Annales Henri Poincaré - Volume 2

Ann. Henri Poincar´ e 2 (2001) 1 – 26 c Birkh¨ auser Verlag, Basel, 2001 1424-0637/01/010001-26 $ 1.50+0.20/0 Annales...

Author: Vincent Rivasseau (Chief Editor)

31 downloads 1231 Views 10MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

Ann. Henri Poincar´ e 2 (2001) 1 – 26 c Birkh¨ auser Verlag, Basel, 2001 1424-0637/01/010001-26 $ 1.50+0.20/0

Annales Henri Poincar´ e

The Rate of Optimal Purification Procedures M. Keyl, R. F. Werner

Abstract. Purification is a process in which decoherence is partially reversed by using several input systems which have been subject to the same noise. The purity of the outputs generally increases with the number of input systems, and decreases with the number of required output systems. We construct the optimal quantum operations for this task, and discuss their asymptotic behaviour as the number of inputs goes to infinity. The rate at which output systems may be generated depends crucially on the type of purity requirement. If one tests the purity of the output systems one at a time, the rate is infinite : this fidelity may be made to approach 1, while at the same time the number of outputs goes to infinity arbitrarily fast. On the other hand, if one also requires the correlations between outputs to decrease, the rate is zero: if fidelity with the pure product state is to go to 1, the number of outputs per input goes to zero. However, if only a fidelity close to 1 is required, the optimal purifier achieves a positive rate, which we compute.

1 Introduction A central problem of quantum information processing is to ensure that devices which have been designed to perform certain tasks still work well in the presence of decoherence, i.e., under the combined influences of inaccurate specifications, interaction with further degrees of freedom, and thermal noise. Decoherence typically has the effect of producing mixed states out of pure states, so it is natural to ask whether the effects of decoherence can be partially undone, by processes turning mixed states into purer ones. As in the classical case this is impossible for operations working on single systems. However, if many (say N ) systems are available, all of which were originally prepared in the same unknown pure state σ, and subsequently exposed to the same (known) decohering process R∗ , then an analysis of the combined state may well allow the reconstruction of the original pure state. The quality of this reconstruction will increase with N . In fact, it should approach perfection as N → ∞: in this limit one can determine the decohered state R∗ σ to an arbitrary accuracy by statistical measurements. The question is only, whether the knowledge of the full density matrix R∗ σ admits the reconstruction of σ, i.e., whether the linear operator R∗ is invertible. Generically, and for sufficiently small decoherence, this is the case. However, the operator R∗−1 is usually not positive, i.e., it takes some density matrices into operators with negative eigenvalues. Therefore, it does not correspond to a physically realizable apparatus. But it does describe a computation we can perform to reconstruct σ from the measured (or estimated) density matrix ρ = R∗ σ.

2

M. Keyl, R. F. Werner

Ann. Henri Poincaré

How well can this reversal of decoherence be done when the number N of inputs is given, and finite? The answer depends critically on the way the purification task is set up, and what “figure of merit” we try to optimize. In general, the resulting variational problems may be very hard to solve. However, in the specific model situation chosen in this paper, the solution is fairly straightforward: we take qubit systems, and assume that decoherence is described by a depolarizing channel of the form 1I R∗ σ = λσ + (1 − λ) . 2

(1)

The purifier will be a device T taking a state of N qubits, and turning out some number M of qubits, where M may be either fixed or itself a random quantity. In the latter case T is given mathematically by a family TM of completely positive maps, where TM takes a density matrix of N qubit systems, and produces a positive operator on the M qubit space, which is not necessarily normalized to unity: the normalization constant wM = tr(TM (ρ)) is interpreted as the probability of getting exactly M outputs from the input state ρ. Thus M wM = 1. Our aim is to design T to get outputs as close as possible to the uncorrupted input state σ, and also as many of them as possible. This is reminiscent of cloning problems [1, 2]. However, in cloning problems the aim is to get many copies of the input state to T , which in our case is the mixed state R∗ σ, rather than the pure state σ. In both cases there is clearly a trade-off between the quality of the outputs and their number, which is why there are several different ways to state the problem. In the sequel we will briefly describe the variants of the purification problem, together with the results, which will be shown later in the paper. 1. Maximal fidelity, failure to produce any output admissible. The best fidelity of outputs is clearly achieved, when the weakest possible demands are made on the number of outputs. In this case we do not even insist on an output every time the device is run, but only on some non-zero probability for getting an output. The best achievable fidelity of these outputs goes to 1 as N → ∞, but not substantially faster than with the following stronger requirement on output numbers. 2. M = 1 fixed, number M never increased at expense of output purity. This is the approach taken by [3]. At least one output qubit is required, and the figure of merit is based on the fidelity of this one qubit. As it turns out the optimal device for this problem can just as well produce more outputs of the same optimal fidelity, with a certain rate. However, this rate is not part of the optimization criterion. 3. M fixed, purity measured by one-particle restrictions. For fixed M, N , this problem is rather similar to 2. However, with the additional parameter M we can discuss better the trade-off between rate and quality of outputs. Suppose we fix some dependence of the number of outputs M (N ) on the number of

Vol. 2, 2001

The Rate of Optimal Purification Procedures

3

inputs. Do the states still approach σ as N → ∞? Clearly, if M (N ) increases slowly, e.g., at the rate given by the optimal device from 2, this will be the case. What may seem surprising at first, however, is that no matter how fast M (N ) → ∞, the state of each output qubit still approaches the uncorrupted pure state. In this sense, optimal purification works with an infinite rate. 4. M fixed, purity measured by fidelity with respect to σ ⊗N . The infinite rate depends critically on what we use as the quality criterion for outputs. Apart from the fidelity of the restrictions of the output state to single qubits used in 3 we could also look at the fidelity of the outputs with respect to the M particle pure state σ ⊗M , thereby taking into account also the correlations between different outputs. For fixed M , the difference between these two fidelity measures does not seem so great, because one can be estimated in terms of the other. However, the estimates are M -dependent (see below), and hence for problems involving a limit M → ∞ the fidelity with respect to the combined state may (and does) turn out to be a much tighter criterion. In fact, no process with finite rate M/N achieves fidelity→ 1, and in this sense even optimal purification works with zero rate, in sharp contrast to 3 above. On the other hand, for any finite fidelity requirement, there is an output rate for an optimized process, which is computed below. These results will be stated in precise terms in the following Section 2, together with the notation needed for that purpose, and graphs of the optimal fidelities and rates. The proofs follow in the subsequent sections. Technically they hinge on the decomposition theory of tensor product representations of SU(2), and this background is provided in Section 3. The reason for representation theory to enter in such a crucial way is isolated in Section 3.1, where it is shown that the optimal devices can be taken to be SU(2)-covariant (do not single out a basis in the qubit space). The two basic purifiers, called the “natural purifier” (optimal for question 2 above), and the “optimal purifier” (optimal for question 3 above) are defined in Section 4, and their fidelities are computed. The proof of the optimality claims is given in Section 5. Finally, in Section 6, we determine the asymptotic behaviour for the optimal purifier, and the output rates.

2 Figures of Merit and Main Results In this section we will state the optimization problems for purifiers mathematically. A device (not necessarily a purification procedure) taking N qubit systems as input and producing M output qubits is described mathematically by a trace preserving, completely positive linear map (“cp-map”) T∗ : B∗ (H⊗N ) → B∗ (H⊗M ), which takes input density matrices to output density matrices. Equivalently, we may work in the Heisenberg picture, using the dual T of T∗ , the unital (i.e. T (1I) =

4



1I) cp-map T : B(H⊗M ) → B(H⊗N ), which is related to T∗ by tr T (X)ρ = tr XT∗ (ρ) . Here H = C2 is the one qubit Hilbert space, B( · ) is the space of all (bounded) operators on the corresponding Hilbert space and B∗ ( · ) denotes the space of trace class operators. Since dim H = 2 < ∞, the spaces B∗ (H) and B(H) are just the 2×2-matrices, but it is nevertheless helpful to keep track of the distinction between spaces of observables and spaces of states. “Good purifiers” should make T∗ ((R∗ σ)⊗N ) very close to σ ⊗M . A simple figure of merit is the fidelity of the output with respect to the desired state in the worst case, i.e., Fall (T ) = inf tr σ⊗M T∗ (R∗ σ)⊗N ) , (2) σ

where the infimum is over all one-particle pure states σ. Similarly, we could pick any one of the outputs, say the one with number i, 1 ≤ i ≤ M , and test its fidelity. The worst case then gives the fidelity (3) Fone (T ) = inf inf tr σ(i) T∗ (R∗ σ)⊗N ) , i

σ

where σ (i) = 1I ⊗ · · · ⊗ σ ⊗ · · · ⊗ 1I denotes the tensor products with (M − 1) factors “1I” and one factor σ at the ith position. We seek to maximize these numbers by judicious choice of T . Let us denote the optimal values by Fmax (N, M ) = sup F (T ),

(4)

T

where =“all” or =“one”, and the supremum is over all unital cp-maps T with the specified number of inputs and outputs. For devices with variable numbers of outputs all these quantities become random variables, as well. Typically, one will seek to optimize the mean fidelity. It is then natural not to take the infimum in Equation (3), but the mean. The case where no output is produced at all, is interpreted here as one output qubit in the completely mixed state. The resulting mean fidelity [3] can be thought of as the fidelity Fone (T) of a modified device T, which uses T , followed by a random selection of one of the outputs. Therefore, the problem of maximizing mean fidelity is exactly the same as maximizing Fone (T ) for devices with fixed output number M = 1, with optimal value Fmax (N, M ). Rather than looking at the mean of the fidelity distribution of a device with variable number of outputs we could also look at its maximum. This corresponds to the problem in item 1 of the previous section. More precisely, one should omit the “worst case” infimum with respect to i in this case, and allow the device to either pick one of its outputs, or to declare failure. This leads to a device with only

Vol. 2, 2001


5

the two output numbers 0 and 1, and the functional to be optimized is the fidelity of the “1”-output. We will denote the optimum for this problem by Fmax (N, 0), with a slight abuse of notation expressing that this is the case with no demands on output numbers at all. It is clear that Fmax (N, M ) is a decreasing function of M , and that therefore the limit Fmax (N, ∞) = lim

M →∞

Fmax (N, M )

(5)

exists. For =all, this limit is zero. However, for =one, it is an interesting quantity, which even goes to 1 as N → ∞. max max max The results for the quantities Fone (N, 0), Fone (N, 1), and Fone (N, ∞) are shown in Figure 1. Of course, all these quantities also depend on the parameter describing the noise, which we have suppressed for notational convenience. It is fixed in the following graphs as λ = 0.5 (resp. β = 0.549, see Section 3). It is clear that Fmax (N, M ) → 1 for any N and M , as the noise level goes to zero (λ → 1).

1 0.95 0.9 0.85 0.8 0.75 0.7 0.65 0.6

5

10

15

20

Figure 1: The three basic fidelities for the one-particle figure of merit: max max max top: Fone (N, 0), middle: Fone (N, 1), bottom: Fone (N, ∞) The leading asymptotic behaviour (as N → ∞) is of the form max Fone (N, M )

c0 c1 c∞

cM + ··· 2N = (1 − λ)/λ = (1 − λ)/λ2 = (λ + 1)/λ2 .

∝ 1−

(6) (7) (8) (9)

From these asymptotic results, a simple estimate for the all-particle fidelity criteria can be obtained: By Equation (41), 1 − Fall (T ) ≤ M (1 − Fone (T )), where

6



M is the number of outputs. Hence, for sufficiently small rate M/N one achieves max good fidelity, even for the all-particle test criterion: 1 − Fall (N, M ) ≤ M (1 − M max max Fone (N, M )) ≤ M (1 − Fone (N, ∞)) ≈ 2N c∞ . Of course, the second estimate is rather crude, and a refined version will be given in Section 6. The argument does show, however, that one may expect optimal all-particle fidelity to become a function of the output rate. This function will be computed in Section 6.3: for every µ > 0, we find the limit    

Φ(µ) =

lim

N →∞ M/N →µ

2λ2 + µ(1 − λ) max Fall (N, M ) = 2 2λ    µ(1 + λ) 2λ2

if µ ≤ λ (10) if µ ≥ λ.

The function Φ is continuous and satisfies Φ(0) = 1 and Φ(∞) = 0, so at small rates purification is near perfect, but becomes arbitrarily bad at too high rates. In Figure 2 Φ is plotted with the noise parameter λ going in steps of 0.1 from 0 to 1. The dotted line describes the performance of the natural purifier (see Section 4.1), which operates with rate µ = λ.

1 0.8 0.6 0.4 0.2

0.5

1

1.5

2

Figure 2: Asymptotic fidelity Φ(µ) for the all-particle figure of merit (10). Curve parameter: λ = .1, .2, ..., 1; dotted line: natural purifier

3 Decomposition theory Many arguments in this paper are based on group theory, in particular the decomposition of tensor products of irreducible representations of SU(2). In this section we will summarize the relevant results which are needed throughout the paper.

Vol. 2, 2001


7

3.1 Reduction to fully symmetric case There are two reason why group theory is useful for us. First of all the depolarizing channel R producing the noise is “covariant” which means that it does not prefer any particular polarization direction (basis in the underlying Hilbert space H = C2 ), and second we are looking at a “universal” purification problem, i.e. the purification devices T we are looking for should work well on an arbitrary unknown input state σ. Therefore, it is natural to look at those T which are covariant as well: T should work in exactly the same way on any input. Carrying this idea further it should also be impossible to single out any one of the input and output channels. Mathematically, these “natural conditions” are stated as follows: Definition 3.1. A unital, cp-map T : B(H⊗M ) → B(H⊗N ) is called fully symmetric if it is U(2) covariant, i.e. T (U ⊗M AU ∗⊗M ) = U ⊗N T (A)U ∗⊗N

∀A ∈ B(H⊗M ) ∀U ∈ U(2)

and permutation invariant, i.e. T (ηAη∗ ) = T (A)

∀η ∈ SM ∀A ∈ B(H⊗M )

τ T (A)τ ∗ = T (A)

∀τ ∈ SN ∀A ∈ B(H⊗N ).

and

Here η ∈ SM , τ ∈ SN denote permutations of M respectively N elements and at the same time the corresponding unitaries on B(H⊗M ) and B(H⊗N ), i.e. η(ψ1 ⊗ · · · ⊗ ψM ) = ψη(1) ⊗ · · · ⊗ ψη(M ) . We could have made this condition part of our definition of a purifier, and restricted the discussion to fully symmetric operations from the outset. However, we have chosen to take the heuristic arguments at the beginning of this section more seriously: the kind of “universality” described there is already embodied in the figures of merit of Section 2, so it becomes a mathematical question whether optimal purifiers are indeed fully symmetric or else symmetry is broken, and a non-symmetric purifier can outperform all symmetric ones. We now argue that the optimal devices (with respect to Fone and Fall ) may be indeed assumed to be fully symmetric. To make this precise, note that Fall (T ) and Fone (T ) are infima over expressions which are linear in T , and hence concave functionals. Therefore, averaging over many T ’s with the same figure of merit produces a T at least as good. Clearly, for all permutations η ∈ SM , τ ∈ SN and U ∈ U(2), the purifier T (X) = τ U ⊗N T (η U ∗⊗M XU ⊗M η ∗ )U ∗⊗N τ ∗ has the same figure of merit as T . By averaging over these parameters (with respect to the appropriate Haar measures) we thus find a purifier, which is at least as good as T and, in addition, fully symmetric. Similar arguments apply for purifiers with variable numbers of outputs (although one has to be more careful in defining figures of merit). Therefore, we will restrict our discussion to fully symmetric purifiers from now on.

8



3.2 Decomposition of tensor products The reduction to fully symmetric purifiers allows the application of techniques from group theory (especially representation theory of SU(2)) which simplifies our problems significantly. Consider in particular the N −fold tensor product SU(2) U → π1/2 (U )⊗N = U ⊗N ∈ B(H⊗N ), of the spin-1/2, or the “defining” representation SU(2) U → π1/2 (U ) = U ∈ B(H). It decomposes into a direct sum of irreducible subrepresentations π1/2 (U )⊗N = U ⊗N =

πs (U ) ⊗ 1I

(11)

s∈I[N]

with πs (U ) ⊗ 1I ∈ B(Hs ⊗ KN,s ) and H⊗N =

Hs ⊗ KN,s

s∈I[N]

and {0, 1, . . . , N2 } N even I[N ] = { 12 , 32 . . . , N2 } N odd Here πs denotes the spin-s irreducible representation of SU(2), Hs its 2s + 1dimensional representation space, which we will identify in the following with ⊗2s the symmetric tensor-product H+ , i.e. the 2s–qubits Bose subspace, and KN,s denotes a multiplicity space, which carries an appropriate representation of the symmetric group SN .

3.3 Decomposition of states Consider now a general qubit density matrix ρ, which in its eigenbasis can be written as (β ≥ 0)

β σ 1 1 0 e 3 exp 2β = β ρ(β) = (12) 0 e−β 2 cosh(β) 2 e + e−β

1 1 = tanh(β)|ψψ| + (1 − tanh(β)) 1I, ψ = 0 2 The parametrization of ρ in terms of the “pseudo-temperature” β is chosen here, because it is, as we will see soon, very useful for calculations. The relation to the form of ρ = R∗ σ initially given in Equation (1) is obviously λ = tanh(β).

Vol. 2, 2001


9

The N –fold tensor product ρ⊗N can be expressed as ρ(β)⊗N = (2 cosh(β))−N exp(2βL3 ) where B(H⊗N ) L3 =

1 σ3 ⊗ 1I⊗(N−1) + · · · + 1I⊗(N−1) ⊗ σ3 . 2

(13)

⊗N denotes the 3–component of angular momentum in the representation π1/2 . In other words, the density matrices are just analytic continuations of group unitaries, or “SU(2)-rotations by an imaginary angle 2iβ”. This reduces the decomposition of ρ(β)⊗N to the decomposition (11) of the tensor product representation. Of course, analytically continued group elements are not normalized as density operators. Extracting appropriate normalization factors the decomposition becomes

ρ(β)⊗N =

wN (s)ρs (β) ⊗

s∈I[N]

with

1I , dim KN,s

sinh (2s + 1)β wN (s) = dim KN,s , sinh(β)(2 cosh(β))N

(14)

and ρs (β) =

sinh(β) exp(2βL3(s) ). sinh (2s + 1)β

(s)

Here L3 denotes again the 3–component of angular momentum, now in the representation πs . The ρs (β) are normalized, i.e. tr ρs (β) = 1. Hence s wN (s) = 1 and 0 ≤ wN (s) ≤ 1 due to the normalization of ρ(β)⊗N . Together with the fact that the multiplicities dim KN,s are independent of β we can extract from Equation (14) a generating functional for dim KN,s : 2 sinh(β)(2 cosh(β))N = 2 sinh (2s + 1)β dim KN,s s∈I[N]

N e(2s+1)β − e−(2s+1)β dim KN,s , = eβ − e−β eβ + e−β = s∈I[N]

obtaining dim KN,s

2s + 1 = N/2 + s + 1

N N/2 − s

provided N/2 − s is integer, and zero otherwise. The same result can be derived using representation theory of the symmetric group; see [4], where the more general case dim H = d ∈ N is studied.

10


Ann. Henri Poincar´ e

3.4 Decomposition of operations and optimal cloning Let us come back now to fully symmetric cp-maps T : B(H⊗M ) → B(H⊗N ). Using the results of Subsection 3.2 it is easy to see that T can be decomposed into a direct sum T (A) = Ts (A) ⊗ 1I (15) s∈I[N]

where the Ts : B(H⊗M ) → B(Hs ) are unital cp-maps which are again fully symmetric (using an obvious modification of Definition 3.1). Identifying, as in Subsection 3.2, the representation space Hs with the 2s–fold symmetric tensor product ⊗2s H+ , leads to the significantly simpler problem of decomposing fully symmetric, ⊗N ), which is already solved in [2]. Hence we unital cp-maps Q : B(H⊗M ) → B(H+ will state only the corresponding results here. In particular we have the following theorem: (s)

Theorem 3.1. Consider again the 3-components of angular momentum L3 and L3 ⊗M in the representations π1/2 respectively πs (cf. Subsection 3.3).

⊗2s ) there is a constant 1. For each fully symmetric cp-map Q : B(H⊗M ) → B(H+ (s) + ω(Q) ∈ R with Q(L3 ) = ω(Q)L3 .

ˆ 2s with 2. For each 2s ∈ N0 there is exactly one fully symmetric Q  M   for 2s ≥ M  2s ˆ 2s ) = max ω(Q) = ω(Q Q    M + 2 for 2s < M , 2s + 2

(16)

where the maximum is taken over the set of all fully symmetric cp-maps ⊗2s Q : B(H⊗M ) → B(H+ ). ⊗2s ˆ 2s∗ : B∗ (H+ ˆ 2s is given in terms of its pre-dual Q )→ 3. If M > 2s holds Q ⊗M B∗ (H ) by

ˆ 2s∗ (θ) = 2s + 1 SM (θ ⊗ 1I⊗(M −2s) )SM Q M +1

(17)

⊗M where SM is the projector from H⊗M onto the Bose subspace H+ .

ˆ 2s is given by 4. For M ≤ 2s the map Q ˆ 2s (A) = S2s (A ⊗ 1I⊗(2s−M ) )S2s , Q or in terms of its predual ˆ 2s∗ (θ) = tr2s−M θ, Q

(18)

where tr2s−M denotes the partial trace over the first 2s − M tensor factors.

Vol. 2, 2001


11

ˆ 2s defined in Equation (17) respectively Note that the family of cp-maps Q ˆ 2s describes the optimal (18) plays a very special role not only mathematically: Q ˆ 2s∗ way to increase (17) or dercrease (18) the number of qubits. More precisely Q maps a finite number 2s of qubits in the same unknown pure state σ to the best possible approximation Q2s∗ (σ⊗2s ) of the product state σ ⊗M . The quality of Q2s∗ (σ⊗2s ) is measured here by the fidelities Gall (Q) := inf tr σ⊗M Q∗ σ⊗2s σ

or Gone (Q) := inf inf tr σ(i) Q∗ σ⊗2s . i

σ

If 2s ≥ M holds (item 4) we simply have to discard 2s − M qubits to get exactly ˆ 2s∗ (σ⊗2s ) = σ ⊗M . If the number of qubits should be increased, i.e. M > 2s holds Q ˆ 2s is the optimal (item 3), the target state σ ⊗M can not be reached. In this case Q quantum cloning device described in [1, 2].

4 Natural and optimal purifiers In this section we will introduce a particular class of purification maps which arise very naturally from the group theoretical discussion of the last section and which maximize, as we will see in Section 5, the fidelities Fall and Fone .

4.1 The definitions As a first step let us reinterpret the decomposition of ρ(β)⊗N discussed in Subsection 3.3 in terms of the of cp-map s∈I[N]

⊗2s B(H+ )

As =: A → T nat (A) :=

s∈I[N]

:=

s∈I[N]

As ⊗ 1I ∈

Tsnat (As ) :=

s∈I[N]

B(Hs ⊗ KN,s ) = B(H⊗N ). (19)

s∈I[N]

Its predual maps the density matrix ρ(β)⊗N to s∈I[N] wN (s)ρs (β). The latter should be interpreted as a (normal) state on the von Neumann algebra ⊗2s nat B(H ). Hence T is an instrument which produces with probabil+ s∈I[N] ity wN (s) the 2s–qubit state ρs (β) from the input state ρ(β)⊗N . This implies in particular that the number of output systems of T nat is not a fixed parameter but an observable. We will see soon that the fidelities of the output states ρs (β) are bigger than those of the input state ρ(β)⊗N provided s > 0 holds. Hence we will call T nat the natural purifier.

12



The most obvious way to construct a device which produces always the same number of output systems is the composition of T nat with the cloning operation ⊗2s ˆ ˆ 2s (A) ∈ B(H⊗M ) A → Q(A) = B(H+ ). Q s∈I[N]

s∈I[N]

ˆ 2s are the operations introduced in Theorem 3.1. Combining T nat with Here the Q ˆ Q we get an operation ˆ ∈ B(H⊗N ) B(H⊗M ) A → T opt (A) := (T nat Q)(A)

(20)

which produces, as stated, a fixed number M of output systems from N input qubits. Physically we can interpret T opt (A) in the following way: First we apply the natural purifier to the input state ρ(β)⊗N and we get 2s output systems in the common state ρs (β). If 2s ≥ M we throw away M − 2s qubits and end up with a number of M . If 2s < M we have to invoke the 2s → M optimal cloner to reach the required number of M output systems. Although this cloning process is wasteful we will see soon that the fidelities F# (T opt ) of the output state produced by T opt are even the best fidelities we can get for any N → M purifier. Hence we will call T opt therefore the optimal purifier.

4.2 The one qubit fidelity Now we will calculate the one qubit fidelity Fone . Due to covariance of the depolarizing channel R the expressions under the infima defining Fone (T ) (and Fall (T )) in Equation (2) and (3) depend for any fully symmetric purifier not on σ and i. I.e. we get with R∗ σ = ρ(β): (21) Fall (T ) = tr σ⊗M T∗ ρ(β)⊗N and Fone (T ) = tr σ(1) T∗ ρ(β)⊗N with σ = |ψψ|. In the case of Fone the situation is further simplified by the introduction of the black cow parameter (cf. [1]) γ(θ) which is defined for each density matrix θ on H⊗M by 1 tr(2L3 θ). M To derive the relation of γ to Fone note that full symmetry of T implies equivalently to (21)    M 1 Fone (T ) = tr  σ(j)  T∗ ρ(β)⊗N  . M j=1 γ(θ) =

Since σ = (1I + σ3 )/2 holds with the Pauli matrix σ3 we get together with the definition of L3 in Equation (13) 1 Fone (T ) = 1 + γ T∗ (ρ(β)⊗N ) . (22) 2

Vol. 2, 2001


13

In other words it is sufficient to calculate γ T∗ (ρ(β)⊗N ) (which is simpler because SU(2) representation theory is more directly applicable) instead of Fone (T ). Another advantage of γ is its close relation to the parameter λ = tanh(β) defining the operation R∗ in Equation (1). In fact we have 1 1 γ(ρ(β)⊗N ) = tr 2L3 ρ(β)⊗N = N tr σ3 ρ(β) = tanh(β) = λ. N N In other words the one particle restrictions of the output state T ρ(β)⊗N are given by 1I γ T (ρ(β)⊗N ) σ + 1 − γ[T (ρ(β)⊗N )] . 2 This implies that γ T (ρ(β)⊗N ) > λ should hold if T is really a purifier. Let us consider now the natural purifier T nat . Since the number of output qubits is not constant in this case we have to consider for each s ∈ I[N ] the quantity Fone (Tsnat ) (see Equation (19) for the definition of the Tsnat ) instead of one fixed parameter Fone (T nat ) (in other words: The fidelity of T nat is, as the number of output qubits, not a constant but an observable). According to the discussion above we get (s) (s) 1 (s) 1 tr 2L3 exp(2βL3 ) γ ρs (β) = tr 2L3 ρs (β) = 2s 2s tr exp(2βL(s) ) 3

1 d 1 d (s) ln tr exp(2βL3 ) = ln sinh (2s + 1)β − ln sinh β = 2s dβ 2s dβ 1 2s + 1 coth (2s + 1)β − coth β (23) = 2s 2s and hence

1 1 + γ ρs (β)⊗N 2 1 2s + 1 1 = 1+ coth (2s + 1)β − coth β . 2 2s 2s If s = 1/2 we have γ ρs (β) = tanh(β) = λ hence the (perturbed) input state ρ(β) Taking the derivative with respect to s shows in addition that is reproduced. γ ρs (β) is strictly increasing in s. Hence T nat really purifies (according to the remark above) and the best result we get if s is maximal. In the limit s → 0 we find γ ρs (β) = 0 which is reasonable because T nat does not produce any output at all in this case (dim Hs = 1 for s = 0). Let us apply these results to the optimal purifier. According to the definition of T opt and T nat in Equations (20) and (19) the decomposition of T opt given in (15) has the form ˆ 2s (A) ⊗ 1I = ˆ Q = Tsopt (A) ⊗ 1I, (24) T opt (A) = T nat (Q(A)) Fone (Tsnat ) =

s∈I[N]

s∈I[N]

14



ˆ 2s (A). Together with (22) we get hence Tsopt (A) = Q   1 opt wN (s)γ Ts∗ (ρs (β))  Fone (T opt ) = 1 + 2 s∈I[N]   1 ˆ 2s∗ (ρs (β))  wN (s)γ Q 1+ = 2 s∈I[N] wN (s)fone (M, β, s), =:

(25)

s∈I[N]

where we have introduced the abbreviation 1 ˆ 2s (ρs (β)) . fone (M, β, s) := 1+γ Q 2 Together with Theorem 3.1 this implies: ˆ 2s (L3 )ρs (β) ˆ 2s∗ (ρs (β)) = 1 tr 2Q 2fone (M, β, s) − 1 = γ Q M ˆ 2s )2s ˆ 2s ) ω(Q ω(Q (s) tr[2L3 ρs (β)] = γ[ρs (β)]. = M M ˆ 2s ) and γ[ρs (β)] from Equations (16) and (23) we get Inserting the values of ω(Q 2fone (M, β, s) − 1 =  1 2s + 1    2s coth (2s + 1)β − 2s coth β =  1 M + 2   (2s + 1) coth (2s + 1)β − coth β 2s + 2 M

for 2s > M (26) for 2s ≤ M .

Hence we have proved the following proposition. Proposition 4.1. The one–qubit fidelity Fone (T opt ) of the optimal purifier is given by Fone (T opt ) = wN (s)fone (M, β, s) (27) s∈I[N]

with fone (M, β, s) from Equation (26). Note in particular that in the case M = 1 the one–qubit fidelity coincides with the expectation value of the fidelity of T nat in the state T∗nat (ρ(β)⊗N ) – the mean fidelity. Hence we can reinterpret the natural purifier as a device which produces exactly one output system (cf. [3]).

Vol. 2, 2001


15

4.3 The all qubit fidelity As in the one–qubit case the all–qubit fidelity of T nat is an observable rather than a fixed parameter. Hence we have to calculate Fall (Tsnat ) for each fixed s. Applying again Equation (21) we get Fall (Tsnat ) = tr σ⊗2s ρs (β) = =

sinh(β) e2βs sinh (2s + 1)β

e(2s+1)β − e(2s−1)β 1 − e−2β = . e(2s+1)β − e(2s+1)β 1 − e−(4s+2)β

Using the decomposition of T opt given in Equation (24) we get for the optimal purifier something similar as in the last subsection: Fall (T opt ) =

s∈I[N]

=

opt ρs (β) wN (s) tr σ⊗M Ts∗

(28)

ˆ 2s∗ ρs (β) . wN (s) tr σ⊗M Q

s∈I[N]

However the calculation of ˆ 2s∗ ρs (β) fall (M, β, s) := tr σ⊗M Q ˆ 2s )Ls3 is not sufficient ˆ 2s (L3 ) = ω(Q is now more difficult, since the knowledge of Q ˆ in this case. Hence we have to use the explicit form of Q2s in Equation (17) and (18). For 2s < M this leads to fall (M, β, s)

= = =

2s + 1 ψ⊗M , SM (ρs ⊗ 1I⊗(M −2s) )SM ψ⊗M M +1 2s + 1 2s + 1 ψ⊗M , (ρs ⊗ 1I⊗(M −2s) )ψ⊗M = ψ⊗2s , ρs ψ⊗2s M +1 M +1 1 − e−2β 2s + 1 . M + 1 1 − e−(4s+2)β

For M ≤ 2s we have to calculate ˆ 2s∗ ρs (β) = tr Q ˆ 2s (σ⊗M )ρs (β) fall (s, M, β) = tr σ⊗M Q = tr ρs (β) SM [(|ψ⊗M ψ⊗M |) ⊗ 1I⊗(2s−M ) ]SM

(29)

ˆ 2s (σ⊗M ) in occupation number representation. By We will compute the operator Q definition, the basis vector “|n” of the occupation number basis is the normalized version of SM Ψ, where Ψ is a tensor product of n factors ψ and (M − n) factors

16



φ, where φ = ( 01 ) denotes obviously the second basis vector. The normalization factor is easily computed to be SM (ψ

⊗n

⊗(M −n)

⊗φ

−1/2 M )= |n. n

(30)

We can now expand the “1I” in Equation (29) in product basis, and apply (30), to find 2s − M 2s−1 ⊗(2s−M ) ⊗M ⊗M SM [(|ψ φ |) ⊗ 1I ]SM = |K K|. K −M K K

Now L3 is diagonal in this basis, with eigenvalues mK = (K − s), K = 0, . . . , (2s). With ρs (β) from (12) we get −1

2s 1 − e−2β 2s − M fall (M, β, s) = e2β(K−s) K 1 − e−(4s+2)β K K − M

for M ≤ 2s.

Together with −1

−1

2s 2s 2s − M K K!(2s − K)! (2s − M )! = = K M K −M M (K − M )!(2s − K)! (2s)! we get fall (M, β, s) =

−1 2s K 2β(K−s) 1 − e−2β . e M 1 − e−(4s+2)β M K

Summarizing these calculations we get the following proposition: Proposition 4.2. The all–qubit fidelity Fall (T opt ) of the optimal purifier is given by Fall (T opt ) =

wN (s)fone (M, β, s)

(31)

s∈I[N]

where fall (M, β, s) is given by  1 − e−2β 2s + 1      M + 1 1 − e−(4s+2)β

−1 fall (M, β, s) =  2s K 2β(K−s) 1 − e−2β   e   1 − e−(4s+2)β M M K

M ≤ 2s (32) M > 2s.

Vol. 2, 2001


17

5 Solution of the optimization problems Now we are going to prove the following theorem: Theorem 5.1. The purifier T opt maximizes the fidelities Fone (T ) and Fall (T ). max max Hence the optimal fidelities Fone (N, M ) and Fall (N, M ) defined in Section 2 are given by Equation (27) and (31). Proof. Note first that the funtionals Fone and Fall are, as infima over continuous functions, upper semicontinuous. Together with the compactness of the set of max admissible T this implies that the suprema F# (N, M ) from Equation (4) are max (N, M ) exist, and attained. In other words: optimal purifier T with F# (T ) = F# we can assume without loss of generality that they are fully symmetric (according to the discussion in Section 3.1). Hence we can apply Equation (21) and the decomposition (15) to get in analogy to (25) and (28)   1 Fone (T ) = 1 + wN (s)γ Ts∗ (ρs (β))  2 s∈I[N]

and Fall (T ) =

wN (s) tr σ⊗M Ts∗ ρs (β) .

(33)

s∈I[N]

The last two Equations show that we have to optimize each component Ts of the purifier T independently. In the one qubit case this is very easy, be (s) cause we can use Theorem 3.1 to get Ts (L3 ) = ω(Ts )L3 and γ Ts∗ (ρs (β)) = (s) ω(Ts ) tr L3 ρs (β) . Hence maximizing γ Ts∗ (ρs (β) ] is equivalent to maximizing ω(Ts ). But we have according to Theorem 3.1  M   for 2s ≥ M  2s ˆ 2s ) = max ω(Ts ) = ω(Q T  M +2   for 2s < M , 2(s + 1) max which shows that Fone (N, M ) = Fone (T opt ) holds as stated. For the many qubit–test version the proof is slightly more difficult. However as in the Fone -case we can solve the optimization problem for each summand in Equation (33) separately. First of all this means that we can assume without loss ⊗M ) because the functional of generality that Ts∗ takes its values in B(H+ ⊗M fs (Ts ) := tr σ Ts∗ ρs (β) (34)

which we have to maximize, depends only on this part of the operation. Full symmetry implies in addition that Ts∗ (ρs (β)) is diagonal in occupation number

18



basis (see Equation (30)), because Ts∗ (ρs (β)) commutes with each πs (U ) (s = M/2, U ∈ U(2)) if πs (U ) commutes with ρs (β). If M > 2s this means we have Ts∗ (ρs (β)) = κ∗ σ⊗M + r∗ where r∗ is a positive operator with σ ⊗M r∗ = r∗ σ⊗M = 0. Inserting this into (34) we see that fs (Ts ) = κ∗ . Hence we have to maximize κ∗ . The first step is an upper bound which we get from the fact that tr σ⊗M ρs (β) 1I − ρs (β) is a positive operator. Since Ts∗ (1I) = (2s + 1)/(M + 1)1I (another consequence of full symmetry) we have 2s + 1 ⊗M tr σ ρs (β) 1I − κσ ⊗M − r∗ . 0 ≤ T tr σ⊗2s ρs (β) 1I − ρs (β) = M +1 Multiplying this Equation with σ ⊗M and taking the trace we get κ∗ ≤

2s + 1 ⊗M tr σ ρs (β) . M +1

(35)

However calculating fs (Tsopt ) we see that this upper bound is achieved, in other words Tsopt maximizes fs . If M ≤ 2s holds we have to use slightly different arguments because the estimate (35) is to weak in this case. However we can consider in Equation (34) the dual Ts instead of Ts∗ and use then similar arguments. In fact for each covariant Ts the quantity Ts (σ⊗M ) is, due to the same reasons as Ts∗ (ρs (β)) diagonal in the occupation number basis and we get Ts (σ⊗M ) = κσ ⊗2s + r where r is again a 2s−1 rn |n (|n denotes again the occupation number positive operator with r = n=0 basis) and κ is a positive constant. Since Ts is unital we get from 1I − σ ⊗M ≥ 0 the estimate 0 ≤ κ ≤ 1 in the same way as Equation (35). Calculating Tsopt (σ⊗M ) shows again that the upper bound κ = 1 is indeed achieved, however it is now not clear whether maximizing κ is equivalent to maximizing fs (Ts ). Hence let us show first that κ = 1 is necessary for fs (Ts ) to be maximal. This follows basically from the fact that Ts is, up to a multiplicative constant, trace preserving. In fact we have 2s + 1 tr Ts (σ⊗M ) = tr Ts (σ⊗M )1I = tr σ⊗M Ts∗ (1I) = . M +1 This means especially that κ + tr(r) = (2s + 1)/(M + 1) holds, i.e. decreasing κ by 0 < + < 1 is equivalent to increasing tr(r) by the same +. Taking into account 2s that ρs (β) = n=0 hn |n holds with hn = exp 2β(n − s) , we see that reducing κ by + reduces fs (Ts ) at least by + tr σ ⊗2s ρs (β) − tr |2s − 1ρs (β) = + e2βs − e(2s−1)β > 0. Therefore κ = 1 is necessary. The last question we have to answer, is how the rest term r has to be chosen, for fs (Ts ) to be maximal. To this end let us consider the slightly modified fidelity f˜s (Ts ) = tr Ts (σ⊗M )σ⊗2s (which is in fact related to optimal cloning; see [1]

Vol. 2, 2001


19

and Section 3.4). It is in contrast to fs (Ts ) maximized iff κ = 1. However the operation which maximizes f˜(Ts ) is obviously the optimal M → 2s cloner (up to normalization) which is according to [2] unique. This implies that κ = 1 fixes Ts already. Together with the facts that κ = 1 is necessary for fs (Ts ) to be maximal and κ = 1 is realized for Tsopt we conclude that max fs (Ts ) = fs (Tsopt ) holds, which proves the assertion.

6 Asymptotic behaviour Now we want to analyze the rate with which nearly perfect purified qubits can be produced in the limit N → ∞. To this end we have to compute the asymptotic behaviour of various expectations involving s. It turns out that it is much better not to do work with the explicit expressions of these expectations, as sums over expressions with many binomial coefficients, but to go back to the definition, and use general properties of expectations of ρ⊗N . This has the added advantage of being easily generalized to Hilbert space dimensions d > 2, so we expect the method to be useful in its own right. We collect the basic statements in the following subsection, applying them to the concrete expressions in subsequent ones.

6.1 Convergence of weights to a point measure In the classical case the general theory alluded to above is nothing but the theory of asymptotic distributions for independent identically distributed random variables (Laws of large numbers of various sorts). In the quantum case this theory has been developed in the context of the statistical mechanics of general mean-field systems [5]. Of this theory we need only the simplest aspects (convergence to a point measure), and not the more advanced “Large Deviation” parts, in which it is shown how the probability of deviations from the limit decrease exponentially fast. N Consider operators of the form AN = (1/N ) i=1 a(i) , where a(i) denotes the copies of a fixed operator on H, acting in the ith tensor factor of H⊗N . It is clear that the expectations tr(ρ⊗N AN ) = tr(ρa) are independent of N . Now consider products of a finite number of such operators and expand the expectation into the average over all terms of the form tr(ρ⊗N a(i) b(j) c(k) · · · ). It is easy to see that for large N the majority of these terms will be such that all indices i, j, k, . . . are different, and for such terms the above expression is equal to tr(ρa) tr(ρb) tr(ρc) · · · . So this will be the limit of the expectation of the product AN BN CN · · · as N → ∞ (for precise combinatorial estimates, see [5]). Of course, this allows us to compute the asymptotic expectations for arbitrary polynomials, and by taking suitable limits of arbitrary continuous functions of Hermitian operators. There is an abstract non-commutative functional calculus describing exactly these possibilities (see appendix of [5]). However, for our purposes it is sufficient to say that all combinations of algebraic operations and continuous functions of a Hermitian variable (evaluated in the usual spectral functional calculus) are in this class.

20



For the case at hand, note that the angular momentum operators Lk as in Equation (13) are of the form N AN therefore, for any sequence of functions fN of three non-commuting arguments (this means that in writing out fN we have to keep track of operator ordering), which converges to a limit function, f∞ , we get σ2 σ3 σ1 (36) lim tr ρ⊗N fN LN1 , LN2 , LN3 = f∞ tr(ρ ), tr(ρ ), tr(ρ ) . N→∞ 2 2 2 Note that the function f∞ is just evaluated on numbers (operators on a onedimensional space) so all operator ordering problems disappear in the limit. This is the huge simplification which makes mean-field theory so accessible. The limit formula will be applied to functions of “2s”, the number of outputs from the natural purifier, which can itself be written as a function of this sort. It is, of course, constant on each summand of the decomposition (11), so it is a function 3 2 = s(s + 1): of the Casimir operator L 2s 3 )2 + N −2 − 1/N = gN LN1 , LN2 , LN3 = 4(L/N N g∞ (x1 , x2 , x3 ) = lim 4(3x)2 + N −2 − 1/N = 2|3x| N→∞ σ2 σ3 σ1 g∞ tr(ρ ), tr(ρ ), tr(ρ ) = g∞ (0, 0, λ/2) = λ = tanh β, (37) 2 2 2 when ρ = ρ(β) is given by eq.(12). Functions of g then also lie in the relevant functional calculus, so we get the following statement, taylored to our need in the following subsections. In it we have already encorporated further, straightforward approximation arguments, using uniformly convergent sequences of continuous functions to establish upper and lower bounds separately. Lemma 6.1. Let fN : (0, 1) → R, N ∈ N be a uniformly bounded sequence of continuous functions, converging uniformly on a neighborhood of λ = tr(ρ(β)σ3 ) to a continuous function f∞ , and let wN (s) denote the weights in Equation (14). Then lim wN (s)fN (2s/N ) = f∞ (λ). (38) N→∞

s∈I[N]

In the language of measure theory this is saying that the probability measures w s N (s)δ(x − 2s/N )dx on the interval [0, 1] converge to the point measure δ(x − λ)dx. Graphically, this is shown in Figure 3

6.2 The one particle test max Let us analyze first the behaviour of the optimal one–qubit fidelity Fone (N, M ) in the limit M → ∞. Obviously only the M > 2s case of fone (M, β, s) is relevant in this situation and we get, together with Equation (27), the expression 1 1 max (2s + 1) coth (2s + 1)β − coth β , Fone (N, ∞) = wN (s) 1 + 2 2s + 2 s∈I[N]

Vol. 2, 2001


21

14 12 10 8 6 4 2 0.2

0.4

0.6

0.8

1

Figure 3: Convergence of wN (s) to a point measure (λ = .5, N = 10, 100, 1000). Discrete points joined, and rescaled for total area 1

which obviously takes its values between 0 and 1. To take the limit N → ∞ we can write 2s max lim Fone (N, ∞) = lim wN (s)fN,∞ ( ) N→∞ N→∞ N s∈I[N]

with fN,∞ (x) =

1 1 1+ (N x + 1) coth (N x + 1)β − coth β . 2 Nx + 2

The functions fN,∞ are continuous, bounded and converge on each interval (+, 1) with 0 < + < 1 uniformly to f∞,∞ ≡ 1. Hence the assumptions of Lemma 6.1 are fulfilled and we get max lim Fone (N, ∞) = f∞,∞ (λ) = 1

N→∞

as already stated in Section 2. This means that we can produce arbitrarily good purified qubits at infinite rate if we have enough input systems. max To analyze how fast the quantity Fone (N, ∞) approaches 1 as N → ∞ let us consider the limit 2s c∞ max lim N (1 − Fone (N, ∞)) = wN (s)fÑ,∞ ( ) ≡ (39) N→∞ N 2 s∈I[N]

with fÑ,∞ = N (1 − fN,∞ ). The existence of this limit is equivalent to the asymptotic formula

1 c∞ max Fone +o , (N, ∞) = 1 − 2N N

22



where, as usual, o N1 stands for terms going to zero faster than N1 . Lemma 6.1 leads to c∞ /2 = f˜∞,∞ (λ) with f˜∞,∞ = limN→∞ fÑ,∞ uniformly on (+, 1). To calculate f˜∞,∞ note that fÑ,∞ (x) =

N coth β N + + Rest Nx + 2 Nx + 2

holds, where “Rest” is a term which vanishes exponentially fast as N → ∞. Hence with coth β = 1/λ we get 1+λ c∞ = 2f˜∞,∞ (λ) = λ2 max The asymptotic behaviour of Fone (N, 1) can be analyzed in the same way. The only difference is that we have to consider now the 1 = M ≤ 2s branch of Equation (26). In analogy to Equation (39) we have to look at max lim N (1 − Fone (N, 1)) =

N→∞

s∈I[N]

c1 2s wN (s)fÑ,1 ( ) = N 2

with fÑ,1 = N (1 − fN,1 ) and 1 1 1− (N x + 1) coth (N x + 1)β − coth β . fN,1 (x) = 2 Nx For f˜∞,1 we get 1 1 −1 + ). f˜∞,1 (x) = ( 2 x xλ

(40)

Using again Lemma 6.1 leads to 1−λ c1 = 2f˜∞,1 (λ) = . λ2 max Finally let us consider Fone (N, 0). Here the situation is easier than in the max other cases because Fone (N, 0) equals the fidelity of the best possible output of the natural purifier, i.e. 1 1 max Fone 1− (N + 1) coth (N + 1)β − coth β = fN,1 (1). (N, 0) = 2 N

Hence we only need the asymptotic behaviour of fN,1 (x) at x = 1. Using Equation (40) we get max Fone (N, 0) = 1 −

1−λ 1 + ··· . λ 2N

This concludes the proof of Equations (6) to (9).

Vol. 2, 2001


23

6.3 The many particle test Consider now the many–qubit fidelity Fall . Although, like Fone , it lies between zero and one, and would attain the value 1 precisely for a (non-existent) ideal purifier, both quantities behave quite differently, when we use them to compare states in systems of varying size. We are looking here at the two kinds of fidelities for an M -particle output state ρM with respect to a one-particle pure state given by the vector ψ, namely Fall = ψ⊗M , ρM ψ⊗M = tr ρM |ψ⊗M ψ⊗M | , and (i) Fi = ψ, ρM ψ = tr ρM 1I ⊗ · · · (|ψψ|)i ⊗ 1I , (i)

where ρM denotes the restriction of ρM to the ith tensor factor. Let pall and pi denote the projections whose ρM -expectations appear on the right hand side of these Equations. These projections commute, and pall is the intersection (in the commuting case: the product) of the pi in the lattice of projections. This corresponds to the union of the respective complements, i.e., 1I − pi ≤ 1I − pall ≤ (1I − pi ) . i

Taking expectations with respect to ρM , we find that supi (1 − Fi ) ≤ (1 − Fall ) ≤ i (1 − Fi ) ≤ M supi (1 − Fi ). For the two figures of merit introduced in Section 1 this implies (1 − Fone (T )) ≤ (1 − Fall (T )) ≤ M (1 − Fone (T )) ,

(41)

for every purifying device T . Hence, for fixed N the two figures of merit are equivalent to within a factor . But the upper bound becomes meaningless in the limit M → ∞, so it is not clear at all whether we can bring the fidelity Fall (T ) close to one for an increasing number of outputs. As a consequence of this analysis it is necessary to perform the limit N, M → ∞ more carefully as in the one qubit case. We will consider therefore the limits N → ∞ and M → ∞ simultaneously, while the quotient M/N approaches a constant µ, i.e. we will calculate the function Φ(µ) defined in Equation (10). The first step in this context is the following lemma, which allows us to handle the 2s −1 K 2β(K−s) term in Equation (32). K M e M Lemma 6.2. For integers M ≤ K and z ∈ C, define

−1 K K R K−R Φ(K, M, z) = . z M M R=M

Then, for |z| < 1, and c ≥ 1: lim

M,K→∞ M/K→c

Φ(K, M, z) =

1 . 1 − (1 − c)z

24



Proof. We substitute R → (K − R) in the sum, and get Φ(K, M, z) =

∞

c(K, M, R)z R ,

R=0

where coefficients with M + R > K are defined to be zero. We can write the non-zero coefficients as

−1 K −R (K − M )!(K − R)! K = c(K, M, R) = K!(K − R − M )! M M (K − M ) (K − M − 1) (K − M − R + 1) = ··· K (K − 1) (K − R + 1) R−1 ! M = 1− . K −S S=0

Since 0 ≤ c(K, M, R) ≤ 1, for all K, M, R, the series for different values of M, K are all dominated by the geometric series, and we can go to the limit termwise, for every R separately. In this limit we have M/(K − S) → c for every S, and hence c(K, M, R) → (1 − c)R . The limit series is again geometric, with quotient (1 − c)z and we get the result. To calculate now Φ(µ) recall that the weights wN (s) approach a point measure in 2s/N =: x concentrated at λ = tr(ρ(β)σ3 ). This means that in Equation (31) only the term with 2s = λN survives the limit. Hence if µ ≥ λ we get M ≥ λN = 2s. Using Equation (32) and Lemma 6.1 we get in this case Φ(µ) =

λ (1 − e−2β ). µ

We see that Φ(µ) → 0 for µ → ∞ and Φ(µ) → 1 − exp(−2β) for µ → λ. If 0 < µ < λ we get M < λN = 2s, which means we have to choose Equation (32) for fall (M, β, s). With Lemma 6.2 and Lemma 6.1 we get Φ(µ) =

1 − e−2β 1 − (1 − µ/λ)e−2β

which approaches 1 if µ → 0 and 1 − exp(−2β) if µ → λ. Writing this in terms of λ = tanh β, we obtain Equation (10).

6.4 Estimating the many particle fidelity in terms of one particle In Section 2 we motivated the observation that the the best all-particle fidelity is a function of the rate (and not identically equal to 1) by estimating the all-particle fidelity in terms of the one-particle fidelity. Since the latter quantity tends to be

Vol. 2, 2001


25

more easily computable it is of some interest for further investigations, how good that estimate actually is. The estimate mentioned in the text before Equation (10) amounts to Φ(µ) ≥ 1 −

µ µ(λ + 1) c∞ = 1 − . 2 2λ

(42)

However, the same basic estimate via Equation (41) gives even more information: Φ(µ) ≥ 1 − lim

N →∞ M/N →µ

≥ 1 − µ lim

N→∞

max M (1 − Fone (N, M ))

wN (s) N (1 − fone (µN, β, s))

s∈I[N]

  1 − µ(1 − λ) 2λ2 = µ(1 + λ)  2 − 2λ2

if µ ≤ λ

(43)

if µ ≥ λ,

where the evaluation of the limit was carried out with the same technique based on Lemma 6.1 used in the previous sections. Figure 4 displays the lower bounds (42) and (43) together with the exact result (10).

1 0.8 0.6 0.4 0.2

0.2

0.4

0.6

0.8

1

Figure 4: The lower bounds (42) and (43) together with the exact result (10) for the all-particle test fidelity as a function of the rate (λ = .5)

It is apparent that these bounds are rather weak, and in fact completely trivial for large rates. Hence all-particle fidelities contain new and independent information about purification processes, which is not already contained in their one-particle counterparts.

26



Acknowledgements We acknowledge several rounds of email discussions with Ignacio Cirac, which helped to clarify the precise relation between this work and [3].

References [1] R.F. Werner, Optimal cloning of pure states, Phys.Rev. A58, 980–1003 (1998). [2] M. Keyl and R.F. Werner, Optimal cloning of pure states, testing single clones, J. Math. Phys. 40, 3283–3299 (1999). [3] J.I. Cirac, A.K. Ekert and C. Macchiavello, Optimal purification of single qubits, Phys. Rev. Lett. 82, 4344–4347 (1999). [4] B. Simon, Representations of finite and compact groups, American Mathematical Society, Providence (1996). [5] G.A. Raggio and R.F. Werner, Quantum statistical mechanics of general mean field systems, Helv. Phys. Acta 62, 980–1003 (1989). M. Keyl and R. F. Werner Institut f¨ ur Mathematische Physik TU Braunschweig Mendelssohnstr. 3 D-38106 Braunschweig, Germany e-mail: [email protected] e-mail: [email protected] Communicated by Vincent Rivasseau submitted 24/01/00, accepted 25/09/00

To access this journal online: http://www.birkhauser.ch



Dissociation of Homonuclear Relativistic Molecular Ions Rafael Benguria, Heinz Siedentop, Edgardo Stockmeyer Abstract. We give lower bounds to the ‘size’ of diatomic homonuclear relativistic molecules which are modeled by the Herbst operator. We also show that – as in the non-relativistic case – the absence of sufficiently many electrons leads to the dissociation of the molecule. To obtain these results we found new bounds for the localization error in the semirelativistic approach.

1 Introduction Experimentally atoms and molecules are known to have only finitely many electrons. E.g., experimentally doubly charged negative stable ions are not known. It is a mathematical challenge to show this physical fact. It has been conjectured since several years that the charge exceeding the electrically neutral molecule should be bounded by a universal constant times the number of nuclei involved. This is an unresolved question. However, many results in this direction have been obtained in the context of non-relativistic quantum mechanics, among them Lieb [5]. On the other hand it is physically obvious that – for molecules – the number of electrons cannot be too low either, since the Coulomb forces would drive the nuclei apart. Pioneering work in this direction has been done by Ruskai [9, 10]. Later improvements are due to Solovej [11] and Alarc´ on et al [2]. In particular the last mentioned paper gives an upper bound on the nuclear charges of a homonuclear diatomic molecule. – The purpose of this paper is an extension of the result of Alarcón et al to the case when the underlying dynamics is no longer non-relativistic but relativistic. For definiteness we now define the model to be treated, namely the (pseudo)-relativistic Herbst Hamiltonian HR of N spinless electrons of mass m and charge −e in the field of two nuclei with the same nuclear charge Ze separated by a distance 2R HR =

N −2 c2 ∆i + m2 c4 − i=1

+

1≤i<j≤N

Zα Zα − |xi − R| |xi + R|

Z 2α α + , |xi − xj | 2R

(1)

which is selfadjointly realized in L2 (R3 ) and where R = (R, 0, 0) is the location of one nucleus and α = e2 /(c) is the Sommerfeld fine structure constant having the

28

R. Benguria, H. Siedentop, E. Stockmeyer


physical value of about 1/137. The lowest occurring energy for a given distance 2R between the nuclei is E(R) = inf{(ψ, HR ψ) | ψ ∈ C0∞ (R3N ), ψ = 1}.

(2)

Note that we do disregard any symmetry of the underlying state space, i.e., we look at boltzonic electrons, i.e., particles for which there is no requirement on the symmetry of the states which have the same ground state energy as bosonic atoms. The ground state energy E is given by E = inf{E(R) | R > 0}.

(3)

The binding energy of the molecule is defined as the ground state energy of HR minus the energy of the split system. Eb (N, R, Z) = inf{E(R)|R > 0} − Es

(4)

where Es is the lowest energy that results from separating the systems into two parts. The molecule dissociates (is instable), if Eb is nonnegative. Since this condition is independent of the mass (for m > 0), we use m = 1 in the following. The structure of the paper is as follows: In Section 2 we give a minimal distance for the nuclei of a stable molecule. Technically this is expressed in Lemma 1. In Section 3 we use Lemma 1 to find a lower bound on the nuclear charge beyond which one electron diatomic molecules dissociate. Finally, in Section 4 we use all the results above to find a lower bound on Z/N in the general N electron case.

2 A Lower Bound on the Size of the Molecule Lemma 1. Consider a diatomic homonuclear molecule and assume the nuclei of atomic number Z located at R = (R, 0, 0) and −R. If E(R) as defined in (2) has a minimum at R0 > 0 and Zα < 14 , then R0 ≥

Z 2α [1 − [1 + 2N

2

(Zα) −1/2 −1 ] . 2] 2 1 1 [ 4 + 16 − (Zα) ]

(5)

Note that we are working in units where m = = c = 1, i.e., α = e2 . Proof. The Hamiltonian that describes the molecule is given by (1). The key ingredients in our proof are the Feynman-Hellmann formula, the virial theorem for the Herbst operator, the Rayleigh-Ritz principle, and the estimate from below for

Vol. 2, 2001

Dissociation of Homonuclear Relativistic Molecular Ions

29

the ground state of a hydrogenic atom by Martin and Roy [7]. For a shorthand notation let x ∈ R3N be the vector (x1 , x2 , . . . , xN ) and N − V (x) = i=1

Zα Zα − |xi − Rˆ n| |xi + Rˆ n|

+

1≤i<j≤N

α . |xi − xj |

(6)

We denote by x · ∇x V =

N

xi · ∇xi V.

(7)

i=1

It is simple to check the following identity: 1 Zα Zα (x · ∇V + V ) = [ (xi − Rˆ n) · n ˆ− n) · n ˆ ]. (8) 3 3 (xi + Rˆ R n| |xi + Rˆ n| i=1 |xi − Rˆ N

Now, let ψ denote the ground state of H. Using the Feynman–Hellmann formula, we have ∂E(R) Z 2α =− 2 ∂R 2R N + ψ, [− i=1

Zα |xi − Rˆ n|3

(xi − Rˆ n) · n ˆ+

Zα |xi + Rˆ n|3

(xi + Rˆ n) · n ˆ ]ψ . (9)

From (8) and (9) we have ∂E(R) 1 Z 2α = − 2 − (ψ, (x · ∇V + V )ψ) . ∂R 2R R According to Herbst [4], Theorem 2.4, the following Virial Theorem holds N pi · pi (ψ, x · ∇V ψ) = ψ, ψ . pi 2 + 1 i=1

(10)

(11)

As quadratic forms, p ·p i i ≥ pi 2 + 1 − 1 ≡ Ki . 2 pi + 1

(12)

Combining (10), (11), and (12) we get ∂E(R) Z 2α 1 1 ≤− 2 − (ψ, Ki ψ) − (ψ, V ψ). ∂R 2R R i=1 R N

(13)

30



However, (ψ, V ψ) ≥

N ψ, [− i=1

Zα Zα − ]ψ . |xi − Rˆ n| |xi + Rˆ n|

(14)

Using (13) and (14) we get N ∂E(R) Z 2α 1 2Zα 2Zα ≤− 2 − + Ki − ]ψ . ψ, [Ki − ∂R 2R 2R i=1 |xi − Rˆ n| |xi + Rˆ n|

(15)

At this point we note that the lowest eigenvalues of the hydrogenic Herbst operator in each angular momentum channel are bounded from below by the corresponding one of the Klein-Gordon operator (Martin and Roy [7]). The latter are known explicitly. Since Ki − |xi2Zα −Rˆ n| is just the Herbst operator for a hydrogenic atom (Herbst [4]) of charge 2Z located at R (location does not matter any more), we have, using the Rayleigh-Ritz principle and following Martin and Roy that −(ψ, Ki −

2Zα ψ) ≤ 1 − |xi − Rˆ n| 1+

Finally, from (15) and (16) 

1

.

2

√(Zα) 1

16 −(Zα)

1 4+

2





 ∂E(R) 1  2 ≤− 2 Z α − 2RN   1 − ∂R 2R 1+

(16)

2

1

2

√(Zα) 1

1 4+

2 16 −(Zα)

2

  . 

(17)

Hence if,  −1/2 −1          2    2  Z α (Zα)   R≤ , 1 − 1 +  2  2N     2 1 1       4 + 16 − (Zα) and Zα < 14 , we have

∂E(R) ∂R

(18)

≤ 0.

Remark 1. • The equation (5) is monotonically decreasing as a function of Z and has a minimum at Zα = 1/4, thus RN ≥ 14.617 for all Zα < 1/4. b (R) • Observe that ∂E∂R = ∂E(R) ∂R , since the energy of each of the two individual atoms obtained from the splitting of the molecule does not depend on R.

Vol. 2, 2001


31

• We have treated the electrons as bosons when estimating (15) using (16). • As opposed to the non-relativistic case treated in [11] and [1] the relativistic Hamiltonian does not allow for asymptotic results for large Zα , since the kinetic energy becomes unbounded from below, if Zα exceeds 2/π. Even a single atom would be instable in this situation, i.e., the electrons would fall into the nucleus. — Our bound is natural in as far as it covers the range of the parameter Zα for which the Coulomb potential is relatively bounded with bound less than one with respect to the kinetic energy operator. Assuming the physical case of having integer Z only and α = αphys = 1/137, we have only finitely many values of N and Z for which we can have stability of the molecule.

3 The One-Electron Molecule To demonstrate the strategy to obtain an upper bound on the minimal nuclear charge that prevents binding for the homonuclear diatomic molecule, we will begin with the simplest case, i.e., one electron whose kinetic energy is given by the naive quantization of the classical relativistic Hamiltonian (Herbst Hamiltonian). Theorem 1. Consider a diatomic molecule with homonuclear nuclei with only one electron. Then instability occurs when 2.864 ≤ Z
1 2

32



Obviously these two functions form a Lipschitz continuous partition of unity. Note also that χ2 (t) = χ1 (−t) holds with this choice. Localizing with these functions yields (ψ, HR ψ)

√ = (ψ, χ1 ( −∆ + 1 − 1 −

Zα )χ1 ψ) | · +R| √ Zα +(ψ, χ2 ( −∆ + 1 − 1 − )χ2 ψ) | · −R| Z 2α χ21 χ22 −(ψ, Lψ) + − Zα(ψ, ( + )ψ), 2R | · −R| | · +R|

(22)

where L is the localization error and has the kernel L(x, y) =

K2 (|x − y|) sin2 [(ψ(x1 ) − ψ(y1 ))/(2R)] π 2 |x − y|2

(23)

(Lieb and Yau [6]) where K2 (x) is a modified Bessel function (see [8]). Estimating the operators sandwiched between the χj in the first two lines of (22) from below by Es and observing that χ21 + χ22 = 1 we get (ψ, HR ψ) − Es ≥ −(ψ, Lψ) +

Z 2α χ21 χ22 − Zα(ψ, ( + )ψ). 2R | · −R| | · +R|

(24)

First we use the fact that (see Appendix A for the proof) Zα(ψ, (

χ22 Zα χ21 + )ψ) ≤ , |x − R| |x + R| R

(25)

yielding the following equation for instability Eb ≥ −(ψ, Lψ) +

Z 2 α Zα − . 2R R

(26)

Next we estimate the localization error. Using the Schwarz inequality we get that (ψ, Lψ) ≤ max ϕL (x1 ) x1 ∈R

(27)

where ϕL (x1 ) :=

L(x, y)dy.

(28)

R3

For shorthand notation we define, S(s, t) := 4 sin2 ((ψ(s) − ψ(t))/2).

(29)

Vol. 2, 2001


33

Now we use cylindrical coordinates, choosing the line joining the nuclei as the first coordinate axis. We have ∞ ∞ K2 ( (x1 − y1 )2 + ρ2 ) 1 x1 y1 ϕL (x1 ) = (30) dy1 ρdρ S( , ). 2 + ρ2 2π −∞ (x − y ) R R 1 1 0 Using [3], Formula 6.596.3, we can compute the expression in { } and obtain ∞ K1 (|x1 − y1 |) x1 y1 1 dy1 S( , ). (31) ϕL (x1 ) = 2π −∞ |x1 − y1 | R R Changing variables, u ≡ x1 /R and v ≡ y1 /R, yields ∞ 1 K1 (R|u − v|) dv S(u, v). ϕL (Ru) = 2π −∞ |u − v|

(32)

Using the fact that the equation above has a maximum at u = 0 (for the proof see Appendix B) and the choice of the χj in (21) we obtain √ ∞ 1 2 K1 (Rv) K1 (Rv) 2 π (ψ, Lψ) ≤ (1 − ) dv + (1 − cos( v))dv . (33) π 2 v v 4 1 0 Estimating the integrals ∞ ∞ K1 (Rv) 1 K1 (Rv)dv = K0 (R), dv ≤ v R 1 1 and

I := 0

1

π

K1 (Rv) (1 − cos v )dv = v 4

1

K1 (Rv) 0

π2 v π cos( w) dv, 32 4

where w ∈ (0, 1), and using the Taylor expansion for the cosine we get π2 1 π2 ∞ π3 I≤ dvK1 (Rv)v ≤ dvK1 (Rv)v ≤ . 32 0 32 0 64R2

(34)

(35)

(36)

So finally we have that, 1 (ψ, Lψ) ≤ R

√ 2 2 π3 (1 − )K0 (R) + π 2 64R

:=

G(R) . R

(37)

0) Note that G is a decreasing function implying that (ψ, Lψ) ≤ G(R R , where R0 is the lower bound for R that we found in Lemma 1. To obtain a numerical value, we will suppose that Z ≤ 3. We then get that R0 ≥ 34.17 and we arrive at

(ψ, Lψ) ≤

0.009026 . R

(38)

34



Now using (38) to estimate (26) we have Eb ≥ −

0.009026 Z 2 α Zα + − ≥ 0, R 2R R

(39)

where the last inequality holds only if Z ≥ 2.864. Remark 2. Notice that estimate (24) implies that the molecule dissociates, i.e., Eb ≥ 0, when Z is big enough and R ≥ R0 . This is also true, if Zα < 1/4, when b R < R0 because ∂E ∂R ≤ 0 by Lemma 1.

4 The N Electron Case a 1/2 Theorem 2. If Z/N ≥ min{1 + (1 + αN ) , b} with Zα ≤ 1/4 where a = 0.7587, b = 3.900 then Eb ≥ 0, i.e., the molecule dissociates.

Proof. Consider the two cluster decomposition β = (β1 , β2 ) of {1, .., N }. The intercluster potential is given by, Iβ =

i∈β2

−Zα Z 2α −Zα 1 + + + . |xi + R| |xi − R| |xi − xj | 2R i∈β1

(40)

i∈β1 j∈β2

We define the cluster Hamiltonian Hβ = H − Iβ . Let ψ be the ground state of H and let Eβ = inf(σ(Hβ )) then the instability condition is given by, E ≥ min Eβ = Es .

(41)

β

The partition of unity is defined as, ! ! Jβ (x) = χ1 (xi ) χ2 (xj ). i∈β1

Now noting that Hamiltonian,

"

(42)

j∈β2

2

β

Jβ (x) = 1 we insert this in the expectation value of the

(ψ, HR ψ) = (ψ, ( Jβ HR Jβ )ψ) − (ψ, ( Jβ [HR , Jβ ])ψ), β

(43)

β

and observe that

Jβ [Jβ , HR ] =

β

β

Jβ [

2 N N −∆k + 1, Jβ ] = χj (xk )[ −∆k + 1, χj (xk )], k=1 j=1

k=1

(44) then

Jβ [HR , Jβ ])ψ) = N (ψ, Lψ) (ψ, ( β

(45)

Vol. 2, 2001


where the kernel of L was given in (23). We have (ψ, Hψ) = (ψ, ( Jβ Hβ Jβ )ψ) + (ψ, (Jβ2 Iβ )ψ) − N (ψ, Lψ) ≥

β

(46)

β

Eβ (ψ, Jβ2 ψ) +

β

35

(ψ, (Jβ2 Iβ )ψ) − N (ψ, Lψ)

(47)

β

≥ Es +

(ψ, (Jβ2 Iβ )ψ) − N (ψ, Lψ).

(48)

β

Furthermore, N χ2 (xi ) Z 2α χ2 (xi ) + 2 )ψ) + (ψ, (Jβ2 Iβ )ψ) ≥ −Zα (ψ, ( 1 |xi − R| |xi + R| 2R i=1 β

(49)

ZαN Z 2α ≥− + . R 2R Here we have dropped the inter-electronic potential and used (25). We obtain Eb = (ψ, Hψ) − Es ≥ −N (ψ, Lψ) −

ZαN Z 2α + ≥ 0, R 2R

(50)

where the last inequality can be written as, Z˜ 2 α ˜ − R (ψ, Lψ) ≥ 0 − Zα 2 N

(51)

with Z˜ := Z/N . At this point we will bound the localization error in two different ways yielding, through equation (51), the conditions of the theorem. We start with the 1/R behavior. Observing that the Bessel function K1 obeys the estimate K1 (x) ≤ 1/x for positive x (see Appendix B) we get ∞ K1 (Rv) 1 ∞ 1 1 dv ≤ (52) dv = , 2 v R v R 1 1 and moreover # $ 1 π

K1 (Rv) 1 1 (1 − cos π4 v ) dv ≤ 0.303/R. (1 − cos v )dv ≤ v 4 R 0 v2 0

(53)

Thus R(ψ, Lψ) ≤ 0.3793, and using this to bound (51) we obtain that for instability we need 0.7587 , (54) Z˜ ≥ 1 + 1 + αN which is one of the conditions.

36



Finally for the 1/R2 behavior, we just use (37), that is 1 (ψ, Lψ) ≤ 2 R

√ 2 2 π3 (1 − )RK0 (R) + π 2 64

≤

0.3954 , R2

(55)

since xK0 (x) ≤ 0.4665. Then for instability Z˜ ≥ 1 + 1 + 0.7908 αRN . If we restrict ourselves to Zα < 1/4, we need only consider values of RN ≥ 14.617 (see Remarks 1 and 2 above). Therefore, we have dissociation if Z˜ ≥ 3.900. This together with (54) proves the theorem. Remark 3. The two different estimates for Z/N yield the same result for N0 ≡ 14.027. For small N , i.e., N ≤ N0 , it is sufficient to use the 1/R2 estimate which is the nonrelativistic estimate. However, for large N , namely N ≥ N0 , the 1/R estimate, which is the ultrarelativistic estimate, yields the better results. As a consequence of Lemma 1 we are able to consider RN ≥ 14.617; therefore in terms of R the 1/R estimate is fulfilled for R ≤ 1.042 and the 1/R2 estimate for the localization error is fulfilled for R ≤ 1.042, assuming, of course, that Zα < 1/4.

A To prove the equation (25) we first introduce 1 F (t) := R

cos(ψ(t1 ))2 sin(ψ(t1 ))2 + (t1 − 1)2 + t2 2 + t3 2 (t1 + 1)2 + t2 2 + t3 2

.

(56)

For t1 ≤ −1 we find that F (t) =

1 1 1 1 ≤ . ≤ 2 2 2 R (t1 − 1) + t2 + t3 |t1 − 1| R

(57)

Similarly we get F (t) ≤ 1/R for t1 ≥ 1. For t1 ∈ (−1, 1) we obtain 1 F (t) ≤ R

cos(ψ(t1 ))2 sin(ψ(t1 ))2 + (1 − t1 ) 1 + t1

≤

1 . R

(58)

The last inequality holds because the function in the braces is a positive concave even function on (−1, 1) with maximum at t1 = 0 and value 1 at this point. So the equation (25) is proved by taking t := x/R.

Vol. 2, 2001


37

B To prove that the maximum of the bound for the localization error occurs at u = 0 we first observe that ∞ 1 K1 (R|u − v|) dv ϕL (Ru) = S(u, v) 2π −∞ |u − v| (59) ∞ K1 (R|u − v|) 2 2 sin [(ψ(u) − ψ(v))/2], dv = π −∞ |u − v| where ψ is defined in (21). We start with some simple facts. Proposition 1. The following is true: 1. K1 (x) ≤ 1/x for positive x. 2. ϕL (u) = ϕL (−u) for all u ∈ R. Proof. 1. Set f (x) := xK1 (x). The proof is immediate from the observation that limx→0 f (x) = 1 and that f (x) < 0 for positive x. 2. Let g(u, v) := K1 (R|u − v|)S(u, v)/|u − v|. The function g fulfills g(u, v) = g(−u, −v). Changing the variable in the integral v := −s yields the desired result. We will now turn to the main goal of this section, namely proving that the function ϕL has its maximum at zero. We will start with the massless case and extend the result to the general massive case. In the latter it is enough for us to assume R ≥ 1.

B.1 The Massless Case Lemma 2. Υ has a unique maximum at u = 0 where ∞ sin2 [(ψ(u) − ψ(v))/2] dv Υ(u) := (u − v)2 −∞ and ψ as defined in (21). Proof. Let first suppose that u ∈ [0, 1), we start noting that ∞ sin2 [(ψ(u) − ψ(v + u))/2] dv . Υ(u) = v2 −∞ Taking the derivative of Υ and separating the integral using (21) we get % & ∞ ∞ dv dv π Υ (u) = − cos(ψ(u)) sin(ψ(u)) 2 2 4 1+u v 1−u v π [(1 − u) sin(ψ(u)) − (1 + u) cos(ψ(u))] . = 4(1 − u2 )

(60)

38



It is clear that Υ (0) = 0. Moreover, Υ (u) ≤ 0 for u ∈ (0, 1) since the function between [ ] in the last equation is convex in (0, 1) and vanishes at u = 0 and u = 1. Hence the maximum of Υ on [0, 1) occurs at 0. To extend the same result for any u we consider the extension of the second statement of Proposition 1 for the massless case and observe that for u ≥ 1 1 Υ(u) = 2

−1

−∞

dv + (u − v)2

2

1

dv −1

sin( π8 (v − 1)) (u − v)2

is a decreasing function of u. This together with the continuity of I proves the lemma.

B.2 The Massive Case First, we will enunciate some facts. Proposition 2. Let ΥR (u) :=

∞

dv −∞

K1 (R|u − v|) sin2 [(ψ(u) − ψ(v))/2], |u − v|

and R ≥ 1. Then the following is true: 1. The derivative with respect to u fulfills, ΥR (u) ≤ ΥR=1 (u) := Υ1 (u) for all u ∈ [0, 1). 2. K1 (u) ≥ e−u /u for all positive u. 3. K0 (u + 1) ≤ 5e−(u+1) /4 for all non-negative u. Proof. 1. Using the same procedure as in Lemma 2 we obtain that for u ∈ [0, 1) ∞ ∞ π K (v) K (v) 1 1 ΥR (u) = sin(ψ(u)) dv − cos(ψ(u)) dv , (61) 4 v v R(1+u) R(1−u) and then deriving with respect to R we see that the function ΥR is decreasing as R increases for R > 0 and u ∈ [0, 1). 2. Let f (u) := uK1 (u)eu . We see that f (0) = 1 and the derivative f (u) = u ue (K1 (u) − K0 (u)) ≥ 0 for all u ∈ (0, ∞). 3. The same idea as in 2. shows that g(0) < 1 and g (u) ≤ 0 on u ∈ [0, ∞) where g(u) := 45 K0 (1 + u)e(1+u) . Lemma 3. Assume ΥR as in Proposition 2 and R ≥ 1 then ΥR has a unique maximum at u = 0.

Vol. 2, 2001


39

Proof. First we prove the claim for u ∈ [0, 1), we note that by (61) u = 0 is a critical point. Considering Proposition 2.1 it suffices to prove that Υ1 (u) ≤ 0. Now, using Proposition 2.2 and (61) we get Υ1 (u) ≤ (sin(ψ(u)) − cos(ψ(u)))

∞

1+u ∞

K1 (v) dv − cos(ψ(u)) v

1+u

1−u 1+u

K1 (v) dv v

e−v (sin(ψ(u)) − cos(ψ(u))) K1 (v)dv − cos(ψ(u)) dv 1+u v2 1+u 1−u 1 (sin(ψ(u)) − cos(ψ(u))) 1 −(1+u) ≤ K0 (1 + u) − cos(ψ(u))e − 1+u 1−u 1+u K0 (1 + u) [sin(ψ(u)) − u sin(ψ(u)) − cos(ψ(u)) + u cos(ψ(u))] = 1 − u2 −(1+u) cos(ψ(u)) 2ue − . 1 − u2 ≤

Denote the expression in the numerator of the last equation by h(u). It is enough to prove that h(u) ≤ 0 for u ∈ [0, 1). Using the third statement of Proposition 2 we arrive at ( ' 5 5u 5 3u h(u) ≤ e−(u+1) sin(ψ(u)) − sin(ψ(u)) − cos(ψ(u)) − cos(ψ(u)) ≤ 0 . 4 4 4 4 The last inequality holds for all u ∈ [0, 1), and the lemma is proved for this domain. To extend our proof for all real u we use the second statement of Proposition 1 and that for u ≥ 1 1 π(v − 1) π 2 −1 K1 (R(u − v)) K1 (R(u − v)) ΥR (u) = sin( ) dv + sin2 . dv 4 u−v u−v 8 −∞ −1 Here clearly ΥR (u) ≤ Υ1 (u). Noting that the derivative of K1 (u − v)/(u − v) with respect to u is −[2K1 (u − v) + (u − v)K0 (u − v)]/(u − v)2 non-positive for u ≥ 1 and that v < 1, we conclude that in this domain ΥR has its maximum at 1. These facts and the continuity of ΥR prove the lemma. Acknowledgement R. B. acknowledges partial support of the project by FONDECYT (Chile), project 199-0427; E. S. thanks CONICYT (Chile) for support through a doctoral fellowship. All authors acknowledge support by the Volkswagen Stiftung through a cooperation grant. The work has been partially supported by the European Union through its Training, Research, and Mobility program, grant FMRX-CT 96-0001960001. We are also thankful for the hospitality by our partner institutions.

40



References [1] H. Alarcón, R. Benguria. A Lower Bound on the Size of Molecules. Lett. Math. Phys., 35:281–289, 1995. [2] H. Alarc´ on, R. Benguria, P. Duclos, and H. Hogreve. On the stability of positive molecular ions. Helv. Phys. Acta, 72:386–407, 1999. [3] I.S. Gradshteyn and I.M. Ryzhik. Table of Integrals, Series, and Products. Academic Press, 1980. [4] Ira W. Herbst. Spectral theory of the operator (p2 +m2 )1/2 −Ze2 /r. Commun. Math. Phys., 53:285–294, 1977. [5] Elliott H. Lieb. Bound on the maximum negative ionization of atoms and molecules. Phys. Rev. A, 29:3018–3028, 1984. [6] Elliott H. Lieb and Horng-Tzer Yau. The stability and instability of relativistic matter. Commun. Math. Phys., 118:177–213, 1988. [7] André Martin and S. M. Roy. Semi-relativistic stability and critical mass of a system of spinless bosons in gravitational interaction. Phys. Lett. B, 233(3-4):407–411, 1989. [8] F. W. J. Olver. Bessel functions of integer order. In Milton Abramowitz and Irene A. Stegun, editors, Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, chapter 9, pages 355–433. Dover Publications, New York, 5 edition, 1968. [9] Mary Beth Ruskai. Limits on Stability of Positive Molecular Ions. Lett. Math. Phys., 18:121–132, 1989. [10] Mary Beth Ruskai. Absence of bound states in extremely asymmetric positive diatomic molecules. Comm. Math. Phys., 137(3):553–566, 1991. [11] Jan Philip Solovej. Asymptotic neutrality of diatomic molecules. Comm. Math. Phys., 130(1):185–204, 1990. Rafael Benguria and Edgardo Stockmeyer Pontificia Universidad Catlica de Chile Departamento de Fsica Casilla 306 Santiago 22 Chile e-mail: rbenguri@fis.puc.cl e-mail: estockme@maxwell.fis.puc.cl Communicated by Gian Michele Graf submitted 8/03/00, accepted 18/09/00

Heinz Siedentop Mathematik Universit¨ at M¨ unchen Theresienstraße 39 D-80333 M¨ unchen Germany e-mail: [email protected]



Atoms with Bosonic “Electrons” in Strong Magnetic Fields Bernhard Baumgartner, Robert Seiringer Abstract. We study the ground state properties of an atom with nuclear charge Z and N bosonic “electrons” in the presence of a homogeneous magnetic field B. We investigate the mean field limit N → ∞ with N/Z fixed, and identify three different asymptotic regions, according to B Z 2 , B ∼ Z 2 , and B Z 2 . In Region 1 standard Hartree theory is applicable. Region 3 is described by a one-dimensional functional, which is identical to the so-called Hyper-Strong functional introduced by Lieb, Solovej and Yngvason for atoms with fermionic electrons in the region B Z 3 ; i.e., for very strong magnetic fields the ground state properties of atoms are independent of statistics. For Region 2 we introduce a general magnetic Hartree functional, which is studied in detail. It is shown that in the special case of an atom it can be restricted to the subspace of zero angular momentum parallel to the magnetic field, which simplifies the theory considerably. The functional reproduces the energy and the one-particle reduced density matrix for the full N -particle ground state to leading order in N , and it implies the description of the other regions as limiting cases.

1 Introduction The ground states of atoms with many electrons in magnetic fields have been studied in [LSY94a, LSY94b, BSY00], and their energies have been evaluated, exactly to leading order, as some of the physical parameters tend to infinity. The atoms have been modeled by the nonrelativistic quantum mechanics of N fermionic electrons, with an unmovable pointlike nucleus of charge Z in a homogeneous magnetic field of strength B. In order to shed some more light onto the interplay of the involved laws of physics, we investigate the effects of changing one of them: What would happen, if the electrons were bosons? So we study the ground state of the Hamiltonian - written in appropriate units H N,Z,B =

N i=1

HB,i − B −

Z |xi |

+

i<j

1 , |xi − xj |

(1.1)

where we set HB,j = (−i∇j + Ba(xj ))2 .

(1.2)

The vector potential is given by a(x) = e × x/2, where x ∈ R3 , e is the unit vector parallel to the magnetic field in z-direction. This Hamiltonian acts on the

42

B. Baumgartner, R. Seiringer


symmetric subspace of L2 (R3N , d3N x). We subtract B for every particle because we are interested in the binding energy, which is now equal to the ground state energy E(N, Z, B) = inf spec H N,Z,B . In the study of asymptotics, as B and Z tend to infinity, we find a division into three different regions. They are - always in appropriate units - characterized by: B Z 2 , B ∼ Z 2 , B Z 2 . This is in contrast to atoms with fermionic electrons, where five different regions have been identified: B Z 4/3 , B ∼ Z 4/3 , Z 4/3 B Z 3 , B ∼ Z 3 , B Z 3 . See [LSY94a] and references therein. A simple heuristic argument: The length scale, which is typical for the quantum effects of a single particle in the magnetic field, is ∼ B −1/2 . Typical energies are the differences of the Landau levels, 2B. On the other hand, the length scale typical for a particle in the Coulomb potential only is ∼ Z −1 and hence the typical energy range is ∼ Z 2 . In Region 2, where B ∼ Z 2 , the magnetic and the Coulombic effects are therefore of the same order of magnitude. In Region 1, where B Z 2 , the Coulomb effects dominate in all directions. Magnetic effects will not contribute in leading order. In Region 3, where B Z 2 , the magnetic effects dominate the dynamics perpendicular to the magnetic field. The electron with low energy is confined to the lowest Landau band, the typical wave functions are squeezed to needles with diameter ∼ B −1/2 . (See [AHS81, FW94] for a detailed rigorous treatment and for citations concerning the history of this problem.) Turning from the one-body system to the N -body problem, we remark that Bose statistics has no effect on the size of the ground states, if the pair interactions are ignored. Moreover, the repulsion of the particles is of the same order of magnitude as the attraction of the nucleus, if N ∼ Z. So the scaling properties of the lengths and energies per particle remain the same, and the distinction of the three regions for many bosons is the same as for a single electron. We remark, that we can moreover identify a Region 4 with strong magnetic field, where the nuclear charge Z is fixed. In this region the asymptotics is, in leading order, independent of the statistics, as has been noted in its evaluation in [BSY00]. It is to be described by the model of one-dimensional atoms with delta-function interactions. Scaling : In the following we will use the parameters λ = N/Z, β = B/Z 2 besides N . By scaling x → x/Z the operator Z −2 H N,Z,B is unitarily equivalent to HN,λ,β =

N i=1

Hβ,i − β −

1 |xi |

+

1 λ , N i<j |xi − xj |

(1.3)

with ground state energy E(N, λ, β) = Z −2 E(N, Z, B). We are interested in the limit N → ∞ of N −1 E(N, λ, β) with λ fixed. In Region 1 this limit is coupled with β → 0, and in Region 3 with β → ∞, while β is fixed in Region 2. The asymptotics of the atomic structure and of its energy for large N in the three regions is modeled by energy functionals in generalized Hartree theory.

Vol. 2, 2001

Atoms with Bosonic “Electrons” in Strong Magnetic Fields

43

In Region 1 standard Hartree theory is applicable. The energy functional, a functional of the density ρ, is 1 E H [ρ] = |∇ρ1/2 (x)|2 d3 x − ρ(x)d3 x + D[ρ, ρ], (1.4) |x| where

1 D[ρ, ρ] = 2

ρ(x)ρ(y) 3 3 d xd y. |x − y|

(1.5)

Its ground state energy is E H (λ) = ρ,

inf

ρ=λ

E H [ρ].

(1.6)

It is known, [BL83], that lim

N→∞

1 1 E(N, λ, 0) = E H (λ). N λ

(1.7)

We will extend this result to Theorem 1.1 (Energy asymptotics for Region 1). If N → ∞ and β = β(N ) → 0 with λ fixed, then 1 1 lim E(N, λ, β) = E H (λ). (1.8) N→∞ N λ In Region 2 Hartree theory has to be generalized. The basic idea remains: the electrons occupy the ground states of an effective one-particle Hamiltonian with a mean field potential which has to be determined by self-consistency. Now in the presence of a magnetic field, the ground state of the effective Hamiltonian may a priori be degenerate, so that the electrons can be distributed over a larger set of states. To take this into account, one has to consider in general one-particle density matrices Γ in this Magnetic Hartree Theory. The energy functional is 1 EβMH [Γ] = Tr[(Hβ − β)Γ] − ρΓ (x)d3 x + D[ρΓ , ρΓ ], (1.9) |x| where ρΓ (x) is the density defined by Γ, with ρΓ = Tr[Γ]. We define the Hartree energy E MH (λ, β) as E MH (λ, β) = inf EβMH [Γ]. (1.10) Γ, Tr[Γ]=λ

This general form of magnetic Hartree theory is necessary, if some extra external potential is added, or if one considers, e.g., molecules. For, in the presence of magnetic fields, some well known facts of ordinary quantum mechanics are no longer true: the ground state may be degenerate, and, in the case of an axially symmetric system, it can happen that the energy is not minimized by states with zero angular momentum [LO77, AHS78]. But for the atom without perturbing

44



forces, it turns out that the theory can be reduced to the consideration of pure states, rank one density matrices, with a zero angular momentum component parallel to the magnetic field. In this case (1.9) simplifies to the Magnetic Hartree density functional 1 √ 2 β2 2 MH ˆ |∇ ρ| + (1.11) r ρ − βρ − ρ d3 x + D[ρ, ρ], Eβ [ρ] = 4 |x| where r is the radial coordinate perpendicular to the magnetic field. This functional √ √ is the restriction of (1.9) to density matrices of the form | ρ ρ|. We will show in the next section that both E MH and EˆMH have the same ground state energy and density, so one could alternatively define E MH as the infimum of EˆMH . The energy asymptotics for Region 2 are stated in the following theorem: Theorem 1.2 (Energy asymptotics for Region 2). If N → ∞ with λ and β fixed, then 1 1 lim E(N, λ, β) = E MH (λ, β). (1.12) N→∞ N λ In Region 3 the ground state of the atom is squeezed into a needle with diameter - in the scaled coordinates - ∼ β −1/2 . The Coulomb interaction of the confined particles acts along the needle effectively like a one dimensional delta function, with coupling constant ∼ ln β. It thus dictates the typical extension ∼ (ln β)−1 of the ground state wave function in the direction of the field, and the typical energies as ∼ (ln β)2 . This effective reduction to a one dimensional system has been discussed in [LSY94a, JY96, BSY00]; see also [BRW99] for related studies. In the appropriate scaling, z ∼ (ln β)x , with x the coordinate in the direction parallel to the magnetic field, the theory is a Hartree theory for a one dimensional model. It is identical to the theory for Region 5 of fermionic electrons, which has been studied in [LSY94a], including an exact solution of the ground state problem. Its energy functional has been called the Hyper Strong Functional; it is d 1/2 1 HS 2 E [ρ] = ( ρ (z)) dz − ρ(0) + (1.13) ρ(z)2 dz, dz 2 with its ground state energy defined as E HS (λ) = ρ,

inf

ρ=λ

E HS [ρ].

(1.14)

We will prove Theorem 1.3 (Energy asymptotics for Region 3). If N → ∞ and β = β(N ) → ∞ with λ fixed, then E(N, λ, β) 1 = E HS (λ). (1.15) lim N→∞ N (ln β)2 λ

Vol. 2, 2001


45

The difference between bosonic atoms in our Region 3 to fermionic atoms in the fermion-Region 5 is in the condition of applicability of HS-theory. The Pauli principle demands one needle for each electron, electrostatics makes them lying side by side. The fermionic atom is thus a bundle of N needles, with total diameter - in unscaled coordinates and with unscaled parameters - ∼ N 1/2 B −1/2 . The condition for validity of HS-Theory is that the diameter of the atom, in the directions of x⊥ , perpendicular to the field, is much smaller than the characteristic length of Coulombic quantum effects, ∼ Z −1 . This condition is therefore B N Z 2 for fermionic atoms. Bosonic electrons may all occupy the same needle. For them, the particle number does therefore not appear in the condition, which is now B Z 2 . In the investigation of the limits we exploit four principles : (i) Restriction to independent particles, (ii) Spatial concentration near the center, (iii) Concentration in the lowest Landau band, and (iv) High field limit of the Coulomb-interaction. The principle (i) is fundamental for the validity of the Hartree theories: The atom with many interacting particles will be compared to models with independent particles in an effective mean field. The spatial concentration (ii) had not to be stressed in systems without magnetic fields. In the studies of fermionic electrons, [LSY94a, BSY00], it has been proven as a consequence of the superharmonicity of the repulsive interactions. We will also use this superharmonicity, but in a different way: It implies the vanishing of the parallel component of the angular momentum in the state which minimizes the Hartree energy. Since one of the consequences is the absence of an “angular momentum barrier”, it can also be viewed as a spatial concentration of the bound electrons. The principles (iii) and (iv) are effective in Region 3, in the limit β → ∞. In the investigations of bosonic “electrons” we could probably have mimicked the procedure of [LSY94a], with some changes due to the Bose statistics. Our procedure relies in fact heavily on the same methods, but we combine them in a new way: The principles (ii), (iii) and (iv) mentioned above are studied for single particles in effective mean fields. In the study of many particle systems, we begin with the reduction to independent particles. But this has to be done in a subtle way, anticipating the limits which have to follow. We do this by extending the method of [BSY00]. The Hartree theories will be discussed in Sect. 2, including the restriction to zero angular momentum in Subsect. 2.5. The confinement to the lowest Landau band, relevant for Region 3, is treated in Subsect. 3.2 and the following subsection. In Sect. 4 Hartree theory is proven to be the limit of many particle quantum

46



mechanics, as it is formulated in the Theorems 1.1, 1.2 and 1.3. Subsection 4.3 treats the restriction to the independent particle model. Finally, some results on the states and on “Bose condensation” are presented in Subsections 4.6 and 4.7. In the investigations of the limiting procedures, we are interested in the physical dimensions of estimates and bounds, not about numerics. We use “C” or “const.” for all the numerical constants.

2 Hartree theory 2.1 Definitions and basic properties Definitions. The Hartree functional without a magnetic field, E H [ρ] in (1.4), is defined for non-negative densities ρ(x) ∈ L1 (R3 , d3 x) with the restriction that every component of ∇ρ1/2 (x) is an element of L2 (R3 , d3 x). Analogously, the functional for Region 3, E HS [ρ] in (1.13), is defined for non-negative densities ρ(z) ∈ L1 (R, dz) with the restriction d(ρ1/2 (z))/dz ∈ L2 (R, dz). The functional for Region 2, E MH [Γ] in (1.9), is defined for density-matrices, non-negative trace class operators Γ acting on L2 (R3 , d3 x), with the restriction of a finite magnetic-kinetic energy: Tr[Hβ Γ] < ∞

(2.1)

1 3 3 The associated density ρΓ (x) can be defined in L (R , d x) as a norm convergent sum of integrable densities k wk ρk , by diagonalizing the density matrix as Γ= wk |ψk ψk |, (2.2) k

with normalized ψk , and ρk (x) = |ψk (x)|2 . The conditions of finiteness of the kinetic energies imply the finiteness of potential energies, in all three regions: the attraction is bounded by the kinetic energy, because of the boundedness of the Coulomb potential (delta function potential in Region 3) relative to the operator of kinetic energy, which is proven with the stability of the hydrogen atom. Moreover, the repulsion is bounded by attraction, because for σ(x) and ρ(x) both non-negative elements of L1 (R3 , d3 x), 2D[σ, ρ] < λ sup Ay [ρ],

(2.3)

y

with λ = σ(x)d3 x and Ay [ρ] = HS theory is

1 3 |x−y| ρ(x)d x.

The analogous inequality for the

σ(z)ρ(z)dz < λ sup ρ(y), y

for σ(z) and ρ(z) both non-negative elements of L1 (R, dz) and λ =

(2.4)

σ(z)dz.

In the variational principles which define the Hartree energies, the restrictions ρ1 = λ and Tr[Γ] = λ can be weakened to ρ1 ≤ λ and Tr[Γ] ≤ λ, because one

Vol. 2, 2001


47

can always “move some charge to infinity”. This will be used for the proof of the existence of a minimizer for the energy, and this in return means that “inf” can be replaced by “min”. In the following discussion we will explicitly study the magnetic Hartree theory. All the results which do not refer to the dependence on β are also valid - in their essence of physical meaning, with some changes in the mathematics for the standard and the HS theory. The proofs can be transfered, keeping their structure, but changing the mathematical expressions. This is an indication, that the physics behind the arguments is often the same. We point out, in particular, that the delta potential has the same scaling properties as the Coulomb potential. Introducing more non-negative parameters, we will study the extended energy functional ζ MH Eλ,β,ζ,α [Γ] = λTr[(Hβ − β)Γ] − λ ρΓ (x)d3 x + αλ2 D[ρΓ , ρΓ ], (2.5) |x| and its ground state energy MH (λ, β, ζ, α) = Eext

inf

Γ, Tr[Γ]≤1

MH Eλ,β,ζ,α [Γ].

(2.6)

By scaling one verifies that the energies are related through MH (λ, β, ζ, α) = Eext

ζ 3 MH α 1 E ( λ, 2 β). α ζ ζ

(2.7)

Monotonicity, convexity and concavity properties. MH (i) Since the functional Eλ,β,ζ,α [Γ] is decreasing in ζ, increasing in α and jointly MH is decreasing in ζ, increasing in α linear in (ζ, α), the Hartree energy Eext and jointly concave in (ζ, α). MH (ii) Since the functional Eλ,β,ζ,α [Γ]/λ is increasing in λ and jointly linear in (ζ, λ), MH /λ, is increasing in λ the energy per unit charge of the electron cloud, Eext and jointly concave in (ζ, λ). Because of (2.7), these properties are actually equivalent to (i).

(iii) Since moreover the functional EβMH [Γ] is convex in Γ (it is even strictly convex in ρΓ , since D[ρ, ρ] is strictly convex in ρ), and since the sets {Tr[Γ] ≤ λ} are ordered by inclusion, the Hartree energy E MH (λ, β) is decreasing and convex ¯ are minimizing in λ. For, if Γn with Tr[Γn ] = λ and Υn with Tr[Υn ] = λ MH MH ¯ sequences, it follows, that E ((λ + λ)/2, β) ≤ inf n Eβ [(Γn + Υn )/2]. The convexity property justifies the definition of a critical charge λc , which a priori might be infinity, indicating the maximal charge which the nucleus can bind: E MH is strictly decreasing for λ ≤ λc , and constant for λ ≥ λc .

48



Moreover, the convexity of E MH in λ and the concavity of E MH /λ give upper and lower bounds on ∂ 2 E MH /∂λ2 , guaranteeing the existence of the chemical potential ∂E MH µ= (2.8) ∂λ as a continuous function of λ. Comparison with the hydrogen atom : The extended functional (2.5) with α = 0 is obviously related to the energy of the hydrogen atom described with the Pauli Hamiltonian, where the magnetic moment of the electron serves for the subtraction of β in the ground state energy: MH Eext (λ, β, ζ, 0) = λE hyd (β, ζ).

(2.9)

We have the following bounds: Proposition 2.1 (Energies of hydrogen as bounds). For λ > 0 and α > 0 MH (λ, β, ζ, α) Eext MH Eext (λ, β, ζ, α)

> λE hyd (β, ζ) < λE hyd (β, ζ − λα/2).

(2.10) (2.11)

Proof. The first inequality follows from the strict positivity of D[ρ, ρ]. The upper bound can be given by choosing the projection onto the ground state of the Hamiltonian Hβ − (ζ − λα/2)/|x|, as a test density matrix Γ. Then we apply the bound to the repulsion by attraction (2.3), together with the observation, that the nucleus has to be at the point of the maximum of the potential Ay [ρ]. (Otherwise the energy could be lowered by shifting ρ). We remark that the bound (2.11) is of no use for λα ∼ 2ζ or larger. In this case MH MH one can use the monotone decrease of the energy Eext in λ, and bound Eext by hyd minλ {λE (β, ζ − λα/2)}. We will use these bounds in Subsection 3.2. As an obvious consequence of these bounds, using also the continuity of the hydrogen energy in ζ, we add Remark 2.2 (The limit λ → 0). In the limit λ → 0 the energy per unit charge, MH (λ, β, ζ, α)/λ, converges to the energy of the hydrogen atom, E hyd (β, ζ). Eext The bounds of Proposition 2.1 will actually be used in a simplified form: Lemma 2.3 (Simple bounds). E MH (λ, β) E MH (λ, β)

≥ −(1/4 + β)λ, ≤ −(1/4)λ(1 − λ/2)2 ,

for

λ ≤ 2.

(2.12) (2.13)

Proof. Applying the diamagnetic inequality [S76] to the lower bound in Proposition 2.1, we get Hβ − β −

1 1 1 ≥ inf spec (H0 − β − ) = − − β. |x| |x| 4

(2.14)

Vol. 2, 2001


49

To get the upper bound, we apply Lieb’s inequality (Theorem A.1. in [AHS78]) inf spec (Hβ − β + V ) ≤ inf spec (H0 + V )

(2.15)

to the upper bound in Proposition 2.1.

2.2 Minimizers Theorem 2.4 (Existence of a minimizer). For each β ≥ 0 and λ > 0 there is a minimizer ΓH for EβMH under the condition Tr[ΓH ] ≤ λ, i.e. E MH (λ, β) = EβMH [ΓH ].

(2.16)

Proof. We follow closely the proof of the analogous theorems 2.2 and 4.3 in [LSY94a]. Let Γn be a minimizing sequence for EβMH with Tr[Γn ] ≤ λ. First note that Tr[Hβ Γn ] is bounded above, because EβMH [Γn ] is bounded above, and since the other contributions to the energy are bounded relative to Hβ . Now we show that the magnetic-kinetic energy is bounded below by the 3-norm of ρ: Using the diamagnetic inequality (the prerequisite for the final result in [S76]) ψ, (Hβ + V )ψ ≥ |ψ|, (H0 + V )|ψ|,

(2.17)

and the decomposition of Γ as in (2.2), we get Tr[(−i∇ + βa(x))2 Γ] ≥

wn

|∇|ψn ||2 .

(2.18)

n

Moreover, using the Cauchy-Schwarz inequality, |∇ρΓ |

2

2 = 2wn |ψn |∇|ψn | n 2 ≤ 4 wn |ψn |2 wn |∇|ψn || . n

1/2

1/2

Because ∇ρΓ = 2ρΓ ∇ρΓ

(2.19)

n

this gives

Tr[(−i∇ + βa(x))2 Γ] ≥

1/3

π 4/3 1/2 2 , ρ3Γ ∇ρΓ ≥ 3 2

(2.20)

where we have used the Sobolev inequality in the last step. We can then conclude that the corresponding sequence ρn ≡ ρΓn is bounded 1/2 3 in L ∩ L1 and ρn is bounded in H1 . Therefore, for each p ∈ (1, 3], there exists a subsequence, again denoted by ρn , that converges to some ρ∞ weakly in L3 ∩ Lp ,

50



and pointwise almost everywhere. It follows from weak convergence that ρ∞ ≥ 0 and ρ∞ ≤ λ. From Fatou’s lemma we infer that lim inf D[ρn , ρn ] ≥ D[ρ∞ , ρ∞ ]. n→∞

(2.21)

Moreover, since |x|−1 ∈ L3/2 + L3+ε for every ε > 0, and choosing p the dual of 3 + ε, we see that 1 1 (2.22) lim ρn = ρ∞ . n→∞ |x| |x| Since the Γn ’s are trace class operators on L2 , we can pass to a subsequence such that for some ΓH lim Tr[Γn A] = Tr[ΓH A] (2.23) n→∞

for every compact operator A. (Here we used the Banach-Alaoglu Theorem and the fact that the dual of the compact operators is the trace class operators). In particular, we have Γn ' ΓH (2.24) in the weak operator sense. It is clear that ΓH ≥ 0. Now let φj ∈ C0∞ (R3 ) be an orthonormal basis for L2 . Again by Fatou’s Lemma φj |ΓH |φj ≤ lim inf Tr[Γn ] ≤ λ. (2.25) Tr[ΓH ] = n→∞

j

In the same way one shows that Tr[(Hβ − β) ΓH ] =

(Hβ − β)1/2 φj |ΓH | (Hβ − β)1/2 φj j

≤ lim inf Tr[(Hβ − β) Γn ]. n→∞

(2.26)

It remains to show that ρΓH = ρ∞ . We already mentioned that for some constant C Tr[Hβ Γn ] < C (2.27) for all n. It follows from (2.24) that (1 + Hβ )1/2 Γn (1 + Hβ )1/2 ' (1 + Hβ )1/2 ΓH (1 + Hβ )1/2

(2.28)

weakly on the dense set of C0∞ functions. Since the operators are bounded by (2.27), (2.28) holds weakly in L2 . Now consider some f ∈ C0∞ acting as a multiplication operator on L2 . It is easy to see that f is relatively compact with respect to −∆, i. e. f (1 − ∆)−1 is compact. In fact, it is Hilbert-Schmidt, because the trace of its square is given by f (x)f (y)Y (x − y)2 d3 x d3 y (2.29)

Vol. 2, 2001


51

with the Yukawa-Potential Y (x) = (4π|x|)−1 exp(−|x|), and this is bounded by Young’s inequality. From [AHS78], Thm. 2.6, we infer that g = (1 + Hβ )−1/2 f (1 + Hβ )−1/2

(2.30)

is compact (it is even Hilbert-Schmidt). Thus there exists a sequence gi of finiterank operators which approximates g in norm. We have (ρn − ρΓH )f = Tr[(Γn − ΓH )f ] ≤ Tr[(1 + Hβ )1/2 (Γn − ΓH )(1 + Hβ )1/2 gi ] +2(C + 1)λg − gi ,

(2.31)

where we have used (2.27). Hence ρn → ρΓH in the sense of distributions. Because we already know that ρn converges to ρ∞ pointwise almost everywhere, we conclude that ρΓH = ρ∞ . We have thus shown that there exists a ΓH with Tr[ΓH ] ≤ λ and EβMH [ΓH ] ≤ lim inf n→∞ EβMH [Γn ], from which we conclude that EβMH [ΓH ] = E MH (λ, β). Remark 2.5. If ΓH is unique, and given any minimizing sequence for EβMH , the whole sequence converges weakly to ΓH . Although we cannot make an assertion about the uniqueness of ΓH yet, we can state the Proposition 2.6 (Uniqueness of the density). The density ρH corresponding to the minimizer is unique. Proof. This follows immediately from the strict convexity of D[ρ, ρ]. For λ ≤ λc , the energy E MH is a strictly decreasing function of λ. So the minimizers for different λ ≤ λc are different, and they have the normalization Tr[Γ] = λ. No part of the electron cloud has to be moved to infinity. The strict convexity of D[ρ, ρ] implies now, for λ ≤ λc , a strict convexity of E MH as a function of λ.

2.3 Some physical quantities and their interrelations Since the densities of the minimizers are unique, the contributions to the Hartree energy, ρ/|x| and D[ρ, ρ], are fixed. We denote them as attraction A and repulsion R, suppressing the dependence on the parameters. As a consequence, also K, the magnetic-kinetic energy, is fixed. Inserting a minimizer into (1.9) we get E = K − A + R.

(2.32)

52



To deduce an analogue to the Feynman-Hellman theorem, we observe the ¯ with correspondfollowing inequality: Consider two different parameters, ζ and ζ, ¯ ing minimizers Γ and Γ, and their densities ρ and ρ¯. All the other parameters are ¯ into the functional (2.5) with parameter ζ, to conclude fixed. Insert Γ MH MH ¯ (λ, β, ζ, α) < Eλ,β,ζ,α [Γ] Eext

1 ρ¯ |x| 1 MH ¯ α) − (ζ − ζ) ¯ = Eext ρ¯. (λ, β, ζ, |x| =

MH ¯ Eλ,β, ¯ [Γ] ζ,α

¯ − (ζ − ζ)

(2.33)

¯ Together with the same argument, where ζ and ζ¯ are exchanged, we get, for ζ > ζ, MH ¯ α) 1 1 (λ, β, ζ, E MH (λ, β, ζ, α) − Eext < − ρ < ext ρ¯. (2.34) − ¯ |x| |x| ζ −ζ We conclude, using the continuity (2.22), where the ρn are now the unique densities of the minimizers for ζ¯n → ζ, that the Hartree energy is differentiable in ζ, and MH ∂Eext = −A. (2.35) ∂ζ ζ=1 In the same way, using the inequality (2.21) instead of (2.22), we show that MH ∂Eext = R. (2.36) ∂α α=1

Having established the differentiability in λ, ζ and in α, the scaling relation (2.7) implies differentiability in β. So, the magnetic moment θ, θ=

∂E MH , ∂β

(2.37)

is well defined as a function of β. Taking the partial derivatives of (2.7) in α and ζ at α = 1 and ζ = 1 gives R = −E + λµ, −A = 3E − λµ − 2βθ.

(2.38) (2.39)

The relation (2.38) is, because of the negativity of the chemical potential µ for λ < λc , a virial inequality, R < |E|, (2.40) turning into an equality at λ = λc . Inserting (2.38) into (2.39), we get βθ =

K − |E| . 2

(2.41)

Vol. 2, 2001


53

The equation (2.38) and the inequality (2.40) are also valid in the Regions 1 and 3, where there is a scaling relation analogous to (2.7). But this relation is without a β, so instead of (2.41), in Regions 1 and 3 the well known virial equality K = |E| =

A−R 2

(2.42)

holds (see also [LSY94a, B84]). Moreover, for λ = λc we have |E| : K : A : R = 1 : 1 : 3 : 1.

2.4 The linearized theory Since the density ρH of a minimizer is unique and integrable, we can define a linearized Hartree functional by MH [Γ] = Tr[H H Γ], Eβ,lin

(2.43)

with the one-particle Hartree Hamiltonian H H ≡ Hβ − β − ΦH (x).

(2.44)

Here the Hartree potential ΦH (x) is given by ΦH (x) =

1 1 − ρH ∗ . |x| |x|

(2.45)

Note that H H depends on λ only via ρH . Since ΦH is in L2 + L∞ ε , it is relative compact with respect to Hβ [AHS78], so we know that the essential spectrum of H H is [0, ∞). Lemma 2.7 (Linear Hartree functional). Let ΓH be a minimizer of EβMH under the MH (under the same constraint). constraint Tr[Γ] ≤ λ. Then ΓH also minimizes Eβ,lin Proof. (We proceed essentially as in [LSY94a]). For any Γ MH EβMH [Γ] = Eβ,lin [Γ] + D[ρΓ − ρH , ρΓ − ρH ] − D[ρH , ρH ].

(2.46)

Especially, for all δ > 0, MH MH EβMH [(1 − δ)ΓH + δΓ] = (1 − δ)Eβ,lin [ΓH ] + δEβ,lin [Γ] − D[ρH , ρH ]

+δ 2 D[ρΓ − ρH , ρΓ − ρH ].

(2.47)

MH MH Now if there exists a Γ0 with Tr[Γ0 ] ≤ λ and Eβ,lin [Γ0 ] < Eβ,lin [ΓH ] we can choose δ small enough to conclude that MH EβMH [(1 − δ)ΓH + δΓ0 ] < Eβ,lin [ΓH ] − D[ρH , ρH ] = EβMH [ΓH ],

which contradicts the fact that ΓH minimizes EβMH .

(2.48)

54



The Hartree equation, the Euler-Lagrange equation corresponding to the minimization of the functional EβMH , is H H Γ = µΓ.

(2.49)

The ground state energy of H H is given by the chemical potential µ(λ, β) = (E + R)/λ. For the overcritical values of λ ≥ λc the ground state energy µ of H H is 0, and there is just one density corresponding to the minimizer. We can now ensure that λc = λc (β) is not too small. In fact we state Lemma 2.8 (Lower bound on the critical charge). λc (β) > 1

for all β ≥ 0.

(2.50)

Proof. For β = 0 this was shown in [BL83], and λc (0) was computed numerically to be 1.21 in [B84]. Fix β > 0. We assume that λρ ≡ ρH ≤ 1, and will show that H H has an eigenvalue strictly below zero. Using ψ(r, z) = exp(−βr2 /4−a|z|) with some a > 0 as a (not normalized) test function we compute ψ|H H ψ

= ≤

1 2 2π a − 2π ΦH (r, z)e− 2 βr −2a|z| rdrdz β 2π a − ρH (x) (φ(0) − φ(x)) d3 x, β

(2.51)

where φ(x) is the potential generated by the charge distribution exp(− 12 βr2 − 2a|z|), i.e. − 1 βr2 −2a|z| e 2 φ(y) = d3 x. (2.52) |x − y| Note that φ(0) > φ(y) for y = 0, so we can choose a small enough to conclude that H H has an eigenvalue strictly below zero and binds more charge than λρ . Lemma 2.8 shows that in Hartree theory there are always negative ions. The following lemma gives an upper bound on the maximal negative ionization. Lemma 2.9 (Upper bound on the critical charge). For some constant C independent of β

1 λc (β) ≤ 2 + min 1 + β, C 1 + [ln β]2 . (2.53) 2 Proof. This follows from Lemma 4.9 and the bound on Nc given in [S00] (see Remark 4.10 below).

Vol. 2, 2001


55

2.5 The minimizer has L = 0 Let H be the operator H = Hβ − β − Φ(x),

(2.54)

where the potential is given by Φ(x) =

1 1 − ∗ ρ. |x| |x|

(2.55)

The function ρ is assumed to be axially symmetric, non-negative, and ρ ∈ L1 ∩ Lq , where q > 3/2. This implies that |x|−1 ∗ ρ is a bounded, continuous function going to zero at infinity. The operator H is essentially self-adjoint on C0∞ (R3 ). Since Φ ∈ L2 + L∞ ε , it is a relatively compact perturbation of Hβ [AHS78], so we know that the essential spectrum of H is given by [0, ∞). The symmetry of ρ implies that H is axially symmetric, i.e. it commutes with the rotations generated by the parallel component of the angular momentum, ∂ ∂ L = −i x −y . (2.56) ∂y ∂x For m ≥ 0 let now Ψm be a ground state of H, if there is one, with angular momentum L Ψm = −mΨm . Note that we can restrict ourselves to considering non-negative m’s, since (H L = −m) is antiunitarily equivalent to (H + 2βm L = m) by complex conjugation, so in the ground state m is certainly nonnegative. Writing Ψm (x) = e−imϕ rm f (r, z), (2.57) where (r, ϕ) denote polar coordinates for (x, y), we see that f is a ground state for 2 2 2 m = − ∂ − 2m + 1 ∂ − ∂ + β r2 − (m + 1)β − Φ(r, z) H 2 2 ∂r r ∂r ∂z 4

(2.58)

m has a ground state, it is unique and strictly positive on L2 (R3 , r2m d3 x). If H m are ([RS78]; to apply the theorems therein note that the first three terms in H just the radial part of the Laplacian in 2m + 3 dimensions). So Ψm is the only ground state of H with L = −m. Moreover, f is a bounded, continuous function [LL97], and hence |Ψm (r, φ, z)| ≤ const.rm , (2.59) for small r and some constant independent of z. In particular, Ψm (0) = 0 if m > 0. We now are able to prove the main result of this subsection: Theorem 2.10 (m = 0 in the ground state of H). Let H be a Hamiltonian as in (2.54), with ρ axially symmetric and non-negative. If inf spec H is an eigenvalue, the corresponding eigenvector is unique and has zero angular momentum, for all values of β ≥ 0.

56



Proof. For ρ = 0, i.e. the hydrogen atom, this was shown in [AHS81] (and also in [GS95]). So we can restrict ourselves to considering the case ρ > 0. Let Ψm be a normalized ground state for H, with angular momentum −m. Define f (b) = − |Ψm (x)|2 Φ(x − b)d3 x. (2.60) The function f is continuous and bounded. It achieves its minimal value at b = 0, because otherwise one could lower the energy by translating |Ψm |2 . Strictly speaking, m − Ψm |H|Ψm ≥ 0, m |H|Ψ (2.61) f (b) − f (0) = Ψ with b = (b1 , b2 , b3 ) and i

m (x) = e− 2 β(b2 x−b1 y) Ψm (x + b). Ψ

(2.62)

The phase in (2.62) is chosen such that the kinetic energy remains invariant. (Note that translating Hβ is equivalent to changing the gauge of the magnetic potential.) In the sense of distributions,

(2.63) ∆f (b) = 4π |Ψm (b)|2 − |Ψm |2 ∗ ρ− (b) , with ρ− (x) ≡ ρ(−x). The function |Ψm |2 ∗ ρ− is pointwise strictly positive and continuous. Assume now that m > 0. Since, by (2.59), |Ψm (x)| ≤ C|x|m for some C > 0, there is an R > 0 such that ∆f (b) < 0 for |b| < R,

(2.64)

i.e. f is superharmonic in some open region containing 0. This contradicts the fact that f achieves its minimum at b = 0. As a consequence, a ground state of H must have angular momentum m = 0. An analogous result holds also for the restriction of H to the lowest Landau band. This fact will be used in the proof of Theorem 3.6. Corollary 2.11 (m = 0 in the ground state of Π0 HΠ0 ). Let Π0 be the projector onto the lowest Landau band. If Π0 HΠ0 has a ground state, the ground state wave function Ψ0 is given by β β exp − r2 ψ(z) Ψ0 (x) = (2.65) 2π 4 for some ψ(z) with |ψ(z)|2 dz = Ψ0 22 . Proof. Mimicking the proof of the last theorem, we see that if Π0 HΠ0 has a ground state, the corresponding wave function Ψ0 = Π0 Ψ0 has L = 0. Hence it is given by (2.65).

Vol. 2, 2001


57

Let now ρ be the Hartree density ρH , the unique density of the minimizer of (depending on λ and β). Applying Lemma (2.7) and the theorem above, we immediately get EβMH

Corollary 2.12 (Uniqueness of ΓH ). The functional EβMH has a unique minimizer ΓH under the condition Tr[Γ] ≤ λ, which is proportional to the projection onto the positive function ρH . In particular, ΓH has rank 1. Moreover, ρH minimizes the density functional (1.11) under the condition ρ ≤ λ. It is C ∞ away from the origin, continuous at x = 0, and strictly positive. The properties of ρH stated above follow from the fact that ρH minimizes MH ˆ Eβ , and therefore satisfies the variational equation 2 β 2 H H −∆ ρ + r −β−Φ ρH = µ ρH , 4

(2.66)

where µ is the chemical potential of the MH theory. Note that Corollary 2.12 proves the assertion made in the Introduction that both E MH and EˆMH have the same ground state energy and density.

3 The limits of Region 2 3.1 The limit of very weak magnetic fields The limit β → 0 is rather easy to handle: Theorem 3.1 (Hartree energy for small β). In the limit β → 0, E MH (λ, β) = E H (λ) − βλ + O(β 2 ).

(3.1)

Proof. Fix λ > 0, and let ρβ be the minimizer of the density functional EˆβMH under the constraint ρ ≤ λ. Using ρ0 as a trial function we get E MH (λ, β) ≤ EˆβMH [ρ0 ] = E MH (λ, 0) − βλ +

β2 4

r2 ρ0 .

(3.2)

It is well known that ρ0 falls off more quickly that the inverse of any polynomial, so r2 ρ0 is finite [BBL81]. For the converse, we estimate E MH (λ, β) = EˆβMH [ρβ ] ≥ Eˆ0MH [ρβ ] − βλ ≥ E MH (λ, 0) − βλ. The observation E MH (λ, 0) = E H (λ) then leads to (3.1).

(3.3)

58



3.2 Lowest Landau band confinement in Hartree theory We now show that for large β most of the charge is confined to the lowest Landau band. To do this, we write the Hartree functional as 1 1 EβMH [Γ] = Tr Hβ − β − Γ + 12 Tr2 Γ ⊗ Γ , (3.4) |x| |x − y| where Tr2 means the trace over the doubled space L2 (R3 , d3 x) ⊗ L2 (R3 , d3 y). Let Π0 be the projector onto the lowest Landau band, and let Π> = 1 − Π0 . We will use the decomposition Hβ − β = Π0 (Hβ − β)Π0 + Π> (Hβ − β)Π>

(3.5)

and

1 1 1 1 1 = Π0 Π0 + Π> Π> + Π0 Π> + Π> Π0 . |x| |x| |x| |x| |x| The off-diagonal terms can be bound using √ 1 √ 1 1 εΠ0 − √ Π> εΠ0 − √ Π> ≥ 0 |x| ε ε

(3.6)

(3.7)

for some 0 < ε < 1, with the result that

1 1 1 1 ≤ (1 + ε)Π0 Π0 + 1 + Π> Π> . |x| |x| ε |x|

(3.8)

In the same way one shows that, see [LSY94a], 1 |x − y|

1 ≥ (1 − 3ε)Π0 ⊗ Π0 Π0 ⊗ Π0 |x − y| 1 3 + 1− Π> ⊗ Π> Π> ⊗ Π> ε |x − y| 1 1 + 1 − 2ε − Π0 ⊗ Π> Π0 ⊗ Π> ε |x − y| 1 2 + 1−ε− Π> ⊗ Π0 Π> ⊗ Π0 . ε |x − y|

(3.9)

Moreover, we will use 1 1 Π# ⊗ Π> Γ ⊗ Γ ≤ Tr[Π# Γ] sup Tr Π> ΓΠ> , Tr2 Π# ⊗ Π> |x − y| |x − a| a (3.10) where # stands for either 0 or >. If we restrict ourselves to considering Γ’s with Tr[Γ] ≤ λ we therefore have 2 8 1 EβMH [Γ] ≥ (1 − 3ε)EβMH [Π0 ΓΠ0 ] + 3ελ inf spec Hβ − β − 3 2|x| β 2λ 1 2 1 + inf inf spec 12 Hβ − − ,(3.11) + Tr[Π> Γ] a 2 ε |x| ε |x − a|

Vol. 2, 2001


59

where we have used that

and that

Π> (Hβ − β)Π> ≥ 12 Π> (Hβ + β)Π>

(3.12)

δ 1 2 Π0 ≥ δ Π0 Hβ − β − Π0 Π0 Hβ − β − 2|x| 2|x|

(3.13)

for δ ≥ 1, which can easily be seen by scaling z → δ −1 z. Now we use the comparison with the hydrogen atom, Proposition 2.1 : 1 (3.14) inf spec Hβ − β − ≥ max{1, λ−1 }E MH (λ, β). 2|x| By the same argument as in the proof of this inequality, we can set a = 0 in (3.11). Using the diamagnetic inequality we see that 12 Hβ − c|x|−1 ≥ − 21 c2 , so finally EβMH [Γ]

2 8 + 3ε max{1, λ}E MH (λ, β) 3

≥ (1 − 4(1 + λ)2 1 β− Tr[Π> Γ]. + 2 ε2 3ε)EβMH [Π0 ΓΠ0 ]

(3.15)

Now if β is large enough, we can set ε2 = 4(1 + λ)2 /β to conclude Lemma 3.2 (Comparison with the confined Hartree theory). MH Econf (λ, β) ≤ E MH (λ, β)(1 − const.(1 + λ)2 β −1/2 ),

(3.16)

where MH Econf (λ, β) =

inf

Tr[Π0 Γ]≤λ

EβMH [Π0 ΓΠ0 ].

(3.17)

Note that the constant in (3.16) can be chosen such that (3.16) is valid for all values of β > 0 and λ > 0. Remark 3.3. Equation (3.15) can be used to estimate the part of the Hartree state ΓH not confined to the lowest Landau band. In fact, if ε2 > 4(1 + λ)2 /β in (3.15), and with Γ = ΓH , we get Tr[Π> ΓH ] ≤ −const.(1 + λ)

εE MH (λ, β) . β − 4(1 + λ)2 ε−2

(3.18)

Using the simple lower bound on E MH , given in Lemma 2.3, and optimizing over ε gives Tr[Π> ΓH ] ≤ const.(1 + λ)3 β −1/2 . (3.19)

60



3.3 Confinement for the mean field Hamiltonian An analogous result as in the previous subsection holds also for the linearized theory. The following estimate will be used in Subsection 4.4: Lemma 3.4 (Confinement for the mean field Hamiltonian). Let H be given as in (2.54), with ρ ≥ 0 and ρ ≤ λ. Then for all β > 0 inf spec H ≥ inf spec Π0 HΠ0 + (2 + λ)β −1/2 E hyd (β, 2).

(3.20)

Proof. For 0 < ε < 1 we use again (3.5) and (3.8), and the analogous inequality in the other direction for |x|−1 ∗ ρ, to conclude that 1 1 H ≥ Π0 Hβ − β − (1 + ε) + (1 − ε) ∗ ρ Π0 |x| |x| −1 1 −1 1 + (1 − ε ) ∗ ρ Π> +Π> Hβ − β − (1 + ε ) |x| |x| 2 Π0 ≥ (1 − ε)Π0 HΠ0 + εΠ0 Hβ − β − |x| 2ε−1 4ε−1 − ∗ ρ Π> , + 12 Π> Hβ + β − (3.21) |x| |x| where we have also used (3.12). With the aid of the inequality (2.3) one easily sees that 2ε−1 4ε−1 (4 + 2λ)ε−1 − ∗ ρ ≥ inf spec Hβ − . (3.22) inf spec Hβ − |x| |x| |x| Using again 12 Hβ − c|x|−1 ≥ − 12 c2 we finally get

H ≥ (1 − ε)Π0 HΠ0 + εE hyd (β, 2) + 12 Π> β − (2 + λ)2 ε−2 .

(3.23)

In particular, if we choose ε = (2 + λ)β −1/2 , we arrive at the desired result, as long as β is large enough to ensure ε < 1. But if β ≤ (2 + λ)2 , (3.20) holds trivially because of the positivity of ρ.

3.4 The limit of very strong magnetic fields In the investigation of the limit β → ∞ some special concepts from earlier studies are involved: The right scaling, the restriction to the lowest Landau band, the effects of the superharmonicity on the distribution of the density in the perpendicular directions, and the high field limit of the Coulomb interaction. We consider pure states Λ = |χχ| defined by the wave functions with L = 0 in the lowest Landau band, β −β(x⊥ )2 /4 √ χ(x) = (3.24) Lψ(Lx ). e 2π

Vol. 2, 2001


61

We denote here the perpendicular and parallel components of the coordinates as x = (x⊥ , x ). The scaling factor L = L(β) has been defined in [BSY00] as the solution of the equation β 1/2 = L(β) sinh[L(β)/2].

(3.25)

L(β) = ln β + O(ln ln β)

(3.26)

Note that as β → ∞. In the following we use the scaled coordinates r = βx⊥ , r = |r|. z = L(β)x ,

(3.27)

With the density matrix Λ = |χχ|, we calculate the contributions to EβMH [Λ]. The normalization of Λ, we denote it as λψ , is equal to ψ22 . We restrict the set of ψ to real valued wave functions, since they minimize the kinetic energy, if the density |ψ|2 is kept fixed. Lemma 3.5 (The strong field limits). Given the state Λ defined by real valued ψ ∈ H1 (R) as in (3.24), the contributions to the energy in the magnetic Hartree theory, magnetic-kinetic energy Kβ,ψ , attraction Aβ,ψ , and Rβ,ψ , are in the repulsion 2 following way related to the kinetic energy, K = (dψ/dz) dz, attraction-energy ψ Aψ = ψ(0)2 , and repulsion-energy Rψ = 12 ψ(z)4 dz of the density ψ(z)2 in the HS-theory: Kβ,ψ = L2 Kψ ,

1 Aβ,ψ − Aψ ≤ C λψ + λ1/4 K 3/4 , ψ ψ L2 L

1 Rβ,ψ − Rψ ≤ Cλψ λψ + λ1/4 K 3/4 . ψ ψ L2 L

(3.28) (3.29) (3.30)

Proof. The equation for the kinetic energy is obvious by definition of Λ. We calculate the energy of attraction as ∞ ∞ 2 −r2 /2 2 Aβ,ψ = L e Vβ,r (z)ψ (z)dz rdr, (3.31) 0

where Vβ,r (z) =

−∞

1 . 2 L L r2 /β + z 2

For the term in square brackets we use Lemma 2.1 of [BSY00], [...] − ψ(0)2 ≤ λψ /r + 8λ1/4 K 3/4 r1/2 , ψ ψ to estimate the difference to the expected limit in the HS-theory:

1 Aβ,ψ − Aψ ≤ 1 λψ π/2 + 8 · 21/4 Γ(5/4)λ1/4 K 3/4 . ψ ψ L2 L

(3.32)

(3.33)

(3.34)

62



For the repulsion Rβ,ψ we calculate 1 2 2 1 2 −(r2 +r2 )/2 Rβ,ψ − Rψ ≤ 1 [...], L2 2 4 d rd r ( 2π ) e R

(3.35)

where

2 2 dzdz Vβ,|r−r | (z − z ) − δ(z − z ) ψ (z)ψ (z ) R2 λ2ψ 1 5/4 3/4 1/2 + 8λψ Kψ |r − r | . ≤ L |r − r |

[...] =

(3.36) (3.37)

The inequality is supplied by Lemma 2.2 of [BSY00], with an adaptation due to the different notation concerning the normalization. After inserting (3.37) in (3.35), the integrals can be evaluated as

√ √ 1 Rβ,ψ − Rψ ≤ 1 ( π/4)λ2ψ + 4 2Γ(5/4) · λ5/4 K 3/4 . (3.38) ψ ψ L2 L

Theorem 3.6 (Magnetic Hartree energy for large β). In the limit β → ∞ with λ fixed, E MH (λ, β) lim = E HS (λ). (3.39) β→∞ (ln β)2 1/4

3/4

Proof. Combining all the bounds of the Lemma, and using λψ Kψ ≤ 14 λψ + 34 Kψ to simplify, gives 1 MH Eβ [Λ] − E HS [|ψ|2 ] ≤ C (1 + λψ )(λψ + Kψ ). (3.40) L(β) L2 For the upper bound on E MH (λ, β) we specify ψ as ψλHS , the minimizing wave function for the HS-theory. See equation (3.6) in [LSY94a] or the following (4.30). It remains ψ2HS , for all λ > 2, so these wave functions are normalized to ψ22 = λψ = min{λ, 2}. Since the variational principle with fixed norm can be replaced by a variational principle with bounded norm, as we have remarked in the Subsection 2.1, the wave function ψ2HS is good for the upper bounds for all λ ≥ 2. Kψ is the kinetic energy in the hyperstrong theory. Using the virial equation (2.42), we may replace Kψ on the right side of (3.40) by |E HS (λ)|. With this choice of ψ the equation (3.40) gives the upper bound C 1 MH λ + E HS (λ) . E (λ, β) ≤ E HS (λ) + 2 L L(β)

(3.41)

Vol. 2, 2001


63

To derive a lower bound, we use the error bound, when confining the theory to the lowest Landau band, which we have estimated in the Subsection 3.2. By Lemma 3.2 we have, for β large enough, MH E MH (λ, β) ≥ (1 − const.(1 + λ)2 β −1/2 )−1 Econf (λ, β).

(3.42)

Also this confined Hartree theory has rotation invariant minimizers, since the lowest Landau band is mapped onto itself by rotations around the z-axis. Moreover, since the potential is superharmonic, also the minimizer of this confined theory has L = 0. See Corollary 2.11 of Subsection 2.5. The variational principle can therefore be restricted to states Λ = |χχ|, where the wave functions χ are specified as in equation (3.24), with general ψ ∈ H1 (R), normalized to ψ22 = λ: MH (λ, β) = Econf

inf

Λ, Tr[Λ]=λ, Λ=|χ χ|

EβMH [Λ].

(3.43)

The comparison with HS-theory in (3.40) is now used as a lower bound 1 C (1 + λ)(λ + Kψ ). E MH [Λ] ≥ E HS [|ψ|2 ] − L(β)2 β L(β)

(3.44)

The right side of this inequality can be considered as a functional similar to the HSfunctional, but with the constant 1 − C(1 + λ)/L(β) in front of the kinetic energy. With an appropriate scale transformation, this is equivalent to a HS-functional multiplied with (1 − (1 + λ)C/L(β))−1 , as long as this parameter is positive. Taking the infima of both sides, we conclude that 1 E MH (λ, β) ≥ L(β)2 conf

−1 (1 + λ) (1 + λ)λ 1−C E HS (λ) − C L(β) L(β)

(3.45)

for β large enough. Considering the limit β → ∞ at constant λ we infer lim inf β→∞

1 E MH (λ, β) ≥ E HS (λ). L(β)2 conf

(3.46)

Combining this with (3.42) proves, in union with (3.41), the theorem. Remark 3.7. To be precise, we specify the asymptotics: E MH (λ, β) λ λ ≤ 1−C E HS (λ) + C , (3.47) L(β)2 L(β) L(β) −1 −1 E MH (λ, β) (1 + λ)2 λ + λ2 1+λ ≥ 1 − C E HS (λ) − C 1 − C , 2 1/2 L(β) L(β) L(β) β (3.48) as long as the terms in braces are positive.

64



Remark 3.8. The convergence both of the energy of the atom, and of the energy per unit charge, when divided by (ln β)2 , is uniform in λ on bounded sets of λ. Analyzing the limit N → ∞ of many particle quantum mechanics, we will be led to the linearized mean field theory, where we need the Lemma 3.9 (Generalized strong field limits). Given two states which are defined by real valued ψ ∈ H1 (R) and real valued ϕ ∈ H1 (R) as in (3.24), with densities ⊥ 2 ρβ,ψ = β/2πe−β(x ) /2 Lψ(Lx )2 , the following estimate for the Coulomb repulsion holds: +∞ Cλϕ

1 1/4 3/4 2 2 ≤ D[ρβ,ϕ , ρβ,ψ ] − 1 λψ + λψ Kψ . (3.49) ϕ(z) ψ(z) dz L2 2 −∞ L Proof. As for Rβ,ψ in the proof of Lemma 3.5, with ψ(z)2 in (3.36) changed to ϕ(z)2 .

4 The mean field limit We now prove the theorems that we have stated in the introduction. We will derive appropriate upper and lower bounds to the quantum mechanical energy of symmetric states obeying Bose statistics. The atom with many interacting particles will be compared to models with independent particles in an effective field. In this comparison the upper bound is rather easy: One uses symmetric wave functions of product form as trial wave functions. Then one adds the “self energies” of the one-particle densities. The production of lower bounds is not so easy. For the Regions 1 and 2 we will use the Lieb-Oxford bound, [LO81], and then we will borrow a part of the kinetic energy to estimate the correction in comparison to the mean field model. But this method breaks down in the presence of magnetic fields, when they are too strong. In Region 3 its use is restricted to the subregion β N 4/3 . This situation is similar to the case of fermionic electrons, discussed in Sect. 7 of [LSY94a]. For β N 4/3 we have to borrow some kinetic energy at the level of quantum mechanics for N particles. Here we develop a new way of producing bounds, extending the method of [BSY00].

4.1 Upper bounds Lemma 4.1 (Upper bounds for Regions 1 and 2). The Hartree energies provide upper bounds to the quantum mechanical energies for each N , λ and β: E(N, λ, β) E MH (λ, β) ≤ . N λ

(4.1)

Vol. 2, 2001


65

Proof. We apply the variational principle, using N -fold products of 1-particle states Γ as N -particle states 1 N (N − 1) E(N, λ, β) ≤ N Tr[(Hβ − β)Γ] − N D[ρΓ , ρΓ ] (4.2) ρΓ + λ |x| N2 for all Γ with Tr[Γ] ≤ 1. Setting Γ = ΓH /λ we see that the inequality holds, for each β and λ. Lemma 4.2 (Upper bound for Region 3). In the limit β → ∞, the energies of the hyperstrong theory are asymptotic upper bounds to the quantum mechanical energies for each N : E(N, λ, β) E HS (λ) lim sup . (4.3) ≤ N (ln β)2 λ β→∞ To be precise: E(N, λ, β) ≤ N L(β)2

λ 1−C L(β)

E HS (λ) C + . λ L(β)

(4.4)

Proof. This results from combining Lemma 4.1 with Theorem 3.6.

4.2 Lower bounds for Regions 1 and 2 Lemma 4.3 (Lower bounds for Regions 1 and 2). In the limit N → ∞, the Hartree energies are asymptotic lower bounds to the quantum mechanical energies, lim inf N→∞

E(N, λ, β) E MH (λ, β) ≥ , N λ

(4.5)

uniformly in β for bounded β. Proof. We use the Lieb-Oxford inequality [LO81] Ψ|

−1

|xi − xj |

4/3

|Ψ ≥ D[ρΨ , ρΨ ] − C

ρΨ

(4.6)

i<j

to be C = 1.68), which, (Ψ is normalized as Ψ2 = 1, the constant can be chosen together with Hölder’s inequality ρ4/3 ≤ ( ρ3 )1/6 ( ρ)5/6 , implies that Ψ|HN,λ,β Ψ ≥

N MH E [ΓΨ λ/N ] − CλN −1/6 λ β

1/6 ρ3Ψ

,

(4.7)

where ΓΨ is the one-particle reduced density matrix of Ψ, and ρΨ is its density. Now if Ψ|HN,λ,β Ψ ≤ 0, which we can of course assume, then Ψ| (Hβ,i − β)Ψ ≤ −Ψ| (Hβ,i − β − 2|xi |−1 )Ψ ≤ −N E hyd (β, 2), (4.8) i

i

66



with E hyd defined in (2.9). Together with (2.20) this implies that

1/3 ≤

ρ3Ψ

1 3

4/3

2 N β − E hyd (β, 2) , π

(4.9)

so finally

1/2 1 1 . E(N, λ, β) ≥ E MH (λ, β) − CλN −2/3 β − E hyd (β, 2) N λ

(4.10)

Remark 4.4. Note that the convergence of the energies in Region 2, including Region 1, is uniform in β for bounded β. So if β → 0 as N → ∞ we get the usual Hartree energy without magnetic field. It is even possible to let β → ∞ with N as long as β ≤ const.N 4/3 . In fact, in (4.10) we have an error term of order N −2/3 β 1/2 (note that E hyd ∼ (ln β)2 for large β), and this is of lower order than E MH as long as β ≤ const.N 4/3 .

4.3 Restriction to independent particles An essential ingredient is the positive definiteness of the repulsive pair interaction. The method exploiting positive definiteness of a function W uses the inequality i<j

W (xi − xj ) ≥

N i=1

−

1 2

+∞

−∞

W (xi − y)σ(y)d3 y

σ(x)W (x − y)σ(y)d3 x d3 y − N

(4.11) W (0) . 2

Actually we need a function W which is positive definite, finite at the origin, and a lower bound to the Coulombic repulsive potential. Also the cutoff near the origin should vanish in the limits N → ∞, β → ∞, but still with W (0)/N (ln β)2 going to 0. If we choose for W the spherical symmetric cutoff potential Vcutoff (x) =

1 − e−µ|x| , |x|

(4.12)

it turns out that this does not suffice for our purposes. The cutoff length µ−1 should be of the order of the typical lengths of the atom, and this would require a coupling of the limits N → ∞ and β → ∞, or more precisely, β is not allowed to increase arbitrarily fast with N . To get a useful lower bound for the entire Region 3, we have to push down W (0) even further, without changing the effective interaction too much. To achieve

Vol. 2, 2001


67

this, we split Vcutoff (we restrict the parameter to µ > 1) into its effective part Vµ and the long range tail Vlong , Vµ (x) = Vlong (x) =

e−|x| − e−µ|x| , |x|

µ > 1,

1 − e−|x| , |x|

(4.13) (4.14)

and borrow some kinetic energy. The construction of W proceeds in several steps. We begin with the observation that Vµ is integrable in x for any x⊥ . It has moreover the property of being decreasing in |x|. So the integrals along lines with fixed x⊥ can be bounded by the integral along the line, where x⊥ = 0: ∞ Vµ (x)dx ≤ 2 ln µ. (4.15) −∞

We now use Lemma 6.3 of [BSY00], the operator inequality, which holds for α > 0, F > 0, and for each b > 0, b independent of α and F , αp2x + F δ(x) ≥ F · wF/α,b (x),

(4.16)

where

b2 (4.17) e−bD|x| 2b + 1 and p2x = −(∂/∂x)2 . These functions have the properties of being positive definite and producing delta-function sequences in the limit bD → ∞ coupled with b → ∞. (The variable x will in the application be replaced by x .) We extend this lemma to the wD,b (x) = D

Proposition 4.5 (Operator inequalities). Let V(x) define a positive Borel measure +∞ V (x)dx, with −∞ V (x)dx ≤ F , F > 0, α > 0. For each b > 0 the operator inequality αp2x + V (x) ≥ (V ∗ wF/α,b )(x) (4.18) holds, in the sense of an inequality for quadratic forms, with the Sobolev space H1 (R) as the form domain. Proof. For ψ ∈ H1 (R) consider the wave functions ψy (x) = ψ(x + y). By (4.16) there holds for each y the inequality ψy |(αp2 + F δ)|ψy ≥ F ψy |wF/α,b |ψy .

(4.19)

Both sides of this inequality are Borel measurable functions of y. Observe ψy |p2 |ψy = ψ|p2 |ψ, ψy |w(x)|ψy = ψ|w(x − y)|ψ, and integrate both sides +∞ of (4.19) with V (y)dy. Divide by F , and use −∞ V (x)dx/F ≤ 1 in front of the kinetic energy term.

68



We may consider α(pi )2 + Vµ (xi − xj ) as a 1-particle Hamiltonian acting in L2 (R, dxi ), which is parametrized by the perpendicular coordinates and by xj . However, the operator inequality (4.18) is also valid if the operators act in the extended space L2 (R3N , d3 x1 ...d3 xN ), so we can apply this proposition, with F = 2 ln µ, to (4.20) α(pi )2 + Vµ (xi − xj ). We add the missing long range tail and define W (x) =

Vµ (x − y)δ 2 (y⊥ )w2 ln µ/α,b (y )d3 y +

1 − e−|x| , |x|

(4.21)

where w... (x ) is defined in (4.17). The essential effect of the convolution of Vµ with the distribution w(x) = δ(x⊥ )w(x ) (4.22) is the lowering of the value of the interactions at the points of coincidences of particles, 2b W (0) ≤ Vµ dx w2 ln µ/α,b (0) + 1 < (ln µ)2 + 1. (4.23) α Thus we have shown, that W (x) is a lower bound to the Coulomb potential, when some borrowed part of the kinetic energy is added, W = Vµ ∗ w + Vlong ≤ α(p )2 + Vµ + Vlong ≤ α(p )2 + 1/|x|.

(4.24)

Now the function W = Vµ ∗ w + Vlong is also positive definite, since the distribution w and all the involved functions are positive definite. So we can refer to the technique (4.11) of reduction to one-particle models. Proposition 4.6 (Bounds by independent particles). Considered as quadratic forms, the N-particle Hamiltonians HN,λ,β are bounded from below by sums of one-particle Hamiltonians and constants: HN,λ,β

N

1 λ ≥ hi − 2 N i=1

b λ σ(x)W (x − y)σ(y)d3 x d3 y − λ (ln µ)2 − , (4.25) α 2

where λ αλ 2 1 (pi ) − + hi = Hβ,i − β − 2 |xi | N

W (xi − y)σ(y)d3 y,

(4.26)

and where W (x) is defined in (4.21). The parameters α, b have to be positive, µ > 1, σ(x) should be an element of L1 (R3 ).

Vol. 2, 2001


69

Proof. From Hβ,i we borrow a part of (pi )2 = −(∂/∂xi )2 to use (4.24),

α(pi )2 +

1 ≥ W (xi − xj ), |xi − xj |

(4.27)

sum over all pairs i = j , and multiply with λ/2N . This gives αλ

N 1 λ λ N −1 2 (pi ) + W (xi − xj ). ≥ 2N i=1 N i<j |xi − xj | N i<j

(4.28)

Inserting (4.28) and (4.11) into (1.3) completes the proof. To ascertain any use to the inequality, αλ < 2 is requested. In the next subsection, the density σ will be chosen to be an approximately minimizing Hartree density, and the parameters will be adapted to the limit N, β → ∞.

4.4 The lower bound for Region 3 We choose σ = N ρ, with ρ(x) =

β β(x⊥ )2 1 HS exp − Lρ (Lx ). 2π 2 λ λ

(4.29)

The minimizing density ρHS λ for the hyperstrong theory is, see [LSY94a], 2(2 − λ)2 (4 sinh[(2 − λ)|z|/4 + c(λ)])2 tanh c(λ) = (2 − λ)/2, HS ρλ (z) = 2(2 + |z|)−2

ρHS λ (z) =

Note that ρ1 = min{1, 2/λ},

for λ < 2,

(4.30)

for λ ≥ 2.

(4.31)

ρ∞ = (β/2π)L(β)ρHS λ (0)/λ ≤ CβL(β).

We want to study the lower bound to N −1 E(N, λ, β), due to Proposition 4.6, λ λ b 1 3 3 2 inf spec {h} − (ln µ) + , (4.32) ρ(x)W (x − y)ρ(y)d x d y − 2 N α 2 which we expect to be asymptotically proportional to (ln β)2 . For simplicity, we will restrict our considerations to β > Cβ for some Cβ > 1 in the following. We choose the parameters as µ = β 1/2+δ , α = N −ε , b = N η ,

(4.33)

with δ, ε, η all greater than zero, ε + η < 1. This guarantees first of all the relative vanishing of the constant which stems from W (0): λ b 1 (ln µ)2 + /(ln β)2 → 0, (4.34) N α 2

70



as β, N → ∞. It remains to study the Hamiltonian h and the self energy of ρ due to the interaction W = Vµ ∗ w + Vlong . The inequality 3 3 ρ(x)Vµ (x − y)ρ(y)d3 x d3 y (4.35) ρ(x)(Vµ ∗ w)(x − y)ρ(y)d x d y ≤ can be observed in Fourier space, where Vµ and w are positive, and w ≤ 1. We add Vlong , and use the pointwise inequality in x-space (Vµ +Vlong )(x) ≤ V C (x) ≡ 1/|x|. So the self-energy of ρ due to W is smaller than the self-energy D[ρ, ρ] due to the interaction by the Coulomb potential V C . The potential (Vµ ∗ w + Vlong ) ∗ ρ in the Hamiltonian h is now changed to V C ∗ ρ. First we give a bound to the difference of Vµ ∗ w ∗ ρ to Vµ ∗ ρ in the form of an operator inequality, which follows from | ψ|Vµ ∗ (w ∗ ρ − ρ)|ψ |

≤ (|ψ|2 ∗ Vµ ) · (w ∗ ρ − ρ)1 ≤ w ∗ ρ − ρ1 |ψ|2 ∗ Vµ ∞ = w ∗ ρ − ρ1 supψ|Vµ,y |ψ,

(4.36)

y

where Vµ,y = Vµ (x − y). Since Vµ ≤ V C , and, with E hyd (β) ≡ E hyd (β, ζ = 1), Hβ − β − VyC ≥ E hyd (β),

(4.37)

the last term in (4.36) is bounded by ψ|VyC |ψ ≤ ψ|Hβ − β − E hyd (β)|ψ,

(4.38)

multiplied with w ∗ ρ − ρ1 . We observe, see Remark 2.2, E MH (λ, β) , λ→0 λ

E hyd (β) = lim

E HS (λ) 1 =− . λ→0 λ 4 lim

The precise bound in the Remark 3.7 gives thus (see also [AHS81]) 1 1 E hyd (β) ≥ − − C L2 (β) ≥ −C(ln β)2 4 L(β)

(4.39)

(4.40)

for β > Cβ . An analogous bound holds also for E hyd (β, 2). Note that L(β) can be replaced by (ln β) (and vice versa), since for β > Cβ > 1 there are positive constants C1 and C2 such that C1 ≤ ln(β)/L(β) ≤ C2 . To complete the estimate (4.36), we need a bound to w ∗ ρ − ρ1 . There, the integrals in perpendicular coordinates can be done explicitly. Note that for all HS HS λ we have |dρHS λ /dz| ≤ ρλ . Using this and the monotonicity of ρλ in |z| we can

Vol. 2, 2001


estimate

71

HS dρλ (x) dz x∈[z−y,z]

HS HS ≤ |y| ρHS λ (z − y) + ρλ (z) + Θ(|y| − |z|)ρλ (0) . (4.41)

HS ρλ (z − y) − ρHS ≤ |y| λ (z)

sup

Since for |y| ≥ 1 (4.41) is obviously true even without the last term, we see that (4.41) holds with Θ(|y| − |z|) replaced by Θ(1 − |z|). This implies 1 ρ ∗ w − ρ1 ≤ dx dηLw(η) λ

HS −1 − 1 ρHS × ρHS λ (Lx − Lη) − ρλ (Lx ) + ( w) λ (Lx )

HS −1 HS ρ (1 − w) + 2λ + ρ (0) L |η|w(η)dη. ≤ λ−1 ρHS λ λ λ (4.42) Now get

1 HS ρHS λ ≤ λ, ρλ (0) ≤ 2 λ, (1 −

ρ ∗ w − ρ1 ≤

w) ≤ (2b)−1 and

|η|w ≤ α/(2b ln µ), so we

1 3 αL(β) + ≤ C(N −η + N −η−ε L(β)/ ln β) ≤ C N −η . (4.43) 2b 2 b ln µ

Finally, we add to Vµ the short-range term Vshort (x) =

e−µ|x| , |x|

(4.44)

and we find the pointwise bound to (V C − Vµ ) ∗ ρ as Vshort 1 ρ∞ ≤ Cµ−2 βL(β) = Cβ −2δ L(β).

(4.45)

We collect the bounds we have obtained so far:

E(N, λ, β) ≥ inf spec {hb }−λD[ρ, ρ]−Cλ(ln β)2 N ε+η−1 + N −η + β −2δ L(β)−1 , N (4.46) with the one-particle Hamiltonian

hb = (Hβ − β) 1 − CλN −η − 2λN −ε − V C + λV C ∗ ρ, (4.47) where we have used (p )2 ≤ Hβ − β. We now use (4.8) to estimate the error terms in the kinetic energy. Then we apply the statements of the subsection on

72



the confinement in the mean field theory, Lemma 3.4, to estimate E(N, λ, β) N

Π0 Hβ − β − V C + λV C ∗ ρ Π0 − λD[ρ, ρ] 2 + λ −1/2 β −2δ 2 −η −ε ε+η−1 −Cλ(ln β) N + N + N + β + . L(β) λ (4.48)

≥ inf spec

Corollary 2.11 states that the minimizer of the operator in question has L = 0. The search for the infimum of the spectrum can therefore be restricted to the use of the same type of wave-functions as in the investigations on the limit of very strong magnetic fields in Subsection 3.4, Lemma 3.5 and 3.9, and these investigations can be applied, too. We choose ε = η = 1/3 and get the lower bound E(N, λ, β) N L2 (β) −1 2 1 1+λ HS HS 2 1−C (ρλ (z)) dz ≥ inf spec pz − δ(z) + ρλ (z) − 2λ L(β)

(4.49) − C λN −1/3 + L(β)−1 λβ −2δ + 1 + λ + (2 + λ)β −1/2 . The operator in curly brackets is the for the linearized HS theory, Hamiltonian 2 with ground state energy (E HS + 12 (ρHS ) )/λ (cf. eq. (2.38)). So our final result λ can be stated as Lemma 4.7 (Lower bound for Region 3). If L(β) > C(1 + λ), then −1 E(N, λ, β) 1+λ E HS (λ) 1+λ −1/3 − C λN + ≥ 1−C . N L2 (β) λ L(β) L(β)

(4.50)

Hence the convergence of the lower bound, in the limit N → ∞, β → ∞, is proven. Remark 4.8. The limits N → ∞ and β → ∞ can be considered independently of each other, in any succession or combination.

4.5 Critical particle number Recall that E(N, Z, B) denotes the ground state energy of the unscaled Hamiltonian (1.1). Lemma 4.9 (Limit of the critical charge). Define Nc (Z, B) = max{N |E(N, Z, B) < E(N − 1, Z, B)}. We have lim inf Z→∞

Nc (Z, βZ 2 ) ≥ λc (β). Z

(4.51)

(4.52)

Vol. 2, 2001


73

Proof. This follows from the convergence of the energies, by analogous arguments as in [BL83] or [LSY94a]. Remark 4.10. In [S00], Theorem 3, the following upper bound on Nc is proven: 2 B Z B Nc < 2Z + 1 + min 1 + 2 , C 1 + ln (4.53) 2 Z Z2 for some constant C independent of B and Z. Inserting (4.53) into (4.52) this proves Lemma 2.9.

4.6 Convergence of the density matrices Lemma 4.11 (Ground states are Hartree minimizers). For fixed λ and β, let ΨN be an ε-approximate ground state of HN,λ,β defined in (1.3), i.e. ΨN , HN,λ,β ΨN ≤ E(N, λ, β) + ε(N )

(4.54)

for all N , with 0 < ε = o(N ). Let ΓN be the corresponding one-particle reduced density matrix. Then λΓN /N is a minimizing sequence for EβMH , i.e. lim EβMH [λΓN /N ] = E MH (λ, β).

N→∞

(4.55)

Proof. Let ρN be the density of ΓN . We have E MH

λ λ2 Tr[(Hβ − β − |x|−1 )ΓN ] + 2 D[ρN , ρN ] N N λ λ2 λ2 −1 ΨN |HN,λ,β ΨN − 2 ΨN | |xi − xj | ΨN + 2 D[ρN , ρN ] N N N i<j λ2 λ 4/3 (E(N, λ, β) + ε) + 2 C ρN , (4.56) N N

≤ EβMH [λΓN /N ] = = ≤

where we used again the Lieb-Oxford inequality for the last step. By an analogous argument as in Lemma 4.3 we see that the right side of (4.56) converges to E MH as N → ∞. Since λΓN /N is a minimizing sequence for EβMH , and ΓH is unique, we know that λΓN /N ' ΓH in weak operator sense. If λ ≤ λc , there is even norm convergence. To show this we need the following general lemma: Lemma 4.12 (Weak convergence implies norm convergence). Let aN be a sequence of positive trace class operators on a Hilbert space. Suppose that aN ' a in weak operator sense, for some positive trace class operator a. If limN→∞ Tr[aN ] = Tr[a], then aN → a in trace norm, i.e. aN − a1 ≡ Tr[|aN − a|] → 0

as

N → ∞.

(4.57)

74



Proof. See [S79], Thm. 2.20. Theorem 4.13 (Convergence of the density matrices). Let ΓN be as in Lemma 4.11, and let λ ≤ λc . Then for each fixed β lim λΓN /N − ΓH 1 = 0.

(4.58)

N→∞

Proof. Since ΓH is unique, λΓN /N converges weakly to ΓH by Lemma 4.11 and the proof of Theorem 2.4. Apply Lemma 4.12 to aN = λΓN /N and a = ΓH , and note that Tr[λΓN /N ] = λ and Tr[ΓH ] = min{λ, λc }.

4.7 Bose condensation Bose condensation, as defined in [PO56], means that the largest eigenvalue of the one-particle reduced density matrix of the ground state for the N -particle Hamiltonian is of order N as N → ∞, or equivalently, if lim inf N→∞ ΓN /Tr[ΓN ] > 0. This is shown to be the case for our model. Theorem 4.14 (Bose condensation in the mean field limit). Let ΓN be as in Lemma 4.11. Then for each fixed β and λ lim

N→∞

ΓN min{λ, λc } = . Tr[ΓN ] λ

(4.59)

In particular, if λ ≤ λc , and if ϕN denotes the normalized eigenvector corresponding to the largest eigenvalue of ΓN with appropriate phase-factor, we have lim

N→∞

where ϕH = λ−1/2

ΓN = 1, Tr[ΓN ]

lim ϕN − ϕH 2 = 0,

N→∞

(4.60)

ρH is the normalized ground state of H H .

Proof. We use the notations of the proof of Theorem 4.13. The first assertion is easily proved, using that |aN − a| ≤ aN − a ≤ aN − a1

(4.61)

and a = min{λ, λc }. To prove the second we denote P = |ϕH ϕH | and PN = |ϕN ϕN |, and compute 1/2

1/2

Tr[aN P ] = aN ϕN |P ϕN + Tr[aN P aN (1 − PN )] ≤ aN ϕN |P ϕN + Tr[aN ] − aN ,

(4.62)

where we have used P ≤ 1 in the last step. Therefore ϕN |P ϕN → 1 as N → ∞, which gives immediately the desired result.

Vol. 2, 2001


75

Acknowledgements We thank Jakob Yngvason for continuing interest and fruitful discussions.

References [LSY94a] Elliott H. Lieb, Jan Philip Solovej, and Jakob Yngvason: Asymptotics of Heavy Atoms in High Magnetic Fields : I. Lowest Landau Band Regions, Commun. Pure Appl. Math. XLVII, 513–591 (1994) [LSY94b] Elliott H. Lieb, Jan Philip Solovej, and Jakob Yngvason: Asymptotics of Heavy Atoms in High Magnetic Fields: II. Semiclassical Regions, Commun. Math. Phys. 161, 77–124 (1994) [BSY00] Bernhard Baumgartner, Jan Philip Solovej, and Jakob Yngvason: Atoms in strong magnetic fields : The high field limit at fixed nuclear charge, Commun. Math. Phys. 212, 703–724 (2000) [AHS81] J.E. Avron, Ira W. Herbst, and Barry Simon, Schr¨ odinger Operators with Magnetic Fields III. Atoms in Homogeneous Magnetic Field, Commun. Math. Phys. 79, 529-572 (1981) [FW94] R. Froese, and R. Waxler, The spectrum of a hydrogen atom in an intense magnetic field, Rev. in Math. Phys. 6, 699–832 (1994) [BL83]

Rafael Benguria, and Elliott H. Lieb, Proof of the Stability of Highly Negative Ions in the Absence of the Pauli Principle, Phys. Rev. Lett. 50, 1771–74 (1983)

[LO77]

R. Lavine and M. O’Carroll, Ground state properties and lower bounds for energy levels of a particle in a uniform magnetic field and external potential, J. Math. Phys. 18, 1908–12 (1977)

[AHS78] J. Avron, I. Herbst, B. Simon, Schr¨ odinger Operators with Magnetic Fields. I. General Interaction, Duke Math. J. 45, 847-883 (1978) [JY96]

K. Johnsen, J. Yngvason, Density-matrix calculations for matter in strong magnetic fields: Ground states of heavy atoms, Phys. Rev. A54, 1936–1946 (1996)

[BRW99] Raymond Brummelhuis, Mary Beth Ruskai, and Elisabeth Werner, One Dimensional Regularizations of the Coulomb Potential with Applications to Atoms in Strong Magnetic Fields, arXiv:math-ph/9912020 [S76]

Barry Simon, Universal Diamagnetism of Spinless Bose Systems, Phys. Rev. Lett. 36, 1083–4 (1976)

76



[B84]

B. Baumgartner, On Thomas-Fermi-von Weizs¨ acker and Hartree energies as functions of the degree of ionisation, J. Phys. A17, 1593-1602 (1984)

[S00]

Robert Seiringer, On the maximal ionization of atoms in strong magnetic fields, arXiv:math-ph/0006002

[RS78]

M. Reed, B. Simon, Methods of Modern Mathematical Physics IV, Academic Press (1978)

[LL97]

E. H. Lieb, M. Loss, Analysis, Amer. Math. Society (1997)

[GS95]

Harald Grosse, Joachim Stubbe, Splitting of Landau Levels in the Presence of External Potentials, Lett. Math. Phys. 34, 59–68 (1995)

[LO81]

E. H. Lieb, S. Oxford, Improved Lower Bound on the Indirect Coulomb Energy, Int. J. Quant. Chem. 19, 427–439 (1981)

[BBL81] Rafael Benguria, H. Brezis, Elliott H. Lieb, The Thomas-Fermi-von Weizsäcker Theory of Atoms and Molecules, Commun. Math. Phys. 79, 167–180 (1981) [S79]

Barry Simon, Trace ideals and their applications, Cambridge University Press, 1979

[PO56]

Oliver Penrose, Lars Onsager, Bose-Einstein Condensation and Liquid Helium, Phys. Rev. 104, 576–584 (1956)

Bernhard Baumgartner and Robert Seiringer Institut f¨ ur Theoretische Physik Universität Wien Boltzmanngasse 5 A-1090 Vienna Austria e-mail: [email protected] e-mail: [email protected] Communicated by Gian Michele Graf submitted 12/07/00, accepted 11/09/00




Electron Wavefunctions and Densities for Atoms Maria Hoffmann-Ostenhof, Thomas Hoffmann-Ostenhof, Thomas Østergaard Sørensen Abstract. With a special ‘Ansatz’ we analyse the regularity properties of atomic electron wavefunctions and electron densities. In particular we prove an a priori estimate, supy∈B(x,R) |∇ψ(y)| ≤ C(R) supy∈B(x,2R) |ψ(y)| and obtain for the spherically averaged electron density, ρ(r), that ρ (0) exists and is non-negative.

1 Introduction and Results Let V be the Coulomb potential for an atom consisting of a nucleus of charge Z (fixed at the origin) and N electrons: V (x) = V (x1 , . . . , xN ) =

N j=1

x = (x1 , . . . , xN ) ∈ R

3N

−

Z + |xj |

1≤j 0 2 r

S2

(1.10)

h(rω) dω. Thereby,

h ∈ C α ((0, ∞)) ∩ C 0 ([0, ∞)) for all α ∈ (0, 1) and ρ ∈ C 2,α ((0, ∞)) ∩ C 2 ([0, ∞)) for all α ∈ (0, 1). (iii) h(x) ≤ C(R)

ρ(y) dy + ρ(x)

for all x ∈ R3 ,

(1.11)

B(x,R)

h(x) ≥ ε ρ(x) for all x ∈ R3 , if ε = E0N−1 − E > 0.

(1.12)

Vol. 2, 2001

Electron Wavefunctions and Densities for Atoms

81

(iv) d2

2 ρ (0) = h(0) + Z 2 ρ(0) . 2 dr 3

(1.13)

Remark 1.12 The results in (i) generalize to the case of molecules, where the continuity results for ρ and h hold in the complement of the set {R1 , . . . , RL } ⊂ R3 (see Remark 1.7). Remark 1.13 It is known that eigenfunctions obey (Kato’s) Cusp Condition (see Kato [8]), and similar properties hold for particle densities. For more recent results see Hoffmann-Ostenhof et al. [4], [5], Hoffmann-Ostenhof and Seiler [6]. In the proof of Theorem 1.11, (iv) we make use of the Cusp Condition for ρ, namely : ρ(r) − ρ(0) = −Z ρ(0) and lim ρ (r) = ρ (0) r↓0 r

ρ (0) = lim r↓0

(1.14)

and also present a proof for it. Remark 1.14 Of course our results are only first steps in a thorough investigation of qualitative properties of the one-electron density. Here are some obvious open questions: (i) Is ρ(x) > 0 for all x ∈ R3 ? We remark that this cannot be true in general, since it is false for some exited states of Hydrogen. (ii) Is ρ ∈ C ∞ (R3 \ {0}) or even C ω (R3 \ {0}) ? dk (r) exists for r ≥ 0 for all k? (iii) Is ρ smooth in [0, ∞), in the sense that dr kρ d ρ(r) ≤ 0 for r ≥ 0 ? This is expected to be true for groundstate densities, (iv) Is dr but not known even for the bosonic case like Helium. Our results imply that d (r) ≤ 0 for r ≤ R0 for the bosonic case, where R0 depends on the constant dr ρ C in Theorem 1.2. Note that because of (1.9) and (1.12) we have ∆ρ ≥ 0 for d |x| ≥ Z/ε, and so the Maximum Principle gives that dr ρ(r) < 0 for r > Z/ε.

Remark 1.15 In the proof of Theorem 1.11 we obtain (see Proposition 3.1): With ∇1 = ( ∂x∂1,1 , ∂x∂1,2 , ∂x∂1,3 ), the function t1 (r) =

S2

is continuous on [0, ∞).

R3(N −1)

|∇1 ψ(rω, x2 , . . . , xN )|2 dx2 . . . dxN dω

82

M. and T. Hoffmann-Ostenhof, T. Østergaard Sørensen


2 Proofs Throughout the proofs, we will denote by C generic constants. Crucial for our investigations is Corollary 8.36 in Gilbarg and Trudinger [3]. We shall make use of this result several times and for convenience we state it already here, adapted for our special case: Proposition 2.1 Let Ω be a bounded domain in Rn and suppose u ∈ W 1,2 (Ω) is a weak solution of ∆u + nj=1 bj Dj u + W u = g in Ω, where bj , W, g ∈ L∞ (Ω). Then u ∈ C 1,α (Ω) for all α ∈ (0, 1) and for any domain Ω , Ω ⊂ Ω we have |u|C 1,α (Ω ) ≤ C sup |u| + sup |g| Ω

Ω

for C = C(α, n, M, dist(Ω , ∂Ω)), with max {1, bj L∞ (Ω) , W L∞ (Ω) , gL∞ (Ω) } ≤ M.

j=1,... ,n

Thereby |u|C 1,α (Ω ) = uL∞ (Ω ) + ∇uL∞ (Ω ) +

sup

x,y∈Ω , x=y

|∇u(x) − ∇u(y)| . |x − y|α

Proof of Theorem 1.2 and Proposition 1.5. Let the function F be as in (1.6) and define the function F1 by F1 (x1 , . . . , xN ) =

N

−

j=1

Z 2

|xj |2 + 1 +

1≤j 0, this gives −1 b − a−1 ≤ |b − a|α max{a−1−α , b−1−α }. So, for x, y, z ∈ R3 , using max{s, t} ≤ s + t and the triangle inequality in R3 , we have |x − z|−1 − |y − z|−1 ≤ |x − y|α |x − z|−1−α + |y − z|−1−α . In this way, by (3.44) and equivalence of norms in R3N , |K(¯ x1 , x2 , . . . , xN )| |K(¯ x1 , x2 , . . . , xN )| + (B) ≤ dx2 · · · dxN |x1 − x2 |1+α |¯ x1 − x2 |1+α N ≤C exp(−γc0 |xj |) dxj j=3

×

R3

R3

1 1 + |¯ x1 − x2 |1+α |x1 − x2 |1+α

exp(−γc0 |x2 |) dx2

≤ C(x0 , R), since x1 , x ¯1 ∈ B(x0 , R). This finishes the proof of Lemma 3.4 (a). The proof of (b) is similar to that of (a) so we omit the details. The proof of the following fact is straightforward:

92



There exist constants C = C(γ, R) and γ =γ (γ) such that γ |(x0 , . . . , xN )|) exp(−γ|(x, . . . , xN )|) ≤ C exp(−

(3.46)

for all x ∈ B(x0 , R). Using this and Lemma 3.3, 3.4, we shall prove the following lemma on the regularity of the functions J1 , J2 and J3 from (3.42), and ρ. Lemma 3.5 Let J1 , J2 and J3 be as in (3.42). Then α (R3 ) for all α ∈ (0, 1). (i) ρ, J2 , J3 ∈ Cloc α (ii) J1 ∈ Cloc (R3 \ {0}) for all α ∈ (0, 1).

Herefrom follow the regularity properties of the function h stated in Proposition 3.1. Proof of Lemma 3.5 (i). Firstly, by Theorem 1.2 and Remark 1.8, |ψ(x)| , |∇ψ(x)| ≤ C exp(−γ|x|) , x ∈ R3N ,

(3.47)

which gives (3.44) for K = ψ2 . Next we verify that for G = ψ2 , (3.43) is fulfilled. Then Lemma 3.3 (with G = ψ2 ) and Lemma 3.4 (with K = ψ2 ) can be applied, and the H¨ older-continuity of J2 , J3 and ρ follows. Given x0 ∈ R3 , R > 0, α ∈ (0, 1), and x, y ∈ B(x0 , R), x = y, such that the line through x and y is parallel to one of the coordinate-axis in R3 , e.g. x = (x1,1 , x1,2 , x1,3 ), y = (y1,1 , x1,2 , x1,3 ), x1,1 = y1,1 . Using that ψ2 ∈ W 1,1 (R3N ) ∩ C 0 (R3N ) we get, for almost all (x1,2 , x1,3 , x ) ∈ R3N−1 (here, (x, x2 , . . . , xN ) = (x1,1 , x1,2 , x1,3 , x ), x ∈ R3(N−1) , x1,i ∈ R) 1 ∂ 2 ψ (sx1,1 + (1 − s)y1,1 , x1,2 , x1,3 , x ) ds ψ2 (x, x ) − ψ2 (y, x ) = 0 ∂s 1 ∂(ψ2 ) (sx1,1 + (1 − s)y1,1 , x1,2 , x1,3 , x ) · (x1,1 − y1,1 ) ds. (3.48) = ∂x1,1 0 Since sx + (1 − s)y ∈ B(x0 , R) for all s ∈ [0, 1] we get, with (3.48), (3.47) and (3.46), |ψ2 (x, x ) − ψ2 (y, x )| |x − y|α 1 ∂ψ ≤ 2|x − y|1−α | (sx + (1 − s)y, x )| · |ψ(sx + (1 − s)y, x )| ds ∂x1,1 0 1 C exp(−2γ|(sx + (1 − s)y, x )|) ds ≤ 2(2R)1−α 0

≤ C exp(− γ |(x0 , x |).

(3.49)

Vol. 2, 2001


93

As both the RHS and the LHS of (3.49) is continuous in (x1,2 , x1,3 , x ), the inequality holds for all (x1,2 , x1,3 , x ) ∈ R3N−1 . The inequality for general x, y ∈ B(x0 , R) now follows from the fact that x and y can be joined by a path consisting of three line segments, each parallel to a coordinate-axis and completely contained in B(x0 , R). So (3.43) follows for G = ψ 2 . This proves (i) of Lemma 3.5. To prove (ii), we write ψ as in the proof of Theorem 1.2: ψ = eF −F1 ψ1 , with F and F1 as in (1.6) and (2.15). Then 2 2 2 J1 (x) = |∇ψ| dx = |∇F | ψ dx + |∇F1 |2 ψ2 dx −2 ∇F · ∇F1 ψ2 dx + e2(F −F1 ) |∇ψ1 |2 dx 2(F −F1 ) ∇F · ∇ψ1 e ∇F1 · ∇ψ1 e2(F −F1 ) ψ1 dx +2 ψ1 dx − 2 ≡ I1 (x) + I2 (x) + I3 (x) + I4 (x) + I5 (x) + I6 (x).

(3.50)

Using the idea from (3.48) twice (on |∇F1 |2 and ψ2 , respectively), the estimates (2.17), (3.47), (3.49), and (3.46), we have, with x0 ∈ R3 , R > 0, α ∈ (0, 1), and x, y ∈ B(x0 , R): |∇F1 |2 ψ2 (x, x ) − |∇F1 |2 ψ2 (y, x ) ≤ |x − y|α |∇F1 (x, x )|2 − |∇F1 (y, x )|2 |ψ(x, x )|2 |x − y|α 2 ψ (x, x ) − ψ2 (y, x ) + |∇F1 (y, x )|2 |x − y|α ≤ C |x − y|1−α ∇ |∇F1 |2 ∞ exp(−2γ|(x, x2 , . . . , xN )|) − y|1−α |∇F1 |2 exp(− + 2C|x γ |(x0 , . . . , xN )|) ∞ ≤ C¯ exp(−¯ γ (x0 , . . . , xN )|). α By Lemma 3.3, with G = |∇F1 |2 ψ2 (x1 , · · · , xN ), this implies that I2 ∈ Cloc (R3 ). F1 −F ψ), gives (3.43) and Using the same ingredients, writing ∇ψ1 = ∇(e (3.44) with

G = K = e2(F −F1 ) |∇ψ1 |2 and G = K = ∇F1 · ∇ψ1 e2(F −F1 ) ψ1 , α and so by Lemma 3.3, I4 , I6 ∈ Cloc (R3 ).

94



The remaining terms are those involving the function F , namely I1 , I3 and I5 . Note that Z x1 xN

∇F (x1 , . . . , xN ) = − ,... , 2 |x1 | |xN |   N N N−1 xN − xk xj − xk 1  x1 − xk  + ,... , ,... , 4 |x1 − xk | |xj − xk | |xN − xk | k=2

k=1,k=j

(3.51)

k=1

and so Z N Z2 − |∇F (x1 , . . . , xN )| = 4 8

N

2

+

1 16

j,k=1,k=j N

j,k,l=1,k=j,l=j

xj xj − xk · |xj | |xj − xk |

xj − xk xj − xl · . |xj − xk | |xj − xl |

In this way, N Z2 I1 (x1 ) = 4 −

+

=

Z 8 1 16

ψ2 dx2 · · · dxN

N j,k=1,k=j

xj xj − xk 2 · ψ dx2 · · · dxN |xj | |xj − xk |

N j,k,l=1,k=j,l=j

N Z2 Z ρ(x1 ) − 4 8

N

xj − xk xj − xl 2 · ψ dx2 · · · dxN |xj − xk | |xj − xl | κj,k (x1 ) +

j,k=1,k=j

1 16

N

νj,k,l (x1 ).

(3.52)

j,k,l=1,k=j,l=j

Note that νj,k,l = νj,l,k . Using the ideas above, Lemma 3.3 implies that the following functions from α (R3 ): (3.52) (with the mentioned choices of G satisfying (3.43)) are all in Cloc ρ: κj,k ,

j, k = 1, j = k : νj,k,k , j = k :

νj,k,l ,

j, k, l = 1, l = j = k :

G = ψ2 , xj − xk 2 xj · ψ , G= |xj | |xj − xk | xj − xk xj − xk 2 G= · ψ = ψ2 , |xj − xk | |xj − xk | xj − xl 2 xj − xk · ψ . G= |xj − xk | |xj − xl |

Vol. 2, 2001


95

Likewise, Lemma 3.4 implies (with the mentioned choices of G = K satisfying α (3.43) and (3.44)) that the following functions from (3.52) are all in Cloc (R3 ): κj,1 , νj,1,l ,

j = 1 :

j, l = 1, j = l :

xj · (xj − x1 ) 2 ψ , |xj | (xj − x1 ) · (xj − xl ) 2 ψ , G=K= |xj − xl | G=K=

From the decomposition of I1 in (3.52) we are left with x1 x1 − xk 2 · ψ dx , k = 2, . . . , N, κ1,k (x1 ) = |x1 | |x1 − xk |

(3.53)

and ν1,k,l (x1 ) = Note that

x1 − xk x1 − xl 2 · ψ dx |x1 − xk | |x1 − xl |

,

k, l ∈ {2, . . . , N }, k = l.

(3.54)

x1 x1 − xk 2 · ψ dx2 · · · dxN |x1 | |x1 − xk | 1 1 x1 · (x1 − xk ) ψ2 dx2 · · · dxN . = |x1 | |x1 − xk |

α The function 1/|x1 | is smooth for x1 = 0 and therefore in Cloc (R3 \ {0}). The 2 function x1 · (x1 − xk ) ψ satisfies (3.43) and (3.44) (by the same ideas as above), so Lemma 3.4 (a) implies that the function 1 x1 · (x1 − xk ) ψ2 dx2 · · · dxN |x1 − xk | α α is in Cloc (R3 ). The functions in (3.53) are therefore in Cloc (R3 \ {0}). α 3 As for the functions in (3.54), these are all in Cloc (R ), which can be seen by applying the previous ideas, in particular Lemma 3.5, (3.46), (3.47) and (3.49). α (R3 \ {0}). This proves that I1 ∈ Cloc As for I3 (see (3.50) and (3.51)), with ∇ = (∇1 , . . . , ∇N ),

I3 (x) = Z

N

xj · ∇j F1 ψ2 dx2 · · · dxN |xj | j=1

1 − 2

N j,k=1,j=k

xj − xk · ∇j F1 ψ2 dx2 · · · dxN . |xj − xk |

(3.55)

α The terms in the first sum with j = 1 are in Cloc (R3 ), due to Lemma 3.4 (b), with 2 G = K = (xj · ∇j F1 ) ψ satisfying (3.43) and (3.44). (To see this, use the previous

96



ideas; to apply the idea from (3.48) to ∇j F1 we use that F1 is smooth). The terms α in the second in Cloc (R3 ), due to Lemma 3.4 (a), applied with sum in (3.55) are all 2 α G = K = (xj − xk ) · ∇j F1 ψ . The term with j = 1 is in Cloc (R3 \ {0}). This can be seen by following the ideas in the proof of the regularity properties of the function in (3.53), now using Lemma 3.3 with G = (x1 · ∇1 F1 )ψ2 . The statements and proofs are similar for N

xj I5 (x) = −Z · ∇j ψ1 e2(F −F1 ) ψ1 dx2 · · · dxN |xj | j=1 +

1 2

N j,k=1,j=k

xj − xk · ∇j ψ1 e2(F −F1 ) ψ1 dx2 · · · dxN . |xj − xk |

(3.56)

That is, the functions in the first sum in (3.56) with j ≥ 2 and those in the second α (R3 ), whereas the function in the first sum with j = 1 is only sum are all in Cloc α 3 in Cloc (R \ {0}). To prove this we use the inequality (with x = (x1 , . . . , xN )): |ψ1 |C 1,α (B(x,R/2)) ≤ C

|ψ1 (y)| ≤ C exp(−γ|(x1 , . . . , xN )|).

sup y∈(B(x,R))

This inequality follows from (2.17), (2.19) and (3.47) (remember that ψ1 = eF1 −F ψ). α This proves that I5 ∈ Cloc (R3 \ {0}), and so finishes the proof that J1 ∈ α 3 Cloc (R \ {0}). (See (3.50)). This proves (ii) and therefore Lemma 3.5. α That h ∈ Cloc ((0, ∞)) is a consequence of the foregoing and of the following proposition: α α (R3 \ {0}), α ∈ (0, 1). Then f ∈ Cloc ((0, ∞)), Proposition 3.6 Assume f ∈ Cloc where f (r) = f (rω) dω. S2

Proof. Let r ∈ (0, ∞). For all x0 ∈ A = {x ∈ R3 | |x| = r}, choose R = R(x0 ) and C = C(x0 ) such that sup x,y∈B(x0 ,R(x0 ))

|f (x) − f (y)| ≤ C(x0 ). |x − y|α

α This is possible, since f ∈ Cloc (R3 \ {0}). Then B(x0 , R(x0 )). A⊂ x0 ∈A

Using compactness of A, choose x1 , . . . , xm ∈ A such that A⊂

m j=1

B(xj , R(xj )).

(3.57)

Vol. 2, 2001


97

Choose > ∈ (0, r) such that {y ∈ R3 | r − > < |y| < r + >} ⊂

m

B(xj , R(xj )).

j=1

Then, for all s, t ∈ (r − >, r + >) and all ω ∈ S2 there exists j ∈ {1, . . . , m} such that sω, tω ∈ B(xj , R(xj )) and therefore by (3.57), |f (sω) − f (tω)| |f (sω) − f (tω)| = ≤ C(xj ). |s − t|α |sω − tω|α So with C = max{C(x1 ), . . . , C(xm )}, |f (sω) − f (tω)| ≤ C, for all s, t ∈ (r − >, r + >) and all ω ∈ S2 . |s − t|α This implies that − f(t)| | S2 (f (sω) − f (tω)) dω| |f(s) = |s − t|α |s − t|α |f (sω) − f (tω)| ≤ dω ≤ C, for all s, t ∈ (r − >, r + >). |sω − tω|α S2 α This proves that f ∈ Cloc ((0, ∞)). To prove that h ∈ C 0 ([0, ∞)), we apply the following: α (R3 ). Then f ∈ C 0 ([0, ∞)), where Proposition 3.7 Assume f ∈ Cloc f (rω) dω. f(r) = S2 α (R3 ). Let r ∈ [0, ∞). Proof. The function f is continuous in R3 , since it is in Cloc Then

lim f (sω) = f (rω)

s→r

for all ω ∈ S2 .

Using the supremum of f on a sufficiently large compact set in R3 as a dominant, Lebesgue’s Dominated Convergence Theorem gives us that f (sω) dω = f (rω) dω. lim s→r

S2

S2

α (R3 \ {0}). Therefore f ∈ C 0 ([0, ∞)). Recall the proof of the fact that h ∈ Cloc In fact, the only terms in the decomposition of h (see (3.41), (3.42), (3.50), and

98



α α (3.51)) that are only in Cloc (R3 \ {0}) and not in Cloc (R3 ) are the functions x1 x1 − xk 2 · ψ dx2 · · · dxN , k = 2, . . . , N, |x1 | |x1 − xk |

x1 · ∇1 F1 ψ2 dx2 · · · dxN , |x1 |

x1 (3.58) · ∇1 ψ1 e2(F −F1 ) ψ1 dx2 · · · dxN . |x1 |

Comparing (3.50), (3.53), (3.55) and (3.56), it can be seen that all the terms in (3.58) stem from the function J1 , namely from I1 , I3 , and I5 . α All other terms in the decomposition of h are in Cloc (R3 ). When integrating 2 them over S , we get something continuous in [0, ∞), according to Proposition 3.7 above. For the terms in (3.58) we note that they are all of the form x1 x1 · K(x1 , x ) dx = · K(x1 , x ) dx . (3.59) |x1 | |x1 | In each case, we have

L(x1 ) = L1 (x1 ), L2 (x1 ), L3 (x1 ) =

α K(x1 , x ) dx , Lj ∈ Cloc (R3 ).

(3.60)

To see this, apply Lemma 3.3 to each of the coordinate functions Lj , j = 1, 2, 3. The integrands are easily seen to satisfy (3.43) in each case, by the previous ideas. To get continuity in [0, ∞) of the functions in (3.58) we use (3.59) and (3.60), and the following lemma: α Proposition 3.8 Assume f = (f1 , f2 , f3 ), fj ∈ Cloc (R3 ). Then f¯ ∈ C 0 ([0, ∞)), where ω · f (rω) dω. f¯(r) = S2

Proof. The same as for Proposition 3.7, noting that for all r ∈ [0, ∞) and fixed ω ∈ S2 : lim ω · f (sω) = ω · f (rω).

s→r

This holds even in the case r = 0, for which lim ω · f (sω) dω = ω · f (0) dω = 0. s↓0

S2

S2

h ∈ C0 This proves that the functions in (3.58) are in C 0 ([0, ∞)). Therefore ([0, ∞)), which finishes the proof of Proposition 3.1.

Vol. 2, 2001


99

Acknowledgement The authors wish to thank G. Friesecke and S. Fournais for stimulating discussions.

References [1] R. Ahlrichs, M. Hoffmann-Ostenhof, T. Hoffmann-Ostenhof, and J.D. Morgan, III, Bounds on the decay of electron densities with screening, Phys. Rev. A (3) 23 (1981), 2106–2117. [2] H.L. Cycon, R.G. Froese, W. Kirsch, and B. Simon, Schr¨ odinger Operators with Application to Quantum Mechanics and Global Geometry, Texts and Monographs in Physics, Springer-Verlag, Berlin, 1987. [3] D. Gilbarg and N.S. Trudinger, Elliptic Partial Differential Equations of Second Order, second ed., Springer-Verlag, Berlin, 1983. [4] M. Hoffmann-Ostenhof, T. Hoffmann-Ostenhof, and H. Stremnitzer, Local properties of Coulombic wave functions, Comm. Math. Phys. 163 (1994), 185– 215. [5] M. Hoffmann-Ostenhof, T. Hoffmann-Ostenhof, and W. Thirring, Simple bounds to the atomic one-electron density at the nucleus and to expectation values of one-electron operators, J. Phys. B11 (1978), L571–L575. [6] M. Hoffmann-Ostenhof and R. Seiler, Cusp conditions for eigenfunctions of n-electron systems, Phys. Rev. A23 (1981), 21–23. ¨ [7] E. Hopf, Uber den funktionalen, insbesondere den analytischen Charakter der L¨ osungen elliptischer Differentialgleichungen zweiter Ordnung, Math. Z. 34 (1931), 194–233. [8] T. Kato, On the eigenfunctions of many-particle systems in quantum mechanics, Comm. Pure Appl. Math. 10 (1957), 151–177. [9] E.H. Lieb, The Stability of Matter: From Atoms to Stars. Selecta of Elliot H. Lieb, second ed., Springer - Verlag, Berlin/Heidelberg, 1997. [10] E.H. Lieb and M. Loss, Analysis, American Mathematical Society, Providence, RI, 1997. [11] Jan Mal´ y and William P. Ziemer, Fine regularity of solutions of elliptic partial differential equations, American Mathematical Society, Providence, RI, 1997. [12] M. Reed and B. Simon, Methods of modern mathematical physics. II. Fourier analysis, self-adjointness, Academic Press, New York, 1975.

100

[13]



, Methods of modern mathematical physics. IV. Analysis of operators, Academic Press, New York, 1978.

[14] B. Simon, Schr¨ odinger semigroups, Bull. Amer. Math. Soc. (N.S.) 7 (1982), 447–526. [15] G.M. Zhislin and A.G. Sigalov, On the spectrum of the energy operator for atoms with fixed nuclei on subspaces corresponding to irreducible representations of a group of permutations, Am. Math. Soc., Transl., II. Ser. 91 (1970), 263–295. T. Hoffmann-Ostenhof Institut f¨ ur Theoretische Chemie W¨ ahringerstrasse 17 Universität Wien A-1090 Wien Austria e-mail: thoff[email protected]

M. Hoffmann-Ostenhof Institut f¨ ur Mathematik Strudlhofgasse 4 Universität Wien A-1090 Wien Austria e-mail: mhoff[email protected] T. Hoffmann-Ostenhof, T. Ø. Sørensen The Erwin Schr¨ odinger International Institute for Mathematical Physics Boltzmanngasse 9 A-1090 Wien Austria e-mail: [email protected]

Work supported by Ministerium f¨ ur Wissenschaft und Verkehr der Republik ¨ Osterreich, the Austrian Science Foundation, grant number P12865-MAT, and by the European Union TMR grant FMRX-CT 96-0001 Communicated by Rafael D. Benguria submitted 12/04/00, accepted 23/10/00




Uniform Singular Continuous Spectrum for the Period Doubling Hamiltonian David Damanik Abstract. We consider the ergodic family of Schr¨ odinger operators generated by the period doubling substitution and we prove that every element of this family has purely singular continuous spectrum.

1 Introduction The discovery of quasicrystals by Shechtman et al. in 1984 [21] has motivated continuing interest by both physicists and mathematicians in adequate models describing these structures. A class of models that has attracted particular attention in this context is provided by one-dimensional Schr¨ odinger operators with potentials generated by so-called primitive substitution sequences, the Fibonacci sequence being the most prominent example. A considerable amount of knowledge about the spectral properties of these operators has since been accumulated, and, from a mathematical point of view, their study is also motivated by the fact that they exhibit rather unusual properties such as purely singular continuous spectral measures which are supported on Cantor sets of Lebesgue measure zero. In particular the apparent tendency of the spectral measures to be always purely singular continuous seems to reflect that substitution sequences give rise to potentials being intermediate between periodic (leading to absolutely continuous spectrum) and random (leading to pure point spectrum). While absence of absolutely continuous spectrum follows in full generality from works of Kotani [17] and Last-Simon [18], the problem of excluding eigenvalues has not yet been solved in similar generality. The spectral theory of substitution Hamiltonians is most conveniently studied within the context of ergodic families of Schr¨ odinger operators since primitive substitution sequences naturally induce strictly ergodic subshifts [20] which serve as a family of potentials associated with a substitution model. One can then employ the powerful general results from this framework; see [4] for the general theory of ergodic Schr¨ odinger operators and [7] for an introduction to one-dimensional quasicrystal models and their spectral theory. The results on absence of eigenvalues for ergodic families of Schr¨ odinger operators generated by primitive substitutions can be classified, in increasing generality, as generic (absence of eigenvalues for a dense Gδ set of elements of the subshift), almost sure (absence of eigenvalues for almost every element with respect to the unique ergodic measure µ on the subshift), and uniform results (absence of

102

D. Damanik


eigenvalues for all elements of the subshift). Generic results for certain classes of substitution models were established in [3, 15], almost sure results can be found in [5, 6, 8], and [9, 10] contain uniform results. Essentially all the known results rely on criteria that deduce absence of 2 solutions from the presence of local symmetries, namely, local repetitions [14] or palindromes [15]. This explains why our current understanding of the problem is limited to models that exhibit such symmetries. Explicit models are known (e.g., the Rudin-Shapiro substitution) which give rise to models that do not have the required local symmetries. Moreover, from [14, 15] one can extract three common criteria: the palindrome method of Hof et al. [15] (based on a criterion of Jitomirskaya and Simon [16] which was developed in the context of uniformly almost periodic models) and the two-block [23] and three-block [13] versions of Gordon’s method [14]. While the palindrome method is an excellent tool to establish generic results for a large class of primitive substitution models [15], it was shown in [12] that one cannot prove a stronger (i.e., almost sure or uniform) result — its scope is always limited to a set of zero µ-measure. On the other hand, the three-block version of Gordon’s method allows for a very simple proof of an almost sure result in the case where the substitution sequence contains a sufficiently high power (slightly more than a third power is enough) [6, 8], whereas the criterion is not sufficient to prove uniform results [8] — there are always elements in the subshift to which one cannot apply the three-block criterion. The only uniform results that are known so far have therefore been established by the two-block version of Gordon. However, the two-block method requires an additional input, namely, sufficient control on a dynamical system (the so-called trace map) that is naturally associated with a substitution model. Sufficient control essentially means that one has to establish boundedness of its orbits (for energies from the spectrum). Unfortunately, such a strong result is known only for a very small subclass of substitution models, namely, those of minimal combinatorial complexity, that is, models which are Sturmian (e.g., the Fibonacci case). This indicates that it might be hard to establish uniform results outside this small subclass. Our goal here is to establish uniform absence of eigenvalues for a prominent model, the family of Schr¨ odinger operators generated by the period doubling substitution (see, e.g., [1, 5]), which indeed lies outside this small subclass and for which the strong trace map result does not hold [2]. Namely, on the alphabet A = {a, b}, consider the period doubling substitution S(a) = ab, S(b) = aa. Iterating on a, we obtain a one-sided sequence u = abaaabab . . . which is invariant under the substitution process. Define the associated subshift Ω to be the set of all twosided sequences which have all their finite subblocks occurring in u. Choose some non-constant function f : A → R and define for ω ∈ Ω, a discrete one-dimensional Schr¨ odinger operator Hω , acting in 2 (Z), by (Hω ψ)(n) = ψ(n + 1) + ψ(n − 1) + f (ωn )ψ(n). We will prove the following theorem.

Vol. 2, 2001

SC Spectrum for the Period Doubling Hamiltonian

103

Theorem 1 For every ω ∈ Ω, the operator Hω has empty point spectrum. Remarks. (a) Note that the result is valid for all replacement functions f , that is, it is robust with respect to variation of a potential coupling constant. This is a general phenomenon in the spectral theory of one-dimensional substitution Hamiltonians which stems from the fact that the proofs are mostly combinatorial. (b) If we combine Theorem 1 with the results of Kotani [17] and Last-Simon [18], we get that for every ω ∈ Ω, the operator Hω has purely singular continuous spectrum. (c) Theorem 1 extends [5] where purely singular continuous spectrum was established for almost all ω ∈ Ω with respect to the unique ergodic measure µ on Ω (see also [6]). (d) Our proof of Theorem 1 uses a combination of the two-block and three-block versions of Gordon’s criterion along with partitions of the elements in the hull Ω and results for the trace map obtained by Bellissard et al. [1]. After recalling some more or less known concepts and results in Section 2, we give a proof of Theorem 1 in Section 3. Since the result is somewhat surprising, by virtue of our discussion preceding the statement of the theorem, we also discuss in Section 3 to what extent the approach of the present paper is likely to apply to other substitution models.

2 The Trace Map, Partitions, and Gordon’s Criterion In this section we recall some useful results and methods that will be used in our proof of Theorem 1. Among these are the trace map, a dynamical system which is directly induced by the substitution rule, partitions of the elements of the hull into products of canonical words, and criteria for absence of eigenvalues of general one-dimensional Schr¨ odinger operators which are based on Gordon’s work [14]. Let us first recall that the sequence u can be regarded as a limit of a sequence of words sn which obey recursive relations. Namely, with sn = S n (a) we have with obvious notation and meaning, u = limn→∞ sn . Moreover, the words sn obey the recursion sn = sn−1 s2n−2 . More transparently, we have with tn = S n (b) sn = sn−1 tn−1 , tn = sn−1 sn−1 .

(1)

Notice that sn and tn both have length m = 2n . Moreover, the words sn and tn are almost identical [6]: Proposition 2.1 For every n ∈ N, the words sn and tn are the same except for their respective rightmost symbol. For sn = u1 . . . um and tn = v1 . . . vm with ui , vi ∈ A and E ∈ R, we define the matrices Mn = Mn (E) and Nn = Nn (E) by Mn = T (E, um ) × · · · × T (E, u1 ), Nn = T (E, vm ) × · · · × T (E, v1 ),

104

D. Damanik

where for c ∈ A and E ∈ R, T (E, c) =

E − f (c) 1

−1 0


.

Let xn = xn (E) = tr(Mn ) and yn = yn (E) = tr(Nn ). Bellissard et al. derived in [1] the recursion xn = xn−1 yn−1 − 2, yn = x2n−1 − 2 (2) which is called the trace map. This dynamical system on R2 is the central tool in the investigation of the spectral properties of the operators Hω . Trace maps are induced by all substitutions (see [3] and references therein) and their study in this context is natural and very useful. It is a standard result that there is a compact set Σ ⊆ R such that σ(Hω ) = Σ for every ω ∈ Ω. This follows essentially from the minimality of Ω which results from the fact that u is almost periodic, that is, every finite subblock of u occurs in u infinitely often and with bounded gaps. It follows from the analysis of the trace map performed by Bellissard et al. in [1] that for every E ∈ Σ, we have the following: If |xn (E)| > 2 for some n, then |xn+1 (E)| ≤ 2. We can therefore state the following proposition. Proposition 2.2 For every E ∈ Σ and every n ∈ N, we have min{|xn (E)|, |xn+1 (E)|} ≤ 2. The next crucial concept we want to recall is the fact that for every n, every ω ∈ Ω can be uniquely decomposed into an infinite product of blocks of the form sn or tn . Let us call this decomposition the n-partition of ω. We summarize the properties we shall need in the following proposition. Proposition 2.3 For every n, every ω ∈ Ω has a unique n-partition. In this product representation, a tn -block is always isolated, and between two consecutive tn -blocks there are either one or three sn -blocks. Proof. By definition, u can be written as a product of blocks of the form s0 and t0 . Moreover, by the self-similarity property S(u) = u, we have, for every n ∈ N, an analogous decomposition into blocks of the form sn and tn . It is easily checked for u that tn -blocks are isolated and that sn -blocks have multiplicity either one or three. It is then a result of [19] (see [22] for an extension to higher dimensions) that these properties are inherited by the subshift elements ω ∈ Ω and that their canonical decompositions are in fact unique (this follows from aperiodicity of u). Finally, we discuss Gordon-type criteria which establish a link between combinatorial properties of the sequences ω ∈ Ω and non-decay properties of the solutions to (Hω − E)φ = 0. (3)

Vol. 2, 2001


105

We do not provide proofs and refer the reader to [7, 13, 23] for proofs, discussions, and applications. Fix some ω ∈ Ω and some E ∈ R. Let φ be a two-sided sequence that solves (3) and obeys the normalization condition |φ(−1)|2 + |φ(0)|2 = 1.

(4)

Denote Φ(n) = (φ(n), φ(n − 1))T . Then we have the following proposition. Proposition 2.4 (a) If for some m ∈ N, we have ω−m+j = ωj = ωm+j , 0 ≤ j ≤ m − 1, then 1 max (Φ(−m), Φ(m), Φ(2m)) ≥ . 2 (b) If for some m = 2n ∈ N, we have that ω0 . . . ω2m−1 is a cyclic permutation of sn sn , then 1 max (|xn (E)| · Φ(m), Φ(2m)) ≥ . 2 Analogous conclusions hold if the assumptions in (a) and (b) are reflected at the origin. We see that we obtain useful estimates for the solutions φ of (3) if we exhibit appropriate squares and cubes in ω.

3 The Proof of Theorem 1 Let us turn to the proof of Theorem 1. Fix ω ∈ Ω, E ∈ Σ, and a solution φ to (3) obeying (4). We want to prove that φ is not square-summable. We shall show that given any k ∈ N, there exists m ∈ Z with |m| ≥ k such that Φ(m) ≥ 14 . From this the assertion clearly follows. So let k ∈ N be fixed and pick n ∈ N such that 2n ≥ k. Consider the n-partition of ω. Case 1: The site 0 ∈ Z is contained in an sn -block and this sn -block is followed to the right by an sn -block. Because of Proposition 2.3 there are two subcases. Case 1.1: The n-partition looks at the origin locally like tn sn sˆn sn tn . Here and in the following, the hat-symbol marks the block that contains the site 0 ∈ Z. Because of Proposition 2.1 we can conclude by applying Proposition 2.4 (a) with m = 2n . Case 1.2: We have tn sˆn sn sn tn . Then either |xn (E)| ≤ 2, and we are done in this case by Proposition 2.4 (b), or |xn (E)| > 2 and then |xn+1 (E)| ≤ 2 by Proposition 2.2. Let us therefore consider the (n + 1)-partition where we must have sn+1 tˆn+1 sn+1 . Note that the origin is not the rightmost site in tˆn+1 . Case 1.2.1: We have sn+1 sn+1 tˆn+1 sn+1 . In this case we can conclude immediately by applying Proposition 2.1 and Proposition 2.4 (b) (reflected at the origin) because we have |xn+1 (E)| ≤ 2. Case 1.2.2: We have tn+1 sn+1 tˆn+1 sn+1 . We pass to the (n + 2)-partition where we must have sn+2 sˆn+2 .

106

D. Damanik


Case 1.2.2.1: We have sn+2 sn+2 sˆn+2 tn+2 . In this case we can apply Proposition 2.4 (a) (reflected at the origin) with m = −2n+2 using Proposition 2.1. Case 1.2.2.2: We have tn+2 sn+2 sˆn+2 sn+2 tn+2 . Again using Proposition 2.1 we can apply Proposition 2.4 (a) with m = 2n+2 . This closes Case 1. Case 2: In the n-partition we have sn sˆn . This case can be treated analogously to Case 1.2.2. This closes Case 2. Case 3: In the n-partition we have tn sˆn tn . We pass to the (n + 1)-partition where we must have sn+1 sˆn+1 and we can then proceed analogously to Case 1.2.2. This closes Case 3. Case 4: In the n-partition we have tˆn . Thus in the (n + 1)-partition we have sˆn+1 and we are in one of the Cases 1–3 with all indices increased by one. In particular, we obtain a sufficient solution estimate. This closes Case 4 and hence concludes the proof. Remark. Let us briefly discuss the result of this paper and the obstacles one encounters when one tries to tackle related questions. We have shown that despite the absence of a uniform trace map bound, which was a crucial tool in the corresponding proof in the Fibonacci (and, more generally, Sturmian) case [10], we can nevertheless prove absence of eigenvalues for all elements in the hull. This raises two questions: Can we carry over some of the other uniform results that were proven in the Fibonacci case [9, 11], all of which relied heavily on the uniform trace map bound as well, and is it feasible that we can prove uniform absence of eigenvalues for non-Sturmian substitution models other than the period doubling substitution? (a) Virtually all further results in [9, 11] require power-law upper bounds on the solutions to the difference equation (3) for energies E in the spectrum. A proof of this property, uniformly in the energy, seems to be out of reach in the period doubling case since one only has the weak trace map bound |xn (E)| ≤ c exp(dγ n ) (5) √ for E ∈ Σ, where γ is any number greater than 2 and c, d are constants depending on γ [1]. While this bound is sufficient to prove that the Lyapunov exponent vanishes on the spectrum (see [1]), it is clearly insufficient to prove power-law upper bounds on the solutions. One may try to improve this bound within certain energy ranges in order to study, for example, local α-continuity, but this requires a more detailed understanding of the trace map dynamics in the period doubling case. (b) While [6, 8] established purely singular continuous spectrum for almost all elements in the hull, the three-block version of the Gordon argument used there is not capable of proving uniform results [8]. The proofs of uniform results in [10] and the present paper therefore made essential use of the two-block version of Gordon along with suitable trace map bounds. Since boundedness of trace map orbits for energies from the spectrum is only known for Sturmian models, any uniform result outside the class of Sturmian models has to be considered somewhat surprising.

Vol. 2, 2001


107

What came to our rescue in the period doubling case, beside the weak uniform bound given in Proposition 2.2, is essentially Proposition 2.1 which says that for every n, any element in Ω is almost 2n -periodic with the only “defects” being the rightmost symbols in the tn -blocks in the n-partition. Such a property is of course a very special feature of the period doubling case (one can certainly construct other examples with this property, such as a → ak b, b → ak a for k ∈ N, but this is not really the point), so that it is currently not obvious how to extend the result of this paper to other non-Sturmian models. Acknowledgments. The author would like to thank Daniel Lenz for the collaboration leading to [10] which developed a general approach to uniform results for Schr¨ odinger operators with hierarchical potentials. Financial support from the German Academic Exchange Service through Hochschulsonderprogramm III (Postdoktoranden) is gratefully acknowledged.

References [1] J. Bellissard, A. Bovier, and J.-M. Ghez, Spectral properties of a tight binding Hamiltonian with period doubling potential, Commun. Math. Phys. 135 (1991), 379–399 [2] A. Bovier, private communication [3] A. Bovier and J.-M. Ghez, Spectral properties of one-dimensional Schr¨ odinger operators with potentials generated by substitutions, Commun. Math. Phys. 158 (1993), 45–66; Erratum Commun. Math. Phys. 166 (1994), 431–432 [4] R. Carmona and J. Lacroix, Spectral Theory of Random Schr¨ odinger Operators, Birkh¨ auser, Boston (1990) [5] D. Damanik, Singular continuous spectrum for the period doubling Hamiltonian on a set of full measure, Commun. Math. Phys. 196 (1998), 477–483 [6] D. Damanik, Singular continuous spectrum for a class of substitution Hamiltonians, Lett. Math. Phys. 46 (1998), 303–311 [7] D. Damanik, Gordon-type arguments in the spectral theory of one-dimensional quasicrystals, to appear in Directions in Mathematical Quasicrystals, Eds. M. Baake and R. V. Moody, CRM Monograph Series, AMS, Providence (2000) [8] D. Damanik, Singular continuous spectrum for a class of substitution Hamiltonians II., to appear in Lett. Math. Phys. [9] D. Damanik, R. Killip, and D. Lenz, Uniform spectral properties of onedimensional quasicrystals, III. α-continuity, Commun. Math. Phys. 212 (2000), 191–204 [10] D. Damanik and D. Lenz, Uniform spectral properties of one-dimensional quasicrystals, I. Absence of eigenvalues, Commun. Math. Phys. 207 (1999), 687–696

108

D. Damanik


[11] D. Damanik and D. Lenz, Uniform spectral properties of one-dimensional quasicrystals, II. The Lyapunov exponent, Lett. Math. Phys. 50 (1999), 245– 257 [12] D. Damanik and D. Zare, Palindrome complexity bounds for primitive substitution sequences, Discrete Math. 222 (2000), 259–267 [13] F. Delyon and D. Petritis, Absence of localization in a class of Schr¨ odinger operators with quasiperiodic potential, Commun. Math. Phys. 103 (1986), 441–444 [14] A. Gordon, On the point spectrum of the one-dimensional Schr¨ odinger operator, Usp. Math. Nauk 31 (1976), 257–258 [15] A. Hof, O. Knill, and B. Simon, Singular continuous spectrum for palindromic Schr¨ odinger operators, Commun. Math. Phys. 174 (1995), 149–159 [16] S. Jitomirskaya and B. Simon, Operators with singular continuous spectrum: III. Almost periodic Schr¨ odinger operators, Commun. Math. Phys. 165 (1994), 201–205 [17] S. Kotani, Jacobi matrices with random potentials taking finitely many values, Rev. Math. Phys. 1 (1989), 129–133 [18] Y. Last and B. Simon, Eigenfunctions, transfer matrices, and absolutely continuous spectrum of one-dimensional Schr¨ odinger operators, Invent. Math. 135 (1999), 329–367 [19] B. Mossé, Puissance de mots et reconnaissabilité des points fixes d’une substitution, Theoret. Comput. Sci. 99 (1992), 327–334 [20] M. Queffélec, Substitution Dynamical Systems - Spectral Analysis, Lecture Notes in Mathematics, Vol. 1284, Springer, Berlin, Heidelberg, New York (1987) [21] D. Shechtman, I. Blech, D. Gratias, and J. V. Cahn, Metallic phase with longrange orientational order and no translational symmetry, Phys. Rev. Lett. 53 (1984), 1951–1953 [22] B. Solomyak, Nonperiodicity implies unique composition for self-similar translationally finite tilings, Discrete Comput. Geom. 20 (1998), 265–279 [23] A. S¨ ut˝ o, The spectrum of a quasiperiodic Schr¨ odinger operator, Commun. Math. Phys. 111 (1987), 409–415

David Damanik Department of Mathematics 253–37 California Institute of Technology Pasadena, CA 91125 USA

Fachbereich Mathematik Johann Wolfgang Goethe-Universität D-60054 Frankfurt Germany e-mail: [email protected]

Communicated by Jean Bellissard submitted 25/07/00, accepted 19/09/00



Regularity of Horizons and the Area Theorem P.T. Chru´sciel, E. Delay, G.J. Galloway, R. Howard Abstract. We prove that the area of sections of future event horizons in space–times satisfying the null energy condition is non–decreasing towards the future under the following circumstances: 1) the horizon is future geodesically complete; 2) the horizon is a black hole event horizon in a globally hyperbolic space–time and there exists a conformal completion with a “H–regular” I+ ; 3) the horizon is a black hole event horizon in a space–time which has a globally hyperbolic conformal completion. (Some related results under less restrictive hypotheses are also established.) This extends a theorem of Hawking, in which piecewise smoothness of the event horizon seems to have been assumed. We prove smoothness or analyticity of the relevant part of the event horizon when equality in the area inequality is attained — this has applications to the theory of stationary black holes, as well as to the structure of compact Cauchy horizons. In the course of the proof we establish several new results concerning the differentiability properties of horizons.

1 Introduction The thermodynamics of black holes rests upon Hawking’s area theorem [37] which asserts that in appropriate space–times the area of cross–sections of black hole horizons is non–decreasing towards the future. In the published proofs of this result [37, 64] there is considerable vagueness as to the hypotheses of differentiability of the event horizon (see, however, [35]). Indeed, it is known that black hole horizons can be pretty rough [15], and it is not immediately clear that the area of their cross–sections can even be defined. The reading of the proofs given in [37, 64] suggests that those authors have assumed the horizons under consideration to be piecewise C 2 . Such a hypothesis is certainly incompatible with the examples constructed in [15] which are nowhere C 2 . The aim of this paper is to show that the monotonicity theorem holds without any further differentiability hypotheses on the horizon, in appropriate space–times, for a large class of cross–sections of the horizon. More precisely we show the following: Theorem 1.1 (The area theorem) Let H be a black hole event horizon in a smooth space–time (M, g). Suppose that either ¯ , g¯) a) (M, g) is globally hyperbolic, and there exists a conformal completion (M + 2 1 + = (M ∪ I , Ω g) of (M, g) with a H–regular I . Further the null energy ¯ ) ∩ M of I+ in M , or condition holds on the past I − (I+ ; M 1 See

Section 4.1 for definitions.

110

P.T. Chru´sciel, E. Delay, G.J. Galloway, R. Howard


b) the generators of H are future complete and the null energy condition holds on H, or ¯ , g¯) = (M ∪ I+ , c) there exists a globally hyperbolic conformal completion (M 2 ¯ ) ∩ M. Ω g) of (M, g), with the null energy condition holding on I − (I+ ; M Let Σa , a = 1, 2 be two achronal spacelike embedded hypersurfaces of C 2 differentiability class, set Sa = Σa ∩ H. Then: 1. The area Area(Sa ) of Sa is well defined. 2. If S1 ⊂ J − (S2 ) , then the area of S2 is larger than or equal to that of S1 . (Moreover, this is true even if the area of S1 is counted with multiplicity of generators provided that S1 ∩ S2 = ∅, see Theorem 6.1.) Point 1. of Theorem 1.1 is Proposition 3.3 below, see also Proposition 3.4. Point 2. above follows immediately from Proposition 4.2, Corollary 4.14 and Proposition 4.17, as a special case of the first part of Theorem 6.1 below.2 The question of how to define the area of sections of the horizon is discussed in detail in Section 3. It is suggested there that a notion of area, appropriate for the identification of area with the entropy, should include the multiplicity of generators of the horizon. We stress that we are not assuming that I+ is null — in fact it could be even changing causal type from point to point — in particular Theorem 1.1 also applies when the cosmological constant does not vanish. In point c) global hyperbolicity of ¯ , g¯) should be understood as that of a manifold with boundary, cf. Section 4.1. (M Actually in points a) and c) of Theorem 1.1 we have assumed global hyperbolicity ¯ , g¯) for simplicity only: as far as Lorentzian causality hyof (M, g) or that of (M potheses are concerned, the assumptions of Proposition 4.1 are sufficient to obtain the conclusions of Theorem 1.1. We show in Section 4.1 that those hypotheses will hold under the conditions of Theorem 1.1. Alternative sets of causality conditions, which do not require global hyperbolicity of (M, g) or of its conformal completion, are given in Propositions 4.8 and 4.10. It seems useful to compare our results to other related ones existing in the literature [37, 64, 47]. First, the hypotheses of point a) above are fulfilled under the conditions of the area theorem of [64] (after replacing the space–time M considered in [64] by an appropriate subset thereof), and (disregarding questions of differentiability of H) are considerably weaker than the hypotheses of [64]. Consider, next, the original area theorem of [37], which we describe in some detail in Appendix B for the convenience of the reader. We note that we have been unable to obtain a proof of the area theorem without some condition of causal regularity 2 The condition S ∩ S = ∅ of Theorem 6.1 is needed for monotonicity of “area with multi1 2 plicity”, but is not needed if one only wants to compare standard areas, see Remark 6.2.

Vol. 2, 2001

Regularity of Horizons and the Area Theorem

111

of I+ (e.g. the one we propose in point a) of Theorem 1.1), and we do not know3 whether or not the area theorem holds under the original conditions of [37] without the modifications indicated above, or in Appendix B; see Appendix B for some comments concerning this point. Let us make a few comments about the strategy of the proof of Theorem 1.1. It is well known that event horizons are Lipschitz hypersurfaces, and the examples constructed in [15, 8] show that much more cannot be expected. We start by showing that horizons are semi–convex4 . This, together with Alexandrov’s theorem concerning the regularity of convex functions shows that they are twice differentiable in an appropriate sense (cf. Proposition 2.1 below) almost everywhere. This allows one to define notions such as the divergence θAl of the generators of the H∩S horizon H, as well as the divergence θAl of sections H ∩ S of H. The existence of the second order expansions at Alexandrov points leads further to the proof H∩S that, under appropriate conditions, θAl or θAl have the right sign. Next, an approximation result of Whitney type, Proposition 6.6, allows one to embed certain subsets of the horizon into C 1,1 manifolds. The area theorem then follows from the change–of–variables theorem for Lipschitz maps proved in [23]. We note that some further effort is required to convert the information that θAl has the correct sign into an inequality concerning the Jacobian that appears in the change of variables formula. Various authors have considered the problem of defining black holes in settings more general than standard asymptotically flat spacetimes or spacetimes admitting a conformal infinity; see especially [34, 48, 49] and references cited therein. It is likely that proofs of the area theorem given in more general settings, for horizons assumed to be piecewise C 2 , which are based on establishing the positivity of the (classically defined) expansion θ of the null generators can be adapted, using the methods of Section 4 (cf. especially Proposition 4.1 and Lemma 4.15), to obtain proofs which do not require the added smoothness. (We show in Appendix C that this is indeed the case for the area theorems of Kr´ olak [47, 48, 49].) In all situations which lead to the positivity of θ in the weak Alexandrov sense considered here, the area theorem follows from Theorem 6.1 below. It is of interest to consider the equality case: as discussed in more detail in Section 7, this question is relevant to the classification of stationary black holes, as well as to the understanding of compact Cauchy horizons. Here we prove the following: Theorem 1.2 Under the hypotheses of Theorem 1.1, suppose that the area of S1 equals that of S2 . Then (J + (S1 ) \ S1 ) ∩ (J − (S2 ) \ S2 ) 3 The proof of Proposition 9.2.1 in [37] would imply that the causal regularity condition assumed here holds under the conditions of the area theorem of [37]. However, there are problems with that proof (this has already been noted by Newman [53]). 4 Actually this depends upon the time orientation: future horizons are semi–convex, while past horizons are semi–concave.

112



is a smooth (analytic if the metric is analytic) null hypersurface with vanishing null second fundamental form. Moreover, if γ is a null generator of H with γ(0) ∈ S1 and γ(1) ∈ S2 , then the curvature tensor of (M, g) satisfies R(X, γ (t))γ (t) = 0 for all t ∈ [0, 1] and X ∈ Tγ(t) H. Theorem 1.2 follows immediately from Corollary 4.14 and Proposition 4.17, as a special case of the second part of Theorem 6.1 below. The key step of the proof here is Theorem 6.18, which has some interest in its own. An application of those results to stationary black holes is given in Theorem 7.1, Section 7. As already pointed out, one of the steps of the proof of Theorem 1.1 is to establish that a notion of divergence θAl of the generators of the horizon, or of sections of the horizon, can be defined almost everywhere, and that θAl so defined is positive. We note that θAl coincides with the usual divergence θ for horizons which are twice differentiable. Let us show, by means of an example, that the positivity of θ might fail to hold in space–times (M, g) which do not satisfy the hypotheses of Theorem 1.1: Let t be a standard time coordinate on the three dimensional Minkowski space–time R1,2 , and let K ⊂ {t = 0} be an open conditionally compact set with smooth boundary ∂K. Choose K so that the mean curvature H of ∂K has changing sign. Let M = I − (K), with the metric conformal to the Minkowski metric by a conformal factor which is one in a neighborhood of ∂D− (K; R1,2 ) \ K, and which makes ∂J − (K; R1,2 ) into I+ in the completion ¯ ) = D− (K; R1,2 ) = ∅, thus ¯ ≡ M ∪ ∂J − (K; R1,2 ). We have M \ J − (I+ ; M M M contains a black hole region, with the event horizon being the Cauchy horizon ∂D− (K; R1,2 )\K. The generators of the event horizon coincide with the generators of ∂D− (K; R1,2 ), which are null geodesics normal to ∂K. Further for t negative and close to zero the divergence θ of those generators is well defined in a classical sense (since the horizon is smooth there) and approaches, when t tends to zero along the generators, the mean curvature H of ∂K. Since the conformal factor equals one in a neighborhood of the horizon the null energy condition holds there, and θ is negative near those points of ∂K where H is negative. This implies also the failure of the area theorem for some (local) sections of the horizon. We note that condition b) of Theorem 1.1 is not satisfied because the generators of H are not future geodesically complete. On the other hand condition a) does not ¯) hold because g will not satisfy the null energy condition throughout I − (I+ ; M whatever the choice of the conformal factor5 . This paper is organized as follows: In Section 2 we show that future horizons, as defined there, are always semi–convex (Theorem 2.2). This allows us to define such notions as the Alexandrov divergence θAl of the generators of the horizon, and their Alexandrov null second fundamental form. In Section 3 we consider sections of horizons and their geometry, in particular we show that sections of horizons have a well defined area. We also discuss the ambiguities which arise when defining the area of those sections when the horizon is not globally smooth. In fact, those ambiguities have nothing to do with “very low” differentiability of horizons and 5 This

follows from Theorem 1.1.

Vol. 2, 2001


113

arise already for piecewise smooth horizons. In Section 4 we prove positivity of the Alexandrov divergence of generators of horizons — in Section 4.1 this is done under the hypothesis of existence of a conformal completion satisfying a regularity condition, together with some global causality assumptions on the space–time; in Section 4.2 positivity of θAl is established under the hypothesis that the generators of the horizon are future complete. In Section 5 we show that Alexandrov points “propagate to the past” along the generators of the horizon. This allows one to show that the optical equation holds on “almost all” generators of the horizon. We also present there a theorem (Theorem 5.6) which shows that “almost all generators are Alexandrov”; while this theorem belongs naturally to Section 5, its proof uses methods which are developed in Section 6 only, therefore it is deferred to Appendix D. In Section 6 we prove our main result – the monotonicity theorem, Theorem 6.1. This is done under the assumption that θAl is non–negative. One of the key elements of the proof is a new (to us) extension result of Whitney type (Proposition 6.6), the hypotheses of which are rather different from the usual ones; in particular it seems to be much easier to work with in some situations. Section 7 discusses the relevance of the rigidity part of Theorem 6.1 to the theory of black holes and to the differentiability of compact Cauchy horizons. Appendix A reviews the geometry of C 2 horizons, we also prove there a new result concerning the relationship between the (classical) differentiability of a horizon vs the (classical) differentiability of sections thereof, Proposition A.3. In Appendix B some comments on the area theorem of [37] are made.

2 Horizons Let (M, g) be a smooth spacetime, that is, a smooth paracompact Hausdorff timeoriented Lorentzian manifold, of dimension n + 1 ≥ 3, with a smooth Lorentzian metric g. Throughout this paper hypersurfaces are assumed to be embedded. A hypersurface H ⊂ M will be said to be future null geodesically ruled if every point p ∈ H belongs to a future inextensible null geodesic Γ ⊂ H; those geodesics will be called the generators of H. We emphasize that the generators are allowed to have past endpoints on H, but no future endpoints. Past null geodesically ruled hypersurfaces are defined by changing the time orientation. We shall say that H is a future (past) horizon if H is an achronal, closed, future (past) null geodesically ruled topological hypersurface. A hypersurface H will be called a horizon if H is a future or a past horizon. Our terminology has been tailored to the black hole setting, so that a future black hole event horizon ∂J − (I+ ) is a future horizon in the sense just described [37, p. 312]. The terminology is somewhat awkward in a Cauchy horizon setting, in which a past Cauchy horizon D− (Σ) of an achronal edgeless set Σ is a future horizon in our terminology [56, Theorem 5.12]. Let dimM = n + 1 and suppose that O is a domain in Rn . Recall that a continuous function f : O → R is called semi–convex iff each point p has a convex neighborhood U in O so that there exists a C 2 function φ: U → R such that f + φ

114



is convex in U. We shall say that the graph of f is a semi–convex hypersurface if f is semi–convex. A hypersurface H in a manifold M will be said to be semi–convex if H can be covered by coordinate patches Uα such that H ∩ Uα is a semi–convex graph for each α. The interest of this notion stems from the specific differentiability properties of such hypersurfaces: Proposition 2.1 (Alexandrov [26, Appendix E]) Let B be an open subset of Rp and let f : B → R be semi–convex. Then there exists a set BAl ⊂ B such that: 1. the p dimensional Lebesgue measure Lp (B \ BAl ) of B \ BAl vanishes. 2. f is differentiable at all points x ∈ BAl , i.e., ∀ x ∈ BAl ∃x∗ ∈ (Rp )∗ such that ∀ y ∈ B f (y) − f (x) = x∗ (y − x) + r1 (x, y) ,

(2.1)

with r1 (x, y) = o(|x − y|). The linear map x∗ above will be denoted by df (x). 3. f is twice–differentiable at all points x ∈ BAl in the sense that ∀ x ∈ BAl ∃ Q ∈ (Rp )∗ ⊗ (Rp )∗ such that ∀ y ∈ B f (y) − f (x) − df (x)(y − x) = Q(x − y, x − y) + r2 (x, y) ,

(2.2)

with r2 (x, y) = o(|x − y|2 ). The symmetric quadratic form Q above will be denoted by 12 D2 f (x), and will be called the second Alexandrov derivative of f at x.6 The points q at which Equations (2.1)–(2.2) hold will be called Alexandrov points of f , while the points (q, f (q)) will be called Alexandrov points of the graph of f (it will be shown in Proposition 2.5 below that if q is an Alexandrov point of f , then (q, f (q)) will project to an Alexandrov point of any graphing function of the graph of f , so this terminology is meaningful). We shall say that H is locally achronal if for every point p ∈ H there exists an open neighborhood O of p such that H ∩ O is achronal in O. We have the following: Theorem 2.2 Let H = ∅ be a locally achronal future null geodesically ruled hypersurface. Then H is semi–convex. Remark 2.3 An alternative proof of Theorem 2.2 can be given using a variational characterization of horizons (compare [4, 33, 57]). 6 Caffarelli and Cabr´ e [9] use the term “second punctual differentiability of f at x” for (2.2); Fleming and Soner [26, Definition 5.3, p. 234] use the name “point of twice–differentiability” for points at which (2.2) holds.

Vol. 2, 2001


115

Remark 2.4 Recall for a real valued function f : B → R on a set the epigraph of f is {(x, y) ∈ B × R : y ≥ f (x)}. We note that while the notion of convexity of a function and its epigraph is coordinate dependent, that of semi–convexity is not; indeed, the proof given below is based on the equivalence of semi–convexity and of existence of lower support hypersurfaces with locally uniform one side Hessian bounds. The latter is clearly independent of the coordinate systems used to represent f , or its graph, as long as the relevant orientation is preserved (changing y n+1 to −y n+1 transforms a lower support hypersurface into an upper one). Further, if the future of H is represented as an epigraph in two different ways, J + (H)

= {xn+1 ≥ f (x1 , . . . , xn )} = {y n+1 ≥ g(y 1 , . . . , y n )} ,

(2.3)

then semi–convexity of f is equivalent to that of g. This follows immediately from the considerations below. Thus, the notion of a semi–convex hypersurface is not tied to a particular choice of coordinate systems and of graphing functions used to represent it. Proof. Let O be as in the definition of local achronality; passing to a subset of O we can without loss of generality assume that O is globally hyperbolic. Replacing the space–time (M, g) by O with the induced metric we can without loss of generality assume that (M, g) is globally hyperbolic. Let t be a time function on O which induces a diffeomorphism of O with R × Σ in the standard way [32, 59], with the level sets Στ ≡ {p | t(p) = τ } of t being Cauchy surfaces. As usual we identify Σ0 with Σ, and in the identification above the curves R × {q}, q ∈ Σ, are integral curves of ∇t. Define ΣH = {q ∈ Σ | R × {q} intersects H} .

(2.4)

For q ∈ ΣH the set (R × {q}) ∩ H is a point by achronality of H, we shall denote this point by (f (q), q). Thus an achronal hypersurface H in a globally hyperbolic space–time is a graph over ΣH of a function f . The map which to a point p ∈ H assigns q ∈ Σ such that p = (f (q), q) is injective, so that the hypothesis that H is a topological hypersurface together with the invariance of the domain theorem (cf., e.g., [18, Prop. 7.4, p. 79]) imply that ΣH is open. We wish to use [1, Lemma 3.2]7 to obtain semi–convexity of the (local) graphing function f , this requires a construction of lower support hypersurfaces at p. Let Γ be a generator of H passing through p and let p+ ∈ Γ ∩ J + (p). By achronality of Γ there are no points on J − (p+ ) ∩ J + (p) which are conjugate to p+ , cf. [37, Prop. 4.5.12, p. 115] or [5, Theorem 10.72, p. 391]. It follows that the intersection of a sufficiently small neighborhood of p with the past light cone ∂J − (p+ ) of p+ is a smooth hypersurface contained in J − (H). This provides the appropriate lower support hypersurface at p. In particular, for suitably chosen points p+ , these support hypersurfaces have 7 The result in that Lemma actually follows from [2, Lemma 2.15] together with [20, Prop. 5.4, p. 24].

116



null second fundamental forms (see Appendix A) which are locally (in the point p) uniformly bounded from below. This in turn implies that the Hessians of the associated graphing functions are locally bounded from below, as is needed to apply Lemma 3.2 in [1]. ✷ Theorem 2.2 allows us to apply Proposition 2.1 to f to infer twice Alexandrov differentiability of H almost everywhere, in the sense made precise in Proposition 2.1. Let us denote by Lnh0 the n dimensional Riemannian measure8 on Σ ≡ Σ0 , where h0 is the metric induced on Σ0 by g, then there exists a set ΣHAl ⊂ ΣH on which f is twice Alexandrov differentiable, and such that Lnh0 (ΣH \ ΣHAl ) = 0. Set HAl ≡ graph of f over ΣHAl . (2.5) Point 1 of Proposition 2.1 shows that HAl has a tangent space at every point p ∈ HAl . For those points define k(p) = kµ (p)dxµ = −dt + df (q) ,

q ∈ ΣHAl .

p = (f (q), q),

(2.6)

A theorem of Beem and Królak shows that H is differentiable precisely at those points p which belong to exactly one generator Γ [6, 14], with Tp H ⊂ Tp M being the null hyper-plane containing Γ˙ p . It follows that K ≡ g µν kµ ∂ν

(2.7)

is null, future pointing, and tangent to the generators of H, wherever defined. We define the generators of HAl as the intersections of the generators of H with HAl . Point 2 of Proposition 2.1 allows us to define the divergence θAl of those generators, as follows: Let ei , i = 1, . . . , n be a basis of Tp H such that g(e1 , e1 ) = · · · = g(en−1 , en−1 ) = 1 ,

g(ei , ej ) = 0 , i = j ,

en = K . (2.8)

We further choose the ea ’s, a = 1, . . . , n − 1 to be orthogonal to ∇t. It follows that the vectors ea = eµa ∂µ ’s have no ∂/∂t components in the coordinate system used in the proof of Theorem 2.2, thus e0a = 0. Using this coordinate system for p ∈ HAl we set i, j = 1, . . . , n θAl =

(ei1 ej1

2 ∇i kj = Dij f − Γµij kµ ,

+ ··· +

ein−1 ejn−1 )∇i kj

.

(2.9) (2.10)

It is sometimes convenient to set θAl = 0 on H \ HAl . The function θAl so defined on H will be called the divergence (towards the future) of both the generators of HAl and of H. It coincides with the usual divergence of the generators of H at every set U ⊂ H on which the horizon is of C 2 differentiability class: Indeed, in a space–time neighborhood of U we can locally extend the ei ’s to C 1 vector fields, 8 In

local coordinates, dLn h0 =

√

det h0 dn x.

Vol. 2, 2001


117

still denoted ei , satisfying (2.8). It is then easily checked that the set of Alexandrov points of U is U, and that the divergence θ of the generators as defined in [37, 64] or in Appendix A coincides on U with θAl as defined by (2.10). The null second fundamental form BAl of H, or of HAl , is defined as follows: in the basis above, if X = X a ea , Y = Y b eb (the sums are from 1 to n − 1), then at Alexandrov points we set BAl (X, Y ) = X a Y b eia ejb ∇i kj .

(2.11)

with ∇i kj defined by (2.9). This coincides with the usual definition of B as discussed e.g. in Appendix A on any subset of H which is C 2 . In this definition of the null second fundamental form B with respect to the null direction K, B measures expansion as positive and contraction as negative. The definitions of θAl and BAl given in Equations (2.10)–(2.11) have, so far, been only given for horizons which can be globally covered by an appropriate coordinate system. As a first step towards a globalization of those notions one needs to find out how those objects change when another representation is chosen. We have the following: 9

Proposition 2.5 1. Let f and g be two locally Lipschitz graphing functions representing H in two coordinate systems {xi }i=1,...,n+1 and {y i }i=1,...,n+1 , related to each other by a C 2 diffeomorphism φ. If (x10 , . . . xn0 ) is an Alexandrov point of f and (y01 , . . . , y0n , g(y01 , . . . , y0n )) = φ(x10 , . . . , xn0 , f (x10 , . . . , xn0 )) , then (y01 , . . . , y0n ) is an Alexandrov point of g. 2. The null second fundamental form BAl of Equation (2.11) is invariantly defined modulo a point dependent multiplicative factor. In particular the sign of θAl defined in Equation (2.10) does not depend upon the choice of the graphing function used in (2.10). Remark 2.6 Recall that the standard divergence θ of generators is defined up to a multiplicative function (constant on the generators) only, so that (essentially) the only geometric invariant associated to θ is its sign. Remark 2.7 We emphasize that we do not assume that ∂/∂xn+1 and/or ∂/∂y n+1 are timelike. Proof. 1. Let )y = (y 1 , . . . , y n ), )x0 = (x10 , . . . , xn0 ), etc., set f0 = f ()x0 ), g0 = g()y0 ). Without loss of generality we may assume ()x0 , f ()x0 )) = ()y0 , f ()y0 )) = 0. To ) φn+1 ), and let establish (2.2) for g, write φ as (φ, ) x) := φ() ) x, f ()x)) . ψ() 9 More precisely, when H is C 2 equation (2.11) defines, in local coordinates, a tensor field which reproduces b defined in Appendix A when passing to the quotient H/K.

118



As (id, f ) and (id, g), where id is the identity map, are bijections between neigh) borhoods of zero and open subsets of the graph, invariance of domain shows that ψ ) admits is a homeomorphism from a neighborhood of zero to its image. Further, ψ a second order Alexandrov expansion at the origin: ) x) = L1 )x + α1 ()x, )x) + o(|)x|2 ) , ψ()

(2.12)

where ) ) Df (0) . L1 = Dx φ(0) + Dxn+1 φ(0) Let us show that L1 is invertible: suppose, for contradiction, that this is not the case, let )x = 0 be an element of the kernel ker L1 , and for |s| 0, empty if τ < 0). In particular all points of S0 = {t = x = 0} are Alexandrov points thereof, while none of them is an Alexandrov point of H. In this example the horizon H is not differentiable on any of the points of S0 , which is a necessary condition for being an Alexandrov point of H. It would be of some interest to find out whether or not Alexandrov points of sections of H which are also points of differentiability of H are necessarily Alexandrov points of H. Following [29], we shall say that a differentiable embedded hypersurface S meets H properly transversally if for each point p ∈ S ∩ H for which Tp H exists the tangent hyperplane Tp S is transverse to Tp H. If S is spacelike and intersects H proper transversality will always hold; on the other hand if S is timelike this might, but does not have to be the case. If H and S are C 1 and S is either spacelike or timelike intersecting H transversely then H ∩ S is a C 1 spacelike submanifold. Therefore when S is timelike there is a spacelike S1 so that H∩S = H∩S1 . It would be interesting to know if this was true (even locally) for the transverse intersection of general horizons H with timelike C 2 hypersurfaces S.

Vol. 2, 2001


121

Let S be any spacelike or timelike C 2 hypersurface in M meeting H properly transversally.11 Suppose, first, that S is covered by a single coordinate patch such that S ∩ H is a graph xn = g(x1 , . . . , xn−1 ) of a semi–convex function g. Let n be the field of unit normals to S; at each point (x1 , . . . , xn−1 ) at which g is differentiable and for which Tp H exists, where p ≡ (x1 , . . . , xn−1 , g(x1 , . . . , xn−1 )) , there is a unique number a(p) ∈ R and a unique future pointing null vector K ∈ Tp H such that K − an, · = −dxn + dg . (3.2) Here we have assumed that the coordinate xn is chosen so that J − (H) ∩ S lies under the graph of g. We set kµ dxµ = K, · ∈ Tp∗ M . Assume moreover that (x1 , . . . , xn−1 ) is an Alexandrov point of g, we thus have the Alexandrov second derivatives D2 g at our disposal. Consider ei — a basis of Tp H as in Equation (2.8), satisfying further ei ∈ Tp S ∩ Tp H, i = 1, . . . , n − 1. In a manner completely analogous to (2.9)–(2.10) we set i, j = 1, . . . , n − 1 S∩H θAl =

(ei1 ej1

2 ∇i kj = Dij g − Γµij kµ ,

+ ··· +

ein−1 ejn−1 )∇i kj

.

(3.3) (3.4)

(Here, as in (2.9), the Γµij ’s are the Christoffel symbols of the space–time metric g.) For X, Y ∈ Tp S ∩ Tp H we set, analogously to (2.11), S∩H (X, Y ) = X a Y b eia ejb ∇i kj . BAl

(3.5)

Similarly to the definitions of θAl and BAl , the vector K with respect to which S∩H S∩H θAl and BAl have been defined has been tied to the particular choice of coordinates used to represent S ∩ H as a graph. In order to globalize this definition it S∩H might be convenient to regard BAl (p) as an equivalence class of tensors defined S∩H up to a positive multiplicative factor. Then θAl (p) can be thought as the assignS∩H ment to a point p of the number 0, ±1, according to the sign of θAl (p). This, S∩H S∩H together with Proposition 2.5, can then be used to define BAl and θAl for S which are not globally covered by a single coordinate patch. 11 Our definitions below makes use of a unit normal to S, whence the restriction to spacelike or S∩H, etc., for timelike, properly transverse S’s. Clearly one should be able to give a definition of θAl any hypersurface S intersecting H properly transversally. Now for a smooth H and for smooth properly transverse S’s the intersection S ∩ H will be a smooth spacelike submanifold of M, and it is easy to construct a spacelike S so that S ∩ H = S ∩ H. Thus, in the smooth case, no loss of generality is involved by restricting the S’s to be spacelike. For this reason, and because the current setup is sufficient for our purposes anyway, we do not address the complications which arise when S is allowed to be null, or to change type.

122



As already pointed out, if p ∈ S ∩ H is an Alexandrov point of H then p will also be an Alexandrov point of S ∩ H. In such a case the equivalence class of BAl , defined at p by Equation (2.11), will coincide with that defined by (3.5), when (2.11) is restricted to vectors X, Y ∈ Tp S ∩ Tp H. Similarly the sign of θAl (p) will S∩H S∩H coincide with that of θAl (p), and θAl (p) will vanish if and only if θAl (p) does. Let us turn our attention to the question, how to define the area of sections of horizons. The monotonicity theorem we prove in Section 6 uses the Hausdorff measure, so let us start by pointing out the following: Proposition 3.3 Let H be a horizon and let S be an embedded hypersurface in M . Then S ∩ H is a Borel set, in particular it is ν–Hausdorff measurable for any ν ∈ R+ . Proof. Let σ be any complete Riemannian metric on M , we can cover S by a countable collection of sets Oi ⊂ S of the form Oi = Bσ (pi , ri ) ∩ S, where the Bσ (pi , ri ) are open balls of σ–radius ri centered at pi with compact closure. We have Oi = Bσ (pi , ri ) ∩ S \ ∂Bσ (pi , ri ), which shows that the Oi ’s are Borel sets. Since S ∩ H = ∪i (Oi ∩ H), the Borel character of S ∩ H ensues. The Hausdorff measurability of S ∩ H follows now from the fact that Borel sets are Hausdorff measurable12 . ✷ Proposition 3.3 is sufficient to guarantee that a notion of area of sections of horizons — namely their (n − 1)–dimensional Hausdorff area — is well defined. Since the Hausdorff area is not something very handy to work with in practice, it is convenient to obtain more information about regularity of those sections. Before we do this let us shortly discuss how area can be defined, depending upon the regularity of the set under consideration. When S is a piecewise C 1 , (n − 1) dimensional, paracompact, orientable submanifold of M this can be done by first defining dn−1 µ = e1 ∧ · · · ∧ en−1 , where the ea ’s form an oriented orthonormal basis of the cotangent space T ∗ S; obviously dn−1 µ does not depend upon the choice of the ea ’s. Then one sets dn−1 µ . (3.6) Area(S) = S

Suppose, next, that S is the image by a Lipschitz map ψ of a C 1 manifold N . Let hS denote any complete Riemannian metric on N ; by [61, Theorem 5.3] for every 12 Let (X, d) be a metric space. Then an outer measure µ defined on the class of all subsets of X is a metric outer measure iff µ(A ∪ B) = µ(A) + µ(B) whenever the distance (i.e. inf a∈A,b∈B d(a, b)) is positive. For a metric outer measure, µ, the Borel sets are all µ-measurable (cf. [36, p. 48 Prob. 8] or [39, p. 188 Exercise 1.48]). The definition of the Hausdorff outer measures implies they are metric outer measures (cf. [39, p. 188 Exercise 1.49]). See also [61, p. 7] or [19, p. 147].

Vol. 2, 2001


123

4 > 0 there exists a C 1 map ψ from N to M such that Ln−1 hS ({ψ = ψ }) ≤ 4 .

(3.7)

Here and throughout Ln−1 hS denotes the (n − 1) dimensional Riemannian measure associated with a metric hS . One then sets S = ψ (N ) and Area(S) = lim Area(S ) . →0

(3.8)

(It is straightforward to check that Area(S) so defined is independent of the choice of the sequence ψ . In particular if ψ is C 1 on N one recovers the definition (3.6) using ψ = ψ for all 4.) It turns out that for general sections of horizons some more work is needed. Throughout this paper σ will be an auxiliary Riemannian metric such that (M, σ) is a complete Riemannian manifold. Let S be a Lipschitz (n − 1) dimensional submanifold of M such that S ⊂ H; recall that, by Rademacher’s theorem, S is differentiable Hn−1 almost everywhere. Let us denote by Hsσ the s dimensional σ Hausdorff measure [24] defined using the distance function of σ. Recall that S is called n rectifiable [24] iff S is the image of a bounded subset of Rn under a Lipschitz map. A set is countably n rectifiable iff it is a countable union of n rectifiable sets. (Cf. [24, p. 251].) (Instructive examples of countably rectifiable sets can be found in [52].) We have the following result [42] (The proof of the first part of Proposition 3.4 is given in Remark 6.13 below): Proposition 3.4 Let S be as in Proposition 3.1, then S is countably (n − 1) rectifiable. If S is compact then it is (n − 1) rectifiable. ✷ Consider, then, a family of sets Vqi which are Lipschitz images and form a Hausdorff measure zero, where h is the metric partition of S up to a set of Hn−1 h on Σ induced by g. One sets Area(S) = Area(Vqi ) . (3.9) i

We note that Area(S) so defined again does not depend on the choices made, and reduces to the previous definitions whenever applicable. Further, Area(S) so defined is13 precisely the (n − 1) dimensional Hausdorff measure Hn−1 of S: h dHn−1 (p) = Hn−1 (S) . (3.10) Area(S) = h h S

13 The

equality (3.10) can be established using the area formula. In the case of C 1 submanifolds of Euclidean space this is done explicitly in [61, p. 48]. This can be extended to countable n rectifiable sets by use of more general version of the area formula [61, p. 69] (for subsets of Euclidean space) or [23, Theorem 3.1] (for subsets of Riemannian manifolds).

124



As we will see, there is still another quantity which appears naturally in the area theorem, Theorem 6.1 below: the area counting multiplicities. Recall that the multiplicity N (p) of a point p belonging to a horizon H is defined as the number (possibly infinite) of generators passing through p. Similarly, given a subset S of H, for p ∈ H we set N (p, S) = the number of generators of H passing through p which meet S when followed to the (causal) future . (3.11) Note that N (p, S) = N (p) for p ∈ S. Whenever N (p, S2 ) is Hn−1 measurable on h1 the intersection S1 of H with a spacelike hypersurface Σ1 we set N (p, S2 ) dHn−1 (3.12) AreaS2 (S1 ) = h1 (p) , S1

where h1 is the metric induced by g on Σ1 . Note that N (p, S2 ) = 0 at points of S1 which have the property that the generators through them do not meet S2 ; thus the area AreaS2 (S1 ) only takes into account those generators that are seen from S2 . If S1 ⊂ J − (S2 ), as will be the case e.g. if S2 is obtained by intersecting H with a Cauchy surface Σ2 lying to the future of S1 , then N (p, S2 ) ≥ 1 for all p ∈ S1 (actually in that case we will have N (p, S2 ) = N (p)), so that AreaS2 (S1 ) ≥ Area(S1 ) . Let us show, by means of an example, that the inequality can be strict in some cases. Consider a black hole in a three dimensional space–time, suppose that its section by a spacelike hypersurface t = 0 looks as shown in Figure 1.

Figure 1: A section of the event horizon in a 2 + 1 dimensional space–time with “two black holes merging”. As we are in three dimensions area should be replaced by length. (A four dimensional analogue of Figure 1 can be obtained by rotating the curve from Figure 1 around a vertical axis.) When a slicing by spacelike hypersurfaces is appropriately chosen, the behavior depicted can occur when two black holes merge together14 . When measuring the length of the curve in Figure 1 one faces various options: 14 The four dimensional analogue of Figure 1 obtained by rotating the curve from Figure 1 around a vertical axis can occur in a time slicing of a black–hole space–time in which a “tem-

Vol. 2, 2001


125

a) discard the middle piece altogether, as it has no interior; b) count it once; c) count it twice — once from each side. The purely differential geometric approach to area, as given by Equation (3.6) does not say which choice should be made. The Hausdorff area approach, Equation (3.10), counts the middle piece once. The prescription (3.12) counts it twice. We wish to argue that the most reasonable prescription, from an entropic point of view, is to use the prescription (3.12). In order to do that, let (M, g) be the three dimensional Minkowski space–time and consider a thin long straw R lying on the y axis in the hypersurface t = 0: R = {t = x = 0, y ∈ [−10, 10]} .

(3.13)

Set K = J + (R) ∩ {t = 1}, then K is a convex compact set which consists of a strip of width 2 lying parallel to the straw with two half–disks of radius 1 added ˆ = M \ K, equipped with the Minkowski metric, still denoted at the ends. Let M ˆ by g. Then M has a black hole region B which is the past domain of dependence of K in M (with K removed). The sections Hτ of the event horizon H defined as H ∩ {t = τ } are empty for τ < 0 and τ ≥ 1. Next, H0 consists precisely of the straw. Finally, for t ∈ (0, 1) Ht is the boundary of the union of a strip of width 2t lying parallel to the straw with two half–disks of radius t added at the ends of the strip. Thus 0 , t 0 a δε > 0 so that |x| < δε implies |f (x)| ≤ ε|x|2 . To do this choose any r0 > 0 so that B(0, r0 ) ⊂ U . Then choose an ? large enough that −ε|x|2 ≤ D2 f− (0)(x, x) ,

D2 f+ (0)(x, x) ≤ ε|x|2 ,

for all x ∈ B(0, r0 ). Now choose r1 ≤ r0 so that B(0, r1 ) ⊂ U . For this ? we can use the Taylor expansions of f+ and f− at 0 to find a δε > 0 with 0 < δε ≤ r1 so that 1 1 ± f (x) − Df± (0)(x, x) ≤ ε|x|2 2 2 for all x with |x| < δ1 . Then f ≤ f+ implies that if |x| < δε then f (x) ≤

ε 1 ε 1 2 + D f (0)(x, x) + f+ (x) − D2 f+ (0)(x, x) ≤ |x|2 + |x|2 = ε|x|2 , 2 2 2 2

with a similar calculation, using f− ≤ f , yielding −ε|x|2 ≤ f (x). This completes the proof. ✷ Proof of Theorem 5.1. This proof uses, essentially, the same geometric facts about horizons that are used in the proof of Proposition A.1. Let p0 ∈ Γint be an Alexandrov point of a section S ∩ H, where S is a spacelike or timelike C 2 hypersurface that intersects H properly transversally, and with p0 ∈ Γint . By restricting to a suitable neighborhood of p0 we can assume without loss of generality that M is globally hyperbolic, that Γ maximizes the distance between any two of its points, and that H is the boundary of J + (H) and J − (H) (that is H divides M into two open sets, its future I + (H) and its past I − (H)). We will simplify notation a bit and assume that S is spacelike, only trivial changes are required in the case S is timelike. We can find C 2 coordinates x1 , . . . , xn on S centered at p0 so that in these coordinates S ∩ H is given by a graph xn = h(x1 , . . . , xn−1 ). By restricting the size of S we can assume that these coordinates are defined on all of S and that their image is B n−1 (r) × (−δ, δ) for some r, δ > 0 and that h: B n−1 (r) → (−δ, δ). As p0 is an Alexandrov point of S∩H and h(0) = 0

Vol. 2, 2001


139

the function h has a second order expansion 1 h()x) = dh(0))x + D2 h(0)()x, )x) + o(|)x|2 ) , 2 where )x = (x1 , . . . , xn−1 ). By possibly changing the sign of xn we may assume that {()x, y) ∈ S | y ≥ h()x)} ⊂ J + (H) ,

{()x, y) ∈ S | y ≤ h()x)} ⊂ J − (H) .

For ε ≥ 0 define 1 ε x) = dh(0))x + D2 h(0)()x, )x) ± |)x|2 . h± ε () 2 2 − 2 n−1 (r), and for ε = 0 the function h0 = h+ Then h± ε is a C function on B 0 = h0 is just the second order expansion of h at )x = 0. For each ε > 0 the second order Taylor expansion for h at 0 implies that there is an open neighborhood Vε of 0 in B n−1 (r) so that + h− on Vε . ε ≤ h ≤ hε

Set N := {()x, h0 ()x)) | )x ∈ B n−1 (r)} ,

Nε± := {()x, h± x)) | )x ∈ Vε } . ε ()

Let n be the future pointing timelike unit normal to S and let η: (a, b) → Γint be the affine parameterization of Γint with η(0) = p0 and η (0), n(p0 ) = −1 (which implies that η is future directed). Let k be the unique C 1 future directed null vector field along N so that k(p0 ) = η (0) and k, n = −1. Likewise let k± ε be the C 1 future directed null vector field along Nε± with k(p0 ) = η (0) and k± ε , n = −1. Let p be any point of Γint ∩J − (p0 )\ {p0 }. Then p = η(t0 ) for some t0 ∈ (a, b). To simplify notation assume that t0 ≤ 0. By Lemma 4.15 N has no focal points in Γint and in particular no focal points on η (a,0] . Therefore if we fix a t1 with t1 < t0 < b then there will be an open neighborhood W of 0 in B n−1 (r) so that := {exp(tk()x, h0 ()x))) | )x ∈ W, t ∈ (t1 , 0)} H is an embedded null hypersurface of M . By Proposition A.3 the hypersurface is of smoothness class C 2 . The focal points depend continuously on the second fundamental form so there is an ε0 > 0 so that if ε < ε0 then none of the submanifolds Nε± have focal points along η [t1 ,0] . Therefore if 0 < ε < ε0 there is an open neighborhood Wε and such that ± := {exp(tk()x, h()x))) | )x ∈ Wε , t ∈ (t1 , 0)} H ε is a C 2 embedded hypersurface of M . We now choose smooth local coordinates y 1 , . . . , y n+1 for M centered at p = η(t0 ) so that ∂/∂y n+1 is a future pointing timelike vector field and the level sets

140



y n+1 = Const. are spacelike. Then there will be an open neighborhood U of 0 in Rn so that near p the horizon H is given by a graph y n+1 = f (y 1 , . . . , y n ). Near p the future and past of H are given by J + (H) = {y n+1 ≥ f (y 1 , . . . , y n )} and J − (H) = {y n+1 ≤ f (y 1 , . . . , y n )}. There will also be open neighborhoods U0 and Uε± of 0 in Rn and functions f0 and fε± defined on these sets so that near p H ± H ε

= {(y 1 , . . . , y n , f0 (y 1 , . . . , y n )) | (y 1 , . . . , y n ) ∈ U0 } , = {(y 1 , . . . , y n , fε± (y 1 , . . . , y n )) | (y 1 , . . . , y n ) ∈ Uε± } .

and H ± are C 2 which implies the functions f0 and f ± are The hypersurfaces H ε ε 2 all C . − ⊂ J − (H). Since Nε− ⊂ J − (H), a simple achronality argument shows that H ε (This uses the properties of H described in the first paragraph of the proof.) By + is achronal. We now show that, choosing Nε+ small enough, we can assume that H ε − + relative to some neighborhood of p, H ⊂ J (Hε ). Let Oε be a neighborhood of + ∩ Oε separates Oε into the disjoint open sets p, disjoint from S, such that H ε + + − + I (Hε ∩ Oε ; Oε ) and I (Hε ∩ Oε ; Oε ). Now, by taking Oε small enough, we + ∩ Oε ; Oε ). If that is not the case, there is claim that H does not meet I + (H ε + ∩ Oε ; Oε ) and p → p. For each ?, a sequence {p } such that p ∈ H ∩ I + (H ε let Γ be a future inextendible null generator of H starting at p . Since p is an interior point of Γ, the portions of the Γ ’s to the future of p must approach the portion of Γ to the future of p. Hence for ? sufficiently large, Γ will meet S at a point q ∈ S ∩ H such that q → p0 . For such ?, the segment of Γ from p to q approaches the segment of Γ from p to p0 uniformly as ? → ∞. Since + separates a small neighborhood of the segment of Γ from p to p0 , it follows H ε + . But this that for ? large enough, the segment of Γ from p to q will meet H ε − + + . We implies for such ? that p ∈ I (Nε ), which contradicts the achronality of H ε + ∩ Oε ; Oε ) = ∅, and hence conclude, by choosing Oε small enough, that H ∩ I + (H ε + ∩ Oε ; Oε ). Shrinking U + if necessary, this inclusion, and that H ∩ Oε ⊂ J − (H ε ε − ⊂ J − (H) imply that on U − we have f − ≤ f and that on U + the inclusion H ε ε ε ε we have f ≤ fε+ . Therefore if we can show that limε 0 D2 fε± (0) = D2 f0 (0) then Lemma 5.4 implies that 0 is an Alexandrov point of f with Alexandrov second derivative given by D2 f (0) = D2 f0 (0). To see that limε 0 D2 fε± (0) = D2 f0 (0) let b0 and b± ε be the Weingarten ± maps of H and Hε along η. By Proposition A.1 they all satisfy the same Ricatti equation (A.2). The initial condition for b0 is calculated algebraically from the second fundamental form of N at p0 (cf. Section 3) and likewise the initial condition ± for b± ε is calculated algebraically from the second fundamental form of Nε at p0 . From the definitions we clearly have Second Fundamental Form of Nε± at p0 −→ Second Fundamental Form ofN at p0 as ε ! 0. Therefore continuous dependence of solutions to ODE’s on initial condi tions implies that limε 0 b± = b at all points of η . As D2 fε± (0) and D2 f0 (0) 0 ε [0,t0 ]

Vol. 2, 2001


141

are algebraic functions of b± ε and b0 at the point p = η(t0 ) this implies that limε 0 D2 fε± (0) = D2 f0 (0), and completes the proof that p is an Alexandrov point of H. Finally, it follows from the argument that at all points of Γint the null Wein garten map b for H is the same as the Weingarten for the C 2 null hypersurface H. So b will satisfy (A.2) by Proposition A.1. ✷ We end this section with one more regularity result, Theorem 5.6 below. Its proof requires some techniques which are introduced in Section 6 only — more precisely, an appropriate generalization of Lemma 6.11 is needed; this, in turn, relies on an appropriate generalization of Lemma 6.9. For this reason we defer the proof of Theorem 5.6 to an appendix, Appendix D. Theorem 5.6 Let S be any C 2 hypersurface intersecting H properly transversally, and define S0 S1 S2

= {q ∈ S ∩ H : q is an interior point of a generator of H} , = {q ∈ S0 : all interior points of the generator through q are Alexandrov points of H} , = {q ∈ S0 : q is an Alexandrov point of H} ,

(5.4) (5.5) (5.6)

Then S1 and S2 have full n − 1 dimensional Hausdorff measure in S0 . Remark 5.7 S0 does not have to have full measure in S ∩ H, it can even be empty. This last case occurs indeed when S = {t = 0} in the example described around Equation (3.13) in Section 3. Note, however, that if S is a level set of a properly transverse foliation S(t), then (as already mentioned) for almost all t’s the sets S(t)0 (as defined in (5.4) with S replaced by S(t)) will have full measure in S(t). We shall call a generator an Alexandrov generator if all its interior points are Alexandrov points. It follows that for generic sections (in the measure sense above) for almost all points through those sections the corresponding generators will be Alexandrov. The discussion here thus gives a precise meaning to the statement that almost all generators of a horizon are Alexandrov generators.

6 Area monotonicity In this section we shall show the monotonicity of the area, assuming that the Alexandrov divergence θAl of the generators of H, or that of a section of H, is non–negative. The result is local in the sense that it only depends on the part of the event horizon H that is between the two sections S1 and S2 whose area we are trying to compare. We consider a spacetime (M, g) of dimension n + 1. We have the following:

142



Theorem 6.1 Suppose that (M, g) is an (n+1) dimensional spacetime with a future horizon H. Let Σa , a = 1, 2 be two embedded achronal spacelike hypersurfaces of C 2 differentiability class, set Sa = Σa ∩ H. Assume that S1 ⊂ J − (S2 ) ,

S1 ∩ S2 = ∅ ,

(6.1)

and that either 1. the divergence θAl of H defined (Hnσ –almost everywhere) by Equation (2.10) is non–negative on J + (S1 ) ∩ J − (S2 ), or S2 of S2 defined (Hn−1 –almost everywhere) by Equation (3.4) 2. the divergence θAl σ is non–negative, with the null energy condition (4.1) holding on J + (S1 ) ∩ J − (S2 ).

Then 1. AreaS2 (S1 ) ≤ Area(S2 ) ,

(6.2)

with AreaS2 (S1 ) defined in Equations (3.11)–(3.12) and Area(S2 ) defined in Equation (3.10). This implies the inequality Area(S1 ) ≤ Area(S2 ) . 2. If equality holds in (6.2), then (J + (S1 )\S1 )∩(J − (S2 )\S2 ) (which is the part of H between S1 and S2 ) is a smooth totally geodesic (analytic if the metric is analytic) null hypersurface in M . Further, if γ is a null generator of H with γ(0) ∈ S1 and γ(1) ∈ S2 , then the curvature tensor of (M, g) satisfies R(X, γ (t))γ (t) = 0 for all t ∈ [0, 1] and X ∈ Tγ(t) H. Remark 6.2 Note that if S1 ∩ S2 = ∅ then the inequality AreaS2 (S1 ) ≤ Area(S2 ) need not hold. (The inequality Area(S1 ) ≤ Area(S2 ) will still hold.) For example, if AreaS1 (S1 ) > Area(S1 ), then letting S2 = S1 gives a counterexample. If S1 ∩ S2 = ∅ then the correct inequality is AreaS2 (S1 \ S2 ) ≤ Area(S2 \ S1 ). Proof. Let us start with an outline of the proof, without the technical details — these will be supplied later in the form of a series of lemmas. Let NH (S1 ) be the collection of generators of H that meet S1 . Let A ⊆ S2 be the set of points that are of the form S2 ∩ γ with γ ∈ NH (S1 ); replacing Σ2 by an appropriate submanifold thereof if necessary, A will be a closed subset of S2 (Proposition 6.3). The condition (6.1) together with achronality of H imply that every q ∈ A is on exactly one of the generators γ ∈ NH (S1 ), we thus have a well defined function φ: A → S1 given by φ(q) = S1 ∩ γ

where γ ∈ NH (S1 ) and q = S2 ∩ γ.

(6.3)

Vol. 2, 2001


143

Let us for simplicity assume that the affine distance from S1 to S2 on the generators passing through S1 is bounded from below, and that the affine existence time of those generators to the future of S2 is also bounded from below (in what follows we will see how to reduce the general case to this one). In this case A can be embedded in a C 1,1 hypersurface N in Σ2 (Lemma 6.9) and φ can be extended to from N to Σ1 (Lemma 6.11). A is Hn−1 measurable a locally Lipschitz function φ, h2 (closed subsets of manifolds are Hausdorff measurable12 ), so we can apply the generalization of the change–of–variables theorem known as the area formula [23, Theorem 3.1] (with m and k there equal to n − 1) to the extension φ of φ to get #φ−1 [p] dHn−1 (p) = J(φ)(q) dHn−1 (6.4) h1 h2 (q) , S1

A

where J(φ) is the restriction of the Jacobian20 of φ to A, and ha , a = 1, 2, denotes the metric induced on Σa by g. We observe that #φ−1 [p] = N (p, S2 ) for all q ∈ S1 . Indeed, if γ is a null generator of H and p ∈ γ ∩ S1 then the point q ∈ S2 with q ∈ γ satisfies φ(q) = p. Thus there is a bijective correspondence between φ−1 [p] and the null generators of H passing through p. (This does use that S1 ∩ S2 = ∅.) This implies that n−1 −1 #φ [p] dHh1 (p) = N (p, S2 ) dHn−1 h1 (p) , S1

S1

which can be combined with the area formula (6.4) to give that AreaS2 (S1 ) = J(φ)(q) dHn−1 h2 (q).

(6.5)

A

We note that [24, Theorem 3.2.3, p. 243] also guarantees21 that A # p → N (p, S2 ) is measurable, so that AreaS2 (S1 ) is well defined. Now a calculation, that is straightforward when H is smooth (Proposition A.5, Appendix A), shows that S2 having the null mean curvatures θAl – or θAl nonnegative together with the null energy condition – implies that J(φ) ≤ 1 almost everywhere with respect to Hn−1 h2 on A — this is established in Proposition 6.16. Using this in (6.5) completes the proof. Our first technical step is the following: 20 It should be pointed out that the Jacobian J(φ) is not the usual Jacobian which occurs in the change–of–variables theorem for Lebesgue measure on Rn , but contains the appropriate

det gij factors occurring in the definition of the measure associated with a Riemannian metric, see [23, Def. 2.10, p. 423]. A clear exposition of the Jacobians that occur in the area and co–area formulas can be found in [22, Section 3.2] in the flat Rn case. See also [41, Appendix p. 66] for the smooth Riemannian case. 21 The result in [24, Theorem 3.2.3, p. 243] is formulated for subsets of Rq , but the result generalizes immediately to manifolds by considering local charts, together with a partition of unity argument.

144



Proposition 6.3 Let the setting be as in Theorem 6.1. Then there exists an open submanifold Σ2 of Σ2 such that A is a closed subset of S2 = Σ2 ∩ H. Replacing Σ2 with Σ2 we can thus assume that A is closed in S2 . Proof. Let W2 be the subset of S2 consisting of all points p ∈ S2 for which there exists a semi–tangent X at p of H such that the null geodesic η starting at p in the direction X does not meet Σ1 when extended to the past (whether or not it remains in H). Let {pk } be a sequence of points in W2 such that pk → p ∈ S2 . We show that p ∈ W2 , and hence that W2 is a closed subset of S2 . For each k, let Xk be a semi–tangent at pk such that the null geodesic ηk starting at pk in the direction Xk does not meet Σ1 when extended to the past. Since the collection of σ-normalized null vectors is locally compact, by passing to a subsequence if necessary, we may assume that Xk → X, where X is a future pointing null vector at p. Let η be the null geodesic starting at p in the direction X. Since (pk , Xk ) → (p, X), ηk → η in the strong sense of geodesics. Since, further, the ηk ’s remain in H to the future and H is closed, it follows that η is a null generator of H and X is a semi-tangent of H at p. Since ηk → η, if η meets Σ1 when extended to the past then so will ηk for k large enough. Hence, η does not meet Σ1 , so p ∈ W2 , and W2 is a closed subset of S2 . Then S2 := S2 \ W2 is an open subset of S2 . Thus, there exists an open set Σ2 in Σ2 such that S2 = S2 ∩ Σ2 = Σ2 ∩ H. Note that p ∈ S2 iff for each semi-tangent X of H at p, the null geodesic starting at p in the direction X meets Σ1 when extended to the past. In particular, A ⊂ S2 . To finish, we show that A is a closed subset of S2 . Let {pk } be a sequence in A such that pk → p ∈ S2 . Let ηk be the unique null generator of H through pk . Let Xk be the σ-normalized tangent to ηk at pk . Let qk be the point where ηk meets S1 . Again, by passing to a subsequence if necessary, we may assume Xk → X, where X is a semi–tangent of H at p. Let η be the null geodesic at p in the direction X. We know that when extended to the past, η meets Σ1 at a point q, say. Since ηk → η we must in fact have qk → q and hence q ∈ S1 . It follows that η starting from q is a null generator ✷ of H. Hence p ∈ A, and A is closed in S2 . We note that in the proof above we have also shown the following: Lemma 6.4 The collection N = ∪p∈H Np of semi–tangents is a closed subset of T M. ✷ Recall that we have fixed a complete Riemannian metric σ on M . For each δ > 0 let Aδ : = {p ∈ A | σ-dist(p, S1 ) ≥ δ , and the generator γ through p can be extended at least a σ–distance δ to the future} . (6.6) We note that if the σ distance from S1 to S2 on the generators passing through S1 is bounded from below, and if the length of the portions of those generators which lie to the future of S2 is also bounded from below (which is trivially fulfilled when

Vol. 2, 2001


145

the generators are assumed to be future complete), then Aδ will coincide with A for δ small enough. Lemma 6.5 Without loss of generality we may assume S 1 ∩ S2 = ∅

(6.7)

A = ∪δ>0 Aδ .

(6.8)

and Proof. We shall show how to reduce the general situation in which S1 ∩ S2 = ∅ to one in which Equation (6.7) holds. Assume, first, that Σ1 is connected, and introduce a complete Riemannian metric on Σ1 . With respect to this metric Σ1 is a complete metric space such that the closed distance balls B(p, r) (closure in Σ1 ) are compact in Σ1 , and hence also in M . Choose a p in Σ1 and let Σ1,i = B(p, i) (B(p, i) – open balls). Then Σ1 = ∪i Σ1,i = ∪i Σ1,i ) (closure either in Σ1 or in M ) is an increasing union of compact sets. The Σ1,i ’s are spacelike achronal hypersurfaces which have the desired property S 1,i ∩ S2 ⊂ Σ1 ∩ Σ2 = ∅ (closure in M), S1,i ≡ Σ1,i ∩ H. Suppose that we have shown that point 1 of Theorem 6.1 is true for S1,i , thus AreaS2 (S1,i ) ≤ Area(S2 ) . As S1,i ⊂ S1,i+1 , S1 = ∪S1,i , the monotone convergence theorem gives lim AreaS2 (S1,i ) = AreaS2 (S1 ) ,

i→∞

whence the result. If Σ1 is not connected, one can carry the above procedure out on each component (at most countably many), obtaining a sequence of sets Σ1,i for each component of Σ1 . The resulting collection of sets is countable, and an obvious modification of the above argument establishes that (6.7) holds. But if Equation (6.7) holds, that is if S 1 ∩ S2 = ∅, we have σ-dist(p, S1 ) = σ-dist(p, S 1 ) > 0 for all p ∈ S2 . Therefore (6.8) holds. This completes the proof. ✷ In order to continue we need the following variation of the Whitney extension theorem. As the proof involves very different ideas than the rest of this section we postpone the proof to Appendix E.

146



Proposition 6.6 Let A ⊂ Rn and f : A → R. Assume there is a constant C > 0 and for each p ∈ A there is a vector ap ∈ Rn so that the inequalities f (p) + x − p, ap −

C C $x − p$2 ≤ f (x) ≤ f (p) + x − p, ap + $x − p$2 (6.9) 2 2

hold for all x ∈ A. Also assume that for all p, q ∈ A and all x ∈ Rn the inequality f (p) + x − p, ap −

C C $x − p$2 ≤ f (q) + x − q, aq + $x − q$2 2 2

(6.10)

holds. Then there is a function F : Rn → R of class C 1,1 so that f is the restriction of F to A. Remark 6.7 Following [9, Prop. 1.1 p. 7] the hypothesis (6.9) of Proposition 6.6 will be stated by saying f has global upper and lower support paraboloids of opening C in K. The condition (6.10) can be expressed by saying the upper and lower support paraboloids of f are disjoint. Remark 6.8 Unlike most extension theorems of Whitney type this result does not require that the set being extend from be closed (here A need not even be measurable) and there is no continuity assumption on f or on the map that sends p to the vector ap (cf. [62, Chap. VI Sec. 2], [40, Vol. 1, Thm 2.3.6, p. 48] where to get a C 1,1 extension the mapping p → ap is required to be Lipschitz.) The usual continuity assumptions are replaced by the “disjointness” condition (6.10) which is more natural in the geometric problems considered here. Proposition 6.6 is a key element for the proof of the result that follows [42]: 1,1 Lemma 6.9 For any δ > 0 there is a Cloc hypersurface Nδ of Σ2 (thus Nδ has co–dimension two in M ) with Aδ ⊆ Nδ .

Remark 6.10 We note that Nδ does not have to be connected. Proof. The strategy of the proof is as follows: We start by showing that all points q in Aδ possess space–time neighborhoods in which all the generators of H passing through Aδ are contained in a C 1,1 hypersurface in M . This hypersurface is not necessarily null, but is transverse to Σ2 on Aδ . Nδ will be obtained by a globalization argument that uses the intersections of those locally defined hypersurfaces with Σ2 . By the local character of the arguments that follow, one can assume without loss of generality that M is globally hyperbolic; otherwise, where relevant, one could restrict to a globally hyperbolic neighborhood and define the pasts, futures, etc., with respect to this neighborhood.

Vol. 2, 2001


147

For q ∈ Aδ define q ± ∈ J ± (q) ∩ H as those points on the generator γq of H through q which lie a σ–distance δ away from q. Achronality of H implies that γq has no conjugate points, hence the hypersurfaces ∂J ∓ (q ± ) are smooth in a neighborhood of q, tangent to H there. Let q0 in Aδ and choose a basis B = {E1 , . . . , En , En+1 } of Tq+ M such that 0 g(Ei , Ej ) = δij and g(Ei , En+1 ) = 0 for i, j = 1, . . . , n, while g(En+1 , En+1 ) = −1. We shall denote by (y1 , . . . , yn+1 ) the normal coordinates associated with this basis. We note that Aδ ⊂ Aδ for δ < δ so that it is sufficient to establish our claims for δ small. Choosing δ small enough the cone {yn+1 = − y12 + · · · + yn2 | yi ∈ R, i = 1, . . . , n} coincides with ∂J − (q0+ ) in a neighborhood of q0 . Reducing δ again if necessary we can suppose that q0− belongs to a normal coordinate neighborhood of q0+ . Let 0− 0 yn+1 be the (n + 1)st coordinate of q0 and let yn+1 be the (n + 1)st coordinate − − of q0 . Denote by C the smooth hypersurface with compact closure obtained by intersecting the previous cone by the space–time slab lying between the hyper0− 0 0 surfaces yn+1 = 12 yn+1 and yn+1 = 12 (yn+1 + yn+1 ). Let us use the symbol ω to denote any smooth local parameterization ω −→ n(ω) ∈ Tq+ M of this hypersur0 face. We then obtain a smooth parameterization by ω of ∂J − (q0+ ) in a neighborhood of q0 by using the map ω −→ expq+ (n(ω)). Reducing δ again if necessary, 0 without loss of generality we may assume that there are no conjugate points on expq+ (n(ω)) ⊂ C − . 0 Using parallel transport of B along radial geodesics from q0+ , we can then use ω to obtain a smooth parameterization of ∂J − (m) in a neighborhood of q0 , for all m in a neighborhood of q0+ . For all m in this neighborhood let us denote by nm (ω) the corresponding vector in Tm M . The map (m, ω) −→ expm (nm (ω)) is smooth for m close to q0+ and for ω ∈ C − . This implies the continuity of the map which to a couple (m, ω) assigns the second fundamental form II(m, ω), defined with respect to an auxiliary Riemannian metric σ, of ∂J − (m) at expm (nm (ω)). If we choose the m’s in a sufficiently small compact coordinate neighborhood of q0+ , compactness of C − implies that the σ-norm of II can be bounded in this neighborhood by a constant, independently of (m, ω). Let, now (x1 , . . . , xn+1 ) be a coordinate system covering a globally hyperbolic neighborhood of q0 , centered at q0 , and of the form O = B n (2r) × (−a, a). We can further require that g(∂n+1 , ∂n+1 ) < C < 0 over O. Transversality shows that, reducing r and a if necessary, the hypersurfaces expm (nm (ω)) are smooth − graphs above B n (2r), for m close to q0+ ; we shall denote by fm the corresponding − graphing functions. Now, the first derivatives of the fm ’s can be bounded by an − m–independent constant: the vectors Vi = ∂i + Di fm ∂n+1 are tangent to the graph, hence non–timelike, and the result follows immediately from the equation g(Vi , Vi ) ≥ 0. Next, the explicit formula for the second fundamental form of a

148



graph (cf., e.g., [2, Eqns. (3.3) and (3.4), p. 604]) gives a uniform bound for the − second derivatives of the fm ’s over B n (2r). We make a similar argument, using future light cones issued from points m near to q0− , and their graphing functions + . Let f be the graphing function of H over B n (2r); we over B n (2r) denoted by fm can choose a constant C such that, for all p in Gδ := {x ∈ Γ | Γ is a generator of H passing through Aδ } ∩ O ,

(6.11)

the graph of the function fp (x) = f (xp ) + Df (xp )(x − xp ) − C||x − xp ||2 lies under the graph of fp−+ , hence in the past of p+ . Here we write xp for the space coordinate of p, thus p = (xp , f (xp )), etc. Similarly, for all q in Gδ the graph of fq (x) = f (xq ) + Df (xq )(x − xq ) + C||x − xq ||2 lies above the graph of fq+− , thus is to the future of q − . Achronality of the horizon implies then that the inequality (6.10) holds for x ∈ B n (2r). Increasing C if necessary, Equation (6.10) will hold for all xp , xq ∈ B n (r) and x ∈ Rn : Indeed, let us make C large enough so that (f ) + sup (f ) , (C − 1)r2 ≥ Lip2B n (r) (f ) − inf n B (r)

B n (r)

where LipB n (r) (f ) is the Lipschitz continuity constant of f on B n (r). For x ∈ B n (2r) we then have C (||x − xq ||2 + ||x − xp ||2 ) 2 +x − xq , aq − x − xp , ap ≥ C f (xq ) − f (xp ) + (||x − xq ||2 + ||x − xp ||2 ) 2 1 − (||x − xq ||2 + ||x − xp ||2 + ||aq ||2 + ||ap ||2 ) ≥ 2 inf (f ) − sup (f ) + (C − 1)r2 − Lip2B n (r) (f ) ≥ 0 . n f (xq ) − f (xp ) +

B (r)

B n (r)

Here ap = df (xp ), aq = df (xq ), and we have used LipB n (r) (f ) to control ||ap || and ||aq ||. Let the set A of Proposition 6.6 be the projection on B n (r) of Gδ ; by that proposition there exists a C 1,1 function from B n (r) to (−a, a), the graph of which contains Gδ . (It may be necessary to reduce r and O to obtain this). Note that this graph contains Aδ ∩ O and is transverse to Σ2 there. From the fact that a transverse intersection of a C 2 hypersurface with a C 1,1 hypersurface is a C 1,1 manifold we obtain that Aδ ∩ O is included in a C 1,1 submanifold of Σ2 , which has space–time co–dimension two.

Vol. 2, 2001


149

Now Aδ is a closed subset of the manifold Σ2 defined in Proposition 6.3, and can thus be covered by a countable union of compact sets. Further, by definition of Aδ any point thereof is an interior point of a generator. Those facts and the arguments of the proof of Proposition 3.1 show that Aδ can be covered by a countable locally finite collection of relatively compact coordinate neighborhoods Ui of the form Ui = B n−1 (4i ) × (−ηi , ηi ) such that Ui ∩ H is a graph of a semi– convex function gi : Ui ∩ H = {xn = gi (xA ) , (xA ) ≡ (x1 , . . . , xn−1 ) ∈ B n−1 (4i )} .

(6.12)

The 4i ’s will further be restricted by the requirement that there exists a C 1,1 function (6.13) hi : B n−1 (4i ) → R such that the graph of hi contains Ui ∩ Aδ . The hi ’s are the graphing functions of the C 1,1 manifolds just constructed in a neighborhood of each point of Aδ . At those points at which S2 is differentiable let m denote a h2 –unit vector field normal to S2 pointing towards J − (H) ∩ Σ2 . Choosing the orientation of xn appropriately we can assume that at those points at which m is defined we have m(xn − gi ) > 0 .

(6.14)

In order to globalize this construction we use an idea of [42]. Let φi be a partition of unity subordinate to the cover {Ui }i∈N , define (xn − hi (xA ))φi (q) , q = (xA , xn ) ∈ Ui , (6.15) χi (q) = 0, otherwise, χδ = χi . (6.16) i

Define a hypersurface Nδ ⊂ ∪i Ui via the equations Nδ ∩ Ui = {χδ = 0 , dχδ = 0} .

(6.17)

(We note that Nδ does not have to be connected, but this is irrelevant for our purposes.) If q ∈ Aδ we have χi (q) = 0 for all i’s hence Aδ ⊂ {χδ = 0}. Further, for q ∈ Aδ ∩ Ui we have m(χi ) = φi m(xn − gi ) ≥ 0 from Equation (6.14), and since for each q ∈ Aδ there exists an i at which this is strictly positive we obtain m(χδ ) = m(χi ) > 0 . i

It follows that Aδ ⊂ Nδ , and the result is established.

✷

Lemma 6.11 Under the condition (6.8), the map φ: A → S1 is locally Lipschitz.

150



Proof. Let q0 ∈ A, we need to find a neighborhood UA ≡ U ∩ A in A so that φ is Lipschitz in this neighborhood. By the condition (6.8) there exists δ > 0 such that q0 ∈ Aδ . By lower semi–continuity of the existence time of geodesics we can choose a neighborhood U of q0 in Σ2 small enough so that U ∩ Aδ = U ∩ A .

(6.18)

1,1 Denote by N the Cloc hypersurface Nδ (corresponding to the chosen small value of δ) given by Lemma 6.9, so that

A∩U⊆N ∩U . Let n be the future pointing unit normal to Σ2 . (So n is a unit timelike vector.) Let m be the unit normal to N in Σ2 that points to J − (H) ∩ Σ2 . Let k := −(n + m). Then k is a past pointing null vector field along N and if q ∈ Aδ then k(q) is tangent to the unique generator of H through q. As Σ2 is C 2 then the vector field 1,1 n is C 1 , while the Cloc character of N implies that m is locally Lipschitz. It follows that k is a locally Lipschitz vector field along N . Now for each q ∈ A ∩ U there is a unique positive real number r(q) so that φ(q) = exp(r(q)k(q)) ∈ S1 . Lower semi–continuity of existence time of geodesics implies that, passing to a subset of U if necessary, for each q ∈ N ∩ U there is a unique positive real number rˆ(q) so that φ(q) = exp(ˆ r(q)k(q)) ∈ Σ1 . As Σ1 is a Lipschitz hypersurface, Clarke’s implicit function theorem [17, Corollary, p. 256] implies that q → rˆ(q) is a locally Lipschitz function near q0 . Thus φ is Lipschitz near q0 and the restriction of φ to Aδ is φ. ✷ Corollary 6.12 The section S1 is countably (n − 1) rectifiable. Remark 6.13 By starting with an arbitrary section S = S1 of the form S = H ∩ Σ, where Σ is a C 2 spacelike hypersurface or timelike hypersurface that meets H properly transversely, and then constructing a section S2 = H ∩ Σ2 for a C 2 spacelike hypersurface Σ2 so that the hypotheses of Theorem 6.1 hold, one obtains that every such section S is countably (n − 1) rectifiable. In the case that Σ1 ≡ Σ is spacelike a more precise version of this is given in [42] (where it is shown that S is countably (n − 1) rectifiable of class C 2 ). Proof. From Lemma 6.9 it is clear for each δ > 0 that Aδ is countably (n − 1) rectifiable. By Lemma 6.11 the map φA : Aδ → S1 is locally Lipschitz and δ therefore φ[Aδ ] is locally countable (n − 1) rectifiable. But S1 = ∞ k=1 φ[A1/k ] so S1 is a countable union of countably (n − 1) rectifiable sets and therefore is itself countably (n − 1) rectifiable. ✷ It follows from the outline given at the beginning of the proof of Theorem 6.1 that we have obtained:

Vol. 2, 2001


151

Proposition 6.14 The formula (6.5) holds. Corollary 6.15 For any section S = H ∩ Σ where Σ is a C 2 spacelike or timelike hypersurface that intersects H properly transversely the set of points of S that are on infinitely many generators has vanishing (n − 1) dimensional Hausdorff measure. Proof. For any point p of S choose a globally hyperbolic neighborhood O of the point and then choose a neighborhood S1 of p in S small enough that the closure S 1 is compact, satisfies S 1 ⊂ S ∩O, and so that there is a C ∞ Cauchy hypersurface Σ2 of O such that S 1 ⊂ I − (Σ2 ; O). Let S2 = H ∩ Σ2 and let A be the set of points of S2 that are of the form S2 ∩ γ where γ is a generator of H that meets S 1 . Compactness of S 1 together with the argument of the proof of Proposition 6.3 show that the set of generators of H that meet S 1 is a compact subset of the bundle of null geodesic rays of M , and that A is a compact subset of S2 . Then Lemma 6.9 implies that A is a compact set in a C 1,1 hypersurface of Σ2 and so Hn−1 (A) < ∞. Lemma 6.11 and the compactness of A yields that φ: A → S 1 given by (6.3) (with S 1 replacing S1 ) is Lipschitz. Therefore the Jacobian J(φ) is bounded on A. By Proposition 6.14 N (p, S2 ) dHn−1 (p) = J(φ)(q) dHn−1 h2 (q) < ∞ S1

A

and so N (p, S2 ) < ∞ except on a set of Hn−1 measure zero. But N (p, S2 ) is the number of generators of H through p so this implies the set of points of S1 that are on infinitely many generators has vanishing (n − 1) dimensional Hausdorff measure. Now S can be covered by a countable collection of such neighborhoods S1 and so the set of points on S that are on infinitely many generators has vanishing (n − 1) dimensional Hausdorff measure. This completes the proof. ✷ To establish (6.2) it remains to show that J(φ) ≤ 1 Hn−1 h2 -almost everywhere. To do this one would like to use the classical formula that relates the Jacobian of φ to the divergence θ of the horizon, cf. Proposition A.5, Appendix A below. However, for horizons which are not C 2 we only have the Alexandrov divergence θAl at our disposal, and it is not clear whether or not this formula holds with θ replaced by θAl for general horizons. The proof below consists in showing that this formula remains true after such a replacement for generators passing through almost all points of A. Proposition 6.16 J(φ) ≤ 1 Hn−1 h2 –almost everywhere on A. Proof. The argument of the paragraph preceding Equation (6.18) shows that it is sufficient to show J(φ) ≤ 1 Hn−1 h2 –almost everywhere on Aδ for each δ > 0. Let Nδ 1,1 be the C manifold constructed in Lemma 6.9, and let U ⊂ Σ2 be a coordinate neighborhood of the form V × (a, b), with V ⊂ Rn−1 and a, b, ∈ R, in which U ∩ Nδ

152



is the graph of a C 1,1 function g: V → R, and in which H ∩ U is the graph of a semi–convex function f : V → R. By [24, Theorem 3.1.5, p. 227] for every 4 > 0 there exists a twice differentiable function g1/ : V → R such that Ln−1 hV ({g = g1/ }) < 4 .

(6.19)

Here Ln−1 hV denotes the (n − 1) dimensional Riemannian measure on V associated with the pull–back hV of the space–time metric g to V. Let prAδ denote the projection on V of Aδ ∩ U, thus Aδ ∩ U is the graph of g over prAδ . For q ∈ V let θ∗n−1 (Ln−1 hV , prAδ , q) be the density function of prAδ in V with respect to the measure Ln−1 hV , defined as in [61, page 10] using geodesic coordinates centered at q with respect to the metric hV . Define B = {q ∈ prAδ | θ∗n−1 (Ln−1 hV , prAδ , q) = 1} ⊂ V .

(6.20)

By [52, Corollary 2.9] or [24, 2.9.12] the function θ∗n−1 (Ln−1 hV , prAδ , ·) differs from the characteristic function χprAδ of prAδ by a function supported on a set of vanishing measure, which implies that B has full measure in prAδ . Let B1/ = B ∩ {g = g1/ } ; Equation (6.19) shows that Ln−1 hV (B \ B1/ ) < 4 .

(6.21)

˜1/ ⊂ B1/ ⊂ prAδ ⊂ V as follows: We define B ˜1/ = {q ∈ B1/ | θ∗n−1 (Ln−1 , B1/ , q) = 1} . B hV

(6.22)

˜1/ has full measure in B1/ , hence Similarly the set B ˜ Ln−1 hV (B \ B1/ ) < 4 .

(6.23)

Let (prAδ )Al denote the projection on V of the set of those Alexandrov points of H ∩ Σ2 which are in Aδ ∩ U; (prAδ )Al has full measure in prAδ . Let, further, VRad be the set of points at which g is twice differentiable; by Rademacher’s theorem (cf., e.g., [22, p. 81]) VRad has full measure in V. We set ˆ1/ B ˆ B

˜1/ ∩ (prAδ )Al ∩ VRad , = B î = B

(6.24) (6.25)

i∈N

ˆ has full measure in prAδ . Since g is Lipschitz we It follows from (6.23) that B ˆ has full Hn−1 –measure in Aδ ∩ U. obtain that the graph of g over B h2

Vol. 2, 2001


153

ˆ then x0 is an Alexandrov point of f so that we have Consider any x0 ∈ B, the expansion f (x) = f (x0 ) + df (x0 )(x − x0 ) + 12 D2 f (x0 )(x − x0 , x − x0 ) + o(|x − x0 |2 ), (6.26) Next, g is twice differentiable at x0 so that we also have g(x) = g(x0 ) + dg(x0 )(x − x0 ) + 12 D2 g(x0 )(x − x0 , x − x0 ) + o(|x − x0 |2 ), (6.27) dg(x) − dg(x0 ) = D2 g(x0 )(x − x0 , ·) + o(|x − x0 |) . (6.28) ˆj ⊂ Bj ; by definition of Bj we then Further there exists j ∈ N such that x0 ∈ B have gj (x0 ) = g(x0 ) = f (x0 ) . (6.29) We claim that the set of directions )n ∈ B n−1 (1) ⊂ Tx0 V for which there exists a sequence of points qi = x0 + ri)n with ri → 0 and with the property that f (qi ) = gj (qi ) is dense in B n−1 (1). Indeed, suppose that this is not the case, then there exists 4 > 0 and an open set Ω ⊂ B n−1 (1) ⊂ Tx0 V of directions )n ∈ B n−1 (1) such that the solid cone K = {x0 + r)n | )n ∈ Ω , r ∈ [0, 4]} contains no points from Bj . It follows that the density function θ∗n−1 (Ln−1 hV , Bj , x0 ) is strictly smaller than ˆj ⊂ B, one, which contradicts the fact that x0 is a density point of prAδ (x0 ∈ B ˆ ˜j , cf. cf. Equation (6.20)), and that x0 is a density point of Bj (x0 ∈ Bj ⊂ B Equation (6.22)). Equation (6.29) leads to the following Taylor expansions at x0 : gj (x) = f (x0 ) + dgj (x0 )(x − x0 )+ 1 2 D gj (x0 )(x − x0 , x − x0 ) + o(|x − x0 |2 ), 2 dgj (x) − dgj (x0 ) = D2 gj (x0 )(x − x0 , ·) + o(|x − x0 |) .

(6.30) (6.31)

Subtracting Equation (6.26) from Equation (6.30) at points qi = x0 + ri)n at which f (qi ) = gj (qi ) we obtain dgj (x0 )()n) − df (x0 )()n) = O(ri ) .

(6.32)

Density of the set of )n’s for which (6.32) holds implies that dgj (x0 ) = df (x0 ) .

(6.33)

Again comparing Equation (6.26) with Equation (6.30) at points at which f (qi ) = gj (qi ) it now follows that D2 gj (x0 )()n, )n) − D2 f (x0 )()n, )n) = 2

gj (x0 + ri)n) − f (x0 + ri)n) + o(1) = o(1) , ri2 (6.34)

154



and density of the set of )n’s together with the polarization identity gives D2 gj (x0 ) = D2 f (x0 ) .

(6.35)

Similarly one obtains dg(x0 ) D2 g(x0 )

= df (x0 ) , = D2 f (x0 ) .

(6.36) (6.37)

Define Sj to be the graph (over V) of gj ; Equations (6.33) and (6.36) show that both Sj and Nδ are tangent to H ∩ Σ2 at p0 ≡ (x0 , f (x0 )): Tp0 (H ∩ Σ2 ) = Tp0 Nδ = Tp0 Sj .

(6.38)

Let nj denote the (C 1 ) field of σ–unit future directed null normals to Sj such that nj (p0 ) = n(p0 ) , where n(p0 ) is the semi–tangent to H at p0 . Let φj : Sj → Σ1 be the map obtained by intersecting the null geodesics passing through points q ∈ Sj with tangent parallel to nj (q) there. Equation (6.38) shows that 0) . φj (p0 ) = φ(p0 ) = φ(p

(6.39)

The lower semi–continuity of the existence time of geodesics shows that, passing to a subset of U if necessary, φj is well defined on Sj . By an argument similar to the one leading to (6.33), it follows from Equations (6.28), (6.31), (6.33) and (6.35)–(6.37) that the derivatives of φj and of φ coincide at p0 , in particular 0 ) = J(φj )(p0 ) . J(φ)(p0 ) ≡ J(φ)(p

(6.40)

Equation (6.35) further shows that Sj is second order tangent to H at p0 in the sense defined before Lemma 4.15, we thus infer from that lemma that there are no focal points of Sj along the segment of the generator Γ of H passing through p0 which lies to the future of Σ1 . Consider the set Hj obtained as the union of null geodesics passing through Sj and tangent to nj there; standard considerations show that there exists a neighborhood of Γ ∩ I + (Σ1 ) ∩ I − (Σ2 ) in which Hj is a C 1 hypersurface. It is shown in Appendix A that 1) Hj is actually a C 2 hypersurface, cf. Proposition A.3, and 2) the null Weingarten map b = bHj of Hj satisfies the Ricatti equation (6.41) b + b2 + R = 0 . Here a prime denotes a derivative with respect to an affine parameterization s → η(s) of Γ that makes Γ future directed. Theorem 5.1 implies that H has a null Weingarten map bAl defined in terms of the Alexandrov second derivatives of H on all of the segment Γ ∩ I + (Σ1 ) ∩ I − (Σ2 ) and that this Weingarten map

Vol. 2, 2001


155

also satisfies the Ricatti equation (6.41). As the null Weingarten map can be expressed in terms of the first and second derivatives of the graphing function of a section Equations (6.33) and (6.35) imply that bHj (p0 ) = bAl (p0 ). Therefore uniqueness of solutions to initial value problems implies that bHj = bAl on all of Γ ∩ I + (Σ1 ) ∩ I − (Σ2 ). But the divergence (or null mean curvature) of a null hypersurface is the trace of its null Weingarten map and thus on the segment Γ ∩ I + (Σ1 ) ∩ I − (Σ2 ) we have θHj = trace bHj = trace bAl = θAl . We now finish the proof under the first of hypothesis of Theorem 6.1, that is that the divergence the θAl ≥ 0 on J + (S1 ) ∩ J − (S2 ). Then θHj = θAl ≥ 0 and Proposition A.5 implies that J(φ)(p0 ) = J(φj )(p0 ) ≤ 1 as required. S2 The other hypothesis of Theorem 6.1 is that θAl ≥ 0 and the null energy + − condition holding on J (S1 ) ∩ J (S2 ). Recall that p0 = Γ ∩ S2 is an Alexandrov point of H; thus Theorem 5.1 applies and shows that θ = θAl exists along the segment Γ ∩ I + (Σ1 ) ∩ I − (Σ2 ), and satisfies the Raychaudhuri equation θ = −Ric(η , η ) − σ2 −

1 θ2 . n−2

Here σ 2 is the norm squared of the shear (and should not be confused with the auxiliary Riemannian metric which we have also denoted by σ). But the null energy S2 condition implies Ric(η , η ) ≥ 0 so this equation and θAl ≥ 0 implies θ = θAl ≥ 0 + − on Γ ∩ I (Σ1 ) ∩ I (Σ2 ). Then the equality θHj = θAl yields θHj ≥ 0 and again we can use Proposition A.5 to conclude J(φ)(p0 ) = J(φj )(p0 ) ≤ 1. This completes the proof. ✷ To finish the proof of Theorem 6.1, we need to analyze what happens when AreaS2 (S1 ) = Area(S2 ) .

(6.42)

In this case Equation (6.4) together with Proposition 6.16 show that A has full measure in S2 , that N (p, S2 ) = 1 Hn−1 h1 –almost everywhere on S1 , and that J(φ) = 1 –almost everywhere on S . Hn−1 2 Next, the arguments of the proof of that Propoh2 sition show that θAl = 0 (6.43) Hnσ –almost everywhere on J − (A) ∩ J + (S1 ) = J − (S2 ) ∩ J + (S1 ). The proof of Proposition 6.16 further shows that Ric(η , η ) = 0 Hnσ –almost everywhere on H ∩ J + (S1 ) ∩ J − (S2 ) (cf. Equation (6.41)), hence everywhere there as the metric is assumed to be smooth and the distribution of semi–tangents is a closed set (Lemma 6.4). Here η is any semi–tangent to H. We first note the following observation, the proof of which borrows arguments from [6, Section IV]: Lemma 6.17 Under the hypotheses of Theorem 6.1, suppose further that the equality (6.42) holds. Then there are no end points of generators of H on Ω ≡ (J + (S1 ) \ S1 ) ∩ (J − (S2 ) \ S2 ) .

(6.44)

156



Proof. Suppose that there exists q ∈ Ω which is an end point of a generator Γ of H, set {p} = Γ ∩ S2 , extending Γ beyond its end point and parameterizing it appropriately we will have Γ(0) = q ,

Γ(a) ∈ I − (H) ,

Γ(1) = p ,

for any a < 0 for which Γ(a) is defined. Now p is an interior point of a generator, and semi–tangents at points in a sufficiently small neighborhood of p are arbitrarily close to the semi–tangent Xp at p. Since I − (H) is open it follows from continuous dependence of solutions of ODE’s upon initial values that there exists a neighborhood V ⊂ S2 of p such that every generator of H passing through V leaves H before intersecting Σ1 when followed backwards in time from S2 , hence A ∩ V = ∅, and A does not have full measure in S2 . ✷ To finish the proof we shall need the following result, which seems to be of independent interest: Theorem 6.18 Let Ω be an open subset of a horizon H which contains no end points of generators of H, and suppose that the divergence θAl of H defined (Hnσ – almost everywhere) by Equation (2.10) vanishes Hnσ –almost everywhere. Then Ω is a smooth submanifold of M (analytic if the metric is analytic). Remark 6.19 The condition on Ω is equivalent to Ω being a C 1 hypersurface, cf. [6]. Proof. Let p0 ∈ Ω and choose a smooth local foliation {Σλ | − ε < λ < ε} of an open neighborhood U of p0 in M by spacelike hypersurfaces so that U ∩ Ω ⊂ Ω and so that p0 ∈ Σ0 . Letting σ be a the auxiliary Riemannian metric, by possibly making U smaller we can assume that the σ-distance of U ∩ Ω to Ω \ Ω is < δ for some δ > 0. Let ΩAl be the set of Alexandrov points of Ω and let B = Ω \ ΩAl be the set of points of Ω where the Alexandrov second derivatives do not exist. We view λ as a function λ: U → R in the natural way. Now λ is smooth on U and by Remark 6.19 Ω is a C 1 manifold so the restriction λΩ is a C 1 function. Letting hσ be the pull back of our auxiliary Riemannian metric σ to Ω we apply the co-area formula to λΩ and use that by Alexandrov’s theorem Hnhσ (B) = 0 to get

ε

−ε

Hn−1 hσ (B ∩ Σλ ) dλ =

B

J(λΩ ) dHnhσ = 0 .

This implies that for almost all λ ∈ (−ε, ε) that Hn−1 hσ (B ∩ Σλ ) = 0. Therefore we can choose a λ just a little bigger than 0 with Hn−1 hσ (B ∩ Σλ ) = 0 and so that p0 ∈ J − (Ω ∩ Σλ ). To simplify notation we denote Σλ by Σ. Then from the choice of Σ we have that Hn−1 hσ almost every point of Σ is an Alexandrov point of Ω. By transversality and that Ω is C 1 the set Σ ∩ Ω is a C 1 submanifold of Σ. Recalling that δ is less than the σ-distance of U ∩ Ω to Ω \ Ω we see that for

Vol. 2, 2001


157

any p ∈ Σ the unique (because Ω is C 1 ) generator Γ of Ω through p extends in Ω a σ-distance of at least δ both to the future and to the past of Σ. Letting A = Σ ∩ Ω and using the notation of Equation (6.6), this implies that A = Aδ . As A = Σ ∩ Ω is already a C 1 submanifold of Σ Lemma 6.9 implies that Σ ∩ Ω is a C 1,1 hypersurface in Σ. Let g by any Lipschitz local graphing function of A in Σ. 2,∞ From Rademacher’s theorem (cf., e.g., [22, p. 81]) it follows that C 1,1 = Wloc , further the Alexandrov second derivatives of g coincide with the classical ones almost everywhere. By [22, p. 235] the second distributional derivatives of g equal the second classical derivatives of g almost everywhere. It follows that the equation θAl = 0

(6.45)

can be rewritten, by freezing the coefficients of the second derivatives at the solution g, as a linear elliptic weak (distributional) equation with Lipschitz continuous 2,∞ coefficients for the graphing function g ∈ Wloc . Elliptic regularity shows that g is, locally, of C 2,α differentiability class for any α ∈ (0, 1). Further, Equation (6.45) is a quasi–linear elliptic equation for g (cf., e.g., [29]), a standard bootstrap argument shows that g is smooth (analytic if the metric is analytic) and it easily follows that Ω in a neighborhood of Σ ∩ Ω containing p0 is smooth (or analytic). As p0 was an arbitrary point of Ω this completes the proof. ✷ Returning to the proof of Theorem 6.1, we note that Lemma 6.17 shows that all points of Σ ∩ Ω, where Ω is given by Equation (6.44), are interior points of generators of H. Simple arguments together with the invariance of the domain theorem (cf., e.g., [18, Prop. 7.4, p. 79]) show that Ω is an open submanifold of H, and Equation (6.43) shows that we can use Theorem 6.18 to conclude. ✷

7 Conclusions Let us present here some applications of Theorems 1.2 and 6.18, proved above. The first one is to the theory of stationary black holes (cf., e.g., [38, 12] and references therein): in that theory the question of differentiability of event horizons arises at several key places. Recall that smoothness of event horizons has been established in 1) static [63, 10] and 2) [10] stationary–axisymmetric space–times. However, staticity or stationarity–axisymmetry are often not known a priori — that is indeed the case in Hawking’s rigidity theorem [37]22 . Now, the rigidity theorem asserts that a certain class of stationary black holes have axi–symmetric domains of outer communication; its hypotheses include that of analyticity of the metric and of the event horizon. The examples of black holes (in analytic vacuum space– times) the horizons of which are nowhere C 2 constructed in [15] show that the hypothesis of analyticity of the event horizon and that of analyticity of the metric are logically independent. It is thus of interest to note the following result, which is 22 This theorem is actually wrong as stated in [37]; a corrected version, together with a proof, can be found in [13].

158



a straightforward corollary of Theorem 1.2 and of the fact that isometries preserve area: Theorem 7.1 Let φ be an isometry of a black–hole space–time (M, g) satisfying the hypotheses of Theorem 1.2. If φ maps H into H, then for every spacelike hypersurface Σ such that

the set

φ(Σ ∩ H) ⊂ J + (Σ ∩ H)

(7.1)

− J (φ(Σ ∩ H)) \ φ(Σ ∩ H) ∩ J + (Σ ∩ H) \ (Σ ∩ H) ⊂ H

(7.2)

is a smooth (analytic if the metric is analytic) null submanifold of M with vanishing null second fundamental form. As already pointed out, the application we have in mind is that to stationary black holes, where φ actually arises from a one parameter group of isometries φt . We note that in such a setting the fact that isometries preserve the event horizon, as well as the existence of hypersurfaces Σ for which (7.1) holds with φ replaced by φt (for some, or for all t’s), can be established under various standard conditions on the geometry of stationary black holes, which are of no concern to us here. Recall, next, that the question of differentiability of Cauchy horizons often arises in considerations concerning cosmic censorship issues (cf., e.g., [3, 58]). An interesting result in this context, indicating non–genericity of occurrence of compact Cauchy horizons, is the Isenberg–Moncrief theorem, which asserts that analytic compact Cauchy horizons with periodic generators in analytic, electro– vacuum space-times are Killing horizons, for a Killing vector field defined on a neighborhood of the Cauchy horizon [45]. We note that if all the generators of the horizon are periodic, then the horizon has no end–points, and analyticity follows23 from Theorem 6.18. Hence the hypothesis of analyticity of the event horizon is not needed in [45]. We also note that there exists a (partial) version of the Isenberg– Moncrief theorem, due to Friedrich, R´ acz and Wald [27], in which the hypotheses of analyticity of [45] are replaced by those of smoothness both of the metric and of the Cauchy horizon. Theorem 6.18 again shows that the hypothesis of smoothness of the Cauchy horizon is not necessary in [27]. To close this section let us note an interesting theorem of Beem and Kr´ olak [6, Section IV], which asserts that if a compact Cauchy horizon in a space–time satisfying the null energy condition contains, roughly speaking, an open dense subset O which is a C 2 manifold, then there are no end points of the generators of the event horizon, and the divergence of the event horizon vanishes. Theorem 6.18 again applies to show that the horizon must be as smooth as the metric allows. Our methods here could perhaps provide a proof of a version of the Beem–Kr´ olak 23 The proof proceeds as follows: Theorem 5.1 shows that the optical equations hold on almost all generators of the Cauchy horizon; periodicity of the generators together with the Raychaudhuri equation shows then that θAl = 0 almost everywhere, hence Theorem 6.18 applies.

Vol. 2, 2001


159

theorem in which the hypothesis of existence of the set O will not be needed; this remains to be seen.

A The Geometry of C 2 Null Hypersurfaces In this appendix we prove a result concerning the regularity of null hypersurfaces normal to a C k submanifold in space–time. We also review some aspects of the geometry of null hypersurfaces, with the presentation adapted to our needs. We follow the exposition of [29]. Let (M, g) be a spacetime, i.e., a smooth, paracompact time-oriented Lorentzian manifold, of dimension n + 1 ≥ 3. We denote the Lorentzian metric on M by g or , . A (C 2 ) null hypersurface in M is a C 2 co-dimension one embedded submanifold H of M such that the pullback of the metric g to H is degenerate. Each such hypersurface H admits a C 1 non-vanishing future directed null vector field K ∈ ΓT H such that the normal space of K at a point p ∈ H coincides with the tangent space of H at p, i.e., Kp⊥ = Tp H for all p ∈ H. (If H is C 2 the best regularity we can require for K is C 1 .) In particular, tangent vectors to H not parallel to K are spacelike. It is well-known that the integral curves of K, when suitably parameterized, are null geodesics. These integral curves are called the null geodesic generators of H. We note that the vector field K is unique up to a positive scale factor. Since K is orthogonal to H we can introduce the null Weingarten map and null second fundamental form of H with respect K in a manner roughly analogous to what is done for spacelike hypersurfaces or hypersurfaces in a Riemannian manifold, as follows: We start by introducing an equivalence relation on tangent vectors: for X, X ∈ Tp H, X = X mod K if and only if X − X = λK for some λ ∈ R. Let X denote the equivalence class of X. Simple computations show that if X = X mod K and Y = Y mod K then X , Y = X, Y and ∇X K, Y = ∇X K, Y , where ∇ is the Levi-Civita connection of M . Hence, for various quantities of interest, components along K are not of interest. For this reason one works with the tangent space of H modded out by K, i.e., Tp H/K = {X | X ∈ Tp H} and T H/K = ∪p∈H Tp H/K. T H/K is a rank n − 1 vector bundle over H. This vector bundle does not depend on the particular choice of null vector field K. There is a natural positive definite metric h in T H/K induced from , : For each p ∈ H, define h: Tp H/K × Tp H/K → R by h(X, Y ) = X, Y . From remarks above, h is well-defined. The null Weingarten map b = bK of H with respect to K is, for each point p ∈ H, a linear map b: Tp H/K → Tp H/K defined by b(X) = ∇X K. It is easily verified that b is well-defined and, as it involves taking a derivative of K, which is C 1 the tensor b will be C 0 but no more regularity can be expected. Note if = f K, f ∈ C 1 (H), is any other future directed null vector field tangent to H, K = f ∇X K mod K. Thus bf K = f bK . It follows that the Weingarten then ∇X K map b of H is unique up to positive scale factor and that b at a given point p ∈ H

160



depends only on the value of K at p when we keep H fixed but allow K to vary while remaining tangent to the generators of H. A standard computation shows, h(b(X), Y ) = ∇X K, Y = X, ∇Y K = h(X, b(Y )). Hence b is self-adjoint with respect to h. The null second fundamental form B = BK of H with respect to K is the bilinear form associated to b via h: For each p ∈ H, B: Tp H/K × Tp H/K → R is defined by B(X, Y ) = h(b(X), Y ) = ∇X K, Y . Since b is self-adjoint, B is symmetric. In a manner analogous to the second fundamental form for spacelike hypersurfaces, a null hypersurface is totally geodesic if and only if B vanishes identically [51, Theorem 30]. The null mean curvature of H with respect to K is the continuous scalar field θ ∈ C 0 (H) defined by θ = tr b; in the general relativity literature θ is often referred to as the convergence or divergence of the horizon. Let e1 , e2 , . . . , en−1 be n − 1 orthonormal spacelike vectors (with respect to , ) tangent to H at p. Then {e1 , e2 , . . . , en−1 } is an orthonormal basis (with respect to h) of Tp H/K. Hence at p, n−1 n−1 θ = tr b = h(b(ei ), ei ) = ∇ei K, ei . (A.1) i=1

i=1

Let Σ be the intersection, transverse to K, of a hypersurface in M with H. Then Σ is a C 2 (n − 1) dimensional spacelike submanifold of M contained in H which meets K orthogonally. From Equation (A.1), θ|Σ = divΣ K, and hence the null mean curvature gives a measure of the divergence of the null generators of H. = f K then θ = f θ. Thus the null mean curvature inequalities θ ≥ 0, Note that if K θ ≤ 0, are invariant under positive scaling of K. In Minkowski space, a future null cone H = ∂I + (p) \ {p} (respectively, past null cone H = ∂I − (p) \ {p}) has positive null mean curvature, θ > 0 (respectively, negative null mean curvature, θ < 0). The null second fundamental form of a null hypersurface obeys a well-defined comparison theory roughly similar to the comparison theory satisfied by the second fundamental forms of a family of parallel spacelike hypersurfaces (cf. Eschenburg [21], which we follow in spirit). Let η: (a, b) → M , s → η(s), be a future directed affinely parameterized null geodesic generator of H. For each s ∈ (a, b), let b(s) = bη (s) : Tη(s) H/η (s) → Tη(s) H/η (s) be the Weingarten map based at η(s) with respect to the null vector K = η (s). Recall that the null Weingarten map b of a smooth null hypersurface H satisfies a Ricatti equation (cf. [5, p. 431]; for completeness we indicate the proof below). b + b2 + R = 0.

(A.2)

Here denotes covariant differentiation in the direction η (s), with η – an affinely parameterized null geodesic generator of H; more precisely, if X = X(s) is a vector

Vol. 2, 2001


161

field along η tangent to H, then24 b (X) = b(X) − b(X ).

(A.3)

Finally R: Tη(s) H/η (s) → Tη(s) H/η (s) is the curvature endomorphism defined by R(X) = R(X, η (s))η (s), where (X, Y, Z) → R(X, Y )Z is the Riemann curvature tensor of M (in our conventions, R(X, Y )Z = ∇X ∇Y Z − ∇Y ∇X Z − ∇[X,Y ] Z). We indicate the proof of Equation (A.2). Fix a point p = η(s0 ), s0 ∈ (a, b), on η. On a neighborhood U of p in H we can scale the null vector field K so that K is a geodesic vector field, ∇K K = 0, and so that K, restricted to η, is the velocity vector field to η, i.e., for each s near s0 , Kη(s) = η (s). Let X ∈ Tp M . Shrinking U if necessary, we can extend X to a smooth vector field on U so that [X, K] = ∇X K − ∇K X = 0. Then, R(X, K)K = ∇X ∇K K − ∇K ∇X K − ∇[X,K] K = −∇K ∇K X. Hence along η we have, X = −R(X, η )η (which implies that X, restricted to η, is a Jacobi field along η). Thus, from Equation (A.3), at the point p we have, b (X) = ∇X K − b(∇K X) = ∇K X − b(∇X K) = X − b(b(X)) = −R(X, η )η − b2 (X) = −R(X) − b2 (X),

(A.4)

which establishes Equation (A.2). Equation (A.2) leads to the well known Raychaudhuri equation for an irrotational null geodesic congruence in general relativity: by taking the trace of (A.2) we obtain the following formula for the derivative of the null mean curvature θ = θ(s) along η, 1 (A.5) θ2 , n−2 where σ, the shear scalar, is the trace of the square of the trace free part of b. This equation shows how the Ricci curvature of spacetime influences the null mean curvature of a null hypersurface. We note the following: θ = −Ric(η , η ) − σ 2 −

Proposition A.1 Let H be a C 2 null hypersurface in the (n+ 1) dimensional spacetime (M, g) and let b be the one parameter family of Weingarten maps along an affine parameterized null generator η. Then the covariant derivative b defined by Equation (A.3) exists and satisfies Equation (A.2). Remark A.2 When H is smooth this is a standard result, proved by the calculation (A.4). However when H is only C 2 all we know is that b is a C 0 tensor field so that 24 Here b(X) is an equivalence class of vectors, so it might be useful to give a practical prescription how its derivative b(X) can be calculated. Let s → c(s) be a null generator of H. Let s → V (s) be a T H/K–vector field along c, i.e., for each s, V (s) is an element of Tc(s) H/K. Say s → V (s) is smooth if (at least locally) there is a smooth — in the usual sense — vector field s → Y (s) along c such that V (s) = Y¯ (s) for each s. Then define the covariant derivative of s → V (s) along c by: V (s) = Y (s), where Y is the usual covariant derivative. It is easily shown, using the fact that ∇K K is proportional to K, that V so defined is independent of the ¯ choice of Y . This definition applies in particular to b(X).

162



there is no reason a priori that the derivative b should exist. A main point of the proposition is that it does exist and satisfies the expected differential equation. As the function s → Rη(s) is C ∞ then the Riccati equation implies that actually the dependence of bη(s) on s is C ∞ . This will be clear from the proof below for other reasons. Proof. Let η: (a, b) → H be an affinely parameterized null generator of H. To simplify notation we assume that 0 ∈ (a, b) and choose a C ∞ spacelike hypersurface Σ of M that passes through p = η(0) and let N = H ∩ Σ. Then N is a C 2 ˜ be a C ∞ hypersurface in Σ so that N ˜ has second hypersurface in Σ. Now let N ˜ ˜ order contact with N at p. Let K be a smooth null normal vector field along N ˜ ˜ such that at p, K = η (0). Consider the hypersurface H obtained by exponentiating ˜ in the direction K; ˜ by Lemma 4.15 there are no focal points normally along N ˜ if necessary to avoid cut along η as long as η stays on H. Passing to a subset of N ∞ ˜ points, H will then be a C null hypersurface in a neighborhood of η. Let B(s) ˜ respectively, at η(s) ˜ and B(s) be the null second fundamental forms of H and H, ˜ in the direction η (s). We claim that B(s) = B(s) for all s ∈ (a, b). Since the null ˜ = B(s) ˜ Weingarten maps ˜b = ˜b(s) associated to B satisfy Equation (A.2), this is sufficient to establish the lemma. ˜ We first show that B(s) = B(s) for all s ∈ [0, c] for some c ∈ (0, b). By restricting to a suitable neighborhood of p we can assume without loss of generality that M is globally hyperbolic. Let X ∈ Tp Σ be the projection of η (0) ∈ Tp M onto ˜ ⊂ Σ (depending on a Tp Σ. By an arbitrarily small second order deformation of N parameter 4 in a fashion similar to Equation (4.4)) we obtain a C ∞ hypersurface ˜+ in Σ which meets N only in the point p and lies to the side of N into which X N ˜− in Σ which meets N only in points. Similarly, we obtain a C ∞ hypersurface N ˜ ± be a smooth the point p and lies to the side of N into which −X points. Let K ± ˜ which agrees with η (0) at p. By exponentiating null normal vector field along N ˜± in the direction K ˜ ± we obtain, as before, in a neighborhood normally along N ∞ ± ˜ , for some c ∈ (0, b). Let B ˜± (s) be the null of η [0,c] a C null hypersurface H ˜ ± at η(s) in the direction η (s). By restricting the second fundamental form of H size of Σ if necessary we find open sets W , W± in Σ, with W− ⊂ W ⊂ W+ , such ˜± ⊂ ∂Σ W± . Restricting to a sufficiently small neighborhood thatN ⊂ ∂Σ W and N ˜ ± ∩ J + (Σ) ⊂ ∂J + (W ± ). Since of η [0,c] , we have H ∩ J + (Σ) ⊂ ∂J + (W ) and H − + + + + − ˜ J (W ) ⊂ J (W ) ⊂ J (W ), it follows that H is to the future of H near η(s) ˜ + near η(s), s ∈ [0, c]. Now if two null hypersurfaces H1 and H is to the future of H and H2 are tangent at a point p, and H2 is to the future of H1 , then the difference of the null second fundamental forms B2 −B1 is positive semidefinite at p. We thus ˜− (s) ≥ B(s) ≥ B ˜+ (s). Letting 4 → 0, (i.e., letting the deformations go obtain B ˜ to zero), we obtain B(s) = B(s) for all s ∈ [0, c]. A straightforward continuation ˜ argument implies, in fact, that B(s) = B(s) for all s ∈ [0, b). A similar argument establishes equality for s ∈ (a, 0]. ✷

Vol. 2, 2001


163

In the last result above the hypersurface H had to be of at least C 2 differentiability class. Now, in our applications we have to consider hypersurfaces H obtained as a collection of null geodesics normal to a C 2 surface. A naive inspection of the problem at hand shows that such H’s could in principle be of C 1 differentiability only. Let us show that one does indeed have C 2 differentiability of the resulting hypersurface: Proposition A.3 Consider a C k+1 spacelike submanifold N ⊂ M of co–dimension two in an (n + 1) dimensional spacetime (M, g), with k ≥ 1. Let k be a nonvanishing C k null vector field along N , and let U ⊆ R × N → M be the set of points where the function f (t, p) := expp (tk(p)) is defined. If f(t0 ,p0 )∗ is injective then there is an open neighborhood O of (t0 , p0 ) so that the image f [O] is a C k+1 embedded hypersurface in M . Remark A.4 In our application we only need the case k = 1. This result is somewhat surprising as the function p → k(p) used in the definition of f is only C k . We emphasize that we are not assuming that f is injective. We note that f will not be of C k+1 differentiability class in general, which can be seen as follows: Let t → r(t) be a C k+1 curve in the x-y plane of Minkowski 3-space which is not of C k+2 differentiability class. Let t → n(t) be the spacelike unit normal field along the curve in the x-y plane, then t → n(t) is C k and is not C k+1 . Let T = (0, 0, 1) be the unit normal to the x-y plane. Then K(t) = n(t) + T is a C k normal null field along t → r(t). The normal exponential map f : R2 → R3 in the direction K is given by f (s, t) = r(t) + s[n(t) + T ], and hence df /dt = r (t) + sn (t), showing explicitly that the regularity of f can be no greater than the regularity of n(t), and hence no greater than the regularity of r (t). Proof. This result is local in N about p0 so there is no loss of generality, by possibly replacing N by a neighborhood of p0 in N , in assuming that N is a embedded submanifold of M . The map f is of class C k and the derivative f(t0 ,p0 )∗ is injective so the implicit function theorem implies f [U] is a C k hypersurface near f (t0 , p0 ). Let η be any nonzero timelike C ∞ vector field on M defined near p0 (some restrictions to be put on η shortly) and let Φs be the flow of η. Then for sufficiently small ε the map f˜: (−ε, ε) × N → M given by f˜(s, p) := Φs (p) ˜ along f˜. (It is is injective and of class C k+1 . Extend k to any C k vector field k ˜ ˜ not assumed that the extension k is null.) That is k: (−ε, ε) × N → T M is a C k ˜ p) ∈ T ˜ ˜ map and k(s, f (s,p) M . Note that we can choose k(s, p) so that the covariant derivative

˜ ∇k ∂s (0, p0 )

has any value we wish at the one point (0, p0 ). Define a map

164



F : (t0 − ε, t0 + ε) × (−ε, ε) × N → M by ˜ p)). F (t, s, p) = exp(tk(s, We now show that F can be chosen to be a local diffeomorphism near (t0 , 0, p0 ). Note that F (t, 0, p) = f (t, p) and by assumption f∗(t0 ,p0 ) is injective. Therefore the restriction of F∗(t0 ,0,p0 ) to T(t0 ,p0 ) (R × N ) ⊂ T(t0 ,0,p0 ) (R × R × N ) is injective. Thus by the inverse function theorem it is enough to show that F∗(t0 ,0,p0 ) (∂/∂s) is linearly independent of the subspace F∗(t0 ,0,p0 ) [T(t0 ,p0 ) (R × N )]. Let ∂F V (t) = (t, s, p0 ) . ∂s s=0 Then V (t0 ) = F∗(t0 ,0,p0 ) (∂/∂s) and our claim that F is a local diffeomorphism follows if V (t0 ) ∈ / F∗(t0 ,0,p0 ) [T(t0 ,p0 ) (R × N )]. For each s, p the map t → F (s, t, p) is a geodesic and therefore V is a Jacobi field along t → F (0, t, p0 ). (Those geodesics might change type as s is varied at fixed p0 , but this is irrelevant for our purposes.) The initial conditions of this geodesic are ∂ ∂ V (0) = F (0, s, p0 ) Φs (p0 ) = = η(p0 ) ∂s ∂s s=0 s=0 and ˜ ∇V ∇ ∇ ∇ ∇ ∇k (0) = F (t, s, p0 ) F (t, s, p0 ) (0, p0 ) . = = ∂t ∂t ∂s ∂s ∂t ∂s s=0,t=0 s=0,t=0 ˜

k From our set up we can choose η(p0 ) to be any timelike vector and ∇ ∂s (0, p0 ) to be any vector. As the linear map from Tp0 M × Tp0 M → Tf (t0 ,p0 ) M which maps the 25 initial conditions V (0), ∇V ∂t (0) of a Jacobi field V to its value V (t0 ) is surjective ˜ ∇k it is an open map. Therefore we can choose η(p0 ) and ∂s (0, p0 ) so that V (t0 ) is not in the nowhere dense set F∗(t0 ,0,p0 ) [T(t0 ,p0 ) (R × N )]. Thus we can assume F is a local C k diffeomorphism on some small neighborhood A of (t0 , 0, p0 ) onto a small neighborhood B := F [A] of F (t0 , 0, p0 ) as claimed. Consider the vector field F∗ (∂/∂t) = ∂F/∂t along F . Then the integral curves ˜ p)). (This is true of this vector field are the geodesics t → F (t, s, p) = exp(tk(s, even when F is not injective on its entire domain.) These geodesics and their velocity vectors depend smoothly on the initial data. In the case at hand the initial data is C k so ∂F/∂t is a C k vector field along F . Therefore the one form α defined by α(X) := X, ∂F/∂t on the neighborhood B of q0 is C k . The definition of F implies that f (t, p) = F (t, 0, p) and therefore the vector field ∂F/∂t is tangent to f [O] and the null geodesics t → f (t, p) = F (t, 0, p) rule f [O] so that

v ∈ Tf (t0 ,p0 ) N there is a Jacobi field with V (t0 ) = v and subjectivity. 25 If

∇V ∂t

(t0 ) = 0, which implies

Vol. 2, 2001


165

f [O] is a null hypersurface. Therefore for any vector X tangent to f [O] we have α(X) = X, ∂F/∂t = 0. Thus f [O] is an integral submanifold for the distribution {X | α(X) = 0} defined by α. But, as is easily seen by writing out the definitions in local coordinates, an integral submanifold of a C k distribution is a C k+1 submanifold. (Note that in general there is no reason to believe that the distribution defined by α is integrable. However, we have shown directly that f [O] is an integral submanifold of that distribution.) ✷ We shall close this appendix with a calculation, needed in the main body of the paper, concerning Jacobians. Let us start by recalling the definition of the Jacobian needed in our context. Let φ: M → N be a C 1 map between Riemannian manifolds, with dim M ≤ dim N . Let n = dim M and let e1 , . . . , en be an orthonormal basic of Tp M then the Jacobian of φ at p is J(φ)(p) = $φ∗p e1 ∧φ∗p e2 ∧· · ·∧φ∗p en $. When dim M = dim N and both M and N are oriented with ωM being the volume form on M , and ωN being the volume form on N , then J(φ) can also be described as the positive scalar satisfying: φ∗ (ωN ) = ±J(φ) ωM . Let S be a C 2 co-dimension two acausal spacelike submanifold of a smooth spacetime M , and let K be a past directed C 1 null vector field along S. Consider the normal exponential map in the direction K, Φ: R×S → M , defined by Φ(s, x) = expx sK. (Φ need not be defined on all of R × S.) Suppose the null geodesic η: s → Φ(s, p) meets a given acausal spacelike hypersurface Σ at η(1). Then there is a neighborhood W of p in S such that each geodesic s → Φ(s, x), x ∈ W meets Σ, and so determines a C 1 map φ: W → Σ, which is the projection into Σ along these geodesics. Let J(φ) denote the Jacobian determinant of φ at p. J(φ) may be computed as follows. Let {X1 , X2 , . . . , Xk } be an orthonormal basis for the tangent space Tp S. Then, J(φ) = $φ∗p X1 ∧ φ∗p X2 ∧ · · · ∧ φ∗p Xk $. Suppose there are no focal points to S along η|[0,1] . Then by shrinking W and rescaling K if necessary, Φ: [0, 1] × W → M is a C 1 embedded null hypersurface N such that Φ({1} × W ) ⊂ Σ. Extend K to be the C 1 past directed null vector field, ∂ K = Φ∗ ( ∂s ) on N . Let θ = θ(s) be the null mean curvature of N with respect to −K along η. For completeness let us give a proof of the following, well known result: Proposition A.5 With θ = θ(s) as described above, 1. If there are no focal points to S along η|[0,1] , then 1

J(φ) = exp − θ(s)ds . 0

2. If η(1) is the first focal point to S along η|[0,1] , then J(φ) = 0 .

(A.6)

166



Remark A.6 In particular, if N has nonnegative null mean curvature with respect to the future pointing null normal, i.e., if θ ≥ 0, we obtain that J(φ) ≤ 1. Remark A.7 Recall that θ was only defined when a normalization of K has been chosen. We stress that in (A.6) that normalization is so that K is tangent to an affinely parameterized geodesic, with s being an affine distance along η, and with p corresponding to s = 0 and φ(p) corresponding to s = 1. Proof. 1. To relate J(φ) to the null mean curvature of N , extend the orthonormal basis {X1 , X2 , . . . , Xk } to Lie parallel vector fields s → Xi (s), i = 1, . . . , k, along η, LK Xi = 0 along η. Then by a standard computation, J(φ)

= $φ∗p X1 ∧ φ∗p X2 ∧ · · · ∧ φ∗p Xk $ = $X1 (1) ∧ X2 (1) ∧ · · · ∧ Xk (1)$ √ g , = s=1

where g = det[gij ], and gij = gij (s) = Xi (s), Xj (s). We claim that along η, 1 d√ g. θ = −√ g ds The computation is standard. Set bij = B(X i , X j ), where B is the null second fundamental form of N with respect to −K, hij = h(X i , X j ) = gij , and let g ij be the i, j’th entry of the inverse matrix [gij ]−1 . Then θ = g ij bij . Differentiating gij along η we obtain, d gij = KXi , Xj = ∇K Xi , Xj + Xi , ∇K Xj ds = ∇Xi K, Xj + Xi , ∇Xj K = −(bij + bji ) = −2bij . Thus,

1 d√ 1 1 1 dg d θ = g ij bij = − g ij gij = − = −√ g, 2 ds 2 g ds g ds

as claimed. Integrating along η from s = 0 to s = 1 we obtain, 1

√ √ J(φ) = g = g · exp − θ ds = exp − s=1

s=0

0

1

θ ds .

0

2. Suppose now that η(1) is a focal point to S along η, but that there are no focal points to S along η prior to that. Then we can still construct the C 1 map Φ: [0, 1] × W → M , with Φ({1} × W ) ⊂ Σ, such that Φ is an embedding when restricted to a sufficiently small open set in [0, 1] × W containing [0, 1) × {p}. The vector fields s → Xi (s), s ∈ [0, 1), i = 1, .., k, may be constructed as above, and

Vol. 2, 2001


167

are Jacobi fields along η|[0,1) , which extend smoothly to η(1). Since η(1) is a focal point, the vectors φ∗ X1 = X1 (1), . . . , φ∗ Xk = Xk (1) must be linearly dependent, which implies that J(φ) = 0. ✷

B Some comments on the area theorem of Hawking and Ellis In this appendix we wish to discuss the status of our H–regularity condition with respect to the conformal completions considered by Hawking and Ellis [37] in their treatment of the area theorem. For the convenience of the reader let us recall here the setting of [37]. One of the conditions of the Hawking–Ellis area theorem [37, Proposition 9.2.7, p. 318] is that spacetime (M, g) is weakly asymptotically simple and empty (“WASE”, [37, p. 225]). This means that there exists an open set U ⊂ M which is isometric to U ∩ M , where U is a neighborhood of null infinity in an asymptotically simple and empty (ASE) spacetime (M , g ) [37, p. 222]. It is further assumed that M admits a partial Cauchy surface S with respect to which M is future asymptotically predictable ([37], p. 310). This is defined by the requirement that I+ is contained in closure of the future domain of dependence D+ (S; M) of S, where the closure is taken in the conformally completed manifold ¯ ∪ I+ ∪ I− , with both I+ and I− being null hypersurfaces. Next, one says that M (M, g) is strongly future asymptotically predictable ([37], p. 313) if it is future ¯ is contained in D+ (S; M). asymptotically predictable and if J + (S) ∩ J¯− (I+ ; M) Finally ([37], p. 318), (M, g) is said to be a regular predictable space if (M, g) is strongly future asymptotically predictable and if the following three conditions hold: ¯ is homeomorphic to R3 \(an open set with compact closure). (α) S ∩ J¯− (I+ ; M) (β) S is simply connected. (γ) the family of hypersurfaces S(τ ) constructed in [37, Proposition 9.2.3, p. 313] ¯ are has the property that for sufficiently large τ the sets S(τ ) ∩ J¯− (I+ ; M) ¯ contained in J¯+ (I− ; M). It is then asserted in [37, Proposition 9.2.7, p. 318] that the area theorem holds for regular predictable spaces satisfying the null energy condition. Now in the proof of [37, Proposition 9.2.1, p. 311] (which is one of the results used in the proof of [37, Proposition 9.2.7, p. 318]) Hawking and Ellis write: “This ¯ shows that if W is any compact set of S, every generator of I+ leaves J + (W; M).” The justification of this given in [37] is wrong26 . If one is willing to impose this as a 26 In the proof of [37, Proposition 9.2.1, p. 311] it is claimed that “... Then S \ U is compact...”. This statement is incorrect in general, as shown by the example (M, g) = (M , g ) = (R4 , diag(−1, +1, +1, +1)), S = S = {t = 0}, U = U = {t = 0}. This example does not show that the claim is wrong, but that the proof is; we do not know whether the claim in Proposition 9.2.1 is correct as stated under the hypothesis of future asymptotic predictability of (M, g) made there. Let us note that the conditions (α)–(γ) do not seem to be used anywhere in the

168

P.T. Chru´sciel, E. Delay, G.J. Galloway, R. Howard I+ M

I+ M

U

I+ M

ψ

C

Ω S

U

Ann. Henri Poincar´ e I+ M

C

ψ(Ω) U

U K

Figure 2: The set Ω ≡ J + (S; M) ∩ ∂U and its image under ψ. supplementary hypothesis, then this condition can be thought of as the Hawking– Ellis equivalent of our condition of H–regularity of I+ . When such a condition is imposed in addition to the hypothesis of strong asymptotic predictability and weak asymptotic simplicity (“WASE”) of (M, g), then the hypotheses of Proposition 4.1 hold, and the conclusions of our version of the area theorem, Theorem 1.1, apply. An alternative way to guarantee that the hypotheses of Proposition 4.1 will hold in the “future asymptotically predictable WASE” set–up of [37] (for those sets C which lie to the future of S) is to impose some mild additional conditions on U and S. There are quite a few possibilities, one such set of conditions is as follows: Let ψ : U → U ∩ M denote the isometry arising in the definition of the WASE spacetime M. First, we require that ψ can be extended by continuity to a continuous map, still denoted by ψ, defined on U. Next, suppose there exists a compact set K ⊂ M such that, ψ(J + (S; M) ∩ ∂U) ⊂ J + (K; M ) ,

(B.1)

see Figure 2. Let us show that, under the future asymptotically predictable WASE conditions together with (B.1), for every compact set C ⊂ J + (S; M) that meets ¯ there exists a future inextendible (in M) null geodesic η ⊂ ∂J + (C; M) I − (I+ ; M) starting on C and having future end point on I+ . First, we claim that ψ(J + (C; M) ∩ U) ⊂ J + (K ∪ C ; M ) ,

(B.2)

¯ Indeed, let p ∈ ψ(J + (C; M) ∩ U), therefore there exists where C = ψ(C ∩ U). a future directed causal curve γ from C to ψ −1 (p) ∈ U. If γ ⊂ U, then p ∈ J + (C ; M ). If not, then γ exits U when followed from ψ −1 (p) to the past at some point in J + (S; M) ∩ ∂U, and thus p ∈ ψ(J + (S; M) ∩ ∂U) ⊂ J + (K; M ), which proof of Proposition 9.2.7 as presented in [37], and it is conceivable that the authors of [37] had in mind some use of those conditions in the proof of Proposition 9.2.1. We have not investigated in detail whether or not the assertion made there can be justified if the supplementary hypothesis that (M, g) is a regular predictable space is made, as the approach we advocate in Section 4.1 allows one to avoid the “WASE” framework altogether.

Vol. 2, 2001


169

establishes (B.2). Since K ∪ C is compact, by Lemma 4.5 and Proposition 4.13 + ¯ in [54], each generator of I+ M meets ∂J (K ∪ C ; M ) exactly once. It follows + + that, under the natural identification of IM with IM , the criteria for H-regularity discussed in Remark 4.5 are satisfied. Hence, we may apply Proposition 4.8 to obtain the desired null geodesic η. There exist several other proposals how to modify the WASE conditions of [37] to obtain better control of the space–times at hand [53, 50, 16], but we have not investigated in detail their suitability to the problems considered here.

C Some comments on the area theorems of Królak Kr´ olak has previously extended the definition of a black hole to settings more general, in various ways, than the standard setting considered in Hawking and Ellis [37]. In each of the papers [47, 48, 49] Kr´ olak obtains an area theorem, under the implicit assumption of piecewise smoothness. It follows from the results presented here that the area theorems of Kr´ olak still hold without the supplementary hypothesis of piecewise smoothness, which can be seen as follows. First, in each of the papers [47, 48, 49] the event horizon H is defined as the boundary of a certain past set, which implies by [37, Prop. 6.3.1 p. 187] that H is an achronal closed embedded C 0,1 hypersurface. Moreover, by arguments in [47, 48, 49] H is ruled by future inextendible null geodesics and hence, in all the papers [47, 48, 49] H is a future horizon as defined here. Now, because in [48, 47] the null generators of H are assumed to be future complete, one can apply Theorem 1.1 to conclude that the area theorem holds, under the explicit assumptions of [48, 47], for the horizons considered there, with no additional regularity conditions. On the other hand, in [49] completeness of generators is not assumed, instead a regularity condition on the horizon is imposed. Using the notation of [49], we shall say that a horizon HT (as defined in [49]) is weakly regular iff for any point p of HT there is an open neighborhood U of p such that for any compact set K T . (The set WT contained in U ∩ WT the set J + (K) contains a N ∞ -TIP from W may be thought of as the region outside of the black hole, while WT represents null infinity.) This differs from the definition in [49, p. 370] in that Kr´ olak requires the compact set K to be in U ∩ WT rather than in the somewhat larger set U ∩ WT . 27 Under this slightly modified regularity condition, the arguments of the proof of [49, Theorem 5.2] yield positivity of θAl : the deformation of the set T needed in that proof in [49] is obtained using our sets S,η,δ from Proposition 4.1. Our Theorem 6.1 then implies that area monotonicity holds for the horizons considered in [49], subject to the minor change of the notion of weak regularity discussed above, with no additional regularity conditions. 27 In Kr´ olak’s definition K is not allowed to touch the event horizon HT. But then in the proof of [49, Theorem 5.2] when T is deformed, it must be moved completely off of the horizon and into WT. So the deformed T will, in general, have a boundary in WT. The generator γ in the proof of [49, Theorem 5.2] may then meet T at a boundary point, which introduces difficulties in the focusing argument used in the proof. The definition used here avoids this problem.

170



D Proof of Theorem 5.6 For q ∈ S0 let Γq ⊂ H denote the generator of H passing through q. Throughout this proof all curves will be parameterized by signed σ–distance from S0 , with the distance being negative to the past of S0 and positive to the future. We will need the following Lemma: Lemma D.1 S0 is a Borel subset of S, in particular S0 is Hn−1 measurable. σ Proof. For each δ > 0 let Aδ : = {p ∈ S0 | the domain of definition of Γp contains the interval [−δ, δ]} . (D.1) Bδ : = {p ∈ S0 | the domain of definition of the inextendible geodesic γp containing Γp contains the interval [−δ, δ]} . (D.2) Lower semi–continuity of existence time of geodesics shows that the Bδ ’s are open subsets of S. Clearly Aδ ⊂ Bδ for δ ≥ δ. We claim that for δ ≥ δ the sets Aδ are closed subsets of the Bδ ’s. Indeed, let qi ∈ Aδ ∩ Bδ be a sequence such that qi → q∞ ∈ Bδ . Since the generators of H never leave H to the future, and since q∞ ∈ Bδ , it immediately follows that the domain of definition of Γq∞ contains the interval [0, δ]. Suppose, for contradiction, that Γq∞ (s− ) is an endpoint on H with s− ∈ (−δ, 0], hence there exists s ∈ (−δ, 0] such that γq∞ (s ) ∈ I − (H). As q∞ is an interior point of Γq∞ , the σ–unit tangents to the Γqi ’s at qi converge to the σ–unit tangent to Γq∞ at q∞ . Now I − (H) is open, and continuous dependence of ODE’s upon initial data shows that γqi (s ) ∈ I − (H) for i large enough, contradicting the fact that qi ∈ Aδ with δ ≥ δ. It follows that q∞ ∈ Aδ , and that Aδ is closed in Bδ . But a closed subset of an open set is a Borel set, hence Aδ is Borel in S. Clearly S0 = ∪i A1/i , –measurability of S0 follows which implies that S0 is a Borel subset of S. The Hn−1 σ now from [25, p. 293] or [19, p. 147]12 . ✷ Returning to the proof of Theorem 5.6, set Γ+ q

= Hsing = Ω = Ωsing =

Γq ∩ J + (q) , H \ HAl , ∪q∈S0 Γ+ q , Ω ∩ Hsing .

Vol. 2, 2001


171

By definition we have Ωsing ⊂ Hsing and completeness of the Hausdorff measure28 together with Hnσ (Hsing ) = 0 implies that Ωsing is n–Hausdorff measurable, with Hnσ (Ωsing ) = 0 .

(D.3)

Let φ: Ω → S0 be the map which to a point p ∈ Γ+ q assigns q ∈ S0 . The arguments of the proofs of Lemmata 6.9 and 6.11 show that φ is locally Lipschitz. This, together with Lemma D.1, allows us to use the co–area formula [23, Theorem 3.1] to infer from (D.3) that n 0= J(φ)dHσ = H1σ (Ωsing ∩ φ−1 (q))dHn−1 (q) , σ S0

Ωsing

where J(φ) is the Jacobian of φ, cf. [23, p. 423]. Hence H1σ (Ωsing ∩ φ−1 (q)) = 0

(D.4)

for almost all q’s in S0 . A chase through the definitions shows that (D.4) is equivalent to (D.5) H1σ (Γ+ q \ HAl ) = 0 , for almost all q’s in S0 . Clearly the set of Alexandrov points of H is dense in Γ+ q when (D.5) holds. Theorem 5.1 shows, for such q’s, that all interior points of Γq are Alexandrov points of H, hence points q satisfying (D.5) are in S1 . It follows that S1 is Hn−1 measurable, and has full (n − 1)–Hausdorff measure in S0 . This σ establishes our claim about S1 . The claim about S2 follows now from the inclusion S1 ⊂ S2 . ✷

E Proof of Proposition 6.6 Because of the identities C C f (p) + x − p, ap − $x − p$2 + $x$2 2 2 C 2 = f (p) + $p$ + x − p, ap + Cp , 2 and C C f (q) + x − q, aq + $x − q$2 + $x$2 2 2 C 2 = f (q) + $q$ + x − q, aq + Cq + C$x − q$2 , 2 28 A measure is complete iff all sets of outer measure zero are measurable. Hausdorff measure is constructed from an outer measure using Carathéodory’s definition of measurable sets [24, p. 54]. All such measures are complete [24, Theorem 2.1.3 pp. 54–55].

172



we can replace f by x → f (x) + C$x$2 /2 and ap by ap + Cp and assume that for all p, x ∈ A we have f (p) + x − p, ap ≤ f (x) ≤ f (p) + x − p, ap + C$x − p$2 ,

(E.1)

and for all p, q ∈ A and x ∈ Rn f (p) + x − p, ap ≤ f (q) + x − q, aq + C$x − q$2 .

(E.2)

These inequalities can be given a geometric form that is easier to work with. Let P := {(x, y) ∈ Rn × R : y > C$x$2 }. Then P is an open convex solid paraboloid of Rn+1 . We will denote the closure of P by P . From the identity f (q) + x − q, aq + C$x − q$2 = f (q) − (4C)−1 $aq $2 + C$x − q − (2C)−1 aq $2 it follows that the solid open paraboloids {(x, y) ∈ Rn × R : y > f (q) + x − q, aq + C$x − q$2 } are translates in Rn+1 of P . Let G[f ] := {(x, y) ∈ Rn × R : x ∈ A, y = f (x)} be the graph of f . The inequalities (E.1) and (E.2) imply that for each p ∈ A there is an affine hyperplane Hp = {(x, y) ∈ Rn × R : y = f (p) + x − p, ap } of Rn+1 and a vector bp ∈ Rn+1 so that (p, f (p)) ∈ Hp ,

(E.3)

(P + bp ) ∩ G[f ] = ∅ but (p, f (p)) ∈ bp + P ,

(E.4)

and for all p, q ∈ A

Hp ∩ (bq + P ) = ∅.

(E.5)

As the paraboloids open up, this last condition implies that each bq + P lies above all the hyperplanes Hp . Let

Q := Convex Hull (bp + P ) . p∈A

Because P is convex if α1 , . . . , αm ≥ 0 satisfy ∈ Rn+1

m i=1

αi = 1 then for any v1 , . . . , vm

α1 (v1 + P ) + α2 (v2 + P ) + · · · + αm (vm + P ) = (α1 v1 + · · · + αm vm ) + P. Therefore if B := Convex Hull {bp : p ∈ A} then Q=

(v + P )

v∈B

so that Q is a union of translates of P . Thus Q is open. Because P is open we have that if lim→∞ v = v then v + P ⊆ (v + P ). So if B is the closure of B in Rn+1 then we also have (v + P ). (E.6) Q= v∈B

Vol. 2, 2001


173

For each p the open half space above Hp is an open convex set and Q is the convex hull of a subset of this half space. Therefore Q is contained in this half space. Therefore Q ∩ Hp = ∅ for all p. We now claim that for each point z ∈ ∂Q there is a supporting paraboloid for z in the sense that there is a vector v ∈ B with v + P ⊂ Q and z ∈ v + P . To ∞ see this note that as z ∈ ∂Q there is a sequence {b }∞ =1 ⊂ B and {w }=1 ⊂ P so that lim→∞ (b + w ) = z. Fixing a p0 ∈ A and using that all the sets b + P are above the hyperplane Hp0 we see that both the sequences b and w are bounded subsets of Rn+1 and by going to a subsequence we can assume that v := lim→∞ b and w := lim→∞ w exist. Then z = v + w, v ∈ B and w ∈ P . Then (E.6) implies v + P ⊂ Q and w ∈ P implies z ∈ v + P . Thus we have the desired supporting paraboloid. We also claim that the graph G[f ] satisfies G[f ] ⊂ ∂Q. This is because for p ∈ A the point (p, f (p)) ∈ Hp and Hp is disjoint from Q. Thus (p, f (p)) ∈ / Q. But from (E.4) (p, f (p)) ∈ bp + P and as bp + P ⊂ Q this implies (p, f (p)) ∈ Q. Therefore (p, f (p)) ∈ ∂Q as claimed. Let F : Rn → R be the function that defines ∂Q. Explicitly F (x) = inf{y : (x, y) ∈ Q}. This is a function defined on all of Rn and G[f ] ⊂ ∂Q implies that this function extends f . For any x0 ∈ Rn the convexity of Q implies there is a supporting hyperplane H for Q at its boundary point (x0 , F (x0 )) and we have seen there also is a supporting paraboloid v + P for ∂Q = G[F ] at (x0 , F (x0 )). Expressing H and ∂(v + P ) as graphs over Rn these geometric facts yield that there is a vector ax0 so that the inequalities F (x0 ) + x − x0 , ax0 ≤ F (x) ≤ F (x0 ) + x − x0 , ax0 + C$x − x0 $2 hold for all x ∈ Rn . Because the function F is defined on all of Rn (rather than just the subset A), it follows from [9, Prop. 1.1 p. 7] that F is of class C 1,1 . ✷

Acknowledgments P.T.C. acknowledges useful discussions with or comments from Guy Barles, Piotr Haj\lasz, Tom Ilmanen, Bernd Kirchheim, Andrzej Kr´ olak, Olivier Ley and Laurent Véron. R.H. benefited from discussions and/or correspondence with Joe Fu, Sergei Konyagin, and Ron DeVore. We are grateful to Lars Andersson for bibliographical advice. E.D. is grateful to the KTH, Stockholm for hospitality during the final stage of work on this project.

174



References [1] L. Andersson, G.J. Galloway, and R. Howard, The cosmological time function, Class. Quantum Grav. 15 (1998), 309–322, gr-qc/9709084. [2]

, A strong maximum principle for weak solutions of quasi-linear elliptic equations with applications to Lorentzian and Riemannian geometry, Comm. Pure Appl. Math. 51 (1998), 581–624.

[3] L. Andersson and V. Moncrief, The global existence problem in general relativity, 1999, Proceedings of the Besse Seminar on Lorentzian Geometry, Nancy, June 1998. [4] F. Antonacci and P. Piccione, A Fermat principle on Lorentzian manifolds and applications, Appl. Math. Lett. 9 (1996), 91–95. [5] J. K. Beem, P. E. Ehrlich, and K. L. Easley, Global Lorentzian geometry, 2 ed., Pure and Applied Mathematics, vol. 202, Marcel Dekker, New York, 1996. [6] J.K. Beem and A. Królak, Cauchy horizon endpoints and differentiability, Jour. Math. Phys. 39 (1998), 6001–6010, gr-qc/9709046. [7] D. Brill, J. Louko, and P. Peldan, Thermodynamics of (3+1)-dimensional black holes with toroidal or higher genus horizons, Phys. Rev. D56 (1997), 3600–3610, gr-qc/9705012. [8] R. Budzy´ nski, W. Kondracki, and A. Kr´ olak, On the differentiability of Cauchy horizons, Jour. Math. Phys. 40 (1999), 5138–5142. [9] L.A. Caffarelli and X. Cabré, Fully nonlinear elliptic equations, Colloqium Publications, vol. 43, AMS, Providence, RI, 1995. [10] B. Carter, Killing horizons and orthogonally transitive groups in space–time, Jour. Math. Phys. 10 (1969), 70–81. [11] Y. Choquet-Bruhat and J. York, The Cauchy problem, General Relativity (A. Held, ed.), Plenum Press, New York, 1980. [12] P.T. Chru´sciel, Uniqueness of black holes revisited, (N. Straumann, Ph. Jetzer, and G. Lavrelashvili, eds.), vol. 69, Helv. Phys. Acta, May 1996, Proceedings of Journées Relativistes 1996, gr-qc/9610010, pp. 529–552. [13]

, On rigidity of analytic black holes, Commun. Math. Phys. 189 (1997), 1–7, gr-qc/9610011.

[14]

, A remark on differentiability of Cauchy horizons, Class. Quantum Grav. (1998), 3845–3848, gr-qc/9807059.

Vol. 2, 2001


175

[15] P.T. Chru´sciel and G.J. Galloway, Horizons non–differentiable on dense sets, Commun. Math. Phys. 193 (1998), 449–470, gr-qc/9611032. [16] C.J.S. Clarke and F. de Felice, Globally noncausal space-times, Jour. Phys. A15 (1982), 2415–2417. [17] F.H. Clarke, Optimization and nonsmooth analysis, second ed., Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 1990. [18] A. Dold, Lectures on algebraic topology, Springer, Berlin, Heidelberg and New York, 1972. [19] G.A. Edgar, Measure, topology, and fractal geometry, Undergraduate Texts in Mathematics, Springer-Verlag, New York, 1990. [20] I. Ekeland and R. Temam, Convex analysis and variational problems, Studies in Math. and its Appl., vol. 1, North Holland, Amsterdam, 1976. [21] J.-H. Eschenburg, Comparison theorems and hypersurfaces, Manuscripta Math. 59 (1987), no. 3, 295–323. [22] L.C. Evans and R.F. Gariepy, Measure theory and fine properties of functions, CRC Press, Boca Raton, FL, 1992. [23] H. Federer, Curvature measures, Trans. Amer. Math. Soc. 93 (1959), 418–491. [24]

, Geometric measure theory, Springer Verlag, New York, 1969, (Die Grundlehren der mathematischen Wissenschaften, Vol. 153).

[25]

, Colloquium lectures on geometric measure theory, Bull. Amer. Math. Soc. 84 (1978), no. 3, 291–338.

[26] W.H. Fleming and H. Mete Soner, Controlled Markov processes and viscosity solutions, Applications of Mathematics, vol. 25, Springer–Verlag, New York, Heidelberg, 1993. [27] H. Friedrich, I. Rácz, and R.M. Wald, On the rigidity theorem for spacetimes with a stationary event horizon or a compact Cauchy horizon, Commun. Math. Phys. 204 (1999), 691–707, gr-qc/9811021. [28] G.J. Galloway, A “finite infinity” version of the FSW topological censorship, Class. Quantum Grav. 13 (1996), 1471–1478. [29]

, Maximum principles for null hypersurfaces and null splitting theorems, Stockholm preprint, 1999.

[30] G.J. Galloway, K. Schleich, D.M. Witt, and E. Woolgar, Topological censorship and higher genus black holes, Phys. Rev. D60, 104039 (1999).

176



[31] G.J. Galloway and E. Woolgar, The cosmic censor forbids naked topology, Class. Quantum Grav. 14 (1996), L1–L7, gr-qc/9609007. [32] R. Geroch, Domain of dependence, Jour. Math. Phys. 11 (1970), 437–449. [33] F. Giannoni, A. Masiello, and P. Piccione, A variational theory for light rays in stably causal Lorentzian manifolds : regularity and multiplicity results, Commun. Math. Phys. 187 (1997), 375–415. [34] G.W. Gibbons and S.W. Hawking, Cosmological event horizons, thermodynamics, and particle creation, Phys. Rev. D15 (1977), 2738–2751. [35] D. Giulini, Is there a general area theorem for black holes ?, Jour. Math. Phys. 39 (1998), 6603–6606. [36] P.R. Halmos, Measure Theory, D. Van Nostrand Company, Inc., New York, N. Y., 1950. [37] S.W. Hawking and G.F.R. Ellis, The large scale structure of space-time, Cambridge University Press, Cambridge, 1973. [38] M. Heusler, Black hole uniqueness theorems, Cambridge University Press, Cambridge, 1996. [39] E. Hewitt and K. Stromberg, Real and abstract analysis, Springer-Verlag, New York, 1975, Graduate Texts in Mathematics, No. 25. [40] L. Hörmander, The analysis of linear partial differential operators, Springer, Berlin, 1985. [41] R. Howard, The kinematic formula in Riemannian homogeneous spaces, Mem. Amer. Math. Soc. 106 (1993), no. 509, vi+69. [42]

, The boundary structure of sets satisfying a locally uniform inner ball condition, preprint, 1999.

[43] S.A. Hughes, C.R. Keeton, S.T. Shapiro, S.A. Teukolsky, P. Walker, and K. Walsh, Finding black holes in numerical spacetimes, Phys. Rev. D49 (1994), 4004–4015. [44] S. Husa and J. Winicour, The asymmetric merger of black holes, Phys. Rev. D60 (1999), 084019 (13 pp.), gr-qc/9905039. [45] J. Isenberg and V. Moncrief, Symmetries of cosmological Cauchy horizons with exceptional orbits, Jour. Math. Phys. 26 (1985), 1024–1027. ¨ [46] F. Kottler, Uber die physikalischen Grundlagen der Einsteinschen Gravitationstheorie, Annalen der Physik 56 (1918), 401–462.

Vol. 2, 2001


177

[47] A. Kr´ olak, Definitions of black holes without use of the boundary at infinity, Gen. Rel. Grav. 14 (1982), 793–801. [48]

, Black holes and the strong cosmic censorship, Gen. Rel. Grav. 16 (1984), 121–130.

[49]

, Black holes and the weak cosmic censorship, Gen. Rel. Grav. 16 (1984), 365–373.

[50]

, The existence of regular partially future asymptotically predictable space-times, Jour. Math. Phys. 29 (1988), 1786–1788.

[51] D.N. Kupeli, On null submanifolds in spacetimes, Geom. Dedicata 23 (1987), 33–51. [52] F. Morgan, Geometric measure theory, a beginner’s guide, Academic Press, San Diego, 1995. [53] R.P.A.C. Newman, Cosmic censorship and curvature growth, Gen. Rel. Grav. 15 (1983), 641–353. [54]

, The global structure of simple space-times, Commun. Math. Phys. 123 (1989), 17–52.

[55] B. O’Neill, Semi–Riemannian geometry, Academic Press, New York, 1983. [56] R. Penrose, Techniques of differential topology in relativity, SIAM, Philadelphia, 1972, (Regional Conf. Series in Appl. Math., vol. 7). [57] V. Perlick, On Fermat’s principle in general relativity : I. The general case, Class. Quantum Grav. 7 (1990), 1319–1331. [58] A. Rendall, Local and global existence theorems for the Einstein equations, Living Reviews in Relativity 1 (1998), URL http://www.livingreviews.org. [59] H.J. Seifert, Smoothing and extending cosmic time functions, Gen. Rel. Grav. 8 (77), 815–831. [60] S.T. Shapiro, S.A. Teukolsky, and J. Winicour, Toroidal black holes and topological censorship, Phys. Rev. D 52 (1995), 6982–6987. [61] L. Simon, Lectures on geometric measure theory, Proceedings of the CMA, vol. 3, Australian University Press, Canberra, 1983. [62] E.M. Stein, Singular integrals and differentiability properties of functions, Princeton University Press, Princeton, N.J., 1970, Princeton Mathematical Series, No. 30. [63] C.V. Vishveshwara, Generalization of the “Schwarzschild surface” to arbitrary static and stationary metrics, Jour. Math. Phys. 9 (1968), 1319–1322. [64] R.M. Wald, General relativity, University of Chicago Press, Chicago, 1984.

178



Piotr T. Chru´sciel*, Erwann Delay** Département de Mathématiques Faculté des Sciences Parc de Grandmont F-37200 Tours, France e-mail: [email protected] e-mail: [email protected] * Supported in part by KBN grant # 2 P03B 130 16. ** Supported in part by the EU TMR project Stochastic Analysis and its Application, ERB-FMRX-CT96-0075. Gregory J. Galloway Department of Mathematics and Computer Science University of Miami Coral Gables, FL 33124, USA e-mail: [email protected] Supported in part by NSF grant # DMS-9803566. Ralph Howard Department of Mathematics University of South Carolina Columbia, SC 29208, USA e-mail: [email protected] Supported in part by DoD Grant # N00014-97-1-0806. Communicated by Sergiu Klainerman submitted 23/12/99, accepted 16/08/00




The Maxwell-Lorentz System of a Rigid Charge Gernot Bauer, Detlef D¨ urr Abstract. We prove global existence and uniqueness of classical solutions for the Maxwell-Lorentz system of a nonrotating rigid charge distribution, i.e. the relativistic dynamics of a nonrotating extended electron, which is subject to its own electromagnetic fields and an external potential. Local existence and uniqueness is achieved via the contraction mapping principle. Suitable a-priori-bounds yield global existence. We show that in case of a negative bare mass and an attracting external potential the stationary solution is unstable. We believe that this result clarifies the origin of the so-called “runaway”-solutions, which appear when the limit to a point charge is taken, formally described by the so called Lorentz-Dirac equation for the radiating electron.

1 Introduction In this paper, we study the following system of differential equations: αp(t) q(t) ˙ = , 2 1 + p(t) p(t) ˙ = −κq(t) +

  αp(t) ρ x − q(t) E(x , t) + ×B(x , t) dx , 2 1 + p(t)

˙ x) = curlB(t, x) − 4π αp(t) ρ x − q(t) , E(t, 1 + p(t)2

(1)

˙ x) = −curlE(t, x), B(t, divE(t, x) = 4πρ x − q(t) , divB(t, x) = 0. This system describes the motion of a charge distribution ρ, which is subject to the electric field E and the magnetic field B. The charge distribution is rigidly carried along with a mass point, whose space coordinate q is located in its center. The motion of q is governed by the Lorentz force law and therefore by the coupling of ρ to the electromagnetic fields. On the other hand, the dynamics of the fields is determined by Maxwell’s equations, where ρ and the corresponding current qρ ˙

180

G. Bauer, D. D¨ urr


appear as sources. Therefore the system describes the selfinteraction of a charge distribution with its own radiation field. Additionaly, the charge distribution is subject to an external potential, which is assumed to be quadratic∗ with curvature κ. In our units the absolute value of the charge center’s mass m as well as the velocity of light c are equal to one. However, we allow the sign α = sgn(m) to be negative, i.e. α may take the values α = ±1. We will refer to (1) as the MaxwellLorentz system of a rigid charge distribution. We do not consider the generalization of the Maxwell-Lorentz system to a spinning extended charge [1]. With rotational degrees of freedom included in the equations of motion our analysis becomes more complicated [2]. There is a huge literature on the subject of a charge distribution interacting with the electromagnetic fields generated by itself, especially in connection with the limit to a point charge. However, despite of its physical relevance, no mathematically rigorous results in connection with the Maxwell-Lorentz system existed. This situation however changed rather recently, and now many results are available. We mention the early work [3] on the linearized problem, and more recent [4], where the long time behavior of solutions of the Maxwell-Lorentz equations is considered, [5] for soliton like solutions, [6] for conservation laws and [2], a very recent publication on the motion of a rotating extended charge, which contains also an exhaustive list of references on the subject. We prove global existence and uniqueness of classical solutions for the above Maxwell-Lorentz system. In addition to our existence result we prove that the stationary solution of the Maxwell-Lorentz system is unstable if α = −1 and κ > 0, i.e. if the charge center’s mass is negative and the external potential is attracting. This result is interesting for the following reason. Physicists have always been tempted to derive from the Maxwell-Lorentz system an equation of motion for a point charge. There are various reasons that make a point charge theory preferable. One reason is that an equation of motion for a point charge would be more fundamental. It would not depend on the arbitrariness of the charge distribution’s size and shape. Moreover, a point charge theory would be “naturally” relativistic invariant, and thus it could serve as a fundamental model for a charged elementary particle like e.g. an electron. But the electromagnetic fields generated by a point charge are singular at the location of the charge. Therefore the Lorentz force is no more well-defined: In case of a point charge the fields have to be evaluated at the charge’s location, i.e. just where they have their singularity. Consequently, there is no Maxwell-Lorentz electrodynamics for a point charge. Dirac now invented a limiting procedure for (1), in which a sequence ρn of charge distributions is considered, each having the fixed total charge Q and a radius rn such that rn → 0 as n → ∞. This leads formally to a point charge theory, and to overcome the above problem Dirac proposed that in the limiting procedure the particle’s bare mass m should depend on n, too [7]. (In fact, his proposal was the first example of a so called mass renormalization, which later ∗ See

however Remark 2 below for more general potentials.

Vol. 2, 2001

The Maxwell-Lorentz System of a Rigid Charge

181

became an essential tool in quantum field theory.) He showed that if mn → −∞ as n → ∞ in such a way, that the sum of mn and the particle’s electromagnetic mass (i.e. the field energy of the electromagnetic fields it generates) remains finite, a mathematically well-defined equation for a point charge emerges in the limit n → ∞. This equation however, now called the Lorentz-Dirac equation, is of third order in the time derivative and allows pathological solutions, the famous “runaway”solutions, where the charge is accelerated to the speed of light without (or even with an attracting) external force [8, 9]. Since Dirac’s work there has been much speculation on the physical origin of these solutions. The discussion has often been obscured by the employment of the ad hoc renormalization procedure from which the Lorentz-Dirac equation emerges. Our result now sheds light on the origin of the “runaway”-solutions as they are due to the particle’s bare mass being negative. We prove that “runaway”-like behavior, i.e. instability of the stationary solution, already occurs without taking the limit of the mass renormalization, namely as soon as m in the Maxwell-Lorentz equations is negative, while in the case of positive mass and confining potential the motion of the charge is always bounded. It was in fact one motivation for our existence theory to have a mathematical basis for achieving such a result. Recently there has been renewed interest in the physical relevance of the Lorentz-Dirac equation, whose stable solutions (living on a center manifold) can be shown to be governed by an effective second order equation [10]. The paper is organized as follows. In section 2 we fix our notation and reformulate the Maxwell-Lorentz equations as an evolution equation in a Hilbert space. In section 3 we prove existence and uniqueness of local solutions via the contraction mapping principle. By means of appropriate a-priori-bounds we establish global existence in section 4. In the final section 5 we consider the case that the charge center’s bare mass m is negative and the external potential is attracting, i.e. α = −1 and κ > 0. We prove that in this case the stationary solution of the above system is unstable (in the sense of Lyapunov).

2 Notational and technical preliminaries Throughout this paper we assume the charge distribution ρ to be smooth and of compact support∗ , i.e. ρ ∈ C0∞ (IR3 ). For q ∈ IR3 we write ρq = ρ(· − q). By L2 we denote the Hilbert space L2 (IR3 ) of real-valued, measurable, square3 integrable functions on IR3 . Then (L2 ) is the real Hilbert space of 3-tuples of functions from L2 . We write f, g and x · y for the canonical scalar products of 3 two functions f, g ∈ (L2 ) and two vectors x, y ∈ IR3 , respectively. Both the norms 3 in L2 and (L2 ) will be denoted by · , whereas both the norms in IR and IR3 will be denoted by | · |. We write xj (j = 1, 2, 3) for the j-th component of a vector x ∈ IR3 with respect to the canonical basis vectors of IR3 . We treat the Maxwell-Lorentz equations as an evolution equation in the fol∗ See

however Remark 2 below for more general ρ.

182



lowing Hilbert space H: 3

3

H = IR3 ⊕ IR3 ⊕ (L2 ) ⊕ (L2 ) .

H is equipped with the scalar product . , . H and the corresponding norm · H ,

where . , . H is defined by

ϕ, ϕ H = q · q + p · p + E, E + B, B for ϕ, ϕ ∈ H, ϕ = (q, p, E, B), ϕ = (q , p , E , B ). An element ϕ = (q, p, E, B) of H, interpreted as a physical state vector, consists of the charge center’s location q and momentum p as well as the electromagnetic fields E and B. Having introduced the underlying state space H, we rewrite the first four lines in (1) in the compact form ϕ(t) ˙ = Aϕ(t) + J ϕ(t) . (2) Here, ϕ(·) : t → ϕ(t) = q(t), p(t), E(t), B(t) is a H-valued map on IR that represents the evolution of the physical state in time. The linear operator A and the nonlinear operator J are formally given by Aϕ = 0, 0, curlB, −curlE (3) and J(ϕ) =

αp

, −κq + 1 + p2

−4παp ρq (x) E(x) + ×B(x) dx, ρq , 0 1 + p2 1 + p2 (4) αp

for ϕ = (q, p, E, B). We first analyze the operator A. Let 3 3 W curl = E ∈ (L2 ) curlE ∈ (L2 ) , where the partial derivatives of the curl are meant in the sense of distributions. We define the domain of A by D(A) = IR3 ⊕ IR3 ⊕ W curl ⊕ W curl and for each n ∈ IN, n ≥ 2, the domain of the n-th power An of A by D(An ) = ϕ ∈ D(A) Aj ϕ ∈ D(A), j = 1, . . . , n − 1 . For our purposes it will be essential that A generates a strongly continuous group of unitary operators on H. This is established in the following lemma.

Vol. 2, 2001


183

Lemma 1 There is a strongly continuous family {Ut }t∈IR of unitary operators Ut : H → H such that Uh ϕ − ϕ = Aϕ, h→0 h

Ut ϕ ∈ D(A)

lim

and

AUt ϕ = Ut Aϕ

for each ϕ ∈ D(A) and t ∈ IR. Proof. A standard functional analytical argument, using Fourier transform and self-adjointness, see e.g. [11]. We now come to analyze the operator J in (2), formally given by (4). The following lemma states that we may choose D(A) as domain of J and provides us with estimates on J. Lemma 2 The nonlinear operator J, cf. (4), has the following properties: (i) J maps D(A) into

∞

D(An ).

n=0

(ii) For each n ∈ IN, n ≥ 0, it holds 2n+1

A

J(ϕ) =

n 4πα

0, 0, 0, (−1)

A2n+2 J(ϕ) =

1 + p2

2n+1

curl

(p ρq ) ,

4πα 0, 0, (−1)n curl2n+2 (p ρq ), 0 . 1 + p2

(5)

(iii) There are constants Kn > 0 such that n A J(ϕ) ≤ Kn , H n A J(ϕ) − J(ϕ) H ≤ Kn 1 + ϕ

H ϕ − ϕ

H for all ϕ, ϕ ∈ D(A), n ∈ IN, n ≥ 0. Proof. A straightforward calculation. For (iii) only standard inequalities are needed. Note that the second inequality immediately follows from the first one.

3 Local existence and uniqueness We shall use now the group {Ut }t∈IR , cf. lemma 1, to prove via the contraction mapping principle the existence and uniqueness of local solutions of the integral equation t ϕ(t) = Ut ϕ0 + Ut−s J ϕ(s) ds (6) 0

184



and to show that a unique solution of (6) is a unique solution of (2) with initial value ϕ(0) = ϕ0 . Doing this, we specify the set of initial values ϕ0 as well as the exact notion of uniqueness. Our existence and uniqueness result is more or less standard from the viewpoint of semigroup theory. Therefore we keep the proofs short and refer to [12], section X.13, where a very similar technique is presented in connection with nonlinear wave equations. Proposition 1 (Local existence and uniqueness) For each ϕ0 ∈ D(An ), n ≥ 1, there is a T > 0 and a map ϕ(·) : [0, T ) → D(An ) with the following properties: (i) ϕ(·) is a n-times strongly continuously differentiable solution of (2) with initial value ϕ(0) = ϕ0 and dj/dtj ϕ(t) ∈ D(An−j ) for all t ∈ [0, T ) and j = 0, . . . , n. (ii) If ϕ(·) : [0, T) → D(A) is a strongly continuously differentiable solution of (2) with initial value ϕ(0) = ϕ0 , then ϕ(t) = ϕ(t) for t ∈ [0, T) ∩ [0, T ). Proof. (i) For given n ∈ IN, n ≥ 1, and T > 0 we define XT,n := ϕ(·) : [0, T ) → D(An ) t → Aj ϕ(t) strongly continuous in [0, T ), j = 0, . . . , n, n j

ϕ(·) := sup A ϕ(t) < ∞ . X

j=0 t∈[0,T )

H

As far as we do not refer explicitly to the dependence on n or T , we write X for XT,n . Since A is a closed operator, X equipped with the norm · X is a Banach space. The map U(·) ϕ0 : t → Ut ϕ0 is an element of X. Define ≤1 , MT,n,ϕ0 := ϕ(·) ∈ XT,n ϕ(0) = ϕ0 , ϕ(·) − U(·) ϕ0 X T,n which is a closed subset of XT,n . We shall show that the map Sϕ0 : ϕ(·) → Sϕ0 ϕ (·), where t Sϕ0 ϕ (t) = Ut ϕ0 + Ut−s J ϕ(s) ds, t ∈ [0, T ), (7) 0

is a contracting self-mapping on MT,n,ϕ0 provided T is chosen sufficiently small. Before doing so, we observe that for each ϕ(·) ∈ MT,n,ϕ0 and for all t ∈ [0, T ) we have ϕ(t) H ≤ Ut ϕ0 H + 1 = ϕ0 H + 1. Hence it follows from lemma 2 that we simply have j j A J ϕ(t) ≤ Cn,ϕ0 , J ϕ(t) − J ϕ(t) A H (8) ≤ Cn,ϕ0 ϕ − ϕ

H H

Vol. 2, 2001


185

for all ϕ(·), ϕ(·) ∈ MT,n,ϕ0 , t ∈ [0, T ) and j = 0, . . . , n, where Cn,ϕ0 := 2 + ϕ0 H max Kj .

(9)

j=0,...,n

For the sake of brevity, we will often write M for MT,n,ϕ0 , C for Cn,ϕ0 and S for Sϕ0 . n We first assert that the map S in (7) yields for each ϕ(·) ∈ M a D(An )-valued function t → Sϕ (t) in [0, T ), i.e. we show that the integral in (7) is D(A )-valued. Since by (8) j A Ut−(s+h) J ϕ(s + h) − Aj Ut−s J ϕ(s) H

= Aj J ϕ(s + h) − Aj J ϕ(s) H + U−h Aj J ϕ(s) − Aj J ϕ(s) H (10) ≤C ϕ(s + h) − ϕ(s) H + U−h Aj J ϕ(s) − Aj J ϕ(s) H , the mappings s → Aj Ut−s J ϕ(s) are continuous in [0, t) for all t ∈ [0, T ) and j = 0, . . . , n. Therefore the integrals t Aj Ut−s J ϕ(s) ds, σ(t) := σ (0) (t), (11) σ(j) (t) := 0

are well-defined as H-valued Riemann integrals. Define (j) σN (t)

N 1 k j := Ut− k t A J ϕ t , N N N

(0)

σN (t) := σN (t),

k=1

as Riemann sum of the integrals in (11). We have σN (t) ∈ D(An ) and lim σN (t) = σ(t),

N→∞

lim Aj σN (t) = σ (j) (t)

N→∞

with respect to the norm · H for all t ∈ [0, T ) and j = 0, . . . , n. Since A is closed, this implies σ(t) ∈ D(An ) and Aj σ(t) = σ (j) (t), i.e. t t Aj Ut−s J ϕ(s) ds = Ut−s Aj J ϕ(s) ds. (12) 0

0

Thus, t → Sϕ (t) = Ut ϕ0 + σ(t) is a D(An )-valued map in [0, T ). Next, we assert the continuity of the mappings t → Aj Sϕ (t) in [0, T ) for j = 0, . . . , n, which holds if the mappings t → Aj σ(t) are continuous. Using (8) and (12), we have j A σ(t + h) − Aj σ(t) H t Uh Aj J ϕ(s) − Aj J ϕ(s) ds + hC. ≤ H 0

186



Here, the integrand converges to zero as h → 0 for each s ∈ [0, T ) and is bounded uniformly in h according to Uh Aj J ϕ(s) − Aj J ϕ(s) ≤ 2 Aj J ϕ(s) ≤ 2C. H H Therefore, by the dominated convergence theorem, the integral converges to zero as h → 0, and so does Aj σ(t + h) − Aj σ(t)H . This implies the continuity of t → Aj σ(t) in [0, T ) for each j ∈ IN with 0 ≤ j ≤ n. Finally we observe by use of (8) that t n j Sϕ (·) − U(·) ϕ0 = sup U A J ϕ(s) t−s X j=0 t∈[0,T )

H

0

≤ T C(n + 1)

(13)

an similarly Sϕ (·) − S ϕ (·)

X

≤ T C(n + 1) ϕ(·) − ϕ(·)

X

(14)

for all ϕ(·), ϕ(·) ∈ M . Hence, if T ≤ 1/C(n + 1), (13) completes showing that S is a self-mapping on M and (14) asserts S to be a contraction. By the Banach fixed point theorem, we thus obtain that for each ϕ0 ∈ D(An ), n ≥ 1, there is a T = Tn,ϕ0 > 0, given by Tn,ϕ0 =

1 2Cn,ϕ0 (n + 1)

(15)

with Cn,ϕ0 cf. (9), such that the map Sϕ0 has exactly one fixed point ϕ(·) in MT,n,ϕ0 . For the proof of the differentiability of ϕ(·) in [0, T ) we need the following two lemmas. Lemma 3 For each j = 0, . . . , n − 1 the map t → Aj ϕ(t) is strongly continuously differentiable in [0, T ) and d j A ϕ(t) = Aj+1 ϕ(t) + Aj J ϕ(t) . dt

(16)

Proof. We consider (Uh − I) j 1 Aj ϕ(t + h) − Aj ϕ(t) = A Ut ϕ0 + h h h + 0

t+h

Ut+h−s Aj J ϕ(s) ds

t

(17) t

(Uh − I) j A Ut−s J ϕ(s) ds. h

Since ϕ0 ∈ D(An ) the first term converges to Aj+1 Ut ϕ0 and since the integrand of the second term is continous the second term converges to Aj J ϕ(t) as h tends

Vol. 2, 2001


187

to zero. The integrand of the third term converges to Aj+1 Ut−s J ϕ(s) for each s ∈ [0, t) and according to 1 h (Uh − I) j j+1 Uh A J ϕ(s) dh ≤ C A Ut−s J ϕ(s) = h h 0 H H

it is bounded uniformly in h. Thus, by the dominated convergence theorem, the third term converges to t t (Uh − I) j lim Aj+1 Ut−s J ϕ(s) ds = A Ut−s J ϕ(s) ds = h→0 0 h 0 t Ut−s J ϕ(s) ds. Aj+1 0

To sum up, t → Aj ϕ(t) is differentiable in [0, T ) and, taking the terms together, d/dtAj ϕ(t) satisfies (16). Note that the right-hand side of this equation is continuous because of ϕ(·) ∈ M , j ≤ n − 1 and lemma 2, so that t → Aj ϕ(t) is even strongly continuously differentiable. Lemma 4 Assume that the map ϕ(·) : [0, T ) → D(An ) is j times strongly continuk k ously differentiable for some j ∈ IN, where d /dt ϕ(t) ∈ D(A) for all t ∈ [0, T ) and l k ≤ j. Then t → A J ϕ(t) is j times strongly continuously differentiable, ∞ dk J ϕ(t) ∈ D(Al ) dtk l=0

and

Al

dk dk J ϕ(t) = k Al J ϕ(t) k dt dt

(18)

for all t ∈ [0, T ) and k, l ∈ IN with k ≤ j. Proof. Let ϕ(·) = q(·), p(·), E(·), B(·) be as supposed in the lemma. Then the component functions q(·), p(·), E(·) and B(·) of ϕ(·) are j times continuously 3 differentiable in [0, T ) and E (k) (t), B (k) (t) ∈ (L2 ) for all t ∈ [0, T ) and k ≤ j. The lemma is an immediate consequence of ρ ∈ C0∞ (IR3 ) and the definition of A and 3 J, cf. (4) and (5). The integrand in (4), as a (L2 ) -valued function of t in [0, T ), is j times strongly continuously differentiable, and all derivatives with respect to 3 t are again (L2 ) -valued. Therefore also the integral, as a function of t in [0, T ), is j times continuously differentiable, integration over IR3 and differentiation with 3 respect to t being exchangeable. Moreover, for each l ∈ IN, the C0∞ (IR3 ) -valued maps p(t) t → curll p(t)ρ · −q(t) 1 + p(t)2 3

are j times continuously differentiable in [0, T ), the derivatives are again C0∞ (IR3 ) valued maps in [0, T ), and differentiation with respect to t commutes with the curl.

188



Proof of Proposition 1 (continued). It is now straightforward to show that the fixed point ϕ(·) of Sϕ0 in MT,n,ϕ0 has the following property: ϕ(·) is n times strongly continuously differentiable, dj/dtj ϕ(t) ∈ D(An−j ) and j−1 dν dj j ϕ(t) = A ϕ(t) + Aj−1−ν ν J ϕ(t) j dt dt ν=0

for all t ∈ [0, T ) and j = 0, . . . , n. We prove this statement by induction with respect to j. The case j = 0 is obvious. Now choose some j ∈ IN with j ≤ n − 1 and assume that the statement is valid vor all k ∈ IN with k ≤ j. Then ϕ(·) is j times continuously differentiable and dk/dtk ϕ(t) ∈ D(An−k ) for all t ∈ [0, T ) and k ≤ j. Therefore, by lemma 4, the maps dν dν j−1−ν J ϕ(t) = A J ϕ(t) ν ν dt dt are continuously differentiable for each ν ≤ j − 1 and t → Aj−1−ν

dν+1 d j−1−ν dν J ϕ(t) = Aj−1−ν ν+1 J ϕ(t) ∈ D(An−(j+1) ) A ν dt dt dt for all t ∈ [0, T ). By lemma 3, also t → Aj ϕ(t) is continuously differentiable and (16) holds, where the right-hand side is in D(An−(j+1) ) for each t ∈ [0, T ). We thus have obtained that dj/dtj ϕ(·) is continuously differentiable in [0, T ) and j−1 dν+1 dj+1 j+1 j ϕ(t) = A ϕ(t) + A J ϕ(t) + Aj−1−ν ν+1 J ϕ(t) j+1 dt dt ν=0

= Aj+1 ϕ(t) +

j ν=0

Aj−ν

dν J ϕ(t) ∈ D(An−(j+1) ), ν dt

i.e. the above statement holds for j + 1. This yields part (i) of the proposition. (ii) Let ϕ(·) be the unique fixed point of Sϕ0 in MT,n,ϕ0 , where T = Tn,ϕ0 . Then for each T0 with 0 < T0 ≤ T the restriction ϕ(·)|[0,T0 ) of ϕ(·) to [0, T0 ) is unique fixed point of Sϕ0 in MT0 ,1,ϕ0 , since by (15) and (9) Tn,ϕ0 depends monotonously decreasing on n. Now let ϕ(·) : [0, T) → D(A) be a strongly continuously differentiable solution of (2) with initial value ϕ(0) = ϕ0 . By the differential equation (2), t → Aϕ(t) is strongly continous in [0, T). Therefore ϕ(·)| [0,T0 ) ∈ MT0 ,1,ϕ0 for some T0 with satisfies (2), ϕ(·)| [0,T0 ) is a fixed point of Sϕ0 0 < T0 ≤ min(T, T). Since ϕ(·) in MT0 ,1,ϕ0 . This implies ϕ(t) = ϕ(t) for all t with 0 ≤ t < T0 . Let T1 be the supremum of such T0 and suppose T1 < min(T, T). By the definition of MT,1,ϕ0 we must have ϕ(T 1 ) = ϕ(T1 ). We can thus repeat the above argument, with initial value ϕ(T1 ) instead of ϕ0 , to show that ϕ(t) = ϕ(t) in a small interval of T1 ≤ t < T2 . This contradicts the maximality of T1 and hence ϕ(t) = ϕ(t) holds for all t with 0 ≤ t < min(T, T ), i.e. for t ∈ [0, T ) ∩ [0, T ).

Vol. 2, 2001


189

4 Regularity and global existence So far we did not take into account the last two lines in (1). From now on however we will always impose on the initial value ϕ0 = q0 , p0 , E0 , B0 the constraint divE0 = 4πρ(· − q0 ),

divB0 = 0,

(19)

where the partial derivatives of the divergence are meant in the sense of distributions. The following proposition states that this constraint implies for a solution of (2) with initial value ϕ0 the validity of the last two lines in (1) on the whole interval of existence as well as certain regularity properties. Proposition 2 Suppose ϕ0 = q0 , p0 , E0 , B0 ∈ D(An ), n ≥ 1. Suppose further that q0 , E0 and B0 satisfy (19). Then the solution ϕ(·) = q(·), p(·), E(·), B(·) : [0, T ) → D(An ) of (2) with initial value ϕ(0) = ϕ0 , cf. proposition 1, has the following properties: (i) For all t ∈ [0, T ), the functions q(·), E(·) and B(·) satisfy divE(t) = 4πρ · −q(t) , divB(t) = 0,

(20)

where the partial derivatives of the divergence are meant in the sense of distributions. (ii) If n ≥ 2λ + 2 for some λ ∈ IN, then, for every t ∈ [0, T ), each of E(t) 3 and B(t) is equal almost everywhere in IR3 to a function in C 2λ (IR3 ) . If ϕ0 ∈

∞

D(An ), then for every t ∈ [0, T ) each of E(t) and B(t) is equal

n=0

3

almost everywhere in IR3 to a function in C ∞ (IR3 ) . Proof. (i) Since t → B(t) is continuously differentiable in [0, T ), we may write t t ˙ B(s)ds = −div curlE(s)ds divB(t) = div B0 + 0

0

for each t ∈ [0, T ). Now choose some arbitrary test function φ. We have t t curlE(x, s)ds φ(x)dx = curlE(x, s) · gradφ(x)dx ds = 0, − div 0

0

(21) using Fubini’s theorem, which may be applied since the integral t t curlE(x, s) · gradφ(x)dx ds ≤ gradφ

curlE(s) ds 0

0

exists. We further have t ˙ E(s)ds divE(t) = div E0 + 0

= 4πρ(· − q0 ) + div

t

curlB(s)ds − 4πdiv 0

0

t

p(s) ρ · −q(s) ds. 2 1 + p(s)

190



Here, the second term vanishes by an argument completely analogous to (21). In the third term we are allowed to commute the divergence with the integration, since p(·) and q(·) are continuous and ρ ∈ C0∞ (IR3 ). This yields t p(s) divE(t) = 4πρ(· − q0 ) − 4π · gradρ · −q(s) ds 1 + p(s)2 0 t d ρ · −q(s) ds = 4πρ · −q(t) . = 4πρ(· − q0 ) − 4π ds 0 (ii) Assume n ≥ 2λ + 2 for some λ ∈ IN. Fix some arbitrary t ∈ [0, T ). 3 Since ϕ(t) ∈ D(An ), we have curl2l E(t), curl2l B(t) ∈ (L2 ) for all l ∈ IN with 0 ≤ l ≤ λ + 1, where the partial derivatives of the curl are meant in the sense of distributions. For l ≥ 1 we have l−1 l−1 curl2 E(t) = graddiv − ∆ curl2 E(t) curl2l E(t) = curl2 l−1 l−1 = (−∆)l−1−ν (graddiv)ν curl2 E(t) ν ν=0

(22)

= (−∆)l−1 curl2 E(t) = (−1)l−1 ∆l−1 graddivE(t) + (−1)l ∆l E(t) in the distributional sense, where ∆ denotes the The same calculation Laplacian. as (22) also holds for B(t). By divE(t) = 4πρ · −q(t) and divB(t) = 0 according to (i) we thus obtain curl2l E(t) = 4π(−1)l−1 ∆l−1 gradρ · −q(t) + (−1)l ∆l E(t), curl2l B(t) = (−1)l ∆l B(t). 3

Thus, by ρ ∈ C0∞ (IR3 ), we conclude ∆l E(t), ∆l B(t) ∈ (L2 ) for 0 ≤ l ≤ λ + 1. of E(t) in the sense of l E(t) of ∆l E(t) and E(t) For the Fourier transforms ∆ and by ∆l E(t) ∈ (L2 )3 , l E(t) = (−1)l |k|2l E(t), tempered distributions we have ∆ ∈ (L2 )3 for 0 ≤ l ≤ λ + 1 the Plancherel theorem and |k|2l ≥ kj2l this yields kj2l E(t) 2 = kl+l E(t) 2 ∈ L1 (IR3 ) and j = 1, 2, 3. By Schwartz’ inequality, k2l k2l E(t) j

j

j

∈ (L2 )3 for 0 ≤ l, l ≤ λ + 1, i.e. kl E(t) ∈ (L2 )3 for all and therefore kjl+l E(t) j l ∈ IN with 0 ≤ l ≤ 2λ + 2 and j = 1, 2, 3. Applying Fourier transform and the 3 Plancherel theorem again, this implies ∂ l /∂xlj E(t) ∈ (L2 ) for 0 ≤ l ≤ 2λ + 2 and j = 1, 2, 3, with the partial derivatives ∂ l /∂xlj in the sense of distributions. The same holds for B(t). By this property of E(t) and B(t), Sobolev’s lemma [13] assures that each of E(t) and B(t) are equal almost everywhere in IR3 to a function 3 in C 2λ (IR3 ) .

We now come to formulate and prove our central theorem, the existence and uniqueness of global classical solutions of the Maxwell-Lorentz system. It remains

Vol. 2, 2001


191

to show that the local solutions of (2), cf. propositions 1 and 2, may be continued to the whole interval [0, ∞), i.e. that they exist globally. The natural idea would be to use conservation of energy∗ . We shall not do that, also because we have included the case of negative mass, for which the energy is no more bounded below. Theorem 1 Let ϕ0 = q0 , p0 , E0 , B0 ∈ D(An ), n ≥ 1. Suppose further that q0 , E0 and B0 satisfy (19). Then the following is true: (i) (Global existence) There is a function ϕ(·) = q(·), p(·), E(·), B(·) : IR → D(An ) with the following properties: ϕ(·) is a n times strongly continuously differentiable solution of (2) with initial value ϕ(0) = ϕ0 and ϕ(j) (t) ∈ D(An−j ) for all t ∈ IR and j = 0, . . . , n. The functions q(·), E(·) and B(·) satisfy (20) for all t ∈ IR. Thus, ϕ(·) is a global solution of the MaxwellLorentz system, with curl and divergence in the distributional sense. (ii) (Uniqueness) Suppose Λ ⊂ IR is an interval and T0 ∈ Λ. If ϕ(·) : Λ → D(A) is a continuously differentiable solution of (2) with initial value ϕ(T 0) = ϕ(T0 ), then ϕ(t) = ϕ(t) for all t ∈ Λ. (iii) (Regularity) If n ≥ 2λ + 2 for some λ ∈ IN, then for every t ∈ IR each of 3 E(t) and B(t) is equal almost everywhere in IR3 to a function in C 2λ (IR3 ) .If ϕ0 ∈

∞

D(An ), then for every t ∈ IR each of E(t) and B(t) is equal almost

n=0

3

everywhere in IR3 to a function in C ∞ (IR3 ) . Proof. (i) According to proposition 1 there is a T > 0 and a function ϕ(·) : [0, T ) → D(An ) such that ϕ(·) is a n times strongly continuously differentiable solution of (2) in [0, T ) with initial value ϕ(0) = ϕ0 . Let T be the sup of all T > 0 with this property and suppose T < ∞. In order to derive a contradiction, observe t t J ϕ(s) ds U ϕ + U J ϕ(s) ds ≤

ϕ

+

ϕ(t) H = t−s 0 H t 0 H H

0

≤ ϕ0 H + K1

0

0

t

ϕ(s) H ds

for all t ∈ [0, T ), using lemma 2. Gronwall’s lemma yields

ϕ(t) H ≤ ϕ0 H eK1 t ≤ ϕ0 H eK1 T < ∞,

(23)

i.e. the map t → ϕ(t) H is bounded in [0, T ). Since by (15) and (9) the length of the interval over which one can use the contraction mapping principle only depends on this bound, the solution can be continued beyond T . This violates the maximality of T and hence T = ∞. Note that nowhere in our proof we make use of the sign of t. Our existence result and the properties of the solutions thus are valid for (−∞, 0] as well as for [0, ∞), i.e. for whole IR. ∗ See

Remark 3 below.

192



(ii) For Λ = [0, T) and T0 = 0 what has to be shown follows immediately from part (ii) of proposition 1, whose proof also includes the case T = ∞. For general intervals Λ the proof may be traced back easily to this special case by translation or inversion of the interval. (iii) This statement is the content of part (ii) of proposition 2, whose proof includes the case that ϕ(·) is a global solution. Remarks. 1. Part (iii) of the theorem states that if ϕ0 ∈ D(An ) for all n ≥ 0, 3 then for each t ∈ IR the equivalence class of functions in (L2 ) that differ from 3 E(t) only on a null set of IR3 has a representative in C ∞ (IR3 ) . The same holds for B(t). If we denote by E(t) and B(t) this representative and no more the 3 whole equivalence class, we have E(t), B(t) ∈ C ∞ (IR3 ) . Furthermore, it follows from (i) that for each interval Λ ⊂ IR the maps t → E(t) and t → B(t) as 3 well as their (L2 ) -derivatives t → ∂ l/∂tl E(t) and t → ∂ l/∂tl B(t), l ≥ 0, are 3 3 elements of L2 Λ, (L2 ) , this space being isomorphic to L2 (Λ × IR3 ) . Since C0∞ ⊂ 3 L2 , the derivative in the time direction with respect to the (L2 ) -norm is the distributional derivative. Therefore we can use Sobolev’s lemma [13] to conclude 3 from ∂ l/∂tl E, ∂ l/∂tl B ∈ L2 (Λ × IR3 ) , l ≥ 0, that E and B are not only strongly 3 continuously differentiable with respect to the (L2 ) -norm, but even pointwise up 3 to all orders. Thus E, B ∈ C ∞ (IR × IR3 ) , and ϕ(·) is a classical solution of the Maxwell-Lorentz system. (Lesser smoothness of the initial conditions would yield in a similar way classical solution which are less smooth.) 2. We wrote the Maxwell-Lorentz system (1) with a linear force term acting in addition to the Lorentz force, i.e. we have a quadratic external potential. We note that this is only for notational simplicity and that the consideration of more general potentials would require only minor adjustments. For example, as long as a more general potential is bounded by a quadratic function, the estimates in lemma 2 remain unaltered and global existence is still valid. In this respect we also remark that our results apply in a straightforward manner to the MaxwellLorentz system with physical constants mass m and velocity of light c written out. Our assumption ρ ∈ C0∞ (IR3 ) is again for notational simplicity. In case of more general ρ, the regularity properties of the solution would reflect the degree of differentiability of ρ. 3. We mention that the total energy of the Maxwell-Lorentz system is conserved. It can be used as an a-priori-bound on the norm of local solutions if α = 1 and κ ≥ 0. However, we proved global existence by other means for general α and κ. Lemma 5 (conservation of energy) Let ϕ(·) = q(·), p(·), E(·), B(·) : [0, T ) → D(An ), n ≥ 1, be a solution of (2) according to proposition 1. Then the quan-

Vol. 2, 2001

tity H,


1 1 2 2 E(t, x)2 + B(t, x)2 dx, H = α 1 + p(t) + κq(t) + 2 8π

193

(24)

is independent of t. We call H the total energy of the Maxwell-Lorentz system. Proof. A straightforward calculation.

5 Negative bare mass and “runaway”-solutions Throughout this section we consider the situation that the charge distribution’s bare mass m is negative and that the external potential is attracting, i.e. α = sgn(m) = −1 and κ > 0. In addition, we assume the charge distribution ρ to be spehrically symmetric. Our investigation is concerned with the stationary solution of the Maxwell-Lorentz system (1). By this we mean the following timeindependent state vector ψs : ψs = (0, 0, EC , 0),

where

x − x dx |x − x |3 is the electrostatic Coulomb field of the charge distribution ρ. It is easy to check, making use of the symmetry of ρ, that ψs is indeed a solution of (1). The following theorem states that in an arbitrarily small neighbourhood of ψs there are initial data ϕ(0) such that the corresponding solution t → ϕ(t) of (1) has “runaway”character, i.e. the charge center runs away infinitely far from the origin and its velocity tends to the speed of light as t → ∞. EC (x) =

ρ(x )

Theorem 2 Let α = −1, κ > 0 and let ρ be sperically symmetric. Then the stationary solution ψs of the Maxwell-Lorentz system (1) is unstable in the sense of Lyapunov: For each 4 > 0 there is a solution t → ϕ(t) = q(t), p(t), E(t), B(t) of (1) with ψs − ϕ(0) H < 4 and lim |q(t)| = ∞, lim |p(t)| = ∞, i.e. lim ψs − ϕ(t) H = ∞.

t→∞

t→∞

t→∞

Proof. Let u be some unit vector, u ∈ IR3 and |u| = 1. We choose the following initial data: q0 = x0 u,

x0 > 0,

−p0 = v0 u, v0 > 0, q˙0 = 1 + p20 x − q(0) − x ρ(x ) dx = EC x − q(0) , E0 (x) = 3 |x − q(0) − x | B0 (x) = 0.

194



These initial data satisfy (19). We may assume that −q˙0 ϕ0 = q0 , p0 , E0 , B0 = q0 , , EC · −q0 , 0 1 − q˙02 satisfies ψs − ϕ0 H < 4. For this we only have to choose x 0 and v0 sufficiently small. By theorem 1, there is a global solution t → ϕ(t) = q(t), p(t), E(t), B(t) of the Maxwell-Lorentz system (1) with ϕ(0) = ϕ0 . Due to the symmetry of the initial data, i.e. q0 , p0 ||u and E0 , B0 being rotational symmetric around the axis through u, it is assured that the motion of q is one-dimensional: We may write q(t) = x(t)u (t ∈ IR),

(25)

with a real-valued function t → x(t) satisfying x(0) = x0 and x(0) ˙ = v0 . Making use now of energy conservation, cf. lemma 5, we have −1 1 1 2 H = + κx(t) + E(t, x)2 + B(t, x)2 dx 2 2 8π 1 − x(t) ˙ (26) 1 2 1 −1 2 + κx0 + dx EC = 2 8π 1 − v02 for all t ∈ IR. We decompose the electric field into a longitudinal and a transversal that El (x, t) = EC x − part: E = El + Et , where divEt = 0 and rotEl = 0. Note q(t) and EC · Et dx = 0 since divE(x, t) = divEC x − q(t) = 4πρ x − q(t) . Therefore 2 2 EC + 2EC · Et + Et2 dx ≥ EC dx. E 2 dx = From (26) we thus obtain 1 1 1 ≥ κ x(t)2 − x20 + 2 2 1 − x(t) ˙ 1 − v02 for all t ∈ IR. Since x0 , v0 > 0, this implies x(t) ˙ > v0 for all t > 0 and therefore lim |x(t)| = ∞, lim |x(t)| ˙ = 1.

t→∞

t→∞

Because of (25) this is what we had to show. Remark. As we mentioned in the introduction, this result sheds light on the genesis of the “runaway”-solutions of the Lorentz-Dirac equation. We wish to remark that the Lorentz-Dirac equation exhibits “runaway”-solutions also without external potential. We employed the potential merely as a tool to have a quick argument for instability. Clearly, also without external potential, as soon as the particle accelerates, it will continue to accelerate, thereby loosing energy to the fields. A rigorous argument for this would need a more detailed analysis of the fields.

Vol. 2, 2001


195

Acknowledgements. We thank Karin Berndl, Michael Kiessling, Alexander Komech and Herbert Spohn for valuable discussions. Earlier discussions with Sheldon Goldstein and Nino Zanghi on the subject are also greatfully acknowledged. This work was supported by the DFG via the Graduiertenkolleg “Mathematik im Bereich ihrer Wechselwirkung mit der Physik”.

References [1] J.S. Nodvik, A Covariant Formulation of Classical Electrodynamics for Charges of Finite Extension, Ann. Phys. 28, 225 (1964). [2] W. Appel, M. Kiessling, Mass and Spin Renormalisation in Lorentz Electrodynamics, preprint (2000) [3] D. Bambusi, D. Noja, On Classical Electrodynamics of Point Particles an Renormalisation: Some Preliminary Results, Lett. Math. Phys. 37, 436 (1996). [4] A. Komech, H. Spohn, Long-Time Asymptotics for the Coupled MaxwellLorentz Equations, Comm. Part. Diff. Eq. 25, 558 (2000). [5] A. Komech, H. Spohn, V. Imaikin, Soliton-like Asymptotics and Scattering for Coupled Maxwell-Lorentz Equations, Proc. of 5th Int. Conf. on Math and Num. Aspects of Waves Propagation, July 10-14, Santiago de Compostela, Spain, SIAM, 2000. [6] M. Kiessling, Classical Electron Theory and Conservation Laws, Phys. Lett. A258, 197 (1999) [7] P.A.M. Dirac, Classical Theory of Radiating Electrons,Proc. Roy. Soc. London A178, 148 (1938). [8] F. Rohrlich, Classical Charged Particles, Addison-Wesley (1990). [9] A.O. Barut, Electrodynamics and Classical Theory of Fields and Particles, Dover (1980). [10] H. Spohn, Dynamics of Charged Particles and their Radiation Field, Los Alamos Archive Math.-Phys./9908024 [11] R.D. Richtmyer, Principles of Advanced Mathematical Physics I, Springer (1978). [12] M. Reed, B. Simon, Methods of Modern Mathematical Physics II, Academic Press (1975). [13] W. Rudin, Functional Analysis, McGraw-Hill (1973).

196


Gernot Bauer Hollenbeckerstr. 23 D-48143 M¨ unster Germany e-mail: [email protected] Detlef D¨ urr Fakult¨ at f¨ ur Mathematik Universität M¨ unchen Theresienstr. 39 D-80333 M¨ unchen Germany e-mail: [email protected] Communicated by Jean Bellissard submitted 28/01/99, accepted 18/09/00





Sinai Billiards Under Small External Forces N.I. Chernov

Abstract. Consider a particle moving freely on the torus and colliding elastically with some fixed convex bodies. This model is called a periodic Lorentz gas, or a Sinai billiard. It is a Hamiltonian system with a smooth invariant measure, whose ergodic and statistical properties have been well investigated. Now let the particle be subjected to a small external force. This new system is not likely to have a smooth invariant measure. Then a Sinai-Ruelle-Bowen (SRB) measure describes the evolution of typical phase trajectories. We find general sufficient conditions on the external force under which the SRB measure for the collision map exists, is unique, and enjoys good ergodic and statistical properties, including Bernoulliness and an exponential decay of correlations.

1 Introduction Let B1 , . . . , Bs be open convex domains on the unit 2-dimensional torus T| 2 . Assume that B¯i ∩ B¯j = ∅ for i = j, and for each i the boundary ∂Bi is a C 3 smooth closed curve with non-vanishing curvature. Consider a particle of unit mass moving in Q := T| 2 \ ∪i Bi according to equations q˙ = p,

p˙ = F

(1.1)

where q = (x, y) is the position vector, p = (u, v) is the momentum (or velocity) vector, and F(x, y, u, v) = (F1 , F2 ) is a stationary force (the force is independent of time). Upon reaching the boundary ∂Q = ∪i ∂Bi , the particle reflects elastically, according to the usual rule p+ = p− − 2 (n(q) · p− ) n(q) .

(1.2)

Here q ∈ ∂Q is the point of reflection, n(q) is the unit normal vector to ∂Q pointing inside Q , and p− , p+ are the incoming and outgoing velocity vectors, respectively. The case F = 0 corresponds to the ordinary billiard dynamics on the table Q. It preserves the kinetic energy K = 12 ||p||2 , so that one can fix it, usually by setting ||p|| = 1. Then the phase space of the system is a compact three-dimensional manifold M0 := Q × S 1 , with identification of incoming and outgoing velocity vectors, i.e. p− and p+ in (1.2), at every point of reflection. The dynamics Φt0 on M0 preserves the Liouville measure, which is simply a uniform measure on M0 .

198

N.I. Chernov


In the study of billiards, one usually considers the following two-dimensional cross-section of M0 : M0 := {(q, p) ∈ M0 : q ∈ ∂Q, (p · n(q)) ≥ 0}

(1.3)

which consists of all outgoing velocity vectors at reflection points. Then the first return map T0 : M0 → M0 is well defined, it is called the collision map or billiard map. The cross-section M0 can be parameterized by (r, ϕ), where r is the arclength parameter along ∂Q and ϕ ∈ [−π/2, π/2] is the angle between p and n(q). In these coordinates, M0 = ∂Q × [−π/2, π/2]. The map T0 preserves a finite smooth measure on M0 , induced by the Liouville measure on M0 . It is given by dν0 = const · cos ϕ dr dϕ . Since each obstacle Bi is convex, it acts as a scatterer, so that parallel bundles of trajectories diverge upon reflection, see Fig. 1. Billiard with this property are said to be dispersing, or Sinai billiards. The map T0 and the flow Φt0 for dispersing billiards are proved to be hyperbolic (i.e., they have one positive and one negative Lyapunov exponents), ergodic, mixing, K-mixing and Bernoulli [Si, GO]. The map T0 enjoys strong statistical properties: exponential decay of correlations and satisfies a central limit theorem and weak invariance principle [BSC2, Y1, Ch2].

✄ ✆ ✄ ✆ ✄ ✆ ✿ ✘✘ ✬✩ ✄ ✆✘✘ ✄✘  ✆✆  z  ❅ ❅ ✫✪ ❘ ❅

ϕ n ❈❖ P ❈ q✒p ✬✩ ❈ ✲ ✛ r ✫✪

Figure 1. Scattering effect and the coordinates r, ϕ. Let us now assume that the configuration of scatterers has finite horizon meaning that the the free motion of the billiard particle in Q is uniformly bounded (by a constant L > 0). In other words, any straight line of length L on the torus intersects one of the Bi . Under this condition, in addition to all the cited properties, the flow Φt0 satisfies a central limit theorem and weak invariance principle [BSC2], hence the billiard particle satisfies a diffusion equation [BSC2]. It is very likely that the flow Φt0 enjoys exponential decay of correlations as well, but this is not proven yet. The assumption of finite horizon seems to be necessary for the above properties, because in billiards without horizon the moving particle exhibits super-

Vol. 2, 2001

Sinai Billiards Under Small External Forces

199

diffusive (ballistic) behavior [Bl], and the correlations seem to decay very slowly, as const·t−1 , see, e.g., [FM1, FM2]. Very little is known in the general case F = 0, though. Clearly, a large force can change the dynamics dramatically, so that the properties of the dynamics will be determined by the character of F in (1.1) more than by the scattering effect of collisions with obstacles. Hence, the dynamics can be of quite generic nature. It is presently understood, due to the KAM theory, that generic mechanical (even Hamiltonian) systems are not completely hyperbolic or ergodic – typically, chaotic regions in the phase space coexist with elliptic islands of stability. So, we have to restrict ourselves to small forces that will not overcome the scattering effect of collisions with obstacles. Thus we will keep the dynamics close enough to the original billiard. Then we can hope that many properties of the system with force will be “inherited” from the original billiard model. It is now clear that the assumption on finite horizon will be necessary – without it the effect of even a small force F may accumulate to a dangerous level during long runs between collisions. The system (1.1)-(1.2) with a force F = 0 may be hyperbolic but is likely to admit no smooth invariant measure. Then the evolution of typical phase trajectories is governed by the so called Sinai-Ruelle-Bowen (SRB) measures. Those measures are characterized by smooth conditional distributions on unstable manifolds. The SRB measures are the only physically observable measures, they are called non-equilibrium steady states in the language of statistical mechanics. We refer the reader to [GC, Ru, Y2] for more discussion on SRB measures and their role in hyperbolic dynamics and physics. There is a remarkable example of the system (1.1)-(1.2) well studied in the literature. Let F be a small constant electric force, possibly combined with a small magnetic force, with a Gaussian thermostat added, see (2.8) below. An SRB measure was constructed and its strong ergodic and statistical properties were mathematically proved [CELS1, CELS2] for this particular model. Certain transport laws of physics were then rigorously derived, including Ohm’s law, Einstein relation, Green-Kubo formula, etc. Similar models are now getting more and more popular in physics. The purpose of this paper is to find general classes of forces F for which the system (1.1)-(1.2) has an SRB measure with good ergodic and statistical properties. In fact, we try to consider the forces as general as possible, assuming only what seems to be necessary. First, we will assume that the force F is small, as we said above. The only other major assumption we need is an additional integral of motion. If none exists, then the phase space of the system is a four-dimensional non-compact manifold Q × IR2 . Then we would face two almost hopeless problems. First, we can only ensure that two nonzero Lyapunov exponents exist – they are inherited from the original billiard, but the other two may be zero or arbitrary small. This makes the system, essentially, only partially hyperbolic with little chance for any good ergodic or statistical properties. To make things worse, the non-compactness of

200

N.I. Chernov


the phase space makes the existence of physically interesting invariant measures very unlikely. In fact, without a proper temperature control (thermostating), the system will usually heat up (||p|| → ∞) or cool down (||p|| → 0), which effectively rules out interesting invariant measures. Hence, we will assume that the dynamics preserves a smooth function E(q, p), an integral of motion, and its level surface is a compact 3-D manifold. We now turn to exact assumptions on the force F in our model.

2 The model and main results Here we state our assumptions on the force F. Assumption A (additional integral). A smooth function E(q, p) is preserved by the dynamics Φt defined by (1.1)-(1.2). Its level surface, M := {E(q, p) = const} is a compact 3-D manifold. Two extra assumptions are made for convenience: (A1) ||p|| = 0 on M, (A2) for each q ∈ Q and p ∈ S 1 the ray {(q, sp), s > 0} intersects the manifold M in exactly one point. Under the assumptions (A1)-(A2), M can be parameterized by (x, y, θ), where (x, y) = q ∈ Q and 0 ≤ θ < 2π is a cyclic coordinate, the angle between p and the positive x-axis. The dynamics (1.1)-(1.2) restricted to M is a flow that we denote by Φt . In the coordinates (x, y, θ) the equations of motion (1.1) can be rewritten as x˙ = p cos θ, y˙ = p sin θ, θ˙ = ph (2.1) where p = ||p|| > 0

and

h = (−F1 sin θ + F2 cos θ)/p2 .

It is also useful to note that p˙ = F1 cos θ + F2 sin θ

(2.2)

Both h = h(x, y, θ) and p = p(x, y, θ) are assumed to be C 2 smooth functions on M. Due to our assumption (A1), we have 0 < pmin ≤ p ≤ pmax < ∞ .

(2.3)

Note that at time of reflection, the angle θ changes discontinuously, say from θ− to θ+ . The law (1.2) then imposes the restriction p(x, y, θ− ) = p(x, y, θ+ )

(2.4)

on the function p at every point (x, y) ∈ ∂Q. For a function f on M, let fx , fy , fθ denote the partial derivatives of f . Denote by ||f ||C 2 the maximum of f and its first and second partial derivatives over M. Now put B0 = max{p−1 (2.5) min , ||p||C 2 , ||h||C 2 } .

Vol. 2, 2001


201

Assumption B (smallness of the force). We assume that the force F and its first derivatives are small enough. This means that δ0 = max{|h|, |hx |, |hy |, |hθ |} is sufficiently small. More precisely, we require that for any given B∗ > 0 there should be a small δ∗ = δ∗ (Q, B∗ ) such that all our results will hold whenever B0 < B∗ and δ0 < δ∗ . Remark. The geometric curvature of the trajectories of the particle on the torus is ˙ = h. By Assumption B, it is small, and so the trajectories are nearly straight θ/p lines. This has an important implication: no trajectory can collide with one body Bi more than once during a short interval of time. Hence, the distance between collisions is uniformly bounded below by a positive constant Lmin > 0. The time between collisions is uniformly bounded below as well, by tmin = Lmin /pmax . Lastly, we state our assumption on finite horizon: Assumption C (finite horizon). There is an L > 0 so that every straight line of length L on the torus T| 2 crosses at least one obstacle Bi . Remark. Under Assumptions B and C, every trajectory of length Lmax for the system (1.1)-(1.2), for some Lmax > L, must hit a scatterer Bi . So, the collisionfree path of the particle is uniformly bounded by Lmax . The time between collisions is uniformly bounded as well, by tmax = Lmax /pmin . Consider the two-dimensional cross-section of the manifold M: M := {(q, p) ∈ M : q ∈ ∂Q, (p · n(q)) ≥ 0}

(2.6)

which, as M0 in (1.3), consists of all outgoing velocity vectors at reflection points. Then the first return map T : M → M is then well defined, we also call it collision map. The cross-section M can be parameterized by (r, ϕ), where r is the arclength parameter along ∂Q and ϕ ∈ [−π/2, π/2] is the angle between p and n(q). In these coordinates, M = ∂Q × [−π/2, π/2], the same as M0 in (1.3). We choose the orientation of the coordinates r and ϕ as shown on Fig. 1 (where r and ϕ increase in the direction of arrows). Also, denote by K(r) > 0 the curvature of the curve ∂Q at the point with coordinate r. There are two particularly interesting types of forces satisfying our assumptions A and B: Type 1 forces (potential forces). Consider an isotropic force F = F(q) (independent of p) such that F = −∇U , where U = U (q) is a (small) potential function. Note that U (x, y) must be a smooth function on the torus, so it is necessarily a periodic function in x and y. These forces preserve the total energy T = 12 ||p||2 + U (q). In this case we can set T = 1/2, so that ||p||2 = 1 − 2U (q) ≈ 1, assuming U (q) be small. Remark. Type 1 forces preserve the Lebesgue measure dx dy dθ on the manifold M, since the divergence of the vector field (2.1) vanishes. This follows from the

202

N.I. Chernov


equation F = −∇U by direct calculations. Therefore, the collision map T : M → M also preserves a smooth measure ν. Sinai billiards under type 1 forces have been studied in numerous papers, and, in most cases, ergodicity and Bernoulli property were established. Type 2 forces (isokinetic forces). Consider forces satisfying (F · p) = 0. They preserve the kinetic energy, i.e. K = 12 ||p||2 =const. In this case we can set ||p|| = 1, as in billiards. Note that the equations in (2.1) hold with p = 1 and |h| = ||F|| (the sign of h is determined by the direction of F). Example of type 2 forces: thermostating. Let F be an arbitrary force. One can modify the equations (1.1) so that the kinetic energy will be preserved: p˙ = F − αp

q˙ = p,

α = (F · p)/(p · p)

where

(2.7)

It is easy to verify that ||p|| = const. The added term αp is called a Gaussian thermostat, it satisfies the Gaussian principle of least constraint. Also, α is called the Gaussian friction coefficient. Example : electric and magnetic fields. A well studied example of a force of type 2 is the following, see [CELS1] F(q, p) = E + [B × p] − αp .

(2.8)

Here E is a small constant electric field, B is a small constant magnetic field (a vector in IR3 perpendicular to the billiard table Q), and αp is the Gaussian thermostat α = (E · p)/(p · p). If E = 0 (thus α = 0), then the system preserves the Lebesgue measure dx dy dθ on M, just as does the pure billiard dynamics Φt0 . If E = 0, then the system has no absolutely continuous invariant measure, but has a unique SRB measure with good ergodic and statistical properties. We refer the reader to [CELS1] for a detailed study of the system (2.8). Recently, M. Wojtkowski found explicit conditions on the field E under which the the system is hyperbolic [W2, W3]. Now we state the main results of this paper. Theorem 2.1 Under Assumptions A, B, and C, the map T : M → M is a uniformly hyperbolic map with singularities. It admits a unique SRB measure ν, which is positive on open sets, K-mixing and Bernoulli. The next theorem concerns the statistical properties of the map T : M → M . Let Hη be the class of Hölder continuous functions on M with exponent η > 0: Hη = {f : M → IR | ∃C > 0 : |f (X) − f (Y )| ≤ C [dist(X, Y )]η , ∀X, Y ∈ M } We say that (T, ν) has exponential decay of correlations for H¨ older continuous functions if for all η > 0 there is λ = λ(η) ∈ (0, 1) such that for all f, g ∈ Hη and some C = C(f, g) > 0 we have (f ◦ T n )g dν − ≤ Cλ|n| f dν g dν (2.9) M

M

M

Vol. 2, 2001


203

for all n ∈ ZZ. We say that (T, ν) satisfies the central limit theorem for H¨ older continuous functions if for all η > 0, f ∈ Hη with f dν = 0, there is σf ≥ 0 such that n−1 1 distr √ f ◦ T i −→ N (0, σf2 ) (2.10) n i=0 which means the convergence in distribution to the normal law N (0, σf2 ). Furthermore, σf = 0 iff f is cohomologous to zero, i.e. f = g ◦ T − g for some g ∈ L2 (ν) Theorem 2.2 The measure ν enjoys exponential decay of correlations and satisfies the central limit theorem. The decay of correlations is uniform in the force F, i.e. the constants λ and C in (2.9) are independent of F. We only remark that Theorem 2.1 easily implies that the flow Φt : M → M is fully hyperbolic and has a unique SRB measure µ that is ergodic and positive on open sets. In a forthcoming paper, we will prove that the flow Φt is actually mixing and Bernoulli and satisfies the central limit theorem.

3 Hyperbolicity of Φt and T Our first goal is to prove that the flow Φt on M is hyperbolic, i.e. it has one positive and one negative Lyapunov exponents. The hyperbolicity is usually obtained by constructing a family of invariant cones in the tangent space [W1]. For Sinai billiards, invariant cones have a clear geometrical interpretation. Unstable cones correspond to divergent bundles of trajectories, and stable cones - to convergent bundles of trajectories. Any divergent bundle of trajectories remains divergent upon reflections off convex obstacles, as in Fig. 1, this easily implies the invariance of the unstable cones. Similarly, any convergent bundle of trajectories remains convergent in the past (as they flow backwards). We will prove that in our dynamics a sufficiently divergent bundle of trajectories remains divergent in the future. Obviously, we need to consider runs between collisions carefully and make sure that the divergence is not lost there. We use some new techniques to do that. Let P = (x, y, θ) ∈ M be an arbitrary point and P = (dx, dy, dθ) ∈ TP M a tangent vector at P . Pick a smooth curve Ps = (xs , ys , θs ) ⊂ M tangent to the vector dP at the point P , i.e. assume that P0 = P and P0 = dP . In the calculations below, we denote differentiation with respect to the auxiliary parameter s by primes and that with respect to time t by dots. In particular, ˙ = (p cos θ, p sin θ, ph) is the velocity vector of the flow Φt . It is not to P˙ = (x, ˙ y, ˙ θ) be confused with the velocity vector (x, ˙ y) ˙ = (p cos θ, p sin θ) of the moving particle on the torus, the latter will be referred to as the particle velocity. Now consider Pst = (xst , yst , θst ) := Φt Ps . The points Pst make a twodimensional surface S in M. It is standard that d DΦt (dP ) = P0t Pst |s=0 . = ds

204

N.I. Chernov


In subsequent formulas, all the calculations will be done at the point P0t , where s = 0, and for brevity we will often drop the subscript 0t. Note that the vectors P = (x , y , θ ) and ˙ = (p cos θ, p sin θ, ph) P˙ = (x, ˙ y, ˙ θ)

(3.1)

are both tangent vectors to S at the point P (= P0t ). We introduce two quantities v = x cos θ + y sin θ

w = −x sin θ + y cos θ .

and

(3.2)

It is easy to see that v is the component of the vector (x , y ) parallel to the particle velocity (x, ˙ y), ˙ and w is the perpendicular component of (x , y ). Solving (3.2) for x , y gives x = v cos θ − w sin θ

and

Now let α = v/w

and

y = v sin θ + w cos θ .

κ = (θ − vh)/w .

(3.3) (3.4)

So, α is the cotangent of the angle between the vector (x , y ) and the particle velocity (x, ˙ y). ˙ To describe κ geometrically, consider the one parameter family of trajectories {(xst , yst )} on the torus, where s is the parameter of the family and t is the internal parameter along each trajectory. Then κ is the curvature of the orthogonal cross-section of this family. Furthermore, κ > 0 corresponds to divergent families, κ < 0 to convergent families, and κ = 0 to parallel families. Also, note that |w| is the width of that family in the direction perpendicular to the particle velocity, per unit increment of the parameter s. Now, consider two vectors U = (cos θ, sin θ, h)

and

R = (− sin θ, cos θ, κ) .

Both are tangent vectors to the surface S, as it follows from the equations P˙ = pU

and

P = vU + wR .

The vector U is obtained by taking a unit vector (cos θ, sin θ) in the direction of the particle velocity (x, ˙ y) ˙ and lifting it to a tangent vector to the surface S. Similarly, the vector R is obtained by taking a unit vector (− sin θ, cos θ) in the perpendicular direction and lifting it to a tangent vector to the surface S. Denote by pU and pR the ‘scaled’ directional derivatives of the function p along the vectors U, R, respectively, defined by pU = px cos θ + py sin θ + pθ h,

pR = −px sin θ + py cos θ + pθ κ .

Similarly, hU = hx cos θ + hy sin θ + hθ h,

hR = −hx sin θ + hy cos θ + hθ κ .

Vol. 2, 2001


205

It is then straightforward that p = pU v + pR w

and

h = hU v + hR w

(3.5)

upon direct differentiation and substitution of (3.3) and using θ = κw + hv .

(3.6)

It is also easy to see that p˙ = pU p

and

h˙ = hU p .

Lemma 3.1 The evolution of the quantities κ, w, α is given by the equations κ/p ˙ = −κ2 − h2 + hR

(3.7)

w˙ = pκw

(3.8)

α˙ = −pκα + pU α + pR + ph .

(3.9)

and ˙ = p cos θ − p sin θ · θ and similarly dy /dt = Proof. First, note that dx /dt = (x) ˙ = p h + ph . Hence p sin θ + p cos θ · θ . Also, dθ /dt = (θ) v˙ = p + phw

and

w˙ = pθ − phv = pκw .

Then direct differentiation of the equations (3.4) and substitution of (3.5) completes the proof. ✷ Let τ be the length parameter along the trajectory (x0t , y0t ) on the torus, i.e. dτ /dt = p. Then the equations (3.7)-(3.9) can be rewritten as dκ/dτ = −κ2 − h2 + hR

(3.10)

dw/dτ = κw

(3.11)

dα/dτ = −κα + h + pU α/p + pR /p

(3.12)

and Now consider a reflection at some ∂Bi ⊂ ∂Q experienced by the family Pst . For every s, the trajectory Pst reflects in ∂Bi at some moment of time t = ts . The outgoing velocity vectors of these trajectories (taken immediately after the reflection) make a curve γ in M , we call it the trace of the family Pst (on M ). Let γ satisfy an equation ϕ = ϕ(r) in the coordinates r, ϕ introduced after (2.6). Note that the curve γ is also parameterized by s, because each point corresponds to a trajectory of the family Pst . The quantities θ, α, κ, w, v, p, h may change discontinuously at the reflection. Denote by θ− , α− , κ− , etc., their values before the reflection and by θ+ , α+ , κ+ , etc., their values after the reflection. Actually, we have p+ = p− by (2.4).

206

N.I. Chernov


Lemma 3.2 The derivative t = dts /ds satisfies t = ∓(w± tan ϕ ± v ± )/p± .

(3.13)

The derivative dr/ds on γ satisfies dr/ds = ∓w± / cos ϕ .

(3.14)

The derivative of the function ϕ = ϕ(r) satisfies dϕ/dr = ∓K(r) + κ± cos ϕ ∓ h± sin ϕ .

(3.15)

Recall that K(r) > 0 is the curvature of the boundary ∂Bi ⊂ ∂Q at the point r ∈ ∂Q on the torus T| 2 . Proof. Let the boundary ∂Bi satisfy an equation G(x, y) = 0, where the function G is chosen so that its gradient vector (Gx , Gy ) is a normal vector to ∂Bi pointing inside Q (outside Bi ). Then ts satisfies the equation G(xsts , ysts ) = 0. Differentiating with respect to s before and after the reflection gives, respectively, [(x )± + (x) ˙ ± t ]Gx + [(y )± + (y) ˙ ± t ]Gy = 0 .

(3.16)

A simple geometric analysis shows that, in the orientation of ϕ specified after (2.6), we have (3.17) Gy /Gx = tan(θ+ + ϕ) = tan(θ− − ϕ) . Solving (3.16) for t then gives t = −

(x )± + (y )± tan(θ± ± ϕ) . (x) ˙ ± + (y) ˙ ± tan(θ± ± ϕ)

Substitution of (3.1) and (3.3) yields (3.13). Next, we have |dr/ds| = (x + xt ˙ )2 + (y + yt ˙ )2 , where x , x, ˙ y , y˙ are all taken either before the reflection or after it. Using (3.1), (3.3) and (3.13) and taking into account our orientation of the coordinate r (to determine the sign of dr/ds) yields (3.14). Another simple geometric inspection shows that K(r) = −d(tan−1 (Gx /Gy ))/dr in our orientation of the coordinate r. Therefore, using (3.17) gives dϕ/dr = −K(r) − dθ+ /dr = K(r) + dθ− /dr . Lastly,

dθ± dθ± dr ˙ ± t [w± / cos ϕ] . = = ∓ (θ )± + (θ) dr ds ds

Now substituting (3.1), (3.6), and (3.13) proves (3.15). Lemma is proved.

✷

Vol. 2, 2001


207

Lemma 3.3 At each reflection, we have v + = v − , i.e. v remains unchanged. Also, w+ = −w− and hence α+ = −α− . The variable κ changes by the rule κ+ = κ− + ∆κ where ∆κ =

(3.18)

2K(r) + (h+ + h− ) sin ϕ . cos ϕ

(3.19)

Generally, there is no relation between h+ and h− . All this directly follows from the previous lemma. ✷ Note that by setting h ≡ 0 in (3.19) we recover the well known equation ∆κ = 2K(r)/ cos ϕ for billiards derived by Sinai, see e.g. [Si, BSC1]. Observe that ∆κ does not depend on the family of trajectories Pst (i.e., on α, κ, w). It only depends on the point (r, ϕ) ∈ M . Hence it is a (smooth) function on the cross-section M , we call it Θ(r, ϕ), i.e. Θ(r, ϕ) =

2K(r) + (h+ + h− ) sin ϕ . cos ϕ

Note that this function has a positive lower bound, Θ(r, ϕ) ≥ Θmin = 2 min K(r) − 2δ0 > 0 r

but it is not bounded above (as, indeed, cos ϕ may be arbitrary close to zero during almost ‘grazing’ reflections). We now consider a family of trajectories that are divergent before some reflection, i.e. assume κ− > 0. Then κ+ > Θmin by (3.18), so the curvature of the family is big enough after the reflection. Denote by L the length of the trajectory on the torus between the current and the next reflection points, and parameterize this trajectory segment by the length parameter τ , 0 ≤ τ ≤ L. Recall that the free path between consecutive reflections is uniformly bounded, hence Lmin ≤ L ≤ Lmax . Lemma 3.4 Let κ− > 0. Then 1 (κ+ )−1

+τ

− δ1 ≤ κτ ≤

1 (κ+ )−1

+τ

+ δ1

(3.20)

for all 0 < τ < L. Here δ1 is a small constant that depends only on δ0 in Assumption B and on Θmin . Proof. The equation (3.10) and Assumption B imply −κ2 − δ0 κ − 2δ0 ≤ dκ/dτ ≤ −κ2 + δ0 κ + 2δ0 assuming that κ > 0. Hence, −(κ + δ )2 ≤ dκ/dτ ≤ −(κ − δ )2 + (δ )2

(3.21)

208

N.I. Chernov


where δ = (2δ0 )1/2 and δ = (4δ0 )1/2 . The smallness of δ , δ and the initial bound κ0 = κ+ ≥ Θmin > 0 allows direct integration of (3.21) resulting in

2δ τ 1 +1 Ae ≤ κ ≤ δ − δ + δ τ (κ+ + δ )−1 + τ Ae2δ τ − 1

where

κ+ − δ + δ κ+ − δ − δ (this, in particular, justifies the assumption κ > 0). Since δ , δ are small, one can now easily obtain (3.20) with δ1 = δ + 2δ . ✷ Convention (on δ’s). Throughout the paper, we denote by δi various small constants that depend on the domain Q and δ0 in Assumption B so that all δi → 0 as δ0 → 0. Hence, all those constants are effectively assumed to be small enough. Denote A=

κmin :=

Θ−1 min

1 − δ1 + Lmax

and

κ− max :=

1 + δ1 . Lmin

Corollary 3.5 If a family of trajectories is divergent before a reflection at time t0 , i.e., κt0 −0 > 0, then for all t > t0 we have κt > κmin > 0, i.e. the curvature of the family stays bounded away from zero. In addition, at each reflection that occurs after the time t0 we have κ− ≤ κ− max , i.e. the curvature of the family falling upon ∂Q is uniformly bounded above. We call a family of trajectories Pst strongly divergent on a time interval (t1 , t2 ) if κt ≥ κmin for all t1 < t < t2 (and then, of course, for all t > t1 ). We emphasize the following: “Invariance principle”. Any strongly divergent family of trajectories remains strongly divergent in the future under the flow Φt . We note that later on some additional restrictions on the class of strongly convergent families will be assumed (see, e.g., the convention on α’s below), but this invariance principle will hold. Remark. We do not assume that the derivatives px , py , pθ are small, they are just bounded as the function p is smooth on a compact manifold M. It is important, ˙ = d(ln p)/dt has uniformly bounded integrals though, that the function pU = p/p along any orbit segment of the flow: t2 pU dt ≤ const = ln(pmax /pmin ) < ∞ t1

for any t1 < t2 . ¯ max such that for any strongly diverLemma 3.6 There are constants αmax and α gent family of trajectories on the interval (t0 , ∞) we have |αt | ≤ αmax eventually, for all t > t1 (where t1 depends on αt0 ). Moreover, if |αt0 | < αmax , then |αt | ≤ α ¯ max for all t > t0 .

Vol. 2, 2001


209

Proof. At every reflections, α simply changes sign, i.e. |α+ | = |α− |. Due to (3.12), we have dα/dτ = −κ(α − pθ /p) + (pU /p)α + (−px sin θ + py cos θ)/p + h .

(3.22)

Note that the terms pθ /p, pU /p, and (−px sin θ + py cos θ)/p + h are uniformly bounded. Since κ ≥ κmin > 0, the first term in (3.22) drives α back whenever it gets too large. The influence of the second term, (pU /p)α, is uniformly bounded by the previous remark. ✷ Convention (on α’s). In all that follows, we will only consider strongly divergent families that satisfy |αt | ≤ αmax for all relevant t. We also assume that the “invariance principle” holds, as we may in view of Lemma 3.6. Lemma 3.7 For any strongly divergent family of trajectories on an interval (t0 , ∞), its width |w| grows exponentially in time: t

pu κu du ≥ |wt0 | ec(t−t0 ) |wt | = |wt0 | exp t0

where c = pmin κmin > 0. We also need the invariance and exponential growth for sufficiently convergent families of trajectories as they flow backwards in time. The following trick will do the job. Time reversal principle. There is a convenient way to study backward dynamics Φt as t → −∞. Consider the involution map I : (x, y, θ) "→ (x, y, θ + π) on M. The flow Φt− := I ◦ Φ−t ◦ I is governed by the equations x˙ = p− cos θ,

y˙ = p− sin θ,

θ˙ = p− h−

(3.23)

where p− (x, y, θ) = p(x, y, θ + π) and h− (x, y, θ) = −h(x, y, θ + π). So, equations (3.23) are similar to (2.1). The new flow Φt− satisfies Assumption A, quite obviously, and Assumption B, because the function h− and its partial derivatives are the negatives of those of h. Thus, all the properties of the flow Φt also hold for Φt− . It is clear that convergent families of trajectory and their backward evolution correspond to divergent families of the flow Φt− and their forward evolution. Hence, all the properties we proved and assumptions we made for divergent families have their counterparts for convergent families. We will say that a family is strongly convergent on an interval (tt , t2 ) if κt ≤ −κmax,− where κmax,− > 0 is the constant defined just as κmax , but for the flow Φ− . Our convention on α’s and the “invariance principle” (under the backward dynamics) apply to strongly convergent families, and they grow (in terms of the width w) exponentially in time as t → −∞.

210

N.I. Chernov


Remark. In a particular case where p(x, y, θ) = p(x, y, θ + π)

and

h(x, y, θ) = −h(x, y, θ + π)

the flows Φt and Φt− coincide. Then we say that the flow Φt is time reversible. Time reversibility is quite common in many models of direct physical origin. For example, potential forces (type 1) are always time reversible. The model (2.8) is time reversible, though, only if B = 0. Generally, time reversibility does not follow from Assumptions A and B. We now arrive at the first major theorem. Theorem 3.8 (Hyperbolicity) The flow {Φt } on M is hyperbolic with respect to any invariant measure, i.e. it has one positive and one negative Lyapunov exponent almost everywhere. The unstable tangent vector dP u = (dxu , dy u , dθu ) at a point P ∈ M corresponds to a family of trajectories that is strongly divergent at all times (−∞ < t < ∞). The stable tangent vector dP s = (dxs , dy s , dθs ) corresponds to a strongly convergent family of trajectories at all times. The angle between the particle velocity (x, ˙ y) ˙ and the vector (dxu , dy u ) is uniformly bounded away from zero, and the same holds for the angle between (x, ˙ y) ˙ and (dxs , dy s ). Having established hyperbolicity for the flow Φt , we can project unstable and stable vectors on the cross section M , and hence the following Corollary 3.9 The first return map T : M → M induced by the flow Φt is hyperbolic, too, with respect to any invariant measure. In the rest of this section, we prove that T is, essentially, a uniformly hyperbolic map. Let X = (r, ϕ) ∈ M and V = (dr, dϕ) ∈ TX M . We call V an unstable vector if it is a tangent vector to the trace ϕ = ϕ(r) of a strongly unstable family Pst . Similarly, V is a stable vector if it is tangent to the trace of a strongly stable family. Lemma 3.10 (Uniform hyperbolicity - 1) There is a constant B1 > 1 such that for every nonzero unstable vector V = (dr, dϕ) we have B1−1 ≤ dϕ/dr ≤ B1 . Similarly, for any stable vector V = 0 we have −B1 ≤ dϕ/dr ≤ −B1−1 . As a result, the angles between stable and unstable vectors are bounded away from zero. Proof. The lemma follows from (3.15), the bound 0 < κ− ≤ κ− max in Corollary 3.5, + and a similar bound −κ− ✷ max ≤ κ < 0 for strongly convergent families. Convention (on B’s), We denote by Bi > 0 constants that only depend on the domain Q and the bounds on the function p(x, y, θ) and its derivatives. Such constants are called global constants. Remark. All our claims about unstable vectors here have their obvious counterparts for stable vectors, as in the above lemma. For brevity, we will only state the claims for unstable vectors.

Vol. 2, 2001


211

Lemma 3.11 (Uniform hyperbolicity - 2) Let V and V˜ be two unstable vectors at a point X ∈ M and T n continuous at X. Then the angle between the unstable vectors DT n (V ) and DT n (V˜ ) at the point T n X is less than Cλn , where C > 0 and λ ∈ (0, 1) are global constants. In other words, the cones made by unstable vectors shrink uniformly and exponentially fast under DT n as n → ∞. Proof. Let V and V˜ be tangent vectors to the traces ϕ = ϕ(r) and ϕ = ϕ(r) ˜ of two strongly unstable families Pst and P˜st . According to (3.15), |dϕ/dr − dϕ/dr| ˜ ≤ − n − ˜ − | cos ϕ, hence it is enough to prove that |κ− − κ ˜ | ≤ Cλ , where κ and |κ− − κ n n n n κ ˜− ˜ satisfies n are taken at the point T X before the reflection. Note that ∆ := κ − κ d∆/dτ = −(κ + κ ˜ − hθ )∆ according to (3.10), and does not change at reflections due to Lemma 3.3. Hence, |∆τ | ≤ |∆0 |e−aτ where a = 2κmin − δ0 > 0. ✷ Now denote by V1 = (dr1 , dϕ1 ) = DT (V ) the image of vector V under DT . It is a tangent vector at X1 = T X. If V is an unstable vector, then so is V1 . Let V and V1 be tangent vectors to the traces left on M by a strongly divergent family Pst at the points X and X1 , respectively. Denote by L the length of the trajectory segment on the torus between the points X and X1 , and parameterize that segment by the length parameter τ , 0 ≤ τ ≤ L. Denote by w+ , κ+ , etc. the quantities introduced in Sect. 3 taken for the family Pst immediately after the reflection at the point X, and by w1− , κ− 1 , etc. the corresponding quantities before the reflection at the point X1 . Lemma 3.12 For any unstable vector V e−δ2 [1 + κ+ L] ≤

|w1− | ≤ eδ2 [1 + κ+ L] |w+ |

(3.24)

with some small δ2 > 0. Proof. Combining (3.11) and (3.20) and integrating with respect to τ from 0 to L ✷ yields (3.24) with δ2 := δ1 Lmax . Note that integrating from 0 to any τ < L in the above proof gives e−δ2 [1 + κ+ τ ] ≤

|wτ | ≤ eδ2 [1 + κ+ τ ] . |w+ |

(3.25)

In the theory of dispersing billiards, a convenient norm of stable and unstable vectors is often used, it is called the p-norm: |V |p = cos ϕ|dr|, and respectively |V1 |p = cos ϕ1 |dr1 |. This is not really a norm in TX M , since |W |p = 0 for some W = 0, but at least |V |p > 0 for every stable and unstable vector V = 0 due to Lemma 3.10. Now (3.24) can be rewritten as e−δ2 [1 + κ+ L] ≤

|V1 |p ≤ eδ2 [1 + κ+ L] |V |p

(3.26)

because |V1 |p /|V |p = |w1− |/|w+ |, as it follows by applying (3.14) to V and V1 .

212

N.I. Chernov


Note that in the pure billiard dynamics δ2 = 0, and we recover a standard formula |V1 |p /|V |p = 1 + κ+ L, see [Si]. The inequality (3.26) shows that the p-norm of unstable vectors grows monotonically and exponentially in time (= the number of collisions), i.e. for all n ≥ 1 |DT n (V )|p /|V |p ≥ Λn

(3.27)

where Λ > 1 is a global constant, say Λ = 1 + κmin Lmin /2 .

(3.28)

The p-metric plays the role of the so called adapted metric of Axiom A systems. It also follows from (3.19) and (3.26) that the expansion of V under DT is mainly determined by cos ϕ: B2−1 B2 |V1 |p ≤ ≤ (3.29) cos ϕ |V |p cos ϕ for some constant B2 > 0. We will primarily work with the Euclidean metric |V | = (dr)2 + (dϕ)2 . It is clear that for stable and unstable vectors V = 0, which satisfy Lemma 3.10, we have |V | cos ϕ 1≤ ≤ B3 (3.30) |V |p for some constant B3 > 0. Then (3.29) and (3.30) imply B4−1 |DT (V )| B4 ≤ ≤ cos ϕ1 |V | cos ϕ1

(3.31)

for some constant B4 > 0. Lemma 3.13 (Uniform hyperbolicity - 3) For any unstable vector V where DT n is defined |DT n (V )|/|V | ≥ B5 Λn (3.32) for global constants Λ > 1 and B5 > 0. Proof. Indeed, due to (3.27), (3.29) and (3.30) |DT n (V )| ≥ |DT n (V )|p ≥ Λn−1 |DT (V )|p ≥ Λn−1

B2−1 |V |p ≥ Λn−1 B2−1 B3−1 |V | . cos ϕ

Vol. 2, 2001


213

4 The properties of the billiard map T : M → M Here we study stable and unstable curves, and singularity curves, for the billiard map T on the cross-section M . Certain technical properties of those curves are necessary for the construction and further study of SRB measures. In the theory of dynamical systems, these properties are called curvature bounds, distortion bounds, absolute continuity, alignment etc. The proofs of these properties are, unfortunately, quite involved. To make things worse, the proofs are not always available even in the pure billiard case – some of these facts are just known as folklore, whose proofs have never been published. For the sake of completeness, we provide here full proofs of all these facts. Definition. A smooth curve γ ⊂ M given by ϕ = ϕ(r) is called an unstable curve (or a stable curve) if it is the trace of a strongly divergent (resp., strongly convergent) family of trajectories. Our “invariance principle” for strongly divergent families implies that the class of unstable curves is invariant under T n , n ≥ 1, and the class of stable curves is invariant under T −n , n ≥ 1. We will refer to this as the “invariance principle” for unstable curves. ¯max such that Lemma 4.1 (Curvature bounds) There are constants Bmax and B for any C 2 smooth unstable curve γ its images T n γ satisfy |d2 ϕ/dr2 | ≤ Bmax eventually, for all n ≥ nγ . Moreover, if γ itself satisfies |d2 ϕ/dr2 | ≤ Bmax , then ¯max . all its images T n γ, n ≥ 1, satisfy |d2 ϕ/dr2 | ≤ B We note that a similar property for pure billiard dynamics is known [Y1, Ch2], but hardly a complete proof was ever published. Our proof certainly covers the pure billiard case. Proof. Let ϕ = ϕ(r) be an unstable curve, the trace of a strongly divergent family Pst . Differentiating (3.15) gives d2 ϕ dK(r) dκ− dϕ dh− dϕ + cos ϕ − κ− sin ϕ + sin ϕ + h− cos ϕ . = 2 dr dr dr dr dr dr Since ∂Q is C 3 smooth, the term dK/dr is bounded. The term κ− is bounded by Corollary 3.5, and dϕ/dr is bounded by Lemma 3.10. Now, using (3.5) and − − − − (3.13) gives dh− /ds = h− R w + hU w tan ϕ. Hence, due to (3.14), dh /dr = − − − − − (dh /ds)/(dr/ds) = hR cos ϕ + hU sin ϕ, so |dh /dr| ≤ (4 + κmax )δ0 . It then remains to estimate the term dκ− /dr. First, according to (3.14) dκ− dκ− dr cos ϕ = = [(κ )− + (κ) ˙ − t ] · − . dr ds ds w Substituting (3.7) and (3.13) gives − dκ− /dr = (κ /w)− cos ϕ − [(κ− )2 + (h− )2 − h− R ] · (sin ϕ − α cos ϕ)

214

N.I. Chernov


Here all the terms are bounded except, possibly, the term (κ /w)− . So, it is enough to prove that the quantity Ξ := κ /w is bounded by a global constant before every reflection. Direct differentiation and using (3.7) yields dκ /dt = dκ/ds ˙ = −2pκκ − pθ wκ3 − D1 wκ2 + phθ κ where D1 is an expression involving first and second order derivatives of the functions p and h. All those derivatives are bounded, since these functions are C 2 smooth on a compact manifold M, hence |D1 | is bounded by a global constant. Now, by using (3.8), dΞ/dt

˙ 2 = (dκ /dt)/w − κ w/w = −3pκΞ + phθ Ξ − pθ κ3 − D1 κ2 .

(4.1)

Now consider a reflection experienced by the family Pst and denote by Ξ− and Ξ+ the values of Ξ before and after the reflection. Differentiating (3.18)-(3.19) gives dκ+ dκ− 2K(r) sin ϕ dϕ 2K H1 + = + · + + − h+ θ κ sin ϕ . 2 dr dr cos ϕ dr cos ϕ cos2 ϕ

(4.2)

Here we denote K = dK/dr, which is bounded on ∂Q, and H1 is a small quantity, see below. Convention (on D’s and H’s), We will denote by Di variable quantities whose absolute values are bounded above by global constants, i.e. |Di | ≤ Bi for some global constant Bi . We will also denote by Hi variable quantities whose absolute values are bounded by some small constants depending on δ0 in Assumption B, i.e. |Hi | ≤ δi∗ where δi∗ → 0 as δ0 → 0, i.e. δi∗ satisfy our convention on δ’s. Note that dκ+ /dr = (dκ+ /ds)/(dr/ds) = [(κ )+ + (κ) ˙ + t ]/(−w+ / cos ϕ) and, − − − − similarly, dκ /dr = [(κ ) + (κ) ˙ t ]/(w / cos ϕ), where we used (3.14). Substituting these into (4.2) and using (3.7), (3.15), (3.18), and (3.13) yields Ξ+ = −Ξ− + ∆Ξ , where ∆Ξ = −

D2 H2 6K 2 (r) sin ϕ + + . 3 2 cos ϕ cos ϕ cos3 ϕ

(4.3)

(4.4)

Here D2 is an expression involving K , κ− , α± and other bounded quantities. Eqs. (4.1) and (4.3)-(4.4) completely describe the evolution of the quantity Ξ in time. Since (4.1) is a linear differential equation, we can decompose Ξ = Ξ1 + Ξ2 so that dΞ1 /dt = −3pκΞ1 +phθ Ξ1

and

dΞ2 /dt = −3pκΞ2 +phθ Ξ2 −pθ κ3 −D1 κ2 (4.5)

and at every reflection − Ξ+ 1 = −Ξ1

and

− Ξ+ 2 = −Ξ2 + ∆Ξ .

(4.6)

Vol. 2, 2001


215

Initially, at a time t0 + 0 when the family Pst just leaves the curve γ (its trace on M ), we set Ξ1 (t0 + 0) = Ξ(t0 + 0) and Ξ2 (t0 + 0) = 0. Now, since |Ξ1 | does not change during reflections, the first equation (4.5) implies that |Ξ1 (t)| ≤ |Ξ1 (t0 )| · e−a(t−t0 ) for t > t0 (4.7) where a = 3pmin κmin − δ0 > 0. Hence, the component Ξ1 converges to zero exponentially fast. Claim. There is a global constant B6 > 0 such that |Ξ2 (t − 0)| ≤ B6 for every moment of reflection t > t0 . We prove the claim inductively. Suppose |Ξ2 (t1 − 0)| ≤ B6 before a reflection at some time t1 > t0 . During the interval from t1 to the next reflection, t2 , we decompose Ξ2 = Ξ21 + Ξ22 as in (4.5), so that dΞ21 /dt = −3pκΞ21 + phθ Ξ21

dΞ22 /dt = −3pκΞ22 + phθ Ξ22 − pθ κ3 − D1 κ2 (4.8) and initially set Ξ21 (t1 + 0) = −Ξ2 (t1 − 0) and Ξ22 (t1 + 0) = ∆Ξ, where ∆Ξ is given by (4.4) and taken at the reflection at t1 . Similarly to (4.7), we now have and

|Ξ21 (t2 − 0)| ≤ |Ξ2 (t1 − 0)| · e−a(t2 −t1 ) .

(4.9)

The equation (4.4) shows that Ξ22 (t1 + 0) = ∆Ξ is of order O(1/ cos3 ϕ) = O(κ3 (t1 + 0)). It is then convenient to “link” Ξ22 with κ3 and consider the ratio g(t) := Ξ22 (t)/κ3 (t). First, by (3.19) and (4.4) g(t1 + 0) = Ξ22 (t1 + 0)/κ3 (t1 + 0) ≤ B with some global constant B . Then, (4.8) and (3.7) imply dg/dt = −pθ + H3 g + D3 /κ . Hence, |g| stays bounded by a global constant between the two reflections, i.e. 3 |g(t)| ≤ B for all t1 < t < t2 . Therefore, |Ξ22 (t2 − 0)| ≤ B (κ− max ) and 3 |Ξ2 (t2 − 0)| ≤ B6 e−atmin + B (κ− max ) .

Hence, an appropriate choice of B6 ensures that |Ξ2 (t2 − 0)| ≤ B6 . The claim is proved. Lemma 4.1 now easily follows. ✷ Convention. In all that follows we will only consider unstable curves that satisfy |d2 ϕ/dr2 | ≤ Bmax . Hence, all our unstable curves will have uniformly bounded geometric curvature. The same goes, of course, to stable curves. We also assume that the “invariance principle” for unstable curves holds, as we may in view of Lemma 4.1.

216

N.I. Chernov


This convention is equivalent to the requirement that for any strongly divergent family, immediately before any reflection, Ξ− = (κ )− /w− ≤ B7

(4.10)

for some global constant B7 . Also note that the proof of the claim in the proof of Lemma 4.1 implies that, under the above convention, all strongly divergent families satisfy |Ξ|/κ3 = |κ |/(κ3 |w|) ≤ B8

(4.11)

for some global constant B8 . This will be used later. We now turn to distortion bounds, but first, a remark is in order. Let γ be an unstable curve on which T n is continuous for some n ≥ 1. We know that T n expands γ exponentially fast in n, due to (3.32). We now need to compare the expansion rates at different points of γ and ensure that those rates vary slowly over γ (this property is referred to as ‘bounded distortions’). However, at almost grazing reflections, when cos ϕ ≈ 0, the expansion of unstable curves is highly nonuniform, and so distortions are unbounded. To fix the situation, we consider the so called homogeneous unstable curves. We partition M into countably many rectangular domains Ik , for k = 0 and |k| ≥ k0 , where k0 > 1 is a large constant to be specified later. For every k ≥ k0 we put Ik = {(r, ϕ) : π/2 − k−2 < ϕ < π/2 − (k + 1)−2 } and I−k = {(r, ϕ) : −π/2 + (k + 1)−2 < ϕ < −π/2 + k−2 } and lastly I0 = {(r, ϕ) : −π/2 + k0−2 < ϕ < π/2 − k0−2 } . The domains Ik are called homogeneity strips, they are also used in the study of pure billiard systems [BSC2, Y1, Ch2]. We say that an unstable curve γ ⊂ M is homogeneous if it is entirely contained in one homogeneity strip Ik . Note that if γ is a homogeneous unstable curve, then for every point X = (r, ϕ) ∈ γ we have cos ϕ ≥ B9−1 |γ|2/3

(4.12)

where B9 > 0 is a global constant. Here and on |γ| denotes the length of γ in the Euclidean metric (dl)2 = (dr)2 + (dϕ)2 . Let γ be an unstable curve, X ∈ γ and T n continuous at X. Denote by Jγ,n (X) the expansion factor of the curve γ under T n at the point X, i.e. Jγ,n (X) := |DT n V |/|V | for any tangent vector V to γ at X.

Vol. 2, 2001


217

Lemma 4.2 (Distortion bounds) Let γ be an unstable curve on which T n is continuous. Assume that γi := T i γ is a homogeneous unstable curve for each 0 ≤ i ≤ n. Then for all X, Y ∈ γ | ln Jγ,n (X) − ln Jγ,n (Y )| ≤ B10 |γn |b for some global constants B10 > 0 and b > 0 (in fact, b = 1/3). We note that the corresponding property for pure billiard dynamics is known [Ch2], but only a proof of a somewhat weaker statement was published [BSC2]. Our proof covers the pure billiard case, too. i Proof. Note that Jγ,n (X) = n−1 i=0 Jγi ,1 (T X). Hence, it is enough to prove the lemma for n = 1, because |γi | grows exponentially in i due to (3.32). So we put n = 1. Let Pst be a strongly divergent family whose trace on M is the curve γ. We will use the notation adopted before Lemma 3.12. Consider Jγ,1 (X) as a function of X1 = (r1 , ϕ1 ) = T X ∈ γ1 , and parameterize γ1 by r1 . It is enough to prove that d ln Jγ,1 B (4.13) dr1 ≤ |γ1 |2/3 for some global constant B > 0. Then Lemma 4.2 (with n = 1) would follow by integration over γ1 . The bound (4.13), in turn, follows from d ln Jγ,1 B cos ϕ1 B (4.14) dr1 ≤ cos ϕ + cos ϕ1 with a global constant B > 0, by applying (4.12) to both γ and γ1 , and because |γ| ≥ B4−1 |γ1 | cos ϕ1 , which follows from (3.31). We now prove (4.14). We have |V | = |dr| 1 + (dϕ/dr)2 = |ds| |w+ |(cos ϕ)−1 1 + (dϕ/dr)2 , and similarly for |V1 |, hence 1 + (dϕ1 /dr1 )2 |V1 | |w− | cos ϕ · = J · J · J = 1+ · Jγ,1 (X) = 2 |V | |w | cos ϕ1 1 + (dϕ/dr) where J , J , J simply denote the first, second and third factors in this expression. We bound them separately. First, d ln J |dϕ1 /dr1 | · |d2 ϕ1 /dr12 | |dϕ/dr| · |d2 ϕ/dr2 | dr ≤ + · dr1 dr1 . 1 + (dϕ1 /dr1 )2 1 + (dϕ/dr)2 Note that |dr/dr1 | ≤ B4 cos ϕ1 for some global constant B4 > 0 due to (3.31). Hence, |d ln J /dr1 | is uniformly bounded due to Lemmas 3.10 and 4.1. Next, d ln J dϕ1 /dr1 dϕ/dr dr B1 B4 cos ϕ1 B1 · ≤ + dr1 cos ϕ1 cos ϕ dr1 ≤ cos ϕ1 + cos ϕ as required by (4.14).

218

N.I. Chernov


t It remains to consider ln J (X) = t01 κp dt, cf. (3.8), where t0 and t1 denote the moments of reflection at X and X1 , respectively. First, d ln J /dr1 = (d ln J /ds)/(dr1 /ds), and − d ln J /ds = −κ+ p+ dt0 /ds + κ− 1 p1 dt1 /ds +

t1

(κp + κ p) dt

t0 − − = κ+ (w+ tan ϕ + v + ) + κ− 1 (w1 tan ϕ1 − v1 ) +

t1

(κp + κ p) dt .

t0

Note that |w+ /w1− | ≤ 2(κ+ Lmin )−1 by (3.24) and for all w = w(t), t0 < t < t1 , we also have by (3.25) + −1 |w/w1− | ≤ |w/w+ | · |w+ /w1− | ≤ 4L−1 + τ] . min [(κ )

(4.15)

Combining the above formulas and (3.14) yields |d ln J /dr1 | ≤

− + − L−1 min | tan ϕ + α | cos ϕ1 + κmax (| sin ϕ1 | + |α1 | cos ϕ1 ) t1 |κp + κ p|/|w1− | dt . + cos ϕ1

2

t0

The first two terms are clearly properly bounded, as required by (4.14). The integral term can be estimated by (3.5) and (4.11), so it does not exceed

t1

cos ϕ1

κ |pU α + pR | · |w/w1− | + B8 κ3 p |w/w1− | dt .

t0

Using (4.15) and an obvious |pU α + pR | ≤ const·(1 + κ) shows that the last expression does not exceed t1 t1 B κ3 p|w/w1− | dt ≤ cos ϕ1 B [(κ+ )−1 + τ ]−2 p dt cos ϕ1 t0

t0

where B and B are some global constants. A direct integration shows that the last expression is bounded by cos ϕ1

L

B [(κ+ )−1 + τ ]−2 dτ ≤ const · cos ϕ1 κ+ .

0

Since κ+ ≤ const·(1+1/ cos ϕ), the last expression is properly bounded as required by (4.14). This completes the proof of (4.14). Lemma 4.2 is proved. ✷ Before we turn to the absolute continuity, one useful observation should be made.

Vol. 2, 2001


219

Volume compression. The volume dV = dx dy dθ in the phase space M is not necessarily invariant under Φt . Its rate of change is given by the divergence of the flow Φt : d (ln dVt ) = px cos θ + py sin θ + pθ h + phθ = pU + phθ . dt

(4.16)

Under our assumptions, phθ is small. The function pU has uniformly bounded integrals along orbits, see Remark before Lemma 3.6. Therefore, for all X ∈ M and t > 0 −1 −δ4 t B11 e < |dΦt (X)| < B11 eδ4 t (4.17) for some small constant δ4 > 0 and a global constant B11 . Also, let Pst be a strongly convergent or divergent family on a time interval (t1 , t2 ) that does not experience singularities (grazing reflections) for t1 < t < t2 . Then for any X, Y ∈ Pst1 we have −1 B12 < |dΦt2 −t1 (X)|/|DΦt2 −t1 (Y )| < B12 (4.18) with a global constant B12 . Indeed, the trajectories Φt X and Φt Y exponentially converge to each other due to the uniform hyperbolicity of Φt , and the smoothness of the functions in (4.16) then proves (4.18). Similar inequalities hold for the map T and the element dν0 of the smooth invariant measure ν0 of the billiard map T0 . Recall that dν0 = const · cos ϕ dr dϕ. The elements dV and dν0 are related by dV = dx dy dθ = p cos ϕ dr dϕ dt = const · p dν0 dt

(4.19)

in the immediate vicinity of the cross-section M . Note that the corresponding relation for pure billiard systems (with p = 1) holds everywhere in the phase space M0 , cf. [Si]. Denote by |DT n |0 the Jacobian of T n with respect to the measure ν0 . Now (4.17) and (4.19) imply −1 −δ4 n B11 e < |DT n |0 < B11 eδ4 n .

(4.20)

Also, let γ ⊂ M be a stable or unstable curve on which T n is continuous, and T n γ also stable (resp., unstable) curve. Then for any X, Y ∈ γ (4.18) and (4.19) imply −1 B12 < |DT n (X)|0 /|DT n (Y )|0 < B12 .

(4.21)

We use the same notation δ4 , B11 , B12 here, even though the values of these constants in (4.17)-(4.18) and (4.20)-(4.21) may be different. Lemma 4.3 (Absolute continuity) Let ξ be a stable curve, X, Y ∈ ξ, and γ1 , γ2 two unstable curves crossing ξ at X and Y , respectively. Assume that T n is continuous on ξ and T i ξ is a homogeneous stable curve for each 0 ≤ i ≤ n. Then | ln Jγ1 ,n (X) − ln Jγ2 ,n (Y )| ≤ B13 where B13 is a global constant.

(4.22)

220

N.I. Chernov


Proof. For any Z ∈ ξ, let Jξ,n (Z) be the contraction factor of ξ under T n at the point Z, i.e. Jξ,n (Z) = |DT n (V )|/|V | for any tangent vector V to ξ at Z. By Lemma 4.2 (applied to ξ) we have | ln Jξ,n (X) − ln Jξ,n (Y )| ≤ B9 |ξ|b ≤ B

(4.23)

for a global constant B . Let |DT n (Z)|e denote the Jacobian of T n at Z ∈ M with respect to the Lebesgue measure dr dϕ on M , i.e. |DT n (Z)|e = |DT n (Z)|0 cos ϕ(Z)/ cos ϕ(T n Z). Since both ξ and T n ξ are homogeneous curves, (4.21) implies B −1 < |DT n (X)|e /|DT n (Y )|e < B

(4.24)

for a global constant B. Now (4.23) and (4.24), along with Lemma 3.10, prove (4.22). ✷ Next, we describe the singularities of the map T . Let S0 = ∂Q × {ϕ = ±π/2} be the natural boundary of M . Put Sn = T n S0 for all n ∈ ZZ, and Sm,n = ∪ni=m Si for −∞ ≤ m ≤ n ≤ ∞. On the sets S−n,−1 and S1,n the maps T n and T −n , respectively, are discontinuous. We will also need the set D0 = ∪k≥k0 {ϕ = ±(π/2 − k−2 )} the union of countably many parallel lines in M separating the homogeneity strips. Put Dn = T n D0 for all n ∈ ZZ, and Dm,n = ∪ni=m Di for −∞ ≤ m ≤ n ≤ ∞. Lemma 4.4 (Alignment) For each n ≥ 1 the set Sn is a finite union of C 2 unstable curves. The set S−n is finite union of stable curves. Similarly, the set Dn is a countable union of unstable curves and D−n is a countable union of stable curves. The curvature of all these curves in M is bounded by a global constant. Proof. One only need to prove this for n = 1, due to the invariance of unstable (stable) curves under T (resp., T −1 ). Since the curves of D0 converge to S0 , then their images (components of D1 ) converge to S1 , so it is enough to prove the lemma for D1 . Consider a curve γ in D0 given by ϕ = ϕ0 = ±(π/2 − k−2 ) with some |k| ≥ k0 . It is the trace of a family Pst that can be naturally parameterized by s = r, and we set t = 0 on that curve. Note that x , y is a unit tangent vector to ∂Q. It is then easy to compute v + = sin ϕ0 , w+ = − cos ϕ0 , (θ )+ = −K(r), and κ+ = (K(r) + h+ sin ϕ0 )/ cos ϕ0 . Hence, κ+ ≥ Θmin and so the family Pst is strongly divergent for t > 0. We now prove the boundedness of curvature. The above natural parameterization r = s does not satisfy our convention on α’s when k is large. But for any point X = (r, ϕ) ∈ γ we can reparameterize the outgoing family Pst , t > 0, with a new parameter s so that v + = 0 and w = 1 at X. In this parameterization, as one can compute directly, (κ )+ = −

2 dK(r)/dr + sin ϕ0 dh+ /dr p sin ϕ0 [(κ+ )2 + (h+ )2 − h+ R] + . cos2 ϕ0 cos3 ϕ0

Vol. 2, 2001


221

Now we see that (κ )+ = O(cos−3 ϕ0 ) = O((κ+ )3 ), so we are in the position of the proof of the claim in the proof of Lemma 4.1. Just like then, we get a uniform bound κ ≤ B6 before the next reflection occurs. This proves that the curvature of the curve T γ is bounded by a global constant. ✷ Corollary 4.5 Unstable curves are uniformly transversal to the boundary ∂M = S0 and to the components of the singularity set S−n , n ≥ 1 (and to those of D−n , n ≥ 1). Stable curves are uniformly transversal to the boundary ∂M = S0 and to the components of the singularity set Sn , n ≥ 1 (and to those of Dn , n ≥ 1). The following continuation property is standard [BSC2, Ch2]: Remark (Continuation property). Each endpoint, X, of every smooth curve γ ⊂ S−n,0 , n ≥ 1, lies either on S0 = ∂M or on another smooth curve γ ⊂ S−n,0 that itself does not terminate at X. Hence, each curve γ ∈ S−n,0 can be continued monotonically up to S0 = ∂M by other curves in S−n,0 .

5 Growth of unstable curves Here we discuss iterations of unstable curves under the action of T . We prove a version of the so called “growth lemma”, a key element in the modern studies of ergodic and statistical properties of hyperbolic dynamical systems. Let γ ⊂ M be an unstable curve of small length ε and m ≥ 1. The map T m is defined on γ \ S−m,0 . By the “invariance principle” for unstable curves, the set γm := T m (γ \ S−m,0 ) is a union of some unstable curves. Denote by Km (γ) the number of those curves (connected components of γm ). By Lemma 3.13 (uniform hyperbolicity) the total length of γm is ≥ B5 Λm ε. However, the effect of growth of γm with m may be effectively eliminated if B5 Λm $ Km (γ). In that case applying T m to γ may produce nothing but a bunch of curves that are even shorter that γ. If that happens for all m, the very existence of SRB measures would be doubtful, if not hopeless. Fortunately, Km (γ) only grows linearly with m, provided ε is small enough. We prove this below. First, note that Km (γ)−1 is the number of points of intersection γ∩S−m,−1 . A point X ∈ M where k ≥ 2 smooth curves of the set S−m,0 meet is called a multiple singularity point, and k is its multiplicity. Denote by Km the maximal multiplicity of all X ∈ M for a given m. Lemma 5.1 For each m ≥ 1 there is an εm > 0 such that for any unstable curve γ ⊂ M of length ε < εm we have Km (γ) ≤ Km . The lemma easily follows from the properties of unstable curves and the singularity set S−m,0 proved in the previous section. Lemma 5.2 (Multiplicity bound) There is a global constant C0 > 0 such that Km ≤ C0 m for all m ≥ 1.

222

N.I. Chernov


We note that a linear bound on Km was first observed by Bunimovich for pure billiard dynamics, see [BSC2]. It is now understood that it is the continuity of the flow Φt that implies the linear bound on Km . We give a proof of this fact different from the original one in [BSC2].

Proof. If Km curves of S−m,0 meet at X, then a neighborhood U (X) of X is divided by those curves into some Lm parts (sectors), and clearly Km ≤ Lm . We now will show that Lm ≤ C0 m for some C0 > 0. On each of the Lm parts of U (X) the map T m is continuous and can be extended by continuity to the point X. Thus, T m X can be defined in Lm different ways. To see exactly how that happens, first note that the real time trajectory Φt X is well and uniquely defined for all t > 0. This trajectory may be tangent to ∂Q at one or more points. We call such points tangent (grazing) reflections. Now, the Lm > 1 different versions of T m at X are possible precisely when the trajectory Φt X has tangential reflections: each of those reflections can be counted as either a “hit” (making an iteration of T ) or a “miss” (skipping it in the construction of T ). Note that the real time elapsed until the mth iteration of T (in any of its versions) is less than mτmax . Hence, there can be no more that C1 m reflections (both tangential and regular ones) involved in the construction of T m at X, where C1 = τmax /τmin is a global constant. Let m ˜ ≤ C1 m be the number of tangential reflections among the first C1 m reflections on the trajectory Φt X. It seems that, ˜ with a choice of hit or miss at every tangential reflection, we would have up to 2m m versions of T at X. That would be too many for us. Fortunately, relatively few sequences of hits and misses materialize, as we show next. Note that there can be no more than C1 tangential reflections in a row. Consider a string of p consecutive tangential reflections on the trajectory Φt X, t > 0, with 1 ≤ p ≤ C1 . Let Y = Φt X ∈ M be the last regular reflection point on the trajectory Φt X before the above string. If there are previous tangential reflections on Φt X, 0 < t < t , then the neighborhood U (Y ) ⊂ M is already divided into some L parts (sectors) according to the hit/miss sequences arisen in those reflections. The boundaries of those L sectors of U (Y ) are curves in S1,n for some n > 0, so they are increasing curves in the r, ϕ coordinates (by Lemma 4.4). Now, there are at most 2p possible hit/miss sequences on the string of p tangential reflections that we have right after the point Y . Accordingly, U (Y ) is divided into ≤ 2p parts (sectors) along some curves in S−p,−1 , which are decreasing curves (by Lemma 4.4). So, we have two partitions of U (Y ): one into L sectors by increasing curves, and the other into ≤ 2p sectors by decreasing curves. These two partitions combined divide U (Y ) into no more than L + 2p parts, as it is clear from Fig. 2. So, each string of p consecutive tangential reflections adds ≤ 2p (i.e., ≤ 2C1 ) parts to the partition of U (X) by S−m,−1 . Hence, Lm ≤ 2C1 C1 m. ✷

Vol. 2, 2001


223

2p ❅ ❆❆ ✁✁ ✟ ❍ ✟ ❍❅ ❍❅ ❆ ✁✟✟ ❍❅ ❍ ❆P ✁q✟ ✟ ✑ ✟ ✁❍ P ❍P ✑✡ ✟ ❍P Y P ✟✑ ✡✁ ❍ ❍ ✟ ✑ ✡✁ L ✡✁ Figure 2. The partition of the neighborhood U (Y ).

Lemmas 5.1-5.2 effectively guarantee the growth of sufficiently short unstable curves under T m . Precisely, if m is large enough, and the unstable curve γ is short enough, then the expansion factor B5 Λm of γ under T m is larger than the “cutting factor” Km (γ) ≤ C0 m. We can now proceed exactly as in [Ch2]. A scheme developed there for the pure billiard dynamics perfectly works for us here, it can be repeated almost word by word. We refer the reader to [Ch2] and only describe certain major steps in the scheme necessary for our further analysis. We start by cutting M along the boundaries of the homogeneity strips Ik thus making M = ∪k Ik a disconnected countable union of strips Ik . This makes the map T discontinuous on the set Γ = S−1 ∪ D−1 . Note that after cutting M into these strips, any connected unstable curve γ ⊂ M will be automatically homogeneous. Then we fix a higher iteration T1 = T m of the map T , with m picked so that C0 m < B5 Λm − 1. The map T1 uniformly expands unstable vectors: |DT1 (V )| ≥ Λ1 |V | with Λ1 := B5 Λm > 1 for all unstable vectors V by Lemma 3.13. The map T1 has singularity set Γ1 = Γ∪T −1 Γ∪· · ·∪T −m+1 Γ = S−m,−1 ∪D−m,−1 . Note also that Λ1 > Km + 1 by Lemma 5.2, so that T1 expands sufficiently short unstable curves faster than the singularity set S−m,−1 breaks them into pieces. For any smooth curve γ ⊂ M we denote by ργ the metric on γ induced by the Euclidean metric on M and by mγ the Lebesgue measure on γ generated by ργ . Note that mγ (γ) = |γ| is the length of the curve γ. An important remark is now in order. Let γ be a homogeneous unstable curve, n ≥ 1, and ξ ⊂ T1n γ any connected (and hence homogeneous and unstable) (n) n mγ |ξ , i.e. the image of the Lebesgue curve. Consider the measure mξ := T1,∗ n mn measure mγ under T1 = T conditioned on ξ. It is a probability measure on ξ absolutely continuous with respect to the Lebesgue measure mξ , and its density

224 (n)

fξ

N.I. Chernov


(n)

= dmξ /dmξ satisfies (n)

fξ (X) (n)

fξ (Y )

=

Jγ,mn (T −mn Y ) Jγ,mn (T −mn X)

for all X, Y ∈ ξ .

(5.1)

Lemma 4.2 (distortion bounds) implies that (n)

(n)

| ln fξ (X) − ln fξ (Y )| ≤ B10 |ξ|1/3 .

(5.2) (n)

Key Remark. By making |ξ| smaller, we can make the the density fξ almost constant on ξ, uniformly in ξ, γ and n. In what follows we only work with unstable curves of small length, less than some ρ0 > 0. We will assume that ρ0 is small (n) enough, hence all the measures mξ on curves ξ ⊂ T1n γ will be almost uniform. For n ≥ 1 denote by Γ1 = Γ1 ∪ T1−1 Γ1 ∪ · · · ∪ T1n−1 Γ1 the singularity set for n T1 . For any δ > 0 let Uδ denote the δ-neighborhood of the set Γ1 ∪ ∂M . Let ρ0 > 0, n ≥ 0, and γ ⊂ M an unstable curve (which is automatically homogeneous). Let ξ ⊂ γ be a disjoint union of open subintervals of γ, and for every X ∈ ξ denote by ξ(X) the subinterval of ξ containing the point X. We call ξ a (ρ0 , n)-subset (of γ) if for every X ∈ ξ the set T1n ξ(X) is a single homogeneous (n) unstable curve of length ≤ ρ0 (in particular, ξ does not intersect the set Γ1 ). Define a function rξ,n on ξ by (n)

rξ,n (X) = ρT1n ξ(x) (T1n X, ∂T1n ξ(X))

(5.3)

which is simply the distance from T1n X to the nearest endpoint of the curve T1n ξ(X) (measured along this curve). In particular, note that rγ,0 (X) = ργ (X, ∂γ). We will use shorthand mγ (rξ,n < ε) for mγ (X ∈ ξ : rξ,n (X) < ε) Proposition 5.3 (“Growth lemma”) There is a global constant α0 ∈ (0, 1) and positive global constants β0 , β1 , β2 , κ, σ, ζ with the following property. For any sufficiently small ρ0 , δ > 0 and any homogeneous unstable curve γ ⊂ M of length ≤ ρ0 , there is an open (ρ0 , 0)-subset ξδ0 ⊂ γ ∩ Uδ and an open (ρ0 , 1)-subset ξδ1 ⊂ γ \ Uδ (one of these subsets may be empty) such that mγ (γ \ (ξδ0 ∪ ξδ1 )) = 0 and for all ε > 0 we have mγ (rξδ1 ,1 < ε) ≤ α0 Λ1 · mγ (rγ,0 < ε/Λ1 ) + εβ0 ρ−1 0 mγ (γ) ,

(5.4)

mγ (rξδ0 ,0 < ε) ≤ β1 δ −κ mγ (rγ,0 < ε)

(5.5)

mγ (ξδ0 ) = mγ (γ ∩ Uδ ) ≤ β2 mγ (rγ,0 < ζδ σ ) .

(5.6)

and

Vol. 2, 2001


225

A general meaning of the above inequalities is the following: (5.4) ensures that the curves in the set T1 ξδ1 are, on the average, long enough; (5.6) asserts that the total measure of the set ξδ0 is small enough; and (5.5) guarantees that the connected components of ξδ0 are not too tiny (hence, they will grow under T1n fast enough). The proof of this proposition repeats word by word the proof of an identical proposition for the pure billiard case. That proof was given in [Ch2] (see the proof of the estimates (2.6)–(2.8) in Section 7 there). It was based on certain facts about billiards which were all listed in [Ch2]. Here we have proved the corresponding facts for our model in Sections 3 and 4. We even tried to use similar notation for the convenience of the reader. Thus here we can refer to [Ch2] for the proof of the above proposition. Corollary 5.4 For any sufficiently small ρ0 > 0 and any homogeneous unstable curve γ ⊂ M of length ≤ ρ0 there is an open (ρ0 , 1)-subset ξ 1 ⊂ γ such that mγ (γ \ ξ 1 ) = 0 and for all ε > 0 we have mγ (rξ1 ,1 < ε) ≤ α0 Λ1 · mγ (rγ,0 < ε/Λ1 ) + εβ0 ρ−1 0 mγ (γ) .

(5.7)

Also, for any n ≥ 2 there is an open (ρ0 , n)-subset ξ n ⊂ γ such that mγ (γ \ ξ n ) = 0 and for all ε > 0 we have n−1 )mγ (γ) mγ (rξn ,n < ε) ≤ (α1 Λ1 )n · mγ (rγ,0 < ε/Λn1 ) + εβ3 ρ−1 0 (1 + α1 + · · · + α1 −1 n −1 ≤ α1 ε + εβ3 ρ0 (1 − α1 ) mγ (γ) . (5.8)

Lastly, for all sufficiently small δ > 0 we have mγ (γ ∩ T1−n Uδ ) ≤ β4 mγ (rξn ,n < ζδ σ ) −1 ≤ β4 α1n ζδ σ + β4 ζδ σ β3 ρ−1 mγ (γ) . 0 (1 − α1 )

(5.9)

Here α1 ∈ (α0 , 1) and β3 > β0 , β4 > β2 are some global constants. Proof. The bound (5.7) follows from (5.4) by taking the limit δ → 0. The bound (5.8) follows from (5.7) by induction on n, this induction argument was explained in detail on pp. 432–433 in [Ch1]. The first inequality in (5.9) is obtained by applying (5.6) to every connected curve in T1n ξ n , where ξ n is the set involved in (5.8). The second inequality in (5.9) then follows directly from the bound (5.8). We note that the necessity to slightly increase the constants α0 , β0 , β2 (to (n) α1 , β3 , β4 respectively) results from the slight non-uniformity of the measure mξ with respect to the Lebesgue measure mξ on every connected component ξ of the set T1n ξ n . In view of our Key Remark, we can make ρ0 > 0 small enough, so that the increase of α0 will be small, hence α1 will be still less than one, because the requirement α1 < 1 is crucial. ✷ Now we fix a ρ0 > 0 satisfying Proposition 5.3. We also fix a small q ∈ (0, 1) and let ρ1 = ρ0 q(1 − α1 )/4β3 . For any homogeneous unstable curve γ ⊂ M of

226

N.I. Chernov


length ≤ ρ0 and n ≥ 1 let ξ n ⊂ γ be the set involved in (5.8). Denote ξ n (ρ1 ) = {X ∈ ξ n : |T1n ξ n (X)| ≥ ρ1 } . In other words, T1n ξ n (ρ1 ) will be the union of long enough (longer than ρ1 ) components of T1n ξ n . A direct calculation based on (5.8) yields : Corollary 5.5 For all n ≥ n(γ) := ln mγ (γ)/ ln α1 + ln(q/ρ1 )/ ln α1 we have mγ (ξ n (ρ1 )) ≥ (1 − q) mγ (γ) .

This means that in the set T1n γ, sufficiently long components (longer than ρ1 ) will be prevalent after n(γ) iterations of T1 . Note that ρ0 , ρ1 , q are global constants (independent of the force F). We complete this section with the construction of stable and unstable manifolds. An unstable curve γ ⊂ M is called an unstable fiber (or unstable manifold) if for all n ≥ 1 the map T −n is defined on γ and T −n γ is also an unstable curve. Likewise, γ is a stable fiber if T n γ is a stable curve for all n ≥ 0. Note that for an unstable fiber γ we have diam(T −n γ) → 0 as n → ∞. Similarly, for a stable fiber γ we have diam(T n γ) → 0 as n → ∞. The above notion corresponds to a standard definition of stable and unstable manifolds for hyperbolic dynamical systems. It is not very helpful in the case of billiards, because of the lack of proper distortion bounds. Such bounds are only available on homogeneous stable and unstable curves, as we have seen in Section 4. Hence, we adopt the following: Definition. An unstable curve γ ⊂ M is called an unstable homogeneous fiber, or h-fiber, if for all n ≥ 0 the curve T −n γ is a homogeneous unstable curve. Similarly, γ ⊂ M is a stable h-fiber if for all n ≥ 0 the curve T n γ is a homogeneous stable curve. Clearly, stable and unstable h-fibers are automatically ordinary stable and unstable fibers. But generally, h-fibers are shorter than ordinary fibers. In other words, an ordinary fiber can be a union (finite or countable) of h-fibers. We now prove that h-fibers exist and are abundant in M . The hyperbolicity of the flow Φt or the map T does not automatically provide the existence of h-fibers, though, because both the flow and the map have singularities. For ε > 0, denote by Uε− the ε-neighborhood of S0 ∪ S−1 ∪ D0 , and by Uε+ the ε-neighborhood of S0 ∪ S1 ∪ D0 . Let ± Mε± = {X ∈ M : T ±n X ∈ / UεΛ −n

for all n ≥ 1}

(here and on Λ is the global constant defined by (3.28)). The following is standard [Pe, Y1, Ch1]:

Vol. 2, 2001


227

Fact. For every point X ∈ Mε− , an unstable h-fiber γ u (X) exists and stretches by at least c0 ε in both directions from X (where c0 > 0 is a global constant). Similarly, for every point X ∈ Mε+ , a stable h-fiber γ s (X) exists and stretches by at least c0 ε in both directions from X. In the notation of the previous section, we have rγ u (X),0 (X) ≥ c0 ε for every X ∈ Mε− , and rγ s (X),0 (X) ≥ c0 ε for every X ∈ Mε+ . Proposition 5.6 For ν0 -almost every point X ∈ M there are stable and unstable h-fibers γ u (X) and γ s (X) through X. Moreover, ν0 (X : rγ u (X),0 (X) ≤ ε) ≤ Cε and ν0 (X : rγ s (X),0 (X) ≤ ε) ≤ Cε for some global constant C > 0. In particular, the union of h-fibers shorter than ε has ν0 -measure less than const · ε. Proof. Since the set S0 ∪ S±1 is a finite union of smooth compact curves, the ν0 measure of its ε-neighborhood is less than const·ε. A similar fact for the set D0 can be verified by direct inspection. Then ν0 (Uε− ) ≤ B ε for some global constant B . − −δ4 Due to (4.20), for all n ≥ 1 we have ν0 (T n UεΛ Λ)−n . Therefore, −n ) ≤ B B11 ε(e − ν0 (Mε ) ≥ 1 − Bε for some global constant B. A similar bound holds for Mε+ . Now the proposition follows from the above fact. ✷ We record a few standard facts about h-fibers, which follow from the properties proved in Sections 3-4, in the same way as in the pure billiard case [BSC2]: (1) if a sequence of h-fibers γnu , n ≥ 1, converges to a curve γ in the C 0 metric, then γ is an h-fiber. (2) For every point x ∈ Mε− the h-fiber γ u (X) is unique, i.e. h-fibers do not cross each other or branch out. The same holds for every X ∈ Mε+ and γ s (X).

6 A Sinai-Ruelle-Bowen measure for the map T For any unstable h-fiber γ ⊂ M , a unique probability measure νγ , absolutely continuous with respect to the Lebesgue measure mγ with density fγ = dνγ /dmγ , is defined by the following condition: JT −n γ,n (T −n Y ) fγ (X) = lim fγ (Y ) n→∞ JT −n γ,n (T −n X)

for all X, Y ∈ γ

(6.1)

(compare this to (5.1)). The existence of the the limit (6.1) is guaranteed by Lemma 4.2 (distortion bounds). We call νγ the u-SRB measure on γ. Observe that u-SRB measures are conditionally invariant under T , i.e. for any subsegment γ1 ⊂ T γ, the measure T∗ νγ |γ1 (the image of νγ under T conditioned on γ1 ) coincides with νγ1 . (n) Note that the density fγ is a pointwise limit of the densities fγ introduced in the previous section, as n → ∞. The bound (5.2) implies a similar bound for

228

N.I. Chernov


fγ . So, according to our Key Remark, all the u-SRB densities are almost constant on unstable h-fibers of length ≤ ρ0 . Definition. A T -invariant ergodic probability measure ν on M is called a SinaiRuelle-Bowen (SRB) measure if its conditional distributions on unstable h-fibers are absolutely continuous. In that case the conditional measure ν|γ is the u-SRB measures νγ on every unstable h-fiber. The significance of SRB measures lies in the following facts. For any SRB measure ν there is a set B ⊂ M of positive Lebesgue measure (called sometimes the basin of attraction) such that for every X ∈ B and any continuous function f : M → IR f (X) + f (T X) + · · · + f (T n−1 X) f (X) dν → n M as n → ∞. Thus, the measure ν describes the distribution of trajectories of points X ∈ B, which are physically observable (detectable) since ν0 (B) > 0. Hence, SRB measures are physically observable. The first goal of this section is to prove the existence and finiteness of SRB measures. We first prove a similar claim for the map T1 = T m introduced in the previous section. In [Pe], Pesin found sufficient conditions for the existence of SRB measures for a wide class of hyperbolic maps with singularities (he called them generalized hyperbolic attractors), which included the class we study here. We restate Pesin’s existence theorem in our notation. Denote by m the Lebesgue measure on M . Theorem 6.1 (see [Pe]) The map T1 admits at least one and at most countably many SRB measures, provided the following two conditions hold. First, there are constants C1 > 0, q1 > 0 such that for all ε > 0, n ≥ 1 m(T1−n Uε ) ≤ C1 εq1 .

(6.2)

Second, there is an unstable h-fiber γ ⊂ M and constants C2 > 0, q2 > 0 such that for all ε > 0, n ≥ 1 mγ (γ ∩ T1−n Uε ) ≤ C2 εq2 . (6.3) Each SRB measure is K-mixing and Bernoulli, up to a finite cycle. Recall that Uε stands for the ε-neighborhood of the set Γ1 ∪ ∂M . Later Sataev [Sa] showed that the number of SRB measures is finite under two additional conditions: there are constants C3 > 0, q3 > 0 such that for every homogeneous unstable curve γ ⊂ M there are nγ ≥ 1 and Cγ > 0 such that for all ε > 0 mγ (γ ∩ T1−n Uε ) ≤ Cγ εq3 mγ (γ) for all n > 0 (6.4) and

mγ (γ ∩ T1−n Uε ) ≤ C3 εq3 mγ (γ)

for all n > nγ .

We now verify Pesin’s and Sataev’s conditions.

(6.5)

Vol. 2, 2001


229

Proposition 6.2 The map T1 satisfies (6.2)-(6.5). Hence, T1 admits at least one and at most finitely many SRB measures. Every SRB measure is K-mixing and Bernoulli, up to a finite cycle. Proof. We foliate M by smooth unstable curves whose collection we denote by Γ∗ = {γ}. We require that the length of each γ ∈ Γ∗ be ρ0 (except for the corners of M and narrow strips Ik , where the curves are necessarily shorter). Let m∗γ be the conditional measure on each γ ∈ Γ∗ induced by the Lebesgue measure m on M , and m∗ the factor measure on Γ∗ . If the foliation is smooth enough and ρ0 small enough, then every m∗γ will have almost uniform density with respect to the Lebesgue measure mγ . In fact, the curves γ can be chosen as parallel line segments, then the measures m∗γ will be exactly uniform. Now the condition (6.2) easily follows from the bound (5.9) by integration over Γ∗ with respect to the factor measure m∗ , which is a straightforward calculation. The conditions (6.3) and (6.4) are direct consequences of (5.9). Lastly, the inequality (6.5) follows from ✷ (5.9) whenever α1n < mγ (γ), i.e. n > ln mγ (γ)/ ln α1 . Proposition 6.3 The map T admits at least one and at most finitely many SRB measures. Every SRB measure is K-mixing and Bernoulli, up to a finite cycle. Proof. Let ν be an SRB measure for the map T1 = T m . Then the measure (ν + T∗ ν + · · · + T∗m−1 ν)/m will be an SRB measure for the map T , hence the existence part. Now, let ν be an SRB measure for T . If it is ergodic for T1 , then it is an SRB measure for T1 . Otherwise ν has at most m ergodic components (with respect to T1 ), each of which is an SRB measure for T1 . This proves Proposition 6.3. ✷ The following proposition gives Theorem 2.2 modulo Theorem 2.1, whose proof is yet to be completed. Proposition 6.4 Each SRB measure ν of the map TF enjoys the exponential decay of correlations (2.9) and satisfies the central limit theorem (2.10). The correlation bound (2.9) is uniform in F. Proof. This follows from a general theorem proved in [Ch2]. That theorem is stated for generic hyperbolic maps satisfying certain assumptions. All the assumptions have been already verified in Sections 3-5. The uniformity of the correlation bound follows from the fact that all the constants in the crucial estimates in Sections 3-5 (most notably, in the “growth lemma” 5.3) are global, i.e. independent of F. Thus we obtain Proposition 6.4. ✷ The uniqueness of an SRB measure for T requires a more elaborate argument. We recall that the space M in the coordinates (r, ϕ) does not depend on the force F in (1.1). So we consider all the maps T = TF as defined on the same space M . For F = 0, we get the billiard map T0 . Recall that the space M is cut into countably many strips, Ik , hence all stable and unstable curves will be automatically homogeneous.

230

N.I. Chernov


The classes of stable and unstable curves depend on F, but only slightly, as it follows from our definitions in Sections 3 and 4. For simplicity, we intersect these classes over all relevant F’s. Hence, from now on, stable and unstable curves mean such curves for all relevant maps T = TF . On the contrary, stable and unstable h-fibers depend on F strongly (not only their directions, but even more their sizes), s,u so we will denote them by γF (X), respectively, for X ∈ M . For any ρ > 0, consider the class Cu (ρ) of unstable curves γ ⊂ M of length ≥ ρ. Denote by Cu (ρ) its closure in the Hausdorff metric. Recall that the Hausdorff metric defines the distance between two compact subsets A, B ⊂ M by dist(A, B) = max{max dist(X, B), max dist(Y, A)} X∈A

Y ∈B

(it is just the C 0 metric if restricted to continuous curves in M ). Recall that our unstable curves are at least C 2 , their tangent vectors satisfy the uniform bound in Lemma 3.10 and their curvature is uniformly bounded by Lemma 4.1. Therefore, all the curves in the class Cu (ρ) are of length ≥ ρ, at least C 1 (but not necessarily C 2 ), and their tangent vectors satisfy the same bound in Lemma 3.10. We will call curves γ ⊂ ∪ρ>0 Cu (ρ) generalized unstable curves. Similarly, generalized stable curves are defined, and we denote their class respectively by ∪ρ>0 Cs (ρ). We call a rhombus R ⊂ M a domain bounded by two unstable curves and two stable curves (called the sides of R). We say that a generalized unstable curve γ straddles R if γ ⊂ R and the endpoints of γ lie on the (opposite) stable sides of R. We say that γ properly crosses R if γ intersects the middle half of each stable side of R and the points of intersection divide γ into three parts of which the smallest one is γ ∩ R. Similar notion are defined for generalized stable curves. For ∗ u s a rhombus R, let RF be the set of points X ∈ R such that both γF (X) and γF (X) properly cross R. Lemma 6.5 There is a rhombus R ⊂ M such that ν0 (R0∗ ) > 0. This easily follows from Proposition 5.6. ✷ ∗ Note that we do not claim that ν0 (RF ) > 0 for all F, or even for any F = 0. This will follow from our further results, see Corollary 6.10, etc. We fix a rhombus R that satisfies the above lemma. Since it does not depend on the force F, it is a “global” object, just like our constants Bi in the previous sections. For any generalized unstable curve γ and n ≥ 1 let γF (n) denote the union of intervals ξ ⊂ γ such that TFn ξ is one generalized unstable curve that straddles R. p Also, let γF (n) denote the union of intervals ξ ⊂ γ such that TFn ξ is one generalized unstable curve that properly crosses R. Lemma 6.6 There are global constants n ˜=n ˜ (ρ1 , R) ≥ 1 and β˜1 = β˜1 (ρ1 , R) > 0 such that for every generalized unstable curve γ ⊂ M of length ≥ ρ1 and all n ≥ n ˜ mγ (γ0p (n)) ≥ β˜1 mγ (γ) .

Vol. 2, 2001


231

In other words, for all n ≥ n ˜ the image T0n γ will contain a certain positive ˜ fraction (characterized by β1 ) of curves that properly cross the rhombus R. This lemma is proved in [BSC2] (see Theorem 3.13 there) under the additional assumption that γ is an h-fiber. However, the past images T0−k γ, k > 0, are not involved in that theorem or its proof, and, clearly, there is no difference between unstable h-fibers and generalized unstable curves as far as their forward iterations are concerned. Thus, Theorem 3.13 in [BSC2] extends to generalized unstable curves. Lemma 6.7 For any force F satisfying Assumptions A and B and every generalized unstable curve γ ⊂ M of length ≥ ρ1 and all n ∈ [˜ n, n ˜ + m] mγ (γF (n)) ≥ β˜2 mγ (γ) where β2 > 0 is a global constant, and m is again the fixed power of T , i.e. such that T1 = T m . In other words, for every n = n ˜, . . . , n ˜ + m the image TFn γ will contain a certain positive fraction of curves that straddle the rhombus R. Proof. Let γ be a generalized unstable curve and n ∈ [˜ n, n ˜ +m]. Consider the curves ξ ∈ T0n γ that properly cross R (such curves exist by Lemma 6.6). Let F be small enough, so that the map TF is a small enough perturbation of T0 , in particular the singularity sets of these maps are close enough to each other. Let also γ ⊂ M be a generalized unstable curve sufficiently close to γ in the Hausdorff metric. We claim that if F ≈ 0 and γ ≈ γ, then to every curve ξ ∈ T0n γ that properly crosses R there corresponds a curve ξ ∈ TFn γ that is close to ξ and has almost the same length. We emphasize that we first fix γ and n, and then assume that F ≈ 0 and γ ≈ γ, for the given γ and n. Note that ξ will be one curve (not broken by singularities), because of the continuation property from the end of Section 4. One can easily see that, by that property, if any long enough generalized unstable curve γ intersects the singularity set for T0 , then any generalized unstable curve γ close enough to γ (in the Hausdorff metric) intersects the singularity set for TF with F ≈ 0, and vice versa. This justifies our claim. Now, since ξ is close to ξ, and ξ properly crosses R, then ξ crosses both stable sides of R, and so the curve ξ ∩ R straddles R. Thus, given γ and n, there is an open neighborhood V(γ, n) of the curve γ in the class Cu (ρ1 ) equipped with the Hausdorff metric and a δ0 (γ, n) > 0 such that any curve γ ⊂ V(γ, n) satisfies (n)) ≥ β˜2 mγ (γ ) mγ (γF

(6.6)

for some global constant β˜2 > 0 and all F’s that satisfy Assumptions A and B with δ0 < δ0 (γ, n). The finite intersection ˜ +m V(γ) := ∩nn=˜ n V(γ, n)

232

N.I. Chernov


is also an open neighborhood of the curve γ in the class Cu (ρ1 ). Any curve γ ⊂ V(γ) satisfies the inequality (6.6) for all n ∈ [˜ n, n ˜ + m] and with all F’s that satisfy Assumptions A and B with δ0 < δ0 (γ) :=

min

n ˜ ≤n≤˜ n+m

δ0 (γ, n) .

Since the class Cu (ρ1 ) is obviously compact in the Hausdorff metric, there is a finite cover of Cu (ρ1 ) by some V(γj ), 1 ≤ j ≤ J. This proves the lemma for all forces satisfying Assumptions A and B with δ0 < δ∗ := min δ0 (γj ) . 1≤j≤J

✷ Remark. Note that in the proof of Lemma 6.7 we have put a new restriction δ0 < δ∗ on δ0 that enters Assumption B. This restriction is probably much more severe than any of the restrictions on δ0 we needed before. Therefore, the uniqueness of the SRB measure probably holds for much smaller forces F than the hyperbolicity of TF and the existence and finiteness of SRB measures do. We therefore expect that in physical models where F changes from F = 0 continuously (such as by increasing the strength of an electrical field [CELS1]), one first observes a unique non-smooth SRB measure, then a finite collection of SRB measures, and then non-SRB stationary states. Such experiments were done, for example, in [DM]. This discussion is related to the physically important issue of the range of applicability of the linear response theory – see van Kampen’s objections [K] and some counterarguments in [CELS1]. Lemma 6.7 and Corollary 5.5 easily imply the following two corollaries. Corollary 6.8 There are global constants n ˜ 1 ≥ 1 and β˜3 > 0 such that for any force F satisfying Assumptions A and B with δ0 < δ∗ and every generalized unstable ˜1 curve γ ⊂ M of length ≥ ρ1 and all n ≥ n mγ (γF (n)) ≥ β˜3 mγ (γ) . The main difference from Lemma 6.7 is that now all n ≥ n ˜ 1 are covered, rather than n ∈ [˜ n, n ˜ + m]. Corollary 6.9 There are global constants n ˜ 2 ≥ 1 and β˜4 > 0 such that for any force F satisfying Assumptions A and B with δ0 < δ∗ and every generalized unstable curve γ ⊂ M of length |γ| = ε > 0 and all n≥n ˜ (ε) := ln ε/ ln α1 + n ˜2

(6.7)

mγ (γF (n)) ≥ β˜4 mγ (γ) .

(6.8)

we have

Vol. 2, 2001


233

u Let RF be the set of points X ∈ R such that the unstable h-fiber γ u (X) ∩ R straddles R.

Corollary 6.10 There is a global constant β˜R > 0 such that for any TF with δ0 β˜R . Furthermore, let ν be not mixing, so that by Proposition 6.3, TF permutes a finite number of subsets u X1 , . . . , Xk ⊂ M on each of which TFk is mixing. In this case we have ν(RF ∩Xi ) > 0 for every i = 1, . . . , k. Proposition 6.11 For any force F satisfying Assumptions A and B with δ0 < δ∗ the SRB measure of the map TF is unique and mixing. We first adopt a definition. s s Definition. Let γF be a stable h-fiber. For ε > 0, let Γε (γF ) denote the union of all s s stable h-fibers in M that are ε-close to γF in the Hausdorff metric. We call γF a s density h-fiber if for every ε > 0 the set Γε (γF ) has positive Lebesgue measure in s M . Note that in this case for any generalized unstable curve ξ ⊂ M that crosses γF , s the set ξ ∩ Γε (γF ) has positive mξ measure, by the absolute continuity Lemma 4.3. Similarly, we introduce unstable density h-fibers. Lemma 6.12 For each map TF there are density h-fibers. In fact, their union has s full Lebesgue measure. If γF is a density h-fiber, then all the connected components −n s of T γF are density h-fibers, too, for every n ≥ 1. Proof. The first two claims follows from Proposition 5.6. To prove the last one, we see n = 1 and note that TF−1 is piecewise smooth and its singularities are unstable curves with the continuation property. Then we use induction on n. ✷ s be a density h-fiber. By the above Lemma and Proof of Proposition 6.11. Let γF Corollary 6.9 (actually applied to stable curves), there exist density h-fibers in s T −n γF for some n ≥ 1 that straddle the rhombus R. This, along with Corollary 6.10, proves Proposition 6.11. ✷ Proposition 6.13 For any force F satisfying Assumptions A and B with δ0 < δ∗ the SRB measure ν of the map TF is positive on open sets. Moreover, for every small round disk D ⊂ M we have ν(D) ≥ c1 [ν0 (D)]1+δ5 for some global constant c1 > 0 and small constant δ5 > 0 depending on δ0 (i.e., δ5 → 0 as δ0 → 0). Proof. Since the disk D is connected, it belongs in one homogeneity strip Ik , and so the quantity cos ϕ does not vary too much over D, i.e. the measure ν0 is almost proportional to the Lebesgue measure m on D. We can find a rhombus RD ⊂ D whose opposite sides are parallel straight lines and which is big enough so that, say,

234

N.I. Chernov


ν0 (RD ) ≥ ν0 (D)/10. Now, we foliate the rhombus RD by parallel stable segments γ that straddle RD and are parallel to the stable sides of RD . Hence, all γ’s in our foliation have the same length, ε. Note that ε ≥ c2 ν0 (D) with a global constant c2 > 0. For any γ in our foliation of RD and n ≥ 1 let γF (−n) denote the union of intervals ξ ⊂ γ such that TF−n ξ is one stable curve that straddles the rhombus R (fixed earlier). Corollary 6.9 (actually, its dual statement for stable curves) implies that mγ (γF (−n)) > β˜4 mγ (γ) for all n ≥ n ˜ := n ˜ (ε). Consider the set RD (−n) := ∪γ γF (−n) where the union is taken over all γ in our foliation of RD . Our previous estimates imply that m(RD (−n)) ≥ β˜4 m(RD ) and hence ν0 (RD (−n)) ≥ β˜5 ν0 (RD ) for all n≥n ˜ , and with the global constant β˜5 = β˜5 /2. Now the volume compression bounds (4.20) imply −1 −δ4 n ˜ e β5 ν0 (RD ) ν0 (TF−n RD (−n)) ≥ B11

for all n ≥ n ˜ . We put n = n ˜=n ˜ (ε) given by (6.7) and obtain n)) ≥ c εδ6 ν0 (RD ) ≥ c [ν0 (RD )]1+δ6 /2 ν0 (TF−ñ RD (−˜ with some positive global constants c , c and a small constant δ6 = δ4 / ln α1 . We put δ5 = δ6 /2. n) is a union of stable curves that straddle Next observe that the set TF−ñ RD (−˜ our fixed rhombus R. Lemma 4.3 (absolute continuity) then implies that for any unstable curve ξ that straddles R we have n)) ≥ c [ν0 (RD )]1+δ5 mξ (ξ) mξ (ξ ∩ TF−ñ RD (−˜ with a global constant c > 0. This bound combined with Corollary 6.10 yields ν(TF−ñ RD (−˜ n)) ≥ c˜ [ν0 (RD )]1+δ5 for some global constant c˜ > 0 and the SRB measure ν of the map TF . The TF invariance of ν completes the proof of Proposition 6.13. ✷ Theorems 2.1 and 2.2 are now proved.

Acknowledgment. The author is grateful to H. van den Bedem and L.-S. Young for their interest to this work and useful discussions and to the referee for helpful comments.

Vol. 2, 2001


235

References [Bl]

P. Bleher, Statistical properties of two-dimensional periodic Lorentz gas with infinite horizon, J. Statist. Phys. 66, 315–373 (1992).

[BSC1]

L. A. Bunimovich, Ya. G. Sinai and N.I. Chernov, Markov partitions for two-dimensional billiards, Russ. Math. Surv. 45, 105–152 (1990).

[BSC2]

L. A. Bunimovich, Ya. G. Sinai and N. I. Chernov, Statistical properties of two-dimensional hyperbolic billiards, Russ. Math. Surv. 46, 47–106 (1991).

[CELS1] N. I. Chernov, G. L. Eyink, J. L. Lebowitz and Ya. G. Sinai, Steadystate electrical conduction in the periodic Lorentz gas, Comm. Math. Phys. 154, 569–601 (1993). [CELS2] N. I. Chernov, G. L. Eyink, J. L. Lebowitz and Ya. G. Sinai, Derivation of Ohm’s law in a deterministic mechanical model, Phys. Rev. Let. 70, 2209–2212 (1993). [Ch1]

N. Chernov, Statistical properties of piecewise smooth hyperbolic systems in high dimensions, Discr. Cont. Dynam. Syst. 5, 425–448 (1999).

[Ch2]

N. Chernov, Decay of correlations and dispersing billiards, J. Statist. Phys. 94, 513–556 (1999).

[DM]

C. P. Dettmann and G. P. Morriss, Crisis in the periodic Lorentz gas, Phys. Rev. E. 54, 4782–4790 (1996).

[FM1]

B. Friedman, R. F. Martin, Decay of the velocity autocorrelation function for the periodic Lorentz gas, Phys. Lett. A 105, 23–26 (1984).

[FM2]

B. Friedman, R. F. Martin, Behavior of the velocity autocorrelation function for the periodic Lorentz gas, Physica D 30, 219–227 (1988).

[GO]

G. Gallavotti and D. Ornstein, Billiards and Bernoulli schemes, Comm. Math. Phys. 38, 83–101 (1974).

[GC]

G. Gallavotti and E.D.G. Cohen, Dynamical ensembles in stationary states, J. Stat. Phys. 80, 931–970 (1995).

[K]

N. van Kampen The Case Against Linear Response Theory, Phys. Norv. 5, 279-284 (1971).

[Pe]

Ya. B. Pesin, Dynamical systems with generalized hyperbolic attractors: hyperbolic, ergodic and topological properties, Ergod. Th. Dynam. Sys. 12, 123–151(1992).

236

N.I. Chernov


[Ru]

D. Ruelle, Smooth dynamics and new theoretical ideas in non-equilibrium statistical mechanics, J. Statist. Phys. 94 (1999).

[Sa]

E. A. Sataev, Invariant measures for hyperbolic maps with singularities, Russ. Math. Surv. 47, 191–251 (1992).

[Si]

Ya. G. Sinai, Dynamical systems with elastic reflections. Ergodic properties of dispersing billiards, Russ. Math. Surv. 25, 137–189 (1970).

[W1]

M. P. Wojtkowski, Invariant families of cones and Lyapunov exponents, Ergod. Th. Dynam. Sys. 5, 145–161 (1985).

[W2]

M. P. Wojtkowski, Magnetic flows and Gaussian thermostats on manifolds of negative curvature, Fund. Math. 163, 177–191 (2000).

[W3]

M. P. Wojtkowski, Flows on Weyl manifolds and Gaussian thermostats, preprint (2000).

[Y1]

L.S. Young, Statistical properties of dynamical systems with some hyperbolicity, Annals of Math. 147, 585–650 (1998).

[Y2]

L.-S. Young, Ergodic theory of chaotic dynamical systems, XIIth International Congress of Mathematical Physics (ICMP’97) (Brisbane), 131– 143, Internat. Press, Cambridge, MA, 1999.

N.I. Chernov Department of Mathematics University of Alabama at Birmingham Birmingham, AL 35294 email: [email protected] Communicated by Eduard Zehnder submitted 26/01/00, accepted 22/11/00




Supersymmetry, Witten Complex and Asymptotics for Directional Lyapunov Exponents in Zd Wei-Min Wang Abstract. We study the long distance asymptotic for random walks in random potentials. We use the analytical formulation. We relate the problem to Witten Laplacians via using supersymmetry. We obtain the asymptotic for the directional Lyapunov exponents.

I Introduction In this paper, we consider random walks in random potentials. There are two equivalent ways to look at the problem: probabilistic and analytical. We use the latter. But the former is more natural and useful. So we begin by presenting the problem from the probabilistic point of view.

I.1

Random walks in random potentials

Let ∆ be the usual discrete Laplacian acting on 2 (Zd ), with its matrix elements ∆ij = 1 |i − j|1 = 1 = 0 otherwise.

(1.1)

Define P on Zd × Zd by

∆ij , 2d for all (i, j) ∈ Zd × Zd . We have from (1.1) that P (i, j) =

P (i, j) = P (0, i − j),

P (0, i) ≥ 0,

P (0, i) = 1,

i∈Zd

i.e., P (i, j) defines the transition probability of a (simple) random walk (see e.g., [Spi]). Let H be the operator, H = ∆ + γV, on 2 (Zd ),

(1.2)

where ∆ is as defined in (1.1), γ is a positive parameter, the potential function V is a diagonal matrix: V = diag(vj ), j ∈ Zd , where {vj } is a family of independently identically distributed (iid) real random variables with distribution dg. The

238

Wei-Min Wang


probability measure is taken to be P = j∈Zd dg(vj ). We use the notation g to denote the expectation value with respect to P . The operator H defines a random walk in the random potential V . As is well known, the spectrum σ(∆) is [−2d, 2d]. (Note that the Laplacian defined here differs from the one usually used by the probabilists by the constant 2d times the identity. So the spectrum differs by 2d.) Let supp dg be the support of dg, then we have the well established fact (See e.g., [CFKS, PF].) that σ(H) = [−2d, 2d] + supp dg

(1.3)

almost surely. Assume E ∈ / σ(H), then we define G(E) = (E − H)−1 and G(E, µ, ν) = (δµ , (E − H)−1 δν ),

µ, ν ∈ Zd ,

(1.4)

to be the Green’s function (or correlation function) of H at energy E. In the absence of the potential, the Green’s function (i.e., the Green’s function for ∆) has the following path representation (see e.g., p. 169 of [La]): E −|w| , E > 2d, (1.5) G0 (E, µ, ν) = (E − ∆)−1 (µ, ν) = w: µ→ν

where the sum is over all walks w from µ to ν and |w| is the length (i.e., total number of steps) of the walk w. Similarly the Green’s function for H defined earlier in (1.4) has the representation (see e.g., (1) of [Z]): G(E, µ, ν) = (E − H)−1 (µ, ν) n ∞ (E − γV (wj ))−1 , =

E > 2d,

(1.6)

n=0 w: µ→ν; |w|=n j=0

where wj is the position of the walk w after j steps. Note that due to the discrete time formulation, (1.5) and (1.6) are slightly different from the usual Feynman-Kac formula. Clearly (1.6) reduces to (1.5) when V = 0, and can be seen as defining a weighted (by V ) path measure. (In fact, (1.5), (1.6) are, in analytical language, the resolvent (or Neumann) series about E or E −γV written in a slightly different way.) For a given realization of the potential V , G(E, µ, ν) is the probability of finding the random walk at site ν conditioned that it starts at site µ. (Recall that the Green’s function is the integral over all time t of the heat kernel.) For more details, see e.g.,[La, MS, Spi]. It follows then G(E, µ, ν) g is the expected (with respect to the random potential V ) probability of finding the random walk at site ν conditioned that it starts at site µ. In the limit |µ − ν| → ∞, G(E, µ, ν) g is also essentially the normalization constant for the measure defined as the tensor product of the probability measure P = i∈Zd dg(vi ) with the path measure for H in (1.6) (see (7), p. XI and (2.7), p. 323 of[Sz1]).

Vol. 2, 2001

Supersymmetry, Witten Complex and Directional Lyapunov Exponents

239

Remark. The book Brownian Motion, Obstacles and Random Media by A. S. Sznitman [Sz1] provides an excellent in-depth reference on the subject. There is also a Bourbaki seminar [Kom] by Komorowski centered on the work of Sznitman, which can serve as a short introduction to the subject. The method that we use here is substantially different–it is purely analytical. However much of our understanding of the subject comes from reading the book, which we frequently refer to in this section. Also when we make references to works of Sznitman covered in the book [Sz1], we do not in general trace back to the original reference.

I.2

The main results

In order to state our results, we need to impose some conditions on dg. We first need to ensure that the resolvent set: R\σ(H) = ∅. This is satisfied if supp dg is bounded either from below or above. We define the Laplace transform of dg, gˆ(t) for t ≥ 0 to be gˆ(t) =

e−tv dg(v),

(1.7)

etv dg(v),

(1.8)

if supp dg is bounded from below; and gˆ(t) =

if supp dg is bounded from above. We also require that dg has bounded moments, in order that the derivatives of gˆ have the required properties at infinity. Remark. This is needed for the Witten Laplacian construction used later (see subsect. 3 below). For the present paper since we do not work to all orders of γ, it is sufficient to assume that dg has finite nth moment, for some fixed n. But as the construction could be extended to all orders, we assume that all moments of dg are finite. Due to the presence of the parameter E, without loss of generality, we may then assume that (H1) supp dg ⊆ (−∞, 0] or supp dg ⊆ [0, +∞) (H2) dg has finite moments, i.e., |

v n dg(v)| < ∞,

Define

for all n ≥ 0.

0+

g(a) =

dg(v),

(a ≥ 0)

dg(v),

(a ≥ 0)

−a

in the first case of (H1) and

a

g(a) = 0−

240

Wei-Min Wang


in the second case of (H1). We assume further (H3) dg is such that g is of regular variation at 0 with exponent ρ (0 ≤ ρ < ∞), i.e., g(Ca) → f (C) = C ρ g(a)

(1.9)

as a → 0 for all C > 0. Remark. Since dg is a probability measure, g is a positive monotone function on [0, ∞). For such functions, condition (1.9) is much less restrictive than it seems at first sight. This is because once we assume the limit of g(Ca)/g(a) exists and is finite as a → 0 for a dense set of C’s in R+ , then the limit f (C) is necessarily of the form C ρ for some 0 ≤ ρ < ∞. For more details on this, see p. 268 of [F]. (H3) enables us to use Tauberian type theorems to deduce the desired properties of the derivatives of gˆ as t → ∞. We note in particular that the Bernoulli distribution δ(v) + δ(v + 1) dg(v) = 2 (1.10) dh(v) + dh(v + 1) = 2 where δ is the Dirac distribution at 0 and h is the Heaviside function, is of regular variation with exponent ρ = 0. For all δ > 0, define Iδ =(2d + δ, ∞) =(−∞, −2d − δ)

if supp dg ⊆ (−∞, 0], if supp dg ⊆ [0, ∞).

(1.11)

Assuming (H1-3), our main result is Theorem 7.2 For all δ > 0, there exists γ0 > 0, such that for all 0 < γ < γ0 , all E ∈ Iδ , all µ, ν ∈ Zd , logG(E, µ, ν) g = logG(E, µ − ν, 0) g = log E −∆ − γv g − γ 2 (v − v g )2 g [(E −∆ − γv g )−1(0, 0)] −1 − γ 3 (v − v g )3 g [(E − ∆ − γv g )−1 (0, 0)]2 (µ − ν, 0) + O(γ 4 )(|µ − ν| + 1) − ∆)−1 (µ − ν, 0) + O(γ 4 )(|µ − ν| + 1), = log(E (1.12) where def → =E − γv g − γ 2 (v − v g )2 g [(E − ∆ − γv g )−1 (0, 0)] E − γ 3 (v − v g )3 g [(E − ∆ − γv g )−1 (0, 0)]2 ∈[−2d, / 2d].

(1.13)

Vol. 2, 2001


241

We stress here that the order of the quantifiers are important, in particular γ0 is uniform in |µ − ν|. Remark. We do not require any regularity conditions on dg other than (H1-3) for Theorem 7.2 to hold, in marked contrast to the random Schr¨ odinger case, see e.g., [SW1, SW2]. Instead the conditions on dg, in particular (H3) are reminiscent of the conditions that one imposes to obtain Lifshitz tails, see e.g., [PF]. The main reason for this contrast as we will see in (1.30, 1.31) is that when E ∈ R\σ(H), it is the Laplace transform of dg that is relevant; while when E ∈ C\σ(H), E ∈ σ(H), it is the Fourier transform of dg. Note the interesting O(γ 2 ), O(γ 3 ) terms, which we will give a simple explanation in (1.15)-(1.17). The O(γ) term just corresponds to a shift in E. Note also that |E − γv g | = |E| + γ|v g | > |E| (1.14) for E ∈ Iδ . So the rate of decay is better than the naive one; recall that the almost sure spectrum is contained in [−2d, ∞) or (−∞, 2d]. Hence in some sense, the “effective” spectrum is further away than the almost sure one. This is related to the so called Lifshitz tails for the density of states in random Schr¨ odinger operators, see e.g., [PF]. We now give a simple explanation for the terms O(γ), O(γ 2 ) and O(γ 3 ) in (1.12). Let G0 = (E − ∆ − γv g )−1 and V¯ = V − γv g . By the resolvent series, we have G(E, µ, ν) =G0 (µ, ν) + γ(G0 V¯ G0 )(µ, ν) + γ 2 (G0 V¯ G0 V¯ G0 )(µ, ν) + γ 3 (G0 V¯ G0 V¯ G0 V¯ G0 )(µ, ν) + γ 4 (G0 V¯ G0 V¯ G0 V¯ G0 V¯ G)(µ, ν), (1.15) Note that the last factor in order O(γ 4 ) is G itself. So G(E, µ, ν) g =G0 (µ, ν) + γ 2 (v − v g )2 g G0 (0, 0)(G0 G0 )(µ, ν) + γ 3 (v − v g )3 g (G0 (0, 0))2 (G0 G0 )(µ, ν) + O(γ 4 ).

(1.16)

We hence obtain that for all µ, ν with |µ − ν| fixed G(E, µ, ν) g = E−∆ − γv g − γ 2 (v − v g )2 g (E − ∆ − γv g )−1 (0, 0) −1 − γ 3 (v − v g )3 g ((E − ∆ − γv g )−1 (0, 0))2 (µ, ν) + O(γ 4 ). (1.17) We see that indeed, the O(γ), O(γ 2 ), O(γ 3 ) terms are identical to the ones in (1.12). The trouble comes when we let |µ − ν| → ∞, as the estimate for the remainder O(γ 4 ) is not in the appropriate weighted space due to the appearance of G there. We see therefore that Theorem 7.2 extends the result obtained from perturbation series, which is valid only for fixed |µ − ν|.

242

Wei-Min Wang


Using the usual method in field theory, in order to obtain the correct behavior for G(E, µ, ν) g for |µ − ν| → ∞, one needs to expand the infinite series and then resum it. The advantage of our method is that one establishes first the existence of an effective convolution matrix. It is only when one wants to derive an expression for this convolution matrix that one expands. Therefore to obtain an accuracy of the given order, one only needs to expand a finite number of terms. It is not an infinite series. For E ∈ / σ(H), define βE (j) = lim − n→∞

logG(E, 0, nj) g , n

(1.18)

if the limit in the RHS exists. Using the FKG inequality [FKG, Li] as stated in Remark 3.2 on p. 241-242 of [Sz1] and sub-additivity ergodic theorem, it can be shown, similar to [Sz1, Z] that for all E ∈ / σ(H), the limit in (1.18) exists and moreover by patching of limits: lim

j→∞

|βE (j) + logG(E, 0, j) | = 0. j

(1.19)

The βE (j) are called the annealed directional Lyapunov exponents. They are in some sense, natural generalizations of the traditional Lyapunov exponent to higher dimensions. For a fixed E, βE defines a norm on Rd [Sz1, Z]. One can similarly define αE , the directional Lyapunov exponents in the almost sure case, which we will not address in this paper. αE are called the quenched Lyapunov exponents. For more detailed statements, see Theorem 3.4, p. 244 of [Sz1] and Theorem A of [Z]. (We remark here that for E ∈ Iδ , (1.19) can also be proved by using the constructions in sects. III-VI.) Remark. There also appears to be some connection between the Lyapunov exponents defined here and the Lyapunov exponents of d-dimensional lattices of coupled, non-linear oscillators. See [EW1, EW2] and references therein for a discussion and some confirmation of this connection. Define ∗

d D|E| = {y ∈ R |

d

cosh yi < |E|},

(1.20)

i=1

is as in (1.13). D is a convex set. We then obtain as a direct consequence where E |E| of Theorem 7.2. Corollary For all δ > 0, there exists γ0 > 0, such that for all 0 < γ < γ0 , all E ∈ Iδ , (1.21) βE (j) = sup y · j + O(γ 4 )j, ¯ y∈D |E|

Vol. 2, 2001


243

˜ is as in (1.13). (Note that y = O(1).) Let where E ˆj = j ∈ S d−1 . j

(1.22)

βE (ˆj) extends to an analytic function in E for E such that E ∈ Iδ . The above result is new for d > 1. For d = 1, there are similar results in the continuum by using probabilistic methods [Fr, Po, Sz2]. The case d = 1 is special, as the (random walk) process is additive. We use lattice field theory (see subsect. 3) to prove Theorem 7.2; the Corollary is a straight forward consequence. Proof of the Corollary (1.21) follows directly from Theorem 7.2 and standard convex analysis as in [Sj1], so we do not repeat it here. (For a general reference on convex analysis, see e.g., [H].) Analyticity in E (E ∈ Iδ ) for βE (ˆj) is a direct consequence of Theorem 7.2, (1.21) and the Cauchy Theorem. Remark. We note from the Corollary that, to O(γ 4 ) the Lyapunov exponent is the support function of a convex set corresponding to the effective convolution matrix: − ∆. Hopefully it will become clear after the proof of Theorem 7.2 that to all E orders in γ, the Lyapunov exponent is the support function of a convex set corresponding to an effective convolution matrix by iterating the procedure used here. (cf. [Sj2] for a related situation.) Starting from O(γ 4 ), the off-diagonal elements of the effective matrix will be different from ∆, i.e., there will be corrections to the generator itself. Comments on Theorem 7.2. Quantitative results like (1.12, 1.13) were not know before for random walks in (time-independent) random potentials in d > 1. If one considers random walks in time-dependent random potentials (cf. [Bol, IS]) , which is a type of the so called directed random walks, then Sinai in [Sin] obtained a result which is similar to Theorem 7.2. (See also (2.22) on p. 327 of [Sz1].) The term directed refers to the fact that the walk is parameterized by time and that the graph of the walk in Z × Zd−1 moves at a constant rate in the time direction. The main and also the crucial difference between the directed case and our (non-directed) case is that in the directed case, one can always, in some sense, reduce to a transfer matrix type of situation, as there is a preferred direction; while in our case, the non-directed case, one cannot avoid treating walks that return (loops or self-intersections), even though they are unlikely due to the condition E∈ / σ(H). (See (1.6).) In spite of the difference stressed above, for high dimensions and potentials which have small variations (γ 1. For d = 1, in the continuum, there are some results due to Friedlin, Povel and Sznitman [Fr, Po, Sz2], which are summarized on p. 233,325 of [Sz1]. The case d = 1 is special as the Brownian motion or the random walk process is additive. Theorem 7.2 and its Corollary confirm a few conjectures raised in the past in the annealed (averaged) case. For a sample of such conjectures in the quenched (almost sure) case, see p. 219, 233 of [Sz1]. In [Sz2], Sznitman proved that in d = 1 and for Brownian motions in Poissonian potentials, i.e., for d2 H = − 2 + V, dx

Vol. 2, 2001


245

where V ≥ 0 is a random potential with Poisson distributions, the quenched Lyapunov exponent αE (1) is analytic in E for E such that E > 0. He used an explicit formula obtained in [Fr] to derive the result. The analyticity of βE (ˆj) proved in the above Corollary extends analyticity results to the annealed case in Zd for all d ≥ 1, when γ 2, the Corollary would then imply that αE (ˆj) is analytic for d > 2, almost surely, which certainly carries interesting informations. In the case of Brownian motions in Poissonian potentials, Sznitman [Sz1] derived upper and lower bounds for αE and βE . In the case of random walks in random potentials (same setting as in this paper), Zerner [Z] derived upper and lower bounds for αE . Our aim is actually to compute the asymptotics of βE as γ 0. Our method is constructive. Due to the lack of rotational symmetry in Zd , the directional dependence of the Lyapunov exponents are more complicated than in Rd . (In Rd , it is known to be proportional to the Euclidean norm, although the constant of proportionality is not known, see p. 219 of [Sz1].) In particular, the Corollary shows that the unit ball in the norm defined by βE is not rotationally invariant. In [Z], when V is a constant, i.e., H is a convolution matrix, Zerner obtained a closed expression for the Lyapunov exponents. Moreover for d = 2, using this expression, Zerner numerically computed the Lyapunov exponents in the case where the potential V is a constant. Combining the Corollary with this numerical result, we see explicitly that for d = 2, the unit ball in the norm given by βE approaches instead the shape of a diamond. We mention another application of the Corollary to random walks in random potentials with a constant drift h ∈ Rd . More precisely, we define the first order difference operator ∇ component wise as (∇eα u)(n) = u(n + eα ) − u(n), where eα ∈ Zd , eα (β) = δαβ , α, β = 1, · · · , d, and we replace the generator ∆ by ∗ as ∆ + dα=1 hα ∇eα . We define the dual norm βE

·x , x=0 βE (x)

∗ () = sup βE

∈ Rd ,

E∈ / σ(H).

∗ is the critical unit ball [Sz1]: if h is such that The unit ball in the norm βE ∗ βE (h) > 1, then the motion is ballistic; if h is such that β ∗ (h) < 1, then the motion is sub-ballistic. The Corollary then allows us to compute the critical drifts hc , which are directional dependent, for E ∈ Iδ .

246

I.3

Wei-Min Wang


The constructions toward the proof of Theorem 7.2

We now describe briefly the construction of the proof of Theorem 7.2. As usual we use finite dimensional approximations and take Λ to be a finite set in Zd and define HΛ to be the restricted operator with appropriate boundary conditions. For concreteness, assume supp dg ⊆ (−∞, 0] and E > 2d. (The other case works in exactly the same way.) Our starting point as in [SW1] is the following Gaussian integral representation of GΛ (E, µ, ν): d2 xi , (1.24) GΛ (E, µ, ν) = xµ · xν [det(E − HΛ )]−1 e− i,j (E−HΛ )ij xi ·xj π i∈Λ

where xi ∈ R2 and · is the usual scalar product in R2 . Using Grassmann variables (see sect. II), the determinant can be further expressed as a Gaussian integral (of Grassmann variables). (This is first used in this context in [BCKP, KS].) This is the so called supersymmetry. Using this representation, we can explicitly take the expectation value of GΛ (E, µ, ν) with respect to dg. Let (1.25) gˆ(t) = etv dg(v) be the Laplace transform of dg. Since gˆ(t) > 0, we define k(t) = log gˆ(t). We obtain (see (2.8,2.9)) GΛ (E, µ, ν) g =

xµ · xν e−2φ(x)

d2 xi , π

(1.26)

i∈Λ

where φ(x) =

1 k(γxj ·xj )−log det(E−∆Λ −γ diag k (γxj ·xj ))]. [ (E−∆Λ )ij xi ·xj − 2 i,j j∈Λ

(1.27) We see that the above supersymmetric-Gaussian transform has mapped the averaged correlation function of a random walk in random potentials to a correla tion function of statistical mechanics. The measure i∈Λ dg(vi ) on RΛ has been def

transformed to e−2φ on (R2 )Λ → = R2Λ . From now on we denote by without subscript, the expectation with respect to the measure e−2φ . Remark. In the physics literature (see e.g., [Ef]), supersymmetry has always played an important role in the study of disordered systems. In [BCKP, KS], supersymmetry was incorporated in a mathematically rigorous manner for the first time. One of the main differences between the present paper (also the earlier paper [SW1])

Vol. 2, 2001


247

with [BCKP, KS], is that the Grassmann variables are further integrated over (see (1.26, 1.27)), so that “conventional” analysis becomes feasible. By a now well known integration by parts initiated in [HS2, Sj1], which for completeness we rederive in sect. II, we have further (formally) GΛ (E, µ, ν) g =

2

−1 −φ (e dx(i) (e−φ dx(i) µ , (∆φ ) ν )), (1)

(1.28)

i=1

where 2Λ 2Λ ∆φ = ∆φ ⊗ I + 2φ on C∞ 0 (R ; ∧R ) ∂ ∂φ ∂ ∂φ (0) zj∗ zj , where zj = + , zj∗ = − + ∆φ = ∂xj ∂xj ∂xj ∂xj (1)

(0)

2Λ on C∞ 0 (R ),

(1.29) zj∗ is the formal adjoint of zj , we have identified 1-forms with functions with values in R2Λ and ( , ) denotes the inner product on L2 (R2Λ ; ∧R2Λ ). (We will address (0) (1) the questions of conditions on φ, domains of ∆φ , ∆φ etc., in sect. III. At this (0)

(1)

stage it suffices to mention that under conditions (H1-3) on g, ∆φ , ∆φ self-adjoint extensions.)

(0) ∆φ ,

(1) ∆φ

have

are respectively the Witten Laplacians on 0, (0)

1-forms. Note that when φ is quadratic, ∆φ is just the usual harmonic oscillator on L2 (R2Λ ). More generally ∆φ = d∗φ dφ + dφ d∗φ , dφ = e−φ deφ = zj dx∧ j, j

d∗φ = eφ d∗ e−φ =

(1.30)

zj∗ dxj ,

j

where d is the usual exterior differential, d∗ its formal adjoint (and consequently d∗φ is the formal adjoint of dφ ). (0)

(1)

The spectra of ∆φ , ∆φ

play a crucial role in our construction. For now

(0) ∆φ

(0)

≥ 0 and that ∆φ has 0 as an eigenvalue of it suffices to mention that multiplicity 1 with e−φ the unique eigenfunction. If φ is strictly convex, i.e., φ > c > 0 uniformly in Λ as an operator, which is the case for E ∈ Iδ , (δ > 0), we (1) obtain that ∆φ ≥ c > 0 uniformly in Λ. This is the so called spectral gap, which is responsible for the exponential decay in Theorem 7.2. Hence to obtain asymptotics for GΛ , we need to compute this spectral gap as an asymptotic series in γ. This requires precise control over the spectrum beyond the gap. The main difficulty here is that we need estimates which are uniform in Λ, so that we can pass to the limit Λ Zd . This is achieved in sects. IV, V by using appropriate weighted spaces.

248

Wei-Min Wang


We also note that for E ∈ / σ(H), lim GΛ (E, µ, ν) g = G(E, µ, ν) g

Λ Zd

exists a priori by resolvent series and spectral theory, so we only need to ensure that we have uniform estimates in this paper. (1) We use a Grushin problem to reduce the study of ∆φ near the lower part of its spectrum to that of an effective operator. From the representation in (1.28), (1) we are only interested in (∆φ − z)−1 for z = 0. We show that to order O(γ 4 ) in appropriate weighted spaces, this effective operator is just 2φ = e−2φ 2φ . ¯ = Λ×{1, 2} which we identify Sketchily, this is accomplished as follows. Let Λ with {1, 2, · · · , |Λ|, · · · , 2|Λ|}. We first use the set of orthonormal 1-forms A1 = ¯ to pose the first Grushin problem for ∆(1) . (Recall that e−φ is the {e−φ dxj , j ∈ Λ} φ (0)

unique eigenfunction of ∆φ with eigenvalue 0.) Consequently, we obtain that the effective operator is 2φ modulo O(γ 2 ), valid for a certain spectral interval I1 at 0. (See Propositions 4.1, 4.3.) This step is similar to that of [Sj1] (see also [BJS]) in the statistical mechanics context, where the effective operator is constructed to order O(h3/2 ), where h is the semi-classical parameter. The order O(h3/2 ) there approximately corresponds to the order O(γ 2 ) here. However as it is clear from (1.12,1.13) and as we explained earlier, the terms which reveal the structure of the problem start at O(γ 2 ). So we need to carry the Grushin constructions a bit further. To O(γ 4 ), this is accomplished by enlarging the set of orthonormal forms to ¯ ∪ {zk∗ e−φ dxj , zk∗ z∗ e−φ dxj , j, k, ∈ Λ, ¯ k ≤ }† , A3 = {e−φ dxj , j ∈ Λ} where { }† denotes the set of orthonormal 1-forms obtained from { } by orthonormalization. These two sets are of the same dimension here. (The restriction ¯ defined in (1.33).) k ≤ is due to the commutativity of the zj∗ (j ∈ Λ), More precisely we show that the well-posedness of the first Grushin problem (1) ¯ implies the well-posedness of a Grushin problem for ∆φ using {e−φ dxj , j ∈ Λ}, (0)

for ∆φ using

¯ †. {e−φ } ∪ {zk∗ e−φ , k ∈ Λ} (0)

So we obtain an effective operator for ∆φ valid in the same interval I1 . (See Proposition 4.4.) Here we used in a crucial way that dφ is a complex and therefore (1) (0) the spectrum of ∆φ and ∆φ are related. Using the first equation of (1.33) and φ > c > 0, we obtain an effective (1) operator for ∆φ valid in a larger spectral interval I2 ⊃ I1 by using ¯ ∪ {zk∗ e−φ dxj , j, k ∈ Λ} ¯ †. A2 = {e−φ dxj , j ∈ Λ} (See Proposition 4.7.)

Vol. 2, 2001


249

Iterating this once more (Propositions 4.9-4.14), we obtain an effective op(1) erator for ∆φ modulo O(γ 4 ) when restricted to the subspace spanned by A1 , in an even larger interval I3 ⊃ I2 ⊃ I1 by using A3 . (Although we do not pursue it in this paper, it is clear to us that this procedure can in fact be iterated to all orders.) We stress that since we are interested in the asymptotics as |µ − ν| → ∞, it is crucial that I3 is large enough, as this translates into estimates which are in appropriate weighted spaces. Theorem 7.2 is then obtained by computing explicitly 2φ −1 . The Corollary follows by using Theorem 7.2 and standard results on the inverse of convolution matrices. We note also that for γ small, φ is close to a quadratic form, and e−2φ is close to a Gaussian. So it should not be surprising that the set of functions ¯ k ≤ , · · · } {e−φ , zk∗ e−φ , zk∗ z∗ e−φ , · · · k, ∈ Λ, play a special role here. This is because when γ = 0, this set of functions after proper orthonormalization is just a set of Hermite functions. What we did can therefore be seen as a perturbation theory around an “infinite” dimensional Gaussian or harmonic oscillator. In other words, what we are using is a Witten complex formulation of Euclidean lattice field theory, whose foundation was laid down by Sj¨ ostrand in his seminal paper [Sj1]. After we finished the paper, we realized that the connections between random walks and lattice field theory are in fact well known, see e.g., [FFS], in particular [BF1, BF2], where they use the term oscillation modes, which roughly corresponds (1) to the eigenfunctions of ∆φ in our vocabulary. The novelty here is that we have an explicit transformation of random walks in random potentials to a specific operator: (1) ∆φ , where φ is determined by the random walk and the random potential as in (1)

(1.31). ∆φ can in turn be interpreted as giving rise to a field theory. It is due to this equality in (1.32) that we are able to compute various quantities. Finally, we like to add that we also hope to address the almost sure (quenched) behavior of G by similar constructions in the future. As mentioned earlier, it is conjectured that αE = βE a.s., for d > 2. If it is true, then the Lyapunov exponents obtained here are also the Lyapunov exponents in the quenched case for d > 2.

II The Averaged Green’s Function We start by recapturing the problem that we study:

where

H = ∆ + γV, on 2 (Zd ),

(2.1)

∆ij = 1 |i − j|1 = 1 = 0 otherwise;

(2.2)

250

Wei-Min Wang


γ is a positive parameter, (V u)j = vj uj , vj is a family of iid random variables with common distribution dg satisfying (H1-3) of sect. I, and σ(H) = [−2d, 2d] + γ supp dg. Let E ∈ R\σ(H). For definitiveness assume supp dg ⊆ (−∞, 0] and E > 0. (supp dg ⊆ [0, +∞) and E < 0 works the same way.) We are interested in computing the asymptotics of G(E, µ, ν) g = (E − H)−1 µν g as |µ − ν| → ∞ for γ sufficiently small. We use finite dimensional approximations. We take Λ to be a finite set in Zd or a large torus of the form (Z/N Z)d . We denote by |Λ| the number of lattice points of Zd in Λ. We identify Λ with the set {1, 2 · · · |Λ|}. The corresponding discrete Laplacian ∆Λ is then defined as in (2.2) for i, j ∈ Λ. Similarly we define VΛ . Define HΛ = ∆Λ + γVΛ on 2 (Λ). Clearly for all Λ, σ(HΛ ) ⊆ [−2d, 2d] + γ supp dg. So GΛ = (E − HΛ )−1 is well defined. As mentioned earlier in sect. I, by using spectral theory (E − H)−1 exists a priori. So to infer the behavior of (E − H)−1 µν , it is sufficient to obtain estimates on (E − HΛ )−1 ) which are uniform in Λ. For the rest of the paper, µν we will only work with HΛ etc., so for simplicity of notations, we will drop the subscript Λ. We reserve ∆ (or ∆Λ ) to denote the discrete Laplacians only; the Witten Laplacians will always be denoted as ∆φ . The main goal of this section is to obtain a convenient representation of (E − H)−1 µν g . This is achieved in Proposition 2.1. Our principal tools are the Gaussian integrals. For any symmetric n × n matrix A with A > 0, we recall the following well-known formula n d2 xi − n i,j=1 Aij xi ·xj A−1 , (2.3) = x · x det Ae µ ν µν π i=1 where xi ∈ R2 and xi · xj is the usual scalar product in R2 . Substituting E − H for A in the above formula, we then have d2 xi . (2.4) G(E, µ, ν) = xµ · xν det(E − H)e− i,j∈Λ (E−H)ij xi ·xj π i∈Λ

Let gˆ be the Laplace transform of dg: gˆ(t) = etv dg(v),

t≥0

(2.5)

For example, for the Bernoulli measure: dg(v) =

δ(v + 1) + δ(v) 2

(2.6)

introduced earlier, gˆ(t) = e−t/2 cosh(t/2). Using (H1), gˆ is well defined for t ≥ 0, moreover gˆ is a real analytic function in t for t > 0. Since dg is a probability

Vol. 2, 2001


251

measure, gˆ(t) > 0 for t ≥ 0. We write gˆ(t) = ek(t) , for t ≥ 0. Hence k(t) = log gˆ(t).

(2.7)

For the Bernoulli measure in (2.6), k(t) = log cosh(t/2) − t/2. Proposition 2.1 G(E, µ, ν) g =

xµ · xν e−2φ(x)

d2 xi , π

(2.8)

i∈Λ

where φ(x) =

1 [ (E − ∆)ij xi · xj − k(γxj · xj ) − 2 i,j j∈Λ

log det(E − ∆ − γ diag k (γxj · xj ))],

(2.9)

xi ∈ R2 . Starting from sect. III, we will be using the representation in (2.8) to study (E − H)−1 µν ) g . The method that we will use (aside from some estimates) is essentially independent of the derivation of (2.8). So those readers who are not particularly interested in the derivation could skip the following proof and proceed directly to sect. III. The proof that we give below is almost identical to that given in [SW1]. For completeness, we resummarize it here: Proof of Proposition 2.1 There are presumably a few ways to derive (2.8). We continue to use the method in [SW1], which uses Grassmann algebra. (For another related way, see appx. B of [SW1].) For a general proper formulation, see e.g., [Be, V]; and in this context see appx. A of [SW1]. In the following we will only write the few lines that are absolutely necessary for the proof of (2.8). Let |Λ| be the number of points in Λ. We use the Grassmann algebra of 2|Λ| generators to express det[E −H]. This algebra is generated by 2|Λ| anti-commuting variables ξi , ηi , i ∈ Λ satisfying the relations: = ξi ηj + ηj ξi = 0, [ξi , ξj ] = ξi ξj + ξj ξi = 0, [ηi , ηj ] = ηi ηj + ηj ηi = 0,

(2.10)

where we write [a, b] = ab + ba for the anti-commutator. It is denoted by Λ[ξ1 , η1 , .., ξ|Λ| , η|Λ| ] (if we identify Λ with {1, .., |Λ|}). From (2.4), we see in particular that ξi2 = ηi2 = 0. We write these anti-commutative variables collectively as ξ, η.

252

Wei-Min Wang


To make connections with better known objects, ξj , ηj can be thought of as the two 1-forms on the j th copy of R2 . “C ∞ functions” F (ξi , ηj ) of these anticommuting variables are defined by Taylor’s formula at (0, 0) which contains a finite number of terms because of nilpotency. In this way F (ξ, η) becomes an element of the Grassmann algebra. For example if F (ξ, η) = eAij ξi ηj ,

(2.11)

F (ξ, η) = 1 + Aij ξi ηj .

(2.12)

then This is the function that we need in writing the determinant. We also need to define the notions of differentiation and integration. Define: ∂ (ξi ) = 1, ∂ξi ∂ (ηi ) = 1. ∂ηi

(2.13)

We require further that these differentiations be linear operators and that Leibnitz’ rule hold. We can then define integrals (with respect to ∂) as follows: 1dξi = 0, ξi dξi = 1, 1dηi = 0, ηi dηi = 1. (2.14) A multiple integral is defined to be a repeated integral. For example, ξi ηj dξi dηj = − ηj ξi dξi dηj = − ηj dηj = −1.

(2.15)

Using (2.11)–(2.15), we obtain det(E − H) = e− j,k∈Λ (E−H)jk ηj ξk (dηj dξj ),

(2.16)

j∈Λ

(E −

H)−1 µν

det(E − H) =

ξµ ην e−

j,k∈Λ (E−H)jk ηj ξk

(dηj dξj ),

(2.17)

j∈Λ

We illustrate (2.16) in the case where (E − H) = M is a 2 × 2 matrix. The integrand in the RHS of (2.16) is (1 − Mjk ηj ξk ), (2.18) e− j,k=1,2 Mjk ηj ξk = j,k=1,2

where we used the commutativity of e−Mjk ηj ξk and (2.11, 2.12). Doing the integral in the RHS of (2.18) upon using (2.14) and (2.15), we obtain RHS = M11 M22 − M12 M21 = det M,

(2.19)

Vol. 2, 2001


253

as expected. Similarly, we obtain for the RHS of (2.17): −1 RHS = (−1)|µ|+|ν| det(M µν ) = Mµν det M,

µ, ν = 1, 2,

(2.20)

where M µν is the comatrix with the µth row and νth column removed. (In the 2 × 2 case it is just the one matrix element that is left.) Combining (2.4) with (2.16), we obtain the following expression: G(E, µ, ν) = xµ · xν e− j,k∈Λ (E−H)jk Xj ·Xk d2 Xj ,

(2.21)

j∈Λ

where

xj ∈R2 , def

Xj → =(xj , ξj , ηj ), 1 def (2.22) Xj · Xk → =xj · xk + (ηj ξk + ηk ξj ), 2 2 def d xj dηj dξj , d2 Xj → = π and we used that (E − H) is a symmetric matrix. Hence, G(E, µ, ν) g = xµ · xν e−( j∈Λ EXj ·Xj − j,k∈Λ,|j−k|1 =1 Xj ·Xk −γ j∈Λ vj Xj ·Xj ) dg(vj ) d2 Xj j∈Λ j∈Λ −( j∈Λ EXj ·Xj − j,k∈Λ,|j−k|

= def

→=

xµ · xν e

xµ · xν e−2Φ

1 =1

Xj ·Xk − j∈Λ k(γXj ·Xj ))

d2 Xj

j∈Λ

d2 Xj ,

j∈Λ

(2.23) where

k(γXj · Xj ) = k(γxj · xj + γηj ξj ) = k(γxj · xj ) + γk (xj · xj )ηj ξj ,

by using (2.12) to expand eγvj ξj ηj and direct computations, and def 2Φ → = EXj · Xj − Xj · Xk − k(γXj · Xj ). j∈Λ

j,k∈Λ,|j−k|1 =1

(2.24)

(2.25)

j∈Λ

Using (2.23) and integrating over the anti-commutative variables ξ, η according to (2.14, 2.15), we then obtain the proposition. We notice from the proof of Proposition 2.1 that in (2.21), apart from the factor xµ · xν , the integrand is only a “function” of Xj · Xk . More precisely, we

254

Wei-Min Wang


consider the algebra C ∞ (R2|Λ| ) ⊗ Λ[ξ1 , η1 , .., ξ|Λ| , η|Λ| ]. We call elements of this algebra, (super)functions. Clearly, 1 D(X) = Xj · Xk = xj · xk + (ηj ξk + ηk ξj ) 2 ∞ 2|Λ| ∈ C (R ) ⊗ Λ[ξ1 , η1 , .., ξ|Λ| , η|Λ| ], and is hence a (super)function. We call the scalars (which are functions in C ∞ (R2|Λ| )) in front of the Grassmann generators ξ, η, coefficients. In the case of Xj · Xk , the coefficients are xj · xk and 1/2. Those (super)functions which are only functions of Xj · Xk (j, k ∈ Λ) are called supersymmetric functions. More precisely, supersymmetries are defined to be the set of transformations (in the algebraic sense) that leave the dot products Xj · Xk (j, k ∈ Λ) invariant. Two obvious transformations that leave the dot product invariant are the usual rotations O in R2 , (i = 1, · · · , n) xi = x i O and the transformations A ∈ Sp(2) acting on ξi , ηi (i = 1, · · · , n) such that (ξi , ηi ) = (ξi , ηi )A, where {x i , ξi , ηi }ni=1 is another set of coordinates, x i being the even ones and ξi , ηi the odd ones. We put Xi = (x i , ξi , ηi ) (i = 1, · · · , n). Aside from these two linear transformations, supersymmetries also include transformations generated by (super)vector fields of the type: V =

Vi

i

=

∂ ∂ ∂ (ξi a + ηi b) + 2(b · xi ) − 2(a · xi ) , ∂x ∂ξ ∂η i i i i

where a, b ∈ R2 , and a

∂ ∂ ∂ def → = a(1) (1) + a(2) (2) , ∂xi ∂x ∂x i

i

∂ ∂ ∂ def → = b(1) (1) + b(2) (2) . b ∂xi ∂x ∂x i

i

(Note that it is the same transformation in all the Xi .) The above transformation is to be understood in the algebraic sense. We check that V D(Xi , Xj ) = 0. Analogous to the usual change of variables, the (super) change of variables also generates a Jacobian, which is called a Berezinian. We will not go into the details, except mentioning that the Berezinian here is 1. Let τ be a supersymmetric transformation. Let Xi = τ Xi (i = 1, · · · , n).

Vol. 2, 2001


255

Definition. A superfunction F is supersymmetric if it is invariant under all supersymmetries: F (X1 , · · · , Xn ) = F (X1 , · · · , Xn ), for all τ supersymmetric transformations. Clearly, supersymmetric functions belong to a rather restricted class of functions. For example, in R2 , with two generators ξ, η, F is supersymmetric if and only if there exists f : [0, ∞) → R of class C ∞ , such that F (X) = f (X · X) = f (x · x) + f (x · x)ηξ. For the general case, see e.g., [KS]. Define dXi = (d2 xi /π)dηi dξi (i = 1, · · · , n). One of the most useful properties of the supersymmetric functions is the following: Proposition 2.2. If F is supersymmetric with all of its coefficients in S(R2n ), then F (X1 , · · · , Xn )dX1 · · · dXn = F (0, · · · 0).

(2.26)

For a proof, see e.g., [K, KS]. Proposition 2.2 is related to the so called “localization formula” in geometry, see e.g., [BGV]. Let F (X) = Xµ · Xν e−2Φ . Using Proposition 2.2, (2.23), F (0) = 0 and integrating over the anti-commuting variables, we then obtain another expression for G(E, µ, ν) g (to be compared to (2.8)): d2 Xj G(E, µ, ν) g = ξµ ην e−2Φ =

j∈Λ

(E − ∆ − γ

−2φ diag k )−1 µν e

d2 xj , π

(2.27)

j∈Λ

where we used (2.17,2.9) to reach the last equality. Proposition 2.2, (2.17) and (2.27) will be our main tool in estimating various integrals with respect to the measure e−2φ . It helps us to obtain the correct order of magnitude in γ. It also allows us to skirt the use of maximum principles as in [Sj1] (or Witten Laplacians as in [BJS].) In spite of all of its virtues, supersymmetry is not central for the main constructions in this paper. So if the readers wish, they could skip the details of such estimates of the integrals and only use the end results. −2φ Finally we with φ defined in (2.9) is normalized for remark that the measure e all Λ as j∈Λ dg(vj ) is normalized, which can also be verified by using (2.25,2.9) and Proposition 2.2.

256

Wei-Min Wang


III The Witten Laplacian Representation of the Averaged Green’s Function Recall that in the Gaussian integral representation (2.4), we doubled the dimen¯ = Λ × {1, 2} as in sect. I, which we identify sion, we therefore introduce the set Λ with {1, 2 · · · |Λ|} × {1, 2} and sometimes also with {1, 2, · · · , |Λ|, · · · , 2|Λ|}. Our main aim is to write the correlation G(E, µ, ν) g = xµ · xν e−2φ(x) d2 xi (3.1) i∈Λ

in (2.8) using Witten Laplacians as initiated in [Sj1]. In order to use the Witten Laplacian construction, we need to have some control over k and its derivatives. To that end, we use (H3) and apply a classical Tauberian theorem, see Theorems 2, 3 on p. 418-422 of [W], which we summarize below as Theorem 3.0. Recall also the notion of regularly varying functions introduced earlier in (H3). Let dU be a measure defined on [0, ∞) and such that its Laplace transform ∞ ω(λ) = e−λx dU (x) (3.2) 0

exists for λ > 0. Theorem 3.0. Let ρ ∈ [0, ∞). If L ≥ 0 defined on [0, ∞) is regularly varying at 0 with exponent 0, then each of the relations 1 ω(λ) ∼ λ−ρ L( ), λ and U (t) ∼

1 tρ L(t), Γ(ρ + 1)

λ → ∞, t→0

(3.3)

(3.4)

implies the other. Using the above Tauberian theorem, we arrive at Lemma 3.1. Assume (H1-3), then k(n) (t) =

On (1) , (1 + |t|)n

(n ≥ 0).

(3.5)

Proof. For simplicity we work out the case supp dg ⊆ (−∞, 0], the other case works in the same way. Since gˆ(t) =

etv dg(v) = ek(t) ,

(3.5) is obviously true when n = 0 by using (H1). We now look at n ≥ 1. gˆ (t) = vetv dg(v) = k (t)ek(t) .

Vol. 2, 2001


257

tv tv e d(g(v)v) − etv g(v)dv ve dg(v) = , k (t) = tv e dg(v) etv dg(v)

So

by integration by parts, v as g is of regular variation. Since g is of regular variation with exponent ρ, vg, 0 g(s)ds are both of regular variations with exponent ρ + 1. The Tauberian theorem 3.0, then gives k (t) = Similarly

O1 (1) (1 + |t|)

as t → ∞.

v 2 etv dg(v) − (k (t))2 k (t) = tv e dg(v) tv e d(g(v)v 2 ) − 2 etv vg(v)dv − (k (t))2 = etv dg(v) O2 (1) = , (1 + |t|)2 .. .

k(n) (t) =

On (1) (1 + |t|)n

.. . Using (2.9), we compute that (α) (2φ i(α) )(x) = 2 (E − ∆ − γ diag k (γxj · xj ))i x (3.6)

2

+ 2γ k (γxi · xi )(E − ∆ − γ diag k (γxj ·

(α) xj ))−1 ii xi ,

and (2φ (x))i(α) j (β) = 2Eδij δαβ − 2∆ij δαβ − 4γ 2 xi xi k (γxi · xi )δij (α) (β)

− 2γ[k (γxi · xi ) − γk (γxi · xi )(E − ∆ − γ diag k )−1 ii ] δij δαβ + 4γ 3 xi xi k (γxi · xi )(E − ∆ − γ diag k )−1 ii δij (α) (β)

2 + 4γ 4 xi xj k (γxi · xi )k (γxj · xj )[(E − ∆ − γ diag k )−1 ij ] , (3.7) (α) (β)

For all δ > 0, E ∈ Iδ , (Iδ as defined in (1.11),) using Lemma 3.1, we deduce that φ ∈ C ∞ (R2|Λ| , R), satisfies φ(x) = O(1 + x2 ), φ(x) = O(1 + |x|), x · φ(x) ∼ |x|2 ,

for |x| >> 1,

φ(n) (x) = On (1) for

n ≥ 2,

(3.8)

258

Wei-Min Wang


¯ 2 (Λ)) ¯ in L(2 (Λ),

(3.9)

and φ(x) ≥ rE > 0

where rE (depending on δ) is uniform in Λ. The results to be stated below (except the last subsection when φ is an exact quadratic form) are general and in particular do not require that φ > c > 0. For notational simplicity, we make a slight change of point of view. We make the identification: ¯ (R2 )Λ ∼ (R)2|Λ| ∼ (R)Λ . (In fact, careful readers probably have already remarked that we made this identification earlier in the introduction.) Hence unless specified otherwise, the i, j, k · · · ¯ and xi , xj , xk · · · etc. will now be in R. We will generally etc. will now be in Λ keep this point of view unless we need to estimate integrals with respect to e−2φ , where it is more convenient to revert back to the point of view of (R2 )Λ and use supersymmetry.

III.1

The Witten complex

We introduce the Witten complex (originally introduced in [W]) as in [Sj1] (cf. also [BJS, HS2]): zj dx∧ (3.10) dφ = e−φ deφ = d + dφ∧ = j, ¯ j∈Λ

acting (for the moment) on C0∞ (R2Λ ; ∧ R2Λ ) (0 ≤ ≤ 2|Λ|), where d = ∧ ¯ ∂xj dxj is the usual exterior differentiation and j∈Λ zj =

∂ ∂φ + ∂xj ∂xj

can be viewed as an annihilation operator. Let d∗φ = eφ d∗ e−φ = zj∗ dxj ,

(3.11)

(3.12)

¯ j∈Λ

be the formal complex adjoint, where zj∗ = −

∂ ∂φ + ∂xj ∂xj

(3.13)

can be viewed as a creation operator. We have the commutation relation: [zj , zk∗ ] = 2∂j ∂k φ,

(3.14)

which plays an important role in our later constructions. We check easily that indeed dφ dφ = d∗φ d∗φ = 0. (3.15)

Vol. 2, 2001


259

Using dφ , dφ ∗ , we define the Witten Laplacian, which is a deformed Hodge Laplacian: ∆φ = d∗φ dφ + dφ d∗φ , (3.16) on C0∞ (R2Λ ; ∧ R2Λ ) (0 ≤ ≤ 2|Λ|). Notice that dφ ∆φ = ∆φ dφ , d∗φ ∆φ = ∆φ d∗φ

(3.17)

()

by using (3.15). If we let ∆φ be the restriction of ∆φ to forms of degree , we obtain more precisely: ()

(+1)

dφ ∆φ = ∆φ

dφ , d∗φ ∆φ

(+1)

= ∆φ d∗φ . ()

(3.18)

This is the same as in the case of Hodge Laplacians which corresponds to taking φ = 0. We have explicitly ∆φ = dφ ∗ dφ = (0)

zj∗ zj = −

¯ j∈Λ

∂2 2 2 + dφ − Tr Hess φ. ∂x j ¯

(3.19)

j∈Λ

(0)

For example, if φ is a non-degenerate quadratic form, then ∆φ is a 2|Λ|dimensional harmonic oscillator. zj and zj∗ are just the famous annihilation and creation operators for the harmonic oscillator. More generally, we have ∆φ = zj zk∗ dx∧ zk∗ zj dxk dx∧ j dxk + j ∗ ∧ ∧ ∗ = zk zj (dxj dxk + dxk dxj ) + [zj , zk ]dx∧ j dxk (3.20) = zj∗ zj + 2 (∂xj ∂xk φ)dx∧ j dxk (0) = ∆φ ⊗ I + 2 (∂xj ∂xk φ)dx∧ j dxk , where to obtain the third line from the second, we used (3.14). In particular, if we identify 1-forms with R2Λ valued functions, we obtain ∆φ = ∆φ ⊗ I + 2φ . (1)

(0)

(3.21)

Also if we identify 2-forms with functions valued in the space of real anti-symmetric matrices, we have (2) (0) (3.22) ∆φ u = ∆φ u + 2(φ ◦ u + u ◦ φ ). ()

Below we describe briefly some general properties of ∆φ for φ satisfying (3.8). ((3.9) is not needed for these general statements.) There is a well written paper by Johnsen [Jo] on the subject. (3.8) ensures that most of the results there are valid in the present case. So for more details, the readers are referred to [Jo].

260

Wei-Min Wang


Since (∆φ u, u) ≥ 0 for all u ∈ C0∞ (R2Λ ; ∧ (R2Λ )∗ ), we can define ∆φ as a self-adjoint operator by taking the Friedrichs extension. Using (3.8), we also see () () that ∆φ has compact resolvent. Moreover ∆φ has discrete spectrum contained ()

()

(0)

in [0, ∞). The lowest eigenvalue of ∆φ is zero and a corresponding eigenfunction is e−φ , since this function is annihilated by dφ . This eigenvalue is simple, for if u (0) is another eigenfunction associated to the same eigenvalue, then 0 = (∆φ u, u) = dφ u2 and hence dφ u = 0, which means precisely that u is a multiple of e−φ . (0)

(1)

For φ satisfying (3.8), we also have the important property that ∆φ , ∆φ have closed range (Theorem 1.7 of [Jo]), i.e., (0)

(0)

(1)

(1)

Ran ∆φ = Ran ∆φ

(3.23)

Ran ∆φ = Ran ∆φ . We will need this property to prove the well-posedness of the various Grushin problems in sect. IV. We next repeat the argument given in [Sj1] to show that the lowest eigen(1) value of ∆φ on L2 (R2|Λ| , ∧R2|Λ| ) is positive. (For more detailed arguments, see (1)

(1)

Theorem 7.5 of [Jo] and its proof.) Assume that u ∈ D(∆φ ) and that ∆φ u = 0. Then we have dφ u = 0, d∗φ u = 0. Since we are in R2Λ , the first equation implies that e−φ u is an exact form: eφ u = d(eφ v), i.e., u = dφ v, where v is given by x e−(φ(x)−φ(y)) u(y), (3.24) v(x) = 0

a line integral. From (3.8), it follows by using weighted Lithner-Agmon estimates as in [HS1] that u ∈ S(R2Λ ) and then using (3.24) and (3.8) once more, that (0) v ∈ S(R2Λ ). Using that d∗φ u = 0, we obtain ∆φ v = 0, so v = const. e−φ . This implies that u = dφ v = 0. Remark. The above argument does not require φ to be convex. It holds for all Λ which are finite subsets of Zd . The point is that the lower bound obtained above is not uniform in Λ. The relations (3.17,3.18) yield connections between the spectra of Witten () Laplacians on various forms. For instance, if λ = 0 is an eigenvalue for ∆φ with eigenform u ∈ S(R2Λ ; ∧ R2Λ ), then dφ u ∈ S(R2Λ ; ∧+1 R2Λ ) and d∗φ u ∈ (+1)

(−1)

and ∆φ reS(R2Λ ; ∧−1 R2Λ ) (if they are non-zero) are eigenforms for ∆φ spectively. (When λ = 0, both dφ u and d∗φ u are zero, so there is no such straightforward relation between the zero eigenvalue. These are basic properties of a complex, see e.g., chap. 11 of [CFKS].

Vol. 2, 2001


261

The following observation, which will be used later in the paper, is a manifestation of the above fundamental fact. Let λ > 0 be the smallest eigenvalue of (1) (0) ∆φ . Then the gap between the first eigenvalue of ∆φ which is zero, and the (0)

second one is ≥ λ. This is because if µ < λ is the second eigenvalue for ∆φ with eigenfunction u ∈ S(R2Λ ; R), then 0 = dφ u ∈ S(R2Λ ; R2Λ ) is an eigenform for (1) ∆φ with eigenvalue µ < λ, which is in contradiction with the assumption. The Grushin reductions in sect. IV can be seen as using the above argument for higher parts of the spectrum. (0)

We mentioned earlier that in the case of a Gaussian (φ a quadratic form), ∆φ is just a 2|Λ|-dimensional harmonic oscillator. This observation can be generalized to note that if e−φ is the eigenfunction of the Schr¨ odinger operator −

∂2 +V ∂x2j

for the lowest eigenvalue λ, then −

∂2 (0) + V − λ = ∆φ . ∂x2j

From this point of view, the Witten Laplacian can be seen as, in some sense, a natural generalization of harmonic oscillators.

III.2

The Witten Laplacian Representation

We begin by rewriting the averaged Green’s function in (3.1) in the new notation as dxi √ G(E, µ, ν) g = 2 xµ xν e−2φ(x) π ¯ (3.25) i∈Λ = 2(xµ e−φ , xν e−φ ), ¯ xµ , xν , xi ∈ R and the extra factor of 2 comes from symmetry. where µ, ν ∈ Λ ⊂ Λ, We note that due to symmetry (e−φ , xν e−φ ) = 0 for all ν ∈ Λ. We look for a 1-form ω, such that (3.26) xν e−φ − (e−φ , xν e−φ )e−φ = xν e−φ = d∗φ ω. Further we look for a particular type of 1-form ω = dφ u. (3.25) then leads to the following equation for u (0) (3.27) xν e−φ = ∆φ u, (0)

where u ∈ D(∆φ ). Using (3.27) in (3.25), we obtain G(E, µ, ν) g = (xµ e−φ , ∆φ u) (0)

= (e−φ dxµ , dφ u).

(3.28)

262

Wei-Min Wang


Applying dφ to (3.27), we obtain e−φ dxν = ∆φ (dφ u). (1)

(3.29)

Since e−φ dxν ∈ S, by using a partition of unity, weighted L2 estimates and repeated differentiation of (3.27), it can be shown that u ∈ S, therefore dφ u ∈ S ⊂ (1) D(∆φ ) (see also [Sj1]). Hence dφ u = (∆φ )−1 (e−φ dxν ). (1)

(3.30)

Using (3.21) in (3.20), we obtain finally G(E, µ, ν) g = 2(e−φ dxµ , (∆φ )−1 (e−φ dxν )), (1)

(3.31)

where xµ , xν ∈ R. (1)

Remark. For all finite sets Λ ⊂ Zd , ∆φ > 0 as discussed previously. So (3.31) is well defined. The main work starting in sect. IV is to study the lower part of the (1) spectrum of ∆φ uniformly in Λ as Λ Zd . Looking for a 1-form ω as in (3.26) is dual to looking for a vector field (for integration by parts) as in [HS2, SW1]. In (1) fact eφ ∆φ e−φ is the vector heat operator obtained there after identifying 1-forms with vector fields.

III.3

(0)

(1)

∆φ and ∆φ for φ an exact quadratic form–the Fock space representation (0)

When φ is a non-degenerate quadratic form, ∆φ is a 2Λ-dimensional harmonic oscillator, as mentioned earlier. Its properties are well-known. However, since most of our insights in the general case come from the quadratic case, below we mention (0) (1) briefly the spectral properties of ∆φ and ∆φ when γ = 0, i.e., for φ(x) =

1 1 (E − ∆)x · x − log det(E − ∆), 2 2

(3.32)

where E ∈ / σ(∆), and for definitiveness we have assumed E > 0; the second term (0) (1) in φ is just the normalization constant. We will also express ∆φ , ∆φ in the basis provided by the associated Hermite functions and thus obtain the so called Fock space representations in (3.38),(3.42). As we will realize later, most of the essential features in the general case are already present in the quadratic case. Since in the latter, φ decouples into the sum of two identical terms corresponding to the two copies of R, for simplicity, in the ¯ rest of this subsection (and only there) we assume j ∈ Λ instead of Λ.

Vol. 2, 2001


263

Substituting (3.23) in (3.12) and (3.14), we have (0)

∆φ = −

∂2 + [(E − ∆)2 ]jk xj xk − 2|Λ|E 2 ∂xj

j∈Λ (1) ∆φ

=

(0) ∆φ

(3.33)

j,k∈Λ

⊗ I + 2(E − ∆),

where as earlier, we have identified 1-forms with RΛ valued functions and we now further identify RΛ valued functions with 2 (Λ). Let ωi , (i = 1, · · · , |Λ|) be the eigenvalues of E − ∆, in increasing order. Clearly E − 2d ≤ ω1 < · · · < ω|Λ| ≤ E + 2d. As Λ Zd , the equalities are reached and in the limit the spectrum σ(E − ∆) = [E − 2d, E + 2d]. For any finite Λ, the (0) spectrum of ∆φ is discrete. Its eigenvalues are: |Λ|

2mi ωi

mi = {0, 1 · · · }.

i=1

The first spectral gap = 2ω1 > 0 and approaches E − 2d > 0 as |Λ| → ∞. The higher gaps generally diminish in size as |Λ| increases and the ωi ’s approach a continum. Let vi be the eigenvector of (E − ∆) corresponding to the eigenvalue ωi (1) (i = 1, · · · Λ). Then clearly e−φ vi is an eigenform of ∆φ with eigenvalue 2ωi . Let (0)

ψ1 denote the eigenfunction of ∆φ corresponding to the eigenvalue 2ω1 . Similarly, (1)

we have that ψ1 vi (component-wise multiplication) are eigenforms for ∆φ corresponding to eigenvalues 2ω1 + 2ωi ≥ 4ω1 (i = 0, 1 · · · |Λ|). Clearly, we can keep on (1) iterating and obtain the whole spectrum of ∆φ this way. Let Π be the projection onto the subspace in L2 (RΛ ; RΛ ) spanned by e−φ vi . The feature to notice is that (1)

∆φ ≥ 2ω1 (1)

∆φ (1 − Π) ≥ 4ω1 . Customarily, 2ω1 is called “the gap” and 4ω1 “the upper gap”. Note that the upper gap is twice the gap. As we will see these two features remain in the general case and manifest themselves when we compute the correlation function. In fact in order to compute the asymptotics of the correlation function, it is necessary to have precise control over the spectrum beyond the gap. For example, in our case, to order O(γ 2 ), it is the upper gap; and to O(γ 4 ), it is the spectrum beyond the upper gap.

264

Wei-Min Wang


Coming back to the quadratic case, where φ is as in (3.23), we know that the (0) Hermite functions provide a basis for L2 (RΛ ). It is instructive to express ∆φ in that basis. Let ζj∗ = [(2φ )−1/2 ]jk zk∗ . (3.34) k

The relevant orthonormal set of Hermite functions are: (ζj∗ )2 e−φ ; ζj∗ e−φ , (j ∈ Λ); ζj∗ ζk∗ e−φ , (j, k ∈ Λ, j > k), √ e−φ , (j ∈ Λ), · · · 2

(3.35)

For any vector space E, we consider the symmetric tensor product E ⊗s E as a n n subspace of E ⊗E. Therefore 2 (Λ)⊗s is a subspace of 2 (Λ)⊗ . We further identify Re−φ with R, the span of {ζj∗ e−φ } with RΛ ∼ 2 (Λ), the span of{ζj∗ ζk∗ e−φ , (j > (ζ ∗ )2

k), √j 2 e−φ } with 2 (Λ)⊗s 2 (Λ), · · · etc., where ⊗s denotes the symmetric tensor product as [ζj∗ , ζk∗ ] = 0 for all j, k ∈ Λ. Let L2 (RΛ )n be the span of ζj∗1 · · · ζj∗n e−φ . We have the decomposition (Hilbert direct sum): 2 Λ L2 (RΛ ) = ⊕∞ n=0 L (R )n 2 ⊗s = ⊕∞ n=0 (Λ)

=R⊕

n

(3.36)

2 ⊗s n ⊕∞ . n=1 (Λ)

(0)

(0)

Since ∆φ leaves each L2 (RΛ )n invariant, we write ∆φ,n for the restriction. We n ∗ n (0) (0) ⊗ 2 (Λ)⊗s . Therefore ∆φ,n can identify ∆φ,n with an element of 2 (Λ)⊗s be seen as a tensor in ⊗2n 2 (Λ), (separately) symmetric in the first n indices and the last n indices. We have further that (0)

∆φ,0 =0 (0)

∆φ,1 =E − ∆ (0)

∆φ,2 =I ⊗ (E − ∆) + (E − ∆) ⊗ I restricted to 2 (λ) ⊗s 2 (Λ) (0) ∆φ,3

=I ⊗ I ⊗ (E − ∆) + I ⊗ (E − ∆) ⊗ I + (E − ∆) ⊗ I ⊗ I restricted to 2 (λ) ⊗s 2 (Λ) ⊗s 2 ( Lambda) .. .

(3.37)

Vol. 2, 2001


265

(0)

We summarize by writing ∆φ as a block diagonal matrix:

(0)

∆φ

 0 0 0 (E − ∆)  = 2 0 0  .. .. . .

0 0 (E − ∆) ⊗s I + I ⊗s (E − ∆) .. .

 ... . . .  , . . .  .. .

(3.38)

where (E − ∆), I are once contra, once covariant tensors. In fact, we have the identification (3.39) (E − ∆)ij = (E − ∆)ij ; Iji = Iij = δij ; ⊗s (with a slight abuse of notation) denotes the restriction of the tensors to the symmetric subspace. (3.37) and (3.38) can be proved by direct inspection of the corresponding matrix element, for example ((E − ∆) ⊗s I + I ⊗s (E − ∆))ij kl =(E − ∆)ik δj + (E − ∆)i δkj + δki (E − ∆)j + δi (E − ∆)jk =(E − =(E − =(E −

∆)ik δj ∆)ik δi ∆)ii

+ δi (E + δki (E

(i = j, k = ) − ∆)jk

(i = j, k = )

− ∆)i

(i = j, k = )

(3.40)

(i = j = k = ).

By using the identification (3.39), we have the same equality as in (3.38) with all (0) the indices down. (3.38) is just the usual Fock space representation of ∆φ . (See e.g., [Be].) (1)

By the same reasoning, ∆φ operates on 2 ⊗n s ⊗ 2 (Λ), L2 (RΛ ; RΛ ) ∼ R ⊕ ⊕∞ n=1 (Λ) where the tensor product outside of the parenthesis is unsymmetrized. It following Fock space representation:  (E − ∆) 0 0  0 (E − ∆) ⊗ I 0   +I ⊗ (E − ∆)  (1) ∆φ = 2  0 0 [(E − ∆) ⊗s I + I ⊗s (E − ∆)] ⊗ I   +(I ⊗s I) ⊗ (E − ∆)  .. .. .. . . .

(3.41) has the  ... . . .    . . . .    .. . (3.42)

266

Wei-Min Wang


We notice also that the Witten Laplacian representation gives in this case G(E, µ, ν) g = 2(e−φ dxµ , (∆φ )−1 (e−φ dxν )) = (E − ∆)−1 (µ, ν), (1)

(3.43)

which is undoubtedly the correct answer. (cf. (1.1,1.4) with V set to 0.) We end this section by commenting that for given µ, ν, the sign of G(E; µ, ν) is independent of V for E ∈ / [−2d, 2d] + γ supp dg. Hence G(E; µ, ν) g has statistical significance. This can be seen as follows: 1. (i) E > 2d, Since

G = (E − H)−1 = (E − ∆ − γV )−1 ∂ G(E, µ, ν) = − G(µ, k)G(k, ν), ∂E k

we see that it is enough to prove G(E; µ, ν) > 0 for some large enough E. For such E’s, we can simply use the resolvent series about (E − V )−1 to conclude that G(E; µ, ν) > 0. 2. (ii) E < −2d, Let U be the unitary multiplication operator: (U f )(i) = (−1)|i|1 f (i). Then U −1 HU = −∆ + γV. Hence G(E; µ, ν) = (E − H)−1 (µ, ν) = (−1)|µ|1 −|ν|1 (E − γV + ∆)−1 (µ, ν). Now (E − γV + ∆)−1 (µ, ν) < 0 by using the same argument as before. So for given µ, ν, G(E; µ, ν) has a sign which is independent of V . (1)

IV Grushin Problems and the Spectrum of ∆φ (1)

We start the study of ∆φ , where φ is as defined in (2.9). Recall from (3.8,3.9) that for all δ > 0, E ∈ Iδ (Iδ as defined in (1.11)), we have uniformly in Λ, φ(x) = O(1 + x2 ), φ(x) = O(1 + |x|), x · φ(x) ∼ |x|2 , φ(x) ≥ rE > 0

for |x| >> 1 , φ(n) (x) = On (1) for n ≥ 2,

(4.1)

in L( (Λ), (Λ)), 2

2

where rE (depending on δ) is independent of Λ. We recall the expression for 2φ , which first appeared in (3.7) of sect. III, and where we used the point of view of

Vol. 2, 2001


267

(R2 )Λ : (2φ (x))i(α) j (β) = 2Eδij δαβ − 2∆ij δαβ − 4γ 2 xi xi k (γxi · xi )δij (α) (β)


2 + 4γ 4 xi xj k (γxi · xi )k (γxj · xj )[(E − ∆ − γ diag k )−1 ij ] , (4.2) where k = log gˆ as in (2.7) and gˆ is the Laplace transform of dg; i, j ∈ Λ, xi , xj ∈ R2 , α, β ∈ {1, 2}. Unless specified otherwise, all our estimates below are uniform in Λ. (α) (β)

(1)

From (3.31), we see that we are mainly interested in the spectrum of ∆φ around 0. Hence we use the method of Grushin problems to obtain an effective operator (around 0). Similar to the quadratic case, the family of functions: e−φ , ¯ {z ∗ z ∗ e−φ , j, k ∈ Λ, ¯ j ≥ k} · · · etc., after orthonormalization will {zj∗ e−φ j ∈ Λ}, j k play a special role and provide a “basis” for the Grushin problems. Our main goal is to compute the asymptotics for the averaged Green’s function in (2.8) as |µ − ν| → ∞. In order to capture the interesting effect of randomness, we need the error term to be of at most o(γ 2 ). (To O(1), the asymptotics only reflects E − ∆. The O(γ) term simply corresponds to the shift in energy: (1) γv g .) In the following we study ∆φ to O(γ 4 ). However, it will become clear (we hope) that the procedure that we describe below can be iterated to get the asymptotics to all orders.

IV.1

(1)

The initial Grushin problem for ∆φ and approximations to O(γ 2 ) (1)

We begin our study of the spectrum ∆φ around 0. From (3.21,3.19,4.1) we have, a priori, for all u, (1)

(∆φ u, u) ≥ 2r0 ||u||2

(4.3) (0)

for some r0 ≥ rE > 0. Hence the second eigenvalue (spectral gap) of ∆φ ≥ r0 > 0. ¯ (Recall that we are working with a complex). Therefore if u ⊥ e−φ dxj for all j ∈ Λ, then (1) (4.4) (u, ∆φ u) ≥ 2(r0 + rE )u. As we will see by the end of the section in order that the error in the decay rate in the asymptotics of GΛ (E, µ, ν) as |µ − ν| → ∞ is of o(γ 2 ), we need to ¯ {z ∗ z ∗ e−φ , j, k ∈ Λ, ¯ j ≥ k} (properly use not only e−φ , but also {zj∗ e−φ , j ∈ Λ}, j k orthonormalized) as our Grushin basis. In order to get a more optimal (“correct”) lower bound than in (4.4) for u in the corresponding orthogonal space, we need to

268

Wei-Min Wang


(1)

do a preliminary study of ∆φ around 0. To that end, we pose our first Grushin problem. ¯ be our initial Grushin basis. We introduce the projecLet e−φ dxj (all j ∈ Λ) ¯ by tion operator R+ : L2 (R2Λ ; ∧R2Λ ) → 2 (Λ)

where u =

¯ j∈Λ

R+ u(j) = (u, e−φ dxj ) = (uj , e−φ ),

¯ j ∈ Λ,

(4.5)

t ¯ → L2 (R2Λ ; ∧R2Λ ): uj dxj . Let R− = R+ : 2 (Λ) R− u− = e−φ u− (j)dxj .

(4.6)

j

Note that R+ R− = I. We pose our initial Grushin problem : (1) v u ∆φ − z R− = (4.7) v+ u− R+ 0 v u (1) 2 2 is the given and is the where u ∈ D(∆φ ), v ∈ L , u− , v+ ∈ . v+ u− unknown. Using (4.2), we have the following proposition, Proposition 4.1. For every C > 0, there exists γ0 > 0 such that the problem (4.4) is well posed for all z ≤ 2 r0 + rE − C1 , 0 ≤ γ ≤ γ0 . Moreover u + u− ≤ O(1)(v + v+ ).

(4.8) (1)

The proof of the above proposition (and indeed the study of ∆φ to O(γ 2 ), see Proposition 4.3) depends on the following fundamental fact. Lemma 4.2. Let w ∈ 2 , R− be as defined in (4.4). Then (φ − φ )R− w ≤ O(γ 2 )w. Proof. (φ − φ )R− w2 =

(4.9)

(φ ik − φ ik )e−φ wk , (φ ij − φ ij )e−φ wj . ijk

We now write

φ ik e−φ − φ ik e−φ = ∆φ u ˜ (0)

for some u ˜. Proceeding exactly as in (3.17), (3.22) in sect. III, we obtain (1) RHS = e−φ dφ ik , (∆φ )−1 e−φ dφ ij wj wk ijk

= O(1)

j,k

wj wk

i

e−φ dφ ik e−φ dφ ij ,

(4.10)

Vol. 2, 2001


269

where we used (4.3) to bound (∆φ )−1 . (1)

To estimate the RHS of (4.10), we need to estimate (φ )2 . For this purpose, it is convenient to revert back to the point of view of (R2 )Λ and use supersymmetry. (We will have more occasions to do such computations in sect. VII, where we need ¯ will be denoted by i(α) , i ∈ Λ, α ∈ {1, 2} more precise results.) Hence elements of Λ 2 and xi ∈ R . (Those readers who are not particularly interested in the details of this estimate could go directly to the end result (4.21) and continue with the reading of the proof.) From (4.2), there are 7 terms in 2φ , which we denote by an , n = 1 · · · 7. So (φ )2 = (

7

a n )2 =

n=1

≤5

7

7

(a n )2 + 2

n=3

a n a m

n>m>2

(4.11)

(a n )2 .

n=3

as a1 , a2 are constants. The estimates on (φ )2 , and indeed (φ(m) )2 for all m ≥ 3 rely on the following estimate: 2q −2φ = Op,q (1), p, q = 0, 1 · · · (4.12) x2p i xj e where Op,q (1) only depends on p and q and is uniform in Λ. Clearly (4.12) holds when p = q = 0. Assume p, q ≥ 1. (The argument works the same way if either p or q = 0.) (4.12) can be easily seen by using Proposition 2.2 as follows. Since (4.13) Xi2p Xj2q e−2Φ = 0, def

where Xi2 → = Xi · Xi = xi · xi + ηi ξi as in (2.22) and Φ is as defined in (2.25), LHS = (xi · xi + ηi ξi )p (xj · xj + ηj ξj )q e−2Φ = 0. Therefore

2q −2Φ x2p i xj e

= − 2q

2(q−1)

− 2p − 4pq

x2p i xj

ηj ξj e−2Φ

2(p−1) 2q xj ηi ξi e−2Φ

xi

2(p−1) 2(q−1) xj ηi ξi ηj ξj e−2Φ .

xi

(4.14)

270

Wei-Min Wang


Let M = E −∆−γ diag k . Integrating over the antisymmetric variables ξ, η using (2.14)-(2.15), we have dη dξ = e−2φ e−2Φ ∈Λ

ηi ξi e−2Φ

dη dξ = Mii−1 e−2φ

∈Λ

−2Φ

ηj ξj e

(4.15) −1 −2φ dη dξ = Mjj e

∈Λ

ηi ξi ηj ξj e−2φ

−2φ dη dξ = Mii−1 (M (ii) )−1 , jj e

∈Λ

where M (ii) is the matrix obtained from M after deleting the ith row and the ith column. Using (4.15) in (4.14), we are left with integrals over R2Λ only and (4.15) becomes 2p 2q −2φ −1 2p 2(q−1) −2φ 2 xi xj e d x = − 2q Mjj xi xj e d2 x ∈Λ

− 2p − 4pq

∈Λ 2(p−1) 2q −2φ Mii−1 xi xj e

d2 x

∈Λ

Mii−1 (M (ii) )−1 jj xi

2(p−1) 2(q−1) −2φ xj e

d2 x .

∈Λ

(4.16)

Since Mii−1 , (M (ii) )−1 jj = O(1), we obtain |

2q −2φ x2p i xj e

2(q−1) −2φ

x2p i xj

d x | ≤ O(1)[ 2

∈Λ

+ +

e

d2 x

∈Λ 2(p−1) 2q −2φ xi xj e

∈Λ 2(p−1) 2(q−1) −2φ xj e

xi

d2 x

(4.17)

d2 x ].

∈Λ

We note that the integrand in the RHS of (4.17) is of degree ≤ 2(p + q − 1) in x, while that of the LHS is of 2(p + q). Clearly the above procedure (4.14)-(4.17) can be repeated on the integrals in the RHS of (4.17) until all the integrands become of degree 0, i.e., constants. We therefore obtain (4.12). (It is alsoi clear that we have similar estimates, when the integrand is of the form ni=1 x2p i , n ∈ N, although we do not need it here for n > 2.) The other ingredient that we need to estimate (φ )2 is the following equal-

Vol. 2, 2001


271

ity obtained from a straight forward computation: ∂ (α)

∂x

(E − ∆ − γdiagk)−1 ij

−1 2 =2(E − ∆ − γdiagk)−1 i γ x k(γx · x )(E − ∆ − γdiagk)j , (α)

Using (4.2), Lemma 3.1, the bound (E − ∆ − γdiagk)−1 2 pointwise (before integration) |a n | ≤ O(γ 2 )

(α = 1, 2). (4.18) = O(1), we see that

for n = 5, 6, 7. So we only need to estimate (a n )2 for n = 3, 4. Since ∂a3 ∂a3 2 (α) ∂xi

= −4γ 2 xi k (γxi · xi ), (α)

(α)

∂xi

= 16γ 4 (xi )2 (k (γxi · xi ))2 (α)

≤ 16γ 4 (xi · xi )(k (γxi · xi ))2 . So

∂a3 2 ≤ O(γ 4 ), (α) ∂xi

by using Lemma 3.1 and the estimate in (4.12). Similarly we obtain ∂a4 2 ≤ O(γ 4 ). (α) ∂xi So

(φ )2 = O(γ 4 ).

(4.19)

¯ = Λ × {1, 2} → Λ. Since (E − ∆ − γ diag k )−1 and also its various Let π: Λ derivatives are pointwise uniformly bounded operators in appropriate exponentially weighted space for E ∈ Iδ (Iδ defined in (1.13)), by using (4.18) and (3.5), it is easy to see that (4.19) generalizes to

2 4 −αd(i ˜ ,j (φ ijk ) = O(γ )e

,k )

,

(4.20)

for some α ˜ > 0, where i = πi ∈ Λ,

j = πj ∈ Λ,

k = πk ∈ Λ,

and d(i , j , k ) is the length of the shortest path connecting i , j and k . We obtain

e−φ dφ ik = O(γ 2 )e−α|i −k |

(4.11)

272

Wei-Min Wang


for some α > 0 uniform in Λ, where i = πi ∈ Λ k = πk ∈ Λ. Note that by using supersymmetry we gained O(γ 1/2 ) in (4.21); the naive estimate would have been O(γ 3/2 ). Using (4.21) in (4.10), we obtain (φ − φ )R− w2 = O(γ 4 )

wj wk e−α|j−k| = O(γ 4 )w2 .

j,k

Taking the square root on both sides, we obtain the lemma.

Remark. Lemma 4.2 should be seen as a statement of the “small” variation of φ , and indeed any function f whose derivative f is “small”. We will see more general versions of (4.9) later in Lemma 4.8. Proof of Proposition 4.1. Let (1) ∆φ − z P(z) → = R+

R− 0

def

.

(4.22)

(1)

It is a self-adjoint operator with domain D(∆φ ) × 2 . Hence it is enough to establish (4.8). We write out (4.8) separately : (1) (∆φ − z)u + R− u− = v (4.23) R+ u = v+ . We first look at the special case v+ = 0. (Later we reduce the general case to the special case). Taking the inner product with u in the first equation in (4.23), we obtain

(1) u, (∆φ − z)u + (u, R− u− ) = (u, v).

(4.24)

∗ Using the second equation in (4.23) with v+ = 0 and the fact that R− = R+ , we see that (u, R− u− ) = 0 and we have

(1) u, (∆φ − z)u = (u, v). (1) 1 So O(1) u2 ≤ u, (∆φ − z)u = (u, v) ≤ u v where we used (4.4) and the assumption on z in the Proposition. Hence u ≤ O(1)v.

(4.25)

Vol. 2, 2001


273

We now take the inner product with R− u− in the first equation in (4.23) : (1) R− u− , (∆φ − z)u + (R− u− , R− u− ) = (R− u− , v). (4.26) Since (∆φ ⊗I)R− u− = 0, (φ −φ )R− u− = O(γ 2 )u− by using Lemma 4.2, we have (1) R− u− , (∆φ − z)u = (φ R− u− , u) + O(γ 2 )u− u (4.27) = O(γ 2 )u− u (0)

where we used the fact that R+ φ u = φ R+ u = 0. Hence (4.24) gives u− 2 − O(γ 2 )u− u ≤ u− v where we used R− u− 2 = u− 2 . So u− ≤ v + O(γ 2 )u = O(1)v

(4.28)

We now consider (4.23) in general. Since R+ R− = I, we obtain (∆φ − z)(u − R− v+ ) + R− u− = v + (z − φ )R− v+ (1)

R+ (u − R− v+ ) = 0

(4.29)

which is of the type (4.23) with v+ = 0. Applying the estimates (4.23) and (4.26) to (4.27), we have u − R− v+ + u− = O(1)(v + (Z − φ )R− v+ ) = O(1)(v + v+ ). Since

u − R− v+ ≥u − R− v+ ≥u − v+ ,

we obtain u + u− = O(1)(v + v+ ).

(4.30)

The above proposition shows that the self-adjoint Grushin operator (1) ∆φ − z R− P= R+ 0 introduced in (4.12) is invertible for such z, γ. Let E denote the inverse of P : def def E E+ E → = P −1 → = (4.31) E− E−+

274

Wei-Min Wang


∗ where E− = E+ , E ∈ L L2 (R2Λ , ∧R2Λ ); L2 (R2Λ , ∧R2Λ ) . We then have the following identity : −1 E− , (∆φ − z)−1 = E − E+ E−+ (1)

(4.32)

where E, E+ , E− and E−+ are analytic in z. Therefore (4.32) implies that (1)

z ∈ σ(∆φ ) iff 0 ∈ σ(E−+ ). E−+ is our effective operator. Using Lemma 4.2, we have moreover the following approximations for E−+ , E+ , E− : Proposition 4.3. For every C > 0, there exists γ0 > 0, such that for all z ≤ 2 r0 + rE − C1 , |z| = O(1), all 0 ≤ γ ≤ γ0 , E−+ − (z − 2φ )L(2 ,2 ) = O(γ 2 ), E+ − R− L(2 ,L2 ) = O(γ 2 ),

(4.33)

E− − R+ L(L2 ,2 ) = O(γ ). 2

Proof. From (4.31) u = Ev + E+ v+ u− = E− v + E−+ v+ . Take v = 0 and let

(4.34)

0 E+ = R− 0 E− = R+ 0 E−+

(4.35)

= z − 2φ .

Take as approximate solutions : 0 u0 = E+ v+ = R− v+ 0 u0− = E−+ v+ = (z − 2φ )v+ .

(4.36)

We then have (∆φ − z)(u0 − u) + R− (u0− − u− ) = (∆φ − z)R− v+ + R− (z − 2φ )v+ (1)

(1)

= 2(φ − φ )R− v+ , (4.37) where we used the fact that v = 0, and R+ (u0 − u) = R+ (R− v+ − u) = 0.

(4.38)

Vol. 2, 2001


275

Applying Proposition 4.1 to the system of equations comprised of (4.37) and (4.38), we have u0 − u + u0− − u− = O(1)(φ − φ )R− v+ (4.39) = O(γ 2 )v+ where we used Lemma 4.2 to obtain the last inequality. Combining (4.39) with (4.34) and (4.36), we then obtain the Lemma.

IV.2

(0)

A first Grushin problem for ∆φ

As mentioned earlier, in order to obtain an accuracy which is of O(γ 2 ), we need to also use (a side from e−φ dxj ) zk∗ e−φ dxj , zk∗ z∗ e−φ dxj (k ≥ ) (after orthonormalization) as our Grushin basis. We do this in two steps. We first add zk∗ e−φ dxj to our basis. In order to obtain the correct analogue of the lower bound in (4.4) for ¯ we need to study more in depth the specu ⊥ e−φ dxj , zk∗ e−φ dxj , for all j, k ∈ Λ, (0) trum of ∆φ . As before, we shall exploit in an essential way the relation between (0)

(1)

the spectra of ∆φ and ∆φ . We start by posing a variant of the Grushin problem in (4.7) : (1) v u ∆φ − z dφ d∗φ 2φ −1 R− = v+ u− R+ 0

(4.40)

where 2φ −1 means the inverse of the matrix 2φ (Recall also that we have ¯ (1) identified 1-forms with RΛ valued functions.); u ∈ D(∆φ ), v ∈ L2 , u− , v+ ∈ 2 ; R+ , R− are as defined before in (4.5,4.6). Let P be the Grushin operator defined in the LHS of (4.40). The interest of the above Grushin problem is that once its well posedness is established, it implies the well posedness of a related Grushin (0) problem for ∆φ (Proposition 4.5), which in turn gives us the desired spectral (0)

information on ∆φ . Proposition 4.4. For every C > 0, there exists γ0 > 0 such that the problem (4.40) is well posed for all z ≤ 2 r0 + rE − C1 , 0 ≤ γ ≤ γ0 . Moreover u + u− ≤ O(1)(v + v+ ).

(4.41)

Proof. We prove the proposition by reference to the well-posedness of the Grushin problem in Proposition 4.1. (4.40) can be written as (1) (∆φ − z)u + R− u− = v + (1 − dφ d∗φ 2φ −1 )R− u− R+ u = v+ .

(4.42)

276

Wei-Min Wang


From Proposition 4.1, we have u + u− ≤ O(1) v + (1 − dφ d∗φ 2φ −1 )R− u− + v+ .

(4.43)

Since dφ d∗φ 2φ −1 R− u− = d∗φ dφ 2φ −1 R− u− + [dφ , d∗φ ]2φ −1 R− u− = 2φ ◦ 2φ −1 R− u− ,

(4.44)

the second term in the RHS of (4.43) can be written as (1 − dφ d∗φ 2φ −1 )R− u− = (2φ − 2φ )2φ −1 R− u− ≤ O(γ 2 )2φ −1 u−

(4.45)

≤ O(γ )u− 2

where we used Lemma 4.2 to reach the first inequality and (4.1) to reach the last inequality. Using (4.45) in (4.43), we have u + u− ≤ O(1)(v + v+ ) + O(γ 2 )u− . So for z ≤ 2 r0 + rE −

1 C

u + u− ≤ O(1)(u + v+ ) and small enough γ. This proves uniqueness, i.e., PU = 0 =⇒ U = 0.

By a similar argument we prove P ∗ U = 0 =⇒ U = 0.

(4.46)

¯ ∼ C2Λ , so Using (4.5), Ran (R+ ) ∼ 2 (Λ)

(1) ¯ ⊆ Ran (P) Ran (∆φ ) ⊕ {0} × 2 (Λ) (1) ¯ ⊆ Ran (∆φ ) ⊕ Ran (dφ d∗φ 2φ −1 )R− × 2 (Λ) ¯ ⊂ L2 (R2Λ , C2Λ ) × 2 (Λ).

From (4.44), we see that Ran (dφ d∗φ 2φ −1 )R− is of finite dimension. Using (4.1), (1)

we see that the hypothesis of Theorem 1.7 of [Jo] are satisfied, so ∆φ has closed range: (1)

(1)

Ran (∆φ ) = Ran (∆φ ). Combining the above arguments, we obtain that Ran (P) = Ran (P)

in L2 × 2 .

Vol. 2, 2001


277

From Theorem 3.1 of [Jo], we then obtain that (Ran P)⊥ = Ker (P ∗ ) = ∅, where we also used (4.46). So P is onto. Therefore we have existence of solutions to (4.40). We now come to the main reason why we posed (4.40) in the first place, which is to use (4.40) to establish the well-posedness of (0) the Grushin problem for ∆φ in the following Proposition 4.5. Let 

e0 = e−φ 

 ∗ −φ  z1 e e1  ..   .. −1/2   .  = (2φ )   . ∗ e−φ z2|Λ| e2|Λ|

(4.47)

¯ Then we have be the new Grushin basis. Note that (ej , ek ) = δjk , j, k ∈ {0} ∪ Λ. Proposition 4.5. For every C > 0, there exists γ0 > 0 such that the following (0) Grushin problem for ∆φ :  2Λ    (∆(0) − z)u + u− (j)ej = v φ j=0    ¯ (u, ej ) = v+ (j), j ∈ {0} ∪ Λ,

(4.48)

¯ is well posed for all where u ∈ D(∆0φ ), v ∈ L2 (R2Λ ), u− , v+ ∈ R ⊕ 2 (Λ) z ≤ 2 r0 + rE − C1 , 0 ≤ γ ≤ γ0 . Moreover u + u− ≤ O(1)(v + v+ ).

(4.49)

To prove Proposition 4.5, we specialize to the case where v is a closed form : dφ v = 0, v ∈ S in (4.40). We first need to prove : Lemma 4.6. For C, γ0 , z such that Proposition 4.4 holds, if v ∈ S is a closed form, i.e., dφ v = 0, then so is u : dφ u = 0. Proof. Applying dφ to the first equation in (4.40), we obtain (2)

(∆φ − z)dφ u = 0.

(4.50)

Since v ∈ S, from (4.40), using an argument similar to that used in sect. III leading (2) (2) to (3.30), we see that dφ u ∈ S ⊂ D(∆φ ). Since (∆φ −z) > c > 0 for z considered in Proposition 4.4, by using (4.1,3.22), we conclude that dφ u = 0 .

278

Wei-Min Wang

Ann. Henri Poincaré def

Proof of Proposition 4.5. Let v ∈ S be a closed 1−form in (4.40). Since dφ → = e−φ deφ and we work in R2Λ , closed forms are exact. Therefore if dφ v = 0, then v = dφ v˜. for some v˜. To show that v˜ ∈ S, we proceed as follows. Write (1)

v = ∆φ w

(4.51)

for some w ∈ S. Applying dφ to (4.51), we obtain (2)

dφ v = ∆φ dφ w = 0

(4.52)

(2)

as v is assumed to be closed. Since ∆φ > 0, (4.52) implies that dφ w = 0. Using this in (4.51), we have that v = dφ (d∗φ w). So v˜ = d∗φ w ∈ S. Lemma 4.6 then implies that u = dφ u ˜ for some u ˜ ∈ S. Note there is no uniqueness for u ˜, if u ˜ is a solution, then so is u ˜ + Ce−φ , for all C. Therefore in order to specify u ˜ uniquely, we need to specify C. ¯ We Let v˜ be a given function in S(R2Λ ), v+ a given vector in R ⊕ 2 (Λ). consider the Grushin problem as in (4.40) with v = dφ v˜. Let w be a 1−form ∈ S. Taking the scalar product of the first equation in (4.40) with w, we obtain (w, (∆φ − z)dφ u ˜) + (w, dφ d∗φ 2φ −1 R− u− ) = (w, dφ v˜). (1)

(4.53)

˜, which we Note that there is an ambiguity when we write the solution u as dφ u clarify by imposing the condition ˜ = (e−φ , u ˜) = v+ (0). R+ u

(4.54)

(∆φ − z)dφ u ˜ + dφ d∗φ 2φ −1 R− u− = dφ v˜.

(4.55)

We have from (4.53) (1)

Since the kernel of dφ , Ker dφ is Re−φ , we obtain as an immediate consequence the following equation u + d∗φ 2φ −1 R− u− + R− u− (0) = v˜. (∆φ − z)˜ (0)

(4.56)

The second equation in (4.40) gives ˜ = v+ . R+ dφ u

(4.57)

The well-posedness of the Grushin problem (4.40), then implies the well posedness of the Grushin problem composed of (4.56), (4.57) and (4.54). Let e0 denote the function e−φ , then ∗ 2φ −1 (4.58) (d∗φ 2φ −1 )R− u− + R− u− (0) = jk zk e0 u− (j) + e0 u− (0). j,k

Vol. 2, 2001


279

  ∗  z1 e0 e1  ..  .  −1/2  Define  .  = [2φ ]  .. , from (4.56), (4.57) (4.58), the fact that ∗ e0 z2|Λ| e2|Λ| the Grushin operator defined by (4.48) has closed range (by using the same arguments leading to the closed-rangeness of P defined in (4.40)) and that S is dense in L2 , we then obtain the well-posedness of the Grushin problem (4.38) in L2 × (R ⊕ 2 ) with the v+ (0) there equals to the v+ (0) in (4.54) and the v+ (j), −1/2 (j = 0) there equals to 2φ jk v+ (k) of the v+ (k) defined in (4.40). 

k

Remark. Although we did not state it that way, the earlier remark that the second (0) eigenvalue of ∆φ ≥ 2r0 implies the well-posedness of

(0)

(∆φ − z)u + u− (0)e0 = v

(4.59)

(u, e0 ) = v+ (0) 1 and sufficiently small γ. This leads to the well-posedness of the for z ≤ 2 r0 − O(1) (1) 1 Grushin problem for ∆φ in (4.7) for z in the larger domain : z ≤ 2 r0 +rE − O(1) (Proposition 4.1), which in turn implies the well-posedness of the Grushin problem (0) 1 . Clearly, we have for ∆φ in (4.48) in the larger domain : z ≤ 2 r0 + rE − O(1) started an iteration process.

IV.3

(1)

The second Grushin problem for ∆φ

Bearing in mind the preceding remark, the proposition below should come as no surprise. ¯ j ∈ Λ) ¯ be our Grushin basis. (From now on, to Let eα dxj , (all α ∈ {0} ∪ Λ, ¯ For elements avoid confusion we use Latin letters only to denote an element of Λ. 2 of other larger sets, we generally use greek letters). Define R+ : L (R2Λ ) → R ⊕ ¯ ⊗ 2 (Λ) ¯ by 2 (Λ) ¯ α ∈ {0} ∪ Λ, ¯ (R+ u)(j, α) = (u, eα dxj ) = (uj , eα ), j ∈ Λ, where u =

¯ j∈Λ

(4.60)

∗ ¯ ⊗ 2 (Λ) ¯ is of uj dxj . Let R− = R+ as usual. Note that R ⊕ 2 (Λ)

dimension 2|Λ|(2|Λ| + 1). We have Proposition 4.7. For every C > 0, there exists γ0 > 0 such that the Grushin problem (1) v u ∆φ − z R− = (4.61) v u R+ 0 − +

280

Wei-Min Wang


(1) ¯ ⊗ 2 (Λ) ¯ is well where u ∈ D(∆φ ), v ∈ L2 (R2Λ ; ∧R2Λ ), u− , v+ ∈ R ⊕ 2 (Λ) posed for all z ≤ 2 r0 + 2rE − C1 , 0 ≤ γ ≤ γ0 . Moreover u + u− ≤ O(1)(v + v+ ).

(4.62)

In order to prove the above Proposition, we need the following lemma on small variations (It is instructive to compare with Lemma 4.2) : ¯ ⊗ 2 (Λ). ¯ Define Lemma 4.8. Let u− ∈ R ⊕ 2 (Λ)

φ ∈L

¯ ⊗ 2 (Λ); ¯ R ⊕ 2 (Λ) ¯ ⊗ 2 (Λ) ¯ R ⊕ 2 (Λ)

to be

¯ α, β ∈ {0} ∪ Λ. ¯ φβj;αk = (eβ , φ jk eα ), j, k ∈ Λ, ¯ R ⊕ 2 (Λ) ¯ to be Define M ∈ L R ⊕ 2 (Λ); ¯ Mαβ = (eα , zj∗ zj eβ ), α, β ∈ {0} ∪ Λ

(4.63)

(4.64)

¯ j∈Λ

(Note that Mαβ = 0 if α = 0 or β = 0 or both.) We have

φ R− u− − R− φ u− = O(γ 2 )u−

∗ ∗ where R− = R+ , R+ is as defined in (4.50) and zj∗ zj )eα − Mαβ eβ = O(γ 2 )

(4.65)

(4.66)

¯ β∈{0,Λ}

Proof. Define φ˜ ∈ L

¯ ⊗ 2 (Λ); ¯ R ⊕ 2 (Λ) ¯ ⊗ 2 (Λ) ¯ to be R ⊕ 2 (Λ) φ˜ βj;αk = φ jk δβα .

(4.67)

R− (φ − φ˜ )u− = (φ − φ˜ )u− = O(γ 2 )u− .

(4.68)

We first show that

Assume α = 0, β = 0,

φβj;αk = (eβ , φ jk eα ) −1/2 −1/2 ∗ = 2φ βω 2φ ακ (zω e0 , φ jk zκ∗ e0 ) ω,κ

−1/2 −1/2 = 2φ βω 2φ ακ (e0 , zω φ jk zκ∗ e0 ) ω,κ

1 (4) −1/2 −1/2 φjk φωκ + φjkωκ . = φ βω φ ακ 2 ω,κ

(4.69)

Vol. 2, 2001


281

So (φ − φ˜ )βj;αk =

−1/2 −1/2 φjk φωκ − φ jk φ ωκ φ βω φ ακ ω,κ

1 −1/2 −1/2 (4) φ βω φ ακ φjkωκ + 2 ω,κ

(α = 0, β = 0),

(φ − φ˜ )0j;0k = 0, (φ − φ˜ )0j;k = (φ − φ˜ )j;0k =

(4.70)

−1/2 2φ m φjmk = 0, m

(4.71) where to reach the last equality we used the fact that φ(x) = φ(−x) and hence φ(x) = −φ(−x). Using Lemma 4.2 to approximate the first term in (4.70) and a straight forward computation to approximate the second term in (4.70), and since φ(4) = O(γ 2 ) pointwise, we obtain (4.68). Using (4.67) (4.68) in (4.65), (4.6), we see that it is sufficient to prove (φ − φ )R− u− = O(γ 2 )u− .

(4.72)

Write (φ − φ )R− u− 2 = (φ − φ )R− u− , (φ − φ )R− u− .

(4.73)

Once again, we write ˜ (φ − φ )R− u− − (e0 , (φ − φ )R− u− )e0 = ∆φ u (0)

(4.74)

for some u ˜. Let A denote the LHS of (4.74). Since the second term in A is O(γ 2 )u− , in order to prove (4.72), it is sufficient to prove A = O(γ 2 )u− .

(4.75)

Using our usual strategy, we write 2 u− (k, α)dφ [(φ jk − φ jk )eα ], A = j,k, (1) (∆φ )−1

α

u− (, β)dφ [(φ j

−

φ j )eβ ]

β

≤ O(1)

u− (k, α)dφ [(φ jk − φ jk )eα ] u− (, β)dφ [φ j − φ j eβ ] j,k,

α

β

(4.76) where we used the fact that

(1) (∆φ )−1

= O(1).

282

Wei-Min Wang


def

¯ Then Define ωα → = dφ [(φ jk − φ jk )eα ] (α ∈ {0} ∪ Λ). ωα = zi [(φ jk − φ jk )eα ]dxi i

= (φ jk − φ jk )φ −1 φ ijk eα dxi if α = 0 αα φiα e0 dxi + i

=

α

(4.77)

i

φ ijk e0 dxi if α = 0.

i

¯ Then using our usual “integration by Define Wαβ = (ωα , ωβ ), α, β ∈ {0} ∪ Λ. parts” similar to (3.25)–(3.31) and using (4.12) and (4.18) to estimate (φ(4) )2 , we obtain ¯ W = O(γ 4 )e−2δ|j−k| on (R ⊕ 2 (Λ)) (4.78) 2 ¯ for some δ > 0 (uniformly in Λ). Let u ˜− (k) = α u− (k, α), (k ∈ Λ). Then substituting (4.78) and (4.77) in (4.76), we obtain A2 ≤ O(γ 4 ) u ˜− (k)˜ u− ()e−δ|j−k| e−δ|j−| j,k, (4.79) 4 2 ≤ O(γ )u− . Taking the square root on both side, we obtain (4.75) and hence (4.65). We now prove (4.66). The case α = 0 is trivially true. We therefore assume α = 0. We first compute Mαβ for α = 0, β = 0. Mαβ = (eα , zj∗ zj eβ ) −1/2 −1/2 = 2φ αα 2φ ββ (zα∗ e0 , zj∗ zj zβ∗ e0 ) α ,β

=2

j −1/2 −1/2 2φ αα 2φ ββ

φ jα φ jβ

α ,β

=2

(4.80)

j −1/2

−1/2

φ αα φ ββ φ jα φ jβ + O(γ 2 )e−δ|α−β|

α ,β ,j

= 2φ αβ + O(γ 2 )e−δ|α−β| , (α = 0, β = 0), where to reach the fourth line from the third, we used Lemma 4.2. Let F = (0) ˜ for ( zj∗ zj )eα − Mαβ eβ . We first note that (F, e0 ) = 0. Therefore F = ∆φ u some u ˜. Proceeding as usual, we see that (0)

(F, F ) = (∆φ u ˜, F ) = (dφ u ˜, dφ F ) = ((∆φ )−1 dφ F, dφ F ). (1)

Vol. 2, 2001


283

So F ≤ O(1)dφ F ,

(4.81)

zj∗ zj eα − zi Mαβ eβ dxi def →= gi dxi .

where

dφ F =

(4.82)

Since −1/2 zj∗ zj eα = 23/2 φ αα (e0 , φ ij φ jα e0 ) e0 , zi α =0,j

= 23/2

−1/2

φ αα φ ij φ jα + O(γ 2 )e−δ|i−α|

(4.83)

j,α =0

= 2φ iα + O(γ 2 )e−δ|i−α| 3/2

and zi

Mαβ eβ =

(α = 0),

−1/2

Mαβ zi 2φ ββ zβ∗ e0

ββ =0

β=0

=

(4.84)

3/2 2φ iα e0

2

−δ|i−α|

+ O(γ )e

(α = 0),

where we used (4.64) to reach the last equality, we obtain that (gi , e0 ) = O(γ 2 )e−δ|i−α| . So

(gi , gi ) = ((∆φ )−1 dφ gi , dφ gi ) + O(γ 4 )e−2δ|i−α| (1)

= O(γ 4 )e−2δ|i−α| ,

(4.85)

gi = O(γ 2 )e−δ|i−α| , by proceeding in a similar way as in the proof of Lemma 4.2. Hence dφ F = O(γ 2 ). Using (4.82),(4.85) and summing over i, we obtain (4.66). Proof of Proposition 4.7. We proceed exactly as in the proof of Proposition 4.1, Proposition 4.5 in place of (4.4) and Lemma 4.8 in place of Lemma 4.2.

IV.4

(0)

The second Grushin problem for ∆φ

We proceed as in subsect. 2 by first posing a variant of the Grushin problem in (4.61). Let def

M(β, j; α, k) → = (eβ , zj zk∗ eα ), ¯ ⊗ 2 (Λ); ¯ R ⊕ 2 (Λ) ¯ ⊗ 2 (Λ) ¯ . M ∈ L R ⊕ 2 (Λ)

(4.86)

284

Wei-Min Wang


Let

 2φ jk , α = β = 0   ˜ α = 0 or β = 0 but not both M(β, j; α, k) = 0   1/2 1/2 2(φ βk φ αj + δαβ φjk ), α = 0, β = 0. ˜ ∈ L R ⊕ 2 (Λ) ¯ ⊗ 2 (Λ); ¯ R ⊕ 2 (Λ) ¯ ⊗ 2 (Λ) ¯ and M− M ˜ = O(γ 2 ). Then M ˜ −1 ∈ L R ⊕ 2 (Λ) ¯ ⊗ 2 (Λ); ¯ R ⊕ 2 (Λ) ¯ ⊗ 2 (Λ) ¯ . So ˜ −1 exists and M Clearly M ¯ ⊗ 2 (Λ); ¯ R ⊕ 2 (Λ) ¯ ⊗ 2 (Λ) ¯ . M−1 exists and M−1 ∈ L R ⊕ 2 (Λ) Proposition 4.9. For every C > 0, there exists γ0 > 0 such that the Grushin problem (1) v u ∆φ − z dφ d∗φ R− M−1 = (4.88) v u R+ 0 − + (1) ¯ ⊗ 2 (Λ) ¯ is well where u ∈ D(∆φ ), v ∈ L2 (R2Λ ; ∧R2Λ ), u− , v+ ∈ R ⊕ 2 (Λ) 1 posed for all z ≤ 2 r0 + 2rE − C , 0 ≤ γ ≤ γ0 . Moreover u + u− ≤ O(1)(v + v+ ).

(4.89)

Proof. Similar to the proof of Proposition 4.4, it is enough to prove that for all u− , (4.90) R− u− − dφ d∗φ R− M−1 u− = O(γ 2 )u− . −1 2 ¯ 2 ¯ 2 ¯ 2 ¯ Since both M, M ∈ L R ⊕ (Λ) ⊗ (Λ); R ⊕ (Λ) ⊗ (Λ) , this is equivalent to prove MR− u− − dφ d∗φ R− u− = O(γ 2 )u− .

(4.91)

The LHS of (4.91) without the norm sign, equals to def M(β, j; α, k)u− (α, k)eβ dxj − zj zk∗ (u− (α, k)eα )dxj → = ψj dxj . j,k,α,β

j,k,α,β

¯ j∈Λ

(4.92) Using our basic “integration by parts” as in (3.25)–(3.31), we proceed essentially as in the proof of Lemma 4.8, which we summarize as follows : ψj ≤O(1) dψj + O(γ 2 )u− ¯ j∈Λ

¯ j∈Λ def

→ =O(1)

ψ j dx + O(γ 2 )u−

j∈Λ

=O(1)

ψ j + O(γ 2 )u−

¯ j,∈Λ

≤O(1)

dψ j + O(γ 2 )u−

¯ j,∈Λ

≤O(γ )u− 2

(4.93)

Vol. 2, 2001


285

the proposition then follows immediately by repeating the arguments in the proof of Proposition 4.4. ¯ u0 = e0 . Define Let uj = zj∗ e0 , j ∈ Λ, 1 def ¯ M(k, α; j, β) → = (zk∗ uα , zj∗ uβ ), 2

¯ k, j ∈ Λ,

¯ α, β ∈ {0} ∪ Λ.

(4.94)

¯ as an element of We consider M 2 ¯ ⊕ [2 (Λ) ¯ ⊗s 2 (Λ)] ¯ ∗ ⊗ 2 (Λ) ¯ ⊕ [2 (Λ) ¯ ⊗s 2 (Λ)] ¯ , (Λ) 2 ¯ ⊕ [2 (Λ) ¯ ⊗ 2 (Λ)] ¯ ⊗ , (separately) which in turn can be seen as a tensor in 2 (Λ) symmetric in the first set of indices and the last set of indices (cf. sect III). ¯ ⊕ [2 (Λ) ¯ ⊗s 2 (Λ)] ¯ of ¯ is the restriction to the symmetric subspace: 2 (Λ) M an operator, which (with a slight abuse of notation) we also denote by ¯ ∈ L 2 (Λ) ¯ ⊕ [2 (Λ) ¯ ⊗ 2 (Λ)]; ¯ 2 (Λ) ¯ ⊕ [2 (Λ) ¯ ⊗ 2 (Λ)] ¯ . M (4.95) ¯ −1 exists satisfying It is easy to see by perturbation theory that M ¯ −1 ◦ M ¯ = I ⊕ [I ⊗ I], ¯ ◦M ¯ −1 = M M

(4.96)

where Iki = δik , (I ⊗ I)ij k = δik δj . When restricted to the symmetric subspace: ¯ ⊕ [2 (Λ) ¯ ⊗s 2 (Λ)], ¯ we have that (with the same abuse of notation) 2 (Λ) ¯ ◦M ¯ −1 = M ¯ −1 ◦ M ¯ = I ⊕ [I ⊗s I], M

(4.97)

(I ⊗s I)ij k = δik δj + δi δjk .

(4.98)

where ¯ −1 > 0 as an operator and moreover M ¯ −1 is a symmetric operator, M ¯ −1/2 Since M is also well defined: ¯ −1/2 ◦ M ¯ −1/2 = M ¯ −1 . M (4.99) Define def

ekα → =

−1/2

¯ 2M

(k, α; j, β)zj∗ uβ , (k, j = 1, . . . , 2|Λ|,

j,β

α, β = 0, 1, . . . , 2|Λ|, k = α), 1 def ¯ −1/2 (k, k; j, β)zj∗ uβ , (k = 1, . . . , 2|Λ|). ekk → = √ 2M 2 j,β

(4.100)

¯ α ∈ {0} ∪ ekα (k ≥ α) are orthonormalized by using (4.94,4.99). Let ekα (k ∈ Λ, ¯ Λ, k ≥ α), e0 be our Grushin basis. Similar to Proposition 4.5, we arrive at the main conclusion of this subsection :

286

Wei-Min Wang


Proposition 4.10. For every C > 0, there exists γ0 > 0 such that the following (0) Grushin problem for ∆φ :  (0) (∆φ − z)u +    

u− (k, α)ekα + u− (0)e0 = v

¯ ¯ k≥α} k∈Λ,α∈{0, Λ

   

(u, ekα ) = v+ (k, α) (u, e0 ) = v+ (0)

(4.101)

(0) ¯ ⊕ [2 (Λ) ¯ ⊗s 2 (Λ)] ¯ ,which is where u ∈ D(∆φ ), v ∈ L2 (R2Λ ), u− , v+ ∈ R ⊕ 2 (Λ) of dimension 1 + 2|Λ| + |Λ|(2|Λ| + 1), is well posed for all z ≤ 2 r0 + 2rE − C1 , 0 ≤ γ ≤ γ0 . Moreover

u + u− ≤ O(1)(v + v+ ).

(4.102)

(The proof is the same as the proof for Proposition 4.5, so we omit it).

IV.5

(1)

The third Grushin problem for ∆φ and approximations to O(γ 4 )

¯ k, j ∈ Λ, ¯ k ≥ α) defined in (4.100) be our Grushin Let ekα dxj , e0 dxj , (α ∈ {0} ∪ Λ, 2 2Λ 2Λ ¯ ⊕ [2 (Λ) ¯ ⊗s 2 (Λ)] ¯ ⊗ 2 (Λ) ¯ by basis. Define R+ : L (R ; R ) → R ⊕ 2 (Λ) (R+ u)(j, k, α) = (u, ekα dxj ) = (uj , ekα ) (R+ u)(j, 0) = (u, e0 dxj ) = (uj , e0 ) where u =

(4.103)

∗ uj dxj . Let R− = R+ as usual. As a direct consequence of Proposi-

j∈Λ

tion 4.10, we have Proposition 4.11. For every C > 0, there exists γ0 > 0 such that the Grushin problem (1) v u ∆φ − z R− = (4.104) v+ u− R+ 0 (1) ¯ ⊕ [2 (Λ) ¯ ⊗s 2 (Λ)] ¯ ⊗ where u ∈ D(∆φ ), v ∈ L2 (R2Λ ; R2Λ ) u− , v+ ∈ R ⊕ 2 (Λ) ¯ is well posed for all z ≤ 2 r0 + 3rE − 1 , 0 ≤ γ ≤ γ0 . Moreover 2 (Λ), C u + u− ≤ O(1)(v + v+ ).

(4.105)

As in the proof of Proposition 4.7, we need the following ¯ ⊕ [2 (Λ) ¯ ⊗s 2 (Λ)] ¯ ⊗ 2 (Λ). ¯ Define φ : Lemma 4.12. Let u− ∈ R ⊕ 2 (Λ) ¯ ⊕ [2 (Λ) ¯ ⊗s 2 (Λ)] ¯ ⊗ 2 (Λ); ¯ R ⊕ 2 (Λ) ¯ ⊕ [2 (Λ) ¯ ⊗s 2 (Λ)] ¯ ⊗ 2 (Λ) ¯ L R ⊕ 2 (Λ)

Vol. 2, 2001


287

to be

¯ α, β ∈ {0} ∪ Λ) ¯ φβj;mαk = (eβ , φ jk emα ), (j, k, , m ∈ Λ,

φ0j;mαk = (e0 , φ jk emα )

(4.106)

φβj;0k = (eβ , φ jk e0 )

φ0j;0k = (e0 φ jk e0 ) = φ jk . Define

Mkα;jβ = ekα z∗ z ejβ .

(4.107)

¯ ∈Λ

Note that Mkα,0 = M0,kα = M0,0 = 0. We have and

(4.108)

φ R− u− − R− φ u− = O(γ 2 )u−

(4.109)

z∗ z ekα − Mkα,jβ ej,β = O(γ 2 ). (

(4.110)

¯ j∈Λ ¯ β∈{0,Λ}

Proof. The proof proceeds exactly as in the proof of Lemma 4.8. We note that

φβj;mαk = (eβ φ jk emα ) ∗ ¯ −1/2 (β, β )M ¯ −1/2 (mα, m α )(z∗ uβ , φ jk zm =M uα ) ∗ ¯ −1/2 (β, β )M ¯ −1/2 (mα, m α )(uβ , z φ jk zm =M uα ) −1/2 ¯ −1/2 ¯ = φjk (δαβ δm + δα δβm ) + M (β, β )M (mα, m α ) ∗ (z∗ eβ , (φ jk − φ jk )zm eα )

(4.111) where we used (4.100, 4.106). Using (4.12) and (4.18), it can be easily shown that the second term2in the 2 ¯ ⊕ ) on R ⊕ (Λ) last line of (4.111) correspond to an operator which is of O(γ 2 ¯ 2 ¯ 2 ¯ [ (Λ) ⊗s (Λ)] ⊗ (Λ). Let φ˜βj;mαk = φ jk (δm δαβ + δα δβm ) = φ jk (I ⊗s I)βmα φ˜βj;0k = 0 φ˜0j;mαk = 0 φ˜0j;0k = φ jk .

(4.112)

288

Then

Wei-Min Wang


R− (φ − φ˜ )u− = (φ − φ˜ )u−

= O(γ 2 )u− , where we used the fact that the operators defined in the second and third equations of (4.106) are of O(γ 2 ). We may now follow the proof of Lemma 4.8 to obtain (4.109). Similarly we obtain (4.110), by using the same type of arguments that we used to prove (4.66) of Lemma 4.8. The proof of Proposition 4.11 now follows immediately by applying Lemma 4.12, in the same way that the proof of Proposition 4.1 follows from Lemma 4.2. Remark. It is worth noticing that in Lemma 4.2, 4.8 and 4.12, the estimates on the RHS’s are all of O(γ 2 ) . . . . This is fundamentally due to the fact that [zj , zk∗ ] = 2φ and (φ )2 = O(γ 2 ) and hence the variation of φ jk is of O(γ 2 ) i.e., jk (φ jk )2 − φ jk 2 = (φ jk − φ jk )2 = O(γ 2 ). (This is of course the content of Lemma 4.2). If φ were a constant : φ = 0 then all the estimates on the RHS would be replaced by 0 this corresponds to a high-dimensional harmonic oscillator. The point of this paper is to do perturbation theory around the harmonic oscillator as the dimension goes to infinity. Using Lemma 4.12 and proceeding exactly as in the proof of Proposition 4.3, we obtain the following approximations : Proposition 4.13. For every C > 0, there exists γ0 > 0, such that for all z ≤ 2 r0 + 3rE − C1 , |z| = O(1), all 0 ≤ γ ≤ γ0 ,

E−+ − (z − M − 2φ )L(2 ,2 ) = O(γ 2 ) E+ − R− L(2 ,L2 ) = O(γ 2 ) E− − R+ L(L2 ,2 ) = O(γ 2 ) ¯ [2 (Λ)⊗ ¯ s 2 (Λ)] ¯ ⊗ where M, φ are defined in (4.107,4.106), 2 isfor R ⊕ 2 (Λ)⊕ E+ v ¯ and L2 for L2 (R2Λ ; R2Λ ). Let u = E 2 (Λ) be the solution u− v+ E− E−+ to the Grushin problem in Proposition 4.11. We specialize to the case v = 0 and ¯ i.e. v+ = {v+ (j, 0), j ∈ Λ)}. ¯ v+ ∈ 2 (Λ), Then u = E+ v+ u− = E−+ v+ . Define Λ+ to be the projection : ¯ ⊕ [2 (Λ) ¯ ⊗s 2 (Λ)] ¯ ⊗ 2 (Λ) ¯ −→ R ⊗ 2 (Λ) ¯ ∼ 2 (Λ), ¯ Λ+ : R ⊕ 2 (Λ)

(4.113)

Vol. 2, 2001


289

¯ α ∈ {0} ∪ Λ. ¯ Assume ∀u : (Λ+ u)(j, 0) = u(j, 0), (Λ+ u)(j, k, α) = 0 for all k ∈ Λ, γ, z are such that Propositions 4.11, 4.13 hold and |z| = O(1). We then have the following improved approximations to E+ , E− and E−+ in this special case. 1 Proposition 4.14. For |z| = O(1), z ≤ 2 r0 + 3rE − O(1)

(E−+ − (z − 2φ ))Λ+ L(2 ,2 ) = O(γ 4 ) (E+ − R− )Λ+ L(2 ,L2 ) = O(γ 4 ) Λ+ (E− − R+ )L(L2 ,2 ) = O(γ 4 )

(4.114)

¯ ⊕ where φ is as defined in (4.106) in Lemma 4.12, 2 is short for R ⊕ 2 (Λ) 2 ¯ 2 ¯ 2 ¯ 2 2 2Λ 2Λ [ (Λ) ⊗s (Λ)] ⊗ (Λ) and L for L (R ; R ). Proof. Let u0 = R− v+

u0− = (z − 2φ )v+ ¯ be approximations to u, u− in (4.104). We then obtain that (where v+ ∈ 2 (Λ))

(∆φ − z)(u0 − u) + R− (u0− − u− ) = 2(φ − φ )R− v+ (1)

R+ (u0 − u) = 0, (0)

¯ Applying where we used the fact that (∆φ ⊗ I)(R− v+ ) = 0 for v+ ∈ 2 (Λ). Proposition 4.11 to the Grushin problem in (4.104), we obtain

u0 − u + u0− − u− ≤ 2(φ − φ )R− v+ . Therefore in order to prove the proposition, it is sufficient to prove that

(φ − φ )R− v+ = O(γ 4 )v+

(4.115)

¯ (Compare (4.115) with (4.109) and (4.110)) for v+ ∈ 2 (Λ). ! " (φ jk e0 − φ jk e0 − (φ − φ )R− v+ 2 = (eβ , φ jk e0 )eβ v+ (k) j,k,p

!

,β,≥β

(φ jp e0

−

φ jp e0

−

" (euω , φ jp e0 )euω v+ (p)

u,ω,u≥ω

where eβ is as defined earlier in (4.100). def

Let A1 → = φ jk e0 − φ jk e0 − (0)

≥β

(eβ φ jk e0 )eβ . Since (A1 , e0 ) = 0, A1 =

˜1 . By using our usual mechanism, we need to estimate dA1 . Hence we need ∆φ u

290

Wei-Min Wang


to estimate def

A2 → = zn A1 = zn (φ jk e0 − φ jk e0 ) −

(eβ φ jk e0 )eβ

≥β

=

φ jkn e0

=

φ jkn e0

− (e0 φ jk e0 )zn e0 − (eβ φ jk e0 )zn eβ ,0

−

−

≥β>0

−1/2 −1/2 2φ (z∗ e0 , φ jk e0 )zn 2φ z∗ e0

(eβ φ jk e0 )zn eβ

≥β>0 −1 = φ jkn e0 − 2φ φjk (2φn )e0 − (eβ φ jk e0 )zn eβ ≥β>0

=

φ jkn e0

−

(eβ φ jk e0 )zn eβ ,

≥β>0

where we used the fact that φ jk = 0 to reach the last equality. Since (A2 , e0 ) = 0, (0)

˜2 . We need to estimate dA2 . Hence we define A2 = ∆φ u

(4)

A3 = zm A2 = φjknm e0 −

(eβ φ jk e0 )zm zn eβ

≥β>0 (4)

(A3 , e0 ) = φjknm −

(eβ , φ jk e0 )(e0 , zm zn eβ ).

≥β>0 ∗ ∗ Since (e0 , zm zn e0 ) = 0 and (eβ , φ jk e0 )eβ − φ jk e0 = Ojk (γ 2 ) φ jk e0 − ≥β ∗ ∗ zn e0 − zm

∗ ∗ (eβ , zm zn e0 )eβ = Omn (γ 2 ), ≥β

we obtain (A3 , e0 ) = φjknm − (e0 , zm zn φ jk e0 ) + Ojknm (γ 4 ) = Ojknm (γ 4 ). (4)

Therefore A3 can be written as def

A3 → = A4 + Ojknm (γ 4 ) (0)

where A4 = ∆φ u ˜3 . We are only left to estimate dA4 ; to that end, we define (5)

A5 = zs A4 = φjknms e0 + Ojknms (γ 4 ).

Vol. 2, 2001


291

(5)

Since φjknms e0 = Ojknms (γ 4 ) by using (4.12) and (4.18) (cf. sect VII for more of such computations.), we obtain that A5 = Ojknms (γ 4 ). Using now the fact that φ and all higher Hessians (obtained by repeated differentiation of (4.2)) are bounded operators in appropriately exponentially weighted spaces, we obtain (4.115) by combining the estimates on A1 , A2 , . . . A5 . We therefore obtain the Proposition.

V Grushin Problems in Weighted Spaces Beginning in this section, we study the Grushin problem in Proposition 4.11 in appropriate weighted spaces. This is the necessary final step toward obtaining the asymptotics of G(E; µ, ν) g as |µ − ν| → ∞. In this section, we study the general classes of weights that we are allowed to take. In sect. VI, we further specify a class of such weights, using the convolution structure that is inherent in the problem. Propositions 4.11, 4.13 as they stand are not sufficient for obtaining the asymptotics. However, from (3.31), we see that we are only interested at z = 0. When z = 0, using Propositions 4.10, 4.11, 4.13, we deduce our desired weighted estimates. But first we need to specify a class W of weights. ¯ → (0, ∞) such that if ρ ∈ W, then W consists of weights ρ: Λ ¯ 2 (Λ) ¯ for all x, where O(1) (W1) ρφ (x)ρ−1 , ρ−1 φ (x)ρ = O(1) in L 2 (Λ); is uniform in ρ, x. Further ρ is such that ρi(1) = ρi(2) = ρi , where i(1) ∈ Λ×{1, 2} = ¯ i(2) ∈ Λ × {1, 2} = Λ, ¯ i ∈ Λ. Λ, (W2) If ρ ∈ W, then ρ−1 ∈ W, and ρ satisfies (ρφ (x)ρ−1 t, t) ≥ (

1 − r0 − 2rE )t2 , O(1)

¯ t ∈ 2 (Λ),

(5.1)

for all x, where O(1) is some positive constant uniform in ρ, x; r0 is taken to be r0 = the smallest eigenvalue of φ − C1 and rE = E − 2d − C1 with arbitrarily large C > 0 and small enough γ depending on C. Remark. W becomes useful only when O(1) is such that the RHS of (5.1) is negative, i.e., there exists ρ ∈ W satisfying (W1), such that 1 1 2 − r0 − 2rE )t2 = −| − r0 − 2rE |t2 ≤ (ρφ (x)ρ−1 t, t) < −ct < 0. O(1) O(1) (5.2) (See (5.6).) From the expression for φ in (4.2), we see that in order for ρ to verify (W1), we only need to check the last term, the O(γ 4 ) term. To O(γ), that term decays in |i − j| at approximately twice the rate of (E − ∆)−1 and hence of (φ (x))−1 . So for sufficiently small γ, this does not cause any problems, and there exist ρ ∈ W such that (5.2) is satisfied. For more precise statement, see (6.17,6.18). (

292

Wei-Min Wang


We now use the weights in W to obtain the weighted analogues of Propositions 4.11, 4.13 at z = 0. We first establish that the Grushin problem in Proposition 4.11 at z = 0 is well posed for ρ ∈ W. We proceed as in the proof of Proposition 4.11. We first look at the special case v+ = 0 and write (1) ∆φ u + R− u− = v (5.3) R+ u = 0. Multiplying (5.3) by ρ, we obtain (1) ρ∆φ u + ρR− u− = ρv

(5.4)

ρR+ u = 0, ¯ ρR+ is defined where ρv for v a 1-form is defined to be ρv = ρj vj dxj , j ∈ Λ; to be: ρR+ = (R− ρ)∗ = (ρR− )∗ . Taking the scalar product with ρu in the first equation of (5.4), we have (1)

(ρu, ρ∆φ u) + (ρu, ρR− u− ) = (ρu, ρv).

(5.5)

Using the second equation of (5.4) and the fact that ρ and R− commute, we have (1)

(ρu, ρ∆φ u) + (ρu, ρR− u− ) =(ρu, (∆φ ⊗ I)ρu) + 2(ρu, ρφρ−1 ρu) 1 ≥2(r0 + 2rE )ρu2 + 2( − r0 − 2rE )ρu2 O(1) 1 ρu2 , ≥ O(1) (0)

(5.6)

where r0 , rE are as in (4.1,4.3) and we used Proposition 4.10. Combining (5.6) and (5.5), we obtain ρu ≤ O(1)ρv. (5.7) We next take the scalar product of the first equation in (5.4) with ρR− u− : (1)

(ρR− u− , ρ∆φ u) + (ρR− u− , ρR− u− ) = (ρR− u− , ρv).

(5.8)

So (ρ(∆φ ⊗ I)R− u− , ρu) + 2(ρφ ρ−1 ρR− u− , ρu) + ρu− 2 ≤ ρu− ρv. (0)

(5.9)

Using the weighted analogue of Lemma 4.12, which can be proved similarly using in addition (W1), and the second equation of (5.4), we obtain O(γ 2 )ρu− ρu + ρu− 2 ≤ ρu− ρv.

Vol. 2, 2001


293

Therefore ρu− ≤ O(1)ρv.

(5.10)

We now consider (5.4) in general. Since R+ R− = I, we have from (5.4): (1)

(1)

ρ∆φ (u − R− v+ ) + ρR− u− = ρ(v − ∆φ R− v+ ) ρR+ (u − R− v+ ) = 0.

(5.11)

Applying the estimates (5.7) and (5.10) to (5.11), we obtain (1)

ρ(u − R− v+ ) + ρu− ≤ O(1)ρ(v − ∆φ R− v+ ) ≤ O(1)(ρv + ρv+ ),

(5.12)

where we used the weighted analogue of Lemma 4.12. Since ρ(u − R− v+ ) ≥ ρu − ρv+ ,

(5.13)

we obtain from (5.12) that ρu + ρu− ≤ O(1)(ρv + ρv+ ),

(5.14)

which implies that ρE(0)ρ−1 L(L2 ,L2 ) , ρE+ (0)ρ−1 L(2 ,L2 ) , (5.15) ρE− (0)ρ−1 L(L2 ,2 ) , ρE−+ (0)ρ−1 L(2 ,2 ) = O(1), 2 2 2Λ 2Λ 2 2 ¯ 2 ¯ 2 ¯ where L is shorthand for L (R ; R ) and is R ⊕ (Λ) ⊕ [ (Λ) ⊗s (Λ)] ⊗ ¯ 2 (Λ). It is also easy to see that all the steps in the proof of Propositions 4.13, 4.14 go through in the weighted space. We hence have the following weighted analogues: ρ(E−+ (0) + (M + 2φ¯ ))ρ−1 L(2 ,2 ) = O(γ 2 ) ρ(E+ (0) − R− )ρ−1 L(2 ,L2 )

= O(γ 2 )

ρ−1 (E− (0) − R+ )ρL(L2 ,2 ) ρ(E−+ (0) + 2φ¯ )ρ−1 Λ+ L(2 ,2 )

= O(γ 2 )

ρ(E+ (0) − R− )ρ−1 Λ+ L(2 ,L2 )

= O(γ 4 )

Λ+ ρ−1 (E− (0) − R+ )ρL(L2 ,L2 )

= O(γ 4 ),

= O(γ 4 )

(5.16)

where Λ+ is the projection as defined in (4.113); M , φ¯ are as in Propositions 4.12, 4.13. We end this section by summarizing the above results as

294

Wei-Min Wang


Theorem 5.1. At z = 0, the Grushin problem in Proposition 4.11 is well posed for ρ ∈ W, with the solution: u = E(0)v + E+ (0)v+ u− = E− (0)v + E−+ (0)v+ , where E(0), E+ (0), E− (0), E−+ (0) satisfy (5.15) and (5.16), uniformly in ρ ∈ W.

VI The Inverse of E−+ and Convolution Equations Starting from this section, we use the symmetry in the problem to facilitate some of the computations. Naturally similar results hold in the general case. Recall from (2.9) that φ(x) = φ(−x). Therefore φ(n) (x) = 0 for n odd. (This was already (1) used earlier in the proof of Proposition 4.14 for n = 3.) ∆φ is parity conserving. (0)

Using this reflection symmetry, the Grushin basis for ∆φ constructed earlier in (4.47) and (4.100) simplifies into: e0 = e−φ ¯ ek = 2φ −1/2 kj (zj∗ e−φ ) (k, j ∈ Λ) j

ekk =

¯ −1/2 (zj∗ zj∗ e−φ ) (k, k , j, j ∈ Λ, ¯ k ≥ k ) (2M) kk ;jj

(6.1)

jj

1 ∗ ∗ −φ ¯ −1/2 ¯ (2M) ) (k, j, j ∈ Λ) ekk = √ kk;jj (zj zj e 2 jj where

¯ kk ;jj = 2φ kj φ k j + 2φ kj φ k j + φ(4) kk jj . M

(6.2)

(The simplification comes from the fact that φ (x) = 0.) In Proposition 4.13, ¯ k ≥ k ) as the Grushin basis, we obtained using e0 dxj , ek dxj , ekk dxj (k, k , j ∈ Λ; (1) the effective operator E−+ (z) of ∆φ − z. Recall the identifications: ¯ e0 dxj ∈ R2Λ ∼ 2 (Λ) j

¯ ⊗ 2 (Λ) ¯ ek dxj ∈ R2Λ ⊗ R2Λ ∼ 2 (Λ)

(6.3)

k,j

¯ ⊗s 2 (Λ) ¯ ⊗ 2 (Λ); ¯ ekk dxj ∈ R2Λ ⊗s R2Λ ⊗ R2Λ ∼ 2 (Λ)

k≥k ,j

and E−+ (z) is therefore a bounded operator in ¯ ⊕ [2 (Λ) ¯ ⊗s 2 (Λ)] ¯ ⊗ 2 (Λ) ¯ R ⊕ 2 (Λ) ! " ¯ ⊕ [2 (Λ) ¯ ⊗ 2 (Λ)] ¯ ⊕ 2 (Λ) ¯ ⊗s 2 (Λ) ¯ ⊗ 2 (Λ) ¯ =2 (Λ) def

→ =F 1 ⊕ F 2 ⊕ F 3 = F.

(6.4)

Vol. 2, 2001


295

Furthermore, Theorem 5.1 implies that at z = 0, E−+ (z = 0) is a bounded operator in ! 2 " 2 ¯ ¯ ¯ ¯ s 2 (Λ) ¯ ⊗2ρ (Λ) ¯ def 2ρ (Λ)⊕[ (Λ)⊗2ρ (Λ)]⊕ (Λ)⊗ → = Fρ1 ⊕Fρ2 ⊕Fρ3 = Fρ , (6.5) where ρ ∈ mathcalW satisfies (W1,2). We now look more closely at (W1,2), taking into account the special form of φ . For the sake of convenience, we repeat the expression for 2φ from (4.2): (2φ (x))i(α) j (β) = 2Eδij δαβ − 2∆ij δαβ − 4γ 2 xi xi k (γxi · xi )δij (α) (β)


2 + 4γ 4 xi xj k (γxi · xi )k (γxj · xj )[(E − ∆ − γ diag k )−1 ij ] , (6.6) where k = log gˆ and gˆ is the Laplace transform of g; and we used the point of view of (R2 )Λ , i.e., i, j ∈ Λ, α, β ∈ {1, 2}, xi , xj ∈ R2 . ∆ij is as defined in (1.1), for i, j ∈ Λ and ∆ = ∆Λ is the matrix of ∆ij . To avoid confusion, we now write ∆Λ for the finite volume Laplacian and reserve ∆ for the corresponding one on 2 (Zd ). The spectrum of ∆Λ is contained in [−2d, 2d]. (α) (β)

At this point it is good to recall the problem that we started with: We study H = ∆ + γV, on 2 (Zd ),

(6.7)

as defined in (1.1). For all δ > 0, let Iδ = (2d + δ, ∞) or Iδ = (−∞, −2d − δ) depending on the support of dg as in (1.11). We are interested in the asymptotics of G(E, µ, ν) g as |µ − ν| → ∞ for E ∈ Iδ . We proceed by using finite volume approximations. We take Λ - µ, ν to be a finite subset of Zd and impose the Dirichlet boundary conditions. For given µ, ν, we take Λ sufficiently large, so that µ, ν is “well” in the interior of Λ. Using the resolvent series and spectral theory, the limit as Λ Zd exists, is independent of the boundary conditions and we have moreover lim GΛ (E; µ, ν) g = G(E; µ, ν) g . (6.8) Λ Zd

Therefore we further specify Λ to be a cube of side length N >> 1. We now make it more precise the class of weights W that we shall take, using the fact that R − ∆ (R ∈ R) is a convolution matrix on 2 (Zd ): F(R − ∆)(ξ) = R −

d i=1

cos ξi ,

∗

ξ = {ξi } ∈ Td .

296

Wei-Min Wang


(We abused slightly the notation and wrote ∆(j − k) for ∆jk .) Let R > 2d, ∗ y = {yi } ∈ Rd ∼ Rd . Define DR = {yi |

d

cosh yi < R} ⊂ Rd ;

(6.9)

i=1

DR is a strictly convex, compact domain in Rd . Let F (y) = 2

d

cosh yi .

i=1

We define the support function of DR : HDR (k) = sup y · k ¯R y∈D

for k ∈ Zd /{0}. HDR (k) is even, convex and homogeneous of degree 1. Therefore HDR (k) defines a distance on Zd : dR (k, ) = HDR (k − ) for k, ∈ Zd . Let ρ: Zd → (0, ∞) be a class of weights such that ρ(k) ≤ edR (k,) . ρ()

(6.10)

(This is always possible by a standard modification of the weight at infinity.) Let 2d < r ≤ R and let ρ satisfy (W1,2) and max(| log

ρ(k) ρ(k) − y(k) · (k − )|, | log − y() · (k − )|) ≤ H ρ() ρ()

(H 2d). (6.19) O(1) Let

HΛ = MΛ + 2φ Λ , φ Λ

(6.20)

φ

where MΛ , are as M , in the earlier notations in (4.64) and (4.63). Then HΛ ∈ L(FρΛ ; FρΛ ) for ρ ∈ W satisfying (6.10) (6.11) and is an approximation to E−+,Λ (0) satisfying (5.15) and (5.16). Let Λj+ (j = 1, 2, 3.) be the projections: ¯ Λ1+ F = F 1 = 2 (Λ) ¯ ⊗ 2 (Λ) ¯ Λ2+ F = F 2 = 2 (Λ) Λ3+ F

(6.21)

¯ ⊗s 2 (Λ)] ¯ ⊗ 2 (Λ). ¯ = F = [ (Λ) 3

2

(Λ1+ was denoted earlier by Λ+ in (4.113).) Let Hj,Λ = Λj+ HΛ Λj+ Hjk,Λ = Qj,Λ = Qjk,Λ = DΛ =

(j = 1, 2, 3.),

Λj+ HΛ Λk+ (j = Λj+ E−+,Λ (0)Λj+ Λj+ E−+,Λ (0)Λk+ ⊕3j=1 Qj,Λ

k, j, k = 1, 2, 3.); (j = 1, 2, 3.), (j = k, j, k = 1, 2, 3.);

(6.22)

RΛ = E−+,Λ (0) − DΛ . Since parity is conserved, H12,Λ = H21,Λ = H23,Λ = H32,Λ = 0; similarly, Q12,Λ = Q21,Λ = Q23,Λ = Q32,Λ = 0. Then by using the resolvent identity of [E−+,Λ (0)]−1 , we have Λ1+ [E−+,Λ (0)]−1 Λ1+ −1 1 −1 −1 1 −1 =Λ1+ DΛ Λ+ + Λ1+ DΛ RΛ DΛ Λ+ + Λ1+ DΛ RΛ [E−+,Λ (0)]−1 Λ1+ −1 =(Q1,Λ − Q13,Λ Q−1 3,Λ Q31,Λ ) def

˜ −1 . → =Q Λ

(6.23)

Vol. 2, 2001


299

Furthermore, from Theorem 5.1, we have Q1,Λ − H1,Λ L(Fρ1

,Fρ1 )

= O(γ 4 )

Q13,Λ − H13,Λ L(Fρ3

,Fρ1 )

= O(γ 4 )

Q31,Λ − H31,Λ L(Fρ1

,Fρ3 )

= O(γ 4 )

Q3,Λ − H3,Λ L(Fρ3

,Fρ3 )

= O(γ 2 ).

Λ

Λ

Λ

Λ

Λ

Λ

Λ

Λ

(6.24)

˜ Λ in the RHS of (6.23), the only operator that we need to From the definition of Q look into is Q−1 . Using (6.24), we see that to order O(γ 4 ), it is sufficient to study 3,Λ −1 H13,Λ H3,Λ H31,Λ . To that end, we compute H3,Λ as defined by (6.20), (6.21). To gain experience and also to see the structure, we start by computing H1,Λ , H2,Λ . By definition: (H1,Λ )ij = e−φ , (∆φ δij + 2φ Λ,ij )e−φ (0)

so H1,Λ

= 2φ Λ ij , = 2φ Λ .

(6.25)

(The φ’s that appear below should all be understood as φΛ ’s. For notational simplicity, we usually only put the subscript back on the end results.) (H2,Λ )ii ;jj = (ei , (∆φ δi j + 2φ i j )ej ) −1/2 −1/2 (e0 , zk ( zn∗ zn )z∗ e0 )δi j + 2(e0 , zk φ i j z∗ e0 ) = (2φ )ik (2φ )j (0)

−1/2

−1/2

= (2φ )ik (2φ )j (4) 4(e0 , ( φ kn φ n )e0 )δi j + 4(e0 , φ i j φ k e0 ) + 2(e0 , φi j k e0 ) n

−1/2

= (2φ )ik

−1/2

(2φ )j

[4( (φ kn φ n )δi j + 4φ i j φ k n

+

(4) 2φi j k

(1) +4 (e−φ dφ kn , (∆φ )−1 (e−φ dφ n ))δi j n (1)

+ 4(e−φ dφ i j , (∆φ )−1 (e−φ dφ k ))]. (6.26) So

H2,Λ = 2φ Λ ⊗ I + I ⊗ 2φ Λ + O(γ 2 ), 2

where O(γ ) is in the sense of bounded operators in

Fρ2Λ .

Similarly

H3,Λ = [2φ Λ ⊗s I] ⊗ I + [I ⊗s 2φ Λ ] ⊗ I + [I ⊗s I] ⊗ 2φ Λ + O(γ 2 ), where O(γ 2 ) is in the sense of bounded operators in Fρ3Λ .

(6.27)

(6.28)

300

Wei-Min Wang


Let P = [2φ Λ ⊗s I] ⊗ I + [I ⊗s 2φ Λ ] ⊗ I + [I ⊗s I] ⊗ 2φ Λ .

(6.29)

¯ in (6.19) Then P −1 ∈ L(Fρ3Λ ; Fρ3Λ ), where ρ ∈ W satisfies (6.10) (6.11) with E further lowered to E + 2(E − 2d) − 1/O(1). (6.30) −1 Using (6.28), we have that H3,Λ ∈ L(Fρ3Λ ; Fρ3Λ ). Using (6.24), we therefore have that 3 3 Q−1 (6.31) 3,Λ ∈ L(FρΛ ; FρΛ ).

We now compute H13 : (H13 )jkmn = 2(e0 , φ jk emn ) −1/2 ∗ ∗ = (2M)mnm n (e0 , φ jk zm zn e0 ) m ,n

=

−1/2

(4)

(2M)mnm n φjkm n

(6.32)

m ,n

= O(γ 2 ), where we used (4.12) and (4.18) (Recall the proof of Lemma 4.2.) to reach the last estimate. (cf. sect. VII for more of such computations.) Therefore by the usual argument, we obtain that H13 = O(γ 2 ) ∈ L(Fρ3Λ ; Fρ3Λ ). Using (6.24) (6.31), we obtain that 4 Q13 Q−1 (6.33) ¯ 2 (Λ)) ¯ = O(γ ). 3 Q31 L(2ρ (Λ), ρ Combining (6.33) with (6.24), we then obtain that 4 ˜ Λ − H1,Λ L(2 (Λ); Q ¯ 2 (Λ)) ¯ = O(γ ), ρ ρ

(6.34)

˜ Λ is as defined in (6.23). Therefore to O(γ 4 ), in order to study where Q −1 1 Λ+ E−+ (0)Λ1+ , we only need to study H1,Λ = 2φ Λ .

VII Asymptotics for the Averaged Green’s Function Recall from sect. I that σ(H) = [−2d, 2d] + γ supp dg almost surely and that E ∈ Iδ σ(H). Recall also from sect. VI that for such E’s and for all given µ, ν, we take µ, ν - Λ ⊂ Zd to be a cube of side length N . We have that lim (E − HΛ )−1 (µ, ν) g = lim GΛ (E; µ, ν) g = G(E; µ, ν) g .

Λ Zd

Λ Zd

For the computations in this section, it is more convenient to revert back to the point of view of associating (R2 )Λ to Λ. The indices i, j, µ, · · · etc. below are once again in Λ and xi , · · · ∈ R2 .

Vol. 2, 2001


301

For the cubes mentioned above, using (3.31), we have (1) −1 −φ GΛ (E; µ, ν) g = (e−φ dx(α) e dx(α) µ , (∆φ ) ν ) =

α=1,2 −φ

(e

−φ dx(α) dx(α) µ , EΛ e ν )

−1 + (e−φ dxµ(α) , (E+,Λ E−+,Λ E−,Λ )e−φ dxν(α) ).

α=1,2

(7.1) 2 2 From Theorem 5.1, ρΛ EΛ ρ−1 Λ L (L , L ) = O(1) uniformly in Λ, where as before, ρΛ is the restriction of ρ to Λ and ρ satisfies (6.10) (6.11) with R = E + 3(E − 2d) −

1 . O(1)

We therefore have −φ |(e−φ dx(α) dxν(α) )| = O(1)e−dr (µ,ν) µ , EΛ e

uniformly in Λ, where r =R−

1 1 = E + 3(E − 2d) − . O(1) O(1)

(7.2)

(Recall that this is approximately twice the decay rate that we expect from GΛ (E; µ, ν) g .) As a consequence, we obtain that −φ −dr (µ,ν) limΛ Zd (e−φ dx(α) dx(α) . µ , EΛ e ν ) = O(1)e

(7.3)

To study the second term in the second equality of (7.1), we use Theorem 5.1, (5.15) and (5.16) we obtain −1 −φ (e−φ dx(α) dx(α) µ , (E+,Λ E−+,Λ E−,Λ )(e ν )) −1 −φ dxν(α) ))(1 + O(γ 4 )) =(R+,Λ (e−φ dx(α) µ ), E−+,Λ R+,Λ (e

=(1 +

(7.4)

−1 )Λ1+ )(µ, ν), O(γ 4 ))(Λ1+ (E−+,Λ

where O(γ 4 ) is uniform in Λ. Using (6.34), (6.23) we have −1 (Λ1+ (E−+,Λ )Λ1+ )(µ, ν) = (2φ Λ + O(γ 4 ))−1 (µ, ν)

(7.5)

¯ 2ρ (Λ)) ¯ and ρΛ is the where O(γ 4 ) (uniformly in Λ) is in the sense of L(2ρΛ (Λ); Λ restriction of ρ to Λ with R lowered to R = E + 2(E − 2d) − To simplify things further, we use

1 . O(1)

302

Wei-Min Wang


Lemma 7.1 φ Λ = E − ∆Λ −γv g − γ 2 diag{(v − v g )2 g [(E − ∆Λ − γv g )−1 (j, j)]} 1 0 + O(γ 4 ), − γ 3 diag{(v − v g )3 g [(E − ∆Λ − γv g )−1 (j, j)]2 } ⊗ 0 1 where the operator in the parenthesis acts on 2 (Λ), O(γ 4 ) (uniformly in Λ) is in ¯ 2ρ (Λ)) ¯ with the sense of L(2ρΛ (Λ); Λ R = E + 2(E − 2d) −

1 O(1)

as in (6.30) Proof. We use Proposition 2.2 to estimate (6.6). We estimate term by term. To compute the integral, it is convenient to go back to the supersymmetric representation. We have k (γxi · xi ) = k (γxi · xi )e−2φ d2 xj =

j∈Λ −( j∈Λ EXj ·Xj − j,k∈Λ,|j−k|

k (γxi · xi )e

1 =1


d2 Xj .

j∈Λ

(7.6) Since

k (γXi · Xi )e−(

j∈Λ

EXj ·X j − j,k∈Λ,|j−k| =1 Xj ·Xk − j∈Λ k(γXj ·Xj )) 2 1 d Xj = k (0) = v g j∈Λ

(7.7)

by using Proposition 2.2, and k (γXi · Xi ) = k (γxi · xi ) + ξi ηi γk (γxi · xi ); we have k (γxi · xi ) = v g + γ

e−(

j∈Λ

ηi ξi k (γxi · xi )

EXj ·Xj − j,k∈Λ,|j−k|

1 =1

= v g + γ

(7.8)


j∈Λ

dξj dηj

d2 xj

j∈Λ

−2φ k (γxi · xi )(E − ∆Λ − γ diag k (γxj · xj ))−1 ii e

d2 xj ,

j∈Λ

(7.9) where we used (2.12) and integrated over the anti-commutative variables. (cf. also (2.17).) Hence the sum of the average of the third and the fifth term in the RHS of (6.6) gives −γv g .

Vol. 2, 2001


303

We now look at the fourth term in (6.6). Using similar arguments as in (7.6)– (7.9), we have xi xi k (γxi · xi ) 1 = δαβ xi · xi k (γxi · xi ) 2 1 = δαβ ηi ξi k (γxi · xi ) 2 (α) (β)

e−(

j∈Λ

EXj ·Xj − j,k∈Λ,|j−k|

1 =1


d2 Xj

j∈Λ

1 + δαβ γxi · xi k (γxi · xi )(E − ∆Λ − γ diag k (γxj · xj ))−1 ii . 2

(7.10) After multiplying by −4γ 2 , the second term in the RHS of (7.10) cancels the average of the sixth term. So we only need to estimate the first term in the RHS of (7.10). After integrating out the anti-commutative variables, we have ηi ξi k (γxi · xi )e−( j∈Λ EXj ·Xj − j,k∈Λ,|j−k|1 =1 Xj ·Xk − j∈Λ k(γXj ·Xj )) d2 Xj

−2φ k (γxi · xi )(E − ∆Λ − γ diag k (γxj · xj ))−1 ii e

=

j∈Λ

d2 xj

j∈Λ

k (γxi · xi ){(E − ∆Λ − γk (0))−1 ii

=

+ γ[(E − ∆Λ − γk (0))−1 (diag(k − k (0)))(E − ∆Λ − k (0))−1 ]ii }e−2φ d2 xj + O(γ 2 ) j∈Λ

= k (0)(E − ∆Λ −

γv g )−1 ii

2 2 + γk (0)[(E − ∆Λ − γv g )−1 ii ] + O(γ )

−1 2 3 = (v − v g )2 g (E − ∆Λ − γv g )−1 ii + γ(v − v g ) g [(E − ∆Λ − γv g )ii ]

+ O(γ 2 ), (7.11) where we used Proposition 2.2 and resolvent series for (E − ∆Λ − γ diag k (γxj · xj ))−1 ii about

E − ∆Λ − γk (0) = E − ∆Λ − γv g .

Since the last term of the RHS of (6.6) is of order O(γ 4 ) by using a similar argument as in (7.10), (7.11), we obtain the Lemma by using (2.8,2.9). Remark. We note the rather remarkable cancellations that occurred in the computation of φ . We suspect that this continues to higher orders.

304

Wei-Min Wang


For all δ > 0, let Iδ be defined as in (1.11). Assuming dg satisfies (H1-3), we then have the principal result of the paper : Theorem 7.2 For all δ > 0, there exists γ0 > 0, such that for all 0 < γ < γ0 , all E ∈ Iδ , all µ, ν ∈ Zd , logG(E, µ, ν) g = logG(E, µ − ν, 0) g = log E −∆ − γv g − γ 2 (v − v g )2 g [(E −∆ − γv g )−1 (0, 0)] −1 (µ − ν, 0) − γ 3 (v − v g )3 g [(E − ∆ − γv g )−1 (0, 0)]2 + O(γ 4 )(|µ − ν| + 1) − ∆)−1 (µ − ν, 0) + O(γ 4 )(|µ − ν| + 1), = log(E (7.12) where def → =E − γv g − γ 2 (v − v g )2 g [(E −∆ − γv g )−1 (0, 0)] E − γ 3 (v − v g )3 g [(E −∆ − γv g )−1 (0, 0)]2

(7.13)

∈[−2d, / 2d]. Proof. Since all the estimates are uniform in Λ and the limit Λ Zd exists, combining (7.1)–(7.5), Lemma 7.1, (6.23) and taking the limit Λ Zd , we obtain the Theorem. Acknowledgment. We are indebted to J. Sjöstrand for many inspiring conversations regarding the Witten Laplacian approach and for providing some preliminary sketches (in Swedish) on higher order Grushin problems in the semi-classical context. We are grateful to T. Spencer for sharing his deep insights on field theory and for pointing out the references [CC, P-L, Sch]. We thank the referee for his constructive comments. We learned of this problem from A. S. Sznitman, with whom we also had various beneficial discussions. We thank J. Fr¨ ohlich for providing the references [BF1, BF2], T. Bodineau, E. Bolthausen, B. Derrida, B. Duplantier for useful conversations, B. Helffer, Y. LeJan and J. Sjöstrand for helpful comments on the manuscript. Finally we thank Mme Bardot for her help with typing. This work was started when the author was a member at the Institute for Advanced Study in the year 1997-1998. The support of the NSF grant DMS 9304580 and the European network TMR program FMRX-CT 960001 is gratefully acknowledged.

References [BJS]

V. Bach, T. Jecko and J. Sj¨ ostrand, Correlation asymptotics of classical lattice spin systems with non-convex Hamilton function at low temperature, Ann. H. Poincaré 1 (1) 59 (2000).

Vol. 2, 2001


305

[Be]

F. A. Berezin, The Method of Second Quantization, 1966 Academic Press, New York.

[BGV]

N. Berline, E. Getzler and M. Vergne, Heat kernels and Dirac operators, Springer-Verlag, 1992.

[BCKP] A. Bovier, M. Campanino, A. Klein, and F. Perez, Smoothness of the density of states in the Anderson model at high disorder, Com. Math. Phys. 114, 439-461(1988). [BF1]

J. Bricmont, J. Fröhlich, Statistical mechanics methods in particle structure analysis of lattice field theory [I] General theoryNuclear Phys. B 251 [FS13], 517-552 (1985).

[BF2]

J. Bricmont, J. Fröhlich, Statistical mechanics methods in particle structure analysis of lattice field theory [II] Scalar and surface models, Com. Math. Phys. 98, 553-578 (1985).

[Bol]

E. Bolthausen, A note on the diffusion of directed polymers in a random environment, Com. Math. Phys. 123, 529–534 (1989).

[CC]

J. T. Chayes and L. Chayes, Ornstein-Zernike behavior for self-Avoiding walks at all non-critical temperature, Com. Math. Phys. 105, 221-238 (1986).

[CFKS] H. L. Cycon, R. G. Froese, W. Kirsch and B. Simon, Schr¨ odinger Operators, Springer-Verlag (1987). [EW1]

J.-P Eckmann and E. C. Wayne, Lyapunov spectra for infinite chains of non-linear oscillators J. Stat. Phys. 50, 853-878 (1987).

[EW2]

J.-P Eckmann and E. C. Wayne, The largest Lyapunov exponent for random matrices and directed polymers in a random environment Com. Math. Phys. 121, 147-175 (1989).

[Ef]

K. B. Effetov, Supersymmetry and the theory of disordered metals Adv. Phys. 32, 53-127 (1983).

[F]

W. Feller, An Introduction to Probability Theory and Its Applications, John-Wiley and Sons (1966).

[FFS]

R. Fernandez, J. Fr¨ ohlich, A. Sokal, Random Walks, Critical phenomena and Triviality in Quantum Field Theory, Springer-Verlag (1992).

[FKG]

C. M. Fortuin, P. W. Kasteleyn, J. Ginibre, Correlation inequalities on some partially ordered sets Com. Math. Phys. 22, 89-103 (1971).

[Fr]

M. Friedlin, Functional Integration and Partial Differential Equations, Ann. of Math. Studies 109 Princeton University Press (1985).

306

Wei-Min Wang


[H]

L. H¨ ormander, Introduction to Complex Analysis in Several Variables, Elsevier (North-Holland Mathematical Library, Vol. 7), Amsterdam, 3rd edition (1990).

[HS1]

B. Helffer and J. Sj¨ ostrand, Multiple wells in the semi-classical limit I, Com. PDE 9(4), 337-408 (1984).

[HS2]

B. Helffer and J. Sj¨ ostrand, On the correlation for Kac like models in the convex case, J. Stat. Phys. 74 (1,2), 349-409 (1994).

[IS]

J. Z. Imbrie and T. Spencer, Diffusion of directed polymers in a random environment, J. of Stat. Phys., 609-626 (1988).

[Jo]

J. Johnsen, On spectral properties of Witten-Laplacians, their range projections and Brascamp-Lieb inequality, Aalborg University preprint (1998).

[K]

A. Klein, The supersymmetric replica trick and smoothness of the density of states for the random Schr¨ odinger operators, Proceedings of Symposium in Pure Mathematics 51 (1990).

[KS]

A. Klein, A. Spies, Smoothness of the density of states in the Anderson model on a one-dimensional strip Ann. of Phys. 183, 352-398 (1988).

[Kom]

T. Komorowski, Brownian motion in a Poissonian obstacle field, Séminaire N. Bourbaki 853 (1988).

[La]

G. F. Lawler, Intersection of Random Walks, Birkh¨ auser (1991).

[Li]

T. M. Liggett, Interacting Particle Systems, Springer-Verlag (1985).

[MS]

N. Madras, G. Slade, The self-Avoiding Walk, Birkh¨ auser (1993).

[PF]

L. Pastur and A. Figotin, Spectra of Random and Almost Periodic Operators, Springer (1992).

[Po]

T. Povel, The one dimensional annealed δ-Lyapunov exponent Ann. IHP, Probab. et Stat. 34, 61-72 (1998).

[P-L]

P. J. Paes-Leme, Ornstein-Zernike and analyticity properties for classical lattice spin systems J. Ann. Phys. 115 367 (1978).

[Sch]

R. S. Schor, The particle structure of ν-dimensional Ising models at low temperature, Commun. Math. Phys. 59, 213 (1978).

[Sin]

Ya. G. Sinai, A remark concerning random walk with random potentials, Fund. Math 147, 173-180 (1995).

[Sj1]

J. Sjöstrand, Correlation asymptotics and Witten Laplacians Algebra and Analysis 8, (1996).

Vol. 2, 2001


307

[Sj2]

J. Sjöstrand, (In preparation).

[Spi]

F. Spitzer, Principles of Random Walks, D. Van Nostrand Company, Inc (1964).

[Spe]

T. Spencer, (Private conversations).

[Sz1]

A. S. Sznitman, Brownian motion, Obstacles and Random media Springer Monograph in Mathematics (1998).

[Sz2]

A. S. Sznitman, Shape theorem, Lyapunov exponents and large deviations for Brownian motion in a Poissonian potential, Com. Pure Appl. Math (1994).

[SW1]

J. Sjöstrand and W. M. Wang, Supersymmetric measures and maximum principles in the complex domain–decay of Green’s functions, Ann. Sci´ Norm. Sup. 32 (1999). ent. Ec.

[SW2]

J. Sj¨ ostrand and W. M. Wang, Decay of averaged Green’s functions–a ´ Norm. Sup. 32 (1999). direct approach, Ann. Scient. Ec.

[V]

T. Voronov, Geometric Integration Theory on Super-manifolds, Mathematical Physics Review, USSR Academy of Sciences (1993), Moscow.

[W]

E. Witten, Supersymmetry and Morse theory, J. Diff. Geom 17, 661-692 (1982).

[Z]

M. P. W. Zerner, Directional decay of the Green’s function for a random non-negative potential on Zd , The Annals of Applied Probability 246-280 (1998).

Wei-Min Wang UMR 8628 du CNRS Département de Mathématiques Université Paris-Sud Bâtiment 425 F-91405 Orsay cedex, France e-mail : [email protected] Communicated by Michael Aizenman submitted 20/06/00, accepted 04/11/00



Aharonov–Bohm Effect in Scattering by Point-like Magnetic Fields at Large Separation H. T. Ito, H. Tamura Abstract.The aim is to study the Aharonov–Bohm effect in the scattering by two point–like magnetic fields at large separation in two dimensions. We analyze the asymptotic behavior of scattering amplitude when the distance between the centers of two fields goes to infinity. The obtained result heavily depends on the fluxes of fields and on incident and final directions.

1 Introduction Magnetic potentials have a direct significance to the motion of particles in quantum mechanics. This property is known as the Aharonov–Bohm effect ([3]) and a lot of physical literatures can be found in the recent book [2]. In this work we consider the scattering by two δ-like magnetic fields at large separation in two dimensions and we analyze the asymptotic behavior of scattering amplitude when the distance between the centers of two fields goes to infinity. Even if a field is compactly supported, the corresponding magnetic potential is not expected to fall off rapidly. In general, it has the long–range property at infinity. We study how the Aharonov– Bohm effect is reflected in the scattering by magnetic fields at large separation. We work in the two dimensional space R2 throughout the entire discussion. We denote by x = (x1 , x2 ) a generic point, and we write H(A) = (−i∇ − A)2 =

2 (−i∂j − aj )2 ,

∂j = ∂/∂xj ,

j=1

for the Schr¨ odinger operator with magnetic potential A(x) = (a1 (x), a2 (x)) : R2 → 2 R . The magnetic field b(x) is defined as b = ∇×A = ∂1 a2 −∂2 a1 , and the quantity

α = (2π)−1

b(x) dx is called the total flux of field b, where the integration with

no domain attached is taken over the whole space. We often use this abbreviation. We begin by making a brief review on the scattering theory for the Hamiltonian with magnetic field supported on a single point. Such a Hamiltonian is regarded as one of solvable models in quantum mechanics and the explicit form of scattering amplitude has been already calculated ([3,17]). In section 2 we are going to discuss the subject in some detail. Let 2παδ(x) be the magnetic field with

310

H. T. Ito, H. Tamura


flux α and center at the origin. The magnetic potential Aα (x) associated with the field is given by Aα (x) = α −x2 /|x|2 , x1 /|x|2 = α (−∂2 log |x|, ∂1 log |x|) . In fact, a simple calculation yields ∇ × Aα = α ∆ log |x| = 2παδ(x). If we denote by γ(x) the azimuth angle from the positive x1 axis, then Aα is written in the different form (1.1) Aα (x) = α ∇γ(x) = α −x2 /|x|2 , x1 /|x|2 . This representation is important. The same relation remains true for the azimuth angle γ(x; ω) from direction ω ∈ S 1 , where S 1 is the unit circle. Let H0 = −∆ be the free Hamiltonian and define Hα by Hα = H(Aα ). The potential Aα (x) has a strong singularity at the origin and it is known ([1,7]) that the operator formally defined is not essentially self–adjoint in C0∞ (R2 \ {0}). We have to impose some boundary conditions at the origin. The operator Hα becomes self–adjoint in L2 = L2 (R2 ) under the condition lim|x|→0 |u(x)| < ∞, and it is called the Aharonov–Bohm Hamiltonian. If, in particular, α ∈ Z is not an integer, the limit is convergent to zero lim|x|→0 |u(x)| = 0. We now denote by f (ω → ω ˜ ; E, Hα , H0 ) the scattering amplitude for the scattering from initial direction ω to final one ω ˜ at energy E > 0. If we identify the coordinates over S 1 with the azimuth angles from the positive x1 axis, then the amplitude is given by f (ω → ω ˜ ) = c(E) (cos απ − 1)δ(˜ ω − ω) − (i/π) sin απ ei[α](˜ω−ω) F0 (˜ ω − ω) (1.2) √ with c(E) = (2π/i E)1/2 , where the Gauss notation [α] denotes the maximal integer not exceeding α and F0 (θ) = v.p. eiθ /(eiθ − 1). We move to the scattering by two δ–like magnetic fields. Let 2πα1 δ(x) and 2πα2 δ(x − d) be given magnetic fields with centers at the origin and d ∈ R2 respectively. We consider the Hamiltonian Hd = H(Aα1 + Aα2 ,d ), where

Aα2 ,d (x) = Aα2 (x − d),

Aαj (x) = αj ∇γ(x) = αj −x2 /|x|2 , x1 /|x|2

(1.3)

is the magnetic potential associated with the field 2παj δ(x). In section 7, we will study the basic spectral problems such as the self–adjointness, the absence of bound states, the principle of limiting absorption and the asymptotic completeness of wave operators for Hd . According to the result there, Hd becomes self–adjoint with domain D(Hd ) = {u ∈ L2 : Hd u ∈ L2 ,

lim |u(x)| < ∞,

|x|→0

lim

|x−d|→0

|u(x)| < ∞},

(1.4)

Vol. 2, 2001

Scattering by Magnetic Fields

311

where Hd u is understood in the distributional sense. We set Hj = H(Aαj ),

j = 1, 2,

˜ ; E) and fj (ω → ω ˜ ; E) the scattering amplitude for and we denote by fd (ω → ω the pair (Hd , H0 ) and (Hj , H0 ) respectively. By (1.2), the scattering amplitude for (Hj , H0 ) is explicitly calculated as ˜ ; E) = −c(E)(i/π) sin αj πei[αj ](˜ω−ω) F0 (˜ ω − ω) fj (ω → ω for ω = ω ˜. The aim here is to study the asymptotic behavior as |d| → ∞ of fd (ω → ω ˜ ; E). If we make a change of variables x → |d|y, then this becomes the problem on the asymptotic behavior at high energy |d|2 E of scattering amplitude for the ˆ and dˆ = d/|d| ∈ S 1 . We Hamiltonian H(Aα1 + A˜α2 ), where A˜α2 (x) = α2 ∇γ(x − d) fix the notation. We define τ (x; ω, ω ˜ ) by τ (x; ω, ω ˜ ) = γ(x; ω) − γ(x; −˜ ω) and we interpret exp(iαγ(x; ω)) with ω = x/|x| as exp(iαγ(x; ω)) := (1 + exp(i2απ))/2 = cos απ × exp(iαπ). The obtained result is formulated as the following theorem. Theorem 1.1 Let the notation be as above and let √ ˜ ; E) = exp(−i Ed · (˜ ω − ω))f2 (ω → ω ˜ ; E) f2,d (ω → ω be the scattering amplitude for the pair (H2,d , H0 ), H2,d = H(Aα2 ,d ). Fix the direction dˆ = d/|d|. If ω = ω ˜ , then fd (ω → ω ˜ ; E) behaves like ˜ ; E) = exp(iα2 τ (−d; ω, ω ˜ ))f1 (ω → ω ˜ ; E) fd (ω → ω ˜ ))f2,d (ω → ω ˜ ; E) + o(1) + exp(iα1 τ (d; ω, ω as |d| → ∞. In particular, the backward scattering amplitudes obey fd (ω → −ω; E) = f1 (ω → −ω; E) + f2,d (ω → −ω; E) + o(1) ˆ and for ω = ±d, ˆ E) = f1 (dˆ → −d; ˆ E) + (cos α1 π)2 f2,d (dˆ → −d; ˆ E) + o(1), fd (dˆ → −d; ˆ E) = (cos α2 π)2 f1 (−dˆ → d; ˆ E) + f2,d (−dˆ → d; ˆ E) + o(1). fd (−dˆ → d;

312



As stated at the beginning, the motion of quantum particles is subject to the influence of magnetic potentials as well as of magnetic fields. This quantum property can be found in the asymptotic formula above. In fact, the first field 2πα1 δ(x) has an influence upon the scattering by the second one through the phase factor exp(iα1 τ (d; ω, ω ˜ )) in front of f2,d (ω → ω ˜ ; E), although the centers of two fields are far away from each other. This can be seen more clearly in the ˆ E) or fd (−dˆ → d; ˆ E). If, in particular, backward scattering amplitude fd (dˆ → −d; the flux α1 is a half–integer, then the scattering by the second field does not make ˆ E). any contribution to the leading term of the asymptotic formula for fd (dˆ → −d; Many literatures can be found in the book [4] for the spectral and scattering theory of Schr¨ odinger operators with potentials supported on a discrete set of points, and the work [11] has recently dealt with the problem on the asymptotic behavior of scattering amplitude for the Schr¨ odinger operator −∆+V1 (x)+V2 (x− d) with potentials falling off rapidly at infinity. In the case of potential scattering, we do not have to modify phase factors and the asymptotic formula is completely split into the sum of two scattering amplitudes corresponding to potentials V1 and V2 (· − d). However the case is quite different in the scattering by magnetic fields. Roughly speaking, the difficulty comes from the long–range property of magnetic potentials. Several new devices are required at many stages of the argument. The micro-local resolvent estimates for Hd and the asymptotic behavior at infinity of the eigenfunction of H1 = H(Aα1 ) or H2 play an important role in proving the theorem. We end the section by making a brief comment on the extension to the case of scattering by point–like magnetic fields supported on several points. This is a natural problem. The analysis heavily depends on the location of centers and on initial and final directions. Some new difficulties may arise. However the idea developed here is thought to be useful to such a generalization. We are going to discuss the detailed matter elsewhere.

2 Scattering by δ–like magnetic field The present section is devoted to the scattering theory for the Schr¨ odinger operator with point–like magnetic field supported on a single point. Such an operator is called the Aharonov–Bohm Hamiltonian. 2.1. We first make a review on the results from [3,17]. We consider the Hamiltonian Hα = H(Aα ), Aα (x) = α∇γ(x) = α −x2 /|x|2 , x1 /|x|2 , which has the δ–like field 2παδ(x) at the origin. We know ([1,7]) that Hα is self– adjoint with domain D(Hα ) = {u ∈ L2 : Hα u ∈ L2 ,

lim |u(x)| < ∞},

|x|→0

Vol. 2, 2001


313

Hα u being understood in D , and that the wave operator W± (Hα , H0 ) = s − lim exp(itHα ) exp(−itH0 ) : L2 → L2 t→±∞

exists and is asymptotically complete : Ran W± (Hα , H0 ) = L2 . Hence the scattering operator S(Hα , H0 ) = W+∗ (Hα , H0 )W− (Hα , H0 ) : L2 → L2 can be defined as a unitary operator. We √ use the notation · to denote the scalar product in R2 . Let ϕ0 (x; λ, ω) = exp(i λ x · ω) be the generalized eigenfunction of the free Hamiltonian H0 = −∆, where λ > 0 and ω ∈ S 1 . The unitary mapping F : L2 → L2 ((0, ∞); dλ) ⊗ L2 (S 1 ) defined by (F u) (λ, ω) = 2−1/2 (2π)−1 ϕ¯0 (x; λ, ω)u(x) dx (2.1) decomposes S(Hα , H0 ) into the direct integral S(Hα , H0 ) F S(Hα , H0 )F ∗ =

∞

⊕ S(λ; Hα , H0 ) dλ,

0

where the fiber S(λ; Hα , H0 ) : L2 (S 1 ) → L2 (S 1 ) is called the scattering matrix at energy λ > 0 and it acts as (S(λ; Hα , H0 )(F u)(λ, · )) (ω) = (F S(Hα , H0 )u) (λ, ω) for u ∈ L2 . We calculate the generalized eigenfunction ϕ∓ (x; λ, ω) of Hα to derive the integral kernel of S(λ; Hα , H0 ). The operator Hα is rotationally invariant. We work in the polar coordinate system (r, θ). Let Λl , l ∈ Z, be the eigenspace associated with eigenvalue l of operator −i∂/∂θ acting on L2 (S 1 ). Then L2 ((0, ∞); dr) ⊗ L2 (S 1 ) = ⊕ L2 ((0, ∞); dr) ⊗ Λl . l∈Z

We define the unitary mapping (U u)(r, θ) = r1/2 u(rθ) : L2 → L2 ((0, ∞); dr) ⊗ L2 (S 1 ). The mapping U yields the partial wave expansion ⊕ (Hlα ⊗ Id), Hα U Hα U ∗ = l∈Z

where Id is the identity operator and Hlα = −∂r2 + (ν 2 − 1/4)r−2 ,

ν = |l − α|,

314



is self–adjoint with domain D(Hlα ) = {u ∈ L2 ((0, ∞); dr) : Hlα u ∈ L2 ((0, ∞); dr), lim r−1/2 |u(r)| < ∞}. r→0

The eigenfunction ϕ∓ is formally defined as ϕ∓ = W± (Hα , H0 )ϕ0 by using the intertwining property of wave operators. However this does not have the precise meaning, because ϕ0 (x; λ, ω) is not in L2 . The precise definition requires the expansion formula √ ϕ0 (x; λ, ω) = exp(i|l|π/2) exp(ilγ(x; ω))J|l| ( λ|x|) (2.2) l∈Z

in terms of the Bessel functions Jp (r). The function Jp (r) satisfies the asymptotic formula Jp (r) = (2/π)1/2 r−1/2 cos(r − (2p + 1)π/4) 1 + gN (r) + O(r−N ), r → ∞, for any N 1 large enough, where gN (r) obeys (d/dr)k gN (r) = O(r−1−k ). If we set e∓l (r) = exp(±i|l|π/2)J|l| (r) − exp(±iνπ/2)Jν (r), then

e∓l (r) = exp(∓ir) C∓l r−1/2 + O(r−3/2 ) + exp(±ir)O(r−3/2 )

for some constant C∓l = 0. Hence e−l (r) satisfies the incoming radiation condition e−l + ie−l = O(r−3/2 ) at infinity, while e+l (r) satisfies the outgoing radiation condition e+l − ie+l = O(r−3/2 ). The simple relation exp(ilγ(x; −ω)) = exp(i|l|π + ilγ(x; ω)) holds between the azimuth angles γ(x; ω) and γ(x; −ω). If we take account of (2.2), then the eigenfunction ϕ∓ is given by √ ϕ∓ (x; λ, ω) = exp(±iνπ/2) exp(ilγ(x; ±ω))Jν ( λ|x|) (2.3) l∈Z

with ν = |l −α| again. We can easily see that the series converges locally uniformly and that ϕ∓ satisfies Hα ϕ∓ = λϕ∓ . We often identify the coordinates over the unit circle S 1 with the azimuth angles from the positive x1 axis. The scattering matrix S(λ; Hα , H0 ) has the property S(λ; Hα , H0 ) : ϕ+ (x; λ, ·) → ϕ− (x; λ, ·). A simple computation yields exp(iνπ/2) exp(−ilγ(x; −ω)) = exp(i(ν − l)π) exp(−iνπ/2) exp(−ilγ(x; ω))

Vol. 2, 2001


315

and hence the kernel of S(λ; Hα , H0 ) is calculated as S(θ , θ; λ, Hα , H0 ) = (2π)−1 exp(i(l − ν)π) exp(il(θ − θ)). l∈Z

According to [17], the sum on the right side equals exp(i(l − ν)π) exp(ilθ) = 2π cos απ δ(θ) − (i/π) sin απei[α]θ F0 (θ) , l∈Z

where F0 (θ) = v.p. eiθ /(eiθ − 1). Thus we can obtain the representation (1.2) of amplitude ω , ω; E, Hα , H0 ) − δ(˜ ω − ω) f (ω → ω ˜ ; E, Hα , H0 ) = c(E) S(˜ for the scattering ˜ at energy E > 0, where √ from initial direction ω into final one ω c(E) = (2π/i E)1/2 . 2.2. The asymptotic behavior as |x| → ∞ of eigenfunction ϕ∓ (x; λ, ω) plays an important role in proving the main theorem. It has been already known in the physical literatures [3,5,14]. However we shall prove the following proposition in section 6 because of its importance. Proposition 2.1 The eigenfunction ϕ∓ (x; λ, ω) has the following asymptotic properties at infinity. Assume that |x/|x| − ω| > c > 0. Then ϕ+ (x; λ, ω) behaves like √ ϕ+ (x; λ, ω) = exp (iα (γ(x; ω) − π)) exp(i λx · ω) N−1 √ + ei λ|x| |x|−1/2 c+j (x)|x|−j + O(|x|−(N+1/2) ), (1)

j=0

where the coefficient c+j (x) obeys the bound |∂xβ c+j | = O(|x|−|β| ). If |x/|x| + ω| > c > 0, then a similar formula √ ϕ− (x; λ, ω) = exp (iα (γ(x; −ω) − π)) exp(i λx · ω) N−1 √ + e−i λ|x| |x|−1/2 c−j (x)|x|−j + O(|x|−(N+1/2) ) (2)

j=0

holds true for the incoming eigenfunction ϕ− (x; λ, ω). (3) then

Assume that 1/2 < q ≤ 1. If 0 < |x/|x| − ω| < c|x|−q for some c > 0, √ ϕ+ (x; λ, ω) = cos απ × exp(i λx · ω) + O(|x|−ν )

316



with ν = 2(q − 1/2)/3 > 0, and if 0 < |x/|x| + ω| < c|x|−q , then √ ϕ− (x; λ, ω) = cos απ × exp(i λx · ω) + O(|x|−ν ) for the same ν as above. (4)

ϕ∓ (x; λ, ω) is bounded uniformly in x.

2.3. We represent the amplitude f (ω → ω ˜ ; E, Hα , H0 ) in terms of resolvent R(E+ i0; Hα ). We know that the boundary values R(λ ± i0; Hα ) = lim R(λ ± iε; Hα ), ε↓0

R(ζ; Hα ) = (Hα − ζ)−1 ,

to the positive real axis exist (principle of limiting absorption) and R(λ ± i0; Hα ) : L2s (R2 ) = L2 (R2 ; x2s dx) → L2−s (R2 )

(2.4)

is bounded for s > 1/2, where x = (1 + |x|2 )1/2 . This is verified by use of the commutator method due to Mourre [13] (see Proposition 7.3 in section 7). We now introduce a basic cut–off function. Let χ ∈ C0∞ [0, ∞) be a smooth function such that χ(s) ≥ 0 and χ(s) = 1 for 0 ≤ s ≤ 1,

χ(s) = 0 for s > 2.

(2.5)

We fix E > 0 and we choose δ, 0 < δ 1, sufficiently small. We define √ β0 (ξ) = χ(2|ξ − Eω|/δ 2 ) for initial direction ω. We further take a nonnegative function j0 ∈ C ∞ (R2 ) such that supp j0 ⊂ Σ(R, −ω, δ), j0 = 1 on Σ(2R, −ω, δ/2), (2.6) and ∂xβ j0 (x) = O(|x|−|β| ) at infinity, where Σ(R, ω, δ) = {x : |x| > R, |x/|x| − ω| < δ},

R > 0.

Recall that the azimuth angle γ(x; ω) satisfies (1.1). Hence we have exp(−iαγ(x; ω))Hα exp(iαγ(x; ω)) = H(Aα − α∇γ) = H0 on Σ(R, −ω, δ). The next lemma is well known ([15]). We skip the proof. Lemma 2.1 Let f ∈ L2 . Then the free solution exp(−itH0 )f behaves like (exp(−itH0 )f )(x) = (2it)−1 exp(i|x|2 /4t)f(x/2t) + o(1), |t| → ∞, −1 2 e−ix·ξ f (x) dx is the Fourier transform. in L , where f (ξ) = (2π)

(2.7)

Vol. 2, 2001


317

Let K1 and K2 be two self–adjoint operators in L2 . We introduce the new notation W± (K2 , K1 ; J) = s − lim exp(itK2 )J exp(−itK1 ) t→±∞

2

for a bounded operator J on L . Let β0 (ξ) and j0 (x) be as above. We set J = j02 β0 (Dx )2 . Then W− (Hα , H0 )β0 (Dx )2 = W− (Hα , H0 ; J) by Lemma 2.1, so that we have the decomposition W− (Hα , H0 )β0 (Dx )2 = W− (Hα , H0 ; J0 )W− (H0 , H0 ; J1 ),

(2.8)

where J0 = j0 exp(iαγ(x; ω))β0 (Dx ),

J1 = j0 exp(−iαγ(x; ω))β0 (Dx ).

The existence of W− (H0 , H0 ; J1 ) follows from Lemma 2.1, while the existence of W− (Hα , H0 ; J0 ) is verified by use of (2.7). The same argument applies to final direction ω ˜ . We define √ β˜0 (ξ) = χ(2|ξ − E ω ˜ |/δ 2 ) and we take a function ˜j0 ∈ C ∞ (R2 ) such that ˜ , δ), supp ˜j0 ⊂ Σ(R, ω

˜j0 = 1 on Σ(2R, ω ˜ , δ/2).

(2.9)

If we set J˜0 = ˜j0 exp(iαγ(x; −˜ ω ))β˜0 (Dx ),

J˜1 = ˜j0 exp(−iαγ(x; −˜ ω))β˜0 (Dx ),

then we obtain W+ (Hα , H0 )β˜0 (Dx )2 = W+ (Hα , H0 ; J˜0 )W+ (H0 , H0 ; J˜1 ).

(2.10)

We combine (2.8) and (2.10) to obtain that β˜0 (Dx )2 S(Hα , H0 )β0 (Dx )2 = W+∗ (H0 , H0 ; J˜1 )S0 (Hα , H0 )W− (H0 , H0 ; J1 ), (2.11) where

S0 (Hα , H0 ) = W+∗ (Hα , H0 ; J˜0 )W− (Hα , H0 ; J0 ).

The operator S0 (Hα , H0 ) also has the direct integral decomposition, because it commutes with H0 . We denote by S0 (λ; Hα , H0 ) : L2 (S 1 ) → L2 (S 1 ) the fiber and by S0 (θ , θ; λ, Hα , H0 ) the kernel of S0 (λ; Hα , H0 ). By Lemma 2.1, W− (H0 , H0 ; J1 ) acts as the multiplication √ F W− (H0 , H0 ; J1 )F ∗ = exp(−iαγ(−θ; ω))β0 ( λθ)×

318



on L2 ((0, ∞); dλ) ⊗ L2 (S 1 ), where F : L2 → L2 ((0, ∞); dλ) ⊗ L2 (S 1 ) is the unitary mapping defined by (2.1). Similarly √ ω ))β˜0 ( λθ) × . F W+ (H0 , H0 ; J˜1 )F ∗ = exp(−iαγ(θ; −˜ √ √ ˜ ) = e−iαπ , we have Since e−iαγ(−ω;ω) β0 ( Eω) = e−iαπ and e−iαγ(˜ω;−˜ω) β˜0 ( E ω S(˜ ω , ω; E, Hα , H0 ) = S0 (˜ ω , ω; E, Hα , H0 )

(2.12)

by (2.11). We derive the representation for S0 (θ , θ; E, Hα , H0 ) on the right side. The derivation is based on the idea due to [10]. We calculate T = Hα J0 − J0 H0 as T = exp(iαγ(x; ω)) (H0 j0 − j0 H0 ) β0 (Dx ) = exp(iαγ(x; ω))[H0 , j0 ]β0 (Dx ) by use of (2.7). Similarly we have T˜ = Hα J˜0 − J˜0 H0 = exp(iαγ(x; −˜ ω ))[H0 , ˜j0 ]β˜0 (Dx ). Since W+ (Hα , H0 ; J0 ) = 0 by Lemma 2.1, it follows that W− (Hα , H0 ; J0 ) = −i exp(itHα )T exp(−itH0 ) dt. If we make use of this relation, then we obtain the representation S0 (λ; Hα , H0 ) = 2πiF (λ) −J˜0∗ T + T˜∗ R(λ + i0; Hα )T F ∗ (λ)

(2.13)

in exactly the same way as [10, Theorem 3.3], where F (λ) : L2s (R2 ) → L2 (S 1 ), s > 1/2, is the trace operator defined by (F (λ)u) (θ) = (F u) (µ, θ)|µ=λ . √ We write ϕ0 (ω, λ) for ϕ0 (x; ω, E) = exp(i λx · ω) and denote by ( , ) the L2 scalar product. The next lemma immediately follows from (2.12). Lemma 2.2 Assume that ω = ω ˜ . Then f (ω → ω ˜ ; E, Hα , H0 ) = −(ic(E)/4π)(T ϕ0 (ω, E), J˜0 ϕ0 (˜ ω , E)) + (ic(E)/4π)(R(E + i0; Hα )T ϕ0 (ω, E), T˜ϕ0 (˜ ω , E)). We fix σ, 0 < σ 1, small enough and take R = |d|σ , |d| 1, in (2.6) and (2.9). We may assume that j0 obeys ∂xβ j0 (x) = O(|x|−|β| ) uniformly in d ; similarly for ˜j0 . The operators J˜0 , T and T˜ are all pseudo-differential operators. If ω = ω ˜ , then we can choose δ so small that the support of symbols T (x, ξ) and J˜0 (x, ξ) does not intersect with each other. Hence it follows that ω , E)) = O(|d|−N ), (T ϕ0 (ω, E), J˜0 ϕ0 (˜

|d| → ∞,

Vol. 2, 2001


319

for any N 1. Thus we have f (ω → ω ˜ ; E, Hα , H0 ) = (ic(E)/4π)(R(E + i0; Hα )T ϕ0 (ω, E), T˜ϕ0 (˜ ω , E)) + o(1) as |d| → ∞. We continue to analyze the behavior as |d| → ∞ of the term on the right side. We decompose T = T (x, Dx ) into T = χ0 T + (1 − χ0 )T = T0 + T1 , where χ0 (x) = χ(|x|/2|d|σ )

(2.14)

C0∞ (0, ∞) σ

for cut–off function χ ∈ with property (2.5). By (2.6), ∇j0 vanishes on Σ(2R, −ω, δ/2) with R = |d| . Hence the symbol T1 (x, ξ) has the support in the outgoing region √ supp T1 ⊂ {(x, ξ) : |x| > 2|d|σ , |ξ − Eω| < δ 2 , x · ξ > (−1 + δ/3)|x||ξ|}. The particle with initial state (x, ξ) ∈ supp T1 at t = 0 moves like the free particle and it does not pass in a neighborhood of the origin for t ≥ 0. In fact, we have 2

|x + tξ|2 ≥ |x|2 − 2t(1 − δ/3)|x| |ξ| + t2 |ξ|2 ≥ c (|x| + t|ξ|) , c > 0. √ Thus the outgoing particle does not take momentum around E ω ˜ , so that (R(E + i0; Hα )T1 ϕ0 (ω, E), T˜ϕ0 (˜ ω , E)) = O(|d|−N ) by the micro-local resolvent estimate ([9, Theorems 1 and 2]). Similarly we decompose T˜ into T˜ = T˜0 + T˜1 . Then we obtain (R(E + i0; Hα )T0 ϕ0 (ω, E), T˜1 ϕ0 (˜ ω , E)) = O(|d|−N ). A similar argument has been used in the semi–classical analysis on scattering amplitudes ([16]). The magnetic potential Aα (x) has a singularity at the origin, but the classical particle starting from (x, ξ) ∈ supp T1 or (x, ξ) ∈ supp T˜1 does not pass over the origin. Thus the argument there applies to Hα without any essential changes. The next lemma is obtained as a consequence of Lemma 2.2. Lemma 2.3 Let j0 , ˜j0 be as in (2.6) and (2.9) respectively and let χ0 be defined by (2.14). Assume that ω = ω ˜ . Then f (ω → ω ˜ ; E, Hα , H0 ) = (ic(E)/4π)(R(E + i0; Hα )T0 ϕ0 (ω, E), T˜0 ϕ0 (˜ ω , E)) + o(1) as |d| → ∞, where T0 acts as T0 ϕ0 (ω, E) = eiαγ(x;ω) χ0 [H0 , j0 ]ϕ0 (ω, E) √ on ϕ0 (ω, E) = ϕ0 (x; ω, E) = exp(i Ex · ω), and T˜0 acts as T˜0 ϕ0 (˜ ω , E) = eiαγ(x;−˜ω) χ0 [H0 , ˜j0 ]ϕ0 (˜ ω , E).

320



2.4. The main idea to prove the theorem is to represent the scattering amplitude fd (ω → ω ˜ ; E) in terms of the eigenfunction of H1 = H(Aα1 ) or H2 . This subsection is devoted to a preliminary step towards the representation. The eigenfunction ϕ∓ (x; λ, ω) of Hα is defined by (2.3). We denote by F± : L2 → L2 ((0, ∞); dλ) ⊗ L2 (S 1 ) the unitary mapping −1/2 −1 (2π) (F± u) (λ, θ) = 2 ϕ¯± (x; λ, θ)u(x) dx and by F± (λ) : L2s (R2 ) → L2 (S 1 ), s > 1/2, the trace operator (F± (λ)u) (θ) = (F± u) (µ, θ)|µ=λ . According to the stationary scattering theory, we know that W∓ (Hα , H0 ) = F±∗ F

(2.15)

and hence it follows that F± (λ)W∓ (Hα , H0 )u = F (λ)u,

a. e. λ > 0,

(2.16)

(F vl ) (λ, θ) = gl (λ)eilθ ,

(2.17)

for u ∈ L . We now consider a function of the form 2

vl (x) = fl (r)eilθ ,

for l ∈ Z, where fl ∈ S[0, ∞) (Schwartz space) and ∞ √ gl (λ) = 2−1/2 e−i|l|π/2 J|l| ( λr)fl (r)r dr. 0

We assume that gl ∈ C0∞ (0, ∞) is supported away from the origin. Lemma 2.4 Let vl be as above. Then xN W± (Hα , H0 )vl ∈ L2 for any N 1. Proof. By (2.15), we have

(W+ (Hα , H0 )vl ) (x) = F−∗ F vl (x) = f−l (r)eilθ ,

where −1/2 iνπ/2

f−l (r) = 2

e

∞

√ Jν ( λr)gl (λ) dλ

0

with ν = |l − α|. The Bessel function Jp (r) obeys the asymptotic formula Jp (r) = eir h+p (r) + e−ir h−p (r)

(2.18)

at infinity, where ∂rm h±p (r) = O(r−1/2−m ). By assumption, gl ∈ C0∞ (0, ∞) has compact support away from the origin. Hence the lemma follows by repeated use of partial integration. ✷

Vol. 2, 2001


321

Lemma 2.5 One has x−m exp(−itHα )W± (Hα , H0 )vl L2 = O(|t|−m ),

|t| → ∞,

for m ≥ 0. Proof. We divide R2 into the two regions {x : |x| > c |t|} and {x : |x| < c |t|} for some c > 0. It is easy to see that the term in the lemma satisfies the desired bound O(|t|−m ) over the region {x : |x| > c |t|}. It follows from (2.15) that ∞ √ exp(−itHα )W+ (Hα , H0 )vl (x) = 2−1/2 eiνπ/2 Jν ( λr)e−itλ gl (λ) dλeilθ . 0

Assume that |x| < c |t|. Then we can take c > 0 so small that the integral above obeys the bound O(|t|−N ) for any N 1. This is again obtained by repeated use of partial integration. Thus the proof is complete. ✷ √ Lemma 2.6 Let β0 (ξ) = χ(2|ξ − Eω|/δ 2 ) be as before and let j± (x) be a bounded function vanishing in a conical neighborhood of ±ω. Then one can choose δ > 0 so small that j+ β0 (Dx ) exp(−itHα )W± (Hα , H0 )vl L2 = O(|t|−N ), j− β0 (Dx ) exp(−itHα )W± (Hα , H0 )vl L2 = O(|t|−N ),

t → ∞, t → −∞,

for any N 1. Proof. We give only a sketch for a proof. The proof is again done by repeated use of partial integration. We show that the term I = j+ β0 (Dx ) exp(−itHα )W− (Hα , H0 )vl obeys the bound O(|t|−N ) as t → ∞. A similar argument applies to the other terms. If we take account of (2.18), then I is expressed as the sum of two oscillatory integrals of the form ∞ I± = exp(iψ± (x, ξ, y, λ; t))f± (x, ξ, y, λ) dλ dy dξ eilθ , 0

where ψ± (x, ξ, y, λ; t) = (x − y) · ξ ±

√

λ|y| − tλ,

t 1.

We consider the integral √ I+ only. The amplitude function f+ is supported in a small neighborhood of Eω in variables ξ and has compact support away from the origin in variable λ, while the stationary point (ξ, y, λ) of the phase function ψ+ has to fulfill the relations √ √ y = x, ξ = λy/|y|, |y| = 2 λt

322



for x ∈ supp j+ . If we take δ > 0 small enough, then we see that such a stationary point does not exist. This yields the desired bound. ✷ Remark 2.1 If vl ∈ L2 takes the form vl = F−∗ geilθ (x) or vl = F+∗ geilθ (x) for g(λ) ∈ C0∞ (0, ∞) supported away from the origin, then we can show in exactly the same way as above that x−m exp(−itHα )vl L2 = O(|t|−m ) and j+ β0 (Dx ) exp(−itHα )vl L2 = O(|t|−N ), j− β0 (Dx ) exp(−itHα )vl L2 = O(|t|−N ),

t → ∞, t → −∞.

The totality of such vl is dense in L2 . As an immediate consequence, we have W+ (Hd , Hα ; J+ ) = 0 for J+ = j+ β0 (Dx ).

3 Proof of main theorem : reduction to basic lemmas In this section we prove the main theorem (Theorem 1.1) by reduction to three lemmas (Lemmas 3.2 ∼ 3.4). The proof of these lemmas is given in section 4, and section 5 is devoted to proving the estimates for resolvent R(E + i0; Hd ) which play a central role in the proof of the lemmas. As previously stated, we prove the self–adjointness, the absence of bound states, the principle of limiting absorption and the asymptotic completeness of wave operators for Hd in section 7. We use these facts without further references. 3.1. The perturbation Hd − H0 between Hd and H0 = −∆ is of long–range class. However we can show that the ordinary wave operator W± (Hd , H0 ) = s − lim exp(itHd ) exp(−itH0 ) : L2 → L2 t→±∞

exists and it is asymptotically complete Ran W− (Hd , H0 ) = Ran W+ (Hd , H0 ) = L2 . Hence the scattering operator S(Hd , H0 ) = W+∗ (Hd , H0 )W− (Hd , H0 ) : L2 → L2 can be defined as a unitary operator and it has the direct integral decomposition ∞ ⊕ S(λ; Hd , H0 ) dλ. S(Hd , H0 ) F S(Hd , H0 )F ∗ = 0

If we denote by S(θ , θ; λ, Hd , H0 ) the kernel of fiber S(λ; Hd , H0 ) : L2 (S 1 ) → L2 (S 1 ), then the scattering amplitude fd (ω → ω ˜ ; E) in question is defined by fd (ω → ω ˜ ; E) = c(E) (S(˜ ω , ω; E, Hd , H0 ) − δ(˜ ω − ω))

Vol. 2, 2001


323

√ with c(E) = (2π/i E)1/2 again. If, in particular, ω = ω ˜ , then fd (ω → ω ˜ ; E) = c(E) S(˜ ω , ω; E, Hd , H0 ). The first step toward the proof of Theorem 1.1 is to represent fd (ω → ω ˜ ; E) in a convenient form. We always assume that ω = ω ˜ . We keep the same notation as in section 2. Let j0 and ˜j0 be as in (2.6) and (2.9), where R is taken as R = |d|σ for 0 < σ 1 fixed small enough. We set χ∞ (x) = 1 − χ(2|x|/|d|σ ), so that χ∞ (x) = 1 for |x| > |d|σ . We further define the following operators : J0d J1d

= exp(iα2 γ(x − d; ω))j0d χ∞ β0 (Dx )χ∞ , = exp(−iα2 γ(x − d; ω))j0d β0 (Dx ),

where j0d (x) = j0 (x − d). Then W− (Hd , H0 )β0 (Dx )2 is decomposed into W− (Hd , H0 )β0 (Dx )2 = W− (Hd , H1 ; J0d )W− (H1 , H0 )W− (H0 , H0 ; J1d ). By Lemma 2.1, W− (H0 , H0 ; J1d ) is realized as the multiplication √ F W− (H0 , H0 ; J1d )F ∗ = e−iα2 γ(−θ;ω) β0 ( λθ)× on L2 ((0, ∞); dλ) ⊗ L2 (S 1 ). A similar relation W+ (Hd , H0 )β˜0 (Dx )2 = W+ (Hd , H1 ; J˜0d )W+ (H1 , H0 )W+ (H0 , H0 ; J˜1d ) holds for the wave operator W+ (Hd , H0 ), where J˜0d J˜1d

= exp(iα2 γ(x − d; −˜ ω ))˜j0d χ∞ β˜0 (Dx )χ∞ , = exp(−iα2 γ(x − d; −˜ ω ))˜j0d β˜0 (Dx ).

The eigenfunction ϕ∓1 (x; θ, λ) of H1 = H(Aα1 ) is defined by (2.3) with α replaced by α1 . We write F±1 : L2 → L2 ((0, ∞); dλ) ⊗ L2 (S 1 ) for the unitary mapping associated with ϕ±1 and F±1 (λ) : L2s (R2 ) → L2 (S 1 ), s > 1/2, for the trace ∗ operator. Then it follows from (2.15) and (2.16) that W∓ (H1 , H0 ) = F±1 F and F±1 (λ)W∓ (H1 , H0 )u = F (λ)u,

a. e. λ > 0,

(3.1)

for u ∈ L2 . We now define S0 : L2 → L2 as S0 = W+∗ (H1 , H0 )W+∗ (Hd , H1 ; J˜0d )W− (Hd , H1 ; J0d )W− (H1 , H0 ). Since S0 commutes with H0 , it has the direct integral decomposition. We denote by S0 (λ) : L2 (S 1 ) → L2 (S 1 ) the fiber of S0 .

324



Lemma 3.1 Let the notation be as above. Then the fiber S0 (λ) is represented as ∗ ∗ S0 (λ) = 2πiF−1 (λ) −J˜0d Td + T˜d∗ R(λ + i0; Hd )Td F+1 (λ), where Td = Hd J0d − J0d H1 ,

T˜d = Hd J˜0d − J˜0d H1 .

Before going into the proof, we calculate Td and T˜d in the lemma. Both the operators are realized as a pseudo-differential operator. We write γd = γ(x − d; ω) and β0 = β0 (Dx ) for brevity. Since e−iα2 γd Hd eiα2 γd = e−iα2 γd H(Aα1 + Aα2 ,d )eiα2 γd = H(Aα1 ) = H1 on the support of j0d , we have Td = eiα2 γd ([H1 , j0d ]χ∞ β0 χ∞ + j0d [H1 , χ∞ β0 χ∞ ]) . We set Q = H1 − H0 . The coefficients of Q have a singularity at the origin only. Since χ∞ = χ∞ (|x|) is rotationally invariant, it is easy to see that [Q, χ∞ ] = 0. Hence we can calculate the second commutator as [H1 , χ∞ β0 χ∞ ] = [H0 , χ∞ β0 χ∞ ] + [Q, χ∞ β0 χ∞ ] = [H0 , χ∞ ]β0 χ∞ + χ∞ β0 [H0 , χ∞ ] + χ∞ [Q, β0 ]χ∞ = [H0 , χ∞ ]β0 χ∞ + χ∞ β0 [H0 , χ∞ ] + [χ∞ Q, β0 ]χ∞ + [β0 , χ∞ ]Qχ∞ . Thus Td admits the decomposition Td = Γ1d + Γ2d + Γ3d ,

(3.2)

where Γ1d = eiα2 γ(x−d;ω) j0d ([H0 , χ∞ ]β0 χ∞ + χ∞ β0 [H0 , χ∞ ]) , Γ2d = eiα2 γ(x−d;ω) [H1 , j0d ]χ∞ β0 χ∞ , Γ3d = eiα2 γ(x−d;ω) j0d ([χ∞ Q, β0 ]χ∞ + [β0 , χ∞ ]Qχ∞ ) with Q = H1 − H0 . Similarly ˜ 1d + Γ ˜ 2d + Γ ˜ 3d , T˜d = Γ where

˜ 1d = eiα2 γ(x−d;−˜ω) ˜j0d [H0 , χ∞ ]β˜0 χ∞ + χ∞ β˜0 [H0 , χ∞ ] , Γ ˜ 2d = eiα2 γ(x−d;−˜ω) [H1 , ˜j0d ]χ∞ β˜0 χ∞ , Γ ˜ 3d = eiα2 γ(x−d;−˜ω) ˜j0d [χ∞ Q, β˜0 ]χ∞ + [β˜0 , χ∞ ]Qχ∞ . Γ

(3.3)

Vol. 2, 2001


325

We see in the course of the proof of Theorem 1.1 in this section that ∗ ˜ ∗kd R(λ + i0; Hd )Γjd F+1 F−1 (λ)Γ (λ) : L2 (S 1 ) → L2 (S 1 ),

1 ≤ j, k ≤ 3,

are all bounded, and hence the relation in Lemma 3.1 makes sense. In fact, each operator is implicitly shown to have a bounded kernel as an integral operator. Proof of Lemma 3.1. The dependence on d does not matter throughout the proof. We use the following simplified notation : W± = W± (H1 , H0 ),

V± = W± (Hd , H1 ; J0d ),

V˜± = W± (Hd , H1 ; J˜0d )

and U1 (t) = exp(−itH1 ),

U (t) = exp(−itHd ).

The proof is based on the same idea as used to derive (2.13) (see [10,15]). We consider the integral ∞ < S0 (λ)F (λ)u, F (λ)v > dλ (S0 u, v) = 0

for u, v ∈ L2 , where < , > denotes the L2 scalar product in L2 (S 1 ). According to the notation above, we have (S0 u, v) = (V− W− u, V˜+ W+ v). We assume for the moment that u and v take the form u(x) = fl (r)eilθ ,

v(x) = fm (r)eimθ

(3.4)

as in (2.17). Then Lemma 2.4 implies that xN W± u ∈ L2 , and it follows from Lemmas 2.5 and 2.6 that Td U1 (t)W± uL2 = O(|t|−2 ) as |t| → ∞. These facts enable us to justify the rather formal computation below. Since V+ = 0 (see Remark 2.1), we can write V− in the integral form V− = −i U (−t)Td U1 (t) dt and hence we obtain (S0 u, v) = −i

(Td U1 (t)W− u, V˜+ U1 (t)W+ v) dt

by the intertwining property U (t)V˜+ = V˜+ U1 (t). If we further make use of the relation ∞ ˜ ˜ V+ = J0d + i U (−s)T˜d U1 (s) ds, 0

326


then we have


∗ (J˜0d Td U1 (t)W− u, U1 (t)W+ v) dt ∞ − (T˜d∗ U (s)Td U1 (t)W− u, U1 (t + s)W+ v) dt ds.

(S0 u, v) = −i

0

We denote by I1 the first integral on the right side and by I2 the second one. We calculate I1 as ∞ ∗ < F−1 (λ)J˜0d Td U1 (t)W− u, F−1 (λ)U1 (t)W+ v > dλ dt I1 = −i 0 ∞ ∗ = −i < F−1 (λ)J˜0d Td eitλ U1 (t) W− u, F−1 (λ)W+ v > dλ dt 0 = −i lim ε↓0

∞

∗ < F−1 (λ)J˜0d Td e−ε|t| eitλ U1 (t) W− u, F−1 (λ)W+ v > dλ dt.

0

The formula lim e−ε|t| eitλ U1 (t) dt = i (R(λ − i0; H1 ) − R(λ + i0; H1 )) = 2πF±1 (λ)∗ F±1 (λ) ε→0

is well known in the stationary scattering theory. Hence it follows from (3.1) that ∞ ∗ I1 = 2πi < −F−1 (λ)J˜0d Td F+1 (λ)∗ F (λ)u, F (λ)v > dλ. 0

A similar computation gives ∞ I2 = 2πi < F−1 (λ)T˜d∗ R(λ + i0; Hd )Td F+1 (λ)∗ F (λ)u, F (λ)v > dλ, 0

where the resolvent R(λ + i0; Hd ) comes from the integration in variable s. We combine the two relations above to obtain that ∞ < S0 (λ)F (λ)u, F (λ)v > dλ = 0 ∞ ∗ < F−1 (λ) −J˜0d Td + T˜d∗ R(λ + i0; Hd )Td F+1 (λ)∗ F (λ)u, F (λ)v > dλ 2πi 0

for u, v as in (3.4). The Fourier expansion and the limit procedure show that this relation remains true for u, v ∈ L2 such that (F u)(λ, θ) = g(λ)η(θ) and (F v)(λ, θ) = g˜(λ)˜ η (θ), where η, η˜ ∈ C ∞ (S 1 ), and g, g˜ ∈ C0∞ (0, ∞) have compact support away from the origin. This completes the proof. ✷ We write S0 (θ , θ; λ) for the kernel of fiber S0 (λ). As is easily seen, ω , ω; E) S(˜ ω , ω; E, Hd , H0 ) = S0 (˜

Vol. 2, 2001


327

and hence it follows from Lemma 3.1 that fd (ω → ω ˜ ; E) = −(ic(E)/4π)(Td ϕ+1 (ω, E), J˜0d ϕ−1 (˜ ω , E)) ω , E)), + (ic(E)/4π)(R(E + i0; Hd )Td ϕ+1 (ω, E), T˜d ϕ−1 (˜ where ϕ±1 (ω, E) = ϕ±1 (x; ω, E). By Proposition 2.1, ϕ±1 (x; ω, E) is bounded uniformly in x ∈ R2 . Roughly speaking, the support of symbols Td (x, ξ) and ˜ . A simple calculus J˜0d (x, ξ) does not intersect with each other, provided that ω = ω of pseudo-differential operators yields that (Td ϕ+1 (ω, E), J˜0d ϕ−1 (˜ ω , E)) = O(|d|−N ) and hence we have ˜ ; E) = (ic(E)/4π)(R(E + i0; Hd )Td ϕ+1 (ω, E), T˜d ϕ−1 (˜ ω , E)) + o(1). fd (ω → ω (3.5)

3.2. The second step is to study the behavior as |d| → ∞ of the term on the right side of (3.5) by making use of estimates on resolvent R(E + i0; Hd ). We introduce the new notation to formulate the resolvent estimates. Let 0 < σ 1 be still fixed small enough and write x ˆ for direction x/|x|. We set B1d = {x : |x| < C|d|σ },

B2d = {x : |x − d| < C|d|σ }

and ˆ < δ, |x − d| > δ|d|σ , |(x ˆ < δ} x − d| − d) + d| Λd = {x : |x| > δ|d|σ , |ˆ for some C 1, and we denote by b1d , b2d and λd the characteristic function of B1d , B2d and Λd respectively. We further denote by the norm of bounded operators acting on L2 , and we use the notation Qd O(|d|ν ) when Qd : L2 → L2 obeys the bound Qd ≤ cε |d|ν+ε , |d| 1, for any ε > 0. The proof of the main theorem is based on the following three lemmas. Lemma 3.2 Let rL be the pseudo-differential operator defined by rL = rL (x, Dx ) = (|x|2 + |d|2 )−L/2 Dx −L for L 1. Then one has : (1)

rL R(E + i0; Hd )b1d = O(|d|−L/2 ) ; similarly for b2d and λd .

(2)

rL R(E + i0; Hd )rL = O(|d|−L ).

(3.6)

328



The estimates in the lemma are very rough. This lemma is used to control error terms which arise in constructing outgoing and incoming approximations to the resolvent R(E + i0; Hd ). According to the principle of limiting absorption (Proposition 7.3), we know that R(E +i0; Hd ) is bounded from L2s (R2 ) to L2−s (R2 ) for s > 1/2, but we do not here intend to pursue how sharp the resolvent estimate can be made. The proof of the theorem does not require such a sharp estimate. Lemma 3.3 One has b1d R(E + i0; Hd )b2d O(|d|−1/2+4σ ) and

b1d R(E + i0; Hd ) − R(E + i0; H1 ) b1d O(|d|−1+7σ ), b2d R(E + i0; Hd ) − R(E + i0; H2,d ) b2d O(|d|−1+7σ ).

ˆ Then one has Lemma 3.4 Write γd (x) for γ(x − d; d). b2d R(E + i0; Hd )λd x−1 O(|d|−1/2+3σ ) and b1d R(E + i0; Hd ) − eiα2 γd R(E + i0; H1 )e−iα2 γd λd x−1 O(|d|−1+6σ ), x−1 λd R(E + i0; Hd ) − eiα2 γd R(E + i0; H1 )e−iα2 γd λd x−1 O(|d|−1+5σ ). Remark 3.1 All the lemmas remain true for R(E − i0; Hd ). Thus Lemma 3.2 shows b1d R(E + i0; Hd )rL = O(|d|−L/2 ) by adjoint. In the argument below, we use such an immediate consequence without further references. We shall complete the proof of Theorem 1.1, accepting these lemmas as ˆ E) only. If ω = proved. To fix the idea, we prove the theorem for fd (dˆ → −d; ˆ ˆ −d, we represent fd (−d → ω ˜ ; E) in terms of the eigenfunction ϕ∓2 (x; θ, λ) of H2 = H(Aα2 ) and the other cases are more easier to deal with. If, in fact, ω = ±dˆ ˆ then the situation becomes much simpler and the proof does not and ω ˜ = ±d, require Lemma 3.4. ˜ jd be as in (3.2) and (3.3) respectively. We set Let Γjd and Γ ˜ kd ϕ−1 ) γjk = (ic(E)/4π)(R(E + i0; Hd )Γjd ϕ+1 , Γ

Vol. 2, 2001


329

ˆ E) and ϕ−1 = ϕ−1 (x; −d, ˆ E). To prove for 1 ≤ j, k ≤ 3, where ϕ+1 = ϕ+1 (x; d, the theorem, we have only to show that : γjk = o(1),

j = k,

(3.7)

γ33 = o(1)

(3.8)

and ˆ E) + o(1) γ11 = f1 (dˆ → −d; 2 ˆ E) + o(1). γ22 = (cos α1 π) f2,d (dˆ → −d;

(3.9) (3.10)

ˆ we may take the two functions j0 and ˜j0 in such a way When ω = dˆ and ω ˜ = −d, that these functions coincide with each other. Thus we assume that j0 = ˜j0 . The three lemmas above can be seen to remain true for the smooth functions b1d (x) = χ(|x|/C|d|σ ),

b2d (x) = χ(|x − d|/C|d|σ )

and

ˆ ˆ λd (x) = 1 − χ(2|x|/δ|d|σ ) χ(|ˆ x − d|/δ) 1 − χ(2|x − d|/δ|d|σ ) χ(|(x − d) + d|/δ) associated with the three sets B1d , B2d and Λd respectively. We use the notation b1d , b2d and λd with the meaning ascribed above throughout the proof of (3.7) ∼ (3.10). We begin by (3.8). The proof is based on the following lemma. Lemma 3.5 Let rL = rL (x, Dx ), L 1, be defined by (3.6) and let λd (x) be as ˜ 3d ϕ−1 take the form above. Then Γ3d ϕ+1 and Γ Γ3d ϕ+1 = λd Γ3d ϕ+1 + rL ed ,

˜ 3d ϕ−1 + rL e˜d , ˜ 3d ϕ−1 = λd Γ Γ

where the L2 norm of remainder terms ed and e˜d is bounded uniformly in d. Proof. The proof uses Proposition 2.1. Roughly speaking, the symbol Γ3d (x, ξ) has support on supp j0d in variables x and on supp ∇β0 in variables ξ. By (2.6), ˆ δ)}, and ∇β0 has support j0d (x) = j0 (x− d) has in {x : x− d ∈ Σ(|d|σ , −d, √ support 2 2 ˆ ˆ If β(ξ) vanishes around in {ξ√: δ /2 < |ξ − E d| < δ√} for the incident direction d. ˆ ˆ ξ = E d, then β(Dx ) exp(i Ex·d) = 0, and if x ∈ supp j0d ∩Λcd and ξ ∈ supp ∇β0 , then √ √ E|x| − ξ · x = E x ˆ − ξ > c > 0. ∇ Thus the first relation follows from Proposition 2.1 (1) and (4). A similar argument applies to the second one and the proof is complete. ✷ ˜ 3d (x, ξ) fall off with Lemma 3.5 implies (3.8). The symbols Γ3d (x, ξ) and Γ −2 order O(|x| ) at infinity uniformly in d. By Proposition 2.1 (4), xλd Γ3d ϕ+1

330



˜ 3d ϕ−1 are of order O(log |d|) in the L2 norm, and by the principle of and xλd Γ limiting absorption, x−ρ R(E + i0; H1 )x−ρ : L2 → L2 is bounded for any ρ > 1/2. Hence (3.8) follows from Lemmas 3.2 and 3.4. To prove (3.7), we further prove one lemma. We write β0 , β˜0 , β1 and β˜1 for the pseudo-differential operators with symbols √ √ ˆ 2 ), ˆ 2 ), β0 (ξ) = χ(2|ξ − E d|/δ β˜0 (ξ) = χ(2|ξ + E d|/δ √ √ ˆ 2 ), ˆ 2 ), β1 (ξ) = χ(|ξ − E d|/δ β˜1 (ξ) = χ(|ξ + E d|/δ respectively. By definition, β1 β0 = β0 and β˜1 β˜0 = β˜0 . Let λ(x) be a smooth function such that ∂xβ λ = O(|x|−|β| ) and ˆ > δ} supp λ ⊂ {x : |x − d| > C|d|σ , |(x − d) + d| for C 1. We construct an outgoing approximation for R(E + i0; Hd )λβ0 and an incoming one for R(E − i0; Hd )λβ˜0 . To do this, we take a function j ∈ C ∞ (R2 ) such that ∂xβ j = O(|x|−|β| ) and ˆ > δ/4} − d) + d| supp j ⊂ {x : |x − d| > |d|σ , |(x ˆ > δ/2}. Hence j = 1 on the and j(x) = 1 on {x : |x − d| > 2|d|σ , |(x − d) + d| support of λ. Lemma 3.6 Let the notation be as above and let θd (x) be defined by ˆ + α2 γ(x − d; −d). ˆ θd (x) = α1 γ(x; −d) Then one has R(E + i0; Hd )λβ0 = j exp(iθd )R(E + i0; H0 )β1 exp(−iθd )λβ0 + R(E + i0; Hd )˜ rL , ˜ ˜ ˜ R(E − i0; Hd )λβ0 = j exp(iθd )R(E − i0; H0 )β1 exp(−iθd )λβ0 + R(E − i0; Hd )˜ rL for L 1, where r˜L denotes an operator such that r˜L Dx L (|x|2 + |d|2 )L/2 ,

Dx L (|x|2 + |d|2 )L/2 r˜L : L2 → L2

are bounded uniformly in d. Proof. We prove only the first relation. We calculate (Hd − E)j exp(iθd ) = exp(iθd )(H0 − E)j

(3.11)

Vol. 2, 2001


331

by use of a relation similar to (2.7). Hence (Hd − E)j exp(iθd )R(E + i0; H0 )β1 exp(−iθd )λβ0 = λβ0 + r˜L + exp(iθd )[H0 , j]R(E + i0; H0 )β1 exp(−iθd )λβ0 . The resolvent R(E + i0; H0 ) is represented in the integral form ∞ R(E + i0; H0 ) = i eitE exp(−itH0 ) dt. 0

If we choose δ small enough, then the free particle with initial state (x, ξ) ∈ supp λ × supp β1 does not pass over supp ∇j for t > 0, so that we can put r˜L = exp(iθd )[H0 , j]R(E + i0; H0 )β1 exp(−iθd )λβ0 for the remainder term on the right side of the above relation. In fact, this can be shown in the standard way using partial integral repeatedly. Thus the proof is complete. ✷ We proceed to the proof of (3.7). We first consider the term γ13 . Recall that χ∞ = 1 − χ(2|x|/|d|σ ), so that ∇χ∞ has support on {x : |d|σ /2 < |x| < |d|σ } ⊂ B1d . Since Γ1d ϕ+1 is uniformly bounded in L2 , we have ˜ 3d ϕ−1 ) + o(1) γ13 = (ic(E)/4π)(eiα2 γd R(E + i0; H1 )e−iα2 γd Γ1d ϕ+1 , λd Γ ˆ We construct approximaby Lemmas 3.2, 3.4 and 3.5, where γd (x) = γ(x − d; d). tions for resolvent R(E ± i0; H1 ). Let ˆ x + d|/δ) λ1d (x) = 1 − χ(4|x|/|d|σ ) χ(|x|/|d|σ )χ(|ˆ be the smooth function associated with the set ˆ < δ}. x + d| Λ1d = {x : |d|σ /2 < |x| < |d|σ , |ˆ ˆ > δ and ξ ∈ supp β0 . Then it follows Assume that x ∈ supp ∇χ∞ satisfies |ˆ x + d| that |x + tξ| > c (t + |x|), c > 0, for t > 0. Hence the particle starting from initial state (x, ξ) √ at t = 0 moves like the free particle and it does not take momentum around − E dˆ ∈ supp β˜0 . This enables us to construct an outgoing approximation in the form ˜ ∗3d λd eiα2 γd R(E + i0; H1 )e−iα2 γd (1 − λ1d )Γ1d = r˜L + Γ ˜ ∗3d λd eiα2 γd R(E + i0; H1 )˜ Γ rL for any L 1. The construction is based on the same idea as in the proof of Lemma 3.6. Thus we obtain ˜ 3d ϕ−1 ) + o(1). γ13 = (ic(E)/4π)(λ1d Γ1d ϕ+1 , eiα2 γd R(E − i0; H1 )e−iα2 γd λd Γ

332



We further construct an incoming approximation for R(E − i0; H1 ). If x ∈ Λd and ξ ∈ supp β˜0 , then the particle with initial state (x, ξ) does not pass over Λ1d for t < 0. Hence we get γ13 = o(1) by constructing an approximation ˜ 3d = r˜L + λ1d eiα2 γd R(E − i0; H1 )˜ rL . λ1d eiα2 γd R(E − i0; H1 )e−iα2 γd λd Γ Similarly we can show γ31 = o(1). Next we consider the term γ23 . Recall that ∇j0d , j0d = j0 (x − d), has support on ˆ δ) \ Σ(2|d|σ , −d, ˆ δ/2)}. {x : x − d ∈ Σ(|d|σ , −d, We construct an outgoing approximation for R(E +i0; Hd )(1−b2d )Γ2d . By Lemma 3.6, the approximation takes the form ˜ ∗3d R(E + i0; Hd )˜ ˜ ∗3d R(E + i0; Hd )(1 − b2d )Γ2d = r˜L + Γ rL , Γ and hence we have ˜ 3d ϕ−1 ) + o(1) γ23 = (ic(E)/4π)(R(E + i0; Hd )b2d Γ2d ϕ+1 , λd Γ by Lemmas 3.2 and 3.5. Since b2d Γ2d ϕ+1 is uniformly bounded in L2 , the desired bound γ23 = o(1) follows from Lemma 3.4. A similar argument applies to the other terms γ21 , γ12 and γ32 . Thus (3.7) is verified. We prove (3.9). We first apply Lemma 3.3 to obtain ˜ 1d ϕ−1 ) + o(1). γ11 = (ic(E)/4π)(R(E + i0; H1 )Γ1d ϕ+1 , Γ Next we construct an outgoing approximation for R(E + i0; H1 )(1 − λ1d )Γ1d and ˜ 1d as in Lemma 3.6. Then we get an incoming one for R(E − i0; H1 )(1 − λ1d )Γ ˜ 1d ϕ−1 ) + o(1). γ11 = (ic(E)/4π)(R(E + i0; H1 )λ1d Γ1d ϕ+1 , λ1d Γ

(3.12)

ˆ Hence it follows The set Λ1d does not contain a conical neighborhood of direction d. from Proposition 2.1 (1) that ˆ ˆ E) = eiα1 (γ(x;d)−π) ˆ E) + ei ϕ+1 = ϕ+1 (x; d, ϕ0 (d,

√ E|x|

O(|x|−1/2 )

√ ˆ E) = exp(i Ex · d). ˆ If ξ ∈ supp β0 , then on Λ1d , where ϕ0 (d, √ √ E|x| − ξ · x = E x ˆ − ξ > c > 0 ∇ for x ∈ Λ1d . This implies that the remainder term is negligible. We note that j0d = 1 and ˆ eiα2 γ(x−d;d) = eiα2 π + O(|d|−1+σ )

Vol. 2, 2001


333

ˆ E), we have on Λ1d . Since β0 (Dx )ϕ0 = ϕ0 for ϕ0 = ϕ0 (d, ˆ ˆ E) + O(|d|−1+σ ) . λ1d Γ1d ϕ+1 = λ1d ei(α2 −α1 )π eiα1 γ(x;d) [H0 , χ2∞ ]ϕ0 (d, Similarly ˆ ˆ E) + O(|d|−1+σ ) . ˜ 1d ϕ−1 = λ1d ei(α2 −α1 )π eiα1 γ(x;d) λ1d Γ [H0 , χ2∞ ]ϕ0 (−d, Hence we have ˆ E), λ1d Φ1d (−d, ˆ E)) + o(1), γ11 = (ic(E)/4π)(R(E + i0; H1 )λ1d Φ1d (d, where

ˆ

Φ1d (ω, E) = Φ1d (x; ω, E) = eiα1 γ(x;d) [H0 , χ2∞ ]ϕ0 (ω, E). We further obtain ˆ E), Φ1d (−d, ˆ E)) + o(1) γ11 = (ic(E)/4π)(R(E + i0; H1 )Φ1d (d, by repeating the same argument as used to derive (3.12). We split [H0 , χ2∞ ] into [H0 , χ2∞ ] = χ(|x|/2|d|σ ) [H0 , j1 χ2∞ ] + [H0 , (1 − j1 )χ2∞ ] , where j1 ∈ C ∞ (R2 ) is a real function such that ∂xβ j1 = O(|x|−|β| ) and ˆ δ), supp j1 ⊂ Σ(|d|σ /4, −d,

ˆ δ/2). j1 = 1 on Σ(|d|σ /2, −d,

We see that only the first commutator makes a contribution. This can be shown by constructing outgoing and incoming approximations for the second commutator. Thus (3.9) is obtained by Lemma 2.3 with j0 = j˜0 = j1 χ2∞ . The proof of (3.10) is similar but is slightly different. By Lemma 3.6, we construct an outgoing approximation ˜ ∗2d R(E + i0; Hd )(1 − b2d )Γ2d = r˜L + Γ ˜ ∗2d R(E + i0; Hd )˜ Γ rL and an incoming approximation ˜ 2d = jeiθd R(E−i0; H0 )β˜1 e−iθd (1−b2d )Γ ˜ 2d +R(E−i0; Hd )˜ rL . R(E−i0; Hd )(1−b2d )Γ We know by the resolvent estimate of [9] that x−s−τ R(E − i0; H0 )β˜1 (1 − b2d )xs : L2 → L2 ,

s > 0,

is bounded for τ > 1. Hence we have ˜ 2d ϕ−1 ) + o(1) γ22 = (ic(E)/4π)(R(E + i0; Hd )b2d Γ2d ϕ+1 , b2d Γ

334



by Lemma 3.2, and it follows from Lemma 3.3 that ˜ 2d ϕ−1 ) + o(1). γ22 = (ic(E)/4π)(R(E + i0; H2,d )b2d Γ2d ϕ+1 , b2d Γ ˆ < δ} for C 1, and denote − d) + d| Let Λ2d = {x : |d|σ < |x − d| < C|d|σ , |(x by ˆ − d) + d|/δ) λ2d (x) = 1 − χ(2|x − d|/|d|σ ) χ(|x − d|/C|d|σ )χ(|(x the smooth function associated with Λ2d . Then we obtain ˜ 2d ϕ−1 ) + o(1) γ22 = (ic(E)/4π)(R(E + i0; H2,d )λ2d Γ2d ϕ+1 , λ2d Γ in the same way as (3.12). By the principle of limiting absorption, x − d−ρ R(E + i0; H2,d )x − d−ρ : L2 → L2 is bounded uniformly in d for any ρ > 1/2, and by Proposition 2.1 (3) with q = 1 − σ, the eigenfunction ϕ±1 behaves like ϕ+1 ϕ−1

ˆ E) = cos α1 π × ϕ0 (x; d, ˆ E) + O(|d|−ν ), = ϕ+1 (x; d, ˆ E) = cos α1 π × ϕ0 (x; −d, ˆ E) + O(|d|−ν ) = ϕ−1 (x; −d,

on Λ2d , where ν = 2(1/2 − σ)/3. Since x − dρ ≤ c |d|ρσ on Λ2d and 2ρσ < ν for σ small enough, we have γ22 = (cos α1 π)2 (ic(E)/4π)(R(E + i0; H2,d ) ˆ E), λ2d Γ ˆ E)) + o(1). ˜ 2d ϕ0 (−d, λ2d Γ2d ϕ0 (d, The commutator [H1 , j0d ] is calculated as ˆ

ˆ

[H1 , j0d ] = [H(Aα1 ), j0d ] = eiα1 γ(x;−d) [H0 , j0d ]e−iα1 γ(x;−d) = eiα1 π + O(|d|−1+σ ) [H0 , j0d ] e−iα1 π + O(|d|−1+σ ) on Λ2d . We have assumed that j0 (x − d) = ˜j0 (x − d). Note that χ∞ = 1 on supp ∇j0d . Hence we have ˆ E), λ2d Φ2d (−d, ˆ E)) + o(1), γ22 = (cos α1 π)2 (ic(E)/4π)(R(E + i0; H2,d )λ2d Φ2d (d, where ˆ

Φ2d (ω, E) = Φ2d (x; ω, E) = eiα2 γ(x−d;d) [H0 , j0d ]ϕ0 (ω, E). Thus (3.10) is obtained from Lemma 2.3 after the change of variables x − d → x.

Vol. 2, 2001


335

4 Completion : proof of Lemmas 3.2, 3.3 and 3.4 In this section we prove the three lemmas and complete the proof of Theorem 1.1.

4.1. The proof of the lemmas requires several auxiliary operators. We first define these operators. We fix 0 < σ1 , σ2 1 small enough, and we define the following two sets ˆ < |d|−σ1 /2 }, Π1d = {x : |x| < C|d|σ1 } ∪ {x : |x| ≥ C|d|σ1 , |ˆ x + d|

(4.1)

ˆ < |d|−σ2 /2 } − d) − d| Π2d = {x : |x − d| < C|d|σ2 } ∪ {x : |x − d| ≥ C|d|σ2 , |(x for C 1. These two sets are disjoint with each other for |d| 1. Let ζjd ∈ C ∞ (R), 1 ≤ j ≤ 2, be a real periodic function with period 2π such that ζjd (s) = αj s for s ∈ (|d|−σj /2 , 2π − |d|−σj /2 ) and |(d/ds)m ζjd (s)| ≤ Cm |d|mσj /2 for Cm > 0 independent of d. We define a smooth real function η1d by η1d (x) = 0 for |x| < |d|σ1 /2 and by ˆ η1d (x) = ζ1d (γ(x; −d)) for |x| > |d|σ1 . We may assume that η1d satisfies |∂xβ η1d (x)| ≤ Cβ |d||β|σ1 /2 |x|−|β| ≤ C˜β x−|β|/2

(4.2)

uniformly in d. By definition, we have ˆ ˆ = ζ (γ(x; −d)) ˆ (−x2 /|x|2 , x1 /|x|2 ) ∇η1d (x) = ζ1d (γ(x; −d))∇γ(x; −d) 1d

(4.3)

and hence ∇η1d (x) = α1 (−x2 /|x|2 , x1 /|x|2 ) for x ∈

Πc1d ,

where

Πc1d

(4.4)

is the complement of Π1d . Similarly we define η2d by ˆ η2d (x) = ζ2d (γ(x − d; d))

for |x − d| > |d|σ2 and by η2d (x) = 0 for |x − d| < |d|σ2 /2. We set p1d (x) = exp(iη1d (x)) and q1d (x) = 1/p1d (x). By (4.2), we have |∂xβ p1d (x)| + |∂xβ q1d (x)| ≤ Cβ x−|β|/2 uniformly in d. If x ∈ Πc1d , then ˆ p1d (x) = exp(iα1 γ(x; −d)),

ˆ q1d (x) = exp(−iα1 γ(x; −d)).

Similarly we define p2d (x) = exp(iη2d (x)) and q2d (x) = 1/p2d (x). Then |∂xβ p2d (x)| + |∂xβ q2d (x)| ≤ Cβ x − d−|β|/2

(4.5)

336



and ˆ p2d (x) = exp(iα2 γ(x − d; d)),

ˆ q2d (x) = exp(−iα2 γ(x − d; d))

for x ∈ Πc2d . We now introduce the following three operators K1d K2d

= p2d H1 q2d = p2d H(Aα1 )q2d = H(Aα1 + ∇η2d ), = p1d H2,d q1d = p1d H(Aα2 ,d )q1d = H(∇η1d + Aα2 ,d )

and K0d = pd H0 qd = H(∇η1d + ∇η2d ) as basic auxiliary operators, where pd = p1d p2d and qd = q1d q2d . The operator K0d has the domain D(K0d ) = H 2 (R2 ), H s (R2 ) being the Sobolev space of order s, while K1d and K2d have the domain D(K1d ) = {u ∈ L2 : K1d u ∈ L2 , D(K2d ) = {u ∈ L2 : K2d u ∈ L2 ,

lim |u(x)| < ∞},

|x|→0

lim

|x−d|→0

|u(x)| < ∞}.

We consider the difference W1d = K1d − K0d . By (4.4), Aα1 = ∇η1d on Πc1d , and hence W1d = 0 there. Similarly we have Hd − K2d = H(Aα1 + Aα2 ,d ) − K2d = 0 on Πc1d . Since Aα2 ,d (x) = Aα2 (x − d) = ∇η2 (x − d) on Π1d , we also have Hd − K2d = K1d − K0d = W1d on Π1d . A similar argument applies to W2d = K2d − K0d . Thus we can obtain the following relations Hd = K1d + W2d ,

Hd = K2d + W1d .

(4.6)

The difference Wjd is a differential operator of first order. For example, W1d takes the form W1d = 2ie1d (x) · ∇ + e0d (x) (4.7) and the coefficients have support in Π1d and singularity at x = 0 only. By (4.2) and (4.3), e1d and e0d satisfy ˆ ∇γ = O(|d|σ1 /2 )∇γ e1d (x) = α1 − ζ1d (γ(x; −d)) (4.8) ˆ and with γ = γ(x; −d)

e0d (x) = O(|d|σ1 )|x|−2

(4.9)

for |x| > |d| , and by (4.5), we have σ1

|∂xβ e0d (x)| + |∂xβ e1d (x)| ≤ Cβ x−|β|/2

(4.10)

Vol. 2, 2001


337

for |x| > 1 uniformly in d. The coefficients of W2d have similar properties. They have support in Π2d and singularity at x = d only. The domain of K1d or K2d is different from that of K0d , and the ordinary resolvent identity is not expected to hold for (Kjd , K0d ). However we can derive the following relation ψj R(E + i0; Kjd ) = R(E + i0; K0d )ψj − R(E + i0; K0d )Ujd R(E + i0; Kjd ) (4.11) for j = 1, 2, where ψ1 and ψ2 are smooth bounded functions vanishing around x = 0 and x = d respectively, and Ujd = −[Kjd , ψj ] + Wjd ψj .

(4.12)

We often use the relation with ψ1 (x) = 1 − χ(|x|/δ|d|σ1 ),

ψ2 (x) = 1 − χ(|x − d|/δ|d|σ2 )

(4.13)

in later application. We shall show (4.11) in a rather formal way. We write the solution u to equation (K0d − E) u = ψ1 f as u = ψ1 R(E + i0; K1d )f + v. Since K0d = K1d − W1d , the remainder v obeys (K0d − E) v = (−[K1d , ψ1 ] + W1d ψ1 ) R(E + i0; K1d )f. This yields the desired relation. Similarly we can show that R(E + i0; Hd )ψ2 = ψ2 R(E + i0; K1d ) − R(E + i0; Hd )V2d R(E + i0; K1d ), (4.14) R(E + i0; Hd )ψ1 = ψ1 R(E + i0; K2d ) − R(E + i0; Hd )V1d R(E + i0; K2d ), (4.15) where V2d = [K1d , ψ2 ] + W2d ψ2 ,

V1d = [K2d , ψ1 ] + W1d ψ1 .

(4.16)

If ψj is taken as in (4.13), then Vjd has properties similar to Wjd . The only difference is that the coefficients of Vjd are all smooth and bounded uniformly in d. The operator Ujd defined by (4.12) has also similar properties. The argument below requires the Green kernel Gd (x, y; E) of R(E + i0; K0d ). The resolvent R(E + i0; H0 ) has the kernel √ (1) G0 (x, y : E) = (i/4)H0 ( E |x − y|), (1)

where H0 (z) is the Hankel function of first kind and order zero. As is well known, (1) H0 (z) behaves like (1) H0 (z) = (2/π)1/2 exp(i(z − π/4))z −1/2 1 + O(|z|−1 )

338



at infinity. Hence Gd (x, y; E) behaves like √ Gd = c0 (E)pd (x) exp(i E|x − y|)|x − y|−1/2 qd (y) 1 + O(|x − y|−1 )

(4.17)

as |x − y| → ∞, where c0 (E) = (1/8π)1/2 exp(iπ/4)E −1/4 . 4.2. Let σ, 0 < σ 1, be fixed small enough as in Lemmas 3.2, 3.3 and 3.4. Throughout the argument in this subsection, K1d , K2d and K0d are defined with σ1 = σ2 = σ. We prove several lemmas on the resolvent estimates for these operators before going into the proof of the three lemmas. The functions b1d , b2d and λd again denote the characteristic functions of sets B1d , B2d and Λd respectively. Lemma 4.1

b2d R(E + i0; K0d )b1d = O(|d|−1/2+2σ ), b2d R(E + i0; K0d )λd x−1 O(|d|−1/2+σ ).

Proof. To prove the first bound, we evaluate the Hilbert–Schmidt norm of the operator. Since the kernel Gd (x, y; E) of R(E + i0; K0d ) obeys (4.17), this bound follows at once. To prove the second bound, we decompose λd into the sum λd (x) = λd (x) χ(|x − d|/δ|d|) + (1 − χ(|x − d|/δ|d|)) = µ2d (x) + µ1d (x). By the principle of limiting absorption, we have x − d−ρ R(E + i0; K0d )x − d−ρ : L2 → L2 is bounded for any ρ > 1/2. Since |x| > c |d| on the support of µ2d for some c > 0, we can choose ρ so close to 1/2 that b2d R(E + i0; K0d )µ2d x−1 = O(|d|−1+ρ+ρσ ) O(|d|−1/2+σ/2 ). On the other hand, we obtain b2d R(E + i0; K0d )µ1d x−1 O(|d|−1/2+σ ) by evaluating the Hilbert–Schmidt norm. This yields the desired bound. Lemma 4.2 Let V1d = [K2d , ψ1 ] + W1d ψ1 ,

ψ1 (x) = 1 − χ(|x|/δ|d|σ ),

be defined by (4.16) with σ1 = σ. Take ρ > 1/2 close enough to 1/2. Then xρ V1d R(E + i0; K0d )rL = O(|d|−L/2 ), where rL is the pseudo-differential operator defined by (3.6).

✷

Vol. 2, 2001


339

Proof. The proof is based on the fact that the free Hamiltonian H0 and ∂/∂θ commute each other. By definition, we have R(E + i0; K0d ) = pd R(E + i0; H0 )qd , where pd = p1d p2d and qd = 1/pd . By (4.7), (4.8) and (4.9), V1d takes the form V1d = O(|d|σ/2 )∇γ · ∇ + O(|d|σ )|x|−2 ,

ˆ γ = γ(x; −d),

in {x : |x| > |d|σ }. The differential operator ∇γ · ∇ can be written as ∇γ · ∇ = |x|−2 −x2 ∂1 + x1 ∂2 = |x|−2 ∂/∂θ and pd satisfies the estimate ∇pd = |d|σ/2 O(|x|−1 ) + O(|x − d|−1 ) . If we take account of these facts, the lemma is easily verified.

✷

× We introduce a smooth nonnegative We work in the phase space 2 partition of unity over Rξ . The partition {β± , β∞ } is normalized by R2x

R2ξ .

β+ (ξ) + β− (ξ) + β∞ (ξ) = 1

(4.18)

and has the following properties : supp β∞ ⊂ {ξ : |ξ|2 < E/2 or |ξ|2 > 2E} and supp β+ ⊂ {ξ : E/3 < |ξ|2 < 3E, ξˆ · dˆ > −1/4} supp β− ⊂ {ξ : E/3 < |ξ|2 < 3E, ξˆ · dˆ < 1/4}. The proof of the two lemmas below is based on the micro-local estimates for the resolvent of auxiliary operators. We make repeated use of a similar idea in the future discussion. Lemma 4.3 b2d R(E + i0; K1d )b1d O(|d|−1/2+3σ ), b1d R(E + i0; K2d )b2d O(|d|−1/2+3σ ) and

b2d R(E + i0; K1d )λd x−1 O(|d|−1/2+2σ ).

Proof. We prove the first bound only. The second and third bounds are obtained in a similar way. Let ψ1 be as in Lemma 4.2. We use (4.11) for the function ψ1 . Since ψ1 b2d = b2d , we have b2d R(E + i0; K1d )b1d

= b2d R(E + i0; K0d )ψ1 b1d − b2d R(E + i0; K0d )U1d R(E + i0; K1d )b1d .

340



By Lemma 4.1, the first operator on the right side obeys the bound O(|d|−1/2+2σ ). To evaluate the second operator, we decompose U1d into the sum of four operators 2 U1d + U∞ (x, Dx ) + U+ (x, Dx ) + U− (x, Dx ), U1d = g1d

(4.19)

where g1d (x) = χ(|x|/M |d|σ ) for M 1, and 2 )U1d β± (Dx ), U± (x, Dx ) = (1 − g1d

We have

2 U∞ (x, Dx ) = (1 − g1d )U1d β∞ (Dx ).

b2d R(E + i0; K0d )g1d = O(|d|−1/2+2σ )

in the same way as in the proof of Lemma 4.1. By the principle of limiting absorption, x−ρ R(E + i0; K1d )x−ρ : L2 → L2 is bounded for any ρ > 1/2. Since the coefficients of U1d vanish around x = 0 and are bounded uniformly in d, we have g1d U1d R(E + i0; K1d )b1d O(|d|σ ) by elliptic estimate. Thus 2 U1d R(E + i0; K1d )b1d O(|d|−1/2+3σ ). b2d R(E + i0; K0d )g1d

We now assume that x ∈ Π1d and |x| > M |d|σ , where Π1d is defined by (4.1) with σ1 = σ. Then the symbol of K0d − E takes the form |ξ|2 − E approximately. If ξ ∈ supp β∞ , then it has a bounded inverse. Since Π1d and B2d do not intersect with each other, we have by the standard calculus of pseudo-differential operators that rN b2d R(E + i0; K0d )U∞ = rÑ + b2d R(E + i0; K0d )˜ for any N 1, where rÑ again denotes a bounded operator having the property (3.11). Hence b2d R(E + i0; K0d )U∞ R(E + i0; K1d )b1d = O(|d|−N ). We still assume that x ∈ Π1d and |x| > M |d|σ . If ξ ∈ supp β− , then the free particle with initial state (x, ξ) at t = 0 never passes over B2d for t > 0. Hence we have b2d R(E + i0; K0d )U− R(E + i0; K1d )b1d = O(|d|−N ) by use of the micro-local estimate on the resolvent R(E + i0; K0d ). If, on the other hand, ξ ∈ supp β+ , then we can take M 1 so large that the incoming particle with state (x, ξ) at t = 0 never passes over B1d for t < 0. This enables us to construct an incoming approximation for ∗ ∗ U+ R(E + i0; K1d )b1d = b1d R(E − i0; K1d )U+ .

Vol. 2, 2001


341

We use an argument similar to that in the proof of Lemma 3.6. Then the approximation is constructed in the form U+ R(E + i0; K1d )b1d = rÑ + rÑ R(E + i0; K1d )b1d and hence we get b2d R(E + i0; K0d )U+ R(E + i0; K1d )b1d = O(|d|−N ). ✷

Thus the desired bound is obtained. Lemma 4.4 Let ρ > 1/2 and V1d be as in Lemma 4.2. Then xρ V1d R(E + i0; K2d )rL = O(|d|−L/2 ).

Proof. We use (4.11) with ψ2 = 1 − χ(|x − d|/δ|d|σ ). Since V1d ψ2 = V1d , we have xρ V1d R(E + i0; K2d )rL

= xρ V1d R(E + i0; K0d )ψ2 rL − xρ V1d R(E + i0; K0d )U2d R(E + i0; K2d )rL .

By Lemma 4.2, the first operator obeys O(|d|−L/2 ). We decompose U2d into the sum of three operators U2d = U2d β∞ (Dx ) + β+ (Dx ) + β− (Dx ) . The coefficients of U2d have support in {x : |x − d| > δ|d|σ }. If we repeat the same argument as in the proof of Lemma 4.3, then we obtain xρ V1d R(E + i0; K0d )U2d β∞ R(E + i0; K2d )rL = O(|d|−L ), xρ V1d R(E + i0; K0d )U2d β+ R(E + i0; K2d )rL = O(|d|−L ) by Lemma 4.2. We know by the micro-local resolvent estimate ([9, Theorem 1]) that x − ds U2d β− (Dx )R(E + i0; K2d )x − d−s−τ : L2 → L2 ,

s ≥ 0,

is bounded for τ > 1. Hence this, together with Lemma 4.2, yields xρ V1d R(E + i0; K0d )U2d β− R(E + i0; K2d )rL = O(|d|−L/2 ). Thus the proof is complete.

✷

The following two propositions play a basic role in proving the three lemmas.

342



Proposition 4.1 Define Π1d and Π2d with σ1 = σ2 = σ and denote by πjd (x), j = 1, 2, the characteristic function of Πjd . Let ρ > 1/2. Then one has : (1)

rL R(E + i0; Hd )π1d x−ρ = O(|d|−L/2 ).

(2)

rL R(E + i0; Hd )π2d x − d−ρ = O(|d|−L/2 ).

Proposition 4.2 b2d R(E + i0; Hd )b1d = O(|d|3σ ). 4.3. We proceed to proving the three lemmas in question, accepting the two propositions above as proved. The proof of the propositions is done in section 5. Throughout the proof of the lemmas, ψ1 (x) and ψ2 (x) are defined by (4.13) with σ1 = σ2 = σ. Proof of Lemma 3.2. First it is clear from Proposition 4.1 that rL R(E +i0; Hd )bjd obeys the desired bound. We consider the operator Q = rL R(E + i0; Hd )rL . We decompose Q into the sum Q = rL R(E + i0; Hd )ψ1 rL + rL R(E + i0; Hd )(1 − ψ1 )rL = Q1 + Q2 . The function 1 − ψ1 (x) = χ(|x|/δ|d|σ ) has support around x = 0, and it satisfies W2d (1 − ψ1 ) = 0. We use (4.15) for Q1 and (4.14) for Q2 . Then Q1 Q2

= rL ψ1 R(E + i0; K2d )rL − rL R(E + i0; Hd )V1d R(E + i0; K2d )rL , = rL (1 − ψ1 )R(E + i0; K1d )rL − rL R(E + i0; Hd )V˜2d R(E + i0; K1d )rL ,

where V˜2d = −[K1d , ψ1 ]. We decompose V1d into V1d = (π1d x−ρ ) (xρ V1d ), and we use Lemma 4.4 and Proposition 4.1. Then we obtain Q1 = O(|d|−L ). Since the coefficients of V˜2d have support around x = 0, we have also Q2 = O(|d|−L ) by Proposition 4.1 again. Thus rL R(E + i0; Hd )rL = O(|d|−L )

(4.20)

and (2) is proved. Next we consider the operator R = rL R(E + i0; Hd )λd . By (4.14), R is represented as R = rL ψ2 R(E + i0; K1d )λd − rL R(E + i0; Hd )V2d R(E + i0; K1d )λd . The first operator is easy to evaluate. This obeys the bound O(|d|−L/2 ). To evaluate the second operator, we decompose V2d into the sum of four operators 2 V2d + V∞ (x, Dx ) + V+ (x, Dx ) + V− (x, Dx ), V2d = g2d

(4.21)

where g2d (x) = χ(|x − d|/M |d|σ ) for M 1, and 2 )V2d β± (Dx ), V± (x, Dx ) = (1 − g2d

2 V∞ (x, Dx ) = (1 − g2d )V2d β∞ (Dx ).

Vol. 2, 2001


343

According to the decomposition above, we set R0 R∞ R±

2 = rL R(E + i0; Hd )g2d V2d R(E + i0; K1d )λd , = rL R(E + i0; Hd )V∞ R(E + i0; K1d )λd , = rL R(E + i0; Hd )V± R(E + i0; K1d )λd .

Since g2d = O(|d|)x−1 , it follows that g2d R(E + i0; K1d )λd = O(|d|ν ) for some ν > 0, and hence R0 = O(|d|−L/2 ) by Proposition 4.1. We use the micro-local analysis for the operators R∞ and R± . A simple calculus of pseudo-differential operators yields V∞ R(E + i0; K1d )λd = rÑ + rÑ R(E + i0; K1d )λd . Hence it follows from (4.20) that R∞ = O(|d|−L ). Assume that x ∈ Π2d and |x| > M |d|σ . If ξ ∈ supp β− , then we can take M 1 so large that the incoming free particle with state (x, ξ) at t = 0 does not pass over Λd for t < 0. Hence we can construct an incoming approximation V− R(E + i0; K1d )λd = rÑ + rÑ R(E + i0; K1d )λd . If we again use (4.20), then we get R− = O(|d|−L ). To deal with R+ , we construct an outgoing approximation in the form R(E + i0; Hd )V+ = j exp(iθd )R(E + i0; H0 )β˜+ exp(−iθd )V+ + R(E + i0; Hd )˜ rN by an argument similar to that in the proof of Lemma 3.6, where β˜+ ∈ C0∞ (R2ξ ) satisfies β˜+ β+ = β+ , and j(x) and θd (x) are used with the meaning ascribed in Lemma 3.6. The first operator obeys rL R(E + i0; H0 )β˜+ exp(−iθd )V+ = O(|d|−L/2 ) by the micro-local resolvent estimate ([9, Theorem 1]), and the remainder operator is evaluated as O(|d|−L ) by (4.20). Hence we have R+ = O(|d|−L/2 ). This completes the proof. ✷ For later reference, we here note that the proof of Lemma 3.2 does not use Proposition 4.2. Hence we can use Lemma 3.2 to prove Proposition 4.2. Proof of Lemma 3.3. By (4.14) and (4.15), we have the following three relations : b2d R(E + i0; Hd )b1d = b2d ψ2 R(E + i0; K1d )b1d − b2d R(E + i0; Hd )V2d R(E + i0; K1d )b1d , b1d (R(E + i0; Hd ) − R(E + i0; K1d )) b1d = −b1d R(E + i0; Hd )V2d R(E + i0; K1d )b1d ,

344



b2d (R(E + i0; Hd ) − R(E + i0; K2d )) b2d = −b2d R(E + i0; Hd )V1d R(E + i0; K2d )b2d . We decompose V1d as in (4.19) with g1d = χ(|x|/M |d|σ ) and V2d as in (4.21) with g2d = χ(|x − d|/M |d|σ ), and we construct outgoing and incoming approximations. The construction is based on the same idea as in the proof of Lemma 3.6. For example, the approximation for b2d R(E + i0; Hd )V+ is constructed in the form b2d R(E + i0; Hd )V+ = r˜L + b2d R(E + i0; Hd )˜ rL and hence it follows from Lemma 3.2 that b2d R(E + i0; Hd )V+ R(E + i0; K1d )b1d = O(|d|−L ). Thus we repeat the same argument as used in the proof of Lemmas 4.3, 4.4 and 3.2 to obtain the following three inequalities : b2d R(E + i0; Hd )b1d

≤ Cε |d|−1/2+3σ+ε 1 + b2d R(E + i0; Hd )g2d + CL |d|−L , (4.22)

b1d (R(E + i0; Hd ) − R(E + i0; K1d )) b1d ≤ Cε |d|−1/2+3σ+ε b1d R(E + i0; Hd )g2d + CL |d|−L , (4.23) b2d (R(E + i0; Hd ) − R(E + i0; K2d )) b2d ≤ Cε |d|−1/2+3σ+ε b2d R(E + i0; Hd )g1d + CL |d|−L (4.24) for L 1 and any ε, 0 < ε 1. By Proposition 4.2, we have b2d R(E + i0; Hd )g1d + b1d R(E + i0; Hd )g2d = O(|d|3σ ). The desired bound is derived by combining this estimate with the three inequalities above. In fact, (4.23) and (4.24) imply that bjd R(E + i0; Hd )bjd O(|d|σ ) for j = 1, 2. We may assume that this is still valid for gjd , so that we have b2d R(E + i0; Hd )b1d O(|d|−1/2+4σ ) by (4.22). This is also valid for g1d and g2d . Thus it again follows from (4.23) and (4.24) that bjd R(E + i0; Hd ) − R(E + i0; Kjd ) bjd O(|d|−1+7σ )

Vol. 2, 2001


345

for j = 1, 2. The operator R(E + i0; K1d ) is represented as R(E + i0; K1d ) = p2d R(E + i0; H1 )q2d ,

q2d = 1/p2d .

The function p2d behaves like ˆ

ˆ

p2d (x) = eiα2 γ(x−d;d) = eiα2 γ(−d;d) + O(|d|−1+σ ) = eiα2 π + O(|d|−1+σ ) on B1d (= supp b1d ). Similarly q2d (x) = e−iα2 π + O(|d|−1+σ ). Thus b1d R(E + i0; Hd ) − R(E + i0; H1 ) b1d O(|d|−1+7σ ). A similar bound is true for b2d R(E + i0; Hd )b2d , and the proof of the lemma is complete. ✷ Proof of Lemma 3.4. The lemma is verified in almost the same way as in the proof of Lemma 3.3. We give only a sketch for a proof. We keep the same notation as above. The following three identities are obtained from (4.14) and (4.15) : b2d R(E + i0; Hd )λd x−1 = b2d ψ2 R(E + i0; K1d )λd x−1 − b2d R(E + i0; Hd )V2d R(E + i0; K1d )λd x−1 , b1d (R(E + i0; Hd ) − R(E + i0; K1d )) λd x−1 = −b1d R(E + i0; Hd )V2d R(E + i0; K1d )λd x−1 , x−1 λd (R(E + i0; Hd ) − R(E + i0; K1d )) λd x−1 = −x−1 λd R(E + i0; Hd )V2d R(E + i0; K1d )λd x−1 . From these relations, we get the following three inequalities : b2d R(E + i0; Hd )λd x−1 ≤ Cε |d|−1/2+2σ+ε 1 + b2d R(E + i0; Hd )g2d + CL |d|−L , b1d (R(E + i0; Hd ) − R(E + i0; K1d )) λd x−1 ≤ Cε |d|−1/2+2σ+ε b1d R(E + i0; Hd )g2d + CL |d|−L , x−1 λd (R(E + i0; Hd ) − R(E + i0; K2d )) λd x−1 ≤ Cε |d|−1/2+2σ+ε x−1 λd R(E + i0; Hd )g2d + CL |d|−L . It follows from Lemma 3.3 that b2d R(E + i0; Hd )g2d O(|d|σ ),

b1d R(E + i0; Hd )g2d O(|d|−1/2+4σ )

346



and hence we have b2d R(E + i0; Hd )λd x−1 O(|d|−1/2+3σ ) and

(4.25)

b1d R(E + i0; Hd ) − R(E + i0; K1d ) λd x−1 O(|d|−1+6σ ).

If we further make use of (4.25), then we obtain x−1 λd R(E + i0; Hd ) − R(E + i0; K1d ) λd x−1 O(|d|−1+5σ ). ✷

Thus the lemma is proved.

5 Resolvent estimates The present section is devoted to proving Propositions 4.1 and 4.2. Throughout the section, we fix σ1 as σ ≤ σ1 1 and take ρ as 1/2 < ρ < σ1 /4 + 1/2.

(5.1)

On the other hand, σ2 is assumed to satisfy 0 < σ2 < (σ1 /4 − (ρ − 1/2))/3

(5.2)

for ρ > 1/2 as above. We further use the notation h2d (x) to denote the characteristic function of the set {x : |x − d| < C|d|κ } for some C 1 large enough and 0 < κ 1 small enough. 5.1. The argument here is based on the following proposition. Proposition 5.1 Assume that ρ fulfills (5.1). Define ˜ 1d = ψ1 W1d , W Then

ψ1 (x) = 1 − χ(|x|/|d|σ1 ).

˜ 1d R(E + i0; K0d )h2d = O(|d|−ν ) xρ W

with ν = σ1 /4 − (ρ − 1/2) − κ. The proof of this proposition heavily depends on the special form of the differential operator W1d . By (4.7), it takes the form W1d = 2ie1d · ∇ + e0d , where ˆ ∇γ = O(|d|σ1 /2 )∇γ, γ = γ(x; −d), ˆ e1d (x) = α1 − ζ1d (γ(x; −d)) and e0d (x) = O(|d|σ1 )|x|−2 in {x : |x| > |d|σ1 }.

Vol. 2, 2001


347

Lemma 5.1 Recall that π1d denotes the characteristic function of Π1d . Then xρ−2 π1d R(E + i0; K0d )h2d = O(|d|−(σ1 +ν) ) with ν = 1/2 − σ1 − κ > 0. Proof. Let D1 = {(x, y) : x ∈ Π1d , y ∈ supp h2d }. We consider the integral I= x2(ρ−2) |Gd (x, y; E)|2 dydx, D1

where Gd (x, y; E) is the kernel of R(E + i0; K0d ). If (x, y) ∈ D1 , then |x − y| > c(|x| + |d|) for some c > 0. Hence it follows from (4.17) that I is evaluated as 2κ I = O(|d| ) x2(ρ−2) (|x| + |d|)−1 dx Π1d ∞ 2κ −1 = O(|d| ) O(|d| ) (1 + r)2(ρ−2) r dr = O(|d|−2(1/2−κ) ). 0

Thus we have I = O(|d|−2(σ1 +ν) ) with ν in the lemma. This proves the lemma. ✷ Lemma 5.2 If g is a bounded function with support in {x : x ∈ Π1d , |x| > |d|σ1 }, then one has xρ g (∇γ · ∇) R(E + i0; K0d )h2d = O(|d|−(σ1 /2+ν) ),

ˆ γ = γ(x; −d),

with ν = σ1 /4 − (ρ − 1/2) − κ. Proof. Let D2 = {(x, y) : x ∈ Π1d , |x| > |d|σ1 , y ∈ supp h2d }. We calculate √ I(x, y) = (∇γ · ∇) exp(i E|x − y|) for (x, y) ∈ D2 . A direct calculation yields √ √ I(x, y) = i E |x|−1 |x − y|−1 |y| (ˆ x2 yˆ1 − x ˆ1 yˆ2 ) exp(i E|x − y|), ˆ2 ). If (x, y) ∈ D2 , then x ˆ = −dˆ + O(|d|−σ1 /2 ) and yˆ = dˆ + where x ˆ = (ˆ x1 , x −1+κ ), so that O(|d| x ˆ2 yˆ1 − x ˆ1 yˆ2 = O(|d|−σ1 /2 ). Thus we have

I(x, y) = O(|d|1−σ1 /2 )|x|−1 |x − y|−1

uniformly in (x, y) ∈ D2 . Hence the integral obeys the bound 2ρ 2 −1 2−σ1 +2κ I= |x| |I(x, y)| |x − y| dydx = O(|d| ) |x|2ρ−2 (|x| + |d|)−3 dx D2 Π1d ∞ = O(|d|2−σ1 +2κ ) O(|d|−σ1 /2 ) r2ρ−1 (r + |d|)−3 dr = O(|d|−(σ1 +2ν) ) 0

348



for ν as in the lemma. The lemma is obtained from this estimate.

✷

Proof of Proposition 5.1. The proposition follows immediately from the two lemmas above. ✷

Lemma 5.3 Let ψ1 be as in Proposition 5.1. Define V1d and U1d by (4.16) and (4.12) respectively. Then xρ V1d R(E + i0; K0d )h2d = O(|d|−ν ), xρ U1d R(E + i0; K0d )h2d = O(|d|−ν ), where ν = σ1 /4 − (ρ − 1/2) − κ. ˜ 1d = V1d on {x : |x| > 2|d|σ1 }. The coefficients of K0d Proof. By definition, W and V1d are smooth and bounded uniformly in d. If we denote by h1d (x) the characteristic function of the set {x : |x| < 2|d|σ1 }, then it follows from (4.17) that xρ h1d R(E + i0; K0d )h2d = O(|d|−µ ) with µ = 1/2 − (ρ + 1)σ1 − κ > 0, so that xρ h1d V1d R(E + i0; K0d )h2d = O(|d|−µ ) by elliptic estimate. It is obvious that µ > ν for σ1 small enough. Hence the first bound follows from Proposition 5.1. The second one is verified in exactly the same way. ✷ Lemma 5.4 One has h2d R(E + i0; K1d )π1d x−ρ = O(|d|−ν ) with ν = σ1 /4 − (ρ − 1/2) − κ. Proof. Let ψ1 be as in Proposition 5.1. Note that h2d ψ1 = h2d . By (4.11), we have h2d R(E + i0; K1d )π1d x−ρ = h2d R(E + i0; K0d )ψ1 π1d x−ρ − h2d R(E + i0; K0d )U1d R(E + i0; K1d )π1d x−ρ . It follows from (4.17) that the first operator on the right side obeys h2d R(E + i0; K0d )π1d x−ρ = O(|d|−(σ1 /4+(ρ−1/2)−κ) ). To evaluate the second operator, we decompose U1d into U1d = U1d xρ x−ρ . Since x−ρ R(E + i0; K1d )x−ρ : L2 → L2 is bounded uniformly in d, the lemma is obtained from Lemma 5.3.

✷

Vol. 2, 2001


349

Lemma 5.5 Let V1d be as in Lemma 5.3 and let σ2 be as in (5.2). If κ = σ2 , then xρ V1d R(E + i0; K2d )h2d O(|d|−ν ) with ν = σ1 /4 − (ρ − 1/2) − 2σ2 > 0. Proof. The proof uses an argument similar to that in the proof of Lemma 4.3. We use (4.11) with ψ2 (x) = 1 − χ(|x − d|/δ|d|σ2 ). Then we have xρ V1d R(E + i0; K2d )h2d

= xρ V1d R(E + i0; K0d )ψ2 h2d − xρ V1d R(E + i0; K0d )U2d R(E + i0; K2d )h2d .

By Lemma 5.3, the first operator on the right side is majorized by O(|d|−µ ) with µ = σ1 /4 − (ρ − 1/2) − σ2 . To estimate the second operator, we decompose U2d into the sum of four operators 2 U2d = g2d U2d + U∞ (x, Dx ) + U− (x, Dx ) + U+ (x, Dx )

as in (4.21), where g2d (x) = χ(|x − d|/M |d|σ2 ) for M 1, and 2 U∞ (x, Dx ) = (1 − g2d (x))U2d β∞ (Dx ),

2 U± (x, Dx ) = (1 − g2d (x))U2d β± (Dx ).

By Lemma 5.3 again, we have 2 xρ V1d R(E + i0; K0d )g2d U2d R(E + i0; K2d )h2d O(|d|−ν )

for ν as in the lemma, because g2d U2d R(E + i0; K2d )h2d O(|d|σ2 ) by the principle of limiting absorption. If we make use of Lemma 4.2, the other operators with U∞ (x, Dx ) and U± (x, Dx ) can be shown to obey the bound O(|d|−N ) for any N 1. This proves the lemma. ✷ Lemma 5.6 Let 2 V+ = V+ (x, Dx ) = (1 − g2d )V2d β+ (Dx ),

g2d (x) = χ(|x − d|/M |d|σ2 ),

be as in (4.21). Then xρ V1d R(E + i0; K2d )V+ xρ = O(|d|−L ) for any L 1.

350



Proof. We construct an outgoing approximation for R(E + i0; K2d )V+ xρ . If the particle starts from x ∈ {x ∈ Π2d : |x − d| > M |d|σ2 } with momentum ξ ∈ supp β+ at time t = 0, then it does not pass over Π1d for t > 0. This enables us to construct the approximation in the form rL . xρ V1d R(E + i0; K2d )V+ xρ = r˜L + xρ V1d R(E + i0; K2d )˜ Hence the lemma is implied by Lemma 4.4.

✷

5.2. We are now in a position to prove Propositions 4.1 and 4.2. Proof of Proposition 4.1. We prove only the first statement. A similar argument applies to the second one. Throughout the proof, we take σ1 = σ and use the relations (4.14) and (4.15) with ψ1 (x) = 1 − χ(|x|/δ|d|σ ),

ψ2 (x) = 1 − χ(|x − d|/δ|d|σ2 )

for 0 < δ 1 small enough, where σ2 is specified by (5.2) with σ1 = σ. We write

X = rL R(E + i0; Hd )π1d x−ρ

for the operator in the proposition. Since π1d ψ2 = π1d , it follows from (4.14) that X = rL ψ2 R(E + i0; K1d )π1d x−ρ − rL R(E + i0; Hd )V2d R(E + i0; K1d )π1d x−ρ . The first operator on the right side satisfies rL ψ2 R(E + i0; K1d )π1d x−ρ = O(|d|−L/2 ). To estimate the second operator, we decompose V2d into the sum of four operators 2 V2d + V∞ (x, Dx ) + V+ (x, Dx ) + V− (x, Dx ) V2d = g2d

as in (4.21), where g2d (x) = χ(|x − d|/M |d|σ2 ) for M 1, and 2 )V2d β± (Dx ), V± (x, Dx ) = (1 − g2d

2 V∞ (x, Dx ) = (1 − g2d )V2d β∞ (Dx ).

We set X0 X∞ X±

2 = rL R(E + i0; Hd )g2d V2d R(E + i0; K1d )π1d x−ρ , = rL R(E + i0; Hd )V∞ R(E + i0; K1d )π1d x−ρ , = rL R(E + i0; Hd )V± R(E + i0; K1d )π1d x−ρ .

Then the operator X in question satisfies X ≤ CL |d|−L/2 + X0 + X∞ + X− + X+ .

Vol. 2, 2001


351

Note that ψ1 V± = V± and ψ1 V∞ = V∞ . We can show X∞ + X− ≤ CL rL R(E + i0; Hd )ψ1 rL as in the proof of Lemma 3.2. To evaluate the operator rL R(E + i0; Hd )ψ1 rL , we represent it as rL ψ1 R(E + i0; K2d )rL − rL R(E + i0; Hd )V1d R(E + i0; K2d )rL by (4.15). If we decompose V1d into V1d = π1d x−ρ xρ V1d , then it follows from Lemma 4.4 that rL R(E + i0; Hd )ψ1 rL = O(|d|−L ) + O(|d|−L/2 )X and hence we have

X∞ + X− ≤ CL |d|−L/2 + |d|−L/2 X .

We consider the operator X+ . We decompose it into the product X+ = rL R(E + i0; Hd )V+ xρ x−ρ R(E + i0; K1d )π1d x−ρ . The second operator is bounded uniformly in d, and the first one is represented as rL ψ1 R(E + i0; K2d )V+ xρ − rL R(E + i0; Hd )V1d R(E + i0; K2d )V+ xρ by use of (4.15) again. The micro-local resolvent estimate of [9] shows that rL ψ1 R(E + i0; K2d )V+ xρ = O(|d|−L/2 ), which, together with Lemma 5.6, implies that rL R(E + i0; Hd )V+ xρ = O(|d|−L/2 ) + O(|d|−L/2 )X. Thus X satisfies

X ≤ CL |d|−L/2 + |d|−L/2 X + X0 .

We shall evaluate X0 . This obeys the bound X0 = o(1) rL R(E + i0; Hd )g2d by Lemma 5.4 with κ = σ2 , and rL R(E + i0; Hd )g2d is written as rL ψ1 R(E + i0; K2d )g2d − rL R(E + i0; Hd )V1d R(E + i0; K2d )g2d by (4.15). Hence Lemma 5.5 yields X0 = O(|d|−L/2 ) + o(1) X.

(5.3)

352



Thus the desired bound is obtained from (5.3) and the proof is complete.

✷

We proceed to the proof of Proposition 4.2. As previously stated, we are allowed to use Lemma 3.2 for the proof of the proposition. Proof of Proposition 4.2. The proof is based on the same idea as in the proof of Proposition 4.1, although we have to modify slightly the argument there. Throughout the proof, σ2 is fixed as σ2 = σ, and σ1 and ρ are chosen to fulfill (5.1) and (5.2). We set Y = b2d R(E + i0; Hd )π1d x−ρ . Since σ1 > σ, b1d π1d = b1d . Hence it suffices to show the bound Y = O(|d|2σ ) in order to prove the proposition. We use the relations (4.14) and (4.15) with ψ1 (x) = 1 − χ(|x|/δ|d|σ1 ),

ψ2 (x) = 1 − χ(|x − d|/δ|d|σ ).

By (4.14), we have Y = b2d ψ2 R(E + i0; K1d )π1d x−ρ − b2d R(E + i0; Hd )V2d R(E + i0; K1d )π1d x−ρ . The first operator on the right side satisfies b2d ψ2 R(E + i0; K1d )π1d x−ρ = o(1) by Lemma 5.4. We decompose V2d as in the proof of Proposition 4.1 and set Y0 Y∞ Y±

2 = b2d R(E + i0; Hd )g2d V2d R(E + i0; K1d )π1d x−ρ , = b2d R(E + i0; Hd )V∞ R(E + i0; K1d )π1d x−ρ , = b2d R(E + i0; Hd )V± R(E + i0; K1d )π1d x−ρ ,

where g2d (x) = χ(|x − d|/M |d|σ ) for M 1. We can show Y∞ + Y− + Y+ ≤ CL b2d R(E + i0; Hd )rL + O(|d|−L ) = O(|d|−L/2 ) by Lemma 3.2. To estimate the operator Y+ , we construct an outgoing approximation for b2d R(E + i0; Hd )V+ , which takes the form rL . b2d R(E + i0; Hd )V+ = r˜L + b2d R(E + i0; Hd )˜ Thus we have Y = o(1) + Y0 . The operator Y0 is also estimated in the same way as X0 . It satisfies Y0 ≤ b2d R(E + i0; K2d )g2d + o(1) Y ≤ C|d|2σ + o(1) Y by Lemmas 5.4 and 5.5. Hence the desired bound follows at once and the proof is complete. ✷

Vol. 2, 2001


353

6 Asymptotic behavior of eigenfunction In this section we prove Proposition 2.1 which has played a basic role in proving the main theorem. As already stated in section 2, the asymptotic behavior of eigenfunction ϕ∓ (x; λ, ω) has been studied in the physical literatures [3,5,14]. The proof here is based on the idea from [14]. The original idea is due to T. Takabayashi. Proof of Proposition 2.1. We consider only the case α ∈ Z. For brevity, we assume that 0 < α < 1, and we set λ = 1. The proof uses the integral representation p π

∞ i Jp (r) = e−ir cos t cos pt dt − sin pπ e−pt+ir cosh t dt , r > 0, π 0 0 (6.1) for the Bessel function Jp (r) with p > 0 ([8]). (1) We write ϕ(x; ω) for ϕ+ (x; λ, ω) with λ = 1 and denote by ϕinc (x; ω) = exp(iα(γ(x; ω) − π)) exp(ix · ω) the leading term in the asymptotic formula. If we make a change of variable σ = σ(x; ω) = γ(x; ω) − π, then −π ≤ σ < π and it follows from (2.3) that ϕ+ (x; ω) = (−i)ν eilσ Jν (|x|) l∈Z

with ν = |l − α|. We also have ϕinc (x; ω) = eiασ−i|x| cos σ . By the Fourier expansion, 1 ilσ π iαt−i|x| cos t −ilt 1 ilσ π −i|x| cos t ϕinc (x; ω) = e e e dt = e e cos νt dt. 2π π −π 0 l∈Z

l∈Z

On the other hand, we have π

∞ 1 ilσ e e−i|x| cos t cos νt dt − sin νπ e−νt+i|x| cosh t dt ϕ+ (x; ω) = π 0 0 l∈Z

by integral representation (6.1). Hence

∞ 1 ilσ e sin νπ e−νt+i|x| cosh t dt. ϕ+ (x; ω) − ϕinc (x; ω) = − π 0 l∈Z

We calculate the sum on the right side. If γ(x; ω) = 0, then |σ| < π and e±iσ = −1. A simple computation shows that

eαt e−αt eilσ e−νt sin νπ = sin απ + 1 + e−iσ et 1 + e−iσ e−t l∈Z

354



for 0 < α < 1. This yields sin απ ϕ+ (x; ω) − ϕinc (x; ω) = − π

∞

−∞

e−αt ei|x| cosh t dt 1 + e−iσ e−t

(6.2)

for |σ| < π. We apply the stationary phase method to the integral on the right side. If x fulfills the assumption |x/|x| − ω| > c > 0, then |σ| < π − c for |x| 1 and hence |1 + e−iσ e−t | > c1 > 0 in a neighborhood of the stationary point t = 0. Thus we can obtain the desired asymptotic expansion. (2) If we write ϕ∓ (x; ω, α) for ϕ∓ (x; ω), then ϕ− (x; ω, α) = ϕ+ (−x; ω, −α). Hence (2) follows (1) at once. (3) We consider ϕ+ (x; ω) only. By assumption, |x/|x|−ω| < c|x|−q for some q, 1/2 < q ≤ 1. We set δ = (q − 1/2)/3 > 0 and η(x) = i eiσ + 1 = i eiσ(x;ω) + 1 for x as above. We evaluate the integral I on the right side of (6.2). If |x|−1/2+δ < |t| < 1, then |∂t cosh t| > c2 |t| and |∂t (1 + e−iσ e−t )−1 | < c3 |t|−2 , so that e−αt ei|x| cosh t dt = O(|x|−2δ ) −iσ e−t 1 + e −1/2+δ |t|>|x| by partial integration. Thus we have I

= −eiσ ei|x|

−|x|−1/2+δ

= −e e

iσ i|x|

|x|−1/2+δ

|x|δ

−|x|δ

2 1 ei|x|t /2 dt + O(|x|−1+4δ ) + O(|x|−2δ ) t + iη

2 1 ei|s| /2 ds + O(|x|−2δ ). s + i|x|1/2 η

We write σ = −π + ε or σ = π − ε. Then ε > 0 and ε = O(|x|−q ). If σ = −π + ε, then η = ε + O(ε2 ) and |x|1/2 η = O(|x|−q+1/2 ). Hence it follows that

|x|δ

|x|−δ

1 1 − 1/2 s s + i|x| η

ei|s|

2

/2

ds = O(|x|−(q−1/2)+δ ).

This yields I = −eiσ ei|x|

|x|−δ

−|x|−δ

1 ds + O(|x|−(q−1/2)+δ ) + O(|x|−2δ ), s + i|x|1/2 η

Vol. 2, 2001


so that

355

I = −iπei|x| + O(|x|−ν ),

ν = 2(q − 1/2)/3,

for σ = −π + ε. Similarly we have I = iπe follows immediately from (6.2).

i|x|

+ O(|x|−ν ) for σ = π − ε. Thus (3)

(4) We again evaluate the integral I. If |x/|x| − ω| > |x|−1/2 , then 1 I= ei|x| cosh t dt + O(1), |x| → ∞. −iσ e−t 1 + e −1/2 |x| 1/2 in the uniform topology, where the convergence is locally uniform in λ ∈ (0, ∞). Proof. The proof uses the positive commutator method due to Mourre [13]. Let Hα be defined by (7.2). Define the operator C as C = −i (x · ∇ + ∇ · x). Then we have i[Hα , C] = i (Hα C − CHα ) = 4Hα by formal computation. Let χ∞ (x) be as in the proof of Proposition 7.1. Recall that χ∞ (x) vanishes around two centers e1 and e2 . We take D = χ∞ Cχ∞ as a conjugate operator. Since h(H + i)−1 : L2 → L2 is compact for h(x) falling off at infinity and since Aα (x) − A1 (x) − A2 (x) = O(|x|−2 ) (7.3) as |x| → ∞, we obtain the relation f (H)i[H, D]f (H) = 4 f (H)Hf (H) + f (H)K0 f (H) for some compact operator K0 : L2 → L2 , where f ∈ C0∞ (0, ∞) is supported away from the origin. This enables us to repeat the same argument as in [6,13] and we get the proposition. ✷ Finally we discuss the existence and completeness of wave operator W± (H, H0 ) = s − lim exp(itH) exp(−itH0 ) : L2 → L2 . t→±∞

358



Proposition 7.4 The wave operator W± (H, H0 ) exists and is asymptotically complete Ran W+ (H, H0 ) = Ran W− (H, H0 ) = L2 . Proof. The existence can be proved in almost the same way as in the case of smooth magnetic fields ([12]). We skip the proof for it. To prove the completeness, it suffices to show that the limit W± (H0 , H) = s − lim exp(itH0 ) exp(−itH) t→±∞

(7.4)

exists. Let Hα be again defined by (7.2). We know from [17] that W± (Hα , H0 ) exists and is asymptotically complete. This implies the existence of limit W± (H0 , Hα ) = s − lim exp(itH0 ) exp(−itHα ). t→±∞

On the other hand, the difference H − Hα is a perturbation of short–range class by (7.3). Hence we can show the existence W± (Hα , H) = s − lim exp(itHα )ϕ∞ exp(−itH) t→±∞

by use of Kato’s smoothness property which follows from Proposition 7.3 ([15]), where ϕ∞ (x) is a smooth real function such that ϕ∞ (x) = 1 for |x| > L 1 and ϕ∞ (x) = 0 for |x| < L/2. Thus the limit (7.4) in question can be shown to exist and the proof is completed. ✷

References [1] R. Adami and A. Teta, On the Aharonov–Bohm Hamiltonian, Lett. Math. Phys. 43, 43–53 (1998) . [2] G. N. Afanasiev, Topological Effects in Quantum Mechanics, Kluwer Academic Publishers (1999). [3] Y. Aharonov and D. Bohm, Significance of electromagnetic potential in the quantum theory, Phys. Rev. 115, 485–491 (1959). [4] S. Albeverio, F. Gesztesy, R. Høegh-Krohn and H. Holden, Solvable Models in Quantum Mechanics, Texts and Monographs in Physics, Springer–Verlag (1988). [5] M. V. Berry, R. G. Chambers, M. D. Large, C. Upstill and J. C. Walmsley, Wavefront dislocations in the Aharonov–Bohm effect and its water wave analogue, Eur. J. Phys. 1, 154–162 (1980). [6] H. L. Cycon, R. Froese, W. Kirsch and B. Simon, Schr¨ odinger Operators with Applications to Quantum Mechanics and Global Geometry, Springer (1987).

Vol. 2, 2001


359

[7] L. Dabrowski and P. Stovicek, Aharonov–Bohm effect with δ–type interaction, J. Math. Phys. 39, 47–62 (1998). [8] A. Erdélyi, Higher Transcendental Functions, Vol. II, Publ. (1953).

Robert E. Krieger

[9] H. Isozaki and H. Kitada, A remark on the micro–local resolvent estimates for two body Schr¨ odinger operators, Publ. RIMS, Kyoto Univ. 21, 889–910 (1985). [10] H. Isozaki and H. Kitada, Scattering matrices for two–body Schr¨ odinger operators, Sci. Papers Coll. of Arts and Sci., Tokyo Univ.35, 81–107 (1985). [11] V. Kostrykin and R. Schrader, Cluster properties of one particle Schr¨ odinger operators. II, Rev. Math. Phys. 10, 627–683 (1998). [12] M. Loss and B. Thaller, Scattering of particles by long–range magnetic fields, Ann. of Phys. 176, 159–180 (1987). [13] E. Mourre, Absence of singular continuous spectrum for certain selfadjoint operators, Comm. Math. Phys. 78, 391–408 (1981). [14] Y. Ohnuki, Aharonov–Bohm k¯ oka (in Japanese), Butsurigaku saizensen 9, Ky¯ oritsu syuppan (1984). [15] M. Reed and B. Simon, Methods of Modern Mathematical Analysis, Vol II, 1976, Vol III, 1979, Academic Press. [16] D. Robert and H. Tamura, Asymptotic behavior of scattering amplitudes in semi–classical and low energy limits, Ann. Inst. Fourier Grenoble 39, 155–192 (1989). [17] S. N. M. Ruijsenaars, The Ahanorov–Bohm effect and scattering theory, Ann. of Phys. 146, 1–34 (1983). H.T. Ito, H. Tamura Department of Computer Science Ehime University Matsuyama 790–8577, Japan and Department of Mathematics Okayama University Okayama 700–8530, Japan email : [email protected] email : [email protected] Communicated by Bernard Helffer submitted 04/09/00, accepted 30/11/00



Resonances Formed by Massless Bosons Coupled to a Harmonic Oscillator C. Billionnet Abstract. For a harmonic oscillator coupled to a massless boson field, with a coupling function fulfilling certain conditions, we prove the existence of resonances, or possibly stable states, which are not in one-to-one correspondence with the energies of the isolated oscillator. The result suggests the existence of such states in other systems coupled to massless bosons. It indicates that complex scaling is a way to look for these states.

1 Introduction In this paper, we consider the harmonic oscillator coupled to a massless boson field via the Hamiltonian ∗ . (1) H = a∗ a ⊗ 1 + 1 ⊗ Hbos + λ a∗ ⊗ c(g1 ) + a ⊗ c(g1 ) ∗ a∗ and a are the creation and annihilation operators for the oscillator, c(g) and c(g) those of the boson field and Hbos is the free Hamiltonian of the boson field. H will denote the global Hilbert space, and HI the interaction part in the Hamiltonian. The first excited level of the oscillator, whose state vector will be denoted by | 1, is turned into a resonance, given by one of the zeros of the multi-valued function ∞ |g1 (p)|2 f (λ, z) = z − 1 − λ2 dp . (2) −∞ z − |p| This can be seen in the following way. Ωbos being the vacuum in the boson space, let P be the orthonormal projector onto the vector | 1⊗Ωbos and P = 1−P . With respect to the decomposition H = P H ⊕ P H, z − H has the matrix representation A B , where A = P (z − H)P , B = P HI P, C = P HI P , D = P (z − 1)P . It is C D a general result when A is invertible that the above matrix M is invertible if and only if D − CA−1 B is invertible; then P M −1 P = (D − CA−1 B)−1 . Acting in the one dimensional space generated by | 1 ⊗ Ωbos , D − CA−1 B is the multiplication by f (λ, z). Thus, if z > 0, both A and D − CA−1 B being then invertible, we have 1 . 1 ⊗ Ωbos | [z − H]−1 | 1 ⊗ Ωbos = f (λ, z)

362

C. Billionnet


When λ is small, for quite general g, the analytic continuation into the second sheet of f (λ, .) has a zero close to 1. It is a pole of the (continuation of the) resolvent matrix element, thus a resonance associated to the excited state | 1. Here we are going to show that the continuation of f (λ, .) has other zeros. They are not the resonances corresponding to excited states of energy greater than 1, and we will give information on their positions. A motivation for the study is that these zeros could be interpreted as physical states which might have appreciable effects in some domain of physics. When the imaginary part of these zeros in non-zero, they could be considered as oscillatorboson resonances, whereas when it is zero, as it is the case for some of them when λ is large, they would correspond to stable bound oscillator-boson states. The harmonic oscillator coupled to the boson is but an example of a system with discrete energy levels coupled to a continuum. Other systems of this type may have resonances given by zeros of functions similar to f . For example, let us replace the harmonic oscillator by a two spin-state system ( a particular case of the spin-boson model), and take ∗ Hsp,1 (λ, g1 ) = P1 σz ⊗ 1 + 1 ⊗ Hbos + λ σx ⊗ [c(g1 )] + c(g1 ) (3) P1 , as the Hamiltonian ; to simplify, we limit ourselves to at-most-one-boson states by introducing P1 , the projection operator on the space they span. Due to the interaction, and when λ is small, the energy of the state | − is moved whereas | + is turned into a resonance. The corresponding energies are zeros of functions φ− and φ+ defined by φ− (λ, z) := 3 + f (λ, z − 1) and φ+ (λ, z) := −1 + f (λ, z + 1) .

(4)

They are the zeros which go respectively to −1 and +1 when λ goes to 0. We will also show that these functions have other zeros. One may also consider the more complex system formed by the electrons inside an atom, coupled to the photon field. We already indicated in [1], in a special case, what we develop in the present paper in a more general way, namely that the study concerning the harmonic oscillator suggests the existence of resonances different from those directly linked to the discrete energy states of the atom by a perturbative argument. Nevertheless we used the argument that their imaginary part is likely to be great, at least for atomic size systems, to infer that they would be difficult to detect experimentally. Other systems coupled to zero-mass bosons could also be considered. The result is stated in the Proposition proved in section 2, the proofs of lemmas being gathered in section 3. Section 4 is devoted to some comments, some of which introduce the generalization presented in a more developed form in section 5. Since the coupling constant λ will not vary throughout the paper, we do not make the λ-dependence of the quantities explicit when it is not necessary. It may have any positive value.

Vol. 2, 2001

Massless Bosons Coupled to a Harmonic Oscillator

363

2 Existence of at least three zeros of f (z), in the first and second sheets We suppose that g1 is real on the real axis and has the five following properties : P1 g12 is holomorphic except at a finite number of poles πi which are non-real. P2 g1 ∈ L2 (IR), with ||g1 ||2 = 1. 1 P3 p− 2 g1 ∈ L2 (IR). P4 g1 (0) = 0. P5 There exists C0 such that for p ∈ C and |p| > 2 supi |πi |, |g1 (p)|2 < C0 |p|−2 holds. Conditions P1 and P5 imply that g1 is a rational function. Let µ be real positive. For z = 0, or ( z = 0 and z < 0 ), or ( z = 0 and µ = 0 ), we define ∞ g12 (p) 2 f (µ, z) := z − 1 − λ dp . (5a) −∞ z − µ|p|

Since f (µ, z) := z − 1 − 2λ2

0

∞

g 2 (p) dp , z − µp

(5b)

where g 2 (p) = 12 (g12 (p) + g12 (−p)), we will replace the study of (5a) by that of (5b), 1 supposing that g satisfies the same hypotheses as g1 . We set C := ||p− 2 g||22 . We will consider the family of functions of z which (5b) defines when the parameter µ varies. If µ = 0, the function has a branch point at z = 0 and has an analytic continuation with poles at z = πi ; however these points are regular points of the principal branch which (5b) defines in the plane cut along the positive real axis. In the introduction we recalled that one is interested in physics in the zeros of the analytic continuation of f (1, .). The advantage in introducing a µ-dependence here is to connect zeros of f (1, .) to zeros of f (0, .), these being easy to find. The link is obtained by continuity, or even analyticity for generic values of µ. µ−1 is a dilation variable. In many studies on resonances, the introduction of such a variable has revealed fruitful (See references in [2].). Let us note that we are not using it exactly in the same way here. Usually the momenta are also dilated in the interaction term whereas they are not in our approach. However, the method of complex dilation will probably be useful in a more complete treatment of the present oscillator-boson problem. As the zeros of f (µ, .) may lie in the second sheet, or, even cross the cut, passing from the first to the second sheet, when µ varies continuously, we have to consider other branches than the principal one defined by (5b). We will sometimes denote the function with all its branches by F (µ, .). We will have to consider several zeros of this function for a given value of µ ; when it is not important which one is considered we will use the generic notation zroot (µ). We will be particularly interested in the zeros of the branch obtained by crossing the positive real axis

364

C. Billionnet


once, clockwise from the upper half-plane. This branch will be denoted by f+ (µ, .). For µ = 0 and z/µ ∈ / IR+ ∪ {0}, one has f+ (µ, z) = f (µ, z) + 4iπ

z λ2 [g( )]2 µ µ

(6)

and for z/µ real positive, λ2 f+ (µ, z) = z − 1 + 2 P µ

0

∞

g 2 (p) λ2 z dp + 2iπ [g( )]2 p − z/µ µ µ

(7)

An alternative expression to (6) and(7), when − min{| arg(πi )|} < arg(z/µ) ≤ 0 is [g1 (p)]2 dp (7 ) f+ (µ, z) = z − 1 − 2λ2 e−iϑ IR+ z − µp where ϑ satisfies − arg(z/µ) < ϑ < min{| arg(πi )|}. In order to get an existence theorem easily for the zeros and their displacements when µ varies along a certain path, we want to use the analyticity of the function F (µ, .). But for µ = µc (λ) := Cλ2 , the singular point z = 0 is a zero of f (µc (λ), .). Therefore, if µ is to vary up to a value µmax > µc (λ), we will manage to have the paths followed by the zeros avoid the branch point by choosing for µ a path avoiding µc (λ), for example by going round it in the upper half-plane. (See Lemma 2.) Set γr = {z; |z −Cλ2 | = r} and γr+ = γr ∩ {z; z ≥ 0}. The parameter µ will then vary on + 2 C + (µmax ) := [0, (C − )λ2 ] ∪ γ λ 2 ∪ [(C + )λ , µmax ] .

If µmax < µc (λ), µ stays real. From now on, we drop the dependence of µc on λ. For µmax = µc , a special study is necessary but is not performed in this paper. Since we are led to consider the functions f (µ, .) where µ may have complex values let us make their definition a bit more precise. They are defined by analytic continuation in the z-variable, from a region where z is large enough for (5b) to make sense. Denoting by Φ the multi-valued one-variable function defined by the continuation across the positive real axis of ∞ 2 g (p) ϕ(ζ) := dp , (8) ζ −p 0 defined if ζ ∈ IR+ , we have, for µ = 0, z f (µ, z) = z − 1 − 2λ2 µ−1 Φ( ) . µ

(9)

For µ = 0, we will also set ϕ(µ, z) := µ−1 ϕ(z/µ). We keep the notation f for the branch defined by formula (5b) without any deformation of the integration path. f+ (µ, z) will denote the continuation in z of f (µ, z1 ) along a path t → z(t) joining z1 to z, when ζ(t) := µ−1 z(t) crosses the positive real axis once clockwise.

Vol. 2, 2001


365

Our study leans on the fact √ that f (0, .) has two zeros z0 := −d(λ) and z1 := 1 + d(λ), where d(λ) := 12 ( 1 + 4λ2 − 1). We are going to prove the existence of certain zeros of f (1, .) by continuing z0 and z1 into zeros of f (µ, .) when µ follows the path C + (1), if 1 > Cλ2 , or [0, 1], if 1 < Cλ2 . (See Figure 1.) Dotted lines indicate points in the second sheet of the function f (µ, .), the cut being µIR+ , varying with µ.) In this latter case, let us remark that one of the zero is real negative. (See [1].) Also let us insist on the fact that there may exist other zeros than those obtained by our method, even in the disc of radius 1 centered at the origin ; a proof is given in [1] with g(p) ∼ p(1 + p2 )−1 . µ complex plane

z complex plane

2

ελ 0

1+d(λ) µ max

µ c (λ)

Figure 1.

-d( λ)

0

Variation of µ, and corresponding paths of zeros of F for small µ

We will then think of applying this method of determination of zeros (corresponding to bound states or resonances) to the harmonic-oscillator Hamiltonian, the spin-boson Hamiltonian or to the Hamiltonian of more general systems coupled to a massless boson field. Concerning the harmonic-oscillator Hamiltonian and −1 f (z) = 1 | [z − H]−1 | 1 , we have Proposition. Let g satisfy properties P1 –P5 and f (µ, .) be defined by (5b) and its analytic continuation. For all µmax = Cλ2 , there exist three functions z0 (.), z0,+ (.) and z1 (.), defined and continuous on C + (µmax ) if µmax > Cλ2 , on [0, µmax ] if µmax < Cλ2 , such that, at every point µ of this domain of definition, and thus at µmax , their values are zeros of f (µ, .) and z0 (0) = z0 ,

z0,+ (0) = z0 ,

z1 (0) = z1 .

Corollary. For λ = C − 2 and g satisfying P1 –P5 , F (.), defined by (5b) and its 1

analytic continuation, has at least the three zeros z0 (1) ,

z0,+ (1) ,

z1 (1) .

If µmax = Cλ2 , we will content ourselves with the Proposition, which asserts the existence of the functions for any µ strictly smaller than Cλ2 , joined to the observation that z0 (µ), the zero which is real negative (see [1]), satisfies z0 (Cλ2 ) =

366

C. Billionnet


0. We will not be concerned here with the question of knowing whether the two other functions z0,+ and z1 have a limit when µ tends to Cλ2 . A point which indicates that the situation near µc can not simply be derived by a continuity argument from the result we will get here is the following : f+ (µc , .) vanishes at 0, as can be seen on (6). In the particular case where g = 21/2 π −1/2 p(1 + p2 )−1 , the zero which is likely to come to this singular point when µ goes to µ− c can be seen to be a zero of f+ (µ, .) which goes to 0 when µ goes to 0. (See [1].) Now f+ (0, 0) = 0. Thus this zero cannot be obtained by the method we follow in the present paper, which relies on the zeros for µ = 0. It is of a different kind. This fact, together with the fact that the zeros close to 0 are difficult to control by lack of analyticity there, make the existence and µ-dependence of the zeros more difficult to get when µ is close to µc . Concerning z0,+ and z1 , we would have for instance to rule out the possibility that they might come to the singular point for this value of µ. These difficulties are reasons why we impose µmax = Cλ2 . Note that the value zroot (µ) for a given µ a priori depends on , that is to say on the continuation path. One will have to study analyticity with respect to the µ-variable in order to prove independence with respect to . The path around µc could also be lying in the lower half-plane but resonances are found the other way round. The first part of the proof (local extension) will consist in showing the following. Let µ0 be either on the path C + (µmax ), if µmax > Cλ2 , or the path [0, µmax ], if µmax < Cλ2 . For zroot (µ0 ) = 0 a zero of order k of F , one can find k func(k) tions zroot (.) defined in a neighborhood of µ0 on the path, such that, for µ in this (k) (k) neighborhood, f (µ, zroot (µ)) = 0 and zroot (µ) = 0. Let us suppose that this point is established and show how we then proceed. We start from µ0 = 0, zroot (0) taking the values z0 or z1 and, for µ small and positive, we define functions z0 (.), z0,+ (.) and z1 (.), continuous, and satisfying f (µ, z0 (µ)) = 0 ,

f+ (µ, z0,+ (µ)) = 0 ,

f+ (µ, z1 (µ)) = 0 .

We denote them generically by zroot (.). (That, for small µ, z1 (µ) is a zero of f+ rather than f is due to the fact that f (µ, .) does not vanish if z is not a negative real.) We then try and extend the region in which these functions are defined, making sure that their values keep away from 0. Figure 2 illustrates the procedure. (The meaning of dotted lines is the same as in Figure 1.) If µmax > Cλ2 , let us parameterize C + (µmax ) by a continuous function µ(t) defined on [0, 1], with µ(0) = 0 and µ(1) = µmax . If µmax < Cλ2 , we simply take µ(t) = µmax t. For all t ∈ [0, 1] such that µ(t) is in the definition domain of zroot (.), with zroot (µ(t)) = 0, we apply the assumed preceding result around µ(t). If zroot (µ(t)) is simple, we thus get a closed interval I containing t and such that zroot (.) is now defined on all µ(I). If it is not, there is some arbitrariness in the choices of the zeros. (See further on.) We perform this operation step by step, starting from the point t = 0, which determines an interval I1 = [0, t1 ] ; t1 then

Vol. 2, 2001


367

determines I2 , etc. . One thus constructs an increasing sequence tn µ( t n )

µ( t 1 )

µ -plane

z 0 ( µ( t ) )

µ lim = µ( t lim) Cλ

0

I2 t

0

1

2

µ

- d( λ)

tn

t lim

1+ d( λ) 0

max

z0, + ( µ ( t ) )

t-plane t2

z-plane

z0 , lim z 0, +

z 1 , lim

, lim

z 1( µ ( t ) )

1

I1 Figure 2 .

Extension of functions z

root

(λ , µ(.))

in [0, 1] which is such that the definition domain of zroot (.) extends to arcs (or intervals) greater and greater, with, for all n, zroot (µ(t)) = 0, if t ≤ tn . The second part (global existence) will consist in showing that, for each of these three functions, the µ(tn ) just constructed do not accumulate on some point differing from µmax . Proof of the proposition 1) Local extension. Let µ0 ∈ C and zroot (µ0 ) ∈ C\{0} be such that F (µ0 , zroot (µ0 )) = 0 . (k)

To show the existence of the zeros zroot (.) in the neighborhood of µ0 , let us show that we are in a position to apply Hurwitz’ theorem. To do so, let us show more generally that for every ξ which is non-zero and different from any of the µ0 πi ’s, but not necessarily a zero of F (µ0 , .), the two following properties hold : (A) There exists α > 0 and a neighborhood Vµ0 (ξ) of ξ such that for |µ−µ0 | < α if µo = 0, or 0 ≤ µ < α if µ0 = 0, F (µ, .) is analytic in Vµ0 (ξ). (A’) F (., z) is continuous at µ0 if µ0 = 0 and, if µ0 = 0, F (µ(.), z) is right continuous at t = 0, these two continuities being uniform for z in Vµ0 (ξ). These properties will assure us that the hypotheses of the theorem are satisfied when we take ξ = zroot (µ0 ), once we have shown that this value is non-zero. Now showing that (A) and (A’) are true is of course equivalent to showing the analogous properties with, instead of F , F1 (µ, z) := µ−1 Φ(z/µ) for µ = 0 and F1 (0, z) = 2z −1 . (See (9) and (5b).) Let us prove analyticity, the only possible singularities being 0 and µπi . If µ0 = 0, it follows from the continuity of (µ, z) → z/µ on (C\{0}) × C. For µ0 = 0, if 0 ≤ µ < α(ξ) := 2−1 |ξ| supi |πi |)−1 , then F1 (µ, .) (multi-valued if µ = 0), is analytic in every neighborhood Vµ0 (ξ) contained in this disc D(ξ, |ξ| 2 ). For µ0 = 0, the continuity property of Φ(., .) and F1 (., .) follows from the continuity of Φ(.). This yields the continuity at µ0 , uniform in some neighborhood of ζ.

368

C. Billionnet


Now let us consider the case µ0 = 0. As µ varies on C (µmax ) or on [0, µmax ], we may suppose that µ is real non-negative. First, if ξ ( = 0) is not real positive, let us set ν(ξ) := |ξ| > 0 if ξ = 0, ν(ξ) := |ξ| > 0 if ξ = 0 1 and Vµ0 (ξ) := D(ξ, ν(ξ)) . 2 Then, ∀µ > 0,

1 ν(ξ) ∀p ≥ 0 2 1 ∞ If integral (8) defining ϕ(µ, ξ) is decomposed into 0 + 1 , we get z ∈ Vµ0 (ξ) =⇒ |z − µp| >

|ϕ(µ, z) − ϕ(0, z)|
2 supi |πi |.

370

C. Billionnet


We may apply the lemma to ψ = g 2 , which is bounded outside D(0, 2 supi |πi |) ; it implies, for a finite number of branches of F , that there exists B > 0 such that |µ−1 Φ(z/µ)| < B, for |µ| ∈ [|µ1 |, |µmax |] and |z| > 2|µmax | supi |πi |. From (9) it follows that, still for |µ| ∈ [|µ1 |, |µmax |] and |z| large enough, none of these determinations of F vanishes. As a consequence, the set {zroot (µ(tn ))} is contained in a compact set. Therefore it has at least an accumu1 lation point. Let us denote it by zroot,lim (λ). 1 1 Let us now show that F (µlim , .) is holomorphic at zroot,lim (λ). That zroot,lim cannot be 0 follows from the following lemma, proved in section 3. Lemma 2. Given a, b > 0, there exists r(a, b) s.t. |z| ≤ r(a, b), a ≤ |µ| < |µmax | and |µ − µc | > b =⇒ f (µ, z) = 0. We use it noting that t ∈ [t1 , 1] implies |µ(t)| > |µ1 |, and also that |µ(t)−µc | > λ2 holds if µmax > Cλ2 or |µ(t) − µc | > Cλ2 − µmax > 0 if µmax < Cλ2 . 1 is not a pole, let zroot (µk ) be a sub-sequence with limit To show that zroot,lim 1 zroot,lim , satisfying F (µk , zroot (µk )) = 0, for all k. The µk are complex numbers with lim µk = µlim and |µk | > |µ1 |. By (9), Φ(z(µk )/µk ) stays bounded as k → ∞, 1 1 thus zroot,lim /µlim is not a pole of Φ and therefore, again by (9), F (µlim , zroot,lim ) is defined 1 )= Now since F (µ, z) is jointly continuous wherever defined, F (µlim , zroot,lim 1 0. If zroot,lim is a zero of order l of F (µlim , .), then Hurwitz’ theorem tells us that there exists a certain neighborhood of µlim in which the number of zeros of F (µlim , .) is l ; in this neighborhood the zeros form l continuous arcs, distinct or 1 1 not, crossing at zroot,lim . Since zroot,lim is an accumulation point, there exists k such that zroot (µk ) is in this neighborhood and therefore on one of the arcs. This implies that zroot (µ(t)) can be defined beyond tlim . (Take z(µ(t)) up to z(µk ), then 1 the arc element containing zroot,lim .) Thus we can extend the domain of definition of the function zroot (µ) beyond µlim , except of course if µlim is the end point of the path on which µ is to vary. Applying this result to each of the three functions z0 (.), z0,+ (.), z1 (.) constructed for small µ, we thus get their existence on the whole of C + (µmax ) if µmax > Cλ2 or [0, µmax ] if µmax < Cλ2 . So our proposition will be proved when we have proved lemmas 1 and 2. ✷

3 Proofs of lemmas 1 and 2 Proof of Lemma 1 Set r := supi |πi |. First, if ζ < 0, |h(ζ)| < ||p−1 ψ||L1 (IR+ ) . If √ ζ ∈ [0, r], |ζ| > r implies |(ζ)| ≥ r 3 and therefore |h(ζ)|≤r−1 3−1/2 ||ψ||L1 (IR+ ) . Let us now suppose ζ > r and consider the disc D(ζ, r). From the hypotheses, it is outside D(0, r) and thus does not contain any pole of ψ. If D(ζ, r) does not meet IR+ , the integration path defining h need not be deformed and |p− ζ| > r holds for every p on this path. If D(ζ, r) does meet IR+ , ζ coming either from the upper halfplane or the lower half-plane, it is possible to deform the integration path in such

Vol. 2, 2001


371

a way that |p − ζ| > r still holds for every p on this path : the part that is touched by the positive real axis is dragged by the disc and remains stuck to it, until the axis possibly recovers its initial position. The p-integration can be separated in two parts. In the first one, the integration is taken on {p, p > 0, |p − ζ| > r} and gives a contribution to the bound for |h(ζ)| at most r−1 ||ψ||L1 (IR+ ) . In the second one, the integration is taken on a arc whose length is smaller than 2πr, the integrand being bounded by r−1 sup{|ψ(p)|; |p| > r} ; hence a contribution to the bound for |h(ζ)| at most 2π sup{|ψ(p)|; |p| > r}. Thus h∗ (h or h+ ) is bounded and so is any branch of h since it is connected to h∗ by the residue formula. ✷ To prove Lemma 2 we need the following lemma. / {IR+ ∪ {0}}, such Lemma 3. Let ψ be a meromorphic function with poles πi , πi ∈ + 1 that ψ ∈ L (IR ). Let h be the function defined in Lemma 1. For all integers m, there exists r > 0, am > 0 and bm > 0 such that ζ = 0, |ζ| < r =⇒ |h(+)m (ζ)| ≤ am + bm |ψ(ζ)| | log |ζ| | Proof of Lemma 3 Set ζ = x+iy. Let r be such that ψ is analytic in a neighborhood of the disc D(0, 2r) and let us suppose |ζ| < r. Let us first suppose y = 0. 2r ψ(p) ψ(p) h(ζ) = dp + dp p − ζ p −ζ 0 p>2r ψ(p) dp | < r−1 ||ψ||L1 (IR+ ) | p>2r p − ζ 2r 2r 2r ψ(p) ψ(p) − ψ(ζ) 1 dp = dp + ψ(ζ) dp p − ζ p − ζ p − ζ 0 0 0 For p ∈ [0, 2r] and ζ ∈ D(0, r), the segment which joins p and ζ in the complex plane is contained in the analyticity domain D(0, 2r) of ψ and, consequently, | Finally,

| 0

ψ(p) − ψ(ζ) | ≤ sup |ψ (z)| . p−ζ z∈D(0,2r)

2r

1 2r dp| ≤ π + log | − 1| p − (x + iy) ζ

Therefore 2r |ψ (z)| + |ψ(ζ)| π + log | − 1| . ζ z∈D(0,2r)

|h(ζ)| < r−1 ||ψ||L1 (IR+ ) + 2r sup From log |

2r 2r − 1| ≤ log(1 + ) ≤ max{| log(3r)|, | log(2r)|} + | log |ζ| | ζ |ζ|

we get the announced inequality for h.

372

C. Billionnet


Let us now consider y → 0 ; ζ stays in D(0, 2r). We have just shown ψ(p) 2r | h(ζ)− dp−ψ(ζ) log | −1| | < π sup |ψ(z)|+2r sup |ψ (z)| . p − ζ ζ z∈D(0,2r) z∈D(0,2r) p>2r lim h(ζ) exists (since the integration path defining h may be deformed) and the

y→0

limit is h+ (x). Therefore the left-hand-side of the above inequality has a limit and it satisfies the same bound since the bound does not depend on y. Thus the announced bound for h+ holds, if y = 0. Lastly, when y goes from a positive value to a negative one, ζ crossing m times the real axis, we get h(+)m (ζ) = h(ζ) + m ψ(ζ) ✷ from where the bound for h(+)m follows. Remark. If we add the hypothesis p−1 ψ ∈ L1 (IR+ ), then, since the analyticity of ψ near 0 now implies that ψ vanishes at the origin, h and h+ are now bounded in a neighborhood of 0. Thus, from P1 , P3 and P4 , each branch of ϕ(ζ) is bounded in a neighborhood of 0 ; we are going to see that a value may be given to the function at the branch point, making each branch continuous there, in the cut plane. 2

Proof of Lemma 2. f (µ, 0) is defined by formula (5b) and its value is −1 + C λµ . For z/µ ∈ / (IR+ ∪ {0}),

g 2 (p) λ2 ∞ dp . (10) f (µ, z) − f (µ, 0) = z 1 − 2 2 µ 0 p(z/µ − p) Since |µ| > a, for z sufficiently small, µz cannot coincide with a pole πi . Because of the three following hypotheses : the only singularities of g are non1 real poles πi , g(0) = 0 and p− 2 g ∈ L2 (IR), hypotheses of Lemma 3 are satisfied −1 2 by the function ψ(p) := p g (p). With ρ := min{|πi |}, (10) then implies that i

f (µ, .) together with any of its analytic continuations into any of the sheets satisfy z = 0, |z| < aρ =⇒ |f (µ, z) − f (µ, 0)| < |z| A + B|z| | log |z|| , uniformly for µ such that µmax ≥ |µ| ≥ a > 0. Lemma 2 is then a consequence of |f (µ, 0)| =

|µ − Cλ2 | b . ≥ |µ| µmax

✷

4 Comments on the result Essentially, it tells us that the zeros of a certain family of µ-dependent functions, zeros which are candidates for resonances or bound states for the value µ = 1, do not disappear as the parameter µ increases. Our strategy has been to obtain

Vol. 2, 2001


373

them from the zeros present when µ = 0, a situation in which the bosons could be considered as having zero energy. A zero may disappear, for a certain value of the parameter µ on which the functions of a family depend, when this zero meets a pole. We showed that this does not occur in our situation. Incidentally, even if it was so, that would not mean that it does not reappear for greater values of µ. Let us remark that z0 (µ) and z0,+ (µ) may, in some sense, be considered as two ”presentations ” of the same zero. Indeed, for µ small, one is a zero of f (µ), the other one a zero of f+ (µ) which corresponds to the preceding one. When µ is small, the two determinations f (µ) and f+ (µ) are neighboring and so are the zeros. On the contrary, z0 (µ) and z1 (µ) cannot be paired. Let us now come back to the Hamiltonian from which we get function f in (5), in order to put our result in a more general context. It is Hµ = Hosc ⊗1+µ 1⊗ Hbos +HI , where HI is given in (1). For µ = 0, the zeros we leant on are eigenvalues of Hµ=0 = Hosc ⊗ 1 + HI . They are only some of them, because in considering f we only considered the restriction of the Hamiltonian to the subspace consisting of at-most-one-boson states, a subspace which is invariant by Hµ . But however, the result we obtained is that certain resonances which are not those usually considered can be derived from eigenvalues of Hµ=0 . Those usually considered are connected to eigenvalues of Hosc = a∗ a, or also H(λ = 0). This formulation leads to a more general point of view, according to which the property suggested by our result is likely to be relevant to quite general situations. We will suggest developments of this idea in section 3, after having recalled that some work still remains to be done in the case of the harmonic oscillator. Before, let us say some words about the role played by the poles of g. In physics, one may expect the function g to have poles. The greater is the range of the interaction the closer to the real axis would be these poles. Besides, mathematically, if g (non-zero) does not have other singularities than poles, hypotheses P1 and P5 imply that g does have poles. Their presence allows us to understand how f (.) may have other zeros than the one near 1, for small λ : ϕ+ has the same poles as g and therefore the term λ2 ϕ+ ( µz ) can not be considered as negligible compared to z − 1, for small λ. The example of the zeros of the function z − 1 + z −1 λ illustrates this. However the proof of the existence of the zeros z0 and z0,+ has used the poles of g only in an indirect way, by the fact that their existence is a consequence of the hypotheses. It is rather analyticity properties of g that have been used. Let us see however how the position of these poles affects the position of the zeros of F which are not close to 1. Suppose now that µ is fixed. If g has no pole inside a disc of radius R, then the different determinations of ϕ have no poles in the same disc cut along IR+ ; therefore they are bounded there. As a consequence, with for instance µR > 2, for all η ∈]0, 12 ], there exists a certain λ0 (µ, R, η) such that for λ smaller than λ0 (µ, R, η), f+ (µ, .) vanishes, in D(0, µR) cut along IR+ , only in the disc D(1, η). According to our result, there are other zeros. Therefore they must be outside the disc D(0, µR). In fact, in a region where λ2 ϕ+ ( µz ) is not bounded, possible zeros of f (µ, .) are not necessarily close to 1.

374

C. Billionnet


5 Possible extensions 5.1 The harmonic oscillator. An extension of our result would consist in no more restraining oneself to the atmost-one-photon space. Although the knowledge of H on this subspace suffices to determine H everywhere, it remains to find how to determine explicitly all the new resonances (or bound states) from the one we described in section 2. Moreover, a complementary study is still necessary to treat possible other zeros, like those we said we left aside.

5.2 The spin-boson model. First let us consider the simplified version mentioned in the introduction. Following the same method as before, we look for the zeros of functions φ+ (z) and φ− (z) by means of the path followed by the zeros of functions φ+ (µ, z) := −1 + f (µ, z + 1) and φ− (µ, z) := 3 + f (µ, z − 1) , when µ varies from 0 to 1 on a path in the complex plane. This amounts to introducing the auxiliary Hamiltonian ∗ Hµ := P1 σz ⊗ 1 + µ 1 ⊗ Hbos + λ σx ⊗ c (g1 ) + c(g1 ) P1 , The of the zeros for µ = 0. φ+ (0, z) has two zeros √ starting point was the position − ± 1 + λ2 √ ; the same is true for φ (0, z). The twofold degeneracy for each of the energies ± 1 + λ2 is due to the fact that the following two states of the spin-boson system have the same energy : spin | + with 0 boson and spin | + with 1 boson. The same for | −. φ+ (µ, .) has a branch point at z = −1 and the principal determination vanishes at that point if and only if µ = 12 Cλ2 . φ− (µ, .) has a branch point at z = +1 and the principal determination vanishes at that point if and only if µ = − 12 Cλ2 , so that it does not vanish if µ > 0. Analyticity and continuity properties (A) and (A’), proved for f , are still true for φ+ et φ− . Therefore, if µmax = 12 Cλ2 , an argument similar to the one we used for the function f of the oscillator can be used for φ+ and φ− . In the study of zeros of φ− , µ can stay real, whereas for φ+ we do need to consider a path 1 1 2 2 + Q+ λ, (µmax ) which goes round the point 2 Cλ , if µmax > 2 Cλ . Then,√for φ , we get the existence of two zeros depending on µ and coinciding with ± 1 + λ2 for 1 2 µ = 0, when µ varies either on Q+ λ, (µmax ) for µmax > 2 Cλ or on [0, µmax ] for 1 2 − µmax < 2 Cλ . For φ the same is true for µ ∈ [0, µmax ]. In particular, µ can be 1 except if 1 = 12 Cλ2 . The zeros are denoted in the following way : z0+ (µ) et z1− (µ) are the zeros of φ+ (µ, z) such that z0+ (0) = 1 + λ2 and z1− (0) = − 1 + λ2 ;

Vol. 2, 2001


375

z1+ (µ) and z0− (µ) are the zeros of φ− (µ, z) such that z1+ (0) = 1 + λ2 and z0− (0) = − 1 + λ2 ; Note that the superscripts ± in the notation of the zeros at µ = 0 do not refer to those in φ± but to the fact that these zeros are close to ±1 if λ is small. The subscripts 0 or 1 refer to the dominating number of bosons in the eigenvectors corresponding to these eigenvalues for small λ. For all real and positive λ and µ, the principal determination of φ− (µ, .) has a real zero which is smaller than 1 for the function is increasing on ] − ∞, 1] and φ− (µ, 1) > 0. It tends to −1 if λ tends to 0, µ being fixed ; it is z0− (µ). The principal determination of φ+ (µ, .) has no real zero if µ > 12 Cλ2 , but has one which is smaller than −1 if µ < 12 Cλ2 , which is the case for sufficiently small µ ; it is z1− (µ). It becomes complex when µ increases, passing the branch point z = −1 for µ = 12 Cλ2 . We are not concerned here with the ground state energy problem. In particular we will not try to determine which of the two eigenvalues z0− (1) or z1− (1) is the smallest when λ2 > 2/C. Thus resonances for the Hamiltonian (3) are derived from eigenvalues z0± (0) and z1± (0) of ∗ (11) Hµ=0 := P1 σz ⊗ 1 + λ σx ⊗ [c(g1 )] + c(g1 ) P1 One may think that this would also be true for Hsp,N (g1 , µ) = PN

σz ⊗ 1 + µ1 ⊗ Hbos + λ σx ⊗ c∗ (g1 ) + c(g1 )

PN , (13)

the number of bosons being that time at most N . This would be exploitable since some eigenvalues and eigenvectors of Hsp,N (g1 , 0) can be constructed explicitly.

5.3 A general system with discrete energy levels coupled to a boson field. As announced at the end of section 1, we are now going to set the question in a general way. Consider a system S which is described, when it is isolated, by an Hamiltonian HS and, when it is coupled to a massless boson field, by the Hamiltonian (14) H = HS ⊗ 1 + 1 ⊗ Hbos + Hint . It is often considered that the coupling to the boson transforms each eigenvalue of HS different from the ground-state energy into a resonance, a pole of matrix elements of the resolvent in the second sheet. The preceding study, based on the µdependence, suggests that there should be other resonances, or even bound states, in correspondence with the eigenvalues of HS ⊗ 1 + Hint , rather than with those of HS . (z0 is an eigenvalue of HS ⊗ 1 + Hint , associated with the eigenvector

376

C. Billionnet


λ−1 d(λ) | 1 ⊗ Ωbos − | 0 ⊗ c∗ (g)Ωbos .) The question is: under which sufficiently large hypotheses is there such a correspondence between resonances for H and eigenvalues for HS ⊗ 1 + Hint in that kind of problem ? It is to be noted that the ratio of the number of eigenvalues of HS ⊗ 1 + Hint to the number of eigenvalues of HS may be expected to be large. For example, in the case of the harmonic oscillator and H given by (1), it is a countable infinity ; indeed, the eigenvalues of Hosc ⊗ 1 + Hint are of the form s+ (1 + d(λ)) − s− d(λ), where s+ and s− are non-negative integers.

References [1] C. Billionnet, Eur. Phys. J. Vol. D. 8, 2000, pp.157–164. [2] H.L. Cycon, R.G. Froese, W. Kirsch, B. Simon, Schr¨ odinger Operators, Springer 87. [3] V. Bach, J. Fr¨ ohlich, I.M. Sigal, Adv. Math., Vol. 137, 1998, 299.

C. Billionnet Centre de Physique Théorique Ecole Polytechnique F-91128 Palaiseau cedex, France e-mail : [email protected] Communicated by Gian Michele Graf submitted 21/03/00, accepted 22/09/00




Algebraic Representation Theory of Partial Algebras G.O.S. Ekhaguere

Abstract. This paper develops an algebraic representation theory of partial algebras. We introduce the notion of irreducibility for representations of a partial algebra, prove necessary and sufficient conditions for a representation to be (a) irreducible or (b) completely reducible, and develop aspects of the decomposition theory of completely reducible representations. An application to boson quantum field theory is furnished.

0.

Introduction

In one approach to the axiomatic formulation of some aspects of quantum field theory and quantum statistical mechanics, the theory of C ∗ - and W ∗ -algebras is applied. Since these algebras admit realizations as bounded linear maps on Hilbert spaces, whereas many of the linear operators encountered in quantum theory are unbounded, densely-defined maps on Hilbert spaces, their use gives rise to a number of problems. For example, thermodynamic limits of some nets of observables may not exist in the uniform topology available in a C ∗ -algebraic approach to quantum statistical mechanics. To address some of these issues, the notion of a partial ∗ algebra has recently been introduced as a generalization of the notion of a ∗ -algebra. The properties and structure of partial ∗ -algebras are currently being studied by a number of authors [1-10]. In this paper, we consider the class of partial algebras and discuss the algebraic representation of such an algebra by the members of L(D, X ), where X is a linear space over C l (the complex numbers), D is a subspace of X and L(D, X ) is the set of all X -valued linear maps whose domains in X contain D. If π is a representation of a partial algebra A in L(D, X ), then the maps π(a), a ∈ A, do not necessarily leave D invariant. We circumvent this obstruction in Section 3 by introducing an enlargement of D which is left invariant by the derived operator set described in that Section. This enables us in Sections 3 and 4 to establish necessary and sufficient conditions for a representation of a partial algebra to be irreducible or completely reducible, and to develop aspects of the decomposition theory of completely reducible representations.

1.

Fundamental notions

Definition A partial algebra is a triplet (A, Γ, ◦) comprising a complex linear space A, a relation Γ ⊆ A × A on A and a partial multiplication ◦ such that

378

G.O.S. Ekhaguere


(i) (x, y) ∈ Γ ⇐⇒ x ◦ y ∈ A (ii) (x, z) ∈ Γ and (y, z) ∈ Γ imply (αx + βy, z) ∈ Γ, and then (αx + βy) ◦ z = α(x ◦ z) + β(y ◦ z) for all α, β ∈ C. l Remark. Partial algebras, especially partial ∗ -algebras, and their applications have been studied by a number of authors [1-10]. Notation : Partial algebras may be analyzed by means of their variety of multipliers. These are introduced as follows. Let (A, Γ, ◦) be a partial algebra. (i) For x ∈ A, the set

ML (x) = {z ∈ A : (z, x) ∈ Γ}

is the set of left multipliers of x. (ii) For C ⊆ A, let

ML (C) = ∩x∈C ML (x) .

Then ML (C) is the set of all elements of A that are left multipliers of every element of C. We remark that right multipliers of elements and subsets of A are similarly defined. The right multipliers of x ∈ A (resp. C ⊆ A) will be denoted by MR (x) (resp. MR (C)). (iii) For C ⊆ A, let

M (C) = ML (C) ∩ MR (C) .

Then M (C) is the set of universal multipliers of C. The foregoing notions will be employed in the sequel.

2.

The concrete partial algebra L(D, X )

Within the context of the algebraic representation theory of a partial algebra considered in this paper, there is the concrete partial algebra L(D, X ) which is introduced as follows. Let X be a linear space over C; l D a subspace of X and L(D, X ) the set of all X -valued linear maps whose domains in X contain D. The set L(D, X ) is a linear space when furnished with the operations of pointwise addition and pointwise scalar multiplication on D. Furthermore, if dom(T ) means the domain of T in X ; A · B (which will often be written simply as AB in the sequel) denotes the pointwise composition of A, B ∈ L(D, X ) on D and Γ(D, X ) = {(A, B) ∈ L(D, X ) × L(D, X ) : BD ⊂ dom(A)} then the triplet (L(D, X ), Γ(D, X ), ·) is a partial algebra. This concrete partial algebra will be employed in the sequel.

Vol. 2, 2001

Algebraic Representation Theory of Partial Algebras

379

Example : The partial algebra of the boson quantum field Let H be a complex Hilbert space, with inner product ·, ·, and X the boson Fock space [14] over H. The identity map on X will be denoted by 11. Each f ∈ H determines a member e(f ) of X , called an exponential vector [15], defined by −1/2 n ⊗ f, e(f ) = ⊕∞ n=0 (n!)

f ∈H

where ⊗0 f = 1 and ⊗n f is the n-fold tensor product of f with itself for n ≥ 1. Write D for the linear span of all the exponential vectors in X . It is known [15] that D is dense in X . Using previous notation, the triplet (L(D, X ), Γ(D, X ), ·) is a partial algebra, which we denote simply by L(D, X ). On D, introduce the X -valued linear operators a(f ) and a∗ (f ), f ∈ H, defined by a(f )e(g) = f, ge(g) a∗ (f )e(g)

=

d e(g + σf )σ=0 , dσ

g∈H.

Then, for f ∈ H, a(f ) is the annihilation operator and a∗ (f ) is the creation operator of quantum field theory. Notice that whereas a(f ) leaves D invariant, a∗ (f ) does not. Let A(D) be the partial subalgebra of L(D, X ) generated by the set {11} ∪ {a(f ), a∗ (g) : f, g ∈ H}. We call A(D) the partial algebra of the boson quantum field. If we take H to be L2 (IR3 , dx), then A(D) has a quasi-local structure, as explained below. In the sequel, we introduce and study the representations of arbitrary partial algebras. We remark that, among other things, the representations of A(D) are of much interest in the study of boson quantum fields. Remark. If Z is a linear space, then L(Z, Z), which will sometimes be written as L(Z), denotes the linear space of all endomorphisms of Z. Definition A representation of a partial algebra (A, Γ, ◦) on a linear space X is a homomorphism π of (A, Γ, ◦) into the concrete partial algebra (L(D, X ), Γ(D, X ), ·), i.e. π is linear and satisfies: (i) (x, y) ∈ Γ =⇒ (π(x), π(y)) ∈ Γ(D, X ) and π(x ◦ y) = π(x) · π(y) (ii) if A is unital, with unit e, then π(e) = 11, the identity element of L(X , X ). Remark. 1. The quasi-local structure of the partial algebra A(D) of the boson quantum field is determined by a net {A(D, O) : O is a bounded open subset of IR3 } of partial algebras, ordered under the isotonous inclusion: A(D, O1 ) is a partial sub-algebra of A(D, O2 ), whenever O1 ⊆ O2 , where A(D, O) is the partial subalgebra of L(D, X ) generated by the set {11} ∪ {a(f ), a∗ (g) : f, g ∈ H, supp f and supp g ⊆ O}. Indeed, there is a canonical partial algebra homomorphism hO1 ,O2 of A(D, O1 ) onto a partial subalgebra of A(D, O2 ), for O1 ⊆ O2 . Then

380

G.O.S. Ekhaguere


A(D) is the partial subalgebra of L(D, X ) generated by ∪{A(D, O) : O is a bounded open subset of IR3 }. Equivalently, A(D) is the inductive limit of the net of local partial algebras {A(D, O) : O is a bounded open subset of IR3 }, equipped with the canonical homomorphisms {hO1 ,O2 : O1 ⊆ O2 }. 2. Issues concerning the representations of quasi ∗ -algebras and partial ∗ algebras have been studied in the references [5-6], [1][18]. Some of the considerations relate to determining suitable conditions for a representation of a subalgebra to admit an extension to the algebra containing it [5]; others concern the construction of representations of partial ∗ -algebras from states, weights or biweights [6][18]. In this paper, we study the representations of partial algebras in their own right, and not as constructs associated with preassigned states, weights or biweights on the algebras. 3. In the sequel, (A, Γ, ◦) is a partial algebra, which will often be written simply as A. Similarly, the partial algebra (L(D, X ), Γ(D, X ), ·) will be abbreviated as L(D, X ).

3.

Irreducible representations

The notion of an operator set will be required in the subsequent discussion. Definition Let M be a subset of A and Z a linear space. An L(Z)-valued map on M is called an operator set indexed by M. Notation : The set of all L(Z)-valued operator sets indexed by M will be denoted by Ops(M, Z). Definition Let ϕ ∈ Ops(M, Z). Then a subspace D of Z is called ϕ-stable if ϕ(a)D ⊆ D for all a ∈ M. Remark. For ϕ ∈ Ops(M, Z), the subspaces {0} and Z of Z are evidently ϕ-stable. These ϕ-stable subspaces are called trivial; all other ϕ-stable subspaces, if any, are called non-trivial. Definition A member ϕ of Ops(M, Z) is irreducible if Z = {0}, ϕ is non-zero and the only ϕ-stable subspaces of Z are the trivial ones. Notation : Let Ops(M, Z) and Ops(M, Z ) be two operator sets and ϕ ∈ Ops(M, Z), ϕ ∈ Ops(M, Z ). Then, Hom(ϕ, ϕ ) is the linear space of all linear maps T : Z → Z such that T · ϕ(a) = ϕ (a) · T on Z, for all a ∈ M. Definition 1. A member of Hom(ϕ, ϕ ) is called an intertwining operator, as it intertwines the operator sets ϕ and ϕ . 2. If Hom(ϕ, ϕ ) contains a bijection, then ϕ and ϕ are called equivalent; this is written ϕ ∼ = ϕ . Remark. The various notions introduced above will be applied to representations

Vol. 2, 2001


381

of A in the sequel. Notation : Let π be a representation of A in L(D, X ). 1. If E is a subset of D, then the symbol E[π] denotes the linear span of E ∪ π(A)E. 2. Define π L : ML (A) → L(D[π]) by π L (a)ξ = π(a)ξ, for a ∈ ML (A), ξ ∈ D[π]. Then, π L ∈ Ops(ML (A), D[π]); the map π L will be called the derived operator set of π. 3. Notice that the linear space D[π] is π L -stable and that D[π] = D iff π(A)D ⊆ D; the latter, of course, implies π(ML (A))D ⊆ D. Remark. Several important notions for π will be introduced by means of its derived operator set π L . Definition Let π be a representation of A in L(D, X ). 1. π will be called irreducible if π is not the zero representation and its derived operator set π L is irreducible, i.e. if D[π] and {0} are the only π L -stable subspaces of D[π]. L ∈ Ops(ML (A), D) is defined by 2. If D is a π L -stable subspace of D[π] and πD L (a)ξ = π L (a)ξ, a ∈ ML (A), ξ ∈ D, πD L is then D will be called π-irreducible provided that the operator set πD irreducible.

3. A subspace D0 of D is π-irreducible if the linear span of D0 ∪ π(A)D0 is π-irreducible. Definition Let π : A → L(D, X ) be a representation of A, and E a π L -stable ˜ subspace of D[π]. On the set D[π] = D[π]/E, define the map π ˜ (a) for each a ∈ A by π ˜ (a)(ξ + E) = π(a)ξ + E, for ξ ∈ D[π], a ∈ A. ˜ Then, the operator set π ˜ : A → L(D[π]) just introduced will be called the quotient ˜ of π on D[π]. Schur’s Lemma Let π : A → L(D, X ) and π : A → L(D , X ) be two representations of A, T ∈ Hom(π, π ), E ≡ ker(T ) and E ≡ range of T . Then (i) E is π L -stable; (ii) E is π -stable; L

382

G.O.S. Ekhaguere


˜ = D[π]/E is equivalent to π . (iii) the quotient π ˜ of π on D[π] E Proof. (i) If a ∈ ML (A) and x ∈ E, then T (π(a)ξ) = π (a)T ξ = 0, whence π(a)ξ ∈ E, showing that E is π L -stable. (ii) If a ∈ ML (A) and η ∈ E , then η = T ξ for some x ∈ D[π], and π (a)η = π (a)T ξ = T (π(a)ξ) ∈ E , showing that E is π L -stable. ˜ (iii) Define T˜ : D[π] → E by T˜(ξ + E) = T ξ, ξ ∈ D[π]. ˜ Let π ˜ be the quotient of π on D[π]. Then T˜ · π ˜ (a)(ξ + E) = = = =

T˜(π(a)ξ + E) T π(a)ξ π (a)T ξ π E (a)T˜(ξ + E),

for a ∈ ML (A), x ∈ D[π], showing that ˜ T˜ · π ˜ (a) = πE (a) · T˜ on D[π], and a ∈ ML (A) . This concludes the proof.

✷

Remark. Schur’s Lemma leads to the following result. Proposition For any two irreducible representations π : A → L(D, X ) and π : L A → L(D , X ), either Hom(π L , π ) = {0} or else π ∼ = π . Proof. Let T be an arbitrary member of Hom(π, π ). As E ≡ ker(T ) and E ≡ L range of T are stable under π L and π , respectively, and π, π are irreducible, it follows that E is either {0} or D[π] and E is either {0} or D [π ]. In case E = {0}, then T is injective and its range must be D [π ], showing that π ∼ = π . If E = D[π], then T is the zero operator, whose range is of course {0}. As T was arbitrary, it follows that Hom(π L , π L ) = {0}. This ends the proof. ✷ Theorem Let π be a representation of a partial algebra A in some L(D, X ), with D = {0}. Then, a necessary and sufficient condition for π to be irreducible is that to any pair ξ, η of vectors in D with ξ = 0, there corresponds a ∈ A such that π(a)ξ = η. Proof. Suppose π is irreducible. Then N ≡ {ξ ∈ D : π(a)ξ = 0 ∀ a ∈ A} is a π L -stable subset of D[π]. It follows that N is either {0} or D[π]. If N = D[π], then π is a zero representation, which is impossible. Therefore, N = {0}. Let ξ be a fixed non-zero member of D and M ≡ {π(a)ξ : a ∈ A}. Since ξ ∈ N , then M = {0}. Moreover, M is π L -stable. It follows that π(A)ξ = M = D[π], showing

Vol. 2, 2001


383

that every vector in D is of the form π(a)ξ, for some a ∈ A. The sufficiency of the assertion is clear. ✷

4.

Completely reducible representations

We shall discuss aspects of the decomposition theory of representations. Definition A representation π : A → L(D, X ) of A will be called completely reducible if there are irreducible representations πα : A → L(Dα , Xα ), α in some index set J, of A such that πL ∼ = ⊕α∈J παL where παL is the derived operator set of πα , α ∈ J. Remark. It is clear from the notion of irreducibility for representations that each of the operator sets παL , α ∈ J, appearing in the last Definition is irreducible. Moreover, the direct sum representation for π L implies that the linear space D[πα ] is isomorphic to the direct sum ⊕α∈J Dα [πα ]. Proposition Let π be a representation of A in L(D, X ), E a π L -stable subspace of D[π] and D[π] = ⊕α∈J Dπ,α

(∗)

a direct sum decomposition of D[π] into π-irreducible subspaces Dπ,α of D[π]. Then, there is a subset K of J such that D[π] = E ⊕α∈K Dπ,α .

(∗∗)

Proof. By Zorn’s Lemma, there is a maximal subset K of J such that E, {Dπ,α : α ∈ K} are linearly independent. Thus, the sum Dπ = E +

Dπ,α

α∈K

is direct. It will be shown that Dπ = D[π]. Assume Dπ = D. By (*), Dπ,β ⊂ Dπ for some β ∈ J\K. Now, Dπ ∩ Dπ,β is a L π -stable subspace of Dπ,β . As Dπ,β is π-irreducible, it follows that Dπ ∩ Dπ,β is either {0} or Dπ,β . If the former holds, then Dπ ∩ Dπ,β = {0}, showing that Dπ,β is linearly independent of E, {Dπ,α : α ∈ K}, contradicting the maximality of K. Also, if Dπ ∩ Dπ,β = Dπ,β , then Dπ,β ⊂ Dπ , a contradiction. Hence, Dπ = D[π]. ✷ Remark. The last result gives rise to the following. Corollary Let π : A → L(D, X ) be a completely reducible representation of A. Then, every π L -stable subspace of D[π] possesses a complementary π L -stable subspace in D[π]. ✷ Remark. Completely reducible representations of partial algebras associated with

384

G.O.S. Ekhaguere


quantum fields arise when superselection rules are present. The irreducible subrepresentations of completely reducible representations correspond to the superselection sectors of the quantum fields [11-13][16-17].

5. Acknowledgment I am grateful to both Professor M.S. Narasimhan, Director of the Mathematics Section of the ICTP, and Professor M. A. Virasoro, Director of the ICTP, for a Visiting Mathematicians award and the Commission for Development and Exchange of the International Mathematical Union for a travel grant that made the writing of this paper possible. I also thank the referee for his/her comments.

References [1] J.-P. Antoine and A. Inoue, Representability of invariant positive sesquilinear forms on partial ∗ -algebras, Mathematical Proceedings of the Cambridge Philosophical Society 108, 337–353 (1990) . [2] J.-P. Antoine, A. Inoue and C. Trapani, Partial ∗ -algebras of closable operators I. Basic theory and the abelian case, Publications of the RIMS, Kyoto University 26, 359–395 (1990) . [3] J.-P. Antoine, Inoue, A. and C. Trapani, Partial ∗ -algebras of closable operators II. States and representations of partial ∗ -algebras, Publications of the RIMS, Kyoto University 27, 399–430 (1990). [4] J.-P. Antoine, Inoue, A. and C. Trapani, Partial ∗ - algebras of closable operators: a review, Reviews in Mathematical Physics 8, 1–42 (1996). [5] J.-P. Antoine, F. Bagarello, and C. Trapani, Extension of representations of quasi ∗ -algebras, Ann. Inst. Henri Poincaré (Physique Théorique) 69, 241–264 (1998). [6] J.-P. Antoine, A. Inoue and C. Trapani, Biweights on partial ∗ -algebra, Journal of Mathematical Analysis and Applications 242, 164–190 (2000). [7] G.O.S. Ekhaguere, Dirichlet forms on partial ∗ -algebras, Mathematical Proceedings of the Cambridge Philosophical Society 104, 129–140 (1988). [8] G.O.S. Ekhaguere and P.O. Odiobala, Completely positive conjugate-bilinear maps on partial ∗ -algebras, Journal of Mathematical Physics 32, 2951–2958 (1991). [9] G.O.S. Ekhaguere, Non-commutative mean ergodic theorem on partial ∗ algebras, International Journal of Theoretical Physics 32, 1187–1196 (1993).

Vol. 2, 2001


385

[10] G.O.S. Ekhaguere, Representation of completely positive maps between partial ∗ -algebras, International Journal of Theoretical Physics 35, 1571–1579 (1996) . [11] S. Doplicher, R. Haag, and J.E. Roberts, Fields, observables and gauge transformations I, Communications in Mathematical Physics 13, 1–23 (1969). [12] S. Doplicher, R. Haag, and J.E. Roberts, Fields, observables and gauge transformations II, Communications in Mathematical Physics 15, 173–200 (1969). [13] S. Doplicher, R. Haag and J.E. Roberts, Local observables and particle statistics I, Communications in Mathematical Physics 23, 199–230 (1971) . [14] J.M. Cook, The mathematics of second quantization, Transactions of the American Mathematical Society 74, 222 (1953) . [15] A. Guichardet, Symmetric Hilbert Spaces and related Topics, SpringerVerlag, Berlin (1972). [16] G.O.S Ekhaguere, The theory of superselection rules. I. A class of inequivalent, irreducible ∗ -representations of the canonical commutation relations of the free electromagnetic field, Journal of Mathematical Physics 19, 1751–1757 (1978). [17] G.O.S. Ekhaguere, The theory of superselection rules. II. Sectors of the free electromagnetic field, Journal of Mathematical Physics 25, 678–683 (1984) . [18] J.-P. Antoine, Y. Soulet, and C. Trapani, Weights on partial ∗ -algebras, Journal of Mathematical Analysis and Applications 192, 920–941 (1995).

G.O.S. Ekhaguere Permanent address : Department of Mathematics University of Ibadan Ibadan NIGERIA and Association of African Universities P.O. Box 5744, Accra-North Accra GHANA email : [email protected] Communicated by Detlev Buchholz submitted 10/04/00, accepted 23/11/00



Temperature Independent Renormalization of Finite Temperature Field Theory Christoph Kopper, Volkhard F. M¨ uller, Thomas Reisz We analyze 4-dimensional massive ϕ4 theory at finite temperature T in the imaginary-time formalism. We present a rigorous proof that this quantum field theory is renormalizable, to all orders of the loop expansion. Our main point is to show that the counterterms can be chosen temperature independent, so that the temperature flow of the relevant parameters as a function of T can be followed. Our result confirms the experience from explicit calculations to the leading orders. The proof is based on flow equations, i.e. on the (perturbative) Wilson renormalization group. In fact we will show that the difference between the theories at T > 0 and at T = 0 contains no relevant terms. Contrary to BPHZ type formalisms our approach permits to lay hand on renormalization conditions and counterterms at the same time, since both appear as boundary terms of the renormalization group flow. This is crucial for the proof.

Abstract.

1 Introduction Field theories at finite temperature and density have been proposed as the fundamental underlying theory for the description of the physics of the early universe. A proposed scenario for baryogenesis is by the electroweak phase transition [1]. QCD is expected to become deconfined at high temperature. The formation of a quark gluon plasma and the phase transitions of QCD are supposed to be visible in relativistic heavy ion collision and astrophysics [2]. A modern presentation of finite temperature field theory can be found in [3]. Beyond their phenomenological implications, quantum field theories at finite temperature are very challenging also from the more theoretical point of view. There is a real-time as well as an imaginary-time formalism, the first describing dynamical and the second equilibrium properties [4]. Many fundamental issues and problems are unsolved so far or require a deeper understanding. Quantum field theories are subject to enhanced complexities compared to zero temperature and zero density. This is largely related to the presence of additional length scales, due to the interaction with a heat bath. On the various scales the properties of the theory are considerably different.

388

C. Kopper, V.F. M¨ uller, T. Reisz


The separation of scales is widely believed to be an intrinsic property of the field theory. In QCD the scales are associated to the generation of electric and magnetic screening and plasmon masses. In the framework of perturbation theory, this manifests itself in terms of IR divergences that are “severe”. They are not removable as it is the case at temperature T = 0 by adjusting the renormalization prescription [5]. Various elaborate resummation techniques have been proposed to (at least partially) remove the IR singularities and in addition compute screening masses in perturbation theory. In any case, all the approaches (need to) aim at a clean separation of IR and UV behaviour. A precondition of all these considerations is renormalizability. Renormalizability is an essential requirement of any local quantum field theory, both at zero and non-zero temperature [6]. It implies that the correlation functions stay finite as the UV-cutoff Λ0 , say, is removed, Λ0 → ∞, and that the limit is parameterized by a set of renormalized (relevant) coupling constants. Moreover, it is crucial that renormalization can be achieved in a temperature independent way, which means that the field theory renormalized at zero temperature stays UV finite at every T > 0 as well. This is often taken for granted even for complicated theories, such as gauge theories. Temperature independent renormalizability is indispensable for relating bare and renormalized coupling constants in a T -independent way. It is thus required when formulating Callan-Symanzik type of equations that govern the T -dependence of observables, including correlation functions and effective masses. More generally it implies that the static and dynamic properties mediated by the interactions with a heat bath are intrinsic features of the field theory itself. Various attempts and steps towards proving renormalizability exist [7]. In order to separate off the IR problem from the UV scale, a massive field theory is considered. Both in the real- and in the imaginary-time formalism, the investigations are commonly based on a Feynman diagrammatic approach in momentum space. In the real-time description, it is argued that the part of the propagator which depends on the temperature T or the chemical potential µ decays exponentially fast for large momenta, so it should be “innocent” of any UV problem. In the imaginary-time formalism the approach is generically more “cumbersome”, but it is again argued that in the sum over the Matsubara frequencies all T - or µ-dependent UV divergences cancel out. Experience obtained by explicit computations to leading orders of perturbation theory confirms that, once IR and UV singularities are properly disentangled, all UV divergences found are T -independent and are removed by the zero temperature counterterms. However, this is not so for non-zero chemical potential µ (associated to a finite density). A field theory that has been renormalized at µ = 0 is able to generate µ-dependent UV divergences that are not removed by the µ = 0 counterterms. A simple example is given by a 4-dimensional Yukawa model, with a chemical potential associated to the fermion number. In the framework of the renormalization group, the chemical potential introduces an additional relevant operator, so at least one additional renormalization condition is expected. This also indicates a possible problem for the analytic continuation from the Euclidean

Vol. 2, 2001

Renormalization of Finite Temperature Field Theory

389

to the real-time formulation, in agreement with a discussion [8] in the framework of axiomatic quantum field theories at finite temperature, where the problem of proving the existence of correlation functions (even at µ = 0) in the real-time formalism has been stressed. The renormalization of field theories at T = 0 is well understood. Strong statements and proofs on the renormalizability of various field theories relevant in physics exist, including several different regularization and renormalization schemes, see e.g. [9, 10]. Unfortunately, this sophistication does not extend to finite T so far. Rigorous proofs do not exist, to the best of our knowledge. We would like to point out, however, that recently rigorous bounds, uniform in the temperature, have been established for the perturbative correlation functions of many-fermion models. Here renormalization is necessary to obtain well-behaved bounds on the IR side, when approaching the Fermi surface, whereas the UV regularization is kept fixed. Feldman et al. [11] renormalize the many-fermion models with T -independent counterterms, as we do. In this paper we give a mathematical proof that massive ϕ4 theory at finite T , in the imaginary-time formalism, is renormalizable. More precisely, we show, to all orders of the loop expansion, that all correlation functions become UV finite at every finite T once the theory has been renormalized at T = 0 by (one of the) usual renormalization prescriptions. The proof is given in the framework of Wilson’s flow equation. It avoids the analysis of individual Feynman integrals (or Feynman sums), which requires the involved combinatorics encoded in the forest formula for overlapping divergences. Moreover it avoids the formulation and proof of a power counting theorem. Using flow equations, the proof of renormalizability merely amounts to establish appropriate bounds in momentum space on the correlation functions, which are viewed as coefficient functions of the associated generating functional. The proof is by induction on the number of loops. This paper is organized as follows. In Sect. 2 we introduce our basic notations. This includes the definition of the generating functional LΛ,Λ0 (ϕ) of the connected, free propagator amputated Green functions on “momentum scale Λ”, with 0 ≤ Λ ≤ Λ0 , where Λ0 denotes the UV cutoff. The dependence of LΛ,Λ0 on the scale Λ is described by the so-called Wilson flow equation. We recap the basic steps of proving renormalizability of 4-dimensional ϕ4 field theories at zero temperature by means of the flow equation. Renormalizability is stated in terms of uniform bounds on the (coefficient functions of the) solution LΛ,Λ0 (ϕ) of the flow equation and its derivative with respect to the UV-cutoff Λ0 , with boundary conditions imposed at Λ = 0 for the relevant couplings and at Λ = Λ0 for the irrelevant interactions. In Sect. 3 we show that the difference DΛ,Λ0 (ϕ; T ) of the generating functionals at temperature T > 0 and T = 0 : DΛ,Λ0 (ϕ; T ) ≡ LΛ,Λ0 (ϕ; T ) − LΛ,Λ0 (ϕ)

(1)

has the properties of an irrelevant operator in the sense of the renormalization

390



group 1 . More precisely, T -independence of the counterterms means that the boundary condition DΛ0 ,Λ0 (ϕ; T ) ≡ 0 (2) holds. From this we derive strong bounds on all scales Λ for DΛ,Λ0 (ϕ; T ) . Together with the bounds on LΛ,Λ0 (ϕ) this proves UV finiteness of massive ϕ44 for every finite T , that is, lim LΛ,Λ0 (ϕ; T ) (3) Λ0 →∞,Λ→0

exists, to all orders of the loop expansion. As an immediate consequence, the theory is also made UV finite by imposing normalization conditions on the mass, the wave function constant and on the quartic coupling constant at any fixed temperature T0 . In Sect. 4 we summarize our central statements and give a short outlook.

2 Renormalization of zero temperature ϕ44 theory a short reminder Perturbative renormalizability of Euclidean zero temperature ϕ44 theory will be established by analyzing the generating functional LΛ,Λ0 of connected (free propagator) amputated Green functions (CAG). The upper indices Λ and Λ0 enter through the regularized propagator C Λ,Λ0 (p) =

2 +m2 Λ2 0

−p 1 {e p2 + m2

or its Fourier transform

− e−

p2 +m2 Λ2

}

(4)

Cˆ Λ,Λ0 (x) =

C Λ,Λ0 (p) eipx ,

(5)

p

where we use the shorthand

:= p

IR4

d4 p . (2π)4

(6)

We assume 0 ≤ Λ ≤ Λ0 ≤ ∞

(7)

so that the Wilson flow parameter Λ takes the role of an infrared (IR) cutoff2 , whereas Λ0 is the ultraviolet (UV) regularization. The full propagator is recovered 1 For the definition of the momentum space field variables ϕ and their position space Fourier transform ϕ ˆ we refer to the beginning of sect.3 : Equ. (1) should be understood in the weak sense, i.e. in a formal power series expansion w.r.t. and as an identity for all coefficient functions generated by the generating functionals. For the equation to make sense as it stands the variables ϕ ˆ have to be appropriately restricted, for instance to be smooth functions, supported in the interval [0, β] in the x0 -component in position space. 2 Such a cutoff is of course not necessary in a massive theory. The IR behaviour is only modified for Λ above m.

Vol. 2, 2001


for Λ = 0 and Λ0 → ∞ . We also introduce the convention δ δ ipx 4 = (2π) e−ipx . ϕ(x) ˆ = ϕ(p) e , δ ϕ(x) ˆ δϕ(p) p p

391

(8)

For our purposes the ”fields” ϕ(x) ˆ may be assumed to live in the Schwartz space S(IR4 ). For finite Λ0 and in finite volume the theory can be given rigorous meaning starting from the functional integral Λ,Λ0 Λ,Λ0 1 (ϕ)+I ˆ ) ˆ ˆ e− 1 LΛ0 ,Λ0 (φˆ + ϕ) e− (L = dµΛ,Λ0 (φ) , (9) where the factors of have been introduced to allow for a consistent loop expanˆ denotes the (translation invariant) Gaussian sion in the sequel. In (9) dµΛ,Λ0 (φ) 1 Λ,Λ0 Λ,Λ measure with covariance Cˆ 0 (x). The normalization factor e− I is due to vacuum contributions. It diverges in infinite volume so that we can take the infinite volume limit only when it has been eliminated [10]. We do not make the finite volume explicit here since it plays no role in the sequel.3 The functional LΛ0 ,Λ0 (ϕ) ˆ is the bare action including counterterms, viewed as a formal power series in . Its general form for symmetric4 ϕ44 theory is g LΛ0 ,Λ0 (ϕ) ˆ = d4 x ϕˆ4 (x) + 4! 3 1 1 1 (∂µ ϕ) ˆ 2 (x) + c(Λ0 )ϕˆ4 (x)} , + d4 x { a(Λ0 )ϕˆ2 (x) + b(Λ0 ) 2 2 4! µ=0

(10)

where g > 0 is the renormalized coupling, and the parameters a(Λ0 ), b(Λ0 ), c(Λ0 ) fulfill (11) a(Λ0 ), b(Λ0 ), c(Λ0 ) = O() . They are directly related to the standard mass, wave function and coupling constant counterterms. Since in the flow equation framework it is not necessary to 3A

rigorous treatment of the thermodynamic limit requires to replace the propagator (5) ˆ Λ,Λ0 (x, y) = χV (x) C ˆ Λ,Λ0 (x − y) χV (y) , where χV is the by a finite volume version, e.g. C V characteristic function of the volume V , and to regard the Gaussian measure with covariance ˆ Λ,Λ0 (x, y) . In this case the quantity I Λ,Λ0 is obviously well defined, at any order l in . Then C V V 0 (9) is well-defined. After decomposing LΛ,Λ w.r.t. powers of and of the field ϕ ˆ , we realize that V Λ,Λ0 the coefficient functions Ll,n are well-defined in the thermodynamic limit, since they are given as finite sums over UV-regularized connected diagrams. The existence of the thermodynamic limit is of course confirmed by the bounds on the solutions of the FE. It should also be feasible to study the thermodynamic limit itself with the aid of the FE in finite volume, by proving inductively uniform bounds on the (appropriately defined) ”translational invariant part” of the finite volume Green functions and a convergence statement analogous to (18). 4 The necessary generalizations in the non-symmetric case will be surveyed in the end of the next section.

392



introduce bare fields in distinction to renormalized ones (our field is the renormalized one in this language), there is a slight difference, which is to be kept in mind only when comparing to other schemes. The Wilson flow equation (FE) is obtained from (9) on differentiating w.r.t. Λ . It is a differential equation for the functional LΛ,Λ0 : δ δ ∂Λ (LΛ,Λ0 + I Λ,Λ0 ) = , (∂Λ Cˆ Λ,Λ0 ) LΛ,Λ0 (12) 2 δ ϕˆ δ ϕˆ 1 δ δ − LΛ,Λ0 , (∂Λ Cˆ Λ,Λ0 ) LΛ,Λ0 . 2 δ ϕˆ δ ϕˆ By , we denote the standard scalar product in L2 (IR4 , d4 x) . Changing to momentum space and expanding in a formal powers series w.r.t. we write with slight abuse of notation LΛ,Λ0 (ϕ) =

∞

0 l LΛ,Λ (ϕ) . l

(13)

l=0 0 From LΛ,Λ (ϕ) we then obtain the CAG of loop order l in momentum space as l

5

0 0 (2π)4(n−1) δϕ(p1 ) . . . δϕ(pn ) LΛ,Λ |ϕ≡0 = δ (4) (p1 + . . . + pn ) LΛ,Λ l l,n (p1 , . . . , pn−1 ) , (14) Λ,Λ0 where we have written δϕ(p) = δ/δϕ(p). Note that our definition of the Ll,n is 0 such that LΛ,Λ vanishes. The absence of 0-loop two (and one-) point functions 0,2 is important for the set-up of the inductive scheme, from which we will prove renormalizability below. The FE (12) rewritten in terms of the CAG (14) takes the following form 1 0 0 ∂Λ ∂ w LΛ,Λ (p , . . . p ) = (∂Λ C Λ,Λ0 (k)) ∂ w LΛ,Λ 1 n−1 l,n l−1,n+2 (k, −k, p1 , . . . pn−1 ) 2 k (15) 1 w1 Λ,Λ0 ∂ Ll1 ,n1 +1 (p1 , . . . , pn1 ) (∂ w3 ∂Λ C Λ,Λ0 (p )) − 2 l +l =l, w +w +w =w 1

2

1 2 n1 +n2 =n

3

∂

w2

0 LΛ,Λ l2 ,n2 +1 (pn1 +1 , . . . , pn )

, ssym

where

p = −p1 − . . . − pn1 = pn1 +1 + . . . + pn .

Here we have written (15) directly in a form where also momentum derivatives of the CAG (14) are performed, and we used the shorthand notation ∂ w :=

n−1

3

(

i=1 µ=0

∂ wi,µ ) with w = (w1,0 , . . . , wn−1,3 ), |w| = wi,µ , wi,µ ∈ IN0 . ∂pi,µ (16)

5 The

normalization of the

0 LΛ,Λ l,n

is defined differently from earlier references.

Vol. 2, 2001


393

The symbol ssym 6 means summation over those permutations of the momenta p1 , . . . , pn , which do not leave invariant the subsets {p1 , . . . , pn1 } and {pn1 +1 , . . . , pn }. Note that the CAG are symmetric in their momentum arguments by definition. A simple inductive proof of the renormalizability of ϕ44 theory has been exposed several times in the literature [10], and we will not repeat it in detail. The line of reasoning can be resumed as follows. The induction hypotheses to be proven are : A) Boundedness 0 |∂ w LΛ,Λ p)| ≤ (Λ + m)4−n−|w| P1 (log l,n (

Λ+m | p| ) P2 ( ). m Λ+m

(17)

B) Convergence 0 p)| ≤ |∂Λ0 ∂ w LΛ,Λ l,n (

1 Λ0 | p| ) .7 (Λ + m)5−n−|w| P3 (log ) P4 ( 2 Λ0 m Λ+m

(18)

Here and in the following the P denote (each time they appear possibly new) polynomials with nonnegative coefficients. The coefficients depend on l, n, |w|, m, but not on p, Λ, Λ0 . We used the shorthand p = (p1 , . . . , pn−1 ) and | p| = sup{|p1 |, . . . , |pn |}. The statement (18) implies renormalizability, since it 0 proves that the limits limΛ0 →∞, Λ→0 LΛ,Λ p) exist to all loop orders l . But the l,n ( statement (17) has to be obtained first to prove (18). The inductive scheme to prove the statements proceeds upwards in l, for given l upwards in n, and for given (l, n) downwards in |w|, starting from some arbitrary |wmax | ≥ 3. The important point to note is that the terms on the r.h.s. of the FE always are prior to the one on the l.h.s. in the inductive order. So the bound (17) may be used as an induction hypothesis on the r.h.s. Then we may integrate the FE, where terms with n + |w| ≥ 5 are integrated down from Λ0 to Λ, since for those terms we have the boundary conditions following from (10) 0 ,Λ0 ∂ w LΛ (p1 , . . . pn−1 ) = 0 , for n + |w| > 4 , l,n

(19)

whereas the terms with n + |w| ≤ 4 at the renormalization point - which we choose at zero momentum for simplicity - are integrated upwards from 0 to Λ, since they are fixed by (Λ0 -independent) renormalization conditions, fixing the relevant parameters of the theory8 : R R 2 2 2 0 L0,Λ l,2 (p) = al + bl p + O((p ) ) ,

0,Λ0 R 0 L0,Λ 0,4 (0) = g , Ll,4 (0) = cl , l ≥ 1 . (20)

6 It is defined differently from the symbol sym in [10], the present conventions being slightly more elegant. 7 In fact, in symmetric ϕ4 theory 1 can be replaced by Λ as shown in [13]. 4 Λ2 Λ3 0

0

R R simplest choice would be to set aR l = 0, bl = 0, cl = 0 , in which case the renormalized coupling is identical to the connected four point function at zero momentum. A shift away from zero momentum would result in non-vanishing terms cR l , just to mention one example of more general choices. 8 The

394



Symmetry considerations tell us that there are no other non-vanishing renormalR R ization constants apart from aR omilch or integrated Taylor l , bl , cl , and the Schl¨ formula permits us to move away from the renormalization point, treating first 0 0 and then the momentum derivatives of L0,Λ L0,Λ l,4 l,2 , in descending order. With these remarks on the boundary conditions, and using the bounds on the propagator and its derivatives |∂ w ∂Λ C Λ,Λ0 (p)| ≤ Λ−3−|w| P(|p|/Λ) e−

p2 +m2 Λ2

,

(21)

statement A) above is straightforwardly verified by inductive integration of the FE. Once this has been achieved statement B) follows on applying the same inductive scheme to bound the solutions of the FE, integrated over Λ and then derived w.r.t. Λ0 .

3

Temperature independent renormalization of finite temperature ϕ44 theory

We fix the notations recalling at the same time some basic facts about Euclidean finite temperature field theory. The scalar field ϕ(x) ˆ becomes periodic in x0 at finite temperature with period β = 1/T . Correspondingly position space integrals over the zero component of the coordinates are now restricted to the compact interval [0, β] . Symbols denoting finite temperature quantities will generally be underlined, thus we write d3 p p := (p0 , p) := (2πnT, p) , n ∈ ZZ , := T . (22) 3 p IR3 (2π) n∈ZZ

We also introduce the convention ϕ(x) ˆ := ϕ(p) eipx , ϕ(p) = p

δ (2π)3 = δ ϕ(x) ˆ T

p

β

(23)

IR3

0

δ e−ipx , δϕ(p)

d3 x ϕ(x) ˆ e−ipx ,

dx0

T δ = δϕ(p) (2π)3

β

IR3

δ eipx . δ ϕ(x) ˆ (24)

}.

(25)

d3 x

dx0 0

The regularized propagator now takes the form 2 +m2

C Λ,Λ0 (p) =

p − 1 {e p2 + m2

Λ2 0

− e−

p2 +m2 Λ2

The generating functional of the finite temperature CAG will be called LΛ,Λ0 (ϕ; T ). In analogy with (14) we define the CAG through 0 δϕ(p ) . . . δϕ(p ) LΛ,Λ (ϕ; T )|ϕ≡0 = l 1

n

(26)

Vol. 2, 2001


395

T n−1 0 ) δ0,(p +...+p ) δ (3) ( p1 + . . . + pn ) LΛ,Λ l,n (p1 , . . . , pn−1 ; T ) . 1,0 n,0 (2π)3

(

At this stage we could prove renormalizability of the finite temperature theory in the same way as for the zero temperature theory. A slight difference is that the relations (20) are to be replaced by R,0 R,1 R 2 0 L0,Λ 2 + O(p4 ) , l,2 (p; T ) = al (T ) + bl (T ) p0 + bl (T ) p 0,Λ0 R 0 L0,Λ 0,4 (p = 0; T ) = g , Ll,4 (p = 0; T ) = cl (T ) , l ≥ 1 ,

(27)

since the space-time O(4)-symmetry is broken down to a ZZ2 × O(3)-symmetry which demands a new renormalization condition. However we want to go beyond and prove temperature independent renormalizability, in the sense that the counterterms can be chosen temperature independent. To do so, it is advantageous to pass directly to the difference between the finite and zero temperature theories, which we will do now. Note in this respect that if we wanted to prove the renormalizability of the finite temperature theory, keeping the counterterms fixed at their zero temperature values, would not work within our scheme and procedure : The CAG would become arbitrarily divergent in Λ0 with increasing loop order, since integrating relevant terms from Λ0 to 0 (instead of integrating them from a renormalization condition fixed at Λ = 0 up to Λ0 ) gives divergent integrals. Thus we rather study the difference functions Λ,Λ0 Λ,Λ0 0 Dl,n ({p}) := LΛ,Λ l,n ({p}; T ) − Ll,n ({p}) .

(28)

Λ,Λ0 We only define and need the difference CAG Dl,n at the external momenta ({p}) := (p1 , . . . , pn−1 ). From the FE (15) and the analogous equation for the Λ,Λ0 0 LΛ,Λ l,n ({p}; T ) we can derive a FE for the Dl,n ({p}) in the following form : Λ,Λ0 ({p}) ∂Λ Dl,n

1 + 2 −

1 = 2

k

Λ,Λ0 (∂Λ C Λ,Λ0 (k)) Dl−1,n+2 (k, −k, {p})

(29)

(∂Λ C

k

l1 +l2 =l, n1 +n2 =n

1 2

Λ,Λ0

0 (k))LΛ,Λ l−1,n+2 (k, −k, {p}) −

(∂Λ C

Λ,Λ0

k

0 (k))LΛ,Λ l−1,n+2 (k, −k, {p})

Λ,Λ0 0 LΛ,Λ (p )) l1 ,n1 +1 (p1 , . . . , pn1 ; T )(∂Λ C

0 DlΛ,Λ (pn +1 , . . . , pn ) 2 ,n2 +1 1

ssym

+

0 DlΛ,Λ (p1 , . . . , pn )(∂Λ C Λ,Λ0 (p )) 1 ,n1 +1 1

0 LΛ,Λ l2 ,n2 +1 (pn1 +1 , . . . , pn )

, ssym

where again

p = −p1 − . . . − pn = pn 1

1 +1

+ . . . + pn .

396



Λ,Λ0 The boundary conditions we want to impose on the system Dl,n are (from the previous remarks) obviously the following ones : Λ0 ,Λ0 Dl,n (p1 , . . . , pn−1 ) = 0 ,

l, n ∈ IN .

(30)

n ∈ IN ,

(31)

To start the induction we also note Λ,Λ0 D0,n (p1 , . . . , pn−1 ) = 0 ,

Λ,Λ0 at the tree level all difference terms D0,n vanish. This follows from the fact that re0 stricted to the momenta (p1 , . . . pn−1 ) the tree level functions LΛ,Λ 0,n (p1 , . . . pn−1 ; T ) 0 and LΛ,Λ 0,n (p1 , . . . pn−1 ) agree. Now we would like to use the same inductive scheme proceeding upwards in l, and for given l upwards in n, to prove the finiteness of Λ,Λ0 Λ,Λ0 limΛ0 →∞,Λ→0 D0,n . Due to the form of (30) we always integrate the FE for Dl,n from Λ0 down to Λ, since the boundary terms at Λ0 always vanish. We want to prove the following Theorem

Λ,Λ0 |Dl,n (p1 , . . . , pn−1 )| ≤ (Λ + m)−s−n P1 (log

Λ,Λ0 |∂Λ0 Dl,n (p1 , . . . , pn−1 )| ≤

|{p}| Λ+m ) P2 ( ), m Λ+m

|{p}| 1 Λ0 (Λ + m)−s−n P3 (log ) P4 ( ). Λ20 m Λ+m

(32)

(33)

The nonnegative coefficients in the polynomials P depend on l, n, s, m and (smoothly) on T , but not on {p}, Λ, Λ0 . The positive integer s ∈ IN may be chosen arbitrarily. 0 The finite temperature CAG LΛ,Λ 0,n (p1 , . . . , pn−1 ; T ) , when renormalized with the same counterterms as the zero temperature ones, satisfy the same bounds as in (17,18) restricted to w = 0 . The coefficients in the polynomials P may now depend on l, n, m and (smoothly) on T . Remark. It is possible to prove the bounds (17,18) also for derivatives of the finite 0 temperature CAG LΛ,Λ 0,n (p1 , . . . , pn−1 ; T ) . In the pi,0 -components differentiations then have to replaced by finite differences. However these bounds are not needed in the inductive proof, so we skip them here. Proof. We first prove (32) and and the statement corresponding to (17) for w = 0 , using the inductive scheme indicated previously. Regarding the FE (29) we state that it is compatible with the inductive scheme and that the only term in which (32) cannot be used as an induction hypothesis is the following one : Λ,Λ0 Λ,Λ0 0 (∂Λ C (k)) Ll−1,n+2 (k, −k, {p}) − (∂Λ C Λ,Λ0 (k)) LΛ,Λ l−1,n+2 (k, −k, {p}) . k

k

(34)

Vol. 2, 2001


397

Λ,Λ0 So our sharp Λ-bound on Dl,n can only be verified if it holds for this difference term. Here we use (17,18) and the Euler-MacLaurin-formula, see e.g. [12]. We can rewrite (34) as

−2 Λ3

∞ 2 d3k − k2 +m 2 Λ 2πT e g(2πnT ) − dk0 g(k0 ) , (2π)4 −∞

(35)

n∈ZZ

where we introduced the function k2

0 0 g(k0 ) = e− Λ2 LΛ,Λ l−1,n+2 (k, −k, {p}) for k, {p} fixed .

The Euler-MacLaurin formula for our case can be stated in the form ∞ 2πT g(2πnT ) − dk0 g(k0 ) = −πT [g(∞) − g(−∞)] r+1 b2k (2πT )2k k=1

(37)

−∞

n∈ZZ

+

(36)

(2k)!

[g (2k−1) (∞) − g (2k−1) (−∞)] + Rr+1 .

Here b2k are the Bernoulli numbers. We observe that passing to the limit of an infinite integration interval is justified, since the function g(k0 ) and its derivatives vanish rapidly at infinity. The remainder Rr+1 obeys the following bound [12] ∞ dk0 |g (2r+3) (k0 )| , (38) |Rr+1 | ≤ 4 e2π T 2r+3 −∞

therefore we obtain, using again (17,18) |Rr+1 | ≤ T 2r+3

|{k, p}| (Λ + m)2−n Λ+m ) P2 ( ). P1 (log 2r+2 Λ m Λ+m

(39)

Note that r ∈ IN can be chosen arbitrarily here, and the bound for (34) is thus m2

T 2r+3 e− Λ2

|{k, p}| (Λ + m)2−n Λ+m ) P2 ( ) P1 (log 2r+2 Λ m Λ+m

≤ T 2r+3 (Λ + m)2−n−2r−2 P3 (log

(40)

|{k, p}| Λ+m ) P4 ( ). m Λ+m

After this preparation we consider the induction process : At each loop order we first prove (32), and then (17) for finite T and corresponding momenta. This second step is trivial from (17,18) at T = 0, from the definition (28) and from (32) 9 . We know already the theorem to be true at 0 loop order. This and the form 9 We may choose the bounds for s = 0 from (32,33) when bounding the finite temperature CAG, so that polynomials appearing in the bounds may be chosen s-independent.

398



0 of the FE (29) implies that we do not need a bound on any of the LΛ,Λ l,n ({p}; T ) Λ,Λ0 in the inductive bound on Dl,n at the given loop order l. It is instructive to regard how the induction starts at loop order l = 1. Treating first the case n = 2 we find that the only non-vanishing contribution on the r.h.s. of the FE stems from (34), and it is momentum independent, so that integrating over Λ we get

Λ,Λ0 (p)| ≤ c (Λ + m)−2r−1 |D1,2

with a suitable constant c , depending on r . For n = 4 also the last two terms on Λ,Λ0 the r.h.s. of the FE contribute. Using the result for D1,2 (p) , integration over Λ gives |{p}| Λ,Λ0 ({p})| ≤ (Λ + m)−2−2r−1 P( ). |D1,4 Λ+m From this one inductively obtains the bound for n ≥ 6 Λ,Λ0 |D1,n ({p})| ≤ (Λ + m)−(n−2)−2r−1 P(

|{p}| ). Λ+m

Λ,Λ0 0 Having bounded the difference functions D1,n we can bound the CAG LΛ,Λ 1,n (T ) = Λ,Λ0 0 LΛ,Λ 1,n (T =0) + D1,n , see (28). Then we may proceed inductively to higher loop orders and verify the inductive bound Λ,Λ0 |Dl,n ({p})| ≤ (Λ + m)−(n−2)−2r−1 P1 (log

|{p}| Λ+m ) P2 ( ). m Λ+m

This proves the first part of the theorem on writing s = 2r − 1 for s odd, and 0 majorizing to obtain even s. It follows that the LΛ,Λ l,n (T ) may be bounded in agreement with (17,18). Now we turn to the proof of the statement (33) which implies convergence Λ,Λ0 of the Dl,n for Λ0 → ∞ . The proof is based on the same inductive scheme and starts from the FE (29) integrated over Λ from Λ0 to Λ , and then derived w.r.t. Λ0 . The result is of the form Λ0 Λ,Λ0 ({p}) = [RHS of (29)]|Λ=Λ0 + dλ ∂Λ0 [RHS of (29)](λ) , (41) −∂Λ0 Dl,n Λ

and we denote the RHS of this equation shortly as Λ,Λ0 Λ0 Il,n ({p}) + Il,n ({p}) . 0 ,Λ0 0 ,Λ0 (T ) ≡ LΛ , and since moreover these terms vanish Since we have imposed LΛ l,n l,n for n ≥ 6, we find

Λ0 Il,n ({p}) = −δn,2

k

−k

e

2 +m2 Λ2 0

Λ30

− k

−k

e

2 +m2 Λ2 0

Λ30

0 ,Λ0 LΛ l−1,4 .

(42)

Vol. 2, 2001


399

Λ0 ,Λ0 0 ,Λ0 Since LΛ ≡ g , see (10), we realize that (42) l−1,4 ≡ cl−1 (Λ0 ) , l > 1 and L0,4 is momentum independent. The difference can be calculated explicitly or bounded again using the Euler-MacLaurin formula, and we obtain Λ0 |Il,n | ≤ δn,2 Λ0−2−2r P(log

Λ0 ) m

(43)

Λ,Λ0 ({p}) we for r ∈ IN and a suitable P depending on r. To get a bound on Il,n apply the derivative in (41) to all entries using the product rule (noting that when applied to ∂Λ C Λ,Λ0 it gives zero). In any case the derivative brings down the required factor of Λ−2 0 , either by (18), or by (33) together with the induction hypothesis. Apart from this the bound (33) is obtained similarly as (32), using in particular the Euler-MacLaurin formula for the difference term (34) derived w.r.t. ✷ Λ0 . This proves also (33).

We end this section with two remarks on possible generalizations. First the preceding analysis can be extended to non-symmetric ϕ44 -theory. The action (10) then has to be replaced by h 1 ˜ Λ0 ,Λ0 (ϕ) ˆ = LΛ0 ,Λ0 (ϕ) ˆ + ˆ d4 x ϕˆ3 (x) + d4 x { d(Λ0 ) ϕˆ3 (x) + v(Λ0 ) ϕ(x)} L 3! 3! (44) with the tree level three-point coupling h and Λ0 -dependent parameters d(Λ0 ) , v(Λ0 ) = O()

(45)

implementing the counterterms necessary to renormalize the one- and three-point functions. Correspondingly we pose additional renormalization conditions R 0 L0,Λ l,1 = vl ,

R 0 L0,Λ l,3 (0) = dl

for l ∈ IN ,

(46)

to be joined to (20). Then the bounds (17,18) hold again, but are no more trivially fulfilled for n odd.10 Once the theory at T = 0 is bounded, the differences (28) again yield the theory at T > 0 . Bounds corresponding to (32,33) are proven proceeding as before, in the symmetric case. As a second remark, we point out that for the existence of the large cutoff limit Λ0 → ∞, it is not necessary that the relevant coupling constants are subject to normalization conditions at zero temperature. Equally well we can impose normalization conditions at some temperature T0 > 0 . We pointed out that at finite temperature the space-time O(4)-symmetry is broken down to ZZ2 ×O(3) . Then the 3 independent renormalization constants aR , bR and cR at T = 0, (20), are replaced by four parameters aR (T0 ), bR,0 (T0 ), bR,1 (T0 ) and cR (T0 ) at T0 , cf. (27), corresponding to four relevant couplings. However, starting from an O(4)-symmetric zero temperature theory we have proved that LΛ,Λ0 (ϕ; T0 ) − LΛ,Λ0 (ϕ) 10 These

(47)

bounds can be improved by replacing n by n ˆ , defined to be the smallest even integer greater or equal to n .

400



has the properties of an irrelevant operator. This implies that for given bR,0 (T0 ) there is a unique choice for bR,1 (T0 ) , or vice versa, such that the finite temperature theory stems from an O(4)-symmetric zero temperature theory. Any different choice would be associated to a zero temperature theory, where O(4)-symmetry is broken by hand through the renormalization conditions. Note that the O(4)symmetric choice is generally not the one where bR,0 (T0 ) = bR,1 (T0 ) : Integration over Λ , starting from the same counterterms (the O(4)-symmetric ones) will lead to a finite difference at Λ = 0 , since O(4)-invariance is broken in the propagator. Otherwise stated, the fact that the finite temperature theory stems from an O(4)symmetric zero temperature theory, can be simply recognized on inspection of the counterterms, but not on the renormalization conditions.

4

Summary

We have presented a proof for the perturbative renormalizability of massive finite temperature ϕ44 -theory. The starting point are the bounds (17,18) which prove the renormalizability of the zero temperature theory. In the flow equation framework they serve at the same time as induction hypotheses for the inductive proof. Bounds of this type have by now been rigorously established for nearly all theories of physical interest, including gauge theories, where the restoration of the Ward identities in the final theory pose an additional problem, to be solved by a suitable restriction on the renormalization conditions. Taking due care of the exceptional momentum problem, corresponding bounds can also be established for theories with massless particles. To extend the bounds to the corresponding finite temperature theories presents no really new problems for the practitioner. The main problem to be solved rather is that the existence of the correlation functions in the large cutoff limit should be proven without changing the counterterms. In our setup this corresponds to posing the boundary conditions (30) for the difference Green functions D between the T > 0 and the T = 0 theories. The announced result is contained in the bounds (32,33). The main new technical tool used to get there is the Euler-MacLaurin formula, generalized to an infinite integration interval for a rapidly decaying integrand. It is applied to the difference terms appearing in the flow equations for the functions D that are not bounded by the induction hypothesis alone, (see (34)- (40)). Here it comes to our help that the bounds (17,18) are sufficiently powerful so as to transform momentum derivatives into negative powers of Λ . Via the Euler-MacLaurin formula it is then possible to gain an arbitrary power in Λ paying the corresponding power in T (see 39). This achieves (far more than) showing that all difference functions D are irrelevant. For the latter a gain of a power of Λ2+ε would have sufficed. We emphasize again that our result agrees with the experience and intuition gained from explicit perturbative calculations. Renormalization is a central issue that is strongly related to the fundamental principles of local quantum field theory. Renormalizability of a field theory gives

Vol. 2, 2001


401

it a meaning beyond some low energy effective model. The techniques we have presented here for proving renormalizability of a field theory at finite temperature mainly rely on two properties. The first property is renormalizability at zero temperature. The second one is that the difference between the theory at finite and zero temperature acts like an irrelevant operator that does not spoil renormalizability. Renormalization group flow equations provide an appropriate tool to put this statement on a strong basis and prove renormalizability for finite temperature. We expect that these methods generalize appropriately to apply to more realistic and complex field theories such as QCD, where both the UV and the IR scale problem are to be attacked.

References [1] A.D. Linde, Phys.Lett. B160, 243 (1985). [2] H. Meyer-Ortmanns, Phase Transitions in Quantum Chromodynamics, Rev. Mod. Phys. 68, 473 (1996). [3] M. Le Bellac, Thermal Field Theory, Cambridge Monographs in Mathematical Physics, Cambridge University Press 1996. [4] N.P. Landsman and Ch.G. van Weert, Real- and Imaginary-Time Field Theory at Finite Temperature and Density, Phys. Rep. 145, 141 (1987). [5] J.H. Lowenstein, Comm. Math. Phys. 47, 53 (1976). [6] J. Zinn-Justin : Quantum Field Theory and Critical Phenomena, Clarendon Press,Oxford, 3rd ed. 1997. [7] H. Matsumoto, I. Ojima and H. Umezawa, Ann. Phys. 152, 348 (1984), A.J. Niemi and G.W. Semenoff, Nucl.Phys. B230 [FS10] 181 (1984), R.E. Norton and J.M. Cornwall, Ann. Phys. 91, 106 (1975), M.B. Kislinger and P.D. Morley, Phys. Rev. D13, 2771 (1976), N.P. Landsman, Comm. Math. Phys. 125, 643 (1989). [8] O.Steinmann, Perturbative Quantum Field Theory at Positive Temperatures: An Axiomatic Approach, Comm. Math. Phys. 170, 405 (1995). [9] N. Bogoliubov and O. Parasiuk, Acta Math. 97, 227 (1957), K.Hepp, Comm. Math. Phys. 2, 3011 (1966), W. Zimmermann, Comm. Math. Phys. 11, 1 (1968), and ibid. 15 (1069) 208, P. Breitenlohner and D. Maison, Comm. Math. Phys. 52, 39 (1977), and ibid. 52 (1977) 55, J. Polchinski, Nucl. Phys. B231, 269 (1984), T. Reisz, Comm. Math. Phys. 116, 81 (1988), and ibid. 117, 573 (1988). [10] G. Keller, Ch. Kopper, M. Salmhofer, Helv. Phys. Acta 65, 32 (1991), G. Keller and Ch. Kopper, Comm. Math. Phys. 148, 445 (1992), Ch. Kopper, Renormierungstheorie mit Flußgleichungen. Shaker Verlag, Aachen, 1998.

402



[11] J. Feldman, H. Kn¨ orrer, M. Salmhofer and E. Trubowitz, J. Stat. Phys. 94, 113 (1999). [12] N. Bourbaki, Fonctions d’une variable réelle, ch.6, éditions Hermann, Paris, 1976. [13] Ch. Kopper and W. Pedra, Irrelevant Interactions without Composite Operators : A Remark on the Universality of second order Phase Transitions, preprint cond-mat 0007476, to appear in J. Phys. A.

C. Kopper Centre de Physique Théorique Ecole Polytechnique F-91128 Palaiseau France email : [email protected] V.F. M¨ uller Fachbereich Physik Universität Kaiserslautern D-67653 Kaiserslautern email : [email protected] T. Reisz Institut f¨ ur Theor. Physik Universität Heidelberg D-69120 Heidelberg Germany and Service de Physique Théorique de Saclay CE-Saclay F-91191 Gif-sur Yvette Cedex France email : [email protected] Communicated by Detlev Buchholz submitted 19/09/00, accepted 25/10/00



Erratum to “Poincaré renormalized forms” Giuseppe Gaeta The computation given in section 12 of [1] is incorrect for the case ν ≤ µ (these are defined in Lemma 2 there). The correct expression for formula (12.14) in Lemma 2 is µ bk Yk , f ∗ (x) = Ax + aµ Xµ + a2µ X2µ + (1) k=ν

where ak , bk are real constants, in general different from ak , bk . Note also that in [1] it is not sufficiently stressed that this computation is performed by making use of the Lie algebraic properties of the set (12.2) of vector fields in normal form. Following step by step the general and generic algorithm described in sections 6–9 one would obtain a different reduced normal form, i.e. ∞ f ∗ (x) = Ax + bν Yν + k=µ ak Xk . Similar considerations and corrections also apply to sect.13 of [1] and sect. VIII.6 of [2]. I stress that in [3] my definition of PRFs was incorrectly reported; thus the remarks given there do not concern PRFs. These points, and the related computations, are discussed in detail in [4], available via http://mpej.unige.ch/mp arc/mp arc-home.html .

References [1] G. Gaeta, “Poincaré renormalized forms”, Ann. Inst. H. Poincaré (Phys. Theo.) 70 (1999), 461–514. [2] G. Cicogna and G. Gaeta, Symmetry and perturbation theory in nonlinear dynamics, Springer (Lecture Notes in Physics, vol. m57), 1999. [3] A.D. Bruno, reviews 1999a:34111 and 2000h:37071, Mathematical Reviews. [4] G. Gaeta, “Poincaré renormalized forms and regular singular points of vector fields in the plane”, preprint mp-arc 01-17 (2001); “Algorithmic reduction of Poincaré normal forms and Lie algebras”, in preparation. Giuseppe Gaeta Dipartimento di Fisica Università di Roma I–00185 Roma, Italy Communicated by Vincent Rivasseau received 05/02/01



The Bianchi IX Attractor Hans Ringstr¨ om

Abstract. We consider the asymptotic behaviour of spatially homogeneous spacetimes of Bianchi type IX close to the singularity (we also consider some of the other Bianchi types, e. g. Bianchi VIII in the stiff fluid case). The matter content is assumed to be an orthogonal perfect fluid with linear equation of state and zero cosmological constant. In terms of the variables of Wainwright and Hsu, we have the following results. In the stiff fluid case, the solution converges to a point for all the Bianchi class A types. For the other matter models we consider, the Bianchi IX solutions generically converge to an attractor consisting of the closure of the vacuum type II orbits. Furthermore, we observe that for all the Bianchi class A spacetimes, except those of vacuum Taub type, a curvature invariant is unbounded in the incomplete directions of inextendible causal geodesics.

1 Introduction In the last few decades, the Bianchi IX spacetimes have received considerable attention, see for instance [6], [13], [21] and references therein. Agreement has been reached, at least concerning some aspects of the asymptotic behaviour as one approaches a singularity, but the basis for the consensus has mainly consisted of numerical studies and heuristic arguments. The objective of this work is to provide mathematical proofs for some aspects of the ’accepted’ picture. The main result of this paper was for example conjectured in [21] p. 146-147, partly on the basis of a numerical analysis. In the standard cosmological models of the universe, one assumes it to be spatially homogeneous and isotropic, and one then typically obtains a cosmological singularity. The question arises to what extent the singularity is caused by the symmetry assumptions. The singularity theorems of Hawking and Penrose yield the conclusion that the existence of cosmological singularities is in some sense generic. However, the concept of a singularity used in these theorems is that of causal geodesic incompleteness, and thus, the theorems do not say much about the character of the singularities. In fact, the curvature may remain bounded as one approaches a singularity, and it may be possible to extend the spacetime beyond it. In other words, the problem of cosmic censorship for vacuum cosmological spacetimes, i. e. that the maximal globally hyperbolic development is generically inextendible, cannot be solved using the singularity theorems. There is thus an incentive to try to analyze the character of cosmological singularities of solutions to Einstein’s equations. Since this is difficult in general, it is natural to consider

406

H. Ringstr¨ om


solutions with symmetry. This paper is concerned with the spatially homogeneous case. The interest in Bianchi IX spacetimes is partly due to the work of Belinskii, Khalatnikov and Lifshitz, below referred to as BKL. Around 1970, they considered such spacetimes in an attempt to construct a general solution of Einstein’s equations with a singularity, see e. g. [3] and [4]. Misner began his work at about the same time [17], his original motivation being questions of isotropy, and he christened the Bianchi IX vacuum solutions mixmaster universes. Later, the seemingly chaotic behaviour of mixmaster universes attracted attention, see for instance [13] and [8], and Bianchi IX has been considered to be a model case for understanding the chaotic nature of solutions to Einstein’s equations. However, the question of what should be meant by chaoticity in this context has caused a great deal of debate. Let us describe in more detail our two main sources of motivation. Cosmic censorship. The Bianchi IX class contains the Taub spacetimes. These spacetimes are vacuum maximal globally hyperbolic spacetimes that are causally geodesically incomplete both to the future and to the past, see [7] and [16]. However, as one approaches a singularity, in the sense of causal geodesic incompleteness, the curvature remains bounded. In fact, one can extend the spacetime beyond the singularities in inequivalent ways, see [7]. The extensions are called Taub-NUT. It is natural to conjecture that the behaviour exhibited by the Taub spacetimes is non-generic, and it is interesting to try to prove that the behaviour is nongeneric in the Bianchi IX class. In fact we prove that all Bianchi IX initial data considered in this paper other than Taub yield inextendible globally hyperbolic developments such that the curvature becomes unbounded as one approaches a singularity. This result is in fact more of an observation, since the corresponding result is known in the vacuum case, see [19], and curvature blow up is easy to prove in the non-vacuum cases we consider. The BKL conjecture. Another reason for studying the Bianchi IX spacetimes is the BKL conjecture, see [4]. According to this conjecture, the ’local’ approach to the singularity of a general inhomogeneous solution should exhibit oscillatory behaviour. The prototypes for this behaviour among the spatially homogeneous spacetimes are the Bianchi VIII and IX classes. Furthermore, matter is conjectured to become unimportant as one approaches a singularity, with some exceptions, for example the stiff fluid case. We refer to [5] for arguments supporting the BKL conjecture and to [1] for an overview of conjectures and results under symmetry assumptions of varying degree. In this paper we prove, under certain restrictions on the allowed matter models, that generic Bianchi IX solutions exhibit oscillatory behaviour and that the matter becomes unimportant as one approaches a singularity. What is meant by the latter statement will be made precise below. If the matter model is a stiff fluid the matter will be important, and in that case we prove that the behaviour is quiescent. This should be compared with [2] concerning the structure of singularities of analytic solutions to Einstein’s equations coupled to a scalar field or stiff fluid. In that paper, Andersson and Rendall prove

Vol. 2, 2001

The Bianchi IX Attractor

407

that given a certain kind of solution to the so called velocity dominated system, there is a unique solution of Einstein’s equations coupled to a stiff fluid approaching the velocity dominated solution asymptotically. One can then ask the question whether it is natural to assume that a solution has the asymptotics they prescribe. In Section 20, we show that all Bianchi VIII and IX stiff fluid solutions exhibit such asymptotic behaviour. The results presented in this paper can be divided into two parts. The first part consists of statements about developments of orthogonal perfect fluid data of class A. We clarify below what we mean by this. The results concern curvature blow up and inextendibility of developments. The second part consists of results expressed in terms of the variables of Wainwright and Hsu. These variables describe the spacetime close to the singularity, and we prove that Bianchi IX solutions generically converge to an attractor.

1.1 Properties of developments ¯ , g¯) with a perfect fluid We consider spatially homogeneous Lorentz manifolds (M source. The stress energy tensor is thus given by Tab = µua ub + p(¯ gab + ua ub ),

(1)

where u is a unit timelike vectorfield, the 4-velocity of the fluid. We assume that p and µ satisfy a linear equation of state p = (γ − 1)µ,

(2)

where we in this paper restrict our attention to 2/3 < γ ≤ 2. We will also assume that u is perpendicular to the hypersurfaces of homogeneity. Einstein’s equations can be written ¯ ab − 1 R¯ ¯ gab = Tab , R (3) 2 ¯ are the Ricci and scalar curvature of (M ¯ , g¯). In order to formulate ¯ ab and R where R an initial value problem in this setting, consider a spacelike submanifold (M, g) ¯ , g¯), orthogonal to u. Let eα , α = 0, .., 3 be a local frame with e0 = u and of (M ei , i = 1, 2, 3 tangent to M and let kij be the second fundamental form of (M, g). Then g and k must satisfy the equations ¯ 00 + R ¯ Rg − kij kij + (trg k)2 = 2R and ¯ 0i , ∇i trg k − ∇j kij = R where ∇ is the Levi-Civita connection of g, and Rg is the corresponding scalar curvature, indices are raised and lowered by g. If we specify a Riemannian metric g, and a symmetric covariant 2-tensor k, as initial data on a 3-manifold, they should thus in our situation satisfy Rg − kij kij + (trg k)2 = 2µ

(4)

408

H. Ringstr¨ om


and ∇i trg k − ∇j kij = 0,

(5)

because of (3), (1) and the fact that u is perpendicular to M . In other words, we should also specify the initial value of µ as part of the data. We consider only a restricted class of manifolds M and initial data. The 3manifold M is assumed to be a special type of Lie group, and g, k and µ are assumed to be left invariant. In order to be more precise concerning the type of Lie groups M = G we consider, let ei , i = 1, 2, 3 be a basis of the Lie algebra with k k structure constants determined by [ei , ej ] = γij ek . If γik = 0, then the Lie algebra and Lie group are said to be of class A, and k γij = ijm nkm ,

(6)

where the symmetric matrix nij is given by nij =

1 (i j)kl γ . 2 kl

(7)

Definition 1.1 Orthogonal perfect fluid data of class A for Einstein’s equations consist of the following. A Lie group G of class A, a left invariant Riemannian metric g on G, a left invariant symmetric covariant 2-tensor k on G, and a constant µ0 ≥ 0 satisfying (4) and (5) with µ replaced by µ0 . We can choose a left invariant orthonormal basis {ei } with respect to g, so that the corresponding matrix nij defined in (7) is diagonal with diagonal elements n1 , n2 and n3 . By an appropriate choice of orthonormal basis, n1 , n2 , n3 can be assumed to belong to one and only one of the types given in Table 1.1. We assign a Bianchi type to the initial data accordingly. This division constitutes a classification of the class A Lie algebras. We refer to Lemma 21.1 for a proof of these statements. Let kij = k(ei , ej ). Then the matrices nij and kij commute according to (5), so that we may assume kij to be diagonal with diagonal elements k1 , k2 and k3 , cf. (140). Definition 1.2 Orthogonal perfect fluid data of class A satisfying k2 = k3 and n2 = n3 or one of the permuted conditions are said to be of Taub type. Data with µ0 = 0 are called vacuum data. Observe that the Taub condition is independent of the choice of orthonormal basis diagonalizing n and k, cf. (140). Considering the equations of Ellis and MacCallum (131)-(135), one can see that if n2 = n3 and k2 = k3 at one point in time, then the equalities always hold, cf. the construction of the spacetime carried out in the appendix. The resulting spacetimes are locally rotationally symmetric and thus have a higher degree of symmetry. According to [10], vacuum Bianchi IX solutions satisfying these conditions are the Taub solutions.

Vol. 2, 2001


409

Table 1: Bianchi class A. Type I II VI0 VII0 VIII IX

n1 0 + 0 0 − +

n2 0 0 + + + +

n3 0 0 − + + +

Definition 1.3 By an orthogonal perfect fluid development of orthogonal perfect fluid data of class A, we will mean the following. A connected 4-dimensional ¯ , g¯) and a 2-tensor T , as in (1), on (M ¯ , g¯), such that there is Lorentz manifold (M ∗ ∗ ¯ ∗ ¯ an embedding i : G → M with i (¯ g ) = g, i (k) = k and i (µ) = µ0 , where k¯ is the ¯ , g¯). Finally, we demand that these objects second fundamental form of i(G) in (M be related by Einstein’s field equations (3). In the appendix, we construct globally hyperbolic orthogonal perfect fluid developments, given initial data, and we refer to them as class A developments, cf. Definition 21.1. The logic of the construction is as follows. Given a solution to certain differential equations, we construct a globally hyperbolic development which we call the class A development. This development need not a priori be maximal. However, we prove that except for the Taub type solutions, this development is inextendible. The Taub type solutions we leave aside as they have been considered elsewhere, see for instance [7]. By the construction it follows that the Bianchi type is preserved by the flow. We can therefore assign a type to a Bianchi class A development according to the type of the initial data. What is meant by inextendibility is explained in the following. Definition 1.4 Consider a connected Lorentz manifold (M, g). If there is a conˆ , gˆ) of the same dimension, and a map i : M → M ˆ, nected C 2 Lorentz manifold (M ˆ with i(M ) = M , which is an isometry onto its image, then (M, g) is said to be ˆ , gˆ) is called a C 2 -extension of (M, g). A Lorentz manifold C 2 -extendible and (M 2 which is not C -extendible is said to be C 2 -inextendible. Remark. There is an analogous definition of smooth extensions. Unless otherwise mentioned, manifolds are assumed to be smooth, and maps between manifolds are assumed to be as regular as possible. The main conclusion concerning developments is that the vaccum Taub type solutions are the only ones which do not exhibit curvature blow up. A more precise statement is to be found in section 19.

410

H. Ringstr¨ om


1.2 Results expressed in the Wainwright-Hsu variables We now turn to the results that are expressed in terms of the variables of Wainwright and Hsu. The equations and some of their properties are to be found in Section 2. The appendix contains a derivation. It is natural to divide the matter models into two categories; the non-stiff fluid case and the stiff fluid case (γ = 2). 2

1.5

1

Σ−

0.5

0

−0.5

−1

−1.5

−2 −2

−1.5

−1

−0.5

0 Σ+

0.5

1

1.5

2

Figure 1: The Kasner map. Let us begin with the non-stiff fluid case, including the vacuum case. We confine our attention to Bianchi IX solutions. The existence interval stretches back to −∞ which corresponds to the singularity. There are some fixed points to which certain solutions converge, and data which lead to such solutions together with data of Taub type will be considered to be non-generic. The Kasner map, which is supposed to be an approximation of the Bianchi IX dynamics as one approaches a singularity, is illustrated in Figure 1. The circle in the Σ+ Σ− -plane appearing in the figure is called the Kasner circle, and we have depicted two bounces of the Kasner map. The starting point is marked by a star, and the end point by a plus sign. Given a point x on the Kasner circle, the Kasner map yields a new point y on the Kasner circle by taking the corner of the triangle closest to x, drawing a straight line from the corner through x, and then letting y be the second point of intersection between the line and the Kasner circle. One solid line corresponds to the closure of a vacuum type II orbit of the equations of Wainwright and Hsu. Actually, it is the projection of the closure of such an orbit to the Σ+ Σ− -plane. A vacuum type II solution has one Ni non-zero and the other two zero, and the three different Ni correspond to the three corners of the triangle; the rightmost corner corresponds to N1 = 0 and the corner on the top left corresponds to N3 = 0. The

Vol. 2, 2001


411

constraint (11) for the vacuum type II solutions is given by 3 Σ2+ + Σ2− + Ni2 = 1. 4 The closure of this set is given a name in the following definition. Definition 1.5 The set A = {(Ω, Σ+ , Σ− , N1 , N2 , N3 ) : Ω + |N1 N2 | + |N2 N3 | + |N3 N1 | = 0} ∩ M, where M is defined by (11), is called the Bianchi attractor. The main result of this paper is that for generic Bianchi IX data, the solution converges to the attractor. That is lim (Ω + N1 N2 + N2 N3 + N3 N1 ) = 0.

τ →−∞

(8)

This conclusion supports the statement that the Kasner map approximates the dynamics, and also the statement that the matter content loses significance close to the singularity. Let us introduce some terminology. Definition 1.6 Let f ∈ C ∞ (Rn , Rn ), and consider a solution x to the equation dx = f ◦ x, x(0) = x0 , dt with maximal existence interval (t− , t+ ). We call a point x∗ an α-limit point of the solution x, if there is a sequence tk → t− with x(tk ) → x∗ . The α-limit set of x is the set of its α-limit points. The ω-limit set is defined similarly by replacing t− with t+ . Remark. If t− > −∞ then the α-limit set is empty, cf. [19]. Thus, the α-limit set of a generic solution is contained in the attractor. The desired statement is that the α-limit set coincides with the attractor, but the best result we have achieved in this direction is that there must at least be three α-limit points on the Kasner circle. This worst case situation corresponds to the solution converging to a periodic orbit of the Kasner map with period three. Observe that we have not proven anything concerning Bianchi VIII solutions, but the behaviour in that case is expected to be similar. Let us sketch the proof. It is natural to divide it into two parts. The first part consists of proving the existence of an α-limit point on the Kasner circle. We achieve this in the following steps. First we analyze the α-limit sets of the Bianchi types I, II and VII0 . An analysis of types I of II can also be found in Ellis and Wainwright [21]. Then we prove the existence of an α-limit point for a generic Bianchi IX solution. To go from the existence of an α-limit point to an α-limit

412

H. Ringstr¨ om


point on the Kasner circle, we use the analysis of the lower Bianchi types. In the second part, we prove (8). Let d be the function appearing in that equation. We assume that d does not converge to zero in order to reach a contradiction. The existence of an α-limit point on the Kasner circle proves that there is a sequence τk → −∞ such that d(τk ) → 0. If d does not converge to zero there is a δ > 0, and a sequence sk → −∞ such that d(sk ) ≥ δ. We can assume sk ≤ τk and conclude that d on the whole has to grow (going backwards) in the interval [sk , τk ]. What can be said about this growth? In Section 14, we prove that we can control the density parameter Ω in this process, assuming δ is small enough, which is not a restriction. As a consequence Ω can be assumed to be arbitrarily small during the growth. Some further arguments, given in Section 15, show that we can assume the growth to occur in the product N2 N3 , using the symmetries of the equations. Furthermore, one can assume the Σ+ Σ− -variables to be arbitrarily close to (Σ+ , Σ− ) = (−1, 0), and that some expressions dominate others. For instance 1+Σ+ can be assumed to be arbitrarily much smaller than N2 N3 . This control introduces a natural concept of order of magnitude. The behaviour of the product N2 N3 will be oscillatory; it will look roughly like a sine wave. The point is to prove that the product decays during a period of its oscillation; that would lead to a contradiction. The variation during a period can be expressed in terms of an integral, and we use the order of magnitude concept to prove an estimate showing that this integral has the right sign. Now consider the stiff fluid case with positive density parameter. In this case we will consider Bianchi VIII and IX solutions. The analysis is similar for the other cases and a description of the results is to be found in Section 19. Again the singularity corresponds to −∞. The density parameter Ω converges to a non-zero value, all the Ni converge to zero, and in the Σ+ Σ− -plane the solution converges to a point inside the triangle shown in Figure 2. In Section 2, we formulate the equations of Wainwright and Hsu and briefly describe their origin and some of their properties. Section 3 contains some elementary properties of solutions. We give the existence intervals of solutions to the equations, and prove that the ΩΣ+ Σ− -variables are contained in a compact set to the past for Bianchi IX solutions. As in the vacuum case, we also prove that (Σ+ , Σ− ) can converge to (−1, 0) only if the solution is of Taub type, although this is no longer a characterization. In Section 4, we mention some critical points and make more precise the statement that solutions converging to these points are non-generic. Included in this section are also two technical lemmas relevant to the analysis. The monotonicity principle is explained in Section 5. It is fundamental to the analysis of the α-limit sets of the solutions. We present two applications; the fact that all α-limit points of Bianchi IX solutions are of type I, II or VII0 and an analysis of the vacuum type II orbits. The last application is not complicated, but illustrates the arguments involved as well as demonstrating how the map depicted in Figure 1 can be viewed as a sequence of type II orbits. Section 6 deals with situations such that one has control over the shear variables and the density parameter. Specifically, it gives a geometric interpretation of some of the

Vol. 2, 2001


413

2

1.5

1

Σ−

0.5

0

−0.5

−1

−1.5

−2 −2

−1.5

−1

−0.5

0 Σ

0.5

1

1.5

2

+

Figure 2: The triangle mentioned in the text.

equations in ΩΣ+ Σ− -space. As an application, we prove that if a Bianchi IX solution has an α-limit point on the Kasner circle then all the points obtained by applying the Kasner map to this point belong to the α-limit set of the solution. The stiff fluid case is handled in Section 7. In this case the α-limit set consists of a point regardless of type. Sections 8-10 deal with the lower order Bianchi types needed in order to analyze Bianchi IX. An analysis of types I of II can also be found in Ellis and Wainwright [21]. Section 11 gives the possibilities for a Taub type Bianchi IX solution. The technical Section 12 is needed in order to prove the existence of an α-limit point for Bianchi IX solutions, and also to prove that the set of vacuum type II points is an attractor. It is used for approximating the solution in situations where the behaviour is oscillatory. Section 13 proves the existence of an α-limit point for a Bianchi IX solution and the existence of an α-limit point on the Kasner circle for generic Bianchi IX solutions. In Section 14, we prove that if one has control over the sum |N1 N2 | + |N2 N3 | + |N3 N1 | in some time interval [τ1 , τ2 ], and control over Ω in τ2 then one has control over Ω in the entire interval. This rather technical observation is essential in the proof that generic solutions converge to the attractor. The heart of this paper is Section 15 which contains a proof of (8). It also contains arguments that will be used in Section 16 to analyze the regularity of the set of non-generic points. In Section 17, we observe that the convergence to the attractor is uniform, and in Section 18 we prove the existence of at least three non-special α-limit points on the Kasner circle. We formulate the main conclusions and prove Theorem 19.4 in Section 19. In Section 20, we relate our results concerning stiff fluid solutions to those of [2]. The appendices contain

414

H. Ringstr¨ om


results relating solutions to the equations of Wainwright and Hsu with properties of the class A developments and some curvature computations.

2 Equations of Wainwright and Hsu The essence of this paper is an analysis of the asymptotic behaviour of solutions to the equations of Wainwright and Hsu (9)-(11). One important property of these equations is that they describe all the Bianchi class A types at the same time. Another important property is that it seems that the variables remain in a compact set as one approaches a singularity. In the Bianchi IX case, this follows from the analysis presented in this paper. Let us give a rough description of the origin of the variables. In the situations we consider, there is a foliation of the Lorentz manifold by homogeneous spacelike hypersurfaces diffeomorphic to a Lie group G of class A. One can define an orthonormal basis eα , α = 0, ..., 3, such that ei , i = 1, 2, 3, span the tangent space of the spacelike hypersurfaces of homogeneity, and e0 = ∂t for a suitable globally defined time coordinate t. It is possible to associate a matrix nij with the spacelike vectors ei , as in (7), and assume it to be diagonal with diagonal components ni . One changes the time coordinate by dt/dτ = 3/θ, where θ is minus the trace of the second fundamental form of the spacelike hypersurface corresponding to t. The Ni (τ ) below are the ni (τ ) divided by θ(τ ), the Σ+ and Σ− correspond to the traceless part of the second fundamental form of the spacelike hypersurface corresponding to τ , similarly normalized, and finally Ω = 3µ/θ2 . We will refer to Σ+ and Σ− as the shear variables, and to Ω as the density parameter. The question then arises to what extent this makes sense, since θ could become zero. An answer is given in the appendix. For all the Bianchi types except IX, this procedure is essentially harmless, and the variables of Wainwright and Hsu capture the entire Lorentz manifold. In the Bianchi IX case, there is however a point at which θ = 0, at least if 1 ≤ γ ≤ 2, see the appendix, and the variables are only valid for half a development in that case. As far as the analysis of the asymptotics are concerned, this is however not important. A derivation of the equations is given in the appendix. They are N1 N2 N3 Σ+ Σ− Ω

= (q − 4Σ+ )N1

√ = (q + 2Σ+ + 2 3Σ− )N2 √ = (q + 2Σ+ − 2 3Σ− )N3 = −(2 − q)Σ+ − 3S+ = −(2 − q)Σ− − 3S− = [2q − (3γ − 2)]Ω.

The prime denotes derivative with respect to a time coordinate τ , and q

=

1 (3γ − 2)Ω + 2(Σ2+ + Σ2− ) 2

(9)

Vol. 2, 2001


S+

=

S−

=

1 [(N2 − N3 )2 − N1 (2N1 − N2 − N3 )] 2 √ 3 (N3 − N2 )(N1 − N2 − N3 ). 2

415

(10)

The constraint is 3 Ω + Σ2+ + Σ2− + [N12 + N22 + N32 − 2(N1 N2 + N2 N3 + N3 N1 )] = 1. 4

(11)

We demand that 2/3 < γ ≤ 2 and Ω ≥ 0. The equations (9)-(11) have certain symmetries, described in Wainwright and Hsu [20]. By permuting N1 , N2 , N3 arbitrarily, we get new solutions, if we at the same time carry out appropriate combinations of rotations by integer multiples of 2π/3, and reflections in the (Σ+ , Σ− )-plane. Explicitly, the transformations √ √ ˜1 , N ˜2 , N ˜3 ) = (N3 , N1 , N2 ), (Σ ˜ +, Σ ˜ − ) = (− 1 Σ+ + 1 3Σ− , − 1 3Σ+ − 1 Σ− ) (N 2 2 2 2 and ˜1 , N ˜2 , N ˜3 ) = (N1 , N3 , N2 ), (Σ ˜ +, Σ ˜ − ) = (Σ+ , −Σ− ) (N yield new solutions. Below, we refer to rotations by integer multiples of 2π/3 as rotations. Changing the sign of all the Ni at the same time does not change the equations. Classify points (Ω, Σ+ , Σ− , N1 , N2 , N3 ) according to the values of N1 , N2 , N3 in the same way as in Table 1.1. Since the sets Ni > 0, Ni < 0 and Ni = 0 are invariant under the flow of the equations, we may classify solutions to (9)-(11) accordingly. Definition 2.1 The Kasner circle is defined by the conditions Ni = Ω = 0 and the constraint (11). There √ are three points on this circle called special: (Σ+ , Σ− ) = (−1, 0) and (1/2, ± 3/2). We will also call points in the Σ+ Σ− -plane special. The following reformulation of Σ+ is written down for future reference, 3 9 Σ+ = −(2 − 2Ω− 2Σ2+ − 2Σ2− )(Σ+ + 1)− (2 − γ)ΩΣ+ + N1 (N1 − N2 − N3 ). (12) 2 2

3 Elementary properties of solutions Here we collect some miscellaneous observations that will be of importance. Most of them are similar to results obtained in [19]. The α-limit set defined in Definition 1.6 plays an important role in this paper, and here we mention some of its properties. Lemma 3.1 Let f and x be as in Definition 1.6. The α-limit set of x is closed and invariant under the flow of f . If there is a T such that x(t) is contained in a compact set for t ≤ T , then the α-limit set of x is connected.

416

H. Ringstr¨ om


✷

Proof. See e. g. [14].

Definition 3.1 A solution to (9)-(11) satisfying N2 = N3 and Σ− = 0, or one of the conditions found by applying the symmetries, is said to be of Taub type. Remark. The set defined by N2 = N3 and Σ− = 0 is invariant under the flow of (9). Lemma 3.2 The existence intervals for all solutions to (9)-(11) except Bianchi IX are (−∞, ∞). For Bianchi IX solutions we have past global existence. Proof. As in the vacuum case, see [19]. ✷ By observations made in the appendix, −∞ corresponds to the singularity. Lemma 3.3 Let 2/3 < γ ≤ 2. Consider a solution of type IX. Then (Σ+ , Σ− , Ω) is contained in a compact set for τ ∈ (−∞, 0], the size of which depends on the initial data. Further, if at a point in time N3 ≥ N2 ≥ N1 and N3 ≥ 2, then N2 ≥ N3 /10. Proof. As in the vacuum case, see [19].

✷

That (Σ+ , Σ− , Ω) is contained in a compact set for all the other types follows from the constraint. The second part of this lemma will be important in the proof of the existence of an α-limit point. One consequence is that one Ni may not become unbounded alone. The final observation is relevant in proving curvature blow up. One can define a normalized version (151) of the Kretschmann scalar (111), and it can be expressed as a polynomial in the variables of Wainwright and Hsu. One way of proving that a specific solution exhibits curvature blow up is to prove that it has an α-limit point at which the normalized Kretschmann scalar is non-zero. We refer to the appendix for the details. It turns out that this polynomial is zero when N2 = N3 , N1 = 0, Σ− = 0, Σ+ = −1 and Ω = 0. The same is true of the points obtained by applying the symmetries. It is then natural to ask the question: for which solutions does (Σ+ , Σ− ) converge to (−1, 0)? Proposition 3.1 A solution to (9)-(11) with 2/3 < γ < 2 satisfies lim (Σ+ (τ ), Σ− (τ )) = (−1, 0),

τ →−∞

only if it is contained in the invariant set Σ− = 0 and N2 = N3 . Remark. The proposition does not apply to the √ stiff fluid case. The analogous statements for the points (Σ+ , Σ− ) = (1/2, ± 3/2) are true by an application of the symmetries. We may not replace the implication with an equivalence, cf. Proposition 9.1. Proof. The argument is essentially the same as in the vacuum case, see [19]. We only need to observe that Ω will decay exponentially when (Σ+ , Σ− ) is close to (−1, 0). ✷

Vol. 2, 2001


417

4 Critical points Definition 4.1 The critical point F is defined by Ω = 1 and all other variables zero. In the case 2/3 < γ < 2, we define the critical point P1+ (II) to be the type II point with Σ− = 0, N1 > 0, Σ+ = (3γ − 2)/8 and Ω = 1 − (3γ − 2)/16. The critical points Pi+ (II), i = 2, 3 are found by applying the symmetries. It will turn out that there are solutions which converge to these points as τ → −∞, but observe that only non-vacuum solutions can do so. Definition 4.2 Let IVII0 denote initial data to (9)-(11) of type VII0 with Ω > 0, and correspondingly for the other types. Let PVII0 be the elements of IVII0 such that the corresponding solutions converge to one of Pi+ (II) as τ → −∞ and similarly for Bianchi II and IX. Finally, let FVII0 be the elements of IVII0 such that the corresponding solutions converge to F as τ → −∞, and similarly for the other types. Remark. The sets FII and so on depend on γ, but we omit this reference. Observe that II , III , IVII0 and IIX are submanifolds of R6 of dimensions 2, 3, 4 and 5 respectively. We will prove that PII consists of points and that FI is the point F . Let 2/3 < γ < 2 be fixed. In Theorem 16.1, we will be able to prove that the sets FII , FVII0 , FIX , PVII0 and PIX are C 1 submanifolds of R6 of dimensions 1, 2, 3, 1 and 2 respectively. This justifies the following definition. Definition 4.3 Let 2/3 < γ < 2. A solution to (9)-(11) is said to be generic if it is not of Taub type, and if it does not belong to FI , FII , FVII0 , FIX , PII , PVII0 or PIX . We will need the following two lemmas in the sequel. Lemma 4.1 Consider a solution x to (9)-(11) such that x has P1+ (II) as an αlimit point but does not converge to it. Then x has an α-limit point of type II with N1 = 0, which is not P1+ (II). Remark. There is no solution satisfying the conditions of this lemma, but we will need it to establish that fact. Proof. Consider the solution to belong to R6 , and let the point x0 represent P1+ (II). There is an > 0 such that for each T , there is a τ ≤ T such that x(τ ) does not belong to the open ball B (x0 ). In x0 one can compute that √ q + 2Σ+ ± 2 3Σ− > 0. Let be so small that these expressions are positive in B (x0 ). Let τk → −∞ be a sequence such that x(τk ) → x0 , and let sk ≤ τk be a sequence such that x(sk ) ∈ ∂B (x0 ) and x((sk , τk ]) ⊆ B (x0 ). Since x(sk ) is contained in a compact set, there is a convergent subsequence yielding an α-limit point which is not P1+ (II). Since

418

H. Ringstr¨ om


N2 and N3 converge to zero in τk and decay in absolute value from τk to sk , the α-limit point has to be of type II (N1 has to be non-zero for the new α-limit point if is small enough). ✷ Lemma 4.2 Consider a solution x to (9)-(11) such that x has F as an α-limit point, but which does not converge to F . Then x has an α-limit point of type I which is not F . Remark. The same remark as that made in connection with Lemma 4.1 holds concerning this lemma. Proof. The idea is the√same as the previous √ lemma. We need only observe that q − 4Σ+ , q + 2Σ+ + 2 3Σ− and q + 2Σ+ − 2 3Σ− are positive in F . ✷

5 The monotonicity principle The following lemma will be a basic tool in the analysis of the asymptotics, we will refer to it as the monotonicity principle. Lemma 5.1 Consider

dx =f ◦x (13) dt where f ∈ C ∞ (Rn , Rn ). Let U be an open subset of Rn , and M a closed subset invariant under the flow of the vectorfield f . Let G : U → R be a continuous function such that G(x(t)) is strictly monotone for any solution x(t) of (13), as long as x(t) ∈ U ∩M . Then no solution of (13) whose image is contained in U ∩M has an α- or ω-limit point in U . Remark. Observe that one can use M = Rn . We will mainly choose M to be the closed invariant subset of R6 defined by (11). If one Ni is zero and two are non-zero, we consider the number of variables to be five etc. Proof. Suppose p ∈ U is an α-limit point of a solution x contained in U ∩ M . Then G ◦ x is strictly monotone. There is a sequence tn → t− such that x(tn ) → p by our supposition. Thus G(x(tn )) → G(p), but G ◦ x is monotone so that G(x(t)) → G(p). Thus G(q) = G(p) for all α-limit points q of x. Since M is closed p ∈ M . The solution x ¯ of (13), with initial value p, is contained in M by the invariance property of M , and it consists of α-limit points of x so that G(¯ x(t)) = G(p) which is constant. Furthermore, on an open set containing zero it takes values in U contradicting the assumptions of the lemma. The argument for the ω-limit set is similar. ✷ Let us give an example of an application. Lemma 5.2 Consider a solution to (9)-(11) of type VIII or IX. If it has an α-limit point, then lim (N1 N2 N3 )(τ ) = 0. τ →−∞

Vol. 2, 2001


419

Proof. Let U of Lemma 5.1 be defined by the union of the sets Ni = 0, i = 1, 2, 3, M by the constraint (11), and G by the function N1 N2 N3 . Compute (N1 N2 N3 ) = 3qN1 N2 N3 .

(14)

Consider a solution x of (9)-(11). We need to prove that G ◦ x is strictly monotone as long as x(τ ) ∈ U ∩ M . By (14) the only problem that could occur is q = 0. However, q = 0 implies |Σ+ | + |Σ− | > 0 by (9)-(11) so that G ◦ x has the desired property. If the sequence τk → −∞ yields the α-limit point we assume exists, then we conclude that (N1 N2 N3 )(τk ) → 0. Since N1 N2 N3 is monotone, we conclude that it converges to zero. ✷ One important consequence of this observation is the fact that all α-limit points of Bianchi VIII and IX solutions are of one of the lower Bianchi types. Since the α-limit set is invariant under the flow, it is thus of interest to know something about the α-limit sets of the lower Bianchi types, if one wants to prove the existence of an α-limit point on the Kasner circle for a Bianchi VIII or IX solution. Let us now analyze the vacuum type II orbits and define the Kasner map. Proposition 5.1 A Bianchi II vacuum solution of (9)-(11) with N1 > 0 and N2 = N3 = 0 satisfies (15) lim N1 = 0. τ →±∞

The ω-limit set is a point in K1 and the α-limit set is a point on the Kasner circle, in the complement of the closure of K1 . Remark. K1 is the segment of the Kasner circle with Σ+ > 1/2, cf. Definition 6.1. Proof. Using the constraint (11) we deduce that Σ+ =

3 2 N (2 − Σ+ ). 2 1

We wish to apply the monotonicity principle. There are three variables. Let U be defined by N1 > 0, M be defined by (11), and G(Σ+ , Σ− , N1 ) = Σ+ . We conclude that (15) is true as follows. Let τn → ∞. A subsequence yields an ω-limit point by (11). The monotonicity principle yields N1 (τnk ) → 0 for the subsequence. The argument for the α-limit set is similar, and equation (15) follows. Combining this with the constraint, we deduce lim q = 2.

τ →±∞

Using the monotonicity of Σ+ and the connectedness of the α-limit set, we conclude that (Σ+ , Σ− ) has to converge. As for the α-limit set, convergence to K1 is not allowed since N1 < 0 close to K1 . Convergence to one of the special points in the

420

H. Ringstr¨ om


closure of K1 is also forbidden, since Proposition 3.1 would imply N1 = 0 for the solution in that case. Assume now that (Σ+ , Σ− ) → (σ+ , σ− ) as τ → ∞. Compute

Σ− 2 − Σ+

= 0.

(16)

Σ− σ− = 2 − Σ+ 2 − σ+

(17)

We get

for arbitrary (Σ+ , Σ− ) belonging to the solution. Since N1 = (q √ − 4Σ+ )N1 and N1 → 0, we have to have σ+ ≥ 1/2. If σ+ = 1/2, then σ− = ± 3/2. The two corresponding lines in the Σ+ Σ− -plane, obtained by substituting (σ+ , σ− ) into (17), do not intersect any points interior to the Kasner circle. Therefore σ+ = 1/2 is not an allowed limit point, and the proposition follows. ✷ Observe that by (16), the projection of the solution to the Σ+ Σ− -plane is a straight line. The orbits when N2 > 0 and when N3 > 0 are obtained by applying the symmetries. Figure 1 shows a sequence of vacuum type II orbits projected to the Σ+ Σ− -plane. The first line, starting at the star, has N1 > 0, the second N3 > 0 and the third N2 > 0. Definition 5.1 If x0 is a non-special point on the Kasner circle, then the Kasner map applied to x0 is defined to be the point x1 on the Kasner circle, with the property that there is a vacuum type II orbit with x0 as an ω-limit point and x1 as an α-limit point.

6 Dependence on the shear variables In several arguments, we will have control over the shear variables and the density parameter in some time interval, and it is of interest to know how the remaining variables behave in such situations. Consider for instance the expression multiplying N1 in the formula for N1 , see (9). It is given by q − 4Σ+ and equals zero when 1 (3γ − 2)Ω + (1 − Σ+ )2 + Σ2− = 1. (18) 4 The set of points in ΩΣ+ Σ− -space satisfying this equation is a paraboloid, and the intersection with Ω = 0 is the dashed circle shown in Figure 3. If (Ω, Σ+ , Σ− ) belongs to the interior of the paraboloid (18) with Ω ≥ 0, then |N1 | will be negative, so that |N1 | increases as we go backward. Outside of the paraboloid, |N1 | decreases. The situation is similar for N2 and N3 . Observe that the circle obtained by letting Ω = 0 in (18) intersects the Kasner circle in two special points. The same is true of the rotated circles corresponding to N2 and N3 . It will be convenient to introduce notation for the points on the Kasner circle at which |Ni | is negative.

Vol. 2, 2001


421

2

1.5

1

Σ

−

0.5

0

−0.5

−1

−1.5

−2 −1.5

−1

−0.5

0

0.5 Σ

1

1.5

2

2.5

+

Figure 3: The circles mentioned in the text. Definition 6.1 We let K1 , √K2 and K3 be the subsets √ of the Kasner circle where q − 4Σ+ < 0, q + 2Σ+ + 2 3Σ− < 0 and q + 2Σ+ − 2 3Σ− < 0 respectively. Remark. On the Kasner circle, Ω = 0 so that q = 2(Σ2+ + Σ2− ) = 2 under the conditions of this definition. It also of interest to know when the derivatives of N2 N3 and similar products are zero. Since (N2 N3 ) = (2q + 4Σ+ )N2 N3 , we consider the set on which q + 2Σ+ equals zero. This set is a paraboloid and is given by 1 1 1 (3γ − 2)Ω + (Σ+ + )2 + Σ2− = . 4 2 4 The intersection with the plane Ω = 0 is the circle with radius 1/2 shown in Figure 3. Again, inside the paraboloid |N2 N3 | increases as we go backward, and outside it decreases. There are corresponding paraboloids for the products N1 N2 and N1 N3 . Observe that in the non-vacuum case, it is harmless to introduce ω = Ω1/2 and then the paraboloids become half ellipsoids. Proposition 6.1 Consider a Bianchi IX solution to (9)-(11) with 2/3 < γ < 2. If the solution has a non-special α-limit point x on the Kasner circle, then the closure of the vacuum type II orbit with x as an ω-limit point belongs to the α-limit set. Remark. The same conclusion holds for a Bianchi type VII0 solution with N1 = 0, if it has an α-limit point in K2 or K3 . Proof. Assume the limit point lies in K1 with (Σ+ , Σ− ) = (σ+ , σ− ). There is a sequence τk → −∞, such that the solution evaluated at τk converges to the point

422

H. Ringstr¨ om


on the Kasner circle. There is a ball Bη (σ+ , σ− ) in the Σ+ Σ− -plane, centered at this point, such that |N2 |, |N3 |, |N1 N2 |, |N1 N3 | and Ω all decay exponentially, at least as eξτ for some fixed ξ > 0, and N1 increases exponentially, at least as e−ξτ , in the closure of this ball. There is a K such that (Σ+ (τk ), Σ− (τk )) ∈ Bη (σ+ , σ− ) for all k ≥ K. For each time we enter the ball, we must leave it, since if we stay in it to the past, N1 will grow to infinity whereas N2 and N3 will decay to zero, in violation of the constraint. Thus for each τk , k ≥ K, there is a tk ≤ τk corresponding to the first time we leave the ball, starting at τk and going backward. We may compute ( where

Σ− ) = h, 2 − Σ+

|h(τ )| ≤ k eξ(τ −τk )

in [tk , τk ] and k → 0. Thus Σ− (τk ) Σ− (tk ) − = 2 − Σ+ (τk ) 2 − Σ+ (tk )

But |

τk

tk

hdτ | ≤

τk

hdτ. tk

k , ξ

and in consequence Σ− (tk ) Σ− (τk ) − → 0. 2 − Σ+ (τk ) 2 − Σ+ (tk ) We thus get a type II vacuum limit point with N1 > 0, to which we may apply the flow, and deduce the conclusion of the lemma. The statement made in the remark follows in the same way. Observe that the only important thing was that the limit ✷ point was in K1 and N1 was non-zero for the solution.

7 The stiff fluid case In this section we will assume Ω > 0 and γ = 2 for all solutions we consider. We begin by explaining the origin of the triangle shown in Figure 2. Then we analyze the type II orbits. They yield an analogue of the Kasner map, connecting two points inside the Kasner circle, and we state an analogue of Proposition 6.1 for this map. We then prove that Ω is bounded away from zero to the past. Only in the case of Bianchi IX is an argument required, but this result is the central part of the analysis of the stiff fluid case. A peculiarity of the equations then yields the conclusion that |N1 N2 | + |N2 N3 | + |N3 N1 | converges to zero exponentially. This proves that any solution is contained in a compact set to the past, and that all α-limit points are of type I or II. Another consequence is that Ω has to converge to a non-zero value; this requires a proof in the Bianchi IX case. Next one concludes that all Ni converge to zero, since if that were not the case, there would be an

Vol. 2, 2001


423

α-limit point of type II to which one could apply the flow, obtaining α-limit points with different Ω:s. Then if a Bianchi IX solution had an α-limit point outside the triangle, one could apply the ’Kasner’ map to such a point, obtaining an α-limit point with some Ni > 0. Finally, some technical arguments finish the analysis. In the case of a stiff fluid, that is γ = 2, it is convenient to introduce ω = Ω1/2 . We then have, since 3γ − 2 = 4, ω = −(2 − q)ω.

(19)

The expression Ω+Σ2+ +Σ2− turns into ω 2 +Σ2+ +Σ2− , and the ω, Σ+ , Σ− -coordinates of the type I points obey ω 2 + Σ2+ + Σ2− = 1, ω ≥ 0.

(20)

In the stiff fluid case, all the type I points are fixed points, and they play a role similar to that of the Kasner circle in the vacuum case. Let us make some observations. If N1 = 0, then N1 = 0 is equivalent to q − 4Σ+ = 0. Dividing by 2 and completing squares, we see that this condition is equivalent to ω2 + (1 − Σ+ )2 + Σ2− = 1, ω ≥ 0. (21) By applying the symmetries, the conditions Ni = 0, Ni = 0 are consequently all fulfilled precisely on half spheres of radii 1. Since |N1 | < 0 corresponds to an increase in |N1 | as we go backward, |N1 | increases exponentially as we are inside the half sphere (21) and decreases exponentially as we are outside it. If one takes the intersection of (20) and (21), one gets the subset Σ+ = 1/2 of (20). The corresponding intersections for N2 and N3 yield two more lines in the Σ+ Σ− plane. Together they yield the triangle in Figure 2. Consequently, if (ω, Σ+ , Σ− ) is close to (20) and (Σ+ , Σ− ) is in the interior of the triangle, then all the Ni decay exponentially as τ → −∞. Let M1 be the subset ωΣ+ Σ− -space obeying (20) with ω > 0 and Σ+ > 1/2 and M2 , M3 be the corresponding sets for N2 and N3 . We also let L1 be the subset of the intersection between (20) and (21) with ω > 0 and correspondingly N2 and N3 yield L2 and L3 . Lemma 7.1 Consider a solution to (9)-(11) with γ = 2, N1 > 0, ω > 0 and N2 = N3 = 0. Then (22) lim N1 (τ ) = 0 τ →±∞

and (ω, Σ+ , Σ− ) converges to a point, satisfying (20) and ω > 0, in the complement of L1 ∪ M1 , as τ → −∞. In ωΣ+ Σ− -space, the orbit of the solution is a straight line connecting two points satisfying (20). Furthermore, ω > 0 is strictly increasing along the solution, going backwards in time.

424

H. Ringstr¨ om


Proof. Since q < 2 for the entire solution, we can apply the monotonicity principle with U defined by q < 2, G defined by Σ+ and M by the constraint (11). If q does not converge to 2 as τ → −∞, we get an α-limit point with q < 2. We have a contradiction. This argument also yields the conclusion that N1 → 0 as τ → ∞. Equation (22) follows. Observe that Σ+ =

3 2 3 N1 (2 − Σ+ ), Σ− = − N12 Σ− 2 2

(23)

and

3 (24) ω = − N12 ω. 2 Consequently, Σ+ , Σ− and ω are all monotone so that they converge, both as τ → ∞ and as τ → −∞. It also follows from (23) and (24) that the quotients (2 − Σ+ )/ω and Σ− /ω are constant. Thus the orbit in ωΣ+ Σ− -space describes a straight line connecting two points satisfying (20). As τ → −∞, the solution cannot converge to a point in L1 ∪ M1 for the following reason. Assume it does. Since Σ+ decreases as τ decreases, see (23), we must have Σ+ ≥ 1/2 for the entire solution, since Σ+ by assumption converges to a value ≥ 1/2. But then N1 < 0 for the entire solution by (9) and (11). Thus N1 increases as we go backward, contradicting the fact that N1 → 0. ✷ The next thing we wish to prove is that if a solution has an α-limit point x in the set M1 , and N1 = 0 for the solution, then we can apply the ’Kasner’ map to that point. What we mean by that is that an entire type II orbit with x as an ω-limit point belongs to the α-limit set of the original solution. From this one can draw quite strong conclusions. Observe for instance that by (19), ω is monotone for a Bianchi VIII solution to (9)-(11). Thus ω converges as τ → −∞ since it is bounded. If the Bianchi VIII solution has an α-limit point of type I outside the triangle, we can apply the Kasner map to it to obtain α-limit points with different ω. But that is impossible. Lemma 7.2 Consider a solution to (9)-(11) with γ = 2 such that N1 = 0. Then if the solution has an α-limit point x ∈ M1 , the orbit of a type II solution with x as an ω-limit point belongs to the α-limit set of the solution. Proof. The proof is analogous to the proof of Proposition 6.1.

✷

Consider a solution such that ω > 0. We want to exclude the possibility that ω → 0 as τ → −∞. Considering (19), we see that the only possibility for ω to decrease is if q > 2. In that context, the following lemma is relevant. Lemma 7.3 Consider a Bianchi IX solution to (9)-(11) with γ = 2. There is an α0 such that if α ≤ α0 and (N1 N2 N3 )(τ ) ≤ α, then q(τ ) − 2 ≤ 4α1/3 .

Vol. 2, 2001


425

Proof. By a permutation of the variables, we can assume N1 ≤ N2 ≤ N3 in τ . Observe that q − 2 ≤ 3N1 (N2 + N3 ) by the constraint (11). If N3 ≤ α1/2 in τ , we get q − 2 ≤ 6α ≤ 4α1/3 if α0 is small enough. If N3 ≥ α1/2 in τ , we get N1 N2 ≤ α1/2 . Assume, in order to reach a contradiction, (N1 N3 )(τ ) ≥ α1/3 . Then N2 (τ ) ≤ α2/3 , so that N1 (τ ) ≤ α2/3 and N3 (τ ) ≥ α−1/3 . By Lemma 3.3 we get a contradiction if α0 is small enough. Thus q − 2 ≤ 3(N1 N2 + N1 N3 )(τ ) ≤ 3(α1/3 + α1/2 ) ≤ 4α1/3 if α0 is small enough. ✷ For all solutions except those of Bianchi IX type, ω is monotone increasing as τ decreases. Thus, ω is greater than zero on the α-limit set of any non-vacuum solution which is not of type IX. It turns out that the same is true for a Bianchi IX solution. Lemma 7.4 Consider a Bianchi IX solution to (9)-(11) with γ = 2 such that ω > 0. Then there is an > 0 such that ω(τ ) ≥ for all τ ≤ 0. Proof. Assume all the Ni are positive. The function φ=

(N1 N2 N3 )1/3 ω

satisfies φ = 2φ. Thus, for τ ≤ 0, (N1 N2 N3 )1/3 (τ ) = ω(τ )φ(0)e2τ ≤ Ce2τ , because of Lemma 3.3. For τ ≤ T ≤ 0, we can thus apply Lemma 7.3, so that for τ ≤ T, T 0 T 0 (q(s) − 2)ds = (q(s) − 2)ds + (q(s) − 2)ds ≤ 4C e2s ds+ τ

τ

T

0

0

(q(s) − 2)ds ≤ 2Ce2T +

+ T

τ

(q(s) − 2)ds ≤ C < ∞.

T

Consequently, ω(τ ) = ω(0) exp(−

0

(q(s) − 2)ds) ≥ ω(0)e−C

τ

for τ ≤ T , and the lemma follows. ✷ The next lemma will be used to prove that ω converges for a Bianchi IX solution.

426

H. Ringstr¨ om


Lemma 7.5 Consider a solution to (9)-(11) with γ = 2 and ω > 0. Then there is an α > 0 and a T such that |N1 N2 | + |N2 N3 | + |N3 N1 | ≤ eατ for all τ ≤ T . Proof. Consider g = |N2 N3 |/ω. Then g = (2ω 2 + 2(1 + Σ+ )2 + 2Σ2− )g. Since ω(τ ) ≥ for all τ ≤ 0, we conclude that g(τ ) ≤ g(0) exp(22 τ ) so that |(N2 N3 )(τ )| ≤ g(0)ω(τ ) exp(22 τ ). There are similar estimates for the other products. By Lemma 3.3, we know that ω is bounded in (−∞, 0] so that by choosing α = 2 and T negative enough the lemma follows. ✷ Corollary 7.1 Consider a solution to (9)-(11) with γ = 2 and ω > 0. Then (ω, Σ+ , Σ− , N1 , N2 , N3 )(−∞, 0] is contained in a compact set and all the α-limit points are of type I or II. Lemma 7.6 Consider a solution to (9)-(11) with γ = 2 and ω > 0. Then lim ω(τ ) = ω0 > 0.

τ →−∞

Proof. Since this follows from the monotonicity of ω in all cases except Bianchi IX, see (19), we assume that the solution is of type IX. Let τk → −∞ be a sequence such that ω(τk ) → ω1 > 0. This is possible since ω is constrained to belong to a compact set for τ ≤ 0 by Lemma 3.3, and since ω is bounded away from zero to the past by Lemma 7.4. Assume ω does not converge to ω1 . Then there is a sequence sk → −∞ such that ω(sk ) → ω2 where we can assume ω2 > ω1 . We can also assume τk ≤ sk . Then sk (q − 2)ds)ω(τk ). ω(sk ) = exp( τk

Since q − 2 ≤ 3(N1 N2 + N2 N3 + N3 N1 ) ≤ 3eατ for τ ≤ T by Lemma 7.5 and the constraint (11), we have, assuming sk ≤ T , sk sk 3 (q − 2)ds ≤ 3 eατ dτ ≤ eαsk . α τk τk Thus

3 ω(sk ) ≤ exp( eαsk )ω(τk ) → ω1 , α so that ω2 ≤ ω1 contradicting our assumption.

✷

Vol. 2, 2001


427

Corollary 7.2 Consider a solution to (9)-(11) with γ = 2 and ω > 0. Then lim Ni (τ ) = 0

τ →−∞

for i = 1, 2, 3. Proof. Assume N1 does not converge to zero. Then there is a type II α-limit point with N1 and ω non-zero by Corollary 7.1 and Lemma 7.6. If we apply the flow, we get α-limit points with different ω in contradiction to Lemma 7.6. ✷ Lemma 7.7 Consider a solution to (9)-(11) with γ = 2 and ω > 0. If it has an α-limit point of type I inside the triangle, the solution converges to that point. Proof. Let x be the limit point. Let B be a ball of radius in ωΣ+ Σ− -space, with center given by the ω, Σ+ , Σ− -coordinates of x. Let τk → −∞ be a sequence that yields x. Assume the solution leaves B to the past of every τk . Then there is a sequence sk → −∞, such that the ω, Σ+ , Σ− -coordinates of the solution evaluated in sk converges to a point on the boundary of B, sk ≤ τk , and the ω, Σ+ , Σ− coordinates of the solution are contained in B during [sk , τk ], k large enough. Since all expressions in the Ni decay exponentially as eατ , for some α > 0, as long as the ω, Σ+ , Σ− -coordinates are in B ( small enough), we have |Σ+ | + |Σ− | + |ω | ≤ αk eα(τ −τk ) for τ ∈ [sk , τk ] where αk → 0. We get |Σ+ (τk ) − Σ+ (sk )| ≤

αk → 0, α

and similarly for Σ− and ω. The assumption that we always leave B consequently yields a contradiction. We must thus converge to the given α-limit point. ✷ Proposition 7.1 Consider a solution to (9)-(11) with γ = 2 and ω > 0. If Ni is non-zero for the solution, it converges to a type I point in the complement of Mi with ω > 0. Proof. If there is an α-limit point on Mi , we can use Lemma 7.2 to obtain a contradiction to Lemma 7.6. If there is an α-limit point in Mk and Nk is zero for the solution, the solution converges to that point by an argument similar to the one given in the previous lemma. What remains is the possibility that all the α-limit points are on the Lk . Since ω converges, the possible points projected to the Σ+ Σ− -plane are the intersection between a triangle and a circle. Since the α-limit set is connected, we conclude that the solution must converge to a point on one of the Lk . ✷ Proposition 7.2 Consider a solution to (9)-(11) with γ = 2 and ω > 0. If Ni is non-zero for the solution, the solution cannot converge to a point in Li .

428

H. Ringstr¨ om


Proof. Assume i = 1. Then Li is the subset of (20) consisting of points with Σ+ = 1/2 and ω > 0. Since N2 , N3 , N2 N3 , N2 N1 and N3 N1 converge to zero faster than N12 , Σ+ will in the end be positive, cf. (23), so that there is a T such that Σ+ (τ ) ≥ 1/2 for τ ≤ T . Since N1 will dominate in the end, we can also assume q(τ ) < 2 for τ ≤ T . By (9) we conclude that |N1 | increases backward as τ ≤ T contradicting Corollary 7.2. ✷ Adding up the last two propositions, we conclude that the Σ+ Σ− -variables of Bianchi VIII and IX solutions converge to a point interior to the triangle of Figure 2, and Ω to the value then determined by the constraint (11). In the Bianchi VII0 case, a side of the triangle disappears, increasing the set of points to which Σ+ , Σ− may converge. We sum up the conclusions in Section 19.

8 Type I solutions From now on we consider the non-stiff fluid case. Consider type I solutions (Ni = 0). The point F and the points on the Kasner circle are fixed points. Consider a solution with 0 < Ω(τ0 ) < 1. Using the constraint, we may express the time derivative of Ω in terms of Ω. Solving the resulting equation yields lim Ω(τ ) = 0, lim Ω(τ ) = 1.

τ →−∞

τ →∞

By (9) (Σ+ , Σ− ) moves radially. Proposition 8.1 For a type I solution, with 2/3 < γ < 2, which is not F, we have lim (Σ+ , Σ− , Ω)(τ ) = (σ+ /|σ|, σ− /|σ|, 0),

τ →−∞

where (σ+ , σ− ) is the initial value of (Σ+ , Σ− ), and |σ| is the Euclidean norm of the initial value.

9 Type II solutions An analysis of the type II solutions can be found in e. g. [21], but we include a proof for completeness. Proposition 9.1 Consider a type II solution with N1 > 0 and 2/3 < γ < 2. If the initial value for Σ− is non-zero, the α-limit set is a point in K2 ∪ K3 . If the initial value for Σ− is zero, either the solution is the special point P1+ (II), it is contained in FII , or (25) lim (Ω, Σ+ , N1 )(τ ) = (0, −1, 0). τ →−∞

Proof. Let the initial data be given by (σ+ , σ− , Ω0 ). The vacuum case was handled in Proposition 5.1, so we will assume Ω0 > 0.

Vol. 2, 2001


429

Consider first the case σ− = 0. Compute 3 3 q − 2 = − (2 − γ)Ω − N12 . 2 2 Thus, Σ− decreases if it is negative, and increases if it is positive, as we go backward in time, by (9). Thus, both N1 and Ω must converge to 0 as τ → −∞, since the variables are constrained to belong to a compact set, and because of the monotonicity principle. Since Σ− is monotonous and the α-limit set is connected, see Lemma 3.1, (Σ+ , Σ− ) must converge to a point, say (s+ , s− ) on the Kasner circle. We must have s− = 0, and 2s2+ + 2s2− − 4s+ ≥ 0, since N1 converges to 0. There are two special points in this set, but we may not converge to them, since that would imply N1 = 0 for the entire solution by Proposition 3.1. The first part of the proposition follows. Consider the case σ− = 0. There is a fixed point P1+ (II). Eliminating Ω from (9)-(11), we are left with the two variables N1 and Σ+ . The linearization has negative eigenvalues at P1+ (II), so that no solution which does not equal P1+ (II) can have it as an α-limit point, cf. [12] pp. 228-234. There is also a set of solutions converging to the fixed point F . Consider now the complement of the above. The function N 2m Ω1−m Z7 = 1 , (1 − vΣ+ )2 where v = (3γ − 2)/8 and m = 3v(2 − γ)/8(1 − v 2 ), found by Uggla satisfies Z7 =

3(2 − γ) 1 (Σ+ − v)2 Z7 . 1 − vΣ+ 1 − v 2

Apply the monotonicity principle. Let G = Z7 and U be defined as the subset of ΩΣ+ N1 -space consisting of points different from P1+ (II), which have Ω > 0, N1 > 0 and |Σ+ | < 1. Let M be defined by the constraint. If Σ+ = v then Z7 = 0, but if we are not at P1+ (II), Σ+ = v implies Σ+ = 0. Thus, G ◦ x is strictly monotone as long as x is contained in U ∩ M . Since the solution cannot have P1+ (II) as an α-limit point, we must thus have N1 = 0 or Ω = 0 in the α-limit set. Observe that 3 3 Σ+ = N12 (2 − Σ+ ) − (2 − γ)ΩΣ+ . (26) 2 2 Thus, if the solution attains a point Σ+ ≤ 0, then (25) holds. We will now prove that this is the only possibility. a. Assume we have an α-limit point with N1 > 0 and Ω = 0. Then we may apply the flow to that limit point to get Σ+ = −1 as a limit point, but then the solution must attain Σ+ ≤ 0. b. If Ω > 0 but N1 = 0, then we may assume Σ+ = 0 since we are not on FII , cf. Lemma 4.2. Apply the flow to arrive at Σ+ = −1 or Σ+ = 1. The

430

H. Ringstr¨ om


former alternative has been dealt with, and the latter case allows us to construct an α-limit point with N1 > 0 and Ω = 0, since N1 increases exponentially, and Ω decreases exponentially, in a neighbourhood of the point on the Kasner circle with Σ+ = 1, cf. Proposition 6.1. c. The situation Ω = N1 = 0 can be handled as above. ✷ We make one more observation that will be relevant in analyzing the regularity of FII . Let A be the vacuum type I and II points. Lemma 9.1 The closure of FII does not intersect A. Proof. Assume there is a sequence xk ∈ FII such that the distance from xk to A goes to zero. We can assume that all the xk have N1 > 0 by choosing a suitable subsequence and then applying the symmetries. We can also assume that xk → x ∈ A. Since Σ− = 0 for all the xk by Proposition 9.1, the same holds for x. Observe that no element of FII can have Σ+ ≤ 0, because of (26). If N1 corresponding to x is zero, we then conclude that x is defined by Σ+ = 1 and all the other variables zero. Applying the flow to the past to the points xk will then yield a sequence yk ∈ FII such that yk converges to a type II vacuum point with N1 > 0 and Σ− = 0, cf. the proof of Proposition 6.1. Thus, we can assume that the limit point x ∈ A has N1 > 0. Applying the flow to x yields the point Σ+ = −1 on the Kasner circle by Proposition 5.1. By the continuity of the flow, we can apply the flow to xk to obtain elements in FII with Σ+ < 0 which is impossible. ✷

10 Type VII0 solutions Claes Uggla and John Wainwright have analyzed the α-limit set of Bianchi VII0 solutions, but as they have not published their results, we include an analysis. When speaking of Bianchi VII0 solutions, we will always assume N1 = 0 and N2 , N3 > 0. Consider first the case N2 = N3 and Σ− = 0 Proposition 10.1 Consider a type VII0 solution with N1 = 0 and 2/3 < γ < 2. If N2 = N3 and Σ− = 0, one of the following possibilities occurs 1. The solution converges to Σ+ = 1 on the Kasner circle. 2. The solution converges to F . 3. limτ →−∞ Σ+ = −1, limτ →−∞ N2 = n2 > 0, limτ →−∞ Ω = 0. Proof. Since 3 Σ+ = − (2 − γ)ΩΣ+ 2 if N2 = N3 , the conclusions of the lemma follow, except for the statement that N2 converges to a non-zero value if Σ+ converges to −1. However, Ω will decay to zero exponentially close to the Kasner circle, and by the constraint, 1 + Σ+ will behave as Ω close to Σ+ = −1. Thus, q + 2Σ+ will be integrable. ✷

Vol. 2, 2001


431

2

2

1

1.5

0

1

q

Σ+

Before we state a proposition concerning the behaviour of generic Bianchi VII0 solutions, let us give an intuitive picture. Figure 4 shows a simulation with γ = 1, where the plus sign represents the starting point, and the star the end point, going backward. Ω will decay to zero quite rapidly, and the same holds for the product N2 N3 . In that sense, the solution will asymptotically behave like a sequence of type II vacuum orbits. If both N2 and N3 are small, and we are close to the section K2 on the Kasner circle, then N2 will increase exponentially, and N3 will decay exponentially, yielding in the end roughly a type II orbit with N2 > 0. If this orbit ends in at a point in K3 , then the game begins anew, and we get roughly a type II orbit with N3 > 0. Observe however that if we get close to K1 , there is nothing to make us bounce away, since N1 is zero. The simulation illustrates this behaviour. Consider the figure of the solution projected to the Σ+ Σ− -plane. The three points that appear to be on the Kasner circle are close to K2 , K3 and K1 respectively. Observe how this correlates with the graphs of N2 , N3 and q.

−1

−2 −2

0.5

−1

0 Σ−

1

0

2

1

0

5

10 −τ

15

20

0

5

10 −τ

15

20

1.4 1.2

0.8

1 0.6

N3

N

2

0.8 0.6

0.4

0.4 0.2 0

0.2 0

5

10 −τ

15

20

0

Figure 4: Illustration of a Bianchi VII0 solution. Proposition 10.2 Generic Bianchi VII0 solutions with N1 = 0 and 2/3 < γ < 2 converge to a point in K1 .

432

H. Ringstr¨ om


We divide the proof into lemmas. First we prove that the past dynamics are contained in a compact set. Lemma 10.1 For a generic Bianchi VII0 solution with N1 = 0 and 2/3 < γ < 2, (N2 , N3 )(−∞, 0] is contained in a compact set. Proof. For a generic solution, Z−1 =

4 2 3 Σ−

+ (N2 − N3 )2 N2 N3

is never zero. Compute =− Z−1

Σ2− (1 + Σ+ ) 16 Z−1 . 4 2 3 3 Σ− + (N2 − N3 )2

(27)

The proof that the past dynamics are contained in a compact set is as in Rendall [18]. Let τ ≤ 0. Then Z−1 (τ ) ≥ Z−1 (0), so that (N2 N3 )(τ ) ≤

4 . 3Z−1 (0)

Combining this fact with the constraint, we see that all the variables are contained in a compact set during (−∞, 0]. ✷ We now prove that N2 N3 → 0. The reason being the desire to reduce the problem by proving that all the limit points are of type I or II, and then use our knowledge about what happens when we apply the flow to such points. Lemma 10.2 Generic Bianchi VII0 solutions with N1 = 0 and 2/3 < γ < 2 satisfy lim (N2 N3 )(τ ) = 0.

τ →−∞

Proof. Assume the contrary. Then we can use Lemma 10.1 to construct an α-limit point (ω, σ+ , σ− , 0, n2 , n3 ) where n2 n3 > 0. We apply the monotonicity principle in order to arrive at a contradiction. With notation as in Lemma 5.1, let U be defined by N2 > 0, N3 > 0 and Σ2− + (N2 − N3 )2 > 0. Let G be defined by Z−1 , and M by the constraint (11). We have to show that G evaluated on a solution is strictly monotone as long as the solution is contained in U ∩ M . Consider (27). By the constraint (11), Σ2− + (N2 − N3 )2 > 0 implies Σ+ > −1. Furthermore, Z−1 > 0 on U . If Z−1 = 0 in U ∩ M , we thus have Σ− = 0, but then Σ− = 0 since 2 2 Σ− + (N2 − N3 ) > 0 and N2 + N3 > 0. The α-limit point we have constructed cannot belong to U . On the other hand, n2 , n3 > 0 and since Z−1 increases as we 2 go backward, σ− + (n2 − n3 )2 cannot be zero. We have a contradiction. ✷ Proof of Proposition 10.2. Note that 3 Σ+ = −(2 − 2Ω − 2Σ2+ − 2Σ2− )(1 + Σ+ ) − (2 − γ)ΩΣ+ 2

(28)

Vol. 2, 2001


433

by (12). Assume we are not on PVII0 or FVII0 . Let us first prove that there is an α-limit point on the Kasner circle. Assume F is an α-limit point. Then we may construct a type I limit point which is not F , and thus a limit point on the Kasner circle, cf. Lemma 4.2 and Proposition 8.1. By Lemma 10.2, we may then assume that there is a limit point of type I or II, which is not P2+ (II) or P3+ (II), and does not lie in FI or FII , cf. Lemma 4.1. Thus, we get a limit point on the Kasner circle by Proposition 8.1 and Proposition 9.1. Next, we prove that there has to be an α-limit point which lies in the closure of K1 . If the α-limit point we have constructed is in K2 or K3 , we can apply the Kasner map according to the remark following Proposition 6.1. After a finite number of Kasner iterates we will end up in the desired set. If the α-limit point we obtained has Σ+ = −1, we may construct a limit point with 1 + Σ+ = > 0 by Proposition 3.1. We can also assume that Ω = 0 for this point, since Ω decays exponentially going backward when Σ+ is close to −1. By Lemma 10.2, this limit point will be a type I or II vacuum point, and by applying the flow we get a non special limit point on the Kasner circle. As above, we then get an α-limit point in the desired set. Let the Σ+ Σ− -variables of one α-limit point in the closure of K1 be (σ+ , σ− ). By (28), we conclude that once Σ+ has become greater than 0, it becomes monotone so that it has to converge. Moreover, we see by the same equation that Ω then has to converge to zero, and Σ2+ + Σ2− has to converge to 1. Since the α-limit set is connected, by Lemma 3.1 and Lemma 10.1, we conclude that (Σ+ , Σ−√) has to converge to (σ+ , σ− ). By Proposition 3.1, (σ+ , σ− ) cannot equal (1/2, ± 3/2), since otherwise N2 or N3 would be zero for the entire solution. Consequently, σ+ > 1/2, and we conclude that N2 and N3 have to converge to zero. The proposition follows. ✷

11 Taub type IX solutions Consider the Taub type solutions: Σ− = 0 and N2 = N3 . We prove that except for the cases when the solution belongs to FIX or PIX , (Σ+ , Σ− ) converges to (−1, 0). Lemma 11.1 Consider a type IX solution with Σ− = 0, N2 = N3 and 2/3 < γ < 2. Then Σ+ (τ0 ) ≤ 0 and Ω(τ0 ) < 1 imply lim (Ω, Σ+ , Σ− , N1 , N2 , N3 )(τ ) = (0, −1, 0, 0, n2 , n2 ),

τ →−∞

where 0 < n2 < ∞. Proof. We prove that the flow will take us to the boundary of the parabola Ω+Σ2+ = 1 with Σ+ < 0, and that we will then slide down the side on the outside to reach Σ+ = −1, see Figure 5. The plus sign in the figure represents the starting point, and the star the end point.

434

H. Ringstr¨ om


1

0.9

0.8

0.7

Ω

0.6

0.5

0.4

0.3

0.2

0.1

0 −1

−0.8

−0.6

−0.4

−0.2

0 Σ

0.2

0.4

0.6

0.8

1

+

Figure 5: Part of a Taub type IX solution projected to the Σ+ Ω-plane. 1. Let us first assume Σ+ (τ0 ) ≤ 0, Ω(τ0 ) < 1 and Ω(τ0 ) + Σ2+ (τ0 ) ≥ 1. Consider C = {τ ≤ τ0 : t ∈ [τ, τ0 ] ⇒ Σ+ (t) ≤ 0, Ω(t) ≤ Ω(τ0 ), Ω(t) + Σ2+ (t) ≥ 1}. We prove that C is not bounded from below. Assume the contrary. Let t be the infimum of C, which exists since C is non-empty and bounded from below. Since t ∈ C, Σ+ (t) < 0. Let t < t be such that Σ+ < 0 in [t , t]. Observe that Ω = [(3γ − 2)(Ω + Σ2+ − 1) + 3(2 − γ)Σ2+ ]Ω.

(29)

By the constraint, Ω + Σ2+ − 1 =

3 2 N2 − 1). N (4 4 1 N1

(30)

Since Σ+ < 0 in [t , t], N2 /N1 increases as we go backward in that interval, because of N2 N2 ( ) = 6Σ+ . N1 N1 Consequently Ω + Σ2+ ≥ 1 in [t , t], by (30), so that Ω decreases in the interval by (29). Thus t ∈ C, contradicting the fact that t is the infimum of C. Let τ ≤ τ0 . Then Σ+ (τ ) ≤ − 1 − Ω(τ0 ). By (29), we then conclude Ω → 0. By (9), we also conclude that N1 N2 → 0 and N1 → 0. By (30), we have Σ+ → −1. Using the constraint (30) and (10), we conclude that q + 2Σ+ is integrable, so that N2 = N3 will converge to a finite non-zero value.

Vol. 2, 2001


435

2. Assume now Σ+ (τ0 ) ≤ 0, Ω(τ0 ) < 1 and Ω(τ0 ) + Σ2+ (τ0 ) < 1. Observe that 3 Σ+ = (1 − Ω − Σ2+ )(4 − 2Σ+ ) − (2 − γ)ΩΣ+ + 9N1 N2 . 2

(31)

As long as Ω + Σ2+ < 1, Σ+ decreases as we go backward in time by (31). Then N2 /N1 will increase exponentially until Ω+Σ2+ = 1, by the constraint, and Σ+ < 0. ✷ Lemma 11.2 Consider a type IX solution with Σ− = 0, N2 = N3 and 2/3 < γ < 2. It is contained in a compact set for τ ≤ 0 and N1 N2 → 0. Proof. Note that N1 must be bounded for τ ≤ 0, as follows from Lemma 3.3, the fact that N2 = N3 , and the fact that N1 N2 N3 decreases backward in time. To prove the first statement, assume the contrary. Then there is a sequence τk → −∞ such that N2 (τk ) → ∞. We can assume N2 (τk ) ≤ 0, and thus 1 (3γ − 2)Ω + 2Σ2+ + 2Σ+ ≤ 0 2

(32)

in τk . Since N1 N22 is decreasing as we go backward, N1 and N1 N2 evaluated at τk must go to zero. Thus Ω + Σ2+ − 1 will become arbitrarily small in τk by (30). If Ω(τk ) ≥ 1 for all k, we get 1 Σ+ (τk ) ≤ − (3γ − 2) 4 by (32), so that 1 (3γ − 2)2 , 16 which is a contradiction. In other words, there is a k such that Σ+ (τk ) ≤ 0, by (32), and Ω(τk ) < 1. We can then use Lemma 11.1 to arrive at a contradiction to the assumption that the solution is not contained in a compact set. To prove the second part of the lemma, observe that N1 N22 converges to zero, as follows from the existence of an α- limit point and Lemma 5.2. Thus Σ2+ (τk ) + Ω(τk ) ≥ 1 +

1/2

N1 N2 = N1 [N1 N22 ]1/2 ≤ C[N1 N22 ]1/2 → 0. ✷ Proposition 11.1 For a type IX solution with Σ− = 0, N2 = N3 and 2/3 < γ < 2, either the solution is contained in FIX or PIX , or lim (Ω, Σ+ , Σ− , N1 , N2 , N3 )(τ ) = (0, −1, 0, 0, n2 , n2 )

τ →−∞

where 0 < n2 < ∞.

436

H. Ringstr¨ om


Remark. Compare with Proposition 3.1. Observe also that when Σ+ for the solution converges to −1, we approach Σ+ = −1, Ω = 0 from outside the parabola Ω + Σ2+ = 1, as follows from the proof of Lemma 11.1. Proof. Consider a solution which is not contained in FIX or PIX . By Lemma 11.2, there is an α-limit point with N1 N2 = 0. We can assume it is not P1+ (II). We have the following possibilities. 1. It is contained in FI ∪ FII ∪ FVII0 . Then F is an α-limit point. Since the solution is not contained in FIX , we get a type I limit point which is not F , by Lemma 4.2, and thus either Σ+ = −1 or Σ+ = 1 as limit points, by Proposition 8.1. The first alternative implies convergence to Σ+ = −1, by Lemma 11.1. If we have a type I α-limit point with Σ+ = 1, we can apply the Kasner map by Proposition 6.1 in order to obtain a type I limit point with Σ+ = −1. 2. The limit point is of type I. This possibility can be dealt with as above. 3. It is of type II. We can assume that it is not P1+ (II), by Lemma 4.1, and that it is not contained in FII . Thus we get Σ+ = −1 on the Kasner circle as an α-limit point, by Proposition 9.1, and thus as above convergence to Σ+ = −1. 4. The limit point is of type VI I0 . We can assume Σ+ = 0. If Σ+ < 0, we can apply Lemma 11.1 again, and if Σ+ > 0, we get Σ+ = 1 on the Kasner circle as an α-limit point, by Proposition 10.1, a case which can be dealt with as above. ✷

12 Oscillatory behaviour It will be necessary to consider Bianchi IX solutions to (9)-(11) under circumstances such that the behaviour is oscillatory. This section provides the technical tools needed. Let g be a smooth function, 0 g A= , (33) −g 0 ˜ = (˜ and x x, y˜)t satisfy

˜ = A˜ x x + ,

where is some vector valued function. Lemma 12.1 Let φ0 be such that (sin(φ0 ), cos(φ0 )) and (˜ x(τ0 ), y˜(τ0 )) are parallel. Define τ ξ(τ ) = g(s)ds + φ0 (34) τ0

and x(τ ) =

x(τ ) y(τ )

=

sin ξ(τ ) cos ξ(τ )

.

Then

(35) τ

˜ x(τ ) − x(τ ) ≤ |1 − (˜ x2 (τ0 ) + y˜2 (τ0 ))1/2 | + |

(s)ds|. τ0

(36)

Vol. 2, 2001


Proof. Let

437

Φ=

y −x x y

.

We have [A, Φ] = 0, Φ = −AΦ and x = Ax. We get x − x) + Φ(A(˜ x − x) + ) = Φ. (Φ(˜ x − x)) = −AΦ(˜ Thus x − x)(τ0 ) + Φ−1 (τ ) (˜ x − x)(τ ) = Φ−1 (τ )Φ(τ0 )(˜

τ

Φ(s)(s)ds. τ0

But Φ takes values in SO(2) and the lemma follows. ✷ In order to prove the existence of an α-limit point for Bianchi IX solutions, and that, generically, there is a limit point on the Kasner circle, we need the following lemma. Lemma 12.2 Consider a Bianchi IX solution with 2/3 < γ < 2. Assume there is a sequence τk → −∞ such that q(τk ) → 0, and N2 (τk ), N3 (τk ) → ∞, then for each T , there is a τ ≤ T such that Σ+ (τ ) ≥ 0. Proof. Observe that by (12), q = 0 and N2 + N3 ≥ N1 implies Σ+ ≤ −2. However, the only term appearing in the constraint which does not go to zero in τk is (N2 − N3 )2 , since the product N1 N2 N3 decreases as we go backward. Thus |Σ− (τk )| → ∞, and the behaviour is oscillatory. It is clear that Σ+ could become positive during the oscillations, but only when |Σ− | is big, so that we on the whole should move in the positive direction. Assume there is a T such that Σ+ (τ ) < 0 for all τ ≤ T . We begin by examining the behaviour of different expressions in the sets Dk = ∪∞ n=k [τn − 1, τn ] and

D = ∪∞ n=1 [τn − 1, τn ].

Observe that by the fact that (Ω, Σ+ , Σ− ) are constrained to belong to a compact set during (−∞, 0], according to Lemma 3.3, N2 and N3 go to infinity uniformly in D (by which we will mean the following): ∀M ∃K : k ≥ K ⇒ Ni (τ ) ≥ M ∀τ ∈ Dk , i = 2, 3. Thus N1 and N1 (N2 + N3 ) go to zero uniformly in D. By (9), Ω also converges to zero uniformly in D. Due to the constraint, we get a bound on Σ2− + 34 (N2 − N3 )2 in D. Consider (12). The last two terms go to zero uniformly. If the first term is not negative, 1 − Ω − Σ2+ − Σ2− ≤ 0. By the constraint, it will then be bounded by an expression that converges to zero uniformly in D. Thus, for every δ > 0 there

438

H. Ringstr¨ om


is a K such that k ≥ K implies Σ+ ≤ δ in Dk . Combining this with the fact that q(τk ) → 0, and the assumption that Σ+ (τ ) < 0 for τ ≤ T , we conclude that Σ+ converges uniformly to zero in D. Next, we use Lemma 12.1 in order to approximate the oscillatory behaviour. Define the functions x ˜ = y˜ =

Σ− (1 − Σ2+ )1/2 √ 3 N2 − N3 . 2 (1 − Σ2+ )1/2

We can apply Lemma 12.1 with g = −3(N2 + N3 ) − 2(1 + Σ+ )˜ xy˜ = g1 + g2 and x , y given by (60) and (61), cf. Lemma 15.1. By the above, we conclude that x ˜ and y˜ are uniformly bounded on Dk , if k is great enough, and that converges to zero uniformly on D. Let xk be the expression given by Lemma 12.1, with τ0 replaced by τk and φ0 by a suitable φk . Let δ > 0. By the above and q(τk ) → 0, we get (˜ x − xk )(τ ) ≤ δ, (37) if τ ∈ [τk − 1, τk ], and k is great enough. In [τk − 1, τk ], we thus have Σ+ = −2 + 2x2k (1 − Σ2+ ) + ρk ,

(38)

where the error ρk can be assumed to be arbitrarily small by choosing k great enough, cf. (12). Let τ ξk (τ ) = g(s)ds + φk τk

be as in (34). Since N2 + N3 goes to infinity uniformly, [τk − 1, τk ] can be assumed to contain an arbitrary number of periods of ξk , if k is great enough. Thus, we can assume the existence of τ1,k , τ2,k ∈ [τk − 1, τk ], such that τ2,k − τ1,k ≥ 1/2 and ξk (τ1,k ) − ξk (τ2,k ) is an integer multiple of π. Let [τ1 , τ2 ] ⊆ [τ1,k , τ2,k ] satisfy ξk (τ1 ) − ξk (τ2 ) = π. We can assume τ2 − τ1 to be arbitrarily small by choosing k great enough. Considering (9), and using the fact that q is bounded, we conclude that N2 + N3 cannot change by more than a factor arbitrarily close to one during [τ1 , τ2 ]. Since the expression involving N2 + N3 dominates g, we conclude that 3 − g(τmax ) ≤ −g(τmin ), 4 where τmax and τmin correspond to the maximum and the minimum of −g in [τ1 , τ2 ]. Estimate ξk (τ1 ) ξk (τ2 ) 2 τ2 2xk (1 − Σ2+ ) 2x2k (1 − Σ2+ ) dη = − dη ≤ 2x2k (1 − Σ2+ )ds = g g ξk (τ1 ) ξk (τ1 )−π τ1

Vol. 2, 2001


1 ≤− g(τmin )

ξk (τ1 )

439

2 sin2 (η)dη = −

ξk (τ1 )−π

π . g(τmin )

We get τ2 − τ1 =

ξk (τ2 )

ξk (τ1 )

1 π 3 dη ≥ − ≥ g g(τmax ) 4

τ2

2x2k (1 − Σ2+ )ds.

τ1

Consequently, (38) yields Σ+ (τ2 ) − Σ+ (τ1 ) = −2(τ2 − τ1 ) +

τ2

2x2k (1 − Σ2+ )ds +

τ1

2 ≤ − (τ2 − τ1 ) + 3

τ2

ρk dτ ≤

τ1

τ2

ρk dτ. τ1

Since ξk (τ1,k ) − ξk (τ2,k ) corresponds to an integer multiple of π, we conclude that 2 Σ+ (τ2,k ) − Σ+ (τ1,k ) ≤ − (τ2,k − τ1,k ) + 3

τ2,k

τ1,k

1 ρk dτ ≤ − + 3

τ2,k

ρk dτ. τ1,k

However, the expressions on the far left can be assumed to be arbitrarily small, and the integral of ρk can be assumed to be arbitrarily small. We have a contradiction. ✷

13 Bianchi IX solutions We first prove that there is an α-limit point. If we assume that there is no αlimit point, we get the conclusion that the Euclidean norm N of the vector (N1 , N2 , N3 ) has to converge to infinity, since (Ω, Σ+ , Σ− ) is constrained to belong to a compact set to the past by Lemma 3.3. In fact, Lemma 3.3 yields more; it implies that two Ni have to be large at any given time. Since the product N1 N2 N3 decays as we go backward, the third Ni has to be small. Sooner or later, the two Ni which are large and the one which is small have to be fixed, since a ’changing of roles’ would require two Ni to be small, and thereby also the third by Lemma 3.3, contradicting the fact that N → ∞. Therefore, one can assume that two Ni converge to infinity, and that the third converges to zero. More precisely we have. Lemma 13.1 Consider a Bianchi IX solution. If N → ∞, we can, by applying the symmetries to the equations, assume that N2 , N3 → ∞ and N1 , N1 (N2 +N3 ) → 0. Proof. As in the vacuum case, see [19]. Lemma 13.2 A Bianchi IX solution with 2/3 < γ < 2 has an α-limit point.

✷

440

H. Ringstr¨ om


Proof. If the solution is of Taub type, we already know that it is true so assume not. We assume N2 , N3 → ∞, since if this does not occur, there is an α-limit point by Lemma 3.3 and Lemma 13.1. By (12) we have Σ+ < 0 if Σ+ = 0 using the constraint (assuming N2 + N3 > 3N1 ). Thus, there is a T such that if Σ+ attains zero in τ ≤ T , it will be non-negative to the past, and thus N2 N3 will be bounded to the past since Σ+ has to be negative for the product to grow. If there is a sequence τk → −∞ such that q(τk ) → 0, we can apply Lemma 12.2 to arrive at a contradiction. Thus there is an S such that q(τ ) ≥ > 0 for all τ ≤ S. Consider Z−1 =

4 2 3 Σ−

+ (N2 − N3 )2 . N2 N3

(39)

(40)

The reason we consider this function is that the derivative is in a sense almost negative, so that it almost increases as we go backward. On the other hand, it converges to zero as τ → −∞ by our assumptions. The lemma follows from the resulting contradiction. We have Z−1

√ 2 − 16 h 3 Σ− (1 + Σ+ ) + 4 3Σ− (N2 − N3 )N1 = = . N2 N3 N2 N3

(41)

Letting f=

4 2 Σ + (N2 − N3 )2 , 3 −

we have, using the constraint, √ h ≤ 4Σ2− N1 (N2 + N3 ) + 2 3N1 f ≤ N1 N2 N3 f for, say, τ ≤ T ≤ S. Thus

Z−1 ≤ N1 N2 N3 Z−1

(42)

for all τ ≤ T . Since q ≥ > 0 for all τ ≤ T ≤ S by (39), we get (N1 N2 N3 )(τ ) ≤ (N1 N2 N3 )(T ) exp[3(τ − T )] for τ ≤ T . Inserting this inequality in (42), we can integrate to obtain Z−1 (τ ) ≥ Z−1 (T ) exp(−

1 (N1 N2 N3 )(T )) > 0 3

for τ ≤ T . But Z−1 (τ ) → 0 as τ → −∞ by our assumption, and we have a contradiction. ✷

Vol. 2, 2001


441

Corollary 13.1 Consider a Bianchi IX solution with 2/3 < γ < 2. For all > 0, there is a T such that Ω + Σ2+ + Σ2− ≤ 1 + for all τ ≤ T . Furthermore lim (N1 N2 N3 )(τ ) = 0.

τ →−∞

Proof. As in the vacuum case, see [19]. The second part follows from Lemma 5.2 and Lemma 13.2. ✷ Proposition 13.1 A generic Bianchi IX solution with 2/3 < γ < 2 has an α-limit point on the Kasner circle. Proof. Observe that by Lemma 13.2 and Corollary 13.1, there is an α-limit point of type I, II or VII0 . 1. First we prove that we can assume the α-limit point to be a type VII0 point with N1 = 0, 0 < N2 = N3 , Ω = 0, Σ− = 0 and Σ+ = −1. a. If there is an α-limit point in FI , FII or FVII0 , F is a limit point, but then there is an α-limit point on the Kasner circle, by Lemma 4.2 and Proposition 8.1. b. Assume there is an α-limit point in PVII0 , or that one of Pi+ (II) is an αlimit point. Then there is a limit point of type II which is not Pi+ (II), by Lemma 4.1, and we can assume it does not belong to FII . We thus get an α-limit point on the Kasner circle by Proposition 9.1. c. Consider the complement of the above. We have an α-limit point of type I, II or VII0 which is generic or possibly of Taub type. If the limit point is of type I or II, we get an α-limit point on the Kasner circle by Proposition 8.1 and Proposition 9.1. If the limit point is a non-Taub type VII0 point, we get an α-limit point on the Kasner circle by Proposition 10.2. Assume it is of Taub type with Σ− = 0, N2 = N3 . By Proposition 10.1, we can assume that we have an α-limit point of the type mentioned. 2. We construct an α-limit point on the Kasner circle given an α-limit point as in 1. Since the solution is not of Taub type, we must leave a neighbourhood of the point (Σ+ , Σ− ) = (−1, 0). If N2 and N3 evaluated at the times we leave do not go to infinity, we are done. The reason is that we can choose the neighbourhood to be so small that Ω and N1 decrease exponentially in it, see (9). If N2 (tk ) or N3 (tk ) is bounded, we get a vacuum Bianchi VII0 α-limit point which is not of Taub-type by choosing a suitable subsequence (if we get a type I or II point we are done, see the above arguments). By Proposition 10.2, we then get an α-limit point on the Kasner circle. Thus, we can assume the existence of a sequence tk → −∞ such that N2 (tk ) and N3 (tk ) go to infinity. There are two problems we have to confront. First of all N2 and N3 have to decay from their values in tk in order for us to get an α-limit point. Secondly, and more importantly, we need to see to it that we do not get an α-limit point of the same type we started with. Let us divide the situation into two cases.

442

H. Ringstr¨ om


a. Assume that for each tk there is an sk ≤ tk such that Σ+ (sk ) = 0. Observe that when Σ+ = 0, we have Σ+ ≤

1 N1 (9N1 − 3N2 − 3N3 ) 2

by the constraint (11), and (12). Thus, we can assume that we have 3N1 ≥ N2 +N3 in sk , since there is an α-limit point with Σ+ = −1. Thus there must be an rk ≤ tk such that, at rk , either N1 = N2 < N3 , N1 = N3 < N2 or N1 < N2 , N1 < N3 and 3N1 ≥ N2 + N3 . One of these possibilities must occur an infinite number of times. The first two possibilities yield a type I or II limit point, and the last a type I limit point because, of the fact that N1 N2 N3 → 0 and Lemma 3.3. As above, we get an α-limit point on the Kasner circle. b. Assume there is a T such that Σ+ (τ ) < 0 for all τ ≤ T . Then N1 → 0, since N1 (tk ) → 0, and Σ+ < 0 implies that N1 is monotone. Assume there is a sequence τk → −∞ such that N2 or N3 evaluated at it goes to zero. Then we get an α limit point of type I or II, a situation we may deal with as above. Thus we may assume Ni ≥ > 0, i = 2, 3 to the past of T . Similarly to the proof of the existence of an α-limit point, we have Z−1 ≤ c N1 N2 N3 Z−1 .

If there is an S and a ξ > 0 such that q(τ ) ≥ ξ > 0 for all τ ≤ S, we get a contradiction as in the proof of Lemma 13.2, since (N2 N3 )(tk ) → ∞. Thus there exists a sequence τk → −∞ such that q(τk ) → 0. If N2 (τk ) or N3 (τk ) contains a bounded subsequence, we may refer to possibilities already handled. By Lemma 12.2, we get Σ+ ≥ 0, a contradiction. ✷

14 Control over the density parameter The idea behind the main argument is to use the existence of an α-limit point on the Kasner circle to obtain a contradiction to the assumption that the solution does not converge to the closure of the set of vacuum type II points. The function d = Ω + N1 N2 + N2 N3 + N3 N1 is a measure of the distance from the attractor. We can consider d to be a function of τ , if we evaluate it at a generic Bianchi IX solution. If τk → −∞ yields the α-limit point on the Kasner circle, then d(τk ) → 0. If d does not converge to zero, then it must grow from an arbitrarily small value up to some fixed number, say δ > 0, as we go backward. In the contradiction argument, it is convenient to know that the growth occurs only in the sum of products of the Ni , and that during the growth one can assume Ω to be arbitrarily small. The following proposition achieves this goal, assuming δ is small enough, which is not a restriction. The proof is to be found at the end of this section.

Vol. 2, 2001


443

Proposition 14.1 Consider a Bianchi IX solution with 2/3 < γ < 2. There exists an > 0 such that if N1 N2 + N2 N3 + N1 N3 ≤ (43) in [τ1 , τ2 ], then Ω ≤ cγ Ω(τ2 ) in [τ1 , τ2 ] if Ω(τ2 ) ≤ . Here cγ > 0 only depends on γ.

2

2

1

1.5

0

1

q

Σ

+

The idea of the proof is the following. If the sum of product of the Ni and Ω are small, the solution should behave in the following way. If all the Ni are small, then we are close to the Kasner circle and Ω decays exponentially. One of the Ni may become large alone, and then Ω increases, but it can only be large for a short period of time. After that it must decay until some other Ni becomes large. But this process of the Ni changing roles takes a long time, and most of it occurs close to the Kasner circle, where Ω decays exponentially. Thus, Ω may increase by a certain factor, but after that it must decay by a larger factor until it can increase again, hence the result. Figure 6 illustrates the behaviour.

−1

−2 −2

0.5

−1

0 Σ

1

0

2

0

1

2

3

2

3

−τ

−

0.12

1.4

0.1

1.2 1

0.08 1

N

Ω

0.8

0.06

0.6 0.04

0.4

0.02 0

0.2 0

1

2

3

0

0

−τ

Figure 6: Part of a type IX solution.

1 −τ

444

H. Ringstr¨ om


We divide the proof into lemmas, and begin by making the statement that Ω decays exponentially close to the Kasner circle more precise. Lemma 14.1 Consider a Bianchi IX solution with 2/3 < γ < 2. If Σ2+ + Σ2− ≥

1 (3γ + 2) 8

in an interval [s1 , s2 ], then Ω(s) ≤ Ω(s2 )e−αγ (s2 −s) for s ∈ [s1 , s2 ], where αγ =

3 (2 − γ). 2

Proof. Observe that 1 Ω ≥ 4[Σ2+ + Σ2− − (3γ − 2)]Ω, 4 so that under the conditions of the lemma

(44)

Ω ≥ αγ Ω. The conclusion follows. ✷ Next, we prove that if the Ni all stay sufficiently small under a condition as in (43) and Ω starts out small, then Ω will remain small. Lemma 14.2 Consider a Bianchi IX solution with 2/3 < γ < 2. There is an > 0 such that if ≤ 18 (6 − 3γ) N1 N2 + N2 N3 + N1 N3 ≤ 3 2 4 Ni

(45) (46)

in an interval [s1 , s2 ], and Ω(s2 ) ≤ , then Ω(s) ≤ Ω(s2 ) for all s ∈ [s1 , s2 ]. Proof. Let E = {τ ∈ [s1 , s2 ] : t ∈ [τ, s2 ] ⇒ Ω(t) ≤ Ω(s2 )}. Let τ ∈ E, τ > s1 . There must be two Ni , say N2 and N3 , such that N2 ≤ 1/2 and N3 ≤ 1/2 in τ , by (46). By the constraint (11) and (46), we have in τ , 3 1 Σ2+ + Σ2− ≥ 1 − N12 − Ω − h1 ≥ (3γ + 2) − 4, 4 8 so that assuming small enough depending only on γ, we have Ω (τ ) > 0, cf. (44). Thus there exists an s < τ such that s ∈ E. In other words, E is an open, closed, and non-empty subset of [s1 , s2 ], so that E = [s1 , s2 ]. ✷ The next lemma describes the phase during which Ω may increase.

Vol. 2, 2001


445

Lemma 14.3 Consider a Bianchi IX solution with 2/3 < γ < 2. There is an > 0 such that if ≥ 18 (6 − 3γ) N1 N2 + N2 N3 + N1 N3 ≤ 3 2 4 N1

(47) (48)

in [s1 , s2 ], and Ω(s2 ) ≤ , then s2 − s1 ≤ c1,γ and Ω(s) ≤ c2,γ Ω(s2 ) for all s ∈ [s1 , s2 ], where c1,γ and c2,γ are positive constants depending on γ. Proof. Assume is small enough that 3 1/2 1 ≤ (6 − 3γ), 4 8 so that N1 ≥ 1/4 in [s1 , s2 ]. Assuming < 1 we get Ni ≤ 1/2 in [s1 , s2 ], i = 2, 3. Use the constraint (11) to write 1 − Ω − Σ2+ − Σ2− =

3 2 N + h1 4 1

(49)

where |h1 | ≤ 3 by (48). Thus, 1 − Ω − Σ2+ − Σ2− ≥

3 1/2 − 3, 4

so that we may assume Ω + Σ2+ + Σ2− < 1

(50)

in [s1 , s2 ]. We now compare the behaviour with a type II vacuum solution. By (12) and (49), we have 3 3 9 Σ+ = −2( N12 + h1 )(Σ+ + 1) − (2 − γ)ΩΣ+ + N12 − 4 2 2 9 3 − N1 (N2 + N3 ) = N12 (2 − Σ+ ) + h2 Ω + h3 , 2 2 where |h3 | ≤ 17 and |h2 | ≤ 2 in [s1 , s2 ]. Let aγ = (6 − 3γ)/4. Then, s2 Σ+ (s2 ) − Σ+ (s1 ) ≥ aγ (s2 − s1 ) + (h2 Ω + h3 )dt. s1

However,

Ω(s) ≤ Ω(s2 )e−4(s−s2 ) ≤ e−4(s−s2 )

for all s ∈ [s1 , s2 ], see (9). Thus, s2 1 h2 Ωds| ≤ Ω(s2 )e4(s2 −s1 ) . | 2 s1

(51)

446

H. Ringstr¨ om


We get 1 Σ+ (s2 ) − Σ+ (s1 ) ≥ aγ (s2 − s1 ) − e4(s2 −s1 ) − 17(s2 − s1 ). 2 This inequality contradicts the statement that s2 − s1 may be taken equal to 4/aγ , by choosing small enough. We conclude that s2 − s1 ≤ 4/aγ = c1,γ , and that we may choose c2,γ = exp(16/aγ ). ✷ The following lemma deals with the decay in Ω that has to follow an increase. The idea is that if N1 is on the boundary between big and small, and its derivative is non-negative at a point, then it will decrease as we go backward, and the solution will not move far from the Kasner circle until one of the other Ni has become large. That takes a long time and Ω will decay. Lemma 14.4 Consider a Bianchi IX solution such that 2/3 < γ < 2. There is an > 0 such that if N1 N2 + N2 N3 + N3 N1 ≤ (52) in [s1 , s2 ], 3 2 1 N (s2 ) = (6 − 3γ), N1 (s2 ) ≥ 0 4 1 8 and Ω(s2 ) ≤ c2,γ , where c2,γ is the constant appearing in Lemma 14.3, then Ω decays as we go backward starting at s2 , until s = s1 , or we reach a point s at which Ω(s2 ) Ω(s) ≤ . 2c2,γ Proof. We begin by assuming that > 0 is a fixed number. As the proof progresses, we will restrict it to be smaller than a certain constant depending on γ. We could spell it out here, but prefer to add restrictions successively. Let N1 ≥ 1/4 in [t1 , s2 ] and N1 (t1 ) = 1/4 or t1 = s1 , in case N1 does not attain 1/4 in [s1 , s2 ]. As in the proof of Lemma 14.3, we conclude that Ni ≤ 1/2 , i = 2, 3 in [t1 , s2 ], and that we may assume Ω + Σ2+ + Σ2− < 1. (53) The variables (Ω, Σ+ , Σ− ) have to belong to the interior of a paraboloid for N1 to be negative. Since N1 (s2 ) ≥ 0 we are on the boundary or outside the paraboloid. The boundary is given by g = 0, where g=

1 (3γ − 2)Ω + 2Σ2+ + 2Σ2− − 4Σ+ . 2

An outward pointing normal is given by ∇g, where the derivatives are taken in the order: Ω, Σ+ and Σ− . Let E = {τ ∈ [t1 , s2 ] : t ∈ [τ, s2 ] ⇒ N1 (t) ≥ 0, Ω(t) ≤ c2,γ }.

Vol. 2, 2001


447

Let τ ∈ E. By (53) we get q(τ ) < 2 and, as we are also outside the interior of the paraboloid, Σ+ (τ ) ≤ 1/2. For , and thereby Ω, small enough depending only on γ, we have Σ+ (τ ) ≥ 1/2 , cf. (51). Using the above observations, we estimate in τ , ∇g · (Ω , Σ+ , Σ− ) ≤ Cγ − 1/2 , where Cγ only depends on γ. For small enough, the scalar product is negative. Thus, if (Ω(τ ), Σ+ (τ ), Σ− (τ )) is on the surface of the paraboloid, the solution moves away from it as we go backward, so that N1 ≥ 0 in [s, τ ] for some s < τ . If we are already outside the paraboloid, the existence of such an s is guaranteed by less complicated arguments. As in the proof of Lemma 14.2, we get Ω > 0 for small enough depending only on γ, so that E is open, closed and non-empty. Thus N1 decreases from s2 to t1 going backward. Now, 3 1 Σ2+ + Σ2− ≥ 1 − N12 − Ω − h1 ≥ (3γ + 2) − c2,γ − 3 4 8 in [t1 , s2 ], so that

Ω(t1 ) ≤ Ω(s2 )e−(2−γ)(s2 −t1 ) ,

(54)

by an argument similar to Lemma 14.1, if is small enough. We can assume is small enough that the time required for N1 to decrease to 1/4 is great enough that if t1 = s1 , then the conclusion of the lemma follows by (54). ✷ Proof of Proposition 14.1. Assume is small enough that all the conditions of Lemmas 14.2-14.4 are fulfilled. We divide the interval [τ1 , τ2 ] into suitable subintervals, such that we may apply the above lemmas to them. If 3 2 1 Ni ≤ (6 − 3γ) 4 8

(55)

in τ2 for i = 1, 2, 3, then we let t2 ∈ [τ1 , τ2 ] be the smallest member of the interval such that (55) holds in all of [t2 , τ2 ]. Otherwise, we chose t2 = τ2 . Either t2 = τ1 or 3N12 (t2 )/4 ≥ (6 − 3γ)/8, by a suitable permutation of the variables. If t2 = τ1 , let t1 be the smallest member of [τ1 , t2 ] such that 3N12 /4 ≥ (6 − 3γ)/8 in [t1 , t2 ]. Because of Lemma 14.2, Ω decays in [t2 , τ2 ]. If t2 = τ1 , we are done; let cγ = 1. Otherwise, we apply Lemma 14.3 to the interval [t1 , t2 ] to conclude that Ω(τ ) ≤ c2,γ Ω(τ2 ) in [t1 , τ2 ]. If t1 = τ1 , we can choose cγ = c2,γ . Otherwise, we apply Lemma 14.4 to [τ1 , t1 ]. Either Ω decays until we have reached τ1 , or there is a point s1 ∈ [τ1 , t1 ] such that Ω(s1 ) ≤ Ω(τ2 )/2. By the proof of Lemma 14.4, we can assume that τ2 − s1 ≥ 1; some time has to elapse for the decay to take place. Given an interval [τ1 , τ2 ] as in the statement of the proposition, there are thus two possibilities. Either Ω(τ ) ≤ c2,γ Ω(τ2 ) for all τ ∈ [τ1 , τ2 ] or we can construct an s1 ∈ [τ1 , τ2 ] such that τ2 − s1 ≥ 1, Ω(s1 ) ≤ Ω(τ2 )/2, and Ω(τ ) ≤ c2,γ Ω(τ2 ) for all τ ∈ [s1 , τ2 ]. If the second possibility is the one that occurs, we can apply the same argument to [τ1 , s1 ], and by repeated application, the proposition follows. ✷

448

H. Ringstr¨ om


Corollary 14.1 Consider a Bianchi IX solution with 2/3 < γ < 2. If lim (N1 N2 + N2 N3 + N1 N3 ) = 0

τ →−∞

and there is a sequence τk → −∞ such that Ω(τk ) → 0, then lim Ω(τ ) = 0.

τ →−∞

15 Generic attractor for Bianchi IX solutions In this section, we prove that for a generic Bianchi IX solution, the closure of the set of type II vacuum points is an attractor, assuming 2/3 < γ < 2. What we need to prove is that lim (Ω + N1 N2 + N2 N3 + N1 N3 ) = 0, τ →−∞

since then we may for each > 0 choose a T such that at least two of the Ni and Ω must be less than for τ ≤ T . The starting point is the existence of a limit point on the Kasner circle for a generic solution, given by Proposition 13.1. Since there is such a limit point, there is a sequence τk → −∞ such that Ni (τk ) and Ω(τk ) go to zero. If h = N1 N2 + N2 N3 + N1 N3 (56) does not converge to zero, it must thus grow from an arbitrarily small value up to some . By choosing so that Proposition 14.1 is applicable, we have control over Ω. A few arguments yield the conclusion that we may assume that it is the product N2 N3 that grows, and that the growth occurs close to the special point (Σ+ , Σ− ) = (−1, 0). Close to this point, Ω, N1 and N1 (N2 + N3 ) decay exponentially, so as far as intuition goes, we may equate them with zero. We thus have a Bianchi VII0 vacuum solution close to the special point (−1, 0). The behaviour of N2 N3 will be oscillatory, and we may reduce the problem to one in which the product behaves essentially as a sine wave. However, by doing some technical estimates, one may see that one goes down going from top to top during the oscillation, and that that contradicts the assumed growth. Figure 7 illustrates the behaviour. It is a simulation of part of a Bianchi VII0 vacuum solution. We begin by rewriting the solutions in a form that makes the oscillatory behaviour apparent. Consider a non Taub Bianchi IX solution in an interval such that −1 < Σ+ < 1. Define the functions x ˜ = y˜ =

Σ− (1 − Σ2+ )1/2 √ 3 N2 − N3 . 2 (1 − Σ2+ )1/2

(57) (58)

The reason why these expressions are natural to consider is that, for reasons mentioned above, N1 , Ω and so forth may be considered to be zero. In the situation we

Vol. 2, 2001


449

−5

10.0004

−3

x 10

1.5

x 10

1

10.0002

−

Σ

N2N3

0.5 10 0

9.9998 −0.5 9.9996 9.9994

−1 0

50

100

−1.5

150

0

50

−τ −3

2

100

150

100

150

−τ −6

x 10

1.0003

x 10

1.0003 1.0002 1.0002

+

1+Σ

N2−N3

1

0

1.0001 1.0001

−1

1 −2

0

50

100

150

1

0

50

−τ

−τ

Figure 7: Part of a Bianchi VII0 vacuum solution. will need to consider N2 − N3 and Σ− will have much greater derivatives than Σ+ , so that it is natural to consider x ˜ and y˜ as sine and cosine, since the constraint essentially says x ˜2 + y˜2 = 1. Let g = −3(N2 + N3 ) − 2(1 + Σ+ )˜ xy˜ = g1 + g2 .

(59)

In our applications, g1 will essentially be constant, and g2 will essentially be zero. ˜ = (˜ Lemma 15.1 The vector x x, y˜)t satisfies ˜ = A˜ x x + , where A is defined as in (33), with g as in (59) and = (x , y )t , where the components are given by (60) and (61). The error terms are ˜ 9 3 Σ+ x x = 3N1 y˜ + ( N1 (N1 − N2 − N3 ) − (2 − γ)ΩΣ+ ) − 2 2 1 − Σ2+

(60)

450

H. Ringstr¨ om


˜ 3 3 Σ+ x 3 3 x − 2( N12 − N1 (N2 + N3 ))˜ −( N12 − 3N1 (N2 + N3 )) − (2 − γ)Ω˜ x 2 1 − Σ+ 2 4 2 and 9 1 3 y˜Σ+ (61) y = [ (3γ − 2)Ω(1 + Σ+ ) + (2 − γ)Ω + N1 (N1 − N2 − N3 )] 2 2 2 1 − Σ2+ 1 + (3γ − 2)Ω˜ y. 2 It is clear that if we have a vacuum type VII0 solution, x = y = 0, so that we ˜ = (sin(ξ(τ )), cos(ξ(τ ))), where ξ is as in (34). In our situation, there may write x is an error term, but by the exponential decay mentioned above, it only makes the technical details somewhat longer. We begin by proving that we can assume that the growth occurs in the product N2 N3 , and that Ω can be assumed to be negligible during the growth. We also put bounds on Σ+ . They constitute a starting point for further restrictions. The values of certain constants have been chosen for future convenience. The lemma below is formulated to handle more general situations than the one above. One reason being the desire to prove uniform convergence to the attractor. We will use the terminology that if x constitutes initial data for (9)-(11), then Σ+ (τ, x) and so on will denote the solution of the equations with initial value x evaluated at τ , assuming that τ belongs to the existence interval. We will use Φ(τ, x) to summarize all the variables. The goal of this section is to prove that the conditions of the lemma below are never met. Lemma 15.2 Let 2/3 < γ < 2. Consider a sequence xl of Bianchi IX initial data with all Ni > 0 and two sequences sl ≤ τl of real numbers, belonging to the existence interval corresponding to xl , such that lim d(τl , xl ) = 0,

l→∞

(62)

where d = Ω + N1 N2 + N2 N3 + N1 N3 , and h(sl , xl ) ≥ δ

(63)

for some δ > 0 independent of l. Then there is an > 0 and a k0 , such that for each k ≥ k0 there is an lk , a symmetry operation on Φ(·, xlk ), and an interval [uk , vk ] belonging to the existence interval of Φ(·, xlk ), such that the transformed variables satisfy (N2 N3 )(uk , xlk ) = , (N2 N3 )(vk , xlk ) ≤ e−20k , e−20k−1 ≤ (N2 N3 )(τ, xlk ) ≤ N1 (τ, xlk ) ≤ exp(−30k) and 2 ≥ N2 (τ, xlk ), N3 (τ, xlk ) ≥ exp(−25k)

(64)

for τ ∈ [uk , vk ]. Furthermore Ω(·, xlk ) ≤ e−13k and − 1 < Σ+ (·, xlk ) ≤ 0 in [uk , vk ].

(65)

Vol. 2, 2001


451

Remark. Observe that for the main application of this lemma, the sequence xl will be independent of l. Proof. By (62) and (63), there is an > 0 such that for every k there is a suitable lk and uk ≤ vk with [uk , vk ] ⊆ [slk , τlk ] such that e−20k−1 ≤ h(τ, xlk ) ≤ 2

(66)

h(uk , xlk ) = 2, h(vk , xlk ) = exp(−20k−1) where τ ∈ [uk , vk ]. We can also assume that h(τ, xlk ) ≤ 2 (67) for all τ ∈ [uk , τlk ]. Furthermore, we can assume (N1 N2 N3 )(·, xlk ) ≤ 2 exp(−50k − 1)/4

(68)

in [uk , τlk ]. The reason is that d(τl , xl ) converges to zero, so that (N1 N2 N3 )(τl , xl ) also converges to zero. Consequently, we can assume (N1 N2 N3 )(τlk , xlk ) to be as small as we wish, and thus we get (68) by the monotonicity of the product. Since we may assume Ω(τlk , xlk ) to be arbitrarily small by (62), we may apply Proposition 14.1 in [uk , τlk ] by (67), choosing small enough. Thus we may assume Ω ≤ exp(−13k) in [uk , vk ]. From now on, we consider the solution Φ(·, xlk ) in the interval [uk , τlk ] and only use the observations above. To avoid cumbersome notation, we will omit reference to the evaluation at xlk . By (66) and (68), we have in [uk , vk ] e−20k−1 ≤ h = N1 N2 N3 (

1 1 1 1 1 1 1 + + ) ≤ 2 e−50k−1 ( + + ), N1 N2 N3 4 N1 N2 N3

so that

1 1 1 4 + + ≥ e30k . N1 N2 N3 At a given τ ∈ [uk , vk ], one Ni , say N1 , must be smaller than exp(−30k). If the second smallest is smaller than exp(−25k), the largest cannot be bigger than 2, by Lemma 3.3, but that will contradict h ≥ exp(−20k − 1) if k is great enough. Thus, if N1 is the smallest Ni for one τ , it is always the smallest. We may thus assume N1 ≤ exp(−30k) and N2 , N3 ≥ exp(−25k) in [uk , vk ]. If is small enough, we can assume N2 , N3 ≤ 2 by Lemma 3.3. Thus, e−20k−1 − 4e−30k ≤ N2 N3 ≤ 2 + 4e−30k . We may shift uk by adding a positive number to it so that (N2 N3 )(uk ) = and (N2 N3 )(τ ) ≤

(69)

for τ ∈ [uk , vk ]. We may also shift vk in the negative direction to achieve (N2 N3 )(vk ) ≤ e−20k , (N2 N3 ) (vk ) < 0 and (N2 N3 )(τ ) ≥ e−20k−1 for τ ∈ [uk , vk ]. The condition on the derivative is there to get control on Σ+ .

452

H. Ringstr¨ om


We now establish (65). Since (N2 N3 ) (vk ) < 0, −1 < Σ+ (vk ) < 0. Due to (64), (12) and the constraint, Σ+ < 0 if Σ+ = 0 or Σ+ = −1. In other words, Σ+ (wk ) = 0 implies Σ+ ≥ 0 in [uk , wk ]. But if uk < wk then Σ+ (uk ) > 0 so that (N2 N3 )(uk ) < (N2 N3 )(wk ), contradicting the construction as stated in (69). We thus have Σ+ ≤ 0 in [uk , vk ]. We also have −1 < Σ+ in that interval. ✷ Below, we will omit reference to the evaluation at xlk to avoid cumbersome notation, but it should be remembered that we in general have a different solution for each k. Let v k

r(τ ) =

(q/2 + Σ+ )ds. τ

Here we mean q(s, xlk ) when we write q, and similarly for Σ+ . Observe that r depends on k, but that we omit reference to this dependence. All the information concerning the growth of N2 N3 is contained in r, see (9), and this integral will be our main object of study rather than the product N2 N3 . Let [uk , vk ] be an interval as in Lemma 15.2. Since (N2 N3 )(vk ) = e4r(uk ) (N2 N3 )(uk ), we have r(uk ) ≤ −5k. Let uk ≤ νk ≤ σk ≤ τk ≤ rk ≤ vk . Starting at uk , let νk be the last point r = −4k, so that r ≥ −4k in [νk , vk ]. Furthermore, let r ≥ −k in [rk , vk ] and finally, assume r ≤ −2k in [νk , τk ]. We also assume that r evaluated at rk , τk , σk and νk is −k, −2k, −3k and −4k respectively. See Table 15. Why? The interval we will work with in the end is [σk , τk ], but the other intervals are used to get control of the variables there. First of all, we want to get control of Σ+ , and the interval [uk , νk ] together with the additional demand on νk serves that purpose. The intervals at the other end, together with the associated demands, are there to yield us a quantitative statement of the intuitive idea that Ω and N1 are negligible relative to the other expressions of interest. Finally, we need to get quantitative bounds relating the different variables; as was mentioned earlier, the main idea is to prove that N2 N3 oscillates, but that it decreases during a period. In order to prove the decrease, we need to have control over the relative sizes of different expressions, and [νk , σk ] is used to achieve the desired estimates. From this point until the statement of Theorem 15.1, we will assume that the conditions of Lemma 15.2 are fulfilled. We will use the consequences of this assumption, as stated above, freely. We improve the control of Σ+ . Let us first give an intuitive argument. Observe that under the present circumstances, the solution is approximated by a Bianchi VII0 vacuum solution. For such a solution, the function Z−1 , defined in (40), is monotone increasing going backwards. According to the Bianchi VII0 vacuum constraint, Z−1 is proportional to (1 − Σ2+ )/N2 N3 . However, we know that N2 N3 has to increase by a factor of e20k going from vk to uk , and consequently 1 − Σ2+ has to increase by an even larger factor. The only way this can occur, is if a large part of the growth in N2 N3 occurs when Σ+ is very close to −1. Taking this into account, we see that the relevant variation in 1 − Σ2+ = (1 − Σ+ )(1 + Σ+ ) occurs in

Vol. 2, 2001


453

Table 2: Subdivision of the interval of growth. Interval [νk , σk ] [σk , τk ] [τk , rk ] [rk , vk ]

Bound on r −4k ≤ r ≤ −2k −4k ≤ r ≤ −2k −4k ≤ r −k ≤ r

the factor 1 + Σ+ . Below, we will use the function (1 + Σ+ )/N2 N3 instead of Z−1 . Let us begin by considering the vacuum case, in order to see the idea behind the argument, without the technical difficulties associated with the non-vacuum case. We have 1 + Σ+ 0 in [uk , vk ] by (65), so that the first term appearing in the numerator of the right hand side of (71) has the right sign. Proof. Using (12), we have

1 + Σ+ N2 N3

3 = [−(2 − 2Ω − 2Σ2+ − 2Σ2− )(Σ+ + 1) − (2 − γ)ΩΣ+ + 2

9 + N1 (N1 − N2 − N3 ) − (2q + 4Σ+ )(1 + Σ+ )](N2 N3 )−1 . 2

454

H. Ringstr¨ om


Consider the numerator of the right hand side. The term involving the Ni has the right sign by (64), and the terms not involving Ω add up to the first term of the numerator of the right hand side of (71). Let us consider the terms involving Ω. They are 3 3 2Ω(1 + Σ+ ) − (2 − γ)Ω(1 + Σ+ ) + (2 − γ)Ω − (3γ − 2)Ω(1 + Σ+ ) = 2 2 3 1 3 = − (3γ − 2)Ω(1 + Σ+ ) + (2 − γ)Ω ≤ (2 − γ)Ω 2 2 2 proving (71). To prove (72), we observe that by the constraint and the fact that 0 < 1 + Σ+ ≤ 1 in the interval of interest, we have −(2 − 2Ω − 2Σ2+ − 2Σ2− )(Σ+ + 1) ≤ 3N1 (N2 + N3 )(1 + Σ+ ) ≤ 3N1 (N2 + N3 ). Inserting this inequality into (12), we get 1 3 3 3 Σ+ ≤ − (2 − γ)Ω(1 + Σ+ ) + (2 − γ)Ω + N1 (9N1 − 3N2 − 3N3 ) ≤ (2 − γ)Ω 2 2 2 2 by (64) and (65) if k is large enough, proving (72). ✷ In the vacuum case, Σ+ is monotone in our situation, see (72), but in the general case we have the following weaker result. Lemma 15.4 Consider an interval [s, t] ⊆ [uk , vk ] such that Σ2+ ≥

1 (3γ + 2). 8

Then (1 + Σ+ (t)) − Ω(t) ≤ 1 + Σ+ (s)

(73)

if k is large enough. Proof. In [s, t] we have

Ω ≥ αγ Ω,

where αγ = 3(2 − γ)/2, see the proof of Lemma 14.1. Thus, Ω(u) ≤ Ω(t) exp[αγ (u − t)] for all u ∈ [s, t]. Integrating (72) we get (73). In connection with (71), the following lemma is of interest. Lemma 15.5 If k is large enough and (1 + Σ+ (τ ))3 ≥ e3k Ω(τ ) for some τ ∈ [uk , vk ], then (1 + Σ+ )3 ≥ in [uk , τ ].

3 (2 − γ)Ω 4

✷

Vol. 2, 2001


455

Proof. If the solution is of vacuum type the lemma follows, so assume Ω > 0. Let us first prove that (1 + Σ+ (u))3 ≥ ek Ω(τ ) for u ∈ [uk , τ ]. Assume there is an s ∈ [uk , τ ] such that the reverse inequality holds. Then there is a t with τ ≥ t ≥ s, such that (1 + Σ+ )3 ≤ e3k Ω(τ ) in [s, t], with equality at t. Because of (65), Lemma 15.4 is applicable for k large enough. Thus ek Ω1/3 (τ ) − Ω(t) ≤ 1 + Σ+ (s) ≤ ek/3 Ω1/3 (τ ).

(74)

However, by the proof of Lemma 15.2, Proposition 14.1 is applicable in any subinterval of [uk , vk ], so that Ω(t) ≤ cγ Ω(τ ). Substituting this into (74), we get ek Ω1/3 (τ ) − cγ Ω(τ ) ≤ ek/3 Ω1/3 (τ ), which is impossible for k large enough. Thus we have, for u ∈ [uk , τ ] and k large enough, (1 + Σ+ (u))3 ≥ ek Ω(τ ) ≥ ek

3 4 (2

1 3 3 (2 − γ)Ω(u) ≥ (2 − γ)Ω(u) 4 − γ)cγ 4

where cγ is the constant appearing in the statement of Proposition 14.1. The lemma follows. ✷ We now prove that we have control over 1 + Σ+ in [νk , vk ]. Lemma 15.6 Let νk and vk be as above. Then for k large enough, 0 < 1 + Σ+ < e−k

(75)

in [νk , vk ]. Proof. Assume 1 + Σ+ (τ ) ≥ e−k for some τ ∈ [νk , vk ]. Because of (65), we then conclude that Lemma 15.5 is applicable, so that 1 + Σ+ ≤0 N2 N3 in [uk , τ ] by (71). Thus 1 + Σ+ (uk ) 1 + Σ+ (τ ) e−k ≥ ≥ , (N2 N3 )(uk ) (N2 N3 )(τ ) (N2 N3 )(τ ) but by our construction (N2 N3 )(τ ) = e4r(uk )−4r(τ ) (N2 N3 )(uk ) ≤ e−20k+16k (N2 N3 )(uk ), so that e3k ≤ 1 + Σ+ (uk ) ≤ 1. The lemma follows.

✷

456

H. Ringstr¨ om


Corollary 15.1 Let νk and vk be as above. For k large enough, Ω + Σ2− + (1 + Σ+ )2 ≤ 4e−k in [νk , vk ]. Proof. By (64), we have

N1 (N2 + N3 ) ≤ 4e−30k

in [uk , vk ]. This observation, the constraint, and Lemma 15.6 yield 3 Ω + Σ2− ≤ 1 − Σ2+ + N1 (N2 + N3 ) ≤ 3e−k 2 in [νk , vk ], for k large enough. The corollary follows using Lemma 15.6. ✷ The next thing to prove is that N1 and Ω are small compared with 1 + Σ+ . The fact that r(rk ) = −k will imply that the integral of 1 + Σ+ is large, but if 1 + Σ+ is comparable with N1 or Ω, it cannot be large since N1 and Ω decay exponentially. The reason (1 + Σ+ )9 appears in the estimate (76) below is that the final argument will consist of an estimate of an integral up to ’order of magnitude’. Expressions of the form (1 + Σ+ )n and (1 + Σ+ )m /(N2 + N3 )l will will define what is ’big’ and ’small’, and here we see to it that terms involving Ω and N1 are negligible in this order of magnitude calculus. The factor exp(−3k) is there in order for us to be able to ignore possible factors multiplying expressions involving N1 and Ω. We only turn up the number k and change exp(−3k) to exp(−2k) to eliminate constants we do not want to think about; consider (60) and (61). Lemma 15.7 Let νk and τk be as above. Then for k large enough, Ω + N1 + N1 (N2 + N3 ) ≤ e−3k e3bγ (τ −vk ) (1 + Σ+ )9

(76)

in [νk , τk ] where bγ > 0. Furthermore,

1 + Σ+ N2 N3

≤ −2Σ2−

(1 + Σ+ ) N2 N3

(77)

in [uk , τk ]. Proof. Note that vk (1 + Σ+ )dτ ≤ − rk

vk

(Σ2+ + Σ+ )dτ ≤

rk

k≤

(q/2 + Σ+ )dτ = −k,

rk

so that

vk

vk

(1 + Σ+ )dτ. rk

(78)

Vol. 2, 2001


457

Let ρ1 = Ω + N1 + N1 (N2 + N3 ). By the construction in Lemma 15.2, we may assume ρ1 (vk ) ≤ e−12k . Because of Corollary 15.1, we have ρ1 (τ ) ≤ e−12k e4bγ (τ −vk ) for all τ ∈ [νk , vk ], where bγ > 0 is some constant depending only on γ. Let ρ2 (τ ) = e−9k ebγ (τ −vk ) ≥ e3k e−3bγ (τ −vk ) ρ1 (τ ). The assumption that (1 + Σ+ )9 ≤ ρ2 in [rk , vk ] contradicts (78). Thus there must be a t0 ∈ [rk , vk ] such that (1 + Σ+ (t0 ))9 ≥ ρ2 (t0 ). In the vacuum case, 1 + Σ+ increases as we go backward, and ρ2 obviously decreases, and thus we are in that case able to conclude (1 + Σ+ )9 ≥ ρ2 in [νk , rk ]. In the general case, we observe that (1 + Σ+ (t0 ))3 ≥ e3k Ω(t0 ) by the above constructions. We get 1 + Σ+ (1 + Σ+ ) ≤ −2Σ2− N2 N3 N2 N3 in [uk , t0 ], by combining Lemma 15.5 and (71). Inequality (77) follows. Thus, if τ ∈ [νk , τk ], we have 1 + Σ+ (τ ) ≥

(N2 N3 )(τ ) (1 + Σ+ (t0 )) ≥ e4k (1 + Σ+ (t0 )). (N2 N3 )(t0 )

Consequently, we will have (1 + Σ+ (τ ))9 ≥ ρ2 (τ ), since 1 + Σ+ has increased from its value at t0 and ρ2 has decreased. The lemma follows. ✷ Next we establish a relation between 1+Σ+ and the product N2 N3 . We prove that (1 + Σ+ )/(N2 N3 ) can be chosen arbitrarily small in the interval [σk , τk ], by estimating it in νk , and then comparing the integral of 1 + Σ+ from νk to σk with the integral of Σ2− over the same interval. The following lemma is the starting point. Lemma 15.8 Let σk , τk be as above. Then for k large enough, σk 1 1 + Σ+ (τ ) ≤ exp(−2 Σ2− ds) (N2 N3 )(τ ) νk if τ ∈ [σk , τk ]. Furthermore, 1 1 + Σ+ (τ ) ≤ (N2 N3 )(τ ) in [uk , τk ].

(79)

458

H. Ringstr¨ om


Proof. The statements follow from (77), and the fact that 1 (1 + Σ+ )(uk ) ≤ . (N2 N3 )(uk ) ✷ Considering the constraint, it is clear that Σ2− should be comparable with 1+Σ+ when N2 −N3 and Σ− oscillate, and thus the integral should be comparable with k, cf. (78). However, we have to work out the technical details. We carry out the comparison between the integrals in three steps. First, we estimate the error committed in viewing x ˜ and y˜ in (57) and (58) as sine and cosine. Then we may, up to a small error, express the integral of Σ2− as the integral of sin2 (η/2), multiplied by some function f (η) by changing variables. In order to make the comparison, we need to estimate the variation of f during a period: the second step. The only expressions involved are 1 + Σ+ and N2 + N3 . The third step consists of making the comparison, using the information obtained in the earlier steps. Let x ˜, y˜, g, g1 and g2 be defined as in (57)-(59), and ξ, x and y be defined as in the statement of Lemma 12.1, with τ0 replaced by τk and φ0 by φk . Observe that x, y and ξ in fact depend on k. We need to compare x with x ˜. Lemma 15.9 Let νk and τk be as above. Then for k large enough, |Σ2− − (1 − Σ2+ )x2 | ≤ 12e−2k (1 + Σ+ )9 . in [νk , τk ]. Furthermore, and

(80)

|1 − (˜ x2 + y˜2 )| ≤ e−k

(81)

˜ x − x ≤ 3e−2k (1 + Σ+ )8

(82)

in that interval. Proof. We have 1/2 3 N1 (N2 + N3 ) − 34 N12 − Ω |1 − (˜ x2 (τk ) + y˜2 (τk ))1/2 | ≤ |1 − 1 + 2 |≤ 1 − Σ2+ ≤ e−2k (1 + Σ+ (τk ))8

(83)

by (76). Equation (81) follows similarly. By (60), (61), (76) and (81), we have (s) ≤ 2bγ e−2k (1 + Σ+ (s))8 e3bγ (s−vk ) for k large enough. Let us estimate how much 1 + Σ+ may decrease as we go backward in time. By (72) and (76), we have (1 + Σ+ ) ≤

3 (2 − γ)e−3k e3bγ (τ −vk ) (1 + Σ+ )9 , 2

Vol. 2, 2001


459

so that if [s, t] ⊆ [νk , τk ], 1 + Σ+ (t) ≤ exp(exp(−2k))(1 + Σ+ (s)), for k large enough. Thus, for τ ≤ τk , we get τk (s)ds ≤ e−2k (1 + Σ+ (τ ))8 .

(84)

(85)

τ

By (36), (85), (84) and (83), we thus have ˜ x − x ≤

5 −2k e (1 + Σ+ )8 2

in [νk , τk ], and (82) follows. Since |x| ≤ 1 and |˜ x| ≤ 1.1, cf. (81), we have |˜ x2 − x2 | ≤ 6e−2k (1 + Σ+ )8 , so that

|Σ2− − (1 − Σ2+ )x2 | ≤ 12e−2k (1 + Σ+ )9 ✷

in the interval [νk , τk ]. Let us introduce

τ

η(τ ) = 2ξ(τ ) = 2

g(s)ds + 2φk ,

(86)

τk

xy˜ = g1 + g2 . The reason we study η instead where g = −3(N2 + N3 ) − 2(1 + Σ+ )˜ of ξ is that the trigonometric expression we will be interested in is sin2 (ξ), which has a period of length π, cf. Lemma 15.9. In the proof of Lemma 15.10, it is shown that, in the interval [νk , τk ], the first term appearing in g is much greater than the second. We can thus consider functions of τ in the interval [νk , τk ] to be functions of η. We will mainly be interested in considering an interval [η0 , η0 + 2π] at a time, so that we will only need to estimate the variation of the relevant expressions during one such period. Lemma 15.10 Let η1,k = η(σk ) and η2,k = η(νk ). If [η1 , η1 + 2π] ⊆ [η1,k , η2,k ] and ηa , ηb ∈ [η1 , η1 + 2π], then for k large enough e−6π/ ≤

(N2 + N3 )(ηa ) ≤ e6π/ , (N2 + N3 ), (ηb )

(87)

1 1 + Σ+ (ηa ) ≤ ≤2 2 1 + Σ+ (ηb )

(88)

|g1 |/2 ≤ |g| ≤ 2|g1 |.

(89)

and

460

H. Ringstr¨ om


Proof. Because of Lemma 15.8, 1 + Σ+ 1 + Σ+ 1 + Σ+ ≤ = (N2 N3 )1/2 ≤ 1/2 N2 + N3 2N2 N3 2(N2 N3 ) 1 ≤ 2

N2 N3 (N2 N3 )(uk )

1/2 (N2 N3 )1/2 (uk ) ≤

(90)

1 −2k e 21/2

in the interval [νk , τk ]. By (81) we may assume x ˜2 + y˜2 ≤ 2 in [νk , τk ]. Combining this fact with (90) yields (89) in [νk , τk ]. Thus, dη/dτ < 0 in that interval. We have |

√ 1 d(N2 + N3 ) | = | ((q + 2Σ+ )(N2 + N3 ) + 2 3Σ− (N2 − N3 ))| ≤ dη 2g ≤

1 |˜ xy˜| x2 + Σ+ | + 2 (3γ − 2)Ω + |Σ2+ + (1 − Σ2+ )˜ (1 − Σ2+ ) ≤ 2 |g| ≤ 6(1 + Σ+ ) + 8

1 + Σ+ , N2 + N3

so that |

d(N2 + N3 ) 1 1 + Σ+ 1 + Σ+ 1 + Σ+ 3 1 + Σ+ +8 ≤6 +2 ≤ |≤6 N2 + N3 dη N2 + N3 (N2 + N3 )2 N2 + N3 N2 N3

in [νk , σk ] for k large, by Lemma 15.8 and (90). If N2 + N3 has a maximum in ηmax ∈ [η1 , η1 + 2π] and a minimum in ηmin , we get (N2 + N3 )(ηmax ) ≤ e6π/ , (N2 + N3 )(ηmin ) and (87) follows. We also need to know how much 1 + Σ+ varies over one period. By (12) (1 + Σ+ ) = (2Σ2+ + 2Σ2− − 2)(1 + Σ+ ) + f1 , where f1 is an expression that can be estimated as in (76), so that we in [νk , τk ] have (1 + Σ+ ) | | ≤ 2(1 − Σ2+ )(1 + x ˜2 ) + (1 + Σ+ ) ≤ 13(1 + Σ+ ), 1 + Σ+ for k large enough. Thus, |

d(1 + Σ+ ) 1 10(1 + Σ+ ) , |≤ 1 + Σ+ dη N2 + N3

so that (88) holds if k is big enough and |ηa − ηb | ≤ 2π by (90).

(91) ✷

Vol. 2, 2001


461

Lemma 15.11 Let σk and τk be as above. Then if k is large enough, 1 + Σ+ 1 ≤ e−c k N2 N3 in [σk , τk ] where c > 0. Proof. Observe that similarly to the proof of Lemma 15.7, we have η2,k σk (1 + Σ+ ) (1 + Σ+ )dτ = k≤ dη. −2g νk η1,k The contribution from one period in η is negligible, by (90) and (89). Compare this integral with η2,k 2 η2,k η2,k 2 Σ− (1 − Σ2+ )x2 Σ− − (1 − Σ2+ )x2 dη = dη + dη = I1,k + I2,k . −g −g η1,k −g η1,k η1,k Now, |I2,k | ≤ e−k

η2,k

η1,k

1 + Σ+ dη −g

by (80). Consider an interval [η1 , η1 + 2π]. Estimate, letting ηa and ηb be the minimum and maximum of Σ+ respectively, and ηmin , ηmax the min and max for −g1 in this interval,

η1 +2π

η1

(1 − Σ2+ )x2 dη ≥ −g

η1 +2π

η1

(1 + Σ+ )x2 dη = −g

η1 +2π

= η1

(1 + Σ+ ) sin2 (η/2) 1 + Σ+ (ηa ) 1 + Σ+ (ηa ) π dη ≥ π ≥ e−6π/ ≥ −g 2|g1 (ηmax )| 2 |g1 (ηmin )| 1 −6π/ η1 +2π 1 + Σ+ (ηb ) π −6π/ 1 + Σ+ (ηb ) = e dη ≥ ≥ e 4 |g1 (ηmin )| 8 |g1 (ηmin )| η1 1 −6π/ η1 +2π 1 + Σ+ (η) e dη, ≥ 16 −g(η) η1

where we have used (87), (88) and (89). Assuming, without loss of generality, that η2,k − η1,k is an integer multiple of 2π, we get

νk

≥

σk

1 −6π/ e − e−k 16

2Σ2− dτ

η2,k

η1,k

η2,k

= η1,k

Σ2− dη = I1,k + I2,k ≥ −g

1 −6π/ 1 + Σ+ (η) dη ≥ e −g(η) 20

η2,k

η1,k

1 + Σ+ (η) dη = −g(η)

462

H. Ringstr¨ om

=

1 −6π/ e 10

σk

(1 + Σ+ )dτ ≥

νk


k −6π/ = c k e 10

for k large enough and the lemma follows from (79). ✷ The following corollary summarizes the estimates that make the order of magnitude calculus well defined. Corollary 15.2 Let σk and τk be as above. Then

and

1 + Σ+ 1 ≤ e−c k , 2 (N2 + N3 )

(92)

1 + Σ+ ≤ e−2k N2 + N3

(93)

1 − e−2k ≤

g ≤ 1 + e−2k g1

(94)

in [σk , τk ] for k large enough. Proof. Observe that by Lemma 15.11, 1 + Σ+ 1 + Σ+ 1 ≤ ≤ e−c k (N2 + N3 )2 N2 N3 and

1 1 + Σ+ 1 + Σ+ ≤ ≤ 1/2 e−2k e−c k ≤ e−2k N2 + N3 2 2(N2 N3 )1/2

for k large enough, cf. (90). We have xy˜ g 2(1 + Σ+ )˜ . =1+ g1 3(N2 + N3 ) By (81) and the above estimates, we get (94) for k large enough. ✷ The interval we will work with from now on is [σk , τk ]. Let η be defined as in (86), but define η1,k = η(τk ) and η2,k = η(σk ). We need to improve the estimates of the variation of 1 + Σ+ and N2 + N3 during a period contained in [η1,k , η2,k ]. Lemma 15.12 Consider an interval I = [η1 , η1 + 2π] ⊆ [η1,k , η2,k ], where η1,k = η(τk ) and η2,k = η(σk ). Let ηa and ηb correspond to the max and min of 1 + Σ+ in I, and let ηmax and ηmin correspond to the max and min of N2 + N3 in the same interval. Then, 40π(1 + Σ+ (ηb ))2 (95) |Σ+ (ηb ) − Σ+ (ηa )| ≤ (N2 + N3 )(ηmax ) and

(N2 + N3 )(ηmax ) 20π ≤ exp( exp(−c k)). (N2 + N3 )(ηmin )

(96)

Vol. 2, 2001


463

Proof. The derivation of (91) is still valid, so that |

d(1 + Σ+ ) 10(1 + Σ+ ) 1 . |≤ 1 + Σ+ dη N2 + N3

By (93) we conclude that (1+Σ+ (ηa ))/(1+Σ+ (ηb )) can be chosen to be arbitrarily close to one by choosing k large enough. Now, 1 1 d(N2 + N3 ) 1 d(N2 + N3 ) = = N2 + N3 dη N2 + N3 2g dτ =

√ 1 1 ((q + 2Σ+ )(N2 + N3 ) + 2 3Σ− (N2 − N3 )) = N2 + N3 2g =

xy˜ 4(1 − Σ2+ )˜ q + 2Σ+ + , 2g 2(N2 + N3 )g

and consequently |

10 −c k d(N2 + N3 ) 1 |≤ e . N2 + N3 dη

Equation (96) follows, and the relative variation of N2 + N3 during one period can be chosen arbitrarily small. Finally, |Σ+ (ηb ) − Σ+ (ηa )| = (1 + Σ+ (ηb ))| ≤

1 + Σ+ (ηa ) − 1| ≤ 1 + Σ+ (ηb )

30π(1 + Σ+ (ηb ))2 (N2 + N3 )(ηmin )

by (91) and the above observations. We may also change ηmin to ηmax at the cost of increasing the constant. ✷ As has been stated earlier, the goal of this section is to prove that the conditions of Lemma 15.2 are never met. We do this by deducing a contradiction from the consequences of that lemma. On the one hand, we have a rough picture of how the solution behaves in [σk , τk ] by Lemma 15.9, Lemma 15.12 and Corollary 15.2. On the other hand, we know that, since r(σk ) − r(τk ) = −k, τk η2,k 2 Σ+ + Σ2− + Σ+ 1 −k = ( (3γ − 2)Ω + Σ2+ + Σ2− + Σ+ )dτ = αk + dη. (97) −2g σk 4 η1,k We will use our knowledge of the behaviour of the solution in [σk , τk ] to prove that (97) is false. Observe that η1,k < η2,k , and that the contribution from one period is negligible, cf. Corollary 15.2. Also, αk → 0 as k → ∞ so that we may ignore it. We will prove that for k great enough, the integral of (Σ2+ + Σ2− + Σ+ )/(−2g) over a suitably chosen period is positive. From here on, we consider an interval [η1 , η1 + 2π] which, excepting intervals of length less than a period at each end of [η1,k , η2,k ], we can assume to be of the form [−π/2, 3π/2]. There is however one

464

H. Ringstr¨ om


thing that should be kept in mind; when translating the η-variable by 2mπ the ξ-variable is translated by mπ. In other words, there is a sign involved, and in order to keep track of it we write out the details. By the above observations we have. Lemma 15.13 For each k there are integers m1,k and m2,k such that −k = βk +

3π/2+2m2,k π

Σ2+ + Σ2− + Σ+ dη, −2g

−π/2+2m1,k π

(98)

where βk → 0 as k → ∞, and η1,k ≤ −π/2 + 2m1,k π ≤ η1,k + 2π, η2,k − 2π ≤ 3π/2 + 2m2,k π ≤ η2,k . Consider now an interval [−π/2 + 2mπ, 3π/2 + 2mπ] ⊆ [−π/2 + 2m1,k , 3π/2 + 2m2,k π], where m is an integer, and make the substitution η˜ = η − 2mπ, ξ˜ = ξ − mπ in that interval. Compute 1 Σ2+ + (1 − Σ2+ )x2 + Σ+ = (1 + Σ+ )(Σ+ + (1 − Σ+ ) (1 − cos η)) = 2 1 1 = (1 + Σ+ )( (1 + Σ+ ) − (1 − Σ+ ) cos η˜) = 2 2 1 = (1 + Σ+ )((1 + Σ+ ) − (1 − Σ+ ) cos η˜). 2 This expression is the relevant part of the numerator of the integrand in the right hand side of (98). There is a drift term yielding a positive contribution to the integral, but the oscillatory term is arbitrarily much greater by Lemma 15.6. The interval [−π/2, 3π/2] was not chosen at random. By considering the above expression, one concludes that the oscillatory term is negative in [−π/2, π/2] and positive in [π/2, 3π/2]. As far as obtaining a contradiction goes, the first interval is thus bad and the second good. In order to estimate the integral over a period, the natural thing to do is then to make a substitution in the interval [π/2, 3π/2], so that it becomes an integral over the interval [−π/2, π/2]. It is then important to know how the different expressions vary with η. We will prove a lemma saying that Σ+ roughly increases with η, and it will turn out to be useful that Σ+ is greater in the good part than in the bad. Let

3π/2+2mπ

J= −π/2+2mπ

Σ2+ + Σ2− + Σ+ 1 dη = −2g 2

3π/2

−π/2

(1 + Σ+ (˜ η + 2mπ))2 d˜ η− −2g(˜ η + 2mπ)

(99)

Vol. 2, 2001


1 − 2

3π/2

+ −π/2

Σ2− (˜ η+

3π/2

−π/2

465

(1 − Σ2+ (˜ η + 2mπ)) cos η˜ d˜ η+ −2g(˜ η + 2mπ)

2mπ) − (1 − Σ+ (˜ η + 2mπ))2 x2 (˜ η + 2mπ) d˜ η = J1 + J2 + J3 . −2g(˜ η + 2mπ)

If we can prove that J is positive regardless of m we are done, since J positive contradicts (98). The integral J1 is positive, and because the relative variation of the integrand can be chosen arbitrarily small by choosing k large enough, J1 is of the order of magnitude (1 + Σ+ )2 . (100) N2 + N3 If negative terms in J2 and J3 of the orders of magnitude (1 + Σ+ )3 (N2 + N3 )2

(101)

or

(1 + Σ+ )3 (102) (N2 + N3 )3 occur, we may ignore them by (93) and (92). By (80), J3 may be ignored. Observe that the largest integrand is the one appearing in J2 . However, it oscillates. Considering (99), one can see that writing out arguments such as η˜ + 2mπ does not make things all that much clearer. For that reason, we introduce the following convention. Convention 15.1 By Σ+ (˜ η ) and Σ+ (−˜ η + π), we will mean Σ+ (˜ η + 2mπ) and η + π + 2mπ) respectively, and similarly for all expressions in the variables Σ+ (−˜ of Wainwright and Hsu. However, trigonometric expressions should be read as stated. Thus cos(˜ η /2) means just that and not cos(˜ η /2 + mπ). Definition 15.1 Consider an integral expression 3π/2 I= f (˜ η )d˜ η. −π/2

Then we say that I is less than or equal to zero up to order of magnitude, if 3π/2 I≤ g(˜ η )d˜ η, −π/2

where g satisfies a bound g ≤ C1

(1 + Σ+ )3 (1 + Σ+ )3 + C , 2 (N2 + N3 )2 (N2 + N3 )3

for k large enough, where C1 and C2 are positive constants independent of k. We write I 0. The definition of I 0 is similar. We also define the concept similarly if the interval of integration is different.

466

H. Ringstr¨ om


We will use the same terminology more generally in inequalities between functions, if those inequalities, when inserted into the proper integrals, yield inequalities in the sense of the definition above. We will write ≈ if the error is of negligible order of magnitude. Lemma 15.14 If J2 as defined above satisfies J2 0, then J is non-negative for k large enough. Proof. Under the assumptions of the lemma, we have J≥

1 2

3π/2

−π/2

(1 + Σ+ )2 d˜ η− −2g

3π/2

−π/2

3π/2

+ −π/2

(C1

(1 + Σ+ )3 (1 + Σ+ )3 + C2 )d˜ η+ 2 (N2 + N3 ) (N2 + N3 )3

Σ2− − (1 − Σ2+ )x2 d˜ η. −2g

By Corollary 15.2, Lemma 15.12 and (80), we conclude that for k large enough, J is positive. ✷ The following lemma says that Σ+ almost increases with η˜. Lemma 15.15 Let −π/2 ≤ ηã ≤ η˜b ≤ 3π/2. Then ηb ) − Σ+ (˜ ηa ) ≥ −(1 + Σ+ (˜ ηmin ))8 , Σ+ (˜ where η˜min corresponds to the minimum of 1 + Σ+ in [−π/2, 3π/2]. Proof. We have Σ+ ≤

3 (2 − γ)Ω, 2

so that dΣ+ 3 Ω ≥ (2 − γ) . d˜ η 2 2g Using (76), (93) and Lemma 15.12, we conclude that dΣ+ 1 ≥ − (1 + Σ+ (˜ ηmin ))8 . d˜ η 2π ✷

The lemma follows. Lemma 15.16 If

3π/2

I= −π/2

satisfies I 0, then J2 0.

1 + Σ+ cos η˜d˜ η −g

Vol. 2, 2001


467

Proof. Consider 3π/2 3π/2 (1 − Σ2+ ) cos η˜ (Σ+ (3π/2) − Σ+ )(1 + Σ+ ) −J2 = d˜ η= cos η˜d˜ η+ −4g −4g −π/2 −π/2 +(1 − Σ+ (3π/2))

3π/2 −π/2

1 + Σ+ cos η˜d˜ η. −4g ✷

The first integral is negligible by (95). The lemma follows. Lemma 15.17 If I1 =

π/2

−π/2

(1 + Σ+ (˜ η ))(g1 (˜ η ) − g1 (−˜ η + π)) cos η˜d˜ η g(˜ η )g(−˜ η + π)

satisfies I1 0, then J2 0. Proof. We have 3π/2 π/2 3π/2 1 + Σ+ 1 + Σ+ 1 + Σ+ I= cos η˜d˜ η= cos η˜d˜ η+ cos η˜d˜ η. −g −g −g −π/2 −π/2 π/2 Make the substitution χ = −˜ η + π in the second integral; π/2 −π/2 1 + Σ+ (−χ + π) 1 + Σ+ (−χ + π) cos(−χ + π)(−dχ) = − cos(χ)dχ. −g(−χ + π) −g(−χ + π) π/2 −π/2 Thus,

π/2

I=

−π/2 π/2

= −π/2

1 + Σ+ (˜ η ) 1 + Σ+ (−˜ η + π) − −g(˜ η) −g(−˜ η + π)

cos η˜d˜ η=

(1 + Σ+ (−˜ η + π))g(˜ η ) − (1 + Σ+ (˜ η ))g(−˜ η + π) cos η˜d˜ η. g(˜ η )g(−˜ η + π)

But (1 + Σ+ (−˜ η + π))g(˜ η ) (1 + Σ+ (˜ η ))g(˜ η ), by Lemma 15.15, so that π/2 (1 + Σ+ (˜ η ))(g(˜ η ) − g(−˜ η + π)) I cos η˜d˜ η. g(˜ η )g(−˜ η + π) −π/2

(103)

Now, g(˜ η ) − g(−˜ η + π) = g1 (˜ η ) − g1 (−˜ η + π) + g2 (˜ η ) − g2 (−˜ η + π), but since 2xy = sin η˜ and the error committed in replacing x ˜ with x and y˜ with y is negligible by (82), we have g2 (˜ η ) − g2 (−˜ η + π) ≈ −(1 + Σ+ (˜ η )) sin η˜ + (1 + Σ+ (−˜ η + π)) sin(−˜ η + π) =

468

H. Ringstr¨ om


= (Σ+ (−˜ η + π)) − Σ+ (˜ η )) sin η˜. The corresponding contribution to the integral may consequently be neglected; the error in the integral will be of type (102) by (95). Consequently, if I1 =

π/2

−π/2

(1 + Σ+ (˜ η ))(g1 (˜ η ) − g1 (−˜ η + π)) cos η˜d˜ η g(˜ η )g(−˜ η + π)

satisfies I1 0, then I 0 by (103), so that the lemma follows by Lemma 15.16. ✷ Let h1 (˜ η ) = g1 (˜ η ) − g1 (−˜ η + π). We estimate h1 by estimating the derivative. We have h1 (π/2) = 0. Lemma 15.18 Let h1 be as above. In the interval [−π/2, π/2], we have η ) 1 − Σ2+ (−˜ η + π) 1 − Σ2+ (˜ dh1 3 + sin η˜. d˜ η −g(˜ η) −g(−˜ η + π)

(104)

Proof. Compute dh1 dg1 dg1 (˜ η) = (˜ η) + (−˜ η + π). d˜ η d˜ η d˜ η But

√ dg1 3 = − ((q + 2Σ+ )(N2 + N3 ) + 2 3Σ− (N2 − N3 )) = d˜ η 2g =

√ Σ− (N2 − N3 ) 1 g − g2 (q + 2Σ+ ) −3 3 . 2 g g

Observe that x and y are trigonometric expressions, and that 2x(˜ η + 2mπ)y(˜ η + 2mπ) = 2 sin(˜ η /2 + mπ) cos(˜ η /2 + mπ) = sin η˜. We have

√

3Σ− (N2 − N3 ) ≈ 2(1 − Σ2+ )xy = (1 − Σ2+ ) sin η˜,

so that dg1 1 g2 1 ≈ ( (3γ − 2)Ω + Σ2+ + Σ2− + Σ+ ) − ( (3γ − 2)Ω + Σ2+ + Σ2− + Σ+ )− d˜ η 4 g 4 −

3(1 − Σ2+ ) sin η˜ . g

The middle term and all terms involving Ω may be ignored. Estimate η ) + Σ2− (˜ η ) + Σ+ (˜ η ) + Σ2+ (−˜ η + π) + Σ2− (−˜ η + π) + Σ+ (−˜ η + π) ≈ Σ2+ (˜

Vol. 2, 2001


469

1 ≈ Σ2+ (˜ η ) + (1 − Σ2+ (˜ η ))(sin2 (˜ η /2 + mπ) − 1/2) + (1 − Σ2+ (˜ η )) + Σ+ (˜ η )+ 2 1 +Σ2+ (−˜ η + π) + (1 − Σ2+ (−˜ η + π))(cos2 (˜ η /2 + mπ) − 1/2) + (1 − Σ2+ (−˜ η + π))+ 2 1 1 +Σ+ (−˜ η + π) = (1 + Σ+ (˜ η ))2 + (1 + Σ+ (−˜ η + π))2 + 2 2 +(1 − Σ2+ (˜ η ))(sin2 (˜ η /2) − 1/2) + (1 − Σ2+ (−˜ η + π))(cos2 (˜ η /2) − 1/2). The first equality is a consequence of (80). Due to the fact that η˜ ∈ [−π/2, π/2], we have cos2 (˜ η /2) − 1/2 ≥ 0. Since −˜ η + π ≥ η˜ and Σ+ increases with η˜ up to order of magnitude according to Lemma 15.15, we have 1 − Σ2+ (−˜ η + π) 1 − Σ2+ (˜ η ). Consequently, 1 1 (1 + Σ+ (˜ η ))2 + (1 + Σ+ (−˜ η + π))2 + (1 − Σ2+ (˜ η ))(sin2 (˜ η /2) − 1/2)+ 2 2 η + π))(cos2 (˜ η /2) − 1/2) +(1 − Σ2+ (−˜

1 1 (1 + Σ+ (˜ η ))2 + (1 + Σ+ (−˜ η + π))2 + 2 2

+(1 − Σ2+ (˜ η ))(sin2 (˜ η /2) − 1/2) + (1 − Σ2+ (˜ η ))(cos2 (˜ η /2) − 1/2) ≥ 0. In other words, we have (104). Here the importance of the fact that Σ+ is greater in the good part than in the bad becomes apparent. ✷ Lemma 15.19 Let I1 be defined as above. Then I1 0. Proof. Let η˜max and η˜min correspond to the max and min of −g in the interval [−π/2, 3π/2], and let ηã and η˜b correspond to the max and min of Σ+ , in the same interval. Observe that for η˜ ∈ [−π/2, 3π/2], we have ηa ) ≥ 1 − Σ2+ (˜ η ) ≥ 1 − Σ2+ (˜ ηb ). 1 − Σ2+ (˜ In order not to obtain too complicated expressions, let us introduce the following terminology: ηb ) η ) 1 − Σ2+ (−˜ η + π) ηa ) 1 − Σ2+ (˜ 1 − Σ2+ (˜ 1 − Σ2+ (˜ a1 = 6 ≤3 + ≤6 = a2 and −g(˜ ηmax ) −g(˜ η) −g(−˜ η + π) −g(˜ ηmin ) b1 =

ηb ) η) ηa ) 1 + Σ+ (˜ 1 + Σ+ (˜ 1 + Σ+ (˜ ≤ ≤ = b2 , 2 2 g (˜ ηmax ) g(˜ η )g(−˜ η + π) g (˜ ηmin )

where η˜ ∈ [−π/2, 3π/2]. Observe that lim

k→∞

a1 b1 = lim = 1, a2 k→∞ b2

(105)

470

H. Ringstr¨ om


by Corollary 15.2 and Lemma 15.12. Consider the interval [0, π/2]. By (104), we have dh1 a1 sin η˜, (106) d˜ η so that

π/2

h1 (˜ η ) = h1 (π/2) − η˜

dh1 d˜ η −a1 cos η˜ d˜ η

in the interval [0, π/2]. Now consider the interval [−π/2, 0]. We have dh1 a2 sin η˜. d˜ η Consequently,

0

η ) = h1 (0) − h1 (˜ η˜

dh1 d˜ η −a1 + a2 (1 − cos η˜) d˜ η

in the interval [−π/2, 0]. Estimate π/2 (1 + Σ+ (˜ η ))(g1 (˜ η ) − g1 (−˜ η + π)) cos η˜d˜ η= g(˜ η )g(−˜ η + π) 0 π/2 π/2 (1 + Σ+ (˜ η ))h1 (˜ η) (1 + Σ+ (˜ η )) η≤ cos η˜d˜ η (−a1 cos2 η˜)d˜ = g(˜ η )g(−˜ η + π) g(˜ η )g(−˜ η + π) 0 0 π/2 πa1 b1 . cos2 η˜d˜ η=− ≤ −a1 b1 4 0 We also estimate

0

−π/2

=

0

(1 + Σ+ (˜ η ))(g1 (˜ η ) − g1 (−˜ η + π)) cos η˜d˜ η= g(˜ η )g(−˜ η + π)

(1 + Σ+ (˜ η ))h1 (˜ η) cos η˜d˜ η −a1 g(˜ η )g(−˜ η + π)

0

(1 + Σ+ (˜ η )) cos η˜d˜ η+ g(˜ η )g(−˜ η + π) −π/2 −π/2 0 0 (1 + Σ+ (˜ η )) +a2 cos η˜d˜ η+ (1 − cos η˜) cos η˜d˜ η ≤ −a1 b1 η )g(−˜ η + π) −π/2 g(˜ −π/2 0 π +a2 b2 (1 − cos η˜) cos η˜d˜ η ≤ −a1 b1 + (1 − )a2 b2 . 4 −π/2

Adding up, we conclude that I1 −(1 + π/4)a1 b1 + (1 − π/4)a2 b2 = [−(1 + π/4)

a1 b1 + (1 − π/4)]a2 b2 , a2 b2

which is negative for k large enough by (105). Thus I1 0.

✷

Vol. 2, 2001


471

Theorem 15.1 The conditions of Lemma 15.2 are never met. Proof. If the conditions are met, then Lemma 15.13 follows, and also that it is false, by Lemmas 15.19, 15.17, 15.14 and (99). ✷ Let A be the set of vacuum type I and II points. Corollary 15.3 Let 2/3 < γ < 2. For every > 0 there is a δ > 0 such that if x constitutes Bianchi IX initial data for (9)-(11) and inf x − y ≤ δ

y∈A

then inf Φ(τ, x) − y ≤

y∈A

for all τ ≤ 0, where Φ is the flow of (9)-(11). Proof. Assuming the contrary, there is an > 0 and a sequence xl → A such that inf Φ(sl , xl ) − y ≥

y∈A

for some sl ≤ 0. Let τl = 0. Since d(τl , xl ) → 0 and we can assume is small enough that Proposition 14.1 is applicable, there must be an η > 0 such that h(sl , xl ) > η for l large enough, contradicting Theorem 15.1. ✷ Corollary 15.4 Consider a generic Bianchi IX solution with 2/3 < γ < 2. Then lim (Ω + N1 N2 + N2 N3 + N1 N3 ) = 0.

τ →−∞

Proof. If h does not converge to zero, then the conditions of Lemma 15.2 are met, since there for a generic solution is an α-limit point on the Kasner circle by Proposition 13.1. Corollary 14.1 then yields the desired conclusion. ✷ Corollary 15.5 Let 2/3 < γ < 2. The closure of FIX and the closure of PIX do not intersect A. Furthermore, the set of generic Bianchi IX points is open in the set of Bianchi IX points. Remark. The closure of the Taub type IX points does intersect A. Proof. Assume there is a sequence xl ∈ FIX such that xl → x ∈ A. By Corollary 15.3 this is impossible since F has a positive Ω-coordinate. The argument for PIX is similar, since the Ω-coordinate of Pi+ (II) is positive. Consider now a generic point x in the set of Bianchi IX points. There is a neighbourhood of x that does not intersect the Taub points. Let us prove the similar statement for FIX and PIX . Assume there is a sequence xl ∈ FIX such that xl → x. For each > 0 there is a T ≤ 0 such that d(T, x) ≤ /2, by Corollary 15.4. By continuity of the flow and the function d, we conclude that for l large enough we have d(T, xl ) ≤ . Since Φ(T, xl ) ∈ FIX , we get a contradiction to the first part of the lemma. Thus, there is an open neighbourhood of x that does not intersect FIX . The argument for PIX is similar. ✷

472

H. Ringstr¨ om


Corollary 15.6 Let 2/3 < γ < 2. The closure of FVII0 and the closure of PVII0 do not intersect A. Furthermore, the generic Bianchi VII0 points are open in the set of Bianchi VII0 points. Proof. The argument proving the first part is as in the Bianchi IX case, once one has checked that analogues of Proposition 14.1 and Theorem 15.1 hold in the Bianchi VII0 case. The second part then follows as in the Bianchi IX case, using Proposition 10.2. ✷

16 Regularity of the set of non-generic points Observe that the constraint (11) together with the additional assumption Ω ≥ 0 defines a 5-dimensional submanifold of R6 which has a 4-dimensional boundary given by the vacuum points. We have the following. Theorem 16.1 Let 2/3 < γ < 2. The sets FII , FVII0 , FIX , PVII0 and PIX are C 1 submanifolds of R6 of dimensions 1, 2, 3, 1 and 2 respectively. We prove this theorem at the end of this section. The idea is as follows. The only obstruction to e. g. FII being a C 1 submanifold, is if there is an open set O containing F and a sequence xk ∈ FII such that xk → F , but each xk has to leave O before it can converge to F . If there is such a sequence, we produce a sequence yk ∈ FII such that the distance from yk to A converges to zero, contradicting Lemma 9.1. The argument is similar in the other cases. We will need some results from [12]. The theorem stated below is a special case of Theorem 6.2, p. 243. Theorem 16.2 In the differential equation ξ = Eξ + G(ξ)

(107)

let G be of class C 1 and G(0) = 0, ∂ξ G(0) = 0. Let E have e > 0 eigenvalues with positive real parts, d > 0 eigenvalues with negative real parts and no eigenvalues with zero real part. Let ξt = ξ(t, ξ0 ) be the solution of (107) satisfying ξ(0, ξ0 ) = ξ0 and T t the corresponding map T t (ξ0 ) = ξ(t, ξ0 ). Then there exists a map R of a neighbourhood of ξ = 0 in ξ-space onto a neighbourhood of the origin in Euclidean (u, v)-space, where dim(u) = d and dim(v) = e, such that R is C 1 with nonvanishing Jacobian and RT t R−1 has the form tP ut e u0 + U (t, u0 , v0 ) = . (108) etQ v0 + V (t, u0 , v0 ) vt U, V and their partial derivatives with respect to u0 , v0 vanish at (u0 , v0 ) = 0. Furthermore V = 0 if v0 = 0 and U = 0 if u0 = 0. Finally eP < 1 and e−Q < 1.

Vol. 2, 2001


473

Let us begin by considering the local behaviour close to the fixed points. Lemma 16.1 Consider the critical point F . There is an open neighbourhood O of F in R6 , and a 1-dimensional C 1 submanifold MII ⊆ FII of O ∩ III , such that for each x ∈ O ∩ III , either x ∈ MII , or x will leave O as the flow of (9)-(11) is applied to x in the negative time direction. Similarly, we get a 2-dimensional C 1 submanifold MVII0 of O ∩ IVII0 , and a 3-dimensional C 1 submanifold MIX of O ∩ IIX with the same properties. Consider the critical point P1+ (II). We then have a similar situation. Give the neighbourhood corresponding to O the name P , and use the letter N instead of the letter M to denote the relevant submanifolds. Then NVII0 has dimension 1 and NIX has dimension 2. Proof. Observe that when Ω > 0, we can consider (9)-(11) to be an unconstrained system of equations in five variables. Using the constraint (11) to express Ω in terms of the other variables, we can ignore Ω and consider the first five equations of (9) as a set of equations on an open submanifold of R5 , defined by the condition Ω > 0 (considering Ω as a function of the other variables). In the Bianchi VII0 case, we can consider the system to be unconstrained in four variables. Let us first deal with the Bianchi VII0 case. Consider the fixed point P1+ (II). Considering the Bianchi VII0 points with N1 , N2 > 0 and N3 = 0, the linearization has one eigenvalue with positive real part and three with negative real part, cf. [20]. By a suitable translation of the variables, reversal of time, and a suitable definition of G and E in (107), we can consider a solution to (9)-(11) converging to P1+ (II) as τ → −∞ as a solution ξ to (107) converging to 0 as t → ∞. E has one eigenvalue with negative real part and three with positive real part, so that Theorem 16.2 yields a C 1 map R of a neighbourhood of 0 with non-vanishing Jacobian to a neighbourhood of the origin in R4 , such that the flow takes the form (108) where u ∈ R and v ∈ R3 . Observe that since ξ = 0 is a fixed point, there is a neighbourhood of that point such that the flow is defined for |t| ≤ 1. There is also an open bounded ball B centered at the origin in (u0 , v0 )-space such that U and V are defined in a neighbourhood N of [−1, 1]×B. Let a = eP and 1/c = e−Q . For any > 0, we can choose B and then N small enough that the norms of U, V and their partial derivatives with respect to u and v are smaller than in N . Assume B and N are such for some satisfying < min{

c−1 1−a , }. 2 2

(109)

Consider a solution ξ to (107) such that R ◦ ξ(t) ∈ B for all t ≥ T . Let (ut , vt ) = R(ξ(t)) for t ≥ T . We wish to prove that vt = 0, and assume therefore that vt0 = 0 for some t0 ≥ T . We have vt0 +n ≥ eQ vt0 +n−1 + V (1, vt0 +n−1 , ut0 +n−1 ) ≥ ≥ cvt0 +n−1 − vt0 +n−1 ≥

1+c vt0 +n−1 , 2

474

H. Ringstr¨ om


where we have used (109), the fact that V is zero when v0 = 0, and the fact that (ut , vt ) remain in B for t ≥ T . Thus, n 1+c vt0 +n ≥ vt0 , 2 which is irreconcilable with the fact that vt remains bounded. If (ut0 , vt0 ) ∈ B and vt0 = 0, (108) yields vt0 +1 = 0 and ut0 +1 ≤ (a +

1−a 1+a )ut0 = ut0 . 2 2

Consequently, all points (u, v) ∈ B with v = 0 converge to (0, 0) as one applies the flow. We are now in a position to go backwards in order to obtain the conclusions of the lemma. The set R−1 (B) will, after suitable operations, including non-unique extensions, turn into the set P and R−1 ({v = 0} ∩ B) turns into NVII0 . One can carry out a similar construction in the Bianchi IX case. Observe that one might then get a different P , but by taking the intersection we can assume them to be the same. The dimension of NIX follows from a computation of the eigenvalues. The argument concerning the fixed point F is similar. ✷ Proof of Theorem 16.1. Let O, MII and so on be as in the statement of Lemma 16.1, ˜⊆O and let Φ be the flow of (9)-(10). Observe that if there is a neighbourhood O 1 ˜ ˜ of F such that FII ∩ O = MII ∩ O, then FII is a C submanifold. The reason is that ˜ for all τ ≤ T . By Lemma 16.1, given any x ∈ FII , there is a T such that Φ(τ, x) ∈ O ˜ of Φ(T, x) we conclude that Φ(T, x) ∈ MII . Then there is a neighbourhood O ⊆ O such that O ∩ FII = O ∩ MII . We thus get, for O suitably chosen, a C 1 map ψ : O → R6 with C 1 inverse, sending FII ∩ O to a one dimensional hyperplane. If O is small enough, we can apply Φ(−T, ·) to it, obtaining a neighbourhood of x. By the invariance of FII , we have Φ(−T, O ) ∩ FII = Φ(−T, O ∩ FII ). In other words, ψ(Φ(T, ·) defines coordinates on Φ(−T, O ) straightening out FII . The arguments for the other cases are similar. Let us now assume, in order to reach a contradiction, that there is a sequence xk ∈ FII ∩ O such that xk → F but xk ∈ / MII for all k. If we let O ⊆ O be a small enough ball containing F , we can assume that |Ni | ≥ 0 for i = 1, 2, 3 in O , cf. the proof of Lemma 4.2. For k large enough, xk ∈ O and applying the flow to them we obtain points yk ∈ FII ∩ ∂O . By choosing a suitable subsequence, we can assume that yk converges to a type I point y which is not F . Given > 0, there is a T such that Φ(−T, y) is at distance less than /2 from A. For k large enough, Φ(−T, yk ) ∈ FII will then be at distance less than from A. We get a contradiction to Lemma 9.1. The arguments for FVII0 and FIX are similar, due to Corollaries 15.6 and 15.5.

Vol. 2, 2001


475

For PVII0 and PIX , we need to modify the argument. Assume there is a sequence xk ∈ PVII0 ∩ P such that xk → P1+ (II), but xk ∈ / NVII0 for all k. By choosing P ⊆ P as a small enough ball, we can assume that |Ni | ≥ 0 in P for i = 2, 3, cf. the proof of Lemma 4.1. For k large enough, xk ∈ P , and applying the flow to them we obtain points yk ∈ PVII0 ∩ ∂P . By choosing a suitable subsequence, we can assume that yk converges to a type II point y which is not P1+ (II). If y ∈ / FII , we can apply the same kind of reasoning as before, using Proposition 9.1 to get a contradiction to the consequences of Corollary 15.6. If y ∈ FII we get, by applying the flow to the points yk , a sequence zk ∈ PVII0 converging to F . Applying the flow again, as before, we get a contradiction. The Bianchi IX case is similar using Corollary 15.5. ✷

17 Uniform convergence to the attractor If x constitutes initial data to (9)-(11) at τ = 0, then we denote the corresponding solution Σ+ (τ, x) and so on. Proposition 17.1 Let 2/3 < γ ≤ 2 and let K be a compact set of Bianchi IX initial data. Then N1 N2 N3 converges uniformly to zero on K. That is, for all > 0 there is a T such that (N1 N2 N3 )(τ, x) ≤ for all τ ≤ T and all x ∈ K. Proof. Assume that N1 N2 N3 does not converge to zero uniformly. Then there is an > 0, a sequence τk → −∞ and xk ∈ K such that (N1 N2 N3 )(τk , xk ) ≥ . We may assume, by choosing a convergent subsequence, that xk → x∗ as k → ∞. Because of the monotonicity of (N1 N2 N3 )(·, xk ), we conclude that (N1 N2 N3 )(τ, xk ) ≥ . for τ ∈ [τk , 0]. Thus (N1 N2 N3 )(τ, x∗ ) = lim (N1 N2 N3 )(τ, xk ) ≥ k→∞

for all τ ≤ 0. We have a contradiction.

✷

Corollary 17.1 Let 2/3 < γ ≤ 2 and let K be a compact set of Bianchi IX initial data. Then for every > 0, there is a T such that Ω + Σ2+ + Σ2− ≤ 1 + for all x ∈ K and τ ≤ T .

476

H. Ringstr¨ om


✷

Proof. As in the pointwise vacuum case, see [19]. Consider d = Ω + N1 N2 + N2 N3 + N3 N1 .

Proposition 17.2 Let K be a compact set of generic Bianchi IX initial data with 2/3 < γ < 2. Then d converges uniformly to zero on K. Proof. Assume that d does not converge to zero uniformly. Then there is an η > 0, a sequence τk → −∞ and a sequence xk ∈ K such that d(τk , xk ) ≥ η.

(110)

We now prove that there is no sequence skn such that τkn ≤ skn ≤ 0 and d(skn , xkn ) → 0. Assume there is. By Theorem 15.1, there is no δ > 0 such that maximum of h(·, xkn ) in [τkn , skn ] exceeds δ for all n. For δ small enough, we can apply Proposition 14.1 to the interval [τkn , skn ] to conclude that for some n, Ω cannot grow in very much in that interval either. We obtain a contradiction to (110) for δ small enough and n big enough. Thus there is an > 0 such that d(τ, xk ) ≥ for all τ ∈ [τk , 0] and all k. Assume xk → x∗ . Then d(τ, x∗ ) = lim d(τ, xk ) ≥ > 0 k→∞

for all τ ≤ 0. But x∗ constitutes generic initial data.

✷

18 Existence of non-special α-limit points on the Kasner circle We know that there is an α-limit point on the Kasner circle, but in order to prove curvature blow up we wish to prove the existence of a non-special α-limit point on the Kasner circle. Lemma 18.1 Consider a generic Bianchi IX solution with 2/3 < γ < 2. If it has a special point on the Kasner circle as an α-limit point then it has an infinite number of α-limit points on the Kasner circle. Proof. By applying the symmetries, we can assume that there is an α-limit point on the Kasner circle with (Σ+ , Σ− ) = (−1, 0). Since the solution is not of Taub type, (Σ+ , Σ− ) cannot converge to (−1, 0) by Proposition 3.1. Thus there is an 1 > > 0 such that for each T there is a τ ≤ T such that 1 + Σ+ (τ ) ≥ . Let τk → −∞ be such that Σ+ (τk ) → −1.

Vol. 2, 2001


477

Let η > 0 satisfy η < . We wish to prove that there is a non-special α-limit point on the Kasner circle with 1 + Σ+ ≤ η. There is a sequence tk ≤ τk such that 1 + Σ+ (tk ) = η and Σ+ (tk ) ≤ 0 assuming k is large enough. The condition on the derivative is possible to impose due to the fact that 1 + Σ+ eventually has to become greater than . Choosing a suitable subsequence of {tk }, we get an α-limit point which has to be a vacuum type I or II point by Corollary 15.4. If it is of type I, we get an α-limit point on the Kasner circle with 1 + Σ+ = η and we are done. The α-limit point cannot have N1 > 0, because of the condition on the derivative, cf. the proof of Proposition 5.1. If it is of type II with N2 or N3 greater than zero, we can apply the flow to get a type II solution, call it x, of α-limit points to the original solution. Since a type II solution with N2 or N3 greater than zero satisfies Σ+ < 0, the ω-limit point y of x must have 1 + Σ+ < η. By Proposition 5.1, y ∈ K2 ∪ K3 , so that it is non-special. Let 0 < η1 < . As above, we can then construct a non-special α-limit point x1 on the Kasner circle with Σ+ coordinate Σ+,1 such that 1 + Σ+,1 ≤ η1 . Assume we have constructed non-special α-limit points xi on the Kasner circle, i = 1, ..., m with Σ+ coordinates Σ+,i satisfying Σ+,i < Σ+,i−1 . Let 0 < ηm+1 < 1 + Σ+,m . Then by the above we can construct a non-special α-limit point xm+1 on the Kasner circle with Σ+ coordinate Σ+,m+1 , satisfying Σ+,m+1 < Σ+,m . Thus the solution has an infinite number of α-limit points on the Kasner circle. ✷ Corollary 18.1 A generic Bianchi IX solution with 2/3 < γ < 2 has at least three non-special α-limit points on the Kasner circle. Furthermore, no Ni converges to zero. Proof. Assume first that the solution has a special α-limit point on the Kasner circle. By Lemma 18.1, the first part of the lemma follows. By the proof of Lemma 18.1, there is a non-special α-limit point on the Kasner circle with Σ+ coordinate arbitrarily close to −1, say that it belongs to K2 . Repeated application of Proposition 6.1 then gives α-limit points first in K3 , and after enough iterates, either an α-limit point in K1 , or a special α-limit point on the Kasner circle with Σ+ = 1/2. If the latter case occurs, a similar argument to the proof of Lemma 18.1 yields an α-limit point on K1 . By Proposition 6.1, we conclude that there are α-limit points with N1 > 0, with N2 > 0 and with N3 > 0. Assume that there is no special α-limit point on the Kasner circle. Repeated application of the Kasner map yields α-limit points in Ki , i = 1, 2, 3, and the conclusions of the lemma follow as in the previous situation. ✷

19 Conclusions Let us first state the conclusions concerning the asymptotics of solutions to the equations of Wainwright and Hsu. We begin with the stiff fluid case.

478

H. Ringstr¨ om


Theorem 19.1 Consider a solution to (9)-(11) with γ = 2 and Ω > 0. Then the solution converges to a type I point with Σ2+ + Σ2− < 1. For the Bianchi types other than I, we have the following additional restrictions. 1. If the solution is of type II with N1 > 0, then Σ+ < 1/2. 2. For a type VI0 or VII0 with N2 and N3 non-zero, then Σ+ ± 3. If the solution is of type VIII or IX, then Σ+ ±

√

√ 3Σ− > −1.

3Σ− > −1 and Σ+ < 1/2.

2

2

1

1

Σ−

Σ−

Remark. Figure 8 illustrates the restriction on the shear variables. The types depicted are I, II, VI0 and VII0 , and VIII and IX, counting from top left to bottom right.

0

−1

−2 −2

0

−1

−1

0 Σ

1

2

−2 −2

−1

2

2

1

1

0

−1

−2 −2

0 Σ

1

2

0 Σ+

1

2

+

Σ−

Σ−

+

0

−1

−1

0 Σ+

1

2

−2 −2

−1

Figure 8: The points to which the shear variables may converge for a stiff fluid. Proof. The theorem follows from Propositions 7.1 and 7.2. ✷ Consider now the case 2/3 < γ < 2. Let A be the closure of the type II vacuum points.

Vol. 2, 2001


479

Theorem 19.2 Consider a generic Bianchi IX solution x with 2/3 < γ < 2. Then it converges to the closure of the set of vacuum type II points, that is lim inf x(τ ) − y = 0

τ →−∞ y∈A

where · is the Euclidean norm on R6 . Furthermore, there are at least three non-special α-limit points on the Kasner circle. Remark. One can start out arbitrarily close to this set without converging to it, cf. Proposition 11.1. Observe that the set of non-generic points is a union of C 1 submanifolds, and that the set of generic data is open in the set of Bianchi IX initial data. Observe also that the convergence to the attractor is uniform, and that Corollary 15.3 yields some interesting conclusions. Proof. The first part follows from Corollary 15.4 and the second part follows from Corollary 18.1. ✷ Next we turn to the precise statement concerning curvature blow up. Let us make a division of the initial data according to their global behaviour. Theorem 19.3 Consider a class A development with 1 ≤ γ ≤ 2. 1. If the initial data are not of type IX, but satisfy trg k = 0, then µ0 = 0 and the development is causally geodesically complete. Only types I and VII0 permit this possibility. 2. If the initial data are of type I, II, VI0 , VII0 or VIII, and satisfy trg k < 0, then the development is future causally geodesically complete and past causally geodesically incomplete. Such initial data we will refer to as expanding. 3. Bianchi IX initial data yield developments that are past and future causally geodesically incomplete. Such data are called recollapsing. Observe that this theorem is not new. As far as class A developments are concerned, we will restrict our attention to equations of state with 1 ≤ γ ≤ 2. The reason is that there is cause to doubt the well-posedness of the initial value problem for 2/3 < γ < 1, cf. [11] p. 85 and p. 88. Furthermore, in the Bianchi IX case we use results from [16] concerning recollapse, see Lemma 21.6. In order to be allowed to do that, we need the above mentioned condition on γ. We will use the Kretschmann scalar, ¯ αβγδ R ¯ αβγδ , κ=R

(111)

as our main measure of whether curvature blows up or not, but in the non-vacuum ¯ αβ . ¯ αβ R case it is natural to consider the Ricci tensor contracted with itself R The next theorem states the main conclusion concerning developments. Theorem 19.4 For class A developments with 1 ≤ γ ≤ 2, we have the following.

480

H. Ringstr¨ om


1. Consider expanding initial data of type I, II or VII0 with 1 ≤ γ < 2 which are not of Taub vacuum type. Then the Kretschmann scalar is unbounded along all inextendible causal geodesics in the incomplete direction. 2. Consider non-Taub recollapsing initial data with 1 ≤ γ < 2. Then the Kretschmann scalar is unbounded along all inextendible causal geodesics in both incomplete directions. 3. For expanding and recollapsing data with γ = 2 and µ0 > 0, the Kretschmann scalar is unbounded along all inextendible causal geodesics in all incomplete directions. ¯ αβ R ¯ αβ is 4. Consider expanding and recollapsing data with µ0 > 0. Then R unbounded along all inextendible causal geodesics in all incomplete directions. In all cases mentioned above the class A development is C 2 -inextendible. Remark. Observe that the Bianchi VIII vacuum case was handled in [19], and the Bianchi VI0 vacuum case in [18]. The above theorem thus isolates the vacuum Taub type solutions as the only ones among the Bianchi class A spacetimes that do not exhibit curvature blow up, given our particular matter model. Proof of Theorems 19.3 and 19.4. Let (M, g) be the Lorentz manifold obtained in Lemma 21.2 with topology I × G. It is globally hyperbolic by Lemma 21.4. If the initial data satisfy trg k = 0 for a development not of type IX, then it is causally geodesically complete and satisfies µ = 0 for the entire development, by Lemma 21.5 and Lemma 21.8. The first part of Theorem 19.3 follows. Consider initial data of type I, II, VI0 , VII0 or VIII such that trg k = 0. By Lemma 21.5 and Lemma 21.8, we may then time orient the development so that it is future causally geodesically complete and past causally geodesically incomplete, and the second part of Theorem 19.3 follows. The third part follows from Lemma 21.8. Consider an inextendible future directed causal geodesic in the above development. Since each hypersurface {v} × G is a Cauchy hypersurface by Lemma 21.4, the causal curve exhausts the interval I. 1. If the solution is not of type IX, then the solution to (131)-(136), which is used in constructing the class A development, corresponds to a solution to (9)-(11), because of Lemma 21.5. Furthermore, t → t− corresponds to τ → −∞, because of Lemma 22.4. a. In all the stiff fluid cases, the solution to (9)-(11) converges to a nonvacuum type I point by Theorem 19.1, so that Lemma 22.1 and Lemma 22.3 yield the desired conclusions in that case. b. Type I, II and VII0 with 1 ≤ γ < 2. That the Kretschmann scalar is unbounded in the cases stated in Theorem 19.4 follows from Proposition 8.1, Proposition 9.1, Proposition 10.1, Proposition 10.2, Lemma 22.1 and and Lemma 22.2. ¯ αβ R ¯ αβ is unc. Non-vacuum solutions which are not of type IX. Then R bounded using Lemma 22.3.

Vol. 2, 2001


481

2. If the solution is of type IX, then half of a solution to (131)-(136) corresponds to a Bianchi IX solution to (9)-(11), because of Lemma 21.6. By Lemma 22.5, t → t± corresponds to τ → −∞. a. In the stiff fluid case, we get the desired statement as before. b. If 1 ≤ γ < 2, we get the desired conclusions, concerning blow up of the Kretschmann scalar, from Corollary 18.1, Proposition 11.1, Lemma 22.1 and Lemma 22.2. ¯ αβ is unbounded using Lemma 22.3. ¯ αβ R c. Non-vacuum solutions. Then R Let us now prove that the development is inextendible in the relevant cases. ˆ , gˆ) of the same dimension, and a Assume there is a connected Lorentz manifold (M ˆ ˆ . Then there map i : M → M which is an isometry onto its image, with i(M ) = M ˆ ˆ is a p ∈ M −i(M ) and a timelike geodesic γ : [a, b] → M such that γ([a, b)) ⊆ i(M ) and γ(b) = p. Since γ|[a,b) can be considered to be a future or past inextendible timelike geodesic in M , either it has infinite length or a curvature invariant blows up along it, by the above arguments. Both possibilities lead to a contradiction. Theorem 19.4 follows. ✷

20 Asymptotically velocity term dominated behaviour near the singularity In this section, we consider the asymptotic behaviour of Bianchi VIII and IX stiff fluid solutions from another point of view. We wish to compare our results with [2], a paper which deals with analytic solutions of Einstein’s equations coupled to a scalar field or a stiff fluid. In [2], Andersson and Rendall prove that given a certain kind of solution to the so called velocity dominated system, there is a unique solution of Einstein’s equations coupled to a stiff fluid approaching the velocity dominated solution asymptotically. We will be more specific concerning the details below. The question which arises is to what extent it is natural to assume that a solution has the asymptotic behaviour they prescribe. We show here that all Bianchi VIII and IX stiff fluid solutions exhibit such asymptotic behaviour. In order to speak about velocity term dominance, we need to have a foliation. In our case, there is a natural foliation given by the spatial hypersurfaces of homogeneity. Relative to this foliation, we can express the metric as in (141) according to Lemma 21.2. In what follows, we will use the frame ei appearing in Lemma 21.2, and Latin indices will refer to this frame. Let g be the Riemannian metric, and k the second fundamental form of the spatial hypersurfaces of homogeneity, so that gij = g¯(ei , ej ) = a−2 (112) i δij , where g¯ is as in (141). The constraint equations in our situation are R − kij kij + (trk)2 = 2µ ∇i kij − ∇j (trk) = 0,

(113) (114)

482

H. Ringstr¨ om


which are the same as (135) and (132) respectively. The evolution equations are ∂t gij ∂t kij

= −2kij = Ri j + (trk)kij .

(115) (116)

The evolution equation for the matter is ∂t µ = 2(trk)µ.

(117)

We wish to compare solutions to these equations with solutions to the so called velocity dominated system. This system also consists of constraints and evolution equations, and we will denote the velocity dominated solution with a left superscript zero. The constraints are 0

−0 kij 0 kij + (tr0 k)2 = 20 µ ∇i (0 kij ) − 0 ∇j (tr0 k) = 0.

(118) (119)

The evolution equations are ∂t 0 gij ∂t 0 kij

= −20 kij = (tr0 k)0 kij ,

(120) (121)

and the matter equation is ∂t 0 µ = 2(tr0 k)0 µ.

(122)

We raise and lower indices of the velocity dominated system with the velocity dominated metric. In [2], Andersson and Rendall prove that given an analytic solution to (118)-(122) on S × (0, ∞) such that ttr0 k = −1, and such that the eigenvalues of −t0 kij are positive, there is a unique analytic solution to (113)(117) asymptotic, in a suitable sense, to the solution of the velocity dominated system. In fact, they prove this statement in a more general setting than the one given above. We have specialized to our situation. Observe the condition on the eigenvalues of −t0 kij . Our goal is to prove that this is a natural condition in the Bianchi VIII and IX cases. Theorem 20.1 Consider a Bianchi VIII or IX stiff fluid development, as constructed in Lemma 21.2, with µ0 > 0. Choose time coordinate so that t− = 0. Then there is a solution to (118)-(122) such that ttr0 k = −1, the eigenvalues of −t0 ki j are positive, and the following estimates hold i

1. 0 g il glj = δ i j + o(tα j ) 2. ki j = 0 ki j + o(t−1+α j ) i

3. µ = 0 µ + o(t−2+β1 ), where αi j and β1 are positive real numbers.

Vol. 2, 2001


483

Remark. In [2] two more estimates occur. They are not included here as they are replaced by equalities in our situation. Observe that the difficulties encountered in [2] concerning the non-diagonal terms of kij disappear in the present situation. Proof. Below we will use the results of Lemma 21.2 and its proof implicitly. When we speak of θij , σij , θ, nij and µ, we will refer to the solution of (131)-(136) and the indices of these objects should not be understood in terms of evaluation on a frame. Since θij and so on are all diagonal, we will sometimes write θi etc instead, denoting diagonal component i. There are two relevant frames: ei and ei = ai ei . The latter frame yields nij through (7). When we speak of kij , Rij and so on, we will always refer to the frame ei . We have kij = −θi δ ij (no summation on i). The metric is given by (112) above. Let us choose 0 i

k

let 0 θ = 0 θ1 + 0 θ2 + 0 θ3 and 0

j

= −0 θi δ ij ,

(123)

gij = 0 a−2 i δij

(no summation on i). Because of (123), equation (119) will be satisfied since it is a statement concerning the commutation of 0 kij and nij . The existence interval for the solution to Einstein’s equations is (0, t+ ) by our conventions, and since we wish to have ttr0 k = −1 we need to define 0 θ(t) = 1/t. Observe that 0 θi /0 θ is constant in time, and that θi /θ converges to a positive value as t → 0; this is a consequence of Theorem 19.1 and the definition (138) of the variables Σ+ and Σ− . Choose 0 θi so that 0 θi /0 θ coincides with the limit of θi /θ. Similarly 0 µ/0 θ2 is constant, µ/θ2 converges to a positive value, and we choose 0 µ/0 θ2 to be the limit. Since R/θ2 is a polynomial in the Ni and the Ni converge to zero by Theorem 19.1, equation (118) will be fulfilled. By our choices, (121) and (122) will also be fulfilled. We will specify the initial value of 0 ai later on, and then define 0 ai by demanding that (120) holds. It will be of interest to estimate terms of the form Ri j /θ2 . These terms are quadratic polynomials in the Ni . By abuse of notation, we will write Ni (τ ) when we wish to evaluate Ni in the Wainwright-Hsu time (137) and Ni (t) when we wish to evaluate in the time used in this theorem. By Theorem 19.1, there is an > 0 and a τ0 such that |Ni (τ )| ≤ exp(τ ) for all τ ≤ τ0 . We wish to rewrite this estimate in terms of t. Let us begin with (139). Since we can assume that q ≤ 3 for τ ≤ τ0 we get θ(τ ) ≤ exp[−4(τ − τ0 )]θ(τ0 ), so that for τ1 ≤ τ ≤ τ0 we get, using (137), τ 3 3 t(τ ) − t(τ1 ) = ds ≥ (exp[4(τ − τ0 )] − exp[4(τ1 − τ0 )]). θ 4θ(τ 0) τ1

484

H. Ringstr¨ om


Letting τ1 go to −∞ and observing that t(−∞) = 0, cf. Lemma 22.4 and Lemma 22.5, we get for some constant c e4τ ≤ ct(τ ), so that Ni (t) ≤ exp(τ (t)) ≤ Ctη for some positive number η. Consequently, expressions such as Ri j /θ2 and R/θ2 satisfy similar bounds. Let us now prove the estimates formulated in the statement of the theorem. Observe that for t small enough, we have t R −θ = trk(t) = −( [ 2 + 1]ds)−1 , θ 0 since the singularity is at t = 0 and trk must become unbounded at the singularity, cf. Lemma 22.4, 22.5 and (139). Thus we get θ− θ=− 0

0

t

R ds{t θ2

t

[ 0

R + 1]ds}−1 = o(t−1+η1 ) θ2

(124)

for some η1 > 0. In order to make the estimates concerning k ij , we need only consider θi and 0 θi . We have ∂t (

0 θi θi θi θi R − Ri i θ − 0 ) = ∂t = θ θ θ θ2

with no summation on the i in Ri i . This computation, together with the estimates above and the fact that θi /θ − 0 θi /0 θ converges to zero, yields the estimate 0 θi θi − 0 = o(tη2 ), θ θ

(125)

0 0 θi θi θi 0 θ − θ θi − 0 θi − 0 = + 0 . θ θ θ θ θ

(126)

for some η2 > 0. However,

Combining (124), (125) and (126), we get estimate 2 of the theorem. Similarly, we have 0 µ µ µ 2µR ∂t ( 2 − 0 2 ) = ∂t 2 = 3 . θ θ θ θ Integrating, using the fact that µ/θ2 converges to 0 µ/0 θ2 , we get 0 µ µ − 0 2 = o(tη3 ) 2 θ θ

(127)

Vol. 2, 2001


485

where η3 > 0. Using 0 0 µ − 0µ µ µ µ 0 θ2 − θ2 − = + , 0 θ2 0 θ2 θ2 θ2 θ2

(124) and (127), we get estimate 3 of the theorem. Finally, we need to specify the initial value of 0 ai and prove estimate 1. Since ∂t ai = −θi ai , (no summation on i) and similarly for 0 ai , we get ai ai ∂t 0 = 0 (0 θi − θi ). ai ai By our estimates on 0 θi − θi , we see that this implies that ai /0 ai converges as t → 0. Choose the value of 0 ai at one point in time so that this limit is 1. We thus get, using estimate 2 of the theorem, ai 0a i

i

− 1 = o(tα j ).

Estimate 1 of the theorem now follows from this estimate and the fact that 0 2 ai 0 il g glj = δ ij . ai The theorem follows.

✷

21 Appendix The goal of this appendix is to relate the asymptotic behaviour of solutions to the ODE (9)-(11) to the behaviour of the spacetime in the incomplete directions of inextendible causal curves. We proceed as follows. 1. First, we formulate Einstein’s equations as an ODE, assuming that the spacetime has a given structure (128). The first formulation presented is due to Ellis and MacCallum. We also relate this formulation to the one by Wainwight and Hsu. 2. Given initial data as in Definition 1.1, we then show how to construct a Lorentz manifold as in (128), satisfying Einstein’s equations and with initial data as specified, using the equations of Ellis and MacCallum. We also prove some properties of this development such as Global hyperbolicity and answer some questions concerning causal geodesic completeness. 3. Finally, we relate the asymptotic behaviour of solutions to (9)-(11) to the question of curvature blow up in the development obtained by the above procedure.

486

H. Ringstr¨ om


We consider a special class of spatially homogeneous four dimensional spacetimes of the form ¯ , g¯) = (I × G, −dt2 + χij (t)ξ i ⊗ ξ j ), (M

(128)

where I is an open interval, G is a Lie group of class A, χij is a smooth positive definite matrix and the ξ i are the duals of a left invariant basis on G. The stress energy tensor is assumed to be given by T = µdt2 + p(¯ g + dt2 ),

(129)

where p = (γ − 1)µ. Below, Latin indices will be raised and lowered by δij . ¯ , g¯) as in (128) with G of class A. In order to Consider a four dimensional (M define the different variables, we specify a suitable orthonormal basis. Let e0 = ∂t and ei = ai j Zj , i=1,2,3, be an orthonormal basis, where a is a C ∞ matrix valued function of t and the Zi are the duals of ξ i . ¯ e ei , ej >= 0. Let the By the following argument, we can assume that < ∇ 0 matrix valued function A satisfy e0 (A) + AB = 0, A(0) = Id where Bij =< ¯ e0 ei , ej > and Id is the 3 × 3 identity matrix. Then A is smooth and SO(3) ∇ ¯ e0 e , e >= 0. valued and if ei = Ai j ej , then < ∇ i j Let ¯ X e0 , Y >, (130) θ(X, Y ) =< ∇ α eα where Greek indices run from 0 to 3. The θαβ = θ(eα , eβ ) and [eβ , eγ ] = γβγ α objects θαβ and γβγ will be viewed as smooth functions from I to some suitable Rk , and our variables will be defined in terms of them. Observe that [Zi , e0 ] = 0. The ei span the tangent space of G, and < 0 [e0 , ei ], e0 >= 0. We get θ00 = θ0i = 0 and θαβ symmetric. We also have γij = 0 i γ0i = 0 and γ0j = −θij . We let n be defined as in (7) and

1 σij = θij − θδij , 3 where we by abuse of notation have written tr(θ) as θ. We express Einstein’s equations in terms of n, σ and θ. The Jacobi identities for eα yield 1 e0 (nij ) − 2nk(i σj)k + θnij = 0. (131) 3 The 0i-components of the Einstein equations are equivalent to σi k nkj − ni k σkj = 0.

(132)

Letting bij = 2ni k nkj − tr(n)nij and sij = bij − 13 tr(b)δij , the trace free part of the ij equations are e0 (σij ) + θσij + sij = 0. (133)

Vol. 2, 2001


487

The 00-component yields the Raychaudhuri equation 1 e0 (θ) + θij θij + (3γ − 2)µ = 0, 2

(134)

and using this together with the trace of the ij-equations yields a constraint 1 2 σij σij + (nij nij − tr(n)2 ) + 2µ = θ2 . 2 3

(135)

Equations (131)-(135) are special cases of equations given in Ellis and MacCallum [10]. At a point t0 , we may diagonalize n and σ simultaneously since they commute (132). Rotating eα by the corresponding element of SO(3) yields upon going through the definitions that the new n and σ are diagonal at t0 . Collect the off-diagonal terms of n and σ in one vector v. By (131) and (133), there is a time dependent matrix C such that v˙ = Cv so that v(t) = 0 for all t, since v(t0 ) = 0. Since the rotation was time independent, < ∇e0 ei , ej >= 0 holds in the new basis. Since n and σ are diagonal and (131) holds, the Bianchi type is preserved by the evolution. The fact that T is divergence free yields e0 (µ) + γθµ = 0.

(136)

Introduce, as in Wainwright and Hsu [20], Σij = σij /θ Nij = nij /θ Ω = 3µ/θ2 and define a new time coordinate τ , independent of time orientation, satisfying dt 3 = . dτ θ

(137)

For Bianchi IX developments, we only consider the part of spacetime where θ is strictly positive or strictly negative. Let √ 3 3 Σ+ = (Σ22 + Σ33 ) and Σ− = (138) (Σ22 − Σ33 ). 2 2 If we let Ni be the diagonal elements of Nij , equations (131) and (133) turn into (9) with definitions as in (10), except for the expression for dΩ/dτ . It can however be derived from (136). The constraint (135) turns into (11). The Raychaudhuri equation (134) takes the form dθ = −(1 + q)θ. dτ

(139)

488

H. Ringstr¨ om


The τ -time will not be used further in this appendix. Before using the equations of Ellis and MacCallum to construct a development, it is convenient to know that one can make some simplifying assumptions concerning the choice of basis. The next lemma fulfills this objective, and also proves the classification of the class A Lie algebras mentioned in the introduction. Lemma 21.1 Table 1.1 constitutes a classification of the class A Lie algebras. Consider an arbitrary basis {ei } of the Lie algebra. Then by applying an orthogonal matrix to it, we can construct a basis {ei } such that the corresponding n defined by (7) is diagonal, with diagonal elements of one of the types given in Table 1.1. Proof. Let ei be a basis for the Lie algebra and n be defined as in (7). If we change the basis according to ei = (A−1 )i j ej , then n transforms to n = (det A)−1 At nA

(140)

Since n is symmetric, we assume from here on that the basis is such that it is diagonal. The matrix A = diag(1 1 − 1) changes the sign of n. A suitable orthogonal matrix performs even permutations of the diagonal. The number of non-zero elements on the diagonal is invariant under transformations (140) taking one diagonal matrix to another. If A = (aij ) and the diagonal matrix n is constructed as in (140), we have nkk = (det A)−1 3i=1 a2ik nii , so that if all the diagonal elements of n have the same sign, the same is true for n . The statements of the lemma follow. ✷ We now prove that if we begin with initial data as in Definition 1.1, we get a development as in Definition 1.3 of the form (128), with certain properties. Lemma 21.2 Fix 2/3 < γ ≤ 2. Let G, g, k and µ0 be initial data as in Definition 1.1. Then there is an orthonormal basis ei i = 1, 2, 3 of the Lie algebra such that nij defined by (7) and kij = k(ei , ej ) are diagonal and nij is of one of the forms given in Table 1.1. Let 1 θ(0) = −trg k, σij (0) = −k(ei , ej ) + θ(0)δij , nij (0) = nij and µ(0) = µ0 . 3 Solve (131), (133), (134) and (136) with these conditions as initial data to obtain n, σ, θ and µ, and let I be the corresponding existence interval. Then there are smooth functions ai : I → (0, ∞) i = 1, 2, 3, with ai (0) = 1, such that g¯ = −dt2 +

3

i i a−2 i (t)ξ ⊗ ξ ,

(141)

i=1

¯ = I × G, with T where ξ i is the dual of ei , satisfies Einstein’s equations (3) on M as in (1) with u = e0 , µ as above and p = (γ − 1)µ. Furthermore, ¯ ei e0 , ej >= σij + 1 θδij , on M ¯ be the associated Levi-Civita connection. Compute and ei spacelike, and let ∇ ˜ ˜ µ , eν ), then θ˜00 = θ˜i0 = ¯ ¯ X e0 , Y > and θ˜µν = θ(e < ∇e0 ei , ej >= 0. If θ(X, Y ) =< ∇ θ˜0i = 0. Furthermore, 1 e0 (aj )δij = −θ˜ij aj (no summation over j) so that θ˜ij is diagonal and trθ˜ = θ. Finally, 1 −˜ σii = −θ˜ii + θ = −σi . 3 The lemma follows by considering the derivation of the equations of Ellis and MacCallum. ✷ Definition 21.1 A development as in Lemma 21.2 will be called a class A development. We will also assign a type to such a development according to the type of the initial data. ¯ v = {v} × G is a Cauchy surface, but The next thing to prove is that each M first we need a lemma. Lemma 21.3 Let ρ be a left invariant Riemannian metric on a Lie group G. Then ρ is geodesically complete.

490

H. Ringstr¨ om


Proof. Assume γ : (t− , t+ ) → G is a geodesic satisfying ρ(γ , γ ) = 1, with t+ < ∞. There is a δ > 0 such that every geodesic λ satisfying λ(0) = e, the identity element of G, and λ (0) = v with ρ(v, v) ≤ 1 is defined on (−δ, δ). If Lh : G → G is defined by Lh (h1 ) = hh1 , then Lh is by definition an isometry. Let t0 ∈ (t− , t+ ) satisfy t+ − t0 ≤ δ/2. Let v ∈ Te G be the vector corresponding to γ (t0 ) under the isometry Lγ(t0 ) . Let λ be a geodesic with λ(0) = e and λ (0) = v. Then Lγ(t0 ) ◦ λ is a geodesic extending γ. ✷ Let us be precise concerning the concept Cauchy surface. Definition 21.2 Consider a time oriented Lorentz manifold (M, g). Let I be an interval in R and γ : I → M be a continuous map which is smooth except for a finite number of points. We say that γ is a future directed causal, timelike or null curve if at each t ∈ I where γ is differentiable, γ (t) is a future oriented causal, timelike or null vector respectively. We define past directed curves similarly. A causal curve is a curve which is either a future directed causal curve or a past directed causal curve and similarly for timelike and null curves. If there is a curve λ : I1 → M such that γ(I) is properly contained in λ(I1 ), then γ is said to be extendible, otherwise it is called inextendible. A subset S ⊂ M is called a Cauchy surface if it is intersected exactly once by every inextendible causal curve. A Lorentz manifold as above which admits a Cauchy surface is said to be Globally hyperbolic. ¯ v = {v} × G is a Cauchy surface. Lemma 21.4 For a class A development, each M ¯ v twice Proof. The metric is given by (141). A causal curve cannot intersect M since the t-component of such a curve must be strictly monotone. Assume that ¯ v . Let γ : (s− , s+ ) → M is an inextendible causal curve that never intersects M ¯ ˜ ˜ ˜ t : M → I be defined by t[(s, h)] = s. Let s0 ∈ (s− , s+ ) and assume that t(γ(s0 )) = t1 < v and that < γ , ∂t >< 0 where it is defined. Thus t˜(γ(s)) increases with s and t˜(γ([s0 , s+ ))) ⊆ [t1 , v]. Since we have uniform bounds on ai from below and above on [t1 , v] and the curve is causal, we get 3 ξ i (γ )2 )1/2 ≤ −C < γ , e0 > (

(142)

i=1

on that interval, with C > 0. Since s+ − < γ , e0 > ds = s0

s+

s0

dt˜ ◦ γ ds ≤ v − t1 , ds

(143)

the curve γ|[s0 ,s+ ) , projected to G, will have finite length in the metric ρ on G defined by making ei an orthonormal basis. Since ρ is a left invariant metric on a Lie group, it is complete by Lemma 21.3, and sets closed and bounded in the corresponding topological metric must be compact. Adding the above observations, we conclude that γ([s0 , s+ )) is contained in a compact set, and thus there is a sequence sk ∈ [s0 , s+ ) with sk → s+ such that γ(sk ) converges. Since t˜(γ(s))

Vol. 2, 2001


491

is monotone and bounded it converges. Using (142) and an analogue of (143), we conclude that γ has to converge as s → s+ . Consequently, γ is extendible contradicting our assumption. By this and similar arguments covering the other ¯ v is a Cauchy surface for each v ∈ (t− , t+ ). ✷ cases, we conclude that M Before we turn to the questions concerning causal geodesic completeness, let us consider the evolution of θ for solutions to the equations of Ellis and MacCallum. This is relevant also for the definition of the variables of Wainwright and Hsu, since there one divides by θ. We first consider developments as in Lemma 21.2 which are not of type IX. Lemma 21.5 Consider class A developments which are not of type IX. Let the existence interval be I = (t− , t+ ). Then there are two possibilities. 1. θ = 0 for the entire development. We then time orient the manifold so that θ > 0. With this time orientation, t+ = ∞. 2. θ = 0, σij = 0 and µ = 0 for the entire development. Furthermore, nij is constant and diagonal and two of the diagonal components are equal and the third is zero. The only Bianchi types which admit this possibility are thus type I and type VII0 . Furthermore I = (−∞, ∞). Proof. Since nij is diagonal, see the proof of Lemma 21.2, we can formulate the constraint (135) as 1 2 σij σij + [n21 + (n2 − n3 )2 − 2n1 (n2 + n3 )] + 2µ = θ2 , 2 3 where the ni are the diagonal components of nij . Considering Table 1.1, we see that, excepting type IX, the expression in the ni is always non-negative. Thus we deduce the inequality 2 (144) σij σij + 2µ ≤ θ2 . 3 Combining it with (134), we get |e0 (θ)| ≤ θ2 , using the fact that 2/3 < γ ≤ 2. Consequently, if θ is zero once, it is always zero. Time orient the developments with θ = 0 so that θ > 0. Consider the possibility θ = 0. Equation (134) then implies σij = 0 and µ = 0, since γ > 2/3. Equations (135) and (133) then imply bij = 0, and (131) implies nij constant. All the statements except the the fact that t+ = ∞ in the θ > 0 case follow from the above. Observe that θ decreases in magnitude with time, so that it is bounded to the future. By the (144), the same is true of σij and µ. Using (131), we get control of nij and conclude that the solution may not blow up in finite time. We must thus have t+ = ∞. ✷ By a theorem of Lin and Wald [16], Bianchi IX developments recollapse.

492

H. Ringstr¨ om


Lemma 21.6 Consider a Bianchi IX class A development with 1 ≤ γ ≤ 2 and I = (t− , t+ ). Then there is a t0 ∈ I such that θ > 0 in (t− , t0 ) and θ < 0 in (t0 , t+ ). Proof. Let us begin by proving that θ can be zero at most once. If θ(ti ) = 0, i = 1, 2 and t1 < t2 , then θ = 0 in (t1 , t2 ) since it is monotone by (134). Thus (134) implies σij = 0 = µ in (t1 , t2 ) as well. Combining this fact with (135) and (133), we get bij = 0, which is impossible for a Bianchi IX solution. Assume θ is never zero. By a suitable choice of time orientation, we can assume that θ > 0 on I. Let us prove that t+ = ∞. Since θ is decreasing on I1 = [0, t+ ) and non-negative on I it is bounded on I1 . By (131), n1 n2 n3 decreases so that it is bounded on I1 . By an argument similar to the proof of Lemma 3.3, one can combine this bound with (135) to conclude that σij and µ are bounded on I1 . By (131), we conclude that nij cannot grow faster than exponentially. Consequently, the future existence interval must be infinite, that is t+ = ∞, since I was the maximal existence interval and solutions cannot blow up in finite time. In order to use the arguments of Lin and Wald, we define t t 1 0 βi (t) = θ(s)ds + α0 , σi (s)ds + βi , α(t) = 0 0 3 where 2βi0 − α0 = ln(ni (0)) and

3 i=1

βi0 = 0. Then

ni = exp(2βi − α). Let ρ = µ/8π and Pi = p/8π = (γ − 1)µ/8π, i = 1, 2, 3. Equations (135) and (134) then imply equations (1.4) and (1.5) of [16], and equations (1.6) and (1.7) of [16] follow from (133). We have thus constructed a solution to (1.4)-(1.7) of [16] on an interval [0, ∞) with dα/dt > 0. Lin and Wald prove in their paper [16] that this assumption leads to a contradiction, if one assumes that |Pi | ≤ ρ and P1 + P2 + P3 ≥ 0. However, these conditions are fulfilled in our situation, assuming 1 ≤ γ ≤ 2. In other words, there is a zero and since θ is decreasing it must be positive before the zero and negative after it. The lemma follows. ✷ The lemma concerning causal geodesic completeness will build on the following estimate. ¯ be a future Lemma 21.7 Consider a class A development. Let γ : (s− , s+ ) → M directed inextendible causal geodesic, and fν (s) =< γ (s), eν |γ(s) > . If θ = 0 for the entire development, then f0 is constant. Otherwise, √ d 2− 2 2 2 (f0 θ) ≥ θ f0 . ds 3

(145)

(146)

Vol. 2, 2001


493

Remark. We consider functions of t as functions of s by evaluating them at t˜(γ(s)), where t˜ is the function defined in Lemma 21.4. Proof. Compute, using the proof of Lemma 21.2, 3 df0 =< γ (s), ∇γ (s) e0 >= θk fk2 , ds k=1

where θk are the diagonal elements of θij . If θ = 0 for the entire development, then θk = 0 for the entire development by Lemma 21.5 and Lemma 21.6, so that f0 is constant. Compute, using Raychaudhuri’s equation (134), d 1 2 1 1 (f0 θ) = θ2 fk + θσk fk2 + f02 σk2 + θ2 f02 + (3γ − 2)µf02 ds 3 3 2 3

3

3

k=1

k=1

k=1

where σk are the diagonal elements of σij . Estimate |

3 k=1

1/2 3 1/2 3 2 σk fk2 | ≤ σk2 fk2 , 3 k=1

k=1

3 using the tracelessness of σij . By making a division into the three cases k=1 σk2 ≤ 3 3 θ2 /3, θ2 /3 ≤ k=1 σk2 ≤ 2θ2 /3 and 2θ2 /3 ≤ k=1 σk2 , and using the causality of γ we deduce (146). ✷ Lemma 21.8 Consider a class A development and let the existence interval be I = (t− , t+ ). There are three possibilities. 1. θ = 0 for the entire development, in which case the development is causally geodesically complete. 2. The development is not of type IX and θ > 0. Then all inextendible causal geodesics are future complete and past incomplete. Furthermore, t− > −∞ and t+ = ∞. 3. If the development is of type IX with 1 ≤ γ ≤ 2, then all inextendible causal geodesics are past and future incomplete. We also have t− > −∞ and t+ < ∞. ¯ be a future directed inextendible causal geodesic and Proof. Let γ : (s− , s+ ) → M fν be defined as in (145). Let furthermore I = (t− , t+ ) be the existence interval ¯ v , v ∈ I is a Cauchy surface by Lemma mentioned in Lemma 21.2. Since every M ˜ 21.4, t(γ(s)) must cover the interval I as s runs through (s− , s+ ). Furthermore, t˜(γ(s)) is monotone increasing so that t˜(γ(s)) → t± as s → s± .

(147)

494

H. Ringstr¨ om


Let s0 ∈ (s− , s+ ) and compute s −f0 (u)du = t˜(γ(s)) − t˜(γ(s0 )).

(148)

s0

Consider the case θ = 0 for the entire development. By Lemma 21.7, f0 is then constant, and I = (−∞, ∞) by Lemma 21.5. Equations (148) and (147) then prove that we must have (s− , s+ ) = (−∞, ∞). Thus, all inextendible causal geodesics must be complete. Assume that the development is not of type IX and that θ > 0. Since f0 θ is negative on [s0 , s+ ), its absolute value is bounded on that interval by (146). If s+ were finite, θ would be bounded from below by a positive constant on [s0 , s+ ), since dθ | | ≤ −f0 θ2 ≤ Cθ ds on that interval for some C > 0, cf. (144) and the observations following that equation. Since f0 θ is bounded, we then deduce that f0 is bounded on [s0 , s+ ). But then (147) and (148) cannot both hold, since t+ = ∞ by Lemma 21.5. Thus, s+ = ∞ and all inextendible causal geodesics must future complete. Since f0 θ is negative on (s− , s+ ), (146) proves that this expression must blow up in finite s-time going backward, so that s− > −∞. Since the curve γ(s) = (s, e) is an inextendible timelike geodesic, we conclude that t− > −∞. Consider the Bianchi IX case. By Lemma 21.4 and 21.6, we conclude the existence of an s0 ∈ (s− , s+ ) such that f0 θ is negative on (s− , s0 ) and positive on (s0 , s+ ). By (146), f0 θ must blow up a finite s-time before s0 , and a finite s-time after s0 . Every inextendible causal geodesic is thus future and past incomplete. We conclude t− > −∞ and t+ < ∞. ✷

22 Appendix In this appendix, we consider the curvature expressions. According to [22], p. 40, the Weyl tensor C¯αβγδ is defined by ¯ αβγδ = C¯αβγδ + (¯ ¯ δ]β − g¯β[γ R ¯ δ]α ) − 1 R¯ ¯ g g¯ , R gα[γ R 3 α[γ δ]β where the bar in g¯αβ and so on indicates that we are dealing with spacetime objects as opposed to objects on a spatial hypersurface. Using this relation and the fact that our spacetime satisfies (3), where T is given by (1) and (2), one can derive the following expression for the Kretschmann scalar ¯ αβγδ R ¯ αβγδ = C¯αβγδ C¯ αβγδ + 2R ¯ αβ − 1 R ¯ αβ R ¯2 = κ=R 3 1 = C¯αβγδ C¯ αβγδ + [4 + (3γ − 2)2 ]µ2 . 3

(149)

Vol. 2, 2001


495

However, according to [21], p. 19, we have C¯αβγδ C¯ αβγδ = 8(Eαβ E αβ − Hαβ H αβ ),

(150)

where, relative to the frame eα appearing in Lemma 21.2, all components of E and H involving e0 are zero, and the ij components are given by Eij Hij

1 1 θσij − (σi k σkj − σkl σkl δij ) + sij 3 3 1 k kl = −3σ (i nj)k + nkl σ δij + nkk σij , 2 =

where sij is the same expression that appears in (133), see p. 40 of [21]. Observe that in our situation, E and H are diagonal, since we are interested in the de˜ij = Eij /θ2 and velopments obtained in Lemma 21.2. It is natural to normalize E ˜i . We want ˜ similarly for H. We will denote the diagonal components of Eij by E to have expressions in Σ+ , Σ− and so on, and therefore we compute ˜1 H ˜2 H ˜3 ˜2 − E E ˜3 ˜2 + E E

1 = N1 Σ+ + √ (N2 − N3 )Σ− 3 √ 1 1 1 = − N2 (Σ+ + 3Σ− ) + (N3 − N1 )(Σ+ − √ Σ− ) 2 2 3 2 √ Σ− (1 − 2Σ+ ) + (N2 − N3 )(N2 + N3 − N1 ) = 3 3 2 2 2 1 1 Σ+ (1 + Σ+ ) − Σ2− − N12 + (N2 − N3 )2 + N1 (N2 + N3 ). = 9 9 3 3 3

˜i and H ˜ i can be computed from this, as Observe that all other components of E Eij and Hij are both traceless. It is convenient to define the normalized Kretschmann scalar κ ˜ = Rαβγδ Rαβγδ /θ4 .

(151)

The latter object can be expressed as a polynomial in the variables of Wainwright and Hsu. By the above observations and the fact that Ω = 3µ/θ2 , we have 3 ˜ 1 ˜ 2 1 ˜ ˜ 2 ˜2 ˜2 ˜ ˜ κ ˜ = 8[ (E [4 + (3γ − 2)2 ]Ω2 . 2 + E3 ) + (E2 − E3 ) − 2H1 − 2H2 − 2H1 H2 ] + 2 2 27 ¯ αβ to a solution to (9)-(11) in the following ¯ αβ R We will associate a κ and a R 4 way. Since κ/θ can be expressed in terms of the variables of Wainwright and Hsu, it is natural to define κ by this expression multiplied by θ4 , where θ obeys (139). There is of course an ambiguity as to the initial value of θ, but we are only interested in the asymptotics, and any non-zero value will yield the same ¯ αβ R ¯ αβ to a solution similarly. conclusion. We associate R

496

H. Ringstr¨ om


Lemma 22.1 The normalized Kretschmann scalar (151) is non-zero at the fixed points F, Pi+ (II), at the non-special points on the Kasner circle, and at the type I stiff fluid points with Ω > 0. Consequently lim sup |κ(τ )| = ∞ τ →−∞

(152)

for all solutions to (9)-(11) which have one such point as an α-limit point. Proof. The statement concerning the normalized Kretschmann scalar is a computation. Equation (152) is a consequence of this computation, the fact that κ = κ ˜ θ4 and the fact that θ → ∞ as τ → −∞, cf. (139). ✷ For some non-vacuum Taub type solutions with 2/3 < γ < 2, the following lemma is needed. Lemma 22.2 Consider a solution to (9)-(11) with Ω > 0 and 2/3 < γ < 2 such that (153) lim (Σ+ , Σ− ) = (−1, 0). τ →−∞

Then lim κ(τ ) = ∞.

τ →−∞

Proof. By Proposition 3.1, the solution must satisfy Σ− = 0 and N2 = N3 . Observe that because of (153), we have Ω → 0, since Ω decays exponentially for Σ2+ large, cf. the proof of Lemma 14.1. Consequently, q → 2. One can then prove that for any > 0, there is a T such that exp[(aγ + )τ ] ≤ Ω(τ ) ≤ exp[(aγ − )τ ] exp[(6 + )τ ] ≤ N1 (τ ) ≤ exp[(6 − )τ ] exp[(6 + )τ ] ≤ [N1 (N2 + N3 )](τ ) ≤ exp[(6 − )τ ] exp[(−6 + )τ ] ≤ θ2 (τ ) ≤ exp[(−6 − )τ ]

(154) (155) (156) (157)

for all τ ≤ T , where aγ = 3(2 − γ). However, the constraint can be written 3 3 (1 − Σ+ )(1 + Σ+ ) = Ω + N12 − N1 (N2 + N3 ). 4 2 By (154)-(156), Ω will dominate the right hand side, since it is non-zero. Since 1 − Σ+ converges to 2, 1 + Σ+ will consequently have to be positive and of the order of magnitude Ω. In particular, for every > 0 there is a T such that exp[(aγ + )τ ] ≤ (1 + Σ+ )(τ ) ≤ exp[(aγ − )τ ]

(158)

Observe that since aγ < 4, Ωθ2 and (1+Σ+ )θ2 both diverge to infinity as τ → −∞, by (154), (157) and (158). Other expressions of interest are N1 θ2 and N1 (N2 +

Vol. 2, 2001


497

N3 )θ2 . The estimates (154)-(157) do not yield any conclusions concerning whether they are bounded or not. However, using (139), we have 0 N1 (τ )θ2 (τ ) = N1 (0)θ2 (0) exp[ (2 + q + 4Σ+ )ds] = τ

0 1 = N1 (0)θ2 (0) exp[ (2(1 + Σ+ ) + (3γ − 2)Ω + 2Σ+ (1 + Σ+ ))ds], 2 τ which is bounded since all the terms appearing in the integral are integrable by (154) and (158). A similar argument yields the same conclusion concerning N1 (N2 + N3 )θ2 . ˜ 1 = N1 Σ+ and H ˜2 = H ˜3 = Since the solution is of Taub type, we have H ˜ 1 /2. We also have E ˜2 = E ˜3 and −H ˜2 = 2E

2 2 1 Σ+ (1 + Σ+ ) − N12 + N1 (N2 + N3 ). 9 3 3

Consequently the E field blows up and the H field remains bounded, and the lemma follows. ✷ ¯ αβ R ¯ αβ becomes unbounded in the matter case. Finally, we observe that R Lemma 22.3 Consider a solution to (9)-(11) with Ω > 0. Then ¯ αβ R ¯ αβ = ∞. lim R

τ →−∞

¯ αβ R ¯ αβ to a solution of (9)-(11) is clarified in the Remark. How to associate R remarks preceding the statement of Lemma 22.1. Proof. We have ¯ αβ R ¯ αβ = µ2 + 3p2 = [1 + 3(γ − 1)2 ]µ2 = 1 [1 + 3(γ − 1)2 ]Ω2 θ4 . R 9 But by (9) and (139), we have 0 Ω (τ )θ (τ ) = Ω (0)θ (0) exp( (−4q + 2(3γ − 2) + 4 + 4q)ds) = 2

4

2

4

τ

= Ω2 (0)θ4 (0) exp(−3γτ ), and the lemma follows.

✷

Lemma 22.4 Consider a class A development, not of type IX, with I = (t− , t+ ) and θ > 0. Then the corresponding solution to the equations of Wainwright and Hsu has existence interval R, and t → t± corresponds to τ → ±∞.

498

H. Ringstr¨ om


Proof. The function θ has to converge to infinity as t → t− for the following reason. Assume it does not. As θ is monotone decreasing, we can assume it to be bounded on (t− , 0]. By the constraint (135), σij and µ are then bounded on (t− , 0], so that the same will be true of nij by (131) and the fact that t− > −∞. But then one can extend the solution beyond t− , contradicting the fact that I is the maximal existence interval. By (134), θ → 0 as t → ∞ = t+ . Equation (137) defines a diffeomorphism τ˜ : (t− , t+ ) → (τ− , τ+ ), and we get a solution to the equations of Wainwright and Hsu on (τ− , τ+ ). By (139), we conclude that the statement of the lemma holds. ✷

Lemma 22.5 Consider a Bianchi IX class A development with I = (t− , t+ ) and 1 ≤ γ ≤ 2. According to Lemma 21.6, there is a t0 ∈ I such that θ > 0 in I− = (t− , t0 ) and θ < 0 in I+ = (t0 , t+ ). The solution to the equations of Wainwright and Hsu corresponding to the interval I− has existence interval (−∞, τ− ), and t → t− corresponds to τ → −∞. Similarly, I+ corresponds to (−∞, τ+ ) with t → t+ corresponding to τ → −∞. Proof. Let us relate the different time coordinates on I− . According to equation t (137), τ has to satisfy dt/dτ = 3/θ. Define τ˜(t) = t1 θ(s)/3ds, where t1 ∈ I− . Then τ˜ : I− → τ˜(I− ) is a diffeomorphism and strictly monotone on I− . Since θ is positive in I− , τ˜ increases with t. Since θ is continuous beyond t0 , it is clear that τ˜(t) → τ− ∈ R as t → t0 . To prove that t → t− corresponds to τ → −∞, we make the following observation. One of the expressions θ and dθ/dt is unbounded on (t− , t1 ], since if both were bounded the same would be true of σij , µ and nij by (134) and (131) respectively. Then we would be able to extend the solution beyond t− , contradicting the fact that I is the maximal existence interval (observe that t− > −∞ by Lemma 21.8). If τ˜ were bounded from below on I− , then θ and θ would be bounded on τ˜((t− , t1 ]) by Lemma 3.2, and thus θ and dθ/dt would be bounded on (t− , t1 ]. Thus t → t− corresponds to τ → −∞. Similar arguments yield the same conclusion concerning I+ . ✷

Acknowledgments This research was supported in part by the National Science Foundation under Grant No. PHY94-07194. Part of this work was carried out while the author was enjoying the hospitality of the Institute for Theoretical Physics, Santa Barbara. The author also wishes to acknowledge the support of Royal Swedish Academy of Sciences. Finally, he would like to express his gratitude to Lars Andersson and Alan Rendall, whose suggestions have improved the article.

Vol. 2, 2001


499

References [1] L. Andersson, The global existence problem in general relativity, grqc/9911032 (1999). [2] L. Andersson and A. Rendall, Quiescent cosmological singularities, grqc/0001047 (2000). [3] V. A. Belinskii, I. M. Khalatnikov and E.M. Lifshitz, Oscillatory approach to a singular point in the relativistic (1970). [4] V.A. Belinskii, I.M. Khalatnikov and E.M. Lifshitz, A general solution of the Einstein equations with a time singularity, Adv. Phys. 31, 639–667 (1982). [5] B.K. Berger et al, The singularity in generic gravitational collapse is spacelike, local and oscillatory, Mod. Phys. Lett. A13, 1565–1574 gr-qc/9805063 (1998). [6] O. Bogoyavlensky, Qualitative theory of dynamical systems in astrophysics and gas dynamics, Springer-Verlag (1985). [7] P.T. Chru´sciel and J. Isenberg, Non-isometric vacuum extensions of vacuum maximal globally hyperbolic spacetimes, Phys. Rev. D48, 1616–1628 (1993). [8] N. Cornish, J. Levin, The mixmaster universe: A chaotic Farey tale, Phys. Rev. D55, 7489–7510 (1997). [9] D. Eardly, E. Liang and R. Sachs, Velocity-dominated singularities in irrotational dust cosmologies J. Math. Phys. 13, 99–106 (1972). [10] G. Ellis and M. MacCallum, A class of homogeneous cosmological models Commun. Math. Phys. 12, 108–141 (1969). [11] H. Friedrich and A. Rendall, The Cauchy problem for the Einstein equations, gr-qc/0002074 (2000). [12] P. Hartman, Ordinary Differential equations, John Wiley and sons, (1964) [13] D. Hobill, A. Burd and A. Coley editors (1994) Deterministic chaos in general relativity, Plenum Press. [14] M.C. Irwin, Smooth dynamical systems, Academic Press (1980). [15] J. Isenberg and V. Moncrief, Asymptotic behaviour of the gravitational field and the nature of singularities in Gowdy spacetimes Ann. Phys. 199, 84–122 (1990). [16] X-F. Lin and R. Wald, Proof of the closed-universe-recollapse conjecture for diagonal Bianchi type-IX cosmologies Phys. Rev. D 40, 3280–86 (1989). [17] C. Misner, Mixmaser universe, Phys. Rev. Lett. 22, 1071–1074 (1969).

500

H. Ringstr¨ om


[18] A. Rendall, Global dynamics of the mixmaster model, Class. Quantum Grav. 14, 2341-2356 (1997). [19] H. Ringstr¨ om, Curvature blow up in Bianchi VIII and IX vacuum spacetimes Class. Quantum Grav. 17, 713–731 gr-qc/9911115. [20] J. Wainwright and L. Hsu, A dynamical systems approach to Bianchi cosmologies : orthogonal models of class A. Class. Quantum Grav. 6, 1409–1431 (1989). [21] J. Wainwright and G.F.R. Ellis editors (1997), Dynamical Systems in Cosmology, Cambridge University Press. [22] R. Wald, General Relativity, University of Chicago Press, (1984). Hans Ringstr¨ om Max Planck Institut f¨ ur Gravitationsphysik Albert Einstein Institut Am M¨ uhlenberg 1 D-14476 Golm Germany e-mail: [email protected] Communicated by Sergiu Klainerman submitted 21/11/00, accepted 14/12/00




Non Exponential Law of Entrance Times in Asymptotically Rare Events for Intermittent Maps with Infinite Invariant Measure Xavier Bressaud, Roland Zweim¨ uller Abstract. We study piecewise affine maps of the interval with an indifferent fixed point causing the absolutely continuous invariant measure to be infinite. Considering the laws of the first entrance times of a point — picked at random according to Lebesgue measure — into a sequence of events shrinking to the strongly repelling fixed point, we prove that (when suitably normalized) they converge in distribution to the independent product of an exponential law to some power and a one-sided stable law. Résumé Nous ´ etudions une classe d’applications affines par morceaux de l’intervalle avec un point fixe indiff´ erent dont la mesure invariante absolument continue est infinie. Nous considérons les lois des premiers temps d’entrée d’un point — choisi au hasard suivant la mesure de Lebesgue — dans une suite d’év´ enements se concentrant autour du point fixe fortement répulsif. Nous prouvons que, correctement renormalis´ es, ces temps convergent en distribution vers le produit indépendant d’une loi exponentielle élev´ ee ` a une certaine puissance et d’une loi stable unilatérale.

1 Introduction There has been a recent interest in statistics of entrance - or return - times into rare events for chaotic dynamical systems. Given a sequence of sets in the phase space of some ergodic system with measures decaying to zero, one can ask about the asymptotic behaviour of the sequence of entrance times in these sets. In the case of hyperbolic systems preserving a probability measure, entrance times typically converge to an exponential distribution when normalized by their expectations. The lack of memory property of the limit distribution is often interpreted as “unpredictability” of the occurence of rare events. Results of this type have been proved for different classes of systems and sequences of shrinking sets, see for example the survey in [5]. One basic family of examples is that of uniformly expanding maps of the interval. Interval maps with indifferent fixed points, frequently referred to as intermittent maps, perhaps give the simplest models beyond uniform hyperbolicity. For those situations where there still exists a finite absolutely continuous invariant measure, precise results again giving exponential limit laws have been given in [17]. The case of maps with an indifferent fixed point whose SRB measure is a Dirac mass at the fixed point - and where the only absolutely continuous invariant

502

X. Bressaud, R. Zweim¨ uller


measure is infinite - is somewhat different. We refer to [1] for general ergodic properties of infinite measure preserving systems, and to [19] for specific information on interval maps with indifferent fixed points and further references. [8] considered a particular piecewise affine, i.e. Markov chain model, and proved convergence to an exponential distribution for entrance times close to the indifferent fixed point, which however are not rare in the sense of the invariant measure respectively the dynamics. The purpose of the present note is to similarly present a simple family of piecewise affine examples for which the entrance times to a particular sequence of sets, namely those shrinking to the strongly repelling fixed point, in general converge to a non exponential law which depends on the fine local behaviour at the fixed point. We also discuss what we expect to be the behaviour of these entrance times for more general sequences of cylinders. The only other results known to us where a limit law different from the exponential distribution turns up are for systems of (very) low complexity, such as rotations (see [4]) and substitutions (see [10]). In these cases, the limit distributions are distributions of discrete random variables and the analysis has a different flavour.

2 Statement of the result Let (I, λ) be the interval I = [0, 1] endowed with Lebesgue measure λ. Let (cj )j≥0 be a sequence strictly decreasing to 0 with c0 = 1 satisfying cj+1 /cj → 1. These points yield a partition (mod λ) of I into the intervals Ij := (cj+1 , cj ), j ≥ 0. We consider the map T on I which is affine and increasing on each Ij and maps I0 onto I (with slope s := (1 − c1 )−1 ) and Ij onto Ij−1 for all j ≥ 1, cf. Fig.1. Since T (x) → 1 as x → 0, transformations of this type frequently serve as simplified models for smooth ’intermittent’ maps with an indifferent fixed point. The piecewise affine version T in fact is just a renewal Markov chain in a sense we shall make precise below. T is conservative ergodic and has a unique (up to a constant factor) absolutely continuous invariant measure µ (whose density is constant on each Ij ) which is infinite if and only if j cj = +∞. Throughout we shall assume that this is the case (i.e. that the chain is null recurent) and we choose µ such that µ(I0 ) = λ(I0 ). Example 1 Specific examples which are frequently studied in the literature are given 1 1 by cj := const · j −α , α ∈ (0, 1], which corresponds to T x = x + ax1+ α + o(x1+ α ) in the smooth setting. We are interested in the asymptotic distributional behaviour of the (first) entrance times to a sequence of asymptotically rare events. More precisely we consider the sequence (dj ) of the preimages of c1 under the rightmost branch of T , i.e. dj := 1 − s−j and the sequence of intervals Bm := (dm+1 , 1), m ≥ 0, with λ(Bm ) = µ(Bm ) = s−m . The variables τm , m ≥ 0 we are interested in are the

Vol. 2, 2001

Entrance Times for Intermittent Maps

Ij · · · I2

I1

503

I0

B3 B4 · · ·

Figure 1: The map T.

numbers of steps needed to enter Bm , that is τm (x) := min{i ≥ 1, T i (x) ∈ Bm }. These entrance times obviously go to infinity almost surely and have infinite expectation with respect to λ. Still it is possible to understand their asymptotic behaviour. To state the result, we let E denote the exponential law of parameter 1, and also use the same symbol for a generic random variable with this distribution, independent of all other variables that may appear. Similarly, Gα denotes the

504



(essentially unique) one-sided stable law of index α ∈ (0, 1), i.e. the distribution on α (t) = e−tα , see [12], pp.448, as well as the R+ = (0, ∞) with Laplace transform G generic random variable with this distribution. For example, G 21 (which naturally arises in return time problems for the simple coin-tossing random walk, cf. [11], p.90) is the law of N12 , where N has a standard normal distribution. We shall say more about how these laws arise after the statement of the theorem, and it will become clear that it is natural to write G1 for the law with unit mass at 1. The theorem below applies to the maps of Example 1, but we prefer to state the result in full generality since this causes no additional difficulties in the proof, and might turn the reader’s attention to a classical probabilistic theory which is not particularly well known in the dynamics community. When talking about asymptotic properties we shall identify a sequence (cj ) with its piecewise constant extension c(x) := c[x] , x ∈ R+ . Recall that a function c : R+ → R+ is called regularly varying (at infinity) with index α ∈ R if it is of the form c(x) = xα l(x) where l is slowly varying in that it satisfies limx→+∞ l(σx) l(x) = 1 for all σ > 0 (e.g. if l is constant or l(x) = log x). A function b is asymptotically inverse to c if b(c(x)) ∼ c(b(x)) ∼ x as x → ∞. Such functions exist and are unique up to asymptotic equivalence if α > 0, see [2], pp.28. Theorem 1 (Distributional convergence of the entrance times) nIf the sequence (cj ) is regularly varying of index −α for some α ∈ (0, 1) , or if ( j=0 cj )n≥1 is slowly varying and α := 1 , then 1 1 d · τm =⇒ E α · Gα b(sm )

as m → ∞, where the τm are considered as random variables distributed according to Lebesgue measure λ on I, and b is a function asymptotically inverse to x → (c1 Γ(1 − α)c(x))−1 in the first case, and asymptotically inverse to x → x x/(c1 0 c(y)dy) in the second. (Hence b is regularly varying with index α1 and satisfies x = o(b(x)) as x → ∞). Example 2 In the case α = 1, which lies at the threshold between the finite and the infinite measure regime, we still have an exponential distribution in the limit, although the normalizing sequence can no longer be given by the expectations of the τm which are already infinite. For the particular α = 1 map from the family of example 1, we have (with κ a suitable constant) κ · m−1 · s−m · τm =⇒ E . d

m

Example 3 In the α ∈ (0, 1) cases of example 1, we have b(sm ) = κ · s α . If, in particular, α = 12 , we obtain 2 E d κ · s−2m · τm =⇒ . N

Vol. 2, 2001


505

1

Remark 1. The distribution Hα := E α · Gα of the independent product of the α1 power of an exponential law of parameter 1 and the one-sided stable law of index α can more explicitely be described by its Laplace transform which is easily seen to be α (t) = 1 H . 1 + tα Remark 2. A minor modification of our argument also gives the asymptotic distributional behaviour of the first return times ϕm (x) := min{i ≥ 1, T i (x) ∈ Bm }, x ∈ Bm , regarded as random variables on the respective sets Bm with normalized Lebesgue measure λm := λ(Bm )−1 · λ. We have 1 1 d · ϕm =⇒ s−1 δ0 + (1 − s−1 ) E α · Gα , m b(s )

where δ0 denotes unit point mass in zero. This is because {ϕm = 1} = Bm+1 ⊆ Bm always has λm −measure s−1 while under the condition that it should be larger than 1, ϕm behaves as τm above. To get an intuitive understanding of the result we take a closer look at a Markov chain equivalent to T . It is a simple renewal chain (Xn ) with states Ij , the renewal state being I0 , see Fig.2. Ij

I4

I3

I2

I1

I0 .

Figure 2: The Markov chain model.

The transition probabilities are given by P(Xn+1 = Ij |Xn = I0 ) = λ(Ij )/λ(I \ I0 ). The precise relation to the interval map is as follows: if Y0 ∈ I is randomly chosen according to some probability density j πj 1Ij constant on each Ij , and Yn := T n (Y0 ), n ≥ 1 , the resulting random sequence (Xn ) with Xn := Ij if Yn ∈ Ij is the renewal chain with initial distribution (πj ). Any sample path of the renewal chain consists of a sequence of excursions to the left part. If we let Lk denote the time between the k − 1st and kth visit in I0 , then (Lk ) clearly is an iid sequence, and, when n starting in I0 , the number of steps until we return to I0 for the nth time is k=1 Lk . This is where the stable laws enter: By classical results, arithmetical averages of nonnegative iid variables Lk without expectation converge to some nondegenerate limit distribution iff the sequence of tail weights tj := P(Lk ≥ j) is regularly varying of index −α for some

506



α ∈ (0, 1), in which case we have 1 d Lk =⇒ Gα , b(n) n

(1)

k=1

cf. [12], pp.448 or [2], where b is asymptotically inverse to x → (Γ(1 − α)t(x))−1 , n pp.343. The same conclusion holds with α := 1 provided ( j=0 tj )n≥1 is slowly x varying and b is asymptotically inverse to x → x/ 0 t(y)dy, cf. [2],pp.372 or [12], pp.234. Observe that in the case α = 1, which is closest to the situation of finite expectation (where the strong law of large numbers would give a.s. convergence of the averages (E(L1 ) · n)−1 nk=1 Lk → 1), (1) with G1 = 1 still gives a weak law of large numbers, while for α < 1 stronger fluctuations cause the limit to become continuously distributed. In our particular situation we have tj = cj showing that the conditions on (cj ) are most natural from a probabilist’s point of view. In fact (1) is essential for understanding how the limit law in the theorem arises. We give a rough heuristical sketch of the argument: Recall (cf. [12], pp.169) that α-stability of the law by definition means that the sum of n independent 1 random variables G1 , · · · , Gn sharing this distribution has the same law as n α G1 . The target event Bm is to stay at I0 for at least m steps. This can happen only at the end of an excursion when we are back at I0 , where we have a certain probability pm (with pm → 0) for Bm to occur. If it does not, we are given another chance at our next return to I0 . The number θm of trials (and hence excursions) we need therefore will roughly have a geometric distribution and should thus converge to an exponential law as m → ∞. On the other hand, the total number of steps done during that time will be given by the random sum L1 + · · · + Lθm . Assume for the moment that the Lk were distributed according to Gα (which they are not, but they share the same tail behaviour) and that they were independent of θm (in fact we shall see below that in a sense the major part of them is). Then, by the 1 defining property of an α-stable law, this sum is distributed like θm α · L1 , so that 1

τm L1 + · · · + Lθm θm α · Gα , where θm (when normalized by its expectation) is close to an exponential distribution.

3 Proof of the theorem The adequate framework for proving a probabilistic result about a dynamical system metrically isomorphic to a Markov chain should be that of the latter. Instead of working with the simple renewal chain mentioned before, we shall find it more convenient to use a slightly refined Markov chain model in which the target events Bm appear explicitely. We let Jj := (dj , dj+1 ), j ≥ 0, and consider the Markov chain (Xi )i≥0 whose states are the Ij , j ≥ 1 and Jj , j ≥ 1, with the obvious

Vol. 2, 2001


507

transition probabilities P(Xn+1 ∈ Ij |Xn = J1 ) = λ(Ij )/λ(J0 ) = (cj − cj+1 )/c1 , and P(Xn+1 ∈ Jj |Xn = I1 ) = λ(Jj )/λ(I0 ) = s−j (s − 1), cf. Fig.3. Ij

I4

I3

I2

I1

J1

J2

J3

J4

Jj

Figure 3: The refined Markov chain.

The relation between the chain and the map T is analogous to what we said before, the target event Bm is j>m Jj . For convenience we shall first consider the chain starting with an initial distribution that in the interval map setting corresponds to normalized Lebesgue measure on I0 , that is, P(X0 = Jj ) = λ(Jj )/λ(I0 ) = (1 − s−1 )s−j+1 , j ≥ 1. Again we consider τm := min{i ≥ 1, Xi ∈ Bm }. To get an easy understanding of paths that enter Bm for the first time at a certain step we shall focus on the states J1 and I1 to separate excursions to the left and to the right. We let Θm denote the number of passages through J1 (and hence through I1 ) before time τm : Θm :=

τ m −1

1J1 (Xi ).

i=0

Whether or not we hit Bm between two passages through J1 depends on the edge we choose from I1 . Now, pm := P(Xi+1 ∈ Bm |Xi = I1 ) = s−m → 0 as m → +∞, and P(Θm = 0) = P(X0 ∈ Bm+1 ) = pm+1 , while P(Θm = r) = (1 − pm+1 )pm (1 − pm )r−1 for r ≥ 1. Consequently, the Θm normalized by their )(1−pm ) expectations E[Θm ] = (1−pm+1 ∼ sm , converge to an exponential law of pm parameter 1: 1 d (2) · Θm =⇒ E. E[Θm ] Turning back to τm we are going to decompose it into the successive excursion times spent on either side. To formalize this, we set S0 := 0, and for k ≥ 1 let Tk := min{i ≥ Sk−1 : Xi = J1 },

and

Sk := min{i ≥ Tk : Xi = I1 }.

508



The lengths of the kth excursion to the left and to the right are then respectively given by Lk := Sk − Tk , k ≥ 1 and Rk := Tk+1 − Sk , k ≥ 0. (These Lk correspond morally - though not precisely - to those from the sketch above.) We can then represent the entrance time τm as τm =

Θm

Lk +

k=1

Θ m −1

Rk + 1.

(3)

k=0

This decomposition is useful because the sequences (Lk ) and (Rk ) are iid, and most important for our purposes - the sequence (Lk ) is independent of each Θm : the number of excursions to the left is independent of their lengths. Moreover we shall see later that the contribution of the Rk vanishes asymptotically, and we therefore concentrate on the first of the sums in (3). c

As the the tail weights tj = P(Lk > j) are now given by c1j , our assumptions on (cj ) ensure that (1) holds with b as in the theorem. Therefore the correct order Θm of magnitude of k=1 Lk is that of the random sequence (b(Θm )) which in view of (2) we might hope to be given by (b(E[Θm ])). We therefore write Θm Θm 1 b(Θm ) 1 b(E[Θm ]) · · · L = Lk . k b(sm ) b(sm ) b(E[Θm ]) b(Θm ) k=1

(4)

k=1

The scalar factor in front converges to 1 because of the regular variation of b. The second factor exhibits good limiting behaviour, too: we have 1 b(Θm ) d =⇒ E α , b(E[Θm ])

(5)

which is immediate from the following lemma. Lemma 1 Assume that E and Em , m ≥ 0, are random variables taking values in d R+ = (0, ∞), such that γ1m Em =⇒ E, for suitable normalizing constants γm → ∞. If b : R+ → R+ is regularly varying at infinity with index β = 0, then b(Em ) d =⇒ E β . b(γm ) Proof. Writing b(Em ) = b(γm )

Em γm

β l ·

Em γm γm

l(γm )

,

l being the slowly varying part of b, this is an easy application of the uniform convergence theorem for slowly varying functions which ensures that l(σx) l(x) → −1 1, as x → +∞, uniformly in σ ∈ [Σ , Σ], for any Σ > 1. See [2], p.6.

Vol. 2, 2001


509

Let us return to (4). Since we know that Θm → ∞ in probability and each is independent of (Lk ) it is easy to see that the rightmost term will converge in law to a stable distribution Gα . However, as both random terms contain the Θm , they are not independent and we have to be careful about the distribution of their product. The reason why we will still have convergence to the independent product 1 E α ·Gα is that the only thing that matters for the last term is that Θm is large. The precise distribution of Θm has hardly any effect on the distribution of the sum. This is made precise in the following lemma, the easy proof of which we omit. Lemma 2 Assume that Qn , Q, Hm , H, and Tm are random variables such that d 1. Qn take values in R+ and Qn =⇒ Q, 2. Tm take values in N and Tm → ∞ in probability, d 3. Hm =⇒ H, 4. Each of Tm , Hm , and H is independent of the sequence (Qn ) and of Q. Then

d Hm · QTm =⇒ H · Q.

Of course, the important point here is that Hm and Tm need not be indepenn b(Θm ) 1 dent. Taking Hm := b(E[Θ , Tm := Θm and Qn := b(n) k=1 Lk we obtain m ]) m 1 1 d Lk =⇒ E α · Gα . b(sm )

Θ

(6)

k=1

To get the asymptotics of τm we still have to take care of the Rk , cf. (3). Recall that (Rk ) is an iid sequence and that the Rk have finite expectation. Therefore Rk → E[R1 ] ∈ R+ almost surely. Since also Θm → ∞ a.s., we have n−1 n−1 k=0 Θm −1 −1 Θm Rk → E[R1 ] a.s. as m → ∞. In view of x/b(x) → 0 (which is clear k=0 from (1) as E[Lk ] = ∞) and (2), this implies Θ m −1 1 Rk → 0 b(sm )

in probability.

(7)

k=1

We therefore end up with 1 1 d · τm =⇒ E α · Gα , m b(s )

(8)

which shows that the distribution of the first entrance time in the small events Bm have the asserted limiting behaviour if we start our chain on the righthand half with the measure specified in the beginning.

510



To finally obtain the result for the case of the initial distribution which corresponds to Lebesgue measure for the interval map is almost trivial now: It is ¯ i )i≥0 defined by X ¯ i := Xi+1 has this enough to notice that the shifted chain (X initial distribution, thus giving a realization of the process we are interested in, ¯ i ∈ Bm } we have τ¯m − (τm − 1) → 0 and to observe that for τ¯m := min{i ≥ 0, X almost surely, so that (8) holds just as well with τm replaced by τ¯m .

4 A more general pattern The following heuristic considerations suggest that the same limit laws should arise for a larger class of asymptotically rare events defined by prescribing the durations k1 , k2 , . . . ∈ N of m consecutive excursions from I0 and letting m → ∞. (That is, we consider the nested sequence of cylinders around some point x ∈ (0, 1).) The situation is more intricate than before, since the excursions required to continue a successful attempt may change from step to step, and if we fail, we still need not necessarily start from scratch, as the last few excursions may well fit a shorter initial segment of (ki ). We start from the Markov chain (Xn )n≥0 with states Ij , j ≥ 0, cf. Fig.2, and P(X0 = I0 ) = 1. Li , i ≥ 1, will denote the duration of the ith excursion from I0 , and we let Sn := n−1 k=0 1I0 (Xk ). To keep track of how many consecutive excursions of the prescribed lenghts we have done up to step n, we set D0 := 0 and define Dn := max({0} ∪ {r ≥ 1 : LSn −r+1 = k1 , . . . , LSn = kr }), n ≥ 1. Observe then that Zn := (Xn , Dn ), n ≥ 0, again is a Markov chain. At step n we complete a series of m excursions of lengths k1 , . . . , km iff Zn = (I0 , m). The waiting time for this event is given by τm := inf{n ≥ 1 : Zn = (I0 , m)}. We decompose paths according to the visits of (Zn ) to (I0 , 0). Let L∗k , k ≥ 1, denote the time between m 1(I0 ,0) (Zk ). Then τm is essentially given the k − 1st and kth visit, and Θm := τk=0 Θm ∗ by k=1 Lk . Θm is the waiting time until the first success (meaning that - with probability pm → 0 - we reach (I0 , m) before returning to (I0 , 0)) in a sequence of Bernoulli d trials performed at each visit to (I0 , 0). Hence pm Θm =⇒ E as m → ∞. Notice Θm (m) Θm ∗ (m) now that k=1 Lk has the same distribution as k=1 Ek , where (Ek )k≥1 is (m) an iid sequence independent of Θm , Ek having the first return distribution F (m) of (Zn ) to (I0 , 0) under the condition that we do not pass through (I0 , m). If the F (m) are uniformly in the domain of attraction of Gα in the sense that both the (m) L∞ -convergence of the distribution functions of b(m) (n)−1 nk=1 Ek to Gα , and (m) the regular variation of the b are uniform in m, then easy generalizations of the Lemmas above show that Θm 1 1 (m) d Ek =⇒ E α · Gα , as m → ∞. −1 (m) b (pm ) k=1

We are however not going to rigorously discuss this question here.

Vol. 2, 2001


511

Finally we notice that this pattern does not include the interesting case of cylinders shrinking to the indifferent fixed point x = 0. As remarked earlier, they do not constitute events which are asymptotically rare w.r.t. the invariant measure. A rough analysis suggests that the entrance times should behave rather differΘm −1 ently. In effects, these entrance times can be written τm = i=1 Li where (Li ) is the sequence of iid random variables describing the durations of the excursions from I0 and Θm is the first index i for which Li is larger than m. For each m, (m) we can consider an iid sequence (Ei )i≥1 , independent of Θm , having the distribution of L given {Li < m}. The random variable τm has the distribution of Θm −1 (m) i Ei . Our point is that, at least in the simplest cases, one can use the i=1 theorem in Section IX.7 of [12] to identify the limit distribution of the triangular m ]−1 (m) array b(m)−1 E[Θ Ei for suitable normalizing sequences b. It has finite i=1 expectation but is not trivial. So we believe that another class of limit laws may arise in this situation. Acknowledgments. Most of this work was done during meetings and visits made possible by the PRODYN program of the ESF.

References [1]

J. Aaronson, An introduction to infinite ergodic theory, AMS (1997).

[2]

N.H. Bingham, C.M. Goldie, J.L. Teugels, Regular variation, Cambridge UP (1989).

[3]

M. Campanino, S. Isola, Statistical properties of long return times in type I intermittency, Forum Math. 7(3), 331–348 (1995).

[4]

Z. Coelho, E. de Faria, Limit laws of entrance times for homeomorphisms of the circle. Israel J. Math 93, 93-112 (1996).

[5]

Z. Coelho, Asymptotic laws for symbolical dynamical systems, Lecture notes - CIMPA Summer School (Temuco, Chile) (1997).

[6]

Z. Coelho, P. Collet, Poisson laws associated to subsystems of finite type in symbolic dynamical systems, Preprint (2000).

[7]

P. Collet, A. Galves, Asymptotic distribution of entrance times for expanding maps, Dynamical systems and Applications, WSSIAA 4, 139–152 (1995).

[8]

P. Collet, A. Galves, B. Schmitt, Unpredictability of the occurence time of a long laminar period in a model of temporal intermittency, Annales de l’Institut Henri Poincaré, 57(3), 319 (1992).

[9]

W. Doeblin, Remarques sur la théorie métrique des fractions continues, Compositio Math. 7, 353-371 (1940).

512



[10] F. Durand, A. Maass, Limit laws of entrance times for low complexity Cantor minimal systems, Preprint (2000). [11] W. Feller, An introduction to Probability theory and its applications, Vol.I, Wiley (1970). [12] W. Feller, An introduction to Probability theory and its applications, Vol.II, Wiley (1971). [13] A. Galves, B. Schmitt, Occurence times of rare events for mixing dynamical systems, Annales de l’Institut Henri Poincaré, 52(3), 267–281 (1990). [14] A. Galves, B. Schmitt, Inequalities for hitting times in mixing dynamical systems, Random Comput. Dynam., 5(4), 337–347 (1997). [15] L. Heinrich, Poisson Approximation for the Number of Large Digits of Inhomogeneous f -Expansions, Mh.Math. 124, 237–253 (1997). [16] M. Hirata, Poisson law for Axiom A diffeomorphisms. Ergod.Th.&Dyn.Sys. 13, 533–556 (1993). [17] M. Hirata, B. Saussol, S. Vaienti, Statistics of return times : a general framework and new applications, Comm. Math. Phys. 206 no. 1, 33–55 (1999). [18] S. Isola, Renewal sequences and intermittency, J. Statist. Phys. 97 no. 1-2, 263–280 (1999). [19] R. Zweim¨ uller, Ergodic properties of infinite measure preserving interval maps with indifferent fixed points, Ergod.Th.&Dyn.Sys. 20, 1519–1549 (2000). Xavier Bressaud Institut de Mathématiques de Luminy, Case 907 163, avenue de Luminy F-13288 Marseille Cedex 9, France e-mail: [email protected] Roland Zweim¨ uller Mathematisches Institut Universität Erlangen-N¨ urnberg Bismarckstraße 11/2 D-91054 Erlangen, Germany e-mail: [email protected] Communicated by Jean-Pierre Eckmann submitted 19/05/00, accepted 15/09/00



Singularity Cancellation in Fermion Loops Through Ward Identities C. Kopper and J. Magnen Abstract. Recently Neumayr and Metzner [1] have shown that the connected N point density-correlation functions of the two-dimensional and the one-dimensional Fermi gas at one-loop order generically (i.e. for nonexceptional energy-momentum configurations) vanish/are regular in the small momentum/small energy-momentum limits. Their result is based on an explicit analysis in the sequel of the results of Feldman et al. [2]. In this note we use Ward identities to give a proof of the same fact - in a considerably shortened and simplified way - for any dimension of space.

The infrared properties of the connected N -point density-correlation function of the interacting Fermi gas at one-loop order, to be called N -loop for shortness, are important for the understanding of interacting Fermi systems, in particular in the low energy regime. The N -loops appear as Feynman (sub)diagrams or as kernels in effective actions. In two dimensions e.g., their properties are relevant for the analysis of the electron gas in relation with questions such as the breakdown of Fermi liquid theory and high temperature superconductivity. We refer to the literature in this respect, see [3,4] and references given there. Whereas the contribution of a single loop-diagram to the N -point function for N ≥ 3 generally diverges in the small energy-momentum limit, these singularities have been known to cancel each other in various situations [3,4,5] in the symmetrized contribution, i.e. when summing over all possible orderings of the external momenta, a phenomenon called loop-cancellation. The two-loop has been known explicitly in one, two and three dimensions for quite some time [1], the calculation in two dimensions goes back to Stern [6]. We introduce the following notations adapted to those of [1] : ΠN (q1 , . . . , qN ) denotes the Fermionic N -loop for N ≥ 3 , see (2) below, as a function of the (outgoing) external energy-momentum variables q1 , q2 . . . , qN−1 and qN = −(q1 + . . . + qN−1 ) . Here the (d + 1)-vector q stands for (q0 , q1 , . . . , qd ) = (q0 , q ) . We also introduce the variables pi = q1 + q2 + . . . + qi−1 ,

p1 = 0 ,

1≤i≤N .

(1)

By definition we then have dk0 dd k IN (k; q1 , . . . , qN ) ΠN (q1 , . . . , qN ) = 2π (2π)d with IN (k; q1 , . . . , qN ) =

N j=1

G0 (k − pj )

(2)

514

C. Kopper and J. Magnen

and G0 (k) =


k2 1 , εk = , µ being the Fermi energy. ik0 − (εk − µ) 2m

To have absolutely convergent integrals for N ≥ 3 , we restrict the subsequent considerations to the physically interesting cases d ≤ 3 . At the end of the paper we indicate how the same results can be obtained for d ≥ 4 . We also assume that the variables qj have been chosen such that the integrand is not singular (see below (8)). In the following we will choose units such that µ = 1, 2m = 1 . By convention the vertex of q1 will be viewed as the first vertex. Symmetrization with respect to the external momenta (q1 , . . . , qN ) diminishes the degree of singularity of the Fermion loops. To prove this fact we have to introduce some notation on permutations. We denote by σ any permutation of the sequence (2, . . . , N ) . By ΠσN (q1 , . . . , qN ) we then denote ΠN (q1 , qσ−1 (2) , . . . , qσ−1 (N) ) . For the completely symmetrized N -loop we write1 ΠSN (q1 , . . . , qN ) = ΠσN (q1 , . . . , qN ) . (3) σ

We will also have to consider subsets of permutations : For n ≤ N − 2 and (i ,...,i ) 2 ≤ j1 < j2 < . . . < jn ≤ N we denote by σ(j11 ,...,jnn ) the permutation mapping jν → iν = σ(jν ) ∈ {2, . . . ,N } , which preserves the order of the remaining sequence (2, . . . , N ) − (j1 , . . . , jn ) , i.e. σ(ν) < σ(µ) for ν < µ , if ν, µ ∈ {j1 , . . . , jn } . When the target positions (i1 , . . . , in ) are summed over (see e.g. (4) below), we n will write shortly σ(j1 , . . . , jn ) , or also σN (j1 , . . . , jn ) , σN , if we want to indicate the number N . Note that n = N − 2 is already the most general case, since fixing the positions of N −2 variables (apart from q1 ) fixes automatically that of the last. (i) For the permutation σ(j) , which maps j onto the i-th position in the sequence (2, . . . , N ) (preserving the order of the other variables), we use the shorthands σji or σj . We then also introduce the N -loop, symmetrized with respect to the previously introduced subsets of permutations, i.e.2 S (j ,...,jn ) σ (j ,...,jn ) ΠNn 1 (q1 , . . . , qN ) = ΠNN 1 (q1 , . . . , qN ) , σN (j1 ,...,jn ) S

in particular ΠSN = ΠNN −2 .

(4)

The notations corresponding to (3 - 4) will be applied in the same sense also to IN . The recent result [1] of Neumayr and Metzner, based on the exact expression for the N -loop from [2], which however is nontrivial to analyze, shows that for N > 2 and d = 1, 2 one has generically : ΠSN (λq1 , . . . , λqN ) = O(1) for λ → 0 ,

(5)

1 We do not divide by the number of permutations, here (N − 1)! , to shorten some of the subsequent formulae. 2 Again we do not multiply by (N −n−1)! . (N −1)!

Vol. 2, 2001

Singularity Cancellation in Fermion Loops Through Ward Identities

515

ΠSN (q10 , λq1 , . . . , qN0 , λqN ) = O(λ2N−2 ) for λ → 0 ,

(6)

ΠSN (q1 , . . . , qN ) = O(|qj |) for qj → 0 .

(7)

We are not completely sure about the authors’ definition of ’generically’. In any case their restrictions on the energy-momentum variables include the following one: The energy momentum set {q1 , . . . , qN } is nonexceptional, if for all J ⊂= {1, . . . , N } we have | qi0 | ≥ η > 0 . (8) i∈J

Our bounds given in the subsequent proposition are based on this condition.3 . Though we cannot exclude (and did not really try) that linear relations among the momentum variables {q1 , . . . , qN } could even improve those bounds, it seems quite clear that they are saturated apart from subsets of momentum configurations of measure zero, cf. also the numerical results mentioned in [1]. Furthermore they deteriorate with the parameter η −1 (cf. the remarks in the end of the paper). Proposition. For nonexceptional energy-momentum configurations {q1 , . . . , qN } (as defined through (8) with η fixed) and for N ≥ 3 and n ≤ N − 2 the following bounds hold : A) A1) A2) B) B1) B2)

In the small λ limit qi0 → λ qi0 , qi → λ qi , λ → 0 |ΠN (λq1 , . . . , λqN )| ≤ O(λ−(N−2) ) , |ΠSNn (λq1 , . . . , λqN )| ≤ O(λ−(N−2−n) ), |ΠSN (λq1 , . . . , λqN )| ≤ O(1). In the dynamical limit qi → λqi , λ → 0 |ΠN (q10 , λq1 , . . . , qN0 , λqN )| ≤ O(λ2 ) , |ΠSNn (q10 , λq1 , . . .)| ≤ O(λ2n+2 ), |ΠSN (q10 , λq1 , . . .)| ≤ O(λ2N−2 ).

(9) (10) (11) (12) (13) (14)

The functions λN−2 ΠN (λq1 , . . . , λqN ) , ΠSN (λq1 , . . . , λqN ) , ΠN (q10 , λq1 , . . . , qN0 , λqN ) and ΠSN (q10 , λq1 , . . . , qN0 , λqN ) are analytic functions of λ in a neighbourhood of λ = 0 (depending on the momentum configuration, in particular on η ). Proof. To prove A1) we perform the k0 -integration using residue calculus, so that (2) takes the form (cf. [2,1]) ΠN (q1 , . . . , qN ) =

N i=1

| k− pi | 0 we regard the scaling resp. dynamical limit for (44), multiplied by ϕ and integrated over k . We can apply the induction hypothesis to the r.h.s. of (44) noting again that the functions ∆ ϕ and A ϕ have the properties required for ϕ . We also use (25) if N = 3 . For each entry in the sum in the last line of (44) we perform in the second term the change of variables k˜ = k + λqjν resp. k˜ = k + (qjν ,0 , λqjν ) and then use (21). With the aid of (45, 46) and the induction hypothesis one then shows dk0 dd k Sn | I (k; λq1 , . . . , λqN ) ϕ(k; λq1 , . . . , λqN ) | ≤ O(λ−(N−2−n) ) , (48) 2π (2π)d N dk0 dd k Sn I (k; q1,0 , λq1 , . . . , qN,0 , λqN ) ϕ(k; q1,0 , λq1 , . . . , qN,0 , λqN ) | | 2π (2π)d N ≤ O(λ2+2n ) . On specializing to ϕ ≡ 1 , this ends the proof of the proposition.

(49) ✷

We join a few comments on various extensions of the results obtained. a) For dimensions d ≥ 4 the N -loop integrals are absolutely convergent for 2N > d + 1 and can be obtained as limits Λ0 → ∞ of their regularized versions, 2 which are defined on introducing a regulating function ρ( Λk 2 ) in the propagators 0

G0 (k) → G0 (Λ0 , k) =

1 ik0 −

2 (ρ−1 ( Λk 2 ) k2 0

− 1)

.

We suppose ρ to be smooth, monotonic, positive, of fast decrease and such that ρ(x) ≡ 1 for x ≤ 1 . The regulator then appears in the A -factors when using the Ward identity, e.g. (32) changes into iqj,0 + ρ−1 (

k2 (k − qj )2 ) (k − qj )2 − ρ−1 ( 2 ) k2 . 2 Λ0 Λ0

withBut since these factors are still independent of k0 , the regulator disappears out leaving any trace after performing the k0 -integration, if Λ0 ≥ |qj | + 1 . So

Vol. 2, 2001

Singularity Cancellation in Fermion Loops Through Ward Identities

523

we still obtain the same results for d ≥ 4 , if 2N > d + 1 , and we obtain them without this last restriction in case we define the integrals as Λ0 → ∞ - limits of their regulated versions from the beginning. b) Neumayr and Metzner [1] also prove |ΠSN (q1 , . . . , qN )| ≤ O(|qj |) for qj → 0 , keeping the other variables fixed. In our framework this result is obtained immediately from (33), and we realize that it holds already on symmetrization with respect to qj , full symmetrization is not required. This result can be generalized to several vanishing external momenta qj1 , . . . , qjn , in the same way as we did for the proof of A2) and B2) in the proposition. Using (44) we obtain on induction S (j1 ,...,jn )

|ΠNn

(q1 , . . . , qN )| ≤ O(

n

|qjν |)

(50)

ν=1

and of course the same bound on ΠSN . c) From the proof one can straightforwardly read off a bound w.r.t. the dependence on the parameter η from (8). This bound is in terms of η −(N+n) , stemming from the contributions with a maximal number of factors of ∆ . It is of course rather crude, since it does not take into account the effects of the nonvanishing spatial variables and can be improved, depending on the hypotheses made on those. In conclusion we have recovered previous results on the infrared behaviour of the connected N -point density-correlation functions, in short N -loops, by simple, but rigorous arguments based on the Ward identity. We obtain bounds for the fully symmetrized N -loop, in showing, how successive symmetrization improves the infrared behaviour.5 The bounds hold in any spatial dimension (taking into account the remarks from a) above). Since the Ward identities are explicit and easy to handle, they permit generalizations such as (50). Acknowledgment We would like to thank Walter Metzner for acquainting us with the results from [1] and comments on a previous version of the paper. We are particularly indebted to Manfred Salmhofer for detecting a major mistake in our first version, suggesting quite a number of further corrections and improvements and for many valuable comments on the paper.

References [1] A. Neumayr and W. Metzner, Phys. Rev. B58, 15449 (1998), and J. Stat. Phys. 96, 613 (1999). [2] J. Feldman, H. Kn¨ orrer, R. Sinclair and E. Trubowitz, in Singularities, edited by M. Greuel (Birkh¨ auser, Basel, 1998). [3] W. Metzner, C. Castellani and C. di Castro, Adv. Phys. 47, 317 (1998). 5 We recently learned from W. Metzner, that they were also aware of the fact that partial symmetrization improves the infrared behaviour, but did not mention it in [1].

524

C. Kopper and J. Magnen


[4] P. Kopietz, Bosonization of Interacting Fermions in Arbitrary Dimensions, (Springer, Berlin, 1997). [5] P. Kopietz, J. Hermisson and K. Sch¨ onhammer, Phys. Rev. B52, 10877 (1995). [6] F. Stern, Phys. Rev. Lett. 18, 546 (1967). C. Kopper and J. Magnen Centre de Physique Théorique CNRS UPR 14 Ecole Polytechnique F–91128 Palaiseau Cedex France e-mail: [email protected] e-mail: [email protected] Communicated by Joel Feldman submitted 01/08/00, accepted 10/01/01




A-Priori Decay for Eigenfunctions of Perturbed Periodic Schr¨ odinger Operators Marius Mˇ antoiu, Radu Purice

We dedicate this work to Werner Amrein for his 60-th anniversary Abstract. In this paper we use a general procedure [11] allowing to study the asymptotic behavior of eigenfunctions (even for eigenvalues that are embedded in the continuous spectrum) and prove exponential decay of eigenfunctions for a large class of perturbed periodic Schr¨ odinger Hamiltonians.

1 Introduction In this paper we consider the problem of obtaining upper bounds for the rate of decay at infinity for eigenfunctions of perturbed periodic Schr¨ odinger operators. More precisely, let us fix a Hamiltonian of the form HI := H + VI where H := −∆ + V is a periodic Schr¨ odinger operator in dimension n and VI is a perturbation decaying at infinity (faster then |x|−1 ). We shall suppose that the spectrum of H has an isolated part at the bottom that can be described by N analytic eigenvalues with analytic associated eigenprojectors (for example if the first band is isolated), more precisely we shall impose our Hypothesis 1.1 below. Under these conditions we show that any eigenvalue of the perturbed Hamiltonian HI that is a regular value (more precisely see Definition 1.2), has eigenfunctions that decay exponentially at infinity, with an exponent linear in |x| (see Theorem 1.4). Let us remark that our result covers also the case of embedded eigenvalues as long as they are regular. Let us point out that the existence of embedded eigenvalues for perturbations of periodic Schr¨ odinger operators has been subject to intensive work. In [10] it is shown that for any continuous V and any number E belonging to the spectrum of H, there exists a function VI which is O(< x >−1 ) at infinity such that E is an eigenvalue of H + VI . In more than one dimension the situation is less clear. Anyway, if n = 2 or 3, for some classes of periodic V ’s, eigenvalues embedded into the spectrum of H are forbidden if one imposes the very restrictive condition |VI (x)| ≤ Cexp(−|x|4/3+ ) for a strictly positive (see [9]). We obtain our result (Theorem 1.4) by first proving a weighted estimation of Hardy type (with exponential weights) for the unperturbed periodic Hamiltonian

526

M. Mˇ antoiu, R. Purice


H (Theorem 1.3). In fact, inspired by [1], [2], [3], [4], [5], [8], we elaborate a general scheme (see also [11]) for obtaining Hardy type inequalities for a Hamiltonian starting from a conjugate operator of a special form imposed by the form of the weight function. In [11] we have used this general method for Hamiltonians given by convolution with analytic functions. In our case we shall isolate the bounded energy region of the first N bands for which we shall apply a generalization of our previous method and the rest of the spectrum for which we shall use a variant of the general method of Agmon [1]. We shall denote by ∇ and ∆ the usual gradient and Laplace operators on C0∞ (Rn ) and by H2 (Rn ) the Sobolev space of second order. Let p = 2 for n=1,2,3, p > n/2 for n ≥ 4 and let V ∈ Lploc (Rn ; R) be Zn -periodic on Rn . By some obvious modifications one can also consider a general type of lattice. We consider the Hamiltonian : H = −∆ + V (1.1) to be the usual self-adjoint operator in L2 (Rn ) (having domain H2 (Rn ), see [12]). The well-known Floquet representation allows one to decompose H as a direct integral corresponding to the representation : L2 (Rn ) ∼ = L2 (Tn ; L2 (Ω)) , where : Tn := Rn /Zn ∼ = (S 1 )n ;

Ω := [0, 1)n

(1.2)

are the n-dimensional torus and the fundamental domain associated to Zn . In the following we identify functions defined on Tn with periodic functions on Rn . The Hamiltonian H is decomposable with respect to the above representation and each ”fibre Hamiltonian” H(τ ) (for τ ∈ Tn ) has compact resolvent and thus a discrete spectrum {λa (τ )}a∈N , defining the so-called ”band functions”. Due to the fact that our procedure relies on the regularity of the functions : Tn τ −→ λa (τ ) and being well known that for n > 1 some difficult problems appear in this context, we are obliged to impose some implicit conditions that we now formulate. We shall constantly denote : Cnδ := {z ∈ Cn | |Imz j | < δ, ∀ j ∈ {1, ..., n}} , δ > 0 P(L2 (Ω)) := P ∈ B(L2 (Ω)) | P 2 = P = P ∗ .

(1.3)

Hypothesis 1.1. By denoting σ(H) the spectrum of the operator H, we assume : a) σ(H) = σ0 ∪ σ∞ , where : (inf σ∞ ) − (sup σ0 ) = d0 > 0 ; b) there is some N ∈ N∗ and for each a ∈ {1, ..., N } two functions : λa : Tn → R

,

πa : Tn → P(L2 (Ω))

(1.4)

that are analytic (with respect to the uniform topology on P(L2 (Ω)) in the second case) and admit holomorphic extensions to some strip Cnδ for some δ > 0, such that the Hamiltonian H reduced to the spectral subspace associated to σ0 is unitarily equivalent, in the Floquet representation, to multiplication with the following

Vol. 2, 2001

A-Priori Decay for Eigenfunctions of Schr¨ odinger Operators

527

operator-valued function of τ ∈ Tn : N

λa (τ )πa (τ ).

(1.5)

a=1

Let us remark that our Hypothesis covers the usual case in which the spectrum of H has an isolated band at its bottom, but also the situation of several bands, even overlapping, as long as one can assure the analyticity of the eigenvalues and of the eigenprojections. Definition 1.2. Let us denote by E0 (H), the set of points t < inf σ∞ such that ∃ε > 0, ∃α0 > 0 for which |(∇λa )(τ )| ≥ α0 , ∀τ ∈ λ−1 a ((t − ε, t + ε)) and ∀a ∈ {1, ..., N }. We call this set, the regular set of H below σ∞ . Let us remark that E0 (H) is the complement in (−∞, inf σ∞ ) of the set of critical values of the functions {λ1 , ..., λN }. With these notations we can state now the main results of our work, that will be proved in Section 3. Theorem 1.3. Let H be a periodic Schr¨ odinger Hamiltonian satisfying the Hypothesis 1.1 and let E ∈ E0 (H). Then there exists a constant κ0 ∈ (0, 2πδ) such that for any κ ∈ (0, κ0 ) there exists a positive constant C (depending on E and κ) for which : κ e f D(H) ≤ C < Q >eκ (H − E)f , ∀f ∈ D(H). (1.6) We have denoted by .D(H) the graph norm with respect to H. Theorem 1.4. Let H be a periodic Schr¨ odinger operator (1.1) for which Hypothesis 1.1 stands true. Let VI be a potential of class Lploc (Rn ) (with p as defined before (1.1)), such that lim < x > |VI (x)| = 0. Then for any eigenvalue E of the |x|→∞

Hamiltonian HI := H + VI that belongs to E0 (H) there exists κ ∈ (0, δ) such that for any corresponding eigenvector g : eκ g ∈ L2 (Rn ).

(1.7)

An Appendix is dedicated to some technical lemmas needed in the proof of Theorem 1.3.

2 Some Developments in the Floquet Representation Let H be a periodic Schr¨ odinger Hamiltonian as in the preceding section. We shall briefly recall some facts concerning the Floquet representation in order to fix our notations and to put into evidence some objects and properties that we shall need in the sequel.

528



For x ∈ Rn let x = [x] + x with [x] ∈ Zn , x ∈ Ω. Then, if we denote K = L2 (Ω), we can define the unitary isomorphism : L2 (Rn ) f → U0 f ∈ L2 (Tn ; K) n/2 (U0 f )(τ, ξ) := (2π) e−i2πα·τ f (α + ξ).

(2.8)

α∈Zn

For further use let us also give the explicit form of its inverse : ◦ ◦ ◦ −1 2 n −n/2 (U0 f )(x) = (2π) ei2π[x]·τ f (τ, x)dτ. ∀ f ∈ L (T ; K),

(2.9)

Tn

We constantly distinguish between the two unitarily equivalent representations ◦

◦

H = L2 (Rn ) and H= L2 (Tn ; K) and we use notations of the form H := U0 HU0−1 . For the position operators :

D(Qj ) := f ∈ H | Rn |xj f (x)|2 dx < ∞ (2.10) Q := (Qj )j=1,...,n (Qj f ) (x) := xj f (x), ◦

we have the explicit form in the representation H : ◦ ◦ ◦ ◦ i −1 Qf (τ, ξ) := U0 QU0 f (τ, ξ) = ∇τ + Mξ f (τ, ξ) 2π

(2.11)

◦

for any f ∈ C ∞ (Tn ; K), where ∇τ is the gradient operator with respect to the variable τ ∈ Tn and Mξ is the operator of multiplication with the variable in K.

1/2 n 2 Xj . For any n commuting variables {X1 , ..., Xn } let < X >:= j=1 n

Then < Q > defines a self-adjoint operator on the domain D(Q) :=

D(Qj )

j=1

that is a domain of essential self-adjointness for each Qj . It is useful to observe that for j = 1, ..., n, one can define the operators : ([Qj ] f ) (x) := [xj ] f (x),

D([Qj ]) := D(Qj )

(2.12)

and they satisfy the relation : [Qj ] = −

1 −1 U (−i∇τ ) U0 . 2π 0

(2.13)

Associated to these operators we have a third representation that we shall fre := l2 (Zn ; K), obtained by the inverse discrete Fourier transform : quently use H F0 : l2 (Zn ; K) →L2 (Tn ; K) ˜) (τ, ξ) := (2π)n/2 e−i2πα·τ u ˜(α, ξ). (F0 u α∈Zn

(2.14)

Vol. 2, 2001


529

We shall also use the following unitary operator : (U f ) (α, ξ) = f (α + ξ) U := F0−1 U0 : L2 (Rn ) → l2(Zn ; K), −1 ˜ ˜ U f (x) = f ([x] , x) .

(2.15)

For any functions F : Zn → B(K) and λ : Tn → B(K) we can define the multipli◦ ◦ and M λ on H, with evident domains, given by : ˜ F on H cation operators M ˜ F f˜ (α, ξ) := F (α)f˜ (α, ξ) M ◦ ◦ ◦ M λ f (τ, ξ) := λ(τ ) f (τ, ξ).

For λ : Tn → B(K) we can define its Fourier transform : −n/2 ˆ ei2πα·τ λ(τ )dτ λ(α) := (2π)

(2.16)

(2.17)

Tn

(with integrals defined in weak sense in B(K)) and we define the convolution : operator on H ◦ ˆ − β)f˜(β) (ξ). λ∗ f˜ (α, ξ) := F0−1 M λ F0 f˜ (α, ξ) = λ(α (2.18) β∈Zn

Thus for any bounded function λ we have λ∗ B(H) ˜ = λL∞ (Tn ;B(K)) . In order to simplify some formulae let us define the discrete translations in For j = 1, ..., n let j ∈ Zn be given by ( j )k := δjk and : H. Vj f˜ (α, ξ) := f˜(α − j , ξ). (2.19) Due to the fact that {V1 , ..., Vn } commute, for any α ∈ Zn one can define : n

αj

(2.20)

ˆ λ(β)V (β).

(2.21)

V (α) ≡ V α =

Vj

j=1

so that : λ∗ =

β∈Zn

In the sequel we shall frequently need to estimate the norm of the operator λ∗ between spaces with weights (growing exponentially at infinity). Even the definition of the conjugate operator that we shall propose asks for the control of such objects. Formally one has : ˆ − β)F (β)−1 f˜ (β, ξ). ˜ F −1 f˜ (α, ξ) = ˜ F λ∗ M F (α)λ(α (2.22) M β∈Zn

530



Lemma 2.5. Let ρ : Tn → B(K) be an analytic function having a holomorphic extension to a strip Cnδ for some strictly positive constant δ. Then for κ ∈ [0, 2πδ) we have : 2 eκ|β| ˆ ρ(β)B(K) < ∞. (2.23) ρ22,−κ := β∈Zn

Proof. Let us remark that for β ∈ Zn and ν ∈ Nn : ei2πβ·τ (∂ ν ρ) (τ )dτ, β ν ρˆ(β) = (2π)−n/2 (2πi)−|ν| Tn

β ρˆ(β)B(K) ≤ (2π) ν

n−(|ν|+n/2)

sup (∂ ρ) (τ )B(K) ν

τ ∈Tn

≤ Mρ (2π)n/2−|ν|

ν! δ |ν|

due to the analyticity assumption on ρ and the Cauchy inequalities. On the other hand one has for any θ ∈ R+ and l ∈ N : (θ |β|)l ≤ θl

|ν|=l

so that : l

ρ(β)B(K) (θ |β|) ˆ

|β ν |

l! ν!

Mρ (2π)n/2 l! ≤ Cn,ε (n − 1)!

(1 + ε) θ 2πδ

l

for any ε > 0. By summing up we get that for any θ > κ : l M (2π)n/2 (1+ε)θ eθ|β| ˆ ρ(β)B(K) ≤ Cn,ε ρ(n−1)! 2πδ l∈N

2 −2(θ−κ)|β| (1+ε)θ l 2 eκ|β| ˆ ρ(β)B(K) ≤ C e 2πδ

β∈Zn

β∈Zn

(2.24)

l∈N

and this is finite for (1 + ε) θ < 2πδ.

Definition 2.6. Let ρ : Tn → B(K) admit a holomorphic extension to the strip Cnδ (with respect to the uniform topology) for some δ > 0. Assume given a function m : Zn → R satisfying : m(α) ≥ 1, m(α + β) ≤ C1 m(α)m(β). For any function G : Zn × Zn → C such that for some κ ∈ [0, 2πδ) : sup e−κ|α| m(β) |G(α, β)| ≡ G∞,κ,m < ∞

(2.25)

α,β∈Zn

˜ the following operators : we define in H (ρ♦G) f˜ (α, ξ) := G(β, α) ρ(β)f˜(α − β) (ξ) β∈Zn † ˜ (ρ♦G) f (α, ξ) := G(β, α − β) ρ(β)f˜(α − β) (ξ). β∈Zn

(2.26)

Vol. 2, 2001


531

If m(β) = 1 for every β we denote G∞,κ,1 = G∞,κ . ˜ m denote the domain of Proposition 2.7. For ρ, m and G as in Definition 2.6 let H the operator of multiplication with the function m provided with the graph-norm. Then for any κ ∈ (κ, 2πδ) (for κ the exponent associated to the function G), we have the estimation : ρ♦GB(H; ˜ H ˜ m ) ≤ C G∞,κ,m ρ2,−κ .

(2.27)

Proof. 2 (ρ♦G) f˜

˜m H

≤

2 2 ˜ := m(α) G(β, α) ρ(β)f (α − β, .) ≤ β∈Zn α∈Zn

K

G2∞,κ,m

α∈Zn



2

eκ |β| ˜  ˆ ρ (β) f (α − β, .) ≤ B(K) n/2+ε K β∈Zn 2 2 ≤ C 2 G∞,κ,m ρ2,−κ f˜ .



˜ H

In computing commutators we use a slight generalization of the above result. Definition 2.8. Let λ : Tn → B(K) and ρ : Tn → B(K) admit holomorphic extensions to Cnδ (with respect to the uniform topology) for some δ > 0. Assume given a function m : Zn → R satisfying : m(α) ≥ 1, m(α + β) ≤ Cm(α)m(β). For any function Γ : Zn × Zn × Zn → C such that for some κ ∈ [0, 2πδ) : sup α,β,γ∈Zn

e−κ(|α|+|β|) m(γ) |Γ(α, β, γ)| ≡ Γ∞,κ,m < ∞

˜ the following operator : we define in H ˆ ρ(γ)M ˜ Γ(γ,β,.) V (β + γ)f˜ (α, ξ). ((λ 4 ρ) ♦Γ) f˜ (α, ξ) := λ(β)ˆ

(2.28)

(2.29)

β,γ∈Zn

˜ m denote the domain Proposition 2.9. For λ, ρ, m and Γ as in Definition 2.8 let H of the operator of multiplication with the function m provided with the graph-norm. Then for any κ ∈ (κ, 2πδ) (with κ the exponent associated to the function Γ), we have the estimation : (λ 4 ρ) ♦ΓB(H; ˜ H ˜ m ) ≤ C Γ∞,κ,m λ2,−κ ρ2,−κ .

(2.30)

The proof is similar to the previous one. Let us give now the application of this result in computing commutators. In the sequel we use the restriction to Zn

532



of functions defined on Rn and we need some bounds on their variation on Zn . It is convenient to express this variation by using the Leibnitz formula applied to the initial function defined on Rn . Corollary 2.10. Let λ : Tn → B(K) and ρ : Tn → B(K) admit holomorphic extensions to the strip Cnδ (with respect to the uniform topology) for some δ > 0. Let m : Zn → R+ and G : Rn × Rn → C be given such that the restriction of G to Zn × Zn satisfies the assumptions of Definition 2.6 and also the following estimation : sup e−κ|α| m (β) |(∇G) (α, β)| ≡ ∇G∞,κ,m < ∞

(2.31)

α,β∈Zn

for a function m satisfying the same conditions as the function m. Then : [λ∗ , ρ♦G] = (λ 4 ρ) ♦Γ with :

Γ (α, β, γ) := G (α, β − γ) − G (α, β) = −

1

(2.32)

ds γ · ∇G(2) (α, β − sγ)

0

(here ∇(2) represents the gradient with respect to the second variable) so that we can apply Proposition 2.9. Proof. ρ(γ)G(γ, α)V (γ) f˜ (α, ξ) = λ(β) V (β), [λ∗ , ρ♦G] f˜ (α, ξ) = γ∈Zn β∈Zn ρ(γ)V (β + γ)f˜ (α, ξ) . = {G(γ, α − β) − G(γ, α)} λ(β) β,γ∈Zn

◦ As it is well known [6], [7], [12],[13], the operator H is analytically decom◦ ◦ posable, i.e. H may be viewed as a multiplication operator with a function H (τ ) defined on Tn with values self-adjoint operators on K, with compact resolvent that depends analytically on τ ∈ Tn . We shall suppose that σ(H) = σ0 ∪ σ∞ with (inf σ∞ ) − (sup σ0 ) = d0 > 0 and consider the spectral projection P0 of H corresponding to σ0 . We denote : K := P0 HP0 , H∞ := H − K, P∞ := 1 − P0 . By our ◦

Hypothesis 1.1 there exists a number N ∈ N∗ such that the operator K = U0 KU0−1 has the following expression : ◦

K=

N

◦

◦

M λa M πa ≡

a=1

where : k(τ ) :=

N

◦

◦

K a ≡M k

(2.33)

a=1 N a=1

λa (τ )πa (τ ).

(2.34)

Vol. 2, 2001

A-Priori Decay for Eigenfunctions of Schr¨ odinger Operators ◦

533 ◦

We shall sometimes use the notations: Pa := U0−1 M πa U0 , Λa := U0−1 M λa U0 . Let us observe that analytic function :

◦ P0 :=

U0 P0 U0−1 is an operator of multiplication with the

p0 (τ ) := −

1 2πi

! −1 ◦ dζ H (τ ) − ζ

(2.35)

Γ

for any contour Γ separating σ0 from the rest of the spectrum. We remark that p0 (τ ) and k(τ ) are analytic functions of τ even without the condition (b) of our Hypothesis 1.1. Moreover we have : p0 (τ ) =

N

πa (τ ),

σ0 =

a=1

N "

λa (Tn ),

a=1

πa (τ )πb (τ ) = 0

f or

(2.36)

a = b.

As in our previous paper [11], in order to define the conjugate operator we shall need the derivatives of the function k(τ ) (in the uniform topology). We shall use the following notations : la : Tn τ → la (τ ) := (∇λa ) (τ ) ∈ Rn N N ◦ ◦ ◦ ◦ L:= M la M πa ≡ La . a=1

(2.37)

a=1

An important difficulty in extending our previous results [11] from the case of a scalar analytic function λ : Tn → R to an analytic operator valued function k : Tn → B(K) of the form (2.34), comes from terms like : πa (∇πb ) πc , appearing when computing commutators. Nevertheless, a simple calculus shows that : πa (∇πb ) πa = 0,

∀ (a, b) ∈ {1, ..., N }2 .

(2.38)

Thus in our developments a very important role will be played by the following linear projection : B(H) S → PK (S) :=

N

Pa SPa ∈ B(H).

a=1

Proposition 2.11. Let PK be the projection defined above (2.39). Then : 1. PK (KS) = PK (SK) = KPK (S), 2. P2K = PK , 3. PK (S ∗ ) = PK (S)∗ , 4. PK (SPK (T )) = PK (S)PK (T ), 5. PK ([K, T ]) = [K, PK (T )] .

(2.39)

534



We concentrate now on the study of the weight functions that we shall use. In order to control the exponential growth of the weight we are interested in, we shall need to use a cut-off procedure and work with a class of bounded weights for which we shall prove estimations that are uniform with respect to the cut-off. Definition 2.12. Given some constant κ > 0 we define Φκ as the class of functions ϕ˜ : [1, ∞) → R+ that are of class C ∞ and satisfy the properties : κ |ϕ(t)| ˜ ≤ κt; 0 < (∂ ϕ) ˜ (t) ≤ κ; |(∂ p ϕ) ˜ (t)| ≤ , ∀p ≥ 2. t Notation 2.13. ϕ(x) := ϕ˜ (< x >) ; w(x) := eϕ(x) ; W := w(Q); W0 := w([Q]).

X(x) := (∇ϕ) (x) ≡ xη(x);

Proposition 2.14. We have the estimations : κC κ . ; |(∇η) (x)| ≤ |X(x)| ≤ κ; |η(x)| ≤ <x> < x >2 In the following we shall need to compare the weights W and W0 . Lemma 2.15. There exists a strictly positive constant C such that we have : C −1 w(x) ≤ w([x]) ≤ Cw(x), Proof.

# # |ϕ(x) − ϕ([x])| = ## x ·

0

1

∀x ∈ Rn .

# # (∇ϕ) ([x] + sx) ds## ≤ κ,

e−κ w([x]) ≤ w(x) ≤ eκ w([x]). Lemma 2.16. There is a constant C such that ∀a ∈ {1, ..., N } : [Pa , W0 ] W −1 ≤ Cκ. 0 B(H) ˜ Denoting : Proof. We study the element [Pa , W0 ] W0−1 f in the representation H. s β · X (α − sβ) ds, θα,β (s) := 0

we have : |θα,β (s)| ≤ sκ |β| , $ ˜ [(πa )∗ , W0 ] W0−1 f˜ (α, ξ) = − (θα,β (1)) eθα,β (s) (π a ) (β) V (β) f (α, ξ) , β∈Zn 2 [(πa )∗ , w([Q])] w([Q])−1 f˜ ≤ κ2 C 2 πa 2,−κ f˜ . ˜ H

˜ H

Vol. 2, 2001


535

We come now to the problem of defining a conjugate operator for K. Proposition 2.17. Pa (for a=1,...,N), K and L leave D(< Q >) invariant. Proof. We have : U D(< Q >) = U D(< [Q] >) = D(< ∇τ >) ◦ ◦ ◦ ◦ ∇τ P a f (τ, ξ) = (∇τ πa ) (τ ) f (τ, ξ) + πa (τ ) ∇τ f (τ, ξ) and all the functions λa (τ ), la (τ ) and πa (τ ) are analytic on Tn .

Definition 2.18. On D(< Q >) we define the following symmetric operator : A0 :=

1 {[Q] · L + L · [Q]} . 2

Once we have fixed E ∈ E0 (H) (as in the statement of Theorem 1.3) let us choose a bounded open interval I such that : E ∈ I ⊂ I¯ ⊂ E0 (H). We would like to use the operator PK (A0 ) as a conjugate operator for K on I. Proposition 2.19. With the above notations we have : 1 2 i [K, PK (A0 )] = 2π L ∈ B(H) 1 EK (I) i [K, PK (A0 )] EK (I) = 2π EK (I) L2 EK (I) ≥ ωI2 EK (I)

where EK (I) is the spectral projection of K corresponding to the interval I and 1 ωI := (2.40) min inf |(∇τ λa ) (τ )| > 0. 2π a τ ∈λ−1 a (I) Proof. Using the properties of the projection PK we observe that : i [K, PK (A0 )] = N

i 2

N

{Pa [Ka , [Q]] · LPa + Pa L · [Ka , [Q]] Pa } +

a=1

{Pa [Q] · [Ka , L] Pa + Pa [Ka , L] · [Q] Pa } ; a=1 N ◦ ◦ LPa = U0−1 M lb U0 Pb Pa = Pa U0−1 M la U0 Pa = Pa L; b=1 ◦ %◦ & ◦ i U −1 M πa M λa , ∇τ M πa + Pa [Ka , [Q]] Pa = 2π %◦ & ◦ ◦ ◦ i U = − 2π La ; + Mλa M πa M πa , ∇τ M πa + 2i

[Ka , L] = [Ka , La ] = 0 (in the last line both operators being multiplication with scalar functions in the subspace corresponding to πa (τ )).

536



In order to derive a Hardy type inequality with exponential weights one has to define a conjugate operator that is very intimately related to the commutator of the Hamiltonian with the weight function. Thus we need a more complicated conjugate operator for K on the interval I; the definition we propose is motivated by the results of the Appendix. Let X : Rn → Rn be a vector field of class C ∞ (Rn ) satisfying : κ |X(x)| ≤ κ; |(∂xν X) (x)| ≤ , |ν| ≥ 1. <x> We shall denote by the same letter X its restriction to Zn . Later we shall take X to be the field defined in Definition 2.13. Notation 2.20.

1

e±sα·X(β) ds = ±

Z± (α, β) := 0

e±α·X(β) − 1 . α · X(β)

Let us observe that : |Z± (α, β)| ≤ eκ|α| ,

∀(α, β) ∈ Zn × Zn

(2.41)

so that it satisfies the assumptions on the function G (with m(β) ≡ 1) made in Definition 2.6. For any a = 1, ..., N we define now : L+ X :=

N

†

L− X :=

Pa (la ♦Z+ ) Pa ,

a=1

N

Pa (la ♦Z− ) Pa .

(2.42)

a=1

Definition 2.21. On D(< Q >) we define the following symmetric operator : AX :=

1 − [Q] · L+ X + LX · [Q] . 2

By Proposition 2.17, PK (AX ) is well defined and symmetric on D(< Q >). Proposition 2.22. On D(< Q >) we have the following equality : [K, PK (AX )] = [K, PK (A0 )] + RX where for some constant C (independent of κ) : RX B(H) ≤ Cκ. Remark 2.23. For a given interval I as above, if κ is small enough, the operator PK (AX ) is still conjugate to K on I. Proof. Let us observe that : # # |Z± (α, β) − 1| ≤ |α · X(β)| ##

0

1

0

1

# # e±stα·X(β) dsdt## ≤ κ |α| eκ|α| ≤ κeκ |α|

Vol. 2, 2001


537

for any κ ∈ (κ, 2πδ). Moreover : i [K, PK (AX )] = + 2i

N a=1

i 2

N a=1

− Pa [Ka , [Q]] · L+ X Pa + Pa LX · [Ka , [Q]] Pa +

( ' ( ' − Pa [Q] · Ka , L+ X Pa + Pa Ka , LX · [Q] Pa ,

% & ( ' ◦ † −1 Pa [Q] · Ka , L+ P U Pa , = P [Q] · P U, (l ♦Z ) M a a a λ a + a X To compute this commutator we make use of the Corollary 2.10. We define : Γ+ (γ, α, β) := Z+ (γ, α − β − γ) − Z+ (γ, α − γ)

(2.43)

and observe that it satisfies the estimation : # 1 # |Γ+ (γ, α, β)| ≤ 0 ds #esγ·X(α−β−γ) − esγ·X(α−γ) # ≤ 1 1 ≤ 0 ds 0 dts |γβ (∇X) (α − tβ − γ)| esγ·X(α−tβ−γ) ≤ κ < α >−1 eκ (|β|+|γ|) . Thus a direct use of the Corollary 2.10 gives us the expected result.

3 The Exponential Weighted Estimation In this Section we prove Theorem 1.3 and Theorem 1.4 of the Introduction. Our strategy is to follow the procedure elaborated in [11] . Thus we shall make a cut-off on the weight in order to make it bounded and also a cut-off on the support of the test function. Our main technical result is an estimation for compactly supported test functions, with bounded weights associated to the class Φκ , but with constants depending only on κ (the upper bound on the derivative of the phase function from Φκ ). In dealing with this situation we shall separate a neighborhood of σ∞ , for which we shall apply the well-known Agmon method [1] and the neighborhood of σ0 for which we shall extend our method [11] from a case of scalar analytic functions to that of a function k : Tn → B(K) of the type (2.34). From now on we shall use Definition 2.12 and Notation 2.13 assuming that : ϕ˜ ∈ Φκ ∩ L∞ ([1, ∞)) .

(3.44)

Our first step is to prove the following estimation. Proposition 3.24. For κ ∈ [0, 2πδ) and any E ∈ E0 (H) there exists a constant C 2 such that for any f ∈ Hcomp (Rn ) one gets : W f D(H) ≤ C ψ(< Q >)−1 W (H − E)f (the function ψ is defined by ψ(x) :=

κ < x >−2 +2η(x)).

538



The proof of this estimation is based on the following two Propositions dealing separately with P∞ H and P0 H. Proposition 3.25. For E < inf σ∞ there exist two positive constants Cκ and C (the 2 (Rn ) the following second one being independent of κ) such that for any f ∈ Hcomp estimation holds : P∞ W f 2D(H) − κC P0 W f 2 ≤ Cκ W (H − E)f 2 . 2 2 Proof. Evidently, the fact that f ∈ Hcomp (Rn ) implies that W f ∈ Hcomp (Rn ). Let d := dist(E, σ∞ ). Let us observe that H∞ = P∞ H = HP∞ so that by hypothesis our value of E is beneath the spectrum of H∞ and we can follow [1].

2 < W f, (H∞ − EP∞ )W f >≥ 2d P∞ W f 2

(3.45)

d P∞ W f 2 ≤ Re < P∞ W f, W (H − E)f > +Re < P∞ W f, [H, W ] f > For the first term on the right-hand side we use the Schwartz inequality and for any θ > 0 we write : 2Re < P∞ W f, W (H − E)f >≤ θ P∞ W f 2 + θ−1 P∞ W (H − E)f 2 . For the second term we observe that on D(H) : [H, W ] W −1 = (−i∇ϕ) · D + D · (−i∇ϕ) + (∇ϕ) , 2

thus :

2 Re < P∞ W f, [H, W ] W −1 P∞ W f >= (∇ϕ) P∞ W f .

Using once again the Schwartz inequality we obtain that for θ0 > 0 : 2Re < P∞ W f, [H, W ] W −1 P0 W f >≤ 2 ≤ θ0 P∞ W f 2 + θ0−1 [H, W ] W −1 P0 W f .

(3.46)

In order to estimate the second term above let us observe that for Imz = 0 : 2 DP0 g2 = D(H + z)−1 (H + z)P0 g ≤ C 2 P0 g2 , due to the fact that D(H + z)−1 is a bounded operator and P0 projects on a bounded spectral region of H. Moreover by Hypothesis 1.1 we have |∇ϕ| ≤ κ so that choosing θ0 = 2C 2 κ we get : 2 κ2 θ0−1 [H, W ] W −1 P0 W f ≤ κ 1 + P0 W f 2 . 2C 2 Choosing finally θ < κ2 we get : 2 (d − κC) P∞ W f 2 − 2κ P0 W f 2 ≤ d−1 P∞ W (H − E)f 2 .

Vol. 2, 2001


539

Let us obtain now the graph norm of H on the left hand side : g2D(H) = g2 + Hg2 , 2 2 2 P∞ W f D(H) ≤ 1 + 2E 2 P∞ W f + 2 (H − E) W f ≤ 2 2 ≤ 1 + 2E 2 P∞ W f + 2 W (H − E) f + 2 −1 2 +2 [H, W ] W −1 (H + z) W f D(H) , −1 [H, W ] W −1 (H + z) ≤ κ2 C 2 ,

(3.47)

W f 2D(H) = P∞ W f 2D(H) + P0 W f 2D(H) . Putting all these together we get the result. For the neighborhood of σ0 we shall obtain an estimation for the operator K with ”weight operator” PK (W0 ). Proposition 3.26. Let E ∈ I ⊂ I¯ ⊂ E0 (H) , η be defined by Notation 2.13 and ψ(x) := κ < x >−2 +2η(x). 2 Then there exists a constant C0 such that for any f ∈ Hcomp (Rn ) one has :

2 PK (W0 ) f 2 ≤ C0 ψ−1 ([Q])PK (W0 ) (K − E)f . Proof. Let us first remark that : PK (W0 ) (K − E) = PK (W0 ) (H − E). As in our previous paper [11] we shall consider the following expression : 2Im < PK (AX ) PK (W0 ) f, (H − E)PK (W0 ) f >= = −i < PK (W0 ) f, [PK (AX ) , H] PK (W0 ) f > .

(3.48)

But (see Proposition 2.22) : [PK (AX ) , H] = [PK (AX ) , K] = [PK (A0 ), K] − RX RX B(H) ≤ Cκ.

(3.49)

Using now Proposition 2.19 we can write : [PK (A0 ), K] = EH (I) i [PK (A0 ), K] EH (I) + +(P0 − EH (I)) [PK (A0 ), K] EH (I) + P0 [PK (A0 ), K] (P0 − EH (I)) i L2 ∈ B(H). We have the inequality : EH (I) g ≤ P0 g, and [PK (A0 ), K] = 2π so that by using the Schwartz inequality we obtain :

|< PK (W0 ) f, (P0 − EH (I)) [PK (A0 ), K] EH (I) PK (W0 ) f > + + < PK (W 0 ) f, P0 [PK (A0 ), K] (P0 − EH (I))PK (W0 ) f >| ≤

≤

1 2π

2

L

θ (P0 − EH (I))PK (W0 ) f + θ−1 PK (W0 ) f 2

2

.

540



Let us observe that : (P0 − EH (I)) = (P0 − EH (I))(K − E)−1 (K − E), (P0 − EH (I))PK (W0 ) f ≤ CE {PK (W0 ) (K − E) f + [K, PK (W0 )] f } . For the last term on the right hand side we use Proposition 4.31 from the Appendix and the Remark following it. This gives us the following estimation : (P0 − EH (I))PK (W0 ) f ≤ CE {PK (W0 ) (K − E) f + κC PK (W0 ) f } . If we choose θ > κ−1 , we obtain that the left hand side is bounded by :

PK (W0 ) (K − E) f 2 + (κCE )2 PK (W0 ) f 2 . L2 CE Using the Mourre estimation (Proposition 2.19 and Proposition 2.22), we obtain : 2 2Im < PK (AX ) PK (W0 ) f, (H − E)PK (W0 ) f >≥ ωI PK (W 0 ) f − − CE L

2

2

2

PK (W0 ) (K − E) f + κ2 PK (W0 ) f

(3.50)

(for the first term of the second line we used the same procedure as above). For the first term in (3.50), we observe that HPK (W0 ) = KPK (W0 ) and commute K with PK (W0 ). The Schwartz inequality gives : 2Im < PK (AX ) PK (W0 ) f, PK (W0 ) (K − E)f >≤ 2 ≤ ψ ([Q]) PK (AX ) PK (W0 ) f 2 + ψ ([Q])−1 PK (W0 ) (K − E)f .

(3.51)

For the term with the commutator we use the Conclusion 4.35 of the Appendix : 2Im < PK (AX ) PK (W0 ) f, (H − E)PK (W0 ) f > − 2 ) 2 ≤ ψ ([Q]) − 2η ([Q]) P (A ) P (W ) f − K X K 0 2 ≤ ψ ([Q])−1 PK (W0 ) (K − E)f + κC PK (W0 )f 2 .

(3.52)

If we chose now ψ(x) as in the statement of the theorem, we obtain the inequality ) 2 2 c ψ ([Q]) − 2η ([Q]) PK (AX ) PK (W0 ) f = 2 = κ < [Q] >−1 PK (AX ) PK (W0 ) f ≤ κC PK (W0 ) f 2 .

From this and (3.50) we get the expected result for κ small enough.

Proof of Proposition 3.24 2 For f ∈ Hcomp (Rn ) we get from the previous two propositions : 2 2 PK (W0 ) f ≤ C0 ψ−1 ([Q])PK (W0 ) (H − E)f , P∞ W f 2D(H) − κC P0 W f 2 ≤ Cκ W (H − E)f 2 .

(3.53)

Vol. 2, 2001


541

We shall begin with the first inequality and obtain an estimation for P0 W f . PK (W0 ) f = P0 W0 f +

N

Pa [W0 , Pa ] f.

a=1

Using Lemma 2.16 for the terms of the sum on the right hand side we obtain : P0 W0 f − κN C W0 f ≤ PK (W0 ) f .

(3.54)

By Lemma 2.15 from Section 2 we have : P0 W f ≤ W P0 f + [P0 , W ] f ≤ C1 W0 P0 f + [P0 , W ] f ≤ ≤ C1 P0 W0 f + C1 [P0 , W0 ] f + [P0 , W ] f . Let us compute now the commutator : [P0 , W ].

−1 + (α + x) − W + (x) f(α + x) , U π *a ([x] − α) W ([Pa , W ] f ) (x) = α∈Zn s−1 ϕ (x + s (α − [x])) − ϕ (α + x) = (α − [x]) · 0 dtX (α + x + t (α − [x])) . Putting all these together we get the estimations : [P0 , W ] f ≤ κC W f , [P0 , W0 ] f ≤ κC W0 f ≤ κC W f

(3.55)

for some constants C, C independent of κ. Our first estimation in (3.53) implies : −1 (3.56) P0 W f ≤ Cκ ψ ([Q]) PK (W0 ) (H − E)f . Now we have to repeat the arguments above in order to treat the right hand side and eliminate the projection PK . We shall use the following notations : +0 := ψ ([Q])−1 W0 ; W

+ := ψ (Q)−1 W. W

Then we have :

−1 ψ ([Q]) PK (W0 ) (H − E)f ≤ % & N + +0 , Pa (H − E)f ≤ C Pa W0 (H − E)f + Pa W + a=1 %

& −1 + ψ ([Q]) , Pa W0 Pa (H − E)f , & % % & −1 −1 −1 ψ ([Q]) , Pa ψ ([Q]) = U π *a (α) ψ ([Q]) , V (α) ψ ([Q]) = α∈Zn −2 −1 =− U π *a (α) dsα · ψ ∇ψ ([Q] − sα) ψ ([Q] − α) V (α) . α∈Zn

(3.57)

542



Let us recall the definition of the function ψ and observe that : # −2 # # ψ ∇ψ (β − sα) ψ (β − α)# ≤ κC < α > , so that by using Proposition 2.7 we get the estimation : % & −1 ψ ([Q]) , Pa ψ ([Q]) ≤ κC.

(3.58)

% & + + −1 By similar arguments we obtain the bound W ≤ κC. Putting all 0 , Pa W0 these estimations together we obtain the following inequality : + (H − E)f (3.59) P0 W f ≤ Cκ W and combining with the inequality (3.53) we finally obtain : + W f D(H) ≤ C W (H − E)f .

(3.60)

In view of our Theorem 1.4 we shall now obtain a similar “local estimation” for the perturbed Hamiltonian HI = H + VI , where VI satisfies the conditions of Theorem 1.4. We have for f supported outside the ball of radious R : ˜ (H − E)f ˜ VI f ≤ < Q > χR VI (H + i)−1 W f D(H) ≤ θCW W for any chosen θ > 0, once we take R large enough. Thus : ˜ (H − E)f ≥ ˜ (HI − E)f ≥ (1 − θC)W W

1 − θC W f D(H) . C

(3.61)

We present now the cut-off procedure that allows us to obtain our main result (Theorem 1.3) from Proposition 3.24. We fix κ > 0 and the phase function ϕ˜0 (t) = κt for t ∈ [1, ∞). Let f belong to :

(3.62) M := f ∈ D(H) | < Q >eϕ0 () (H − E)f ∈ L2 ( Rn ) . We shall approximate the function f with functions with compact support, but in order to control the limit we shall need to work first with bounded phase functions ϕ˜ ∈ Φκ that converge to ϕ 0 . Let us fix χ ∈ C0∞ (R) such that : 0 ≤ χ(t) ≤ 1,

χ(t) = 0

f or |t| ≥ 1,

χ(t) = 1

f or |t| ≤ 1/2.

(3.63)

For f ∈ M, x ∈ Rn and θ ∈ (0, 1] we set : χθ (x) := χ(θ < x >);

fθ := χθ f.

(3.64)

Vol. 2, 2001


543

Let : j(t) :=

−1 − 1 − 1 e 1−t2 dt e 1−t2 , f or |t| < 1 . 0, f or |t| ≥ 1 R

For N ∈ N let :

ηÑ (t) :=

1 jN (t) := j(t/N ), N

κ , f or t ≤ 2N , 0 , f or t > 2N

(3.65)

(3.66)

t ηN := jN ∗ ηÑ ,

ηN (s)ds, ∀t ≥ 0.

ϕN (t) :=

(3.67)

0

Lemma 3.27. The following relations are true : 1. j ∈ C0∞ (Rn ), 0 ≤ j(t), j(t)dt = 1, j(−x) = j(x), R jN (t)dt = 1, jN (t) = 0 for |t| ≥ N , 2. R

ηN (t) ≤ κ, |t(∂ηN )(t)| ≤ C1 κ, 3. η# N ∈ C ∞ (R), # #(∂ k ηN )(t)# ≤ Ck κ ∀t ∈ R , for k ∈ N and with Ck independent of κ, 4. ϕN (t) ≤ ϕ0 (t),

lim ϕN (t) = ϕ0 (t),

N→∞

∀t ∈ R.

Proof. We shall prove only those estimations that are not completely obvious. First we observe that 0 ≤ ηN (t) ≤ κ and that for t ≤ N we get ηN (t) = κ and for t ≥ 3N we get ηN (t) = 0. For the first derivative of ηN (t) we see that : N t(∂ηN )(t) = tκ

(∂jN )(τ )dτ = −κ

t j(t/N − 2); N

(3.68)

t−2N

but j(τ − 2) = 0 implies that 1 < τ < 3 so that |t(∂ηN )(t)| ≤ 3cκ. For the higher derivatives we observe that : (∂ k ηN )(t) = −κ(∂ k−1 jN )(t − 2N ) = −κ

1 (∂ k−1 j)(t/N − 2), Nk

# # so that #(∂ k ηN )(t)# ≤ Ck κ for any k > 1, with Ck independent of κ.

(3.69)

Corollary 3.28. For any N ∈ N the phase function ϕN defined by (3.67 ) belongs to the class Φκ for some κ > κ. We fix now the value of κ small enough (as in the statement of Proposition 3.24), f ∈ M, θ ∈ (0, 1] and N ∈ N large enough so that we can apply Proposition 3.24 with the phase function ϕN for the function fθ (with compact support). Thus : 2 eϕN fθ 2D(H) ≤ Cκ ψN (Q)−1 eϕN (H − E)fθ ,

(3.70)

544



where ψN is given by the same formula as in Proposition 3.26 but with ϕ replaced by ϕN . We remove the cut-off in f by letting θ → 0 and we use Fatou Lemma on the left hand side of the inequality (3.70) and the Dominated Convergence Theorem on the right hand side (the boundedness of eϕN is crucial at this step). This leads us to an estimation for any f ∈ M with phase function ϕN . A similar procedure allows us to control the limit N → ∞ and to finish the proof of Theorem 1.3. Let us consider the limit of the right hand side of (3.70) when θ → 0. −1 ϕN −1 ϕN −1 ϕN ψN e (H − E)fθ = χθ ψN e (H − E)f + ψN e [H, χθ ] f.

(3.71)

When θ → 0 the first term converges in L2 -norm to ψN (Q)−1 eϕN (H − E)f . Concerning the second term, we observe that for N ∈ N we can find a finite any−1 constant CN (diverging with N ) such that : eϕN ψN (Q) < Q >−1 ≤ CN . For a fixed N we study the family {< Q > [H, χθ (Q)] f }θ>0 of L2 -functions. We denote : 1 ζθ (3.72) ζ˜θ = −i∇ ζθ (x) := −2iθxχ (θ < x >), 2<x> and observe that we can write : < Q > [H, χθ (Q)] f = ζθ (Q)Df + ζ˜θ (Q)f

(3.73)

We shall now estimate the norm < Q > [H, χθ (Q)] f . If we take into account that χ (t) has support in the set {1/2 ≤ t ≤ 1} and if we denote hθ the characteris1 tic function of the set τ ∈ R+ | 2θ ≤ τ ≤ 1θ (that evidently converges pointwise to 0 for θ → 0) we finally get that : # # #˜ # |ζθ (x)| ≤ Chθ (< x >); (3.74) #ζθ (x)# ≤ Cθ. We use the fact that for f ∈ D(H) the vector Df belongs to L2 (Rn ) in order to show that the second term in (3.71) converges to zero for θ → 0. We have thus proved that : lim < Q > [H, χθ (Q)] f = 0. (3.75) θ→0

In conclusion, for a fixed N ∈ N, the cut-off in f on the right hand side of (3.70) can be removed. For the left hand side we observe that for any y ∈ R : lim eϕN (y) fθ (y) = eϕN (y) f (y).

θ→0

(3.76)

Let us point out that in the left hand side of (3.70) we have to control the behavior of the graph norm eϕN χθ f D(H) when θ → 0. For that we commute H with χθ and use once again the calculus done above (where now the factor < z > in the definition of ζθ is absent so that the convergence to zero with θ follows immediately). We still have to study the behavior of the inequality (3.70) with fθ replaced by f , when N → ∞. For this we prove the following lemma.

Vol. 2, 2001


545

Lemma 3.29. There exists a constant C such that for any N ∈ N we have : √ eϕN (<x>) ≤ C < x >eκ<x> . ψN (x) Proof. For N ∈ N we define the function : teϕÑ (t) eϕÑ (t) = . gN (t) := κt−2 + 2t−1 ϕÑ (t) κ + 2tϕÑ (t)

(3.77)

We have : ϕÑ (t)

κ = ηN (t) = N

2N

1 j((t − s)/N )ds = κ

−∞

j(τ )dτ. t N

(3.78)

−2

Since ϕÑ is decreasing and ϕÑ (2N ) = κ/2, one has : ϕÑ (t) ≥ κ/2 for t ≤ 2N and ϕÑ (t) ≤ κ/2 for t ≥ 2N . Hence, for t ≤ 2N we have {κ + 2tϕÑ (t)}1/2 ≥ (κt)1/2 , 1/2 ϕ which implies g√ e ˜0 (t) . √ For t ≥ 2N N (t) ≤ (t/κ) we get ϕN (t) ≤ κt/2, which ϕ ˜0 (t) , with ω :=sup te−κt/2 . gives gN (t) ≤ ω te t≥1

Using this result we see that the right hand side of (3.70) (with fθ replaced by f ) is uniformly bounded by : 2 ψN (Q)−1 eϕÑ (H − E)f 2 ≤ C < Q >eκ (λ(D) − E)f , ∀N ∈ N (3.79) with C independent of N , the right hand side being finite due to the hypothesis f ∈ M. But evidently : √ ψN (x)−1 eϕÑ (x) → < x >eκ<x> , (3.80) N→∞

so that we can use the Dominated Convergence Theorem. For the first term on the left hand side one can immediately use the Fatou Lemma in a way similar to the argument we gave for the θ → 0 limit. Thus we obtain the expected inequality : κ e f D(H) ≤ Cκ < Q >eκ (H − E)f (3.81) and this finishes the proof of Theorem 1.3. Proof of Theorem 1.4 First let us consider the set MI defined as in (3.62) but with HI replacing H. Let us fix some f ∈ MI with support far enough from the origin (so that after a cut-off to a compact support we can apply the estimation in (3.61)). Then we can repeat the above cut-off procedure. Due to the fact that VI commutes with

546



all the cut-off functions, it follows that all the above procedure of removing cutoffs extends to the perturbed case without any modification. We thus obtain (see (3.61)) : κ e f D(H) ≤ Cκ < Q >eκ (HI − E)f . Suppose HI has an eigenvalue E belonging to E0 (H) with eigenfunction g. Denoting by χ the smoothed characteristic function of a ball of sufficiently large radius R in Rn , by χ⊥ = 1 − χ and by f = χ⊥ g we see that : κ √ e g ≤ C < Q >eκ (HI − E)f + eκ χg , (HI − E)f = (HI − E)g − (HI − E)χg = −(HI − E)χg so that : κ e g ≤ C < Q >eκ (HI − E)χg + eκ χg < ∞,

due to the fact that HI is a differential operator.

4 Appendix In this appendix we shall study the commutator [K, PK (W0 )] and show that it can be written in a special form that allows one to compare it with the conjugate operator PK (AX ). We have : N

[K, PK (W0 )] =

a=1

Pa [Λa Pa , W0 ] Pa =

N

Pa [Λa , W0 ] Pa .

(4.82)

a=1

Let us observe that : ' ( Pa [Λa , W0 ] Pa = Pa [Λa , W0 ] W0−1 Pa W0 Pa + Pa Pa , [Λa , W0 ] W0−1 W0 Pa . (4.83) Lemma 4.30. The operator [Λa , W0 ] W0−1 defines a bounded operator in H and [Λa , W0 ] W −1 ≤ κC. 0 B(H) Proof. We have :

[Λa , W0 ] W0−1 = U −1 (λa ♦F ) U,

where we denoted : 1 F (α, β) := − dsα · 0

0

1

dtX (β − tα) exp sα ·

1

dtX (β − tα) .

(4.84)

0

We observe that we have the estimation : |F (α, β)| ≤ κ < α > eκ|α| ≤ κeκ |α| for any κ ∈ (κ, 2πδ). Using now Proposition 2.7 we get the expected result.

Vol. 2, 2001


547

An important difficulty in our technical developments comes from the fact that we have to consider the product of the operator [Λa , W0 ] W0−1 with some unbounded operator and the above lemma does not give sufficient information in order to control this product. More precisely, our method of obtaining Hardy type inequalities from a Mourre estimation relies heavily on the study of the following object : 2Im < PK (AX )PK (W0 )f, [K, PK (W0 )] f > . (4.85) The next proposition gives a technical result concerning the structure of the commutator of K with PK (W0 ), that will allow us to treat the expression (4.85). Proposition 4.31. The following relation holds : i [K, PK (W0 )] = − 2π η ([Q]) PK (AX )PK (W0 ) + T PK (W0 )+ N +R0 PK (W0 ) + Ra W0 Pa a=1 ∗

where T = T ∈ B(H), Ra ∈ B(H; Hm ) for m(x) :=< x >, a=0,...,N and : T B(H) +

max

a=0,1,...,N

Ra B(H;Hm ) ≤ κC,

for some constant C independent of κ. Proof. We consider once again (4.82) and (4.83) and we observe that : ( ' Pa , [Λa , W0 ] W0−1 = U −1 [(πa )∗ , (λa ♦F )] U =: U −1 ((πa 4 λa ) ♦Ψ) U where :

Ψ(α, β, γ, ) := −

1

dsβ · ∇(2) F (α, γ − sβ).

(4.86)

0

We denote :

Y (α, β) := α ·

dtX (β − tα) , 0

so that :

1

Y1 (α, β) := α ·

1

dt (∂X) (β − tα) ,

(4.87)

0

1 ∇(2) F (α, β) = − 0 dsY1 (α, β) exp {sY (α, β)} − 1 − 0 dsY (α, β) (sY1 (α, β)) exp {sY (α, β)} ,

hence we have the estimation (with κ > κ) : 1 # # |α| κ # ∇(2) F (α, β)# ≤ κ dt eκ|α| ≤ eκ |α| . < β − tα > < 2β > 0

(4.88)

In order to treat the first term in (4.83) we have to make a more detailed analysis of the factor [Λa , W0 ] W0−1 and separate it into its hermitian and antihermitian parts : 1 −1 1 2Λa − W0 Λa W0−1 − W0−1 Λa W0 + W0 Λa W0 − W0 Λa W0−1 , 2 2

548



2Λa − W0 Λa W0−1 − W0−1 Λa W0 = U −1 (λa ♦G+ ) U, where :

G+ (α, β) := 1 − e(ϕ(β)−ϕ(β−α)) + 1 − e(ϕ(β−α)−ϕ(β)) .

(4.89)

Some algebra, using the Leibnitz formula, shows that G+ satisfies :

|G+ (α, β)| ≤ κ < α >2 eκ|α| ≤ κeκ |α| for any κ ∈ (κ, 2πδ). Then : W0−1 Λa W0 − W0 Λa W0−1 (4.90)

−1 (ϕ([Q])−ϕ([Q]+α)) (ϕ([Q])−ϕ([Q]−α)) * =U −e V (α) U. λa (α) V (α) e α∈Zn

Let us observe that :

e(ϕ(β)−ϕ(β±α)) = e(ϕ(β)−ϕ(β±α)) − e∓α·X(β) + e∓α·X(β) − 1 + 1, ϕ (β) − ϕ (β ± α) ± α · X (β) 1 n 1 = − tdt du {αj αk ∂j Xk (β ± utα)} ≡ Y± (β, α). (4.91) j,k=1

0

0

Let us introduce the notations : G1 (α, β) G2 (α, β) G− (β, α)

n 1 1 := − dsαj αk (∂k Xj ) (β − sα)e−sα·X(β−α) , 2 0 j,k=1 1 1 := − dsα · (∇η) (β − sα), 2 0 1 1 := ds Y+ (β, α)e−α·X(β) exp {sY+ (β, α)} 2 0

− Y− (β, α)eα·X(β) exp {−sY− (β, α)} .

(4.92)

We have the estimations : |G1 (α, β)| |G2 (α, β)| |G− (α, β)|

< α >3 κ|α| κC κ |α| e e ≤ , < α >3 κC ≤ κC ≤ eκ |α| , < β >2 < β >2 < α >3 2κ|α| κC 2κ |α| e e ≤ κC ≤ . ≤ κC

(4.93)

Vol. 2, 2001


549

for any strictly positive constants κ and κ > κ. Then we can write : i 1 −1 Pa W0 Λa W0 − W0 Λa W0−1 Pa = − η ([Q]) Pa (AX )Pa 2 2π +Pa U −1 (λa ♦(G1 + G− )) U Pa + U −1 (πa ♦G2 ) U Pa (AX )Pa . In conclusion we obtain : [K, PK (W0 )] = η ([Q])

N

Pa AX Pa PK (W0 )+

a=1

+

N Pa U −1 (λa ♦(G1 + G− )) U Pa + U −1 (πa ♦G2 ) U Pa AX Pa PK (W0 )+

a=1

+ 21

N

Pa U −1 (λa ♦G+ ) U PK (W0 ) +

a=1

N

Pa U −1 ((πa 4 λa ) ♦Ψ) U W0 Pa .

a=1

We introduce now the notations : T := N

R0 :=

a=1

1 2

N

Pa U −1 (λa ♦G+ ) U Pa ,

a=1

Pa U −1 (λa ♦(G1 + G− )) U Pa + U −1 (πa ♦G2 ) U Pa AX Pa ,

(4.94)

Ra := Pa U −1 ((πa 4 λa ) ♦Ψ) U.

Taking into account Proposition 2.7, Proposition 2.9 and the estimations proved above for G+ , G1 , G2 , G− and Ψ we get the stated result. Remark 4.32. Let us finally remark that for a=1,...,N : W0 Pa f = Pa W0 Pa f + [W0 , Pa ] W0−1 W0 Pa f, [W0 , Pa ] W0−1 [W0 , Pa ] W −1 0 B(H) W0 Pa f H

= −U −1 (πa ♦F ) U, ≤ κ πa 2,κ , −1 ≤ 1 − κ πa 2,κ Pa W0 Pa f H .

Summing upon a ∈ {1, ..., N } we re-obtain the term PK (W0 )f . Remark 4.33. We have the following relations : 2 Im < PK (AX )PK (W0 )f, [K, PK (W0 )] f > + 2 < PK (AX )PK (W0 )f, η ([Q]) PK (AX )PK (W0 )f > ≤ 2Im < PK (AX )PK (W0 )f, T PK (W0 )f > + κC PK (W0 )f 2 =

(−i) < PK (W0 )f, [PK (AX ), T ] PK (W0 )f > + κC PK (W0 )f 2 .

Lemma 4.34. We have [PK (AX ), T ] ∈ B(H) and [PK (AX ), T ] ≤ κC.

550



Proof. Let us remind that : T :=

N

Pa U −1 (λa ♦G+ ) U Pa

(4.95)

a=1

so that the commutator takes the form : [PK (AX ), T ] = −

N % &

1 Pa U −1 [Q] (la ♦Z+ )† + (la ♦Z− ) [Q] , λa ♦G+ U Pa . 2 a=1

&

% [Q] (la ♦Z+ )† + (la ♦Z− ) [Q] , λa ♦G+ = = [[Q] , λ%a ♦G+ ] (la ♦Z+ )† +&(la ♦Z− ) [λa ♦G+ , [Q]] + †

+ [Q] (la ♦Z+ ) , λa ♦G+ + [la ♦Z− , λa ♦G+ ] [Q] ,

[[Q] , λa ♦G+ ] =

(la ♦G+%) , & *a (β)M ˜ Z (γ,[Q]) V (γ), λ ˜ G (β,[Q]) V (β) = la (γ)M [la ♦Z− , λa ♦G+ ] = − + β,γ∈Zn * 1 la (γ)λa (β) 0 dsβ · ∇(2) Z− (γ, [Q] − sβ) − = β,γ∈Zn

1 − 0 dsγ · ∇(2) G+ (β, [Q] − sγ) V (β + γ), # # # ∇(2) G+ (α, β)# ≤ κC < α >2 eκ|α| ≤ κC eκ |α| i 2π

(by some obvious calculations). Proposition 2.7 gives the expected estimation. Conclusion 4.35. Putting together the above results we get the following relation : 2Im < PK (AX )PK (W0 )f, [K, PK (W0 )] f > + 2 2 + 2η ([Q])PK (AX )PK (W0 )f ≤ κC PK (W0 )f .

Acknowledgments. We want to thank the University of Geneva for its hospitality during the preparation of this work.

References [1] S. Agmon, ”Lectures on Exponential Decay of Solutions of Second Order Elliptic Equations”, Princeton Univ. Press, (1982). [2] S. Agmon, I. Herbst, E. Skibsted, “Perturbation of Embedded Eigenvalues in the Generalized N-Body Problem”, Comm. Math. Phys. 122, 411–438, (1989). [3] W. Amrein, Anne Boutet de Monvel, V. Georgescu, ”Hardy Type Inequalities for Abstract Differential Operators”, Memoirs of the American Mathematical Society, 375, 1–119, (1987).

Vol. 2, 2001


551

[4] R. Froese, I. Herbst, ”Exponential Bounds and Absence of Positive eigenvalues for N-Body Schr¨ odinger Operators, Comm. Math. Phys., 87, 429–447, (1982). [5] R. Froese, I. Herbst, Maria Hoffmann - Ostenhof, T. Hoffmann - Ostenhof, ”L2 -Exponential Lower Bounds to Solutions of the Schr¨ odinger Equation”, Comm. Math. Phys., 87, 265–286, (1982). [6] Ch. Gérard, F. Nier, ”The Mourre Theory for Analytically Fibred Operators”, J. Func. Anal. 152 (1), 202–219, (1998). [7] Ch. Gérard, F. Nier, ”Scattering Theory for the Perturbations of Periodic Schr¨ odinger Operators”, J. Math. Kyoto Univ. 38 (4), 595–634, (1998). [8] I. Herbst, “ Perturbation theory for the decay rate of eigenfunctions in thegeneralized N-body problem”, Comm. Math. Phys. 158, (1993). [9] P. Kuchment, B. Vainberg, ”On Embedded Eigenvalues of Perturbed Periodic Schr¨ odinger Operators”, in Spectral and Scattering Theory (Newark, DE, 1997), Plenum, New York, 67–75, 1998. [10] L.A. Malozemov, ”On the Eigenvalues of a Perturbed Almost Periodic Operator that are Immersed in the Continuous Spectrum”, Usp. Mat. Nauk 43, no. 4 (262), 211–212, 1988. [11] M. Mantoiu, R. Purice, “Weighted Estimations from a Conjugate Operator”, Lett. Math. Phys., 51, 17–35, 2000. [12] M. Reed, B. Simon, ”Methods of Modern Mathematical Physics, Vol.IV: Analysis of Operators”, Academic Press, 1978. [13] J. Sjóstrand, ”Microlocal Analysis for the Periodic Magnetic Schr¨ odinger Equation and Related Questions”, Springer Lect. Notes in Math., 1495, 237– 332 (1991). Marius Mˇ antoiu∗∗ , Radu Purice Institute of Mathematics “Simion Stoilow” The Romanian Academy P.O. Box 1 - 764 70700 Bucharest Romania Research partially supported by the Swiss National Science Foundation and the grant CNCSU-13 ∗∗ Present address: Université de Gen` eve, 32, bd. d’Yvoy, CH-1211 Genève 4, Suisse

Communicated by Gian Michele Graf submitted 27/09/00, accepted 11/12/00



Bound States in Weakly Deformed Strips and Layers D. Borisov, P. Exner, R. Gadyl’shin, and D. Krejˇciˇr´ık Abstract. We consider Dirichlet Laplacians on straight strips in R2 or layers in R3 with a weak local deformation. First we generalize a result of Bulla et al. to the three-dimensional situation showing that weakly coupled bound states exist if the volume change induced by the deformation is positive; we also derive the leading order of the weak-coupling asymptotics. With the knowledge of the eigenvalue analytic properties, we demonstrate then an alternative method which makes it possible to evaluate the next term in the asymptotic expansion for both the strips and layers. It gives, in particular, a criterion for the bound-state existence in the critical case when the added volume is zero.

1 Introduction Spectra of Dirichlet Laplacians in infinitely stretched regions such as a planar strip or a layer of a fixed width have attracted a lot of attention recently. Of course, the problem is trivial as long as the strip or layer is straight because then one can employ separation of variables. However, already a local perturbation such as bending, deformation, or a change of boundary conditions can produce a nonempty discrete spectrum. This effect was studied intensively in the last decade, first because it had applications in condensed matter physics, and also because it was itself an interesting mathematical problem. A particular aspect we will be concerned with here is the behaviour in the weak-coupling regime, i.e., the situation when the perturbation is gentle. Recall that the answer to this question depends on the type of the perturbation. For bend strips, e.g., one can perform the Birman-Schwinger analysis which yields the first term in the asymptotic expansion for the gap between the eigenvalue and the threshold of the essential spectrum [DE]. It is proportional to the fourth power of the bending angle and always positive, since any nontrivial (local) bending induces a non-empty discrete spectrum. A local switch of the boundary condition from Dirichlet to Neumann has a similar effect. Here the weak-coupling behaviour was determine variationally to be governed by the fourth power of the “window width” [EV1] and the exact asymptotics was derived formally in [Po] by a direct application of the technique developed in [Il, Ga]. Notice that this asymptotics differs substantially from that corresponding to a local change in the mixed boundary conditions, where the Birman-Schwinger technique is applicable and the leading term is a multiple of the square of the said parameter [EK]. Recall also that analogous results can be derived for layers with locally perturbed boundary

554

D. Borisov, P. Exner, R. Gadyl’shin, and D. Krejˇ ciˇr´ık


conditions where, however, the asymptotics is exponential rather that powerlike [EV2]. The present paper deals with the case of a local deformation of the strip or layer, which is more subtle than the bending or boundary-condition modification. The main difference is that the effective interaction induced by a deformation can be of different signs, both attractive and repulsive. It is easy to see by bracketing that a bulge on a strip or layer does create bound states while a squeeze does not. The answer is less clear for more complicated deformations where the width change does not have a definite sign. The first rigorous treatment of this problem was presented in the work of Bulla et al [BGRS] dealing with a local one-sided deformation (characterized by a function λv) of a straight strip of a constant width d. The authors found that the added volume was decisive: a bound state exists for small positive λ if the area change λdv is positive, and in that case the ground-state eigenvalue has the following weak-coupling expansion, E(λ) = κ21 − λ2 κ41 v2 + O(λ3 ) ,

(1.1)

where κ1 = πd is the square root of the first transverse eigenvalue.1 On the other hand, the discrete spectrum is empty if v < 0. A problem arises in the critical case, v = 0, when the areas of the outward and inward deformation coincide. The authors of [BGRS] suggested that the analogy with one-dimensional Schrödinger operators by which bound states should exist again may be misleading due to the presence of the higher transverse modes. This suspicion was confirmed in [EV3] where it was shown that this is true only if the deformation was “smeared” enough. More specifically, the discrete spectrum is empty if 4 d> √ b (1.2) 3 provided supp v ⊂ [−b, b]. On the other hand, a weakly bound state exists if 6κ21 v 2 √ < , v2 9 + 90 + 12π 2

(1.3)

and in that case there are positive c1 , c2 such that −c1 λ4 ≤ E(λ) − κ21 ≤ −c2 λ4 .

(1.4)

These results have been obtained by a variational method and they are certainly not optimal, because there are deformed strips which fulfill neither of the conditions (1.2), (1.3). A way to improve the above conclusions would be to compute the BirmanSchwinger expansion employed in [BGRS] to the second order which becomes the 1 In fact, they assumed d = 1, but it is easy to restore the strip width in their expression obtaining eq. (1.1).

Vol. 2, 2001

Bound States in Weakly Deformed Strips and Layers

555

leading one when the term linear in λ2 in (1.1) is absent, and the asymptotics is governed by λ4 in correspondence with (1.4). This is not easy, however. The standard technique in these situations is to map the strip in question onto a straight one by means of suitable curvilinear coordinates. In distinction to the bent-strip case [DE] these coordinates typically are not locally orthogonal. Hence the transformed Laplacian contains numerous terms which make the computation extremely cumbersome. After this introduction, let us describe the aim and the scope of the present paper. The aim is twofold. First we are going to consider an extension of the result of [BGRS] to the case of a locally deformed layer. The result is summarized in Theorem 2.4. In particular, we derive a weak-coupling expansion of the groundstate eigenvalue, −1 2 κ E(λ) = κ21 − exp 2 −λ 1 v + O(λ2 ) (1.5) π and show the analytical properties of the round-bracket expression w.r.t. λ. This is done in Sec. 2; the results again say nothing about the behaviour in the critical case. Instead of attempting to proceed further by the Birman-Schwinger method, we demonstrate in Sec. 3 a different approach to the weak-coupling problem. It is based on constructing the asymptotics of a particular boundary value problem, and requires as a prerequisite the analyticity of the function E(·) itself in dimension two, and of its above mentioned constituent in dimension three. In the present case, however, these properties are guaranteed by [BGRS] and the results of Sec. 2. The methods allows us to recover the expansions (1.1) and (1.5) in a different way. What is more, we are also able to compute higher terms, in principle of any order. We perform the explicit computation for the second-order terms which play role in the critical case. In particular, we made in this way more precise the result expressed by (1.2) and (1.3) about the critical bound-state existence for smeared perturbations, and derive its analog in the deformed-layer case.

2 Locally deformed layers 2.1 The curvilinear coordinates Let x = (x1 , x2 ) ∈ R2 and (x, u) ∈ Ω0 := R2 ×(0, d) with d > 0. Given a func2 tion v ∈ C ∞ 0 (R ) we define the mapping φ : Ω0 → R3 : (x, u) → φ(x, u) := x1 , x2 , (1 + λv(x)) u (2.1) for λ > 0, which defines our deformed layer Ωλ := φ(Ω0 ). To make use of the curvilinear coordinates defined by the mapping φ we need the metric tensor Gij := φ,i .φ,j of the deformed layer. It can be seen easily to be

556


of the form




 2 2 1 + λ2 v,1 u λ2 v,1 v,2 u2 λv,1 (1 + λv)u 2 2 1 + λ2 v,2 u λv,2 (1 + λv)u  , (Gij ) =  λ2 v,1 v,2 u2 (1 + λv)2 λv,1 (1 + λv)u λv,2 (1 + λv)u

(2.2)

where v,µ means the derivative w.r.t. xµ , and its determinant is G := det(Gij ) = (1 + λv)2 . In view of the inverse function theorem, the mapping φ defining the layer will be diffeomorphism provided λv− ∞ < 1, where we put conventionally v− := max{0, −v}. For a sign-changing v, this is a nontrivial restriction which is satisfied, however, when λ is small enough. That is just the case we are interested in. We will also need the contravariant metric tensor, in other words the inverse matrix   λv,1 u 1 0 − 1+λv     λv u   ,2 0 1 − 1+λv (2.3) (Gij ) =       λv,1 u λv,2 u 1+λ2 |∇v|2 u2 − 1+λv − 1+λv (1+λv)2 and the following contraction identities Gµj,j = −

λv,µ , 1 + λv

G3j,j = −

3λ2 |∇v|2 u λ∆v u + , 1 + λv (1 + λv)2

(2.4)

where conventionally summation is performed over repeated indices, and we de2 2 note |∇v|2 := v,1 + v,2 and ∆v := v,11 + v,22 . Another convention concerns the range of the indices, which is 1, 2 for Greek and 1, 2, 3 for Latin indices. The indices are at that associated with the above coordinates by (1, 2, 3) ↔ (x1 , x2 , u).

2.2 The straightening transformation As mentioned in the introduction the main object of our study is the Dirichlet 2 λ Laplacian −∆Ω D on L (Ωλ ). If we think of a quantum particle living in the λ region Ωλ with hard walls and exposed to no other interaction, −∆Ω D will be its Hamiltonian up to a multiplicative constant; we can get rid of the latter by setting the Planck’s constant = 1 and the effective mass m∗ = 12 . Mathematically 3 λ speaking, −∆Ω D is defined for an open set Ωλ ⊂ R as the Friedrichs extension ∞ of the free Laplacian with the domain C 0 (Ω) – cf. [RS, Sec. XIII.15]. Moreover, λ since the smooth boundary of Ωλ has the segment property, −∆Ω D acts simply as ψ → −ψ,jj with the Dirichlet b.c. at ∂Ωλ . A natural way to investigate the Hamiltonian is to introduce the unitary 1 transformation U : L2 (Ωλ ) → L2 (Ω0 ) : {ψ → U ψ := G 4 ψ ◦ φ} and to investigate the unitarily equivalent operator −1 λ Hλ := U (−∆Ω = −G− 4 ∂i G 2 Gij ∂j G− 4 D )U 1

1

1

(2.5)

Vol. 2, 2001


557

λ with the form domain Q(Hλ ) = W01,2 (Ω0 ) instead of −∆Ω D . As usual in such situations, the “straightened” region is geometrically simpler and the price we pay is a more complicated form of the operator (2.5). 1 1 To make it more explicit, put F := ln G 4 . Commuting G− 4 with the gradient components, we cast the operator (2.5) into a form which has a simpler kinetic part, Hλ = −∂i Gij ∂j + V = −Gij ∂i ∂j − Gij,j ∂i + V ,

but contains an effective potential, V := (Gij F,j ),i + F,i Gij F,j = Gij F,ij + Gij,j F,i + Gij F,i F,j . If we now employ the particular form (2.2) of the metric tensor together with (2.3), (2.4), we can write Hλ

1 + λ2 |∇v|2 u2 2 2λv,1 u 2λv,2 u ∂1 ∂3 + ∂2 ∂3 ∂3 + (1 + λv)2 1 + λv 1 + λv λ∆v u λv,1 λv,2 3λ2 |∇v|2 u + ∂1 + ∂2 + − ∂3 + V 1 + λv 1 + λv 1 + λv (1 + λv)2

= −∂12 − ∂22 −

with V =

λ2 v∆v 3λ2 |∇v|2 λ∆v − − . 2 2(1 + λv) 4(1 + λv)2

For our purpose it useful to rewrite this expression further in a form sorted w.r.t. to the powers of λ: Ω0 Hλ = −∆D + λ 2v∂32 + 2v,1 u∂1 ∂3 + 2v,2 u∂2 ∂3 + v,1 ∂1 + v,2 ∂2 ∆v + (∆v) u∂3 + 2 2 2 2 3v + |∇v| u + 2λv 3 2 2vv,1 u 2vv,2 u ∂1 ∂3 + ∂2 ∂3 ∂3 + −λ2 (1 + λv)2 1 + λv 1 + λv v(∆v) u 3|∇v|2 u vv,1 vv,2 ∂1 + ∂2 + + + ∂3 1 + λv 1 + λv 1 + λv (1 + λv)2 v∆v 3|∇v|2 + + 2(1 + λv) 4(1 + λv)2 In analogy with [BGRS], we thus get the following formula for the “straightened” operator, 3 7 Hλ = H0 + λ A∗n Bn + λ2 A∗n Bn , (2.6) n=1

n=4

558



where each of the An ’s and Bn ’s is a first-order differential operator with compactly supported coefficients and A∗1 := 2v∂3 A∗2 := ∆v A∗3 := (2u∂3 + 1) ω 3v 2 + |∇v|2 u2 + 2λv 3 ∂3 (1 + λv 3 ) v∆v A∗5 := − 1 + λv 3|∇v|2 A∗6 := − (1 + λv)2 2u∂3 + 1 A∗7 := − v 1 + λv A∗4 := −

B1 := ω∂3 1 B2 := ω u∂3 + 2 B3 := v,1 ∂1 + v,2 ∂2 B4 := ω∂3 B5 := ω u∂3 + B6 := ω u∂3 +

1 2 1 4

B7 := v,1 ∂1 + v,2 ∂2

2 with ω ∈ C ∞ 0 (R ) such that ω ≡ 1 on supp v. We define a pair of operators 2 Cλ , D : L (Ω0 ) → L2 (Ω0 ) ⊗ C7 by An ϕ n = 1, 2, 3 ϕ → (Cλ ϕ)n := λAn ϕ n = 4, . . . , 7

ϕ → (Dϕ)n := Bn ϕ n = 1, . . . , 7 then (2.6) finally becomes Hλ = H0 + λCλ∗ D.

2.3 Weak coupling analysis First we note that since the our layer is deformed only locally, we have Ω0 2 λ σess (−∆Ω D ) = σess (−∆D ) = [κ1 , ∞) . λ This is easy to see, for instance, by using a bracketing to show that inf σess (−∆Ω D ) = κ21 – cf. [DEK] – while the opposite inclusion is obtained by constructing an appropriate Weyl sequence. We use the notation κ2j := ( πd j)2 for the eigenvalues of the transverse operator (−∂32 )D ; the corresponding eigenfunctions are denoted by χj , and their explicit form is 2 sin κn u . χj (u) = d

Next we define Kλα := λD(H0 − α2 )−1 Cλ∗ . We are interested in (positive) eigenvalues E(λ) =: α2 of Hλ below the lowest transverse mode, hence we choose α ∈ [0, κ1 ). Our basic tool is the following classical result – cf. [BGRS, Lemma 2.1]:

Vol. 2, 2001


559

Proposition 2.1 (Birman-Schwinger principle) α2 ∈ σdisc (Hλ ) ⇐⇒ −1 ∈ σdisc (Kλα ) Proof. If Kλα ψ = −ψ, then ϕ := −λ(H0 − α2 )−1 Cλ∗ ψ is easily checked to satisfy Hλ ϕ = α2 ϕ. Conversely, if Hλ ϕ = α2 ϕ, we have ϕ ∈ Q(Hλ ) ⊂ D(D), so ψ := Dϕ ✷ is in L2 (Ω0 ) and Kλα ψ = −ψ. To make use of the above equivalence, we have to analyze the structure of Kλα . Let R0 (α) := (H0 −α2 )−1 be the free resolvent corresponding to H0 . Using the 2 transverse-mode decomposition and the fact that H0 = −∆R ⊗ I1 + I2 ⊗ (−∂32 )D , we can express the integral kernel of R0 , R0 (x, u, x , u ; α) =

∞

χj (u) rj (x, x ; α) χj (u )

j=1

where rj (x, x ; α) is the kernel of (−∆R +κ2j −α2 )−1 in L2 (R2 ). We define kj (α)2 := κ2j − α2 . The free kernel rj can be expressed in terms of Hankel’s functions – cf. [AGH, Chap. I.5] – which are related to Macdonald’s functions by [AS, 9.6.4], so finally we arrive at the formula 2

R0 (x, u, x , u ; α) =

∞ 1 χj (u) K0 (kj (α)|x − x |) χj (u ) . 2π j=1

ˆλ + M ˆ λ where Now we want to split the singular part of R0α ; we write Kλα = L ∗ ˆ Lλ := λDLα Cλ contains the singularity: Lα (x, u, x , u ) := −

1 χ1 (u) ln k1 (α) χ1 (u ) 2π

ˆ λ = λDMα C ∗ consists of diverges logarithmically as α → κ1 −. The regular part M λ two terms, Mα = Nα + R0⊥ (α), where the operator R0⊥ is defined as the projection of the resolvent on higher transverse modes R0⊥ (x, u, x , u ; α)

∞ 1 := χj (u) K0 (kj (α)|x − x |) χj (u ), 2π j=2

and the remaining term is therefore 1 Nα (x, u, x , u ) := χ1 (u) K0 (k1 (α)|x − x |) + ln k1 (α) χ1 (u ). 2π Put w−1 := ln k1 (α). The next step in the BS method is to show the boundedness and the analyticity (w.r.t. w) of the regular part of Kλα . A more difficult part of this task concerns the operator containing Nα where we have to take a different route than that used in [BGRS].

560



First we note that while the Hilbert-Schmidt norm is suitable for estimating the operator Nα , it fails when the latter is sandwiched between λD and Cλ∗ . More specifically, using the regularity and compact support of the functions involved one could transform λDNα Cλ∗ into an integral operator via integration by parts, but the obtained kernel has a singularity which is not square integrable. Hence we use instead the “continuous” version of the Schur-Holmgren bound. Since it seems to be less known than its discrete analogue [AGH, Lemma C.3], [Mad, Thm. 7.1.9], we present it here with the proof. Lemma 2.2 Suppose that M is an open subset of Rn and let K : L2 (M ) → L2 (M ) be an integral operator with the kernel K(·, ·). Then 12 K ≤ KSH := sup |K(x, x )|dx sup |K(x, x )|dx . x∈M

x ∈M

M

M

Proof. The claim follows from the inequality 1/p

Kp,p ≤ K1,1 K1/q ∞,∞ ,

(2.7)

where K is now an integral operator on Lp (M ), p−1 + q −1 = 1, and |K(x, x )| dx , K1,1 := sup |K(x, x )| dx. K∞,∞ := sup x∈M

x ∈M

M

M

If K is bounded for p = 1, ∞, we can prove (2.7) for the other p by an interpolation argument adapted from the discrete case [Mad]. By Hölder’s inequality 1 1 K(x, x )ψ(x )dx ≤ |K(x, x )| p |K(x, x )| q |ψ(x )| dx M

≤

M

|K(x, x )||ψ(x )| dx p

p1

M

|K(x, x )| dx ,

M

so we can easily estimate the Lp -norm of Kψ, p Kψpp = dx K(x, x )ψ(x ) dx M M p/q ≤ K∞,∞ dx |K(x, x )||ψ(x )|p dx M M p/q p ≤ K∞,∞ dx |ψ(x )| dx |K(x, x )| M

M

p ≤ Kp/q ∞,∞ K1,1 ψp ,

which yields the result.

✷

Recall that · SH is not a norm and that it simplifies for the symmetric kernels, KSH = supx∈M M |K(x, x )| dx . We are now ready to prove the following key result.

Vol. 2, 2001


561

ˆ (α(w)) is a bounded and analytic operator-valued function, Lemma 2.3 w → M which can be continued from {w ∈ C | Re w < 0} to a region that includes w = 0. Proof. As in [BGRS, Lemma 2.2], let H1 ⊂ L2 (Ω0 ) be the space of L2 (Ω0 ) functions of the form ϕχ1 , where ϕ ∈ L2 (R2 ). Let further P1 be the projection onto this subspace, and P1⊥ := I − P1 the projection onto its orthogonal complement in L2 (Ω0 ). Then R0⊥ (α) ≡ R0 (α)P1⊥ has an analytic continuation into the region {α ∈ C |α2 ∈ C \[κ22 , ∞)} since the lowest point in the spectrum of H0 P1⊥ P1⊥ L2 (Ω0 ) is κ22 . This region includes the domain [0, κ1 ) actually considered. To accommodate the extra factors D, Cλ∗ , we introduce the quadratic form bα (φ, ψ) := (φ, DR0⊥ (α)Cλ∗ ψ) = (R0⊥ (α) 2 P1⊥ D∗ φ, R0⊥ (α) 2 P1⊥ Cλ ψ) . 1

1

To check boundedness of this form, it is therefore sufficient to verify that R0⊥ (α) 2 1 P1⊥ D∗ and R0⊥ (α) 2 P1⊥ Cλ∗ are bounded operators. We shall check it for their ad1 joints. To this purpose, it is enough to show that Cλ P1⊥ and DP1⊥ are (R0⊥ (α)− 2 P1⊥ ) -bounded, i.e., that there exist positive a, b such that 1

Cλ P1⊥ ψ ≤ aR0⊥ (α)− 2 P1⊥ ψ + bψ , 1

∀ψ ∈ Q(Hλ ) :

and similarly for DP1⊥ . However, ∇P1⊥ ψ2

= (H0 + 1) 2 P1⊥ ψ2 − P1⊥ ψ2 1 1 (H0 + 1) 2 P1⊥ ψ ≤ (H0 − α2 ) 2 P1⊥ ψ + 1 + α2 P1⊥ ψ 1 ≤ R0⊥ (α)− 2 P1⊥ ψ + 1 + α2 ψ . 1

Here ∇ means the gradient in the variables (x, u) through which all the actions of Cλ , D can be estimated, e.g., |(Cλ ψ)1 | ≡ |A1 ψ| ≤ 2v∞ |∇ψ|, etc. In the same way, one verifies the analyticity of the operator-valued function DR0⊥ (α)Cλ∗ , which is equivalent to the analyticity of the complex-valued function α → bα (·, ·). Consider next the regular part of R0 (α)P1 containing the operator Nα . Let h be a C ∞ -function of compact support in R2 . As pointed out above, using integration by parts and the explicit form of the operators Cλ , D one sees that it is sufficient to check the boundedness and analyticity of hnα h and hnα,µ h, where nα (x, x ) := nα,µ (x, x )

=

1 K0 (k1 (α)|x − x |) + ln k1 (α) , 2π µ 1 xµ − x k1 (α)K1 (k1 (α)|x − x |) ; − 2π |x − x |

recall that ,µ means the derivative w.r.t. xµ and K0 = −K1 holds true – cf. [AS, 9.6.27]. We will use the following estimates which are valid for the Macdonald functions [AS, 9.6–7] with any z ∈ (0, ∞): |(K0 (z) + ln z)e−z | ≤ c1 , |[K1 (z) − z(K0 (z) + K2 (z))/2]| ≤ c3 ,

|K1 (z) − z −1 | ≤ c2 , |zK1 (z)| ≤ 1 .

562



Passing to the polar coordinates, xµ − x = (ρ cos ϕ, ρ sin ϕ) , µ

ρm :=

sup

x,x ∈supp h

|x − x | ,

we check the finiteness of the Schur-Holmgren bounds: |m1 (x, x ; α)h(x )| dx hnα hSH = sup |h(x)| 2 2 x∈R R ρm ρm 2 k1 (α)ρ ≤ c1 h∞ e ρ dρ + | ln ρ| ρ dρ 0 0 ≤ c1 h2∞ ρm ρm eκ1 ρm + max{e−1 , ρm ln ρm } , ρm ρ dρ hnα,µ hSH ≤ h2∞ = h2∞ ρm . ρ 0 Concerning the analyticity, one should investigate the complex-valued functions w → φ, hnα(w) h ψ and w → φ, hnα(w),µ h ψ , where φ, ψ are arbitrary vectors of L2 (Ω0 ). Using the Schwarz inequality, it is sufficient to check the finiteness of norms of the complex derivative w.r.t. w of the corresponding operator-valued −1 functions. Since K1 = −(K0 + K2 )/2 by [AS, 9.6.29] and k1 (α(w)) = ew , we put z := k1 (α(w))|x − x | and write dnα(w) 1 z 1 (x, x ) = K1 (z) − , dw 2π w2 z −1 µ dnα(w),µ 1 xµ − x ew z (x, x ) = K K (z) − (z) + K (z) . 1 0 2 dw 2π |x − x | w2 2 −1

Using now the inequality w−2 ew ≤ c4 the Schur-Holmgren bounds: dnα(w) 2 2 h h ≤ c2 c4 h∞ ρm , dw SH

for w ∈ (−∞, 0), we are able to estimate dnα(w),µ 2 2 h h ≤ c3 c4 h∞ ρm . dw SH

Thus the derivatives are bounded for w ∈ (−∞, 0), and since the limits as w tends to zero make sense, we can continue the function analytically to w = 0. ✷ Now we are in position to follow the standard Birman-Schwinger scheme to derive the weak-coupling expansion. Eigenvalues of Hλ correspond to singularities of the operator-valued function (I + Kλα )−1 which we can express as −1 ˆλ ˆ λ )−1 L ˆ λ )−1 . (I + Kλα )−1 = I + (I + M (I + M

(2.8)

ˆ λ is finite and we can choose λ sufficiently small to have Owing to Lemma 2.3, M ˆ Mλ < 1; then the second term at the r.h.s. of (2.8) is a bounded operator. On

Vol. 2, 2001


563

ˆ λ is a rank-one operator of the form (ψ, ·)ϕ, where ˆ λ )−1 L the other hand, (I + M ¯ u) ψ(x, ϕ(x, u)

λ ln k1 (α) χ1 (u)Cλ∗ , := − 2π ˆ λ )−1 D χ1 (x, u) , := (I + M

so it has just one eigenvalue which is d λ ˆ λ )−1 D χ1 (x, u) dx du . (ψ, ϕ) = − χ1 (u) Cλ∗ (I + M ln k1 (α) 2π 0 R2 Putting it equal −1 we get an implicit equation, F (λ, w) = 0, with d λ ˆ λ )−1 D χ1 (x, u) dx du , χ1 (u) Cλ∗ (I + M F (λ, w) := w − 2π 0 R2

(2.9)

ˆ λ has to be understood as a function both of λ and w. Expanding where M ˆ (I + Mλ )−1 into the Neumann series we find 1 (χ1 , C0∗ D χ1 ) , 2π and by Lemma 2.3 we know that F (λ, w) is jointly analytic in λ, w. In view of the implicit function theorem w = w(λ) is then an analytic function and we can compute the first term in its Taylor expansion: F,w (0, 0) = 1 = 0 ,

F,λ (0, 0) = −

dw F,λ (0, 0) 1 (0) = − = (χ1 , C0∗ D χ1 ) . dλ F,w (0, 0) 2π

But (C0 )n = 0 for n = 4, . . . , 7, B3 χ1 = 0, and (A2 χ1 , B2 χ1 ) = 0 since R2 ∆v = 0. It follows that dw 1 d κ2 1 χ1 (u)2 du v(x) dx = − 1 v, (2.10) (0) = (A1 χ1 , B1 χ1 ) = − dλ 2π π 0 π R2 where we have employed the symbol v := R2 v(x) dx. We note that α2 → κ21 − holds as λ → 0+, and consequently, k1 (α) → 0+. Thus w(0) = 0 is well defined because w = (ln k1 (α))−1 by definition. Furthermore, the solution α2 clearly represents an eigenvalue if and only if w is strictly negative for λ small. A sufficient condition for that is that the first term of the expansion of w(λ) is strictly negative; due to (2.10) it happens if v is strictly positive. Summing up the discussion, we get the announced three-dimensional analogue to Theorem 1.2 in [BGRS]: 2 Theorem 2.4 Let Ωλ be given by (2.1), where v ∈ C ∞ 0 (R ) satisfies v > 0. Then Ωλ for all sufficiently small positive λ, −∆D has a unique eigenvalue E(λ) in [0, κ21 ), −1 which is simple and can be expressed as E(λ) = κ21 − e2w(λ) , where λ → w(λ) is an analytic function. Moreover, the following asymptotic expansion is valid:

w(λ) = −λ

κ21 v + O(λ2 ) . π

564



3 An alternative method Now we will derive the weak-coupling expansion by constructing the asymptotics for singularities in a particular boundary value problem. This approach enables us to derive easily higher terms of the expansion. At the same time it allows a unified treatment for different dimensions; in this way we will be able to amend the existing results concerning deformed strips. First we introduce a unifying notation. Let n = 2, 3 be the dimension of the considered deformed region, i.e., the perturbed planar strip or layer, respectively. We set x = (x1 , . . . , xn−1 ) ∈ Rn−1 and (x, u) ∈ Ω0 := Rn−1 ×(0, d) for the unperturbed domain. From technical reasons it is convenient to change the setting slightly, in comparison with (2.1) and [BGRS], [EV3], and to deform the “lower” boundary of Ω0 what we certainly can do without loss of generality. We denote therefore in this section Ωλ := {(x, u) ∈ Rn : −λdv(x) < u < d}

(3.1)

n−1 with v ∈ C ∞ ). We denote by −∆ the (n − 1)-dimensional Laplacian, while 0 (R −∆ stands for the n-dimensional one. We also use f := f (x) dx , Rn−1

· as the norm in L2 (Rn−1 ), and m α(m) := (ln m)−1

β(t) :=

t ln t

if n = 2 if n = 3

3.1 The asymptotic expansion Let us now construct the asymptotics of the eigenvalues mλ of the following boundary value problem: (∆ + κ21 )Ψλ = m2λ Ψλ in Ωλ Ψλ (x, λdv(x)) = Ψλ (x, d) = 0 as they approach zero. We will seek it in the form  !∞ i if n = 2  i=1 λ mi  mλ =   exp − !∞ λi mi −1 if n = 3 i=1

(3.2)

where the existence of such expansions follows from [BGRS] and Theorem 2.4, respectively. Notice that this corresponds to the expansion of E(λ) = κ21 − m2λ , λ the ground-state eigenvalue of −∆Ω D in the problem discussed above, because the mirror transformation of Ωλ on (3.1) does not affect the spectral properties.

Vol. 2, 2001


565

n−1 Suppose that a function f ∈ C ∞ ), supp f ∩ supp v = ∅, and f = 0 0 (R is given. If we manage to construct a solution ψλ (x, u; m) of the boundary value problem

(∆ + κ21 )ψλ = m2 ψλ + (α(m) − α(mλ )) f χ1 in Ωλ ψλ = 0 on ∂Ωλ

(3.3)

which is bounded and non-vanishing w.r.t. m for small nonzero m, then Ψλ (x, u) = ψλ (x, u; mλ ). We shall look for the asymptotics of ψλ in the following form, ψλ (x, u; m) =

∞

λi ψi (x, u; m) .

(3.4)

i=0

Substituting (3.4) and (3.2) into (3.3), we obtain a family of the boundary value problems: (∆ + κ21 )ψ0 = m2 ψ0 + α(m)f χ1 in Ω0 ψ0 = 0 on ∂Ω0 (∆ + κ21 )ψi = m2 ψi + (−1)n−1 mi f χ1 ψi = 0 i j j j d (−v) ∂ ψi−j ψi = − j! ∂uj j=1

in if

Ω0 u=d

if

u=0

i=0

(3.5)

i≥1

(3.6)

One can check easily that ψ0 = −α(m)(−∆ + m2 )−1 f χ1 solves (3.5) and has the asymptotics (−1)n−1 ψ0 (x, u; m) = χ1 (u) f 2π n−2 β(|x − x |)f (x ) dx + δn3 (γ − ln 2)f +(−1)n−1 α(m) Rn−1 +O α(m)2 (3.7) as m → 0, where γ is the Euler number and δnj the Kronecker delta. Lemma 3.1 Suppose that F ∈ C ∞ (Ω0 ) with a bounded support and H ∈ C ∞ 0 (Rn−1 ) have the expansions F (x, u; m) =

∞ i=0

α(m)i Fi (x, u) ,

H(x; m) =

∞ i=0

α(m)i Hi (x)

566


as m → 0. Define Fi,k := ary value problem

d 0


Fi (·, u) χk (u) du . Let φ0 be the solution of the bound-

(∆ + κ21 )φ0 = F0 in Ω0 , φ0 = 0 if u = d , φ0 = H0 if u = 0 ;

(3.8)

then the condition

2 (3.9) κ1 H0 d is necessary and sufficient for existence of a solution of the boundary value problem F0,1 =

(∆ + κ21 )φ = m2 φ + F in Ω0 , φ = 0 if u = d , φ = H if u = 0 , which is bounded as m → 0. If it is satisfied, the solution has the asymptotics φ(x, u; m)

" # 2 (−1)n−1 = φ0 (x, u) + κ1 H1 + O (α(m)) . χ1 (u) F1,1 − 2π n−2 d

Proof. The statement is obvious if H = 0. In particular, the solution φ is constructed by the Fourier method in the explicit form φ(x, u; m) =

∞

φ˜i (x; m)χi (u).

i=1

By a direct calculation it is easy to see that φ˜i are bounded functions for m ≥ 0 so long as i ≥ 2. The problem arises for i = 1, because in general φ˜1 tends to infinity as m → 0. The condition (3.9) guarantees that the explicit solution φ has no such pole. This proves the sufficiency. To see that the condition is necessary at the same time, one integrates by parts in the scalar product equation χ1 , (∆ + κ21 − m2 )φ = (χ1 , F ) and puts m = 0 afterwards. In the opposite case, H = 0, we use the replacement u φ(x, u; m) = ϕ(x, u; m) + 1 − H(x; m) d and expand the r.h.s. of the equation for ϕ in the Fourier series, which reduces the task to the previous situation. ✷ Corollary 3.2 φ ∈ C ∞ (Q) holds for any bounded domain Q ⊂ Ω0 .

Vol. 2, 2001


567

It follows from Lemma 3.1 that the recursive system of the boundary value problem (3.6) has solutions which are continuous with respect to m in the vicinity of m = 0 and decay as |x| → ∞ for m > 0, provided the mi ’s satisfy the following recursive relations: % i $ 2 κ1 dj (−v)j ∂ j ψi−j n (·, 0; 0) . (3.10) mi = (−1) d f j=1 j! ∂uj In particular, owing to (3.7) and Lemma 3.1 we get m1 =

κ21 π n−2

v ,

(3.11)

which agrees with the leading term obtained by the Birman-Schwinger method in the previous section – cf. Theorem 2.4 and (3.2) – as well as with the corresponding result (1.1) in the strip case.

3.2 The next-to-leading order Let us now calculate m2 . By virtue of (3.6), (3.7) and (3.11) the boundary value problem for ψ1 together with the boundary condition for ψ2 (x, u; 0) look as follows (∆ + κ21 )ψ1 = m2 ψ1 + (−1)n−1

κ21 π n−2

v f χ1

ψ1 = 0 ∂ψ0 ψ1 = dv ∂u ψ2 = 0 ∂ψ1 ψ2 = dv ∂u

in Ω0 if

u=d

if

u=0

if

u=d

if

u = d, m = 0

∂ψ0 (−1)n−1 2 (x, 0; m) = κ1 B(f ) , ∂u 2π n−2 d where B(f ) is the square bracket from (3.7). Hence $ % 2 κ1 d ∂ψ1 n−1 v (·, 0; 0) m2 = (−1) d f ∂u

(3.12)

with

(3.13)

(3.14)

and it is sufficient to find ψ1 . With eq. (3.12) and Lemma 3.1 in mind, we consider the following boundary value problem (∆ + κ21 )φ0 = (−1)n−1 n−1

φ0 =

(−1) 2π n−2

κ21

v f χ1

in Ω0

φ0 = 0

if

u=d

2 κ1 d v f d

if

u=0

π n−2

(3.15)

568



and seek φ0 in the form (−1)n−1 φ0 (x, u) = 2π n−2

2 u κ1 1 − f d v(x) − ϕ(x, u) ; d d

(3.16)

substituting it into (3.15), we arrive at the boundary value problem

(∆ +

κ21 )ϕ

u (∆ + κ21 )v + 2κ1 = −d f 1 − d

2 v f χ1 in Ω0 d ϕ = 0 on ∂Ω0 .

The Fourier method gives ϕ = − −

∞

χk 2 (−∆ + κ2k − κ21 )−1 (−∆ − κ21 ) v d f d κk k=2

' 2 χ1 & f v + κ21 (−∆ )−1 (v f − f v) . d d κ1

Lemma 3.1 an relations (3.12), (3.13), (3.15), and (3.16) together with the last result imply that ∂ψ1 (−1)n (x, 0; 0) = n−2 ∂u 2π ( κ21 × π n−2

2 κ1 d v(x) β(|x − x |) f (x ) dx dx

Rn−1 × Rn−1

+f

Rn−1

Rn−1

)

(γ − ln 2) v

,

where we have employed also the implication F = 0 ⇒ (−∆ )−1 F =

−1 2π n−2

R

n−1

β(|x − x |) f (x ) dx

∞ & ' (−∆ + κ2k − κ21 )−1 (−∆ − κ21 ) v (x) k=2

π

β(|x − x |) v(x ) dx − v

+f 3 v(x) + 2 κ2 +δn3 1

β(| · −x |) F (x ) dx .

Vol. 2, 2001


569

Substituting this into (3.14) we get the sought coefficient: ( κ21 κ21 v(x) β(|x − x |) v(x ) dx dx m2 = − n−2 3 v 2 + n−2 π π Rn−1 × Rn−1 * ∞ + 2 2 −1 2 +2 v (−∆ + κk − κ1 ) (−∆ − κ1 ) v k=2

κ2 +δn3 1 π

)

(γ − ln 2) v

2

.

(3.17)

3.3 The critical case As we have pointed out in the introduction, the above result is most interesting in the critical case, v = 0, when the first coefficient (3.11) equals zero and m2 given by (3.17) determines the leading order. In this situation we have the following result. n−1 Theorem 3.3 Let V ∈ C ∞ ) be an arbitrary function such that V = 0 and 0 (R x , σ > 0. v(x) = V σ

Then the following inequalities hold, 3 κ21 σn−1 8 2 2 2 −1 2 V + 2 2 V ∆ V − 2κ1 σ ∇ (∆ ) V − n−2 π 2 2κ1 σ 2 n−1 3 κ1 σ 2 2 2 −1 2 ≤ m2 ≤ − n−2 V − 2κ1 σ ∇ (∆ ) V . π 2 Proof. In the first place, note that V = 0 implies κ21 V (x) β(|x − x |) V (x ) dx dx = ∇ (∆ )−1 V 2 > 0 , − n−2 π Rn−1 × Rn−1

because ∆ β(|x|) = 2π n−2 δ(x) holds in the sense of distribution. Under the stated assumptions, the formula (3.17) yields therefore κ21 σn−1 2 2 2 −1 2 m2 = − n−2 3V − 2κ1 σ ∇ (∆ ) V + 2A(σ) , π where A(σ) :=

∞ , −1 V −∆ + (κ2k − κ21 ) σ2 (−∆ − κ21 σ2 ) V , k=2

and it suffices to find suitable bounds on A(σ).

570



Since the Fourier transformation together with the Plancherel theorem give the estimate −1 F F ≤ 2 , (3.18) −∆ + (κ2k − κ21 ) σ2 (κk − κ21 )σ2 we obtain the upper bound 3 A(σ) ≤ 4

1 V + 2 2 V ∆ V , κ1 σ 2

where the numerical factor comes from On the other hand, denoting Uk (x; σ) :=

!∞

k=2 (k

2

− 1)−1 = 34 .

−∆ + (κ2k − κ21 ) σ2

−1

V (x),

we see that , −1 V −∆ + (κ2k − κ21 ) σ2 (−∆ − κ21 σ2 ) V , = −∆ + (κ2k − κ21 ) σ2 Uk (−∆ − κ21 σ2 )Uk . Integrating the r.h.s. by parts and using (3.18), we get the lower bound A(σ) =

∞ ∆ Uk 2 + κ21 (k2 − 2)σ2 ∇ Uk 2 − κ41 (k2 − 1)σ 4 Uk 2 k=2 ∞

>−

κ41 (k2 − 1)σ 4 Uk 2 ≥ −V 2

k=2

which concludes the proof.

∞ k=2

1 3 = − V 2 , k2 − 1 4 ✷

This theorem confirms the spectral picture we got from (1.2) and (1.3). More specifically, m2 > 0 as σ → ∞ so the critical weakly bound state exists for sufficiently smeared deformations, and vice versa. In contrast to (1.2) and (1.3), however, we are able now to tell from (3.17) for any given zero-mean v the sign of m2 .

Acknowledgment R.G. is grateful for the hospitality extended to him at NPI AS where a part of this work was done. The research has been partially supported by GA AS and the Czech Ministry of Education under the contracts 1048801 and ME170. The first and the third authors have been partially supported by Russian Fund of Basic Research – Grants 99-01-00139 and 99-01-01143, respectively.

Vol. 2, 2001


571

References [AS]

M.S. Abramowitz, I.A. Stegun, eds., Handbook of Mathematical Functions, Dover, New York (1965).

[AGH] S. Albeverio, F. Gesztesy, R. Høegh-Krohn, H. Holden, Solvable Models in Quantum Mechanics, Springer, Heidelberg (1988). [BGRS] W. Bulla, F. Gesztesy, W. Renger, B. Simon, Weakly Coupled Bound States in Quantum Waveguides, Proc. Amer. Math. Soc. 127, 1487–1495 (1997). [DE]

P. Duclos, P. Exner, Curvature–Induced Bound States in Quantum WaveGuides in Two and Three Dimensions, Rev. Math. Phys. 7, 73–102 (1995).

[DEK] P. Duclos, P. Exner, D. Krejˇciˇr´ık, Locally Curved Quantum Layers, Ukrainian J. Phys. 45, 595–601 (2000). [EK]

P. Exner, D. Krejˇciˇr´ık, Waveguides Coupled Through a Semitransparent Barrier : a Birman-Schwinger Analysis, Rev. Math. Phys. 13, 307–334 (2001).

[EV1]

P. Exner, S.A. Vugalter, Asymptotic Estimates for Bound States in Quantum Waveguides Coupled Laterally Through a Narrow Window, Ann. Inst. H. Poincaré: Phys. théor. 65, 109–123 (1996).

[EV2]

P. Exner, S.A. Vugalter, Bound-State Asymptotic Estimates for WindowCoupled Dirichlet Strips and Layers, J. Phys. A30, 7863–7878 (1997).

[EV3]

P. Exner, S.A. Vugalter, Bound States in a Locally Deformed Waveguide : the Critical Case, Lett. Math. Phys. 39, 59–68 (1997).

[Ga]

R.R. Gadyl’shin, Surface Potentials and the Method of Matching Asymptotic Expansions in the Problem of the Helmholtz Resonator, Algebra i Analiz 4, 88–115 (1992); English transl. in St. Peterburgs Math. J. 4, 273– 296 (1993).

[Il]

A.M. Il’in, Matching of Asymptotic Expansions of Solutions of Boundary Value Problems, Nauka, Moscow (1989); English transl., Amer. Mat. Soc., Providence, RI, (1992).

[Mad]

I.J. Maddox, Elements of Functional Analysis, Cambridge Univ. Press (1970).

[Po]

I.Yu. Popov, Asymptotics for Bound State for Laterally Coupled Waveguides, Rep. Math. Phys. 4, 88–115 (1992).

[RS]

M. Reed, B. Simon, Methods of Modern Mathematical Physics, IV. Analysis of Operators, Academic Press, New York (1978).

572


D. Borisov and R. Gadyl’shin Bashkir State Pedagogical University October Revolution St. 3a RU-450000 Ufa, Russia email: [email protected] email: [email protected] P. Exner and D. Krejˇciˇr´ık Department of Theoretical Physics Nuclear Physics Institute Academy of Sciences ˇ z, Czech Republic CZ-25068 Reˇ email: [email protected] email: [email protected] Communicated by Gian Michele Graf submitted 11/10/00, accepted 23/11/00





Lattice Points, Perturbation Theory and the Periodic Polyharmonic Operator Leonid Parnovski, Alexander V. Sobolev

1 Introduction Consider the polyharmonic operator acting in L2 (Rd ), perturbed by a real-valued periodic function: H = H0 + V, H0 = (−∆)l , l > 0. (1.1) The spectrum of H is formed from closed intervals (spectral bands), possibly separated by gaps (see [6], [10]). We shall concentrate on one aspect of this structure, known as the Bethe-Sommerfeld conjecture, which states that the number of spectral gaps is finite. This hypothesis was put forward by H. Bethe and A. Sommerfeld for the Schr¨ odinger operator in dimension three, i.e. for l = 1, d = 3. Ever since, the case l = 1 was a subject of intensive study by a number of authors, which lead to the justification of the conjecture for d = 2 in [9], [1], for d = 3 in [13] and for d = 2, 3, 4 in [2]. In dimensions d ≥ 5 the problem was solved only for rational lattices of periods (see [12]). For arbitrary l the number of gaps was shown to be finite for 2l > d, d ≥ 3 in [11], [12]. Later, in [3] (see also [4]), these conditions were relaxed to 4l > d + 1, d ≥ 2. In our recent paper [7] we prove the conjecture for 6l > d + 2, d ≥ 2. The aim of the present paper is to loosen the condition from [7] further. Namely, we show that the number of gaps in the spectrum of H is finite if 8l > d+3, d ≥ 2. In the physically most relevant case l = 1 (i.e. for the Schr¨ odinger operator), this requirement is fulfilled for d = 2, 3 or 4. These are exactly the dimensions for which the conjecture was justified in the papers cited above. However, our method has a considerable advantage that it relies only on elementary perturbation theoretic arguments and treats all dimensions d and exponents l satisfying 8l > d + 3, in a unified fashion. In connection with this, it is appropriate to note that the study of the polyharmonic operator with an arbitrary l > 0 (rather than with l = 1 only) is useful and instructive as it allows one to understand better the mechanisms responsible for the quantitative characteristics of the spectrum, and to find out how far one can push the perturbation theoretic argument in its investigation. Our approach follows the plan of [7] and comprises two main ingredients: 1. Number-theoretic estimates, more precisely, estimates on the number of lattice points inside a ball of a large radius;

574

L. Parnovski, A. V. Sobolev


2. An estimate on the difference between the counting functions of the perturbed and unperturbed problems. All the necessary number-theoretic facts were obtained in the previous article [7] and are used here without any modifications (see Proposition 3.1). On the contrary, for ingredient 2 we now rely on a bound (see Proposition 3.2), borrowed from [5], which is more precise than the corresponding bound established in [7]. This modification enabled us to improve the sufficient condition of validity of the Bethe-Sommerfeld conjecture from 6l > d + 2 to 8l > d + 3. Before we learnt about the existence of paper [5] we established an alternative version of Proposition 3.2, which required the condition V ∈ C∞ , which is more restrictive in comparison with [5]. This version can be found in [8]. Notation. By bold lowercase letters we denote vectors in Rd and Zd , e.g. x ∈ Rd , m ∈ Zd . Bold uppercase letters G, F are used for d × d constant positive definite matrices. The notations ab and aGb stand for the scalar product in Rd and the bilinear form of the matrix G respectively. For any function f ∈ L1 (O), O = [0, 2π)d the Fourier transform is defined as follows: 1 e−imx f (x)dx. fˆ(m) = (2π)d/2 O Throughout the paper we also use the following notation: 0, d = 1(mod 4); δ = δd = arbitrary positive number, d = 1(mod 4).

(1.2)

By C and c (with or without indices) we denote various positive constants whose precise value is unimportant.

2 Main result and preliminaries 2.1 Notation and main result Using a linear change of coordinates, (1.1) can be transformed to the following form: H =H0 + V, (l)

H0 =H0 = (DGD)l , D = −i∇, where G is a constant positive-definite d × d -matrix, and V is a bounded realvalued function periodic with respect to the cubic lattice Γ = (2πZ)d . As V is bounded, the operator H is self-adjoint on the domain D(H0 ) = H 2l (Rd ). We use the following notation for the fundamental domains of the lattice Γ and its dual lattice Γ† = Zd : O = [0, 2π)d , O† = [0, 1)d .

Vol. 2, 2001

Periodic Polyharmonic Operator

575

Let us also introduce the torus Td = Rd /Γ. To describe the spectrum of H we use the Floquet decomposition of the operator H (see [10]). We identify the space L2 (Rd ) with the direct integral Hdk, H = L2 (O). G= O†

The identification is implemented by the Gelfand transform (U u)(x, k) = e−ikx e−i2πkm u(x + 2πm), k ∈ Rd , m∈Zd

which is initially defined on functions from the Schwarz class and extends by continuity to a unitary mapping from L2 (Rd ) onto G. It is readily seen that (U H0 U −1 u)( · , k) = H0 (k)u( · , k),

l H0 (k) = (D + k)G(D + k) , k ∈ Rd ,

with the domain D(H0 (k)) = H 2l (Td ). The family H(k) = H0 (k) + B(k) realises the decomposition of H in the direct integral: H(k)dk. U HU −1 = O†

The spectra of all H(k) consist of discrete eigenvalues λj (k), j = 1, 2, . . . , that we arrange in non-decreasing order counting multiplicity. It is clear that λj ( · ) are continuous functions of k. The images j = ∪ λj (k), k∈O†

of the functions λj are called spectral bands. The spectrum of the initial operator H has the following representation: σ(H) = ∪j j . The bands with distinct numbers may overlap. To characterise this overlapping we introduce the function m(λ) = m(λ, V ) called the multiplicity of overlapping, which is equal to the number of bands containing given point λ ∈ R: m(λ) = #{j : λ ∈ j }; and the overlapping function ζ(λ) = ζ(λ, V ), λ ∈ R, defined as the maximal number t such that the symmetric interval [λ − t, λ + t] is entirely contained in one of the bands j : ζ(λ) = max max{t : [λ − t, λ + t] ⊂ j }. j

576



These two quantities were first introduced by M. Skriganov (see e.g. [12]). It is easy to see that ζ is a continuous function of λ ∈ R. To state the main result we have to impose additional smoothness conditions on the potential V . It will be convenient to formulate them in terms of the Fourier coefficients Vˆ (θ) of the potential V . We shall assume that |Vˆ (θ)| |θ|ν < ∞, (2.1) θ∈Zd

with ν>

 (d − 1)/2, d ≥ 3;

(2.2)

2(l + 1)/3, d = 2.

This condition is exactly the same as in Section 1 of [5]. The main results of the paper are stated in the following theorem. Recall that the parameter δ = δd used in the Theorem is defined in (1.2). Theorem 2.1. Let l > 0, d ≥ 2 and let V ∈ C ∞ (Td ) be a real-valued function satisfying the conditions (2.1), (2.2). Suppose that 8l > d + 3. Then there is a number λl = λl (V, δ) ∈ R such that m(λ) ≥ c0 λ

d−1 4l −δ

,

ζ(λ) ≥ c0 λ1−

d+1 4l −δ

(2.3)

for all λ ≥ λl with a constant c0 independent of V . Clearly, this Theorem implies the validity of the Bethe-Sommerfeld conjecture. The proof of Theorem 2.1 exploits the connection between the functions m(λ), ζ(λ) and the counting functions N λ; H(k) = 1, n λ; H(k) = 1. λj (k)≤λ

Denote

λj (k) 0 let E(ρ) = E(ρ, F) ⊂ Rd be the ellipsoid {ξ ∈ Rd : |Fξ| ≤ ρ}, F = G1/2 . There is a very simple connection between integer points in the ellipsoid and the eigenvalues of the unperturbed problem. Indeed, the eigenvalues of the operator H0 (k) equal |F(m + k)|2l , m ∈ Zd , which ensures that for all ρ ≥ 0   N ρ2l ; H0 (k) = # k; E(ρ) , (3.1)  N (ρ2l ; H ) = #(E(ρ)) = w ρd , 0 d where wd = √

Kd π d/2 , Kd = , Γ(d/2 + 1) det G

Kd being the volume of the unit ball in Rd . We are interested in the lower bound for the deviation of the function N (ρ2l ; H0 (k)) from the volume wd ρd of the ellipsoid E(ρ) as ρ → ∞:

578



Proposition 3.1. Let the number δ be as defined in (1.2). Then for all sufficiently big ρ the estimate holds:

# E(ρ) − wd ρd ≥ Cρ d−1 2 −δ , with a constant C = C(d, G, δ). Note that we do not need any upper bound on the l.h.s. of the above inequality. We refer to [7] for the proof and discussion of Proposition 3.1.

3.2 An estimate for the counting function N (ρ2l ; H(k)) As in [7], the second crucial ingredient of the proof is an estimate for the deviation of N (λ; H(k)) from the unperturbed counting function N (λ; H0 (k)), averaged in k ∈ O† . In contrast to [7], we use a more precise estimate established in [5]: Proposition 3.2. Let d ≥ 2, 2l > 1. Suppose that the potential satisfies the conditions (2.1), (2.2). Then

N (ρ2l ; H) − N (ρ2l ; H0 ) ≤ Cρd+1−4l ln ρ, (3.2) for sufficiently large ρ. The bound (3.2) was derived in [5] as an intermediate result for obtaining the corresponding estimate for the integrated density of states D(ρ2l ; H) = N (ρ2l ; H). Indeed, by (3.1), the unperturbed density of states D(ρ2l ; H0 ) coincides with wd ρd , so that (3.2) leads to D(ρ2l ; H) = wd ρd + ρd+1−4l O ln ρ , ρ → ∞. For l = 1 and V ∈ C∞ (Td ) a similar estimate with the remainder O(ρd−3+η ) with arbitrary η > 0 was proved in [2] for all d ≥ 2. Note also that for V ∈ C∞ (Td ) and arbitrary l > 1/2 an estimate similar to (3.2) with the remainder O(ρd+1−4l+η ) with arbitrary η > 0 was found in [8].

3.3 Proof of Theorem 2.1 Observe that under the condition 8l > d + 3 we have d + 1 − 4l < (d − 1)/2. Therefore Proposition 3.2 and (3.1) give the equalities lim ρ−β N (ρ2l ; H) − N (ρ2l ; H0 ) = 0, (3.3) −β 2l d lim ρ N (ρ ; H) − wd ρ = 0, (3.4) as ρ → ∞, for β = (d − 1)/2 − δ with a sufficiently small δ (see (1.2) for definition of δ). Note that

|N (λ; H) − N (λ; H)| ≥ |N (λ; H0 ) − wd ρd |

− |N (λ; H) − N (λ; H0 )| − |N (λ; H) − wd ρd |, λ = ρ2l .

Vol. 2, 2001


579

Now, using Proposition 3.1 and (3.1) for the first term in the r.h.s., and the relations (3.3) and (3.4) for the remaining terms, we obtain that

N (ρ2l ; H) − N (ρ2l ; H) ≥ cρβ , for all ρ ≥ ρ0 with a sufficiently large ρ0 > 0. Noticing that the function in the l.h.s. is of average zero, we see that max N ρ2l ; H(k) ≥ N (ρ2l ; H) + cρβ , k

min N ρ2l ; H(k) ≤ N (ρ2l ; H) − cρβ , k

which implies, in view of (3.4), that  maxk N ρ2l ; H(k) ≥ min N ρ2l ; H(k) ≤ k

wd ρd + cρβ , wd ρd − cρβ ,

(3.5)

for ρ ≥ ρ0 . According to (3.5) for all non-negative t ≤ ρ2l /2 we have β

d

N+ (ρ2l − t) ≥ wd (ρ2l − t) 2l + C(ρ2l − t) 2l ≥ wd ρd + Cρβ − ctρd−2l , ∀ρ ≥ 2ρ0 . Similarly, d

d

N− (ρ2l + t) ≤ wd (ρ2l + t) 2l − C(ρ2l + t) 2l ≤ wd ρd − Cρβ + ctρd−2l , ∀ρ ≥ ρ0 . Now one concludes from (2.5) that m(ρ2l ) ≥ N+ (ρ2l ) − N− (ρ2l ) ≥ 2Cρβ , ∀ρ ≥ 2ρ0 , and hence β

m(λ) ≥ cλ 2l , which yields (2.3) for all λ ≥ λl = (2ρ0 )2l . This completes the proof of the lower bound for m(λ). To estimate ζ(λ) write N+ (ρ2l − t) − N− (ρ2l + t) ≥ 2Cρβ − 2ctρd−2l . From the formula (2.4) one can now infer (2.3) for ζ(λ), λ ≥ (2ρ0 )2l . Theorem 2.1 is proved.

580



References [1] B.E.J. Dahlberg, E. Trubowitz, A remark on two dimensional periodic potentials, Comment. Math. Helvetici 57 (1982), 130–134. [2] B. Helffer, A. Mohamed, Asymptotics of the density of states for the Schr¨ odinger operator with periodic electric potential, Duke Math. J. 92 (1998), 1–60. [3] Yu. E. Karpeshina, Analytic Perturbation Theory for a Periodic Potential, Izv. Akad. Nauk SSSR Ser. Mat., 53 (1989), No 1, 45-65; English transl.: Math. USSR Izv., 34 (1990), No 1, 43 – 63. [4]

Perturbation theory for the Schr¨ odinger operator with a periodic potential, Lecture Notes in Math. vol 1663, Springer Berlin (1997).

[5]

On the density of states for the periodic Schr¨ odinger operator, Ark. Mat. 38, 111–137 (2000).

[6] P. Kuchment, Floquet theory for partial differential equations, Birkh¨ auser, Basel, (1993). [7] L. Parnovski, A.V. Sobolev, On the Bethe-Sommerfeld conjecture for the polyharmonic operator, to appear in Duke Math. J. [8]

, Perturbation theory and the Bethe-Sommerfeld conjecture, Research report No 2000-05, University of Sussex, (2000).

[9] V.N. Popov, M. Skriganov, A remark on the spectral structure of the two dimensional Schr¨ odinger operator with a periodic potential, Zap. Nauchn. Sem. LOMI AN SSSR 109, 131–133(Russian) (1981). [10] M. Reed, B. Simon, Methods of modern mathematical physics, IV, Academic Press, New York, (1975). [11] M. Skriganov, Finiteness of the number of gaps in the spectrum of the mutlidimensional polyharmonic operator with a periodic potential, Mat. Sb. 113 (155), 131–145 (1980); Engl. transl.: Math. USSR Sb. 41 (1982). [12]

, Geometrical and arithmetical methods in the spectral theory of the multi-dimensional periodic operators, Proc. Steklov Math. Inst. Vol. 171, (1984).

[13]

, The spectrum band structure of the three-dimensional Schr¨ odinger operator with periodic potential, Inv. Math. 80, 107–121 (1985).

Vol. 2, 2001


Leonid Parnovski Department of Mathematics University College London Gower Street London WC1E 6BT UK email: [email protected] Alexander V. Sobolev Centre for Mathematical Analysis and Its Applications University of Sussex Falmer, Brighton BN1 9QH, UK email: [email protected] Communicated by Bernard Helffer submitted 26/04/00, accepted 21/12/00


581



Resonances of the Dirac Hamiltonian in the Non Relativistic Limit L. Amour, R. Brummelhuis, J. Nourrigat Abstract. For a Dirac operator in IR3 , with an electric potential behaving at infinity like a power of |x|, we prove the existence of resonances and we study, when c → +∞, the asymptotic expansion of their real part, and an estimation of their imaginary part, generalizing an old result of Titchmarsh.

1 Introduction We are interested in the following Dirac operator D(c) in IR3 , depending on a parameter c > 1, V (x) cσ · Dx D(c) = . (1) cσ · Dx V (x) − 2c2 Here σ · Dx denotes σ1 D1 + σ2 D2 + σ3 D3 , where the σj are the Pauli matrices, and V is a C ∞ real-valued function, satisfying the following hypotheses. (H1) We assume that V can be extended in an holomorphic function in the following open set of C I 3 , for some positive constants a and r, Ω = Sa ∪ B(0, r)

(2)

I 3 , |Argzj | < a, ∀ j = 1, 2, 3}, and B(0, r) be where Sa is the complex sector {z ∈ C the open complex ball with center 0 and radius r. We assume also that for some positive constants k, m0 and R, we have |V (z)| ≤ m0 (1 + |z|k ),

∀ z ∈ Sa .

(3)

(H2) We have also, if x ∈ IR3 and |x| ≥ R, |x|k ≤ m0 V (x).

(4)

(H3) We have also, if x ∈ IR3 and |x| ≥ R, |x|k ≤ m0 x ·

∂V . ∂x

(5)

We see easily that D(c) is essentially self-adjoint, and Titchmarsh proved, when V is radial, that D(c) has the whole real line as a purely absolutely continuous spectrum (see Thaller [14]). Let H be the corresponding Schr¨ odinger operator 1 H = − ∆ + V (x). 2

(6)

584

L. Amour, R. Brummelhuis, J. Nourrigat


The spectrum of H is discrete. We shall prove that, when c is large enough, D(c) has resonances near the eigenvalues of H and we shall study their asymptotic behaviour when c → +∞. Recall that, in the semiclassical limit, the asymptotic behaviour of the resonances is studied in Parisse [9] (see also Balslev-Helffer [2]). For the Dirac operator in one dimension, with potential V (x) = |x|, Titchmarsh [15] gave an explicit computation of the resonances (see also Veselic [16] and Thaller [14]). For the definition of resonances, we need the analytic dilations (see AguilarCombes [1]). For each θ ∈ C I such that | θ| < a, we denote by D(θ, c) the following Hamiltonian V (eθ x) e−θ cσ · Dx D(θ, c) = , (7) e−θ cσ · Dx V (eθ x) − 2c2 with domain I 4 ) = {u ∈ H 1 (IR3 ,C I 4 ), |x|k u ∈ L2 (IR3 ,C I 4 )}. B 1 (IR3 ,C

(8)

We shall prove in Section 2 the following theorem. Theorem 1 D(θ, c) has pure point spectrum for small positive θ. Each eigenvalue λj (θ, c) is isolated and of finite even multiplicity, and does not depend on θ. The eigenvalues of D(θ, c), denoted by Ej (c) since they do not depend on θ, will be called resonances. We shall prove in Section 3 the following theorem. Theorem 2 If θ is small enough, we have the following properties. (i) Let K be a compact set of C I containing no eigenvalue of H. Then, if c is large enough, K contains no resonance. (ii) Let D be a compact disc centered at an eigenvalue E0 of H, of multiplicity µ, and containing no other eigenvalue. Then, if c is large enough, D contains a finite number of resonances, and the sum of their multiplicities is 2µ. Theorem 3 If θ is small enough, we have the following property. If D is a disc as in Theorem 2, if E0 is a simple eigenvalue of H, then D contains, for c large enough, one resonance λ(c) of multiplicity 2, and there exists a C ∞ function f in a neighborhood of 0 such that f (0) = E0 and, for c large enough λ(c) = f (

1 ). c2

(9)

This theorem is proved in Section 4. Recall that, when V (x) = O(< x >−s ) (s > 0), if E0 is an isolated simple eigenvalue of H, Grigore-Nenciu-Purice [3] proved that for c large enough, D(c) has a double eigenvalue λ(c) defined by an equality like (9), but where f is analytic. If V is a polynomial, we may think that the function f in (9) belongs perhaps in some Gevrey class related to the degree of V .

Vol. 2, 2001

Resonances of the Dirac Hamiltonian in the Non Relativistic Limit

585

Now, we can study the imaginary part of the resonances. We consider the following Agmon metric ds2c in IR3 , depending on c (see Wang [17]) ds2c =

1 V (x)+ (2c2 − V (x))+ dx2 , c2

(10)

where x+ = sup(x, 0). For each ε > 0, we consider the ”sea” M (c, ε) = {x ∈ IR3 , V (x) ≥ (2 − ε)c2 }.

(11)

We denote by S(c, ε) the distance, for the metric ds2c , of the origin to M (c, ε). Theorem 4 Under the hypothesis of Theorem 2 (point ii), for each ε > 0, there exists Cε > 0 such that the resonances Ej (c) contained in D satisfy | Ej (c)| ≤ Cε e−(2−ε)S(c,ε) .

(12)

We are very grateful to X.P. Wang for useful discussions about the exterior scaling, used in Section 5.

2 Proof of Theorem 1. We remark first that D(c) is essentially self-adjoint, since we have easily the following implication : u ∈ L2 (IR3 ,C I 4 ),

z < 0,

(D(c) − z)u = 0 ⇒ u = 0.

(13)

Now c is fixed. It can be seen using Cauchy’s estimate that (H1) implies |∂zα V (z)| ≤ Cα (1 + |z|)k−|α| ,

∀z ∈ S a2 .

(14)

From the calculus adapted to the harmonic oscillator, straightforward modifications are easily made, to obtain a calculus for global elliptic pseudo-differential operators, adapted to first order systems with a potential behaving like |x|k . Therefore, we briefly give the main aspects. See Shubin[12] for more considerations. C)) such that for For each m ∈ IR, let Γm be the space of d ∈ C ∞ (IR6 , M4 (I all α and β in IN3 , there exists Cαβ such that, for all (x, ξ) ∈ IR6 , |∂xα ∂ξβ d(x, ξ)| ≤ Cαβ (1 + |x|k + |ξ|)m−

|α| k −|β|

.

For each d ∈ Γm , let Op(d) be the corresponding operator, associated to d by the standard calculus −3 (Op(d)ϕ)(x) = (2π) eix−y,ξ d(y, ξ)ϕ(y) dydξ, ∀ϕ ∈ S(IR3 ;C I 4 ). The operator Op(d) (d ∈ Γm ) is said globally elliptic if, for some positive real number C, (|x|k + |ξ|)m ≤ C(1 + |Detd(x, ξ)|)1/4 , for all (x, ξ) ∈ IR6 .

586



The notation ·, · stands for the inner scalar product of L2 (IR3 ;C I 4 ) and · denotes the corresponding norm. For j ∈ IN, let |α| B j (IR3 ;C + |β| ≤ j . I 4 ) = φ ∈ L2 (IR3 ;C I 4 ), xα Dxβ φ ∈ L2 (IR3 ;C I 4 ), for k I 4 ) into B s (IR3 ;C I 4 ) for any In particular, for d ∈ Γm , Op(d) maps B s−m (IR3 ;C s ∈ IN. It is seen in Lemma 3 that for small positive θ, the family D(θ, c) is Kato analytic. The resonances are defined as the eigenvalues of D(θ, c), for small positive

θ. Lemma 1 There exists τ0 > 0 such that, if 0 < θ < τ0 then D(θ, c) is globally elliptic. Proof: The symbol d of D(θ, c) satisfies 2 Det d(x, ξ, c, θ) = Vθ (x) Vθ (x) − 2c2 − c2 e−2θ |ξ|2

(15)

where Vθ (x) = V (eθ x). We write θ = σ + iτ , σ, τ ∈ IR and K, C, τ0 denotes three positives real numbers independent of x and τ . The real numbers K, C (resp. τ0 ) may increase (resp. decreases). Following the analyticity of V , there exists τ0 > 0 such that, for 0 < θ < τ0 , for all x ∈ IR3 , Vθ (x) = V (xeσ ) + iτ eσ

3 j=1

xj

∂V (xeσ ) + τ 2 M (x, θ). ∂xj

There exists K, C, τ0 > 0 such that ∀θ ∈C I with 0 < θ < τ0 , ∀ |x| ≥ C, |M (x, θ)| ≤ K|x|k .

(16)

Then, for some K, C, τ0 > 0, if 0 < θ < τ0 , if |x| ≥ C K −1 τ ≤ ArgVθ (x), Arg(Vθ (x) − 2c2 ) ≤ Kτ, |Vθ (x)|, |Vθ (x) − 2c2 | ≥ K −1 |x|k .

(17)

From (17), there exist K, C, τ0 > 0 such that, for all θ and x such that 0 < θ < τ0 and |x| ≥ C, K −1 τ ≤ Arg(Vθ (x)(Vθ (x) − 2c2 )) ≤ Kτ. (18) Then (18) shows that, for some K, C, τ0 > 0 (τ0 < π/2), if 0 < θ < τ0 , if |x| ≥ C, then |Vθ (x) Vθ (x) − 2c2 − c2 e−2θ |ξ|2 | ≥ sin(K −1 τ )|Vθ (x)(Vθ (x) − 2c2 )| + c2 sin(2τ )|ξ|2 . The proof of Lemma 1 follows from (15),(17),(19). Theorem 1 will follow from the two Lemma below.

(19) ✷

Vol. 2, 2001


587

Lemma 2 There exists τ0 > 0 such that, if 0 < θ < τ0 , then the resolvant set of D(θ, c) is not empty. m be the space of a(, ·, ·, ρ) ∈ C ∞ (IR6 , M4 (I Proof. For m ∈ IN, let Γ C)), depending on a parameter ρ ≥ 1, such that, for all α and β in IN3 , there exists Cα,β , independent on ρ, such that, for all (x, ξ, ρ) ∈ IR6 × [1, +∞), |∂xα ∂ξβ a(x, ξ, ρ)| ≤ Cα,β (1 + |ξ| + |x|k + ρ)m−

|α| k −|β|

.

m ), is said globally elliptic with parameter ρ, The operator Op(a(ρ)) (a ∈ Γ if there exists C > 0 such that, for all (x, ξ, ρ) ∈ IR6 × [1, +∞), (|ξ| + |x|k + ρ)m ≤ C(1 + |Det a(x, ξ, ρ)|)1/4 . As in the proof of Lemma 1, θ = σ + iτ , σ, τ ∈ IR and K, C, τ0 are three positives real numbers independent on x and τ which may change. Let ρ > 0, α ∈ [0, 2π), and set P = D(θ, c)+ρeiα . The symbol p(x, ξ, ρ) of P (associated with the standard 1 . Take K, C, τ0 such that (17) holds, and set α = Kτ . There calculus) belongs to Γ exists K, C, τ0 (possibly different) such that, if 0 < θ < τ0 , if |x| ≥ C then K −1 τ ≤ Arg(Vθ (x) + ρeiα ), Arg(Vθ (x) + ρeiα − 2c2 ) ≤ Kτ, |Vθ (x) + ρeiα | ≥ cos(Kτ )(|Vθ (x)| + ρ) ≥ K −1 (|x|k + ρ), iα |Vθ (x) + ρe − 2c2 | ≥ cos(Kτ )(|Vθ (x) − 2c2 | + ρ) ≥ K −1 (|x|k + ρ).

(20)

Then (20) shows that, for some K, C, τ0 > 0 (τ0 < π/2), if 0 < θ < τ0 , if |x| ≥ C, then |(Vθ (x) + ρeiα )(Vθ (x) + ρeiα − 2c2 ) − c2 e−2θ |ξ|2 | ≥ sin(K −1 τ )|(Vθ (x) + ρeiα )(Vθ (x) + ρeiα − 2c2 )| + c2 sin(2τ )|ξ|2 .

(21)

Following (20),(21), P is globally elliptic with parameter ρ. Then, there are q(ρ) −1 such that and r(ρ) in Γ (D(θ, c) + ρeiα )Op(q(ρ)) = I + Op(r(ρ)).

(22)

Moreover, supρ≥1 ρOp(r)L(L2 (IR3 )) < ∞. Thus, the r.h.s. of (22) is invertible for a sufficiently large ρ. This proves Lemma 2. ✷ Lemma 3 There exists τ0 > 0 such that the family of operators {D(θ, c), 0 < θ < τ0 } is analytic in the sense of Kato. Let τ0 be as in Lemma 1, and set θ ∈ C I with 0 < θ < τ0 . The existence of parametrixes for the global elliptic operator D(θ, c) shows that ∃ C > 0, ∀ φ ∈ B 1 , φB 1 ≤ C (D(θ, c)φ + φ) . It implies that, for all θ ∈ C I with 0 < θ < τ0 , D(θ, c) is closed on B 1 .

588



There exists another τ0 > 0 and K > 0 such that, for all θ, h ∈ C I satisfying 0 < z, θ < τ0 and for all x ∈ IR3 , (V (xeθ+h − V (xeθ ))/h = eθ

3 j=1

xj

∂V (xeθ ) + hN (x, θ, h), ∂xj

3

∂V xj (xeθ )|, |N (x, θ, h)| ≤ Kxk . |eθ ∂x j j=1

(23)

Fix u, v ∈ L2 (IR3 , x2k dx), and let F be the map: θ → Vθ u, vL2 (IR3 ;IC) . ¿From (23), if 0 < θ < τ0 , then (F (θ + h) − F (θ))/h has a limit as h → 0 (0 < h < τ0 ). For each u ∈ L2 (IR3 , x2k dx), θ → Vθ u is a (weakly) analytic vector valued function. Then, for each φ ∈ B 1 , D(θ, c)φ is a vector valued analytic function of θ ∈ {z ∈ C, I 0 < z < τ0 }. The above closure and analyticity results, added to Lemma 2, imply that {D(θ, c), 0 < θ < τ0 } is an analytic family of type (A) [7, VII.2]. ✷ Proof of Theorem 1. Using Lemma 2, there exists z ∈ C I such that (D(θ, c) − z)−1 2 3 4 1 3 4 maps L (IR ;C I ) into B (IR ;C I ), hence is a compact operator of L2 (IR3 ;C I 4 ). Therefore, the spectrum of D(θ, c) is a sequence of eigenvalues λj (c, θ) of finite multiplicity. It is clear that D(θ, c) = U ( θ)D( θ, c)U (θ)−1 , that is to say, D(θ, c) is unitarily equivalent to D( θ, c). Therefore each λj (c, θ) does not depend on θ. In addition, Lemma 3 implies that, each λj (c, θ) depends analytically on θ with, at most, algebraic singularities. As [11, pf of th1(i)], it can be proved using Puiseux series, that each λj (c, θ) is a constant function of θ. The multiplicity of each of these eigenvalues λj (c, θ) is even. This can be proved like in Parisse [9]. This completes the proof. ✷

3 Proof of Theorem 2. By arguments similar to that of Section 2, the spectrum of the following Schr¨ odinger operator 1 Hθ = − e−2θ ∆ + V (xeθ ) (24) 2 is discrete, and the eigenvalues are the same as H, with the same multiplicities. (The only difference with Section 2 is that the sign of θ plays no role, and there is τ > 0 such that the family (Hθ )|θ| 0 and R > 0 such that, for each θ ∈ B + (θ0 ), there exists Aθ > 0 such that, if |x| ≥ R < x >k ≤ Aθ Vθ (x),

< x >k ≤ Aθ eθ−θ Vθ (x).

(27)

Proof. By the hypotheses (H1) and (H2), we can write, if θ = σ + iτ ∈ B + (a/2) σ

Vσ+iτ (x) = Vσ (x) + iτ e

3 j=1

xj

∂V (xeσ ) + O(τ 2 < x >k ). ∂xj

(28)

If |x| is large enough and 0 < θ < θ0 , (where θ0 depends on the constants of hypotheses (H2) and (H3)), there exists Aθ > 0 such that (27) is valid. ✷ For each ε > 0, we set 1 0 ∆ε = . (29) 0 ε The points i) and ii) of Theorem 2 are consequences of the points ii) and iii) of the following Lemma. Lemma 5 i) Let K be a compact set of C I and θ such that 0 < θ < θ0 . Then thereexistsBθ > 0 (independent of c) such that, if c is large enough, for each u1 I 4 ) (uj ∈ S(IR3 ,C I 2 )), for each z ∈ K and c ≥ 1, we have in S(IR3 ,C u= u2 < x >k/2 u1 + < x >−k σ.Du1 + u2 ≤ . . . −1 . . . ≤ Bθ ∆−1 c (D(θ, c) − z)∆c u + u1 .

(30)

ii) If K contains no eigenvalue of H, there exists Aθ > 0 (independent of c) such that, if c is large enough −1 uL2 (IR3 ,IC4 ) ≤ Aθ ∆−1 C4 ) , c (D(θ, c) − z)∆c uL2 (IR3 ,I

(31)

for all u ∈ S(IR3 ,C I 4 ), and z ∈ K, and therefore, uL2 (IR3 ,IC4 ) ≤ Aθ (D(θ, c) − z)uL2 (IR3 ,IC4 ) .

(32)

iii) If D is a disc centered at an eigenvalue of H, and containing no other eigenvalue, then, if 0 < θ < θ0 , lim

sup (D(θ, c) − z)−1 − Rzθ∞ = 0.

c→+∞ z∈∂D

(33)

590


−1 Proof of point i). The equality ∆−1 c (D(θ, c) − z)∆c u =

f g


is equivalent to

f = (Vθ − z)u1 + e−θ σ.Du2 , Vθ − z g = e−θ σ.Du1 + − 2 u2 . c2 It follows from the two last equalities that Vθ − z θ−θ u1 , e (Vθ − z)u1 − − 2 u2 , u2 = u1 , eθ−θ f − g, u2 c2

(34) (35)

(36)

and therefore, taking the imaginary parts in the last equality and applying Lemma 4, < x >k/2 u1 2 + c−2 < x >k/2 u2 2 ≤ . . .

(37) . . . ≤ Bθ f 2 + u1 2 + gu2 + c−2 u2 2 . Taking now the real parts in (36), we obtain, with another Bθ ,

u2 2 ≤ Bθ f 2 + u1 2 + gu2 + c−2 u2 2 . The inequality (30) (with another Bθ ) follows easily, if c is large enough, from the two last ones. Proof of point ii). Suppose that the inequality (31) were false. Then there would exist a sequence (un ) in S(IR3 ,C I 4 ), a sequence (zn ) in K, and a sequence cn → +∞ such that −1 un = 1 ∆−1 (38) cn (D(θ, cn ) − zn )∆cn un → 0. ϕn Taking a subsequence, we can assume that zn → z ∈ K. Let us set un = ψn f n −1 and ∆−1 . If we set Vθ (x) = V (xeθ ), we have the relations cn D(θ, cn )∆cn un = gn (34) and (35) with f , u1 , u2 replaced by fn , ϕn , ψn . By (30), the sequences < x >k/2 ϕn and < x >−k σ.Dϕn are bounded in L2 (IR3 ,C I 2 ). Note that the k −k 2 −k operator < x > + < x > (σ.D) < x > has compact resolvant. By these properties, we may assume (after taking subsequences) that there exist ϕ and ψ in L2 (R3 ,C I 2 ) such that ϕn → ϕ (strongly) and ψn → ψ (weakly) in L2 (R3 ,C I 2 ). We have (39) (Vθ − z)ϕ + e−θ σ.Dψ = 0 and

e−θ σ.Dϕ − 2ψ = 0,

(40)

and therefore (Hθ − z)ϕ = 0. If ϕ = 0, it follows that ϕn → 0, and, since fn + gn → 0, the point i) shows that ψn → 0, and this gives a contradiction since ϕn 2 + ψn 2 = 1. Therefore, there exists ϕ = 0 in L2 (IR3 ,C I 2 ) such that

Vol. 2, 2001


591

(Hθ − z)ϕ = 0, and there is a contradiction since z ∈ K and K contains no eigenvalue of Hθ . The inequality (31) is proved, and (32) follows easily. Proof of point iii) Suppose that there exist θ such that 0 < θ < θ0 , a sequence (Fn ) in L2 (IR3 ,C I 4 ), a sequence (zn ) in ∂D, a sequence cn → +∞, and δ > 0 such that Fn = 1, (D(θ, cn ) − zn )−1 Fn − Rzn θ∞ Fn ≥ δ. (41) Let us set

Fn =

fn gn

Un = (D(θ, cn ) − zn )−1 Fn =

,

ϕn ψn

.

(42)

By the point ii) (applied to the compact ∂D), the sequence Un is bounded. By the point i) (applied to the function ∆cn Un ), we have < x >k/2 ϕn + < x >−k σ.Dϕn + cn ψn ≤ Bθ [Fn + ϕn ] = O(1). (43) Therefore ψn → 0, which implies, together with (41), that, for n large enough ϕn − (Hθ − zn )−1 fn ≥

δ . 2

(44)

By (43), we may assume, (after taking a subsequence), that there exist ϕ and ψ in L2 (R3 ,C I 2 ) such that ϕn → ϕ (strongly) and cn ψn → ψ (weakly) in L2 (R3 ,C I 2 ). We may assume also that zn → z ∈ ∂D and that fn weakly converges to f ∈ I 4 ). It follows that L2 (IR3 ,C (Vθ (x) − z)ϕ + e−θ σ.Dψ = f, e−θ σ.Dϕ − 2ψ = 0, and therefore (Hθ − z)ϕ = f . Since the operator (Hθ − z)−1 is compact, we may assume also that ∈ L2 (IR3 ,C I 2) (45) (Hθ − zn )−1 fn → ϕ (strong convergence). We have (Hθ − z)ϕ = (Hθ − z)ϕ = f , and there is a contradiction with (44) since ϕ − ϕ ≥ δ/2 and z is not in the spectrum of Hθ . Proof of of Theorem 2. The point i) is a consequence of Lemma 5 (point ii). For the point ii), let E0 be a simple eigenvalue of H. Let D be a disc, centered at E0 , with radius ρ > 0, containing no other eigenvalue of H inside it, and Γ be the boundary of D. By the point i), we know that, for c large enough, D(θ, c) − z is invertible for all z ∈ Γ. We define then an operator Πθc by 1 Πθc = (D(θ, c) − z)−1 dz (46) 2iπ Γ Similarly we define Πθ∞ by Πθ∞

1 = 2iπ

Rzθ∞ dz Γ

(47)

592



where Rzθ∞ is defined in (25). It follows from Lemma 5 (point iii) that lim Πθc − Πθ∞ = 0.

(48)

c→+∞

✷

The point ii) follows easily.

4 Proof of Theorem 3. If D = B(E0 , ρ) is a disc like in the Theorems 2 and 3, and if E0 is a simple eigenvalue of H, we know, by Theorem 2, that, for c large enough, D(θ, c) has only one eigenvalue λ(c) of multiplicity 2 in B(E0 , ρ). Since E0 is also a simple eigenvalue of the dilated Schr¨ odinger operator Hθ defined in (24) (section 3), let ϕθ be a normalized eigenvector (Hθ ϕθ = E0 ϕθ , ϕθ = 1). By the global ellipticity of Hθ , we know that ϕθ is in S(IR3 ). Let   ϕθ  0   (49) ψθ =   0 . 0 If Πθc is defined in (46), (where Γ is the boundary of D), Πθc ψθ is in the eigenspace of D(θ, c) corresponding to the eigenvalue λ(c) and, by (48), if c is large enough, Πθc ψθ = 0. Therefore λ(c) =

(D(θ, c)Πθc ψθ , Πθc ψθ ) , Πθc ψθ 2

E0 =

(Hθ Πθ∞ ψθ , Πθ∞ ψθ ) Πθ∞ ψθ 2

(50)

(since Πθ∞ ψθ = ψθ ). I 4 ) and Γ be the boundary of D = B(E0 , ρ). Lemma 6 Let ψ be a function in S(IR3 ,C Let F (ε, z) be the function defined, for ε small enough and z ∈ Γ by F (ε, z) = (D(θ, 1/ε) − z)−1 ψ, F (ε, z) = Rzθ∞ ψ,

if if

ε = 0,

(51)

ε=0

(52)

where Rzθ∞ is defined in (25). Then ε → F (ε, z) is C ∞ from some neighborhood of 0 to H = L2 (IR3 ,C I 4 ), and depends continuously of z in Γ. Proof. If ∆ε is the operator defined in (29), we can write, by (34) and (35) −1 −2 B ∆−1 c (D(θ, c) − z)∆c = A + c

where A=

Vθ − z e−θ σ.D

e−θ σ.D −2

,

B=

0 0 0 Vθ − z

(53) .

Vol. 2, 2001


593

By Lemma 5, there is t0 > 0 such that A + tB : B 1 → H is invertible if 0 < t ≤ t0 , and there exists K > 0 such that (A + tB)−1 f ≤ Kf , 0 < t ≤ t0 , ∀f ∈ H. u(t) , we have, by Lemma 5 Moreover, if we set (A + tB)−1 f = v(t)

(54)

< x >k/2 u(t) + < x >−k σ.Du(t) + v(t) ≤ . . . . . . ≤ K(f + u(t)),

0 < t ≤ t0 ,

∀f ∈ H.

In the other hand, if Hθ is the operator defined in (24), and z ∈ Γ, the operators Dα (Hθ − z)−1 Dβ are bounded in L2 (IR3 ) if |α + β| ≤ 2 (we construct easily a parametrix of this operator in a suitable class). Therefore, the following operator S is bounded in H e−θ −1 (Hθ − z)−1 (H − z) σ.D θ 2 S= e−θ e−2θ −1 −1 σ.D − I2 2 σ.D(Hθ − z) 4 σ.D(Hθ − z) and it satisfies AS = I. Moreover u ∈ H and (A + tB)u = 0 imply u = 0 (0 ≤ t ≤ t0 ). It follows easily from these properties that, if f ∈ H, the function G(t)f defined by G(t)f = (A + tB)−1 f

if

0 < t ≤ t0 ,

G(0)f = Sf

(55)

is continuous in [0, t0 ] to H. Let E be the space of f ∈ H such that, for each m, < x >m u is in H. Using the commutation relation xj (A + tB)−1 = (A + tB)−1 xj − ie−θ (A + tB)−1 αj (A + tB)−1 0 σj where αj = , it follows that, for each integer m, there is Km such that σj 0 < x >m (A + tB)−1 f ≤ Km < x >m f ,

∀f ∈ E,

0 ≤ t ≤ t0 ,

and that, for each f ∈ E, the function < x > G(t)f is continuous in [0, t0 ] to H. It follows that, for each f ∈ E, the function G(t)f is C ∞ on [0, t0 ] to H, and that p G(p) (t)f = (−1)p (A + tB)−1 B(A + tB)−1 if 0 < t ≤ t0 (56) m

and G(p) (0)f = (−1)p S(BS)p f . This property can be proved, by induction on p, using the previous remarks. The Lemma follows easily since F (ε, z) = ∆ε G(ε2 ) ∆ε ψ. Proof of Theorem 3. Since ψθ defined in (49) is in S(IR3 ,C I 4 ), (this can be proved by using a parametrix of Hθ ), it follows from (50) and Lemma 6 that the function g defined in some neighborhood of 0 by g(ε) = λ(1/ε)

if

ε = 0

(57)

594



g(0) = E0

(58)

∞

is C . We remark that

J=

JD(θ, c)J = Dθ,−c

I 0

0 −I

(59)

Since ψθ defined in (49) satisfies Jψθ = ψθ , it follows that g is an even function of ε, and there exists a C ∞ function f in a neighborhood of 0 such that g(ε) = f (ε2 ), which proves Theorem 3.

5 Imaginary part of the resonances. In this section, we need another definition of the resonances, using the exterior scaling. We are very grateful to X.P. Wang for this suggestion. For each ε > 0 and c > 1, we have to introduce two auxiliary Hamiltonians : one of them (denoted by Ddis (θ, c)) is obtained from D(c) by an exterior complex scaling (cf. Hunziker [6]), and the other one, denoted by D0 (c), is obtained from D(c) by a modification of the potential (cf. Wang [17] and Parisse [9]). For the construction of the distorted operator Ddis (θ, c), we use, for each ε ∈ (0, 1), a function ϕ ∈ C ∞ (IR) such that ϕ(t) = 0 if t ≤ 2 − 2ε and ϕ(t) = 1 if t ≥ 2. For each θ ∈ C I and x ∈ IR3 , we set V (x) Xc (x) = xϕ . (60) ϕθ (x) = x + θXc (x), c2 If |θ| is small enough, we can define a system pθ = (pθ,1 , pθ,2 , pθ,3 ) of differential operators by i pθ =t (ϕθ (x))−1 Dx − ∇(ln Jθ (x)), 2

Jθ (x) = det ϕθ (x),

and a distorted Dirac operator Ddis (θ, c) by cσ · pθ V (ϕθ (x)) Ddis (θ, c) = . cσ · pθ V (ϕθ (x)) − 2c2

(61)

(62)

Proposition 1 With the previous notations, if |θ| is small enough, if D is a disc as in Theorem 2 (point ii), and if c is large enough, the spectrum of Ddis (θ, c) in D is the same sequence of eigenvalues Ej (c) as for the operator D(θ, c) defined in (7), with the same multiplicities. For the proof of this Proposition, we shall use the following Lemma. Lemma 7 There exist A > 0 and θ0 > 0 with the following properties. If z ∈ C, I

z < 0, c ≥ 1, if θ ∈ Ω, where Ω = {θ ∈ C, I

|θ| < θ0 ,

0< θ
k ≤ A eθ Vθ (x) , if |θ| ≤ 1, 0 < θ < ε0 , |x| ≥ R and

| eθ Vθ (x) | ≤ A θ,

if |θ| ≤ 1, 0 < θ < ε0 , |x| ≤ R.

It follows that, with other constants A and ε0 , if z < 0, |θ| < 1, 0 < θ < ε0 , and if (D(θ, c) − z)u = f , we have

| z|u2 ≤ A f u + | θ|(c2 + |Re z|)u2 . If moreover, 0 ≤ θ ≤ | z|/(2A(c2 + |Re z|)), then uH ≤

2A (z − D(θ, c))uH . | z|

(65)

By the results of Section 2, it follows that, for each θ ∈ Ω (with another A), z − D(θ, c) : B 1 → H is invertible and that the inverse depends holomorphically on θ in Ω. The result about weak continuity follows from (64), using the implication (13). ✷ End of the proof of the Proposition. Once the Lemma 7 is established, the proof of Proposition 1 follows the classical proof of the Aguilar-Balslev-Combes theorem [1] (see Hislop-Sigal [5] or Laguel [8] for more details). For real θ, small enough, we define an operator Uθ : H → H by (Uθ f )(x) = e3θ/2 f (xeθ )

(66)

θ : H → H by and an operator U θ f )(x) = Jθ (x)1/2 f (ϕθ (x)). (U

(67)

596



θ are unitary, and we have Then Uθ and U θ D(c)U −1 . Ddis (θ, c) = U θ

D(θ, c) = Uθ D(c)Uθ−1 ,

(68)

I 4 ) and θ0 > 0 such that, for each There exists a subspace A in H = L2 (IR3 ,C θ f extend to holomorphic functions from f ∈ A, the functions θ → Uθ f and θ → U θ A are dense in H. B(0, θ0 ) to H, and such that, for each θ ∈ B(0, θ0 ), Uθ A and U If f, g ∈ A, |θ| < θ0 and θ > 0, we set Ff g (z, θ) =< Uθ f, (z − D(θ, c))−1 Uθ g >,

(69)

θ g > . f, (z − Ddis (θ, c))−1 U F f g (z, θ) =< U θ

(70)

By the results of Section 2 and their analogous for D(θ, c), we know that, if c ≥ 1, these functions of z are meromorphic in D. Let A and θ0 be the constants of Lemma 7. There is an analogous of Lemma 7 with D(θ, c) replaced by D(θ, c), and we may assume that the constants A and θ0 are the same. If E0 is the center of D and ρ its radius, let ρ ω = {θ ∈ C, I |θ| < θ0 , 0 < θ < }. 2A(c2 + |E0 | + ρ) By Lemma 7, if z ∈ D and z < − ρ2 , the functions θ → Ff g (z, θ) and θ → F f g (z, θ) are holomorphic in ω and continuous in ω. By (68), they are equal in ω ∩ IR, and therefore they are equal in ω. Now, if θ ∈ ω, the functions z → Ff g (z, θ) and z → F f g (z, θ) are meromorphic in D and equal in {z ∈ D, z < − ρ2 }, and therefore they are equal on D. A point z0 ∈ D is an eigenvalue of D(θ, c) (resp. of Ddis (θ, c)) iff there are f and g ∈ A such that z0 is a pole of z → Ff g (z, θ) (resp. ✷ of z → F f g (z, θ)). Therefore, these eigenvalues are the same. Therefore, under the hypotheses of theorem 2, if D is a disc centered at E0 , of radius ρ, and containing no other eigenvalue of H, if Ej (c) (1 ≤ j ≤ 2µ) are the resonances in D, there exists an orthonormal system of functions ψj in L2 (IR3 ,C I 4) (1 ≤ j ≤ 2µ), such that, if c is large enough, Ddis (θ, c)ψj = Ej (c)ψj .

(71)

Now we shall define a modified real-valued potential, like in Wang [17] and Parisse [9] in the semiclassical study of multiple wells or resonances for the Dirac operator. For that, we can choose a function ψ ∈ C ∞ (IR), nondecreasing, such that ψ(t) = t if t ≤ 2 − 2ε , ψ(t) ≤ t for all t, and ψ(t) = 2 − 4ε if t ≥ 2. Using this function, we define a modified potential V0 (depending on ε and c) by V (x) V0 (x) = c2 ψ . (72) c2 Let d(x, V0 , c) be the distance from x ∈ IR3 to the origin for the Agmon metric defined as in section 1, but with the potential V0 instead of V . We set Σ(c, ε) =

inf

V (x)≥(2− ε2 )c2

d(x, V0 , c).

(73)

Vol. 2, 2001


597

Lemma 8 If ε < 1/2, there exists Kε > 0 such that V (x) ≥

i)

ii)

3 2 c ⇒ c ≤ Kε d(x, V0 , c). 2

< x >≤ Kε (1 + d(x, V0 , c)),

∀x ∈ IR3 .

S(c, ε) ≤ Σ(c, ε).

iii)

Proof. Let x ∈ IR3 , and t → x(t) be a C 1 curve such that x(0) = 0 and x(1) = x. Suppose that V (x) ≥ (3/2)c2 . Let t0 and t1 such that 0 < t0 < t1 < 1,

V (x(t0 )) =

1 2 c , 2

V (x(t1 )) = c2 ,

and

1 2 c ≤ V (x(t)) ≤ c2 , ∀t ∈ [t0 , t1 ]. 2 For each t ∈ [t0 , t1 ], we have V0 (x(t)) ≥ 12 c2 and 2c2 −V0 (x(t)) ≥ 4ε c2 , and therefore 1 c

0

1

√

1/2 c ε 2 V0 (x(t)+ (2c − V0 (x(t)) |x(t1 ) − x(t0 )|. |x (t)|dt ≥ 4

By the hypotheses on the potential V , there exists K > 0 and K > 0 such that, if c is large enough, 1 2 k−1 c ≤ |V (x(t0 )) − V (x(t1 ))| ≤ K|x(t0 ) − x(t1 )| [< x(t0 ) > + < x(t1 ) >] 2 . . . ≤ K |x(t0 ) − x(t1 )|V (x(t1 ))(k−1)/k ≤ K |x(t0 ) − x(t1 )|c2−2/k The point i) follows from the last inequalities. For the point ii), we can find R > 0 such that V0 (x) ≥ 1 if |x| ≥ R. If |x| ≥ R and if x(t) is a curve as above, there exists t0 ∈ [0, 1] such that |x(t0 )| ≤ R and |x(t)| ≥ R if t ∈ [t0 , 1]. It follows that 1/2 ε 1 1

|x (t)|dt ≥ |x − x(t0 )| V0 (x(t)+ (2c2 − V0 (x(t)) c 0 2 and therefore |x| ≤ R + 2ε d(x, V0 , c). The proof of the point iii) is straightforward. ✷ We denote by D0 (c) the modified Hamiltonian corresponding to the modified potential V0 cσ · Dx V0 (x) D0 (c) = . (74) cσ · Dx V0 (x) − 2c2 We see easily that D0 (c) is essentially self-adjoint and, using the arguments of Section 3, we see that, if D is a neighborhood of E0 like in the Theorem 2 (point

598



ii), D ∩ IR contains, for c large enough, 2µ eigenvalues λj (c) (1 ≤ j ≤ 2µ) of D0 (c) (if they are repeated according to their multiplicities). Let ϕj = ϕj (c) (1 ≤ j ≤ 2µ) be an orthonormal system of corresponding eigenfunctions, ϕj = 1,

D0 (c)ϕj = λj (c)ϕj ,

(75)

and we have, if ρ is the radius of D and if c is large enough |λj (c) − E0 | ≤

ρ . 2

(76)

The following result about the exponential decay at infinity of the functions ϕj (c) is well-known (see Wang [17]). Proposition 2 With the previous notations, for each ε > 0, there exists Cε > 0, independent of c such that the functions ϕj (1 ≤ j ≤ 2µ) satisfy e(1−ε)d(.,V0 ,c) ϕj 2 +

1 (1−ε)d(.,V0 ,c) e ∇ϕj 2 ≤ Cε . c2

(77)

Proof. The proof is the same as in Wang [17] but, since it is written in [17] in the semiclassical context, we give a sketch of the proof here. By a direct calculus, we see, like in Wang [17] (Proposition 2.1) that, for each real-valued function Φ, bounded, uniformly lipschitzian on IR3 , we have |∇(eΦ ϕj )|2 dx + δ(x, c)|eΦ ϕj |2 dx = 0 (78) c2 IR3

where

IR3

δ(x, c) = [V0 (x) − λj (c)] 2c2 − V0 (x) + λj (c) − c2 |∇Φ(x)|2 .

(79)

There exists Rε > 0 such that, if 0 ≤ ε ≤ 1 8(|E0 | + (ρ/2)) + 4 . 2ε2 − ε3

|x| ≥ Rε ⇒ V0 (x) ≥ If Φ satisfies Φ(0) = 0 and

c2 |∇Φ|2 ≤ V0 (x)+ (2c2 − V0 (x)) (1 − ε)2

(80)

using (76), we see that δ(x, c) ≥ c2 ,

if |x| ≥ Rε .

(81)

We can find Kε > 0, independent on c, such that c−2 |δ(x, c)| + |Φ(x)| ≤ Kε , It follows that

(82)

|eΦ(x) ϕj (x)|2 dx ≤ . . .

(83)

|∇(eΦ(x) ϕj (x))|2 dx +

IR3

if |x| ≤ Rε .

|x|≥Rε

Vol. 2, 2001


599

. . . ≤ Kε

|x|≤Rε

|eΦ(x) ϕj (x)|2 dx ≤ Kε eKε .

(84)

Since, for c large enough, |∇Φ(x)|2 ≤ 6c2 , it follows from (84) and (82) that |eΦ(x) ∇ϕj (x)|2 dx ≤ (2 + 12c2 )Kε eKε . (85) |x|≥Rε

Since ϕj satisfies (75), we remark also that e2Kε |eΦ(x) ∇ϕj (x)|2 dx ≤ 3 2 D0 (c)ϕj 2 + V0 ϕj 2 + c2 ϕj 2 c |x|≤Rε ≤ Kε c2

(86) (87)

where Kε is independent on c. We used |λj (c)| ≤ |E0 | + (ρ/2) and V0 (x) ≤ (2 − (ε/4))c2 . Therefore, with Kε > 0 independent on c, and on the function Φ satisfying (80) 1 Φ e ∇ϕj 2 + eΦ ϕj 2 ≤ Kε . (88) c2 The Proposition follows by the argument of [17]. ✷ Now we shall study the decay at infinity of the orthonormal system of functions ψj satisfying (71), following the technique of Sigal [13]. For that, we set V0 , c) = inf(d(x, V0 , c), Σ(c, ε)). d(x,

(89)

Proposition 3 With the previous notations, for each ε > 0, there exists Kε > 0, independent of c such that the functions ψj (1 ≤ j ≤ 2µ) satisfy

e(1−ε)d(.,V0 ,c) ψj ≤ Kε c(1−2/k)+ .

(90)

In the proof, and also later, we shall use a cut-off function defined as follows. We can choose a function h ∈ C ∞ (IR) such that 0 ≤ h(t) ≤ 1 for all t, h(t) = 1 if t ≤ 2 − ε and h(t) = 0 if t ≥ 2 − 2ε . We set V (x) χ(x) = h (91) , ∀x ∈ IR3 . c2 We remark that, with Aε independent on c |∇χ(x)| ≤ Aε c−2/k .

(92)

χD(θ, c) = χD0 (c)

(93)

Ddis (θ, c)χ − χD0 (c) = [D0 (c), χ] = c(Dχ).α

(94)

We remark also that and therefore

600


where (Dχ).α =

0 σ · (Dχ) σ · (Dχ) 0


.

Proof of Proposition 3. Let γ be the boundary of D (a circle with center E0 , and with radius ρ). If c is large enough, all the resonances Ej (c) (1 ≤ j ≤ 2µ) are contained in B(E0 , ρ/2). The same arguments as for Lemma 5 (point ii) show that, for c large enough (z − D(θ, c))−1 ≤ K (95) for all z ∈ γ, where K is independent on c. Let P be the projection defined, for c large enough, by 1 (z − Ddis (θ, c))−1 f dz. (96) Pf = 2iπ γ First, we shall prove that the functions P ϕj satisfy the estimations of the I 4 ), proposition. It follows from (94) that, for each z ∈ γ, and for all f ∈ L2 (IR3 ,C (z − Ddis (θ, c))−1 (χf ) = χ + c(z − Ddis (θ, c))−1 (Dχ).α (z − D0 (c))−1 f. (97) Applying this equality with f = ϕj and integrating over γ, we obtain, by (96) (z − Ddis (θ, c))−1 (Dχ).αϕj c P (χϕj ) = χϕj + gj , gj = dz. (98) 2iπ γ z − λj (c) We can write

e(1−ε)d(.,V0 ,c) P ϕj ≤ e(1−ε)Σ(c,ε) P ((1−χ)ϕj )+e(1−ε)d(.,V0 ,c) χϕj )+. . . (99)

. . . + e(1−ε)d(.,V0 ,c) gj .

(100)

2

By (95), the L norm of the projector P is bounded by some constant K independent of c. By the definition of Σ(c, ε) and by the Proposition 2, e(1−ε)Σ(c,ε) P ((1 − χ)ϕj ) ≤ Kε

(101)

for some constant Kε , independent on c. If c is large enough, using (95) and (76), we see that e(1−ε)d(.,V0 ,c) gj ≤ K0 ce(1−ε)Σ(c,ε) (∇χ)ϕj (102) with some other constant K0 . Therefore, using also (92) and the definition of Σ(c, ε), we obtain,

e(1−ε)d(.,V0 ,c) gj ≤ Kε c1−(2/k) e(1−ε)d(.,V0 ,c) ϕj ≤ Kε c1−(2/k)

(103)

where Kε and Kε are independent on c. We used Proposition 2, which shows also that e(1−ε)d(.,V0 ,c) (χϕj ) ≤ e(1−ε)d(.,V0 ,c) ϕj ≤ Cε . (104)

Vol. 2, 2001


601

Summing up, we proved that, for some other Kε independent on c

e(1−ε)d(.,V0 ,c) P ϕj ≤ Kε c(1−(2/k))+ .

(105)

Now we shall orthogonalize the system (P ϕj ) (1 ≤ j ≤ 2µ). We remark that (z − Ddis (θ, c))−1 Ddis (θ, c) − D0 (c) ϕj 1 P ϕj − ϕj = dz. (106) 2iπ γ z − λj (c) It follows that

P ϕj − ϕj ≤ K0 Ddis (θ, c) − D0 (c) ϕj

(107)

where K0 is independent of c. We have, if Vθ is defined in (26) and V0 in (72) Ddis (θ, c) − D0 (c) ϕj ≤ K |∇ϕj (x)|2 dx + . . . V (x)≥(2−ε/2)c2

[1 + |Vθ (x) − V0 (x)|2 |]ϕj (x)|2 dx

... + K

(108)

V (x)≥(2−ε/2)c2

for some constant K, and we have also |Vθ (x) − V0 (x)| ≤ K < x >k . By Lemma 8 and proposition 2, it follows that, for some Kε P ϕj − ϕj ≤ Kε e−Σ(c,ε) . By Lemma 8, P ϕj − ϕj → 0 when c → +∞. Hence the Gram matrix S = (P ϕj , P ϕk )1≤j,k≤2µ tends to identity when c → +∞. Therefore, if c is large enough, T = S −1/2 is defined, and bounded independently of c. If we set T = (ajk ), the system of functions ψj = ajk P ϕk is an orthonormal basis of ImP , which satisfies the estimations (90). ✷ End of the proof of Theorem 4. We consider again the function χ defined in (91) and an orthonormal system of eigenfunctions ψj satisfying (71). By Proposition 3, we can write |ψj (x)|2 dx ≤ Kε2 c2 e−2(1−ε)Σ(c,ε) . (109) supp (1−χ)

It follows by Lemma 8 (point i)) that, if c is large enough 1 (1 − χ(x))|ψj (x)|2 dx ≤ . 2

(110)

If we write the imaginary part of the scalar product of both sides of (71) with χψj , we obtain, using (93) ( Ej (c)) χ(x)|ψj (x)|2 dx = D(θ, c)ψj , χψj = . . . IR3

602



1 . . . = D0 (c)ψj , χψj = − [D0 (c), χ]ψj , ψj . 2 Using (110) and (92), we have, for some constants K, K and Kε | Ej (c)| ≤ | [D0 (c), χ]ψj , ψj | ≤ Kc |∇χ(x)||ψj (x)|2 dx ≤ . . . . . . ≤ K c1−(2/k)

(111)

|ψj (x)|2 dx ≤ Kε c3 e−2(1−ε)Σ(c,ε) .

supp(1−χ)

The estimation (12) of Theorem 4 follows, with another ε, using Lemma 8.

References [1] J. Aguilar, J.M. Combes, A class of analytic perturbations for one-body Schr¨ odinger Hamiltonians, Comm. Math. Physics, 22, 280–294 (1971). [2] E. Balslev, B. Helffer, Limiting absorption principle and resonances for the Dirac operator. Advances in Appl. Math, 13, 186–215 (1992). [3] D. Grigore, G. Nenciu, R. Purice, On the nonrelativistic limit of the Dirac Hamiltonian, Ann. Inst. H. Poincaré, Phys. Th. 51, 231–263 (1989). [4] B. Helffer, J. Sjöstrand, Résonances en limite semi-classique. Mémoire de la S.M.F. 24/25, (1986). [5] P.D. Hislop, I.M. Sigal, Introduction to spectral theory, with applications to Schr¨ odinger operators, Springer, (1996). [6] W. Hunziker, Distortion analyticity and molecular resonance curves, Ann. Inst. H. Poincaré, Phys. Th. , 45, 4, 339–358 (1986). [7] T. Kato, Perturbation theory of linear operators, Springer, (1980). [8] M. Laguel, Résonances en limite semiclassique et propriétés de l’Hamiltonien effectif. Thèse, Reims, (1999). [9] B. Parisse, Résonances pour l’opérateur de Dirac. Helvetica Physica Acta, 64, 557–591 (1991). [10] B. Parisse, Résonances pour l’opérateur de Dirac II. Helvetica Physica Acta, 65, 1077–1118 (1992). [11] P. Seba, The complex scaling method for Dirac resonances, Lett. Math. Phys. 16, 51–59 (1988). [12] M.A. Shubin, Pseudodifferential operators and spectral theory, Springer, (1987).

Vol. 2, 2001


603

[13] I.M. Sigal, Sharp exponential bounds on resonances states and width of resonances states, Advances in Applied Math., 9, 127–166 (1988). [14] B. Thaller, The Dirac equation. Springer, (1992). [15] E. Titchmarsh, A problem in relativistic quantum mechanics, Proc. London Math. Society, 41, 170–192 (1961). [16] K. Veselic, The nonrelativistic limit of the Dirac equation and the spectral concentration, Glasnik Math. 4, 231–240 (1969). [17] X.P. Wang, Puits multiples pour l’opérateur de Dirac, Ann. Inst. H. Poincaré, 42, 269–319 (1985). L. Amour, R. Brummelhuis, J. Nourrigat Département de Mathématiques ESA 6056 CNRS, Université de Reims Moulin de la Housse, B.P. 1039 F-51687 Reims Cedex 2, France email: [email protected] email: [email protected] email: [email protected] Communicated by Bernard Helffer submitted 05/06/00, accepted 20/07/00




On the Formation of Singularities in Solutions of the Critical Nonlinear Schr¨ odinger Equation Galina Perelman Abstract. For the one-dimensional nonlinear Schr¨ odinger equation with critical power nonlinearity the Cauchy problem with initial data close to a soliton is considered. It is shown that for a certain class of initial perturbations the solution develops a self-similar singularity in finite time T ∗ , the profile being given by the ground state solitary wave and the limiting self-focusing law being of the form λ(t) ∼ (ln | ln(T ∗ − t)|)1/2 (T ∗ − t)−1/2 .

Introduction Consider the nonlinear Schr¨ odinger equation iψt = −ψ − |ψ|2p ψ,

x ∈ Rd ,

(1)

with initial data ψ|t=0 = ψ0 ∈ H 1 . It is well known that for p ≥ d2 the problem has solutions that blow up in finite time [G]. The case p = d2 marks the transition between the global existence and the blowup phenomenon. In this paper we study the participation of nonlinear bound states in singularity formation in the one-dimensional critical case : d = 1, p = 2. The NLS (1) has an important solution of special form- soliton : eit ϕ0 (x), where ϕ0 is the “ground state solitary wave”. Ground states are orbitally stable relative to small perturbations of initial data in the subcritical case and unstable in the critical and supercritical case. In fact for p ≥ d2 initial data arbitrary close to a ground state may give rise to a solution that blows up in finite time. In the critical case , however, a kind of orbital stability result is still valid provided one extends a definition of the ground state orbit taking dilation as well as translations into account. More precisely, any blowup solution ψ with L2 norm close to L2 norm of ϕ0 is close (in L2 ) to the set {eiµ λ1/2 ϕ0 (λ(x + b)), µ, b ∈ R, λ ∈ R+ } for t close enough to the blowup time, see [MM], [W4]. Although giving some information on the spatial structure of the solutions near the blowup time this result does not answer the question of what the asymptotic behavior of the system is. Toward an understanding of this asymptotic behavior we have the following

606

G. Perelman


result. We consider the Cauchy problem for (1) (p = 2, d = 1) with even initial data close to a soliton : ψ|t=0 = ϕ0 + χ0 , (2) where χ0 is small in suitable sense. We show that for a certain set (open in X = {χ0 ∈ H 1 , xχ0 ∈ L2 }) of initial perturbations the solution ψ blows up in finite time T ∗ , admitting the following asymptotic representation ψ(t, x) ∼ eiµ(t) λ1/2 (t)ϕ0 (λ(t)x),

t → T ∗,

λ(t) ∼ (T ∗ − t)−1/2 (ln | ln(T ∗ − t)|)1/2 , µ(t) ∼ ln(T ∗ − t) ln | ln(T ∗ − t)|.

(3) (4)

Thus, up to a phase factor the formation of the singularity is self-similar with a profile given by the ground state. In the multidimensional case the existence and stability of the blowup solutions with the asymptotic behavior (3), (4) have been conjectured and formally explained by several authors, see, for example, [DNPZ], [Fr], [KSZ], [LPSS], [LePSS1], [LePSS2], [M1], [M2], [M3], [SF], [SS1], [SS2]. The asymptotics (3), (4) clearly can not be true for all blowup solutions starting from data close to a ground state since there is a family of explicit blow up solutions with a different blowup rate : 1/2 x2 tT ∗ T∗ T ∗x ). (5) ei 4(t−T ∗ ) +i T ∗ −t ϕ0 ( ∗ ∗ T −t T −t However it may be reasonable to expect the exceptional set of initial data to be a one-codimensional manifold and the corresponding solutions to behave (up to the invariances of the equation) like the explicit ones (5), see [BW]. This phenomenon is due to a certain degeneracy of the model and is unstable with respect to perturbations of the equation. For Zakharov equation (that can be considered as a physical refinement of (1)) the solutions with the blowup rate (4) disappear : the minimal blowup rate is given by that of the explicit solutions, see [GM], [Me3]. The structure of this article is briefly as follows. It consists of two sections fairly different in nature. The first contains a complete proof of the indicated result with reference to certain estimates for the linearized operators. The second contains a systematic treatment of the properties of the linearized operators, and, in particular, a proof of the estimates mentioned in Section 1. The expositions in the two sections are essentially independent up to the overlap concerning the estimates mentioned. A brief variant of the present article containing a description of the main results was published in [P].

1. Asymptotic behavior of solutions of nonlinear equation We start by devoting subsection 1.1 to a description of preliminary concepts and to the exact formulation of the results. Subsections 1.2 and 1.3 are devoted to the

Vol. 2, 2001

Formation of Singularities in the Critical Nonlinear Schr¨ odinger Equation 607

proof of (3) for the solution of the Cauchy problem(1) , (2). Up to some technical modifications the main line will repeat that of [BP1], [BP2].

1.1 Preliminary facts and formulation of the result 1.1.1 The nonlinear equation We formulate here the necessary facts about the Cauchy problem for the equation iψt = −ψxx − |ψ|4 ψ

(1.1.1)

with initial data in H 1 . Proposition 1.1.1 The Cauchy problem for equation (1.1.1) with initial data ψ(0, x) = ψ0 (x), ψ0 ∈ H 1 has a unique solution ψ in the space C([0, T ∗ ) → H 1 ) with some T ∗ > 0 and (i) ψ satisfies the conservation laws 1 dx|ψ|2 = const, H(ψ) = dx[|ψx |2 − |ψ|6 ] = const; 3 (ii) if T ∗ < ∞, then ψx 2 → ∞ as t → T ∗ and ψx 2 ≥ c(T ∗ − t)−1/2 ; (iii) if H(ψ0 ) < 0 then T ∗ < ∞. Suppose in addition that xψ0 ∈ L2 . Then xψ ∈ C([0, T ∗ ) → L2 ) and ψ satisfies the pseudo-conformal conservation law 4 dx|(x + 2it∂x )ψ|2 − t2 dx|ψ|6 = const. 3 The assertions stated here can be found in [CW1], [OT], for example. Equation (1.1.1) is invariant with respect to the transformations : bx2

ψ(x, t) → (a + bt)−1/2 eiω+i 4(a+bt) ψ( where ω ∈ R, 1.1.2

a b c d

c + dt x , ), a + bt a + bt

∈ SL(2, R).

Exact blowup solutions

Equation (1.1.1) has a family of soliton solutions ei

α2 4

t

ϕ0 (x, α),

α > 0,

(1.1.2)

608

G. Perelman


where ϕ0 is a positive even smooth decreasing function satisfying the equation −ϕ0xx +

α2 ϕ0 − ϕ50 = 0. 4

As |x| → ∞, ϕ0 ∼ ϕ∞ (α)e− 2 |x| . One has a relation α

ϕ0 (x, α) =

α 1/2 2

α ϕ0 ( x), 2

(1.1.3)

where ϕ0 (x) stands for ϕ0 (x, 2). One can give an explicit expression for ϕ0 : ϕ0 (x) =

31/4 . ch 1/2 2x

Applying the transformations (1.1.2) to (1.1.1) one gets a 3-parameter family of solutions 2 eiµ(t)−iβ(t)z /4 λ1/2 (t)ϕ0 (z), z = λ(t)x, (1.1.4) where µ, β, λ are given by λ(t) = (a + bt)−1 , β(t) = −b(a + bt), µ(t) =

c + dt . a + bt

Remark that λ(t), β(t), µ(t) satisfy the system λ−3 λt = β, λ−2 βt + β 2 = 0, λ−2 µt = 1. If b = 0, solution (1.1.4) blows up in finite time. It is known that equation (1.1.1) has no blowup solutions in the class {ψ ∈ H 1 (R), ψ2 < ϕ0 2 }, see [W3]. The solutions (1.1.4) are the only blowup solutions (up to Galilei invariance) with minimal mass, see [Me1], [Me2]. 1.1.3

Extended manifold of blowup solutions

The 3-parameter family (1.1.4) can be considered as the boundary a = 0 of the 4-parameter family of formal solutions w(x, σ(t)), w(x, σ) = eiµ−iβz

2

/4 1/2

λ

ϕ(z, a), z = λx,

σ = ( µ2 , λ, β, a), λ ∈ R+ , β, µ, a ∈ R. Here ϕ(z, a) =

∞ n=0

an ϕn (z)

(1.1.5)

Vol. 2, 2001


is a formal solution of the equation −ϕzz + ϕ −

az 2 ϕ − ϕ5 = 0, 4

(1.1.6)

Equation (1.1.6) is equivalent to the following system for ϕn : L0+ ϕn =

z2 ϕn−1 + Fn , 4

n ≥ 1,

where L0+ = −∂z2 + 1 − 5ϕ40 , Fn being a homogeneous polynomial of ϕk , k ≤ n − 1 of degree 5. In particular, ϕ1 is characterized by the equation : L0+ ϕ1 =

z2 ϕ0 . 4

Since L0+ ϕ0 = 0, the operator L0+ is invertible being restricted to the subspace of even functions. As a consequence, the above equations have a unique even solution decreasing as |z| → ∞. More precisely, 3n −|z|

|ϕn (z)| ≤ c z

e

, z ∈ R.

We use the notation z = (1 + z 2 )1/2 . Function w(x, σ(t)) is a formal solution of (1.1.1) if σ(t) satisfies the system λ−3 λt = β, λ−2 βt + β 2 = a, λ−2 µt = 1, at = 0,

(1.1.7)

which gives, in particular, λ = (d2 t2 + d1 t + d0 )−1/2 , a = d21 /4 − d2 d1 . Here dj are constant. N We shall use the notations ϕN (z, a) = ak ϕk (z), k=0

ϕN (z, α, a) = 1.1.4

α 1/2 2

α 16a ϕN ( x, 4 ). 2 α

Linearization of (1.1.1) on a soliton

Consider the linearization of (1.1.1) on the soliton eit ϕ0 (x) : ¯ iχt = −χxx − ϕ40 χ − 2ϕ40 (χ + e2it χ). Introduce the function f : χ = eit f. Then f satisfies the equation f ' ' ' ift = H0 f , f = ¯ , f

610

G. Perelman


H0 = (−∂x2 + 1)σ3 + V (ϕ0 ), V (ξ) = −3ξ 4 σ3 − 2iξ 4 σ2 , σ2 , σ3 being the standard Pauli matrices : σ2 =

0 −i i 0

, σ3 =

1 0 0 −1

.

H0 is considered as a linear operator in L2 (R → C2 ) defined on the natural domain. In this section L2 stands for the subspace of the standard L2 consisting of even functions. The operator H0 satisfies the relations σ3 H0 σ3 = H0∗ ,

σ1 H0 σ1 = −H0 ,

(1.1.8)

0 1 . 1 0 The continuous spectrum of H0 consists of two semi-axes (−∞, −1], [1, ∞) and is simple. The point E = 0 is an eigenvalue of the multiplicity 4. By differentiating the solution w with respect to the parameters it is easy to distinguish an eigenfunction ξ'0 1 ' , H0 ξ'0 = 0, ξ0 = iϕ0 −1

where σ1 =

and three associated functions ξ'j , j = 1, 2, 3, H0 ξ'j = iξ'j−1 , where

1 1 , ξ'1 (x) = (1 + 2x∂x )ϕ0 4 1

1 1 , ξ'2 (x) = −i x2 ϕ0 (x) −1 8

1 1 ' , ξ3 (x) = ϕ1 (x) 1 2 ϕ1 being the second coefficient in the expansion (1.1.5). Since xϕ0 22 = 0, < ξ'3 , σ3 ξ'0 >= −ie, e = 8 vectors ξ'j , j = 0, . . . , 4, span the root subspace of H0 corresponding to the eigenvalue E = 0. It will be shown in Section 2 that E = 0 is the only eigenvalue of H0 .

Vol. 2, 2001


1.1.5 Main theorem Consider the Cauchy problem for equation (1.1.1) with initial data ψ|t=0 = ψ0 , ψ0 (x) = e−iβ0 x

2

/4

(ϕN (x, β02 ) + χ0 (x)), β0 > 0,

(1.1.9)

where χ0 (x) = χ0 (−x) and χ0 satisfies the estimate χ0 X = O(β02N ).

(1.1.10)

Here f X = f H 1 + xf L2 . Assume that (i) N is sufficiently large; (ii) β0 is sufficiently small. These conditions give, in particular, H(ϕN (β02 ) + χ0 ) = −2β02 e + O(β04 ) < 0, which together with the conformal invariance implies that the solution ψ of the Cauchy problem (1.1.1), (1.1.9) blows up in finite time T ∗ < ∞. Our main result is the following. Theorem 1.1.1 The solution ψ of the Cauchy problem (1.1.1), (1.1.9) blows up in finite time T ∗ = 2β1 0 (1 + o(1)), as β0 → 0, and there exist λ(t), µ(t) ∈ C 1 ([0, T ∗ )), λ(t) = const(T ∗ − t)−1/2 (ln | ln(T ∗ − t)|)1/2 (1 + o(1)), µ(t) = const ln(T ∗ − t) ln | ln(T ∗ − t)|(1 + o(1)), t → T ∗ ,

(1.1.11)

such that ψ admits the representation ψ(x, t) = eiµ(t) λ1/2 (t) (ϕ0 (z) + χ(z, t)) , z = λ(t)x, where χ is small in L2 ∩ L∞ uniformly with respect to t ∈ [0, T ∗ ). Moreover, χ∞ = o(1), as t → T ∗ . The constants in (1.1.11) are independent of initial data. Remark. Due to the conformal invariance the same result remains valid for initial data of the form 2 ψ˜0 (x) = eiω−ibz /4 λ1/2 ψ0 (z), z = λx, where ω ∈ R, λ ∈ R+ , b > − T1∗ . Remark. In principle our approach makes it possible to obtain an explicit value of the constant assumed in the hypothesis (i). But this would make the calculations less transparent and the result would be very far from the optimal one (we expect the theorem be true for N > 2).

612

1.1.6

G. Perelman


Outline of the proof

The proof contains two main ingredients : the ideas of the works [BP1], [BP2], [SW1], [SW2] where the asymptotic stability of solitary waves were considered and the asymptotic constructions of the works mentioned in the introduction, especially, that of [SF]. We shall now briefly describe the main steps of the proof. Step 1. Splitting of the motion. Following [BP1], [BP2] we start by introducing some new coordinates for the description of the solution with initial data (1.1.9). The new coordinates posses an important property : they allow us to split the motion into two parts, the first part is a finite- dimensional dynamics on the manifold of formal solutions {w(·, σ)} and the second part remains small in some sense for all t ∈ [0, T ∗ ). To describe these coordinates we introduce a quasi-solution ϕ(z, ˜ a) of (1.1.6). One of the principal difficulties in the description of the critical blow-up comes from the fact that (1.1.6) has no admissible solutions for a > 0, which explains the presence of a correction to the self similar blowup rate (T ∗ − t)−1/2 , see again [DNPZ], [Fr], [KSZ], [LPSS], [LePSS1], [LePSS2], [M1], [M2], [M3], [SF], [SS1], [SS2]. By admissible we mean a solution with the purely outgoing behavior at infinity √ z2 h 1 i ϕ ∼ const ei 4 |z|− 2 − h , h = a, as |z| → ∞, which would give a finite energy blowup solution w of (1.1.1) with the blowup rate (T ∗ − t)−1/2 . To overcome this difficulty we follow the approach of [SF]. Instead of (1.1.6) we consider a modified equation where the quadratic 2 potential − az4 is replaced by zero outside the interval h−1 [−2 + δ0 , 2 − δ0 ] with some δ0 > 0. For a sufficiently small this modified equation has a solution ϕ˜ that decreases exponentially as |z| → ∞. The obtained profile ϕ˜ almost satisfies (1.1.6) : az 2 ϕ˜ − ϕ˜5 = F0 (a), 4 the error F0 is exponentially small (with respect to a). Choosing δ0 sufficiently small we shall make F0 to be almost of the same order as the effective small pa2 S0 rameter of the problem e− h , S0 = ds 1 − s2 /4 (we use this expression for S0 −ϕ˜zz + ϕ˜ −

0

instead of the explicit value in order to underline its obvious semi-classical meaning). The exact assertions related to the modified profile ϕ˜ as well as a description ˜ are given in of the spectral properties of the corresponding linearized operator H subsubsection 1.2.1. Using the profile ϕ˜ we decompose the solution ψ of (1.1.1), (1.1.9) as follows. ψ(x, t) = eiµ(t)−iβ(t)z

2

/4 1/2

λ

(t)(ϕ(z, ˜ a) + f (z, t)),

the decomposition being fixed by some suitable orthogonality conditions that have ˜ see suba natural interpretation in terms of the spectral objects associated to H, subsection 1.2.2. For the present the parameter δ0 in the definition of ϕ˜ is arbitrary. We fix it only at the last steps of the proof.

Vol. 2, 2001


The functions σ(t) = ( µ(t) 2 , λ(t), β(t), a(t)) and f satisfy the system of coupled equations : (1.1.12) if'τ = H(a)f' + N (a, f ), στ = G (a, f ), where H(a) =

(−∂z2

+1−

az 2 4 )σ3

(1.1.13)

+ V (ϕ(a)), ˜ G , N are some nonlinear functions, t τ is a changed time variable : τ = dsλ2 (s), τ → ∞, as t → T ∗ . 0

Step 2. Effective equations. Assuming that a(τ ) is a small slowly varying function we single out the main order terms in N , G and derive a model system that we expect to describe qualitatively the dynamics (1.1.12), (1.2.13). The model system has the form ifτ = (−∂z2 + 1 − a

z2 )f + F0 (a), 4

λ−1 λτ = β, βτ + β 2 = h2 , µτ = 1, hτ = −ch−1 e− f |τ =0 = χ0 ,

S0 h

(1 + O(h)),

λ(0) = 1, β(0) = h(0) = β0 , µ(0) = 0,

where c is a positive constant. At this stage the constructions are formal and quite similar to those of [SF]. Solving the equation for h one gets h ∼ ln−1 (τ + τ ∗ ), 2S0

τ ∗ ∼ e β0 β03 , which leads to (3), (4). Step 3. Estimates of the solution. To prove that the complete dynamics (1.1.12), (1.1.13) is indeed close to the model one we employ the standard perturbation methods, the same methods were used in [BP1], [BP2]. To ensure that the correction terms in (1.1.12) can be treated perturbatively one requires suitable time-decay estimates (local in space) for the dispersive solutions of the linear equation if'τ = H(a(τ ))f'. In our case this local decay is a consequence of the corresponding properties of the group e−iτ H(a) restricted to the subspace of the “continuous” spectrum of H(a), see proposition 1.2.7, and the fact that a depends on τ slowly.

1.2 1.2.1

Splitting of motions Modified ground state

Consider the equation −ϕ˜zz +

az 2 α2 ϕ˜ − θ(hz)ϕ˜ − ϕ˜5 = 0, h = |a| > 0, 4 4

α, a ∈ R. Here θ ∈ C0∞ (R), θ(ξ) = θ(−ξ), θ(ξ) ≤ 1,

1, |ξ| ≤ 2 − δ0 θ(ξ) = , 0, |ξ| > 2 − δ0 /2

(1.2.1)

614

G. Perelman


δ0 > 0 is sufficiently small (θ can be considered as a family of cut-off functions parametrized by δ0 ). One has the following proposition. Proposition 1.2.1 For α in some finite vicinity of 2 and for a sufficiently small,1 equation (1.2.1) has a unique positive even smooth decreasing solution ϕ(z, ˜ α, a) which is close to ϕ0 (z, α). Moreover, (i) as a → 0, ϕ(z, ˜ α, a) admits the asymptotic expansion (1.1.5) in the sense |ϕ˜ − ϕN | ≤ c|a|N+1 < x >3(N+1) e− h Sα,a (h|x|) , ξ S˜α,a (ξ) = 12 0 ds α2 − (a)+ s2 θ(s); ξ 1 (ii) e h Sα,a (h|x|) ϕ(α, ˜ a)∞ ≤ c, Sα,a (ξ) = 12 0 ds α2 − sgn as2 θ(s). The similar formulas are valid for the derivatives of ϕ˜ with respect to z, α, a. Here (a)+ stands for max(a, 0). 1

˜

See subsection 2.2 for the proof. ˜ Introduce a linearized operator H(a) associated to the modified ground state ϕ(z, ˜ a) = ϕ(z, ˜ 2, a) az 2 ˜ H(a) = (−∂x2 + 1 − ˜ θ)σ3 + V (ϕ(a)). 4 ˜ The continuous spectrum of H(a) is the same as in the case of the operator H0 . ˜ The point E = 0 is an eigenvalue of H(a) of the multiplicity 2. There are an eigenfunction ζ˜0 (a) 1 ˜ ˜ ζ˜0 = 0, ˜ , H ζ0 (a) = iϕ(a) −1 and an associated function ζ˜1 (a) 1 ˜ ˜ ζ˜1 = iζ˜0 , , H ˜ a)|α=2 ζ1 (a) = ∂α ϕ(α, 1 ζ˜1 , σ3 ζ˜0 = i4ea + O(a2 ). A more detailed description of the discrete spectrum can be obtained by means of the standard perturbation methods. In particular, the following proposition is proved in subsubsection 2.3.2. Proposition 1.2.2 For a sufficiently small the discrete spectrum of the operator ˜ H(a) in some finite vicinity√of the point E = 0 consists of 0 and two simple eigenvalues ±λ(a), λ(a) = i aλ (a), where λ is a smooth real function of a. As a → 0, λ (a) = 2 + O(a). Let ζ˜2 (a) be an eigenfunction corresponding to λ(a) normalized by the condition ζ˜2 , ξ'0 = ζ˜0 , ξ'0 − λ2 ξ'2 , ξ'0 . 1 The

constants here and below depend on δ0 .

Vol. 2, 2001


Then ζ˜2 (a) is a smooth function of a1/2 admitting the following asymptotic expansion as a → 0 1 2' 3' 2 3 1 ˜ ˜ ˜ (h0 + O(a)) + iaλ (h1 + O(a)), ζ2 = ζ0 − iλζ1 − λ ξ2 + iλ ξ3 + iaλ −1 1 where hi , i = 1, 2, are some real even smooth exponentially decreasing functions. 1−γ ˜ O(a) corresponds to the L∞ -norm with the weight e h Sa (h|x|) , Sã (ξ) = S˜2,a (ξ), γ > 0. This asymptotic representation can be differentiated any number of times with respect to x and a. Let us mention that

σ1 ζ˜2 = ζ˜¯2 .

In the subspace generated by ζ˜j (a), j = 0, . . . 3, where ζ˜3 = σ1 ζ˜2 is an eigenfunction corresponding to the eigenvalue −λ, we introduce a new basis {'ej (a)}3j=0 : 'e0 = ζ˜0 ,

'e1 = ζ˜1 ,

1 ˜ ˜3 + 2ζ˜0 , 'e3 = − i ζ˜2 + ζ˜3 + i2λζ˜1 , − ζ + ζ 2 2λ2 2λ3

1

1 'e2 = e2 −1 , 'e3 = e3 −1 , e¯j = (−1)j−1 ej . It follows from proposition 1.2.2 that as a → 0, 1 ' + O(a2 ), 'e2 = ξ2 − iah0 −1 'e2 =

1 ' + O(a2 ). 'e3 = ξ3 + ah1 1 1.2.2

Orthogonality conditions

Return to the Cauchy problem (1.1.1), (1.1.9). Using the profile ϕ˜ one can rewrite β0 x2

the initial data ψ0 in the form : ψ0 = e−i 4 (ϕ(β ˜ 02 ) + χ0 ), χ0 X = O(β02N ). Below we shall omit “ “ in the notation of χ0 . Write the solution ψ as the sum ψ(x, t) = eiΦ λ1/2 (t) (ϕ(z, ˜ a(t)) + f (z, t)) ,

Φ = µ(t) −

β 2 z , z = λ(t)x, (1.2.2) 4

where ϕ(z, ˜ a) = ϕ(z, ˜ 2, a), σ(t) = ( µ(t) 2 , λ(t), β(t), a(t)) being an arbitrary curve in R+ × R3 , it is not a solution of (1.1.7) in general. The decomposition can be fixed by the orthogonality conditions (1.2.3) f'(t), σ3'ej (a(t)) = 0, j = 0, . . . , 3.

616

G. Perelman


This means that σ has to satisfy the system Fj (ψ, σ) = 0,

j = 0, . . . 3,

' σ3 eiΦσ3 'ej (λ·, a) − 'e0 (a), 'ej (a) = 0, Fj (ψ, σ) = λ1/2 ψ,

(1.2.4) '= ψ . ψ ψ¯

The solvability of (1.2.4) for ψ in some small L2 − vicinity of ϕ0 is guaranteed by the smoothness of the basis 'ej (a), j = 0, . . . , 3 and the non-degeneration of the corresponding Jacobi matrix

∂Fj B0 = . ψ=ϕ0 σ=(1,0,0,0) ∂σk It is not difficult to check that 3 B0 = −2 ξ'k , σ3 ξ'j

,

k,j=0

4 det B0 = 2 ξ'1 , σ3 ξ'2 = (8e)4 = 0.

So, one can assume that the initial decomposition obeys (1.2.3) : χ ' 0 , σ3'ej (β02 ) = 0, j = 0, . . . , 3. To prove the existence of a trajectory σ(t) we need the following orbital stability result : Proposition 1.2.3 For any 5 > 0 there exists δ > 0 such that for any ψ0 , ψ0 − ϕ0 H 1 ≤ δ, E(ψ0 ) < 0, there exists µ(t) ∈ C([0, T ∗ )) such that the solution ψ corresponding to the initial data ψ0 satisfies the inequality ψ(t) − λ1/2 (t)eiµ(t) ϕ0 (λ(t)·)2 ≤ 5, where λ(t) is given by λ(t) =

0 ≤ t < T ∗,

ψx (t)2 . ϕ0x 2

See [LBSK], [W2], [W3] for the proof. By (1.1.10), ψ˜0 , ψ˜0 = ϕ(β ˜ 02 ) + χ0 satisfies the conditions of the above propo˜ admits the representation sition. Thus, the corresponding solution ψ(t) ˜ t) = eiΦ˜ λ ˜ 1/2 (t) ϕ(z, ψ(x, ˜ a ˜(t)) + f˜(z, t) ,

˜ β(t) ˜ ˜ =µ z 2 , z = λ(t)x, Φ ˜(t) − 4

˜ ˜ ˜(t)), σ ˜ (0) = (0, 1, 0, β02 ) is a continuous trajectory where σ ˜ (t) = ( µ˜(t) 2 , λ(t), β(t), a ϕ0x 2 ˜ ˜ a ˜ ˜ being small uniformly with respect to satisfying (1.2.4), f 2 , λ ψx (t)2 − 1 , β, t. By the conformal invariance we can write now the solution ψ(t) of the Cauchy problem (1.1.1), (1.1.9) in the form (1.2.2) where ˜ µ(t) = µ ˜(ρ), λ(t) = (1 − β0 t)−1 λ(ρ),

Vol. 2, 2001


˜ −2 + β(ρ), ˜ β(t) = β0 (1 − β0 t)λ

a(t) = a ˜(ρ), ρ =

t , 1 − β0 t

f (z, t) = f˜(z, ρ) satisfying the orthogonality conditions (1.2.3). By (i) of proposition 1.1.1, λ admits the estimate λ(t) ≥ c(T ∗ − t)−1/2 .

(1.2.5)

Remark that since ψ(t) ∈ C 1 ([0, T ∗ ) → H −1 ) the trajectory σ(t) belongs in fact, to C 1 . 1.2.3

Differential equations

We write a system of equations for σ and f in explicit form. Introduce a new time variable τ : t τ = dsλ2 (s). 0 ∗

By (1.2.5), τ → ∞ as t → T . In terms of f (1.1.1) takes the form

where

˜ f' + N, if'τ = H(a)

(1.2.6)

1 1 N = N0 (a, f ) + N1 (ϕ, , ˜ f ) + l(σ) ϕ˜ + f' − iaτ ϕã 1 1 az 2 1 (θ(hz) − 1)σ3 (ϕ˜ + f'), N0 (a, f ) = 4 1 1 1 4 5 ' + f ) + ϕ˜ − V (ϕ) ˜ f', ˜ f ) = −|ϕ˜ + f | σ3 (ϕ˜ N1 (ϕ, 1 −1

(1.2.7)

λτ 1 λτ z 2 )(z∂z + ) + (a − βτ + β 2 − 2β ) σ3 . λ 2 λ 4 ' Substitute the expression for fτ from (1.2.6), (1.2.7) into the derivative of the orthogonal conditions. The result can be written down as follows : l(σ) = (µτ − 1)σ3 + i(β −

(A0 (a) + A1 (a, f ))'η = 'g (a, f ). Here

µτ − 1 λτ λτ , − β, βτ − β 2 + 2β − a, aτ ), 2 λ λ 0 0 0 −(ϕã , ϕ) ˜ 2 0 −( z4 ϕ, ˜ ϕ˜α ) 0 2(ϕ, ˜ ϕ˜α ) ˜ e2 ) 0 −i(ϕã , e2 ) 0 −i((z∂z + 12 )ϕ, 2 0 −( z4 ϕ, ˜ e3 ) 0 2(ϕ, ˜ e3 )

(1.2.8)

'η = (

  A0 = 2  

  , 

618

G. Perelman


(A1 'η )j = l(σ)f', σ3'ej + iaτ f', σ3'eja , gj (a, f ) = − N0 + N1 , σ3'ej . By propositions 1.2.1, 1.2.2, A0 (a) = iB0 + O(a),

(1.2.9)

as a → 0. In principle (1.2.8) can be solved with respect to the derivatives η and together with equation (1.2.6) constitutes a complete system for σ, f' : if'τ = H(a)f' + N (a, f ),

(1.2.10)

'η = G(a, f ),

(1.2.11)

f |τ =0 = χ0 , Here H(a) = (−∂z2 + 1 −

az 4

2

σ|τ =0 =

(0, 1, β0 , β02 ).

2 ' )σ3 + V (ϕ(a)), ˜ N = N − a z4 (θ − 1)σ3 f.

1.2.4 Effective equations In order to derive a system of effective equations consider the main nonlinear terms of (1.2.10), (1.2.11). Below it will become clear that the function a depends slowly on τ . More precisely, a ∼ ln−2 (τ + τ ∗ ), (1.2.12) 2S0

with some τ ∗ = O(e β0 β03 ). We shall also see that the contribution f of the S0 √ continuous spectrum asymptotically is of the order e− h , h = a, (in the uniform 2S0 norm) and of the order e− h for z not too large. In its turn the vector η also 2S0 has the order e− h . We shall use these facts while deriving the equations. At this stage we are not worrying about formal justification. The main terms of N are generated by the expression 1 z2 , F0 (a) = a (θ − 1)ϕ. ˜ (1.2.13) N ∼ F0 (a) −1 4 Thus, it is clear that in the region |z| ≥ const h−1 the main order term of f is given by the expression f ∼ −(l(a) + 1 − i0)−1 F0 (a),

(1.2.14)

2

where l(a) = −∂z2 − a z4 .

hz 2

The sign “-” (in −i0) is essential : it means that e−i 4 (l(a) + 1 − i0)−1 F0 (a) has finite energy. For the following it is convenient to write f = f 0 + f 1 , f 0 = −(l(a) + 1 − −1 i0) F0 (a). It will become clear later that in the region |z| ≥ const h−1 f 0 and

Vol. 2, 2001

Formation of Singularities in the Critical Nonlinear Schr¨ odinger Equation 619 S0

2S0

f 1 are of the order e− h and e− h respectively while for |z| ∼ 1 both f 0 and f 1 2S0 have the order e− h . Consider (1.2.11). The main term of G is given by the expression G ∼ A−1 g 0 (a), 0 (a)' where gj0 = − N0 (a, f0 ), σ3'ej . So we rewrite (1.2.11) in the form 'η = G0 (a) + GR (a, f ).

(1.2.15)

g 0 (a), GR being the remainder. Here G0 (a) = −A−1 0 (a)' 0 The behavior of f (a), G0 (a) in the limit a → 0 is described by the following proposition. Proposition 1.2.4 For a > 0 sufficiently small, f 0 (a), G0 (a) satisfy the estimates f 0 (a)∞ ≤ ce−(1−)

S0 h

,

0 ϕ(a)f ˜ (a)∞ ≤ ce−(2−)

S0 h

,

S0 1 hz 2 hz 2 hz 2 e−i 4 f 0 1 , (z∂z + )e−i 4 f 0 1 , ∂h e−i 4 f 0 1 ≤ ce−(1−) h , 2 G0 (a) ≤ ce−(2−) Moreover,

G30

S0 h

.

admits the following representation G30 (a) = −2ν0 e−

2S0 h

(1 + O(a)),

ν0 =

ϕ2∞ . e

This asymptotic estimate can be differentiated any number of times with respect to a. Here fˆ stands for the Fourier transform of f : fˆ(p) = (2π)−1/2 dxe−ipx f (x). Here and in what follows the letter 5 is used as a general notation for small positive constants that depend on the choice of the cut off function θ and tend to zero as δ0 → 0. They may change from line to line. The proof of this proposition is given in appendix 2. In order to estimate qualitatively the behavior of a, consider the last equation of (1.2.15) neglecting the remainder GR : aτ = G30 (a). We denote by a0 (τ ) the solution of this equation with initial data a0 (0) = β02 . It √ is easy to check that h0 = a0 admits the representation h−1 0 (τ ) =

1 ln ln(τ + τ ∗ ) (ln ν1 (τ + τ ∗ ) + 3 ln ln ν1 (τ + τ ∗ )) + O( ), 2S0 ln(τ + τ ∗ )

as τ + τ ∗ → +∞, ν1 =

ν0 , 4S02

τ∗ =

β03 2S0 ν0 e

2S0 β0

(1 + O(β0 )).

(1.2.16)

620

G. Perelman


1.2.5 Spectral properties of the operator H(a) To study the behavior of solutions to (1.2.10), (1.2.11) we need some information about spectral properties of H(a), a > 0, in the limit a → 0. The necessary facts are collected in this subsubsection, the proofs being given in Section 2. We renormalize H(a) to make the principal part independent of the parameters : ∗ 1/4 ˆ H(a) = a1/2 T (a1/4 )H(a)T (a ), (T (a)f )(z) = a1/2 f (az), a > 0.

ˆ The operator H(a) has the form 2 ˆ ˆ0 − z )σ3 + W ˆ (a), H(a) = (−∂z2 + E 4

ˆ0 = a−1/2 , E

ˆ (a) = a−1/2 T ∗ (a1/4 )V (ϕ(a))T ˜ (a1/4 ). where W ˆ We consider H(a) as a linear operator in L2 (R → C2 ) defined on the domain 2 2 ˆ where the operator (−∂z − z4 )σ3 is self-adjoint. The continuous spectrum of H(a) ˆ at infinity coincides with R. Because of the exponential decrease of the potential W the point spectrum contains only finitely many eigenvalues, and the corresponding ˆ root subspaces are finite-dimensional. H(a) satisfies the same relations (1.1.8) as H0 . As a consequence the spectrum is symmetric with respect to transformations ¯ E → −E and E → E. Consider the equation ˆ − E)ψ = 0. (H (1.2.17) One can find a basis of solutions fˆj (z, E), j = 1, . . . , 4, with the following properties. The solutions fj are holomorphic functions of E , E ∈ C, admitting the following asymptotic representations as z → +∞ z2 1 + o(1)], fˆ1 (z, E) = ei 4 z νˆ(E) [ 0 z2 1 ¯ fˆ2 (z, E) = e−i 4 z νˆ(E) [ + o(1)], 0 z2 0 ¯ + o(1)], fˆ3 (z, E) = e−i 4 z νˆ(−E) [ 1 z2 0 fˆ4 (z, E) = ei 4 z νˆ(−E) [ + o(1)], 1 ˆ0 ). where νˆ(E) = − 12 + i(E − E We introduce the solutions gˆj (z, E), j = 1, . . . , 4, with standard behavior at −∞ by gˆj (z, E) = fˆj (−z, E).

Vol. 2, 2001


Consider the matrix solutions ˆ 1 = (ˆ ˆ 2 = (ˆ Fˆ1 = (fˆ1 , fˆ3 ), Fˆ2 = (fˆ2 , fˆ4 ), G g1 , gˆ3 ), G g2 , gˆ4 ). ˆ j , j = 1, 2 : One can express Fˆ1 in terms of G ˆ 2 Aˆ + G ˆ 1 B, ˆ Fˆ1 = G ˆ ˆ = B(E) ˆ Aˆ = A(E), B are holomorphic functions of E , E ∈ C. ˆ lying in the upper half plane {Im E > 0} The eigenvalues of the operator H are characterized by the equation ˆ det A(E) = 0. The solutions of this equation in lower half plane {Im E ≤ 0} are called resonances. One can prove the following result. Proposition 1.2.5 For a > 0 sufficiently small , ˆ (i) the point spectrum of H(a) restricted to the subspace of even functions ˆj > 0, ˆ1,2 (a), E consists of four simple purely imaginary eigenvalues ±iE ˆ1 = O(e−(1−)S0 /h ), E

ˆ2 (a) − λ (a)| = O(e−(2−)S0 /h ), |E

(ii) there exists C0 > 0, independent of a, such that in the strip {E : −C0 < ˆR < 0. ˆ ˆR (a), E Im E ≤ 0} the operator H(a) has only one simple resonance iE ˆ Moreover, ER admits the asymptotic estimates ˆR = O(e−(1−)S0 /h ), E

ˆR + E ˆ1 = O(a−2 e−2S0 /h ). E

ˆj , Let ζˆj , j = 1, . . . 4, be eigenfunctions corresponding to the eigenvalues ±iE j = 1, 2 : ˆ ζˆj = iE ˆj ζˆj , H ˆ ζˆj+2 = −iE ˆj ζˆj+2 , j = 1, 2. H ˆR : Let ζˆR be a resonant function corresponding to iE ˆ ζˆR = iE ˆR ζˆR , H 2

iz ζˆR ∼ e 4

σ3

|z|− 2 −ER −iE0 σ3 'c, 1

ˆ

ˆ

as |z| → ∞. Here 'c is a constant vector. Let Pˆ (a) stand for the spectral projection onto eigenspace corresponding to ˆ1 , ±iE ˆ2 and to the resonance iE ˆR : the eigenvalues iE ˆ1 f, σ3 ζˆ3 + n−1 ζˆ2 f, σ3 ζˆ4 Pˆ (a)f = n−1 ζ 1 2 ¯ −1 ˆ −1 ˆ ˆ +¯ n2 ζ4 f, σ3 ζ2 + nR ζR f, σ3 ζˆR .

622

G. Perelman

The normalization constants n1 , n2 , nR are given by ¯ nR = ζˆR , σ3 ζˆR , nj = ζˆj , σ3 ζˆj+2 ,


j = 1, 2.

The spectral projection P (a) of the operator H(a) corresponding to the eigenvalues iE1 , ±iE2 and to the resonance iER is given by P (a) = T (a1/4 )Pˆ (a)T ∗ (a1/4 ). Introduce the operator Q(a) : Q(a) = (I − P˜ (a))P (a)(I − P˜ (a)), ˜ where P˜ (a) is the spectral projection of the operator H(a) onto the subspace corresponding to the eigenvalues E = ±λ(a) and E = 0 : ˜ ˜ ˜ ˜ P˜ (a)f = n ˜ −1 ˜ −1 1 ζ0 f, σ3 ζ1 − n 1 ζ1 f, σ3 ζ0 ˜ ˜ ˜ ˜ ˜ −1 +˜ n−1 2 ζ2 f, σ3 ζ3 − n 2 ζ3 f, σ3 ζ2 , n ˜ 1 = ζ˜0 , σ3 ζ˜1 , n ˜ 2 = ζ˜2 , σ3 ζ˜3 . The following proposition is proved in subsubsection 2.4.4. Proposition 1.2.6 The operators P , Q admit the estimates |(P f )(z)| ≤ c < z >−1/2+ER e−i ˆ

z2 h 4 σ3

f H 1 ,

|(Qf )(z)| ≤ c < z >−1/2+ER e h S(h|z|) e−(3−)S0 /h e−i ξ where S(ξ) = 0 ds (1 − s2 /4)+ . ˆ

1

z2 h 4 σ3

f H 1 ,

ˆ G ˆ : L2 (R → C2 ) → L2 (R → C2 ) : Let us introduce the operators F, 1 ˆ ˆ E)Φ(E), (FΦ)(z) = √ dE F(z, 2π R 1 ˆ ˆ E)Φ(E). dE G(z, (GΦ)(z) =√ 2π R ˆ Gˆ are solutions of the scattering problem : Here F, Fˆ = Fˆ1 Aˆ−1 , 2

ˆ E) ∼ e iz4 F(z, 2 − iz4

ˆ E) ∼ e F(z,

σ3

ˆ0 σ3 ) σ3 − 12 +i(E−E

z

ˆ0 σ3 ) − 12 −i(E−E

|z|

ˆ 1 Aˆ−1 , Gˆ = G

+e

iz 2 4

σ3

Aˆ−1 ,

z → +∞,

ˆ0 σ3 ) − 12 +i(E−E

|z|

ˆ Aˆ−1 , B

z → −∞.

Vol. 2, 2001


ˆ∗, G ˆ ∗ is given by The action of the adjoint operators F ˆ ∗ ψ)(E) = √1 (F dz Fˆ ∗ (z, E)ψ(z), 2π R 1 ∗ ˆ dz Gˆ∗ (z, E)ψ(z). (G ψ)(E) = √ 2π R ˆ G ˆ are bounded in L2 and satisfy the relations It is not difficult to show that F, ˆ ∗ σ3 = P c , ˆ σ3 E Eˆ

ˆ ∗ σ3 Eˆ ˆ σ3 = I, E

ˆ : L2 (R → C2 ) × L2 (R → C2 ) → L2 (R → C2 ), where E

ˆΦ ˆ 1 + GΦ ˆ 2, ' = FΦ E

' = (Φ1 , Φ2 ), Φ

σ3 0 , P c being the spectral projection onto the subspace of the 0 σ3 continuous spectrum. Moreover, one can prove the following proposition. σ ˆ3 =

Proposition 1.2.7 For a > 0 sufficiently small, there exists b0 , 12 > b0 > 0, independent of a, such that z2 ˆ ∗ f )(E) is a meromorphic function of E in the strip (i) for e−i 4 σ3 f ∈ H 1 , (F ˆ1 and satisfies the estimate −b0 ≤ Im E ≤ 0 with the only pole in −iE 2

ˆ ∗ f L (R−ib) ≤ ch−K1 e−i z4 ˆ ∗ f L (R−ib) , ∂h F F 2 2

σ3

f H 1 ,

hL ≤ b ≤ b0 ; ˆb : (ii) let us introduce the operators F ˆ b Φ)(z) = √1 ˆ E − ib)Φ(E). (F dE F(z, 2π R For hL ≤ b ≤ b0 , they satisfy the inequality. ˆ b Φ2 ≤ ch−K2 Φ2 , (1 + |z|)−ν2 F

ν2 > 1/2,

ˆ replaced by G. ˆ the same being true for F Here Kj , j = 1, 2, depend only on L. 1.2.6

Equations on the finite interval

Following [BP1], [BP2] we consider the system (1.2.10), (1.2.11) on some finite interval [0, τ1 ] and later investigate the limit τ1 → ∞. On the interval [0, t1 ], t1 = t(τ1 ) we approximate the trajectory σ(t) by σ1 (t) where σ1 (t) = ( µ(t) 2 , λ1 (t), β1 (t), a1 (t)) is the solution of the following Cauchy problem −2 2 λ−3 1 λ1 = β1 , λ1 β1 + β1 = a1 , a1 = 0,

624

G. Perelman


λ1 (t1 ) = λ(t1 ), β1 (t1 ) = a1/2 (t1 ), a1 (t1 ) = a(t1 ). We associate to the trajectory σ1 a new function g g(y, ρ) = eiy r1/2 f (ry, τ ), τ 2 , r = √βλ λ , ρ = 0 dsr−2 . where = 1−βr 4 1 1 Equation (1.2.10) in terms of g takes the form 2

ˆ g + N0 + N1 + N2 + N3 , i'gρ = H(a)' where

2 1 N0 = e r F0 (a) , N1 = eiy σ3 r5/2 N1 , −1 2 1 1 − iaτ ϕã , N2 = eiy σ3 r5/2 l(σ)ϕ˜ 1 1

(1.2.18)

iy2 σ3 5/2

N3 = eiy

2

σ3 5/2

r

(1.2.19)

ˆ (a)'g + (µρ − h−1 )σ3'g . V (ϕ(a)) ˜ f' − W

Since a depends slowly on τ it is natural to rewrite the above equation in ˆ terms of the spectral representation of H(a). Write 'g as the sum 'g = 'h + 'k

(1.2.20)

of the projections on the subspaces corresponding to the discrete and continuous ˆ spectra of H(a). More precisely, set 'k = Pˆ (a)'g . Then 'h = (F ˆ b )σ3 Φ(· − ib), ˆb + G

ˆ ∗ σ3'g )(E), Φ(E) = (F

ˆR < b ≤ b0 . Let us remark that due to the orthogonality conditions where −E (1.2.3) the four dimensional component k is controlled by h (or equivalently by Φ). ˆ Projecting (1.2.18) on the subspace of the continuous spectrum of H(a) one gets an equation for Φ : iΦρ = EΦ + D, (1.2.21) where D = D0 + D1 + D2 , ˆ ∗ σ3 N0 , D0 = F

ˆ ∗ σ3'g , D1 = iF ρ

D2 =

3

ˆ ∗ σ3 Nj . F

(1.2.22)

j=1

Consider (1.2.21) on the line Im E = −b with some b, 0 < b ≤ b0 , that will be fixed later, rewriting it as an integral equation : −iEρ ˆ ∗

Φ(ρ) = e

ρ

F (0)σ3'g0 − i 0

dse−iE(ρ−s) D(s), Im E = −b.

(1.2.23)

Vol. 2, 2001

Here


2 1/2 ˆ ˆ F(0) = F(a(0)), g0 (y) = eiy 0 r0 χ0 (r0 y),

1 − β0 r02 , r0 = (β1 λ21 (0))−1/2 . 4 The relations (1.2.3), (1.2.15), (1.2.20), (1.2.23) make up the final form of the equations which is used to investigate the dynamical system on the interval [0, τ1 ]. It follows from (1.2.13), (1.2.14) that the main part of D is given by D0 . The contribution of D0 in (1.2.22) allows some asymptotic simplifications. After a natural integration by parts one gets 0 =

Φ = Φ0 + Φ1 , −iEρ

Φ1 (ρ) = e

Φ0 = −

σ3 Φ10 − i

ρ

1 D0 , E

dse−iE(ρ−s) D (s).

(1.2.24)

0

Here Φ10 = F∗ (0)σ3'g0 + E1 D0 (0), D = D1 +D2 +i the main order term of Φ is given by Φ0 .

1.3

D0ρ E .

In accordance with (1.2.13)

Estimates of the solution

Here we prove that the new coordinates indeed admit only small (in suitable sense) deviations from their initial values. As in [BP1], [BP2], for this purpose we use the method of majorants. 1.3.1 Estimates of soliton parameters Introduce a natural system of norms for the components of the solution ψ : s0 (τ ) = sup |h(s) − h0 (s)|h−2 0 (s), s≤τ

−1 s1 (τ ) = sup |β(s) − h(s)|h−2 (s; κ1 , r1 ), 0 (s)p s≤τ

−1 (s; κ2 , r2 ), s2 (τ ) = sup |β(s) − r−2 |h−2 0 (s)p τ ≤s≤τ1

M0 (τ ) = sup f (s)∞ p−1 (s; κ0 , r0 ), s≤τ

M1 (τ ) = sup z−ν3 f 1 (s)∞ p−1 (s; κ3 , r3 ), ν3 ≥ 2, s≤τ

M2 (τ ) = sup ρδ f (s)2 p−1 (s; κ4 , r4 ), s≤τ

where p(τ ; κ, r) = e−κ

τ 0

dsh0 (s)

S0

+ e−r h0 (τ ) , ρδ = e−

(1−δ) h0

h0 |z| 0

2 ds 1− s4 θ(s)

,

626

G. Perelman


7 4 κ4 = b40 , κ0 = κ3 = 78 κ4 , κ1 = 32 κ4 , κ2 = 54 κ4 , r0 = 34 , r1 = 15 8 , r2 = 4 , r3 = 3 , 3 r4 = 2 , δ > 0 is supposed to be a sufficiently small fixed number. At last, set

sˆj = sj (τ1 ), j = 0, 1,

sˆ2 = s2 (0),

ˆ j = Mj (τ1 ). M

Consider equation (1.2.15). It follows immediately from (1.2.7), (1.2.9) and from proposition 1.2.4 that S0

S0

|η| ≤ W (M, s)[e−(2−) h0 (τ ) + e−(1−) h0 (τ ) z−ν3 f 1 ∞ +ρδ f 22 + ρδ f 2 f 4∞ ], S0

S0

|GR | ≤ W (M, s)[e−(4−) h0 (τ ) + e−(1−) h0 (τ ) z−ν3 f 1 ∞ +ρδ f 22 + ρδ f 2 f 4∞ ]. We use W (M, s) as a general notation for functions of Mj , j = 0, 1, 2, sk , k = 0, 1, 2, defined on R6 , which are bounded in some finite neighborhood of 0 and may acquire the infinite value +∞ outside some larger neighborhood. While depending on δ0 , δ, W does not depend on β0 . In all the formulas where W appear it would be possible to replace them by some explicit expressions but such expressions are useless for our aims. In terms of majorants the above inequalities take the form τ S0 (1.3.1) |η| ≤ W (M, s) Ψ0 (M )e−2κ3 0 dsh0 (s) + e−(2−) h0 (τ ) , 3κ3 τ 3r4 S0 |GR | ≤ W (M, s)Ψ1 (M ) e− 2 0 dsh0 (s) + e− 2 h0 (τ ) , (1.3.2) where Ψ0 (M ) = M2 M04 + β04 M12 + M22 , Ψ1 (M ) = e−γ/β0 + M2 M04 + M22 , with some γ > 0. Using (1.3.1), (1.3.2) and proposition 1.2.4 it is not difficult to get the following inequalities

s0 ≤ W (M, s) s20 + β0−4 Ψ1 (M ) , γ s1 ≤ W (M, s) β0 s21 + e− β0 + β0−4 Ψ0 (M ) , γ ˆ , sˆ) sˆ1 + β0 s22 + e− β0 + β −3 Ψ0 (M ˆ) . s2 ≤ W (M 0 See appendix 3 for the proof. Changing if necessary, functions W one can simplify these inequalities : s0 ≤ W (M, s)β0−4 Ψ1 (M ), γ s1 ≤ W (M, s) e− β0 + β0−4 Ψ0 (M ) , γ ˆ , sˆ) e− β0 + β −4 Ψ0 (M ˆ ) , γ > 0. s2 ≤ W (M 0

(1.3.3)

Vol. 2, 2001

1.3.2


Estimates of Dj

Consider (1.2.24). Using propositions 1.2.4, 1.2.7 one gets for D0 S0

ˆ , sˆ)e−(1−) h0 (τ ) , D0 L2 (R−ib) ≤ W (M

(1.3.4)

S0 D0ρ ) ˆ , sˆ)e−(1−) h0 (τ L2 (R−ib) ≤ W (M [|aρ | + |βρ | + |rρ |] E S

0 ) ˆ , sˆ)e−(1−) h0 (τ ≤ W (M [|η| + |β − h| + |β − r−2 |].

(1.3.5)

In a similar manner 2

ˆ , sˆ)h−K |η| + |β − h| + |β − r−2 | e−i βz4 f H 1 . D1 L2 (R−ib) ≤ W (M 0

(1.3.6)

In this subsubsection and the next one we use letter K as a general notation for nonnegative numbers independent of parameters that may change from line to line. Consider D2 . It is not difficult to show that e−i

y2 4

σ3

ˆ , sˆ)h−3 e−i βz4 N1 H 1 ≤ W (M 0

ˆ , sˆ)h−3 (1 + e−i ≤ W (M 0 + < z >−ν3 f 1 ∞ (∂z (e−i y2 4

hz 2 4

βz 2 4

2

σ3

N1 H 1

S0 f H 1 ) e−(2−) h0 (τ )

f )2 + ρδ f 2 ) + ρδ f 22 + f 4∞ ,

ˆ , sˆ)h−K |η|, N2 H 1 ≤ W (M (1.3.7) 0

y2 βz 2 ˆ , sˆ)h−K |µρ − r2 | + || + |r−2 − h| e−i 4 f H 1 e−i 4 σ3 N3 H 1 ≤ W (M 0 2

ˆ , sˆ)h−K |η| + |β − r−2 | + |β − h| e−i βz4 f H 1 . ≤ W (M 0 e−i

σ3

Combining the inequalities (1.3.5)-(1.3.7) one obtains 2

ˆ , sˆ)h−K (1 + e−i βz4 f H 1 ) D L2 (R−ib) ≤ W (M 0 S0 × |η| + |β − h| + |β − r−2 | + e−(2−) h0 (τ )

+ < z >−ν3 f 1 ∞ (∂z (e−i

hz 2 4

f )2 + ρδ f 2 ) +ρδ f 22 + f 4∞ .

(1.3.8)

It follows directly from the conservation laws that ˆ , sˆ), f 2 ≤ W (M ∂z (e−i

hz 2 4

ˆ , sˆ)[λ−1 β0N + |h − β|1/2 f )2 ≤ W (M

(1.3.9)

628

G. Perelman S0

+e−(1−) h0 (τ ) + ρδ f 2

1/2


+ f 2∞ ].

In the last inequality we also made use of the obvious asymptotic estimate |H(e−i

hz 2 4

ϕ(a))| ˜ = O(e−(2−)

S0 h

).

The inequalities (1.3.8), (1.3.9) lead to the estimate ˆ , sˆ)h−K [β02N + Ψ2 (M )]p(τ ; κ2 , r2 ), D L (R−ib) ≤ W (β −1 M 0

2

0

(1.3.10)

1/2 M1 M2

Ψ2 (M ) = + (M0 + M1 )2 + M22 . Here we have also used (1.3.1), (1.3.3) and the obvious inequality λ 1.3.3

−1

≤

τ −γ dsh0 (s) −1 ˆ W (β0 M , sˆ)e 0 ,

γ < 1.

Estimates of f in L2

To estimate f we represent it as the sum f = f0 + f1 + f2 , 2 f'j = (I − P˜ (a))T (r−1 )e−iy σ3 'hj , j = 0, 1,

where 'hj = (F ˆb + G ˆ b )Φj (· − ib). At last, 2 f'3 = (I − P˜ (a))T (r−1 )e−iy σ3 'k.

Consider f0 . Using the representation 'h0 = −(H ˆ − i0)−1 (I − Pˆ )N0 , one can get the following estimate (see appendix 5) ˆ , sˆ)e−(2−) ρδ f2 2 ≤ W (M

S0 h

.

(1.3.11)

Here and in what follows 5 depends on both δ0 and δ and tends to zero as δ0 , δ → 0. It follows from proposition 1.2.7 and (1.2.24), (1.3.4), (1.3.10) that ˆ , sˆ)h−K [β02N + Ψ2 (M )]p(τ ; κ2 , r2 ), (1.3.12) ρδ f1 2 ≤ W (β0−1 M 0 provided b > κ2 . Using proposition 1.2.6 one can easily prove the following estimate ˆ , sˆ)h−K [e−(2−) ρδ f2 2 ≤ W (M 0

S0 h

+ |β − h| + |β − r−2 |]e−i

βz 2 4

f H 1 . (1.3.13)

Combining (1.3.11)-(1.3.13) and taking into account (1.3.3) one gets finally ˆ , sˆ)h−K0 [β02N + Ψ2 (M )]p(τ ; κ2 , r2 ) ρδ f 2 ≤ W (β −1 M 0

≤ with some K0 ≥ 0.

0

ˆ , sˆ)β −K0 [β02N W (β0−1 M 0

+ Ψ2 (M )]p(τ ; κ4 , r4 )

(1.3.14)

Vol. 2, 2001


1.3.4 Estimates of f in L∞ βz We represent f by the sum f' = ei 4 f˜1 satisfies the equation

2

σ3

hz (f˜0 + f˜1 ), where f˜0 = e−i 4

2

λτ 1 if˜τ1 = (−∂z2 + µτ )σ3 f˜1 − i ( + z∂z )f˜1 + H0 + H1 , λ 2

σ3 '0

f (a). Then

(1.3.15)

where H0 = H00 + H01 + H02 , λτ 1 H00 = −if˜τ0 + (µτ − 1)σ3 f˜0 + i(h − )( + z∂z )f˜0 , λ 2 βz 2

H01 = e−i 4 σ3 N1 , βz 2 1 1 H02 = e−i 4 σ3 l(σ)ϕ˜ − iaτ ϕã 1 1 βz 2 hz 2 1 . +(e−i 4 σ3 − e−i 4 σ3 )F0 (a) −1 2

βz At last, H1 = e−i 4 σ3 V (ϕ(a)) ˜ f'. We rewrite (1.2.15) as an integral equation τ 1 ˜ f = U (τ, 0)' χ1 − i dsU (τ, s)(H0 (s) + H1 (s)),

(1.3.16)

0 β0 z 2

where χ1 = e−i 4 (χ0 − f 0 (β02 )), U (τ, s) being the propagator corresponding to the equation ifτ = (−∂z2 + µτ )σ3 f − i λλτ ( 12 + z∂z )f. It follows from (1.3.16) that ˆ0 1 + f˜1 ∞ ≤ c[λ−1/2 (τ )χ

τ

ds 0

+ 0

τ

λ(s) λ(τ )

1/2 ˆ 0 1 H

(1.3.17)

λ− 2 (τ )λ− 2 (s) ds H1 1 ]. t(τ ) − t(s) 1

1

Here we made use of the obvious estimates 1/2 λ(s)

fˆ1 , λ(τ ) U (τ, s)f ∞ ≤ c . 1 1 λ− 2 (τ )λ− 2 (s) √ f 1 t(τ )−t(s)

The first term in the right hand side of (1.3.17) can be estimated as follows ˆ1 1 ≤ W (β0−1 M, s)β02N p(τ ; κ3 , r3 ). λ−1/2 (τ )χ

(1.3.18)

630

G. Perelman


Consider H0 . Using proposition 1.2.4 and (1.3.1), (1.3.7) one gets ˆ 0 1 ≤ c(H ˆ 00 1 + H01 H 1 + H02 H 1 ) H ≤ W (β0−1 M, s)[β02N + β0L0 s1 + Ψ2 (M )]p(τ ; κ2 , r2 ). Thus, the contribution of H0 in the right hand side of (1.3.17) admits the estimate τ 0

ds

λ(s) λ(τ )

12

ˆ 0 1 ≤ W (β −1 M, s)β −1 H 0 0

(1.3.19)

×[β02N + β0L0 s1 + Ψ2 (M )]p(τ ; κ3 , r3 ).

The third term of (1.3.17) can be estimated as follows : 0

τ

λ− 2 (τ )λ− 2 (s) ds H1 1 ≤ W (M, s)M2 t(τ ) − t(s) 1

1

0

τ

λ− 2 (τ )λ− 2 (s) ds p(s; κ4 , r4 ) t(τ ) − t(s) 1

1

≤ W (M, s)M2 β0−1 p(τ ; κ3 , r3 ), which together with proposition 1.2.4 and (1.3.3), (1.3.17)-(1.3.19) gives M0 + M1 ≤ W (β0−1 M, s)β0−1 [β02N + M2 + (M0 + M1 )2 +β0−2 M22 + β0−2 (M0 + M1 )4 ]. 1.3.5

(1.3.20)

Estimates of majorants

Consider the system of inequalities (1.3.3), (1.3.14), (1.3.20). Introduce new scales : ˆ j , j = 0, 1, ˆ j = β0 M M

ˆ 2. ˆ 2 = β 2K0 +2 M M 0

Remark that one can choose the function W to be spherically symmetric and ˆ j the inequalities (1.3.3), (1.3.14), (1.3.20) can be monotone. Then in terms of M written in the form γ ˆ2 , ˆ sˆ) e− β0 + β 2 (M ˆ0+M ˆ 1 )2 + β 4K0 M sˆ0 , sˆ1 , sˆ2 ≤ W (M, (1.3.21) 0 2 0 ˆ0+M ˆ2 , ˆ 1 ≤ W (M, ˆ sˆ) β 2N−2 + +β 2K0 M M 0 0

(1.3.22)

1 ˆ 2 ≤ W (M, ˆ sˆ) β 2N−3K0 −2 + β −2K0 M ˆ0+M ˆ 1 ) + β −3K0 (M ˆ0+M ˆ 1 )2 . ˆ 2 (M M 0 2 0 0

Taking into account the second inequality one can rewrite the third one as follows. ˆ 2 ≤ W (M, ˆ sˆ)β 2N−3K0 −2 . M 0

(1.3.23)

0 Choosing N > 1 + 3K 2 one gets that for β0 sufficiently small the solution of (1.3.21)-(1.3.23) can belong either to a small neighborhood of 0 or to some domain

Vol. 2, 2001


ˆ j , sj whose distance from 0 is bounded uniformly with respect to β0 . Since all M are continuous functions of τ1 and for τ1 = 0 are small only the first possibility can be realized. As a consequence, one finally obtains M0 , M1 ≤ cβ02N−K0 −1 , s0 , s1 ≤

M2 ≤ cβ02N−K0 ,

cβ04N−2K0 −4 ,

τ ≤ τ1 .

(1.3.24) (1.3.25)

The constant c here does not depend either on β0 or on τ1 . Since τ1 is arbitrary these estimates are valid, in fact, for τ ∈ R. 1.3.6 Asymptotic behavior of the solution as t → T ∗ The statement of theorem 1.1.1 is a simple consequence of the inequalities (1.3.1), (1.3.24), (1.3.25). Indeed, proposition 1.2.1 and the estimates (1.3.24), (1.3.25) ensure that ψ(x, t) = eiµ(t) λ1/2 (t) (ϕ0 (z) + χ(z, t)) , z = λ(t)x, where χ admits the estimate τ

Consider λ = e

0

χ∞ ≤ ch0 . ds(β+η2 )

. By (1.3.1), (1.3.24), (1.3.25), |β + η2 − h0 | ≤ ch20 .

(1.3.26)

So, one gets for λ τ

λ=e

0

ds(h0 +O(h20 )

=e

2S0 τ ln τ

(1+o(1))

,

τ → +∞.

(1.3.27)

In the last equality we have made use of (1.2.16). Consider the relation ∞ ∞ 1 1 1 h0 T∗ − t = β + η . ds 2 = − ds − h + 2 0 λ 2h0 λ2 h0 λ2 2h0 τ τ By (1.3.26), this identity implies ∗

−1/2

λ = (2h0 (T − t))

(1 + O(h0 )) =

4S0 (T ∗ − t) ln τ

−1/2 (1 + o(1)),

(1.3.28)

as t → T ∗ , which together with (1.3.27) gives ln τ e−

4S0 τ ln τ

(1+o(1))

= 4S0 (T ∗ − t)(1 + o(1)),

t → T ∗.

As a consequence, one gets τ=

1 | ln(T ∗ − t)| ln(| ln(T ∗ − t)|)(1 + o(1)). 4S0

(1.3.29)

632

G. Perelman


Combining (1.3.28), (1.3.29), one obtains finally λ= Consider µ = τ + 2

τ 0

4S0 (T ∗ − t) ln | ln(T ∗ − t)|

−1/2 (1 + o(1)).

dsη1 . By (1.3.1), (1.3.24), (1.3.25), µ = τ (1 + o(1)),

which together with (1.3.29) implies µ=

1 | ln(T ∗ − t)| ln(| ln(T ∗ − t)|)(1 + o(1)). 4S0

2. Properties of the linearized equations As mentioned in the introduction, this section has a technical value : it contains ˜ a detailed description of the spectral properties of the operators H(a), H(a) in the limit a → 0. In particular, we prove here the propositions 1.2.1, 1.2.2 and 1.2.4-1.2.7. The present section consists of four subsection. In the first subsection we collect some elementary properties of the soliton linearization H0 1 that will be used in what follows (most of them were proved in [BP1].) In subsection 2.2 we construct the modified ground state ϕ(a) ˜ and prove proposition 1.2.1. Subsection 2.3 contains a proof of proposition 1.2.2. In subsection 2.4 we prove the estimates related to the operator H(a). Finally, we have five appendices where some technical details are removed.

2.1 Operator H0 2.1.1

Standard solutions

Consider the equation H0 f = Ef,

(2.1.1)

Since σ1 H0 = −H0 σ1 , it suffices to consider the solutions for Re E ≥ 0.In [BP1] a basis of solutions fj , j = 1, . . . , 4 with the standard behavior e±ikx 10 ,

√ √ e±µx 01 , k = E − 1, µ = E + 1, as x → +∞ was constructed. We collect here some properties of these solutions that we shall need later : (i) the decreasing solution fi0 (x, k), i = 1, 3, and its derivatives with respect to x are holomorphic functions of k ∈ Ωi , i = 1, 3, where Ω3 = {k, Re µ − |Im k| > −δ1 }, Ω1 = {k, k ∈ Ω3 , Im k > −δ1 }, 1 Here

we consider H0 as an operator on the whole L2 (R → C2 ).

Vol. 2, 2001


√ √ µ√= k2 + 2, the root being defined on the plane with the cuts (−i∞, −i 2], [i 2, i∞), Re µ > 0. Here δ1 is a small positive number determined by the rate of decrease of the potential V (ϕ0 ). (ii) fi0 , i = 1, 3, have the following asymptotics as x → +∞ 1 0 ikx + O((1 + |k|)−1 e−γx )], k ∈ Ω11 f1 (x, k) = e [ 0 f10 (x, k)

1 −µx 0 =e + c(k)e + O((1 + |k|)−1 e−Im kx−γx ), k ∈ Ω12 0 1 0 0 −µx f3 (x, k) = e [ + O((1 + |k|)−1 e−γx )], k ∈ Ω3 . (2.1.2) 1 ikx

Here γ is some positive number, Ω11 and Ω12 are two subsets of Ω1 = Ω11 ∪ Ω12 , Ω11 = {k, Re µ − Im k > δ2 }, Ω12 = {k, Re µ − Im k ≤ δ2 }, δ2 > 0 being a small positive number, c(k) is a holomorphic function of k admitting the estimate c(k) = O((1 + |k|)−1 ). (iii) The increasing solutions fi0 , i = 2, 4, are holomorphic functions of k ∈ Ω2 = {k, |Im k| < δ1 }, with the following asymptotic behavior as x → ∞ 0 −ikx 1 + O((1 + |k|)−1 e−γx )], f2 (x, k) = e [ 0 f40 (x, k)

0 =e [ + O((1 + |k|)−1 e−γx )], 1 µx

(2.1.3)

uniformly with respect to k, k ∈ Ω2 . The asymptotic representations (2.1.2), (2.1.3) can be differentiated with respect to x and k any number of times. (iv) One can choose fj0 in such a way that f10 (x, −k) = f20 (x, k), f10 (x, k) = f20 (x, k),

0 0 f3,4 (x, −k) = f3,4 (x, k),

0 (x, k) = f 0 (x, k), f3,4 3,4

k ∈ R.

(2.1.4)

The Wronskian w(f, g) =< f , g >R2 − < f, g >R2 does not depend on x if f and g are solutions of (2.1.1). (v) The system of Wronskians for fj0 has the form w(f10 , f20 ) = 2ik, w(f10 , f30 ) = 0, w(f10 , f40 ) = 0, w(f30 , f40 ) = −2µ, k ∈ Ω2 .

(2.1.5)

634

G. Perelman


The solutions with standard behavior as x → −∞ can be obtained by using the fact that the operator H0 is invariant under the change of variable x → −x. Let gj0 (x, k) = fj0 (−x, k), j = 1, . . . , 4. In addition to scalar Wronskian we shall also use matrix Wronskian

W (F, G) = F t G − F t G , where F and G are 2 × 2 matrices composed of pairs of solutions. The matrix Wronskian do not depend on x. We introduce the concrete matrix solutions F10 = (f10 , f30 ), F20 = (f20 , f40 ),

G01 = (g10 , g30 ), G2 = (g20 , g40 ).

Since V decays exponentially H0 cannot have more than a finite number of the eigenvalues, all of them being of finite multiplicity. It was shown in [BP1] that Proposition 2.1.1 The eigenvalues of the operator H0 in the domain Re E ≥ 0 and its resonances at the boundary point E = 1 of the continuous spectrum 1 are characterized by the equation det D0 = 0, where D0 = W (G01 , F10 ). Remark. Let us mentioned that the most rapidly decreasing solution f30 is simply defined by means of the integral equation ! ∞ sin k(x−y) 0 0 µx 0 k f3 (y) = e dy − σ3 V (ϕ0 (y))f30 (y). sh µ(x−y) 1 0 x µ For E in some small vicinity of zero one can use the similar equations to construct a complete set of solutions. Indeed, consider the equation ! ∞ sin k(x−y) 0 0 ikx 1 k − w1 (x) = e dy σ3 V (ϕ0 (y))w10 (y). (2.1.6) sh µ(x−y) 0 0 x µ The potential V (ϕ0 ) decreases exponentially : |V (ϕ0 (x))| ≤ ce−4|x| , so, for E in a sufficiently small vicinity of zero (for ex., for |E| ≤ 2) the integral operator in (2.1.6) reproduces the behavior of the free term. Thus, omitting standard details we get the existence of solution w10 (x, k) of (2.1.1) that is holomorphic 1 Generically the equation H f = ±f does not have solutions bounded at infinity. If, never0 theless such bounded solutions exist the points ±1 are called resonances.

Vol. 2, 2001


function of k ∈ Ω0 , Ω0 = {k, |k2 + 1| < 2}, with the following asymptotic behavior as x → +∞ : # " 1 0 ikx −4x w1 = e ) , (2.1.7) + O(e 0 uniformly with respect to k. This asymptotic formula can be differentiated with respect to x, k any number of times. The constructed solution satisfies the relation f30 (x, k) = σ1 w10 (x, iµ). √ √ √ √ On the set Ω0 with the cuts along the intervals (−i 3, −i 2], [i 2, i 3) introduce the basis of solutions {wj0 }4j=1 , w20 (x, k) = w10 (x, −k),

w30 (x, k) = σ1 w10 (x, iµ),

w40 (x, k) = σ1 w10 (x, −iµ),

Re µ > 0.

wj0 satisfy the same set of relations (2.1.4), (2.1.5) as fj0 . Consider the Wronskian : ˆ 0 = W (U 0 , W 0 ), D ˆ 0 coincide where W 0 = (w1 , w3 ), U 0 (x, k) = W 0 (−x, k). Clearly, the zeros of det D with those of det D0 (in Ω0 ∩ Ω1 ). Since H0 is invariant under the change of variable x → −x, the matrices D0 , ˆ 0 can be factorized : D D0 = −2D0+ D0− ,

0 D0− (k) = (F10 (0, k))t , D0+ (k) = F1x (0, k).

ˆ +D ˆ 0 = −2D ˆ −, D 0 0

ˆ − (k) = (W 0 (0, k))t , D ˆ + (k) = Wx0 (0, k). D 0 0

2.1.2 Discrete spectrum Taking into account the special structure of the perturbation V (ϕ0 ) one can get a more precise description of the discrete spectrum. The structure of the root subspace of H0 restricted to the subspace of even functions corresponding to the eigenvalue E = 0 has already been described in Section 1. Taking into account also the Galilei invariance of the equation (1.1.1) one can get the complete description : corresponding to the point E = 0 are two eigenvectors 'η0 , ξ'0 and four associated functions 'η1 , ξ'i , i = 1, 2, 3, H'η1 = i'η0 , H0 ξ'i = iξ'i−1 , i = 1, 2, 3, ξi ηi , ξ'i = ¯ , 'ηi = η¯i ξi

H0 ξ'0 = H0 'η0 = 0,

ξ0 = iϕ0 , ξ1 =

1 1 (1 + 2x∂x )ϕ0 , ξ2 = −i x2 ϕ0 , 4 8

636

G. Perelman

ξ3 = Since


1 i ϕ1 , η0 = ϕ0 , η1 = − xϕ0 . 2 2

xϕ0 22 , ξ'3 , σ3 ξ'0 = − ξ'2 , σ3 ξ'1 = −i 8 ϕ0 22 'η1 , σ3 'η0 = i , 2

the vectors ξ'i , i = 0, 'ηj , i = 0, . . . , 3, j = 0, 1, span the root subspace corresponding to the point E = 0. Let us pass to a new basis in the matrix representation of H0 : 1 1 1 −1 . L0 = W H0 W , W = √ 1 −1 2 The operator L0 has the form L0 =

0 L0+

L0− 0

,

where L0+ = −∂x2 + 1 − 5ϕ40 ,

L0− = −∂x2 + 1 − ϕ40 .

The operators L0± are self-adjoint in L2 , the continuous spectra lie on the half-axe E ≥ 1. L0− has the only eigenvalue E = 0 with the eigenfunction ϕ0 . L0+ has two eigenvalues E0 , 0, E0 < 0, with the eigenfunctions ϕ30 , ϕ0 respectively. Both L0− and L0+ have no resonances at the end point of the continuous spectrum. Remark that T0 0 2 , T0 = L0− L0+ . L0 = 0 T0∗ The spectra of the operators T0 and T0∗ are connected in a canonical way, i.e., are complex conjugated and the corresponding root subspaces are finite-dimensional and have the same structure. Consider T0 . Obviously, T0 ξ1 = T0 η0 = 0. The spectrum of T0 is real, the minimal eigenvalue being equal to zero (see [BP1], for example). Moreover, one has the following proposition. Proposition 2.1.2 Zero is the only eigenvalue of the operator T0 in the interval (−∞, 1]. Proof. We prove it by a contradiction. Let 1 ≥ E > 0 be an eigenvalue of T0 with −1 an eigenfunction ψ : T0 ψ = Eψ. Then (ψ, ϕ0 ) = 0, (L−1 0− ψ, ξ1 ) = (L0− ψ, η0 ) = 0. Consider the self-adjoint operator A = P L0+ P , P being the projection orthogonal to ϕ0 . The direct calculations show that (L−1 P u, P u) (Au, u) ≤ 0− < 1, (u, u) (P u, P u)

Vol. 2, 2001


provided u ∈ F, F = L{ψ, η0 , ξj , j = 0, 1}. Obviously, dim F = 4, which implies that the number of the eigenvalues of A in (−∞, 1) counted with their multiplicities is greater or equal than four. On the other hand the only eigenvalue of A in the interval (−∞, 1) is the point E = 0, η0 , ξj , j = 0, 1, being the corresponding eigenfunctions. Indeed, let E = 0 be an eigenvalue of P L0+ P , then E > E0 and there exists u, (u, ϕ0 ) = 0, such that L0+ u = Eu + ϕ0 . Consequently, u = (L0+ − E)−1 ϕ0 , which implies ((L0+ − E)−1 ϕ0 , ϕ0 ) = 0.

(2.1.8)

Consider the function g(λ) = ((L0+ − λ)−1 ϕ0 , ϕ0 ), assuming that λ ∈ (E0 , 1). The function g has the following obvious properties : 1) g(λ) is monotonically increasing, because g (λ) = (L0+ − λ)−1 ϕ0 22 ; 2) g(0) = −(ξ1 , ϕ0 ) = 0. Thus, (2.1.8) is impossible for E = 0. Proposition 2.1.2. extends immediately to the operators L0 and H0 : Corollary 2.1.3 E = 0 is the unique point in the discrete spectrum of the operator H0 . A slight modification of the arguments used in the proof of proposition 2.1.2 allows us to get Proposition 2.1.4 The operator H0 has no resonances at the end points of the continuous spectrum. See appendix 1 for the proof. 2.1.3 Embedded eigenvalues In this subsubsection we prove the absence of embedded eigenvalues. Consider equation (2.1.1) with E > 1. After a change of variables f (x) = v(z), z = th 2x, (2.1.1) takes the form −∂z2 −

2z 1 + ∂z + 1 − z2 4(1 − z 2 )2

v

3 E 9 v− σ1 v = σ3 v. 2 2 4(1 − z ) 2(1 − z ) 4(1 − z 2 )2

(2.1.9)

The only singular points of this system (considered on the whole plane z ∈ C ) are z± = ±1 and z∞ = ∞. It is easy to check that they are regular. In particular, in a vicinity of z± one can find a basis of solutions of the form (z − zj )ik/4 ej1 (z), (z − zj )−ik/4 ej2 (z), (z − zj )µ/4 ej3 (z),

638

G. Perelman


(z − zj )−µ/4 ej4 , if µ/2 ∈ Z, ln(z − zj )(z − zj )µ/4 ej3 (z) + (z − zj )−µ/4 ej4 , if µ/2 ∈ Z,

where ejl , l = 1, . . . , 4, j = ±, are holomorphic non vanishing functions in some vicinity of zj , k and µ being the same as in subsubsection 2.1.2. Thus, if E > 1 is an eigenvalue of H0 there exists a nontrivial solution v of (2.1.9) such that v(z) = (1 − z 2 )µ/4 v˜(z), where v˜ is an entire function. Since z∞ is a regular singular point of (2.1.9) v˜ has at most polynomial growth at infinity, which means that v˜ is polynomial. Moreover, it is easy to check that the roots of the characteristic equation at infinity are given by − 12 ± 2, − 12 ± 1, which implies n = 0, where n is the degree of v˜. The direct calculation shows that (2.1.9) has no nontrivial solution of the form (1 − z 2 )µ/4 a, where a is a constant vector. Combining these results with the results of the previous subsection one gets the proposition. Proposition 2.1.5 det D0 (k) = 0, k ∈ Ω1 , Im k ≥ 0, provided k = i.

2.2 Profile ϕ˜ Consider (1.2.1). We are looking for a real even solution of (1.2.1). Write ϕ˜ as the sum ϕ(x, ˜ α, a) = ϕ0 (x, α) + χ(x, α, a). Then χ satisfies the equation ˜ −1 χ=L + χ0 + J (χ), where

(2.2.1)

ax2 θ(hx)ϕ0 (x, α), 4 5 5 4 ˜ −1 J (χ) = L + (ϕ0 + χ) − ϕ0 − 5ϕ0 χ , χ0 =

2 2 ˜ + = −∂x2 + α − ax θ(hx) − 5ϕ40 . L 4 4

˜ + is a self-adjoint operator in L2 . It follows from the corresponding properties L ˜ + to the subspace of even functions has a bounded of L+0 that the restriction of L inverse. Moreover, one has the estimate ˜ + (x, y)| ≤ ce− h |Sα,a (hx)−Sα,a (hy)| , |G 1

x ≥ 0, y ≥ 0,

(2.2.2)

Vol. 2, 2001


ξ ˜ + is the kernel of L ˜ −1 Sα,a (ξ) = 12 0 ds α2 − sgn as2 θ(s). Here G + , if we consider ˜ L+ as an operator on the half-line x ≥ 0 with the Neumann boundary condition at x = 0. This estimate can be obtained as an immediate consequence of the constructions developed in the next subsection. It follows from (2.2.2) that 1 ˜ ˜ −1 3 (2.2.3) L+ χ0 (x) ≤ c|a| x e− h Sα,a (h|x|) , ˜ −1 h Sα,a (h|x|) f . e h Sα,a (h|x|) L (2.2.4) 1 + f ∞ ≤ ce ξ Here S˜α,a (ξ) = 12 0 ds α2 − (a)+ s2 θ(s). Consider (2.2.1). The basis idea is to view this equation as a mapping of the space of continuous functions equipped with the norm 1

˜

1

˜

|χ|p = x−p e h Sα,a (h|x|) χ∞ , 1

˜

with some p ≥ 0, to itself and to seek for a fixed point. Using (2.2.4) it is not difficult to check that the nonlinear operator J maps this space into itself : |J (χ)|p ≤ c[|χ|2p + |χ|5p ].

(2.2.5)

Moreover, |J (χ1 )−J (χ2 )|p ≤ c|χ1 −χ2 |p [|χ1 |p +|χ2 |p +(|χ1 |p +|χ2 |p )4 ]. (2.2.6) The estimates (2.2.3), (2.2.5), (2.2.6) mean that for a sufficiently small the mapping χ → χ0 + J (χ) is a contraction of the ball |χ|3 ≤ η into itself with some η > 0, and, consequently, has a unique fixed point which satisfies the estimate |χ|3 ≤ c|a|.

(2.2.7)

In the same manner one can prove the asymptotic expansion (1.1.5). Write ϕ˜ = ϕN + χN . The function χN satisfies the equation −∂x2 χN +

α2 ax2 χN − θ(hx)χN − (ϕN + χN )5 + (ϕN )5 − RN = 0, 4 4

where RN admits the estimate $ |RN (x)| ≤ c aN+1 x

3N+2

+ (1 − θ(hx))

N−1

% |a|k+1 |x|3k+2 e− 2 |x| . α

k=0

We rewrite this equation in the form similar to (2.2.1) : N χN = χN 0 + JN (χ ),

(2.2.8)

640

G. Perelman


˜ −1 ˜ −1 χN 0 = L+ RN , JN (χ) = L+ FN (χ), where FN (χ) = (ϕN + χ)5 − (ϕN )5 − 5ϕ40 χ. By (2.2.2), (2.2.4), N+1 , < x >−3(N+1) e h Sα,a (h|x|) χN 0 ∞ ≤ c|a| 1

˜

|JN (χ)|p ≤ c(|a||χ|p + |χ|2p + |χ|5p ), which together with (2.2.7) implies < x >−3(N+1) e h Sα,a (h|x|) χN ∞ ≤ c|a|N+1 , 1

˜

provided a is sufficiently small. By (2.2.7), ϕ˜ admits the estimate x−3 e h Sα,a (h|x|) ϕ ˜ ∞ ≤ c. 1

˜

(2.2.9)

Plugging this inequality into right hand side of the representation α2 ax2 −1 5 − θ) ϕ˜ 4 4 and using the corresponding estimate of the free resolvent one gets an improved version of (2.2.9) : ϕ˜ = (−∂x2 +

c2 (1 + O(e− h Sα,a (h|x|) )) ≤ e h Sα,a (h|x|) ϕ˜ ≤ c1 , x ∈ R, 4

1

(2.2.10)

with some c1 , c2 > 0 independent of α, a, which together with (2.2.7) implies the positivity of ϕ˜ provided a is sufficiently small. We can now formulate the final assertion with respect to ϕ. ˜ Proposition 2.2.1 For α in some finite vicinity of 2 and for a sufficiently small, equation (2.2.1) has a unique positive even decreasing solution ϕ(z, ˜ α, a) which is close to ϕ0 (z, α). Moreover, as a → 0, ϕ(z, ˜ α, a) admits the asymptotic expansion (1.1.5) in the sense |ϕ˜ − ϕN | ≤ c|a|N+1 < x >3(N+1) e− h Sα,a (h|x|) . 1

˜

(2.2.11)

Remark. It is not difficult to check that (i) the solution ϕ˜ is a smooth function of its arguments and the asymptotic representation (2.2.11) can be differentiated with respect to x, α and a any number of times; (ii) ϕ˜ “almost” satisfies the scaling law α 1/2 α 16a ϕ(x, ˜ α, a) ∼ ϕ( ˜ x, 2, 4 ). (2.2.12) 2 2 α More precisely, α 1/2 α 16a ϕ(x, x, 2, ϕ( ˜ ) ≤ ce−γ1 /h e−γ2 |x| , ˜ α, a) − 2 2 α4 with some γ1 , γ2 > 0.

Vol. 2, 2001


˜ 2.3 Operator H(a) ˜ In this subsection we establish the spectral properties of the operator H(a) (in the limit a → 0) that were announced and used in Section 1. 2.3.1 Standard solutions Consider the equation ˜ (H(a) − E)ψ = 0.

(2.3.1)

For E in some small but fixed vicinity of zero we introduce a basis of solutions ψj , j = 1, . . . , 4, of (2.3.1) with the standard behavior at +∞ by means of the integral equations ∞ ˜ dy K(x, y, E)σ3 V (ϕ(y))ψ ˜ (2.3.2) ψj (x, E) = ψ0j (x, E) − j (y, E), x

j = 1, . . . , 4, where ψ0j (x, E) = σ1 ψ0j+2 (x, −E), 1 1 , ψ02 (x, E) = u2 (x, λ1 ) , ψ01 (x, E) = u1 (x, λ1 ) 0 0

˜ y, λ1 ) 0 k(x, ˜ 0 k(x, y, λ2 )

˜ K(x, y, E) = ˜ y, λ) = k(x,

λ1 = E − 1,

λ2 = −E − 1,

,

1 (u1 (x, λ)u2 (y, λ) − u1 (y, λ)u2 (x, λ)), w(u1 , u2 )

w(u1 , u2 ) = u1 u2 −u2 u1 , u2 (x, λ) = u1 (−x, λ), u1 being a decreasing (as x → +∞) solution of the equation −uxx −

ax2 θ(hx)u = λu. 4

We normalize u1 by the condition

1 hx 1 −h ds 0 e u1 = 2 (−λ − ax4 θ(hx))1/4

2

−λ−sgn a s4 θ(s)

,

x → +∞.

(2.3.3)

The roots here are defined on the complex plane with the cut along the negative semi-axis. They are positive for the positive values of the argument. For λ in some finite vicinity of −1, x ∈ R, the asymptotics of u1 as a → 0 is given by the standard WKB formulas 1 −h

u1 (x, λ) = e

hx 0

∞ 2 ds −λ−sgn a s4 θ(s) j=0

hj uj1 (hx, λ),

(2.3.4)

642

G. Perelman

where u01 (ξ, λ) = uj1 (ξ, λ) = −

1 2(−λ − sgn a

1 2 θ(ξ)

(−λ − sgn a ξ ∞

ξ 2 θ(ξ) 4

)1/4

4

ds ξ

)1/4


, j−1 u1ss 2 θ(s)

(−λ − sgn a s

4

)1/4

.

As a consequence, one gets w(u1 , u2 ) = −2 + O(h), ˜ y, λ)| ≤ ce h1 |k(x,

hy hx

dsRe

2

−λ−sgn a s4 θ(s)

,

x ≤ y,

uniformly with respect to λ in some finite vicinity of −1. The potential V (ϕ) ˜ decreases exponentially : |V (ϕ(x))| ˜ ≤ ce− h Sa (h|x|) , 4

Sa (ξ) = S2,a (ξ), so for E in some finite vicinity of zero we get the existence of a solution ψj of (2.3.1) that has the following asymptotic behavior as x → +∞ : " # 4 1 −h Sa (hx) + O(e ψj (x, E) = uj (x, E − 1) ) , j = 1, 2, (2.3.5) 0 " # 4 0 ψj (x, E) = uj−2 (x, −E − 1) + O(e− h Sa (hx) ) , j = 3, 4, (2.3.6) 1 uniformly with respect to a, E. In this formulation and in subsequent ones we omit phrases of the following type : the solutions ψj and its derivatives with respect to x are holomorphic functions of E and the asymptotic representations can be differentiated with respect to x and E any number of times. Clearly, ψj+2 (x, E) = σ1 ψj (x, −E), w(ψ1 , ψ2 ) = w(ψ10 , ψ20 ), w(ψ3 , ψ4 ) = w(ψ30 , ψ40 ),

¯ ψj (x, E) = ψj (x, E), w(ψ1 , ψ3,4 ) = 0, w(ψ3,4 , ψ2 ) = 0.

(2.3.7) (2.3.8)

One can use ψj (−x, E), j = 1, . . . , 4 as a basis of solutions with the standard behavior at −∞. We shall describe now the behavior of the decreasing solutions ψ1,3 in the limit a → 0. By (2.3.7), it is sufficient to consider ψ1 . We represent it as the sum (2.3.9) ψ1 = e−ikx u1 (x, λ1 )w10 (x, k) + r1 , k = E − E0 , Im k > 0. One can write down the following integral equation for rj ∞ ˜ r1 (x, E) = − dy K(x, y, E)[R1 + σ3 V (ϕ(y))r ˜ 1 (y, E)]. x

Vol. 2, 2001


Here

˜ − V (ϕ0 ))e−ikx u1 (x, λ1 )w10 (x, k) R1 = (V (ϕ) −2eikx (e−ikx u1 (x, λ1 ))x σ3 (e−ikx w10 (x, k))x .

By (2.1.7), (2.3.4), |R1 | ≤ ch|u1 (x, λ1 )| x3 e− h Sα,a (hx) , 4

˜

which leads to the following asymptotic estimate for r1 : r1 = O(hu1 (x, λ1 )e− h Sa (hx) ), γ < 4. γ

˜

(2.3.10)

For x not too large the representation (2.3.9), (2.3.10) can be simplified : ψ1 = d0 w10 + O(he−

1−γ h

ã (hx) S

),

d0 = (−λ1 )−1/4 ,

(2.3.11)

with some γ > 01 , uniformly with respect to E in some finite vicinity of zero. In a similar way one can get a complete asymptotic expansion of ψ1 in powers of h. Without dwelling on the derivation we describe the result. Let us introduce a formal solution w, ∞ w(x, E, a) = an wn (x, E), (2.3.12) n=0

of the equation

" # ax2 2 (−∂x + 1 − )σ3 + V (ϕ(a)) ψ = Eψ. 4

(2.3.13)

Equation (2.3.13) is equivalent to the following recurrent system for wn : (H0 − E)w0 = 0, x2 σ3 wn−1 + V k wn−k = 0, n ≥ 1, 4 k=1 where V k are the coefficients of the expansion V (ϕ(a)) = k≥0 ak V k . It is easy to check that this system admits a solution with the following asymptotic behavior # " 1 3n wn = eikx Pn (x, E) + O(x e−4x ) , x → +∞, 0 √ k = E − 1, Im k > 0, Pn being polynomial of x of the degree 3n. The coefficients wn can be fixed uniquely by the condition Pn (0, E) = 0 for n > 0, P0 = 1. Then ¯ a). w0 (x, E) = w10 (x, k), w(x, E, a) = w(x, E, n

(H0 − E)wn −

1γ

can be made arbitrary small by choosing a sufficiently small vicinity of the point E = 0.

644

G. Perelman


One can show that after a renormalization the solution ψ1 admits the asymptotic (2.3.12). More precisely, there exists a formal series d(E, a) = expansion n h d (E, a ˆ ), a ˆ = a/|a|, (d0 being the same as in (2.3.11)), such that n n≥0 ψ1 = dw,

(2.3.14)

in the sense |ψ1 (x, E, a) −

hn ψ1n (x, E, a ˆ)| ≤ chN+1 e−

(1−γ) ˜ Sa (hx) h

,

x ≥ 0,

(2.3.15)

n≤N

uniformly with respect to E in some finite vicinity of zero. Here ψ1n are the coefficients of the series dw, γ is the same as in (2.3.11). It is worth mentioning that d can be found from the formal relation u1 (x, λ1 , a) = eikx d(E, a) an Pn (x, E). n≥0

In particular, 1 d1 = 2(−λ1 )1/4

∞ ds

∂ s2 ˆ θ(s))−1/4 (−λ1 − a ∂s 4

2 .

0

By (2.3.7), an expansion similar to (2.3.14), (2.3.15) is valid for ψ3 : ψ3 (x, E, a) = hn ψ3n (x, E, a ˆ),

(2.3.16)

n≥0

where ψ3n (x, E, a ˆ) = σ1 ψ1n (x, −E, a ˆ) . ˜ 2.3.2 Spectral properties of the operator H(a) ˜ ˜ The operator H(a) has the same continuous spectrum as H0 . In addition, H(a) ˜ can have only finitely many eigenvalues of finite multiplicity. H(a) satisfies the relations similar to (1.1.8) : ˜∗ ˜ σ3 H(a)σ 3 = H (a),

˜ ˜ σ1 H(a)σ 1 = −H(a),

(2.3.17)

˜ which leads to a clear symmetry in the structure of the spectrum of H(a). The point E = 0 is an eigenvalue : there is an eigenfunction ζ˜0 and an associated function ζ˜1 , ˜ ζ˜1 = iζ˜0 , ˜ ζ˜0 = 0, H(a) H(a) 1 1 ˜ ˜ , ζ1 (a) = ∂α ϕ(α, , ˜ ˜ a)|α=2 ζ0 (a) = iϕ(a) −1 1

(2.3.18)

Vol. 2, 2001


ζ˜1 (a), σ3 ζ˜0 (a) = 4ia(ϕ0 , ϕ1 ) + O(a2 ) = 4iea + O(a2 ).

(2.3.19)

˜ The eigenvalues of H(a) lying in some finite vicinity of zero can be characterized by the equation det D(E) = 0, where D = W (Ξ1 , Ψ1 ), Ψ1 = (ψ1 , ψ3 ), Ξ1 (x, E) = Ψ1 (−x, E), D is a holomorphic function of E in some finite vicinity of the point E = 0. In the same manner as D0 , the matrix D can be factorized : D = −2D− D+ ,

D− (E, a) = Ψt1 (0, E, a), D+ (E, a) = Ψ1x (0, E, a),

the zeros of det D+ ( det D− ) (counted with their multiplicity) corresponds to the ˜ eigenvalues of H(a) restricted to the subspace of even (odd) functions. By (2.3.7), σ1 D± (E)σ1 = D± (−E),

¯ = D± (E). D± (E)

(2.3.20)

It follows from (2.3.18), (2.3.19) that the point E = 0 is a root of det D+ of the multiplicity two : (2.3.21) det D+ = κ(a)E 2 + O(E 4 ). As a → 0, κ admits the asymptotic representation of the form : κ(a) = d2 (0, a)ˆ κ(a),

(2.3.22)

where κ ˆ (a) is a formal series in powers of a, in particular, κ ˆ (a) = κ0 a + O(a2 ),

κ0 =

(ϕ40 (0) − 1)e > 0. ϕ2∞

(2.3.23)

where ϕ∞ = ϕ∞ (2). In terms of the matrix solution Ψ1 (2.3.14), (2.3.16) take form Ψ1 = WΛ,

(2.3.24)

where W is the formal matrix solution of (2.3.9) W(x, E, a) = an W n (x, E), W n (x, E) = (wn (x, E), σ1 wn (x, −E)), (2.3.25) n≥0

Λ(E, a) =

d(E, a) 0 0 d(−E, a)

.

Let us note the obvious relation W 0 (x, 0) = √

1 ('η0 , −ξ'0 )W. 2ϕ∞

(2.3.26)

646

G. Perelman


The formulas (2.3.24), (2.3.25) imply the following asymptotic expansion of D± : ˆ −, D− = ΛD

ˆ + Λ, D+ = D

(2.3.27)

ˆ + is a formal series in powers of a : where D ˆ ± (E, a) = ˆ n± (E)an , D D n≥0

ˆ n− (E) = (W n (0, E))t , D

ˆ n+ (E) = Wxn (0, E). D

ˆ + (E). Taking into account the structure of the root subspace of Consider D 0 H0 corresponding to the zero eigenvalue one can get the following relation : ˆ + (E)W = D ˆ + (E) 1 m1 (E) + E 4 γ0 0 1 D + O(E 5 ), 0 0 1 m1 (E) 0 −1 ˆ + (0)W = γ1 1 0 , m1 (E) = m10 E + m11 E 3 , D (2.3.28) 0 1 0 m1k , k = 0, 1, γk , k = 0, 1, are some constants, all of them can be calculated explicitly but in what follows we shall need only γk , k = 0, 1 ϕ0xx (0) γ1 = − √ , 2ϕ∞

e γ0 = √ . 4 2ϕ0 (0)ϕ∞

These formulas imply :

ˆ + = κ0 E 4 + O(E 6 ). det D 0 4 In a similar manner one can get ˆ − = κ1 E 2 + O(E 4 ), det D 0

κ1 =

ϕ0 22 . 2ϕ∞ (1 − ϕ40 )

(2.3.29)

(2.3.30)

It follows from (2.3.27) that asymptotically (as a → 0), the eigenvalues of ˜ H(a) restricted to the subspace of even (odd) functions are characterized by the ˆ ± is a formal series in equation Φ+ (E, a) = 0 (Φ− (E, a) = 0) where Φ± = det D powers of a : ˆ± an Φ± Φ± (2.3.31) Φ± (E, a) = n (E), 0 = det D0 . n≥0

By (2.3.20), ± ¯ ± Φ± n (E) = Φn (−E) = Φn (E),

and by (2.3.21), as E → 0,

2 Φ+ n (E) = O(E ).

One can show 2 Φ− 1 (E) = κ1 + O(E ),

2 4 Φ+ 1 (E) = κ0 E + O(E ).

(2.3.32)

Vol. 2, 2001


The formulas (2.3.30)-(2.3.32) show that for a sufficiently small det D+ (E, a) − and have two simple roots ±λ(a) and ±µ(a) respectively, λ(a) = √ det D (E, a) √ i aλ (a), µ(a) = i aµ (a) where λ (a), µ (a) are smooth real functions, λ (a) = 2 + O(a),

µ (a) = 1 + O(a).

Since for a sufficiently small the number of the roots of det D− (det D+ ) counted with their multiplicity in some finite vicinity of the point E = 0 is equal two (four), there are no roots except for ±µ (zero and ±λ). ˜ Let ζ˜2 (a) be an eigenfunction of H(a) corresponding to the eigenvalue λ(a). ' By (2.3.26), (2.3.28), ζ2 (a) can be normalized in such a way that ζ˜2 , ξ'0 = ζ˜0 , ξ'0 − λ2 ξ'2 , ξ'0 . (2.3.33) Then ζ˜2 = ζ˜0 + O(h). ˆ + allows us to get the A little bit more detailed consideration of the series W, D following refinement of the above representation : 1 ζ˜2 = ζ˜0 − iλζ˜1 − λ2 ξ'2 + iλ3 ξ'3 + iλk hk , (2.3.34) (−1)k−1 k≥4

where hk are even smooth real exponentially decreasing functions of x, (h2k , ϕ0 ) = 0. This asymptotic expansion holds in the sense of the L∞ -norm with (1−γ) ˜ the weight e h Sa (h|x|) , γ > 0 : |ζ˜2 − ζ˜0 + iλζ˜1 + λ2 ξ'2 − iλ3 ξ'3 −

N

iλk hk | ≤ c|a|N+1 e−

(1−γ) ˜ Sa (h|x|) h

.

(2.3.35)

k≥4

The results of this subsubsection implies in particular the following proposition. Proposition 2.3.1 For a sufficiently small, the discrete spectrum of the operator ˜ H(a) (restricted on the subspace of even functions) in some finite vicinity of the point E = 0 consists of 0, the corresponding root subspace being described by √ (2.3.18), and two simple eigenvalues ±λ(a), λ(a) = i aλ (a), where λ (a) is a smooth real function of a, λ (a) = 2+O(a). The eigenfunction ζ˜2 (a) corresponding to √ the eigenvalue λ(a), normalized by the condition (2.3.33) is a smooth function of a, admitting the asymptotic expansion (2.3.34) as a → 0 in the sense (2.3.35).

2.4 Operator H(a) In this subsection we establish the estimates related to the operator H(a), a > 0, that were announced and used in Section 1.

648

G. Perelman


2.4.1 Standard solutions Consider the equation (H(a) − E)ψ = 0.

(2.4.1)

We introduce a basis of solutions fj (x, E), j = 1, . . . 4, of (2.4.1) with the following asymptotic behavior as x → +∞ : " # 4 1 −h Sa (hx) f1 (x, E) = v(x, λ1 ) + OE,a (e ) , 0 " # 4 1 ∗ −h Sa (hx) + OE,a (e ) , f2 (x, E) = v (x, λ1 ) 0 " # 4 0 ∗ −h Sa (hx) f3 (x, E) = v (x, λ2 ) + OE,a (e ) , (2.4.2) 1 " # 4 0 −h Sa (hx) + OE,a (e f4 (x, E) = v(x, λ2 ) ) , 1 ¯ where v ∗ (x, λ) = v(x, λ), i hx 4

v(x, λ) = Cν e

2

Hν

− iπ 4

e

1/2 ! h x , 2

iνπ ν λ 1 ν = − + i , Cν = e 4 (2h)− 2 , 2 h

Hν being the Hermite function. The function v is a holomorphic function of λ ∈ C satisfying the equation ax2 −vxx − v = λv. (2.4.3) 4 As x → +∞,

hx2 v = ei 4 xν 1 + Oν (< hx2 >−1 ) . The solutions fj can be characterized by the appropriate integral equations. In particular, one can write for f1 the following one. ∞ 1 − dyK(x, y, E)σ3 V (ϕ(y))f ˜ f1 (x, E) = v(x, λ1 ) 1 (x, E), 0 x

where K(x, y, E) = k(x, y, λ) =

0 k(x, y, λ1 ) 0 k(x, y, λ2 )

,

1 (v(x, λ)v ∗ (y, λ) − v(y, λ)v ∗ (x, λ)). w(v, v ∗ )

By standard arguments one gets from this equation the existence of a solution f1 with the asymptotic behavior (2.4.2) as x → ∞, f1 being a entire function of E.

Vol. 2, 2001


The solutions fj , j = 1, . . . , 4, satisfy the relations : ¯ f2 (x, E) = f1 (x, E),

¯ f3 (x, E) = σ1 f1 (x, −E),

w(f1 , f2 ) = ih,

w(f1,2 , f3,4 ) = 0,

f4 (x, E) = σ1 f1 (x, −E), w(f3 , f4 ) = −ih.

Let us introduce the solutions gj (z, E), j = 1, . . . , 4, with standard behavior at −∞ by gj (x, E) = fj (−x, E). Consider the matrix solutions F1 = (f1 , f3 ), F2 = (f2 , f4 ), G1 = (g1 , g3 ), G2 = (g2 , g4 ). One can express F1 in terms of Gj , j = 1, 2 : F1 = G2 A + G1 B, A = A(E), B = B(E) are holomorphic functions of E , E ∈ C. One can get the Wronskian representations for A and B : A = ih−1 σ3 W (G1 , F1 ),

B = −ih−1 σ3 W (G2 , F1 ),

A admitting a factorization on the even and odd parts : A = −2ih−1 σ3 A− A+ ,

A− = F1t (0, E), A+ = F1x (0, E).

The solutions Fj , Gj satisfy the following orthogonal relations dxF1t (x, E)σ3 G1 (x, E ) = 2πhσ3 A(E)δ(E − E ), R

R

2.4.2

dxF2t (x, E)σ3 G1 (x, E ) = 0.

(2.4.4)

Asymptotics of the standard solutions as a → 0

In this subsubsection we describe the asymptotic behavior of the solutions fj in the limit a → 0. We formulate the results and outline the proofs omitting some technical details of the calculations. Consider f3 on the set D = {E, Re E ≥ 0, Im E ≥ −δ3 h}, where δ3 is a small positive number. It is not difficult to check that on this set f3 admits the following asymptotic representation. Lemma 2.4.1 As x → ∞,

γ f3 (x, E) = v ∗ (x, λ2 ) eµx f30 (x, k) + O(h(1 + |E|)−1/2 e− h Sa (hx) ) ,

0 < γ < 4,√uniformly with respect √ to h in some small vicinity of zero, and E ∈ D. Here µ = E + 1, Re µ > 0, k = E − 1.

650

G. Perelman


By the way of explanation we remark that the assertions of Lemma 2.4.1 can be got by combining the standard WKB description of v(x, λ) (see appendix 4) and the following representation : f3 (x, E) = v ∗ (x, λ2 )eµx f30 (x, k) + f31 (x, E), ∞ 1 1 f3 (x, E) = − dyK(x, y, E)σ3 [R + V (ϕ(y))f ˜ 3 (x, E)], x

where 0 + µf30 ), R = (V (ϕ) ˜ − V (ϕ0 ))v ∗ (x, λ2 )eµx f30 − 2(v ∗ (x, λ2 )eµx )x σ3 (f3x

|R| ≤ ch < x >3 e−4/hSa (hx) |v ∗ (x, λ2 )|, uniformly with respect to E ∈ D, x ∈ R+ , and h sufficiently small. To describe the behavior of f1 we must single out three subsets on the set D: D = D0,R ∪ D1,R ∪ D2,R , D0,R = {E, |E − 1| ≥ Rh, arg (1 − E) ∈ (−δ4 , δ4 )} ∩ D, D1,R = {E, |E − 1| ≤ Rh} ∩ D,

D2,R = D \ (D0,R ∪ D1,R ),

where δ4 is a small fixed number, R > 0. Proceeding in the same manner as in lemma 2.4.1 one can get the following result. Lemma 2.4.2 The solution f1 admits the following estimates : (i) if E ∈ D0,R then " # γ h f1 (x, E) = v(x, λ1 ) eikx w10 (x, k) + O( e− h Sa (hx) ) , |k| √ where k = E − 1, Im k > 0, provided R is sufficiently large, h is sufficiently small; (ii) if E ∈ D1,R then γ f1 (x, E) = v(x, λ1 ) w10 (x, 0) + OR (h1/2 e− h Sa (hx) ) . Here γ is the same as in lemma 2.4.1. To describe the behavior of f1 on the set D2,R we use the standard substitution reducing the order of the system (2.4.1) : 1 . (2.4.5) f1 = z0 f3 + z1 0

Setting z2 = z0 f32 where f3 = ff31 we get 32 −z1 − (E − 1)z1 −

ax2 z1 + V11 z1 + V12 z2 = 0, 4

(2.4.6)

Vol. 2, 2001


−z2 − z2

v ∗ (x, λ2 ) + V21 z1 + V22 z2 = 0. v ∗ (x, λ2 )

Here ˜ − V2 (ϕ) ˜ V11 = V1 (ϕ)

χ1 , χ2

˜ V21 = V2 (ϕ),

V12 =

2 (χ χ1 − χ1 χ2 ), χ22 2

V22 = −

χ2 , χ2

f

3j , j = 1, 2, V1 and V2 being the components of the potential V : χj = v∗(x,λ 2) V = V1 σ3 + iV2 σ2 .

By lemma 2.4.1, this system has smooth coefficients for x ≥ M (M sufficiently large) which are holomorphic functions of E ∈ D.

0 Let 'z0 = zz10 be the most rapidly decreasing solution of the unperturbed 2 system 0 0 −z1 − k2 z1 + V11 z1 + V12 z2 = 0, 0 0 −z2 + µz2 + V21 z1 + V22 z2 = 0,

where 0 = V1 (ϕ0 ) − V2 (ϕ0 ) V11

χ01 , χ02

0 V21 = V2 (ϕ0 ),

0 V12 =

2 0 (χ0 χ0 − χ0 1 χ2 ), (χ02 )2 2 1

0 V22 =−

χ0 2 , χ02

0 χ0 = χχ10 being defined by χ0 = eµx f30 (x, k). The solution 'z0 can be characterized 2 by the following integral equation ∞ sin k(x−y) 1 0 k 'z0 = eikx dy − V0'z0 (y), −µ(y−x) 0 0 e x

0 0 V11 V12 . If k ∈ Ω1 then for sufficiently large x ≥ M a solution 0 0 V21 V22 'z0 is defined that depends smoothly on x, holomorphically on k and admits the asymptotic representation # " 1 0 ikx −1 −4x + O((1 + |k|) e 'z = e ) . 0 where V0 =

x z0 It is worth mentioning that the function z00 f30 +z10 10 where z00 = M dy f 02 satisfies 32 (2.1.1) and in fact coincides with the solution f10 . Let us return to the complete system (2.4.6). Write 'z as the sum 'z = v(x, λ1 )e−ikx'z0 (x, k) + 'z1 ,

652

G. Perelman


√ where k = E − 1, the square root being defined on the complex plane with the cut along negative semi- axes, Re k > 0. Then for 'z1 one can write down the following equation ∞ 0 k(x, y, λ1 ) R + V'z1 (y) , 'z1 = − dy 0 t(x, y, λ2 ) x where t(x, y, λ) =

v∗ (y,λ) v∗ (x,λ) ,

V=

V11 V21

V12 V22

,

R = (V − V0 )e−ikx v(x, λ1 )'z0 (x, k)−

2eikx (v(x, λ1 )e−ikx )x (e−ikx z10 )x

x (x,λ1 ) v(x, λ1 )e−ikx z20 ( vv(x,λ − ik + 1)

∗ (x,λ ) vx 2 v∗ (x,λ2 )

. − µ)

By lemma 2.4.1, R admits the estimate |R | ≤ ch|k|−1 (1 + |k|)e− h Sa (hx) |v|, γ

provided x ≥ M , γ < 4. Using the standard arguments one checks that a solution 'z1 is defined, depends smoothly on x, x ≥ M , depends holomorphically on E ∈ D2,R and admits the estimate γ1 |'z1 | ≤ ch|k|−1 e− h Sa (hx) |v|. Here R is supposed again to be sufficiently large. Thus, f1 admits a representation of the form (2.4.5), where ∞ z2 z0 = − dy , f32 x " 0 # z1 z1 h − γ Sa (hx) −ikx h v ) , =e + O( e z2 z20 |k|

(2.4.7)

provided x ≥ M . As a direct consequence of lemmas 2.4.1, 2.4.2 and (2.4.5), (2.4.7) one gets the following asymptotic representations of the matrices A± . For E ∈ D0,R : ˆ − (k) + O( h ) , A− (E) = a(E) D 0 |k| (2.4.8) + h + ˆ A (E) = D0 (k) + O( |k| ) a(E), Im k > 0,

0 a(λ1 ) where a(E) = 0 a∗ (λ2 ) first part of lemma 2.4.2.

, a(λ) = v(0, λ). Here R is the same as in the

Vol. 2, 2001


For a(λ) one can write down an explicit expression : ν/2 √ iπν 2 π 4

1−ν . a(λ) = e h Γ 2 Thus, a(λ) has no zeros except for the points λ = −ih( 32 + 2n), n = 0, −1, . . . . For E ∈ D1,R : ˆ − (0) + OR (h1/2 ) , A− (E) = a(E) D 0 (2.4.9) ˆ + (0) + OR (h1/2 ) a(E). A+ (E) = D 0 Here we made use of the obvious estimate |

vx (0, λ1 ) | ≤ ch1/2 , v(0, λ1 )

provided E ∈ D1,R , δ3 < 3/2. It follows from lemma 2.4.1, (2.1.2), (2.4.5), (2.4.7) that (i) as |E| → ∞, E ∈ D2,R

A− (E) = at1 (E) I +O(|E|−1/2 ) ,

(2.4.10) A+ (E) = I + O(|E|−1/2 ) (ikp − µq)a1 (E),

p = 10 00 , and q = 00 01 , uniformly with respect to h sufficiently small; (ii) h A− (E) = at2 (E) D0− (k) + O( |k| ) , (2.4.11) h ) a2 (E) A+ (E) = D0+ (k) + O( |k| uniformly with respect to E in any compact subset of D2,R . Here Re k > 0, 0 a(λ1 ) , j = 1, 2, aj = aj (E) a∗ (λ2 ) aj being holomorphic functions of E ∈ D2,R . 2.4.3

The point spectrum of H(a)

Since H satisfies (2.3.14) the spectrum is invariant under transformations E → ¯ It follows from (2.4.2) that the eigenvalues of H lie outside the con−E, E → E. tinuous spectrum. In the upper half plane they are characterized by the equation det A = 0, zeros of det A+ (det A− ) corresponding to the eigenvalues of H restricted on the subspace of even (odd) functions.

654

G. Perelman


The zeros of det A in the closed lower half plane {Im E ≤ 0} are called resonances. It follows directly from (2.4.8)-(2.4.11) and proposition 2.1.5 that for h sufficiently small the number of zeros of det A in the half plane Im E ≥ −δ3 h is finite and if there are any, they belongs to a small vicinity of the point E = 0. Moreover, one has the following proposition. Proposition 2.4.3 For a > 0 sufficiently small, in the half plane Im E > −δ3 h (δ3 > 0 sufficiently small) (i) det A+ has only three zeros : iE1,2 (a), iER (a). They are simple purely imaginary, E1,2 > 0, ER < 0, and admit the following asymptotic estimates as a→0 : |iE2 (a) − λ(a)| = O(e−(2−)S0 /h ),

E1 , ER = O(e−(1−)S0 /h ),

ER + E1 = O(a−3/2 e−2S0 /h ). (ii) det A− has only one zero which is simple purely imaginary and belongs to a O(e−(2−)S0 /h ) vicinity of µ(a). ˜ Here λ(a) (µ(a)) is the corresponding eigenvalue of H(a) restricted to the subspace of even (odd) functions : √ √ √ a > 0. λ(a) = i a(2 + O(a)), µ(a) = i a(1 + O(a)), Before starting the proof we mention the following obvious consequence of the above proposition : (i) the discrete spectrum of H(a) restricted to the subspace of even functions consists of four simple purely imaginary eigenvalues ±iE1,2 (a); (ii) in the strip {E : −δ3 h < Im E ≤ 0} the operator H(a) has only one simple resonance iER (a). Proof of proposition 2.4.3. For E in some small vicinity of zero and for h|x| ≤ 2 − δ0 the solution F1 of (2.4.1) can be expressed in terms of the solutions Ψ1 , Ψ2 , Ψ2 = (ψ2 , ψ4 ) of (2.3.1) F1 = Ψ1 T1 + Ψ2 T2 , (2.4.12) W (Ψ2 , F1 ) = W (Ψ2 , Ψ1 )T1 ,

W (Ψ1 , F1 ) = −W (Ψ2 , Ψ1 )T2 .

It follows directly from lemmas 2.4.1, 2.4.2 that for E ∈ Υ = {|E| ≤ δ4 , Im E > −hδ3 }, δ4 > 0 sufficiently small, 4−−O(E)

S0 h T1 (E) = t1 (E) + O(e− )a(E), − 6−−O(E) S0 h T2 (E) = t2 (E) + O(e )a(E),

where ti (E)

0 ti (E) ¯ 0 ti (−E)

,

(2.4.13)

Vol. 2, 2001


t1 (E) =

w(u2 (λ1 ), v(λ1 )) = (−λ1 )1/4 a(λ1 )(1 + O(h)), w(u2 (λ1 ), u1 (λ1 )) t2 (E) = −

=

a 4w(u2 (λ1 ), u1 (λ1 )

w(u1 (λ1 ), v(λ1 )) w(u2 (λ1 ), u1 (λ1 )) ∞ dxx2 (1 − θ)v(λ1 )u1 (λ1 ) 0

− 2−−O(E) S0 h

= O(e

)a(λ1 ).

(2.4.14)

Here we used the WKB representation (2.3.4) of u1 and a similar one of v (see appendix 4). The above representation imply the equivalence between the equations det A+ = 0 and Φ(E) = det[D+ (E) + Ψ2x (0, E)T0 (E)] = 0,

T0 = T2 T1−1 .

By (2.4.13), (2.4.14), T0 = t0 + O(e−

6−−O(E) S0 h

),

t0 (E) =

0 t0 (E) ¯ 0 t0 (−E)

,

(2.4.15)

where t0 (E) = t2 (E)t−1 1 (E). The zeros of det A− are characterized by a similar equation, D+ being replaced by D− and Ψ2x by Ψt2 . The asymptotic estimates (2.4.13)-(2.4.15) together with the analytic properties of D± implies directly that in Υ (i) det A+ (E) has only three zeros (counted with their multiplicity) : one (E2 ) 2− 1− is in a O(e− h S0 ) vicinity of λ(a), two others belong to a O(e− h S0 ) vicinity of the point E = 0; 2− (ii) det A− (E) has only one zero E3 which belongs to a O(e− h S0 ) vicinity of µ(a). Since ¯ 1, (2.4.16) A± (E) = σ1 A± (−E)σ E2,3 are purely imaginary. Clearly, the zeros of det A+ that are exponentially close to the point E = 0 can be characterized (asymptotically) by the equation : det[D+ (E) + Ψ2x (0, 0)t0 (0)] = 0. ˜ Taking into account the structure of the root subspace of H(a) corresponding to the zero eigenvalue one can rewrite (again asymptotically) the above equation as follows : κE 2 + 2γ2 γ3 Re t0 (0) = 0, (2.4.17)

656

G. Perelman


where κ have been introduced in subsubsection 2.3.2 and γ2,3 are defined by the relations : 1 1 1 1 1 1 √ √ Ψ1x (0, 0) Ψ2x (0, 0) , γ3 . = = γ2 1 −1 1 −1 2 2 It follows from (2.4.13), (2.4.14) and the WKB representations of ui , i = 1, 2, and v that ∞ 1 (2.4.18) Re t(0) ≥ c dy(1 − θ(hy))e− h S(hy) , 0

with some positive constant c. Here ξ ds( 1 − s2 θ(s)/4 + (1 − s2 /4)+ ). S(ξ) = 0

By (2.3.7), (2.3.21), (2.3.23), (2.3.25), √ 2d0 (0)ϕ∞ γ2 = d(0, a)(γ1 + O(a)), γ3 = + O(h). ϕ0 (0)

(2.4.19)

The formulas (2.4.15), (2.4.17)-(2.4.19) imply the existence of two simple zeros iE1 , iER of det A+ , & & 2Re t0 γ2 γ3 2Re t0 (2−)S0 /h + O(e ϕ∞ (1 + O(h)). (2.4.20) )=± E1 , ER = ± κ ea By (2.4.16), they are purely imaginary. The expression E1 + ER can be calculated as follows. E1 + ER = i

Φ (0) + O(e(3−)S0 /h ). Φ (0)

By (2.3.18), (2.4.13)-(2.4.15), Φ (0) = 2κ(a) + O(e(2−)S0 /h ).

(2.4.21)

For Φ (0) the direct calculations give Φ (0) = −i

S0 γ2 γ3 −2S0 /h (1 + O(h)). e 2h

Combining (2.3.19), (2.3.20) and (2.4.19), (2.4.21), (2.4.22) one gets E1 + ER =

κ2 −2S0 /h e (1 + O(h)), h3

κ2 =

S0 ϕ2∞ . 4e

(2.4.22)

Vol. 2, 2001


Let ζ1 (x, a) and ζ2 (x, a) be eigenfunctions corresponding to the eigenvalues iE1 and iE2 respectively. Let ζR (z, a) be a resonant function associated to the resonance iER : HζR = iER ζR , ζR ∼ e

ihx2 4

σ3

|x|− 2 − 1

ER +iσ3 h

'c,

as |x| → ∞. Here 'c is a constant vector. Clearly ζj , j = 1, 2, R, can be normalized by the conditions : ζj , ζ˜0 = ζ˜0 , ζ˜0 , j = 1, 2, R. The following lemma is an immediate consequence of (2.4.12)-(2.4.14), lemmas 2.4.1, 2.4.2 and (2.3.23)-(2.3.25). Lemma 2.4.4 ζj , j = 1, 2, R, admit the estimates √ Ej 1 h|x| 1 2 |ζj − ζ˜0 − Ej ζ˜1 | ≤ ce−(2−)S0 /h e h 0 ds (1−s /4)+ < x >− 2 − h , j = 1, R, √ E2 1 h|x| 1 2 |ζ2 − ζ˜2 | ≤ ce−(2−)S0 /h e h 0 ds (1−s /4)+ < x >− 2 − h , ˜ corresponding to the eigenvalue where ζ˜2 = ζ˜2 (a) is the eigenfunction of H(a) λ(a), normalized by the condition ζ˜2 , ζ˜0 = ζ˜0 , ζ˜0 . Let us mention that ζ˜2 (a) introduced here differs a little bit from that of subsection 2.3. As a consequence of lemmas 2.4.1, 2.4.2, 2.4.4 and the representations (2.4.12), (2.4.13) one can get the estimates of the operators P (a), Q(a) announced in proposition 1.2.6. 2.4.4 The resolvent of H(a) The resolvent R(E) = (H − E · I)−1 , Im E > 0, of H is an integral operator with 2 × 2 matrix kernel

F1 (x, E)D−1 Gt1 (y, E)σ3 , y ≤ x, G(x, y, E) = −1 G1 (x, E)Dt F1t (y, E)σ3 , x ≤ y, where D = W (G1 , F1 ) = −ihσ3 A, the resolvent kernel in the lower half plane ¯ Im E < 0 being given by G(x, y, E). The kernel G is a meromorphic function of E on the complex plane and its poles in the upper (lower) half plane coincide with the zeros of det A, i.e., with the eigenvalues (resonances) of H. It follows from the estimates (2.4.2) for the solutions F1 and G1 that for Im E > 0 and away from the zeros of A the kernel G determines a bounded operator in L2 .

658

G. Perelman


The formula for the resolvent makes it easy to construct on the continuous spectrum a complete system of generalized eigenfunctions. Let F, G be solutions of the scattering problem : F = F1 A−1 , F(x, E) ∼ e F(x, E) ∼ e−

ihx2 4

σ3

ihx2 4

σ3 − 12 + hi (E−σ3 )

x

|x|− 2 − h (E−σ3 ) + e 1

i

G = G1 A−1 ,

ihx2 4

σ3

A−1 ,

x → +∞,

|x|− 2 + h (E−σ3 ) BA−1 , 1

i

x → −∞.

By proposition 2.4.3, F, G are meromorphic functions in the strip −hδ3 < Im E < hδ3 with the only poles at iER , iE2 which are simple. The relations (2.4.4) imply the orthonormality of the scattering problem solutions : 1 dxF ∗ (x, E)σ3 F(x, E ) = δ(E − E )σ3 , 2πh R 1 dxG ∗ (x, E)σ3 G(x, E ) = δ(E − E )σ3 , (2.4.23) 2πh R 1 dxF ∗ (x, E)σ3 G(x, E ) = 0. 2πh R It is easy to express the jump of the resolvent on the continuous spectrum in terms of the solutions F, G : 1 (G(x, y, E + i0) − G(x, y, E − i0)) = 2πi 1 [F(x, E)σ3 F ∗ (y, E) + G(x, E)σ3 G ∗ (y, E)]σ3 . 2πh Introduce the operators F, G : L2 (R → C2 ) → L2 (R → C2 ) : 1 ' dEF(x, E)Φ(E), (FΦ)(x) = √ 2πh R 1 ' dEG(x, E)Φ(E). (GΦ)(x) = √ 2πh R The action of the adjoint operators F∗ , G∗ is given by 1 ∗ dxF ∗ (x, E)ψ(x), (F ψ)(E) = √ 2πh R 1 ∗ (G ψ)(E) = √ dxG ∗ (x, E)ψ(x). 2πh R

(2.4.24)

Vol. 2, 2001


Proposition 2.4.5 F is a bounded operator. Moreover, x2 (i) for e−ih 4 σ3 f ∈ H 1 , (F∗ f )(E) is a meromorphic function of E in the strip −b0 h < Im E ≤ 0 with the only pole in −iE2 and satisfies the estimate F∗ f L2 (R−ibh) ≤ ch−K1 e−i

hx2 4

σ3

f H 1 , hL ≤ b < b0 ,

(ii) let us introduce the operators Fb : 1 (Fb Φ)(x) = √ dEF(x, E − ibh)Φ(E). 2πh R For hL ≤ b < b0 , they satisfy the inequality x−ν2 Fb Φ2 ≤ ch−K2 Φ2 ,

ν2 > 1/2,

provided b0 is sufficiently small. Here Kj , j = 1, 2, depend on L but do not depend on a. The same is true for the operator G. Proof. This proposition is a direct consequence of the similar estimates related to 2 the unperturbed operator H 0 (a), H 0 (a) = (−∂x2 + 1 − ax4 )σ3 , lemmas 2.4.1 and 2.4.2, the representation (2.4.5), (2.4.7) and proposition 2.4.3. To illustrate the arguments used we prove here the estimates for F, the part (i) can be obtained in a similar manner. We start by remarking that in the free case (V = 0) the above proposition is an immediate consequence of the explicit factorization of the corresponding operators F0 , F∗0 in terms of the Fourier transform : 3 14 +i E−σ ∞ √ 2h 2 E−σ3 2 1 h i hx σ3 +i π σ3 4 4 e dρeiρ σ3 +i 2hxρσ3 ρ− 2 −i h . 2 0 (2.4.25) Here F0 is the solution of the scattering problem associated to the operator H 0 (a). This representation implies, in particular, the unitary property of F0 and the estimates x−ν2 F0b Φ2 ≤ chb/2 Φ2 , (2.4.26)

1 F0 (x, E, a) = √ π

F∗0 f L2 (R−ibh) ≤ ch−b/2 e−ih

x2 4

σ3

f H 1 ,

provided 0 ≤ b < 12 . To take into account the perturbation V we use the representation F = F0 + F1 ,

F1 = −(H 0 − E)−1 + V F.

0 Here (H 0 − E)−1 + stands for the meromorphic continuation of the resolvent (H − −1 E) from the upper half-plane into the lower half-plane.

660

G. Perelman


Using lemmas 2.4.1 and 2.4.2, the representation (2.4.5), (2.4.7) and proposition 2.4.3 it is not difficult to prove the estimate e−γ|x| |F(x, E, a)|, e−γ|x| |FE (x, E, a)| ≤ ch−K e−γ1 |x| (1 + |E|)− 4 + 1

Im E 2h

, (2.4.27)

hL ≤ |Im E| ≤ hδ3 , e−γ|x| |F(x, E, a)| ≤ c(h)e−γ1 |x| (1 + |E|)−1/4 ,

E ∈ R.

(2.4.28)

Here γ > γ1 > 0, K is a positive constant depending only on L. Combining (2.4.27) with the obvious estimates of the free operator : −1/2 (H 0 − E)−1 (1 + |E|)−1/2 x + f ∞ ≤ ch

M

f ∞ ,

|Im E| ≤

h , 2

(2.4.29)

where M is a positive constant independent of h and λ, one gets dEF1 (x, E − ibh)Φ(E)∞ ≤ ch−K−1/2 Φ2 ,

(2.4.30)

hL−1 ≤ b ≤ min(1/2, δ3 ), dEF1 (x, E)Φ(E)∞ ≤ c(h)Φ2 ,

(2.4.31)

R

R

The inequalities (2.4.26), (2.4.30) lead to the desired estimate for Fb . To estimate L2 -norm of the integral R dEF1 (x, E)Φ(E) the following refinement of (2.3.29) is needed : (l(a) − λ − i0)−1 f ∞ ≤ c|λ|−1 xM f ∞ , λ ≤ −1, v(x, λ) (l(a) − λ − i0)−1 f (x) + dyv(−y, λ)f (y) 2v(0, λ)vx (0, λ) R

(2.4.32)

≤ ch−1/2 λ−1/2 x−α yM f ∞ ,

(2.4.33)

2

1/2 2(−λ)+ ,

l(a) = −∂x2 − ax4 . In the second estimate hx ≥ λ ∈ R, α is arbitrary, M depends on α. By the way of the explanation we remark that these estimates as well as (2.4.29) can be got easily by combining the explicit representation of the resolvent (l(a) − λ − i0)−1 in terms of v(x, λ) with the corresponding properties of the Weber functions, see [B] and appendix 4. Since F(x, E) = σ1 F(x, −E)σ1 , E ∈ R, it is sufficient to consider the integral ∞ I= dEF1 (x, E)Φ(E). 0

By (2.4.31),

dx|I|2 ≤ c(h)Φ22 . h|x|≤4

(2.4.34)

Vol. 2, 2001


To estimate I in the region h|x| ≥ 4 we break it into two terms : I = I1 + I2 ,

I1 = pI,

I2 = qI.

Consider I1 . Using (2.4.28), (2.4.33), the boundedness of F0 and the obvious estimate (see appendix 4, (A4.1), (A4.4)) |v(0, λ)vx (0, λ)| ≤ c(h), one gets immediately

λ ≥ −1,

dx|I1 |2 ≤ c(h)Φ22 . hx≥4

The same estimate is valid in the region hx ≤ −4. Thus, dx|I1 |2 ≤ c(h)Φ22 .

(2.4.35)

h|x|≥4

I22

Consider I2 . We represent it as the sum I2 = I21 + I22 , I21 = ∞ = h2 x2 −1 dE. 16 By (2.4.28), (2.4.32), −5/4

|qF1 (x, E)| ≤ c(h) E

,

h2 x2 16

0

−1

dE,

E ≥ 0, x ∈ R,

which allows us to get for I22 dx|I22 |2 ≤ c(h)Φ22 .

(2.4.36)

h|x|≥4

To estimate I21 in the region hx ≥ 4 we combine (2.4.33) with the following estimate of v (see appendix 4, (A4.1)) |v(x, λ) − ei

hx2 4

x−1/2+iλ/h | ≤ c|λ|2 h−3 x−5/2 ,

λ ≤ −1, hx ≥ |λ|1/2 (2 + δ), δ > 0. As a result, one gets the representation qF1 (x, E) = e−i where

hx2 4

µ(E) =

dy

x−1/2+i(E+1)/h µ(E) + R2 ,

v(−y, λ2 ) 2v(0, λ2 )vx (0, λ2 )

qV F(y, E),

R2 admits the estimate |R2 | ≤ c(h)x−5/2 [(E + 1)−3/4 + (E + 1)2 |µ(E)|], provided hx ≥ 4(E + 1)1/2 , E ≥ 0.

(2.4.37)

662

G. Perelman


The function µ can be estimated as follows. |µ(E)| ≤ c(h)e−γ

(E+1)1/2 h

,

(2.4.38)

with some γ > 0. Here we have used (2.4.28) and the following estimate of v |λ|1/2 v(x, λ) −γ|x| ≤ ce−γ h , (2.4.39) e v(0, λ)vx (0, λ) −λ ≥ δ > 0, γ is a positive constant depending only on δ and γ. (2.4.39) is an immediate consequence of the WKB representations of v, see appendix 4, (A4.2)(A4.4). It follows from (2.4.35), (2.4.36) that for hx ≥ 4, ∞ 2 −i hx −1/2+i/h 4 x dExiE/h µ(E)Φ(E) + Oh (Φ2 x−5/2 ). I21 = e 0

As a consequence,

dx|I21 |2 ≤ c(h)Φ22 .

(2.4.40)

hx≥4

In a similar way one can obtain dx|I21 |2 ≤ c(h)Φ22 .

(2.4.41)

hx≤−4

Combining (2.4.34)- (2.4.36), (2.4.40), (2.4.41) one gets finally : I2 ≤ c(h)Φ2 , which implies the boundedness of the operator F. Since ˆ E, a) = F(h−1/2 z, hE, a)h− 14 − 2i (E−Eˆ0 σ3 ) , F(z,

(2.4.42)

proposition 2.4.4 implies immediately the corresponding inequalities of proposition 1.2.7. ˆ a one can use the following In order to prove the estimates for the derivative F representation 1 d ˆ∗ ∗ ˆ ˆ (Fa'g )(E) = −E0a (F σ3'g )(E) + √ dyF2∗ (y, E)'g (y), dE 2π R

where

ˆ ˆ0a [σ3 , W ˆ − E)−1 E ˆ ]FÊ (E) + W ˆ a F(E) , Im E > 0. F2 (E) = −(H

Vol. 2, 2001


The desired inequalities follows then directly from proposition 2.4.4, the estimate (2.4.27) and (2.4.42). Introduce the operator E : L2 (R → C2 ) × L2 (R → C2 ) → L2 (R → C2 ) : ' = FΦ1 + GΦ2 , EΦ

' = (Φ1 , Φ2 ). Φ

In terms of E the orthonormality conditions (2.4.23) mean E∗ σ3 Eˆ σ3 = I. The formula for the jump in the resolvent leads to a relation meaning that the scattering problem solutions form a complete system of eigenfunctions of the continuous spectrum of H : Eˆ σ3 E∗ σ3 = P c , σ3 0 where σ ˆ3 = , P c being the spectral projection onto the subspace of 0 σ3 the continuous spectrum. The operator E realizes a linear equivalence between the restriction of H to the continuous spectrum and the multiplication by E : HP c = EE σ ˆ3 E∗ σ3 . Moreover, for any bounded continuous function ϕ we have ϕ(H)P c = Eϕ(E)ˆ σ3 E∗ σ3 .

Appendix 1 Here we prove proposition 2.1.2. By (1.1.8) it suffices to consider the point E = 1. Let the equation (L0 − 1)ψ = 0 have a bounded solution ψ, ψ ∈ L2 . Then the same is true for the operator T0 : there exists ψ0 such that T0 ψ0 = ψ0 ,

ψ0 = C± (1 + O(e∓γx )), x → ±∞,

(A1.1)

where γ > 0, |C− | + |C+ | > 0. Obviously, (ψ0 , ϕ0 ) = 0. One can consider ψ0 be real and either odd or even. We normalize ψ0 in such a way that C+ = 1. Introduce a truncated resonant function ψ0 : ψ0 (x) = Θ(5x)ψ0 + µ(5)ϕ0 ,

µ(5) = −

(Θψ0 , ϕ0 ) , ϕ0 22

where 5 > 0 is small, Θ is even, Θ ∈ C0∞ , Θ(ξ) = 1 in some vicinity of zero. Clearly, (ψ0 , ϕ0 ) = 0, |µ(5)| ≤ ce−γ/ , γ > 0.

664

G. Perelman

The direct calculations show ψ0 22

−1

=5

−γ/

M0 + M1 + O(e


),

Θ22 ,

M0 =

(L0+ ψ0 , ψ0 ) = 5−1 M0 + M1 + M2 + O(5),

M1 =

R

dx(|ψ0 |2 − 1),

M2 = ((L+ − 1)ψ0 , ψ0 ).

As in the proof of proposition 2.1.2 we consider the quotient u ∈ F , F = L{ψ0 , η0 , ξj , j = 0, 1}. It is clear that dim F = 4. It follows from (A1.2) that

(Au,u) (u,u) ,

(A1.2)

A = P L0+ P ,

(Au, u) |x1 |2 M2 + O(52 )) max3 , ≤ (1 + 5 (u, u) M0 x∈C < (I + B)x, x >C3 where



0 B =  b1 b2

b1 0 0

 b2 (ψ , ej ) 0  , bj = 0 , ψ0 2 0 −1/2

bj = 51/2 (M0

ej =

(ψ0 , ej ) + O(5)),

It is easy to check that

1 |x1 |2 max3 = , 1 − b2j x∈C < (I + B)x, x >C3

j=

η0 η0 2 , ξ1 ξ1 2 ,

j=1 , j=2

j = 1, 2.

1 if ψ0 is odd, . 2 if ψ0 is even

Thus, κj (Au, u) ≤ (1 + 5 + O(52 )), (u, u) M0

κj = M2 + (ψ0 , ej )2 .

Consider κj . Clearly, κj = (f, ψ0 ) + (f, ej )2 ≤ (f, ψ0 + f ), where f = (P L0+ − 1)ψ0 , f is a real smooth function decreasing exponentially as |x| → ∞, (f, ϕ0 ) = 0. By (A1.1), (f, ψ0 + f ) = −((L0− − 1)−1 f, f ). Since L0− has no resonances at the end point E = 1 of the continuous spectrum the expression ((L0− − 1)−1 f, f ) is well defined and positive since (f, ϕ0 ) = 0. Thus, κj < 0, j = 1, 2. This means that for 5 sufficiently small (Au, u) < 1, (u, u) provided u ∈ F , which contradicts to the fact that the number of the eigenvalues of A counted with their multiplicity is equal three.

Vol. 2, 2001


Appendix 2 Here we prove proposition 1.2.6. Using the obvious estimate |(l(a) + 1 − i0)−1 (x, y)| ≤ ch−1/3 e− h |S(hx)−S(hy)| , 1

and the inequality (ii) of proposition 1.2.1. one gets immediately f 0 (a)∞ ≤ ce−(1−)

S0 h

,

0 ϕ(a)f ˜ (a)∞ ≤ ce−(2−)

S0 h

.

By (1.2.9), the expression G0 (a) can be estimated as follows. S0 G0 (a) ≤ c a dxx2 (1 − θ(hx))|'ej |(ϕ˜ + |f 0 (a)|) ≤ ce−(2−) h . j=0,...,3

Here we also made use of propositions 1.2.1, 1.2.2. Consider G30 : ' 2 ( az 1 1 3 0 ' (θ − 1)f , ϕ˜ = G0 = 2(ϕã , ϕ) ˜ 4 −1 R 1 1 az 2 0 ˜ =− dyf 0 · l(a)f 0 = Im ( (θ − 1)f , ϕ) lim Im (ϕã , ϕ) ˜ 4 (ϕã , ϕ) ˜ R→+∞ R

2 h lim Im f¯0 (R)f 0 (R) = − |κ|2 , (A2.1) (ϕã , ϕ) ˜ R→+∞ (ϕã , ϕ) ˜ where κ can be characterized by the asymptotic representation f0 = e

ihz 2 4

|z|−1/2−i/h (κ + o(1)),

z → ∞.

It is not difficult to check that ay 2 1 1 dyψ− (y) dyψ− (y)ϕ˜5 (y). (1 − θ(hy))ϕ(y) ˜ = κ= w(ψ− , ψ+ ) 4 w(ψ− , ψ+ ) R

R

Here ψ± is a solution of the equation (l(a)+1)ψ = 0, characterized by the following behavior at ±∞ : ψ± = e

ihz 2 4

|z|−1/2−i/h (1 + o(1)),

z → ±∞.

Using the standard WKB descriptions of ψ± , see appendix 4, and proposition 1.2.1 one can easily check that as a → 0, Γ−1 h|κ|2 admits an asymptotic expansion in powers of a : 2 2S 1 2 −1 − h0 n y 5 dye ϕ0 (y) = 2ϕ2∞ . kn a , k0 = (A2.2) |κ| = h e 2 n≥0

666

G. Perelman


Combining (A2.1) and (A2.2) one gets the following asymptotic (as a → +0) representation of G30 : G30 = e−

2S0 h

G30k ak ,

G300 = −

n≥0

2ϕ2 k0 = − ∞ < 0. (ϕ0 , ϕ1 ) e

This asymptotic expansion can be differentiated any numbers of time with respect to a. ihz 2 To estimate the Fourier transform f˜ˆ0 of f˜0 = e− 4 f 0 we use the representation : ∞ 2 2 p i/h F˜ˆ0 (s) i i ˆ 0 ˜ f (p) = − dse 2h (p −s ) , (A2.3) h s |p|1/2 |s|1/2 |p|

where

ihz 2 F˜ˆ0 = e− 4 F0 .

This representation gives immediately S0 ihz 2 f˜ˆ0 1 ≤ ch−1 F˜ˆ0 1 ≤ ch−1 e− 4 F0 H 1 ≤ ce−(1−) h .

+ 12 )f˜0 = − hi [(p2 +1)f˜ˆ0 + Consider (z∂z + 12 )f˜0 . Using the representation (z∂z

F˜ˆ0 ], and taking into account (A2.3) one gets

1 2 (z∂z + )f˜0 1 ≤ ch−2 p F˜ˆ0 1 2 ≤ ch−2 e−

ihz 2 4

F0 H 3 ≤ ce−(1−)

S0 h

.

˜0 At last, the expression ∂

h f can be estimated as follows. " # ˆ + (p∂ + 1 )f˜ˆ0 ≤ ce−(1−) Sh0 . 0 ≤ ch−1 ∂ F ˜ ˜ ∂

f h 1 h 0 1 p 1 2

Appendix 3 Here we prove the inequalities (1.3.3). We start by estimating s0 . Write h as the sum h = h0 + h1 . Then h1 admits the representation τ τ dse s duΛ0 (h0 (u)) Λ1 (s), h1 (τ ) = 0

where Λ0 (h) =

1 d −1 3 h G0 (h), 2 dh

Vol. 2, 2001


1 3 1 3 1 3 G (h) − G . G (h0 ) − Λ0 h1 + 2h 0 2h0 0 2h R Taking into account proposition 1.2.4 one can estimate Λj as follows. Λ1 =

Λ0 (τ ) ≤ −ch0 (τ )h−2 0 , c > 0, − |Λ1 (τ )| ≤ W (M, s)[Ψ1 (M )h−1 0 (τ )(e

3κ3 2

τ 0

dsh0 (s)

+ e−

3r4 S0 2 h0 (τ )

)

−2S0 /h0 (τ ) +s2 h−1 ], 0 (τ )e

which implies the inequality |h1 | ≤ W (M, s) Ψ1 (M )(I1 + I2 ) + s2 I3 .

Here

τ

−1

dsec(h0

I1 =

(s)−h−1 0 (τ )) h−1 (s)e− 0

3κ3 2

s 0

duh0 (u)

≤ ch20 (τ )β0−4 ,

0

τ

−1

dsec(h0

I2 =

− (s)−h−1 0 (τ )) h−1 (s)e 0

3r4 S0 2 h0 (s)

≤ ch20 (τ )e−γ/β0 ,

0

with some γ > 0,

τ

−1

dsec(h0

I3 =

S

0 −2 h (s) (s)−h−1 0 (τ )) h−1 (s)e 0 0

≤ ch20 (τ ).

0

Combining these inequalities one gets

s0 ≤ W (M, s) s20 + β0−4 Ψ1 (M ) . Consider s1 . Set β2 = h−β. For β2 one can write down the following equation τ τ β2 = dse−2 s duh(u) Λ3 (s), 0

1 η3 . 2h Taking into account (1.3.1) one can estimate Λ3 as follows S0 |Λ3 | ≤ W (M, s) s21 h40 p(τ ; κ1 , r1 ) + e−(2−) h0 + h−1 0 Ψ0 (M )p(τ ; 2κ3 , 2r3 ) . Λ3 = β22 + 2βη1 − η2 +

As a consequence, one obtains the following estimate of β2 : γ |β2 (τ )| ≤ W (M, s) β0 s21 + e− β0 + β0−4 Ψ0 (M ) h20 (τ )p(τ ; κ1 , r1 ). Here we made use of the obvious estimates τ τ −1 −γ/h0 (s) dse− s duh0 (u) hM ≤ chM (τ )e−γ/h0 (τ ) , 0 (s)e 0 0

668

G. Perelman

τ

dse−

τ s

duh0 (u) M h0 (s)e−α

s 0

duh0 (u)


−1 ≤ chM (τ )e−α 0

τ 0

duh0 (u)

,

0

provided α < 1. Consider β3 = β − r−2 . It satisfies the equation β3τ = 2ββ3 + Λ4 , Λ4 = 2β32 − 2β3 η1 + η2 + a − β 2 . By (1.3.1), S0

|Λ4 | ≤ W (M, s)[s22 h40 p(τ, κ2 , r2 ) + e−(2−) h0 +Ψ0 (M )p(τ, 2κ3 , 2r3 ) + s1 h30 p(τ, κ1 , r1 )]. Since

|β3 | ≤

τ1

3

dse

τ

duh0 (u)

s

|Λ4 (s)|,

τ

one finally gets γ ˆ , sˆ) sˆ1 + β0 s22 + e− β0 + β −3 Ψ0 (M ˆ) . s2 ≤ W (M 0

Appendix 4 In this appendix we collect some results related to the behavior of the function h v(x, λ) in the limit |λ| → 0, which corresponds to the semi-classical regime for the equation (2.4.3). The necessary results can be obtained by the WKB method (see, e.g., [F]). Since the subject is so well-known we just formulate them. For arg λ ∈ [0, π − δ], where δ is a small positive number, the asymptotics of h v as 5 ≡ |λ| → 0 is given by the standard WKB formula (uniformly with respect to x ∈ R) : i v(x, λ) = C0 (λ, h)e Ω0 (y,ω) (ω + y 2 /4)−1/4 1 + O( Here ω =

λ |λ| ,

y=

hx , |λ|1/2

C0 (λ, h) =

√1 2

Ω0 (y, ω) = y 2 /4 + ω ln y −

h |λ|1/2

∞

−ν

5 ) , 2 1 + (y)+

(A4.1)

,

ds ω + s2 /4 − s/2 − ω/s .

y

The roots are defined on the complex plane with the cut along the negative semiaxis. They are positive for the positive values of the argument. A similar representation (with the appropriate change of the signs in the phases) is valid for v ∗ on the semi-bounded intervals y ≥ const provided Imh λ is sufficiently small. √ Consider the case arg λ ∈ (π − δ, π]. For y ≥ Re y1 + δ , y1 = 2 −ω, δ > 0 fixed, (A4.1) is still valid. To describe the behavior of the solutions on a finite

Vol. 2, 2001


vicinity of the turning point y1 one can use so called Olver type asymptotic representations, see [F]. Let b be an interval of the form b = (−Re y1 + δ , +∞). For y ∈ b the function v has the following asymptotic behavior as 5 → 0 v(x, λ) = C1 (λ, h) 5−1/6 A(y, 5)w1 (−5−2/3 ζ(y)) + 51/6 B(y, 5)w1 (−5−2/3 ζ(y)) , (A4.2) ∞ √ λ Here C1 (λ, h) = C0 (λ, h)ei 2h (ln(−ω)+S1 ) , S1 = 2 ds s2 − 4 − s + 2/s − 2 + 2 ln 2, w1 (z) is the solution of the Airy equation w1 − zw1 = 0 with the following asymptotic behavior as z → −∞ 3/2

w1 (z) = ei2/3(−z)

(−z)−1/4 [1 + O((−z)−3/2 )].

As z → +∞, w1 (z) = e2/3z

3/2

−iπ/4 −1/4

z

[1 + O((z)−3/2 )].

The new slow variable ζ(y) is given by  3 ζ(y) =  2

2/3 y ω + s2 /4ds . y1

ζ(y) is a holomorphic function of y in some finite vicinity of y1 and it is real for real ω and y. As y → y1 , ζ(y) ∼ (−ω)1/3 (y − y1 ). Note that ζ(y) is a solution of the equation y2 (ζ )2 ζ = ω + . 4 At last, A = (ζy )−1/2 (1 + O(5)),

B = O(5 y−5/6 ),

(A4.3)

uniformly with respect to y ∈ b. The solution v ∗ admits a similar representation (with w1 replaced by w2 = ∗ w1 ). In the limit 5−2/3 (y − Re y1 ) → +∞ the representation (A4.2), (A4.3) takes the simpler form (A4.1). When 5−2/3 (y − Re y1 ) → −∞ (A4.2), (A4.3) can be again simplified and one gets the standard WKB formula (now with a real phase for λ ∈ R) : 1 v(x, λ) = C2 (λ, h)e− Ω1 (y,ω) (−ω − y 2 /4)−1/4 1 + O(5) ,

(A4.4)

−i − S0 with respect to y, |y| ≤ ∗Re y1 − δ . Here C2 = C1 e 4 h , Ω1 (y, ω) = uniformly y 2 dy −ω − s /4. The solution v admits a similar description. 0 π

λ

670

G. Perelman


Appendix 5 Here we outline the proof of the estimate (1.3.11) for f'0 , 2 −2 ˜0, f'0 = −a1/2 (I − P˜ (a))e−iz r σ3 T ∗ (ra1/4 )h

where ˜ 0 = (H(a) − i0)−1 (I − P (a))T (a1/4 )N0 . h Clearly, ˜ 0 2 . ˆ , sˆ)ρ2δ h ρδ f'0 2 ≤ W (M

(A5.1)

˜ 0 , we rewrite it in the form To estimate h ) dE 1 1/4 ˜ )N0 . (I − P )(H − E)−1 h0 = + T (a 2πi E

(A5.2)

|E|=a

Using lemmas 2.4.1, 2.4.2, proposition 2.4.3 and the WKB representations of the solutions of (2.4.3) one can prove the following estimate for the kernel of (H −E)−1 + |G(x, y, E)| ≤ ca−K e− h |S(hx)−S(hy)| , 1

|E| = a,

with some K > 0. As a consequence, one gets the inequality S

1/4 ˆ , sˆ)e−(2−) h00 . )N0 2 ≤ W (M ρ2δ (H − E)−1 + T (a

(A5.3)

Here we have also used proposition 2.2.1. * dE −1 1/4 )N0 . Using propositions Consider the expression E P (H − E)+ T (a |E|=a

1.2.6, 2.3.1 and lemma 2.4.4 it is not difficult to show that it admits an estimate similar to (A5.3) : ) S dE 1/4 ˆ , sˆ)e−(2−) h00 . ρ2δ )N0 2 ≤ W (M (A5.4) P (H − E)−1 + T (a E |E|=a

Combining (A5.1)-(A5.4) one gets the desired result : S

ˆ , sˆ)e−(2−) h00 . ρδ f'0 2 ≤ W (M

Acknowledgment It is a pleasure to thank V.S. Buslaev, F. Merle and J. Sj¨ ostrand for numerous helpful discussions.

Vol. 2, 2001


References [B]

H. Bateman, Higher transcendental functions, v.II, New York (1953).

[BW]

J. Bourgain, W. Wang, Construction of blow-up solutions for the nonlinear Schr¨ odinger equation with critical nonlinearity, Ann. Scuola Norm. Sup. Pisa Cl. Sci (4) XXV , 197–215 (1997).

[BP1]

V.S. Buslaev, G.S. Perelman, Scattering for the nonlinear Schr¨ odinger equation : states close to a soliton, St. Peters. Math. J. 4, 1111–1143 (1993).

[BP2]

V.S. Buslaev, G.S. Perelman, On the stability of solitary waves for nonlinear Schr¨ odinger equation, Amer. Math. Soc. Transl. 2 164, 75–99 (1995).

[C]

T. Cazenave, An introduction to nonlinear Schr¨ odinger equation, Textos de Métodes Mateática, UFRJ, Rio de Janeiro, (1989).

[CW1]

T. Cazenave, F.B. Weissler, The structure of solutions to the pseudoconformal invariant nonlinear Schr¨ odinger equation, Proc. Royal Soc. Edinburgh 117A, 251–273 (1991).

[CW2]

T. Cazenave, F.B. Weissler, The Cauchy problem for the critical nonlinear Schr¨ odinger equation in H s , Nonlin. Anal., Theory, Methods & Appl. 14 , 807–836 (1990).

[DNPZ]

S. Dyachenko, A.C. Newell, A. Pushkarev, V.I. Zakharov, Optical turbulence : weak turbulence, condensates and collapsing filaments in the nonlinear Schr¨ odinger equation, Phys. D 57, 96–160 (1992).

[F]

M.V.Fedoryuk, Asymptotic analysis : Linear ordinary differential equations, Berlin (1993).

[Fr]

G.M. Fraiman, Asymptotic stability of manifold of self-similar solutions in self-focusing, Sov. Phys. JETP 61, 228–233 (1985).

[GV1]

J. Ginibre, G. Velo, On a class of nonlinear Schr¨ odinger equations I, II, J. Func. Anal. 32, 1–71 (1979).

[GV2]

J. Ginibre, G. Velo, On a class of nonlinear Schr¨ odinger equations III, Ann. Inst. H. Poincaré Phys. Thoér. 28, 287–316 (1978).

[G]

R. Glassey, On the blowing up of solutions to the Cauchy problem for nonlinear Schr¨ odinger operators, J. Math. Phys. 8, 1794–1797 (1977).

[GM]

L.Glangetas, F.Merle, Existence of self-similar blow-up solution for the Zakharov equation in dimension two, Comm. Math. Phys. 160, 173–215 (1994).

672

G. Perelman


[Gr]

M. Grillakis, Linearized instability for nonlinear Schr¨ odinger and Klein - Gordon equations, Comm. Pure Appl. Math., 747–774 (1988).

[KSZ]

N. Kosmatov, V. Schvets, V. Zakharov, Computer simulation of wave collapse in nonlinear Schr¨ odinger equation, Phys. D 52, 16–35 (1991).

[LBSK]

E.W. Laedke, R. Blaha, K.H. Spatschek, E.A. Kuznetsov, On the stability of collapse in the critical case, J. Math. Phys. 33 (3), 967–973 (1992).

[LPSS]

M. Landman, G. Papanicolaou, C. Sulem, P.L. Sulem, Rate of blow-up for solutions of the nonlinear Schr´ odinger equation at critical dimension, Phys. Rev. A 38, 3837–3843 (1988).

[LePSS1] B.J. LeMesurier, G. Papanicolaou, C. Sulem, P.L. Sulem, Focusing and multi-focusing solutions of the nonlinear Schr¨ odinger equation, Phys. D 31, 78–102 (1988). [LePSS2] B.J. LeMesurier, G. Papanicolaou, C. Sulem, P. Sulem, Local structure of the self-focusing singularity of the nonlinear Schr¨ odinger equation, Phys. D 32, 210–226 (1988). [M1]

V.M. Malkin, Dynamics of wave collapse in the critical case, Phys. Lett. A 151, 285–288 (1990).

[M2]

V.M. Malkin, On the analytical theory for stationary self-focusing of radiation, Phys. D 64, 251–266 (1993).

[M3]

V.M. Malkin, Singularity formation for nonlinear Schr¨ odinger equation, in “Partial Differential equations and their applications”, P.C.Grenier, V.Ivrii, L.A.Seco and C.Sulem, eds, CRM Proceedings and Lecture Notes 12, 183-198 (1997).

[MM]

Y. Martel, F. Merle, Stability of blowup profile and lower bounds for blowup rate for the critical generalized KdV equation, preprint (2000).

[Me1]

F. Merle, On uniqueness and continuation properties after blow-up time of self-similar solutions of the nonlinear Schr¨ odinger equation with critical exponent and critical mass, Comm. Pure Appl. Math. 15, 203-254 (1992).

[Me2]

F. Merle, Determination of blowup solutions with minimal mass for nonlinear Schr¨ odinger equation with critical power, Duke Math. J. 69, 427–453 (1993).

[Me3]

F. Merle, Lower bounds for the blow-up rate of solutions of the Zakharov equation in dimension two, Comm. Pure Appl. Math. 49, 765–79 (1996).

Vol. 2, 2001


[OT]

T. Ogawa, Y. Tsutsumi, Blow-up oh H 1 solution for the one-dimensional nonlinear Schr¨ odinger equation with critical power nonlinearity, Proc. Amer. Math. Soc. 111, 487–496 (1991).

[P]

G. Perelman, On the blow up phenomenon for the critical nonlinear Schr¨ odinger equation in 1D, Sem. EDP Ec. Polytec. (1999-2000).

[SF]

A.I. Smirnov, G.M. Fraiman, The interaction representation in the selffocusing theory, Phys. D 51, 2–15 (1991).

[SS1]

C. Sulem, P.L. Sulem, Focusing nonlinear Schr¨ odinger equation and wave-packet collapse, Nonlin. Anal., Theory, Meth. & Appl. 30 (2), 833– 844 (1997).

[SS2]

¨ C. Sulem, P.L. Sulem, The nonlinear Schrdinger equation. Self-focusing and wave collapse. Appl. Math. Scien. 139 (1999), Springer-Verlag, New York.

[SW1]

A. Soffer and M.I. Weinstein, Multichannel nonlinear scattering theory for nonintegrable equations I, Comm. Math. Phys. 133, 119–146 (1990).

[SW2]

A. Soffer and M.I. Weinstein, Multichannel nonlinear scattering theory for nonintegrable equations II, J. Diff. Eq. 98, 376-390 (1992).

[W1]

M.I. Weinstein, Modulation stability of ground states of nonlinear Schr¨ odinger equation, SIAM J. Math. Anal. 16, 472–491 (1985).

[W2]

M.I. Weinstein, Lyapunov stability of ground states of nonlinear dispersive evolution equations, Commun. Pure Appl. Math. 39, 51–68 (1986).

[W3]

M.I. Weinstein, Nonlinear Schr¨ odinger equations and sharp interpolation estimates, Comm. Math. Phys. 87, 567–576 (1983).

[W4]

M.I. Weinstein, On the structure and formation of singularities in solutions to nonlinear dispersive evolution equations, Comm. Part. Diff. Eq. 11 (5), 545-565 (1986).

[W5]

M.I. Weinstein, Solitary waves of nonlinear dispersive evolution equations with critical power nonlinearities, J. Diff. Eq. 69, 192–203 (1987).

Galina Perelman Centre de Mathématiques Ecole Polytechnique F-91128 Palaiseau Cedex France email: [email protected] Communicated by Bernard Helffer submitted 20/10/00, accepted 08/03/01



Semi-classical Estimates on the Scattering Determinant Vesselin Petkov and Maciej Zworski Abstract. We present a unifying framework for the study of Breit-Wigner formulæ, trace formulæ for resonances and asymptotics for resonances of bottles. Our approach is based on semi-classical estimates on the scattering determinant and on some complex function theory.

1 Introduction The purpose of this paper is to present a semi-classical estimate on the scattering determinant and its applications. We work in the technically simplest setting of compactly supported perturbations of −h2 ∆ on Rn , and concentrate on presenting a complex analytic framework for a general study of Breit-Wigner formulæ, trace formulæ for resonances, and asymptotics for resonances of bottles. This allows us to make the paper essentially self-contained. The scattering matrix constitutes a mathematical model for the data obtained in a scattering experiment or a chemical reaction. Resonances model states which live for certain times but eventually decay – the real part of a resonance gives the rest energy of the state and its imaginary part the rate of its decay. A basic intuition connects resonances and scattering matrices via the time delay operator or the Breit-Wigner approximation: the long living states should contribute peaks in the derivatives of expressions obtained from the scattering matrix (i.e. expressions which at least in principle are obtained from scattering data). Mathematically this connection is expressed most simply through the fact that the resonances are the poles of the meromorphic continuation of the resolvent. We refer to [37] for a basic introduction to the theory of resonances and for references. The scattering determinant, that is the determinant of the scattering matrix, is a natural mathematical object to study. It is closely related to the scattering phase which replaces the counting function of eigenvalues for problems on noncompact domains – see [16] for an introduction and references. The connection between the asymptotics of the scattering phase and resonances was first explored by Melrose [15] who proved the Weyl law for the scattering phase using bounds on the resonances. The further connections between resonances and the scattering phase were then investigated by Guillopé and the authors [11],[18],[35], and the present paper is a semi-classical continuation of these works. We are however using, rather than proving, asymptotics of the scattering phase, as established in the generality we consider by Christiansen [6] and Bruneau and the first author [4], who followed, among other things, the ideas of Robert [20].

676

V. Petkov, M. Zworski


Related problems have been recently studied by Bony [2] and Bony-Sj¨ ostrand [3] without a direct appeal to scattering theory, but following Sj¨ ostrand’s work on local trace formulæ [22],[23]. That approach allows obtaining some of the applications directly and in greater generality. For instance, it is shown in [2] that for a large class of perturbations, if λ > 0 is a non-critical energy level, and Ch < δ < 1/C, then we have # {z : z ∈ Res (P (h)) , |z − λ| < δ} = O(δ)h−n , where Res (P (h)) denotes the set of resonances. This provides a fine upper bound on resonances in small sets, generalizing [18, Proposition 2] and Lemma 6.1 below. The basic estimate on the scattering determinant which follows directly from adapting the proofs in the classical case [18],[34] is: −n

| det S(z, h)| ≤ eCh

, Im z ≥ 0 , z ∈ Ω {Re z > 0} ,

(1.1)

where S(z, h) is the scattering matrix and where Im z > 0 is the “physical half plane” (that is the half plane where S(z, h) is holomorphic). It is interesting and useful that the constant in (1.1) depends only on the size of the support of the perturbation not on its properties. The difficulty in using (1.1) lies in the need for a lower bound ∀ 0 < h ≤ h0 ∃z0 = z0 (h) ∈ Ω , Im z0 > δ,

−n

| det S(z0 , h)| ≥ e−Ch

.

(1.2)

Here z0 clearly can depend on h but δ > 0 is fixed. When we can find z0 ’s such that (1.2) holds with Ω = (a, b) + i(−c, c) , 0 < a 0 we can factorize det S(z, h): P (¯ z , h) , |g(z, h)| ≤ C(N (h) + h−n ) + C, z ∈ Ω , P (z, h) P (z, h) = (z − w) , Ω = Ω + D(0, ) ,

det S(z, h) = eg(z,h)

(1.3)

w∈Res (P (h))∩Ω

N (h) = # Res (P (h)) ∩ Ω ,

where we denoted the set of resonances of P (h) by Res P (h). In particular, this shows that we have an improved estimates | det S(z, h)| ≤ C exp(C Im zh−n ), Im z ≥ 0. The factorization is essentially equivalent, via the Birman-Krein formula, to the local trace formula of Sj¨ ostrand [22],[23], just as the earlier global formulæ of Bardos-Guillot-Ralston, Melrose and Sj¨ ostrand-Zworski, were equivalent to global factorization of the scattering determinant [11],[35] – see Sect.5. Finer analysis under stronger spectral assumptions leads to factorization in sets of size h and that gives for 0 < δ < h/C the semi-classical Breit-Wigner

Vol. 2, 2001

Semi-classical Estimates on the Scattering Determinant

formula: σ(λ + δ, h) − σ(λ − δ, h) =

677

ωC− (z, [λ − δ, λ + δ]) + O(δ)h−n ,

|z−λ| 0 is fixed and B(x, R) = {y ∈ Rn : |x − y| < R}. We assume that P (h), 0 < h ≤ 1, is a family of self-adjoint operators, P (h) : H −→ H, with domain D ⊂ H, satisfying the following conditions: 1lRn \B(0,R0 ) D = H 2 (Rn \ B(0, R0 )), 1lRn \B(0,R0 ) P (h) = −h2 ∆|Rn \B(0,R0 ) , (P (h) + i)−1 is compact, P (h) ≥ −C, C ≥ 0 . 1

For convenience we will also add the reality condition: Pu = Pu ¯, which is satisfied in interesting situations. Under the above conditions, it is known that the resolvent R(z, h) = (P (h) − z)−1 : H −→ D continues meromorphically from {z : Im z > 0}, through (0, ∞), to the double cover of C when n is odd, and to the logarithmic plane Λ, when n is even (see the proof of Proposition 4.1 for a direct argument). The first sheet, where R(z, h) is meromorphic on H (with poles corresponding to eigenvalues) is called the physical plane. This continuation is as an operator from Hcomp to Dloc , and the poles are of finite rank. The poles are called resonances of P (h). We will denote the set of resonances by Res (P (h)), and will always include them according to their multiplicity, mR (z, h), which for z = 0 is defined as mR (z, h) = rank R(w, h)dw , γ (z) = {z + eit : 0 ≤ t ≤ 2π} , 0 < 1 , γ (z) 1 in

Proposition 2.2 only, where it also could be avoided

Vol. 2, 2001


679

see [24] and [35] for a discussion of this. We remark that we include the point spectrum of P (h), denoted by σ(P (h)), in the set Res (P (h)). Strictly speaking, resonances have non-zero imaginary parts and a distinction could be made. In order to guarantee a polynomial bound on the counting function of resonances, we need a spectral condition on P (h). It is formulated in terms of a reference operator constructed from P (h): let H = HR0 ⊕ L2 (TnR1 \ B(0, R0 )) , TnR1 = Rn /(R1 Zn ) , R1 R0 , and define P (h) by replacing −h2 ∆Rn by −h2 ∆TnR in the definition of P (h) (see 1 [24] and [22]). The assumptions on P (h) imply that P (h) has discrete spectrum and we assume that if N (P (h), λ) is the number of eigenvalues of P (h) in [−λ, λ] then n /2 λ N (P (h), λ) = O , for λ ≥ 1 (2.1) h2 for some number n ≥ n. As was observed in [24], this assumption does not depend on R1 , only on P (h). The scattering matrix for a “black box” perturbation is defined just as in the usual obstacle or potential scattering (see [16], [6] and references given there). We recall the stationary definition here: for any λ > 0 and a function f ∈ C ∞ (Sn−1 ), there exists u ∈ Dloc , such that for |x| > R0 √ i√λ|x| i λ|x| n−1 (P − λ)u = 0 , u(x) = |x|− 2 e− h f (x/|x|) + e h g(x/|x|) + O(1/|x|) , (2.2) where g ∈ C ∞ (Sn−1 ). By Rellich’s Uniqueness Theorem (see for instance [38, Sect.3]), u is unique up to a compactly supported eigenfunction u ˜ ∈ Dcomp , (P − λ)˜ u = 0. From the black box assumptions we know that the set of such λ’s is discrete, and the compact support of the eigenfunctions u ˜, makes them irrelevant in our study of scattering. The function f can be considered as the incoming data, and g as the outgoing data. This is consistent with our notion of the outgoing √ resolvent, R(z, h), which is bounded on H for Im z > 0: the outgoing term exp(i z|x|/h) is bounded in L2 for Im z > 0. The absolute scattering matrix relates the two data:

h) : f −→ g , S(λ, and we denote by S 0 (λ, h) the free scattering matrix corresponding to P = −h2 ∆. It is essentially given by the antipodal map: S 0 (λ, h)f (ω) = i1−n f (−ω) , λ > 0 (see the proof of Proposition 2.1 below). We then define the standard (relative) scattering matrix as

h) . S(λ, h) = S 0 (λ, h)−1 S(λ,

680



It has the form (see (2.5) below): S(λ, h) = I + A(λ, h), A(λ, h) ∈ C ∞ (Sn−1 × Sn−1 ) . Under our assumptions, it continues meromorphically in λ to the double cover of C (Riemann surface for z = w2 ) for n odd and to the logarithmic plane, Λ when n is even. It is holomorphic in Im z > 0, Re z > 0 and the poles of its continuation correspond to resonances of P (h) (see Proposition 2.2 below). We recall also the crucial unitarity S(z, h)−1 = S(¯ z , h)∗ . (2.3) It follows from the pairing formula recalled in the proof of Proposition 2.1. We now present one of many possible representations of A(z, h) in terms of the resolvent (see [17, Sect.2] and [36, Sect.3]) and its proof contains the proof of the general statements about S(z, h) made above. Proposition 2.1 For φ ∈ Cc∞ (Rn ) let us denote by Eφ± (z, h) : L2 (Rn ) → L2 (Sn−1 ) (2.4) √ √ the operator with the kernel φ(x) exp(±i zx, ω/h), with z positive on the real axis. Let us choose χi ∈ Cc∞ (Rn ), i = 1, 2, 3, such that χi ≡ 1 near B(0, R0 ), and χi+1 ≡ 1 on supp χi , i = 1, 2. Then for Im z > 0, Re z > 0 we have A(z, h) = cn h−n z

Eχ+3 (z, h)[h2 ∆, χ1 ]R(z, h)[h2 ∆, χ2 ]t Eχ−3 (z, h), cn = iπ(2π)−n , (2.5) where t E denotes the transpose of E. n−2 2

We remark that the transpose is defined using the Schwartz kernel: t E(x, ω) = E(ω, x). Proof. We give a direct proof in the spirit of [30] and use the pairing formula: if λ > 0 and

− n−1 2

ui (x) = |x|

(P − λ)ui = fi ∈ H , fi |Rn \B(0,R0 ) ∈ S, √ i√λ|x| i λ|x| + h e− h a− (x/|x|) + e a (x/|x|) + O(1/|x|) , |x| −→ ∞ , i i

then √ − + + u1 , f2 H − f1 , u2 H = 2ih λ a− 1 , a2 L2 (Sn−1 ) − a1 , a2 L2 (Sn−1 ) . √ Let us introduce the operators E± (λ, h) with Schwartz kernels exp(±i λx, ω/h) and assume that λ > 0 is not an eigenvalue of P . For g1 , g2 ∈ C ∞ (Sn−1 ) let us put u1 = (1 − χ1 )t E− (λ, h)g1 , u2 = (1 − χ2 )t E− (λ, h) − R(λ, h)[h2 ∆, χ2 ]t E− (λ, h) g2

Vol. 2, 2001


681

so that (P − λ)u2 = 0 and (P − λ)u1 = [h2 ∆, χ1 ]t E− (λ, h)g1 . A stationary phase argument now gives a− 1 = αn g1 , 1−n a+ g1 (−•) , 1 = αn i

αn = λ− 4 (n−1) h 2 (n−1) e 4 π(n−1) (2π) 2 (n−1) . 1

1

i

1

For u2 we note that since R(λ, h) is the outgoing resolvent, the only incoming contribution comes from the free term (1 − χ1 )t E− (λ, h)g2 (that R(λ, h) has not incoming term is seen, for instance, from the properties of the free resolvent and (4.2) below). Hence a− 2 = αn g2 , 1−n a+ Sg2 (−•) . 2 = αn i

Using the fact that (1 − χ2 )[h2 ∆, χ1 ] = 0, and the pairing formula above we see that g1 , E+ (λ, h)[h2 ∆, χ1 ]R(λ, h)[h2 ∆, χ2 ]t E− (λ, h))g2 L2 (Sn−1 ) = −[h2 ∆, χ1 ]t E− (λ, h)g1 , (1 − χ2 )t E− (λ, h)g2 − R(λ, h)[h2 ∆, χ2 ]t E− (λ, h)g2 H = u1 , (P − λ)u2 H − (P − λ)u1 , u2 H = 2iλ− 2 (n−2) hn (2π)n−1 g1 , (I − S(λ, h))g2 L2 (Sn−1 ) . 1

The general result follows from analytic continuation – in fact, we proved here that the scattering matrix has an analytic continuation, once that of R(z, h) is established. ✷ Remark. It is interesting to note that the representation (2.5) does not depend on the cut-off functions, and that we can reverse the condition χ2 ≡ 1 on the support of χ1 to χ1 ≡ 1 on the support of χ2 . Both facts follow directly from the properties of the scattering matrix but here we propose a direct argument based on the standard properties of “quantum flux”. Suppose that χ2 is equal to one on the supports of functions χ1 , χ ˜1 , which are equal to 1 near B(0, R0 ). We claim that Eχ+3 (z, h)[h2 ∆, χ2 ]R(z, h)[h2 ∆, χ1 − χ ˜1 ]t Eχ−3 (z, h) ≡ 0 . This will follow from showing that (−h2 ∆ − z)vj = 0 , j = 1, 2 =⇒ R(z, h)[h2 ∆, χ1 − χ ˜1 ]v1 , [h2 ∆, χ2 ]v2 H = 0 , which is clear as the left hand side is equal to

682



R(z, h)(−P (χ1 − χ ˜1 ) − (χ1 − χ ˜1 )h2 ∆)v1 , [h2 ∆, χ2 ]v2 H = −(χ1 − χ ˜1 )v1 , [h2 ∆, χ2 ]v2 H = 0 , ˜1 )[h2 ∆, χ2 ] = 0 . Similarly, if χ1 ≡ 1 on the support of χ ˜1 , and χ ˜1 ≡ 1 since (χ1 − χ near B(0, R0 ), then Eχ+3 (z, h)[h2 ∆, χ2 − χ ˜1 ]R(z, h)[h2 ∆, χ1 ]t Eχ−3 (z, h) ≡ 0 , which shows that we can switch the conditions on χ1 and χ2 . Yet another argument of the same type shows that Eχ+3 (z, h)[h2 ∆, χ2 ]R0 (z, h)[h2 ∆, χ1 ]t Eχ−3 (z, h) ≡ 0 , R0 (z, h) = (−h2 ∆ − z)−1 . In the next proposition we list two well known facts: Proposition 2.2 If we define the multiplicity of a pole or a zero of det S(z, h) as d 1 mS (z, h) = − tr S(w, h)dw , (2.6) S(w, h)−1 2πi dw γ (z) γ (z) = {z + eit : 0 ≤ t ≤ 2π} , 0 < 1 , then • det S(w, h) = (w − z)−mS (z,h) gz (w), for w near z, with gz (z) = 0, • mS (z, h) = mR (z, h) − mR (¯ z , h), Re z > 0, where one of z, z¯, is in the physical, and one in the non-physical half-plane. In particular, the non-negative eigenvalues of P (h) do not contribute to the poles of the scattering matrix. We outline the proof for the reader’s convenience: Proof. The first part is a direct application of a classical result of Gohberg and Sigal [10]. To see the second part we will use the continuity properties of the multiplicities and the generic simplicity of resonances (see [13]): this makes the argument considerably simpler. By continuity property we mean the fact that for any w0 and

> 0, |w−w0 |< m• (w, h) is constant for sufficiently small perturbations, which follows from the definition of multiplicities using integrals. Consequently we can assume that mR (w, h) ≤ 1 as the general statement follows from a deformation to the generic case. Suppose then that −π/2 < arg w0 < 0, that is, that w0 is in the first sheet of the non-physical plane, and that mR (w0 , h) = 1. The proof of the meromorphic continuation (see the derivation of (4.2) below) shows that in this case R(w, h) =

A + B(w) , w − w0

where B(w) is holomorphic in w near w0 . The reality of P implies that R(w, h) is symmetric (with respect to the indefinite form •, ¯•H ) and consequently A = φ⊗φ,

Vol. 2, 2001


683

Au = φ, u ¯H φ. Another look at the structure of the resolvent (see (4.2), and for a more detailed discussion [38, Lemma 1]) shows that φ = R0 (w0 , h)g, where g ∈ Cc∞ (Rn ) and R0 (z, h) is the free resolvent. Proposition 2.1 shows that S(w, h) =

n−2 A1 + B1 (w) , A1 = cn z 2 h−n E+ (w0 , h)g ⊗ E− (w0 , h)g . (2.7) w − w0

In fact, all that needs to be checked is that E∓ (z, h)g = E∓ (z, h)[h2 ∆, χ]R0 (z, h)g , g ∈ Cc∞ (Rn ) , (1 − χ)g = 0 , and that follows from integration by parts: for z ∈ (0, ∞) and χ ∈ Cc∞ (Rn ), [h2 ∆, χ]R0 (z, h)g, t E± (z, h)f H = −(−h2 ∆ − z)χR0 (z, h)g, t E± (z, h)f H + g, t E± (z, h)f H = g, t E± (z, h)f H . The essentially standard Rellich’s Uniqueness Theorem type argument (see [38, Sect.3]) shows that for arg z = 2πk, k = 0, 1, E± (w0 , h)g = 0. We can then find invertible operators, Fk (w), k = 1, 2, holomorphic near w0 , such that P1 S(w, h) = F1 (w) + P0 (w) F2 (w) , P12 = P1 , w − w0 with P0 (w), holomorphic near w0 . As shown in [10], the operators Fk (w) make no contribution to the integral in (2.6), and consequently we can assume that S(w, h) is given by the expression in the middle. The representation (2.7) shows that dS −1 (w0 ) = dim{ψ ∈ L2 (Sn−1 ) : S −1 (z, h)ψ = O(|z − w0 |k )} ≤ 1 , def

and that the only power k which can occur is k = 1. In fact using the projection P1 we can construct an element of the kernel and hence dS −1 (w0 ) = mR (w0 , h). On the other hand, for k ≥ 1 we have def

dS (w0 ) = dim{ψ ∈ L2 (Sn−1 ) : S(z, h)ψ = O(|z − w0 |k )} = 0 , since the equality (2.3) implies that S −1 (z, h) is continuous at w0 . If we apply [10, Theorem 2.1] in this situation, we obtain that 1 d tr S(w, h)dw mR (w0 , h) = dS −1 (w0 ) − dS (w0 ) = − S(w, h)−1 2πi dw γ (w0 ) and that proves the second part of the proposition for Im z < 0, as then mR (¯ z , h) = 0. For Im z > 0, we use (2.3), which shows than that mS (z, h) = −mS (¯ z , h) = −mR (¯ z , h). As now mR (z, h) = 0, we obtain our formula. ✷

684



Remark. We could avoid the results of [13] which strictly speaking do not apply to the whole logarithmic plane when the dimension is even (but apply in the region considered here), and used instead the direct argument of [11, Sect.2] which is based on [10]. Proposition 2.1 has the following strange consequence which perhaps has been observed before: Proposition 2.3 Suppose that n ≥ 5. Then S(z, h) is holomorphic in On (h, R0 ) = h2 R0−2 On , On = {reiθ ; rn−4 ≤ αn sin2 θ , 0 < r ≤ 1} where αn is a constant depending on the dimension. Moreover, 1 ≤ det S(z, h) ≤ 2 , z ∈ On (h, R0 ) . 2 We recall that it is well known that if n ≥ 5 and 0 is a pole of the resolvent, than it is an eigenvalue. The proposition shows that this phenomenon of absence of resonances propagates to a set near zero. Proof. We can that S(z, h) is is invertible in small there. In

take R0 = 1 as the general result follows from scaling. To show holomorphic in On (h, 1) ∩ {Im z < 0}, we will show that S(z, h) On (h, 1) ∩ {Im z > 0}. That is done by showing that "A(z, h)" is fact, for χ ∈ Cc∞ (Rn ), χ ≡ 1 near B(0, 1), 1

"[h2 ∆, χ]t Eφ± (z, h)"L2 (Sn−1 )→L2 (Rn ) ≤ Cχ (h2 + |z|)eCχ |z| 2 /h , 1

"Eφ± (z, h)[h2 ∆, χ]"L2 (Rn )→L2 (Sn−1 ) ≤ Cχ (h2 + |z|)eCχ |z| 2 /h , C "(1 − χ)R(z, h)(1 − χ)"L2 (Rn )→L2 (Rn ) ≤ , Im z > 0 . Im z Hence "A(z, h)" ≤ C

n−4 n |z| C|z| 12 /h −2 (h |z|) 2 + (h−2 |z|) 2 , Im z > 0 , e Im z

where the constants depend on the cut-off functions used and the dimension. By choosing αn in the definition of On small enough we can make "A(z, h)" small in On (h, 1). To estimate the determinant we observe that −1

e− (I+A(z,h))

A(z,h) tr

≤ | det S(z, h)| ≤ e A(z,h) tr , Im z > 0 . n+1

Since "A(z, h)"tr ≤ C"(I − ∆Sn−1 ) 2 A(z, h)", the determinant estimate follows from the previous argument by observing that "(I − ∆Sn−1 )

n+1 2

Eφ± (z, h)[h2 ∆, χ]"L2 (Rn )→L2 (Sn−1 )

Vol. 2, 2001


≤ C(1 + (h−2 |z|)

n+1 2

685

)(|z| + h2 ) , |z| ≤ h2 .

✷ The standard object of study in scattering theory is the scattering phase which is defined as 1 σ(λ, h) = log s(λ, h) , (2.8) 2πi with some choice of the logarithm, for instance, σ(0, h) = 0. It is related to the spectral shift function which is defined using normalized traces of functions of P (h). To present this relation we introduce a normalized trace: for g ∈ S(R) we let

trg(P (h)) = trH g(P (h)) − (1 − χ)g(−h2 ∆)(1 − χ) (2.9) − trL2 (Rn ) g(−h2 ∆) − (1 − χ)g(−h2 ∆)(1 − χ) , where χ ∈ Cc∞ (Rn ), χ ≡ 1 on B(0, R0 + a) , a > 0. The Birman-Krein formula then takes the following well known form dg

trg(P (h)) = − g(λ) , g ∈ S((0, ∞)) , (2.10) (λ)σ(λ, h)dλ + dλ λ∈σ(P (h))

and for the adaptation of the standard proof to the black box case we refer to [6]. By using the assumption (2.1) and the representation of the scattering phase (see [4, Theorem 3]) we have, for every J R+ , |σ(λ, h)| ≤ C(J)h−n , λ ∈ J , 0 < h ≤ h0 (J) .

(2.11)

If we define N (λ, h) = # σ(P (h)) ∩ (0, λ] − # σ(−h2 ∆TnR ) ∩ (0, λ] , 1

then, as shown recently by Bruneau and the first author [4, Theorem 3], for E > 0 and µ > 0 we have δλ ((E, E + µ]) σ(E + µ, h) − σ(E, h) + λ∈σ(P (h))

= N (E + µ, h) − N (E, h) + O(h−n

+1

).

(2.12)

N (E + µ, h) − N (E, h)) = W (E, µ)h−n + O(h−n+1 ) ,

(2.13)

In particular, in the interesting situation when n = n ,

and σ(P (h)) ∩ (0, ∞) = ∅, we have σ(E + µ, h) − σ(E, h) = W (E, µ)h−n + O(h−n+1 ) ,

(2.14)

where the Weyl term W (E, µ) is assumed to be smooth in µ as is the case for spectral asymptotics near non-degenerate energy levels (see for instance [7, Sect.11]).

686



3 Some complex analysis In the aspects of scattering theory studied here we apply the following principle of complex analysis: if a holomorphic function is not identically zero then, at most points, it is bounded from below by a constant times the reciprocal of its upper bound provided we have control on the lower bound of the function at one point. This follows from a precise statement for the disc: Lemma 3.1 Suppose f (z) is holomorphic in the disc |z| ≤ r and that f (0) = 0. Suppose that the number of zeros of f (z) in |z| ≤ r is equal to N . Then for any θ ∈ (0, 1) we have r+ρ 1 2r min log |f (z)| > − max log |f (z)| + N log + log |f (0)| , (3.1) r − ρ |z|=r θ r−ρ |z|=ρ for ρ ∈ (0, r) \ ∪K k=1 (ρk − δk , ρk + δk ), 0 < δk < ρ ,

K

k=1 δk

≤ 6θr .

The proof follows from the classical lemma of Cartan (see for instance [12, Lemma 6.17]) and the Poisson-Jensen formula (see [12, Lemma 6.18]). We recall that N can be estimated using Jensen’s formula by −1 log(1 + ) max log |f (z)| − log |f (0)| , > 0 . r |z|=r+ For future use we will recall here Cartan’s beautiful estimate: Given arbitrary numbers zm ∈ C, m = 1, ..., M , for any η > 0 there exists a set, L D(a l , rl ), formed by the union of L ≤ M discs, D(al , rl ), centered at some l=1

points al ∈ C, such that L l=1 rl < 2eη and M m=1

|z − zm | > η M , z ∈ C \

L

D(al , rl ) .

(3.2)

l=1

Lemma 3.1 is then a consequence of this, and of the Carathéodory or Harnack inequalities (see the proof of Proposition 4.2 for a direct application of a similar argument). We will also need a result in the case of a cone for which we quote [5, Theorem 56]: Lemma 3.2 Suppose that f is holomorphic in {z : 0 < arg z < π/k + }, > 0 and that log |f (z)| ≤ B1 |z|k , log |f (z0 )| ≥ −B2 with 0 < arg z0 < π/k. Then for any δ > 0, log |f (reiθ )| > −Cδ rk , r > r0 , θ ∈ (0, θ0 ) \ Σ(r) , |Σ(r)| < δ , Cδ = Cδ (, z0 , B1 , B2 ) , r0 = r0 (, z0 , B1 , B2 ) .

(3.3)

Vol. 2, 2001


687

We remark that this estimate also follows from the estimates obtained more directly by Sj¨ ostrand [23, Sect.7]. We recall also the standard Carleman inequality which in a similar context was already used in [18] (see for instance [29, 3.7]): Lemma 3.3 Let f (z) be holomorphic in |z − λ| ≤ R, Im z ≥ 0, with λ ∈ R. Let zj denote the zeros of f (z) and let 0 < ρ < r < R be such that no zeros of f (z) lie on |z − λ| = ρ and |z − λ| = r, and on the real axis. Then for 0 < δ < 1 − ρr we have π Im zj 1 1 ≤ log |f (λ + reiθ )| sin θdθ |zj − λ|2 δ πr 0 ρ 0, M0 (h) =

max Ω0 (h)∩{Im z=0}

log+ |F (z, h)| , M1 (h) =

max

Ω0 (h)∩{Im z=hM }

log+ |F (z, h)| ,

where log+ x = max(0, log x). Proof. For > 0 let us introduce, h−L/2 f (z, h) = √ π

b− /2

exp(−h−L (x − z)2 )dx , L < 2M ,

a+ /2

so that |f (z, h)| ≤ e on Ω0 (h), |f (z, h)| ≥ 1/2 on Ω (h), if h ≤ h(), and |f (z, h)| ≤ C exp(−C h−L ), on Ω0 (h) \ Ω /4 (h). We then apply the maximum principle to the subharmonic function log |G(z, h)| = log |F (z, h)| + log |f (z, h)| − M0 (h) − M1 (h) − 1 . If we choose L > K then, on ∂Ω0 (h) we have log |G(z, h)| ≤ 0 and on Ω (h) we get log |f (z, h)| = O(1). ✷

4 Estimates on the scattering determinant We will give a self-contained discussion of the estimates for the number of resonances, the cut-off resolvent and the scattering determinant in the setting of semiclassical compactly supported black box perturbations. Our presentation comes largely from [28, Sect.4] and it is based on the works of Melrose [14], Sj¨ ostrand [22],[24], Vodev [31], and the second author [34],[36]. We also adapt the methods of [18] to the semi-classical setting to obtain the estimate on the scattering determinant (Lemma 4.3), and its factorization (Proposition 4.4). We start with a polynomial bound on the number or resonances: Proposition 4.1 If n is as in (2.1) and Ω {z : Re z > 0} is a pre-compact neighborhood of E ∈ R+ , then # {z : z ∈ Res (P (h)) ∩ Ω} = O(h−n ) , Ω C .

(4.1)

Remark. This proposition can be improved by replacing h−n by a more precise bound on the number of eigenvalues of the reference operator, Φ(h−2 ) – see [22]. For a large class of majorants Φ, the proof given here can be improved following [31]. Consequently we can replace h−n by Φ(Ch−2 ) in all subsequent estimates, which we avoid for the sake of clarity. The most interesting case is of course n = n

Vol. 2, 2001


and a nice case where Φ(t) = tn quotients [24].

/2

689

, n > n is given by finite volume hyperbolic

Proof. Let R0 (z, h) be the meromorphic continuation of the free resolvent (−h2 ∆−

ΩΩ

C. Let us also consider the following cut-off z)−1 from Im z > 0 to Ω, ∞ n functions χi ∈ Cc (R ), i = 0, 1, 2, χ0 ≡ 1 near B(0, R0 ), χi ≡ 1 near supp χi−1 and χ ≡ 1 near supp(χ2 ). We then define Q0 = Q0 (z, h) = (1 − χ0 )R0 (z, h)(1 − χ1 ) , Q1 = Q1 (z0 , h) = χ2 R(z0 , h)χ1 , Im z0 > 0 , so that (P (h) − z)(Q0 (z, h) + Q1 (z0 , h)) = I + K0 (z, h) + K1 (z0 , z, h), K0 (z, h) = −[h2 ∆, χ0 ]R0 (z, h)(1 − χ1 ), K1 (z0 , z, h) = −[h2 ∆, χ2 ]R(z0 , h)χ1 + χ2 (z0 − z)R(z0 , h)χ1 . We now put K = K(z0 , z, h) = K0 (z, h)χ + K1 (z0 , z, h)χ which is a compact operator H → H and the norm of K(z0 , z0 , h) is O(h). Hence (I + K(z0 , z0 , h))−1 exists for h small enough and consequently (via analytic Fredholm theory) (I + K(z0 , z, h))−1 is meromorphic in z (under our assumptions, on the Riemann surface of z = w2 for n odd and z = ew for n even). Hence R(z, h)χ = (Q0 (z, h)χ + Q1 (z0 , h)χ)(I + K(z0 , z, h))−1 ,

(4.2)

and we have essentially reviewed the proof of the meromorphic continuation of the resolvent from [24]. We now introduce

f (z, h) = det(I + K n

+1

(z, h)) ,

where n is as in (2.1) and, as we will see below, the choice of the power justifies the existence of the determinant. By Weyl inequalities (see for instance [9, Chapter

where II, Corollary 3.1]), |f (z, h)| ≤ M (h), z ∈ Ω, M (h) = supz∈Ω det(I + K ∗ K)

n +1 2

).

To estimate M (h) we need to estimate the eigenvalues of (K ∗ K) 2 , that is the characteristic values µj (K) of K. The standard properties of characteristic values (see [9, Chapter II]) show that it is enough to estimate the characteristic values of various summands. 1

690



We start by proving that µj ([h2 ∆, χ2 ]R(z0 , h)χ1 ) ≤ Ch µj (χ2 R(z0 , h)χ1 ) ≤ C

− 1 n j , h

− 2 n j . h

In fact, for all N, M , χ2 R(z0 , h)χ1 − χ2 (P (h) − z0 )−1 χ1 = O(hN ) : H −→ DM , (see the proof of [24, Proposition 5.4]). From this the estimates follow from the estimates on the characteristic values of χ2 (P (h) − z0 )−1 χ1 which in turn follow from (2.1). Greater difficulty lies in estimating the K0 χ term, where we encounter exponential growth. We start by observing that for Im z ≥ 0, − 1 n j 2 µj ([h ∆, χ2 ]χR0 (z, h)χ) ≤ Ch h (see, for instance, [34, Lemma 4]). For Im z < 0 we write χR0 (z, h)χ = χ(−h2 ∆ − z)−1 χ + χ(R0 (z, h) − (−h2 ∆ − z)−1 )χ , where R0 (z, h) is the meromorphic continuation of the resolvent from Im z > 0 and (−h2 ∆ − z)−1 is the resolvent, holomorphic on L2 for z ∈ C \ R+ . This reduces the problem to estimating the characteristic values of χ(R0 (z, h)−(−h2 ∆−z)−1 )χ . We rewrite this operator using the standard representation of the spectral projection (see for instance the proof of [34, Lemma 1]): χ(R0 (z, h) − (−h2 ∆ − z)−1 )χ = cñ h−n z

n−2 2

t

Eχ+ (z, h)Eχ− (z, h) ,

where Eχ± are as in (2.4). Hence, cn ||z| µj (χ(R0 (z, h) − (−h2 ∆ − z)−1 )χ) ≤ |˜

n−2 2

h−n "t Eχ+ (z, h)"µj (Eχ− (z, h))

and we estimate µj (Eχ− (z, h)) = µj ((I − ∆Sn−1 )−k (I − ∆Sn−1 )k Eχ− (z, h)) ≤ µj ((I − ∆Sn−1 )−k )"(I − ∆Sn−1 )k Eχ− (z, h)" ≤ C k j − n−1 (2k)! exp(C/h) 2k

(4.3)

≤ C exp(Ch−1 − j n−1 /C ) , 1

where we used the Cauchy inequalities and then optimized in k. By summing up the contributions from different terms in K, we obtain the following estimate on the determinant −n M (h) = O(eCh ) . (4.4)

Vol. 2, 2001


Since I + K(z0 , z0 , h)n estimates hold for

(I + K(z0 , z0 , h)n

+1

691

can be inverted by Neumann series, and since the same

+1 −1

)

= I − K(z0 , z0 , h)n

+1

(I + K(z0 , z0 , h)n

+1 −1

)

,

we can estimate f (z0 , h)−1 so we get −n

|f (z0 , h)| > e−Ch

.

(4.5)

| Im z0 | < r < Re z0 (choosing z0 Let us now put Ω0 = D(z0 , r), Ω0 ⊂ Ω, appropriately for that). Let N (h) be the number of zeros, wj (h), of f (z, h) in

Then by the Jensen inequality D(z0 , r + ) ⊂ D(z0 , r + 2) ⊂ Ω. N (h) ≤ C (

max

D(z0 ,r+2 )

log |f (z, h)| − log |f (z0 , h)|)

(4.6)

≤ C (log M (h) − log |f (z0 , h)|) ≤ Ch−n .

By Lemma 3.1, we can cover Ω by discs centered at z˜ at which (4.5) holds with z0 replaced by z˜. Hence by repeating the argument we obtain (4.1). ✷ The next result holds in greater generality (see [27, Lemma 1] and references given there), but we will give a direct argument following directly from the proof of Proposition 4.1: Proposition 4.2 If Ω is as in (4.1), then for 0 < h ≤ h0 we have "χR(z, h)χ"H→H ≤ CΩ exp(CΩ h−n log(1/F (h))) , z ∈Ω\ D(zj (h), F (h)) , 0 < F (h) 1 ,

zj (h)∈ResP (h)

where R(z, h) is the meromorphically continued resolvent, zj (h) are the resonances of P (h) and χ ∈ Cc∞ (Rn ), χ ≡ 1 near B(0, R0 ). Proof. To estimate the resolvent we now use, with the notation of the proof of Proposition 4.1 the following inequality "χR(z, h)χ" ≤ ("Q0 χ" + "Q1 χ")"(I + K(z0 , z, h))−1 " n +1

det(I + (K ∗ K) 2 ) ≤ eCh M (h)|f (z, h)|−1 . ≤ ("Q0 χ" + "Q1 χ") | det(I + K n +1 )| Here in the second inequality we have used [9, Chapter V, Theorem 5.1]. Hence the problem is reduced to lower bounds on |f (z, h)|. We could apply Lemma 3.1 but instead we trade the quality of the lower bound for an explicit characterization of the exceptional set.

692



Thus, again with the same notation as in the proof of Proposition 4.1, we write

N(h) g(z,h)

f (z, h) = e

(z − wj (h)) , z ∈ D(z0 , r) ,

j=1

where g(z, h) is holomorphic in D(z0 , r) and wj (z) are the zeros of f (z, h) in N(h) (z − wj (h)) with some η0 > 0, D(z0 , r + ). Next using the estimate (3.2) for j=1 the bound (4.6) for N (h), the estimate (4.4) and the maximum principle for the harmonic function Re g(z, h), we deduce an upper bound Re g(z, h) ≤ Ch−n , z ∈ D(z0 , r). Carathéodory’s inequality (see for instance [29, 5.5]) gives max |g(z, h)| ≤

|z−z0 |=ρ

2ρ r+ρ max Re g(z, h) + |g(z0 , h)| , r > ρ . r − ρ |z−z0 |=r r−ρ

Taking 0 < Im z0 < Re z0 , z0 ∈ / −n

−Ch

N(h) j=1

(4.7)

D(wj (h), F (h)), we get log |f (z0 , h)| >

and

N(h)

log |

(z0 − wj (h))| ≥ N (h) log F (h) ≥ −Ch−n log

j=1

1 , F (h)

which yields | Re g(z0 , h)| ≤ Ch−n log(1/F (h)) .

We can choose appropriately Im g(z0 , h) so that |g(z0 , h)| ≤ Ch−n log(1/F (h)), and that gives the lower bound

N(h)

log |f (z, h)| ≥ −Ch−n log(1/F (h))) for z ∈ D(z0 , ρ) \

D(wj (h), F (h)) .

j=1

Now recall that the resonances zj (h) are included in the set of zeros of f (z, h), so applying the maximum principle for the operator-valued holomorphic function χR(z, h)χ, outside the discs centered at zj (h), we obtain the conclusion of the proposition for z ∈ D(z0 , ρ) \ zj (h)∈ResP (h) D(zj (h), F (h)). Covering Ω by discs and using the successive lower bounds for |f (z, h)|, gives the result for general domains. ✷ We now give the crucial estimate on the scattering determinant. It generalizes the estimate given in [34, Proposition 2, (14)]. Its interest comes from its universality: it does not depend in any way on the structure of the perturbation, only on the size of its support: Lemma 4.3 If s(z, h) = det S(z, h) and Ω is as in (4.1), then −n

|s(z, h)| ≤ CeCh

, z ∈ Ω ∩ C+ , C = C(R0 , Ω) .

(4.8)

Vol. 2, 2001


693

Proof. This is an almost immediate consequence of Proposition 2.1, the estimates (4.3), the resolvent estimate of Proposition 4.2. As before, we use Weyl inequalities to have |s(z, h)| ≤

∞

(1 + µj (A(z, h))) .

j=1

For Im z > hM , M fixed, we have that R(z, h) = O(h−M ) : H → H, and the equation, (1 − χ1 )(−h2 ∆ − z)R(z, h) = 1 − χ1 , then gives "[h2 ∆, χ2 ]R(z, h)[h2 ∆, χ1 ]"L2 (Rn )→L2 (Rn ) = O(h−M +2 ) . Hence,

µj (A(z, h)) ≤ Cn (Ω)h−n−M +2 "Eχ+3 (z, h)"µj (Eχ−3 (z, h)) .

We now use the estimate (4.3) to obtain µj (A(z, h)) ≤ exp(CM h−1 − j n−1 /C ) . 1

Consequently, the product of 1 + µj (A(z, h)) over j ≥ Ch−n+1 is bounded by exp(Ch−1 ), which implies that −1

−n+1

|s(z, h)| ≤ eCh (1 + "A(z, h)")Ch ≤ eC(Im

√

zh−n +(n+M −2) log(1/h)h−n+1 )

, Im z > hM ,

(4.9)

where, as √ in the proof of Proposition 2.3, we estimated "A(z, h)" by Ch−n−M +2 exp(C Im z/h). Since |s(z, h)| = 1 for z ∈ R, we can apply a version of the three lines theorem given in Lemma 3.5 to conclude the proof. For that we need some weak estimate valid everywhere and we claim that −(n +1)n

|s(z, h)| ≤ eCh

, Im z ≥ 0 , z ∈ Ω = Ω + D(0, ) .

(4.10)

In fact, Proposition 4.2 shows that for every x ∈ Ω ∩ R we can find x ∈ Ω2 ∩ R such that |x − x | < and for z ∈ x + i[0, hM ] and 0 < h ≤ h() we have −n −1

"χR(z, h)χ"H→H = O(eh

).

Hence

−n −1

"[h2 ∆, χ2 ]R(z, h)[h2 ∆, χ1 ]"L2 (Rn )→L2 (Rn ) = O(eh

).

Consequently, by the same argument as above, µj (A(z, h)) ≤ exp(Ch−n

−1

1

− j n−1 /C ) ,

which proves (4.10) and concludes the proof of the proposition.

✷

694



As recalled in Sect.2, the poles of the scattering determinant are given by the poles of the resolvent, away from the real axis. That, and the unitarity (2.3), immediately imply a factorization of the scattering determinant. The issue is the estimate on the non-vanishing term in that factorization and this is addressed in Proposition 4.4 Let s(z, h) = det S(z, h) be the scattering determinant. Then z , h) g(z,h) P (¯

s(z, h) = e

P (z, h)

, |g(z, h)| ≤

P (z, h) =

C h−n , n≥1 , z ∈ R, C (N (h) + h−n ), n ≥ 5 (z − w) ,

w∈Res (P (h))∩R

N (h) = # Res (P (h)) ∩ R , (4.11) where g(z, h) is holomorphic in R and R = (a, b) + i(−c, c) , 0 < a < b , 0 < c , R = R + D(0, ) . In particular for n = n we have an improved estimate −n

|s(z, h)| ≤ CeC Im zh

, z ∈ R ∩ C+ .

(4.12)

To obtain this proposition we need the following Lemma 4.5 For any Ω = [a, b] + i(0, c), 0 < a < b, c > 0, there exist δ > 0 and C, such that for any 0 < h ≤ h0 , there exists z0 = z0 (h), which satisfies log |s(z0 , h)| ≥ −Ch−n , z0 ∈ Ω , Im z0 > δ .

(4.13)

The constant C in (4.13) depends on P (h). Proof. If we factorize s(z, h) as in (4.11), then Cartan’s lemma (3.2) and the bound on the number of resonances (4.1) show that we need to find z0 for which Re g(z0 , h) ≥ −Ch−n . We normalize g(z, h) by assuming that |g(˜ a, h)| ≤ 2π , a < a ˜ < b, and note that Lemma 4.3 and Cartan’s lemma imply that Re g(z, h) ≤ Ch−n , z ∈ Ω ∩ {Im z ≥ 0} .

We claim that

| Im g(z, h)| ≤ Ch−n , z ∈ R ∩ R .

Vol. 2, 2001


695

In fact, for λ real we have

Im g(λ, h) − Im g(˜ a, h) = 2π(σ(λ, h) − σ(˜ a, h)) + 2

w∈Res (P (h))∩R

a ˜

λ

Im w dt . |w − t|2

Using (4.1) and the estimate a ˜

λ

y dt ≤ π , y 2 + (x − t)2

we see that the second term on the right hand side is O(h−n ). The first term satisfies the same estimate in view of (2.11). If we put fh (z) = g(z, h)hn , then we know that

fh (¯ z ) = −fh (z), z ∈ R, |fh (z)| ≤ C1 , z ∈ R ∩ R, Re fh (z) ≤ C, z ∈ R ∩ {Im z ≥ 0}, and we want to show that ∃ δ > 0 , C2 > 0, ∀ 0 < h ≤ h0 , ∃ z0 = z0 (h) ∈ R,

Im z0 > δ , Re fh (z0 ) ≥ −C2 . (4.14) If not, we would have a sequence of holomorphic functions gN such that gN (¯ z ) = −gN (z) , z ∈ R , |gN (z)| ≤ C1 , z ∈ R ∩ R , Re gN ≤ −N , Im z > 1/N , Re gN ≤ C , Im z ≥ 0 . The Poisson formula applied as in the proof of Lemma 3.4 shows that Re gN ≤ −N Im z/C ,

z ∈ D(˜ a, ρ) , Im z ≥ 0 ,

for some ρ and C independent of N . Since Re gN |R = 0 we conclude that ∂Im z Re gN |R ≤ −N/C. From Cauchy-Riemman equations we now get ∂Re z Im gN (z) ≥ N/C , z ∈ D(˜ a, ρ) ∩ R , and that contradicts the uniform boundedness of gN on R ∩ R. Hence (4.14) holds and the lemma is proved. ✷ When the dimension is large enough we obtain the following stronger result: Lemma 4.6 Suppose that n ≥ 5. For any Ω = [a, b] + i(0, c), 0 < a < b, c > 0, there exist δ > 0 and C, such that for any 0 < h ≤ h0 , there exists z0 = z0 (h), which satisfies log |s(z0 , h)| ≥ −Ch−n , z0 ∈ Ω , Im z0 > δ .

(4.15)

The constant C depends only on Ω and the support of the perturbation, B(0, R0 ).

696



Proof. We first make the following observation based on the proof of Lemma 4.3: fix any H > 0, then for any 0 < h ≤ H, we have, for any compact set Ω1 C, −n

|s(z, h)| ≤ CeCh

z ∈ Ω1 ∩ {Im z > min(1/C, hM )} ,

(4.16) √ where the constants depend only on M , H, and R0 . Put Pρ (h) = ρ−1 P ( ρh), ρ > 0. Then P (h)|Rn \B(0,R0 ) = Pρ (h)|Rn \B(0,R0 ) and Pρ (h) satisfies the black box assumptions of Sect.2 (without uniformity with respect to ρ). If sρ (z, h) is the scattering determinant corresponding to Pρ (h), then we have the following relation: √ s(wρ, h) = sρ (w, h/ ρ) . We can now apply (4.16) to sρ and that gives n √ √ |sρ (w, h/ ρ)| ≤ C exp(Ch−n ρ 2 ) , w ∈ Ω1 C , Im w > (h/ ρ)M . By scaling, using ρ ∼ |z|, this implies that |s(z, h)| ≤ C exp(Ch−n |z| 2 ) , Im z > h2 /C, Re z > 0 , n

if we take M > 2. We now put fh (w) = s(h2 w, h), which in view of the previous estimate satisfies n

log |fh (w)| ≤ C|w| 2 + C , Im w > 1/C , Re w > 0 , uniformly with respect to h. Proposition 2.3 shows that there exist many w’s, ˜ |w| ˜ ≤ 1, Im w ˜ > 1/C, such that ˜ ≥ −C log |fh (w)| holds with a constant independent of h. We can now apply Lemma 3.2 and conclude that log |fh (reiθ )| > −Crn/2 , r > r0 , θ ∈ (0, θ0 ) \ Σ(r, h) , |Σ(r, h)| < δ0 . This implies the existence of z0 (h) = h2 w0 (h), Im w0 (h) > δ/h2 , |w0 (h)| ≤ C/h2 , such that z0 (h) satisfies the conditions in (4.15), and log |s(z0 (h), h)| = log |fh (w0 (h))| > −C|w0 (h)| 2 > −Ch−n . n

✷ Proof of Proposition 4.4. Since we clearly have a factorization given in (4.11), the only thing to check is the estimate on g(z, h). The slight difference with the standard arguments lies in having estimates on |s(z, h)| for Im z ≥ 0 only. The unitarity implies however that g(z, h) = −g(¯ z , h), and hence we only need to estimate g for Im z ≥ 0. In that region, the bound (4.8), an application of Cartan’s lemma (3.2), and the maximum principle give Re g(z, h) ≤ C1 (h−n + N (h)) , Im z ≥ 0 , z ∈ R /2 .

Vol. 2, 2001


697

Lemmas 4.5 and 4.6, and the trivial bound |z − w| ¯ ≤ 1 , Im z ≥ 0 , Im w ≤ 0 , |z − w|

(4.17)

give an existence of z0 = z0 (h) ∈ R, Im z0 ≥ δ > 0, such that Re g(z0 , h) ≥

−C2 h−n , n ≥ 1 , . −C2 h−n , n ≥ 5

When n ≥ 5, Harnack’s inequality, applied to the harmonic function G(z, h) = 2C1 (h−n + N (h)) − Re g(z, h), positive for Im z ≥ 0, z ∈ R /2 , shows that | Re g(z, h)| ≤

1 C(N (h) + h−n ) , z ∈ R /4 , Im z > ρ . ρ

(4.18)

In fact, if 0 < ρ < Im z0 is such that D(z0 , Im z0 − ρ) ⊂ R , we have max

z∈D(z0 ,Im z0 −ρ)

G(z, h) ≤

2|z0 | 2|z0 | G(z0 , h) ≤ (2C1 + C2 )h−n + 2C1 N (h) . ρ ρ

Using this inequality with different ρ and z0 , we get the bound (4.18) for all z ∈ R, Im z > ρ. In view of (4.18), we can apply Lemma 3.4 to u(z, h) = (h−n + N (h))−1 Re g(z, h) and deduce the estimate | Re g(z, h)| ≤ C(h−n + N (h))| Im z|, z ∈ R /4

(4.19)

which combined with the Carathéodory inequality gives the bound |g(z, h)| ≤ C(h−n + N (h)), z ∈ R .

(4.20)

Recalling (4.17), it also gives (4.12). We proceed similarly for lower dimensions.

5 Local trace formula for resonances As an application of the results of Sect.4 we present a proof of a slight improvement of Sj¨ ostrand’s local trace formula in the setting of semi-classical compactly supported perturbations. We stress that it depends only on the upper bound on the number of resonances (4.1), the factorization of the scattering determinant (4.11), and on the Birman-Krein formula (2.10). It is essentially a localized version of the arguments of [11] and [35].

698



Theorem 1 Suppose that P (h) satisfies the assumptions of Sect.2. Let Ω, Ω {Re z > 0}, be an open, simply connected set such that Ω∩R is connected. Suppose that f is holomorphic on a neighborhood of Ω and that that ψ ∈ Cc∞ (R) satisfies 0, d(Ω ∩ R, λ) > 2, ψ(λ) = 1, d(Ω ∩ R, λ) < , where > 0 is sufficiently small. Then

tr(ψf )(P (h)) =

f (z) + EΩ,f,ψ (h) , (5.1)

z∈Res (P (h))∩Ω −n

|EΩ,f,ψ (h)| ≤ M (ψ, Ω)h

sup {|f (z)| : 0 < d(z, Ω) < 2 , Im z ≤ 0} ,

is defined in (2.9) and n is as in (4.1). where tr Remark. We note that unlike in [22],[23] we only estimate the function f in the lower half plane to control the error EΩ,f,ψ (h). Proof. The Birman-Krein formula recalled in Sect.2 shows that dσ

tr(ψf )(P (h)) = (ψf )(λ) (λ, h)dλ + (ψf )(λ) . (5.2) dλ λ∈σ(P (h))

Let ψ˜ ∈ Cc∞ (C) be an almost analytic extension of ψ satisfying supp ∂¯z ψ˜ ⊂ {z : ≤ d(z, Ω) ≤ 2} , which can certainly be arranged. We note that this implies that ψ˜ ≡ 1 on Ω. An application of Green’s formula gives ∂z s(z, h) ˜ )(z) + 1 ˜

tr(ψf )(P (h)) = L(dz) , (ψf ∂¯z ψ(z)f (z) π s(z, h) C− z∈Res (P (h))

where we used the definition of the scattering phase σ(λ, h) given in (2.8), and where L(dz) denotes the Lebesgue measure on C. Notice that if λ ∈ σ(P (h)), then λ ∈ Res (P (h)) so the eigenvalues are included in the first term. On the other hand, ∂z s(z, h)/s(z, h) is regular on Ω∩R which justifies the application of Green’s formula. We first note that the properties of ψ˜ and Proposition 4.1 show that ˜ )(z) (ψf z∈Res (P (h))

=

z∈Res (P (h))

f (z) + O(h−n )sup {|f (z)| : 0 < d(z, Ω) < 2 , Im z ≤ 0} .

Vol. 2, 2001


699

Using the elementary inequality 1 1 1 L(dz) ≤ L(dz) + L(dz) |z − w| |z − w| |z − w| Ω1 D(w,ρ) Ω1 \D(w,ρ) 1 1 ≤ 2πρ + |Ω1 | ≤ 2 2π|Ω1 | , ρ = (|Ω1 |/(2π)) 2 , ρ (5.3) (4.1) and (4.11) conclude the proof, as, with Ω ⊂ R, s (z, h) 1 1 ≤ |g (z, h)| + + . s(z, h) |z − w| |z − w| ¯ w∈Res (P (h))∩R

✷

6 Breit-Wigner approximation We now establish the semi-classical version of the Breit-Wigner approximation and throughout this section we assume that n = n . Again, it is a purely complexanalytic consequence of the estimate on the scattering determinant, and of the existence of a good remainder in the Weyl law for the scattering phase. It generalizes the large energy result of [18]. Theorem 2 Suppose that σ(P (h)) ∩ (0, ∞) = ∅, and that the spectral condition (2.14) holds for E in a neighbourhood of λ > 0 and for µ sufficiently small. Then for any 0 < δ < h/C we have σ(λ + δ, h) − σ(λ − δ, h) = ωC− (z, [λ − δ, λ + δ]) + O(δ)h−n , (6.1) |z−λ| 0, and h/C < δ < 1/C # {z : z ∈ Res (P (h)) , |z − λ| < δ} = O(δh−n ) . (6.2)

700



Proof. We recall that the spectral assumption (2.14) implies that the scattering phase satisfies σ(λ + 2δ, h) − σ(λ − 2δ, h) = O(δh−n ) . As in [18, Proposition 1] we now show that 1 # {z ∈ Res (P (h)) : |z −λ| < δ}−O(δh−n ) , (6.3) 2 which then implies the lemma. σ(λ+2δ, h)−σ(λ−2δ, h) ≥

To see (6.3), we apply (4.11) with R centered at λ, so that 1 λ+2δ s (t, h) dt |σ(λ + 2δ, h) − σ(λ − 2δ, h)| = 2πi λ−2δ s(t, h)   λ+2δ 1 2i Im z  g (t, h) − = dt 2π λ−2δ |z − t|2 z∈Res (P (h))∩R λ+2δ | Im z| 1 − O(δh−n ) ≥ π λ−2δ |z − t|2 z∈Res (P (h))∩R

≥

1 # {z ∈ Res (P (h)) : |z − λ| < δ} − O(δh−n ) , 2 (6.4)

since for 0 < y < δ and |x − λ| < δ we have λ+2δ δ/y π y 1 dt ≥ dr ≥ . 2 + y2 2 (x − t) 1 + r 2 λ−2δ −δ/y ✷ We need one more lemma which is a h-local version of Proposition 4.5: Lemma 6.2 Let Ω(h) = {z : |z − λ| ≤ C1 h}, λ > 0, and, for |z − λ| < C2 h, 0 < C2 < C1 , put s(z, h) = egλ (z,h)

Pλ (¯ z , h) , Pλ (z, h) = Pλ (z, h)

(z − w) .

w∈Res (P (h))∩Ω(h)

Then under the assumptions of Theorem 2 we can choose gλ so that |gλ (z, h)| ≤ Ch−n+1 , |z − λ| ≤ C2 h . Proof. We will use the factorization in Proposition 4.4 in the domain R = Ω = (λ/2, 3λ/2) + i(−c, c), c > 0 and we denote by g(z, λ) the corresponding holomorphic function and recall that P (z, h) = w∈Res (P (h))∩Ω (z − w). Comparing the expressions for s(z, h), we see that gλ (z, h) = g(z, h) + log

P (¯ z , h)Pλ (z, h) Pλ (¯ z , h)P (z, h)

,

Vol. 2, 2001


701

and we need to show that the second term on the right hand side is bounded by Ch−n+1 for |z − λ| < C2 h. In fact, the real part of the first term is bounded by C| Im z|h−n = O(h−n+1 ) because of (4.19) and by Carathéodory inequality we conclude that this term is O(h−n+1 ). Now we will show that d P (¯ z , h)Pλ (z, h) log = dz P (z, h)Pλ (¯ z , h) (6.5) 1 1 −n − ≤ Ch , w∈Res (P (h))∩Ω z − w ¯ z−w |w−λ|>C h 1

for |z − λ| < C2 h, from which the needed estimate follows by integration and a choice of the branch of logarithm. To see (6.5), we proceed as in [19] and rewrite the expression to be estimated as 2| Im w| | Re z − w|2 w∈Res (P (h))∩Ω |w−λ|>C1 h

Im z

+ 0

1 1 − 2 ((Re z + iy) − w) ((Re z + iy) − w) ¯ 2

dy .

(6.6)

The sum of the integrated terms is harmless as 1 1 ≤ 2 |z − w| |z − w|2 w∈Res (P (h))∩Ω w∈Res (P (h))∩Ω |w−λ|>C1 h

|w−z|>(C1 −C2 )h

C log(1/h)

k=1

C3 2k h≤|z−w|C1 h

1 − h

π

log |s(λ + C1 he , h)| sin θdθ iθ

0

,

702



where we used the fact that |s(z, h)| = 1 for z real and r > 0 is chosen so that Ω ⊂ {w ∈ C : |w − λ| < r}. By Lemma 4.3 the first integral is bounded from above by Ch−n . To estimate the absolute value of the second integral, we rewrite it as follows. We put Ωλ,h = {z : Im z ≥ 0 , |z − λ| ≤ C1 h}, define Γλ,h , as its boundary, denote by L(dz) the Lebesgue measure on C and use Green’s formula: C1 π 1 iθ log |s(λ + C1 he , h)| sin θdθ = − 2 Re log |s(z, h)|dz h 0 h Γ λ,h 1 2i ∂¯z log |s(z, h)|L(dz) = − 2 Re h Ωλ,h 1 i ∂z log s(z, h)L(dz) . = 2 Re h Ωλ,h The integrand in this last integral can be rewritten as  1 i  1 − g (z, h) − h2 z−w z−w ¯ w∈Res (P (h))∩Ω |w−λ|≤C1 h

−

w∈Res (P (h))∩Ω |w−λ|>C1 h

1 1 − z−w z−w ¯

  .

(6.7)

The integral of the first term is estimated by 1

−n , |g (z, h)|L(dz) ≤ Ch−n−2 |Ωλ,h | ≤ Ch h2 Ωλ,h and that of the second one by 1 1 1 − L(dz) 2 h z − w z − w ¯ Ωλ,h w∈Res (P (h))∩Ω |w−λ|≤C1 h 1 1 1 ≤ 2 + L(dz) h |z − w| |z − w| ¯ Ωλ,h w∈Res (P (h))∩Ω |w−λ|≤C1 h 1 C ≤ 2 |Ωλ,h | 2 h−n+1 ≤ Ch−n , h

where we used (5.3) and Lemma 6.1. It remains to estimate the integral of the last term in (6.7) (the sum over |w − λ| > C1 h). That term is exactly the left hand side of (6.5), and we rewrite it

Vol. 2, 2001


703

again as in (6.6)3 The second term in (6.6) is treated the same way as before, and the first term is estimated using (6.4): | Im w| | Im w| 1 C λ+Ch L(dz) ≤ dt 2 2 h | Re z − w| h |t − w|2 λ−Ch w∈Res (P (h))∩Ω Ωλ,h w∈Res (P (h))∩Ω |w−λ|>C1 h

|w−λ|>C1 h

C

−n , ≤ Ch−n+1 ≤ Ch h ✷

and this estimate completes the proof of the lemma.

Proof of Theorem 2. In the notations of (4.11) and (6.1), and for 0 < δ < h/C we get λ+δ s (t, h) 1 σ(λ + δ, h) − σ(λ − δ, h) = dt 2πi λ−δ s(t, h)   λ+δ 2i Im z  1  = gλ (t, h) −  dt 2πi λ−δ |z − t|2 z∈Res (P (h))∩R =

|z−λ| 0, let Ωγ = (a − γ, b + γ) − i(0, c), 0 < a < b, 0 < c, and let z0 = z0 (h), z¯0 ∈ Ω0 satisfy Im z0 (h) > 2δ > 0, with 0 < δ < 1 fixed. Suppose that ψ ± ∈ Cc∞ (R; [0, 1]) have the properties that ψ ± ≡ 1 in Ω± ∩ R and supp ψ ± ⊂ Ω ± ∩ R. Then we have # Res (P (h)) ∩ Ω2 ∩ {Im z < −δ} ≤ C1 h−n − C2 log |s(z0 (h), h)| , and

dσ (λ, h)dλ − E − (h) dλ dσ ≤ # Res (P (h)) ∩ Ω0 ≤ ψ + (λ) (λ, h)dλ + E + (h) , dλ

(7.1)

ψ − (λ)

(7.2)

where, in the notation of (4.1), √ |E ± (h)| ≤ A0 ( δ + ) # Res (P (h)) ∩ Ω3 \ Ω− + A1 h−n − A2 log |s(z0 (h), h)| , with the constants A0 = A0 (R0 , Ω0 ), Ai = Ai (R0 , Ω0 , , δ), i = 1, 2, which do not depend on P (h). Proof. We first observe that Lemma 3.1, Jensen’s inequality, and (4.8) imply (7.1), and that there exist z’s satisfying log |s(z, h)| ≥ C1 h−n − C2 log |s(z0 (h), h)| , z ∈ Ω2 ∩ {Im z > δ/2} , for any δ > 0. The factorization argument, as in the proof of (4.11), now shows that for z ∈ Ω2 , | Im z| > δ, we have s(z, h) = egδ (z,h)

Pδ (¯ z , h) , Pδ (z, h) = Pδ (z, h)

(z − w) ,

(7.3)

w∈Res (P (h))∩Ω3 Im w δ} , where the new constants again depend only on R0 as far as the dependence on P (h) is concerned. We now proceed as in the proof of Theorem 1: let ψ˜ ± ∈ Cc∞ (C; [0, 1]) be an almost analytic extension of ψ ± satisfying supp ∂¯z ψ˜ ± ⊂ Ω ± \ Ω± .

706


Green’s formula then gives dσ ψ ± (λ) (λ, h)dλ = dλ

1 ψ˜ ± (z) + π z∈Res (P (h)) = @(Res (P (h)) ∩ Ω0 ) +

+

∂z s(z, h) L(dz) ∂¯z ψ˜ ± (z) s(z, h) C−

ψ˜ ± (z)

z∈Res (P (h))\Ω0

1 ± ˜ ψ (z) − 1 + π

z∈(Res (P (h))∩Ω0 )


∂z s(z, h) L(dz) , ∂¯z ψ˜ ± (z) s(z, h) C−

and if we call the sum of the last three terms on the right hand side E ± (h), then (7.2) holds and we need to estimate E ± (h). We first use (5.3) and deduce from (4.11) (just as in the proof of Theorem 1) that ¯z ψ˜± (z) ∂z s(z, h) L(dz) ∂ s(z, h) −δ≤Im z≤0

√ ≤ C0 δ max h−n , # (Res (P (h)) ∩ (Ω3 \ Ω− )) .

For Im z < −δ we use the improved factorization (7.3) which, again as in the proof of Theorem 1, gives ¯z ψ˜± (z) ∂z s(z, h) L(dz) ≤ C5 h−n − C6 log |s(z0 (h), h)| , ∂ s(z, h) Im z 0 fixed, 0 < a < b. Then for any δ > 0 we have Nδ ([a, b], h) ≤ C(R0 , δ, a, b)h−n , 0 < h ≤ h0 (R0 , δ, a, b) .

Vol. 2, 2001


707

If N ([a, b], h) = # σ(P (h)) ∩ [a, b] is the counting function for the reference operator, then for any > 0 we have N ([a + , b − ], h) − E− (h) ≤ N0 ([a, b], h) ≤ N ([a − , b + ], h) + E+ (h) , 0 ≤ E± (h) ≤ Ch−n + C(R0 , )h−n + C(, P )h−n

+1

.

Remark. The theorem is stated in a weaker form than actually available: if we use the optimal version of Proposition 4.1 discussed in the remark following it, we can replace h−n by a better bound in the estimates on E± (h). Proof. When we apply (2.12) and (7.2) we only need to check that d ± d ± ψ (λ)(σ(λ, h) − σ(a± , h))dλ ψ (λ) (σ(λ, h) − σ(a± , h))dλ = − dλ dλ d ± ψ (λ) N ([a± , λ], h) + OP (h−n +1 ) dλ =− dλ  −  ≥ N ([a + , b − ], h) − O ,P (h−n +1 ), ,  ≤ N ([a − 2, b + 2], h) + O ,P (h−n +1 ), + with a+ = a − 2, a− = a. An application of Proposition 4.1 to estimate #Res (P (h)) ∩ (Ω3 \ Ω− ) completes the proof (we take δ = 2 and we change in the estimate involving N ([a − 2, b + 2], h)). ✷ With this in place we immediately obtain Sj¨ ostrand’s bottle theorem [23] for compactly supported perturbations: Theorem 5 Suppose that P satisfies the assumptions of Sect.2 with h = 1. Let Nδ (r) = #{z ∈ Res(P ) : 1 ≤ |z| ≤ r , −π/2 < arg(z) < −δ}. Then for δ > 0 we have Nδ (r) ≤ C(δ, R0 )rn , r ≥ r0 (δ, P ) ,

(7.4)

where C(δ, R0 ) does not depend on P . For any > 0, and r ≥ r1 (, P ), we have N ((1 − )r) − E− (r) ≤ N0 (r) ≤ N ((1 + )r) + E+ (r) ,

0 ≤ E± (r) ≤ C0 rn + C1 (R0 , )rn + C2 (, P )rn

−1

(7.5)

.

where, as indicated, the constants C0 and C1 (R0 , ) in the error terms do not 2 depend on P , and where N (r) = @ σ(P ) ∩ [1, r ] is the normalized counting function of eigenvalues of the reference operator P . When n ≥ 5 then r0 (δ, P ), and r1 (, P ) depend only on R0 .

708



Proof. This is a straightforward application of Theorem 4. We only comment on the case of n < 5. In that case, we can apply the proof of Lemma 4.6 to obtain a desired lower bound on the scattering determinant since we always have log |s(z)| > −CP at some z, Im z > 0, |z| ≤ C. We refer to [19] for more details. ✷ To illustrate the theorem we conclude with two examples which are implicit in [23]: Example 7.1 Let P = −∆g be a metric perturbation of the Laplacian which satisfies volg (B(0, R0 )) R0n . Then the number of resonances in any conic neighbourhood of the real axis is comparable to rn , if r is sufficiently large. In fact, a scaling argument shows that the constants depending on R0 in (7.5) are all bounded by CR0n . This generalizes the estimate given in [25, Example 3]. Example 7.2 Suppose that N (r) ∼ Crp logq r where p + q > n. Such examples can be obtained by considering hypoelliptic operators – see [26, Example 5.1] and references given there. Here we use a stronger version of Theorem 5 as discussed in the remark following the statement of Theorem 4. We then obtain that N0 (r) = Crp logq r(1 + o(1)) , which was first proved by Vodev [33]. Note added in proofs. By combining ideas of this paper with the techniques of [23], some of our results have been generalized to larger classes of perturbations by V. Bruneau and the first author. A new, slightly simpler, proof of Theorem 2 has been provided there as well.

References [1] S. Agmon, A perturbation theory for resonances, Comm. Pure Appl. Math. 51, 1255–1309 (1998). [2] J.-F. Bony, Majoration du nombre de résonances dans des domaines de taille h, preprint, (2000). [3] J.-F. Bony and J. Sj¨ ostrand, Trace formula for resonances in small domains, preprint (2000). [4] V. Bruneau and V. Petkov, Representation of the spectral shift function and spectral asymptotics for trapping perturbations, preprint, (2000). [5] M. Cartwright, Integral Functions, Cambridge University Press, (1956). [6] T. Christiansen, Spectral asymptotics for general compactly supported perturbations of the Laplacian on Rn , Comm. P.D.E. 23, 933–947 (1998).

Vol. 2, 2001


709

[7] M. Dimassi and J. Sj¨ ostrand, Spectral Asymptotics in the semi-classical limit, Cambridge University Press, (1999). [8] C.Gérard and A. Martinez, Prolongement méromorphe de la matrice de scattering pour des probèmes à deux corps a` longue portée. Ann. Inst. H. Poincaré (Physique Théorique) 51, 81–110 (1989). [9] I.C. Gohberg and M. G. Krein, Introduction to the Theory of Linear Nonself-adjoint Operators, Translations of Mathematical Monographs 18, A.M.S., Providence, (1969). [10] I.C. Gohberg and E. I. Sigal, An operator generalization of the logarithmic residue theorem and the theorem of Rouché, Math. USSR Sbornik 13, 603–624 (1971). [11] L. Guillopé and M. Zworski, Scattering asymptotics for Riemann surfaces, Ann. of Math. 129, 597–660 (1997). [12] W.K. Hayman, Subharmonic Functions, vol.II, Academic Press, London, (1989). [13] F. Klopp and M. Zworski, Generic simplicity of resonances, Helv. Phys. Acta 68, 531-538 (1995). [14] R.B. Melrose, Polynomial bound on the number of scattering poles. J. Funct. Anal. 53, 287–303 (1983). [15] R.B.Melrose, Weyl asymptotics for the phase in obstacle scattering, Comm. P.D.E., 13, 1431–1439 (1988). [16] R.B. Melrose, Geometric Scattering Theory, Cambridge University Press, (1996). [17] V. Petkov and L. Stoyanov, Sojourn times of trapping rays and the behavior of the modified resolvent of the Laplacian, Ann. Inst. H. Poincaré (Physique Théorique) 62, 17-45 (1995). [18] V. Petkov and M. Zworski, Breit-Wigner approximation and the distribution of resonances, Commun. Math. Phys. 204, 329-351 (1999). [19] V. Petkov and M. Zworski, Erratum to [18], Commum. Math. Phys. 214, 733–735 (2000). [20] D. Robert, Relative time-delay for perturbations of elliptic operators and semiclassical asymptotics, J. Funct. Anal. 126, 36-82 (1994). [21] N. Shenk and D. Thoe, Resonant states and poles of the scattering matrix for perturbations of −∆, J. Math. Anal. Appl. 87, 467–491 (1972).

710



[22] J. Sjöstrand, A Trace Formula and Review of Some Estimates for Resonances, in Microlocal analysis and spectral theory (Lucca, 1996), 377–437, NATO Adv. Sci. Inst. Ser. C, Math. Phys. Sci., 490, Kluwer Acad. Publ., Dordrecht, (1997). [23] J. Sjöstrand, Resonances for bottles and related trace formulæ, Math. Nachr. 221, 95–149 (2001). [24] J. Sjöstrand and M. Zworski, Complex Scaling and the Distribution of Scattering Poles, Journal of AMS, 4, 729–769 (4) (1991). [25] J. Sjöstrand and M. Zworski, Lower bounds on the number of scattering poles, Comm. P.D.E. 18, 847–857 (1993). [26] J. Sjöstrand and M. Zworski, Lower bounds on the number of scattering poles, II, J. Funct. Anal. 123 (4), 336–367 (1994). [27] S.H.Tang and M. Zworski, From quasimodes to resonances, Math. Res. Lett., 5, 261–272 (1998). [28] S.H.Tang and M. Zworski, Resonance expansions of scattered waves, Comm. Pure Appl. Math. 53, 1305–1334 (2000). [29] E.C.Titchmarsh, The Theory of Functions, Oxford University, Oxford, (1968). [30] A. Vasy, Geometric scattering theory for long-range potentials and metrics, Internat. Math. Res. Notices 6, 285–315 (1998). [31] G. Vodev, Sharp polynomial bounds on the mumber of scattering poles for perturbations of the Laplacian, Commun. Math. Phys. 146, 39–49 (1992). [32] G. Vodev, On the distribution of scattering poles for perturbations of the Laplacian, Ann. Inst. Fourier (Grenoble) 42, 625–635 (1992). [33] G. Vodev, Asymptotics on the number of scattering poles for degenerate perturbations of the Laplacian. J. Funct. Anal. 138, 295–310 (1996). [34] M. Zworski, Sharp polynomial bounds on the number of scattering poles, Duke Math. J. 59, 311-323 (1989). [35] M. Zworski, Poisson formulæ for resonances, Séminaire E.D.P. 1996-1997, ´ Ecole Polytechnique, XIII-1-XIII-12. [36] M. Zworski, Poisson formulæ for resonances in even dimensions, Asian J. Math. 2 (3), 609–617 (1998). [37] M. Zworski, Resonances in physics and geometry. Notices Amer. Math. Soc. 46, 319–328 (1999).

Vol. 2, 2001


711

[38] M. Zworski, Singular part of the scattering matrix determines the obstacle, preprint (1999), Osaka J. Math. to appear. Vesselin Petkov Département de Mathématiques Appliquées Université de Bordeaux I 351, Cours de la Libération F-33405 Talence France email: [email protected] Maciej Zworski Mathematics Department University of California Evans Hall, Berkeley, CA 94720 USA email: [email protected] Communicated by Bernard Helffer submitted 02/10/00, accepted 31/01/01




Precise Asymptotic Formulas for Semilinear Eigenvalue Problems T. Shibata Abstract. We consider the nonlinear Sturm-Liouville problem −u (t) = |u(t)|p−1 u(t) − λu(t), t ∈ I := (0, 1), u(0) = u(1) = 0, where p > 1 and λ ∈ R is an eigenvalue parameter. To investigate the global L2 bifurcation phenomena, we establish asymptotic formulas for the n-th bifurcation branch λ = λn (α) with precise remainder term, where α is the L2 norm of the eigenfunction associated with λ.

1 Introduction In this paper we study global L2 -bifurcation phenomena associated with the nonlinear Sturm-Liouville problem −u (t) = |u(t)|p−1 u(t) − λu(t), u(0) = u(1) = 0,

t ∈ I := (0, 1),

(1.1) (1.2)

where p > 1 and λ ∈ R is an eigenvalue parameter. ¯ are the bifurAs is well known (cf. Berestycki [2]), (−(nπ)2 , 0) ∈ R × C 2 (I) ¯ cation points of (1.1)–(1.2) for n ∈ N and {(λ, un,λ ) : λ > −(nπ)2 } ⊂ R × C 2 (I) ¯ is the eigenfuncform C 1 -curves emanating from (−(nπ)2 , 0), where un,λ ∈ C 2 (I) tion associated with λ which has exact n − 1 interior zeros and is positive near t = 0. Since (1.1)–(1.2) is an eigenvalue problem, a detailed study of the global behavior of these bifurcation branches in L2 framework is important. To this end, ¯ of (1.1)–(1.2) with the we consider the solution pair (λn (α), un,α ) ∈ R × C 2 (I) properties (i) un,α L2 (I) = α > 0, (ii) un,α has exact n − 1 interior zeros in I, (iii) un,α (t) > 0 near t = 0, and establish precise asymptotic formulas for λn (α) when λn (α) 1. As far as the author knows, the first contribution to this problem is the following result due to Benguria and Depassier [1]: Let p = 3. Then as α → ∞ λ1 (α) =

1 (1 + o(1))α4 . 16

(1.3)

714

T. Shibata


Motivated by this, Shibata [6] established the following asymptotic formula to study the first term of λn (α) for p > 1: as λ → ∞ α2 := un,λ 2L2 (I) = nC1 λ(5−p)/(2(p−1)) + O(λ(5−p)/(2(p−1))−1/2 ),

(1.4)

where 2/(p−1) p+1 C2 , C1 = 2 2 √ 2 p+3 π C2 = Γ /Γ , p−1 p−1 2(p − 1) ∞ xq−1 e−x dx (q > 0). Γ(q) =

(1.5)

0

For the related topics, we also refer to Shibata [7]. Then (1.4) gives us the formulas λn (α) = K1 n2(p−1)/(p−5) α4(p−1)/(5−p) + o α4(p−1)/(5−p) as α → ∞ (1 5),

(1.7)

where (2(p−1))/(p−5)

K1 = C1

.

(1.8)

By (1.6)–(1.7), we understand the first term of λn (α). We also find that λn (α) → ∞ as α → ∞ (1 5)). This drives us to the natural question: What is the remainder term of λn (α) ? For the case p = 3, we can obtain the second term of λ1 (α) as follows. In this case, (λ, u1,λ ) is given parametrically by λ = 4K(k)2 (2k2 − 1), u1,λ L2 (I) = {8K(k)[E(k) − (1 − k2 )K(k)]}1/2

(1 ≤ k < 1).

(cf. [1]). Here K(k) and E(k) are the complete elliptic integrals. Since k → 1 corresponds to α → ∞, by using the asymptotic formulas for K(k) and E(k) as k → 1 (cf. Gradshteyn and Ryzhik [4, pp. 905–906]), we obtain that as α → ∞ λ1 (α) =

2 1 4 1 α + (1 + o(1))α6 e−α /4 . 16 8

(1.9)

Motivated by (1.9), we shall establish the precise asymptotic formulas for λn (α) for general p > 1. Now we state our results.

Vol. 2, 2001

Precise Asymptotic Formulas for Semilinear Eigenvalue Problems

715

Theorem 1. Let n ∈ N be fixed. Assume 1 < p < 5. Then as α → ∞ λn (α) = K1 n2(p−1)/(p−5) α4(p−1)/(5−p)

√ 2(p+1)/(p−5) 6(p−1)/(5−p) − K1 n4/(p−5) α2(p−1)/(5−p)

+ K2 n

α

(1.10)

e

√ 2(p−1)/(p−5) 4(p−1)/(5−p) − K1 n4/(p−5) α2(p−1)/(5−p)

+ K3 n α e √ 4(p−1)/(5−p) − K1 n4/(p−5) α2(p−1)/(5−p) , +o α e where

K2 K3 K4

6/(p−5) p−1 p+1 2(p+1)/(p−5) := 24(p+1)(p−3)/((p−1)(p−5)) C2 , (1.11) 5−p 2 K1 p − 1 K4 −4 − := 22(p+1)/(p−1) + , (1.12) 5−p 2C2 C2 1 1 − s2/(p−1) := ds. (1.13) (1 − s)3/2 0

Furthermore, if p > 5, then (1.10) holds as α → 0.

√ Theorem 2. Let n ∈ N be fixed. Assume p = 5. Then as α ↑ ( 3nπ/2)1/2 √ λn (α) = n2 µn (α) + log µn (α) + log 4 3 2 √ log 4 3 − 2 1 log µn (α) + (1 + o(1)) , + µn (α) µn (α) where

√ 3 α2 π− . µn (α) = − log 2 n

(1.14)

(1.15)

The following Theorems 3 and 4 give the asymptotics of the L∞ -norm of the corresponding solutions with respect to λ and α. Theorem 3. Let n ∈ N be fixed. Then as λ → ∞ un,λ ∞

=

1/(p−1) p+1 λ1/(p−1) 2 √ 1 2(p+1)/(p−1) −√λ/n − λ/n 2 e + o(e ) . × 1+ p−1

(1.16)

716

T. Shibata


Theorem 4. Let n ∈ N be fixed. Assume that 1 < p < 5. Then as α → ∞ 1/(p−1) (p + 1)K1 un,α ∞ = n2/(p−5) α4/(5−p) (1.17) 2

√ 4/(p−5) 2(p−1)/(5−p) K2 α n4/(p−5) α2(p−1)/(5−p) e− K1 n × 1+ (p − 1)K1 √ 4/(p−5) 2(p−1)/(5−p) 1 K3 2(p+1)/(p−1) α + + 2 e− K1 n p−1 K1 √ 4/(p−5) 2(p−1)/(5−p) α + o e− K1 n .

√ If p > 1, then (1.17) holds as α → 0. If p = 5, then as α ↑ ( 3nπ/2)1/2 √ √ un,α ∞ = 31/4 n µn (α) + log µn (α) + log 4 3

(1.18)

1/2 √ log 4 3 − 2 1 log µn (α) + (1 + o(1)) + µn (α) µn (α) √ 3 1 + o(1) α2 × 1+ √ π− . 2 n 2 3µn (α) We briefly explain the idea of the proofs of the Theorems. For a unique positive solution pair (λ, u1,λ ) of (1.1)–(1.2) for a given λ 1, we put √ wλ (s) = λ−1/(p−1) u1,λ (t), (s := λ(t − 1/2), t ∈ I). Then wλ satisfies the problem −wλ (s) = wλ (s)p − wλ (s), wλ (t) > 0, s ∈ Iλ , √ wλ (± λ/2) = 0.

√ √ s ∈ Iλ := (− λ/2, λ/2),

(1.19) (1.20) (1.21)

The limit equation of (1.19)–(1.21) is −w (s) = w(s)p − w(s), w(s) > 0, s ∈ R, lim w(s) = 0.

s ∈ R,

s→±∞

(1.22) (1.23) (1.24)

Let w be a unique solution of (1.22)–(1.24) (cf. Berestycki and Lions [3].). Then as λ → ∞ α2

:= u1,λ 2L2 (I) = λ(5−p)/(2(p−1)) wλ 2L2 (Iλ ) = λ(5−p)/(2(p−1)) w2L2 (R) + o(1) .

(1.25)

Vol. 2, 2001


717

Therefore, the first term of the formula (1.10) comes from wL2 (R) . This observation was accomplished in [6]. However, to obtain the remainder term, we need more precise information about wλ 2L2 (Iλ ) for λ 1. To this end, we study the detailed asymptotic behavior of wλ L2 (Iλ ) as λ → ∞ by using the relationship between wλ ∞ and wλ L2 (Iλ ) carefully. The remainder of this paper is organized as follows. In Section 2, we study the relationship between wλ ∞ and wλ L2 (Iλ ) . In Section 3, we prove our Theorems. Section 4 is the appendix, in which two estimates we accept without proof in Section 2 are proved.

2 Preliminaries We first recall some fundamental properties of w and wλ for λ > 0 (cf. Berestycki and Lions [3]). We know that s ∈ I¯λ , 1/(p−1) p+1 = w∞ , wλ ∞ = wλ (0) > 2 √ λ wλ (s) < 0, 0 < s < . 2 wλ (s) = wλ (−s),

(2.1) (2.2) (2.3)

We define (λ) > 0 by wλ ∞ =

1/(p−1) p+1 . (1 + (λ)) 2

(2.4)

Then by (2.2) and the result of Kwong [5], it is known that (λ) ↓ 0 as λ ↑ ∞. We begin √ with the fundamental lemma for (λ). Lemma 2.1. λ = 2J((λ)) for λ > 0, where 1 1

J() := dy ( > 0). (2.5) 2 p+1 y −y + (1 − y p+1 ) 0 Proof. Multiply (1.19) by wλ . Then we obtain wλ (s)wλ (s) + wλ (s)p wλ (s) − wλ (s)wλ (s) = 0, s ∈ I¯λ . This implies d ds

1 2 1 1 w (s) + wλ (s)p+1 − wλ (s)2 2 λ p+1 2

≡ 0, s ∈ I¯λ .

So the inside of the bracket is constant. Put s = 0. Then by (2.2), we obtain 1 2 1 1 1 1 2 ¯ wλ (s) + wλ (s)p+1 − wλ (s)2 = wλ p+1 ∞ − wλ ∞ , s ∈ Iλ . 2 p+1 2 p+1 2

718

T. Shibata


We put zλ := wλ /wλ ∞ . Then by this, we obtain zλ (s)2 =

2 p+1 ) − (1 − zλ (s)2 ), s ∈ I¯λ . wλ p−1 ∞ (1 − zλ (s) p+1

By this, (2.3) and (2.4), we obtain −zλ (s) =

√ zλ (s)2 − zλ (s)p+1 + (λ)(1 − zλ (s)p+1 ), 0 ≤ s ≤ λ/2.

(2.6)

Put y = zλ (s). Then by (2.6), we obtain 1√ λ = 2

√

λ/2

0

−zλ (s)

ds zλ (s)2 − zλ (s)p+1 + (λ)(1 − zλ (s)p+1 )

1

1

dy 2 − y p+1 + (λ)(1 − y p+1 ) y 0 = J((λ)).

=

Thus the proof is complete. Next, we study the asymptotic behavior of (λ) as λ → ∞. Lemma 2.2. For λ 1

✷

√ 2(p + 1) log (λ) = − λ + log 2 + O (λ)(p−1)/2 + O( (λ)) + O((λ)). (2.7) p−1 Proof. Let J() = J1 () + J2 + J3 (),

(2.8)

where J1 ()

1

= 0

J2 J3 ()

1

dy, 2 y +

1

y p−2

dy, 1 − y p−1 (1 + 1 − y p−1 ) 0 := J() − J1 () − J2 .

:=

We study the asymptotic behavior of J() as → 0. We first calculate J1 (). We know from [4, pp. 51] that for x 1 −1

tan

1 π x= − +O 2 x

1 x3

.

(2.9)

Vol. 2, 2001


719

√ Put y = tan θ in J1 (). Then by (2.9) and Taylor expansion of tan θ at θ = π/4, for 0 < 1, we obtain tan−1 (1/√ ) 1 J1 () = dθ (2.10) cos θ 0 1 1 π = log tan tan−1 √ + 2 4 1 1 1 1 − log 1 − tan tan−1 √ tan−1 √ = log 1 + tan 2 2 √ √ = log(2 − + O()) − log( + O()) √ 1 = log 2 − log + O( ). 2 Next, put y = sin2/(p−1) θ in J2 . Then we obtain π/2 sin θ 2 dθ J2 = p−1 0 1 + cos θ 2 log 2. = p−1 Moreover, for 0 < 1, we can prove |J3 ()| ≤ C((p−1)/2 +

√

+ ).

(2.11)

(2.12)

We accept (2.12) without proof here, since the calculation is long and complicated. The proof √ will be given in Section 4 later. Once (2.12) is accepted, then since J((λ)) = λ/2 by Lemma 2.1, we obtain (2.7) by (2.8), (2.10)–(2.12). Thus the proof is complete. ✷ Put y = zλ (s). Then by (2.1) and (2.6), we obtain √λ/2 2 zλ L2 (Iλ ) = 2 zλ (s)2 ds (2.13) 0

√ λ/2

−zλ (s) ds zλ (s)2 · zλ (s)2 − zλ (s)p+1 + (λ)(1 − zλ (s)p+1 ) 0 = 2L((λ)),

= 2

where

L() := 0

1

y2

dy. y 2 − y p+1 + (1 − y p+1 )

Put = 0 and y = sin2/(p−1) θ in (2.14). Then we have 1 π/2 y2 2

dy = sin(5−p)/(p−1) θdθ L(0) = p−1 0 y 2 − y p+1 0 = C2 .

(2.14)

(2.15)

720

T. Shibata


To study the asymptotic behavior of zλ L2 (Iλ ) as λ → ∞, we prepare the asymptotic formula for L() as → 0. Lemma 2.3. For 0 < 1 1 L() = C2 + log + K5 + o(), (2.16) 4 where 1 p+1 1 K5 := − (2.17) log 2 − K4 . 4 2(p − 1) 2(p − 1) Proof. By (2.15), we have L() − C2

where A1 (, y)

=

L() − L(0) = −L1 () 1 y(1 − y p+1 ) dy, := − A1 (, y) 0 =

(2.18)

y 2 − y p+1 + (1 − y p+1 ) 1 − y p−1

× ( y 2 − y p+1 + y 2 − y p+1 + (1 − y p+1 )).

We put L1 () = L2 () + L3 (), where

1

(2.19)

y

dy, (2.20) y 2 + (y + y 2 + ) L3 () := L1 () − L2 (). (2.21) √ We first calculate L2 (). Put y = tan θ and x = tan(θ/2) in (2.20). Then by (2.9) and Taylor expansion, we obtain tan−1 (1/√ ) sin θ L2 () = dθ (2.22) cos θ(1 + sin θ) 0 √ tan( 12 tan−1 (1/ )) 4x dx = (1 − x)(1 + x)3 0 tan( 12 tan−1 (1/√ ))

1 2 1 1 − + + dx = 2(1 − x) 2(1 + x) (1 + x)2 (1 + x)3 0 1 1 1 1 + log 2 − 1 + o(1) tan−1 √ = − log 1 − tan 2 2 2 4 π 1 1 1 1 (1 + o(1)) + log 2 − + o(1) = − log − tan−1 √ 2 2 2 4 √ 1 1 1 = − log( (1 + O())) + log 2 − + o(1) 2 2 4 1 1 1 = − log + log 2 − + o(1). 4 2 4 L2 () :=

0

Vol. 2, 2001


Secondly, we can show that as → 0 1 L3 () → L3 (0) = p−1

721

1 log 2 + K4 , 2

(2.23)

where K4 is the constant defined by (1.13). The proof will be given in Section 4 later. By (2.18), (2.19), (2.22) and (2.23), we obtain (2.16). Thus the proof is complete. ✷

3 Proof of Theorems. We first prove our Theorems for the case n = 1. Proof of Theorem 1 for n = 1. By (1.25), (2.4), (2.13), Lemma 2.3 and Taylor expansion, for λ 1, we obtain α2

= λ(5−p)/(2(p−1)) wλ 2L2 (Iλ ) = λ(5−p)/(2(p−1)) wλ 2∞ zλ 2L2 (Iλ )

(3.1)

= 2λ(5−p)/(2(p−1)) wλ 2∞ L((λ)) 2/(p−1) p+1 (5−p)/(2(p−1)) (1 + (λ)) = 2λ 2

1 × C2 + (λ) log (λ) + K5 (λ) + o((λ)) 4 2/(p−1) p+1 2 C2 λ(5−p)/(2(p−1)) 1 + = 2 (λ) + o((λ)) 2 p−1

K5 1 (λ) log (λ) + (λ) + o((λ)) . × 1+ 4C2 C2 It follows from (1.5) and (1.8) that if p = 5, then 2(p−1)/(5−p) 2/(p−1) 1 2 −1 C2 . K1 = 2 p+1

(3.2)

Therefore, by (3.1), (3.2) and Taylor expansion, for p = 5 and λ 1, we obtain p−1 4(p−1)/(5−p) λ = K1 α (λ) log (λ) (3.3) 1− 2(5 − p)C2 2 K5 2(p − 1) + − (λ) + o((λ)) . 5−p p−1 C2 By Lemma 2.2, for λ 1, we have (λ)

√ λ ξ(λ)

=

22(p+1)/(p−1) e−

e

=

2

e

+ 1 (λ)

:= 2

e

+ 22(p+1)/(p−1) (eξ(λ) − 1)e−

√ 2(p+1)/(p−1) − λ √ 2(p+1)/(p−1) − λ

(3.4) √

λ

,

722

T. Shibata √ λ/2

where ξ(λ) = O(e−(p−1) √

λ1 (λ) √ λ

e−

=

√ λ/2

+ e−


). Then we see that as λ → ∞

√ √ 22(p+1)/(p−1) λ(eξ(λ) − 1)e− λ √ λ

e−

→ 0.

This along with (3.4) implies that for λ 1 √ 2(p+1)/(p−1) − λ

(λ) = 2

e

+o

√

e− λ √ λ

.

(3.5)

Moreover, by (3.1), (3.3)–(3.5) and Lemma 2.2, we obtain that as λ → ∞ √ λ

|α4(p−1)/(5−p) (λ) log (λ)| ≤ Cλ3/2 e−

→ 0.

This along with (3.3) implies

√ λ = K1 α2(p−1)/(5−p) + o(1).

(3.6)

Therefore, by (2.17), (3.3)–(3.6) and Lemma 2.2, we obtain that for λ 1 √ √ p−1 4(p−1)/(5−p) 22(p+1)/(p−1) λe− λ (3.7) 1+ λ = K1 α 2(5 − p)C2

√ 2 2(p − 1) p+1 K5 log 2 +22(p+1)/(p−1) e− λ − − + 5−p p−1 C2 (5 − p)C2 √ +o(e− λ ) √

p−1 = K1 α4(p−1)/(5−p) 1 + 22(p+1)/(p−1) K1 α2(p−1)/(5−p) e− λ 2(5 − p)C2

√ 2(p − 1) 2 K5 p+1 + − +22(p+1)/(p−1) e− λ − log 2 5−p p−1 C2 (5 − p)C2 √ − λ ) +o(e √ λ

= K1 α4(p−1)/(5−p) + K2 α6(p−1)/(5−p) e− √ + o α4(p−1)/(5−p) e− λ .

√ λ

+ K3 α4(p−1)/(5−p) e−

By this and the same calculation as that to obtain (3.5), for λ 1, we obtain √ λ

α

√ K1 α2(p−1)/(5−p)

e−

= e−

e

=

√ k(p−1)/(5−p) − λ

√

+ o(e−

K1 α2(p−1)/(5−p)

√ 2(p−1)/(5−p) αk(p−1)/(5−p) e− K1 α √ − K1 α2(p−1)/(5−p)

+ o(e

By (3.7)–(3.9), we obtain (1.10) for n = 1.

),

(k = 4, 6).

),

(3.8) (3.9) ✷

Vol. 2, 2001


723

Proof of Theorem 2 for n = 1. Let p = 5. By putting y = sin1/2 θ, by (2.15), we have 1 y π

C2 = (3.10) dy = . 4 4 1−y 0 Moreover, by Lemma 2.2, (3.4) and (3.5), we obtain √ √ (λ) log (λ) = [8e− λ + 1 (λ)][− λ + 3 log 2 + o(1)] √ √ √ √ = −8 λe− λ + 24(log 2)e− λ + o(e− λ ).

(3.11)

Then by this, (3.1) and (3.5), we obtain √ π 1 1 α2 = 2 3 1 + (λ) + o((λ)) + (λ) log (λ) + K5 (λ) + o((λ)) 2 4 4 π √ π 1 + (λ) log (λ) + + K5 (λ) + o((λ)) (3.12) = 2 3 4 4 8 √ √ √ √ √ π = 2 3 − 2 λe− λ + (π + 6 log 2 + 8K5 ) e− λ + o(e− λ ) . 4 Recall that K4 is defined by (1.13). Since p = 5, we have 1 1 − s1/2 K4 = ds (put t = s1/2 ) 3/2 0 (1 − s) 1 1−t = 2 tdt (put t = sin θ) (1 − t2 )3/2 0 π/2 sin θ = 2 dθ (put x = tan(θ/2)) 1 + sin θ 0 1 x dx = 8 2 2 0 (1 + x) (1 + x ) 1

1 1 = 8 dx + − 2(1 + x)2 2(1 + x2 ) 0 = −2 + π.

(3.13)

Therefore, by (2.17) and (3.13) K5 =

K4 1 3 π 1 3 − log 2 − = − log 2 − . 4 4 8 2 4 8

Then by (3.12) and (3.14), we obtain √ √ √ √ √ √ √ 3 2 α = π − 4 3 λe− λ + 8 3e− λ + o(e− λ ). 2 This implies √ √ √ −√λ 2 3 2 0< π − α = 4 3 λe 1 − √ (1 + o(1)) . 2 λ

(3.14)

(3.15)

(3.16)

724

T. Shibata

Now we put

√ µ1 (α)

:= − log

˜ := λ

3 π − α2 2


,

√ λ − µ1 (α).

Then√by taking logarithm of the both hand side of (3.16), we obtain that as α ↑ ( 3π/2)1/2 √ λ = (1 + o(1))µ1 (α), (3.17) ˜ λ = o(µ1 (α)). (3.18) By (3.18) and Taylor expansion, we obtain log

√

˜ ˜ = log µ1 (α) + λ − 1 λ = log(µ1 (α) + λ) µ1 (α) 2

˜ λ µ1 (α)

2 (1 + o(1)). (3.19)

Then by taking logarithm of the both hand side of (3.16) and using Taylor expansion and (3.19), we obtain 2 ˜ ˜ λ 1 λ ˜ = log µ1 (α) + (1 + o(1)) λ − µ1 (α) 2 µ1 (α) √ 2 + log 4 3 + log 1 − √ (1 + o(1)) λ 2 ˜ ˜ √ λ λ 2 1 = log µ1 (α) + (1 + o(1)) + log 4 3 − √ (1 + o(1)). − µ1 (α) 2 µ1 (α) λ This implies

˜ 1 λ (1 + o(1)) + µ1 (α) 2µ1 (α)2 √ 2 = log µ1 (α) + log 4 3 − √ (1 + o(1)). λ ˜ λ 1−

By this and (3.17), we obtain √ 2 ˜ = λ log µ1 (α) + log 4 3 − (1 + o(1)) µ1 (α) ˜ 1 λ 1 (1 + o(1)) + O − × 1+ µ1 (α) 2µ1 (α)2 µ1 (α)2 √ √ log µ1 (α) log 4 3 − 2 = log µ1 (α) + log 4 3 + + (1 + o(1)). µ1 (α) µ1 (α) This implies Theorem 2 for n = 1.

(3.20)

(3.21)

✷

Vol. 2, 2001


725

Proof of Theorem 3 for n = 1. By (2.4) and Taylor expansion, we obtain u1,λ ∞

= λ1/(p−1) wλ ∞ (3.22) 1/(p−1) p+1 λ1/(p−1) (1 + (λ))1/(p−1) = 2 1/(p−1) p+1 1 (λ) + o((λ)) . = λ1/(p−1) 1 + 2 p−1

This along with (3.5) implies Theorem 3 for n = 1. ✷ Proof of Theorem 4 for n = 1. We substitute (1.10) into (3.22). If p = 5, then by (3.5) and (3.8), we obtain u1,α ∞ = λ1 (α)1/(p−1) wλ1 (α) ∞ (3.23) 1/(p−1) √ 2(p−1)/(5−p) p+1 = K1 α4(p−1)/(5−p) + K2 α6(p−1)/(5−p) e− K1 α 2 1/(p−1) √ 2(p−1)/(5−p) +K3 (1 + o(1))α4(p−1)/(5−p) e− K1 α

√ 2(p−1)/(5−p) 1 2(p+1)/(p−1) −√K1 α2(p−1)/(5−p) × 1+ e + o e− K1 α 2 p−1 1/(p−1) (p + 1)K1 α4/(5−p) = 2

K2 2(p−1)/(5−p) −√K1 α2(p−1)/(5−p) × 1+ α e K1 1/(p−1) √ K3 −√K1 α2(p−1)/(5−p) − K1 α2(p−1)/(5−p) + e +o e K1

√ 1 2(p+1)/(p−1) −√K1 α2(p−1)/(5−p) − K1 α2(p−1)/(5−p) × 1+ e +o e 2 p−1 1/(p−1) (p + 1)K1 α4/(5−p) = 2

K2 2(p−1)/(5−p) −√K1 α2(p−1)/(5−p) 1 × 1+ α e p − 1 K1 √ 2(p−1)/(5−p) K3 −√K1 α2(p−1)/(5−p) + e + o e− K1 α K1

√ 2(p−1)/(5−p) 1 2(p+1)/(p−1) −√K1 α2(p−1)/(5−p) e + o e− K1 α 2 . × 1+ p−1 This implies (1.17) for n = 1. Finally, we prove (1.18) for n = 1. Let p = 5. We substitute (1.14) into (3.22).

726

T. Shibata


Then by (3.5), (3.16) and (3.17), we obtain u1,α ∞ = λ1 (α)1/4 wλ1 (α) ∞

(3.24) 1/2 √ √ log µ1 (α) log 4 3 − 2 + (1 + o(1)) = 31/4 µ1 (α) + log µ1 (α) + log 4 3 + µ1 (α) µ1 (α)

1 × 1 + (λ1 (α)) + o((λ1 (α))) 4 1/2 √ √ log µ1 (α) log 4 3 − 2 1/4 + (1 + o(1)) µ1 (α) + log µ1 (α) + log 4 3 + =3 µ1 (α) µ1 (α) √ × 1 + 2(1 + o(1))e− λ1 (α) 1/2 √ √ (α) 3 − 2 log 4 log µ 1 + (1 + o(1)) = 31/4 µ1 (α) + log µ1 (α) + log 4 3 + µ1 (α) µ1 (α) √ ( 3π/2) − α2 × 1 + (1 + o(1)) √ 2 3 λ1 (α) 1/2 √ √ log µ1 (α) log 4 3 − 2 1/4 =3 + (1 + o(1)) µ1 (α) + log µ1 (α) + log 4 3 + µ1 (α) µ1 (α) √ ( 3π/2) − α2 × 1 + (1 + o(1)) √ . 2 3µ1 (α) This implies (1.18) for n = 1.

✷

Proof of Theorems 1–4 for n ≥ 2. Let n ≥ 2. Since (1.1)–(1.2) is autonomous, we see that the interior zeros of un,λ are exactly {k/n : k = 1, 2, · · · , n − 1}. Put µ := λ/n2 , β := α/n2/(p−1) and √ wµ (s) := λ−1/(p−1) un,λ (t), s = λ(t − 1/(2n)), 0 ≤ t ≤ 1/n. Then wµ satisfies (1.19)–(1.21) in Iµ and we easily obtain β 2 = µ(5−p)/(2(p−1)) wµ 2L2 (Iµ ) .

(3.25)

Therefore, Theorems 1–2 for n ≥ 2 are obtained by replacing λ and α by µ and β, respectively, in all the arguments in Sections 2–3. To obtain Theorem 3 for n ≥ 2, we have only to note that 1/(p−1) p+1 un,λ ∞ = λ1/(p−1) wµ ∞ = λ1/(p−1) . (1 + (µ)) 2 Then by replacing (λ) by (µ) in Lemma 2.2, we obtain Theorem 3 for n ≥ 2. Finally, Theorem 4 for n ≥ 2 is a direct consequence of Theorems 1–3 for n ≥ 2. ✷

Vol. 2, 2001


727

4 Appendix The purpose of this section is to prove (2.12) and (2.23). Let Ck > 0(k = 3, 4, · · · ) be constants independent of . Proof of (2.12). We divide the proof into several steps. Step 1. We put

B1 (, y) := y 2 − y p+1 + (1 − y p+1 ), (4.1)

B2 (, y) := y 2 + , (4.2) B3 (, y) := B1 (, y) + B2 (, y). (4.3) Then it is easy to see that J3 ()

=

J3,1 () + J3,2 () (4.4) 1 1 1 := y p+1 − dy B (, y)B (, y)B (, y) B (0, y)B (0, y)B3 (0, y) 1 2 3 1 2 0 + J2 (),

where

J2 () := 0

1

y p+1 dy. B1 (, y)B2 (, y)B3 (, y)

We calculate J3,2 () first. By Lebesgue’s dominated convergence theorem J2 () → J2

as → 0.

By this and (2.11), for 0 < 1, we obtain J3,2 () =

2 log 2 + o(). p−1

(4.5)

Next, we calculate J3,1 (). We have

J3,1 ()

B1 (0, y)B2 (0, y)B3 (0, y) − B1 (, y)B2 (, y)B3 (, y) dy B1 (0, y)B2 (0, y)B3 (0, y)B1 (, y)B2 (, y)B3 (, y) 0 := −(J4 () + J5 () + J6 ()), (4.6) 1

y p+1

:=

where

y p+1

B1 (, y)B2 (, y)B3 (, y) − B1 (0, y)B2 (, y)B3 (, y) dy, B1 (0, y)B2 (0, y)B3 (0, y)B1 (, y)B2 (, y)B3 (, y)

y p+1

B1 (0, y)B2 (, y)B3 (, y) − B1 (0, y)B2 (0, y)B3 (, y) dy, B1 (0, y)B2 (0, y)B3 (0, y)B1 (, y)B2 (, y)B3 (, y)

y p+1

B1 (0, y)B2 (0, y)B3 (, y) − B1 (0, y)B2 (0, y)B3 (0, y) dy. B1 (0, y)B2 (0, y)B3 (0, y)B1 (, y)B2 (, y)B3 (, y)

1

J4 () := 0

1

J5 () := 0

J6 () :=

0

1

728

T. Shibata


Step 2. We calculate J4 (). Since B1 (, y) − B1 (0, y) = we obtain J4 ()

1

= 0

(1 − y p+1 ) , B1 (, y) + B1 (0, y)

1 − y p+1 y p+1 dy B1 (, y)(B1 (, y) + B0 (0, y)) B1 (0, y)B2 (0, y)B3 (0, y)

1 − y p+1 y p−2

dy 2 1 − y p−1 (1 + 1 − y p−1 ) 0 B1 (, y) δ 1 + , = J7 () + J8 () := 1

≤

0

(4.7)

δ

where 0 < δ 1 is a fixed constant. Let Ck,δ (k ∈ N) be a positive constant which depends on δ but independent of . First, we consider the case where p ≥ 2. √ Put y = tan θ. Then we obtain δ 1 δ p−2 dy (4.8) · J7 () ≤ p−1 2 )(y + ) 1 − δ p−1 0 (1 − δ δ 1 dy ≤ C1,δ 2+ y 0 tan−1 (δ/√ ) √ πC1,δ √ = C1,δ dθ ≤ . 2 0 √ Next, we consider the case where 1 < p < 2. By putting y = tan θ, we obtain δ p−2 y dy (4.9) J7 () ≤ (1 − δ p−1 )2 0 y 2 + √ tan−1 (δ/ ) cos2−p θ dθ = C2,δ (p−1)/2 sin2−p θ 0 ≤ C3,δ (p−1)/2 . On the other hand, it is easy to see that J8 () ≤ C4,δ J2 ≤ C5,δ .

(4.10)

Hence, by (4.7)–(4.10), we obtain J4 () ≤ C3 ((p−1)/2 +

√ + ).

(4.11)

Step 3. We calculate J5 (). Since B1 (0, y)B2 (, y)B3 (, y) − B1 (0, y)B2 (0, y)B3 (, y) =

B1 (0, y)B3 (, y) , B2 (, y) + B2 (0, y)

Vol. 2, 2001


729

we obtain J5 ()

1

y p+1 dy B2 (0, y)B3 (0, y)B1 (, y)B2 (, y)(B2 (, y) + B2 (0, y))

1

y p+1 dy B1 (0, y)B2 (0, y)B3 (0, y) B2 (, y)2

= 0

≤

0

1

= 0

(4.12)

y p−2

dy . y 2 + 1 − y p−1 (1 + 1 − y p−1 )

Then by the same calculation as that of (4.7)–(4.10), we obtain J5 () ≤ C4 ((p−1)/2 +

√

+ )

(4.13)

Step 4. Finally, we calculate J6 (). Since (4.14) B1 (0, y)B2 (0, y)B3 (, y) − B1 (0, y)B2 (0, y)B3 (0, y) = B1 (0, y)B2 (0, y){[B1 (, y) − B1 (0, y)] + [B2 (, y) − B2 (0, y)]} = D1 + D2 (1 − y p+1 ) + := B1 (0, y)B2 (0, y) , B1 (, y) + B1 (0, y) y2 + + y we obtain J6 ()

=

D3 + D4 1

(4.15) p+1

D1 y dy B (0, y)B (0, y)B (0, y)B1 (, y)B2 (, y)B3 (, y) 1 2 3 0 1 D2 y p+1 + dy. 0 B1 (0, y)B2 (0, y)B3 (0, y)B1 (, y)B2 (, y)B3 (, y)

:=

Then D3

1

y p+1 (1 − y p+1 ) dy B1 (, y)B2 (, y)B3 (, y)B3 (0, y)(B1 (, y) + B1 (0, y))

1

1 − y p+1 y p+1 dy B1 (0, y)B2 (0, y)B3 (0, y) B1 (, y)B3 (, y)

= 0

≤

0

≤

0

1

(4.16)

1 − y p+1 y p−2

dy. 1 − y p−1 (1 + 1 − y p−1 ) y 2 +

Then by (4.7)–(4.10), we obtain D3 ≤ C5 ((p−1)/2 +

√ + ).

(4.17)

730

T. Shibata


Next, D4

1

= 0

≤

0

1

1 y p+1

dy B3 (0, y)B1 (, y)B2 (, y) B3 (, y)( y 2 + + y)

(4.18)

1 y p−2

dy. 2+ p−1 p−1 y 1−y (1 + 1 − y )

Then by (4.7)–(4.10), we obtain √ D4 ≤ C6,δ ( + (p−1)/2 ).

(4.19)

Therefore, by (4.15) and (4.19)–(4.20), we obtain J6 () ≤ C7,δ ((p−1)/2 +

√ + ).

Then we obtain (2.12) by (4.4)–(4.6), (4.11), (4.13) and (4.20). Proof of (2.23). We see that L3 () =

1

y 0

A2 (, y) dy, A3 (, y)

(4.20) ✷

(4.21)

where A2 (, y) A3 (, y)

= (1 − y p+1 ) y 2 + (y + y 2 + )

− B1 (, y) 1 − y p−1 (B1 (0, y) + B1 (, y)),

= B1 (, y) 1 − y p−1 (B1 (0, y) + B1 (, y))

× y2 + y + y2 + .

(4.22) (4.23)

It is easy to see that A3 (, y) is decreasing as → 0 for any y ∈ I. We show that A2 (, y) is increasing as → 0 for y ∈ I. Indeed, we have dA2 (, y) d

=

M1 + M2

(4.24)

:= (1 − y p+1 )(1 − 1 − y p−1 ) 1 y(1 − y p+1 ) 1 − y p−1

+ − . 2 y 2 (1 − y p−1 ) + (1 − y p+1 ) y2 +

It is clear that M1 > 0 for y ∈ I. Moreover, by direct calculation, we find that M2 > 0 for y ∈ I is equivalent to y 2 + (2 − y 2 − y p−1 ) > y p+1 ,

Vol. 2, 2001


731

which is valid for y ∈ I. Therefore, M2 > 0 for y ∈ I. Furthermore, by noting that

y 2 − y p+1 + (1 − y p+1 ) ≤ y 2 + , we see that for y ∈ I A2 (, y)

= y(1 − y p+1 ) y 2 + + (1 − y p+1 )(y 2 + ) (4.25)

p−1 2 p+1 p+1 ) y −y + (1 − y ) − y(1 − y

2 p+1 p+1 p−1 − 1−y (y − y + (1 − y ))

p+1 p−1 2 ) + y y + (y p−1 − y p+1 ) > (1 − y )(1 − 1 − y + y 2 {(1 − y p+1 ) − (1 − y p−1 )3/2 } > 0.

Therefore, yA2 (, y)/A3 (, y) > 0 in I and increasing as → 0. Then by monotone convergence theorem, we see that as → 0 L3 ()

→ L3 (0) (4.26) 1 p+1 p−1 3/2 1−y − (1 − y ) dy (put y = sin2/(p−1) θ) = p−1 )3/2 2y(1 − y 0 π/2 1 1 − sin2(p+1)/(p−1) θ − cos3 θ = dθ (put t = cos θ) p−1 0 sin θ cos2 θ 1 1 − t3 − (1 − t2 )(p+1)/(p−1) 1 dt = p−1 0 t2 (1 − t2 ) 1 1 1 1 1 − (1 − t2 )2/(p−1) = dt + dt (put s = 1 − t2 ) p−1 0 1+t t2 0 1 1 1 − s2/(p−1) 1 log 2 + ds = p−1 2 0 (1 − s)3/2 1 1 log 2 + K4 . = p−1 2

This implies (2.23).

✷

References [1] R. Benguria and M. C. Depassier, Upper and lower bounds for eigenvalues of nonlinear elliptic equations: I. The lowest eigenvalue, J. Math. Phys. 24, 501–503 (1983). [2] H. Berestycki, Le nombre de solutions de certains problèmes semi-linéares elliptiques, J. Functional Analysis 40, 1–29 (1981). [3] H. Berestycki and P. L. Lions, Nonlinear scalar field equation I, existence of a ground state, Arch. Rational Mech. Anal. 82, 313–345 (1983).

732

T. Shibata


[4] I. S. Gradshteyn and I. M. Ryzhik, Table of integrals, series, and products, Academic Press, New York (1980). [5] M. K. Kwong, Uniqueness of positive solution of ∆u − u + up = 0 in Rn , Arch. Rational Mech. Anal. 105, 243–266 (1989). [6] T. Shibata, Spectral asymptotics for nonlinear Sturm-Liouville problems, Forum Math. 7, 207–224 (1995). [7] T. Shibata, Global L2 -bifurcation of nonlinear Sturm-Liouville problems, Z. Angew. Math. Phys. 46, 859–871 (1995). T. Shibata The Division of Mathematical and Information Sciences Faculty of Integrated Arts and Sciences Hiroshima University Higashi-Hiroshima, 739-8521, Japan email: [email protected] Communicated by Rafael D. Benguria submitted 02/11/00, accepted 25/01/01




Interacting Fermi Liquid in Three Dimensions at Finite Temperature: Part I: Convergent Contributions M. Disertori, J. Magnen and V. Rivasseau Abstract. In this paper we complete the first step, namely the uniform bound on completely convergent contributions, towards proving that a three dimensional interacting system of Fermions is a Fermi liquid in the sense of Salmhofer. The analysis relies on a direct space decomposition of the propagator, on a bosonic multiscale cluster expansion and on the Hadamard inequality, rather than on a Fermionic expansion and an angular analysis in momentum space, as was used in the recent proof by two of us of Salmhofer’s criterion in two dimensions.

I Introduction Conducting electrons in a metal at low temperature are well described by Fermi liquid theory. However we know that the Fermi liquid theory is not valid down to zero temperature. Indeed below the BCS critical temperature the dressed electrons or holes which are the excitations of the Fermi liquid bound into Cooper pairs and the metal becomes superconducting. Even when the dominant electron interaction is repulsive, the Kohn-Luttinger instabilities prevent the Fermi liquid theory to be generically valid down to zero temperature. Hence Fermi liquid theory (e.g. for the simplest case of a jellium model with a spherical Fermi surface) is only an effective theory above some non-perturbative transition temperature, and it is not obvious to precise its mathematical definition. Recently Salmhofer proposed such a mathematical definition [S]. It consists in proving that (under a suitable renormalization condition on the two-point function), perturbation theory is analytic in a domain |λ log T | ≤ K, where λ is the coupling constant and T is the temperature, and that uniform bounds hold in that domain for the self-energy and its first and second derivatives. This criterion in particular excludes Luttinger liquid behavior, which has been proved to hold in one dimension [BGPS-BM], and for which second momentum-space derivatives of the self-energy are unbounded in that domain. Recently two of us proved Salmhofer’s criterion for the two dimensional jellium model [DR1-2]. However the proof relies in a key way on the special momentum conservation rules in two dimensions. In three dimensions general vertices are not necessarily planar in momentum space. This has drastic constructive consequences (although perturbative power counting is similar in 2 and 3 dimensions). In particular it seems to prevent, up to now, any constructive analysis based on

734

M. Disertori, J. Magnen and V. Rivasseau


angular decomposition in momentum space. The only existing constructive result for three dimensional Fermions relies on the use of a bosonic method (cluster expansion) together with the Hadamard inequality [MR]. It proves that the radius of convergence of the theory in a single momentum slice of the renormalization group analysis around the Fermi surface is at least a constant independent of the slice. In this paper, we build upon the analysis of [MR], extending it to many slices. We use a multiscale bosonic cluster expansion based on a direct space decomposition of the propagator, which is not the usual momentum decomposition around the Fermi sphere. We bound uniformly the sum of all convergent polymers in the Salmhofer domain |λ log T | ≤ K. Hence our result is the three dimensional analog of [FMRT] and [DR1]. Because of its technical nature, this result is stated precisely only in section III.6, after the definition of the multiscale cluster expansion. Using a Mayer expansion we plan in a future paper (which would be the three dimensional analog of [DR2]) to perform renormalization of the two point subgraphs and to study boundedness of the self energy and of its first and second momentum space derivatives. That would complete the proof of Fermi liquid behavior in three dimensions. Remark however that the optimal analyticity radius of the Fermi liquid series should be given by |λ ln T | = KBCS where KBCS is a numerical constant given by the coefficient of a so called “wrong-way” bubble graph [FT2]. In this paper we prove analyticity in a domain λ| ln T | ≤ K but our constant is not the expected optimal one, KBCS , not only because of some lazy bounds, but also because of a fundamental difficulty linked to the use of the Hadamard inequality. Actually the kind Hadamard bound relevant for a model of fermions with two spin states is λof n 2 2 n nn det A ≤ (|λ|a ) , where An is an n × n matrix whose coefficients n n n! n n! are all bounded by a. Hence (using Stirling’s formula), the radius of convergence in λ of that series is only shown to be at least 1/ea2 by this bound, whether 1/a2 would be expected from perturbation theory. For this reason it seems to us that the analyticity radius obtained by any method based on Hadamard bound is smaller than the optimal radius by a factor at least 1/e, and we do not know how to cure this defect.

II Model We consider the simple model of isotropic jellium in three spatial dimensions with a local four point interaction. We use the formalism of non-relativistic field theory at imaginary time of [FT1-2-BG] to describe the interacting fermions at finite temperature. Our model is therefore similar to the Gross-Neveu model, but with a different, non relativistic propagator.

Vol. 2, 2001

II.1

Fermi Liquid in Three Dimensions: Convergent Contributions

735

Free propagator

Using the Matsubara formalism, the propagator at temperature T , C(x0 , x), is antiperiodic in the variable x0 with antiperiod T1 . This means that the Fourier transform defined by 1 ˆ C(k) = 2

1 T

− T1

dx0

d3 x e−ikx C(x)

(II.1)

is not zero only for discrete values (called the Matsubara frequencies) : k0 =

2n + 1 π, β

n ∈ ZZ ,

(II.2)

where β = 1/T (we take /h = k = 1). Remark that only odd frequencies appear, because of antiperiodicity. Our convention is that a four dimensional vector is denoted by x = (x0 , x) where x is the three dimensional spatial component. The scalar product is defined as kx := −k0 x0 + k. x. By some slight abuse of notations we may write either C(x − x ¯) or C(x, x ¯), where the first point corresponds to the field and the second one to the antifield (using translation invariance of the corresponding kernel). ˆ Actually C(k) is obtained from the real time propagator by changing k0 in ik0 and is equal to: Câb (k) = δab

1 ik0 − e( k)

k2 −µ , e( k) = 2m

,

(II.3)

where a, b ∈ {↑, ↓} are the spin indices. The vector k is three-dimensional. Since our theory has three spatial dimensions and one time dimension, there are really four dimensions. The parameters m and µ correspond to the effective mass and to the chemical potential (which fixes the Fermi energy). To simplify notation we put 2m = µ = 1, so that, if ρ = | k|, e( k) = e(ρ) = ρ2 − 1. Hence, 1 (II.4) Cab (x) = d3 k eikx Câb (k) . (2π)3 β k0

The notation k0 means really the discrete sum over the integer n in (II.2). When T → 0 (which means β → ∞) k0 becomes a continuous variable, the corresponding discrete sum becomes an integral, and the corresponding propagator C0 (x) becomes singular on the Fermi surface defined by k0 = 0 and | k| = 1. In the following to simplify notations we will write:

1 d k ≡ d3 k β 4

k0

,

1 d x ≡ 2

β

4

−β

dx0

d3 x .

(II.5)

736

II.2



Ultraviolet cutoff

It is convenient to add a continuous ultraviolet cut-off (at a fixed scale Λu ) to the propagator (II.3) for two reasons: first because it makes its Fourier transformed kernel in position space well defined, and second because a non relativistic theory does not make sense anyway at high energies. To preserve physical (or OsterwalderSchrader) positivity one should introduce this ultraviolet cutoff only on spatial frequencies [FT2]. However for convenience we introduce this cutoff both on spatial and on Matsubara frequencies as in [FMRT]; indeed the Matsubara cutoff could be lifted with little additional work. The propagator (II.3) equipped with this cut-off is called C u and is defined as: ˆ Cˆ u (k) := C(k) [u(r)]|r=k2 +e2 (k) 0

(II.6)

where the compact support function 0 ≤ u(r) ∈ C0∞ (R) satisfies: u(r) = 1 for r ≤ 1, u(r) = 0 for r > 10.

II.3

Position space

In the following we will use the propagator in position space. The key point for further analysis is to write it as C u ( x, t) =

1 1 F ( x, t) 1 + | x| 1 + f (t) + | x|

where f (t) is defined by sin (2πT t) = ε(t) sin (2πT t) f (t) := 2πT 2πT

1 1 t∈ − , T T

(II.7)

(II.8)

and ε(t) is the sign of sin (2πT t). This is useful since the remaining function F has a spatial decay scaled with T , and no global scaling factor in T , as proved in the following lemma. Lemma 1 For any p ≥ 1, there exists Kp such that the function F ( x, t) defined by (II.7) satisfies Kp |F ( x, t)| ≤ ∀p ≥ 1. (II.9) (1 + T | x|)p Proof. In radial coordinates the propagator is written as π ∞ iρ| x| cos θ−ik0 t T 2π u 2 e dφ dθ sin θ dρ ρ u k02 + e2 (ρ) . C ( x, t) = (2π)3 ik − e(ρ) 0 0 0 0 k0

(II.10) By symmetry considerations, changing θ to π − θ, we can rewrite this as π ∞ T 2π eiρ|x| cos θ−ik0 t 2 u k0 + e2 (ρ) . dφ dθ sin θ dρ ρ2 C u ( x, t) = 3 2(2π) ik0 − e(ρ) 0 0 −∞ k0

(II.11)

Vol. 2, 2001


Now we write the integral over θ as π dθ sin θ eiρ|x| cos θ =

1

737

dv eiρ|x|v

(II.12)

i d 1− eiρ|x|v ρ dv

(II.13)

−1

0

and applying twice the identity iρ| x|v

e

1 = (1 + | x|)

we obtain 1 dv eiρ|x|v = −1

1 1 i d eiρ|x|v dv 1 − (II.14) 1 + | x| −1 ρ dv 1 1 1 iρ|x| = dv eiρ|x|v + − e−iρ|x| e 1 + | x| −1 iρ 1 1 (2 + | x|) 1 iρ|x| iρ| x|v −iρ| x| . = dv e − e e + (1 + | x|)2 −1 (1 + | x|)2 iρ

We decompose further, introducing for the first term 1 = χ(| x| ≤ 1) + χ(| x| > 1), where χ is the characteristic function of the event indicated, and perform the v integration for the second term only. In this way the function F can be written as a sum of two terms F = F1 + F2 where (1 + f (t) + | x|) T (1 + | x|) 2(2π)2 ∞ 1 u[k02 + e2 (ρ)] dv dρ ρ2 eiρ|x|v−ik0 t ik0 − e(ρ) −1 −∞

F1 = χ(| x| ≤ 1)

(II.15)

k0

F2 =

T (2 + | x| + χ(| x| > 1)/| x|)(1 + f (t) + | x|) (1 + | x|) 2(2π)2 ∞ σρ u[k02 + e2 (ρ)] dρ eiσρ|x|−ik0 t . i ik0 − e(ρ) σ=±1 −∞

(II.16)

k0

Now we apply on F1 and on F2 the identity

∆ 1 d i iai ρ| x|−ik0 t eiai ρ|x|−ik0 t = 1 + ε(t) i − 1 + f (t) − ε(t)ai | x| e 2 ∆k0 2 dρ (II.17) where we defined a1 =: v for F1 and a2 =: σ for F2 , and where the discretized ∆ derivative ∆k on a function F (k0 ) is defined by 0 ∆ 1 [F (k0 + 2πT ) − F (k0 − 2πT )] . F (k0 ) = ∆k0 4πT

(II.18)

738



Hence integrating by parts the Fi ’s are written as ∞ T 1 F1 ( x, t) = dvf ( x , t, v) dρeiρ|x|v−ik0 t G1 (k0 , ρ) 1 2(2π)2 −1 −∞ k0 ρ2 u k02 + e2 (ρ) G1 (k0 , ρ) = [1 + ε(t)∆] ik0 − e(ρ)

F2 ( x, t) =

∞ T σ f ( x , t, σ) dρ G2 (k0 , ρ) eiσρ|x|−ik0 t 2 2(2π)2 −∞ i k0 σ ρ u k02 + e2 (ρ) G2 (k0 , ρ) = [1 + ε(t)∆] ik0 − e(ρ)

(II.19)

(II.20)

where we have defined f1 ( x, t, v) = χ(| x| ≤ 1) f2 ( x, t, σ) =

(1 + f (t) + | x|) (1 + | x|)(1 + f (t) − 2i ε(t)v| x|)

1 + f (t) + | x| 2 + | x| + χ(| x| > 1)/| x| 1 + | x| (1 + f (t) − 2i ε(t)| x|σ)

1 d ∆ ∆= −i . 2 dρ ∆k0

(II.21) (II.22) (II.23)

Remark that these functions are uniformly bounded in modulus (f1 is bounded by 1 and f2 by 6). The signs and coefficients in ∆ have been optimized in order to obtain a positive factor 1 + f (t) and to minimize the action of ∆ on (ik0 − e(ρ))−1 . After a tedious but trivial computation, we find &(t)bi ρbi −1 u[k02 + e2 (ρ)] u[k02 + e2 (ρ)] Gi =: [1 + &(t)∆] ρbi = ρbi + ik0 − e(ρ) 2 ik0 − e(ρ) 2 2ρ(ρ − 1) ik0 (ik0 − e(ρ)) − +ρbi &(t) u [k02 + e2 (ρ)] ik0 − e(ρ) [ik0 − e(ρ)]2 + 4π 2 T 2 (ρ − 1)[ik0 − e(ρ)]2 + 4π 2 T 2 ρ 2 [ik0 − e(ρ)]2 [ik0 − e(ρ)] + 4π 2 T 2 O(T )

+u[k02 + e2 (ρ)]

+

[ik0 − e(ρ)]2 + 4π 2 T 2

(II.24)

where b1 = 2 for G1 and b2 = 1 for G2 . Using these explicit expressions it now easy to check that F1 and F2 are uniformly bounded by some constant K (independent of T as T → 0). To complete the proof of Lemma 1, there remains to check that these functions F1 and F2 also decay like any power as T | x| → ∞. For F1 there is obviously nothing to check

Vol. 2, 2001


739

remarking the function χ(| x| ≤ 1) in (II.21). Hence we have only to prove (1 + | x|T )p |F2 | ≤ Kp

(II.25)

for some constant Kp independent from T . Since

p i d p iσ| x|ρ (1 + | x|T ) e = 1−T eiσ|x|ρ σ dρ we have p

| (1 + | x|T ) F2 ( x, t)| ≤ T.K1 sup

σ=±1

k0

∞

−∞

(II.26)

p i d dρ 1 + T G2 (k0 , ρ) σ dρ

(II.27) where we bounded the factors |fi | by constants. Now, performing the change of variable w = ρ2 − 1, using the fact that the u function has compact support, and the fact that the sum over k0 is bounded away from 0 since by (II.2) |k0 | ≥ T , it is a trivial power counting exercise to check that (II.27) is actually bounded by a constant. Remark that it is not possible to improve significantly Lemma 1. Actually if we try in (II.7) to obtain e.g. more factors such as (1 + f (t) + |x|), identity (II.17) should be applied several times and the action of two or more ∆ operators on the free propagator (II.3) would generate terms that diverge when T → 0. Similarly, if the factor (1 + |x|) appears more than one time, some corresponding factors fi would not remain bounded when |x| → ∞. In the following we will use the spatial decay of the propagator to integrate and the following lemma will be useful Lemma 2 Let the interval − T1 , T1 be divided into eight sub-intervals j−1 1 j 1 ,− + , Ij =: − + T 4T T 4T Then

where tj = − T1 +

1≤j≤8

1 1 ≤ 2 1 + f (t) + | x| 1 + π |t − tj | + | x| j−1 4T

for j odd and tj = − T1 +

j 4T

(II.28)

(II.29)

for j even.

2πT t is positive and periodic with period 1/2T Proof. Remember that f (t) = ε(t) sin2πT (see Fig.1). In each interval Ij with j odd, the function ε(t) sin 2πT t is higher or equal to the line 4T (t − tj ) while for j even it is higher that the decreasing line −4T (t − tj ). The proof follows1 .

8 C = j=1 Cj according to which interval we are in, and taking tj as the new origin, we could in fact obviously restrict ourselves to proving the main result of this paper for j = 5, where t ≥ 0 and f (t) ≥ 2t/π. 1 Splitting

740


-1/T

-1/2T

1/2T

1/T


t

Figure 1: The function ε(t) sin(2πT t)

II.4

Heuristic discussion

Before going on into technical details, we include here some informal discussion, according to the referee’s suggestion. The particular expansion scheme below may seem unnecessarily complicated, but it has been developed to overcome a series of hurdles, that we try to sketch here. - First the choice of the Hadamard inequality is up to now the only way to overcome the main difficulty of three dimensional Fermionic models, namely the nonplanarity of the three dimensional vertex. This comes with a price: since Hadamard’s bound consumes the symmetry factorial of the vertices, nothing is left to make a tree expansion, which is usually the simplest way to treat constructively a Fermionic theory (see e.g. [DR1]). A consequence is that one needs to treat the infinite volume limit as in a bosonic theory, using cluster expansions [MR]. - In [MR] a single step of the renormalization group is performed, but there is nevertheless some multiscale aspect, because an expansion had to be performed with respect to a superrenormalizable auxiliary index. The corresponding cluster expansions were made with respect to rectangular rather than square boxes. This is due to the difference between space and time decrease of the propagator that appears in the previous subsection. Therefore the naive multiscale generalization of [MR] would be a somewhat mind-boggling “double index” expansion with rectangles of all sizes and aspects. We prefer to avoid this complication, and to stay within the much more familiar renormalization group picture of a multiscale cluster expansion performed with respect to a single sequence of growing cubic lattices. It is for that purpose that we introduce in the next subsection a scale decomposition solely related to the size of x and t. Hence the slices used in this paper are not the usual momentum shells around the Fermi surface of [FT1-2] or of [MR], although they are loosely related to them. Our decomposition is anisotropic. Two conditions have to be satisfied. One of them involves the square of the propagator, C 2 , since the Hadamard bound is written in terms of C 2 . The other one involves directly the propagator C, since it is used to perform the “horizontal” cluster expansion

Vol. 2, 2001


741

between cubes of a given scale. These two conditions are explained in detail in Appendix A. - After applying the Hadamard bound the power counting factors of the propagators is entirely consumed and nothing is left to sum over the scales of the four fields hooked to a given vertex. Roughly speaking for symmetric vertices, i.e. when the scales of the four fields are summed over identical intervals, these sums can be paid for by the fact that the coupling constant is in c/| log T |. On the other hand the vertical part of a multiscale expansion (the one that couples together the different scales) typically dissymmetrizes the field scales of the vertices, introducing some constraints. For instance one considers usually that a vertical coupling at scale j is made of a vertex with at least one field of scale higher than j and one field of scale lower than j. But after applying Hadamard’s inequality there is not enough decay to sum over j for such vertices. Our inductive definition of the vertical expansion in section III may seem complicated but it is designed to carefully avoid this problem. It extends the notion of vertical coupling at scale j to include any vertex with at least one field of scale higher than j (but not necessarily any field of scale lower than j!). We remark that the corresponding extended vertical expansion would not work for an ordinary bosonic theory. Indeed it can create an arbitrary number of vertices say in a single cube of the first slice, corresponding to vertical connections of lower scales, and this leads to the usual bosonic divergence of perturbation theory. However remember that Fermionic perturbation theory with cutoffs converges, and this is why this new kind of vertical expansion is possible in our context! Nevertheless to implement this convergence in practice is quite subtle. It requires in particular an optimal use of the two different forms of the Hadamard inequality (IV.113) and (IV.114), for rows and for columns. This is done in subsection IV.3, introducing a so-called weight expansion, which is really the core of our paper. Let us finally mention some technical complications related to our extended rules for vertical connections. The vertical expansion is inductive, since new vertical connections must bring in new vertices, and we have not been able to cast it in the most symmetric and compact tree formalism of Brydges-Kennedy [BK] [AR2]. Since we use instead the older formalism of Brydges-Battle-Federbush [BF1-2], we have to exploit carefully the additional 1/n! factors that are hidden in the integrals over the interpolating factors. This is explained in subsection IV.7.1. Also our inductive rules for this vertical expansion create quite naturally redundant vertical connections. Nevertheless to organize the sum over the positions of the cubes of a multiscale connected component of the expansion, it is convenient to extract a tree out of these redundant connections, which joins all the cubes. This tree extraction is explained in subsection III.3.

742


II.5


Slice decomposition

To introduce multiscale analysis we can work directly in position space. We then write the propagator as C u ( x, t) =

j M +1

C j ( x, t) ;

C j ( x, t) = C u ( x, t) χΩj ( x, t)

(II.30)

j=0

where χΩ ( x, t) is the characteristic function of the subset Ω ⊂ R4 χΩ ( x, t) = 1 if ( x, t) ∈ Ω = 0 otherwise

(II.31)

and the subset Ωj is defined as follows: 3

1

Ωj = { ( x, t) | M j−1 ≤ (1 + | x|) 4 (1 + f (t) + | x|) 4 < M j 3 1 = { ( x, t) | M jM ≤ (1 + | x|) 4 (1 + f (t) + | x|) 4

} 0 ≤ j ≤ jM } j = jM + 1 (II.32) where M > 0 is a constant that will be chosen later. In Appendix A we discuss why the relative powers 3/4 and 1/4 for (1+| x|) and (1+f (t)+| x|) are convenient. jM is defined as the temperature scale M jM 1/T , more precisely ln T −1 (II.33) jM = 1 + I ln M where I means the integer part. With these definitions j M +1

χΩj ( x, t) = 1 .

(II.34)

j=0

This decomposition is somewhat dual to the usual slice decomposition in momentum space of the renormalization group. Now, for each slice j we can introduce a corresponding lattice decomposition. We work at finite volume Λ := [−β, β]×Λ , where Λ is a finite volume in the three dimensional space. For j ≤ jM we partition Λ in cubes of side M j in all directions, forming the lattice Dj . For that we introduce the function χ∆ (x)

= 1 = 0

if x ∈ ∆ otherwise

(II.35)

satisfying ∆∈Dj χ∆ (x) = χΛ (x). For j = jM + 1 we partition Λ in cubes of side M jM in all directions, forming the lattice DjM +1 = DjM . We define the union of all partitions D = ∪j Dj .

Vol. 2, 2001


743

Auxiliary scales The function χΩj actually mixes temporal and spatial coordinates. In order to sharpen the analysis of x and t, we will need later an auxiliary slice decoupling for each scale j:

kM (j) j

C ( x, t) =

C j,k ( x, t) ;

C j,k ( x, t) = C j ( x, t) χΩj,k (t)

(II.36)

k=0

where, for any j ≤ jM we defined Ωj,k = { t | M j+k−1 ≤ f (t) < M j+k = { t | 0 ≤ f (t) < M j

} k>0 } k=0

(II.37)

and kM (j) is defined as kM (j) = min{jM − j, 3j} .

(II.38)

The bound k ≤ jM − j is obtained observing that f (t) ≤ M jM in any case by 1 periodicity. The bound k ≤ 3j is obtained observing that (1 + f (t)) 4 ≤ M j . The case j = jM + 1 is special. In this case we must have 0 ≤ f (t) ≤ M jM by periodicity, therefore there is no k decomposition. Actually we say that k = 0 and we define (II.39) ΩjM +1,0 = { t | 0 ≤ f (t) ≤ M jM } Spatial constraints For any j and k fixed, the spatial decay is constrained too. We must distinguish three cases: • j ≤ jM and k > 0: then there is a non zero contribution only for M j− 3 − 3 2− 3 ≤ (1 + | x|) ≤ M j− 3 + 3 k

4

1

k

1

(II.40)

• j ≤ jM and k = 0: then there is a non zero contribution only for M j− 3 2− 3 ≤ (1 + | x|) ≤ M j 4

1

(II.41)

• j = jM + 1: then there is a non zero contribution only for M jM 2− 3 ≤ (1 + | x|) 1

(II.42)

Power counting and scaled decay of the propagator Now for each j and k we can estimate more sharply the propagator C jk . We distinguish three cases: • for j ≤ jM and k > 0 we have j,k C ( x, t) ≤ K1 M −2j− 23 k M 73 2 13 χj,k ( x, f (t))

(II.43)

744



where the function χj,k is defined by k

1

if | x| ≤ M j− 3 + 3 , f (t) ≤ M j+k otherwise

χj,k ( x, t) = 1 = 0

(II.44)

and the function F ( x, t) is bounded by Kp . • for j ≤ jM and k = 0 we have j,k C ( x, t) ≤ K1 M −2j M 83 2 23 χj,0 ( x, f (t))

(II.45)

where the function χj,0 is defined by if | x| ≤ M j , f (t) ≤ M j otherwise

χj,0 ( x, t) = 1 = 0

(II.46)

• for j = jM + 1 we have j +1,0 2 C M ( x, t) ≤ M −2jM 2 3 χjM +1,0 (f (t))

Kp p (1 + M −jM | x|)

(II.47)

where the function χjM +1,0 is defined by if f (t) ≤ M jM otherwise

χjM +1,0 (t) = 1 = 0

(II.48)

and the spatial decay for | x| comes from the decay of the function F in (II.9). In the following, the multiscale analysis is essentially performed using the j index. The auxiliary structure will be introduced only in section IV. In that section we will also need to exchange the sums over j and k. The constraints on the maximal value of k, kM (j), are then changed into constraints on j: j M (j) M +1 k j=0

k=0

3jM

C

j,k

=

4

C j,k

(II.49)

k=0 j∈J(k)

where k = [ , jM − k] for k > 0 3 J(0) = [0, jM + 1]

J(k)

(II.50) (II.51)

Vol. 2, 2001

II.6


745

Partition function

We introduce now the local four point interaction ¯ =λ I(ψ, ψ)

d4 x (ψ¯↑ ψ↑ )(ψ¯↓ ψ↓ ) = λ

Λ

d4 x Λ

4

ψc ,

(II.52)

c=1

where ψc is defined as: ψ1 = ψ¯↑

ψ3 = ψ¯↓

ψ2 = ψ↑

ψ4 = ψ↓ .

(II.53)

The partition function is then defined as ZΛu

= =

∞ 1 ¯ ¯n ψ) dµC u (ψ, ψ)I(ψ, n! n=0

¯ ¯ I(ψ,ψ) dµC u (ψ, ψ)e =

∞ 1 ¯ ¯ Iv (ψ, ψ) dµC u (ψ, ψ) n! n=0

(II.54)

v∈V

¯ denotes the local interaction at vertex where V is the set of n vertices and Iv (ψ, ψ) v. Now we can introduce slice decomposition over fields: ψc =

j M +1

ψcj

(II.55)

j=0

hence ¯ =λ Iv (ψ, ψ)

d4 xv

Λ

Jv

4

jv

ψcc

(II.56)

c=1

where xv is the position of the vertex v, Jv = (j1v , j2v , j3v , j4v ) gives the slice indices for the fields hooked to v. Now we write I(v) = λ

Jv ∆v

∆v

d4 xv

4

jv

ψcc

(II.57)

c=1

where ∆v ∈ D0 and ∞ λn n! n=0 JV ∆V 4 jv 4 c ¯ d xv ψc (xv ) , dµC u (ψ, ψ)

ZΛu =

v

∆v

where we denoted any set {av }v∈V by aV .

v

c=1

(II.58)

746


j

j+1


∆ A(∆)

Figure 2: Ancestor The Grassmann functional integral at the n-th order in (II.58) can be written as a determinant 4 jv ¯ dµC u (ψ, ψ) ψcc (xv ) = det M (JV , {xv }) (II.59) v

c=1

where M (JV , {xv }) is a 2n × 2n matrix, whose rows correspond to fields and whose columns correspond to antifields. Therefore, for a given vertex v, ψ1 (xv ) and ψ3 (xv ) correspond to columns and ψ2 (xv ) and ψ4 (xv ) correspond to rows. The matrix element is then v

Mvc;¯v c¯ = δjcv ,jc¯v¯ C jc (xv , xv¯ )

(II.60)

where c ∈ C =: {2, 4} are field indices and c¯ ∈ C¯ =: {1, 3} are antifield indices. Notations For each cube ∆ we denote by i∆ its slice index, that is ∆ ∈ Dj with j = i∆ . We call ancestor of any cube ∆ ∈ Dj , A(∆), the unique cube ∆ ∈ Dj+1 satisfying ∆ ⊂ ∆ (see Fig.2). In the same way for any set S of cubes in Dj , we call ancestor of S the set A(S) = ∪∆∈S A(∆). We call ∆jv , the unique cube ∆ ∈ Dj , for any j ≥ i∆v , satisfying ∆v ⊂ ∆ (for j = i∆v we have ∆jv = ∆v ). (We remark that for the moment all i∆v = 0 ∀∆v ). In the following we will denote by hvc the half-line corresponding to the field jcv ψc (xv ). We say that hvc is external field for the cube ∆ if ∆v ⊆ ∆, i∆ < jcv and there exist at least one field hvc hooked to v (different from hvc ) with attribution jcv ≤ i∆ (see Fig.3). We call E(∆) the set of external fields and antifields of ∆. In the same way we denote by E(S) = ∪∆∈S E(∆) the set of external fields and antifields of the subset S ⊂ Dj . We need also to introduce some notations for the fields with smallest index attached to a vertex v. We call iv the smallest scale of the vertex v, nv the number of fields hooked to v with band index j = iv (1 ≤ nv ≤ 4) and σv the set of indices of these nv fields with j = iv , which is necessarily non-empty. Finally we distinguish the particular field in σv with lowest value of c, which we call cv . iv = inf {jcv | 1 ≤ c ≤ 4} ; σv = {c | jcv = iv } ; nv = |σv | ; cv = inf {c ∈ σv } (II.61)

Vol. 2, 2001


747

∆

A(∆)

Figure 3: External fields for ∆ We say that a vertex v belongs to a cube ∆ ∈ Dj if xv ∈ ∆, and we denote the corresponding set of vertices by V (∆) = {v |∆v ⊆ ∆} .

(II.62)

In the same way we denote by V (S) = ∪∆∈S V (∆) the set of vertices belonging to the subset S ⊂ Dj . We then say that a vertex v is internal for a cube ∆ ∈ Dj if v belongs to ∆ and iv ≤ j. The set of internal vertices of ∆ is therefore defined as I(∆) = V (∆) ∩ {v | iv ≤ j} .

(II.63)

We remark that there may be vertices in V (∆)\I(∆)). In the same way we denote by I(S) = ∪∆∈S I(∆) the set of internal vertices for the subset S ⊂ Dj . Remark that, if v ∈ I(∆), then v ∈ I(∆ ) for any ∆ such that ∆ ⊆ ∆ .

III Connected functions In order to compute physical quantities, we need to extract connected functions. For instance Z in perturbation theory is the sum over all vacuum graphs corresponding to the full expansion of the determinant in (II.59), and we know that the logarithm of Z is the same sum but restricted to connected graphs. But while in ordinary graphs the connectedness can be read directly from the propagators joining vertices, here we need for constructive reasons to test the connection between different cubes in D by a multiscale cluster expansion. Then the computation of log Z is achieved through a Mayer expansion [R]. For this purpose we must introduce two kinds of connections, vertical connections between cubes at adjacent levels j − 1 and j, whose scale is defined as j, and horizontal connections between cubes at the same level j, whose scale is defined as j. (We remark that there is therefore no vertical connection of scale 0).

748



The difficulty is that our definition of these connections is inductive, starting from the scale zero towards the scale jM . We define a connected polymer Y as a subset of cubes in D, such that for any two cubes ∆, ∆ ∈ Y , there exists a chain of cubes ∆1 , ..., ∆N ∈ Y such that ∆1 = ∆, ∆N = ∆ and there is a connection between ∆i and ∆i−1 for any i = 2, ..., N . For each scale j we define connected subpolymers at scale j as subsets of cubes belonging to ∪jq=0 Dq , that are connected through connections of scale ≤ j. These are the analogs of the quasi-local subgraphs in [R]. As for usual graphs, we call Ykj (k = 1, ..., c(j)) the c(j) connected polymers at scale j and ykj their restriction to Dj . The set of external fields for Ykj then corresponds to the set of external fields for ykj , which is denoted by E(ykj ). Finally for a given vertex v we call yvj the particular connected component ykj which contains the vertex v. Connections 1) For any pair ∆, ∆ , with ∆, ∆ ∈ Dj and ∆ = ∆ , we say that there is a horizontal connection, or h-connection (∆, ∆ ) between them if there exists a propagator C j (xv , xv ) with ∆v ⊆ ∆ and ∆v ⊆ ∆ in the expansion of the determinant of (II.59). (This definition is not inductive). ˜ It is also convenient to introduce generalized notions: a ”generalized cube” ∆ of scale j is a subset of cubes of scales j and a generalized horizontal connection, ˜ ∆ ˜ ) is a propagator C j (xv , xv ) with ∆v ⊆ ∆ ˜ and ∆v ⊆ ∆ ˜ or gh-connection (∆, in the expansion of the determinant of (II.59). 2) For each connected subpolymer at scale j, denoted by Y , we suppose by induction that we have defined all subconnections for the subpolymers in Y of scales ≤ j. Let us suppose that |y| = p and that the p cubes of y = Y ∩ Dj are labeled as ∆1 , ... ∆p . • 2a) We say that there exists a vertical connection, called v-connection, between each cube ∆i of y and its ancestor A(∆i ) for i = 1 to p if we can associate to y a single new internal vertex v in Y that has never been associated previously by the inductive process to any previous vertical connections at scale j ≤ j. We remark that the existence of such a single vertex creates always a set of associated vertical connections, with cardinal |y|. This set of v-connections is called the v-block associated to the vertex v. In summary typically (when |y| > 1) several vertical connections are associated to a single vertex, and these vertical connections can form loops (see Fig.4)). • 2b) If condition 2a is not satisfied, i.e. there is no such new internal vertex v for y, but |E(y)| > 0, we say that there exists a vertical connection, called f -connection, again between each cube ∆i of y for i = 1 to p and its ancestor A(∆i ). In this case all these vertical connections are called f -connections, and |E(y)| is called the strength of each such connection. The set of all such f -connections for a fixed set of external lines is called the f -block associated to these external lines.

Vol. 2, 2001


749

∆2

∆1

A( ∆ 1 ) = A( ∆ 2 )

Figure 4: Example of vertical and horizontal connections impulsions 0

j m(Y)

j (Y) M

E (∆)

111 000 000 111 000 111 000 111 000 111 000 111 000 111 000 111 000 111 000 111 00000 11111 00000 11111 00000 11111 00000 11111 00000 11111 0000011111 11111 0000011111 11111 00000 11111 00000 11111 00000 11111 00000 11111 00000 00000 00000 11111 00000 11111 0000011111 11111 0000011111 11111 00000 11111 00000 11111 00000 11111 00000 11111 00000 00000 00000 11111 00000 11111 00000 11111 00000 11111 00000 11111 00000 11111 00000 11111 0000011111 11111 0000011111 11111 00000 11111 00000 11111 00000 11111 00000 11111 00000 00000 00000 11111 00000 11111 0000011111 11111 0000011111 11111 00000 11111 00000 11111 00000 11111 00000 11111 00000 00000 00000 11111 00000 00000000001111111111 1111111111 00000000001111111111 000000000011111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 00000000001111111111 1111111111 00000000001111111111 00000000001111111111 0000000000 1111111111 00000000001111111111 1111111111 00000000001111111111 1111111111 00000000001111111111 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 0000000000 0000000000 00000000001111111111 1111111111 00000000001111111111 1111111111 00000000001111111111 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 0000000000 0000000000 00000000001111111111 1111111111 00000000001111111111 1111111111 00000000001111111111 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 0000000000 0000000000 00000000001111111111 1111111111 00000000001111111111 1111111111 00000000001111111111 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 0000000000 0000000000 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 00000000001111111111 1111111111 00000000001111111111 00000000001111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 00000000001111111111 1111111111 00000000001111111111 00000000001111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 00000000001111111111 1111111111 00000000001111111111 00000000001111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 11111111110000000000 11111111110000000000 11111111111111111111 1111111111 0000000000 0000000000

111 000 000 111 000 111 000 111 000 111 000 111

∆

positions

Figure 5: An example of polymer Y .

In fact in this paper we will restrict ourselves to the analysis and bound for connected subpolymers for which in the second case, we always have |E(y)| ≥ 6, since the other cases need renormalization. When there is no vertical connection, i.e. no new vertex, and |E(y)| = 0, we call Y simply a (vacuum) polymer.

III.1

Polymer structure

With these definitions in phase space (in our usual representation, for which index space is vertical) all polymers have a “solid on solid” profile (see Fig.5)2 . 2 This is not the unique possible choice. In [AR1] polymers with holes or overhangs are allowed. Here we choose polymers without holes for simplicity.

750



We define the highest and lowest slice index of each polymer Y as mY MY

= min∆∈Y i∆ = max∆∈Y i∆ .

(III.64)

For each cube ∆ ∈ Y , we define the “exposed volume of ∆” as Ex(∆) = ∪

∆ ∈D with ∆=A(∆ ) and ∆ ∈Y

∆ .

(III.65)

In other words this is the part of ∆ that contains no other cube of Y , and is therefore at the upper border of the polymer (see Fig.5). An element ∆ ∈ Y is called a “summit cube” if Ex(∆) = ∅, and we define the “border of Y ”, B(Y ), as the union of all summit cubes: E(Y ) = ∪{∆ | Ex(∆)=∅} ∆. We remark that {Ex(∆)}∆∈B(Y ) is a partition of the volume occupied by Y , and the sum over ∆v for any v in Y can be written as dxv = dxv (III.66) ∆v ∈D0

∆v

∆v ∈B(Y )

Ex(∆v )

and we say that the vertex v is localized in the summit cube ∆v ∈ B(Y ). Trees and Forests The connections among cubes in a polymer are the constructive analogs of lines in a graph. It is useful to select among these connections a minimal set i.e. a tree connecting the cubes of the polymer. This is the purpose of the expansion defined below. But we perform this task in two steps. In the main step, called the multiscale cluster expansion, we select vertices, external lines and propagators which form v-blocks, f -blocks and gh-connections (still containing loops, see Figure 4); then in a second, auxiliary step, called the tree and root selection, we eliminate some redundant connections from the v-blocks and f -blocks, and we localize the gh-connections into ordinary h-connections, in order to obtain an ordinary tree connecting all cubes of the polymer; moreover we select for any subpolymer a particular cube called the root, in a coherent way. Just like the definition of the connections, our expansion is inductive. The multiscale expansion starts from the slices with lowest index towards the ones with higher index. The tree and root selection works also inductively but in the inverse order, from the slices with highest index towards the ones with lower index. In the end the particular connections which are selected by the expansion to form the tree will be called links (more precisely v-link, f -link, or h-link, if they correspond to a v-connection, an f -connection, or an h-connection). Therefore by construction for each subpolymer Ykj , the set of horizontal and vertical links of scale j ≤ j forms a subtree Tj spanning the subpolymer; and for the union ∪k Ykj of subpolymers at scale j it forms a forest Fj (i.e. a set of disjoint trees). The forest Fj at scale j is built from the forest Fj−1 at scale j − 1, by adding a set of v-links or f -links of scale j and a set of h-links of scale j. Therefore

Vol. 2, 2001


751

F0 ⊂ F1 ⊂ ... ⊂ FjM +1 := F (such a growing sequence of forests is technically called a “jungle”[AR2]).

III.2

Multiscale Cluster Expansion

In this first step we build connected polymers by choosing v-blocks, f -blocks and gh-links which ensure the connectedness of the polymer. This is done through Taylor expansions with integral remainders, inductively from scale 0 to scale jM . We build the connected subpolymers at scale j + 1, knowing already the connected subpolymers at scales j < j+1. We perform first the vertical expansion, then the horizontal one, except for the first slice, for which we start with the horizontal one. III.2.1 Vertical expansion For each connected subpolymer Ykj , we define Ij (ykj ) ⊂ I(ykj ) as the subset of vertices internal for ykj that have been selected until the step j. We can also define the union of all vertices already selected until scale j as Ij (Fj ) = ∪k Ij (ykj ). We extract first the v-blocks, then the f -blocks of scale j + 1 (all other connections at scale j ≤ j being already fixed). v-blocks First we test the existence of a v-block associated to a vertex. We want therefore to know whether I(ykj )\Ij (ykj ) = ∅ for each ykj , namely whether there is at least one internal vertex v that has not already been selected. For this purpose we introduce into (II.58) the identity 4 j v >j v 1= υ (jc ) + υ (jc ) (III.67) v∈V \Ij (Fj )

where we defined

υ j (jcv )

c=1

= 1 if jcv ≤ j = 0 otherwise

(III.68)

and υ >j (jcv ) = 1 − υ j (jcv ). Remark that v is internal vertex for ∆jv if there is at least one field hooked to v with jcv ≤ j. Therefore, to select one new internal vertex for yjk we define the function

F (wy j ) = k

v∈V

j j (yk )\Ij (yk )

4 wy j υ j (jcv ) + υ >j (jcv ) .

(III.69)

k

c=1

The identity (III.67) corresponds to F (wy j = 1). Now we apply the first order k Taylor formula: 1 dwy j F (wy j ) (III.70) F (1) = F (0) + 0

k

k

752


where

F (0) = v∈V

4


υ

>j

(Jv )

(III.71)

c=1

j j (yk )\Ij (yk )

means there is no new internal vertex for yjk (hence Ij (yjk ) = I(yjk )), and we must go to the next paragraph to test for the existence of external fields (f -blocks). On the other hand, the integral remainder F

(wy j ) k

=

4

υ

j

(jαv v )

0

j j αv =1 v∈V (yk )\Ij (yk )

j j v ∈V (y )\Ij (y ) k k v =v

1

dwy j

k

wy j υ j (jcv ) + υ >j (jcv )

c =αv

k

4 j v >j v wyj υ (jc ) + υ (jc ) .

c=1

(III.72)

k

extracts one new internal vertex for yjk , choosing the field with c = αv to have jcv ≤ j. To simplify this expression we define (III.73) Υj (v, c) = wy j υ j (jcv ) + υ >j (jcv ) v

where we recall that yvj is the particular connected component ykj at scale j containing v, as defined in the introduction of Section III. Hence the remainder term is written 1 4 4 j v F (wyj ) = υ (jαv ) dwyj Υj (v, c ) Υj (v , c). k

0

j j αv =1 v∈V (yk )\Ij (yk )

k

c =αv

j j v ∈V (y )\Ij (y ) k k v =v

c=1

(III.74) When this remainder term is selected, we have built the v-block corresponding to ykj and to the vertex v. Remember that this v-block associated to the vertex v is made of as many vertical connections as there are cubes in ykj . This analysis is performed for each connected component yjk before going on. f -blocks If wy j = 0, that is I(ykj )\Ij (ykj ) = ∅, there is no v-block connecting ykj k

to its ancestor, therefore we must test for the existence of external fields (f -block). Fo each v ∈ I(ykj ) (actually in this case Ij (ykj ) = I(ykj )) we can write the sum over field attributions as follows = (III.75) Jv

nv ,σv iv ∈Iv Jv

where we recall that iv = min{jcv | c = 1, ..., 4}, σv gives the indices of the fields with jcv = iv and nv = |σv | (II.61). The attribution iv can belong only to the

Vol. 2, 2001


753

interval Iv = [0, lv ] where lv is the scale where the vertex v has been associated to a vertical block. Remark that lv ≤ j − 1 because this vertex has been extracted as internal vertex for some ykj with j < j. Finally Jv gives the band indices for the 4 − nv fields that do not belong to the band iv : jcv > iv , ∀c ∈ σv . Remark that if the field c = αv does not belong to σv then it satisfies the constraint iv < jαv v ≤ lv ≤ j − 1. The interpolating function F is now F (wyj ) = k

wyj .

(III.76)

k

j c∈σv v∈I(yk ) j v >j c

We want to extract external lines until we have convergent power counting. Since in this theory two and four point functions a priori require renormalization [FT1-2], we push the Taylor formula in w to sixth order: F (w = 1) =

5 p=0

F (p) (w = 0) +

1

dw F (6) (w )

(III.77)

0

where all terms with p odd are zero by parity and the term F (p) (w = 0) for p = 0, 2, 4 corresponds to the case of 0, 2 and 4 external fields. Finally the integral remainder corresponds to the case of 6 external legs or more. When a field is derived by the Taylor formula at scale j, hence is chosen as external field, its band attribution is constrained to the set jcv > j. The highest band is constrained to iv ≤ j, but this was already true because external fields only hook to vertices that have been extracted at some level j ≤ j (therefore iv ≤ j − 1). Remark that the same field may be chosen as external field at different scales. When any term in (III.77) is selected except the one with p = 0 we have built the f -block corresponding to ykj and to the corresponding set of selected external lines, and we say that the f -block has a corresponding strength of p = 2, 4, or 63 . Remember that this f -block again is made of as many vertical f -connections as there are cubes in ykj . This analysis is again performed for each connected component yjk before going on. III.2.2 Horizontal expansion The extraction of the vertical blocks has fixed a certain set of generalized cubes at ˜ j+1 are the connected components at ˜ j+1 . The elements of D scale j + 1, called D scale j + 1, taking into accounts all previous connections, that is the connections 3 In part II of this study we plan to perform renormalization of the two point function and to simply bound logarithmic divergences such as those of the 4-point function using the smallness of the coupling constant like in [DR2]. For that purpose we need to complicate slightly this definition, and to introduce holes in the vertical direction of our polymers when f -blocks have strength 2 or 4. These complications are not necessary here so we postpone them to this future publication.

754



of scale j ≤ j and the vertical connections of the v and f -blocks of scale j + 1 that have just been built. In order to complete the construction of the connected subpolymers at scale j + 1, we must test horizontal connections between these generalized cubes, that is gh-connections. Extracting these gh-connections actually corresponds to extracting forests made of such gh-connections at scale j + 1 over these generalized cubes. h We denote such a forest by Fj+1 . This is done using a so called forest formula. Forest formula To simplify notation we work at scale j instead of j + 1. Forest formulas are Taylor expansions with integral remainders which test connections (here the gh-connections at scale j) between n ≥ 1 points (here the generalized cubes at scale j) and stop as soon as the final connected components are built. The result is a sum over forests, a forest being a set of disjoint trees. We use the unordered Brydges-Kennedy Taylor formula, which states [AR2] that for any smooth function H of the n(n−1)/2 variables ul , l ∈ Pn = {(i, j)|i, j ∈ {1, .., n}, i = j},

k 1 k ∂ H|hl =1 = dwq H (hF (III.78) l (wq ), l ∈ Pn ) ∂h l 0 q q=1 q=1 u−F

where u−F is any unordered forest, made of 0 ≤ k ≤ n−1 lines l1 , ..., lk over the n points. To each line lq q = 1, ..., k of F is associated the parameter wq , and to each (wq ). These factors replace the pair l = (i, j) is associated the weakening factor hF l variables ul as arguments of the derived function kq=1 ∂h∂l H in (III.78). These q

weakening factors hF l (w) are themselves functions of the parameters wq , q = 1, ..., k through the formulas hF i,i (w) = 1 hF i,j (w) =

inf wq ,

F lq ∈Pi,j

if i and j are connected by F

F is the unique path in the forest F connecting i to j where Pi,j

hF i,j (w) = 0

if i and j are not connected by F.

(III.79)

In our case, the H function is the determinant, Pn is the set of pairs of generalized cubes at scale j ˜ ∆ ˜ ) | ∆, ˜ ∆ ˜ ∈ D ˜j } . Pn = {(∆,

(III.80)

We apply the forest formula (III.78) at scale j and we denote the corresponding Fh

j forest by Fjh . Therefore the interpolation parameter h∆ ˜∆ ˜ is inserted besides the matrix element defined in (II.60): j ) (III.81) Mvc;¯v c¯ = δjcv ,jc¯v¯ C∆ j j (xv , xv ¯ ,∆ v v

v ¯

j=jc

Vol. 2, 2001


755

where we defined j C∆ (xv , xv¯ ) =: χ∆jv (xv ) C j (xv , xv¯ ) χ∆jv¯ (xv¯ ) j ,∆j v

(III.82)

v ¯

and χ∆ (x) is the characteristic function of ∆, defined by: χ∆ (x) = 1 if x ∈ ∆ and χ∆ (x) = 0 otherwise. The interpolated matrix element, for any jcv = j is then jcv j hj∆ (III.83) Mvc;¯vc¯(h∆, ¯) ˜ ∆ ˜ ) = δjcv ,jc¯v¯ ˜ j C∆j ,∆j (xv , xv ˜ j ,∆ v v

v ¯

v

v ¯

j=jc

˜ jv as the unique generalized cubes containing ∆jv , and write for where we defined ∆ Fh

j simplicity hj∆ ˜ j instead of h∆ ˜j . ˜ j ,∆ ˜ j ,∆ v

III.3

v ¯

v

v ¯

Tree and root selection

¯ Localization of the gh-connections We now fix, for each field h or antifield h hooked to a vertex v, whether it belongs or not to a propagator derived by the horizontal expansions (since this costs only a factor 2 per field or antifield, hence a factor 16 per vertex). As we know the position of ∆v for any v, we know exactly ¯ that form at scale j (as j b = j) the propagators of ˜ in y j the set of h, h for each ∆ h k ˜ the tree Tjk . We denote this set by b(∆). The first, rather trivial step, consists in replacing each gh-connection between generalized cubes by an ordinary h-link between ordinary cubes. This means, in j the propagator χ∆ to the gh-connection, that ˜ C χ∆ ˜ corresponding we expand the characteristic functions as χ∆ = χ , and χ = ˜ ∆ ˜ χ∆ . ˜ ˜ ∆∈Dj ,∆⊂∆ ∆ ∈Dj ,∆ ⊂∆ ∆ Accordingly the gh-connection is localized into an ordinary connection, or h-link between ∆ and ∆ 4 . Choice of the roots Remember that at each scale j each connected subpolymer ˜ ykj is actually made of a set of disjoint generalized cubes ∆.We want now to choose j ˜ root in each y , called the root of the subpolymer, and one one generalized cube ∆ k ˜ called the root of the generalized particular cube ∆root in each generalized cube ∆ cube. ˜ root is special: it will correspond to the root cube of the The root cube in ∆ whole subpolymer, therefore we will denote it by ∆0root . ˜ = ∆ ˜ root , we want to choose one field or antifield Finally, in each ykj , for each ∆ ˜ in b(∆) as the one contracting towards the root in Tjk and we call it hroot (the vertex to which it is hooked being called vroot ). We call then Rroot the set of all hroot for all generalized cubes at all the different scales. 4 The corresponding sums are bounded below in two steps: in the first step, at the beginning of section III.4, the set b of the fields for the h-links is chosen (and paid in section IV.7.3), and in section IV.6 the contraction between these fields is performed (construction of Tjk ). Since in section III.4 the position of all the fields is known, together these two steps pay for the localization of gh-connections into ordinary connections.

756



Remark that the choice of the set Rroot can be performed only after the ˜ root . The set of remaining fields in b(∆) ˜ is denoted by lb (∆) ˜ (and called choice of ∆ ˜ Remark that for ∆ ˜ root all fields are leaves: b(∆) ˜ = lb (∆). ˜ the leaves for ∆). The roots are chosen inductively scale by scale, from bottom up, starting by the biggest index scale MY of the polymer and going up until the smallest index mY , To break translation invariance, we need to assume from now on that the polymer Y contains a particular point, namely the origin x = 0. At the biggest scale we have only one connected component, that must con˜ root as the unique ∆ ˜ containing x = 0, tain the origin x = 0. Therefore we choose ∆ ˜ root containing x = 0. Now for each and ∆root = ∆0root as the unique cube ∆ ∈ ∆ ˜ = ∆ ˜ root we define ∆root as the (necessarily unique) cube ∆ ∈ ∆ ˜ containing a ∆ field hroot ∈ Rroot of that scale. With these definitions we can introduce the general inductive rule. We assume ˜ root have been defined until the scale j. We now want to define that all ∆root and ∆ the roots at scale j − 1. Remark that each connected component ykj−1 actually corresponds to some ˜ 0 at scale j. We denote by ∆0 its root cube. Now we distinguish generalized cube ∆ two cases: • there exists a cube ∆1 ∈ ykj−1 with ∆1 ⊆ ∆0 which contains either 0 or one hroot at some scale j ≥ j. Remark that this ∆1 must be unique. Then we ˜ root for y j−1 the unique ∆ ˜ with ∆1 ⊆ ∆. ˜ Now for all ∆ ˜ = ∆ ˜ root define as ∆ k we introduce hroot and ∆root exactly as in the case of the lowest band MY . ˜ root we choose ∆1 as root cube: ∆1 = ∆0root . Finally for ∆ • there is no cube ∆1 ∈ ykj−1 with ∆1 ⊆ ∆0 with 0 ∈ ∆1 or ∆vroot ⊆ ∆1 for ˜ ∈ y j−1 some hroot at a lower scale. Therefore we choose as root one of the ∆ k ˜ ∩ ∆0 = ∅ (remark that there must be at least one of such ∆ ˜ by satisfying ∆ ˜ = ∆ ˜ root we introduce hroot and ∆root exactly as in construction). For all ∆ ˜ root we choose as ∆0root one of the case of the lowest band MY . Finally for ∆ the cubes satisfying ∆ ⊆ ∆0 (there must be at least one by construction). For an example see Fig.6, where cubes of three scales are shown. The lines connecting two cubes are are h-links. The union ∆1 ∪∆2 ∪∆3 is a generalized cube ˜ 0 above). From the figure one can see that there are at scale j (corresponding to ∆ three generalized cubes at scale j − 1: ˜1 ∆ ˜2 ∆ ˜3 ∆

= ∆1 = ∆2 ∪ ∆3 ∪ ∆4 = ∆5 ∪ ∆6 .

(III.84)

˜ 0 is a root at scale j, ∆2 is the corresponding root cube Now, let us say that ∆ and 0 ∈ ∆2 and no hroot has vertex in ∆2 . Then we have two equivalent choices ˜ root as ∆ ˜ 2 ∩ ∆2 = ∅ and ∆ ˜ 3 ∩ ∆2 = ∅. Let us take ∆ ˜ root = ∆ ˜ 2 . Now inside for ∆ ˜ ∆2 we have again two equivalent choices for ∆root as ∆3 and ∆4 ⊂ ∆2 .

Vol. 2, 2001


757

j−2 ∆’1

∆’2 ∆’3 ∆’4

∆1

∆’5 ∆’6

∆2

j−1

∆3

j

Figure 6: Construction of roots

Choice of the v-links and f -links Remember that in order to avoid loops, each time several cubes in ykj have the same ancestor we must choose only one of them in the block to bear a link (either of v or f type). The choice of this cube is completely arbitrary (for instance choose the first ones in some lexicographic ordering of the cubes), except for one constraint. Actually, for each connected subpolymer y the root cube ∆0root acts as root for y, therefore we decide to always choose as vertical link (∆0root , A(∆0root )). All other choices are arbitrary. This constraint is useful because in the following all the vertical power counting for ykj will be concentrated on this special vertical link (∆, A(∆)) (∆ = ∆0root ). At the end of this selection process we have therefore an ordinary tree of either v, f or h links connecting together all cubes of Y .

III.4

Result of the expansion

As a result of this inductive process we obtain the following expression ∞ n λ ZΛu = εF d4 xv n! ∆ a a v b b v n=0 ∆V F Vd ,αVd a,b,R lVd {Jh },{Jh Cb ¯ } {jh },{jh ¯}          jM +1 jM +1 jM +1 1 1 1     dwl   dwl   dwl  

j=0

l∈hLj

jM +1



 

0

j=1

  j C∆ ¯l )  ¯ (xl , x ∆ l

 

j=0

v∈Vd



l

l∈hLj

υ >jm (v) (jαv ) υ lv (jαv ) v v

0

l∈vLj

v∈Vd l v −1 j=0

 

j=1

l∈f L6j

0

     

nv σv ρv iv ∈Iv J v



Υj (v, αv )

v∈V¯d Jv

758


 



υ >jm (v) (jcv )

j=0

v∈Vd c=αv

 

4

 υ >jm (v) (jcv )

v∈V¯d c=1

 

v∈Vd c∈σv

lv



 Υj (v, c) 

Υj (v, c)

j=0

jcv −1



jM


 sj (v, c) det M ({wl })

(III.85)

j=0

where • Vd = {v ∈ V | ∃ one v-link associated to v } and V¯d = V \Vd ; • a = {hvc | v ∈ Vd and hvc is associated to some f -links at one or several scales}; • b = {hvc | hvc is associated to one h-link }; • R = Rroot = {hvc | hvc is a root field or antifield }; • lVd = {lv | v ∈ Vd } where lv + 1 is the scale of the v-links associated to v (they are all at the same scale); • Jha is the set of scales j where the field h is associated to a f -link: for each ¯ j ∈ Jha hvc is external field for yvj . The same definition holds for h; ¯ • jhb is the scale of the h-link associated to h. The same definition holds for h; ¯ that form the h-links; • Cb fixes the pairs h − h • εF is a sign coming from the horizontal forest formulas; • hLj is the set of h-links of scale j in Fj . For each h-link l we denote the ¯ l . The vertices are denoted by v(l) and corresponding field, antifield by hl , h v¯(l), their positions by xl (¯ xl ) and the cubes of the link containing them by ¯ l. ∆l and ∆ • vLj is the set of vertical links of scale j associated to a vertex. We recall that each such vertex corresponds to a set of v-links in Fj connecting some subset y at scale j − 1 (which is already connected by Fj−1 ) to its ancestor; • f Lpj is the set of vertical links of scale j associated to p external fields. We recall that each such set of external fields corresponds to a set of f -links of scale j and order p (p = 2, 4, 6) in Fj connecting some subset y at scale j − 1 (which is already connected by Fj−1 ) to its ancestor;

Vol. 2, 2001


759

• wl = wy j where l is the v-links connecting ykj to its ancestor. The same k

definition holds for wl ;

• Defining

if v ∈ V¯d , jm (v) = max{j | yvj connected to A(yvj ) by a f −link} j j jm (v) = max{j < lv | yv connected to A(yv ) by a f −link} if v ∈ Vd , (III.86) we must have, for all hvc , jcv > jm (v). This bound can be understood as follows: a vertex v cannot have iv ≤ jm (v). Indeed otherwise it would be internal j (v) for yvm , and would have been chosen at that scale instead of the f -link j (v) connecting yvm to its ancestor. We remark that for v ∈ Vd this argument only applies for scales j < lv , since after lv the vertex can no longer be selected as a vertical connection. This explains the definition (III.86). All these constraints are expressed in formula (III.85) by the function υ >jm (v) (jcv ). Moreover, for each v ∈ Vd we have inserted an additional sum =

(III.87)

ρh {hv c | c=cv }

ρv

where we recall that cv = min{c ∈ σv } (II.61), and we define ρh = 1 if iv ≤ jh ≤ lv and ρh = 2 if lv < jh . Remark that for c ∈ σv and c = cv , or for c = αv , we must have ρhvc = 1 by construction (Recall that αv is defined in (III.72)). On the other hand, if h ∈ a we must have ρh = 2 by construction. • the values of sj depend on the f -links: - sj (v, c) = 1 if yvj is connected to its ancestor by a v-link or if j ∈ Jhavc (which means hvc is associated to a f -link connecting yvj to its ancestor); - sj (v, c) = wyj if yvj is connected to its ancestor by a f -link of order 6 v and j ∈ Jhavc ; - sj (v, c) = 0 if yvj is connected to its ancestor by a f -link of order 2 or 4 and j ∈ Jhavc . • finally det is the determinant remaining after the propagators corresponding to h-links have been extracted. The matrix element is h Fj j ¯ v ,j v Mvc;¯ ({w }) = δ (w)C (x , x ) (III.88) h l j j j j v v ¯ j v c¯ ¯ c c ∆ ,∆ ∆ ,∆ v

v ¯

v

v ¯

j=jcv

Fh

where the weakening factor h∆jj ,∆j (w) is defined in (III.79), substituting in v

v ¯

the formulae the general forest F with the horizontal forest Fjh .

760



Constrained attributions The non zero contributions are given by the following attributions: - for v ∈ Vd and c ∈ σv we must have iv ∈ Ivc = [1 + jm (v) , lv ]

(III.89)

- for v ∈ Vd and c ∈ σv we must have

jcv ∈ Jvc jcv

∈

Jvc

= [iv , lv ] for c = αv = [jm (v, c) , jM (v, c)]

c = αv

(III.90)

where jm (v, c) = 1 + iv if hvc ∈ a and ρhvc = 1 jm (v, c) = 1 + lv if hvc ∈ a and ρhvc = 2 a jm (v, c) = 1 + max{j ∈ Jhvc } if hvc ∈ a

(III.91)

and jM (v, c) = lv if hvc ∈ a and ρhvc = 1, otherwise jM (v, c) = min{j > iv | yvj is connected to its ancestor by a f -link of order p = 2, 4 }, with the convention that min ∅ = jM + 1. Remark that jαv v satisfies a special constraint because this is the field derived in order to extract a v-link at scale lv + 1, therefore it must satisfy jαv v ≤ lv ; - finally, for v ∈ V¯d we must have jcv ∈ Jvc = [jm (v) + 1 , jM + 1].

(III.92)

Reinserting attribution sums inside the determinant This is a key step for later bounds. We observe that for all v ∈ V¯d the constraints υ j and υ >j on the attributions for each field hooked to v are independent. Therefore we can reinsert all the sums inside the determinant (bringing with them the corresponding vertical weakening factors w and w ). On the other hand, for v ∈ Vd , the sum over attributions for hvc with c ∈ σv are independent from each other but are all dependent from iv . Therefore we can reinsert in the determinant the sums for c ∈ σv (with their vertical weakening factors), but we must keep the sum over iv outside the determinant. The weakening factors for all c = cv are inserted in the determinant. On the other hand for the particular field hvcv we keep outside the determinant the weakening factors w , as they will be used to perform certain sums, and reinsert the others in the determinant. Therefore we can write the partition function as ∞ λn u 4 εF d xv ZΛ = n! ∆v a a b b v n=0 ∆V

F Vd ,αVd a,b,R lVd {Jh },{Jh ¯ } {jh },{j¯ } Cb h

Vol. 2, 2001



jM +1




 

j=0

 

l∈hLj





v∈Vd

jM +1

dwl  

0









j=1

  1

jM +1

dwl  

0

l∈vLj

j=1

v∈Vd cv =αv

j=iv

 

l∈f L6j

 1

dwl 

0

j=iv

v∈Vd cv =αv



j C∆ ¯l , {wl }, {wl }) det M ({wl }, {wl }, {wl }) ¯ (xl , x ∆ l

j=0



    l l −1 v v     wy j   wyj   v v

nv σv ρv iv ∈Ivc



jM +1

  1

761

l

l∈hLj

(III.93) where the matrix element is   Mvc;¯ v c¯ ({wl }, {wl }, {wl }) =

 Wvc (jcv )

jcv ∈Icv

δjcv ,jc¯v¯

h F j (x , x ) h∆jj ,∆j (w) C∆ j v v ¯ ,∆j v

v

v ¯

v ¯

(III.94)

 

j=jcv

 Wv¯c¯(jc¯v¯ )

jcv ∈Ic¯v¯

and the horizontal propagator is j j C∆ ¯l , {wl }, {wl }) = Wvl cl (j) C∆ ¯l ) Wv¯l c¯l (j) ¯ (xl , x ¯ (xl , x l ∆l l ∆l

(III.95)

and vl , cl and v¯l , c¯l identify respectively the field and the antifield of the link. We defined Icv Icv Icv

= {iv } = =

Jvc Jvc

v ∈ Vd , c ∈ σv v ∈ Vd , c ∈ σv v ∈ V¯d

(III.96)

and Ivc , Jvc and Jvc is the set of band attributions with the constraints due to the forest structure that we introduced above. Finally the definitions for the factors Wvc are given below. Vertical weakening factors The expression for Wvc (jcv ) is given by the Υj (v, c) and sj functions. Remark that Υj (v, c) = 1 Υj (v, c) = wy j

v

Actually we have to distinguish different cases.

if j < jcv if j ≥ jcv

(III.97)

762


If v ∈ Vd , c = αv and c = cv  Wvαv (jαv v ) = 

 

v jα −1 v

l v −1

wy j   v

v j=jα v

If v ∈ Vd , c = αv and c = cv  Wvαv (jcv ) = 

 sj (v, αv ) .

(III.98)

j=iv

 

jcv −1

lv

wy j  

v

j=jcv

If v ∈ Vd and c = cv


 sj (v, c) .

(III.99)

j=iv



jcvv −1

Wvcv (jcvv ) = 

 sj (v, cv ) .

(III.100)

j=iv

Finally if v ∈ V¯d



Wvαv (jcv ) = 

jM

  wy vj  

j=jcv

jcv −1

 sj (v, c)

(III.101)

j=iv

where we take the convention that a void product is 1. Therefore for v ∈ Vd and ρh = 1 the product over sj is reduced to 1 and for v ∈ Vd and ρh = 2 the product over w is reduced to 1.

III.5

Connected components

Now, at each order n we can factorize the connected components, namely the polymers. The forest F is connected if at the highest slice index (hence the lowest energy scale) there is only one connected component. Remark that F could have no link for any j > jF . In this case the forest is connected if FjF has only one connected component. The partition function is written as ZΛu =

∞ kY =0

1 kY !

Y1 ,...,Yk Y ∪q Yq =D, Yq ∩Y =∅ q

q

A(Yq )

(III.102)

where kY is the number of different connected polymers Yq and the amplitude for a polymer Y is defined as ∞ λn 4 A(Y ) = εF d xv n! ∆v c a a b b v n=0 ∆V FM

Y

Vd ,αVd a,b,R lVd {Jh },{Jh ¯ } {jh },{j¯ } Cb h

Vol. 2, 2001

 

M Y



  

l∈hLj



M Y

1

dwl 

0



v∈Vd





j=mY






j=mY +1

l∈vLj

 1

dwl 

0

M Y

j=mY +1

v∈Vd cv =αv

j=iv

v∈Vd cv =αv



 

l∈f L6j

 1

dwl 

0

j=iv

j C∆ ¯l , {wl }, {wl }) det M ({wl }, {wl }, {wl }) ¯ (xl , x ∆ l

j=mY



    l l −1 v v     wy j   wyj   v v

nv σv ρv iv ∈Ivc





M Y

763

l

l∈hLj

(III.103) c where FM is any connected forest over Y 5 . The spatial integral for each v is still Y written in terms of cubes in D0 , but all sums are restricted to the polymer. This c means that ∆ ∈ Dj becomes ∆ ∈ Dj ∩ Y and so on. Remark that FM has no Y link at scale j < mY .

III.6

Main result

Now we have nearly succeeded in computing the logarithm of Z. Actually (III.102) would be the exponential of A(Y ), if there was no constraint Yq ∩Yq = ∅, ∪q Yq = D. Taking out these conditions and computing the logarithm is the purpose of the so called Mayer expansion [R]. By translation invariance, a Mayer expansion converges essentially if the following condition holds: |A(Y )|e|Y | ≤ 1 (III.104) Y 0∈Y

(where |Y | is the cardinal of Y , hence the total number of cubes of all scales forming Y ). If we perform power counting, we find that all sub-polymers of Y , Ykj , with |E(Ykj )| = 2, 4 need renormalization. This is postponed to a future publication6 . To start with a simpler situation, in this paper we restrict ourselves to the case |E(Ykj )| > 4 for all j < jM + 1. We call this subset the convergent attributions for Y and we denote the corresponding amplitudes by Ac (Y ). Remark that Ac (Y ) contains only f -links of order 6. We therefore prove the following theorem, which is a 3-d analog of [FMRT] and [DR1].

Theorem For any L > 0, there exists K > 0, such that if |λ ln T | ≤ K

(III.105)

5 The constraint that Y must be connected implies that the term at order n is zero unless n is big enough (in order to be able to connect Y ). 6 In this future publication, we plan in fact to renormalize only the 2-point function, and to bound the logarithmic divergence of the 4-point functions by the condition λ| log T | ≤ K, like in [DR2].

764


we have


|Ac (Y )|L|Y | ≤ 1

(III.106)

Y 0∈Y

The sum is performed over all polymers that contain the position x = 0, and Ac (Y ) is the amplitude of Y restricted to the convergent attributions. The rest of the paper is devoted to the proof of this theorem, and from now on we further assume K ≤ 1.

IV Proof The general idea is to bound the determinant by a Hadamard inequality, and to sum over the horizontal structures using the horizontal propagators decay. The Hadamard inequality generally costs a factor nn | ln T ||Vd \Vb |+(1−ε)|Vd ∪Vb | ¯

(IV.107)

where 0 < ε < 1 and Vb is the set of vertices hooked to some horizontal link: Vb = {v ∈ V | hvc ∈ b for some c}.

(IV.108)

The factor nn is bounded by the global 1/n! symmetry factor of the vertices, up to a factor en by Stirling formula, which is absorbed in the constant K (see however the remark in the Introduction). The logarithm is bounded by a fraction of the small coupling constant λn . A delicate point is to prove that the factor ε is strictly positive ε > 0, since we need to spare a fraction of λ at each derived vertex v ∈ Vd ∪ Vb in order to extract a small factor per cube. This factor is necessary to bound the last sum over the polymer size and shape. ¯ In the following we will denote fields only by h (not hvc ) and antifields by h. The corresponding vertex is vh , vh¯ , the field index is ch ∈ C, the antifield index ch¯ ∈ C¯ (C and C¯ are introduced in section II.6), their slice indices are jh , jh¯ and their vertex position is xh , xh¯ . In order to bound the amplitude of a polymer A(Y ) we must introduce the auxiliary slice decoupling of section II.5. For each propagator extracted from the determinant we write

kM (j) j C∆ ¯l , {wl }, {wl }) = ¯ (xl , x l ∆l

jk C∆ ¯l , {wl }, {wl }) ¯ (xl , x l ∆l

(IV.109)

k=0

=

jkh

l δkhl ,kh¯ C∆ ∆ ¯l , {wl }, {wl }) ¯ (xl , x l

l

l

khl kh ¯

l

where jk jk ¯l , {wl }, {wl }) = Wvl cl (j) C∆ ¯l ) Wv¯l c¯l (j) C∆ ¯ (xl , x ¯ (xl , x l ∆l l ∆l

(IV.110)

Vol. 2, 2001


765

C jk is defined in (II.36) and Wh (j) corresponds to the function Wvc (j) defined in (III.98-III.101). The matrix element is written as Mh; δkh ,kh¯ (IV.111) ¯ ({wl }, {wl }, {wl }) = h kh kh ¯

j∈Ih ∩Ih ¯ ∩J(kh )

h Fj jkh (xh , xh¯ ) [Wh¯ (j) ] [Wh (j)] h∆j ,∆j (w) C∆ j ,∆j h

¯ h

h

¯ h

where we have exchanged the sums over jh and kh , J(k) is defined in (II.50) and the interval Ih corresponds to the interval Icv defined in (III.96). Finally we denote ¯ The sums over kh and k¯ by ∆jh the cube ∆jvh . The same definitions hold for h. h are extracted from the determinant by multilinearity. We need now to reorganize the sum over Y according to a tree structure analogous to the “Gallavotti-Nicol´ o tree” [GN]. that is called here S.

IV.1

The S structure

Let MY be the lowest scale of the polymer. S is a rooted tree that pictures the inclusion relations for the connected components of Y at each scale and the type of vertical connection (vertex or field). In this rooted tree the extremal leaves are pictured as dots and the other vertices as circles. A circle at layer l represents a connected subpolymer at scale j = MY − l. A leaf at layer l by convention represents an extremal summit cube, that is a cube such that Ex(∆) = ∆ (no cube above), whose scale is MY − l + 1. The highest layer fixes the scale mY : lmax = MY − mY + 1 (as at scale MY − lmax there are only leaves, hence no cubes) and satisfies lmax − 1 ≤ jM . There are two types of links in S: the leaf-links which join a leaf to a circle, and the circle-links which join two circles. To each circle-link corresponds a vertical block in the multiscale expansion, and we can associate to it a label f or v depending if this block is associated to a vertex or to external fields7 . An example of S structure is given in Fig.7 and two possible polymers corresponding to this structure are given in Fig.8 a and b. We remark that S fixes in a unique way the number and scales of the extremal summit cubes, but that several polymers, with different total number of cubes, may correspond to the same structure S. In order to fix this total number of cubes, we introduce for each circle-link of S a further number which fixes the number of vertical links (which are v-links or f -links depending of the type of the circle-link) selected in the block in section III.3. Since there is one vertical link per ancestor cube, this number is the number of ancestor cubes of the connected component y corresponding to the circle at the top of the circle-link. We call this collection of indices V L. S and V L together 7 We remark that the circles at level l connected only to leaves at level l + 1 must be connected to the previous circle at level l − 1 by a v-circle-link. Indeed each of the extremal summit cubes forming that circle must contain at least one vertex.

766


∆1 ∆ 2


l=3 ∆4 ∆ 5 ∆3

v f

∆6 l=2

v l=1

l=0

Figure 7: Example of S ∆1

∆2 ∆4 ∆5 ∆6

M −2 Y

∆3

MY

M Y−2

∆1 ∆ 2

a ∆4 ∆ 5 ∆6 ∆3

MY

b

Figure 8: Two possible polymers corresponding to S ∆1 ∆ 2

∆1 ∆ 2 ∆ 3

∆3

b

a ∆1

∆2

∆2

∆1

∆3

∆3

v c

d

Figure 9: a,b,c: three polymers corresponding to the same S shown in d: V L can distinguish a from b, but not b from c

Vol. 2, 2001


767

fix the number |Y | of cubes in Y . For instance the situations in Fig.9a and b. correspond to the same S, shown in Fig.9 d. But the case a) corresponds to an index V L = {1} for the unique circle-link and to 4 cubes in Y , whether the case b) corresponds to an index V L = {2} and to 5 cubes in Y , Finally when S and V L are given, we can label all the cubes of Y , and we fix the subset BS of those cubes of Y which are summit cubes. They are those with non-zero exposed volume: |Ex(∆)| > 08 . Nevertheless we remark that there is still some ambiguity, as even V L and BS cannot distinguish between Fig.9b and c, and the position of the cubes of Y is not yet fixed.

IV.2

The reorganized sum

The sum (III.106) is then reorganized in terms of the structure S as ∞ λn |Ac (Y )|L|Y | ≤ L|Y | n! Y MY S V L BS {x∆ }c n=0 Vd ,αVd a,b,R {vl }l∈vL 0∈Y     v∈Vd iv ∈Ivc ∆v ∈Div ∩Y

nVd σVd ρVd {n∆ }∆∈B ∆cV¯ S



 cj M Y    

M Y

j=mY



M Y

 

l∈hLj cj

 



1

dwl 

0

M Y

j=mY +1

 

∆v

v∈Vd

1

a b b {Jha },{Jh ¯} ¯ } {jh },{jh ¯ } {kh },{kh

dxv 

M Y

dwl 

0

l∈vLj

(IV.112)

εF

j=mY +1





 

l∈f L6j

 1

dwl 

0

jkhl C∆ ∆ ¯l , {wl }, {wl }) δkhl kh¯  ¯ (xl , x l

j=mY k=1

Ex(∆v )

v∈V¯d

j=mY k=1 Tjk



 dxv 

d

l

l

l∈Tjk

   lv l v −1    det M {wl }, {wl }, {wl }, {kh,h¯ } wyj   wyj   v v v∈Vd j=iv v∈Vd j=iv cv =αv c =α v

v

where • {x∆ }c chooses the position of each cube in the polymer, constrained by S, V L and BS , with the additional constraint that at the lowest level MY there is one cube containing the origin x = 0. • vl is the vertex v ∈ Vd associated to the vertical link l ∈ vL where vL = ∪j vLj . Remark that once we know vl for each l ∈ vL, we automatically know 8 Actually B only really fixes the non-extremal summit cubes ∆ (with 0 < |Ex(∆)| < |∆|) S since the extremal summit cubes with Ex(∆) = ∆ were already known from the data in S.

768



lv for all v ∈ Vd . The vertices of Vd are from now on said to be localized in the cube ∆iv ∈ Div to which they belong. each summit • nBS = {n∆ }∆∈BS gives the number of vertices in V¯d localized in cube (recall (III.66): nBS = {n∆ |∆ ∈ BS } with the constraint ∆∈BS n∆ = |V¯d | = n − |Vd |. • nVd , σVd , ρVd are the assignments nv , σv , ρv ∀v ∈ Vd . • ∆cV¯d chooses which vertices v ∈ V¯d are localized in each summit cube: ∆cV¯d = {∆v }v∈V¯d with the constraint #{v | v ∈ V¯d , ∆v = ∆} = n∆ , ∀∆ ∈ BS . The spatial integral for each v ∈ V¯d is then performed over the exposed volume of the corresponding cube Ex(∆v ) (see (III.66)). • kh fixes the value of an auxiliary scale (defined in section II.5) that will be used in the propagator analysis; kh¯ is the same thing for antifields. ˜ ∈ y k by h-links of • Tjk chooses the tree connecting the generalized cubes ∆ j scale j. To fix Tjk one has to choose the h-links and the corresponding fields. As the fields (antifields) that must contract at scale j in order to create Tjk are already fixed by b, jhb and jh¯b , we only have to fix the field-antifield pairing Cb restricted to yjk .

IV.3

Bounding the determinant

In order to bound the main determinant we apply the following Hadamard inequalities If M is a n × n matrix with elements Mij , its determinant satisfies the following bounds

Hr :

Hc :

| det M | ≤

| det M | ≤

n

 

n

i=1

j=1

n

n

j=1

i=1

 12 |Mij |2 

(IV.113)

12 |Mij |

2

(IV.114)

where Hr is obtained by considering each row as a n-component vector, and Hc by considering each column as a n-component vector. We remark that these two inequalities are both true, but not identical. In our case it is crucial to optimize as much as possible our bounds, and to use either the row or the column inequality depending of the kind of fields involved and of various scaling and occupation factors.

Vol. 2, 2001


769

Before expanding the determinant in (IV.112) we distinguish therefore five different types of fields (antifields) denoted by an index αh , αh¯ : αh αh αh αh αh

=1 =2 =3 =4 =5

if if if if if

vh vh vh vh vh

∈ Vd ∈ Vd , ch = cv and ρh = 1 ∈ Vd , ch = cv , h ∈ a and ρh = 2 ∈ Vd , h ∈ a ∈ Vd and ch = cv

(IV.115)

¯ The case αh = 1 is the most general The same definitions hold for antifields h. one. This is a partition, since neither the fields with ρh = 1 and ch = cv nor the special fields h with vh ∈ Vd and ch = cv can belong to a. We now define for each field h a weight Ih which depends of the type of the field as follows: αh = 1 :

Ih

−1 = n∆h M −4i∆h f∆ h

αh αh αh αh

Ih Ih Ih Ih

= = = =

=2: =3: =4: =5:

M −4ivh M −4lvh M −4ih M −4ivh

(IV.116)

where ∆h is the cube where the vertex vh is localized. For ∆ ∈ BS we defined f∆ as the exposed fraction of the volume |∆| = M 4i∆ , and n∆ as the number of vertices in V¯d localized in the summit cube ∆. Finally, for each h ∈ a the scale ih is defined as (IV.117) ih = max Jha . We remark that actually h ∈ a can only have attributions j ≥ 1 + ih . The same ¯ definitions hold for h. The Hadamard inequality will be either of the row or of the column type depending on whether the ratio of weights of the fields involved is larger or smaller than 1. In fact we need to discretize these ratios in order to transfer some factors from fields to antifields and conversely and to obtain a correct bound. To implement this program we introduce an auxiliary expansion called the weight expansion. IV.3.1 The weight expansion We expand h=

5

hβh

(IV.118)

βh =1

¯ such that α¯ = βh . The same where hβh means that h can contract only with h h holds for the antifields.

770



¯ βh¯ ) as Finally, we expand each hβh (h hβh (r) hβh = r∈Z Z

(IV.119)

¯ such that where hβh (r) means that h can contract only with h Ih ∈ Ir ; I0 = [1], Ir =]2r−1 , 2r ] if r > 0, Ir = [2r , 2r+1 [ if r < 0 Ih¯

(IV.120)

We remark that the intervals Ir are disjoint with ∪r∈ZZ Ir =]0, +∞[ and that with ¯ ) with r = −r. The same this definition h(r) can contract only with antifields h(r holds for the antifields. The special fields or antifields of type 5 require an additional expansion. We define for each such field h an occupation number n(h) which is the number of derived vertices localized in the same cube than h n(h) = nd (∆ivh ) = |{ vertices in Vd localized in the cube ∆ivh }|

(IV.121)

We remark that nd (∆ivh ) has nothing to do with n∆vh in general, since these numbers concern respectively Vd and V¯d . We recall that the vertices v ∈ Vd are localized in the cube of Div to which they belong, whether the vertices of V¯d are localized in the summit cube to which they belong. By convention, for any field not of type 5 we put n(h) = 1

(IV.122)

The same definitions hold for the antifields. Now we expand each field as (IV.123) hβ (r) = hβ (r, s) s∈Z Z ¯ such that where hβh (r, s) means that h can contract only with h n(h) ¯ ∈ Is n(h)

(IV.124)

where Is is defined like Ir in (IV.120). We remark that this additional s expansion is trivial (reduced to the term s = 0) unless α or β equals 5, and that for α = 5 β = 5, s is negative: s ≤ 0. Symmetrically for α = 5 β = 5, s is positive: s ≥ 0. Summarizing all constraints, the field hβh (r, s) contracts only with antifields ¯ βh¯ (r , s ) such that βh = α¯ , β¯ = αh , kh = k¯ , r = −r and s = −s. Therefore h h h h we have β α ¯ (−r, −s) | α¯ = β, k¯ = k} . {h (r, s) | αh = α, kh = k} = {h (IV.125) h h

Vol. 2, 2001


771

The determinant in (IV.112) is now written as    det M = det Mr,s ({βh }{βh¯ }) (IV.126) {βh }{βh ¯ } {rh },{rh ¯ } {sh },{sh ¯} r,s∈Z Z where the sums over rh , rh¯ , sh , sh¯ , βh and βh¯ are extracted from the determinant by multilinearity, and Mr,s is the matrix containing only fields with rh = r (therefore only antifields with rh¯ = −r) and sh = s (therefore only antifields with sh¯ = −s) We take the convention that Mr,s = 1 if there is no field with rh = r and sh = s. We recall that the sums over sh and sh¯ are restricted by some constraints: s = 0 unless βh or βh¯ equals 5, s ≤ 0 for βh = 5, βh¯ = 5, and s ≥ 0 for βh = 5, βh¯ = 5. Now we can insert absolute values inside the sums and (IV.112) can be bounded by

|Y |

|Ac (Y )|L

≤

MY

Y 0∈Y

S





 cj M Y    

v∈V¯d

M Y

j=mY







l∈hLj

dwl 

0

BS

Ex(∆v )



1

VL

M Y

j=mY +1



{x∆ }





  dxv

d

j=mY k=1 Tjk

L

∞ |λ|n n! c n=0

v∈Vd iv ∈Ivc ∆v ∈Div ∩Y

nVd σVd ρVd {n∆ }∆∈B ∆cV¯ S



|Y |

 

∆v

v∈Vd

l∈vLj

1

Vd ,αVd a,b,R {vl }l∈vL

j=mY k=1

{sh },{sh ¯}

 

dxv 

dwl 

0

(IV.127)

r,s∈Z Z



M Y



j=mY +1

l∈Tjk

a b b {Jha },{Jh ¯} ¯ } {jh },{jh ¯ } {kh },{kh

  M Y jk   ¯l , {wl }, {wl }) δkhl kh¯  C∆ll∆ ¯ l (xl , x l cj

l∈f L6j

 1

dwl 

0

{βh }{βh ¯ } {rh },{rh ¯}

   lv l v −1    wy j   wy j  |det Mr,s ({βh }{βh¯ })|  v v v∈Vd cv =αv

j=iv

v∈Vd cv =αv

j=iv

Now, for each r, s we distinguish between three cases. • If r > 0 (which means rh = r > 0 and rh¯ = −r < 0), then Ih > Ih¯ for any ¯ in Mr . In this case we apply the row inequality (IV.113). h, h • If r < 0 (which means rh = r < 0 and rh¯ = −r > 0), then Ih < Ih¯ for any h, ¯ in Mr . This case is similar to the first case, exchanging the role of fields h and antifields, so we apply the column inequality (IV.114).

772



• If r = 0 (which means rh = r = 0 and rh¯ = −r = 0), then Ih = Ih¯ for any h, ¯ in Mr . In this case we must analyze in more detail the subdeterminants h as will be explained later. n With these conventions the fixed index (field or antifield) in the sum j=1 |Mij |2 for Hr or ni=1 |Mij |2 for Hc is always the one with the highest weight I. This is essential in the following bounds. IV.3.2 Case r > 0 (and r < 0) As remarked above we treat only the case r > 0, the other case being similar, exchanging fields and antifields, hence rows and columns. In that case we apply the row inequality (IV.113):  |det Mr,s ({βh }{βh¯ })| ≤

h∈b,rh =r sh =s

   

 12 ¯ ∈b|β =α¯ ,α =β¯ , h h h h h kh ¯ =kh ,rh ¯ =−r, sh ¯ =−s

  |Mh,h¯ |2  

(IV.128)

where h ∈ b is the set of fields that are not extracted from the determinant to give some h-link. Now 2 h F jkh j 2 |Mh,h¯ | = δkh ,kh¯ [Wh (j)] h∆j ,∆j (w) C∆j ,∆j (xh , xh¯ ) [Wh¯ (j)] ¯ ¯ h h h h j∈Ih ∩Ih¯ ∩J(kh ) jk 2 C jh j (xh , xh¯ ) ≤ δkh ,kh¯ (IV.129) ∆ ,∆ h

j∈Ih

¯ h

Fh

where the weakening factors Wh (j), Wh¯ (j) and h∆jj ,∆j (w) are bounded by one, ¯ h

h

the sum over j is performed over the larger set Ih ∩ Ih¯ ∩ J(kh ) ⊂ Ih , which is an upper bound, and we applied the identity

jkh (xh , xh¯ ) C j jkh C∆ j ,∆j h

¯ h

∆h ,∆jh ¯

(xh , xh¯ ) = 0

if j = j

(IV.130)

¯ in the sum, its weight satisfies which is true by construction. For any h Ih 2−r ≤ Ih¯ < Ih 2−r+1 .

(IV.131)

Before going on we prove the following lemma Lemma. If r > 0, the only non zero contributions are for αh < 5. Proof. Actually if there exists αh = 5 we must have M −4ivh > 2r−1 Ih¯ ≥ Ih¯ .

(IV.132)

Vol. 2, 2001


773

But this is impossible. Indeed let us consider for instance the case αh¯ = 1. Then −1 M −4ivh > n∆h¯ M −4i∆h¯ f∆ ≥ M −4i∆h¯ ¯ h

(IV.133)

¯ we must also have iv ≥ i∆¯ , which implies ivh < i∆h¯ . But to contract h with h h h which is a contradiction. The other cases are verified in the same way. ¯ Now the first step is to estimate the sum over h Σh¯ =:

¯ ∈b|β =α¯ ,α =β¯ , h h h h h kh ¯ =kh ,rh ¯ =−r,sh ¯ =−s

|Mh,h¯ |2 .

(IV.134)

For this purpose we distinguish five cases. ¯ of type 1 (α¯ = 1). 1.) βh = 1 which means that h can contract only with h h ¯ the weight I is Therefore for any h −1 Ih¯ = n∆h¯ M −4i∆h¯ f∆ . ¯ h

Therefore the sum Σh¯ is bounded by 2 Σh¯ ≤ C jkh (x∆j , x∆ )

j∈Ih ∆∈Dj

∆ ∈BS ,∆ ⊂∆

h

(IV.135)

2n∆

(IV.136)

where 2n∆ is the maximal number of antifields (two for each vertex) localized in ∆ . We remark that the vertex position in the propagator is substituted by the cube center x∆ . By (IV.131) and (IV.135) we see that n∆ < Ih 2−r+1 M 4i∆ f∆ therefore (IV.136) is bounded by 2 Σh¯ ≤ 2 Ih 2−r+1 C jkh (x∆jh , x∆ ) j∈Ih ∆∈Dj

(IV.137)

M 4i∆ f∆ . (IV.138)

∆ ∈BS ,∆ ⊂∆

Now we observe that M 4i∆ f∆ is the exposed volume of ∆ and that ∪∆ ⊂∆ Ex(∆ ) is a partition of ∆, for any cube ∆, therefore we have M 4i∆ f∆ = M 4j (IV.139) ∆ ∈BS ,∆ ⊂∆

hence Σh¯ is bounded by Σh¯ ≤ Ih 2−r+2

j∈Ih

M 4j

2 C jkh (x∆jh , x∆ ) . ∆∈Dj

(IV.140)

774



Finally the sum over ∆ is bounded by 2 16 4 χj,k (x∆j , x∆ ) C jkh (x∆j , x∆ ) ≤ C M 3 M −4j M − 3 kh h

(IV.141)

h

∆∈Dj

∆∈Dj

where from now on we use C as generic name for a constant independent of M which can be tracked but whose numerical precise value is inessential. We applied the scaled decay (II.43)-(II.47), and the function χj,k is different from zero only for | x∆j − x∆ | ≤ M j and |t∆j − t∆ | ≤ M j+k (actually for k > 0 we have | x∆j − x∆ | h

k

h

1

h

M j− 3 + 3 ≤ M j ). Now, for x∆j fixed, the number of cubes such that their center h x∆ satisfies these bounds is at most 26(2M kh ) where 26 is the number of nearest neighbors of ∆jh in the position space, and 2M kh is the number of choices in the time direction. Therefore 2 kh 16 (IV.142) C jkh (x∆j , x∆ ) ≤ CM 3 M −4j M − 3 . h

∆∈Dj

Remark that the case j = jM + 1 needs a different treatment. Actually in this case we have χ(|t∆ − t∆ | ≤ M jM ) C jM +1,0 (x∆ , x∆ )2 ≤ Cp M −4jM (1 + M −jM | x∆ − x∆ |)2p ∆∈Dj

≤ M −4jM

∆∈Dj

n1 ,n2 ,n3 ∈Z Z

Cp ≤ C M −4jM . (1 + M −jM M jM (|n1 | + |n2 | + |n3 |))2p

(IV.143)

The sum Σh¯ is finally bounded by kh kh 16 16 M 4j M −4j M − 3 ≤ C M 3 Ih 2−r M − 3 |Ih | Σh¯ ≤ C M 3 Ih 2−r j∈Ih

≤ C M

16 3

Ih 2−r M −

kh 3

jM

(IV.144)

where |Ih | is the number of elements in the interval Ih , the numerical constants have been absorbed in C and we bounded |Ih | by jM . ¯ of type 2 (α¯ = 2). 2.) βh = 2 which means that h can contract only with h h ¯ Therefore all h must be hooked to some vertex in Vd and must have scale attribution ivh¯ ≤ jh¯ ≤ lvh¯ . The weight Ih¯ is Ih¯ = M −4ivh¯ = M −4ir (h) where ir (h) is the unique scale for which (IV.131) is satisfied. Now 2 Σh¯ ≤ C jkh (x∆j , x∆ ) 2(jM + 2 − j) h

j∈Ih j≥ir (h)

∆∈Dj

(IV.145)

(IV.146)

Vol. 2, 2001


775

where jM + 2 − j ≤ 2jM is the maximal number of cubes at scale jM + 1 ≥ j ≥ j containing ∆. As j = jh¯ ≤ lvh¯ , only vertices localized in these cubes can contribute. The factor 2 appears because there is only one vertex localized in each cube and at most 2 antifields hooked to that vertex. The sum over ∆ is performed as in the case 1.). Therefore Σh¯ ≤ C jM M

16 3

M−

kh 3

M −4j .

(IV.147)

j∈Ih

Now M −4j = M −4(j−ir (h)) M −4ir (h) ≤ M −4(j−ir (h)) 2−r+1 Ih . The sum over j is performed with the decay M −4(j−ir (h)) M −4(j−ir (h)) ≤ C .

(IV.148)

(IV.149)

j∈Ih j≥ir (h)

Finally Σh¯ ≤ C M

16 3

Ih 2−r M −

kh 3

jM

(IV.150)

where all constants have been inserted in C. ¯ of type 3 (α¯ = 3). 3.) βh = 3 which means that h can contract only with h h ¯ Therefore all h in the sum are hooked to some v ∈ Vd and have jh¯ > lv . The weight is (IV.151) Ih¯ = M −4lvh¯ = M −4ir (h) where ir (h) is the unique scale for which (IV.131) is satisfied. Then Σh¯ ≤

2 C jkh (x∆jh , x∆ )

j∈Ih ∆∈Dj

2

(IV.152)

∆ ∈Dir (h) ,∆ ⊂∆

where 2 is the maximal number of antifields with ∆v ⊆ ∆ that are hooked to the vertex vl of the vertical link l ∈ vLi∆ connecting the connected component ykj (j = i∆ ) containing ∆ to its ancestor. Now we observe that

2 ≤ 2 M 4j M −4ir (h)

(IV.153)

∆ ∈Dir (h) ,∆ ⊂∆

where M 4j−4ir (h) is the number of cubes of scale ir (h) contained in a cube of scale j. By (IV.131)-(IV.151) we see that M −4ir (h) ≤ Ih 2−r+1

(IV.154)

776



hence Σh¯ is bounded by Σh¯ ≤ 2Ih 2−r+1

M 4j

j∈Ih

2 C jkh (x∆jh , x∆ ) .

(IV.155)

∆∈Dj

The sum over ∆ is bounded as in the case 1.) above. Therefore kh kh 16 16 M 4j M −4j M − 3 ≤ C M 3 Ih 2−r M − 3 |Ih | Σh¯ ≤ C M 3 Ih 2−r j∈Ih

≤ C M

16 3

Ih 2−r M −

kh 3

jM

(IV.156)

where |Ih | ≤ jM + 1 ≤ 2jM and all constant factors are absorbed in C. ¯ of type 4 (α¯ = 4). 4.) βh = 4 which means that h can contract only with h h ¯ Therefore all h in the sum are associated to some f -link of order 6 and its weight is (IV.157) Ih¯ = M −4ih¯ = M −4ir (h) where ir (h) is the unique scale for which (IV.131) is satisfied. Then 2 6 Σh¯ ≤ C jkh (x∆jh , x∆ )

(IV.158)

∆ ∈Dir (h) ,∆ ⊂∆

j∈Ih ∆∈Dj

where 6 is the maximal number of antifields with ∆v ⊂ ∆ that have been derived by a f -link of order 6 at scale i∆ for the connected component ykj (j = i∆ ) containing ∆ . Now we can apply the same analysis as for the case 3.) except that instead of a factor 2 we have a factor 6. Hence we obtain Σh¯ ≤ C M

16 3

Ih 2−r M −

kh 3

jM

(IV.159)

¯ of type 5 (α¯ = 5). 5.) βh = 5 which means that h can contract only with h h ¯ Therefore all h in the sum are hooked to some v ∈ Vd and have jh¯ = iv . The weight is Ih¯ = M −4ivh¯ = M −4ir (h) (IV.160) where ir (h) is the unique scale for which (IV.131) is satisfied. There is no sum over j to compute, as we have only j = ir (h). 2 ir (h)kh ¯ . (x∆ir (h) , x∆ ) n(h) (IV.161) Σh¯ ≤ C ∆∈Dir (h)

h

We know that s is negative, and by (IV.124) (and the fact that n(h) = 1), we ¯ ≤ 2−s = 2|s| . Therefore obtain n(h) 2 ir (h)kh (x∆ir (h) , x∆ ) . (IV.162) Σh¯ ≤ 2|s| C ∆∈Dir (h)

h

Vol. 2, 2001


777

The sum over ∆ is performed as in the other cases then Σh¯ ≤ C 2|s| M

16 3

M−

kh 3

M −4ir (h) .

(IV.163)

Applying (IV.160) we have Σh¯ ≤ C 2|s| 2−r Ih M

M−

16 3

kh 3

.

Now we can insert all these bounds in (IV.128): nr,s 8 C M3 |det Mr,s ({βh }{βh¯ })| ≤

h∈b,rh =r sh =s,βh =5

h∈b,rh =r sh =s,βh =1,...,4 1

Ih2 2− 2 2 r

(IV.164)

|s| 2

1

1

2 Ih2 2− 2 jM M−

M−

kh 6

r

kh 6

(IV.165)

where C is a constant and nr,s is the number of fields belonging to the matrix Mr,s . Now we observe that 1 1 r 1 r r Ih2 2− 2 ≤ Ih4 2 4 2− 2 Ih¯4

h∈b,rh =r sh =s,βh =β

=


1 r Ih4 2− 8


¯ ∈b,r ¯ =−r h h sh ¯ =−s,αh ¯ =β

1 r Ih¯4 2− 8

¯ ∈b,r ¯ =−r h h sh ¯ =−s,αh ¯ =β

(IV.166)

where we applied the relation (IV.131) and the fact that |{h |rh = r, sh = s, βh = ¯ |r¯ = −r, s¯ = −s, α¯ = β}| for β = 1, ..., 5. Moreover β}| = |{h h h h

h∈b,rh =r sh =s

M−

kh 6

=

h∈b,rh =r sh =s

kh

M − 12

¯ ∈b,r ¯ =−r h h sh ¯ =−s

kh ¯

M − 12

(IV.167)

¯ |r¯ = −r, s¯ = −s, k¯ = k}| for any k ≥ 0 since |{h |rh = r, sh = s, kh = k}| = |{h h h h and 1 1 1 2 4 4 jM = jM jM

h∈b,rh =r sh =s,βh =1,...,4

≤

h∈b,rh =r sh =s,αh =1,...,4

1 4 jM

h∈b,rh =r sh =s,βh =1,...,4

¯ ∈b,r ¯ =−r h h sh ¯ =−s,αh ¯ =1,...,4

1 4 jM

¯ ∈b,r ¯ =−r h h sh ¯ =−s,αh ¯ =1,...,4

(IV.168)

778



¯ | r¯ = where we applied the relation |{h | rh = r, sh = s, αh = 1, ..., 4}| = |{h h ¯ | r¯ = −r, s¯ = −s, α¯ = 5}| which is true −r, sh¯ = −s, αh¯ = 1, ..., 4}| + |{h h h h because αh < 5 ∀h. Now, for any h with βh = 5 there is no factor jM therefore we 1 4 . write 1 ≤ jM Finally we observe that (see (IV.121)):

h∈b,rh =r sh =s,βh =5

≤

2|s|/2 =

h∈b,rh =r, sh =s

2−|s|/4

¯ ∈b,r ¯ =−r h h sh ¯ =−s,αh ¯ =5

2|s|/2 ≤

¯ ∈b,r ¯ =−r, h h sh ¯ =−s,αh ¯ =5

¯ ∈b,r ¯ =−r, h h sh ¯ =−s,αh ¯ =5

2−|s|/2 2nd (∆ivh¯ )

2−|s|/4 2nd (∆ivh¯ )

(IV.169)

where we apply the inequality 2|s| ≤ 2nd (∆ivh¯ ). The determinant | det Mr,s | is then bounded by |det Mr,s | ≤

¯ ∈b,r ¯ =−r h h sh ¯ =−s,αh ¯ 0. In the second case ( αh = 5, βh < 5) we apply the column inequality (IV.113) and everything goes as in the case r > 0 exchanging fields and antifields. Finally in the third case we have some field with αh = 5 contracting with some antifield with αh¯ = 5. Here again we optimize the Hadamard inequalities depending on the sign of s. If s ≥ 0 we apply the row inequality, and symmetrically9 . The two main weights are equal Ih¯ = M −4ivh¯ = M −4ivh = Ih .

(IV.173)

Remark that there is no sum over j to compute, as we have only j = ivh . Σh¯ ≤

2 ivh kh ¯ . (x ivh , x∆ ) n(h) C ∆

∆∈Div

h

h

9 This

second optimization is not really necessary, but nicer.

(IV.174)

780



¯ < 2−s+1 n(h) therefore We know that n(h) 2 Σh¯ ≤ 2−|s|+1 n(h) C ir (h)kh (x∆ir (h) , x∆ ) .

(IV.175)

h

∆∈Div

h

The sum over ∆ is performed as in the other cases and we get Σh¯ ≤ 2−|s|+1 n(∆vh )C M

M−

kh 3

M −4ivh = C 2−|s| n(∆vh ) Ih M

kh

M− 3 (IV.176) where the constant 2 has been inserted into C. As before we will distribute the factor 2−|s| on both sides of the determinant which gives again factors 2−|s|/4 for each field or antifield of this determinant after the Hadamard inequality. The other factors are unchanged. 16 3

16 3

IV.3.4 Result of the weight expansion The global determinant is bounded by

16n 3

|det Mr,s | ≤ C n M

r,s

1 − 14 14 4 M −i∆h n∆ f j h ∆h M

h∈b, αh =1

h∈b, αh =5

¯ ∈b, h αh ¯ =4

h∈b, αh =2,3

kh

M − 12

2−

1

4 M −ih¯ jM

¯ ∈b, h αh ¯ =1

¯ ∈b, h αh ¯ =5

|rh |sh ¯| ¯| 8 − 4

kh ¯

M − 12

¯ ∈b h

1 4

M −ivh jM

h∈b, αh =4

1 − 14 14 4 M −ivh nd (∆ivh ) M −i∆h¯ n∆ f j M ¯ ∆h ¯ h

|rh | |sh | 8 − 4

h∈b

2−

M −ivh¯ nd (∆ivh¯ )

1

4 M −ih jM

¯ ∈b, h αh ¯ =2,3

1

4 M −ivh¯ jM

(IV.177)

where we have applied r,s nr,s ≤ 2n, and all numerical factors have been absorbed in the constant C. The factors M −lvh have been moreover bounded by M −ivh . Inserting this result inside (IV.127) we have

|Ac (Y )|L|Y | ≤

MY

Y 0∈Y

S

VL

{vl }l∈vL nVd σVd ρVd {n∆ }∆∈BS ∆cV¯



 cj M Y   j=mY k=1 Tjk

v∈Vd

L|Y |

∆v

 

n 16 ∞ |λ|n CM 3 BS

{x∆ }c n=0

 


 dxv  d

n!

v∈Vb \Vd

Ex(∆v )

Vd ,αVd a,b,R

a b b {Jha },{Jh ¯} ¯ } {jh },{j¯ } {kh },{kh

 

dxv  

h

v∈V¯d \Vb



M 4i∆v 

Vol. 2, 2001



M Y




cj

  jk  ¯l ) δkhl kh¯  C∆ l∆ ¯ (xl , x 

l

 

j=mY k=1

2−

l

l

l∈Tjk

|rh | |sh | 8 − 4

{h∈b}

 2−

|rh |sh ¯| ¯| 8 − 4

¯ ∈b} {h





M Y

 

{h∈b}

j=iv

| ln T |

v∈Vd ∪Vb

{βh }{βh ¯}

1  −i −1 4  M ∆h n∆ f 4  h ∆h h∈b αh =1

    

h∈b, αh =2,3

¯ ∈b} {h



M −ivh

¯ ∈b, h αh ¯ =2,3

 

   M −ivh¯  M −ih   h∈b, αh =4



1

0

l∈vLj

 l v −1 kh kh ¯   M − 12 M − 12  wy j    v v∈Vd cv =αv

(IV.178)

{rh },{rh ¯ } {sh },{sh ¯}

j=mY +1



781

  lv   dwl  wy j  v v∈Vd cv =αv

j=iv

| ln T |3/4

v∈Vd ∪Vb

 −i 1 − 14  4  M ∆h¯ n∆ f ∆ ¯ ¯  h h

¯ ∈b, h αh ¯ =1



¯ ∈b, h αh ¯ =4

 M −ih¯  

   M −ivh nd (∆ivh ) M −ivh¯ nd (∆ivh¯ )   ¯ ∈b, h αh ¯ =5

h∈b, αh =5

jkl (IV.179) (x , x ¯ , {w }, {w }) ¯l ) . C∆l ∆ ≤ C jkl (xl , x l l ¯l l l To get the factor v∈Vd ∪Vb | ln T |3/4 v∈Vd ∪Vb | ln T | in this bound we collected where

1/4

the factors jM , which are bounded by | ln T |1/4 (II.33), and we used the fact that a vertex v ∈ Vd ∪ Vb either is in Vd , hence has a field or antifield of type 5 hooked 1/4 to it, which has no jM factor, or is in Vb − Vd , hence has at least a field or antifield in b which does not appear in the products of (IV.177). The integrals over the weakening factors wl and wl have been bounded by one, but the ones over wl are kept preciously since they are used below. Now we observe that   |rh | |sh | |rh¯ | |sh¯ |  2− 8 − 4 (IV.180) 2− 8 − 4  ≤ C n . {rh },{rh ¯} {sh },{sh ¯}

{h∈b}

¯ ∈b} {h

The logarithms are bounded using the relation | ln T ||λ| ≤ K. Hence we can write (since we assumed K ≤ 1) 1 ¯ |λ| | ln T |3/4 |λ| | ln T | ≤ K |Vd \Vb | |λ| 4 . (IV.181) v∈Vd ∪Vb

v∈Vd ∪Vb

v∈Vd ∪Vb

782



The n∆ and n(∆) factors coming from the Hadamard bound can be estimated using Stirling’s formula as follows:

h∈b, αh =1

1

h∈b, αh =5

4 n∆ h

¯ ∈b, h αh ¯ =1

nd (∆ivh )

1

4 n∆ ≤ ¯ h

∆∈BS

nn∆∆ ≤

¯ ∈b, h αh ¯ =5

n∆ ! en∆ = en

∆∈BS

nd (∆ivh¯ ) ≤

n∆ !

∆∈BS

nd (∆)nd (∆) ≤ en

∆∈Y

nd (∆)!

∆∈Y

(IV.182) Inserting all these results and absorbing all constants except K in the global factor C n we have

|Ac (Y )|L|Y | ≤

MY

Y 0∈Y

S

L|Y |

n 16 ∞ CM 3 M 4

VL

a,b,R {vl }l∈vL nVd σVd ρVd {n∆ }∆∈BS ∆cV¯



|λ|

{h∈b}

¯ ∈b} {h

 

h∈b, αh =2,3,5

M Y



j=mY k=1 Tjk M Y

¯ ∈b, h αh ¯ =2,3,5

 cj 

cj

M Y

j=mY +1

v∈Vd





 

l∈Tjk

l∈vLj

0

1

4i∆v 

M

∆v

h∈b, αh =4

nd (∆)!

 dxv 

v∈Vb \Vd



¯ ∈b, h αh ¯ =1

¯ ∈b, h αh ¯ =4

j=iv

 M −i∆h¯   

 M −ih¯  



dxv 

Ex(∆v )

    lv l v −1    dwl   wy j   wy j  v v v∈Vd cv =αv

h∈b, αh =1

  M −ivh¯  M −ih 

a b b {Jha },{Jh ¯ } {jh },{jh ¯}

kh  ¯ M − 12   M −i∆h 





∆∈Y



 jk  ¯l ) δkhl kh¯  C∆ll∆ ¯ l (xl , x l

j=mY k=1



M −ivh

¯

Vd ,αVd






  













v∈Vd ∪Vb

 kh n∆ !  M − 12

∆∈BS



v∈Vd ∪Vb

{kh },{kh ¯ } {βh }{βh ¯}

1 4

d

n!

{x∆ }c n=0

BS

K |Vd \Vb |

v∈Vd cv =αv

j=iv

(IV.183)

Vol. 2, 2001


783

−1

where we applied f∆ ≥ M −4 to bound every factor f∆h4 by M . As we have at most four fields of type 1 per vertex v ∈ V¯d we obtain at most the factor M 4n .

IV.4

Extracting power counting

In order to extract the power counting for h-links we define jkl jkl C∆ ¯l ) = M −jh M −jh¯ M −εkh M −εkh¯ D∆ ¯l ) ¯ (xl , x ¯ (xl , x ∆ ∆ b

l

b

l

l

(IV.184)

l

¯ are the field, antifield contracted to form the propagator and ε where h and h is some small constant 0 < ε < 1 that will be determined later. Remark that jhb = jh¯b = j, kh = kh¯ = kl by construction. The factor M −εkh is necessary to sum b over kh and extract a small factor per cube. The factor M −jh corresponds to a kind of power counting for the field. Now we can write n 28 ∞ CM 3 ¯ K |Vd \Vb | |Ac (Y )|L|Y | ≤ L|Y | n! c n=0 Y MY

0∈Y

S

VL

BS

 

|λ|



1 4

v∈Vd ∪Vb

{kh },{kh ¯ } {βh }{βh ¯}

 kh n∆ !  M − 12

∆∈BS

{h∈b}

¯ ∈b} {h



      

h∈b, αh =1,2,3,5

M Y

M −i∆h

¯ ∈b, h αh ¯ =1,2,3,5

 cj 

j=mY k=1 Tjk M Y

j=mY +1

v∈Vd ∪Vb

d



M

4i∆v 

v∈Vd ∪Vb

Ωv

 

M

kh ¯ − 12

  

  M −i∆h¯  

Vd ,αVd



a b b {Jha },{Jh ¯ } {jh },{jh ¯}

nd (∆)!

∆∈Y

M −εkh

{h∈b}

 M −εkh¯ 

¯ {h∈b}

M −ih



¯

l∈vLj

0

 M −ih¯  

αh =4, h∈b | αh ¯ =4, { h∈bor| h∈b } ¯ or h∈b     cj M Y jk  dxv  ¯l ) δkhl kh¯  D∆ll∆ ¯ l (xl , x l

j=mY k=1

l∈Tjk

    l l −1 v v 1     dwl   wyj   wyj  v v 




a,b,R {vl }l∈vL nVd σVd ρVd {n∆ }∆∈BS ∆cV¯



{x∆ }

v∈Vd cv =αv

j=iv

v∈Vd cv =αv

(IV.185)

j=iv

where we defined ih = jhb if h ∈ b, and we defined Ωv = ∆iv if v ∈ Vd and Ωv = Ex(∆v ) if v ∈ Vb ∩ V¯d . We also defined ∆h = ∆vh ∈ BS if αh = 1 and

784



∆h = ∆ivh if αh = 2, 3, 5. Now for each h with ih > i∆h we can write M −ih = M −i∆h M −(ih −i∆h ) .

(IV.186)

¯ The factor M −i∆h will be used to compensate the The same formulas hold for h. integration over xv ∈ ∆vh . To extract power counting for a v-link associated to the vertex v we extract 1 a fraction |λ| 8 for each vertex in Vd :

1

|λ| 4 =

v∈Vd

1

1

|λ| 8 |λ| 8 .

(IV.187)

v∈Vd

Now, for each ykj connected to its ancestor by a f -link, there are 6 external fields. One of these may be the field hroot . For this field we keep the vertical decay b M −(jh −i∆h ) untouched, in order to perform later the sum over the tree structure. The vertical decay for the remaining five external fields, together with the 1 factors λ 8 are necessary for several purposes: • to ensure a factor M −4 to sum the root cube for any ykj inside a cube at scale j + 1; • to sum over Jha and jhb ; • to extract one small factor per cube; • to sum over the tree structure. Therefore we write the vertical decay for each of the five fields as follows: ε

ε

M −(ih −i∆h ) = M − 2 (ih −i∆h ) M − 2 (ih −i∆h ) M −(1−ε )(ih −i∆h )

(IV.188)

where 0 < ε < 1 is some small constant that will be chosen later. One of the two fractions ε /2 is necessary to sum over Jha and jhb , and to extract one small factor per cube. The other fraction will be used to reconstruct some vertical decay in order to sum over the tree. Now we call GF the set of subpolymers ykj connected to their ancestor by a f -link, and GV the set of subpolymers ykj connected to their ancestor by a v-link. Therefore we can write

h∈b | αh =4, or h∈b and h∈Rroot

ε

M −(1−ε )(ih¯ −i∆h¯ ) M − 2 (ih¯ −i∆h¯ ) ≤

¯ ∈b | α¯ =4, or h h ¯ ∈R h∈b and h root

ε

M −(1−ε )(ih −i∆h ) M − 2 (ih −i∆h ) j gk ∈GF

(IV.189)

ε

M −5(1−ε ) M −5 2

Vol. 2, 2001


785

where we applied the equation M

i∆h −1

−(1−ε )(ih −i∆h )

=

M −(1−ε )[j−(j−1)] .

(IV.190)

j=ih

On the other hand for subpolymers with v-links we write   1 1 ε −5(1−ε )   8 8 8 |λ| = |λ| ≤ M |λ| j yk ∈GV

v∈Vd

j yk ∈GV

(IV.191)

v∈Vd

where from now on we assume |λ| 8 ≤ M −5 . 1

(IV.192)

Finally we observe that for each h ∈ Rroot we can reconstruct a fraction of the vertical decay jhb − i∆h . This is possible because any cube ∆ in the set Ah =: {∆ | jhb > i∆ ≥ i∆h and ∆h ⊆ ∆}

(IV.193)

must be ∆0root for some connected component at scale i∆ , with ∆h ⊆ ∆, and we ε ε can extract a fraction |λ| 16 or M −5 2 of its vertical decay. Remark that no field h = h ∈ Rroot can hook to any cube in Ah , because they are all cubes of type ∆0root , therefore Ah ∩ Ah = ∅ for any h = h ∈ Rroot . This means that the same ∆ is never used for more than one h ∈ Rroot . Therefore we can write   ε ε ε −5 2  −5 ε2 (ih −i∆h )  8 16 |λ| M M |λ| ≤ (IV.194) v∈Vd

j yk ∈GF

v∈Vd

h∈Rroot

One of this fractions can be used to sum over jhb , the others will be used to sum over the tree. Inserting all these bounds we have

|Ac (Y )|L|Y | ≤

MY

Y 0∈Y

S

L|Y |

n 28 ∞ CM 3

VL

BS

{x∆ }c n=0

Vd ,αVd a,b,R {vl }l∈vL nVd σVd ρVd {n∆ }∆∈BS ∆cV¯

b },{j b } {k },{k ¯ } {β }{β ¯ } {jh h h h h ¯ h

v∈Vd ∪Vb

1

|λ| 16



n!

¯


d



K |Vd \Vb |

v∈Vd ∪Vb

1

|λ| 16

 

a {Jha },{Jh ¯}

786


M −4i∆v

v∈Vd ∪Vb





M −εkh

{h∈b}

v∈Vd



 

{h∈b}

¯ ∈b} {h

M

−4 ε2

(ih −i∆h )

M −(ih −i∆h )

M Y

M

v∈Vd ∪Vb

Ωv

 M

¯

h∈b | αh ¯ =4, ¯ or h∈b

j=mY +1

l∈vLj

0

 ε M − 2 (ih¯ −i∆h¯ )  

 

−4 ε2

(ih ¯ −i∆¯ )  h

cj M Y −1





j=iv

M −5(1−ε ) 

l∈Tjk

j yk ∈GV

M −5(1−ε )

v∈Vd cv =αv

M 4i∆v

(IV.195)

j=iv

M −4i∆v = 1.

(IV.196)

v∈Vd ∪Vb

and

j=mY k=1

j=mY k=1

v∈Vd ∪Vb



    M cj Y jk  dxv  ¯l ) δkhl kh¯  D∆ll∆ ¯ l (xl , x l

v∈Vd cv =αv

where we applied





    lv l v −1 1     dwl   wy vj   wy vj  

kh ¯ − 12

M −(ih¯ −i∆h¯ ) 

¯ {h∈R root }

¯ {h∈R root }

 cj M Y   j=mY k=1 Tjk

{h∈Rroot }

¯ {h∈b}

{h∈Rroot }



∆∈BS

M −εkh¯  





 ε − ε2 (ih −i∆h ) |λ| 16  M  b | αh =4 } { h∈or h∈b





 kh n∆ !  M − 12

nd (∆)!

∆∈Y




cj M Y −1

M −5(1−ε ) = 



M −5(1−ε )  .

(IV.197)

j=mY k=1

j yk ∈GF 1

1

where we have extracted from |λ| 8 a fraction |λ| 16 that will be used to extract a small factor per cube. The factors M −5(1−ε ) will be used to sum over the cube positions and to perform the last sum over S. Remember that ∆v is a cube in BS if v ∈ Vb ∩ V¯d , but if v ∈ Vd , the localization cube ∆iv of v may not be a summit cube.

IV.5

Extracting a small factor per cube

Now, before bounding the polymer structure, we must extract a small factor g for each cube, in order to obtain a factor g |Y | . First we still need to extract some fractions of vertical decay. Actually, we will need also a fraction of the k decay for tree lines. Therefore we write M −εkh = M − 2 kh M − 2 kh ε

ε

(IV.198)

Vol. 2, 2001


787

¯ ∈ b. One fraction will be used to sum over kh , and the remaining for each h, h fraction is used to extract a small factor per cube. Finally we need to extract a ε ¯ ∈ b with αh = 4, and fraction ε /4 of the vertical decay M − 2 (ih −i∆h ) for each h, h a ¯ for each h, h ∈ b. One fraction will be used to sum over J and j b . The remaining fraction is bounded by    M −1 cj Y   ε ε ε ε  M − 4 (ih¯ −i∆h¯ )  |λ| 16 ≤  M −5 4  . M − 4 (ih −i∆h )   ¯ b | αh =4 h∈b | αh j=mY k=1 v∈Vd ¯ =4, { h∈or } h∈b ¯ or h∈b (IV.199) Now we can prove the following lemma. Lemma. One can extract from (IV.195) at least one small factor g < 1 for each cube in Y , where g is defined by ε

g = max[ |λ| 32 , M −5 4d , M − 2d ] 1

ε

(IV.200)

where d = 34 = 81 is the number of nearest neighbors for each cube (including itself ). Proof. We will proof the following inequality    M −1 cj Y 1 ε ε ε |λ| 16  M −5 4   M − 2 kh M − 2 kh¯  ≤ g |Y | v∈Vd ∪Vb

j=mY k=1

¯ h∈b

h∈b

(IV.201) which is enough to prove the lemma. First we make some remarks. 1) For all extremal summit cube ∆ ∈ Y (Ex(∆) = ∆), there must be at least one vertex v ∈ Vd ∪ Vb with ∆v = ∆, as this cube must be connected to the polymer by 1 a horizontal or vertical link. For this vertex we have a factor |λ| 16 ≤ g 2 . Therefore we a factor g 2 for each ∆ ∈ Y with Ex(∆) = ∆. 2) For all ∆ ∈ Y such that ∆ = ∆0root for some connected subpolymer ykj , there is a vertical link connecting ∆ to its ancestor and we have a fraction of the vertical ε decay M −5 4 ≤ g d . 3) For each tree line C jk connecting some ∆, ∆ ∈ Dj , we can write the vertical ε ¯ (kh = k¯ = k) as decay M − 2 k for the corresponding h and h h M − 2 kM − 2 k = ε

ε

j+k−1 j =j

M−2 M−2 . ε

ε

(IV.202)

788



Therefore for all ∆ ∈ Dj with j ≤ j ≤ j + k − 1 such that ∆ ⊆ ∆ or ∆ ⊆ ∆ ε we have a factor M − 2 ≤ g d . With these remarks we can now prove (IV.201) by induction. Actually we will prove that, if at the scale j we have a factor g 2 for any ∆ ∈ Dj ∩ Y then we can rewrite this factors in such a way to have a factor g for any ∆ ∈ Dj ∩ Y and a factor g 2 ∀ ∆ ∈ Dj+1 ∩ Y . Inductive hypothesis: At scale j we have a factor g 2 for any ∆ ∈ Dj ∩ Y . This is certainly true for the highest scale mY , because at this scale all cubes are extremal summit cubes therefore by remark 1) they have a factor g 2 . Proof of the induction. Now we must prove that, given a factor g 2 for any ∆ ∈ Dj ∩Y , we have a factor g for any ∆ ∈ Dj ∩Y and a factor g 2 for any ∆ ∈ Dj+1 ∩Y . We consider a connected component ykj+1 . This is made from a set of generalized cubes connected by a tree. Let us consider one particular generalized cube ˜ which is made of cubes of scale j + 1 connected by links of higher scales. Now ∆ ˜ For each such ∆ we denote by s∆ the number of cubes we consider each cube in ∆. above that is s∆ = {∆ ∈ Dj ∩ Y | ∆ ⊂ ∆} We distinguish three situations. a) If |s∆ | = 0 then we are in the special case Ex(∆) = ∆ therefore the extremal summit cube ∆ has a factor g 2 . b) If |s∆ | ≥ 2 then we have g 2|s∆ | = g |s∆ | g |s∆ | ≤ g |s∆ | g 2

(IV.203)

therefore we can keep a factor g for each ∆ ∈ s∆ and we have a factor g 2 for ∆. c) The case |s∆ | = 1 is the most difficult one. We call the unique element of s∆ ∆ . Again we distinguish three cases: • there is no tree line of any scale connecting ∆ to some other ∆ ∈ Dj ˜ and ∆ must be ∆0root for some connected (see Fig.10 a). Therefore ∆ = ∆ component at scale j, therefore there is a vertical link connecting ∆ to ∆, and, by 3) we have a factor g d . Hence we can write g 2 g d = g g d+1 ≤ g g 2

(IV.204)

and we can keep a factor g for ∆ and assign a factor g 2 to ∆.

• there is at least one tree line C j k at some scale j ≤ j connecting ∆ to some ˜ and ∆ is not nearest neighbor of ∆ (see Fig.10 b). ∆ ⊂ ∆1 (∆1 ∈ ∆)

Vol. 2, 2001


∆’ ∆

a

∆’

789

∆’ ∆

∆

b

c

Figure 10: Three possible cases for |s∆ | = 1. Then, |t∆ − t∆ | ≥ M j+1 (in the space directions they must always be nearest neighbors) and the propagator must have j + k ≥ j + 1. Therefore as j ≤ j k cannot be zero and by remark 3) we can associate to ∆ a factor g d in addition to g 2 . Hence we can write g 2 g d = g g d+1 ≤ g g 2

(IV.205)

and we can keep a factor g for ∆ and assign a factor g 2 to ∆.

• there is at least one tree line C j k at some scale j ≤ j connecting ∆ to some ˜ and ∆ is nearest neighbor of ∆ (see Fig.10 c). ∆ ⊂ ∆1 (∆1 ∈ ∆) In this case j + k ≥ j and no factor can be extracted from the k decay. Remark that, if ∆ = ∆0root for some connected component at scale j, then there is a vertical link and everything works as in the case of Fig 10 a). On the other hand, if ∆ = ∆0root , there is still a vertical link connecting ∆ to ∆ but it does not have any vertical decay associated. In this case we have to distinguish three possible situations: a’ ) there is no other tree line connecting ∆ or ∆ to some other cube in Dj . Therefore ∆ must be ∆0root for some connected component at scale j and the corresponding vertical link has a vertical decay associated. Hence we have a factor g in addition to g 2 for each cube nearest neighbor (nn) of ∆ , hence for each of them we can keep one factor g and give the remaining g 2 to its ancestor. b’ ) there is a tree line connecting ∆ to some cube which is not nn of ∆ . Then we have some k vertical decay from the tree propagator, and we can assign a factor g in addition to g 2 for each cube nn of ∆ . Therefore, as in a’, we have a factor g in addition to g 2 for each cube nn of ∆ , hence for each of them we can keep one factor g and give the remaining g 2 to its ancestor. c’ ) there is a tree line connecting ∆ or ∆ to some cube nn. Then we test case a’ and b’ again, and we go on until a’ or b’ (see Fig11 a,b) is satisfied, or until the chain of nn cubes at scale j arrives to a cube at

790

M. Disertori, J. Magnen and V. Rivasseau ∆’

∆’

j

∆

∆

j+1

a

b

∆’

∆


j

j+1

c

Figure 11: Three possible situations when extracting a small factor g scale j + 1 that is not nn of ∆. In this last case (see Fig11 c) we must have at least M of such cubes, therefore we can write (g 2 )M ≤ g M (g 2 )d

(IV.206)

which means that we keep one factor g for each cube at scale j and we give a factor g 2 to each nn of ∆ at scale j + 1. This is true if M satisfies: M ≥ 2d.

(IV.207)

IV.6

Bounding the tree choice

Construction of Tjk Before summing over the trees we must see how the tree is ˜ = ∆ ˜ root we have one h ∈ Rroot built. In the connected component yjk , for each ∆ ˜ ˜ and d∆ ˜ fields in lb (∆) (defined in sec. III.3). For ∆root we have no h ∈ Rroot but ˜ ˜ we still have d∆ ˜ fields in lb (∆). Each h ∈ lb (∆) can contract only with a h ∈ Rroot ˜ ˜ ˜ ) we only have to choose ∆ ˜ . in some ∆ = ∆ . As there is only one field h ∈ b(∆ This last sum is performed using the decay of the tree line as we will prove below. ˜ we have to perform the following sum Therefore for each h ∈ lb (∆) dxh C jkh (xh , xh ) = dxh C jkh (xh , xh ) ˜ ∈yj ∆ k ˜ =∆ ˜ ∆

Ωh

∆root =∆root (h),∆0root

Ωh

(IV.208) ˜ , Ωh is the localization volume where h is the unique field in Rroot hooked to ∆ ˜ and Ωvh of the vertex to which h is hooked, ∆root is the corresponding cube in ∆

Vol. 2, 2001


∆h

0

∆ = ∆ root ∆’ A( ∆ )

A( ∆ )

a

b

791

root

∆ h root c

Figure 12: Three types of oriented links ∆h ⊆ ∆root is the localization cube for h . Finally we denoted by ∆root (h) the ˜ where h is hooked (this contraction is not cube ∆root for the generalized cube ∆ possible as it would generate a loop). Remark that the condition ∆root = ∆0root holds because this last cube does not contain any h ∈ Rroot . The sum over the tree Tjk is then bounded by

h∈b\Rroot j j b =j, and ∆h ⊆y h k

∆root =∆root (h),∆0root

dxh C jkh (xh , xh ) .

(IV.209)

Ωh

Sum over the cube positions and Tjk Now, for fixed Tjk , we have a multiscale tree structure. We want to sum over the cube positions following this tree from the leaves towards the root (which is the cube ∆0root at scale MY , which contain x = 0). For this purpose we give a direction (represented by an arrow) to all links (vertical and horizontal). • For any vertical link connecting some ∆0root to its ancestor we draw an arrow going from ∆0root down to its ancestor and we call it a down − link (see Fig.12a) • For all other vertical links connecting some ∆ to its ancestor we draw an arrow going from its ancestor up to ∆ or and we call it a up − link (see Fig.12b). • For each horizontal link, that is made by the contraction of a field (antifield) in Rroot with an antifield ( field) in b\Rroot we draw an arrow going from the field (antifield) in Rroot towards the antifield ( field) in b\Rroot (see Fig.12c). Now we can perform the sums following the tree. We have three situations. • If we have a down-link we have to sum over the choices for ∆, for ∆ = A(∆) fixed. Remark that for each down-link we have the vertical decay M −5(1−ε ) .

792



From this we first extract a fraction M −5ε that will be used for the last sum. With the remaining M −5(1−2ε ) assuming ε ≤ 1/10 we can write

M −5(1−2ε ) =

∆∈Dj ∆⊂∆ ,i =j+1 ∆

|∆ | −5(1−2ε ) = M 4 M −5(1−2ε ) ≤ 1 . (IV.210) M |∆|

• If we have an up-link we have to sum over the choices for ∆ = A(∆) for ∆ fixed. As there is only one ∆ such that ∆ = A(∆) there is no sum at all. • If we have an horizontal link the argument is more subtle and we explain it below. Sum over horizontal links For some h ∈ Rroot we want to prove that −(ih −i∆h ) −4 ε2 (ih −i∆h ) M M dxh Djkh (xh , xh ) ≤ C M 11/3 M 4i∆h x∆

Ωh

(IV.211) where ∆ is the unique cube at scale ih = jhb = j with ∆h ⊆ ∆ (see Fig.12c). From now on we write j instead of ih . We recall that we defined for k > 0 (see (IV.184)) j,k D h (xh , xh ) = C j,kh (xh , xh ) M 2j M 2εkh (IV.212) ≤ C M 8/3 M 2εkh M −2kh /3 χ | xh − xh | ≤ M j−kh /3+1/3 , |th − th | ≤ M j+k and for k = 0 j,0 D (xh , xh ) = C j,0 (xh , xh ) M 2j M 2εkh (IV.213) ≤ C M 8/3 χ | xh − xh | ≤ M j , |th − th | ≤ M j The case k = 0 is simple as dxh χ | xh − xh | ≤ M j , |th − th | ≤ M j Ωh ≤ M 4i∆h χ | x∆ − x∆ | ≤ M j , |t∆ − t∆ | ≤ M j and

χ | x∆ − x∆ | ≤ M j , |t∆ − t∆ | ≤ M j ≤ d

(IV.214) (IV.215)

x∆

where d is the number of nearest neighbors. Therefore dxh Djkh (xh , xh ) ≤ C M 8/3 M 4i∆h M −(j−i∆h ) M −2ε (j−i∆h ) x∆

Ωh

where the decay M −(1+2ε )(j−i∆h ) is just bounded by one.

(IV.216)

Vol. 2, 2001


793

The case k > 0 is more difficult. Now the integral is given by dxh χ | xh − xh | ≤ M j−kh /3+1/3 , |th − th | ≤ M j+k

(IV.217)

Ωh

3 ≤ M i∆h min[ M i∆h , M j−kh /3+1/3 ] χ | x∆ − x∆ | ≤ M j , |t∆ − t∆ | ≤ M j+k and the sum over x∆ gives χ | x∆ − x∆ | ≤ M j , |t∆ − t∆ | ≤ M j+k ≤ d 2M k .

(IV.218)

x∆

Now we have to distinguish two cases. 1. If we have i∆h < j −

1 kh + 3 3

(IV.219)

(IV.211) is bounded by

C M 8/3 M 4i∆h M kh M −(1+2ε )(j−i∆h ) M −kh (2/3−2ε) .

(IV.220)

By (IV.219) we have

M −(1+2ε )(j−i∆h ) ≤ M −(1+2ε )kh /3 M (1+2ε )/3 . Inserting this bound in the equation above we obtain C M 8/3 M (1+2ε )/3 M 4i∆h M kh (2ε−2ε ) ≤ C M 11/3 M 4i∆h

(IV.221)

(IV.222)

for ε < ε . 2. On the other hand, if we have i∆h ≥ j − kh

1 1 + ⇒ kh ≥ 3 (j − i∆h ) + 1 3 3

(IV.223)

(IV.211) is bounded by

C M 8/3 M i∆h M 3(j−kh /3+1/3) M kh M −(1+2ε )(j−i∆h ) M −kh (2/3−2ε) . (IV.224) Now we can write M i∆h M 3(j−kh /3+1/3) M kh = M M −kh M kh M 4i∆h M 3(j−i∆h )

(IV.225)

and (IV.211) is bounded by C M 1+8/3 M 4i∆h M (2−2ε )(j−i∆h ) M −kh (2/3−2ε) ≤ C M 11/3 M 4i∆h (IV.226)

794


if we can prove that kh ≥ This is true by (IV.223) if

hence for ε
iv

p=0

iv <j1 <j2 ...<jp 1 we can extract a factor |λ|. Otherwise, if |Vd | = 1, we have a polymer reduced to one or two cubes, therefore there is no logarithms. We can extract the complete coupling constant for the unique vertex. Remark that in this case we have not extracted a small factor g for the cube, but only a factor K. Nevertheless this is only one term of the sum (only the polymers with |Y | = 1). 10 This lemma is a particular variation on well known combinatoric identities [BF2], [DR2, Appendix B1]. 11 In fact to perform a Mayer expansion, we need only to control with MY fixed in our Y 0∈Y

main result (III.104). However we prove the slightly stronger result (III.106) for simplicity, since it is also true.

800



• |Vd | ≤ 16 and |Y | > |Vd |. In this case we must have at least |Y | − |Vd | vertical links of type f , therefore there must be at least 2 vertices with some derived fields hooked: |Vd | ≥ 2. Let us say that the lowest f -link is at scale j. At lower scale there can be only v-links, therefore there are at most 16 scales. As MY −j ≤ 16 the set of attributions for six fields derived to give the f -link has at most size MY − j ≤ 16, therefore these links do no give any logarithm, and we have a factor |λ|6/4 < |λ|. IV.7.3 Remaining sums Now the remaining sum is |Ac (Y )|L|Y | ≤ |λ| (gLC)|Y | MY

Y 0∈Y

S

V L BS

n 1 |Vd ∪Vb | ¯ CM 13 |λ| 272 K |Vd \Vb | n! n≥1 Vd ,αVd a,b,R {vl }l∈vL  M −1 cj Y n∆ !  M −5ε  nVd σVd ρVd {n∆ }∆∈B ∆cV¯ S

d

(IV.243)

j=mY k=1

∆∈BS

where all constants have been inserted ∆∈Y nd (∆)! coming in C and the factor 1 from (IV.231) is compensated by ∆∈Y nd (∆)! coming from Lemma IV.7.1a. Sum over {n∆ } and ∆cV . These sums are bounded as follows.

[n∆ !] ≤

{n∆ }∆∈BS ∆cV¯ ∆∈BS d

|V¯d |! n∆ ! ≤ |V¯d |! 2|Y |+n n ! ∆∈BS ∆

{n∆ }∆∈BS

∆∈BS

(IV.244) where we applied

as

1 ≤ 2|Y |+n

(IV.245)

{n∆ }∆∈BS

∆∈BS

n∆ = |V¯d | ≤ n.

Sum over {vl }l∈vL namely 1 n!

{vl }l∈vL

1≤

This sum actually consumes a fraction of the global factorial,

1 1 [n (n − 1) (n − 2) ... (n − |Vd | + 1)] = ¯ n! |Vd |!

where we applied n − |Vd | = |V¯d |.

(IV.246)

Vol. 2, 2001


801

Sum over σvd , nVd , a, b, R, Vd , ρVd and αVd . The sum over σvd costs at most a factor 4! per vertex, the sum over nVd at most a factor 4 per vertex, the sums over a, b and R a factor 2 per field, the sum over Vd a factor 2 per vertex, the sum over ρVd a factor 2 per field and finally the sum over αVd a factor 4 per vertex. Therefore ≤ Cn . (IV.247) Vd ,αVd a,b,R nVd σVd

The remaining bound is now |Ac (Y )|L|Y | ≤ |λ| (gLC)|Y | MY

Y 0∈Y

CM

13 n

|λ|

|Vd ∪Vb | 272

S

(IV.248)

V L BS



cj M Y −1

K |Vd \Vb |  ¯



M −5(1−2ε ) 

j=mY k=1

n≥1

where all constants have been inserted in C and the factorial |V¯d |! in (IV.244) has been canceled by the factor |V¯1d |! in (IV.246). Now

CM 13

n

|λ|

n≥1

|Vd ∪Vb | 272

K |Vd \Vb | = ¯

(IV.249)

|Vd ∪Vb | CM 13 |λ|1/272

|Vd ∪Vb |≥1

|V¯d \Vb | CM 13 K ≤C

¯d \Vb |≥0 |V

for λ and K small enough, depending on M . The choice of BS costs a factor 2 per cube so finally we have to bound   cj M Y −1 (gLC)|Y |  M −5ε  (IV.250) |λ| MY

S

j=mY k=1

VL

Sum over S and V L These sums are performed together. For this purpose we reorganize the sum as follows: S

VL

d 0

d0 ≥0

i=1



cj M Y −1

(gLC)|Y |   



M −5ε  ≤

j=mY k=1

p1i ≥1

(8gLC)

0

(8gLC)p

(IV.251)

p≥1

1

p1i

M −5ε

di d1i ≥0 i =1

 

p2i ≥1

p2i

(8gLC)

M −5ε

 · · · 

d2i ≥0

where p0 is the number of cubes in the connected subpolymer at the layer l = 0 (corresponding to the scale MY ), d0 the number of connected components at the

802



scale MY − 1 (circles in the rooted tree) connected to the root, p1i the number of cubes for the connected subpolymer i and so on. The factor 8 include a factor 2 to decide, for each vertical link, whether it is a v or f link, a factor 2 to decide for any cube of the connected subpolymer if it is going to a give a dot or not in S at the next layer (see Fig. 7), and finally a factor 2p to decide the remaining positive numbers V L for the circle links of S (since they are strictly positive and their sum is p). The products stop at pMY as this is the maximal number of layers. We remark that for the root we do not have any vertical link, hence no vertical decay M −5ε . We start computing this formula from leaves, which correspond to d = 0. Assuming gLC ≤ 1/16 and M −5ε /2 ≤ 1/2 we have

(8gLC) M −5ε ≤ p

p≥1

1 −5ε /2 . M 2

(IV.252)

Now we can perform the sum over d at the previous layer d M −5ε /2 ≤ 2

(IV.253)

d≥0

and at each layer we compensate the factor 2 by the new factor M −5ε /2 ≤ 1/2. Therefore we can sum over all layers until the root, and the result is bounded by 2 because the last layer has no M −5ε factor. Sum over MY This sum is finally bounded as announced by our spared factor λ |Ac (Y )|L|Y | ≤ |λ| 2 ≤ 2| ln T ||λ| ≤ 2K ≤ 1. (IV.254) Y 0∈Y

MY

for |λ ln T | ≤ K. This ends the proof of the theorem. To summarize our conditions, for a given L we compute first the constant C, we choose M large enough (and λ small enough) so that gLC ≤ 1/16 and M −5ε /2 ≤ 1/2, and we restrict again λ so that 13 1/272 CM λ ≤ 1/2. These restrictions on λ are therefore enforced solely by taking K small enough depending on L, which is our theorem.

Appendix A In section II.5 we have introduced band decoupling on the position space, and defined, for each band j the characteristic function Ωj . Let us introduce the following generalization of (II.32): Ωj = { ( x, t) | M j−1 ≤ (1 + | x|) 2 +α (1 + f (t) + | x|) 2 −α < M j 1 1 = { ( x, t) | M jM ≤ (1 + | x|) 2 +α (1 + f (t) + | x|) 2 −α 1

1

} j ≤ jM } j = jM (A.1)

Vol. 2, 2001


803

To select the optimal value for α we must insert auxiliary scales as in section II.5 and estimate the scaled decay of the propagator C jk , as a function of α. We insert auxiliary scale decomposition as in (II.37). Spatial constraints The constraints on spatial positions now are: • if j ≤ jM and k > 0 there is a non zero contribution only for M j M −k( 1+2α ) M − 1+2α 2− 1+2α ≤ (1+| x|) ≤ M j M −k( 1+2α ) M 1+2α (A.2) 1−2α

2

1−2α

1−2α

1−2α

• for j ≤ jM and k = 0 there is a non zero contribution only for M j M − 1+2α 2− 1+2α ≤ (1 + | x|) ≤ M j 2

1−2α

(A.3)

• for j = jM + 1 there is a non zero contribution only for M jM 2− 1+2α ≤ (1 + | x|) 1−2α

(A.4)

Scaled decay of the propagator Now for each j and k we can estimate the scaled decay of the propagator C j,k . We distinguish three cases: • for j ≤ jM and k > 0 we have j,k 1−2α 4α ) M 3+2α C ( x, t) ≤ M −2j M −k( 1+2α 1+2α 2 1+2α χ x, f (t)) j,k (

(A.5)

where the function χj,k is defined by χj,k ( x, t) = 1 = 0

if | x| ≤ M j M −k( 1+2α ) M 1+2α , f (t) ≤ M j+k otherwise (A.6) 1−2α

1−2α

• for j ≤ jM and k = 0 we have j,0 1−2α 4 C ( x, t) ≤ M −2j M 1+2α 22( 1+2α ) χj,0 ( x, f (t))

(A.7)

where the function χj,0 is defined by χj,0 ( x, t) = 1 = 0

if | x| ≤ M j , f (t) ≤ M j otherwise

• for j = jM + 1 we have j +10 1−2α C M ( x, t) ≤ M −2jM 22( 1+2α ) χjM +1,0 (f (t))

(A.8)

Kp p (A.9) (1 + M −jM | x|)

where the function χjM +1,0 is defined by χjM +1,0 (t) = 1 = 0

if f (t) ≤ M jM otherwise

(A.10)

and the spatial decay for | x| comes from the decay of the function F in (II.7).

804



Integration volume The region of spatial integration (for a scale propagator) is now fixed by the χj,k domain. Therefore • for j ≤ jM and k > 0 we have Vj,k = | x|3 f (t) ≤ M 4j M −k( 1+2α ) M 3( 1+2α ) 2−8α

1−2α

(A.11)

• for j ≤ jM and k = 0 we have Vj,k = | x|3 f (t) ≤ M 4j

(A.12)

Vj,k = | x|3 f (t) ≤ M 4jM .

(A.13)

• for j = jM + 1 we have

As we have seen, the tree propagator is used in two cases, namely to bound the sum over cubes in the Hadamard bound (see (IV.141)) and to perform the sum over trees. In the Hadamard bound we must have Fjk =: |C jk |2 M 4j M k ≤ K M −εk

(A.14)

for some constants K, ε > 0 (K is actually proportional to some constant power of M ). The decay M −εk is necessary to sum over k. Inserting the α depending bounds for C jk we have, for k > 0 Fjk ≤ M −4j M −k( 1+2α ) M 2 1+2α 22 1+2α M 4j M k = M k[1−( 1+2α )] M 2 1+2α 22 1+2α (A.15) and (A.14) is true for

8α 1 1− . (A.16) 1 + 2α 6 8α

3+2α

1−2α

8α

3+2α

1−2α

On the other hand when summing over the tree structure we must ensure that

Fjk =: |C jk | Vjk ≤ K M 2j M −εk

(A.17)

for some constants K, ε > 0 (K is actually proportional to some constant power of M ). Again the decay M −εk is necessary to sum over k. Inserting the values for |C jk | and Vjk we have Fjk

M −2j M −k( 1+2α ) M 1+2α 2 1+2α M 4j M −k( 1+2α ) M 3( 1+2α ) 2−4α 3−2α 1−2α (A.18) ≤ M 2j M −k( 1+2α ) M 2( 1+2α ) 2 1+2α

≤

4α

3+2α

1−2α

2−8α

1−2α

and (A.17) is true for 2 − 8α > 0 ⇒ α
0 for λ0 < µ < λ1 and Γn (n ≥ 1) is a circuit around the interval (λ2n−1 , λ2n ) with counterclockwise orientation. Flaschka and McLaughlin have obtained formula (1.2) by applying a well known procedure due to Arnold in the case of finite dimensional integrable systems: they defined the action vari 1 α where α is a 1-form satisfying ω = dα and (cn )n is a able In by In := 2π cn 1 (appropriately chosen) basis of cycles of an invariant torus. Expressing 2π α in cn conveniently chosen canonical coordinates they obtain the integral in (1.2) . Denote by (γn )n≥1 the sequence of gap lengths, γn := λ2n − λ2n−1 . Proposition 1 Let q0 ∈ L20 . Then there exist a neighborhood Uq0 of q0 in L20,C and a constant C ≥ 1 so that, for any n ≥ 1, In is analytic on Uq0 and 1 γn 2 2In = (1 + rn ) nπ 2 ≤ |1 + rn |≤ C and C1 ≤ Re(1 + rn ) ≤ C as well as the asymptotic estimate rn = O logn n . As a consequence,

1/2 2In (1.3) ξn (q) := (γn /2)2

where the error rn is analytic on Uq0 , satisfies

1 C

is analytic and does not vanish on Uq0 (with z 1/2 denoting the branch of the square root which equals 1 at z = 1) and satisfies the asymptotic estimate (q ∈ Uq0 ) log n 1 |ξn − √ | ≤ C nπ n where C ≥ 1 is independent of q.

Vol. 2, 2001

On Birkhoff Coordinates for KdV

811

Proof. (in [BBGK], section 2) Integrating (1.2) by parts, the L2 -gradient ∂In 1 =− ∂q(x) π

Γn

∂In ∂q(x)

can be computed

∂∆(µ) ∂q(x)

dµ. ∆2 (µ) − 4

2 Angle variables To define the angle variables, introduce the holomorphic differentials investigated in [BKM2] (cf also [MT2]). Proposition 2 There exists an open neighborhood U = UL20 in L20,C so that for any q in U , one can find a sequence of entire functions ψj (λ) ≡ ψj (λ, q) (j ≥ 1) satisfying ψj (λ, q) dλ 1 (2.1) = δj,n 2π Γn ∆(λ, q)2 − 4 The functions ψj depend analytically on λ and q and admit a product representation (j) cj µk − λ ψj (λ) = 2 2 (2.2) j π k2 π2 k=j

(j)

(j)

with µk = µk (q) and cj = cj (q) depending analytically on q ∈ U and satisfying (j)

|µk − τk | ≤ C

1 |γk |2 (k = j); k |cj − 2πj| ≤ C

τk = 1 j

1 (λ2k−1 + λ2k ) 2

(2.3) (2.4)

where C > 0 can be chosen locally uniformly with respect to q and independently of j ≥ 1. Proof. cf Theorem A.5 (in Appendix A.2), Lemma 3.2, and Lemma 3.3 in [BKM2]. It is convenient to introduce the following Definition An open set U in L20,C is said to be a G-neighborhood if U satisfies the properties stated in Proposition 2. In the sequel, let Uq0 always denote a bounded G-neighborhood of q0 ∈ L20 . To define the angle variables, introduce the hyperelliptic surface Σq , y = ∆2 (λ) − 4, associated with spec(q).

812

T. Kappeler and M. Makarov


For q in Uq0 \ Dn with Dn := {q | λ2n = λ2n−1 } the angle variable θn (q) is defined formally - to be the n’th component of the Abel map associated to Σq , evaluated at (µ∗k )k≥1 with µ∗k := (µk , ∆2 (µk ) − 4) ∈ Σq . d2 Here µk = µk (q) (k ≥ 1) denote the Dirichlet eigenvalues of the operator − dx 2 +q considered on [0, 1]. More precisely, we define for q in Uq0 \ Dn , θn (q) :=

k≥1

µ∗ k (q) λ2k (q)

ψn (λ, q) dλ ∆2 (λ, q) − 4

where for each k ≥ 1 the path in the integral µ∗k (q) ψn (λ, q) ηn,k (q) := dλ ∆2 (λ, q) − 4 λ2k (q)

(2.5)

(2.6)

is near λ2k , but otherwise arbitrary. Formula (2.5)for the variables (θn )n conjugate to the actions can be obtained - at least formally - by taking the derivative of α = n In dθn with respect to In , q ∂α ∂α ∂In = dθn and integrating on an invariant torus with In = 0, θn = q0 ∂In where q0 is a base point of the invariant torus under consideration. By then expressing ∂α ∂In in conveniently chosen canonical coordinates one obtains formula (2.5) under the assumption that α coincides with the 1-form introduced in [FM]. In the remainder of this section we show that the ηn,k are well defined analytic functions on Uq0 \Dn , multivalued in the case k = n, and that they satisfy estimates to make the infinite sum in (2.5) convergent and θn (q) analytic. Lemma 3 (i) For k = n, ηn,k is a well defined function defined on Uq0 . In particular, the integral in (2.6) is independent of the path chosen (as long as the latter stays near λ2k ). (ii) ηn,n is well defined as a multivalued function on Uq0 \Dn with values differing by multiples of 2π. Proof. (i) First notice that ηn,k is well defined for q with γk (q) = 0. In such a case (n) µk = λ2k . Therefore ψn (λ) and ∆2 (λ) − 4 both contain the factor (λ2k − λ) and √ ψn2 (λ) is analytic near λ2k . Thus by Cauchy’s theorem, ηn,k is well defined ∆ (λ)−4

in this case. The independence of ηn,k of the path of integration in the case γk = 0 follows from the normalization (2.1) λ2k−1 ψn (λ)dλ (2.7) = πδn,k mod 2π. ∆2 (λ, q) − 4 λ2k

Vol. 2, 2001


813

(ii) First we notice that as γn (q) = 0, the integral in (2.6) is well defined. Due to the normalization condition (2.7), we have

λ2n−1

λ2n

ψ (λ)dλ n =π ∆2 (λ, q) − 4

mod 2π.

(2.8)

By Cauchy’s theorem, ηn,n is thus well defined mod 2π.

To prove the boundedness result below, it is convenient to consider the model for Σq , obtained by glueing two copies of the complex plane, slit open along (−∞, λ0 ), (λ2n−1 , λ2n ) (n ≥ 1). These copies are refered to as the sheets of Σq . Lemma 4 Let Uq0 be a bounded G-neighborhood of q0 ∈ L20 . Then there exists C > 0 so that for any n ≥ 1 the following holds: (i) for all k = n and q ∈ Uq0 , Cn 1 (|µk − τk | + |γk |); |k 2 − n2 | k

|ηn,k (q)| ≤ (ii) for q ∈ Uq0 \ Dn |ηn,n (q)

µn − τn ; mod 2π| ≤ C log 2 + γn

(iii) for all q ∈ Uq0 ,



|ηn,k (q)| ≤

k=n

1/2

C  |µk − τk |2   n

1/2   + |γk |2   . 

k≥1

k≥1

Proof. is provided in Appendix A.

To prove regularity properties of ηn,k , introduce Sk

:=

{q ∈ Uq0 | γk (q) = 0}

Wk

:=

{q ∈ Uq0 | µk ∈ {λ2k−1 , λ2k }}.

Notice that Sk and Wk are analytic subvarieties as Sk = {q ∈ Uq0 | ∆(λ˙ k ) = ˙ λ˙ k ) = 0} (where λ˙ k is the root of ∆(λ) ˙ (−1)k 2, ∆( = 0 near λ2k ) and Wk = {q ∈ Uq0 | y1 (1, µk ) = (−1)k } ≡ {q ∈ Uq0 | y1 (1, µk ) − y2 (1, µk ) = 0} where for the characterization of Wk we used that the Wronskian identity [y1 (x, λ), y2 (x, λ)] = 1, evaluated at (x, λ) = (1, µk ), is given by y1 (1, µk )y2 (1, µk ) = 1. Lemma 5 Let Uq0 be a G-neighborhood of q0 ∈ L20 . Then:

814



(i) for k = n, ηn,k is analytic on Uq0 ; (ii) ηn,n is an analytic, multivalued function on Uq0 \ Dn whose values can be identified modulo π; (iii) when restricted to real potentials, ηn,n is a continuous, multivalued function whose values can be identified modulo 2π. Proof. (i) Notice that for q ∈ Uq0 \ Sk and a small q-neighborhood V ⊆ Uq0 \ Sk , − + − there exist analytic functions λ+ k , λk on V with {λk , λk } = {λ2k , λ2k−1 }. In view µ∗k (q) ψn (λ,q) dλ. From this deduce that ηn,k is analytic on of (2.7) ηn,k (q) := λ+ (q) √ 2 k

∆ (λ,q)−4

V \ (Sk ∪ Wk ) and as a consequence, analytic on Uq0 \ (Sk ∪ Wk ). It remains to prove the analyticity of ηn,k for q ∈ Sk ∪Wk . By [[PT], Appendix A] this amounts to prove that ηn,k is locally bounded and weakly analytic. By Lemma 4, ηn,k is bounded on Uq0 . For ηn,k to be weakly analytic it is to show that for any given q ∈ Sk ∪ Wk and any p ∈ L20,C , ηn,k (q + zp) is analytic for z ∈ C near z = 0. Introduce D := {q + zp | z ∈ C, |z| < 3} and chose 3 sufficiently small so that D ⊆ Uq0 . Due to the fact that Sk and Wk are analytic submanifolds of Uq0 it follows that, for 3 sufficiently small, the following two cases occur: case 1S :

Sk ∩ D ⊆ {q};

case 2S :

Sk ∩ D = D

Wk ∩ D ⊆ {q};

case 2W :

Wk ∩ D = D .

and, similarly, case 1W :

Combining them, we obtain four different cases, (iS , jW ) (1 ≤ i, j ≤ 2) which are treated separately. First we notice that the cases (iS , 2W ) (i = 1, 2) are particularly easy as ηn,k = 0 on D . In the case (2S , 1W ) we have λ2k = λ2k−1 = τk on D and as τk is analytic it follows that ηn,k is continuous on D . As, by considerations above, ηn,k is analytic on D \ {q} it follows that ηn,k is analytic on D (removable singularity). It remains to treat the case (1S , 1W ). Again by the considerations above, ηn,k is analytic on D \ {q}. As lim r→q λj (r) = λ2k (q) for j = 2k, 2k − 1, r∈D

ηn,k |D is continuous at q. It follows that ηn,k is analytic on D in case (1S , 1W ). (ii) By Lemma 3, ηn,n is a multivalued function whose values coincide modulo 2π. For q ∈ Uq0 \ Dn , there exist a neighborhood V ⊆ Uq0 \ Dn and analytic functions − + − λ+ n , λn on V so that {λn , λn } = {λ2n , λ2n−1 }. As λ2n−1 ψn (λ) dλ = π mod 2π ∆2 (λ) − 4 λ2n and

µ∗n λ+ n

√ ψn2 (λ)

∆ (λ)−4

dλ is continuous on V , we conclude that ηn,k is continuous on

V when viewed as a multivalued function whose values coincide modulo π. Arguing as in (i), we conclude that ηn,n is analytic on V , and therefore on Uq0 \ Dn as well, when considered as a multivalued function.

Vol. 2, 2001


815

(iii) As λ2n and λ2n−1 are real for q real valued, they are continous in q. This implies that ηn,n is continuous on Uq0 \ Dn ∩ L20 when viewed as a multivalued function whose values coincide modulo 2π. We summarize our results in the following Proposition 6 There exists a G-neighborhood U = UL20 of L20 in L20,C so that, for any n ≥ 1, the following statements hold: (i) θñ := k=n ηn,k converges absolutely, is analytic on U, and satisfies θñ = 1 O n locally uniformly in q (cf Lemma 4); (ii) θn is an analytic, multivalued function on U \ Dn with values equal modulo π; (iii) when restricted to real valued potentials in U \ Dn , θn is a continuous multivalued function with values equal modulo 2π.

3 Ω : Definition and regularity properties In this section we define a real analytic map Ω = (Ωn )n≥1 : L20 → h1/2 (N; R2 ) which satisfies - as will be proved in the subsequent sections - all the properties listed in Theorem 1. We begin by defining the n th component of Ω, Ωn (q) := (xn (q), yn (q)). Let U ≡ UL20 be a G-neighborhood of L20 in L20,C . Definition For q ∈ U \ Dn , set Ωn (q) := (xn (q), yn (q)) := ξn (q)

γn (q) (cos θn (q), sin θn (q)), 2

where ξn (q) has been introduced in section 1, θn (q) in section 2, and where γn (q) := 2 λ2n (q) − λ2n−1 (q), is related to the actions In (q) by 2In (q) = ξn (q) γn2(q) . Recall that γn (q) is not continuous on U \Dn due to the choice of the ordering of the eigenvalues. Further recall that θn = ηn,n + θñ where θñ := k=n ηn,k is analytic on U whereas

µ∗ n

ηn,n (q) = λ2n

εn ψn √ dλ ∆2 − 4

is analytic on U \ Dn when viewed as a multivalued function whose values coincide mod π (cf Lemma 5). Lemma 7 On U \ Dn , xn (q) and yn (q) are analytic.

816



Proof. Let p ∈ U \ Dn . Then there exist a neighborhood V ⊆ U \ Dn and analytic − + functions λ± n on V with {λn (q), λn (q)} = {λ2n−1 (q), λ2n (q)}. ∗ µ + It follows from the proof of Lemma 5 that ηn,n (q) := λ+n √∆ψ2n−4 dλ is anan lytic on V when viewed as a multivalued function (mod 2π). Introduce on V the following functions − γn+ := λ+ n − λn ;

x+ n := ξn

+ θn+ := ηn,n + θñ ;

γn+ cos θn+ ; 2

yn+ := ξn

γn+ sin θn+ . 2

+ Then γn+ , θn+ , x+ n , yn are analytic on V . Thus the claimed statement follows if + xn = x+ n and yn = yn .

Take q in V . If λ+ n (q) = λ2n (q) then, according to the definition of γn and θn , and Lemma 3 γn+ (q) = γn (q),

θn+ (q) ≡ θn (q)

mod 2π

whereas in the case λ+ n (q) = λ2n−1 (q), in view of (2.7), γn+ (q) = −γn (q),

θn+ (q) ≡ (θn (q) + π)

mod 2π.

+ Thus in both cases we conclude that xn (q) = x+ n (q) and yn (q) = yn (q).

The next result shows that Ωn can be extended: Proposition 8 There exists a G-neighborhood U = UL20 of L20 in L20,C so that for any n ≥ 1, Ωn = (xn , yn ) admits an analytic continuation on U . Let us outline our proof of Proposition 8. First we show that, for any n ≥ 1, Ωn admits a continuous extension on U (Corollary 11) and has a bound of the form |Ωn (q)| ≤

C (|γn | + |µn − τn |) n1/2

where C > 0 can be chosen independently of q for q in a bounded G-neighborhood of q0 (Corollary 11). Using Lemma 7, Proposition 8 then follows by showing that Ωn is weakly analytic. We begin by establishing an auxilary result. For q ∈ Uq0 , Uq0 a G- neighborhood of q0 ∈ L20 , and n ≥ 1 introduce the functions ζn ≡ ζn (λ, q) =

ψn (λ, q) vn (λ, q)

(3.1)

defined for λ ∈ C near {λ2n (q0 ), λ2n−1 (q0 )} where vn (λ, q) := (−1)n−1

2 (λ − λ0 )1/2 ((λ2k − λ)(λ2k−1 − λ))1/2 nπ nπ k2 π2 k=n

(3.2)

Vol. 2, 2001


817

1/2 1/2 and denotes = 1. Then, for z the branch defined on C \ R− with 1 2 λ, ∆(λ) − 4 ∈Σq near the branch points {λ2n , λ2n−1 }, (λ2n − λ)(λ − λ2n−1 ) is defined by ζn (λ) ψn (λ) = . (3.3) (λ2n − λ)(λ − λ2n−1 ) ∆(λ)2 − 4

Lemma 9 Given a bounded G-neighborhood Uq0 of q0 ∈ L20 , there exists a constant C > 0 so that, for q in Uq0 and n ≥ 1, |ζn (τn ) − 1| ≤ C|γn |. Proof. For q ∈ Uq0 \ Dn real valued, by formula (2.1), 1 π

λ2n−1

λ2n

1 dλ = 1. ζn (λ, q) (λ2n − λ)(λ − λ2n−1 )

(3.4)

Choose λ(t) := τn − t γ2n (−1 ≤ t ≤ 1) as path of integration. As q is realvalued 1/2 γn 1 − t2 (λ2n − λ)(λ − λ2n−1 ) = − . 2

(3.5)

Substituting (3.5) into (3.4) yields 1 1 dt 1 1 dt 1= ζn (λ(t)) = (ζn (λ(t)) + ζn (λ(−t))) . (3.6) 1/2 1/2 π −1 π 0 (1 − t2 ) (1 − t2 ) Notice that ζn (λ(t)) + ζn (λ(−t)) is even in tγn . Further, ζn (λ) as well as γn2 are analytic in q, hence (3.6) remains valid on all of Uq0 \ Dn . The integral in (3.6) is split up into two parts, FI (q) + FII (q), with 1 FI (q) := ζn (τn ) π

1 −1

dt (1 − t2 )1/2

= ζn (τn ).

Then (3.6) leads to |ζn (τn ) − 1| ≤ |FII (q)|. To estimate 1 FII (q) := π

1

−1

(ζn (λ) − ζn (τn ))

(3.7) dt

(1 − t2 )1/2

,

notice that, as λ(t) − τn = −t γ2n ,

ζn (λ) − ζn (τn )

1

∂ζn (τn + s(λ − τn ))(λ − τn )ds 0 ∂λ γn 1 ∂ζn γn (τn + st )ds. = −t 2 0 ∂λ 2 =

818


This leads to FII (q) = −

γn 1 2 π

1 −1

0

1

t (1 −

1/2 t2 )


γn ∂ζn (τn + st )dtds. ∂λ 2

Choose C > 0 so that

∂ζn γn sup (τn + st ) ≤ C ∂λ 2 0≤s≤1

∀q ∈ Uq0 .

0≤|t|≤1

Thus, for q ∈ Uq0 \ Dn ,

|ζn (τn ) − 1| ≤ C|γn |.

(3.8)

As ζn (τn ) and |γn | are continuous and Uq0 \ Dn is dense in Uq0 , (3.8) holds on the whole neighborhood Uq0 . Recall that in section 2, we have introduced the real analytic submanifolds Wn Sn

:= {q ∈ Uq0 | µn ∈ {λ2n , λ2n−1 }} , := {q ∈ Uq0 | λ2n = λ2n−1 }

where Uq0 is a bounded G-neighborhood of q0 ∈ L20 . To formulate our next result, introduce, for q ∈ Uq0 , 1 1 ∂ζn (τn + st(µn − τn ))dsdt. (3.9) pn (q) := (µn − τn ) 0 0 ∂λ Use the model for Σq near λ2n obtained by glueing two copies of the complex plane, slitopen along the interval Gn = {(1 − t)λ2n−1 + tλ2n | 0 ≤ t ≤ 1}. For λ∗ = (λ, ∆(λ)2 − 4) ∈ Σq with λ ∈ Gn and near λ2n , define 3n ≡ 3n (λ∗ ) = ±1 by 2 1/2 γn /2 (λ2n − λ)(λ − λ2n−1 ) = i3n · (λ − τn ) 1 − (3.10) λ − τn where (1 − z 2 )1/2 denotes the square root on C \ (−∞, −1) ∪ (1, ∞) with 11/2 = 1. Formula (3.10) then leads to 2 1/2 γ /2 n ∆(λ)2 − 4 = ζn (λ)i3n · (λ − τn ) 1 − . (3.11) λ − τn Define Ωn ≡ (xn , yn ) on Sn as follows (xn , yn )

:= (0, 0) on Sn ∩ Wn

(3.12)

(1, −i3n ) on Sn \ Wn (3.13) with 3n = 3n (µ∗n ), µ∗n = (µn , y1 (1, µn ) − y2 (1, µn )) and θñ := k=n ηn,k . Notice that Ωn |Sn is continuous on Sn . (xn , yn )

:= (µn − τn )ξn e

in θñ +pn

Vol. 2, 2001


819

Lemma 10 For q1 ∈ Sn \ Wn , lim

q→q1 q∈Sn ∪Wn

Ωn (q) = Ωn (q1 ).

Proof. We first evaluate the limits of xn (q) ± iyn (q) = ξn γ2n e±iθn for q → q1 ˜ ˜ with q ∈ Uq0 \ (Sn ∪ Wn ). By Proposition 6, limq→q1 e±iθn (q) = e±iθn (q1 ) and by Proposition 1, limq→q1 ξn (q) = ξn (q1 ). Thus it remains to find the limit of γn ±iηn,n (q) as q → q1 . For q ∈ Uq0 \ (Sn ∪ Wn ), 2 e

µ∗ n

ηn,n (q) = λ2n

ψn (λ) dλ = ∆(λ)2 − 4

µ∗ n λ2n

ζn (λ) dλ (λ2n − λ)(λ − λ2n−1 )

(3.14)

where ζn (λ) is given by (3.1) and the square root (λ2n − λ)(λ − λ2n−1 ) is defined on Σq for λ near λ2n by (3.10). For q ∈ Uq0 \ (Sn ∪ Wn ) with |µn − τn | ≤ 4|γn |, by Lemma 4, (3.15) |ηn,n (q)| ≤ C ( for q with |µn − τn | ≤ 4|γn | ). µ∗n ζn (λ) √ To evaluate λ2n dλ for q ∈ Uq0 \(Sn ∪ Wn ) with |µn −τn | > 4|γn | (λ2n −λ)(λ−λ2n−1 )

we consider two cases: case 1 :

Re wn ≥ 0;

case 2 :

Re wn < 0

n where wn = µγnn−τ /2 . Let us first consider case 1. Choose as path of integration

λ(t) = λ2n + t(µn − λ2n ) = τn +

γn w(t) 2

where w(t) = 1 − t + twn

(0 ≤ t ≤ 1).

Then (λ2n − λ(t))(λ(t) − λ2n−1 )

γ 2 n

(1 − w(t))(1 + w(t)) 2 γ 2 1 n = − w(t)2 1 − . 2 w(t)2 =

Notice that Re w(t) = 1 − t + t Re wn ≥ 0 (case 1). Moreover, for 0 ≤ t ≤ 1, (cf (3.10)) 1/2 1 γn (λ2n − λ(t))(λ(t) − λ2n−1 ) = i3n w(t) 1 − . 2 w(t)2

(3.16)

820



Substituting (3.16) into the integral in (3.14) we get ηn,n (q)

1

ζn (λ(t))(µn − λ2n )dt 1/2 1 i3n γ2n w(t) 1 − w(t) 2 1 ζn (λ(t)) 1/2 (wn − 1)dt 0 1 w(t) 1 − w(t) 2 wn γn ζn (τn + 2 w) mod 2π. 1/2 dw 1 w 1 − w12

= 0

=

3n i

=

3n i

(3.17)

Using the Taylor expansion 1 γn γn γn ∂ζn w = ζn (τn ) + w (τn + s w)ds, ζn τn + 2 2 ∂λ 2 0 the last integral in (3.17) can be split into two parts, ηn,n (q) = I(q) + II(q) where wn 1 3n (3.18) I(q) := ζn (τn ) 1/2 dw i 1 w 1 − w12 and 3n i

II(q) :=

1

wn

1 ∂ζn ∂λ (τn

0

1−

+ s γ2n w) γn dwds. 1 1/2 2

Then, as Re w(t) > 0 for 0 ≤ t < 1, and w(0) = 1 3n 1 1/2 I(q) = ζn (τn ) log w + w(1 − 2 ) i w w=wn and with

γn 2 dw

=

γn 2 (wn

3n II(q) = (µn − λ2n ) i

(3.19)

w2

( mod 2π)

(3.20)

− 1)dt = (µn − λ2n )dt 1

0

0

1

γn dtds ∂ζn . (3.21) (τn + s( + t(µn − λ2n ))) 1 1/2 ∂λ 2 (1 − w(t) 2)

Notice that, for 0 < t ≤ 1, 1 = 1 1/2 (1 − (1 − w(t) 2)

1 1 1/2 (1 + w(t) )

2 ≤ 1/2 1 1/2 t w(t) )

(3.22)

where using that |wn | ≥ 4, 1 −

−1/2 1 1 = 1/2 w(t) t

1 + t(wn − 1) 1/2 2 ≤ 1/2 wn − 1 t

(3.23)

Vol. 2, 2001


821

and, using that Re w(t) = 1 + t Re wn ≥ 1 1 +

−1/2 1 + t(wn − 1) 1/2 1 = ≤ 1. w(t) 2 + t(wn − 1)

(3.24)

Before continuing our argument for case 1 let us first consider the case 2: Re wn < 0. Then µ∗n µ∗n ψn (λ)dλ ψn (λ)dλ =π+ mod 2π (3.25) ηn,n (q) = 2 −4 ∆(λ) ∆(λ)2 − 4 λ2n λ2n−1 where we used (2.7). For the last integral in (3.25), choose as path of integration λ(t) = λ2n−1 + t(µn − λ2n−1 ) and argue as in case 1. It leads to the following formula, ηn,n = I(q) + II(q) + III(q) where I(q) is defined as in (3.20) but 3n II(q) := (µn − λ2n−1 ) i

1

0

1

0

∂ζn (τ (s, t)) ∂λ

dtds 1−

1 w(t)2

1/2

mod 2π (3.26)

where τ (s, t) := τn + s(− γ2n + t(µn − λ2n−1 )) and III(q) := (3n ζn (τn ) + 1)π

mod 2π.

(3.27)

The estimates (3.23), (3.24) allow to take the limit under the integral in (3.21) and (3.26) to obtain dtds 3n 1 1 ∂ζn (τ (s, t)) = pn (q1 ) lim II(q) = (µn − λ2n ) 1 1/2 q→q1 i 0 0 ∂λ (1 − w(t) 2) q=q1

(3.28) where we used that limq→q1 γn (q) = 0 and limq→q1 λ2n (q) = τn (q1 ). Now let us continue with the proof of case 1 and case 2 simultaneously. From (3.20) we obtain 1/2 ±n ζn (τn ) 1 γn ±iI(q) γn lim e w+w 1− 2 = lim (3.29) q→q1 2 q→q1 2 w w=wn

=

(µn − τn )(±3n (q1 ) + 1)

where we used |ζn (τn ) − 1| ≤ C|γn | (Lemma 9) and thus lim

q→q1

γn 2

1 γn /2

ζn (τn ) = 1.

(3.30)

822



Notice that III(q) (cf 3.27) is continuous in q and lim e±iIII(q) = lim exp (±i(3n ζn (τn ) + 1)π) = 1.

q→q1

q→q1

Combining (3.28), (3.29), and (3.31) we conclude that limq→q1 For q1 ∈ Sn \ Wn we then obtain (q ∈ Uq0 \ (Sn ∪ Wn )) lim (xn + iyn ) =

q→q1

= =

γn ±ηn,n 2 e

(3.31) exists.

γn iηn,n e 2 γ ˜ n (2wn )n ζn (τn ) en pn ξn eiθn lim q→q1 2 ˜

ξn eiθn lim

q→q1

˜

(1 + 3n )ξn eiθn (µn − τn )epn

where pn ≡ pn (q1 ) (cf (3.9)). Similarly, γn iηn,n e 2 γ ˜ n (2wn )−n ζn (τn ) e−n pn = ξn e−iθn lim q→q1 2 ˜

= ξn e−iθn lim

lim (xn − iyn )

q→q1

q→q1

˜

= (1 − 3n )ξn e−iθn (µn − τn )epn . Thus

˜

lim xn = ξn en iθn (µn − τn )epn

q→q1

and

˜

lim yn = −i3n ξn en iθn (µn − τn )epn = −i3n xn (q1 ).

q→q1

Corollary 11 (i) Ωn is continuous on Uq0 . (ii) There exists C > 0 so that for q ∈ Uq0 and n ≥ 1, |xn | + |yn | ≤

C (|µn − τn | + |γn |). n1/2

Proof. (i) Follows from Lemma 7, Lemma 10 and the definitions (3.12), (3.13). √ ±iθñ (q) (cf Proposition 6) and ( nξn )n≥1 (cf Proposition 1) (ii) On Uq0 , e n≥1

are bounded. It remains to bound γ2n e±iηn,n by C(|µn − τn | + |γn |). This follows from (3.15), the boundedness of e±iII(q) (cf (3.21) and (3.26)), the boundedness of e±iIII(q) (cf (3.27), Lemma 9), and the boundedness of γ2n e±iI(q) (cf (3.20), Lemma 9). Proof. (of Proposition 8). The claimed statement follows if for any q0 ∈ L20 , there exists a G-neighborhood Uq0 of q0 in L20,C so that xn , yn are bounded on Uq0 and weakly analytic (cf [PT]). By Corollary 11, xn , yn are bounded on Uq0 . From

Vol. 2, 2001


823

Lemma 7 and Corollary 11 one concludes, similarly as in the proof of Lemma 5, that xn (q), yn (q) are weakly analytic. The results of this section lead to Theorem 2 Ω := (Ωn )n≥1 : L20 → h1/2 (N; R2 ) is real analytic. Proof. Let q0 ∈ L20 . By Corollary 11 there exist C > 0 and a G-neighborhood Uq0 of q0 in L20,C so that for any n ≥ 1 Ωn is analytic on Uq0 and, for q in Uq0 , |xn |2 + |yn |2 ≤

C |γn (q)|2 + |µn (q) − τn (q)|2 . n

By Proposition 28, Uq0 and C > 0 can be chosen so that, for q ∈ Uq0 , |γn (q)|2 + |µn (q) − τn (q)|2 ≤ C. n≥1

Thus Ω(q) ∈ h1/2 (N; R2 ) and Ω is bounded on Uq0 . Together with the analyticity of Ωn on Uq0 (n ≥ 1), this implies that Ω is analytic on Uq0 .

4 Canonical relations: part 1 In this section we prove a first set of canonical relations for the variables In , θn (n ≥ 1) introduced in sections 1 and 2 respectivly. These relations will be used in the next section to prove that the map Ω, defined in section 3, is a local diffeomorphism. Let O(q) be the set of open gaps, O ≡ O(q) := {n ∈ N | γn (q) = 0}. Proposition 12 (i) For q ∈ L20 and m, n ≥ 1 , {In , Im } = 0. (ii) For q ∈ L20 , m ∈ O(q), and n ≥ 1, {θm , In }(q) = −δn,m . (iii) For q ∈ L20 and m, n ∈ O(q), {xn , xm } =

{yn , ym } = 0;

{xn , ym } =

0 (m = n);

{xn , yn } = 0.

We prove parts (i), (ii), and (iii) of Proposition 12 separately.

824



Proof of Proposition 12(i) Recall that ∂Ik 2 =− ∂q(x) π

λ2k

λ2k−1

1 ∂∆(λ) dλ ∆2 (λ) − 4 ∂q(x)

(4.1)

where the path of integration is given by λ = λ2k−1 + tγk − i0 with 0 ≤ t ≤ 1. For a, b ∈ R, we have (cf (B.3) in Appendix B) {∆(a, q), ∆(b, q)} = 0. Therefore {In , Im } = 0.

The proof of Proposition 12(ii) requires several auxiliary results which we present first. For q ∈ L20 , let Iso(q) denote the set of isospectral potentials. As Iso(q) is compact and generically not contained in a finite dimensional space, Iso(q) generically is not a manifold. Nevertheless its normal space Nq Iso(q) and its tangent space Tq Iso(q) at q are well defined (cf [MT1]) : Tq Iso(q) is the L2 -closure of the d 2 2 (f2n − f2n−1 ) with n ∈ O ≡ O(q) where (fn )n≥0 denotes an orthonorspan of dx d2 mal set of eigenfunctions of the Schr¨ odinger operator − dx 2 + q on [0, 2], considered with periodic boundary conditions. The normal space Nq Iso(q) is the orthogonal complement of Tq Iso(q) in L20 . Lemma 13 For n ≥ 1 and q ∈ L20 ,

d ∂In dx ∂q(x)

∈ Tq Iso(q).

∂In Proof. It suffices to consider n ∈ O as, for n ∈ N \ O, ∂q(x) = 0. Similarly as in the proof of Proposition 12(i) one shows that, for any λ ∈ R,

{∆(λ), In } = 0. d ∂In Therefore ∆(·, q) remains unchanged along the flow generated by dx ∂q(x) . As ∞ ∆(·, q) determines the spectrum of q, {λn (q)}n=0 = {λ | ∆(λ, q) = ±2}, we cond ∂In clude that dx ∂q(x) ∈ Tq Iso(q).

Denote by mij = mij (λ, q) (1 ≤ i, j ≤ 2) the entries of the Floquet matrix mij := ∂xi−1 yj (1, λ, q). Lemma 14 For any k ≥ 1, q ∈ L20 , and λ = µk (q), {µk (·), ∆(λ, ·)}(q) =

1 m11 (µk (q), q) − m22 (µk (q), q) m12 (λ, q) . 2 m ˙ 12 (µk (q), q) λ − µk (q)

Proof. By the definition of the Poisson bracket, 1 ∂∆(λ, q) d ∂µk (q) dx. {µk , ∆(λ)}(q) = − ∂q(x) dx ∂q(x) 0

(4.2)

Vol. 2, 2001


Using that (cf. [PT])

∂µk ∂q(x)

=

y22 (x,µk ,q) m ˙ 12 (µk )m22 (µk )

2(λ − µk ){µk , ∆(λ)}

= =

825

we obtain (cf. (B.4) in Appendix B)

1 m12 (λ) − m22 (µk ) m ˙ 12 (µk ) m22 (µk ) m12 (λ) (m11 (µk ) − m22 (µk )) . m ˙ 12 (µk )

Corollary 15 For any k, n ≥ 1 and q ∈ L20 , {µk (·), In (·)} = −

1 m11 (µk ) − m22 (µk ) π m ˙ 12 (µk )

λ2n

λ2n−1

m12 (λ) dλ 2 λ − µk ∆ (λ) − 4

where we have omitted q from the list of parameters. Proof. The claimed formula follows from Lemma 14 and 2 ∂In =− ∂q(x) π

λ2n

λ2n−1

∂∆(λ) 1 dλ. ∆2 (λ) − 4 ∂q(x)

∂θm (x) d ∂In (x) As dx onto Tq Iso(q) will matter ∂q(x) ∈ Tq Iso(q), only the projection of ∂q(x) for the computation of {θm , In }(q). As θm = k≥1 ηm,k we introduce, for k ∈ O and m ≥ 1,   − ψ˙m (µk ) y1 (x, µk )y2 (x, µk ) if µk ∈ {λ2k−1 , λ2k } ∆(µk ) hm,k (x, q) := ψm (µk ) ∂µ k if λ2k−1 < µk < λ2k  √ 2 ∂q(x) ∆ (µk )−4

where ψm (λ) (m ≥ 1) is given in Proposition 2. Lemma 16 For q ∈ L20 , k ∈ O, and m, n ≥ 1, (i) ∂ηm,k d ∂In d ∂In , = hm,k , ; ∂q(x) dx ∂q(x) L2 dx ∂q(x) L2 (ii)

∂ηm,k d ∂In , ∂q(x) dx ∂q(x)

L2

=−

ψm (µk ) 1 m ˙ 12 (µk ) π

λ2n

λ2n−1

m12 (λ) dλ . 2 λ − µk ∆ (λ) − 4

Proof. (i) Consider the case λ2k−1 < µk < λ2k . To prove the statement we use C.3 ∂λ2k in Appendix C. As λ2k (·) is a spectral invariant, ∂q(x) ∈ Nq Iso(q).

826


By Lemma 13,

∂λ2k d ∂In ∂q(x) , dx ∂q(x)

∂ ∂q(x)


L2

= 0. Similarly,

ψm (y + λ2k ) −G(y + λ2k )

,

d ∂In dx ∂q(x)

=0 L2

2

−4 where G(λ, q) := ∆(λ) λ2k −λ . Therefore in this case we obtain (i). In the case µk = λ2k , we 40 in Appendix B, use Lemma 42 in Appendix C. By Corollary d ∂∆(λ) d ∂In 2 2 y2 (x, µk ), dx ∂q(x) 2 = 0, as λ2k = µk . Therefore y2 (x, µk ), dx ∂q(x) 2 = 0 L

L

and, by Lemma 42, we obtain (i). The case µk = λ2k−1 is treated similarly. (ii) For q ∈ L20 with µk = λ2k , the statement follows from (i) and Corollary 15 (recall that ∆2 (µk ) − 4 = m11 (µk ) − m22 (µk )). By continuity, (ii) holds for m = k, or m = k and m ∈ O. Denote by Gap0≤K the set of K-gap potentials Gap0≤K := {q ∈ L20 | γk = 0 iff k > K}.

(4.3)

Proof of Proposition 12(ii) Fix m, n ≥ 1. By Proposition 41, for K ≥ max {m, n} and q ∈ Gap0≤K , {θm , In }(q) =

K ∂ηm,k k=1

d ∂In , ∂q(x) dx ∂q(x)

L2

∞ ∂ηm,k d ∂In , + . ∂q(x) dx ∂q(x) L2 k=K+1

Using Corollary 44 together with (B.4) (cf Appendix B), we obtain, for k > K and λ = µk , (using that for λ2k = λ2k−1 , m222 (µk ) = 1 and m21 (µk ) = 0) ∂ηm,k d ∂∆(λ, q) , = 0. ∂q(x) dx ∂q(x) L2 Thus, for k > K, ∂ηm,k d ∂In ∂ηm,k d ∂∆(λ) 1 2 λ2n , , =− dλ = 0. ∂q(x) dx ∂q(x) L2 π λ2n−1 ∆2 (λ) − 4 ∂q(x) dx ∂q(x) L2 Hence, for q ∈ Gap0≤K , (cf Lemma 16 and Lemma 47 in Appendix D) {θm , In }(q) =

K ∂ηm,k k=1

d ∂In ∂q(x) dx ∂q(x)

,

L2

K ψm (µk ) m12 (λ) dλ 2 ˙ 12 (µk ) λ − µk ∆ (λ) − 4 λ2n−1 k=1 m λ2n 1 ψm (λ) = − dλ = −δnm . π λ2n−1 ∆2 (λ) − 4

= −

1 π

λ2n

Vol. 2, 2001


827

∂θm d ∂In 0 As dx ∂q(x) and ∂q(x) depend continuously on q, and the set ∪k≥K Gap≤k is dense in L20 , we conclude that {θm , In } = −δn,m for q ∈ U \ Dm .

Corollary 17 For k, n ≥ 1, {xk , In } = δk,n yk ;

{yk , In } = −δk,n xk .

Proof. Assume that q ∈ U \ Dk . Then 1 ∂Ik ∂θk d ∂In √ {xk , In } = − 2Ik sin θk , cos θk (4.4) ∂q(x) ∂q(x) dx ∂q(x) L2 2Ik = δk,n 2Ik sin θk = δk,n yk . d ∂In 2 As xk , yk , and dx ∂q(x) are analytic, we conclude that (4.4) holds for q ∈ L0 . The other identity in the statement is obtained in a similar fashion.

To two Lemmas. Recall that prove Proposition 12(iii) we need the following 2 θñ = k=n ηn,k (q) and introduce, for q ∈ L0 with λ2n−1 = λ2n , an L2 [0, 1]orthonormal basis f˜2n−1 , f˜2n of span y1 (·, λ2n ), y2 (·, λ2n ) with f˜2n := ||yy22 || and f˜2n−1 (0) > 0. Then f˜2n−1 is of the form (yj ≡ yj (·, λ2n ), j = 1, 2) y1 + bn y2 ; f˜2n−1 = ||y1 + bn y2 ||

bn := −

y1 , y2 L2 . y2 , y2 L2

Lemma 18 Let q ∈ L20 with λ2n−1 (q) = λ2n (q). Then

˜2 − f˜2 f ∂xn 2n 2n−1 = ξn cos θñ − κn sin θñ f˜2n f˜2n−1 ∂q(x) 2

˜2 − f˜2 f ∂yn 2n 2n−1 = ξn sin θñ + κn cos θñ f˜2n f˜2n−1 ∂q(x) 2

(4.5) (4.6)

where κn ≡ κn (q) satisfies κn = 0. If q is a finite gap potential one has for n → ∞ log n κn = −1 + O . n Proof. is given in Appendix C. Lemma 19 Let q ∈ L20 with λ2m−1 (q) = λ2m (q) and λ2n−1 (q) = λ2n (q). with f˜j defined as above d ˜2 2 2 2 ˜ ˜ ˜ f − f2m−1 =0 f2n − f2n−1 , dx 2m L2 d ˜ ˜ f2m f2m−1 =0 f˜2n f˜2n−1 , dx L2 d 2 2 − f˜2n−1 , f˜2m f˜2m−1 = −δn,m ||y2 || ||y1 + bn y2 ||. f˜2n dx L2

Then,

(4.7) (4.8) (4.9)

828



Proof. Assume that q ∈ H01 . The identities (4.7) and (4.8) clearly hold if m = n. 2 2 If m = n, then, as f˜2k−1 , f˜2k , and f˜2k f˜2k−1 with k ∈ {m, n} are in H 3 , we obtain by Lemma 39 in Appendix B that (4.7)-(4.9) hold. It remains to verify (4.9) for m = n. Notice that 2 y1 (x, λ2n )y2 (x, λ2n ) = αf˜2n−1 f˜2n − bn ||y2 ||2 f˜2n n y2 where, in view of f˜2n−1 = ||yy11 +b +bn y2 || , α = ||y1 + bn y2 || ||y2 ||. Let W [f, g] := f g − f g . By a straightforward computation, d ˜ ˜ 1 2 ˜ f2n f2n−1 W [f˜2n−1 , f˜2n ](0); = f2n , dx 2 L2 d ˜ ˜ 1 2 ˜ = − W [f˜2n−1 , f˜2n ](0). f2n f2n−1 f2n−1 , dx 2 L2

Combining the two identities above leads to d ˜ 1 2 2 ˜ ˜ ˜ = W [f˜2n−1 , f˜2n ](0) = − f2n − f2n−1 , f2n−1 f2n dx α L2 and (4.9) holds for n = m. Finally one can argue by continuity to conclude that (4.7)-(4.9) hold for q ∈ L20 . Proof of Proposition 12(iii) The claimed identities follow from Lemma 18 and Lemma 19.

5

dq Ω a local diffeomorphism

In this section we prove 1

Proposition 20 For q ∈ L20 , the map dq Ω : L20 → h 2 (N; R2 ) is invertible. Remark The derivative dq Ω at q = 0 can be explicitly computed. It is given by (p ∈ L20 ) −1 d0 Ω(p) = √ (p2n , p2n−1 ) nπ n≥1 where (pn )n≥1 are the Fourier coefficents of p, p2n =

1

p(x) cos (2πnx)dx; 0

p2n−1 =

1

p(x) sin (2πnx)dx. 0

To prove Proposition 20 we show in a first step that dq Ω is Fredholm (cf Lemma 23 below). For this we need the following

Vol. 2, 2001


829

Lemma 21 For K ≥ 0 and q ∈ Gap0≤K (cf 4.3), we have: √ √ log n ∂xn = − 2 cos 2πnx + O∞ (i) 2nπ ∂q(x) n √ √ ∂yn log n = − 2 sin 2πnx + O∞ 2nπ ∂q(x) n (ii)

√ log n d ∂xn 1 √ = 2 sin 2πnx + O∞ n 2nπ dx ∂q(x) √ log n d ∂yn 1 √ = − 2 cos 2πnx + O∞ n 2nπ dx ∂q(x)

Proof. The estimate for

∂yn ∂q(x)

(n → ∞) (n → ∞);

(n → ∞) (n → ∞).

is obtained similarly as the estimate for

∂xn ∂q(x) ,

so we

∂xn ∂q(x) .

concentrate on (i) Fix K ≥ 0 and q ∈ Gap0≤K and let n > K be arbitrary. As λ2n−1 (q) = λ2n (q), by Lemma 18,

˜2 − f˜2 f ∂xn 2n−1 = ξn (q) cos θñ 2n − κn sin θñ f˜2n f˜2n−1 . (5.1) ∂q(x) 2 Recall that θñ = k=n ηn,k . As, for k > K, µk = λ2k , we get, for k > K, ηn,k = 0. K Therefore θñ = k=1 ηn,k . By Lemma 4 1 θñ = O . (5.2) n Recall that ξn = √1nπ 1 + O logn n , κn = −1 + O logn n . Further, as y1 = nπx cos nπx + O∞ n1 and y2 = sinnπ + O∞ n12 we have y1 , y2 L2 = O n12 and y ,y y2 , y2 L2 = O n12 . Hence bn = − y21 ,y22 L22 = O(1) and y1 + bn y2 = cos nπx + L 1 O∞ n . One thus obtains √ 1 y2 (x, λ2n ) ˜ = 2 sin nπx + O∞ f2n = . (5.3) ||y2 (·, λ2n )|| n and

√ y1 + bn y2 = 2 cos nπx + O∞ f˜2n−1 = ||y1 + bn y2 ||

Therefore f˜2n f˜2n−1

=

2 2 − f˜2n−1 f˜2n

=

1 . n

1 sin 2nπx + O∞ , n 1 −2 cos 2nπx + O∞ . n

(5.4)

(5.5) (5.6)

Substituting the above estimates in (5.1), one obtains the claimed asymptotic.

830



(ii) The proof for (ii) is similar, using the asymptotics of the derivatives of the fundamental solutions y1 (x, λ2n ) and y2 (x, λ2n ) stated in (C.9). Introduce (n ≥ 1) √ ∂xn ∂yn ; B−n ≡ B−n (q) := 2nπ ; 2nπ ∂q(x) ∂q(x) √ √ Tn ≡ Tn (q) := − 2 cos 2πnx; T−n ≡ T−n (q) := − 2 sin 2πnx. Bn ≡ Bn (q) :=

√

From Lemma 21 we obtain, with Gap0f inite = ∪k≥1 Gap0≤k , Corollary 22 For q ∈ Gap0f inite , the system (Bm )m=0 is quadratically close to (Tm )m=0 , i.e. ||Bm − Tm ||2 < ∞. m=0 1

→ h 2 (N; R2 ) is given by h, Bm L2 em dq Ω(h) =

The linear operator dq Ω :

L20

(5.7)

m∈Z\{0}

where em = (2mπ)−1/2 (δn,m , 0)n≥1 and e−m = (2mπ)−1/2 (0, δn,m )n≥1 . Denote by (e∗m )m the basis dual to (em )m , i.e. e∗m = (2mπ)1/2 (δn,m , 0)n≥1 and e∗−m = (2mπ)1/2 (0, δn,m )n≥1 . Lemma 23 Let q ∈ L20 . (i) The operator dq Ω is a Fredholm operator with index 0. (ii) Bm = Tm + o2 (1), (±m → ∞). 1

1

Proof. Introduce the operators D : L20 → h 2 (N; R2 ), and Aq : L20 → h 2 (N; R2 ), given by h, Tm L2 em ; D(h) := m∈Z\{0}

Aq

:=

dq Ω − D;

Aq (h) =

h, Bm − Tm L2 em .

m∈Z\{0}

(i) First we prove that, for q ∈ Gap0f inite , the operator Aq is compact. It follows from Corollary 22 that, for any q ∈ Gap0f inite and 3 > 0, there exist a > 0 and M > 0 such that ∀h ∈ L20 with ||h|| ≤ 1, the following inequalities hold 2 h, Bm − Tm L2 < 3. ||Aq h|| ≤ a; |m|>M

Thus Aq is compact.

Vol. 2, 2001


831

As Aq = dq Ω − D depends continuously on q and Gap0f inite is dense in L20 , we conclude that Aq is compact for q ∈ L20 . As D is invertible, dq Ω is a Fredholm operator of index 0. 1 (ii) Notice that, for m = 0, (dq Ω)∗ (e∗m ) = Bm , where (dq Ω)∗ : h− 2 (N; R2 ) → L20 ∗ and (em )m denotes the basis dual to (em )m introduced above. Indeed, for h ∈ L20 , ! " ∗ (dq Ω) (e∗m ), h L2 = e∗m , dq Ω(h) = h, Bm L2 where we used (5.7). By (i), Bm = D∗ (e∗m ) + A∗q (e∗m ). Notice that D∗ (e∗m ) = Tm . 1 Further A∗q (e∗m ) = o2 (1) as A∗q : h− 2 (N; R2 ) → L20 is compact. As a second ingredient of the proof of Proposition 20, we show that dq Ω is 1 − 1. First we need to establish some auxilary results. Following [GK], we say that a sequence (Fn )n∈J in L20 (J ⊂ Z) is almost normalized if 0 < inf Fn and sup Fn < ∞. n

n

independent An almost normalized sequence (Fn )n∈J is said to be ω-linearly 2 2 in L (cf [GK] p. 316) if for any sequence (α ) with α n n∈J n∈J n < ∞ and 0 α F = 0, α = 0 for all n ∈ J . n n n n∈J Notice that, by Lemma 23, Bm is almost normalized. Lemma 24 Let q ∈ L20 . Then dq Ω is invertible iff (Bm )m=0 is ω-linearly independent in L20 . ∗

1

Proof. By Lemma 23, (dq Ω) : h− 2 (N; R2 ) → L20 is a Fredholm operator of index 0. Further, for m = 0, (dq Ω)∗ (e∗m ) = Bm . Therefore, N ull (dq Ω)∗ = {0} iff (Bm )m=0 is ω-linearly independent in L20 . √

For n ∈ O, √2πn 2In √ √2nπ ∂In 2In ∂q(x)

∂In ∂q(x)

= cos θn Bn +sin θn B−n . Hence, by Lemma 21, the sequence

is almost normalized. n∈O

Lemma 25 The system (Bm )m=0 is ω-linearly independent in L20 iff the system √ √2nπ ∂In is ω-linearly independent in L20 . 2I ∂q(x) n

n∈O

Proof. Assume that, for a sequence (αm )m=0 with f :=

m∈Z\{0}

αm Bm =

√ n≥1

2 m∈Z\{0} αm

< ∞,

∂xn ∂yn + α−n 2nπ αn = 0. ∂q(x) ∂q(x)

Then, by Corollary 17, for k ∈ O, √ d ∂Ik = 2kπ(αk yk − α−k xk ). 0 = f, dx ∂q(x) L2

832



# Thus, for k ∈ O, (αk , α−k ) = ± α2k + α2−k (cos θk , sin θk ) and αk

# ∂xk ∂yk 1 ∂Ik + α−k = ± α2k + α2−k √ . ∂q(x) ∂q(x) 2Ik ∂q(x)

By Proposition 12(iii) and Corollary 17, for k ∈ O, √ d ∂xk ∂yk d ∂xk 0 = f, , = 2kπα−k dx ∂q(x) L2 ∂q(x) dx ∂q(x) L2 √ ∂xk d ∂yk d ∂yk , 0 = f, = 2kπαk . dx ∂q(x) L2 ∂q(x) dx ∂q(x) L2 Hence, by Proposition 12(iii), for k ∈ O, α±k = 0 and √ # 2nπ ∂In . αm Bm = 0= ± α2n + α2−n √ ∂q(x) 2I n n∈O m∈Z\{0} From these considerations the claimed statement follows. √ ∂In Lemma 26 The system √2nπ is ω-linearly independent in L20 . 2I ∂q(x) n

n∈O

Proof. It is to show that for any (αn )n∈O with n∈O α2n < ∞ and √ 2nπ ∂In =0 αn √ 2In ∂q(x)

(5.8)

n∈O

one has αn = 0 for any n ∈ O. Recall that, for k ∈ O and m ≥ 1, we have introduced   − ψ˙m (µk ) y1 (x, µk )y2 (x, µk ) µk ∈ {λ2k−1 , λ2k } ∆(µk ) hm,k (x, q) := ψm (µk ) ∂µk √ λ2k−1 < µk < λ2k  ∂q(x) 2 ∆ (µk )−4

and proved (cf Lemma 16) ψm (µk ) 1 λ2n m12 (λ) ∂In d dλ . = , hm,k 2 (λ) − 4 ∂q(x) dx m ˙ (µ ) π λ − µ 2 12 k k ∆ λ2n−1 L For any m ∈ O given, we want to conclude from (5.8) that αm = 0. Indeed, √ 2nπ ∂In d , hm,k 0 = αn √ 2In ∂q(x) dx n∈O L2 √ λ2n 2nπ ψm (µk ) 1 m12 (λ) dλ = αn √ . ˙ 12 (µk ) π λ2n−1 λ − µk ∆2 (λ) − 4 2In m n∈O

Vol. 2, 2001


833

With the change of variable of integration λ = ζn (t) := τn + t γ2n (−1 ≤ t ≤ 1),

λ2n

λ2n−1

m12 (λ) dλ = λ − µk ∆2 (λ) − 4

1

−1

√ m12 (ζn (t)) 1 − t2 γn /2 dt √ ζn (t) − µk ∆2 (ζn (t)) − 4 1 − t2

√ and standard asymptotic estimates for 2In = ξn γn /2, ψm (λ), and m ˙ 12 (λ) one concludes that (for n, k = m) √ √ m |αn | 2nπ ψm (µk ) m12 (ζn (t)) 1 − t2 γn /2 . αn √ ≤C 2 2 2 ˙ 12 (µk ) ζn (t) − µk ∆ (ζn (t)) − 4 |k − m | n 2In m Therefore 0=

√ 2nπ 1 λ2n ψm (µk ) m12 (λ) dλ . αn √ π λ2n−1 m ˙ 12 (µk ) λ − µk ∆2 (λ) − 4 2I n n∈O k∈O

(5.9)

For, k ∈ O, ψm (µk ) = 0. Thus, by the sampling formula (cf Proposition 46 Appendix D), ψm (µk ) m12 (λ) ψm (µk ) m12 (λ) = = ψm (λ). m ˙ 12 (µk ) λ − µk m ˙ 12 (µk ) λ − µk

k∈O

k≥1

We now can rewrite (5.9) as √ √ 2nπ 1 λ2n ψm (λ) 2nπ dλ = 0= αn √ αn √ δn,m 2 π 2In 2In ∆ (λ) − 4 λ2n−1 n∈O n∈O

and hence αm = 0.

6 Ω a diffeomorphism The main result of this section is the following 1

Theorem 3 The map Ω : L20 → h 2 (N; R2 ) as well as its inverse is a real analytic diffeomorphism. First we need to prove 1

Proposition 27 The map Ω : L20 → h 2 (N; R2 ) is proper. 1

Proof. Given a compact subset K ⊂ h 2 (N; R2 ), there exists M ≥ 1 and, for any ε > 0, nε ≥ 1 so that, for all q ∈ Q := Ω−1 (K) ⊆ L20 , n|In (q)| ≤ M ; (6.1) n≥1

834



n|In (q)| ≤ ε.

(6.2)

n≥nε

It is proved in [BBGK, Lemma 2.2] that In ≥

1 min{(1/n)γn2 , nγn }. (8π)2

Thus the set {γn (q)n≥1 | q ∈ Ω−1 (K)} is compact in 2 . Therefore Ω−1 (K) is compact in L20 (cf [GT]). 1

Proof of Theorem 3 We have established that Ω : L20 → h 2 (N; R2 ) is a real analytic map and a local diffeomorphism. It remains to show that Ω is 1-1 and onto. 1 Consider the set V := {z ∈ h 2 (N; R2 ) | EΩ−1 (z) = 1}. Then V is open and closed 1 in h 2 (N; R2 ) as Ω is proper and a local diffeomorphism. In order to prove that 1 1 V = h 2 (N; R2 ) it suffices therefore to show that V = ∅. Take w = 0 ∈ h 2 (N; R2 ). −1 Then, for any q ∈ Ω (0) and n ≥ 1, γn (q) = 0 and therefore q ≡ 0.

7 Restriction of Ω to H0N (N ≥ 1) In this section we want to improve on Theorem 3. For any N ≥ 0, denote by Ω(N ) the restriction of Ω ≡ Ω(0) to H0N . It turns out that the range of Ω(N ) is contained in hN +1/2 (N; R2 ) (cf Lemma 29), hence Ω(N ) can be viewed as a map Ω(N ) : H0N → hN +1/2 (N; R2 ). Theorem 4 For any N ≥ 0, (i) Ω(N ) is a diffeomorphism; (ii) Ω(N ) is real analytic. The proof of Theorem 4 follows from the results stated in the remainder of this section. Recall the following result from [KM] (cf also [ST]) and [Ma]. N Proposition 28 (i) For q0 ∈ H0N , there exists a complex neighborhood Uq0 ⊆ H0,C so that, for q ∈ Uq0 , (γn (q))n≥1 and (µn (q) − λ2n (q))n≥1 are uniformly bounded in hN (N; C). (ii) For any real valued q ∈ L20 one has

q ∈ H0N iff (γn (q))n≥1 ∈ hN (N; R). As a consequence we obtain the following Lemma 29 Let N ≥ 0.

Vol. 2, 2001


835

N (i) For q0 ∈ H0N there exists a complex neighborhood Uq0 of q0 in H0,C so that Ω(Uq0 ) is bounded in hN +1/2 (N; C2 ).

(ii) For real valued potentials, the following characterization holds: q ∈ H0N iff (xn (q), yn (q))n≥1 ∈ hN +1/2 (N; R2 ). Proof. (i) By Proposition 28(i), there exists a complex neighborhood Vq0 of q0 N in H0,C so that (γn (q))n≥1 and (µn (q) − λ2n (q))n≥1 are uniformly bounded in N h (N; C). By Corollary 11, there exists a complex neighborhood Wq0 of q0 so that N +1/2 |xn |+|yn | ≤ nC (N; C2 ). 1/2 (|µn −τn |+|γn |) (∀n ≥ 1). Hence Ω (Vq0 ∩ Wq0 ) ⊆ h (ii) In view of (i) it remains to prove that for any element (xn , yn )n≥1 ∈ hN +1/2 (N; R2 ), Ω−1 (xn , yn )n≥1 ∈ H0N . By Theorem 3, q := Ω−1 (xn , yn )n≥1 ∈ L20 . As q is real valued |xn |2 + |yn |2 = 2In . 2 By Proposition 1, 2In = O n1 γ2n . As q is real valued and (xn , yn )n≥1 ∈ hN +1/2 (N; R2 ) it then follows from Proposition 28(ii) that q ∈ H0N . As a conseqence of Lemma 29 one gets Corollary 30 For any N ≥ 0, Ω(N ) : H0N → hN +1/2 (N; R2 ) is real analytic and bijective. Proof. To see that Ω(N ) is real analytic it suffices to show that Ω(N ) is weakly analytic and locally bounded. As Ω is real analytic, Ω(N ) is weakly analytic. By Lemma 29(i), Ω(N ) is locally bounded. From the fact that Ω : L20 → h1/2 is bijective it follows that Ω(N ) : H0N → N +1/2 h is 1-1 and by Lemma 29(ii), we have that Ω(N ) is onto. Let us now analyze the derivative dq Ω in more detail. Clearly, for q ∈ H0N , ) dq Ω(N = dq Ωn |H N . n 0

Using an inductive procedure, we obtain the following improvement of Lemma 21.

836



Lemma 31 Let q ∈ Gap0≤K with K ≥ 0 and N ≥ 0. Then for any p ∈ H0N , the following statements hold: √ √ ≤ Cn ||p||H N ; 2nπ ∂xn , p + 2 cos 2nπx, p ∂q(x) L2 L2 √ √ 2nπ ∂yn , p ≤ Cn ||p||H N + 2 sin 2nπx, p ∂q(x) L2 L2 n where the bounds Cn are independent of p and satisfy Cn = O nlog . N +1 Proof. Both estimates are proved similarly, so we concentrate on the first one. The proof consists in verifying the statement for N = 0, 1 and in proving an inductive step. Let us start with the latter one. Assume that the statement has already been proved for N ≥ 0. We want to show that the statement holds for N + 2. Let ∂xn is, for n ≥ K + 1, p ∈ H0N +2 . According to Lemma 18 and as q ∈ Gap0≤K , ∂q(x) a linear combination of the products yi (x, λ2n , q)yj (x, λ2n , q) ∈ C ∞ (1 ≤ i, j ≤ 2). Hence (straightforward verification) Lq

∂xn d ∂xn = 2λ2n ∂q(x) dx ∂q(x)

(7.1)

where Lq is a skew symmetric differential operator of order 3, given by Lq = −

d 1 d3 d q+q . + 3 2 dx dx dx

d −1 d Denote by dx : L20 → H01 the inverse of the restriction of dx to H01 . It follows from (7.1) that −1 1 ∂xn d ∂xn = . (7.2) Lq ∂q(x) 2λ2n dx ∂q(x) ∂xn Substitute (7.2) into ∂q(x) ,p and integrate by parts to get

where

p˜ := Lq

d dx

L2

∂xn ,p ∂q(x)

−1

= L2

1 2λ2n

∂xn , p˜ ∂q(x) L2

1 p = − p + 2qp + q 2

d dx

(7.3)

−1 p

∈ H0N .

By the induction hypothesis √ √ log n 2nπ ∂xn , p˜ ≤O + 2 cos 2nπx, p˜ ||˜ p||H N . ∂q(x) nN +1 L2 L2

(7.4)

(7.5)

By (7.4), we have ||˜ p||H N ≤ C||p||H N +2 .

(7.6)

Vol. 2, 2001


Further, √ 2 cos 2nπx, p˜

where and

L2

=

837

1 √ 2 cos 2nπx, p 2 L2 −1 √ d 2 cos 2nπx, 2qp + q p + dx

−

√ 2 cos 2nπx, p

L2

= −(2nπ)2

−1 √ d 2 cos 2nπx, 2qp + q p dx

L

√ 2 cos 2nπx, p

L2

,

1 ||p||H N +2 . ≤ O nN +2 2

(7.7)

L2

(7.8)

(7.9)

Substituting (7.8) and (7.9) into (7.7) and using (7.6), (7.5) leads to the following estimate √ √ 2 2 2nπ ∂xn , p˜ ≤ O log n ||p||H N +2 . + 2n π 2 cos 2nπx, p ∂q(x) nN +1 L2 L2 (7.10) Using (7.3) , (7.10) and the asymptotics λ2n = n2 π 2 + O(1), we obtain √ √ 2nπ ∂xn , p + 2 cos 2nπx, p ∂q(x) L2 L2 √ 2nπ ∂x 2n2 π 2 √ n ≤ , p˜ + 2 cos 2nπx, p 2 2λ2n ∂q(x) 2λ L 2 2n L 2 2 √ 2n π √ + − 2 cos 2nπx, p + 2 cos 2nπx, p 2 2 2ßλ2n L L log n ≤O ||p||H N +2 . nN +3 This proves the induction step. It remains to verify the statements for N = 0 and N = 1. The case N = 0 is contained in Lemma 21(i). The case N = 1 is proved in similar fashion as the d −1 d −1 d −1 Lq dx instead of Lq dx together induction step using the operator dx with Lemma 21(ii). Lemma 32 For q ∈ H0N , dq Ω(N ) : H0N → hN +1/2 is bijective. Proof. By Theorem 3, dq Ω : L20 → h1/2 is bijective, hence dq Ω(N ) = dq Ω|H N is 0

1-1. To see that dq Ω(N ) is onto it then suffices to prove that dq Ω(N ) is a Fredholm operator of index 0. Using Lemma 31, this is verified in a similar way as in the proof of Lemma 23.

838



8 Ω a symplectomorphism The symplectic to the Poisson bracket {F, G} = structure ω associated d −1 ∂F d ∂G f, dx g 2 (f, g ∈ L20 ). Denote ∂q(x) , dx ∂q(x) L2 is given by ω(f, g) := ∞ L 1 by ωcan the canonical symplectic structure ωcan = k=1 dyk ∧ dxk on h 2 (N; R2 ). In this section we prove 1 Theorem 5 The map Ω : L20 , ω → h 2 (N; R2 ), ωcan is a symplectomorphism. To establish Theorem 5, it remains to prove that Ω∗ ω = ωcan . We will establish this identity for finite gap potentials and then argue by continuity. First let us introduce some more notation. Recall that Dm = {q | γm (q) = 0} and define, for any given K ≥ 0, the map K 1 ΛK : ∩m≤K L20 \ Dm → R>0 × S 1 × h 2 N>K ; R2 q → (In (q), θn (q))1≤n≤K , (xn (q), yn (q))n>K . 1

By Proposition 20, ΛK is a local diffeomorphism. Further dq ΛK : L20 → h 2 (N; R2 ) is given by dq ΛK (h) =

K ∂θn ∂In ,h ,h en + e−n + ∂q(x) ∂q(x) L2 L2 n=1 ∞ ∂yn ∂xn ,h ,h en + e−n . ∂q(x) ∂q(x) L2 L2 n=K+1

Introduce v±n ≡ v±n (q) := (dq ΛK )−1 (e±n ) and let ωK be the restriction of the symplectic form ω to Gap0≤K which we now analyze. Lemma 33 Let q ∈ Gap0≤K and 1 ≤ n, m ≤ K. Then (i) v±n (q) ∈ Tq Gap0≤K . d ∂In (ii) v−n (q) = − dx ∂q(x) . (iii) ωK (v−m , v−n ) = 0; ωK (vm , v−n ) = −δn,m . ∂yn ∂In ∂θn ∂xn Proof. Notice that the system ( ∂q(x) , ∂q(x) )1≤n≤K , ( ∂q(x) , ∂q(x) )n>K is biorthogonal to (vn , v−n )n≥1 , i.e. for 1 ≤ n ≤ K and m ≥ 1,

∂In , vm ∂q(x) 2 L ∂In , v−m ∂q(x) L2

∂θn , v−m = δn,m ; = δn,m ; ∂q(x) L2 ∂θn , vm = 0; =0 ∂q(x) L2

(8.1) (8.2)

Vol. 2, 2001


and, for n > K, m ≥ 1, ∂xn , vm ∂q(x) 2 L ∂xn , v−m ∂q(x) L2

839

∂yn , v−m = δn,m ; = δn,m ; ∂q(x) L2 ∂yn , v = 0; = 0. m ∂q(x) L2

(8.3) (8.4)

(i) As Gap0≤K = {q ∈ L20 | xn (q) = yn (q) = 0 iff n > K}, it follows from (8.3) and (8.4) that, for 1 ≤ m ≤ K, v±m ∈ Tq Gap0≤K . d ∂In 0 (ii) By Lemma 13, dx ∂q(x) ∈ Tq Iso(q) ⊂ Tq Gap≤K . By Proposition 12(ii), for 1 ≤ n, m ≤ K, ∂θm d ∂In , = −δn,m . ∂q(x) dx ∂q(x) L2 By Proposition12(i) and Corollary 17, for l > K, m ≥ 1, and 1 ≤ n ≤ K, we have ∂Im d ∂In = 0 and ∂q(x) , dx ∂q(x) 2

L

∂xl d ∂In , ∂q(x) dx ∂q(x)

= 0; L2

∂yl d ∂In , ∂q(x) dx ∂q(x)

= 0. L2

The conditions (8.1)-(8.4) determine (vn , v−n )n≥1 uniquely. Thus, for 1 ≤ n ≤ K, d ∂In v−n (q) = − dx ∂q(x) . (iii) As, for 1 ≤ l ≤ K, v±l (q) ∈ Tq Gap0≤K , we obtain, for 1 ≤ n, m ≤ K, using (ii) and (8.1) d ∂In ∂Im , = 0; ωK (v−n , v−m ) = ω(v−n , v−m ) = dx ∂q(x) ∂q(x) L2 ∂In = −δn,m . ωK (vm , v−n ) = ω(vm , v−n ) = vm , − ∂q(x) L2 When expressed in the coordinates (In , θn )1≤n≤K on Gap0≤K the 2-form ωK takes, in view of Lemma 33, the form ωK =

K n=1

dθn ∧ dIn +

cij dIi ∧ dIj

(8.5)

1≤i<j≤K

where cij are functions of (In , θn )1≤n≤K , ( 1 ≤ i, j ≤ K). As ω is closed, ωK is closed as well. Therefore the coefficients cij depend only on I1 , . . . , IK . We want to show that cij vanish. To this end we prove that cij = 0 when evaluated at a potential q ∈ Gap0≤K with θ1 = · · · = θK = 0. Introduce, for A ⊆ L2 , the subset of normalized potentials in A N orA := {q ∈ A | µk (q) = λ2k (q) ∀k ≥ 1}.

840



Notice that on N orGap0≤K , θ1 = · · · = θK = 0. In Appendix C, we derive an ∂θn on N orL20 \ Dn which turns out to be in explicit formula for the gradient ∂q(x) 2 2 H (cf Proposition 41). Hence, on L0 \ Dn ∩ L20 \ Dm ∩ N orL20 , {θm , θn } is ∂yl ∂xl and ∂q(x) well defined. Further in Appendix C, Lemma 45, the gradients ∂q(x) for potentials q ∈ L20 with γl (q) = 0 are given which also turn out to be in H 2 . Hence, for q ∈ L20 with γn = 0 and γl = 0, {θn , xl }(q) and {θn , yl }(q) are both well defined. Lemma 34 (i) For m, n ≥ 1 and q ∈ L20 \ Dm ∩ L20 \ Dn ∩ N orL20 , {θm , θn }(q) = 0. (ii) For l, n ≥ 1 and q ∈ N orL20 with γl (q) = 0 and γn = 0 {θn , xl } = {θn , yl } = 0. Proof. (i) For k ≥ 1, introduce ak (x, q) := y1 (x, µk (q), q)y2 (x, µk (q), q);

gk (x, q) :=

Then (cf [PT]), for i, j ≥ 1, d 2 d 2 gi , gj aj = 0; ai , = 0; dx dx L2 L2

y2 (x, µk (q), q) . ||y2 (·, µk (q), q)||L2

d 2 1 g aj , = δi,j . dx i L2 2

The claimed statement from Proposition 41. then follows (ii) For q ∈ L20 \ Dn ∩ L20 \ Dl ∩N orL20 , we conclude from (i) and Proposition 12 that the claimed statement holds. In view of Proposition 41, the general case is then obtained by a limiting argument. Lemma 35 Let q ∈ N orGap0≤K and 1 ≤ n, m ≤ K. Then d ∂θn (i) vn (q) = dx ∂q(x) . (ii) ωK (vn , vm ) = 0. Proof. By Lemma 34, for 1 ≤ n ≤ K, l > K, and q ∈ N orGap0≤K {θn , xl }(q) = {θn , yl }(q) = 0 and, for 1 ≤ l ≤ K, {θn , Il }(q) = −δn,l ;

{θn , θl }(q) = 0.

Thus it follows from (8.1)-(8.4) that, for 1 ≤ n ≤ K, vn = (ii) Follows from (i) and (8.2).

d ∂θn dx ∂q(x) .

Vol. 2, 2001


841

Proposition 36 When expressed in the coordinates (In , θn )1≤n≤K on Gap0≤K the 2-form ωK is canonical, i.e. ωK =

K

dθn ∧ dIn .

n=1

Proof. By (8.5) ωK =

K

dθn ∧ dIn +

cij dIi ∧ dIj ,

1≤i<j≤K

n=1

where the coefficients cij depend only on I1 , . . . , IK . By Lemma 35, cij = 0 if θ1 = · · · = θK = 0. Thus cij ≡ 0 on Gap0≤K , for 1 ≤ i < j ≤ K. Proof of Theorem 5 Introduce, for q ∈ L20 and n ≥ 1, u±n ≡ u±n (q) := (dq Ω)

−1

(e±n ).

(8.6)

We have to prove that, for any m, n ≥ 1 and any q ∈ L20 , ω(um , un ) = ω(u−m , u−n ) = 0;

ω(um , u−n ) = −δm,n .

(8.7)

Fix m, n ≥ 1. For any K ≥ max {m, n} and q ∈ Gap0≤K we have, by Proposition 36, ω(vm , vn ) = ω(v−m , v−n ) = 0;

ω(vm , v−n ) = −δm,n .

For 1 ≤ k ≤ K, uk

=

u−k

=

1 2Ik vk cos θk − √ v−k sin θk 2Ik 1 2Ik vk sin θk + √ v−k cos θk . 2Ik

Therefore, by Proposition 36, we obtain (8.7), for q ∈ Gap0≤K . The set ∪K≥max {m,n} Gap0≤K is dense in L20 and, as Ω is analytic, u±m (q), u±n (q) depend continuously on q. Therefore (8.7) holds for any q ∈ L20 .

9 Canonical relations: part 2 In this section we establish regularity properties of the L2 -gradients of θn , xn , and yn (cf Proposition 37 below) and apply them to prove the remaining cannonical relations.

842



Proposition 37 For n ≥ 1 and N ≥ 0, the maps ∇θn

: H0N \ Dn → H0N +1 ;

∇xn

: H0N → H0N +1 ;

∇yn

: H0N → H0N +1 ;

∇θn : q →

∂θn ∂q(x)

∂xn ∂q(x) ∂yn ∇yn : q → ∂q(x) ∇xn : q →

are real analytic. Proof. We prove the statement for N = 0, as for N > 0 the proof is similar. Let 1 q ∈ L20 and z := Ω(q). As Ω−1 : h 2 (N; R2 ) → L20 is analytic, dz Ω−1 depends analytically on z. Thus, for n ≥ 1, the maps u±n (·) : L20 → L20 , q → u±n (q) (cf (8.6)) are analytic. Notice that the system

∂yn ∂xn ∂q(x) , ∂q(x)

is biorthogonal to the basis n≥1

(un , u−n )n≥1 . On the other hand, it follows from (8.7) that

−1 d um , un dx −1 d u−n um , dx Thus

=

u−m ,

L2

d dx

−1 u−n

= 0;

(9.1)

L2

= −δm,n .

(9.2)

L2

−1 d −1 d − dx u−n , dx un

n≥1

is a system, biorthogonal to (un , u−n )n≥1 .

As a basis admits exactly one biorthogonal system, we conclude that, for n ≥ 1, ∂xn =− ∂q(x)

d dx

−1 u−n ;

∂yn = ∂q(x)

d dx

−1 un .

In particular, for q ∈ L20 , ∂yn ∂q(x) ,

viewed as maps ∂xn ∂q(x) ∂yn ∂q(x)

∂yn ∂xn ∂xn 1 ∂q(x) , ∂q(x) ∈ H0 and ∇xn : q → ∂q(x) and ∇yn from L20 to H01 , are analytic. As, for q ∈ L20 \ Dn ,

= =

(9.3) : q →

1 ∂In ∂θn − 2In sin θn ; cos θn ∂q(x) ∂q(x) 2In 1 ∂In ∂θn √ + 2In cos θn sin θn ∂q(x) ∂q(x) 2In √

and the map ∇In : L20 → H02 is analytic, we conclude that ∇θn : L20 \ Dn → H01 is a real analytic map.

Vol. 2, 2001


843

Theorem 6 (i) For q ∈ L20 and m, n ≥ 1, {xm , xn } = 0;

{ym , yn } = 0;

{xn , ym } = δn,m .

(ii) For m, n ≥ 1 and q ∈ L20 \ Dm ∩ L20 \ Dn , {θm , θn } = 0. Proof. (i) By Proposition 37, any bracket in the statement is well defined. The statement follows from 5 (cf 8.7) and (9.3). Theorem (ii) For q ∈ L20 \ Dn ∩ L20 \ Dm , {θn , θm } is well defined by Proposition 37. By (i) we have (9.4) 0 = {xn , xm } = { 2In cos θn , 2Im cos θm }. Using that {In , Im } = 0 and {θn , Im } = −δn,m one verifies {

2In cos θn , 2Im cos θm } = sin θn sin θm 2In 2Im {θn , θm }.

(9.5)

Combining (9.4) and (9.5) yields sin θn sin θm {θn , θm } = 0 and thus, for θn , θm ∈ {0, π} mod 2π, {θn , θm } = 0. By continuity, {θn , θm } = 0 on L20 \ Dn ∩ L20 \ Dm .

A

Appendix

In this appendix, we prove Lemma 4 stated in section 2: Lemma 38 Let Uq0 be a bounded G-neighborhood of q0 ∈ L20 . Then there exists C > 0 so that for any n ≥ 1 the following holds: (i) for all k = n and q ∈ Uq0 , |ηn,k (q)| ≤

Cn 1 (|µk − τk | + |γk |); |k 2 − n2 | k

(ii) for q ∈ Uq0 \ Dn |ηn,n (q)

µn − τn ; mod 2π| ≤ C log 2 + γn

844


(iii) for all q ∈ Uq0 , k=n




1/2  1/2  C  |ηn,k (q)| ≤  |µk − τk |2  +  |γk |2   . n k≥1

k≥1

Proof. (i) As n = k, one has by (2.7) µ∗k µ∗k ψn (λ) ψn (λ) dλ = dλ. ηn,k = 2 ∆(λ) − 4 ∆(λ)2 − 4 λ2k−1 λ2k The following argument is not affected if one interchanges the roles of λ2k−1 and λ2k . Therefore we may assume in the following that |µk − λ2k−1 | ≤ |µk − λ2k |. For λ near Gk := {tλ2k + (1 − t)λ2k−1 | 0 ≤ t ≤ 1} we have (n)

µk − λ ψn (λ) = ± ζn,k (λ) 2 ∆(λ) − 4 (λ2k − λ)(λ − λ2k−1 ) (n)

where, with µn = τn    −1/2 (n) (λ2j − λ)(λ2j−1 − λ) µj − λ 1 λ − λ cn  0 4   ζn,k := ± . τn − λ j 2 π2 kπ k2 π2 (j 2 π 2 )2 j=k

j=k

Using that cn = O(n) (Proposition 2) we then conclude (cf [PT], Appendix E), that for λ near Gk , and any n, k with n = k n (A.1) |ζn,k (λ)| ≤ C k|n2 − k 2 | uniformly for q ∈ Uq0 . Moreover, if we integrate along a straight line l from λ2k−1 to µk on the sheet of Σq determined by µ∗k , then we have $ (n) µk − λ = O(1) λ2k − λ (n) since |µk − λ2k−1 | ≤ |µk − λ2k | and µk = τk + O γk2 . Thus it remains to show that $ µ∗k (n) λ − µk dλ = O (|γk | + |µk − τk |) λ − λ2k−1 λ2k−1 when integrating along the straight line l. But this follows with the substitution (n) λ = λ2k−1 + t(µk − λ2k−1 ). Setting 3 = |µk − λ2k−1 | and δ = |µk − λ2k−1 | we obtain the bound 1√ √ √ 3+δ √ √ δ dt = 2 3 + δ δ ≤ 3 + 2δ. δ t 0

Vol. 2, 2001


845

As 3 = O(|γk |) and δ = O(|γk | + |µk − τk |), the claim follows. (ii) Arguing as in (i), we may assume, in view of (2.7) that µn = λ2n−1 , λ2n . In the case where µn satisfies 0 < |µn − λ+ n | ≤ 2|γn |, one obtains as in (i), ∗ 1 µn ψn (λ) 1 |µn − λ+ dλ ≤ π + C n |dt, (A.2) + 1/2 2 λ2n t |µn − λn |1/2 |γn /2|1/2 ∆(λ) − 4 0 which establishes the claimed estimate in this case. If |µn − λ+ n | > 2|γn |, the integral is split into two parts, ∗ z ψ (λ)dλ µn µn ψn (λ) ψn (λ)dλ n dλ ≤ π + + (A.3) λ+ λ2n ∆(λ)2 − 4 ∆(λ)2 − 4 z ∆(λ)2 − 4 n n where z = τn +|γn | |µµnn −τ −τn | . The first integral on the right side of (A.3) is estimated as in (A.2). Arguing as in (i), the second integral can be estimated 2| µnγ−τn | µn ψ (λ)dλ γ n 1 n n ≤ C (A.4) dt z | γ2n |(t2 − 1)1/2 2 ∆(λ)2 − 4 2 µn − τn µn − τn . ≤ C log 2 ≤ C arccosh γn /2 γn /2

Combining (A.3) and (A.4) leads to the claimed estimate. (iii) We split the sum |ηn,k (q)| into two parts k = n |k−n|≤n/2 |ηn,k (q)| and |η (q)|. The two parts are estimated separately, |k−n|>n/2 n,k

|ηn,k (q)| ≤ C

|k−n|≤n/2

|k−n|≤n/2

n 1 1 (|µk − τk | + |γk |) n + k k |k − n|

1 2 (|µk − τk | + |γk |) n |k − n| k=n  1/2 1/2  1/2  1   2  |µk − τk |2  + |γk |2   ≤C   n |k − n|2 ≤ C

k=n

k≥1

k≥1

where for the last inequality we have used the Cauchy-Schwartz inequality. The sum |k−n|>n/2 |ηn,k (q)| is treated similarly.

B Appendix In this appendix, we prove various orthogonality relations.

846



For λ ∈ R and q ∈ L2 , introduce F (x, λ, q) := aij (q)yi (x, λ, q)yj (x, λ, q) 1≤i,j≤2

G(x, λ, q)

:=

bij (q)yi (x, λ, q)yj (x, λ, q)

1≤i,j≤2 3 (R), but with aij (·), bij (·) ∈ C(L2 ; R). Notice that for q ∈ H 1 , F and G are in Hloc not necessarily periodic.

Lemma 39 Assume that α = β, and q ∈ H 1 . Then, with F ≡ F (x, α, q) and G ≡ G(x, β, q), & % 1 1 d 1 1 G = − (F G − F G + F G )|0 + 2 (F (q − α)G)|0 . F, dx 2(β − α) 2 L2 (B.1) Moreover, if the right side of (B.1) is well defined and continuous for q ∈ L2 , (B.1) holds for q ∈ L2 . Proof. For a ∈ R, introduce Lq;a

1 := − 2

d dx

3 +q

d d d + q − 2a . dx dx dx

One verifies that Lq;α F (x, α, q) = Lq;β G(x, β, q) = 0. As

(B.2)

1 2(β−α) (Lq;α

− Lq;β ), we obtain using (B.2) 1 1 d G F, (Lq;α − Lq;β )GL2 = F, Lq;α GL2 . = F, dx 2(β − α) 2(β − α) L2

d dx

=

Integrating by parts, we obtain F, Lq;α GL2 = −

1 1 1 (F G − F G + F G )|0 + 2 (F (q − α)G)|0 − Lq;α F, G . 2

Using (B.2) once again we obtain (B.1).

Corollary 40 (i) Assume that α = β and, for q ∈ H 1 , F ≡ F (·, α, q), G ≡ G(·, β, q) ∈ H 3 . Then, for q ∈ L2 , d G = 0. F, dx L2 (ii) For λ, β arbitrary and q ∈ L2 , ∂∆(λ, q) d ∂∆(β, q) , = 0. ∂q(x) dx ∂q(x) L2

(B.3)

Vol. 2, 2001


(iii) For λ, a, b ∈ R, k ≥ 1, and q ∈ L2 ∂∆(λ, q) d 2 , ay1 (x, µk , q)y2 (x, µk , q) + by2 (x, µk , q) = ∂q(x) dx L2 m12 (λ) am21 (µk )m22 (µk ) + b(m222 (µk ) − 1) . 2(λ − µk )

847

(B.4)

1 1 dj dj F = 0 and G Proof. (i) It follows from the assumption F, G ∈ H 3 that dx j dxj 0 = 0 0 for 0 ≤ j ≤ 2. Hence the claimed statement is a direct consequence of Lemma 39. 3 (ii) For q ∈ H 1 and λ ∈ R, ∂∆(λ,q) ∂q(x) ∈ H and (i) can be applied. (iii) Assume that q ∈ H 1 . Let F := ∂∆(λ,q) ∂q(x) and G := ay1 (x, µk , q)y2 (x, µk , q) + 2 3 3 (R). One verifies that F (0) = F (1) = by2 (x, µk , q). Then F ∈ H and G ∈ Hloc m12 (λ), F (0) = F (1) = m22 (λ) − m11 (λ), G(0) = G(1) = 0, G (0) = a, G (1) = am11 (µk )m22 (µk ) = a, G (0) = 2b, G(1) = 2(am21 (µk )m22 (µk ) + bm222 (µk )). Therefore (B.4) holds for q ∈ H 1 . As the right hand side of (B.4) is defined and continuous on L2 , we conclude from Lemma 39 that the identity (B.4) remains valid for q ∈ L2 .

C

Appendix

The purpose of this appendix is to derive an explicit formula for the gradient of ∂θn the angle variables ∂q(x) for certain potentials. This formula is similar to the one obtained in [MV] for the nonlinear Schr¨ odinger equation (NLS). In addition, we ∂yn ∂xn and ∂q(x) for q ∈ L20 with λ2n−1 = λ2n . present formulas for ∂q(x) Recall that Dn := {q | γn (q) = 0}. For k, n ≥ 1 and q ∈ L20 \ Dn introduce ˙ 11 m21 ψn k+1 m ; dk ≡ dk (q) := (−1) . cn,k ≡ cn,k (q) := − ˙ λ ,q ˙ ∆ ∆ λ2k ,q 2k Recall that ψn (λ, q) is an entire function introduced in section 2 and mij = mij (λ, q) (1 ≤ i, j ≤ 2) denote the entries of the Floquet matrix mij := ∂xi−1 yj (1, λ, q). Proposition 41 Let K, n ≥ 1 and q ∈ L20 \ Dn with µk (q) = λ2k (q) for k ≥ K. Then ∂θn ∂q(x)

= +

K−1 k=1 ∞

∂ηn,k ∂q(x) cn,k (q) y1 (x, λ2k , q)y2 (x, λ2k , q) + dk (q)y22 (x, λ2k , q)

k=K

where the series converges in H 2 .

(C.1)

848



To prove Proposition 41 we first study the gradient of ηn,k . Notice that µk (q)−λ2k (q) ψ (y + λ2k (q), q) n ηn,k (q) = dy yG(y + λ2k (q), q) 0 where G(λ, q) := (C.2) to write ∂ηn,k ∂q(x)

∆2 (λ)−4 λ2k −λ .

= +

(C.2)

For q ∈ L20 with λ2k−1 (q) < µk (q) < λ2k (q), we can use

∂µk ψ (µ (q), q) ∂λ2k n k (q) − (q) ∂q(x) ∆2 (µk (q), q) − 4 ∂q(x)

µk (q)−λ2k (q) ∂ ψn (y + λ2k (q), q) 1 √ dy. −y ∂q(x) −G(y + λ2k (q), q) 0

Lemma 42 For p ∈ L20 with λ2k−1 (p) < µk (p) = λ2k (p), ∂ηn,k ∂m11 ∂m22 (−1)k ψn − m ˙ = m ˙ 22 11 ˙2 ∂q(x) q=p ∂q(x) ∂q(x) λ2k ,p ∆ = cn,k y1 (x)y2 (x) + dk y22 (x)

(C.3)

(C.4)

λ2k ,p

where ˙ denotes the derivative with respect to λ. Proof. Introduce the open sets (k ≥ 1) Vk := {q ∈ L20 | λ2k−1 (q) < µk (q) < λ2k (q)}. It follows from (C.3) and the analyticity of ηn,k that ∂µk ∂λ2k ∂ηn,k ψn (µk (q), q) = lim (q) − (q) . lim q∈Vk ∂q(x) q∈Vk ∂q(x) ∆2 (µk (q), q) − 4 ∂q(x) q→p q→p As ∆(λ2k (q), q) = (−1)k 2 and m12 (µk (q), q) = 0, we get, by implicit differentiation, ∂m12 ∂∆ ∂λ2k ∂µk ∂q(x) (λ2k (q), q) ∂q(x) (µk , q) ; (q) = − (q) = − . ˙ 2k (q), q) ∂q(x) ∂q(x) m ˙ 12 (µk , q) ∆(λ Differentiating the Wronskian identity, m11 m22 √ − m12 m21 = 1, with respect √ to λ at λ = µk (q), we get, using that 2m11 = ∆ + ∆2 − 4 and 2m22 = ∆ − ∆2 − 4 at λ = µk , ˙ 11 m22 + 2m11 m ˙ 22 = ∆(m ˙ 11 + m ˙ 22 ) − ∆2 − 4(m ˙ 11 − m ˙ 22 ). 2m ˙ 12 m21 = 2m Similarly, differentiating the Wronskian identity with respect to q and evaluating the result at λ = µk (q) we get √ ∂m12 11 +m22 ) 11 −m22 ) − ∆2 − 4 ∂(m∂q(x) ∆ ∂(m∂q(x) ∂q(x) √ . = m ˙ 12 ∆(m ˙ 11 + m ˙ 22 ) − ∆2 − 4(m ˙ 11 − m ˙ 22 )

Vol. 2, 2001


849

Thus ∂µk ∂λ2k (q) − (q) (C.5) ∂q(x) ∂q(x)   √ ∂∆ ∂∆ 2 − 4 ∂ (m ∆ − ∆ − m ) 11 22 ψn (µk , q) ∂q(x) ∂q(x)  ∂q(x) . √ = − ˙ ˙ − ∆2 − 4(m ∆ ∆ ∆ ˙ − m ˙ ) ∆2 (µk , q) − 4 11 22 λ ,q µ ,q ψn (µk , q) ∆2 (µk , q) − 4

2k

k

Taking the limit q → p, (C.5) yields ∂m11 ∂m22 ˙ −2 m −m ˙ 11 (−1)k ψn ∆ . ˙ 22 ∂q(x) ∂q(x) λ2k ,p To finish the derivation, notice that, as µk (p) = λ2k (p), m12 (λ2k , p) = 0 and m11 (λ2k , p) = m22 (λ2k , p) = (−1)k . Using that (cf [PT]) ∂m11 ∂q(x) ∂m22 ∂q(x)

=

m12 y12 (x) − m11 y1 (x)y2 (x)

=

m22 y1 (x)y2 (x) − m21 y22 (x)

we obtain at (λ2k (p), p) m ˙ 22

∂m11 ∂m22 ˙ 1 (x)y2 (x) + m −m ˙ 11 = (−1)k+1 ∆y ˙ 11 m21 y22 (x). ∂q(x) ∂q(x)

Lemma 43 (i) Let n ≥ 1 be fixed. cn,k (q) with k = n and dk (q) with k ≥ 1 can be extended continuously on L20 and satisfy the asymptotics cn,k = O

1 k2

;

dk (q) = O (1) .

(ii) For n ≥ 1, γn cn,n can be extended continuously on L20 and satisfies the asymptotics log n cñ,n := γn cn,n = −4nπ 1 + O

= 0. n ˙ q) have the following Proof. (i) Recall that ψn (λ, q) and ∆(λ, tions (n) cn (q) µm − λ ˙ ; ∆(λ, q) = − ψn (λ, q) = 2 2 2 2 n π m π m=n

m≥1

product representaλ˙ m − λ . m2 π 2

850



n (λ2k ) Thus cn,k (q) = − ψ∆(λ can be written as a product of three quotients ˙ ) 2k

cn,k (q) = where f (λ) :=

+ m≥1 m=k,n

µ(n) m −λ m2 π 2

cn (q) λ˙ n − λ2k

(n)

f (λ2k ) µk − λ2k g(λ2k ) λ˙ k − λ2k

and g(λ) :=

+

˙ m −λ λ m≥1 m2 π 2 . m=k,n

(C.6)

As, by assumption,

2 n = k, the first two quotients on the right hand side of (C.6) are continuous on L0 . (q) (λ2k ) As λ2k = k 2 π 2 +O(1), λ˙ cn−λ = O k12 whereas fg(λ = 1 + O logk k (cf [PT] 2k ) n 2k Appendix E). To estimate the third quotient, recall that ([BKM1, Theorem2.1] and [BKM2 Lemma 2.4]) (n)

|µk (p) − τk (p)| = γk2 (p)O

1 ; k

|λ˙ k (p) − τk (p)| = γk2 (p)O

log k k

(C.7)

uniformly in {(n, k) ∈ N × N | k = n} and p in a sufficiently small neighborhood of q. This leads to µ(n) − λ µ(n) − τ − γ /2 1/2 + γk O(1/k) k k 2k k k . = = λ˙ k − λ2k λ˙ k − τk − γk /2 1/2 + γk O(log k/k) Thus the last quotient on the right hand side of (C.6) can be extended continuously on L20 and is O(1). The estimates for dk are obtained in a similar way. (ii) Notice that

γn cn,n

+ µ(n) m −λ2n cn m=n m2 π 2 = γn

= 0. + −λ2n λ˙ n − λ2n m=n λ˙ m m2 π 2

Similarly as in (i) one obtains 1 γn /2 γn cn,n = −cn 2 γn /2 − (λ˙ n − τn )

log n 1+O . n

Using (C.7) and the estimate cn = 2nπ 1 + O n1 (cf Proposition 2) one obtains the claimed asymptotic. Combining the two Lemmas above, one obtains Corollary 44 For k = n and q ∈ L20 with γk (q) = 0, ∂ηn,k = cn,k y1 (x, λ2k , q)y2 (x, λ2k , q) + dk y22 (x, λ2k , q) . ∂q(x)

Vol. 2, 2001


851

Proof of Proposition 41 Formula (C.1) follows from Lemma 42 and Corollary 44. It remains to prove that the series in (C.1) converges in H 2 . For k ≥ K, y1 (x, λ2k , q) y2 (x, λ2k , q) and y22 (x, λ2k , q) are in H 2 . Using that cn,k and cn,k dk are O k12 (Lemma 43) and the following estimates of y1 ≡ y1 (x, λ2k , q) and y2 ≡ y2 (x, λ2k , q) (cf [PT]) 1 1 sin πkx y1 = cos πkx + O∞ + O∞ ; (C.8) ; y2 = k πk k2 1 y1 = −πk sin πkx + O∞ (1) ; y2 = cos πkx + O∞ (C.9) k one obtains, by a straightforward computation, the convergence of the series in H 2. To state the next result, recall that θñ := k=n ηn,k . For q ∈ L20 with λ2n−1 (q) = λ2n (q) introduce an orthonormal basis f˜2n , f˜2n−1 of span y1 (·, λ2n ), y2 (·, λ2n ) with f˜2n := ||yy22 || and f˜2n−1 (0) > 0. Then f˜2n−1 is of the form (yj ≡ yj (·, λ2n ), j = 1, 2) y1 + bn y2 y1 , y2 L2 ; bn := − . f˜2n−1 = ||y1 + bn y2 || y2 , y2 L2 Lemma 45 Let q ∈ L20 with λ2n−1 (q) = λ2n (q). Then 2 f˜2 − f˜2n−1 ∂xn = ξn cos θñ 2n − κn sin θñ ∂q(x) 2 2 f˜2 − f˜2n−1 ∂yn = ξn sin θñ 2n + κn cos θñ ∂q(x) 2

f˜2n f˜2n−1

(C.10)

f˜2n f˜2n−1

(C.11)

where κn ≡ κn (q) satisfies κn = 0. If q is a finite gap potential, one has for n → ∞ log n κn = −1 + O . n Proof. Formulas (C.10) and (C.11) are derived in a similar fashion, so we prove only (C.10). Let (qm )m≥1 be a sequence in L20 , convergent √ to q, such that µn (qm ) = λ2n (qm ) > λ2n−1 (qm ) ∀m ≥ 1. For p ∈ L20 \ Dn , xn = 2In cos θn = 12 ξn γn cos θn . Therefore, % & ∂ξn 1 ∂xn ∂γn ∂θn = lim γn cos θn + ξn cos θn − ξn γn sin θn . ∂q(x) 2 m→∞ ∂q(x) ∂q(x) ∂q(x) qm By = 0 for p with λ2n−1 (p) < µn (p) = λ2n (p). Hence θn (qm ) = definition, ηn,n (p) η (q ). As k=n n,k m k=n ηn,k is analytic, the following limit exists, ηn,k (q). θñ := lim θn (qm ) = m→∞

k=n

852



As ξn (·) is analytic and limm→∞ γn (qm ) = 0, we obtain ∂ξn γn cos θn = 0. lim m→∞ ∂q(x) qm Thus

, ∂γn ∂θn 1 ∂xn ˜ ˜ = ξn (q) cos θn lim − sin θn lim γn . m→∞ ∂q(x) m→∞ ∂q(x) 2 ∂q(x) qm qm

(C.12)

Step 1 : Computation of the first limit on the right side of (C.12). For p ∈ L20 \ Dn , ∂γn 2 2 2 ∂q(x) = f2n (p)−f2n−1 (p), where f2n−1 and f2n are L -normalized eigenfunctions p

corresponding to λ2n−1 and λ2n . As λ2n (qm ) = µn (qm ), the eigenfunction f2n (qm ) can be chosen to be f2n (qm ) = ||yy22 || . Then 2 2 lim f2n (qm ) = f˜2n .

m→∞

Notice that, as λ2n−1 (qm ) < λ2n (qm ), the eigenfunction f2n−1 (qm ) is orthogonal to the eigenfunction f2n (qm ). Choose f2n−1 = an (y1 (x, λ2n−1 , qm ) + bn y2 (x, λ2n−1 , qm )) with an ≡ an (qm ) = ||y1 + bn y2 ||−1 and bn ≡ bn (qm ) (m sufficiently large). From f2n−1 (qm ), f2n (qm )L2 = 0

(C.13)

it follows that y2 , f2n L2 bn = − y1 , f2n L2 where f2n = f2n (x, qm ) and yj = yj (x, λ2n−1 (qm ), qm ) (j = 1, 2). Notice that y2 , f2n L2 → ||y2 (·, λ2n (q), q)|| = 0

(m → ∞).

Hence for m sufficiently large y2 , f2n L2 = 0 and bn = −

y1 , f2n L2 . y2 , f2n L2

Define Q(qm ) = ||y1 + bn y2 || (m sufficiently large) and notice that Q(qm ) → Q(q) with Q(q) = 0 as y1 (x, λ2n (q), q) and y2 (x, λ2n (q), q) are linearly independent. Hence an (qm ) := 1/Q(qm ) is well defined for m large and an (qm ) → an (q) > 0

(m → ∞).

We conclude that limm→∞ f2n−1 (qm ) = f˜2n−1 (q) where y1 + bn f˜2n f˜2n−1 (q) = ||y1 + bn f˜2n ||

Vol. 2, 2001


with bn (q) := − It follows that ||f˜2n−1 || = 1;

853

y1 , y2 L2 . y2 , y2 L2

f˜2n−1 , f˜2n

L2

=0

(C.14)

2 2 and limm→∞ f2n−1 (qm ) = f˜2n−1 . Thus we have proved that

∂γn 2 2 (qm ) = f˜2n − f˜2n−1 . ∂q(x)

lim

m→∞

Step 2 : Computation of the second limit on the right side of (C.12). We have ∂θn ∂ to compute limm→∞ γ2n ∂q(x) . As k=n ηn,k is analytic, its gradient ∂q(x) qm p η depends continuously on p. Therefore, as lim γ (q ) = 0, we obn,k m→∞ n m k=n tain ∂ k=n ηn,k ∂ηn,n ∂θn ∂ηn,n lim γn + = lim γn lim γn . = m→∞ m→∞ m→∞ ∂q(x) ∂q(x) ∂q(x) ∂q(x) qm

qm

qm

By Lemma 42 lim γn

m→∞

∂ηn,n ∂q(x) qm

=

lim γn (qm )cn,n (qm ) y1 (x, λ2n , q)y2 (x, λ2n , q)

m→∞

+

lim γn (qm )cn,n (qm )dn (qm ) y22 (x, λ2n , q).

m→∞

By Lemma 43, cñ,n

log n := lim γn (qm )cn,n (qm ) = −4πn 1 + O

= 0 m→∞ n

and limm→∞ dn (qm ) = dn (q) = O(1). Hence ∂θn lim γn = cñ,n y1 (x, λ2n , q)y2 (x, λ2n , q) + dn (q)y22 (x, λ2n , q) . m→∞ ∂q(x) qm To obtain the claimed statement it remains to interprete the right side of the 1 ∂θn equation above. As θn (q + c) = θ(q) for any c, we have 0 γn ∂q(x) dx = 0 for qm 1 any m. Therefore 0 = 0 y1 (x)y2 (x) + dn y22 (x) dx. Hence y1 + dn y2 and y2 are orthogonal and thus dn = bn . It follows that 1 cñ,n (y1 y2 + dn y22 ) = κn f˜2n f˜2n−1 2

854



with κn := 12 cñ,n ||y2 || ||y1 + bn y2 || = 0 and log n 1 1 1 1 1 1 √ +O √ +O (−4πn) 1 + O κn = 2 n nπ 2 n2 n 2 log n . = −1 + O n In view of (C.12), formula (C.10) and the claimed asymptotics for κn are thus proved.

D Appendix In this appendix, for the convenience of the reader, we review the sampling formula (cf [MT1]) in the form used in this paper. Recall that for q ∈ L20 , j ≥ 1, + µ(j) −λ c ψj (λ, q) = j 2 πj 2 n=j nn2 π2 denote the functions introduced in section 2. The following interpolation formula is an incidence of the sampling formula (cf [MT1]). Proposition 46 For q ∈ L20 , j ≥ 1, ∞ ψj (µk (q), q) m12 (λ, q) = ψj (λ, q) m ˙ 12 (µk (q), q) λ − µk (q)

(λ ∈ C)

(D.1)

k=1

where ˙ denotes the derivative with respect to λ and m12 (λ, q) = y2 (1, λ, q). Proposition 46 follows by a limiting argument from the corresponding one for finite gap potentials. Denote by Gap0≤K the set of K-gap potentials Gap0≤K := {q ∈ L20 | γk = 0 iff k > K} (1 ≤ K < ∞ arbitrary). Lemma 47 For q ∈ Gap0≤K , 1 ≤ j ≤ K, and λ ∈ C K ψj (µk (q), q) m12 (λ, q) = ψj (λ, q) m ˙ 12 (µk (q), q) λ − µk (q)

(D.2)

k=1

Proof. Denote the left and right hand side of (D.2) by LHSj (q, λ) and RHSj (q, λ) respectively. Using the product representation for ψj and for m12 (cf. [PT]), we conclude that   m12 (λ, q) λ − µk (q)

=

−1    2 2 k π 

ψj (λ, q)

=

1≤l≤K l=k

cj (q)    2 2 j π

1≤l≤K l=j

µl (q) − λ   G1 (λ, q); l2 π2   (j) µl (q) − l2π2

λ  G2,j (λ, q); 

Vol. 2, 2001


where G1 (λ, q) :=

µk (q) − λ ; k2 π2

G2,j (λ, q) :=

k>K

855

µ(j) (q) − λ k . k2 π2

k>K

(j) µk (q)

= λ2k−1 (q) = λ2k (q) and G1 (λ, q) = As q ∈ Gap≤K , for k > K, µk (q) = G2,j (λ, q) =: G(λ, q). Thus LHSj (λ, q) = P1,j (λ, q)G(λ, q) and RHSj (λ, q) = P2,j (λ, q)G(λ, q) where P1,j (λ, q) and P2,j (λ, q) are polynomials in λ of degree at most K − 1. As m12 (µk (q), q) = 0 for k ≥ 1, we obtain, by L’Hopital’s rule, that LHSj (µk (q), q) = RHSj (µk (q), q). Clearly, G(µk (q), q) = 0 for 1 ≤ k ≤ K, thus P1,j (µk (q), q) = P2,j (µk (q), q) for 1 ≤ k ≤ N which means that P1 and P2 , both being polynomials of degree at most K − 1, coincide.

References [At]

M. Atiyah, Convexity and commuting Hamiltonians, Bull. London Math. Soc. 14 (1982), 1–15.

[BBGK] D. B¨ attig, A. Bloch, J.-C. Guillot, and T. Kappeler, On the symplectic structure of the phase space for periodic KdV, Toda and defocusing NLS, Duke Math. J. 79 (1995), 549–604. [BKM1] D. B¨ attig, T. Kappeler, and B. Mityagin, On the Korteweg-deVries equation: convergent Birkhoff normal form, J. Funct. Anal. 140 (1996), 335– 358. [BKM2] D. B¨ attig, T. Kappeler, and B. Mityagin, On the Korteweg–deVries equation: frequencies and initial value problem, Pacific J. Math. 181 (1997), 1–55. [FM]

H. Flaschka and D. McLaughlin, Canonically conjugate variables for the Korteweg-deVries equation and Toda lattice with periodic boundary conditions, Progress of Theor. Phys. 55 (1976), 438–456.

[GT]

J. Garnett, E. Trubowitz, Gaps and bands of one dimensional Schr¨ odinger operators, Comm. Math. Helv. 59 (1984), 258–312.

[GK]

I.C. Gohberg and M.G. Krein, Introduction to the theory of linear, nonselfadjoint operators, Transl. of Math. Monogr., Volume 18, AMS, 1969.

[GS]

V. Guillemin, S. Sternberg, Convexity properties of the moment mapping, Invent. Math. 67 (1982), 491–515.

[Ka]

T. Kappeler, Fibration of the phase-space for the Korteweg-deVries equation, Ann. Inst. Fourier 41 (1991), 539–575.

[KaMa] T. Kappeler, M. Makarov, On action-angle variables for the second Poisson bracket of KdV, Commun. Math. Phys. 214 (2000), 651–677.

856



[KM]

T. Kappeler, B. Mityagin, Estimates for periodic and Dirichlet eigenvalues of the Schrödinger operator, to appear in SIAM J. of Math. Anal.

[Ma]

V.A. Marchenko, Sturm-Liouville operators and applications, Operator Theory: Advances and Applications, Volume 22, Birkh¨ auser, 1986.

[MT1]

H.P. McKean, E. Trubowitz, Hill’s operator and hyperelliptic function theory in the presence of infinitely many branch points, CPAM 24 (1976), 143–226.

[MT2]

H.P. McKean, E. Trubowitz, Hill’s surfaces and their theta functions, Bull AMS 84 (1978), 1042–1085.

[MV]

H.P. McKean, K.L. Vaninsky, Action-angle variables for the cubic Schr¨ odinger equation, CPAM 50 (1997), 489–562.

[PT]

J. P¨ oschel, E. Trubowitz, Inverse spectral theory, Academic Press, San Diego, 1987.

[ST]

J.J. Sansuc, V. Tkachenko, Spectral properties of non-selfadjoint Hill’s operators with smooth potentials, in A. Boutet de Monvel and V. Marchenko (eds.), Algebraic and geometric methods in mathematical physics, 371–385, Kluwer, 1996.

T. Kappeler and M. Makarov Institut f¨ ur Mathematik Universit¨ at Z¨ urich Winterthurerstrasse 190 CH-8057 Z¨ urich Switzerland email: [email protected] Communicated by Eduard Zehnder submitted 31/07/00, accepted 25/06/01


Ann. Henri Poincaré 2 (2001) 857 – 886 c Birkh¨ auser Verlag, Basel, 2001 1424-0637/01/050857-30 $ 1.50+0.20/0


The Vlasov-Poisson System with Radiation Damping M. Kunze and A. D. Rendall

Abstract. We set up and analyze a model of radiation damping within the framework of continuum mechanics, inspired by a model of post-Newtonian hydrodynamics due to Blanchet, Damour and Sch¨ afer. In order to simplify the problem as much as possible we replace the gravitational field by the electromagnetic field and the fluid by kinetic theory. We prove that the resulting system has a well-posed Cauchy problem globally in time for general initial data and in all solutions the fields decay to zero at late times. In particular, this means that the model is free from the runaway solutions which frequently occur in descriptions of radiation reaction.

1 Introduction and main results The Vlasov-Poisson system is a well-known description of collisionless particles which interact via a field which they generate collectively. It can be applied in the case of particles interacting through the electromagnetic field (plasma physics case) or the gravitational field (stellar dynamics case). The equations modeling the two cases are only distinguished by a difference of sign. This description is nonrelativistic and is only appropriate for physical situations where the velocities of the particles are small compared to the velocity of light. When it is replaced by a fully relativistic model the two cases diverge drastically. In the electromagnetic case the appropriate system of equations is the (relativistic) Vlasov-Maxwell system while in the gravitational case it is the Vlasov-Einstein system, which is much more complicated. In classical electrodynamics it is well known that accelerated charged particles radiate and that this leads to an effect on the motion of the particles known as radiation reaction. This typically leads to damping, i.e. to loss of energy by the particles. A similar but more complicated effect occurs in the case of the gravitational field. It is, however, hard to formulate exactly due to difficulties such as the nonlinearity and coordinate dependence of the equations used. There is a large literature concerning effective equations in electrodynamics which incorporate radiation damping without providing a full relativistic description of the field and sources. These effective equations usually have undesirable solutions which tend to infinity exponentially fast, the so-called “runaway solutions”. It has recently been observed that nevertheless, in some of these models, the physically relevant solutions of the effective equation constitute a center-like manifold in phase space, restricted to which the dynamics is completely well-behaved. Moreover, the effective equation is a good approximation of the full system; cf. [14, 15].

858

M. Kunze and A.D. Rendall


In the case of the gravitational field, radiation damping is a subject of particular interest at the moment due to the fact that gravitational wave detectors will soon be ready to go into operation and it is important for their effective functioning that the sources of gravitational waves be understood well. (For background on gravitational wave detection see for instance [8] and references therein.) The most promising type of source at the moment is a strongly self-gravitating system of two stars rotating about their common center of mass which lose energy by (gravitational) radiation damping and eventually coalesce. As has already been indicated, it is hard to describe this within the full theory and hence effective equations like those known in electrodynamics are important. Very little is understood about this in terms of rigorous mathematics at this time. The aim of this paper is to take a first step towards bringing this subject into the domain where models can be defined in a mathematically precise way and theorems proved about them. The model we will discuss has the following characteristics. It clearly exhibits the phenomenon of radiation damping. It does not suffer from pathologies such as runaway solutions. It is simple enough so that we can prove theorems about the global behaviour of the general solution. The particular model was chosen with the aim of obtaining this combination of properties. It is inspired by a model of Blanchet, Damour and Sch¨ afer [5] for a perfect fluid with radiation damping. The phenomenon of radiation damping is intimately connected with the long time asymptotics of the system. Thus, in order to capture it mathematically, we need at least a global existence theorem. This seems hopeless for a fluid, due to the formation of shocks, and so we replace it by collisionless matter (Euler replaced by Vlasov). The latter is known to have good global existence properties [19, 17, 24, 11]. Although the original motivation came from the gravitational case, the electromagnetic case is much simpler. Thus we use a model motivated by the electromagnetic case here, hoping to return to the more complicated gravitational case at a later date. We are not aware that the model used here has a direct physical application. The model to be studied is defined as follows. There are two species of particles of opposite charges, say ions (“+”) and electrons (“−”). In the case of the Vlasov-Poisson system the motion of the individual particles is governed by the characteristic systems X˙ +

= V +,

V˙ + = ∇U (t, X + ),

(1)

X˙ −

= V −,

V˙ − = −∇U (t, X − ),

(2)

where U = U (t, x) is the (electric) potential. The requirement that the particle densities f ± = f ± (t, x, v) be constant along the characteristics leads to the Vlasov equations ∂t f + + v · ∇x f + + ∇U · ∇v f +

=

0,

(3)

∂t f − + v · ∇x f − − ∇U · ∇v f −

=

0,

(4)

Vol. 2, 2001

The Vlasov-Poisson System with Radiation Damping

859

with t ∈ R, x ∈ R3 , and v ∈ R3 denoting time, position, and velocity variable, respectively. The potential U derives from the Poisson equation ∆U = 4πρ = 4π(ρ+ − ρ− ), where ρ± (t, x) =

lim U (t, x) = 0,

|x|→∞

f ± (t, x, v) dv.

(5)

(6)

Supplied with suitable data f ± (t = 0) = f0± , (3)–(6) constitutes the VlasovPoisson system for two species of opposite charges; see [9, 22] for general information on Vlasov-Poisson and related models. In order to introduce a damping effect due to radiation into (3)–(6), we modify the characteristic equations by introducing a small additional term. Let D(t) = xρ(t, x) dx = x(f + (t, x, v) − f − (t, x, v)) dxdv (7) denote the corresponding dipole moment, and replace (1), (2) by X˙ + X˙ −

= =

V +, V −,

...

V˙ + = ∇U (t, X + ) + ε D (t),

(8)

...

V˙ − = −∇U (t, X − ) − ε D (t),

(9)

with an ε > 0 small. This is to be thought of as an approximation to the full Vlasov-Maxwell system. It includes the electric dipole radiation which is supposed to give the leading contribution to the radiation reaction, cf. [13, p. 784]. The third time derivative in these equations can lead to pathological behaviour and so we will modify the model by formally small corrections so as to eliminate it. Here we follow the procedure of [5] which was used to tackle the fifth time derivatives which occur in the analogous gravitational problem. To reduce the order of derivatives on D(t), we utilize the transformations ¨ V˜ + = V + − εD(t)

¨ and V˜ − = V − + εD(t).

(10)

Then (8), (9) read X˙ +

=

¨ V + + εD(t),

V˙ + = ∇U (t, X + ),

(11)

X˙ −

=

¨ V − − εD(t),

V˙ − = −∇U (t, X − ),

(12)

where the tilde has been omitted for simplicity. The corresponding Vlasov equations are then ¨ ∂t f + + (v + εD(t)) · ∇x f + + ∇U · ∇v f +

=

0,

(13)

¨ ∂t f − + (v − εD(t)) · ∇x f − − ∇U · ∇v f −

=

0.

(14)

860



¨ Next we derive an approximation D[2] (t) to D(t). By definition of D(t) in (7) we formally calculate ˙ D(t) = x(∂t f + − ∂t f − ) dxdv = x − v · ∇x f + − ∇U · ∇v f + + v · ∇x f − − ∇U · ∇v f − dxdv = v(f + − f − ) dxdv. Thus

∼ ¨ D(t) v(∂t f + − ∂t f − )dxdv = O(ε) + ∼ v − v · ∇x f + − ∇U · ∇v f + + v · ∇x f − − ∇U · ∇v f − dxdv = O(ε) + ∇U (f + + f − ) dxdv. = O(ε) − v ∇U · (∇v f + + ∇v f − )dxdv = O(ε) + Hence we are led to define the approximation D[2] (t) = ∇U (t, x)(f + (t, x, v) + f − (t, x, v)) dxdv = ∇U (t, x)(ρ+ (t, x) + ρ− (t, x)) dx,

(15)

and we replace (11), (12) by X˙ + X˙

−

= V + + εD[2] (t), = V

−

[2]

− εD (t),

V˙ + = ∇U (t, X + ), V˙

−

−

= −∇U (t, X ),

(16) (17)

with corresponding Vlasov equations ∂t f + + (v + εD[2] (t)) · ∇x f + + ∇U · ∇v f +

= 0,

(18)

∂t f − + (v − εD[2] (t)) · ∇x f − − ∇U · ∇v f −

= 0,

(19)

and U is determined by (5). We call the system consisting of (18), (19), (5), (6), and (15) the VlasovPoisson system with damping (VPD), and we propose it as a model to study the damping effect due to radiation. We add some more comments. Remark 1 (a) To model radiation as we have done it, it is necessary to consider at least two species with different charge to mass ratios. Here we make the simplest choice of equal masses and two charges which are equal in magnitude and opposite in sign. If the charge to mass ratios were equal then the rate of change of the

Vol. 2, 2001


861

dipole moment would be proportional to the linear momentum of the system and, by conservation of momentum, the radiation reaction force would vanish. This is a well-known fact (absence of bremsstrahlung for identical particles), cf. [7, p. 411] or [25, p. 201], and can also be seen from the corresponding effective equations for radiation reaction, cf. [16, eq. after (1.9)]. (b) In the context of general relativity, one has to use the quadrupole moment 1 xi xj − |x|2 δij ρ(t, x) dx Qij (t) = 3 instead of the dipole moment D(t) and it is the fifth time derivative which occurs instead of the third before reduction [7]. This leads to considerable complications. (c) Notice that D[2] (t) ≡ 0 for e.g. spherically symmetric solutions, whence there is no radiation damping in this case. It is the purpose of this paper to analyze rigorously long-time properties of classical solutions to (VPD). Therefore we first have to deal with the question of global existence of solutions, e.g. for smooth data functions f ± (t = 0) = f0± of compact support, i.e. such that f0± ∈ C0∞ (R3 × R3 ),

f0± ≥ 0,

and f0± (x, v) = 0

for |x| ≥ r0 or |v| ≥ r0 , (20) with some fixed r0 > 0. Since global existence is a quite non-trivial issue for Vlasov-Poisson like systems, we provide a complete existence proof for (VPD) in the Appendix, section 3, deriving estimates on higher velocity moments of f ± along the lines of [17]. This approach has been successfully applied to other related problems as well; cf. [2, 6]. In this manner we obtain Theorem 1 If f0± satisfy (20), then there is a unique solution f ± ∈ C 1 ([0, ∞[×R3 × R3 ) of (VPD) with data f ± (t = 0) = f0± . Having ensured that suitable solutions do exist, we now turn to the decay estimates for quantities related to (VPD). We define the total energy E(t) = Ekin (t) + Epot (t), with 1 Ekin (t) = 2

(21)

|v|2 (f + (t, x, v) + f − (t, x, v)) dxdv 1 and Epot (t) = |∇U (t, x)|2 dx, 8π

denoting kinetic and potential energy, respectively.

(22)

862



Theorem 2 Assume f0± satisfy (20). Then ˙ = −ε |D[2] (t)|2 . E(t)

(23)

Moreover, the following estimates hold for t ∈ [0, ∞[. − 3(p−1) 2p

for p ∈ [1, 53 ];

− 5p−3 7p

for p ∈ [2, 15 4 ];

(a) ρ± (t) p;x ≤ C(1 + t)

(b) ∇U (t) p;x ≤ C(1 + t) (c) |D[2] (t)| ≤ C(1 + t)

− 87

.

In particular, (VPD) does not admit nontrivial static solutions, and the kinetic energy satisfies Ekin (t) → E∞ as t → ∞ for some E∞ ≥ 0. Moreover, if E(0) > 0 and ε > 0 is small enough, then E∞ > 0. See Section 2 for the proof. We note that in theorem 2 a slow dissipation of energy takes place due to the “damping term” D[2] (t), as can be seen from equation (23). Remark 2 As an aside, we include a comment on a relation to the usual VlasovPoisson system. We start with the characteristic equations (16), (17), i.e. X˙ +

= V + + εD[2] (t),

V˙ + = ∇U (t, X + ),

X˙ −

= V − − εD[2] (t),

V˙ − = −∇U (t, X − ).

Define ¯ + = X +, X ¯ − = X −, X

V¯ + = V + + εD[2] (t), V¯ − = V − − εD[2] (t).

Then

where

¯˙ + = V¯ + , X

¯ + ) + εD˙ [2] (t) = ∇W (t, X ¯ + ), V¯˙ + = ∇U (t, X

¯˙ − = V¯ − , X

¯ − ) − εD˙ [2] (t) = −∇W (t, X ¯ − ), V¯˙ − = −∇U (t, X W (t, x) = U (t, x) + εD˙ [2] (t) · x.

Also ∆W = ∆U . Thus we obtain a solution of the Vlasov-Poisson system where the potential W does not satisfy the usual boundary conditions. This is similar to the cosmological solutions of the Vlasov-Poisson system constructed in [23]. They are obtained directly as solutions of a transformed system but are in the end solutions of the Vlasov-Poisson system with unconventional boundary conditions. This reformulation gives a simple way of seeing the volume preserving property of the flow for (VPD), since we know it for Vlasov-Poisson. It is, however, not hard to see it directly.

Vol. 2, 2001


863

Notation Throughout the paper, C denotes a general constant which may change from line to line and which only depends on f0± . If we consider a solution on a fixed time interval [0, T ], and if C additionally depends on T , this is indicated by CT . The usual Lp -norm of a function ϕ = ϕ(t, x) over x ∈ R3 is denoted by ϕ(t) p;x , 3 and if ϕ = ϕ(t, x, v) and the integrals are to be extended over (x, v) ∈ R3 × R , then we write ϕ(t) p;xv . To simplify notation, an integral always means R3 . Acknowledgments We wish to thank Thibault Damour, Gerhard Rein, Gerhard Sch¨ afer and Herbert Spohn for discussions and helpful advice. MK acknowledges support through a Heisenberg fellowship of DFG.

2 Proof of theorem 2 We split the proof into several subsections.

2.1

Energy dissipation

We verify (23) and calculate the change of the total energy E(t) from (21). Due to (18) and (19) we have 1 E˙kin (t) = v 2 (∂t f + + ∂t f − ) dxdv 2 1 = v 2 − [v + εD[2] (t)] · ∇x f + − ∇U · ∇v f + 2 −[v − εD[2] (t)] · ∇x f − + ∇U · ∇v f − dxdv (24) = (v · ∇U )(f + − f − ) dxdv = ∇U · j dx, where +

−

j(t, x) = j (t, x) − j (t, x),

±

j (t, x) =

vf ± (t, x, v) dv,

is the current. The evaluation of E˙pot (t) is a little more tedious, and for this purpose we will use 1 1 1 2 Epot (t) = (25) |∇U | dx = − (∆U )U dx = (ρ− − ρ+ )U dx, 8π 8π 2 and moreover the representations of the electric field by means of Coulomb potentials dy (ρ+ (t, y) − ρ− (t, y)), U (t, x) = − |x − y| 1 (ρ+ (t, y) − ρ− (t, y)) E(t, x) := ∇U (t, x) = − dy ∇x |x − y| (x − y) + = dy (ρ (t, y) − ρ− (t, y)). (26) |x − y|3

864



Then we obtain through the change of variables x ↔ y and v ↔ w E˙pot (t)

dxdv (∂t ρ− − ∂t ρ+ )U + (ρ− − ρ+ )(∂t U ) 1 1 dxdydvdw (∂t f − (t, x, v) − ∂t f + (t, x, v)) (f + (t, y, w) = − 2 |x − y| 1 −f − (t, y, w)) + (f − (t, x, v) − f + (t, x, v)) (∂t f + (t, y, w) − ∂t f − (t, y, w)) |x − y| = − dxdydvdw (∂t f − (t, x, v) − ∂t f + (t, x, v)) =

1 2

=

1 (f + (t, y, w) − f − (t, y, w)) |x − y|

dxdy dvdw [v − εD[2] (t)] · ∇x f − (t, x, v) |x − y|

−∇U (t, x) · ∇v f − (t, x, v) − [v + εD[2] (t)] · ∇x f + (t, x, v) −∇U (t, x) · ∇v f + (t, x, v) × (f + (t, y, w) − f − (t, y, w)) =

dxdy dv [v − εD[2] (t)] · ∇x f − (t, x, v) − [v + εD[2] (t)] |x − y| ·∇x f + (t, x, v) × (ρ+ (t, y) − ρ− (t, y))

=

−

=

dxdydv ∇x

dxdv ∇U (t, x) · [v − εD[2] (t)]f − (t, x, v) − [v + εD[2] (t)]f + (t, x, v)

dxdv(∇U · v)(f

= =

−

· [v − εD[2] (t)]f − (t, x, v) −[v + εD[2] (t)]f + (t, x, v) × (ρ+ (t, y) − ρ− (t, y)) 1 |x − y|

−

+

− f )+ ∈t

dxdv∇U · (−εD[2] (t)f − − εD[2] (t)f + )

2

∇U · j dx − ε |D[2] (t)| ,

recall the definition of D[2] (t) from (15). Combining this with (24), we see that (23) holds.

2.2

Decay of the potential energy

Here we show a t−1 -decay of the potential energy Epot (t) from (25). The result is similar to [12, 18], but the proof requires appropriate modifications due to the presence of the term D[2] (t).

Vol. 2, 2001


Lemma 1 We have Epot (t) ≤ C(1 + t)−1

and

(x − vt)2 (f + + f − ) dxdv ≤ Ct,

865

t ∈ [0, ∞[,

the constants being independent of ε ∈ [0, 1]. Proof. Denote R(t) = (x − vt)2 (f + + f − ) dxdv

and g(t) =

t2 4π

|∇U |2 dx = 2t2 Epot (t). (27)

Then a short calculation reveals 2 ˙ R(t) = − 2t (x · ∇U )ρ dx + 2t ∇U · j dx +2εD[2](t) · (x − tv)(f + − f − ) dxdv .

(28)

Inserting (26) for ∇U and writing x·(x−y) |x−y|−3 = |x−y|−1 +y·(x−y) |x−y|−3, we see that 1 1 2 U ρ dx = |∇U | dx. (29) (x · ∇U )ρ dx = − 2 8π On the other hand, (24) and the energy identity (23) imply 2 2 1 d 2 ∇U ·j dx = E˙kin (t) = −E˙pot (t)−ε |D[2] (t)| = − |∇U | dx−ε |D[2] (t)| . 8π dt (30) Using (29) and (30) in (28), it follows that 2 t 1 d 2 2 ˙ R(t) = − |∇U | dx + 2t2 − |∇U | dx − ε |D[2] (t)| 4π 8π dt + 2εD[2] (t) · (x − tv)(f + − f − ) dxdv . By means of g from (27), this may be rewritten as g(t) 2 d 2 [2] [2] + − R(t)+g(t) = −2εt |D (t)| +2εD (t)· (x−tv)(f −f ) dxdv . dt t (31) Compared to [12, p. 1412], it is now necessary to see how the two terms with D[2] (t) contribute. First we consider the case that t > 0 is such that (x − tv)(f + − f − ) dxdv . t2 |D[2] (t)| ≤

866



Then we obtain from (31) that g(t) d (x − tv)(f + − f − ) dxdv R(t) + g(t) ≤ + 2ε|D[2] (t)| dt t 2 g(t) −2 + − ≤ (x − tv)(f − f ) dxdv . + 2εt (32) t However, if

+ − (x − tv)(f − f ) dxdv , t |D (t)| ≥ 2

[2]

then (31) yields

g(t) d R(t) + g(t) ≤ , dt t hence (32) is verified for all t > 0. In order to have bounds below independent of, say, ε ∈ [0, 1], we modify (32) to 2 g(t) d −2 + − (x − tv)(f − f ) dxdv . (33) R(t) + g(t) ≤ + 2t dt t To further exploit this, we next note that due to H¨ older’s inequality and by lemma 2 below with p = 0 2 + − 2 + − ≤ − f ) dxdv (x − tv) [f + f ] dxdv (x − tv)(f [f + + f − ] dxdv ≤ CR(t),

(34)

thus by (33)

g(t) d + Ct−2 R(t), t > 0. R(t) + g(t) ≤ dt t Integrating over t ∈ [1, T ], we see that T T g(t) dt + C R(T ) ≤ R(T ) + g(T ) ≤ C + t−2 R(t) dt, t 1 1 Therefore

R(T ) ≤ C 1 +

1

T

g(t) dt , t

T ≥ 1.

T ≥ 1,

by Gronwall’s lemma. Using this in (35), we find T T t dt g(t) g(s) g(T ) ≤ C + dt + C ds 2 1+ t s t 1 1 1 T T 1 1 g(t) g(t) dt + C − dt, ≤ C+ t t T t 1 1

(35)

(36)

Vol. 2, 2001


867

and consequently g(T ) ≤ CT,

T ≥ 1,

again by Gronwall’s lemma. According to the definition of g, this proves the t−1 decay of Epot (t), and then (36) shows that R(T ) ≤ CT holds as well. This completes the proof of lemma 1.

2.3

Some general estimates

We digress now from the proof of theorem 2 and note some useful estimates that will also play a role later for the global existence of solutions, cf. theorem 1. Define the velocity moments |v|p f ± (t, x, v) dxdv, and Mp (t) = sup Mp+ (s) + Mp− (s) Mp± (t) = s∈[0,t]

(37) for p ∈ [0, ∞[. Lemma 2 For t ∈ [0, ∞[ we have

f ± (t) ∞; xv ≤ C

and

Mp (t) ≤ C,

p ∈ [0, 2].

Proof. For fixed t ∈ [0, ∞[ let (X (s), V(s)) = (X (s; t, x, v), V(s; t, x, v)) denote the characteristics from (16) associated with (18), i.e.

X˙ (s) ˙ V(s)

=

V(s) + εD[2] (s) ∇U (s, X (s))

,

X (t) V(t)

=

x v

.

(38)

∂ Then ∂s [f + (s, X (s), V(s))] = 0 shows f + (t, x, v) = f0+ (X (0), V(0)), and hence the first bound follows. Concerning the second, M2 (t) = 2 sups∈[0,t] Ekin (s) ≤ 2 sups∈[0,t] E(s) = 2E(0) by (23). Moreover, M0± (t) = M0± (0), as (x, v) → (X (0; t, x, v), V(0; t, x, v)) is a volume-preserving diffeomorphism of R3 ×R3 , due to the fact that the right-hand side of the ODE in (38) has divergence div = div(X ,V) p 2 zero; see also lemma 16 below and remark 2. Observing that |v| ≤ 1 + |v| for 3 v ∈ R and p ∈ [0, 2] completes the proof.

p such that |v| f (x, v) Lemma 3 Let f = f (x, v) ∈ L∞ (R3 × R3 ) be nonnegative dxdv < ∞ for some p ∈ [0, ∞[, and define φ(x) = f (x, v) dv, x ∈ R3 . Then p 3+p

φ 3+p ; x ≤ C f ∞; xv 3

Here C depends only on p.

p

|v| f (x, v) dxdv

3 3+p

.

(39)

868



Proof. The argument is well-known, but indicated for completeness. We split 4π 3 R f ∞; xv +R−p |v|p f (x, v) dv φ(x) ≤ f (x, v) dv+ f (x, v) dv ≤ 3 |v|≤R |v|≥R and optimize in R to find

p 3+p

φ(x) ≤ C f ∞; xv

p

|v| f (x, v) dv

3 3+p

,

whence integration w.r.t. x yields (39). Lemma 4 We have

∇ 1 ∗ ρ

≤ C ρ p; x ,

|x| q; x In addition,

∇ 1 ∗ div Γ

|x|

q; x

3 q ∈] , ∞[, 2

≤ C Γ q; x ,

p=

3q . 3+q

q ∈]1, ∞[,

for smooth and compactly supported vector fields Γ : R3 → R3 . Proof. The first estimate is a consequence of the classical Hardy-LittlewoodSobolev inequality; see [10, Thm. 4.5.3]. Concerning the second, we note that integration by parts reveals x−y 4π Γ(x) − lim div Γ(y) dy = Γ(x − y) · g(y) dy, (40) ε→0 |x−y|≥ε |x − y|3 3 with g(y) = |y|1 3 G(y), where G(y) = (−Id) + |y|3 2 (y ⊗ y) ∈ R3×3 . Since G is bounded in R3 \ {0}, homogeneous of degree zero, and satisfies |y|=1 G(y) d2 y = 0, the Calder´ on-Zygmund inequality [1, Thm. 4.31] implies that the second term on the right-hand side of (40) defines a bounded operator Lq (R3 ) → Lq (R3 ); in view of the compact support of the Γ’s it is not necessary that G also has compact support. Lemma 5 For t ∈ [0, ∞[ we have 1

ρ± (t) p; x ≤ CM3(p−1) (t) p ,

p ∈ [1, ∞[,

(41)

as well as

∇U (t) q; x ≤ C ρ(t) Moreover,

3q 3+q ; x

≤ CM 6q−9 (t) 3+q

|D[2] (t)| ≤ ∇U (t) p; x ρ(t) p ; x ,

3+q 3q

,

3 q ∈] , ∞[. 2

p ∈ [1, ∞].

(42)

(43)

Vol. 2, 2001


869

Proof. According to lemma 3 and lemma 2, α

3

3

3+α 3+α

ρ± (t) 3+α ; x ≤ C f ± (t) ∞; ≤ CMα (t) 3+α xv Mα (t) 3

for all α ≥ 0, hence (41) holds. Due to (26) we see lemma 4 applies to yield, for 3q q ∈] 32 , ∞[ and with p = 3+q , together with (41) 1

∇U (t) q; x ≤ C ρ+ (t) p; x + ρ− (t) p; x ≤ CM3(p−1) (t) p . Expressing p through q, we arrive at (42). The estimate on |D[2] (t)| is a consequence of (15) and H¨ older’s inequality.

2.4

Proof of theorem 2 (completed)

From lemma 1 we additionally obtain, analogously to [12], the following information. Corollary 1 Under the assumptions of theorem 2, we moreover have

ρ± (t) 5 ; x ≤ C(1 + t)−3/5 , 3

and

∇U (t) 15 ; x ≤ C(1 + t)−3/5 , 4

t ∈ [0, ∞[, t ∈ [0, ∞[.

(44)

(45)

Proof. Using lemma 2 and lemma 1 we can split ρ± (t, x) ≤ f ± (t, x, v) dv + R−2 (x − tv)2 f ± (t, x, v) dv {v:|x−tv|≤R} {v:|x−tv|≥R} 3 −3 −2 2 + ≤ CR t + R (x − tv) (f + f − )(t, x, v) dv ≤ CR3 t−3 + CR−2 t, and then choose the optimal R ∼ = t4/5 to obtain (44). Concerning (45), this follows from (44) and the first inequality in (42) with q = 15 4 . Lemma 6 Assertions (a)–(c) of theorem 2 are satisfied. Proof. From lemma 2 and corollary 1 we know ρ± (t) 1; x ≤ C as well as √

ρ± (t) 5 ; x ≤ C(1 + t)−3/5 , and we have ∇U (t) 2; x = 8πEpot (t)1/2 ≤ C(1 + 3

t)−1/2 as well as ∇U (t) 15 ; x ≤ C(1 + t)−3/5 by lemma 1 and corollary 1. Hence 4 the general interpolation estimate α

1−α

φ p ≤ φ q1 φ q2 ,

p ∈ [q1 , q2 ],

α 1 1−α = + , p q1 q2

870



yields (a) and (b). For (c), we use (43) with p = 5/2 and p = 5/3, (a), and (b) to see that −19/35

|D[2] (t)| ≤ ∇U (t) 5 ; x ρ(t) 5 ; x ≤ C(1 + t) 2

3

−8/7

(1 + t)−3/5 = C(1 + t)

as was to be shown.

, (46)

Remark 3 The estimates derived thus far suggest that the optimal decay rate for D[2] (t) be |D[2] (t)| ∼ t−3/2 rather than |D[2] (t)| ∼ t−8/7 , for the following reason: from H¨ older’s inequality and lemma 1 it follows that I(t) = (x − vt)(f + − f − ) dxdv satisfies |I(t)| ≤ C

(x − vt)2 (f + + f − ) dxdv

1/2

≤ C(1 + t)1/2 ,

˙ ∼ t−1/2 . On the other hand, direct calculation shows thus we might expect I(t) + − ˙ I(t) = ε (f + f ) dxdv − t D[2] (t) ∼ (−t)D[2] (t), whence we should have |D[2] (t)| ∼ t−3/2 . This decay would also be obtained if it 15 were possible to use theorem 2(a) and (b) with p = 15 4 and p = 11 , respectively, since then (43) would yield − 5(15/11)−3 7(15/11)

|D[2] (t)| ≤ ∇U (t) 15 ; x ρ(t) 15 ; x ≤ C(1 + t) 11

−3/2

= C(1 + t)

4

− 3((15/4)−1) 2(15/4)

(1 + t)

.

However, the necessary decay estimates for such p-norms of ∇U (t) and ρ(t) could not be proved.

Corollary 2 There are no nontrivial static solutions of (VPD), and Ekin (t) → E∞ ≥ 0 as t → ∞. If E(0) > 0 and ε > 0 is sufficiently small, then E∞ > 0. Proof. If (VPD) had a static solution f ± (t) ≡ f0± , then Epot (t) ≡ 0, whence ∇U = 0. This in turn yields D[2] (t) ≡ 0 by definition. Consequently, the Vlasov equations ± v) = f0± (x − (18), (19) reduce to ∂t f ± + v · ∇xf ± = 0 with unique solution ± f (x, ± ± −3 −1 f0 (w, t [x − w]) dw, vt, v). But then we see ρ (x) = f0 (x − vt, v) dv = t showing as t → ∞ that the solution has to be trivial. To prove the assertion concerning Ekin (t), note that, since E(t) is decaying by (23), E(t) → E∞ ≥ 0 as t → ∞. But E(t) = Ekin (t) + Epot (t) and Epot (t) → 0, hence the first claim follows. For the second, denote C1 the constant on the right-hand side of (46). Since all

Vol. 2, 2001


871

bounds are derived from lemma 1, we note that C1 is independent of ε ∈ [0, 1]. Integrating (23) yields t |D[2] (s)|2 ds, Ekin (t) + Epot (t) = E(0) − ε 0

thus as t → ∞, provided that E∞ = 0, by (46) ∞ ∞ |D[2] (s)|2 ds ≤ C12 ε (1 + s)−16/7 ds = (7C12 /9)ε. E(0) = ε 0

So if we choose ε
0.

Taking into account Section 2.1, lemma 6, and corollary 2, we note that the proof of theorem 2 is complete. Remark 4 With regard to corollary 2, E∞ = limt→∞ Ekin (t) > 0 was to be expected, since otherwise the particle velocities would have to tend to zero. It is, however, not surprising that at late times, when we are in a small data regime and the radiation reaction force is getting small, the solution behaves like a solution of the Vlasov-Poisson system with small data. In that case the particles travel with constant non-zero velocity at late times, as shown in [3]. We note a further consequence of the foregoing estimates. Corollary 3 If E∞ > 0, then (x · v)(f + (t, x, v) + f − (t, x, v)) dxdv ≤ C3 (1 + t), C1 t − C2 ≤

t ∈ [0, ∞[,

for constants C1 , C2 , C3 > 0. Proof. Denote S(t) = (x · v)(f + + f − ) dxdv. In view of lemma 1 we obtain 2 2 + − 2t Ekin (t) = (x − vt) (f + f ) dxdv − x2 (f + + f − ) dxdv + 2tS(t) ≤

C(1 + t) + 2tS(t),

note whence t2 E∞≤ C(1+t)+2tS(t) for t large enough. To prove the upper bound, ˙ that Q(t) = x2 (f + + f − ) dxdv satisfies Q(t) = 2S(t) + 2εD[2] (t) · x(f + − older’s f − ) dxdv, as follows by a straightforward calculation. Therefore utilizing H¨ inequality we obtain ˙ Q(t) ≤ 2Q(t)1/2 (2Ekin (t))1/2 + C(1 + t)−8/7 Q(t)1/2 ≤ CQ(t)1/2 . older’s Consequently, Q(t) ≤ C(1 + t)2 , and this in turn yields, once more by H¨ inequality, |S(t)| ≤ Q(t)1/2 (2Ekin (t))1/2 ≤ C(1 + t).

872



3 Appendix : Existence of solutions As mentioned in the introduction, the proof follows [17]. The idea is to decompose the field E = ∇U = E1 + F in a “far field” F that is small, and in some complementary part E1 which is of higher regularity than E itself. (More precisely,

E1 (t) p; x ≤ C for every p ∈ [1, 15 4 [ can be achieved.) According to this splitting, we write the Vlasov equations (18) and (19) in the form ∂t f + + (v + εD[2] (t)) · ∇x f + + F · ∇v f + ∂t f

−

[2]

+ (v − εD (t)) · ∇x f

−

− F · ∇v f

−

= =

−E1 · ∇v f + , −

E1 · ∇v f .

Since F is small, the characteristics of e.g. (47) should behave as s X (s) ≈ x + (s − t)v + ε D[2] (τ ) dτ, V(s) ≈ v,

(47) (48)

(49)

t

which is close to a free streaming, at least in case D[2] were not present. Writing ρ± (t, x) as a suitable integral over characteristics, it then turns out that in order to derive the necessary estimates for global existence (on higher moments), it is possible to use a rigorous form of (49). In particular, one may verify that −1 ≈ |s − t|−3 and ∂x (s) ≈ |s − t|, det ∂X (s) ∂V ∂v as is important to transform away the characteristics. The main point to note here is that the term with D[2] drops if we take derivatives in (49) w.r.t. x or v, and hence the arguments from [17] can be expected to carry over. Having derived the higher moment bounds Mm (t) ≤ C,

t ∈ [0, T ],

m ∈]3,

51 [, 11

then a standard argument yields the global existence of classical solutions for (VPD). It should finally be remarked that we did not succeed in generalizing the proofs of global existence for the usual Vlasov-Poisson system that bound the increase in velocity along a characteristic; see [19, 24, 22], and [20] for a recent application. The reason for this is that, when estimating the “ugly” term, an X¨ (s) will appear, which in our case will lead to the expression D˙ [2] (s) that could not be bounded well enough to make the proof work.

3.1

Local existence

Similar to the case of the usual Vlasov-Poisson system, cf. [4], where this is contained implicitly, an iteration scheme may be set up to yield the local existence of a solution and a criterion when a local solution in fact will be global.

Vol. 2, 2001


873

Theorem 3 Suppose f0± satisfy (20). Then there exist unique solutions f ± ∈ C 1 ([0, T∗ [×R3 × R3 ) of (5), (6), (14), and (13) with data f ± (t = 0) = f0± , on a maximal time interval of existence [0, T∗ [. If moreover P = P + + P − , with P ± (t) = sup |v| : ∃ x ∈ R3 ∃ s ∈ [0, t] : (x, v) ∈ suppf ± (s) , (50) is bounded on [0, T∗ [, then T∗ = ∞. We will not go into the proof of this result.

3.2

Some preliminary estimates

We first need to derive some a priori bounds. For this we consider a classical solution of the system that exists for times t ∈ [0, T ]. Note that all estimates from the previous sections remain valid on any interval where the solution exists. Lemma 7 For t ∈ [0, T ] we have 3+p Mp (t) ≤ CT 1 + sup ∇U (s) 3+p; x , s∈[0,t]

In addition,

6+p Mp (t) ≤ CT 1 + M3( 3+2p ) (t) 3 , 6+p

p ∈ [1, ∞[.

p ∈ [1, ∞[.

(51)

Proof. Recalling (37), from (18) and H¨ older’s inequality it follows that d + Mp (t) = |v|p − [v + εD[2] (t)] · ∇x f + − ∇U · ∇v f + dxdv dt = − |v|p ∇U · ∇v f + dxdv = p |v|p−2 (v · ∇U )f + dxdv

p−1 +

|v| f (t, ·, v) dv

≤ C ∇U (t) 3+p; x

.

3+p 2+p ; x

Now

|v|p−1 f ± (t, ·, v) dv

3+p

2+p ; x

2+p

≤ CMp± (t) 3+p

by an argument similar to the proof of lemma 3, and this yields d ± 2+p 3+p M (t) ≤ C ∇U (t) . 3+p; x Mp (t) dt p

874



Since Mp (·) is increasing, it is differentiable a.e. in t, with d 2+p d Mp (t) ≤ sup Mp+ (s) + Mp− (s) ≤ C sup ∇U (s) 3+p; x Mp (t) 3+p . dt s∈[0,t] dt s∈[0,t] (52) Integration of this differential inequality gives the claim. Finally, for (51) we observe that by the first part and (42) with q = 3 + p Mp (t) ≤ CT + CT sup ∇U (s) 3+p 3+p; x ≤ CT + CT M 6q−9 (t) s∈[0,t]

and

3.3

6q−9 3+q

3+q 3q (3+p)

3+q

= 3( 3+2p 6+p ).

,

Estimates for higher moments

For R > 0 choose a radially symmetric function χR ∈ C0∞ (R3 ) with χR (x) ∈ [0, 1] for x ∈ R3 , χR (x) = 1 for |x| ≤ R, and χ(x) = 0 for |x| ≥ 2R. Correspondingly we decompose the electric field E(t, x) from (26) as E(t, x) = E1 (t, x) + F (t, x), with

E1 (t, x)

(x − y) + χ(x − y) (ρ (t, y) − ρ− (t, y)) dy |x − y|3 1 = − χ∇ ∗ (ρ+ (t) − ρ− (t))(x). |x| =

(53)

Some useful estimates on E1 and F are stated in lemma 15 below. Then we write the Vlasov equations (18) and (19) in the form (47) and (48). This can be used to derive a representation formula for ρ± (t, x), and for simplicity we will consider only ρ+ (t, x). We fix x, v ∈ R3 and t ∈ [0, T ], and denote (X(s), V (s)) = (X(s; x, v), V (s; x, v)) for s ∈ [0, t] the solution of the characteristic system ˙ X(0) x X(s) −V (s) − εD[2] (t − s) = , = , (54) V (0) v −F (t − s, X(s)) V˙ (s) associated with (47). Since ∂ + [f (t − s, X(s), V (s))] ∂s = −∂t f + (t − s, X(s), V (s)) − [V (s) + εD[2] (t − s)] · ∂X f + (t − s, X(s), V (s)) −F (t − s, X(s)) · ∂V f + (t − s, X(s), V (s)) = E1 (t − s, X(s)) · ∂V f + (t − s, X(s), V (s))

Vol. 2, 2001


by (47), it follows through integrating

t 0

ds(. . .) and

875

dv(. . .) that

t ρ+ (t, x) = dvf0+ (X(t), V (t)) − ds dvE1 (t − s, X(s)) · ∂V f + (t − s, X(s), V (s)) 0 t + (55) = dvf0 (X(t), V (t)) − ds dvdivV [E1 f + ](t − s, X(s), V (s)) 0

f0+

where = f + (t = 0), and [E1 f + ](τ, X, V ) = E1 (τ, X)f + (τ, X, V ); note that the dependence on x and v in (55) enters via X(s) and V (s). To rewrite (55) appropriately, define ˜ v) = G(X(s; x, v), V (s; x, v)). G(X, V ) = [E1 f + ](t − s, X, V ) and G(x, ˜ By lemma 16 below we then have G(X, V ) = G(x(s; X, V ), v(s; X, V )), and consequently divV G = divx ∂x ˜= )·G where divx ( ∂V parts w.r.t. v yields

3 ˜ ∂x ˜ ∂x ∂ Gi ∂vj ˜ · G − divx , ·G+ ∂V ∂V ∂vj ∂Vi i,j=1

3

i,j=1

ρ (t, x)

=

∂xj ∂ ∂xj ( ∂Vi ).

Utilizing this in (55) and integrating by

∂x ˜ ·G (t)) − divx ds dv ∂V 0

t ∂x ∂v ˜ ˜ + ds dv divx · G + divv ·G ∂V ∂V 0

+

˜i G

dv f0+ (X(t), V

t

+ + =: φ+ 0 (t, x) − divx Γ (t, x) + R (t, x).

(56)

Similarly, we have − − ρ− (t, x) = φ− 0 (t, x) − divx Γ (t, x) + R (t, x),

(57)

− − with the corresponding functions φ− 0 , Γ , and R + Next we derive some estimates on φ+ , Γ , and R+ . 0

Lemma 8 For t ∈ [0, T ] we have

φ+ 0 (t) 3+m ; x ≤ CT 3

(m > 0)

and

φ+ 0 (t) 3( 3+m ); x ≤ CT 6+m

(m ≥ 3).

Proof. We can apply corollary 7 below with s = t and τ = 0 to obtain the first 3+m bound. Concerning the second, note that m ≥ 3 implies 3( 3+m 6+m ) ≤ 3 . Whence + + it suffices to bound the support of x → φ0 (t, x) = dv f0 (X(t; x, v), V (t; x, v)). To do so, recall from (20) that f0+ (¯ x, v¯) = 0 for |¯ x| ≥ r0 or |¯ v | ≥ r0 . From the proof of corollary 7 we know |V (t) − v| ≤ C1 , C1 depending only on T . Thus

876



∂ ∂s |X(s) − x|

≤ |V (s) + εD[2] (t − s)| ≤ C(1 + |v|) by (54) and theorem 2(c), whence |X(t)−x| ≤ C2 (1+|v|). Then, if |x| ≥ C2 (1+C1 +r0 )+r0 =: r1 and |V (t)| ≤ r0 , we have |v| ≤ |V (t) − v| + |V (t)| ≤ C1 + r0 and therefore |X(t)| ≥ |x| − |X(t) − x| ≥ r0 . This yields f0+ (X(t), V (t)) = 0, and thus φ+ 0 (t, x) = 0 for |x| ≥ r1 and t ∈ [0, T ]. Next we turn to bound Γ+ (t, x) =

t 0

ds

∂x ˜ dv ( ∂V · G).

Lemma 9 For t ∈ [0, T ] and any t0 ∈]0, T ] we have m−3 9 1

Γ+ (t) 3+m; x ≤ CT t06−m 1+Mm(t) (6−m)(3+m) +CT (1+| ln t0 |) 1+Mm(t) 3+m , with 3 < m
2 . 3+p 3+m 15 Then with p = 6m−9 6−m > 1 we find r = 3 . Moreover, 1 ≤ r < 4 by the choice

Vol. 2, 2001


877

of m. Hence by lemma 15 and corollary 7

t0

+

ds s dv f ](t − s, X(s; ·, v), V (s; ·, v)) [E

1

0

≤

3+m; x

2− 3 CT t0 r

sup E1 (τ ) r;x

τ ∈[0,T ] 2− 3r

≤ CT t0

sup τ ∈[0,t]

1 3+m 3+m r dx dvf + (t − τ, X(τ ), V (τ ))

1 1 + Mp (t) 3+m ;

note that Jensen’s inequality has been used for the first estimate. Utilizing 3( 3+2p 6+p ) = m and (51), we can bound Mp (t) by means of Mm (t), as 6+p 9 Mp (t) ≤ CT 1 + Mm (t) 3 = CT 1 + Mm (t) 6−m . Thus we have shown that

t0

+

ds s dv [E1 f ](t − s, X(s; ·, v), V (s; ·, v))

0 3+m; x 9 2− 3r (6−m)(3+m) 1 + Mm (t) . ≤ CT t0

(61)

As far as the second part of the integral is concerned, we now make use of (60) with r = 3 and r = 32 . Then

t

+

ds s dv [E1 f ](t − s, X(s; ·, v), V (s; ·, v))

t0 3+m; x ≤ CT (1 + | ln t0 |) sup E1 (τ ) 3 ; x τ ∈[0,T ]

sup τ ∈[0,t]

dx

2

1 3+m 3+m 3 dv f (t − τ, X(τ ), V (τ ))

+

1 ≤ CT (1 + | ln t0 |) 1 + Mm (t) 3+m ,

(62)

again by lemma 15 and corollary 7. Summarizing (61) and (62), we see that the first asserted estimate holds. To verify (58) it sufficient to follow the argument just t elaborated and to note that for t0 = t the contribution of the t0 ds(. . .)-part of (59) drops out, whence we simply use (61) for t0 = t. Finally we need to consider

t ∂x ∂v + ˜ ˜ R (t, x) = ds dv divx · G + divv ·G ∂V ∂V 0 in (56).

878



Lemma 10 For t ∈ [0, T ] we have

R+ (t) 3( 3+m ); x ≤ CT , 6+m

with m ∈ [0, 147 16 ]. Proof. Using (67) below, and (60) with r = |R+ (t, x)| ≤ CT

t

ds 0

≤ CT

sup E1 (τ ) 13 ; x

τ ∈[0,T ] t

ds s

and r =

13 4 ,

we estimate

dv [E1 f + ](t − s, X(s; x, v), V (s; x, v))

≤ CT

13 9

− 12 13

4

0

t

ds s

− 12 13

0

9 13 dv f (t − s, X(s), V (s))

+

9 13 , dv f (t − s, X(s), V (s))

+

the latter according to lemma 15. Hence due to corollary 7, with p determined 27 3+m through 3+p 3 = 13 ( 6+m ),

R+ (t) 3( 3+m ); x ≤ CT sup 6+m

τ ∈[0,t]

dx

6+m 3+m 27 3(3+m) 13 ( 6+m ) dv f + (t − τ, X(τ ), V (τ ))

6+m 6+m ≤ CT 1 + Mp (t) 3(3+m) = CT 1 + M3(α−1) (t) 3(3+m) , 3+m where α = 27 13 ( 6+m ). Since 0 ≤ 3(α − 1) = claim follows from lemma 2.

3 3+14m 13 ( 6+m )

≤ 2 by choice of m, the

− − The foregoing estimates, and analogous ones for φ− 0 , Γ , and R , can be put together to yield the following result.

Lemma 11 For t ∈ [0, T ] and m ∈]3, 51 11 [ we have m−3 9 1

∇U (t) 3+m; x ≤ CT t06−m 1+Mm(t) (6−m)(3+m) +CT (1+| ln t0 |) 1+Mm (t) 3+m , with t0 ∈]0, t] being arbitrary. Here CT does not depend on t0 . Moreover, 9 m−3

∇U (t) 3+m; x ≤ CT t 6−m 1 + Mm (t) (6−m)(3+m) , t ∈ [0, T ].

(63)

Proof. Due to (26), (56), and (57) we may write 1 − + − − + ∇U (t, x) = − ∇ ∗ [φ+ 0 (t)−φ0 (t)]+[R (t)−R (t)]+divx [Γ (t)−Γ (t)] (x). |x|

Vol. 2, 2001


879

Therefore lemma 4 implies for m ∈ [0, ∞[ that

∇U (t) 3+m; x ≤ C φ+ + φ− 0 (t) 3( 3+m 0 (t) 3( 3+m 6+m ); x 6+m ); x + R+ (t) 3( 3+m ); x + R− (t) 3( 3+m ); x 6+m 6+m + − +C Γ (t) 3+m; x + Γ (t) 3+m; x . Due to lemmas 8, 9, and 10 we thus obtain the first desired estimate. Concerning (63), we rather apply (58) than the first estimate from lemma 9. This in particular can be used to derive a short time bound on Mm (t). Corollary 4 For m ∈]3, 51 11 [ there exist t1 ∈]0, T ] and C1 > 0 (both depending on T ) such that Mm (t) ≤ C1 , t ∈ [0, t1 ]. m−3

Proof. Combining (52) with (63) and observing t 6−m ≤ CT yields 2+m d Mm (t) ≤ C sup ∇U (s) 3+m; x Mm (t) 3+m dt s∈[0,t]

2+m 9 ≤ CT 1 + Mm (t) (6−m)(3+m) Mm (t) 3+m . Integration of this differential inequality gives a local bound on Mm (t), that, how9 7−m ever, fails to extend to all of [0, T ] due to (6−m)(3+m) + 2+m 3+m = 6−m > 1. Corollary 5 For t ∈ [0, T ] and m ∈]3, 51 11 [ we have

1

∇U (t) 3+m; x ≤ CT (1 + | ln Mm (t)|) 1 + Mm (t) 3+m .

(64)

Proof. Note that we may assume Mm (t) ≥ 1 for t ∈ [0, T ], since otherwise Mm (t) simply can be replaced by Mm (t) + 1. Set 1

t0 = t1 Mm (t) 3−m ≤ t1 in lemma 11, with t1 from corollary 4. If t ∈ [t1 , T ], then t0 ≤ t, and therefore 1 m−3 9 1 3−m ( 6−m ) + (6−m)(3+m) = 3+m shows that (64) holds for t ∈ [t1 , T ]. On the other hand, if t ∈ [0, t1 ], then Mm (t) ≤ C1 and (63) imply ∇U (t) 3+m; x ≤ C2 for some C2 > 0. Hence (64) holds as well in this case if we choose CT ≥ C2 . Theorem 4 For t ∈ [0, T ] and m ∈]3, 51 11 [ we have Mm (t) ≤ CT .

880



Proof. By (52) and due to corollary 5 we see that d Mm (t) ≤ dt ≤ ≤

2+m

C sup ∇U (s) 3+m; x Mm (t) 3+m s∈[0,t]

CT (1 + | ln Mm (t)|) 1 + Mm (t) CT (1 + | ln Mm (t)| Mm (t) .

Integration of this differential inequality yields the claimed estimate.

Corollary 6 For t ∈ [0, T ] we have

ρ± (t) p; x ≤ CT ,

p ∈]2,

28 [. 11

Proof. This is a consequence of (41) and theorem 4, since m = 3(p − 1) ∈]3, 51 11 [ 28 corresponds to p = 3+m ∈]2, [. 3 11

3.4

Global existence of solutions

We start with some preliminary (well-known) observations. Recall the definition of P ± (t) from (50), and also that [0, T∗ [ is the maximal interval of existence, cf. theorem 3. Lemma 12 We have ±

±

P (t) ≤ P (0) +

t

0

∇U (s) ∞; x ds,

t ∈ [0, T∗ [.

Proof. Assume e.g. (x, v) ∈ suppf + (s) for some x ∈ R3 and s ∈ [0, t]. From the proof of lemma 2 we know that f + (s, x, v) = f0+ (X (0; s, x, v), V(0; s, x, v)), with (X , V) the characteristics from (38). This means that (x, v) = (X (s; 0, x0 , v0 ), V(s; 0, x0 , v0 )) for (x0 , v0 ) ∈ suppf0+ . Hence s t ˙ V(τ ; 0, x0 , v0 ) dτ ≤ P + (0) + |v| ≤ |v0 | +

∇U (τ ) ∞; x dτ 0

0

by the characteristic equation for V.

Next we need to derive a bound on ∇U (t) ∞; x . Lemma 13 For α >

∇U (t) ∞; x ≤ C where

1 α

+

1 α

3 2

we have

j=±

j

ρ (t) ∞; x

1− α3 j=±

j

ρ (t) α ; x

= 1. The constant C depends only on α.

α3 ,

t ∈ [0, T∗ [,

Vol. 2, 2001


881

Proof. With R > 0 we estimate from (26) dy + − |∇U (t, x)| ≤ ρ (t, y) + ρ (t, y)) 2 |y−x|≤R |x − y| dy + − + (t, y) + ρ (t, y)) ρ 2 |y−x|≥R |x − y| α1 dy j j ≤C

ρ (t) ∞; x R +

ρ (t) α ; x 2α |y−x|≥R |x − y| j=± j=± 3

ρj (t) ∞; x R + C

ρj (t) α ; x R α −2 . ≤C j=±

j=±

Choosing the optimal R yields the claim. Lemma 14 We have

ρ± (t) ∞; x ≤ CP ± (t)3 ,

t ∈ [0, T∗ [,

with C depending only on the data. Proof. By definition of P + (t) and bounding f ± (t) ∞; xv as in lemma 2, it follows that f + (t, x, v) dv ≤ C f ± (t) ∞; xv P + (t)3 ≤ CP + (t)3 , ρ+ (t, x) = |v|≤P + (t)

as was to be shown.

Using the criterion from theorem 3 and by means of corollary 6 we are finally going to complete the proof of theorem 1. Proof of theorem 1 : Assume T∗ < ∞ in theorem 3. All the estimates on the moments remain valid if [0, T ] is replaced by [0, T∗ [, since in the constants only terms of the form CT∗ , T∗α , and eCT∗ do enter. In particular,

ρ± (t) 27 ; x ≤ C, 11

t ∈ [0, T∗ [,

by corollary 6, where here and below the various constants C depend on T∗ . Choos3 27 ing α = 27 16 > 2 , which corresponds to α = 11 , we deduce from lemmas 12, 13, and 14 that 2 11 119 t ds

ρj (s) ∞; x

ρj (s) 27 ; x P + (t) ≤ P + (0) + C ≤

P + (0) + C

0

j=±

t 0

112 ds P + (s)3 + P − (s)3 .

j=±

11

882


This implies

P (t) ≤ P (0) + C

t

6

P (s) 11 ds, 0


t ∈ [0, T∗ [,

and hence the boundedness of P on [0, T∗ [. We remark that we did not try to reduce the exponent optimal power of P .

3.5

6 11

so as to obtain the

Some technical lemmas

Lemma 15 Define E1 and F as in (53). Then we have E1 (t) p; x ≤ C for t ∈ [0, T ] and p ∈ [1, 15 4 [, as well as

F (t) ∞; x + ∇F (t) ∞; x + D2 F (t) ∞; x ≤ CR−2 ,

t ∈ [0, T ].

(65)

Proof. From (53) and Young’s inequality [21, p. 29] with 1q + 1r = 1 + p1 we obtain

1 + −

χ ∇ 1 ρ+ (t) − ρ− (t) . (χ ∇ ) ∗ (ρ (t) − ρ (t)) ≤

E1 (t) p; x =

r; x

|x| |x| q; x p; x x 3 5 q 3 ± We have χ(·) |x| 3 ∈ L (R ) for q ∈ [1, 2 [ and ρ (t) r; x ≤ C for r ∈ [1, 3 ] due to theorem 2(a). Combining those values for q and r, we see that we need to have p ∈ [1, 15 4 [. The bounds in (65) are obtained by observing that (x − y) 1 − χ(x − y) F (t, x) = (ρ+ (t, y) − ρ− (t, y)) dy |x − y|3 |x−y|≥R ± f (t, y, v) dydv ≤ C. where χ ∈ C0∞ (R3 ), and moreover ρ± (t, y) dy =

Lemma 16 For fixed t ∈]0, T ] and s ∈ [0, t] consider the map Z(s) :

R6 (x, v) → (X(s; x, v), V (s; x, v)) = (X(s), V (s)) ∈ R6 ,

where (X(s), V (s)) is the solution of the characteristic system (54), i.e. ˙ X(s) X(0) x −V (s) − εD[2] (t − s) = , = . V (0) v −F (t − s, X(s)) V˙ (s) Then Z(s) is a volume-preserving diffeomorphism, and −1 det ∂X (s) ≤ Cs−3 , ∂x (s) ≤ Cs, ∂v ∂V as well as

∂ ∂ ∂xj ∂vj + ≤ C, (s) (s) ∂xi ∂Vk ∂vi ∂Vk

1 ≤ i, j, k ≤ 3,

(66)

(67)

Vol. 2, 2001


883

if R > 0 is chosen sufficiently large (depending on T ). Here Z(s)−1 : (X, V ) → (x, v) = (x(s; X, V ), v(s; X, V )) is the inverse of Z(s), and the constants C do depend only on T , but not on (t, s, x, v). Proof. As the right-hand side of the characteristic system has divergence div = div(X,V ) = 0, the first claim follows; cf. also remark 2. Moreover, we have ∂X ∂V −2 3 −2 2 (68) ∂v (s) + s Id ≤ CR s , ∂v (s) − Id ≤ CR s , ∂X ∂V −2 2 −2 (69) ∂x (s) − Id ≤ CR s , ∂x (s) ≤ CR s, where Id denotes the unit matrix in R3 . E.g. to validate the estimate on the v-derivatives one can introduce, following [3], the function φ(s) = ∂X ∂v (s) + s Id ∂V ˙ ˙ ¨ and calculate that φ(0) = 0, φ(s) = − ∂v (s) + Id, φ(0) = 0, as well as φ(s) = ∂F (t − s, X(s))[φ(s) − s Id]. Here the important observation is that ∂x ∂ ∂X ∂ ˙ ∂ ˙ φ(s) = (s) + Id = X(s) + Id = − V (s) − εD[2] (t − s) + Id ∂s ∂v ∂v ∂v =−

∂V (s) + Id, ∂v

since the term with εD[2] (t − s) simply drops through the v-derivative, and thus the same general argument can be used as in the case without D[2] . Then we s ¨ may write φ(s) = 0 (s − τ )φ(τ ) dτ and utilize (65) to derive (68) by means of a Gronwall argument, whereas (69) is obtained in the same way using the function φ(s) = ∂X ∂x (s) − Id instead. Then

det ∂X (s) = s3 det − s−1 ∂X (s) + s Id + Id ∂v ∂v together with (68) and the continuity of the map A → | det(A)| at A = Id yields that for s ∈ [0, T ] and R > 0 large enough, det ∂X (s) ≥ 1 s3 , 2 ∂v ∂x and this proves the first estimate in (66). As what concerns the bound on | ∂V (s)|, employing the chain rule it follows that −1 ∂x ∂X ∂V ∂x (s) = − (s) (s) (s) , ∂V ∂X ∂v ∂v

884



−1 ∂x ∂V ∂X ∂x (s) = Id − (s) (s) (s) . ∂X ∂V ∂x ∂x Choosing R > 0 sufficiently large, we find from (68) and (69) that −1 −1 ∂V ∂X ≤ C, ∂v (s) + ∂x (s) whence by (68) and (69), ∂x ∂x ∂V (s) ≤ Cs ∂X (s) ,

∂x −2 ∂x ∂X (s) ≤ C 1 + R s ∂V (s) ,

∂x (s)| ≤ Cs, for R > 0 large enough. and this gives | ∂V Finally, the estimates on the second derivatives in (67) are more tedious, but verified in a similar way.

Corollary 7 If (X(s), V (s)) is a characteristic curve, cf. lemma 16, and s, τ ∈ [0, T ], then for p ∈]0, ∞[

p

± 3 3 ± 3+p 3+p 3+p

f (τ, X(s; ·, v), V (s; ·, v)) dv

M , ≤C

f (τ ) (τ ) + M (τ ) T 0 p ∞; xv

3+p 3 ;x

with Mp (·) from (37). Proof. Utilizing lemma 3 with f (x, v) = f ± (τ, X(s; x, v), V (s; x, v)), it follows that the left-hand side is bounded by p

3+p C f ± (τ ) ∞; xv

|v|p f ± (τ, X(s; x, v), V (s; x, v)) dxdv

3 3+p

.

∂ |V (s)−v| ≤ F ∞; xt ≤ From the characteristic equation for V (s) we obtain that ∂s p C, whence |V (s) − v| ≤ CT = CT , thus |v| ≤ C(1 + |V (s)|p ). Using this estimate and the fact that Z(s) is a volume-preserving diffeomorphism yields the claim.

References [1] R.A. Adams, Sobolev Spaces, Academic Press, New York (1975). [2] H. Andréasson, Global existence of smooth solutions in three dimensions for the semiconductor Vlasov-Poisson-Boltzmann equation, Nonlinear Anal. 28, 1193–1211 (1997). [3] C. Bardos and P. Degond, Global existence for the Vlasov-Poisson equation in 3 space variables with small initial data, Ann. Inst. H. Poincaré Anal. Non Linéaire 2, 101–118 (1985).

Vol. 2, 2001


885

[4] J. Batt, Global symmetric solutions of the initial value problem in stellar dynamics, J. Differential Equations 25, 342–364 (1977). [5] L. Blanchet, T. Damour and G. Sch¨ afer, Post-Newtonian hydrodynamics and post-Newtonian gravitational wave generation for numerical relativity, Mon. Not. R. Astron. Soc. 242, 289–305 (1990). [6] F. Bouchut, Existence and uniqueness of a global smooth solution for the Vlasov-Poisson-Fokker-Planck system in three dimensions, J. Funct. Anal. 111, 239–258 (1993). [7] W.L. Burke, Gravitational radiation damping of slowly moving systems calculated using matched asymptotic expansions, J. Math. Phys. 12, 401–418 (1971). ´ E. ´ Flanagan, Astrophysical sources of gravitational radiation and prospects [8] E. for their detection, in Dadhich N. and Narlikar J. (Eds.): Gravitation and Relativity at the Turn of the Millennium, Inter-University Center for Astronomy and Astrophysics, Pune (1998). [9] R.T. Glassey, The Cauchy Problem in Kinetic Theory, SIAM, Philadelphia (1996). [10] L. Hörmander, The Analysis of Linear Partial Differential Operators I, Springer, Berlin-Heidelberg-New York (1983). [11] E. Horst, On the asymptotic growth of the solutions of the Vlasov-Poisson system, Math. Meth. Appl. Sci. 16, 75–85 (1993). [12] R. Illner and G. Rein, Time decay of the solutions of the Vlasov-Poisson system in the plasma physical case, Math. Meth. Appl. Sci. 19, 1409–1413 (1996). [13] J.D. Jackson, Classical Electrodynamics, Wiley, New York (1975). [14] M. Kunze and H. Spohn, Radiation reaction and center manifolds, SIAM J. Math. Anal. 32, 30–53 (2000). [15] M. Kunze and H. Spohn, Adiabatic limit for the Maxwell-Lorentz equations, Annales H. Poincaré 1, 625–653 (2000). [16] M. Kunze and H. Spohn, Slow motion of charges interacting through the Maxwell field, Comm. Math. Phys. 212, 437–467 (2000). [17] P.L. Lions and B. Perthame, Propagation of moments and regularity for the 3-dimensional Vlasov-Poisson system, Invent. Math. 105, 415–430 (1991). [18] B. Perthame, Time decay, propagation of low moments and dispersive effects for kinetic equations, Comm. Partial Differential Equations 21, 659–686 (1996).

886



[19] K. Pfaffelmoser, Global classical solutions of the Vlasov-Poisson system in three dimensions for general initial data, J. Differential Equations 95, 281– 303 (1992). [20] M. Pulvirenti and C. Simeoni, L∞ -estimates for the Vlasov-Poisson-FokkerPlanck equation, Math. Meth. Appl. Sci. 23, 923–935 (2000). [21] M. Reed and B. Simon, Methods of Modern Mathematical Physics II : Fourier Analysis, Self Adjointness, Academic Press, New York (1975). [22] G. Rein, Selfgravitating systems in Newtonian theory – the Vlasov-Poisson system, in Proc. Minisemester on Math. Aspects of Theories of Gravitation 1996, Banach Center Publications 41, part I, 179–194 (1997). [23] G. Rein and A. Rendall, Global existence of classical solutions to the Vlasov-Poisson system in a three dimensional, cosmological setting, Arch. Rat. Mech. Anal. 126, 183–201 (1994). [24] J. Schaeffer, Global existence of smooth solutions to the Vlasov-Poisson system in three dimensions, Comm. Partial Differential Equations 16, 1313–1335 (1991). [25] V.V. Zheleznyakov, Radiation in Astrophysical Plasmas, Kluwer, Dordrecht (1996). Markus Kunze Zentrum Mathematik, TU M¨ unchen Gabelsbergerstr. 49 D-80333 M¨ unchen Germany email: [email protected] Alan D. Rendall Max-Planck-Institut f¨ ur Gravitationsphysik Am M¨ uhlenberg 1 D-14476 Golm Germany email: [email protected] Communicated by Sergiu Klainerman submitted 04/12/00, accepted 05/02/01




On the Condensate Multivortex Solutions of the Self-Dual Maxwell-Chern-Simons CP(1) Model D. Chae and H.-S. Nam

Abstract. In this paper we prove the existence of periodic multivortex solutions in the plane of the self-dual Maxwell-Chern-Simons CP(1) model where the kinetic part of the Lagrangian contains both Maxwell and Chern-Simons terms. We also study both of the Maxwell and the Chern-Simons limits. Finally we consider the single signed vortex case and prove that the solutions are bounded from below or above by solutions of the Maxwell CP(1) model depending on the sign of the vortices. As a simple corollary, in the vortex free case we construct a unique explicit solution.

1 The Maxwell-Chern-Simons CP(1) model We consider the following Lagrangian of the gauged CP(1) model [7]: L=−

1 κ 1 1 Fαβ F αβ + αβγ Aα ∂β Aγ + |Dα φ|2 + (∂α N )2 − V (φ, N ) 4q 2 2 2q

where q, κ, ρe are dimensionless parameters, φ : R2 → S 2 is a scalar field, Dα φ = ∂α φ + Aα n × φ for n = (0, 0, 1) is the gauge covariant derivatives, Fαβ = ∂α Aβ − ∂β Aα is the curvature tensor and N is a scalar field. αβγ is the totally skewsymmetric tensor with 012 = 1 and gαβ = diag(1, −1, −1). Here the first two terms in the Lagrangian are called Maxwell and Chern-Simons term, respectively. In order to obtain the self-dual equations (Bogomol’nyi’s equation) for the static case, we choose the potential V (φ, N ) as [7]: V (φ, N ) =

2 1 q κN + (s − n · φ) + N 2 (n × φ)2 . 2 2

The Gauss law constraint obtained from the variation of A0 is 1 ∂i F0i + κF12 + n · φ × D0 φ = 0. q

(1.1)

888

D. Chae and H.-S. Nam


Following Bogomol’nyi’s argument one can obtain, by integration by parts, using (1.1), the static energy T 00 d2 x E = Ω 1 1 2 (|F0i |2 + |F12 |2 ) + (|D0 φ|2 + |Di φ|2 ) = d x 2q 2 1 2 2 + (|∂0 N | + |∂i N | ) + V (φ, N ) 2q 1 1 1 |∂0 N |2 + |F0i ± ∂i N |2 + |F12 ∓ q(κN + s − n · φ)|2 = d2 x 2q 2q 2q 1 1 + |D0 φ ∓ (n × φ)N |2 + |D0 φ ± φ × D2 φ|2 2 2 ± d2 x {φ · D1 φ × D2 φ + (s − n · φ)F12 } ≥ ± d2 x {φ · D1 φ × D2 φ + (s − n · φ)F12 } ≡ ±T, where we can choose the sign ± so that ±T = |T |. Since E ≥ |T | and T is the generalized topological charge, the field configurations saturating the energy bound satisfy the Gauss law constraint(1.1) and the following self-dual equations :   ∂0 N = 0      = ∓ ∂i N  F0i (1.2) F12 = ± q(κN + s − n · φ)    D0 φ = ± (n × φ)N     D φ = ∓ φ × D φ. 1 2 In this paper we consider the case |s| < 1 and choose the upper signs in (1.2). By elementary calculations we obtain that    A0 = N (1.3) F12 = q(κN + s − n · φ)   ¯ ∂u = − iα ¯u φ1 φ2 , u2 = 1+φ where ∂¯ = 12 (∂1 + i∂2 ), α = 12 (A1 − iA2 ), u = u1 + iu2 and u1 = 1+φ 3 3 is the spherical projection of φ. Now we consider the doubly periodic boundary condition due to ’t Hooft [10]. First, we observe that the Lagrangian L is invariant under the following gauge transformation:

φ → R(θ)φ

,

Aα → Aα + ∂α θ,

Vol. 2, 2001

On the Self-Dual Maxwell-Chern-Simons CP(1) Model

889

where R(θ) denotes the rotation matrix with angle −θ about the fixed vector n = (0, 0, 1). We set the doubly periodic region Ω by Ω = {x = (x1 , x2 ) ∈ R2 |x = t1 a1 + t2 a2 , 0 < t1 , t2 < 1} where a1 and a2 are linearly independent vectors in R2 , and define Γk = {x ∈ R2 |x = tk ak , 0 < tk < 1} for k = 1, 2. Then the boundary ∂Ω can be written as ∂Ω = Γ1 ∪ Γ2 ∪ {a1 + Γ2 } ∪ {a2 + Γ1 } ∪ {O, a1 , a2 , a1 + a2 }. Here we impose the following doubly periodic boundary condition R(θk (x + ak ))φ(x + ak ) = N (x + ak ) = k

(Aj + ∂j θk )(x + a ) =

R(θk (x))φ(x) N (x)

(1.4)

(Aj + ∂j θk )(x),

where x ∈ Γ1 ∪ Γ2 − Γk for k = 1, 2 and θ1 , θ2 are real-valued smooth functions defined in a neighborhood of Γ2 ∪ {a1 + Γ2 }, Γ1 ∪ {a2 + Γ1 } respectively. For ¯ simplicity, we denote the value of θk at x by θk (t1 , t2 ) for x = t1 a1 + t2 a2 ∈ Ω. Since φ is single-valued, we obtain that θ1 (1, 1− ) − θ1 (1, 0+ ) + θ1 (0, 0+ ) − θ1 (0, 1− ) + θ2 (0+ , 1) − θ2 (1− , 1) + θ2 (1− , 0) − θ2 (0+ , 0) + 2π(m − n) = 0

(1.5)

with a suitable integer m − n. Using (1.4), we can show that this integer m − n determines the magnetic flux Φ= F12 dx = Aj dxj = − ∂j θk dxj = 2π(m − n). Ω

∂Ω

∂Ω

If we denote x1 , · · · , xm and y1 , · · · , yn as the preimage of the north pole n = (0, 0, 1) and the south pole −n = (0, 0, −1) with multiplicity, respectively, then we obtain that ∆ϕ = ∆N

=

m

n

k=1

l=1

1 − eϕ 2qκN + 2q(s − ) + 4π δ − 4π δy l x k 1 + eϕ κ2 q 2 N + κq 2 (s −

1 − eϕ 4eϕ )+q N ϕ 1+e (1 + eϕ )2

(1.6) (1.7)

where ϕ = ln |u|2 and we used the Gauss law (1.1) in (1.7). For simplicity and technical reason we set κN = −N and ϕ = w + U where w is uniquely determined by n m

4π(m − n) − 4π δyl + 4π δx k , w = 0. (1.8) ∆w = − |Ω| Ω l=1

For existence and some properties, see [1].

k=1

890



Finally we arrive at the following system of equations : ∆U

=

∆N

=

1 − ew+U 4π(m − n) )+ 1 + ew+U |Ω| w+U 1−e 4ew+U ) + q N. −κ2 q 2 (−N + s − 1 + ew+U (1 + ew+U )2 2q(−N + s −

(1.9) (1.10)

Formally setting κ = 0 we obtain the equation for the Maxwell CP(1) model in the periodic domain studied in [9]: a 1 − ew+u 4π(m − n) a . (1.11) ∆u = 2q s − + 1 + ew+ua |Ω| On the other hand, the equation for the Chern-Simons CP(1) model in the periodic domain studied by the authors of this paper [5]: 4ew+u 1 − ew+u 4π(m − n) 2 (1.12) s − + ∆u = 2 κ (1 + ew+u )2 1 + ew+u |Ω| has at least two multivortex solutions. In the next section we will prove existence of solution of the system(1.9)(1.10) in the periodic domain Ω. We also consider the behaviors of the solution (U κ,q , N κ,q ) in the limits κ → 0, q = fixed (the Maxwell limit), and q → ∞, κ = fixed (the Chern-Simons limit) and their relation with solutions of (1.11) and (1.12). We also study the special case of single signed vortex cases, i.e. the cases m ≥ 0, n = 0 or m = 0, n ≥ 0. We remark that there are similar limiting problems in other models(e.g. Maxwell-Chern-Simons-Higgs system [3], [4], [8]). For monotone iteration, see [2]. The organization of this paper is following : In section 2, we prove existence of solutions using super/subsolution method (Theorem 2.1). In section 3, we consider the Maxwell limit (Theorem 3.1). Keeping the parameter q fixed and letting κ → 0, we obtain a dichotomy which says that we can choose subsequences such that the solutions converge smoothly to the unique solutions of the Maxwell gauged sigma model [9] or diverges. In section 4, we consider the Chern-Simons limit (Theorem 4.1). Keeping κ fixed and letting q → ∞, we obtain a dichotomy similarly to the above section. In this case, if the numbers of vortex and antivortex does not equal to each other, then we can find a subsequence of solutions which converge smoothly to a solution of the Chern-Simons gauged CP(1) model. In section 5, we consider the single signed vortex case. We prove that the upper or lower bounds of the solutions depending on the sign of the vortices can be controlled by the unique solution of Maxwell limit and as a simple corollary, in vortex free case, the solution pair is uniquely determined as pair of constants (Theorem 5.1).

Vol. 2, 2001


891

2 Existence of solutions In this section we will prove the following theorem: Theorem 2.1 Let −1 < s < 1 and x1 , · · · , xm , y1 , · · · , yn ∈ Ω be given, where xk ’s and yl ’s are not necessarily distinct respectively. Then there exist κ0 such that for any 0 < κ < κ0 there exist a constant qκ = q(κ) such that if q > qκ then the self-dual equations (1.3) with the periodic boundary conditions have a multivortex solution (φ, N, A) such that φ−1 (n) = {x1 , · · · , xm } and φ−1 (−n) = {y1 , · · · , yn }. As we will see later, in proving the existence of super/subsolution pairs, qκ = q(κ) remains bounded as κ → 0 and unbounded as κ → κ0 , while in the process of iteration we need the unboundedness as κ → 0. First we show the existence of super/subsolution pairs. Note that (U , N ) is a supersolution pair if the following inequalities hold.   1 − ew+U 4π(m − n)   , )+  ∆U ≤ 2q(−N + s − w+U |Ω| 1+e  1 − ew+U 4ew+U 2 2   )+q N.  ∆N ≤ −κ q (−N + s − 1 + ew+U (1 + ew+U )2 Similarly, (U , N ) is a subsolution pair if the reversed inequalities hold. Lemma 2.1 Under the same assumptions in Theorem 2.1, there exist κ0 such that for any 0 ≤ κ < κ0 there exist constants qκ+ and qκ− and if q > max{qκ+ , qκ− } then there exist supersolution pair (U, N ) and subsolution pair (U , N ) with the property U ≥ U and N ≥ 0 ≥ N to (1.9)-(1.10). Proof. We first construct explicitly a supersolution pair (U, N ). Let be a sufficiently small positive number so that the (m + n) balls with center xk or yl of radius 2 are mutually disjoint and 8π(m + n)2 < |Ω|. Then we can define a smooth function g + with −1 ≤ g + ≤ 0 and

n −1 , x ∈ l=1 B(yl , ) + g (x) = 0 , x ∈ Ω \ nl=1 B(yl , 2). We may assume that |Dg + (x)| ≤ the following equation:

2

and |D2 g + (x)| ≤

1 4πm 1 1 + 2 g+ − 2 ∆U = |Ω| |Ω|

Ω

4 2 .

g + − 4π

Let U be a solution of

m

δx k .

k=1

From this and (1.8) we get n

4πn 1 1 1 − 4π ∆(w + U) = δy l + 2 g + − 2 |Ω| |Ω| l=1

Ω

g+,

892



and we know that w +U ∞ as x → yl . Since w +U is determined up to constant, we can assume w + U ≥ c0 for some positive constant, say c0 = max(ln 1−s 1+s + 1, 1) so that 1 − ew+U > 0. (2.1) s− 1 + ew+U Now we define N by N

m

1 − ew+U 1 1 4π(m − n) 1 ∆U + +s− − 4π δx k 2q 2q |Ω| 2q 1 + ew+U k=1 1 − ew+U 1 + 1 1 1 4πn + + s − = − g + g − . 2q2 2q2 |Ω| Ω 2q |Ω| 1 + ew+U

= −

Then clearly we get

∆U ≤ 2q −N + s − Thus it suffices to show that 2 2

∆N ≤ −κ q

−N + s −

1 − ew+U 1+

+

ew+U

1 − ew+U 1+

ew+U

4π(m − n) . |Ω|

+q

4ew+U (1 + ew+U )2

(2.3)

N.

From (2.2) we can calculate the left hand side of (2.4) as 1 − ew+U 1 + 1 1 1 4πn + −s+ ∆N = −∆ g − g + 2q2 2q2 |Ω| Ω 2q |Ω| 1 + ew+U = −

(2.2)

1 1 − ew+U + ∆g − ∆( ) 2q2 1 + ew+U

1 4ew+U 1 ∆g + + ∆(w + U ) 2 2q 2 (1 + ew+U )2 4πn 1 1 + 1 4ew+U 1 1 + + ≤ − + 2g − 2 ∆g + g . 2q2 2 (1 + ew+U )2 |Ω| |Ω| Ω

≤ −

On the other hand, the right hand side of (2.4) is 1 − ew+U 4ew+U 2 2 N +q −κ q −N + s − 1 + ew+U (1 + ew+U )2 4πn 1 1 4ew+U 1 1 2 + 2 g+ − 2 κ q+ =− g+ 2 |Ω| |Ω| Ω (1 + ew+U )2 1 − ew+U 4ew+U s− . +q (1 + ew+U )2 1 + ew+U

(2.4)

Vol. 2, 2001


893

Thus it suffices to show that 4πn 4ew+U 1 + κ2 q 1 1 1 + + + + ∆g ≤ − g − g [LH] ≡ − 2q2 2 |Ω| 2 2 |Ω| Ω (1 + ew+U )2 4ew+U 1 − ew+U +q s− ≡ [RH]. (1 + ew+U )2 1 + ew+U

n If x ∈ l=1 B(yl , ), then κ2 q 1 4nπ2 4πn 4ew+U 1 [LH] = 0 ≤ − + − 2+ 2 2 |Ω| |Ω| (1 + ew+U )2 4πn 4ew+U 1 + 1 1 κ2 q + + + 2g − 2 g ≤− 2 |Ω| |Ω| Ω (1 + ew+U )2 1 − ew+U 4ew+U s− ≡ [RH], +q (1 + ew+U )2 1 + ew+U where we used

(2.1) in the last inequality. If x ∈ Ω \ nl=1 B(yl , ), then

n c0 ≤ w + U ≤ U = U () ≡ sup w + U x ∈ Ω \ B(yl , ) < ∞ ∗

∗

l=1

and 2 q4 ∗ 2 κ q 4πn 4πn 4ec0 1 − ec0 4eU + + ≤− s− +q ∗ 2 (1 + ec0 )2 |Ω| |Ω| 1 + ec0 (1 + eU )2 4πn κ2 q 4ew+U 1 1 1 ≤− + + 2 g+ − 2 g+ 2 |Ω| |Ω| Ω (1 + ew+U )2 1 − ew+U 4ew+U s− ≡ [RH], +q (1 + ew+U )2 1 + ew+U

[LH] ≤

for all q and κ satisfying κ

2

q

≡

∗ 4eU |Ω| 1 − ec0 s− ≡ κ21 ∗ 4πn (1 + eU )2 1 + ec0 2 ∗ 4ec0 4πn 4ec0 4πn 4eU + 24 (1+e (s − U ∗ )2 (1+ec0 )2 |Ω| + (1+ec0 )2 |Ω| ∗ c0 4eU s − 1−e − κ2 4πn ∗ 1+ec0 |Ω| (1+eU )2

qκ+ .

1−ec0 1+ec0

)−

4πn 2 |Ω| κ

894



This gives the existence of supersolution pair (U , N ). Moreover, (2.1) and the maximum principle implies N ≥ 0. We note that qκ+ is monotone increasing and goes to infinity as κ → κ1 . By similar argument we can also construct subsolution pair (U, N ) with the property w + U < 0 and N ≤ 0 for any q and κ satisfying 4eU ∗ |Ω| 1 − ec1 κ2 < − s ≡ κ22 4πm (1 + eU ∗ )2 1 + ec1 q

> ≡

4ec1 4πm (1+ec1 )2 |Ω|

+

4ec1 4πm (1+ec1 )2 |Ω| 4eU ∗ (1+eU ∗ )2

2

+

1−ec1 1+ec1

2 4

4eU ∗ (−s (1+eU ∗ )2

+

1−ec1 1+ec1

−

4πm 2 |Ω| κ

− s − κ2 4πm |Ω|

qκ−

m where 0 > c1 ≥ w + U ≥ U ∗ = U ∗ () ≡ inf{w + U|x ∈ Ω \ k=1 B(xk , )} > −∞ and c1 is chosen so that 1 − ew+U < 0, s− 1 + ew+U and U is determined by the solution of n

1 4πn 1 1 + 2 g− − 2 g − + 4π δy l . ∆U = − |Ω| |Ω| Ω l=1

Here g − is a smooth function defined by

m 1 , x ∈ k=1 −

B(xk , ) g (x) = 0 , x∈Ω\ m k=1 B(xk , 2). and 0 ≤ g − ≤ 1. Finally we set κ0 = min{κ1 , κ2 } and this completes the proof.

Now we prove the existence of solutions via monotone iteration. Lemma 2.2 Suppose that there exist supersolution pair (U , N ) and subsolution pair (U, N ) with the property U ≥ U, N ≥ 0 ≥ N and κ2 q > 2 maxx∈Ω {N , −N }. Then there exists solution pair (U, N ) of (1.9)-(1.10) between (U, N ) and (U, N ). Proof. We consider the following iteration scheme with a constant L > q + κ2 q 2 : w+U k 1 − e 4π(m − n) (∆ − L)U k+1 = 2q −N k + s − − LU k + |Ω| 1 + ew+U k k k 1 − ew+U 4ew+U k+1 2 2 k (∆ − L)N = −κ q −N + s − N k − LN k + q 1 + ew+U k (1 + ew+U k )2 (2.5) where k = 0, 1, 2, . . . and (U 0 , N 0 ) = (U, N ).

Vol. 2, 2001


895

Since (∆ − L)(U 1 − U 0 ) ≥ 0, (∆ − L)(N 1 − N 0 ) ≥ 0 and (∆ − L)(U 1 − U) ≤ 0, (∆ − L)(N 1 − N ) ≤ 0, by the maximum principle, we obtain that U0 ≥ U1 ≥ U

N0 ≥ N1 ≥ N

,

in Ω.

Now suppose that for k ≥ 1, U0 ≥ U1 ≥ · · · ≥ Uk ≥ U

,

N0 ≥ N1 ≥ ··· ≥ Nk ≥ N,

,

N k ≥ N k+1 ≥ N .

and we want to check U k ≥ U k+1 ≥ U x

Noting that the function F (x) = 1−e 1+ex is strictly decreasing and its range is [−1, 1], k k+1 it is easy to check that U ≥ U ≥ U and we will concentrate on remaining part. From (2.5) we get (∆ − L)(N k+1 − N k ) = (κ2 q 2 − L)(N k − N k−1 ) k k−1 1 − ew+U 1 − ew+U 2 2 − +κ q 1 + ew+U k 1 + ew+U k−1 k

+q

k−1

4ew+U 4ew+U k −q N k−1 . k 2N w+U (1 + e ) (1 + ew+U k−1 )2

Thus it suffices to show that 2 2

κ q

k−1

k

1 − ew+U 1 − ew+U − k−1 1 + ew+U 1 + ew+U k

k−1

k

4ew+U 4ew+U k−1 N − q Nk +q k−1 (1 + ew+U )2 (1 + ew+U k )2 ≤ q(N k−1 − N k ). (2.6)

We note that U k ≤ U k−1 , N k ≤ N k−1 and

k

1−ew+U 1+ew+U k

≥

k−1

1−ew+U 1+ew+U k−1

.

First consider the case N ≥ N k−1 ≥ N k ≥ 0 at the point x ∈ Ω. k−1

Case I

k

4ew+U 4ew+U k−1 2 ≤ w+U (1 + e ) (1 + ew+U k )2

896



We can calculate the left hand side of (2.6) as k−1 k k−1 k 1 − ew+U 4ew+U 1 − ew+U 4ew+U 2 2 k−1 κ q − N − q Nk + q k−1 k k−1 1 + ew+U 1 + ew+U (1 + ew+U )2 (1 + ew+U k )2 k−1 k−1 4ew+U 4ew+U k−1 k ≤q N − N (1 + ew+U k−1 )2 (1 + ew+U k−1 )2 k−1 k 4ew+U 4ew+U +q − Nk (1 + ew+U k−1 )2 (1 + ew+U k )2 k−1

≤q

4ew+U (N k−1 − N k ) (1 + ew+U k−1 )2

≤ q(N k−1 − N k ), and we obtained the desired result. k−1

k

4ew+U 4ew+U > k−1 (1 + ew+U )2 (1 + ew+U k )2

Case II

Letting N0 = maxx∈Ω {N , −N }, we can calculate the left hand side of (2.6) as

k−1 k k−1 k 1 − ew+U 4ew+U 1 − ew+U 4ew+U k−1 − N − q Nk + q κ q 1 + ew+U k−1 1 + ew+U k (1 + ew+U k−1 )2 (1 + ew+U k )2 k−1 k 1 − ew+U 1 − ew+U 2 2 =κ q − 1 + ew+U k−1 1 + ew+U k k−1 k k 4ew+U 4ew+U 4ew+U k−1 +q − + q (N k−1 − N k ) N (1 + ew+U k−1 )2 (1 + ew+U k )2 (1 + ew+U k )2 2 2

k−1

≤ κ2 q 2

k−1

k

w+U 1 − ew+U 4ew+U 2 21 − e k−1 + q k−1 2 N0 − κ q w+U w+U 1+e (1 + e ) 1 + ew+U k k

−q

4ew+U N0 + q(N k−1 − N k ) (1 + ew+U k )2

≤ q(N k−1 − N k ), where we used U k−1 ≥ U k and the monotone decreasing property of the function 2 κ q 2 ex G(x) = 4N0 q + 4N0 1 + ex (1 + ex )2 for κ2 q > 2N0 . Thus we obtained the desired result.

Vol. 2, 2001


897

Same argument is also valid for the case N ≥ 0 ≥ N k−1 ≥ N k ≥ N at the point x ∈ Ω. The only remaining case is N ≥ N k−1 ≥ 0 ≥ N k ≥ N at the point x ∈ Ω. 4ex k+1 ≤ N k. Since 0 ≤ H(x) = (1+e x )2 ≤ 1 it is easy to check (2.6) and we get N We can apply the same argument to show that N k+1 ≥ N . Thus we have obtained decreasing sequences (U k , N k ), which converge to solutions of (1.9)-(1.10) and this completes the proof. Proof of Theorem. 2.1 From (2.2) we estimate 2qN
2N holds for

1 q> 2 κ

κ2 (|s| + 1)2 + 2

|s| + 1 +

≡ qκ0 .

(2.7)

Similarly κ2 q > −2N holds for q > qκ0 also. We note that qκ0 is monotone decreasing and goes to infinity as κ → 0. By Lemmas 2.1 and 2.2, if we set 0 < κ < κ0

,

q > max{qκ+ , qκ− , qκ0 },

then we get the desired existence of solutions and the proof completes.

3 The Maxwell limit In this section we prove the strong convergence of solutions in Maxwell limit i.e. κ → 0 with q kept fixed. Theorem 3.1 Let (U κ , N κ ) be solution pairs of (1.9)-(1.10) w.r.t. κ for fixed q. Then the following dichotomy holds. Either (i) {U κ } has a convergent subsequence {U κj } in W 2,2 (Ω), and U κj → u a

,

N κj → 0 in C ∞ (Ω)

as κj → 0 where ua is the unique solution of (1.11), or (ii) U κ L2 → ∞ as κ → 0, and there exists a subsequence {N κj } such that N κj →

2π(m − n) + s + 1 in C 0,α (Ω), q|Ω|

0 ≤ α < 1,

(3.1)

898


or N κj →

2π(m − n) + s − 1 in C 0,α (Ω), q|Ω|


0 ≤ α < 1.

(3.2)

To prove this theorem we need some lemmas. Lemma 3.1 N κ W 2,2 ≤ C. Proof. Taking L2 -inner product in (1.10) with N κ , integrating by parts, by H¨ older inequality, we have N κ L2 ≤ C, ∇N κ L2 ≤ Cq. On the other hand, (1.10) implies ∆N κ L2 ≤ Cq(1 + κ2 ). Then by the CalderonZygmund inequality ([6]) we obtained the desired results. From now on, we will use the following decomposition: 1 f = f˜ + f, |Ω| Ω so that

Ω

f˜ = 0.

Lemma 3.2 ∇U κ L2 + ∆U κ L2 ≤ C and U˜κ W 2,2 ≤ C. Proof. Taking L2 -inner product in (1.9) with U κ , integrating by parts, we have Ω

κ 2

|∇U |

κ

1 − ew+U 4π(m − n) = −2q (−N + s − )U κ − w+U κ 1 + e |Ω| Ω w+U κ 1−e = −2q (−N κ + s − )U˜κ 1 + ew+U κ Ω 12 2 κ ˜ ≤ Cq |U | . κ

Ω

Uκ

Ω

√ Using the Poincaré lemma and Young’s inequality we get ∇U κ L2 ≤ C q. On the other hand, lemma 3.1 and (1.9) implies that ∆U κ L2 ≤ C(1 + q). Finally the Poincaré lemma completes the proof. Recall that ua is the unique solution of (1.11) and we set Uκ = U κ −ua . Subtracting (1.11) from (1.9) we obtain ∆Uκ

= −2qN κ −

κ

a

= −2qN κ +

a

1 − ew+U 1 − ew+u κ − w+U 1+e 1 + ew+ua

κ

1 4ew+u +σU Uκ , 2 (1 + ew+ua +σUκ )2

(3.3)

Vol. 2, 2001


899

where σ ∈ [0, 1] and we used mean value theorem. Similarly we obtain that ∇Uκ L2 + ∆Uκ L2 ≤ C. 1 U κ , then by lemma Proof of Theorem 3.1. If we set U κ = U˜κ +cκ where cκ = |Ω| Ω 3.2 and the Poincaré lemma, U˜κ W 2,2 ≤ C. First we consider the case that {U κ } has convergent subsequence. Then {cκ } also has a convergent subsequence and U κ W 2,2 ≤ C and Uκ W 2,2 ≤ C. Since W 2,2 (Ω) 1→ C 0,α (Ω) and this embedding is compact, there exist subsequences and U, N ∈ C 0,α (Ω) such that Uκ → U,

N κ → N in C 0,α (Ω),

here we used lemma 3.1. Taking L2 -inner product in (1.10) with C 2 test function and taking the limit κ → 0, we have that N is a weak solution of ∆N = q

4ew+U N. (1 + ew+U )2

By the standard regularity theory, N ∈ C 2,α (Ω) and N is a classical solution of the above equation. Moreover, using the maximum principle, we can conclude that N ≡ 0 and N κ → 0 in C 0,α (Ω). In fact, by bootstrap argument, above convergence is in C ∞ (Ω). On the other hand, applying similar argument to (3.3), we obtain that U ≡ 0 and U κ → ua in C ∞ (Ω). Hence (i) holds. Now we suppose that {U κ } diverges. Then we can choose a subsequence such that cκ ∞ or cκ −∞ as κ → 0. First consider the case cκ ∞. By lemma 3.1 there exists a subsequence {N κ } and N ∈ C 0,α (Ω) such that N κ → N in C 0,α (Ω) as κ → 0. Taking L2 -inner product in (1.10) with C 2 test function and taking the limit κ → 0, we have that ∆N = 0 and N ≡ Cq is a constant. On ˜ such that the other hand, from lemma 3.2, there exists a subsequence U˜κ and U 0,α κ ˜ ˜ ˜ U → U in C (Ω). Then from (1.9) we have that U is a classical solution of ˜ = 2q(−Cq + s + 1) + ∆U Since Ω is periodic and

Ω

Cq =

4π(m − n) . |Ω|

˜ = 0, we have U

2π(m − n) + s + 1, q|Ω|

and

˜ ≡ 0. U

Thus (3.1) holds. In the case of cκ −∞, by similar argument, we obtain (3.2) and this completes the proof.

4 The Chern-Simons limit In this section we prove the strong convergence of solutions in the Chern-Simons limit i.e. q → ∞ with κ kept fixed.

900



Theorem 4.1 Let (U q , N q ) be solution pairs of (1.9)-(1.10) w.r.t. q for fixed κ. For 1 ≤ p < ∞, we have the following. (i) If m = n, then there exists subsequence (U qj , N qj ) such that U qj → ucs in C ∞ (Ω),

−N qj + s −

qj

1 − ew+U → 0 in Lp (Ω) q 1 + ew+U j

as qj → ∞ where ucs is a solution of (1.12). (ii) If m = n, then the following dichotomy holds. Either (a) {U q } has a convergent subsequence {U qj } in W 2,2 (Ω), and U qj → ucs in C ∞ (Ω),

−N

qj

qj

1 − ew+U +s− → 0 in Lp (Ω) q 1 + ew+U j

as qj → ∞ where ucs is a solution of (1.12), or (b) U q L2 → ∞ and q

−N q + s −

1 − ew+U → 0 in Lp (Ω) 1 + ew+U q

as q → ∞. Remark If the solution pairs {(U q , N q )} are constructed via Theorem 2.1, then U q L2 is uniformly bounded and thus the case (ii)-(b) does not occur. First we rewrite lemma 3.1 and 3.2 as Lemma 4.1 N q L2 ≤ C, ∇N q L2 ≤ Cq, ∆N q L2 ≤ Cq 2 , √ ∇U q L2 ≤ C q, ∆U q L2 ≤ Cq. Now we sharpen the estimates. Lemma 4.2 ∇(U q +

2 2 N q )L2 + ∆(U q + 2 N q )L2 + ∇U q L2 ≤ C. κ2 q κ q

Vol. 2, 2001


901

Proof. From (1.9) and (1.10) we have q

∆(U q +

4ew+U 2 2 4π(m − n) q q N . ) = + q 2N 2 2 w+U κ q κ (1 + e ) |Ω|

(4.1)

Taking L2 -inner product, and using the Poincaré lemma, we obtain that ∇(U q + 2 q 2 κ2 q N )L ≤ C and C is independent of q. From this and lemma 4.1 we have ∇U q L2 ≤ ∇(U q + Clearly ∆(U q +

2 q 2 κ2 q N )L

2 2 N q )L2 + ∇ 2 N q L2 ≤ C. 2 κ q κ q

≤ C and this completes the proof.

On the other hand, simple calculation gives

q q q 1 − ew+U 4ew+U 1 − ew+U 2 2 q ∆ −N + s − = κ q +q −N + s − 1 + ew+U q (1 + ew+U q )2 1 + ew+U q q q q 1 − ew+U 4ew+U 1 4ew+U q 2 )| − q N q. + |∇(w + U 2 (1 + ew+U q )2 1 + ew+U q (1 + ew+U q )2 (4.2) q

Lemma 4.3 q

1 − ew+U C −N +s− 2≤ 1 + ew+U q L q q

q

,

1 − ew+U ) 2 ≤ C. ∇(−N + s − 1 + ew+U q L q

Proof. Taking L2 -inner product in (4.2), we obtain w+U q 2 ∇(−N q + s − 1 − e ) 1 + ew+U q Ω q q 2 4ew+U 1 − ew+U 2 2 q −N + s − + κ q +q (1 + ew+U q )2 1 + ew+U q Ω q q q 1 − ew+U 4ew+U 1 1 − ew+U q 2 q =− |∇(w + U )| −N + s − 2 Ω (1 + ew+U q )2 1 + ew+U q 1 + ew+U q q q 4ew+U 1 − ew+U q q +q N + s − −N . w+U q )2 1 + ew+U q Ω (1 + e Using the interpolation inequality 1 1 √ ∇(w + U q )L4 ≤ C∇(w + U q )L2 2 ∆(w + U q )L2 2 ≤ C q,

902



we obtain q 2 1 − ew+U q κ q(−N + s − 1 + ew+U q ) Ω 2

q

≤ C∇(w + U q )2L4 − N q + s −

1 − ew+U L2 1 + ew+U q

q

1 − ew+U )L2 1 + ew+U q q 1 − ew+U ≤ Cq(−N q + s − )L2 1 + ew+U q +Cq(−N q + s −

and this implies that q

q(−N q + s − It is easy to get ∇(−N q + s −

1 − ew+U )L2 ≤ C. 1 + ew+U q q

1−ew+U 1+ew+U q

)L2 ≤ C.

From (1.9) and (1.10), we get the following corollary. Corollary 4.1 ∆U q L2 ≤ C

,

∇N q L2 ≤ C

,

∆N q L2 ≤ Cq.

Proof of Theorem 4.1. From lemma 4.3, using the interpolation inequality, we obtain that for any given 1 ≤ p < ∞, q

1 − ew+U −N +s− Lp → 0 as q → ∞. 1 + ew+U q 1 U q , then by lemma 4.2, corollary 4.1 and If we set U q = U˜q + cq where cq = |Ω| Ω the Poincaré lemma, we have U˜q W 2,2 ≤ C. First we consider the case that {U q } has no convergent subsequence. Then {cq } also has no convergent subsequence and we can choose a subsequence such that cq ∞ or cq −∞ as q → ∞. Consider the case cq ∞ as q → ∞. q q By lemma 4.1 and corollary 4.1 there exists a subsequence Nq such that Nq → 0 as q → ∞. On the other hand, U˜q + κ22 q N˜q W 2,2 ≤ C and thus there exists a ˜ in C 0,α (Ω). ˜ ∈ W 2,2 (Ω) such that U˜q + 22 N˜q → U subsequence and its limit U q

κ q

˜ in C 0,α (Ω). Taking L2 -inner product and taking the limit Thus we have U˜q → U ˜ is a weak solution of q → ∞, we also obtain that U ˜= ∆U

4π(m − n) . |Ω|

Vol. 2, 2001


903

˜ The standard regularity theory implies that U is a classical solution and in fact ˜ smooth solution. Since Ω is periodic and Ω U = 0, we have m=n

,

˜ ≡ 0. U

Thus (ii)-(b) occurs. Moreover, if m = n, then {U q } must has a convergent subsequence. Now suppose that {U q } and hence cq has a convergent subsequence. Then by q lemma 4.2 and corollary 4.1, U q + κ22 q N q W 2,2 ≤ C and Nq W 2,2 ≤ C. Since W 2,2 (Ω) 1→ C 0,α (Ω) and this embedding is compact, there exist subsequences and their limits U, N ∈ W 2,2 (Ω) such that Uq +

2 Nq → U , κ2 q

Nq →N q

in C 0,α (Ω).

Moreover from lemma 4.1 we know that N ≡ 0 i.e. U q → U in C 0,α (Ω). Since w+U q −N q + s − 1−e → 0 in Lp (Ω), we take L2 -inner product in (4.1) with C 2 test 1+ew+U q function and take the limit q → ∞. Then we have that U is a weak solution of the Chern-Simons CP(1) model(1.12). By the standard regularity theory, U ∈ C 2,α (Ω) and U is a classical solution of (1.12). Moreover, bootstrap argument can be used to show that U q → U in C ∞ (Ω). Hence (i) and (ii)-(a) hold and this completes the proof.

5 Single signed vortex case In this section, we consider the case n = 0 or m = 0. First we rewrite the system of equations (1.9)-(1.10) as ∆ϕ = 2q(−N + s −

m

n

k=1

l=1

1 − eϕ ) + 4π δ − 4π δy l , x k 1 + eϕ

∆N = −κ2 q 2 (−N + s −

1 − eϕ 4eϕ )+q N ϕ 1+e (1 + eϕ )2

(5.1) (5.2)

and the equation of Maxwell CP(1) model (1.11) as ∆ϕ = 2q(s −

m

n

k=1

l=1

1 − eϕ ) + 4π δxk − 4π δy l . ϕ 1+e

(5.3)

We set ϕa the unique solution of (5.3) and prove the following theorem: Theorem 5.1 Let (ϕ, N ) be a solution pair of (5.1)-(5.2). (i) If n = 0, then N (x) ≤

1 − eϕ κ2 q max(s − ) ≤ 0, +1 1 + eϕ

κ2 q

ϕ(x) ≤ ϕa (x) ≤ ln

1−s 1+s

∀ x ∈ Ω.

904



(ii) If m = 0, then N (x) ≥

1 − eϕ κ2 q min(s − ) ≥ 0, κ2 q + 1 1 + eϕ

ϕ(x) ≥ ϕa (x) ≥ ln

1−s 1+s

∀ x ∈ Ω.

(iii) If m = n = 0, then N ≡ 0 and ϕ ≡ ϕa ≡ ln 1−s 1+s . Proof. We first prove the case (i). Let xϕ and xN be maximum points of ϕ and N , respectively, i.e. ∆ϕ(xϕ ) ≤ 0 and ∆N (xN ) ≤ 0. From (5.1)-(5.2) we obtain that 1 − eϕ ϕ )(x ), 1 + eϕ ϕ N κ2 q 2 (s − 1−e 1+eϕ )(x )

N (xϕ ) ≥ (s − N (xN ) ≤

ϕ

4e N) κ2 q 2 + q (1+e ϕ )2 (x

(5.4) .

(5.5) ϕ

From the choice of xϕ , xN and the monotone increasing property of s− 1−e 1+eϕ w.r.t. ϕ, we have that s−

1 − eϕ ϕ (x ) ≤ N (xϕ ) 1 + eϕ N

≤ N (x ) ≤ Then we are lead to q

1−eϕ N 1+eϕ )(x ) 4eϕ q (1+eϕ )2 (xN )

κ2 q 2 (s − κ2 q 2 +

≤

1−eϕ ϕ 1+eϕ )(x ) . 4eϕ q (1+eϕ )2 (xN )

κ2 q 2 (s − κ2 q 2 +

1 − eϕ 4eϕ N (x ) s − (xϕ ) ≤ 0, (1 + eϕ )2 1 + eϕ

and we must have ϕ(xN ) = ∞,

−∞ or

(s −

1 − eϕ ϕ )(x ) ≤ 0. 1 + eϕ

Since ϕ is bounded from above, ϕ(xN ) = ∞. If ϕ(xN ) = −∞, then by (5.5), N ≤ N (xN ) ≤ (s −

1 − eϕ N )(x ) = s − 1 < 0. 1 + eϕ

But this implies that 0 ≤ N + s − 1 ≤ −N + s −

1 − eϕ . 1 + eϕ

From (5.1) we obtain that 1 − eϕ 0 ≤ 2q −N + s − 1 ≤ 2q −N + s − = −4πm. 1 + eϕ Ω Ω

Vol. 2, 2001


905

Thus we must have m = 0 and ϕ cannot be −∞. This is a contradiction and we ϕ ϕ must have (s − 1−e 1+eϕ )(x ) ≤ 0. On the other hand, subtracting (5.3) from (5.1) for n = 0, we have a

a

∆(ϕ − ϕ )

1 − eϕ 1 − eϕ = −2qN − 2q( − ) 1 + eϕ 1 + eϕa a a 4eϕ +σ(ϕ−ϕ ) ≥ q (ϕ − ϕa ), (1 + eϕa +σ(ϕ−ϕa ) )2

where we used mean value theorem and N ≤ 0. Then by the maximum principle, we obtain ϕ ≤ ϕa . Finally, from (5.3), the maximum principle implies that a

1 − eϕ s− ≤ 0 in Ω. 1 + eϕa Proof of (ii) is similar to that of (i). Combining (i) and (ii), (iii) follows immediately.

Acknowledgments This research is supported partially by GARC-KOSEF, KOSEF(K97-07-02-02-013) and BSRI-MOE.

References [1] T. Aubin, Nonlinear Analysis on Manifolds. Monge-Ampère Equations, Springer-Verlag(1982). [2] L.A. Caffarelli and Y. Yang, Vortex condensation in the Chern-Simons Higgs model : An existence theorem, Commun. Math. Phys. 168, 321–336 (1995). [3] D. Chae and N. Kim, Topological Multivortex solutions of the Self-Dual Maxwell-Chern-Simons-Higgs System, J. Differential Equations 134, 154–182 (1997). [4] D. Chae and N. Kim, Vortex condensates in the relativistic self-dual MaxwellChern-Simons-Higgs system, RIM-GARC preprint series 97-50(1997). [5] D. Chae and H.-S. Nam, Multiple Existence of the Multivortex Solutions of the Self-Dual Chern-Simons CP(1) Model on a Doubly Periodic Domain, Lett. Math. Phys. 49, 297–315 (1999). [6] D. Gilbarg and N.S. Trudinger, Elliptic Partial Differential Equations of the Second Order, Springer-Verlag, (1983). [7] K. Kimm, K. Lee and T. Lee, Anyonic Bogomol’nyi Solitons in a Gauged O(3) Sigma Model, Phys. Rev. D 53, 4436–4440 (1996).

906



[8] T. Ricciardi and G. Tarantello, Self-dual vortices in the Maxwell-ChernSimons-Higgs theory, [9] B.J. Schroers, Bogomol’nyi solitons in a gauged O(3) sigma model, Phys. Lett. B 356, 291–296 (1995). [10] G.’t Hooft, A property of electric and magnetic flux in non-abelian gauge theories, Nuclear Phys. B 153, 141–160 (1979).

Dongho Chae and Hee-Seok Nam Department of Mathematics Seoul National University Seoul 151-742 Korea email: [email protected], email: [email protected] Communicated by Tetsuji Miwa submitted 22/09/00, accepted 15/04/01




The Bisognano-Wichmann Theorem for Massive Theories J. Mund Abstract. The geometric action of modular groups for wedge regions (BisognanoWichmann property) is derived from the principles of local quantum physics for a large class of Poincaré covariant models in d = 4. As a consequence, the CPT theorem holds for this class. The models must have a complete interpretation in terms of massive particles. The corresponding charges need not be localizable in compact regions: The most general case is admitted, namely localization in spacelike cones.

Introduction In local relativistic quantum theory [23], a model is specified in terms of a net of local observable algebras and a representation of the Poincaré group, under which the net is covariant. The Bisognano-Wichmann theorem [2, 3] intimately connects these two, algebraic and geometric, aspects. Namely, it asserts that under certain conditions modular covariance is satisfied: The modular unitary group of the observable algebra associated to a wedge region coincides with the unitary group representing the boosts which preserve the wedge. Since the boosts associated to all wedge regions generate the Poincaré group, modular covariance implies that the representation of the Poincaré group is encoded intrinsically in the net of local algebras. It has further important consequences: It implies the spin-statistics theorem [27, 22] and, as Guido and Longo have shown [22], the CPT theorem. It also implies essential Haag duality, which is an important input to the structural analysis of charge superselection sectors [16, 17]. Counterexamples [32, 10, 11] demonstrate that modular covariance does not follow from the basic principles of quantum field theory without further input. But its remarkable implications assign a significant role to this property, and it is desirable to find physically transparent conditions under which it holds. Bisognano and Wichmann have shown modular covariance to hold in theories where the field algebras are generated by finite-component Wightman fields [2, 3]. In the framework of algebraic quantum field theory, Borchers has shown that the modular objects associated to wedges have the correct1 commutation relations with the translation operators as a consequence of the positive energy requirement [4]. Based on his result, Brunetti, Guido and Longo derived modular covariance for 1 namely,

as expected from modular covariance

908

J. Mund


conformally covariant theories [10]. In the Poincaré covariant case, sufficient conditions for modular covariance of technical nature have been found by several authors [6, 8, 29, 26, 21] (see [8] for a review of these results). In the present article, we derive modular covariance in the setting of local quantum physics for a large class of massive models. Specifically, the models must contain massive particles whose scattering states span the whole Hilbert space (asymptotic completeness). Further, within each charge sector the occurring particle masses must be isolated eigenvalues of the mass operator. The corresponding representation of the covering group of the Poincaré group must have no accidental degeneracies; i.e. for each mass and charge there is one single particle multiplet under the gauge group (the group of inner symmetries). We admit the most general localization properties for the charges carried by these particles, namely localization in spacelike cones [13]. A byproduct of our analysis is that the CPT theorem holds under these rather general and transparent conditions. It must be mentioned that Epstein has already proved a rudimentary version of the CPT theorem for massive theories in the framework of local quantum physics [20]. But it refers only to the S-matrix (and not to the local fields), and is derived only for compactly localized charges. It must also be mentioned that the spin-statistics theorem, which is a consequence of modular covariance and needs not be assumed for our derivation, has already been proved by Buchholz and Epstein [12] for massive theories with charges localized in spacelike cones. The article is organized as follows. In Section 1, the general framework is set up and our assumptions concerning the particle spectrum are made precise, as well as our notion of modular covariance. We state our main result in Theorem 2. The proof will proceed in two steps: In Section 2, single-particle versions of the Bisognano-Wichmann and the CPT theorems are derived (Theorem 5). This is an extension of Buchholz and Epstein’s proof [12] of the spin-statistics theorem for topological charges. In Section 3 we prove modular covariance via Haag-Ruelle scattering theory (Proposition 7). As mentioned, this already implies the CPT theorem [22]. Yet for the sake of self consistency, we show in Section 4 that the CPT theorem can be derived directly from our assumptions via scattering theory (Proposition 9).

1 Assumptions and Result In the algebraic framework, the fundamental objects of a quantum field theory are the observable algebra and the physically relevant representations of it. The set of equivalence classes of these representations, or charge superselection sectors, has the structure of a semigroup. We will assume that it is generated by a set of “elementary charges” which correspond to massive particles. Then all relevant charges are localizable in spacelike cones [13]. Under these circumstances and if Haag duality holds, it is known [19] that the theory may be equivalently described

Vol. 2, 2001

The Bisognano-Wichmann Theorem for Massive Theories

909

by an algebra of (unobservable) charged field operators localized in spacelike cones, and a compact gauge group acting on the fields. The observable algebra is then the set of gauge invariant elements of the field algebra, and modular covariance of the former is equivalent to modular covariance of the latter [27, 22]. We take the field algebra framework as the starting point of our analysis. It is noteworthy that then essential Haag duality needs not be assumed for our result, but rather follows from it. Let us briefly sketch this framework. The Hilbert space H carries a unitary representation U of the universal covering group P˜+↑ of the Poincaré group which has positive energy, i.e. the joint spectrum of the generators Pµ of the translations is contained in the closed forward lightcone. There is a unique, up to a factor, invariant vacuum vector Ω. Further, there is a compact group G (the gauge group) of unitary operators on H which commute with the representation U and leave Ω invariant. For every spacelike cone2 C there is a von Neumann algebra F (C) of so-called field operators acting in H. The family C → F(C), together with the representation U and the group G, satisfies the following properties. 0) Inner symmetry: For all C and all V ∈ G V F (C) V −1 = F (C) . i) Isotony: C1 ⊂ C2 implies F (C1 ) ⊂ F(C2 ). ii) Covariance: For all C and all g ∈ P˜+↑ U (g) F (C) U (g)−1 = F (g C) . iii) Twisted locality: There is a Bose-Fermi operator κ in the center of G with . κ2 = 1, determining the spacelike commutation relations of fields: let Z = 1+iκ 1+i . Then ZF (C1 )Z ∗ ⊂ F(C2 ) if C1 and C2 are spacelike separated. iv) Reeh-Schlieder property: For every C, F (C) Ω is dense in H. v) Irreducibility: C F (C) = C · 1. Note that twisted locality (iii) is equivalent to normal commutation relations [15]: Two field operators which are localized in causally disjoint cones anticommute if both operators are odd under the adjoint action of κ (fermionic) and commute if one of them is even (bosonic). Let Hα H= α∈Σ 2 A spacelike cone is a region in Minkowski space of the form C = a + ∪ 4 λ>0 λO, where a ∈ R is the apex of C and O is a double cone whose closure does not contain the origin.

910

J. Mund


be the factorial decomposition of G , with Σ the set of equivalence classes of irreducible unitary representations of G contained in the defining representation 3 . The subspaces Hα will be referred to as (charge) sectors, and two sectors corresponding to conjugate representations α, α ¯ of G will be called conjugate sectors. We denote by Eα the projection in H onto Hα , and by dα the (finite) dimension of the class α. Note that the Poincaré representation commutes with Eα . Let Σ(1) ⊂ Σ be the set of charges carried by the particle√types of the theory: α ∈ Σ(1) if, and only if, the restriction of the mass operator P 2 to Hα has non-zero eigenvalues. (1) 4 vi) Massive √ particle spectrum: For each α ∈ Σ , there is exactly one eigen2 value mα of P Eα . This eigenvalue is isolated and strictly positive. Further, the corresponding subrepresentation of P˜+↑ contains only one irreducible representation, with multiplicity equal to dα . Thus, for each α ∈ Σ(1) , there is one multiplet under G of particle types with the same charge α, mass and spin. vii) Asymptotic completeness: The scattering states span the whole Hilbert space (see equation (3.5)). The property of modular covariance, which we are going to derive from these assumptions, means the following. Due to the Reeh-Schlieder property and locality, for every spacelike cone C there is a Tomita operator [9] STom (C) associated to F (C) and Ω : It is the closed antilinear involution satisfying STom (C) BΩ = B ∗ Ω

for all B ∈ F(C) .

Its polar decomposition is denoted as 1

STom (C) = JC ∆C2 . The anti-unitary involution JC in this decomposition is called the modular conjugation, and the positive operator ∆C gives rise to the so-called modular unitary group ∆it C associated to C. Modular covariance means, generally speaking, that the Tomita operators associated to a distinguished class of space-time regions have geometrical significance. This class is the set of wedge regions, i.e. Poincaré transforms of the standard wedge region W1 : 5 . W1 = { x ∈ R4 : |x0 | < x1 } , and the geometrical significance is as follows. Let Λ1 (t) denote the Lorentz boost in x1 -direction, acting as cosh t 1 + sinh t σ1 on the coordinates x0 , x1 , and λ1 (t) its lift to the covering group P˜+↑ . 3 In fact, Σ contains all irreducible representations of G, and is in 1–1 correspondence with the superselection sectors of the observable algebra [15]. 4 Our results still hold if no restriction is imposed on the number of (isolated) mass values in each sector. 5 Wedges W will be considered as limiting cases of spacelike cones. F (W ) is the von Neumann algebra generated by all F (C) with C ⊂ W.

Vol. 2, 2001


911

Definition 1 A theory is said to to satisfy modular covariance if ∆it W1 = U (λ1 (−2πt)) .

(1.1)

Other notions of modular covariance have been proposed in the literature (see, e.g. [14]), but this is the strongest one. In particular, it implies [22] that the modular conjugation JW1 has the geometric significance of representing the reflexion j at the edge of W1 , which inverts the sign of x0 and x1 and acts trivial on x2 , x3 . More precisely, Guido and Longo have shown in [22] that equation (1.1) implies . that the operator Θ = Z ∗ JW1 acts geometrically correctly, i.e. satisfies Θ F (W ) Θ−1 = F (jW )

(1.2)

for all wedge regions W, and has the representation properties Θ U (g) Θ−1 = U (jgj) for all g ∈ P˜+↑ ,

Θ2 = 1 .

(1.3)

Here we have denoted by g → jgj the unique lift of the adjoint action of j on the Poincaré group to an automorphism of the covering group [31]. Since Θ also sends each sector to its conjugate, equations (1.2) and (1.3) exhibit Θ as a CPT operator6. Thus, modular covariance (1.1) implies the CPT theorem. Further, the last two equations imply, by the Tomita-Takesaki theorem, that the theory satisfies twisted Haag duality for wedges, i.e. ZF (W )Z ∗ = F (W ) .

(1.4)

Our main result is the following theorem. Theorem 2 Let the assumptions 0) , . . . , vii) be satisfied. Then modular covariance, the CPT theorem as expressed by equations (1.2) and (1.3), and twisted Haag duality for wedges hold. It is noteworthy that equation (1.2) holds not only for wedge regions, but also for spacelike cones if one replaces the family F with the so-called dual family F d . Namely, twisted Haag duality for wedge regions implies that the dual family . F d (C) = ∩W ⊃C F (W ) is still local. On this family, Θ acts geometrically correctly, i.e. equation (1.2) holds for all F d (C) [22]. Our proof of the theorem will proceed in two steps: In the next section, modular covariance is shown to hold in restriction to the single particle space (Theorem 5). In Section 3 we show that modular covariance extends to the space of scattering states (Proposition 7). By the assumption of asymptotic completeness this space coincides with H, hence modular covariance holds, implying the CPT theorem. 6 Here we consider j as the P T transformation. The total space-time inversion arises from j through a π-rotation about the 1-axis, and is thus also a symmetry (if combined with charge conjugation C). In odd-dimensional space-time, j is the proper candidate for a symmetry in combination with C, while the total space-time inversion is not.

912

J. Mund


2 Modular Covariance on the Single Particle Space As a first step, we prove single-particle versions of the Bisognano-Wichmann and the √ CPT theorems. Let EI , I ⊂ R, be the spectral projections of the mass operator P 2 . We denote by E (1) the sum of all E{mα } , where α runs through the set Σ(1) of single particle charges and mα are the corresponding particle masses. The range of E (1) is called the single particle space. An essential step towards the Bisognano-Wichmann theorem is the mentioned result of Borchers [4, 5], namely that the modular unitary group and the modular conjugation associated to the wedge W1 have the correct commutation relations with the translations. In particular, they commute with the mass operator, which implies that STom (W1 ) commutes with E (1) . Let us denote the corresponding restriction by . STom = STom (W1 ) E (1) . Similarly, the representation U (P˜+↑ ) leaves E (1) H invariant, giving rise to the subrepresentation . U (1) (g) = U (g) E (1) , and one may ask if modular covariance holds on E (1) H. We show in this section that this is indeed the case, the line of argument being as follows. Let K denote the generator of the unitary group of 1-boosts, U (λ1 (t)) = eitK . We exhibit an antiunitary “PT-operator” U (1) (j) representing the reflexion j on E (1) H, and show that STom coincides with the “geometric” involution . Sgeo = U (1) (j) e−πK E (1)

(2.1)

up to a unitary “charge conjugation” operator which commutes with the representation U (1) of P˜+↑ . By uniqueness of the polar decomposition, this will imply modular covariance on E (1) H. We begin by exploiting our knowledge about U (1) (P˜+↑ ). By assumption, for each α ∈ Σ(1) the subrepresentation U (g)E (1) Eα contains only one equivalence class of irreducible representations. As is well–known, the latter is fixed by the mass mα and a number sα ∈ 12 N0 , the spin of the corresponding particle species. We briefly recall the so-called covariant irreducible representation Um,s for mass m > 0 and spin s ∈ 12 N0 . The universal covering group of the proper orthochronous Lorentz group L↑+ is identified with SL(2, C) [30]. Explicitly, the boosts Λk (·) in k-direction and rotations Rk (·) about the k-axis, k = 1, 2, 3, lift to 1

λk (t) = e 2 t σk

and

i

rk (ω) = e 2 ω σk ,

k = 1, 2, 3 ,

(2.2)

respectively, where σk are the Pauli matrices. The universal covering group P˜+↑ of the proper orthochronous Poincaré group P+↑ is the semidirect product of SL(2, C) with the translation subgroup R4 , elements being denoted by g = (x, A). The representation Um,s of P˜+↑ for m > 0 and s ∈ 12 N0 acts on a Hilbert space Hm,s of

Vol. 2, 2001


913

functions from the positive mass shell Hm into C2s+1 . The latter, viewed as the space of covariant spinors of rank 2s, is acted upon by an irreducible representation Vs of SL(2, C) satisfying Vs (A∗ ) = Vs (A)∗

¯ = Vs (A) . Vs (A)

and

Let , denote the scalar product in C2s+1 , and dµ(p) the Lorentz invariant measure on the mass shell Hm , and let, for p = (p0 , p) ∈ R4 , . p˜ = p0 1 − p · σ

and

. 0 p= p 1+p·σ , e

where σ = (σ1 , σ2 , σ3 ). Then the scalar product in Hm,s is defined as 1 ( ψ1 , ψ2 ) = dµ(p) ψ1 (p) , Vs ( p˜) ψ2 (p) . m Um,s acts on Hm,s according to Um,s (x, A)ψ (p) = exp(ix · p) Vs (A) ψ(Λ(A−1 ) p) ,

(2.3)

(2.4)

where Λ : SL(2, C) → L↑+ denotes the covering homomorphism. To this representation an anti–unitary operator Um,s (j) can be adjoined satisfying the representation properties Um,s (j)2 = 1

and

Um,s (j) Um,s (g) Um,s (j) = Um,s (jgj)

for all g ∈ P˜+↑ . Namely, it is given by

(2.5)

7

1 . p σ3 ψ(−j p) . Um,s (j)ψ (p) = Vs me

(2.6)

By our assumption vi), we may identify the subrepresentation U (g)E (1) Eα with the direct sum of dα copies of Umα ,sα . Then there are mutually orthogonal (1) projections Eα,k ⊂ Eα , k = 1, . . . dα , in H onto irreducible subspaces such that E (1) Eα =

dα

(1)

Eα,k ,

(2.7)

k=1 (1)

(1)

U (g) Eα,k = Umα ,sα (g) Eα,k

for all g ∈ P˜+↑ .

We define a “PT–operator” U (1) (j) on E (1) H as the anti–linear extension of (1) . (1) U (1) (j) Eα,k = Um,s (j) Eα,k . 7 A proof, as well as explicit formulae for the relevant group relations jgj, are given in the Appendix for the convenience of the reader.

914

J. Mund


Note that this definition of U (1) (j) depends on the choice of the decomposition (2.7). We define now a closed antilinear operator Sgeo in terms of the representation U (1) , as anticipated, by equation (2.1). Note that the group relation j λ1 (t) j = λ1 (t) implies that Sgeo is, like STom , an involution: it leaves its domain invariant and satisfies (Sgeo )2 ⊂ 1. The following proposition is a corollary of the article [12] of Buchholz and Epstein. Proposition 3 There is a unitary “charge conjugation” operator C on E (1) H satisfying CEα E (1) = Eα¯ C E (1)

and

[C, U (g)] E (1) = 0 for all g ∈ P˜+↑ ,

(2.8)

such that C Sgeo = STom .

(2.9)

Proof. Let α ∈ Σ(1) . Corresponding to the decomposition (2.7) of the particle multiplet α into particle types (α, k) there is, for each k in {1, . . . , dα }, a family of linear subspaces C → Fα,k (C) ⊂ F(C) satisfying (1)

E (1) Fα,k (C) Ω = Eα,k F (C) Ω ,

(2.10)

see e.g. [18, 15]. Note that the closures of the above vector spaces are independent of C by the Reeh–Schlieder property and span E (1) Hα if k runs through {1, . . . , dα }. Similarly, the “anti–particle” Hilbert spaces E (1) Fα,k (C)∗ Ω–

(2.11)

are independent of C, mutually orthogonal for different k, and span E (1) Hα¯ if k runs through {1, . . . , dα } (note that dα¯ = dα ). Buchholz and Epstein [12] have shown the particle – anti-particle symmetry in this situation: For each α ∈ Σ(1) and k = 1, . . . , dα there is a unitary map Cα,k from the closure of the vector space (2.10) onto the space (2.11) intertwining the respective (irreducible) subrepresentations of P˜+↑ . We now recall in detail the relevant result of Buchholz and ∞ Epstein. Denote by Fα,k (C) the set of field operators B ∈ Fα,k (C) such that the −1 map g → U (g) B U (g) is smooth in the norm topology. Buchholz and Epstein consider a special class of spacelike cones: Let C ⊂ R3 be an open, salient cone in the x0 = 0 plane of Minkowski space, with apex at the origin. Then its causal completion C = C is a spacelike cone. Its dual cone C ∗ is defined as the set . C ∗ = (p0 , p) ∈ R4 : p · x > 0 for all x ∈ C – \ {0} . ∞ Lemma 4 (Buchholz, Epstein) Let B ∈ Fα,k (C), where C is a spacelike cone as (1) (1) above and α ∈ Σ . Then E B Ω is, considered as a function on the mass shell

Vol. 2, 2001


915

Hmα , the smooth boundary value of an analytic function in the simply connected subset . ΓC,α = {k ∈ C4 | k 2 = m2α , Imk ∈ −C ∗ } of the complex mass shell. Further, its boundary value on −Hmα satisfies ωα,k Vsα (

∗ 1 p σ2 ) E (1) B Ω (−p) = Cα,k E (1) B ∗ Ω (p) , mα e

(2.12)

where ωα,k is a complex number of unit modulus which is independent of B and C. Note that equation (2.12) coincides literally with equation (5.13) in [12]. We reformulate this result as follows. Denote by K1 the class of spacelike cones C contained in W1 which are of the form C as in the lemma and contain the positive x1 -axis. Let further . ∞ D0 = span E (1) Fα,k (C) Ω

(2.13)

C∈K1 ,α,k

where α runs through Σ(1) and k = 1, . . . , dα . The lemma asserts that on this domain an operator S0 may be defined by (1) 1 . (1) p σ2 ) Eα,k S0 Eα,k ψ (p) = Vsα ( ψ (−p) , mα e

ψ ∈ D0 .

(2.14)

Further, the intertwiners Cα,k , modified by the factors ωα,k appearing in (2.12), extend by linearity to a unitary “charge conjugation” operator C on E (1) H, (1) . (1) C Eα,k = ωα,k Cα,k Eα,k ,

which satisfies the equations (2.8) of the proposition. Now equation (2.12) may be rewritten as C S0 ⊂ STom . (2.15) This inclusion implies in particular that S0 is closable, its closure satisfying the same relation. But this closure is an extension of the operator Sgeo , as we show in the Appendix (Lemma 11). Hence we have C Sgeo ⊂ STom ,

(2.16)

and it remains to show the opposite inclusion. To this end, we refer to the opposite wedge W1 = R2 (π) W1 . Let . Sgeo = U (1) (r2 (π)) Sgeo U (1) (r2 (π))−1 , . STom = U (1) (r2 (π)) STom U (1) (r2 (π))−1 = STom (W1 ) E (1) . We claim that the following sequence of relations holds true: STom ⊂ κ−1 (STom )∗ ⊂ κ−1 C (Sgeo )∗ = C Sgeo ,

(2.17)

916

J. Mund


where κ is the Bose-Fermi operator. Twisted locality and modular theory imply that ZSTom (W1 )Z ∗ ⊂ STom (W1 )∗ . Applying E (1) , this yields ZSTom Z ∗ ⊂ (STom )∗ . But ZSTom Z ∗ = Z 2 STom , because κ commutes with the modular operators. Using Z 2 = κ, this proves the first inclusion. Since C commutes with U (1) (P˜+↑ ) and both Sgeo and STom are involutions, the inclusion (2.16) implies that

CU (1) (j) = U (1) (j)C ∗ .

(2.18)

Thus, the adjoint of relation (2.16) reads STom ∗ ⊂ CSgeo ∗ , which implies the second of the above inclusions. Finally, the group relations (A.2) and λ1 (t) r2 (π) = r2 (π) λ1 (−t) imply that (Sgeo )∗ = U (1) (r2 (2π)) Sgeo . But the spin-statistics theorem [12] asserts that U (r2 (2π)) = κ. (Namely, both operators act on Hα as multiplication by the statistics sign κα = e2πisα .) Hence the last equation in (2.17) holds. This completes the proof of (2.17) and hence of the proposition. By uniqueness of the polar decomposition, equation (2.9) of the proposition implies the equations 1

2 E (1) = e−πK E (1) , ∆W 1

JW1 E (1) = C U (1) (j) E (1) .

Since the unitary C commutes with U (1) (P˜+↑ ) and satisfies equation (2.18), we have shown the single particle version of the Bisognano-Wichmann theorem: Theorem 5 Let the assumptions 0), . . . , vi) of Section 1 hold. Then i) Modular Covariance holds on the single particle space: (1) = U (λ1 (−2πt)) E (1) . ∆it W1 E

(2.19)

ii) JW1 E (1) is a “CPT operator” on E (1) H: JW1 U (g)JW1 E (1) = U (jgj) E (1)

for all g ∈ P˜+↑ .

(2.20)

3 Modular Covariance on the Space of Scattering States Having established modular covariance on the single particle space, we now show that it extends to the space of scattering states. The argument is an extension of Landau’s analysis [28] on the structure of local internal symmetries to the present case of a symmetry which does not act strictly local in the sense of Landau. The method to be employed is Haag-Ruelle scattering theory [24, 25], whose adaption to the present situation of topological charges has been developed in [13]. This method associates a multi-particle state to n single particle vectors, which are created from the vacuum by quasilocal field operators carrying definite

Vol. 2, 2001


917

charge. Recall [15] that for every α ∈ Σ, there is a family of linear subspaces C → Fα (C) ⊂ F(C) of field operators carrying charge α : Fα (C) Ω = Eα F (C) Ω . Operators in Fα (C) are bosons or fermions w.r.t. the normal commutation relations according as κ takes the value 1 or −1 on Hα . The mentioned quasilocal creation operators are constructed as follows. For α ∈ Σ(1) , let B ∈ Fα (C) be such that the spectral support of BΩ has non–vanishing intersection with the mass hyperboloid Hmα . Further, let f ∈ S(R4 ) be a Schwartz function whose Fourier transform has compact support contained in the open forward light cone V+ and intersects the energy momentum spectrum of the sector α only in the mass shell Hmα . Recall that the latter is assumed to be isolated from the rest of the energy momentum spectrum in the sector Hα . For t ∈ R, let ft be defined by . ft (x) = (2π)−2 (3.1) d4 p ei(p0 −ωα (p))t e−ip·x f˜(p) , 1 . where ωα (p) = (p2 + m2α ) 2 . For large |t|, its support is essentially contained in the region t Vα (f ), where Vα (f ) is the velocity support of f,

. Vα (f ) = { 1,

p , p = (p0 , p) ∈ suppf˜ } . ωα (p)

(3.2)

More precisely [7, 24], for any ε > 0 there is a Schwartz function ftε with support in t Vα (f )ε , where V ε denotes an ε–neighbourhood of V, such that ft − ftε converges to zero in the Schwartz topology for |t| → ∞. Let now . B(ft ) = d4 x ft (x) U (x)BU (x)−1 . For large |t|, this operator is essentially localized in C + t Vα (f ). Namely, for any ε > 0, it can be approximated by the operator (3.3) B(ftε ) ∈ F C + t Vα (f )ε in the sense that B(ftε ) − B(ft ) is of fast decrease in t. Further, it creates from the vacuum a single particle vector B(ft ) Ω = (2π)2 f˜(P ) B Ω

∈ E (1) Hα ,

which is independent of t, and whose velocity support is contained in that of f. Here we understand the velocity support V (ψ) of a single particle vector to be defined as in equation (3.2), with the spectral support of ψ taking the role of suppf˜. To construct an outgoing scattering state from n single particle vectors, pick n localization regions Ci , i = 1, . . . , n and compact sets Vi in velocity space, such that for suitable open neighbourhoods Viε ⊂ R4 the regions Ci + t Viε are

918

J. Mund


mutually spacelike separated for large t. Next, choose Bi ∈ Fαi (Ci ), and Schwartz functions fi as above with Vαi (fi ) ⊂ Vi . Then the limit out . lim Bn (fn,t ) · · · B1 (f1,t ) Ω = ψn × · · · × ψ1 (3.4) t→∞

. exists and depends only on the single particle vectors ψi = Bi (fi,t ) Ω, justifying the above notation. The convergence in (3.4) is of fast decrease in t, and the limit vector depends continuously on the single particle states, as a consequence of the cluster theorem. Further, the normal commutation relations survive in this limit. . Let us write H(1) = E (1) H, and denote by H(n) , n ≥ 2, the closed span of outgoing n-particle scattering states and by H(ex) the span of these spaces: H(ex) = C Ω ⊕ H(n) . (3.5) n∈N

Asymptotic completeness means that H(ex) coincides with H. Our proof that modular covariance extends from H(1) to H(ex) relies on the following observation. Lemma 6 In each H(n) , n ≥ 2, there is a total set of scattering states as in equation (3.4), with the localization regions chosen such that C1 , . . . , Cn−1 ⊂ W1 and Cn = W1 . In particular, for these scattering states the regions Ci + tV (ψi )ε , i = 1, . . . , n − 1, are spacelike separated from W1 + tV (ψn )ε for large t. Proof. Consider the set M n of velocity tupels (v1 , . . . , vn ) ∈ R3n satisfying the requirements that a) one of the velocities, say vi0 , has the strictly largest 1component: (vi0 )1 > (vi )1 for i = i0 , and b) the relative velocities w.r.t. vi0 have different directions: R+ (vi − vi0 ) = R+ (vj − vi0 )

for i = j .

Given such (v1 , . . . , vn ), let Ci0 = W1 . For i = i0 , let C i be a cone in the t = 0 plane of R4 containing the ray R+ (vi − vi0 ) and with apex at the origin, and let then Ci be its causal closure. Then, having chosen sufficiently small opening angles, the regions Ci + t{(1, vi )}, i = 1, . . . , n, are mutually spacelike separated for all t > 0, and further Ci ⊂ W1 for i = i0 . Now the set M n exhausts R3n except for a set of measure zero. Hence, a scattering state (ψn × · · · × ψ1 )out as in (3.4) can be approximated by a sum of scattering states (ψnν × · · · × ψ1ν )out , whose localization regions satisfy that Ciν0 = W1 for some i0 , and Ciν ⊂ W1 for i = i0 . This is accomplished by a standard argument [1] taking into account the continuous dependence of (ψn × · · · × ψ1 )out on the ψi and the Reeh-Schlieder theorem. But due to the normal commutation relations obeyed by the scattering states, (ψnν × · · · × ψ1ν )out coincides with ± (ψiν0 × · · · ψnν · · · × ψ1ν )out and hence is of the form required in the lemma.

Vol. 2, 2001


919

(1) Proposition 7 If the unitary groups ∆it , they W1 and U (λ1 (−2πt)) coincide on H (ex) also coincide on the space H of scattering states.

. Proof. Let Ut = ∆it W1 U (λ1 (2πt)). Considering this operator as an internal symmetry, it should act multiplicatively on the scattering states as shown by Landau in [28]. The complication is that Ut does not act strictly local, but only leaves F (W1 ) invariant. We generalize Landau’s argument to this case utilizing the last lemma. By induction over the particle number n we show that Ut is the unit operator on each H(n) . Let (ψn × · · · × ψ1 )out be a scattering state with ψi = Bi (fi,t ), where the localization regions Ci are as in the above lemma. Since Bn−1 (fn−1,t ) · · · B1 (ft )Ω − (ψn−1 × · · · × ψ1 )out is of fast decrease in t, while Bn (fn,t ) increases at most like |t|4 , one concludes as Hepp in [24]: (ψn × · · · × ψ1 )out = lim Bn (fn,s ) (ψn−1 × · · · × ψ1 )out . s→∞

(3.6)

Hence Ut (ψn × · · · × ψ1 )out = lim Ut Bn (fn,s )Ut−1 (ψn−1 × · · · × ψ1 )out , s→∞

(3.7)

where we have put in the induction hypothesis that Ut acts trivially on H(n−1) . Due to Borchers’ result, Ut commutes with the translations, which implies that Ut Bn (fn,s )Ut−1 coincides with (Ut Bn Ut−1 )(fn,s ). But modular theory and covariance guarantee that Ut Bn Ut−1 is, like Bn , in F (W1 ). In addition, (Ut Bn Ut−1 )(fn,s ) Ω = Ut ψn , and we conclude from the equation (3.7) that Ut (ψn × · · · × ψ1 )out = ((Ut ψn ) × ψn−1 × · · · × ψ1 )out . By assumption of the proposition, Ut acts trivially on ψn , and hence on the scattering state. By linearity and continuity, the same holds on H(n) , completing the induction. The hypothesis of this proposition has been shown in Theorem 5 to hold under our assumptions 0), . . . , vi). Hence, we have now derived modular covariance from these assumptions and asymptotic completeness. As mentioned, Guido and Longo have shown that modular covariance generally implies covariance of the modular conjugations, and hence the CPT theorem [22, Prop. 2.8, 2.9]. Thus, the proof of Theorem 2 is now completed.

4 The CPT Theorem We show here that the CPT theorem can also be derived directly from our assumptions in Section 1, via the single particle result and scattering theory. This should in particular turn out useful for a derivation of the CPT theorem in a theory of massive particles with non–Abelian braid group statistics (plektons) in d = 2 + 1,

920

J. Mund


where the methods of [22] cannot be applied in an obvious way since one has no field algebra. Recall that incoming scattering states can be constructed as in equation (3.4), where now t → −∞ and the condition for the limit to exist is that the regions Ci − |t| Viε be mutually spacelike separated for large |t|. The following result holds under the assumptions of Section 1, but without restrictions on the degeneracies of the mass eigenvalues in each sector. Like Proposition 7, it is an extension of Landau’s argument [28]. Lemma 8 JW1 maps outgoing scattering states to incoming ones and vice versa according to out in = JW1 ψn × · · · × JW1 ψ1 . JW1 ψn × · · · × ψ1

(4.1)

Let us put the statement of the lemma into a more concise form. Recall that the spaces of incoming and outgoing scattering states are isomorphic to an appropriately symmetrized Fock space over H(1) via the operators Win,out which map ψn ⊗ · · · ⊗ ψ1 to (ψn × · · · × ψ1 )in,out , respectively. Lemma 8 then asserts that JW1 Wout = Win Γ(JW1 E (1) ) ,

(4.2)

where Γ(U ) denotes the second quantization of a unitary operator U on H(1) . Note that the same equation holds with Wout and Win interchanged. Proof. We proceed by induction along the same lines as in the last proposition. Let ψi = Bi (fi,t )Ω be the single particle states appearing in equation (4.1), with velocity supports contained in compact sets Vi , and with localization regions Ci , such that Ci + tViε are mutually spacelike separated for large t and suitable ε > 0. According to Lemma 6, we may assume that the localization regions satisfy C1 , . . . , Cn−1 ⊂ W1 and Cn = W1 . By the same arguments as in the last proof, we have JW1 (ψn × · · · × ψ1 )out = lim JW1 Bn (fn,t ) JW1 (JW1 ψn−1 × · · · × JW1 ψ1 )in , t→∞

(4.3) where we have put in the induction hypothesis that JW1 acts as in equation (4.1) on H(n−1) . Now by Borchers’ result [4] we know that the commutation relations JW1 U (x)JW1 = U (jx) hold. From these we conclude that the spectral supports of ψ ∈ H and JW1 ψ are related by the transformation −j, and hence their velocity supports are related by (4.4) V (JW1 ψ) = −r V (ψ) , where r denotes the inversion of the sign of the x1 -coordinate. By virtue of the Reeh-Schlieder theorem and the continuity of the scattering states, we may assume î (fî,−t )Ω = î ∈ F(r Ci ) and fî such that B that for i = 1, . . . , n − 1 there are B ε ˆ ˆ JW1 ψi . Further, fi can be chosen such that V (fi ) ⊂ V (JW1 ψi ) , which in turn is

Vol. 2, 2001


921

î (fî,−t ) can be approximated by contained in −r Viε due to equation (4.4). Then B ε an operator Ai (t) localized in the region r {Ci + tViε }. These regions are mutually spacelike separated for large positive t, and hence the incoming n − 1 particle ˆn−1 (fˆn−1,t ) · · · B ˆ1 (fˆ1,t ) Ω. state in equation (4.3) may be written as limt→−∞ B Similarly, Borchers’ commutation relations imply that JW1 Bn (fn,t ) JW1 = JW1 Bn JW1 (fˆn,−t ) , where fˆn (x) = fn (jx) . Now JW1 Bn JW1 is in F (W1 ) , and V (fˆn ) = −r V (fn ), and therefore the discussion around equation (3.3) implies that the above operator may be approximated by an operator Aεn (t) ∈ F W1 +t·rVnε . Recall that the operators Aεi (t), i = 1, . . . , n−1, are localized in the regions r {Ci + tViε }. For large positive t, these regions are spacelike to r {W1 + tVnε } and are hence contained in W1 + t · rVnε . Hence the Aεi (t) commute with Aεn (t) for large t. Thus the standard arguments of scattering theory [24, 17] apply, yielding that the vector (4.3) may be written as ˆn−1 (fˆn−1,t ) · · · B ˆ 1 (fˆ1,t ) Ω , lim JW1 Bn JW1 (fˆn,t ) B t→−∞

and only depends on the single particle vectors. But these are JW1 Bn JW1 (fˆn,t ) î (fî,t ) Ω = JW1 ψ for i = 1, . . . , n − 1. Hence the limit coincides Ω = JW1 ψn , and B with the right hand side of equation (4.1), completing the induction. . Proposition 9 (CPT) Let Θ be the the anti–unitary involution Θ = Z ∗ JW1 . i)If the representation property Θ U (g) Θ = U (jgj)

for all g ∈ P˜+↑

(4.5)

holds on H(1) , then it is also satisfied on the space H(ex) of scattering states. ii) In this case, and if in addition asymptotic completeness holds, Θ acts geometrically correctly on the family of wedge algebras F (W )W in the sense of equation (1.2). Note that equation (4.5) is equivalent to JW1 U (g)JW1 = U (jgj), since Z commutes with U (g) and satisfies Z ∗ JW1 = JW1 Z. In Proposition 5, we have shown that JW1 satisfies this representation property on H(1) if the assumptions 0), . . . ,vi) of Section 1 hold. Hence Proposition 9 is a CPT theorem, holding under these assumptions and asymptotic completeness. Proof. i) Let JW1 have the above representation property on H(1) . As is well known [17], the restriction of U (P˜+↑ ) to the space of scattering states is equivalent to the second quantization of its restriction to H(1) : U (g) Wout,in = Wout,in Γ(U (g)E (1) ). By virtue of Lemma 8, see equation (4.2), the assumption thus implies JW1 U (g)JW1 Wout = Wout Γ JW1 U (g)JW1 E (1) = U (jgj) Wout ,

922

J. Mund


which proves the claim. ii) By twisted locality and modular theory, one has F (W1 ) ⊂ Z ∗ F (W1 ) Z = Θ F (W1 ) Θ .

(4.6)

Now recall that U (r2 (π)) F (W1 ) U (r2 (π))−1 = F (W1 ) and that jr2 (π)j = r2 (−π), see equation (A.2). One therefore obtains, by applying Ad U (r2 (π))Θ to the inclusion (4.6) and using equation (4.5), the opposite inclusion. Hence equality holds in (4.6). Since every wedge region arises from W1 by a Poincaré transformation, the claimed equation (1.2) follows by covariance of the field algebras and the representation property (4.5) of Θ.

A

Single-Particle PT Operator and Geometric Involution

We provide an explicit formula for the group relations jgj and a proof of the representation property of the “PT operator” Um,s (j) defined in equation (2.6). As before, we denote by g → jgj the unique lift [31] of the adjoint action of j on the Poincaré group to an automorphism of the covering group. An explicit formula for jgj follows from the observation that j coincides with the proper Lorentz transformation −R1 (π) : Hence, for all A ∈ SL(2, C) j Λ(A) j = R1 (π) Λ(A) R1 (π)−1 = Λ(σ1 A σ1 ) . This shows that the lift jgj is given by for all (x, A) ∈ P˜+↑ .

j (x, A) j = (j x, σ1 A σ1 )

(A.1)

Using equation (2.2), one has in particular the relations j r2 (ω) j = r2 (−ω) ,

j λ1 (t) j = λ1 (t) .

(A.2)

Lemma 10 The operator Um,s (j) defined in equation (2.6) is anti–unitary and satisfies the representation properties (2.5). Proof. We prove the second of the equations (2.5) for g = (0, A) with A ∈ SL(2, C). The other assertions are shown along the same lines. Recall that the covering homomorphism Λ : SL(2, C) → L↑+ is characterized by Λ(A) p = A p A∗ . We have e Um,s (j) Um,s (g) Um,s (j) ψ (p) (A.3) = V m−2 p σ A¯ (Λ(A−1 )(−j)p) σ ψ(jΛ(A−1 )jp) . s

e

3

3

Using the identity Λ(A−1 )(−j)p = Λ(A−1 iσ1 )p = A−1 σ1 p σ1 (A∗ )−1 ,

e

Vol. 2, 2001


923

which follows from −j = R1 (π), and the well-known relation ¯ 2 = (A∗ )−1 σ2 Aσ

for A ∈ SL(2, C) ,

one verifies that the argument of Vs in equation (A.3) equals σ1 A σ1 . By equation (A.1), this proves the claim. We now relate the geometric involution Sgeo = U (1) (j)e−πK E (1) with the closable operator S0 defined in equation (2.14). Lemma 11 The closure of S0 is an extension of Sgeo . Proof. Recall that for f ∈ S(R) the bounded operator f (K), where K denotes again the generator of the boosts λ1 (·), may be written as f (K) = dt f˜(t) U (λ1 (t)) . √ Here 2π f˜ is the Fourier transform of f, and the integral is understood in the weak sense. Let now c be a smooth function with compact support, and let ψ = ∞ E (1) BΩ, where B ∈ Fα,k (C) for some C ∈ K1 . Applying the above formula to . −πK c(K) one finds, using that c˜ is analytic and c π (t) = c˜(t − iπ), cπ (K) = e Sgeo c(K) ψ (p) = dt c˜(t − iπ) U (1) (j) U λ1 (t) ψ (p) (A.4) 1 p σ3 e 2t σ1 ψ Λ1 (−t)(−jp) . (A.5) = dt c˜(t − iπ) Vsα mα e The one-parameter group Λ1 (·) extends to an entire analytic function satisfying Λ1 (−t − it ) = Λ1 (−t) jt − i sin t σ , where jt acts as multiplication by cos t on the coordinates x0 and x1 and leaves the other coordinates unchanged, and σ acts as σ1 on (x0 , x1 ) and as the zero projection on (x2 , x3 ) [23]. Note that in particular Λ1 (−t − iπ) = Λ1 (−t) j . Further, one easily verifies that for any q ∈ Hmα , the vector σ q is in the dual cone C ∗ . Hence for all t ∈ (0, π) and all p ∈ Hmα , the complex vector Λ1 (−t−it )(−jp) is in ΓC,α , the domain of analyticity of ψ, (c.f. Lemma 4) and approaches Λ1 (−t)(−p) as t → π. It follows that the integrand in the expression (A.5) is anti–holomorphic in t in the strip 0 < Imt < π, and that (A.5) coincides with 1 1 p σ3 σ1 e 2t σ1 ψ(Λ1 (−t)(−p)) dt c˜(t) Vsα mα e i = dt c˜(t) S0 U λ1 (t) ψ (p) . (A.6)

924

J. Mund


Here we have used that for all t, U (λ1 (t)) ψ is again in the domain D0 of S0 due to the covariance of the field algebra. This is so because for all t, there is some Ct ∈ K1 such that Λ1 (t) C ⊂ Ct . Let now φ be in the (dense) domain of S0∗ , and let ψ ∈ D0 . We have shown from (A.4) to (A.6), that φ , Sgeo c(K)ψ = dt c˜(t) φ , S0 U λ1 (t) ψ = c(K)ψ , S0∗ φ . Let D denote the set of finite linear combinations of vectors of the form c(K) ψ, where c ∈ C0∞ (R) and ψ ∈ D0 . Then the above equation shows that D is in the domain of S0∗∗ , and that S0∗∗ = Sgeo on D. But D is a core for Sgeo , hence S0∗∗ is an extension of Sgeo .

Acknowledgments I thank K. Fredenhagen, R. Longo and D. Buchholz for stimulating discussions which have been essential to this work, and K.-H. Rehren for carefully reading the manuscript. Further, I gratefully acknowledge the hospitality extended to me by the Universities of Rome I and II. Last not least, I acknowledge financial support by the SFB 288 (Berlin), the EU (via TMR networks in Rome), the Graduiertenkolleg “Theoretische Elementarteilchenphysik” (Hamburg), and the DFG (Göttingen).

References [1] H. Araki, Mathematical theory of quantum fields, Int. Series of Monographs in Physics, no. 101, Oxford University Press, 1999. [2] J.J Bisognano and E.H. Wichmann, On the duality condition for a Hermitean scalar field, J. Math. Phys. 16, 985 (1975). [3]

, On the duality condition for quantum fields, J. Math. Phys. 17, 303 (1976).

[4] H.J. Borchers, The CPT-theorem in two-dimensional theories of local observables., Commun. Math. Phys. 143, 315 (1992). [5]

, On modular inclusion and spectrum condition, Lett. Math. Phys. 27, 311 (1993).

[6]

, On Poincaré transformations and the modular group of the algebra associated with a wedge, Lett. Math. Phys. 46, 295–301 (1998).

[7] H.J. Borchers, D. Buchholz, and B. Schroer, Polarization-free generators and the S-matrix, Commun. Math. Phys. 219, 125–140 (2001), arXiv:hepth/0003243.

Vol. 2, 2001


925

[8] H.J. Borchers and J. Yngvason, On the PCT-theorem in the theory of local observables, arXiv:math-ph/0012020. [9] O. Bratteli and D.W. Robinson, Operator algebras and quantum statistical mechanics 1, second ed., TMP, Springer, New York, 1987. [10] R. Brunetti, D. Guido, and R. Longo, Modular structure and duality in conformal field theory, Commun. Math. Phys. 156, 201–219 (1993). [11] D. Buchholz, O. Dreyer, M. Florig, and S.J. Summers, Geometric modular action and spacetime symmetry groups, Rev. Math. Phys. 12, 475–560 (1998). [12] D. Buchholz and H. Epstein, Spin and statistics of quantum topological charges, Fysica 17, 329–343 (1985). [13] D. Buchholz and K. Fredenhagen, Locality and the structure of particle states, Commun. Math. Phys 84, 1–54 (1982). [14] D.R. Davidson, Modular covariance and the algebraic PCT/spin-statistics theorem, arXiv:hep-th/9511216. [15] S. Doplicher, R. Haag, and J.E. Roberts, Fields, observables and gauge transformations I, Commun. Math. Phys. 13, 1–23 (1969). [16]

, Local observables and particle statistics I, Commun. Math. Phys. 23, 199 (1971).

[17]

, Local observables and particle statistics II, Commun. Math. Phys. 35, 49–85 (1974).

[18] S. Doplicher and D. Kastler, Ergodic states in a non commutative ergodic theory, Commun. Math. Phys. 7, 1–20 (1968). [19] S. Doplicher and J.E. Roberts, Why there is a field algebra with a compact gauge group describing the superselection structure in particle physics, Commun. Math. Phys. 131, 51–107 (1990). [20] H. Epstein, CTP-invariance of the S-matrix in a theory of local observables, J. Math. Phys. 8, 750–767 (1967). [21] D. Guido and R. Longo, Natural energy bounds in quantum thermodynamics, Commun. Math. Phys. 218, 513–536 (2001), arXiv:math.OA/0010019. [22]

, An algebraic spin and statistics theorem, Commun. Math. Phys. 172, 517 (1995).

[23] R. Haag, Local quantum physics, second ed., Texts and Monographs in Physics, Springer, Berlin, Heidelberg, 1996.

926

J. Mund


[24] K. Hepp, On the connection between Wightman and LSZ quantum field theory, Axiomatic Field Theory (M. Chretien and S. Deser, eds.), Brandeis University Summer Institute in Theoretical Physics 1965, vol. 1, Gordon and Breach, 1966, pp. 135–246. [25] R. Jost, The general theory of quantized fields, American Mathematical Society, Providence, Rhode Island, 1965. [26] B. Kuckert, Two uniqueness results on the Unruh effect and on PCTsymmetry, Comm. Math. Phys. 221, 77–100 (2001), arXiv:math-ph/0010008. , A new approach to spin & statistics, Lett. Math. Phys. 35, 319–331

[27] (1995).

[28] L.J. Landau, Asymptotic locality and the structure of local internal symmetries, Commun. Math. Phys. 17, 156–176 (1970). [29] B. Schroer and H.-W. Wiesbrock, Modular theory and geometry, Rev. Math. Phys. 12, 139–158 (2000), arXiv:math-ph/9809003. [30] R.F. Streater and A.S. Wightman, PCT, spin and statistics, and all that, W. A. Benjamin Inc., New York, 1964. [31] V.S. Varadarajan, Geometry of quantum theory, vol. II, Van Nostrand Reinhold Co., New York, 1970. [32] J. Yngvason, A note on essential duality, Lett. Math. Phys. 31, 127–141 (1994).

Jens Mund Institut f¨ ur theoretische Physik Universit¨ at G¨ ottingen Bunsenstr. 9 D-37 073 G¨ ottingen Germany email: [email protected] Communicated by Klaus Fredenhagen submitted 5/02/01, accepted 22/03/01




A Palindromic Half-Line Criterion for Absence of Eigenvalues and Applications to Substitution Hamiltonians D. Damanik, J.-M. Ghez and L. Raymond Abstract. We prove a criterion for absence of decaying solutions on the half-line for one-dimensional discrete Schr¨ odinger operators. As necessary inputs, we require infinitely many palindromic prefixes and upper and lower bounds for the traces of associated transfer matrices. We apply this criterion to Schr¨ odinger operators with potentials generated by substitutions.

1 Introduction The use of local symmetries in the study of spectral properties of one-dimensional Schr¨ odinger operators has a long history dating back at least to the work of Gordon in 1976 [22]. Criteria in this spirit are particularly useful in the study of models centered around the Fibonacci Hamiltonian; see [13] for a review. The general idea is that local symmetries in the potential should be reflected in the solutions of the associated eigenvalue equation which prevents them from being square-summable. Quite often one can even prove the stronger property that the solutions do not decay at infinity. In one dimension, there are of course two types of local symmetries: repetition of blocks (i.e., powers) and reflection symmetry of blocks (i.e., palindromes). Moreover, the criteria can be classified as half-line methods and whole-line methods, according to whether they study properties of the potentials and the solutions on a half-line or the whole line. Half-line methods are usually slightly more involved, but they have the advantage that they provide stronger conclusions. Thus, there are in principle four types of criteria for absence of eigenvalues which employ local symmetries. Based on Gordon’s work, Delyon-Petritis [20] and S¨ ut˝ o [34] have found wholeline and half-line criteria, respectively, using the occurrence of local repetitions in the potential. These criteria have subsequently been applied to large classes of potentials generated by substitutions and circle maps; see, for example, [8, 9, 10, 11, 12, 15, 16, 17, 18, 20, 23, 34]. It is important to note that these criteria do not only exclude eigenvalues, they also establish explicit solution estimates which were crucial in the study of more refined spectral properties such as α-continuity; compare [10, 17]. Based on Jitomirskaya-Simon [25], Hof et al. [24] have found a whole-line criterion for absence of eigenvalues which is based on palindromes. This criterion

928

D. Damanik, J.-M. Ghez and L. Raymond


is applicable to large classes of potentials but it has the slight drawback that it does not exclude decaying solutions. Moreover, its scope is somewhat limited; see [1, 2, 19] for results on non-applicability of palindromic criteria. Let us summarize the current situation:

whole-line half-line

powers Delyon-Petritis [20] (based on [22]) S¨ ut˝ o [34] (based on [22])

palindromes Hof-Knill-Simon [24] ?

Our motivation for filling in the gap in this table is now twofold. On the one hand, it is interesting in its own right to find a palindromic half-line criterion. This is further motivated by the fact that, as mentioned above, half-line methods give stronger conclusions. On the other hand, we will show — in our application of such a criterion to Schr¨ odinger operators with potentials generated by substitutions — how one obtains a more detailed understanding of both the solution behavior and the concrete realizations of the potentials one can treat for many examples (including, e.g., the prominent Thue-Morse case). We remark that the Thue-Morse case displays little power symmetries, and hence is quite impossible to study using Gordon-type criteria, whereas palindromic symmetries are abundant. The organization of this article is as follows. In Section 2 we prove a halfline criterion for absence of decaying solutions for general potentials provided that one finds suitable palindromic structures and bounds for the traces of the transfer matrices associated to them. In Section 3 we apply this criterion to potentials generated by substitutions.

2 A Palindromic Criterion for Absence of Decaying Solutions on the Half-Line In this section we show how one can exclude the presence of decaying solutions for a half-line eigenvalue problem with a potential having infinitely many palindromes as prefixes. The necessary input are upper and lower bounds on transfer matrix traces on (not necessarily related) subsequences of these palindromic prefixes. As explained in the introduction, this complements the whole-line palindrome method of Hof et al. and also the several variants of Gordon-type criteria. Consider a discrete one-dimensional Schr¨ odinger operator (Hφ)(n) = φ(n + 1) + φ(n − 1) + V (n)φ(n)

(1)

in 2 (Z) with potential V : Z → R. We shall study the solutions to the difference equation φ(n + 1) + φ(n − 1) + V (n)φ(n) = Eφ(n)

(2)

Vol. 2, 2001

A Criterion for Absence of Eigenvalues

929

for E ∈ R. As usual, we introduce the transfer matrices ME (n) = TE (V (n)) × · · · × TE (V (1)) where for ζ ∈ R,

TE (ζ) =

E−ζ 1

−1 0

.

Then, any solution φ of (2) obeys Φ(n) = ME (n)Φ(0), where Φ(i) denotes (φ(i+1), φ(i))T . The main result of this section is the following: Theorem 1 Fix some E ∈ R. Suppose that (i) There exists a sequence of integers nk → ∞ such that for every k and every 1 ≤ i ≤ nk /2 , we have V (i) = V (nk − i + 1). (ii) There exists a constant C1 such that |trME (nk )| ≤ C1 for infinitely many nk . (iii) There exists a constant C2 such that |trME (nk )| ≥ C2 for infinitely many nk . Then no solution φ of (2) tends to 0 at +∞. In particular, no solution of (2) is square-summable and hence E is not an eigenvalue of H. Remark In other words, if the potential restricted to the right half-line has infinitely many palindromic prefixes and the traces of the transfer matrices associated with these palindromes neither tend to 0 nor to ∞ for some energy E, then the solutions corresponding to this energy do not tend to 0 at +∞ and hence E is not an eigenvalue of H. Proof. We will first show that for every k, the transfer matrix ME (nk ) has the form ak −bk ME (nk ) = (3) bk dk for suitable numbers ak , bk , dk . We consider first the case where nk is even. Let 0 1 T = 1 0 and denote the matrix entries of ME (nk /2) by αk βk ME (nk /2) = . γk δk

930



Then, using assumption (i), we get ME (nk )

ME (nk /2) · T · (ME (nk /2))−1 · T αk βk δk −βk = ·T · ·T γk δk −γk αk αk βk αk −γk = · γk δk −βk δk α2k − βk2 −(αk γk − βk δk ) = −γk2 + δk2 αk γk − βk δk ak −bk =: . bk dk =

This proves (3) for nk even. Let us now consider the case where nk is odd. Writing αk βk ME (nk /2 ) = , γk δk we can proceed with ζ = V (nk /2 + 1) as follows: ME (nk )

ME (nk /2 ) · TE (ζ) · T · (ME (nk /2 ))−1 · T αk βk E − ζ −1 αk −γk = · · γk δk −βk δk 1 0 2 (E − ζ)αk + 2αk βk −((E − ζ)αk γk + 2βk γk ) = −((E − ζ)γk2 + 2γk δk ) (E − ζ)αk γk + 2βk γk ak −bk =: bk dk =

and hence we have (3) also for nk odd. Now assume that there exists a decaying solution φ of (2). Then ME (nk ) and thus max{|ak |, |bk |, |dk |} tends to infinity. By assumption (ii), on a subsequence nkj we have |akj +dkj | ≤ C1 . Using this and the relation det ME (nk ) = ak dk +b2k = 1, we get min{|akj |, |bkj |, |dkj |} → ∞.

(4)

Now, with v = Φ(0), we have ME (nk )v → 0 as k → ∞. We see that, as an element of P (R2 ), v = (1, 0)T , for otherwise ak , bk → 0 and hence dk → ∞, contradicting assumption (ii). Similarly, it follows that in P (R2 ), v = (0, 1)T . Therefore v = (1, ξ)T in P (R2 ) with some ξ = 0. We get ak − ξbk → 0, bk + ξdk → 0.

Vol. 2, 2001

Hence,


931

ak + ξ 2 dk → 0.

In particular, on the subsequence nkj we have both akj +ξ 2 dkj → 0 and |akj +dkj | ≤ C1 , so that (4) implies ξ 2 = 1. But then trME (nk ) = ak + dk → 0, contradicting assumption (iii). The above proof shows in fact the following: Corollary 1 Fix some energy E ∈ R. Suppose as in Theorem 1 that the potential on the right half-line starts with infinitely many palindromes and suppose further that the traces of the associated transfer matrices are bounded on a subsequence. Then the existence of a decaying solution of (2) implies that these traces converge to zero and the initial condition of the decaying solution is equal to either (1, 1)T or (1, −1)T in P (R2 ). This result can be interpreted as a result for half-line operators on 2 (N) where one has to impose a boundary condition at the origin to ensure self-adjointness. On the set of energies where the transfer matrix traces do not diverge to infinity, one therefore has absence of decaying (and thus 2 -) solutions for all but two boundary conditions.

3 Absence of Eigenvalues for Substitution Schr¨ odinger Operators In this section, we want to apply our criterion to the particular case of Schr¨ odinger operators with potentials generated by primitive substitutions. Some classes of such operators have already been shown to exhibit absence of eigenvalues [6, 9, 11, 12, 15, 16, 21, 23, 24, 34]. Here we will show how the application of our criterion allows one to treat an additional subclass. Schr¨ odinger operators with potentials generated by primitive substitutions have been studied mainly because of their relevance to quasicrystals and their exotic spectral properties. Since the discovery of quasicrystals by Shechtman et al. in 1984 [33], a lot of effort has been made to find appropriate structural models. The two most heavily studied models are generated by either a cut and project scheme or a substitution process. On the other hand, there has been intense research activity on Schr¨ odinger operators with purely singular continuous spectra over the last decade and it turned out that as a rule, one-dimensional Schr¨ odinger operators with potentials generated by primitive substitution apparently exhibit this “exotic” spectral type. While absence of absolutely continuous spectrum follows in full generality from Kotani [26] and Last and Simon [27], absence of point spectrum is not known in similar generality. However, no counterexample is known, and there is a huge number of positive partial results; see [13] for a review and [9] for the conjecture that the hypothesis sufficient to prove the singularity of the spectrum, that is, semi-primitivity of a reduced trace map and existence of a square in u (Theorem 2 below), are also sufficient to prove absence of eigenvalues . Apart from

932



results on the spectral type, other interesting results have been obtained for these operators, such as zero-measure Cantor spectrum [6, 8, 9, 35], general gap-labeling (heuristic [28], K-theoretic [7] and constructive [30]), and results on the opening of gaps at low coupling (heuristic [28] and precise [5, 6]) and large coupling [30].

3.1

A class of models with empty point spectrum

Let us first recall some definitions concerning these operators. A substitution [29] is a map S from a finite alphabet A to the set A∗ of words on A, which can be naturally extended to a map from A∗ to A∗ and also to a map from AN to AN . S is said to be primitive if there exists k ∈ N such that for every pair (α, β) ∈ A2 , S k (α) contains β. A substitution sequence u is a fixed point of S given by indefinite iteration of S on a letter a ∈ A such that S(a) begins with a. The hull of u, Ωu , is defined as the set of two-sided infinite sequences over A that have all their finite subwords occurring in u. Fix some function f : A → R. One says that a Schr¨ odinger operator of type (1) is associated with S and f if the sequence (V (n))n∈Z has the form V (n) = f (ωn ) for some ω ∈ Ωu . It follows from general principles (see, e.g., [13]) that primitivity of S implies the existence of a closed set Σ ⊆ R such that the spectrum of every associated operator H is equal to Σ. However, in general the spectral type of H need not be independent of ω. We will always assume in the following that S and f are such that the potentials V are not periodic since the spectral theory of the periodic case is well established. Let us recall some notions which are crucial to the approach of [9]. Define for any word w = w1 . . . wm ∈ A∗ and every E ∈ R, ME (w) = TE (f (wn )) × · · · × TE (f (w1 )) and

(k)

ME (w) = ME (S k (w)).

The substitution rule naturally leads to recursive relations between the matrices (k) ME (w). From these recursions one can obtain an even more useful system of recursive equations for the traces of these matrices. In general there exists a finite subset of words B ⊂ A∗ containing A for which these equations yield a closed set of recursive polynomial equations, which is called the trace map. It turns out that to each trace map one can associate a reduced trace map that is monomial. To this reduced trace map, one can then associate a substitution Sˆ on B whose properties are ultimately crucial for the spectral analysis. In short terms, the reduced trace map is obtained by keeping in the recursive equations the terms of highest degree which determine the behavior of the norms of the transfer matrices at large n, allowing one, under the hypothesis of the theorem below, to identify the spectrum with the set of energies with zero Lyapunov exponent and thus to apply Kotani’s theorem [26] to prove the singularity of the spectrum; see [9] for details of the proof. We call such a substitution Sˆ semi-primitive if:

Vol. 2, 2001


933

ˆ C is a primitive substitution from C to C ∗ . (i) There exists C ⊂ B such that S| (ii) There exists k such that for all β ∈ B, Sˆk (β) contains at least one letter from C. The following theorem was proven in [9]: Theorem 2 Assume that S is primitive and has a fixed point u, f is such that the associated potentials are aperiodic, and the following conditions are satisfied: (i) u contains the square of a word in B. (ii) There exists a trace map whose associated substitution Sˆ is semi-primitive. Then for every ω ∈ Ωu , the spectrum of the operator H in (1) with V (n) = f (ωn ) is singular and supported on a set of zero Lebesgue measure. Of special interest for an application of our criterion from Section 2 to substitution models is the fact that semi-primitivity of the trace map implies the existence of non-divergent subsequences of trace map iterates for energies in the spectrum [9, Lemma 3.4]. It has been shown in [9] that semi-primitivity of the derived substitution Sˆ holds for many prominent substitutions including (in the case where A = {a, b}) Fibonacci (a → ab, b → a), period doubling (a → ab, b → aa), binary non-Pisot (a → ab, b → aaa), and Thue-Morse (a → ab, b → ba, to be discussed in detail below). (n) Let xn (E) = tr(ME (a)), where a is the first symbol of u. As a consequence ˆ for every energy E from the spectrum, we get that in the case of semi-primitive S, |xn (E)| is bounded on a subsequence (which may depend on the energy). For concrete models with a semi-primitive Sˆ and which display the required palindromic symmetries, we can therefore focus our attention on lower bounds for a subsequence of |xn (E)|. We have the following theorem: Theorem 3 Assume that S is primitive and has a fixed point u, f is such that the associated potentials are aperiodic, and the following conditions are satisfied: (i) S n (a) is a palindrome for every n, where a is the first symbol of u. (ii) There exists a trace map whose associated substitution Sˆ is semi-primitive. (iii) For every E ∈ Σ, xn (E) → 0 as n → ∞. Then there is ω ∈ Ωu such that u is the restriction of ω to N and the operator H in (1) with V (n) = f (ωn ) has no eigenvalues.

934



Proof. We only have to construct an element ω ∈ Ωu whose restriction to the right half-line coincides with u. The assertion is then a consequence of Theorem 1. The existence of such an element follows from the repetitivity of u (i.e., its finite subwords occur infinitely often) and compactness of Ωu (since it is clearly a closed subset of the compact AZ ). Namely, using these two properties, one can construct a subsequence of elements of Ω which on the right half-line converges pointwise to u and then choose a converging subsequence by compactness. The limit ω of this subsequence then coincides with u on the right half-line. By construction, we have that assumptions (i) and (iii) imply conditions (i) and (iii) of Theorem 1, respectively, while, as mentioned above, assumption (ii) implies condition (ii) of Theorem 1 by [9, Lemma 3.4]. If u contains a square, which is the case, for example, if there is a palindrome of even length, the operator H above verifies all the assumptions of Theorem 2. We can therefore state the following corollary to Theorem 3. Corollary 2 Under the assumptions of Theorem 3, if u contains the square of a word, the spectrum of H is purely singular continuous and supported on a Cantor set of Lebesgue measure zero. Remark This corollary is a partial answer to the conjecture in [9] mentioned in the introduction of this section.

3.2

The Thue-Morse case

As a first example, we consider the Thue-Morse sequence; compare, in particular, [5]. Generic absence of eigenvalues was shown by Delyon-Peyrière [21], in [6] (in an implicit way that will be made explicit here), and by Hof et al. [24]. In fact, as was claimed in [6] (see the remark at the end of Section II), the palindromicity of some S n (a) is a crucial ingredient in the proof. However, our result additionally yields the absence of decaying solutions at +∞ and, if one considers the symmetric extension of u (the reader may verify that this sequence belongs to Ωu ), we even have an explicit potential from the Thue-Morse hull for which every solution to (2) with E in the spectrum decays neither at +∞ nor at −∞. Let us recall the definition of the Thue-Morse substitution: It is defined on the alphabet A = {a, b} by S(a) = ab and S(b) = ba. It is clearly primitive and the sequence u = limn→∞ S n (a) is called the Thue-Morse sequence. Since one also has u = limn→∞ S 2n (a), S 2 generates the same hull and hence the same family of associated operators. It is therefore sufficient to verify all the assumptions of Theorem 3 for the substitution S 2 . Aperiodicity of the associated potentials holds for all non-constant functions f . Moreover, we have (i) The iterates of S 2 on a are palindromes of even length since S 2 (a) = abba and S 2 (b) = baab.

Vol. 2, 2001

A Criterion for Absence of Eigenvalues (n)

935 (n)

(ii) Define yn (E) = tr(ME (b)), zn (E) = tr(ME (ab)). The trace map for S 2 is as follows (we drop E for simplicity of notation): xn+1 = xn yn zn − x2n − yn2 + 2 yn+1 = xn yn zn − x2n − yn2 + 2 zn+1 = xn yn zn3 − x23 zn2 − yn2 zn2 + xn yn zn + 2 which gives rise to a semi-primitive reduced trace map: x → xyz, y → xyz, z → xyz 3 . (iii) Since zn (E) corresponds to the evolution of the traces associated with the word γ0 = ab which contains a, Lemma 3.4 of [9] shows that for every E ∈ Σ, there is a subsequence of (|zn (E)|)n∈N which is bounded from above. Then the above recursions for xn = yn show that the sequence (xn (E))n∈N cannot converge to 0. Thus, by Corollary 2, there is ω ∈ Ωu such that u is the restriction of ω to N and for every non-constant f , the spectrum of H with V (n) = f (ωn ) is purely singular continuous and supported on a Cantor set of Lebesgue measure zero.

3.3

A family of examples

We can verify the assumptions of Corollary 2 for a class of two-letter substitutions, namely all those which are defined by S(a) = palindrome of even length beginning with a, S(b) = ap , p ∈ N. First, we remark that this class clearly verifies the hypothesis of semiprimitivity. Second, it is easy to see in this case that if for some E in the spectrum, the sequence (xn (E))n∈N converges to 0, then for yn (E), zn (E) defined as in the previous subsection, the sequence (|yn (E)|)n∈N converges to 0 if p is odd and 2 if p is even. Since there is a subsequence of (|zn (E)|)n∈N which is bounded from above, this leads immediately to a contradiction because the specific form of S(a) implies that the trace map expression for xn (E) contains a constant term of absolute value 2 and no term yn (E) or −yn (E). A natural goal is now the generalization of this result to an arbitrary finite alphabet A = {a1 , ....., am }, in the sense that S(a1 ) = palindrome of even length beginning with a1 and all the other S(ak )’s are powers of a1 , because in this case the convergence of (xn (E))n∈N to 0 implies the convergence to 0, 2, or −2 of all the other sequences of traces associated to one letter, while xn (E) contains a constant term of absolute value 2 and no term of the type yn (E) or −yn (E) associated to other letters.

3.4

Examples with an invariant

We have seen that for an attempt to apply Theorem 3, the hardest part is to establish condition (iii) since condition (i) can be verified easily and condition (ii)

936



can be checked by a simple algorithmic procedure; compare [9]. We therefore want to point out that condition (iii) can sometimes be established by using some soft arguments involving a trace map invariant. For example, in the Fibonacci case we (n) have that xn (E) = tr(ME (a)) obeys (xn+1 (E))2 + (xn (E))2 + (xn−1 (E))2 − xn+1 (E)xn (E)xn−1 (E) = Cf for every n and E, with Cf = 0 if f is not constant [34]. invariant prevents xn (E) from converging to zero! Models been discussed in [3, 4, 14, 31, 32]. We remark that also in case, an invariant plays an important role (this is implicit in

It is clear that this with invariant have the period doubling [6]).

Acknowledgments D. D. would like to thank the Centre de Physique Théorique, Marseille for its warm hospitality and the Université de Toulon et du Var and the German Academic Exchange Service (HSP III, Postdoktoranden) for financial support. Note added in proof During the publishing process of this article, we became aware of a preprint by Q-M. Lin, B. Tan, Z-W. Wen and J. Wu, where they claimed that, for any primitive substitution S, there exists a power of S and an associated trace map such that the corresponding induced substitution is semi-primitive.

References [1] J.-P. Allouche, Schr¨ odinger operators with Rudin-Shapiro potentials are not palindromic, J. Math. Phys. 38, 1843–1848 (1997). [2] M. Baake, A note on palindromicity, Lett. Math. Phys. 49, 217–227 (1999). [3] M. Baake, U. Grimm, and D. Joseph, Trace maps, invariants, and some of their applications, Int. J. Mod. Phys. B 7, 1527–1550 (1993). [4] M. Baake and J. A. G. Roberts, Reversing symmetry group of GL(2, Z) and P GL(2, Z) matrices with connections to cat maps and trace maps, J. Phys. A 30, 1549–1573 (1997). [5] J. Bellissard, Spectral properties of Schr¨ odinger’s operator with a ThueMorse potential, in: Number Theory and Physics (Les Houches, 1989), Eds. J. M. Luck, P. Moussa and M. Waldschmidt, Springer, Berlin (1990), pp. 140–150. [6] J. Bellissard, A. Bovier, and J.-M. Ghez, Spectral properties of a tight binding Hamiltonian with period doubling potential, Commun. Math. Phys. 135, 379– 399 (1991). [7] J. Bellissard, A. Bovier, and J.-M. Ghez, Gap labeling theorems for onedimensional discrete Schr¨ odinger operators, Rev. Math. Phys. 4, 1–37 (1992).

Vol. 2, 2001


937

[8] J. Bellissard, B. Iochum, E. Scoppola, and D. Testard, Spectral properties of one-dimensional quasi-crystals, Commun. Math. Phys. 125, 527–543 (1989). [9] A. Bovier and J.-M. Ghez, Spectral properties of one-dimensional Schr¨ odinger operators with potentials generated by substitutions, Commun. Math. Phys. 158, 45–66 (1993); Erratum Commun. Math. Phys. 166, 431–432 (1994). [10] D. Damanik, α-continuity properties of one-dimensional quasicrystals, Commun. Math. Phys. 192, 169–182 (1998). [11] D. Damanik, Singular continuous spectrum for the period doubling Hamiltonian on a set of full measure, Commun. Math. Phys. 196, 477–483 (1998). [12] D. Damanik, Singular continuous spectrum for a class of substitution Hamiltonians, Lett. Math. Phys. 46, 303–311 (1998). [13] D. Damanik, Gordon-type arguments in the spectral theory of onedimensional quasicrystals, in : Directions in Mathematical Quasicrystals, Eds. M. Baake and R. V. Moody, CRM Monograph Series 13, AMS, Providence, R7, pp 277–304 (2000). [14] D. Damanik, Substitution Hamiltonians with bounded trace map orbits, J. Math. Anal. Appl. 249, 393–411 (2000). [15] D. Damanik, Uniform singular continuous spectrum for the period doubling Hamiltonian, Ann. Henri Poincaré 2, 101–108 (2001). [16] D. Damanik, Singular continuous spectrum for a class of substitution Hamiltonians II., Lett. Math. Phys. 54, 25–31 (2000). [17] D. Damanik, R. Killip, and D. Lenz, Uniform spectral properties of onedimensional quasicrystals, III. α-continuity, Commun. Math. Phys. 212, 191– 204 (2000). [18] D. Damanik and D. Lenz, Uniform spectral properties of one-dimensional quasicrystals, I. Absence of eigenvalues, Commun. Math. Phys. 207, 687–696 (1999). [19] D. Damanik and D. Zare, Palindrome complexity bounds for primitive substitution sequences, Discrete Math. 222, 259–267 (2000). [20] F. Delyon and D. Petritis, Absence of localization in a class of Schr¨ odinger operators with quasiperiodic potential, Commun. Math. Phys. 103, 441–444 (1986). [21] F. Delyon, J. Peyrière, Recurrence of the eigenstates of a Schr¨ odinger operator with automatic potential, J. Stat. Phys. 64, 363–368 (1991).

938



[22] A. Gordon, On the point spectrum of the one-dimensional Schr¨ odinger operator, Usp. Math. Nauk 31, 257–258 (1976). [23] M. H¨ ornquist and M. Johansson, Singular continuous electron spectrum for a class of circle sequences, J. Phys. A 28, 479–495 (1995). [24] A. Hof, O. Knill, and B. Simon, Singular continuous spectrum for palindromic Schr¨ odinger operators, Commun. Math. Phys. 174, 149–159 (1995). [25] S. Jitomirskaya and B. Simon, Operators with singular continuous spectrum: III. Almost periodic Schr¨ odinger operators, Commun. Math. Phys. 165, 201– 205 (1994). [26] S. Kotani, Jacobi matrices with random potentials taking finitely many values, Rev. Math. Phys. 1, 129–133 (1989). [27] Y. Last and B. Simon, Eigenfunctions, transfer matrices, and absolutely continuous spectrum of one-dimensional Schr¨ odinger operators, Invent. Math. 135, 329–367 (1999). [28] J.-M. Luck, Cantor spectra and scaling of gap widths in deterministic aperiodic systems, Phys. Rev. B 39, 5834–5849 (1989). [29] M. Queffélec, Substitution Dynamical Systems - Spectral Analysis, Lecture Notes in Mathematics, Vol. 1284, Springer, Berlin, Heidelberg, New York (1987). [30] L. Raymond, A constructive gap labeling for the discrete Schr¨ odinger operator on a quasiperiodic chain, preprint (1997) [31] J. A. G. Roberts, Escaping orbits in trace maps, Physica A 228, 295–325 (1996). [32] J. A. G. Roberts and M. Baake, Trace maps as 3D reversible dynamical systems with an invariant, J. Stat. Phys. 74, 829–888 (1994). [33] D. Shechtman, I. Blech, D. Gratias, and J. V. Cahn, Metallic phase with longrange orientational order and no translational symmetry, Phys. Rev. Lett. 53, 1951–1953 (1984). [34] A. S¨ ut˝ o, The spectrum of a quasiperiodic Schr¨ odinger operator, Commun. Math. Phys. 111, 409–415 (1987). [35] A. S¨ ut˝ o, Singular continuous spectrum on a Cantor set of zero Lebesgue measure for the Fibonacci Hamiltonian, J. Stat. Phys. 56, 525–531 (1989).

Vol. 2, 2001


David Damanik Department of Mathematics 253-37 California Institute of Technology Pasadena, CA 91125 U.S.A. email: [email protected] Jean-Michel Ghez Centre de Physique Théorique UPR 7061 Luminy Case 907 F-13288 Marseille Cedex 9, France and PHYMAT Département de Mathématiques Université de Toulon et du Var B.P.132 F-83957 La Garde Cedex, France email: [email protected] Laurent Raymond L2MP - UMR 6137 Service 142 Centre Universitaire de Saint-Jérôme F-13387 Marseille Cedex 20, France and Université de Provence Marseille France email: [email protected] Communicated by Jean Bellissard submitted 12/10/00, accepted 7/06/01


939



Nonrelativistic Limit of the Dirac-Fock Equations M. J. Esteban and E. Séré

Abstract. In this paper, the Hartree-Fock equations are proved to be the non relativistic limit of the Dirac-Fock equations as far as convergence of “stationary states” is concerned. This property is used to derive a meaningful definition of “ground state” energy and “ground state” solutions for the Dirac-Fock model.

1 Introduction In this paper we prove that solutions of Dirac-Fock equations converge, in a certain sense, towards solutions of the Hartree-Fock equations when the speed of light tends to infinity. This limiting process allows us to define a notion of ground state for the Dirac-Fock equations, valid when the speed of light is large enough. First of all, we choose units for which m = = 1, where m is the mass of the e2 = 1, with −e the charge electron, and is Planck’s constant. We also impose 4πε 0 of an electron, ε0 the permittivity of the vacuum. The Dirac Hamiltonian can be written as Hc = −i c α · ∇ + c2 β,

(1)

1I 0 where c > 0 is the speed of light in the above units, β = , 0 −1I 0 σk αk = (k = 1, 2, 3) and the σk are the well known Pauli matrices. σk 0 The operator Hc acts on 4-spinors, i.e. functions from R3 to C4 , and it is selfadjoint in L2 (R3 , C4 ), with domain H 1 (R3 , C4 ) and form-domain H 1/2 (R3 , C4 ). Its spectrum is (−∞, −c2 ] ∪ [c2 , +∞). Let us consider a system of N electrons coupled to a fixed nuclear charge density eZµ, where e is the charge of the proton, Z > 0 the total number of protons and µ is a probability measure defined on R3 . Note that in the particular case of m point-like nuclei, each one having atomic number Zi at a fixed location xi , eZµ =

m i=1

eZi δxi

and

Z=

m i=1

Zi .

942

M. J. Esteban and E. Sér´ e


In our system of units, the Dirac-Fock equations for such a molecule are given by

 1 1  )ψk + (ρΨ ∗ )ψk H c,Ψ ψk := Hc ψk − Z(µ ∗    |x| |x|    RΨ (x, y) ψk (y) dy = εck ψk (k = 1, ...N ), − |x − y|  3 R       Gram Ψ = 1I (i.e 3 ψk∗ ψl = δkl , 1 ≤ k, l ≤ N). N L2

(DFc )

R

Here, Ψ = (ψ1 , · · · , ψN ) , each ψk is a 4-spinor in H 1/2 (R3 , C4 ) (by bootstrap, ψk is also in any W 1,p (R3 ) space, 1 ≤ p < 3/2), and ρΨ (x) :=

N

ψk∗ (x)ψk (x), RΨ (x, y) :=

k=1

N

ψk (x) ⊗ ψk∗ (y) .

(2)

k=1

We have denoted ψ ∗ the complex line vector whose components are the conjugates of those of a complex (column) vector ψ, and ψ1∗ ψ2 is the inner product of two complex (column) vectors ψ1 , ψ2 . The n × n matrix GramL2 Ψ is defined by the usual formulas (GramL2 Ψ)kl := ψk∗ (x)ψl (x) dx . (3) R3

Finally, εc1 ≤ ... ≤ εcN are eigenvalues of H c,Ψ . Each one represents the energy of one of the electrons, in the mean field created by the molecule. For physical reasons, we impose 0 < εck < c2 . Note that the scalars εck can also be seen as Lagrange multipliers. Indeed, the Dirac-Fock equations are the Euler-Lagrange equations of the Dirac-Fock energy functional

Ec (Ψ) =

N k=1

R3

1 2

1 ∗ ψk∗ Hc ψk − Z µ ∗ ψ ψk |x| k

+

ρΨ (x)ρΨ (y) − tr RΨ (x, y)RΨ (y, x) |x − y|

R3 ×R3

dxdy

under the constraints 3 ψk∗ ψl = δkl . R In [6] we proved that under some assumptions on N and Z, there exists an infinite sequence of solutions of (DFc ). More precisely: max(Z, 3N − 1) , there exists Theorem 1 [6] Let N < Z + 1. For any c > π/2+2/π 2

N c,j 1/2 a sequence of solutions of (DFc ), Ψ ⊂ H (R3 ) , such that j≥0

(i) 0 < Ec (Ψc,j ) < N c2 ,

Vol. 2, 2001

(ii)

Nonrelativistic Limit of the Dirac-Fock Equations

943

lim Ec (Ψc,j ) = N c2 ,

j→+∞

c,j

c,j

(iii) 0 < c2 − µj < ε1 ≤ ... ≤ εN < c2 − mj , with µj > mj > 0 independent of c. The constant π/2+2/π is related to a Hardy-type inequality obtained indepen2 dently by Tix and Burenkov-Evans (see [15, 3, 16]), and which plays an important role in the proof of Theorem 1. With the physical value c = 137.037... and Z an integer (the total number of protons in the molecule), our conditions become N ≤ Z, N ≤ 41, Z ≤ 124 . The constraint N ≤ 41 is technical, and has no physical meaning. Our result was recently improved by Paturel [13], who relaxed the condition on N . Paturel obtains the same multiplicity result, assuming only that N < Z + 1 max(Z, N ) < c. Taking c = 137.037..., Paturel’s conditions are N ≤ and π/2+2/π 2 Z ≤ 124 : they cover all existing neutral atoms. This is an important improvement. In [6], the critical points Ψc,j are obtained by a complicated min-max argument involving a family of min-max levels cν,p (Fj ) (see [6] p. 511). Note that the expression ”the critical points” is misleading. Indeed, for each j we can define the c := lim inf ν→0,p→∞ cν,p (Fj ), and there exists a critical point min-max level Ej,DF c,j c Ψ such that Ej,DF = Ec (Ψc,j ) ; but we do not know whether this critical point is unique. In the present paper, we do not write the definition of the min-max levels cν,p (Fj ) in its full detail (the reader is referred to [6] for a complete definition). c We just state the minimal information on Ej,DF needed in the present paper. 1/2 3 4 Let us denote E := H (R , C ). Since σ(Hc ) = (−∞, −c2 ] ∪ [c2 , +∞) , the Hilbert space E can be split as E = Ec+ ⊕ Ec− , ± ± where Ec± := Λ± c E, and Λc := χR± (Hc ). The projectors Λc have a simple expres± ˆ ˆ± sion in the Fourier domain : Λ c ψ(ξ) = Λ (ξ) ψ(ξ), with c

c (ξ) := 1 Λ 2 ±

1IC4

c α · ξ + c2 β ± c4 + c2 |ξ|2

.

(4)

Proposition 2 [6, 13] For every j ≥ 0, let V be any (N + j) dimensional complex subspace of Ec+ . Then, taking the notation of Theorem 1, we have c = Ec (Ψc,j ) ≤ Ej,DF

sup Ψ∈(Ec− ⊕V ) Gram

L2

N

Ψ ≤ 1IN

Ec (Ψ).

(5)

944



In the present paper, we prove three main theorems. We first consider a sequence cn → +∞ and a sequence {Ψn }n of solutions of (DFcn ). For all n, Ψn = n ), each ψkn is in H 1/2 (R3 , C4 ), with (ψ1n , ..., ψN

R3

ψk∗ ψl dx = δkl and H cn ,Ψn ψkn =

εnk ψkn . Using the standard Hardy inequality, one can prove that the functions ψkn are in H 1 (R3 , C4 ) for cn large enough. We assume that −∞ < lim (εn1 − c2n ) ≤ lim (εnN − c2n ) < 0 . n→+∞

n→+∞

(6)

2 A (column) vector ψ ∈ C4 can be written in block form ψ = ϕ χ where ϕ ∈ C (respectively χ ∈ C 2 ) consists of the two upper (resp. lower) components of ψ. This n k with ϕnk and χnk in H 1 (R3 , C 2 ). Finally, Ψn splits gives the splitting ψkn = ϕ χn k Φn as χn , where Φn := (ϕn1 , ..., ϕnN ) and χn := (χn1 , ..., χnN ). Our first result is that n Φ¯ 1 ¯ Ψn = Φ χn has a subsequence converging, in H norm, towards Ψ = 0 , where N ¯ = (ϕ¯1 , · · · , ϕ¯N ) ∈ H 1 (R3 , C 2 ) is a solution of the Hartree-Fock equations: Φ 

1 ∆ϕk 1  H − Z µ ∗ ϕ ϕk ϕ = − + ρ ∗  k Φ Φ k   2 |x| |x|   RΦ (x, y)ϕk (y) ¯k ϕk , k = 1, ...N, dy = λ −  |x − y| 3 R     ¯ k = lim (εn − c2 ) .  ϕ∗k ϕl dx = δkl , λ k n

(HF)

n→+∞

R3

Here (as in the Dirac-Fock equations), N

ρΦ (x) =

ϕ∗l (x)ϕl (x) ,

RΦ (x, y) =

l=1

N

ϕl (x) ⊗ ϕ∗l (y) .

l=1

Note that the Hartree-Fock equations are the Euler-Lagrange equations corN

of the Hartree-Fock energy: responding to critical points in H 1 (R3 , C 2 )

EHF (Φ)

N 1

:=

k=1

1 + 2 under the constraint

2

||∇ϕk ||L2

2

(7)

R3 ×R3

R3

1 |ϕk |2 dx −Z µ∗ |x| R3

ρΦ (x)ρΦ (y) − tr (RΦ (x, y)RΦ (y, x)) dxdy , |x − y|

ϕ∗k ϕl = δkl ,

i, j = 1, ...N.

Vol. 2, 2001


945

Theorem 3 Let N < Z + 1. Consider a sequence cn → +∞ and a sequence {Ψn }n n of solutions of (DFcn ), i.e. Ψn = (ψ1n , · · · , ψN ), each ψkn being in H 1/2 (R3 , C4 ) , with R3

ψk∗ ψl dx = δkl and H cn ,Ψn ψkn = εnk ψkn . Assume that the multipliers εnk ,

k = 1, . . . , N, satisfy (6). Then for n large enough, ψkn is in H 1 (R3 , C4 ) , and there ¯ 1 , ..., λ ¯N , ¯ = (ϕ¯1 , · · · , ϕ¯N ), with negative multipliers, λ exists a solution of (HF), Φ such that, after extraction of a subsequence, λnk := εnk − (cn )2

ψkn =

ϕnk χnk

−→ n→+∞

ϕ¯k 0

¯k , −→ λ

n→+∞

k = 1, ..., N ,

in H 1 (R3 , C 2 ) × H 1 (R3 , C 2 ),

n χ + i (σ · ∇)ϕn k k 2cn

(8)

(9)

= O(1/(cn )3 ),

(10)

¯ EHF (Φ).

(11)

L2 (R3 ,C 2 )

and Ecn (Ψn ) − N c2n

−→

n→+∞

As a particular case, we have Corollary 4 If cn → +∞ and N, Z, µ are fixed, then for any j ≥ 0 the sequence {Ψcn,j }n of Theorem 1 satisfies the assumptions of Theorem 3 (see (iii) in Theorem N 1). So it is precompact in H 1 (R3 , C4 ) . Up to extraction of subsequences, cn ,j cn ,j ¯ j < 0 , k = 1, ..., N λk := εk − c2n −→ λ (12) k ¯j N N Φ (13) Ψcn ,j −→ in H 1 (R3 , C 2 ) × H 1 (R3 , C 2 ) 0

¯ j = ϕ¯j , · · · , ϕ¯j is a solution of the Hartree-Fock equations with multipliers and Φ n 1 ¯ j . Moreover, ¯j , · · · , λ λ

1

N

Ecn (Ψcn ,j ) − N c2n

−→

n→+∞

¯ j ). EHF (Φ

(14)

Particular solutions of the Hartree-Fock equations are the minimizers of EHF (Φ) under the constraints GramL2 Φ = 1IN . They are called ground states. Their existence was proved by Lieb and Simon [10] under the assumption N < Z + 1, but the uniqueness question remains unsolved (see also [11] for the existence of excited states). It is difficult to define the notion of ground state for the Dirac-Fock model, since Ec has no minimum under the constraints R3 ψk∗ ψl = δkl . Our second main

946



result asserts that ”the” first solution Ψc,0 of (DFc ) found in [6], whose energy c level is denoted E0,DF , can be considered, in some (weak) sense, as a ground state c − N c2 converges to the minimum of EHF as c goes to for (DFc ). Indeed, E0,DF infinity. Moreover, for c large the multipliers εc,0 associated to Ψc,0 are the N k smallest positive eigenvalues of the mean-field operator H c,Ψc,0 . Theorem 5 Let N < Z + 1 and c sufficiently large. With the above notations, c E0,DF =

min

Gram

Φ=1IN 2

EHF (Φ) + N c2 + o(1)c→+∞ .

(15)

L

N to some Moreover, for any subsequence {Ψcn,0 }n converging in H 1 (R3 , C4 ) Φ¯ 0 0 ¯ 0 , Φ is a ground state of the Hartree-Fock model, i.e. ¯ 0) = EHF (Φ

min

Gram

L2

Φ=1IN

EHF (Φ).

(16)

Furthermore, for c large, the eigenvalues corresponding to Ψc,0 in (DFc ), c,0 εc,0 1 , . . . , εN are the smallest positive eigenvalues of the linear operator H c,Ψc,0 and the (N + 1)-th positive eigenvalue of this operator is strictly larger than εc,0 N . Finally, we are able to show that, for c large enough, the function Ψc,0 can be viewed as an electronic ground state for the Dirac-Fock equations in the following sense: it minimizes the Dirac-Fock energy among all electronic configurations which are orthogonal to the “Dirac sea”. Theorem 6 Fix N, Z with N < Z + 1 and take c sufficiently large. Then Ψc,0 is a solution of the following minimization problem: −

inf{Ec (Ψ) ; GramL2 Ψ = 1IN , ΛΨ Ψ = 0 }

(17)

−

where ΛΨ = χ(−∞,0) (H c,Ψ ) is the negative spectral projector of the operator H c,Ψ , − − − and ΛΨ Ψ := (ΛΨ ψ1 , · · · , ΛΨ ψN ) . −

The constraint ΛΨ Ψ = 0 has a physical meaning. Indeed, according to Dirac’s original ideas, the vacuum consists of infinitely many electrons which completely fill up the negative space of H c,Ψ : these electrons form the “Dirac sea”. So, by the Pauli exclusion principle, additional electronic states should be in the positive space of the mean-field Hamiltonian H c,Ψ . The proof of Theorem 6 will be given in Section 4. This proof uses some other interesting min-max characterizations of Ψc,0 (see Lemma 9).

Vol. 2, 2001


947

2 The nonrelativistic limit This section is devoted to the proof of Theorem 3. We first notice that when N < Z + 1, N, Z fixed, and c is sufficiently large, any solution of (DFc ) is actually ν N in (H 1 (R3 )) . This follows from the fact that for ν small, the operator H1 − |x| is essentially self-adjoint with domain H 1 (R3 ) (see [14]). We can also obtain a priori estimates on H 1 norms: Lemma 7 . Fix N, Z ∈ Z+ , take c large enough, and let Ψc be a solution of (DFc ). If the multipliers εck associated to Ψc satisfy 0 ≤ εck ≤ c2 (k = 1, . . . , N ) , then 4 Ψc ∈ (H 1 (R3 , C ))N , and the following estimate holds 2

2

||Ψc ||2 + ||∇Ψc ||2 ≤ K . The constant K is independent of c (for c large). Proof. The normalization constraint GramL2 Ψc = 1IN implies 2

||Ψc ||2 = N .

(18)

Using the (DFc ) equation and the standard Hardy inequality R3

u2 ≤ 4 |x|2

R3

|∇u|2 ,

(19)

one easily proves that Ψc is in H 1 , and satisfies: 2

2

(Hc Ψc , Hc Ψc ) = c4 ||Ψc ||2 + c2 ||∇Ψc ||2 2

(20)

2

≤ c4 ||Ψc ||2 + +(Z 2 + N 2 ) ||∇Ψc ||2 + +c2 max(N, Z) ||∇Ψc ||2 , for some + > 0 independent of N, Z and c. The estimates (18) and (20) prove the lemma. Proof of Theorem 3. Let us split the spinors ψkn : R3 → C 4 in blocks of upper and lower components: ψkn =

ϕnk χnk

,

with

ϕnk , χnk : R3 → C 2 .

948



We denote L := −i (σ · ∇) . Then we can rewrite (DFcn ) in the following way:  N

 1 n n 2 1 n  n 2 n  c ϕ ϕk + (cn )2 ϕnk Lχ − Z µ ∗ + (|ϕ | + |χ | ) ∗ n  k l l k  |x| |x|   l=1       N   (ϕnl )∗ (y)ϕnk (y) + (χnl )∗ (y)χnk (y)  n  dy = εnk ϕnk − ϕ (x)  l   |x − y| 3 R  l=1       N

2 1 n n 2 1 n n χk + χ − (cn )2 χnk cn Lϕk − Z µ ∗ (|ϕl | + |χnl | ) ∗   |x| |x| k   l=1       N   (ϕnl )∗ (y)ϕck (y) + (χcl )∗ (y)χck (y)  n  dy = εnk χnk − χ (x)  l   |x − y| 3 R  l=1          (ϕnk )∗ ϕnl + (χnl )∗ χnl dx = δkl .

(21)

R3

2

Note that LχL2 = ∇χL2 for all χ ∈ H 1 (R3 , C ) . So, dividing by cn the first equation of (21), we get ∇χnk L2 (R3 ,C 2 ) = O(1/cn ) .

(22)

Dividing by 2(cn )2 the second equation of (21), and using the fact that εnk − (cn ) is a bounded sequence, we get N n 1 n χk − 1 Lϕnk = O χl H 1 (R3 ,C 2 ) . (23) 2cn (cn )2 2 3 2 2

L (R ,C

l=1

)

The estimate (23) together with Lemma 7 implies χnk L2 (R3 ,C 2 ) = O(1/cn ) .

(24)

Combining this with (22), we obtain χnk H 1 (R3 ,C 2 ) = O(1/cn ) . So

N l=1

(25)

χnl H 1 (R3 ,C 2 ) = O(1/cn ), and (23) gives the estimate n χk − 1 Lϕnk 2cn

L2 (R3 ,C 2 )

= O(1/(cn )3 ) .

(26)

Vol. 2, 2001


949

Now, the first equation of (21), combined with (26), implies  N  ∆ϕnk 1 n n 2 1 n   − − Z(µ ∗ )ϕ χ + |ϕl | ∗    2 |x| k |x| k  l=1       N (ϕnl )∗ (y)ϕnk (y) n dy = λnk ϕnk + hnk , − ϕ (x)  l  |x − y| 3  R  l=1        n   (ϕnk )∗ ϕnl = δkl + rkl , R3

with λnk := εnk − (cn )2 , and lim ||hnk ||H −1 (R3 ) = 0 ,

n→+∞

n lim |rkl | = 0 for all k, l ∈ {1, . . . , N }.

n→+∞

n

Therefore Φ := (ϕn1 , . . . , ϕnN ) is a Palais-Smale sequence for the HartreeFock problem, and the multipliers λnk satisfy limn→+∞ λnk < 0 . At this point, we just invoke an argument used in [11] to obtain the convergence in H 1 norm of n ¯ = (ϕ¯1 , · · · , ϕ¯N ), a solution of the Hartree-Fock some subsequence {Φ } towards Φ equations  ¯ k ϕ¯k , k = 1, ...N H ϕ¯ = λ    Φ¯ k   ϕ¯∗k ϕ¯l = δkl ,  R3

¯k = where λ

n

lim λk .

n →+∞

¯ . From Finally, let us prove that Ecn (Ψn ) − N (cn )2 converges to EHF (Φ) Lemma 7 and the estimate (26), one easily gets

Ecn (Ψn ) − N c2n = EHF (Φn ) + O(1/(cn )2 ) .

(27)

¯ the energy level EHF (Φn ) converges Since Φn converges in H 1 norm to Φ, ¯ . So (27) implies the desired convergence. This ends the proof of Theto EHF (Φ) orem 3.

3 Ground state for Dirac-Fock equations in the nonrelativistic limit The aim of this section is to prove Theorem 5. The estimate given in Proposition ˆ± 2 on the energy Ec (Ψc,j ) and the expression of Λ c given in (4), will be crucial. Proof of Theorem 5. By Corollary 4, for any sequence cn going to infinity, Ψcn ,0 0 is precompact in H 1 norm. If it converges, its limit is of the form (Φ¯0 ), and

950



¯ 0 ) . As a consequence, Ecn (Ψcn ,0 ) − N (cn )2 converges to EHF (Φ

lim

c→+∞

Ec (Ψc,0 ) − N c2

≥

inf

Gram

L2

Φ=1IN

EHF (Φ).

(28)

In order to prove (15) and (16) of Theorem 5, we just have to show that

inf lim Ec (Ψc,0 ) − N c2 ≤ EHF (Φ). (29) Gram

c→+∞

L2

Φ=1IN

N , with GramL2 Φ = 1IN . Let Vc be Take Φ = (ϕ1 , · · · , ϕN ) ∈ H 1 (R3 , C 2 )

the complex subspace of Ec+ defined by

+ ϕN ϕ1 Vc := Span {Λ+ c ( 0 ), ..., Λc ( 0 )} .

(30)

From formula (4) and Lebesgue’s convergence theorem, one easily gets, for k = 1, . . . , N , ϕk lim Λ− (31) c ( 0 )H 1 = 0 . c→+∞

So, for c sufficiently large, we have dim Vc = N .

(32)

Hence, by (5), c = Ec (Ψc,0 ) ≤ E0,DF

Gram N

Ec (Ψ) .

sup

(33)

Ψ∈(E − ⊕Vc )N L2

Ψ ≤ 1IN

N

Let Ψ+ ∈ (Ec+ ) , Ψ− ∈ (Ec− ) such that GramL2 (Ψ+ + Ψ− ) ≤ 1IN . By the concavity property of Ec in the Ec− direction (see [6], Lemma 2.2), if c is large enough, we have

Ec (Ψ+ + Ψ− ) ≤ Ec (Ψ+ ) + Ec (Ψ+ ) · Ψ− −

1 − 2 (ψk , −c ∆ + c4 ψk− ) 4 N

k=1

≤

Ec (Ψ+ ) + M ||Ψ− ||L2

c2 − ||Ψ− ||2L2 , 4

(34)

for some constant M > 0 independent of c . Hence, for c large, c E0,DF ≤

sup Ψ+ ∈D(Vc )

Ec (Ψ+ ) + ◦(1)c→+∞ ,

where D(Vc ) := Ψ+ ∈ (Vc )N ; GramL2 Ψ+ ≤ 1IN .

(35)

Vol. 2, 2001


951

If c is large enough, it follows from Hardy’s inequality (19) that the map Ψ+ → Ec (Ψ+ ) is strictly convex on the convex set A+ := {Ψ+ ∈ (Ec+ )N ; GramL2 Ψ+ ≤ 1IN } . Indeed, its second derivative at any point Ψ+ of A+ is of the form Ec (Ψ+ )[dΨ+ ]2 = 2

N

(dψk ,

k=1

c4 − c2 ∆ dψk )L2 + Q(dΨ+ )

with Q a quadratic form on (H 1/2 (R3 , C4 ))N bounded independently of c and Ψ+ ∈ A+ . As a consequence, sup Ec (Ψ+ ) is achieved by an extremal point Ψ+ max of Ψ+ ∈D(Vc )

the convex set D(Vc ) = A+ ∩ (Vc )N . Being extremal in D(Vc ) , the point Ψ+ max satisfies GramL2 Ψ+ (36) max = 1IN . + + Since ψk,max ∈ Vc , there is a matrix A = (akl )1≤k,l≤N such that, for all l , ψl,max = ϕk akl Λ+ c ( 0 ) . Then 1≤k≤N

+ Φ A∗ GramL2 Λ+ c ( 0 ) A = GramL2 Ψmax = 1IN .

(37)

Using the U (N ) invariance of D(Vc ) and Ec , and the polar decomposition of square matrices, one can assume, without restricting the generality, that A = A∗ and A is positive definite. Recalling that GramL2 Φ = 1IN , we see, from (31), that Φ 2 GramL2 (Λ+ c 0 ) = 1IN +o(1) . So (37) implies A = 1IN +o(1) , hence A = 1IN +o(1) . Combining this with (31), we get

+ Now, since ψk,max

+ − (ϕ0k )H 1 = o(1)c→+∞ . ψk,max √ + + ∈ Ec+ , Hc ψk,max = c4 − c2 ∆ ψk,max . But

∆ . c4 − c2 ∆ ≤ c2 − 2 This inequality is easily obtained in the Fourier domain: it follows from 1 + x2 (∀x ≥ 0) . So we get N k=1

+ + (Hc ψk,max , ψk,max )L2 ≤ N c2 +

√ 1+x≤

N

1 + ∇ψk,max 2L2 . 2 k=1

Combining this with (31), we find 2 Ec (Ψ+ max ) ≤ N c + EHF (Φ) + ◦(1)c→+∞ .

(38)

952



Finally, (35) and (38) imply c E0,DF

≤

Ec (Ψ+ max ) + ◦(1)c→+∞

≤

N c2 + EHF (Φ) + ◦(1)c→+∞ .

(39)

Since Φ is arbitrary, (39) implies (29). The formulas (15), (16) of Theorem 5 are thus proved. We now check the last assertion about the εc,0 k , k = 1, . . . , N, being the smallest eigenvalues of the operator H c,Ψc,0 for c large. By Corollary 4, we can translate this statement in the language of sequences. We take a sequence cn → N ¯ 0 to some Φ0 , for n large +∞ such that {Ψcn,0 }n converges in H 1 (R3 , C4 ) enough. Let H n := H cn ,Ψcn ,0 and H∞ := HΦ¯ 0 . We have H n ψkcn ,0 = εnk ψkcn ,0 and ¯k ϕ¯0 , with H∞ ϕ¯0k = λ k 0 < εn1 ≤ · · · ≤ εnN < (cn )2 ,

¯1 ≤ · · · ≤ λ ¯N < 0 , λ

¯ k = lim (εn − (cn )2 ) . λ k n→+∞

Let us denote en1 ≤ · · · ≤ eni ≤ · · · the sequence of eigenvalues of H n , in the interval (0, c2n ) , counted with multiplicity. Similarly, we shall denote ν¯1 ≤ · · · ≤ ν¯i ≤ · · · the sequence of eigenvalues of H∞ in the interval (−∞, 0) , counted with multiplicity. Let z ∈ C \ σ(H∞ ) . Then for n large enough, z + (cn )2 ∈ C \ σ(H n ) , and the resolvent

−1 Rn (z + (cn )2 ) := (z + (cn )2 )I − H n R(z)ϕ ¯ ¯ , where R(z) := converges in norm towards the operator L(z) : ϕ 0 χ →

−1 z I − H∞ is the resolvent of H∞ . So, by the standard spectral theory, lim (eni − (cn )2 ) = ν¯i for all i ≥ 1 . ¯ 0 is a ground state of the Hartree-Fock model. So a result We know that Φ ¯ k for all 1 ≤ k ≤ N , and ν¯N +1 > λ ¯ N . But proved in [1] tells us that ν¯k = λ 2 ¯ N , and (en (εnN − (cn )2 ) converges to λ − (c ) ) converges to ν ¯ , n N +1 as n goes N +1 to infinity. So, for n large enough, enN +1 > εnN , hence εnk = enk for all 1 ≤ k ≤ N . This ends the proof of Theorem 5.

n→+∞

4 Proof of Theorem 6. In this section, both Φ and Ψ will denote N -uples of 4-spinors (i.e. N -uples of functions from R3 into C4 ). As explained in the Introduction of the present paper, ”the” solution Ψc,0 was obtained in [6] by a complicated min-max argument. Note that we are not able to prove that this min-max argument leads to a unique critical point (this is not surprising: even in the simpler case of nonrelativistic HartreeFock, no uniqueness result is known for ”the” ground state). However, the min-max c = Ec (Ψc,0 ) is well defined and unique. For c large, we will show that level E0,DF c the definition of E0,DF can be simplified.

Vol. 2, 2001


953

First of all, we introduce the notion of projector “ε-close to Λ+ c ”, where 1 −1 + Λ c = Hc Hc + Hc is the positive free-energy projector. 2 Definition 8 Let P + be an orthogonal projector in L2 (R3 , C4 ), whose restriction 1 1 to H 2 (R3 , C4 ) is a bounded operator on H 2 (R3 , C4 ) . 1 3 4 2 Given ε > 0, P + is ε-close to Λ+ c if and only if, for all ψ ∈ H (R , C ),

14

1 2 4 4 −c P + − Λ+ ψ ≤ ε ∆ + c ψ 2 3 4 . −c2 ∆ + c4 c 2 3 4 L (R ,C )

L (R ,C )

Λ+ c

Λ+ c

is itself. More interesting An obvious example of projector ε-close to examples will be given below. Let us now give a min-max principle associated to P+ : Lemma 9 Fix N, Z with N < Z + 1. Take c > 0 large enough, and P + a projector − ε-close to Λ+ = 1IL2 − P+ , and define c , for ε > 0 small enough. Let P E(P + ) :=

inf

sup

1 N

1

Φ+ ∈(P + H 2 ) Gram 2 Φ+ =1IN L

+

Then E(P ) does not depend on P

Ec (Ψ) .

Ψ∈(P − H 2 ⊕ Span(Φ+ ))N Gram 2 Ψ=1IN L

+

and Ec (Ψ

c,0

) ≤ E(P + ).

Remark In the case N = 1 , Ec is the quadratic form (ψ, Hψ)L2 associated to the 1 . Then E(Λ+ operator H = Hc − Zµ ∗ |x| c ) coincides with the min-max level λ1 (V ) π/2 + 2/π 1 , defined in [4], for V = −Zµ ∗ |x| . By Theorem 3.1 of [4], if c > 2 then λ1 (V ) is the first positive eigenvalue of H . Proof of Lemma 9. The idea behind this lemma is inspired by [2]. Note that, under our assumptions, E(P + ) < N c2 (1 + Kε) for some K > 0 independent of c and ε. This follows from arguments similar to those used in the proof of Lemma 5.3 of [6]. In [6] the free energy projectors Λ± c were used. With these projectors, it was 2 seen that E(Λ+ ) < N c (thanks to a careful choice of Φ+ ). When P + is ε-close c + + 2 to Λc , we then get E(P ) < N c (1 + Kε). To continue the proof of the lemma we perform a change of physical units. In mathematical language, this change corresponds to a dilation in space by the factor c, and to dividing the energies by c2 . Let (dc ϕ)(x) = c3/2 ϕ(cx) and

Ec (Φ) : = c12 Ec dc Φ N

Z

1 ϕk , (−iα · ∇ + β)ϕk − µ ∗ |ϕk |2 = (40) c |x| 3 R k=1 2 ρΦ (x)ρΦ (y) − RΦ (x, y) 3 3 1 d xd y + 2c R3 ×R3 |x − y| where µ ˜(E) = µ(c−1 E) for any Borel subset E of R3 .

954



The interest of this rescaled energy Ec is that for c large and GramL2 Ψ ≤ 1IN , we have N

1 ψk , (−iα∇ + β)ψk + O (41) ||Ψ||2 1/2 N . Ec (Ψ) = (H ) c k=1 R3

± := dc−1 ◦ Λ± Let us denote P ± := dc−1 ◦ P ± ◦ dc , Λ c ◦ dc = χR± −iα.∇ + β . ± does not depend on c. Now, P + is ε-close to Λ+ if and only if Note that Λ c 

1 4

  + ψ  −∆ + 1 P + − Λ 2 3 4 L (R ,C ) (42)

14 1   ≤ ε −∆ + 1 ψ , ∀ψ ∈ H 2 (R3 , C4 ) .  L2 (R3 ,C4 )

We denote Φ • A the right action of an N × N matrix A = (akl )1≤k,l≤N on an N -uple Φ = (ϕ1 , . . . , ϕN ) ∈ (L2 (R3 , C 4 ))N . More precisely, N N (Φ • A) := ( ak1 ϕk , . . . , akN ϕk ) . k=1

(43)

k=1

N

+ + H 1/2 , . . . , ϕ ) ∈ P such that GramL2 Φ+ = 1IN , and Given Φ+ = (ϕ+ 1 N N

, we define Φ− ∈ P − H 1/2 − 12 gΦ+ (Φ− ) := (Φ+ + Φ− ) • GramL2 (Φ+ + Φ− ) − 12 = (Φ+ + Φ− ) • 1IN + GramL2 Φ− .

(44)

N

1 We obtain a smooth map gΦ+ , from P − H 2 to

N 1 + ΣΦ+ := Ψ ∈ P − H 2 ⊕ Span (ϕ+ . , . . . , ϕ ) / Gram Ψ = 1 I N 1 N L2 In fact, the values of gΦ+ lie in the following subset of ΣΦ+ :

ΣΦ+ := Ψ ∈ ΣΦ+ / GramL2 P + Ψ > 0 . Now, take an arbitrary Ψ ∈ ΣΦ+ . Then there is an invertible N × N matrix B such that P + Ψ = Φ+ • B . So we may write Ψ • B −1 = Φ+ + P − Ψ • B −1 . As a consequence, − 12 gΦ+ (P − Ψ • B −1 ) = (Ψ • B −1 ) • GramL2 (Ψ • B −1 ) .

Vol. 2, 2001


955

One easily computes

GramL2 (Ψ • B −1 ) = (B ∗ )−1 GramL2 Ψ B −1 = (B B ∗ )−1 . Hence gΦ+ (P − Ψ • B −1 ) = (Ψ • B −1 ) • (B B ∗ )1/2 = Ψ • (B −1 (B B ∗ )1/2 ) , and finally

Ψ = gΦ+ (P− Ψ • B −1 ) • U ,

where U := (B B ∗ )−1/2 B ∈ U(N ) is the unitary matrix appearing in the polar decomposition of B . So we have proved that gΦ+ (Φ− ) • U . ΣΦ+ =

e

1

Φ− ∈(P − H 2 )N U ∈ U (N )

Now, Ec is invariant under the U(N ) action “ • ” , and ΣΦ+ is dense in ΣΦ+ for the norm of (H 1/2 (R3 , C4 ))N . Hence

Ec (Ψ) = Ec gΦ+ (Φ− ) . sup sup (45)

e

1

Ψ∈(P − H 2 ⊕ Span(Φ+ ))N Gram 2 Ψ = 1IN

e

1

Φ− ∈(P − H 2 )N

L

We now prove Lemma 9 in three steps. Step 1. Let Φ+ ∈ (P+ H 1/2 )N be such that GramL2 Φ+ = 1IN and such that Ec (Φ+ ) ≤ N + δ, for some δ > 0 small. For ε small and c large, there is a N

maximizing Ec ◦ gΦ+ and lying in a small neighborhood unique Φ− ∈ P − H 1/2

of 0 . If we denote k(Φ+ ) this maximizer, the map k is smooth from

N Sδ+ = Φ+ ∈ P + H 1/2 GramL2 Φ+ = 1IN , Ec (Φ+ ) ≤ N + δ

N to P− H 1/2 , and equivariant for the U(N ) action.

Proof of Step 1. Take r > 0. For ε , δ small and c large, if Φ+ ∈ Sδ+ , Φ− ∈ (P− H 1/2 )N , and Φ− H 1/2 is not smaller than r, then

Ec gΦ+ (Φ− ) < N − δ , by (41). On the other hand, for c large enough, using (41) once again, one has

δ Ec gΦ+ (0) = Ec (Φ+ ) ≥ N − . 2

956



N

Φ− H 1/2 ≤ r , no maximizer of So, if we define Vr := Φ− ∈ P − H 1/2 Ec ◦ gΦ+ can be outside Vr . Moreover, choosing r small, and then taking c large and ε small, the map Φ− ∈ Vr "−→ Ec ◦ gΦ+ (Φ− ) is strictly concave. Indeed, its second derivative at Φ− ∈ Vr is very close in norm to the negative form Ψ− ∈ (P− H 1/2 )N "−→ −2

N i=1

ψi− 2 1/2 − 2 H

1≤i,j≤N

+ − − (ϕ+ j , ϕi )H 1/2 (ψi , ψj )L2 .

Step 1 immediately follows from these facts. Step 2. The min-max level E(P + ) does not depend on P + .

Proof of Step 2. Take two projectors P1+ , P2+ , both ε-close to Λ+ c . For i = 1, 2, and N

+ + + + 1/2 Φ ∈ P H , with Gram 2 Φ = 1I and Ec (Φ ) ≤ N + δ , let i

i

L

J i (Φ+ i ) :=

i

i

N

max

Φ− ∈(P˜i− H 1/2 )N Gram 2 Φ− =1IN

i − Ec gΦ + (Φ ) i

(46)

L

i = Ec ◦ gΦ + i

k i (Φ+ i ) .

+ i i in Step 1. Here, gΦ + and k are the maps associated to Pi

By Ekeland’s variational principle [5], there is a minimizing sequence Φ+ 1,n n≥0

N

+ + 1 1 −1/2 1 1 k for J , such that (J ) Φ1,n n−−→ 0 in H . Let Ψ := g (Φ ) . + n 1,n →+∞ Φ

Then Ψn is a Palais-Smale sequence for Ec in the manifold N

GramL2 Ψ = 1IN , Σ := Ψ ∈ H 1/2

1,n

δ with Ec Ψn ≥ N − , where δ > 0 is the constant of the first step. So 2

+ GramL2 P2 Ψn > 0 . We denote 

− 12   Φ+ := P+ Ψn • Gram 2 P+ Ψn , 2,n 2 2 L

− 12   Φ− := P− Ψn • Gram P+ Ψn . 2,n 2 2 L2

(47)

c Ψn ≥ N − δ , we have ) . Since E One easily checks that Ψn = g 2 + (Φ− 2,n Φ 2 2,n − Φ2,n H 1/2 ≤ r , where r > 0 is the same as in the proof of step 1. Since Ψn

Vol. 2, 2001


957

is a Palais-Smale sequence for Ec , the derivative of Ec ◦ g 2 +

at the point Φ− 2,n

Φ2,n

converges to 0 as n goes to infinity. So, by the concavity properties of Ec ◦ g 2 + in Φ

the domain

2,n

N

V2,r := Φ− ∈ P2− H 1/2 Φ− H 1/2 ≤ r

(see the proof of step 1), we get + 2 Φ− 2,n − k (Φ2,n )H 1/2

−→ n→+∞

0

Ec Ψn − J 2 Φ+ 2,n

and

−→

n→+∞

0.

As a consequence, E(P1+ ) =

inf

˜ + 1/2 )N Φ+ 1 ∈(P1 H Gram

L2

≥ J 1 Φ+ 1

inf

˜ + 1/2 )N Φ+ 2 ∈(P2 H

Φ+ IN 1 =1

Gram

L2

+ J 2 (Φ+ 2 ) = E(P2 ) .

Φ+ IN 2 =1

Since 1, 2 play symmetric roles in the above arguments, we conclude that E(P + ) does not depend on P + , for c large enough and ε small enough.

c,0 Step 3. Ec Ψc,0 ≤ E Λ+ is ”the” first solution of (D-F) found in c , where Ψ [E-S]. 1/2 satisfies GramL2 Ψ− ≤ 1IN , Proof of Step 3. For c large enough, if Ψ− ∈ Λ− c H + it follows from Hardy’s inequality that the map Ψ → Ec (Ψ+ + Ψ− ) is strictly convex on 1/2 N W (Ψ− ) := {Ψ+ ∈ (Λ+ ) ; GramL2 (Ψ+ + Ψ− ) ≤ 1IN } . c H 1/2 , As a consequence, for an arbitrary N -dimensional subspace V of Λ+ c H + − − + sup Ec (Ψ + Ψ ) is achieved by an extremal point Ψmax of SV (Ψ ) := Ψ+ ∈W (Ψ− )∩V N − N

the convex set W (Ψ ) ∩ V . Being extremal, Ψ+ max must satisfy the constraints − GramL2 (Ψ+ max + Ψ ) = 1IN . So we have sup

Ec (Ψ) =

SV (Ψ− ) =

sup

1/2 Ψ∈(Λ− ⊕V )N c H

1/2 N Ψ− ∈(Λ− ) c H

Gram

Gram

L2

Ψ≤1IN

L2

Ψ− ≤1IN

sup 1/2 Ψ∈(Λ− ⊕V )N c H

Gram

By proposition 2,

Ec Ψc,0 ≤

sup 1/2 Ψ∈(Λ− ⊕V )N c H

Gram

L2

Ψ≤1IN

Ec Ψ .

L2

Ψ=1IN

Ec (Ψ) .

958



Finally we get, for c large, Ec (Ψc,0 ) ≤

inf

sup

1/2 N Φ ∈(Λ+ ) c H + Gram 2 Φ =1IN L

1/2 Ψ∈(Λ− ⊕ Span(Φ+ ))N c H

+

Gram

L2

Ec (Ψ) = E(Λ+ c ) .

Ψ=1IN

(The correspondence between Φ+ and V is V = Span(Φ+ ) ). This ends the proof of Step 3 and of Lemma 9. Thanks to Lemma 9, we are able to write the following inequalities for c large, and P + ε-close to Λ+ c , ε small : c,0 E(P + ) = E(Λ+ ) c ) ≥ Ec (Ψ

≥

inf Ψ solution of (DFc )

Ec (Ψ)

Λ− ΨΨ = 0

≥8

> < > :

(48)

inf Ψ∈(H 1/2 )N Gram

L2

Ec (Ψ) .

Ψ=1IN

Λ− Ψ Ψ=0

As announced before, we now give some important examples of projectors ε-close to Λ+ c :

N Lemma 10 Fix N, Z, and take c large enough. Then, for any Φ ∈ H 1/2 , with

is ε-close to Λ+ = χ H GramL2 Φ ≤ 1IN , the projector Λ+ c,Φ (0,+∞) c . Φ Proof of Lemma 10. We adapt a method of Griesemer, Lewis, Siedentop [7] to the Hamiltonian H c,Φ . Once again, it is more convenient to work in a system of units such that H c,Φ becomes

1 ˜ : ψ "→ dc−1 ◦ H c,Φ ◦ dc (ψ) = −iα · ∇ + β ψ − Z µ ∗ ψ H c,Φ c |x| 1

1 1 ψ(y) + ρΦ˜ ∗ ψ− dy RΦ˜ (x, y) c |x| c R3 |x − y| with µ (E) = µ(c−1 E), Φ(x) = c−3/2 Φ(c−1 x).

˜ ,Λ + := χ(0,∞) (H1 ), K ˜ := + := χ(0,∞) H Denoting H1 := −iα · ∇ + β, Λ ˜ Φ c,Φ Φ

c Hc,Φ˜ − H1 , we find, as in the proof of Lemma 1 of [7],

+∞ −1

2 −1 1 + + ˜ −z 2 K ˜ ˜ +z 2 ΛΦ˜ − Λ ψ = H1 KΦ˜ H H dz H12 +z 2 ψ, Φ c,Φ c,Φ πc 0

Vol. 2, 2001


959

and for any χ ∈ L2 (R3 , C4 ), following [7] (proof of Lemma 3), we get

M + − Λ + )ψ χL2 (−∆ + 1)1/4 ψL2 ≤ χ , (−∆ + 1)1/4 (Λ ˜ Φ c L2 for c large enough (M is a constant independent of c). As a consequence, if c is M + large enough and bigger than , then Λ+ Φ is ε-close to Λc . This ends the proof ε of Lemma 10. Now, to end the proof of Theorem 6, we just need the following result : N

Lemma 11 Fix N, Z and take c > 0 large enough. If Φ ∈ H 1/2 , GramL2 Φ = 2 1IN , Λ− Φ Φ = 0 and Ec (Φ) ≤ N c , then

N

1/2 Ec (Φ) = max Ec (Ψ) ; Ψ ∈ Λ− H ⊕ Span(Φ) , Gram Ψ = 1 I . 2 N Φ L Proof of Lemma 11. If Λ− Φ Φ = 0 and GramL2 Φ = 1IN , then 0 is a critical point of the map N

1/2 − g H − " → E (Ψ ) , Ψ− ∈ Λ− c Φ Φ

−1/2 with gΦ (Ψ− ) = Φ + Ψ− • 1IN + GramL2 Ψ− . Take ε > 0 small. By Lemma

+ 10, Λ+ Φ is ε-close to Λc for c large enough. From the proof of Lemma 9 (Step 1), there is a unique critical point of Ec ◦ gΦ in a small neighborhood Vr of 0 in 1/2 1/2 Λ− ) and this critical point is the unique maximizer of Ec ◦ gΦ in Λ− ). Φ (H Φ (H So, 0 is this maximizer. This proves Lemma 11.

Let us explain why Theorem 6 is now proved. We know that, for c large enough,

≥ Ec (Ψc,0 ) ≥ 8 inf Ec (Ψ) , N c2 > E Λ + c

>< >:

hence

8 > < > :


L2

Ψ=1IN

Λ− Ψ Ψ=0

Ec (Ψ) =

8> >>< >> >:

Ψ∈(H 1/2 )N Gram

L2

Ψ=1IN

Λ− Ψ Ψ=0


L2

Ec (Ψ) .

Ψ=1IN

Λ− Ψ Ψ=0 Ec (Ψ) ≤ N c2

Take ε > 0. By Lemma 10, for any Ψ ∈ (H 1/2 )N with GramL2 Ψ = 1IN , the + + + projector Λ+ Ψ is ε-close to Λc , if c is large. Hence E(ΛΨ ) = E(Λc ) (by Lemma 9),

960



if we have chosen ε small enough. But if Ψ also satisfies Λ− Ψ Ψ = 0 and Ec (Ψ) ≤ + N c2 , then, from Lemma 11 and from the definition of E(Λ+ Ψ ), we have E(Λc ) = + E(ΛΨ ) ≤ Ec (Ψ). So

≤8 inf Ec (Ψ) , E Λ+ c

> < > :

and therefore,

Ψ∈(H 1/2 )N Gram

L2

= Ec (Ψc,0 ) = 8 E Λ+ c

>< >:

and Theorem 6 is proved.

Ψ=1IN

Λ− Ψ Ψ=0


L2

Ec (Ψ)

Ψ=1IN

Λ− Ψ Ψ=0

5 Acknowledgements The authors are grateful to Boris Buffoni for explaining to them the work [2], and suggesting that it might be useful in the study of the Dirac-Fock functional. The proof of Lemma 9 is inspired by this paper.

References [1] V. Bach, E.H. Lieb, M. Loss, J.P. Solovej, There are no unfilled shells in unrestricted Hartree-Fock theory, Phys. Rev. Lett. 72(19), 2981–2983 (1994). [2] B. Buffoni, L. Jeanjean, Minimax characterization of solutions for a semilinear elliptic equation with lack of compactness, Ann. Inst. H. Poincaré 10(4), 377–404 (1993). [3] V.I. Burenkov, W.D. Evans, On the evaluation of the norm of an integral operator associated with the stability of one-electron atoms. Proc. Roy. Soc. Edinburgh, sect. A 128 (5), 993–1005 (1998). [4] J. Dolbeault, M.J. Esteban, E. Séré, Variational characterization for eigenvalues of Dirac operators, Cal. Var. and PDE 10 (4), 321–347 (2000). [5] I. Ekeland, On the variational principle, J. Math. Anal. 47, 324–353 (1974). [6] M.J. Esteban, E. Séré, Solutions for the Dirac-Fock equations for atoms and molecules, Comm. Math. Phys. 203, 499–530 (1999). [7] M. Griesemer, R.T. Lewis, H. Siedentop, A minimax principle for eigenvalues in spectral gaps: Dirac operators with Coulomb potentials, Doc. Math. 4, 275–283 (1999).

Vol. 2, 2001


961

[8] I.W. Herbst, Spectral theory of the operator (p2 + m2 )1/2 − ze2 /r, Comm. Math. Phys. 53, 285–294 (1977). [9] Y.-K. Kim, Relativistic self-consistent field theory for closed-shell atoms Phys. Rev. 154, 17–39 (1967). [10] E. H. Lieb, B. Simon, The Hartree-Fock theory for Coulomb systems, Comm. Math. Phys. 53, 185–194 (1977). [11] P.-L. Lions, Solutions of Hartree-Fock equations for Coulomb systems, Comm. Math. Phys. 109, 33–97 (1987). [12] A. Messiah, Mécanique quantique. Dunod, 1965. [13] E. Paturel, Solutions of the Dirac-Fock Equations without Projector Ann. Henri Poincaré 1, 1123–1157 (2000). [14] B. Thaller, The Dirac equation. Springer-Verlag, 1992. [15] C. Tix, Strict positivity of a relativistic Hamiltonian due to Brown and Ravenhall, Bull. London Math. Soc. 30(3), 283–290 (1998). [16] C. Tix, Lower bound for the ground state energy of the no-pair Hamiltonian, Phys. Lett. B 405, 293–296 (1997).

Maria J. Esteban and Eric Séré CEREMADE (UMR C.N.R.S. 7534) Université Paris IX-Dauphine Place du Maréchal de Lattre de Tassigny F-75775 Paris Cedex 16 France email: [email protected] email: [email protected] Communicated by Rafael D. Benguria submitted 3/01/01, accepted 15/05/01




Convergent Perturbative Solutions of the Schr¨ odinger Equation for Two-Level Systems with Hamiltonians Depending Periodically on Time J. C. A. Barata

Abstract.We study the Schr¨ odinger equation of a class of two-level systems under the action of a periodic time-dependent external field in the situation where the energy difference 2 between the free energy levels is sufficiently small with respect to the strength of the external interaction. Under suitable conditions we show that this equation has a solution in terms of converging power series expansions in . In contrast to other expansion methods, like in the Dyson expansion, the method we present is not plagued by the presence of “secular terms”. Due to this feature we were able to prove uniform convergence of the Fourier series involved in the computation of the wave functions and to prove absolute convergence of the expansions leading to the “secular frequency” and to the coefficients of the Fourier expansion of the wave function.

I Introduction This paper is dedicated to the mathematical study of a class of periodically timedepending two-level systems. It is well know that the usual perturbative approach, based, f.i., on the Dyson series, leads to difficulties involving secular terms and (for quasi-periodic interactions) small divisors. In [1] a new algorithm has been devised to overcome the secular terms in the general case of quasi-periodic interactions. Roughly speaking it involves an inductive “renormalization” of an effective field introduced via an exponential Ansatz (the function g to be introduced below). Here we apply that algorithm to the case of periodic interactions in the strong coupling regime, a situation of particular interest in several branches of physics (for references, see [2] or below). As we will show, our method not only recovers the Floquet form of the solution of the time-depending Schr¨ odinger equation, but also allows the computation of the secular frequency and of the Fourier coefficients in terms of explicit convergent -expansions, what constitutes a feature of our algorithm, compared to other expansion methods. Let us describe more precisely the systems we will study. Consider the following Hamiltonian for a two-level system under the action of an external timedependent field H1 (t) = H0 + HI (t) = σ3 − f (t)σ1

(I.1)

964

J. C. A. Barata


and the corresponding Schr¨ odinger equation1 i∂t Ψ(t) = H1 (t)Ψ(t),

(I.2)

with Ψ : R → C2 . Here f (t) is a function of time t and ∈ R is a parameter representing half of the energy difference between the “free” (i.e., for f ≡ 0) in their usual energy levels. The symbols σ1 , σ2 and σ3 denote the Pauli matrices 1 0 . The “interaction and σ representations: σ1 = 01 10 , σ2 = 0i −i = 3 0 0 −1 Hamiltonian” HI (t) := −f (t)σ1 represents a time-dependent external interaction coupled to the system inducing transitions between the two eigen-states of the free Hamiltonian H0 := σ3 . Since the Schrödinger equation (I.2) can be read as i∂τ Ψ0 (τ ) = σ3 − −1 f −1 τ σ1 Ψ0 (τ ), (I.3) where τ ≡ t and Ψ0 (t) ≡ Ψ(−1 t), the situation where is “small” characterizes the “strong coupling” and, for periodic f , “large frequency” regime [3, 4]. The system described above is certainly one of the simplest non-trivial timedepending quantum systems and the study of the solutions of (I.2) is of basic importance for many physical applications as, e.g., in quantum optics, in the theory of spin resonance or in problems of quantum tunneling. Equation (I.2) has been analyzed by many authors in various approximations. In the wide literature on the subject of time-depending two-level systems we mention the pioneering works of Rabi [5], of Bloch and Siegert [6] and of Autler and Townes [7]. In [7] the authors studied the solutions of (I.2) for the case where, in our notation, f (t) = −2β cos(ωt), β ∈ R. Their work is exact but non-rigorous and involved a combination of the method of continued fractions, for relating the coefficients the Fourier decomposition of the wave functions, with numerical analysis. No proof has been exhibited that the continued fractions converge and further unjustified restrictions have been made in order to transform some transcendental equations into low order algebraic equations, which are then solved either exactly or, specially for strong fields, numerically. For related treatments using different approaches and for related systems, see [8, 9, 10, 11, 12, 13, 14] and other references therein. For a recent review on the mathematical theory of quantum systems submitted to time-depending periodic and quasi-periodic perturbations see [3]. For an introduction to the subjects of “quantum chaos” and quantum stability, two subjects strongly linked to the problems considered here, see [15]. See also [4] for results on the spectral analysis of the quasi-energy operator for two-level atoms in the quasi-periodic case. In [1] we studied the system described by (I.2) in the situation where f is a quasi-periodic function of time and a special perturbative expansion (power series expansion in ) has been developed. Its main virtue is to be free of the so-called “secular terms”, i.e., polynomials in t that appear order by order in perturbation 1 For

simplicity we shall adopt here a system of units with ~ = 1.

Vol. 2, 2001

Schr¨ odinger Equation with Hamiltonians Depending Periodically on Time 965

theory and that spoil the analysis of convergence of the series and the proofs of quasi-periodicity of the perturbative terms. Although we have not been able to prove convergence of our power series expansion in the general case where f is quasi-periodic it has been established that the coefficients of the expansion are indeed quasi-periodic functions of time. One of the obstacles found in the attempt to prove convergence of our expansion in the general case of quasi-periodic f is the presence of “small denominators”. This typical feature of perturbative approximations for solutions of differential equations with quasi-periodic coefficients is well known as one of the main sources of problems in the mathematically precise treatment of such equations. On what concerns proofs of convergence it should, therefore, be expected that better results could be obtained if the function f were restricted to be periodic since, in this case, no problems with small denominators should afflict our expansions. In the present paper we show how the difficulties analyzed in [1] can be circumvented in the case of periodic f and establish convergence of our perturbative expansion for that case. * By a time-independent unitary transformation, representing a rotation of π/2 around the 2-axis, we may replace H1 (t) by H2 (t) := e−iπσ2 /4 H1 (t) eiπσ2 /4 = σ1 + f (t)σ3 (I.4) and the Schr¨ odinger equation becomes

with

i∂t Φ(t) = H2 (t)Φ(t),

(I.5)

Φ(t) := e−iπσ2 /4 Ψ(t).

(I.6)

The theorem below, proven in [1], presents the solution of the Schr¨ odinger equation (I.5) in terms of particular solutions of a generalized Riccati equation. Theorem I.1 Let f : R → R, f ∈ C 1 (R) and ∈ R and let g : R → C, g ∈ C 1 (R), be a particular solution of the generalized Riccati equation G − iG2 − 2if G + i2 = 0.

(I.7)

Then, the function Φ : R → C2 given by φ+ (t) Φ(t) = = U (t)Φ(0) = U (t, 0)Φ(0), φ− (t) where

  U (t) := 

R(t) (1 + ig(0)S(t)) −iR(t) S(t)

−iR(t)S(t)

(I.8) 

 , R(t) 1 − i g(0) S(t)

(I.9)

966

with

J. C. A. Barata


t (f (τ ) + g(τ )) dτ R(t) := exp −i

(I.10)

0

and

S(t) :=

t

R(τ )−2 dτ

0

is a solution of (I.5) with initial value Φ(0) =

(I.11)

φ+ (0) φ− (0)

∈ C2 .

For a proof of Theorem I.1, see [1]. Let us briefly describe some of the ideas leading to Theorem I.1 and to other results of [1]. As we saw in [1], the solutions of the Schr¨ odinger equation (I.5) can be studied in terms of the solutions of a particular complex version of Hill’s equation: φ (t) + if (t) + 2 + f (t)2 φ(t) = 0. (I.12) In fact, a simple computation (see [1]) shows that the components φ± of Φ(t) satisfy precisely φ+ + +if + 2 + f 2 φ+ = 0 . (I.13) φ− + −if + 2 + f 2 φ− = 0 As a side remark we note that equations (I.13) are simpler and more convenient than the equations obtained by separating ψ+ and ψ− from (I.2): f ψ+ − f ψ+ + 2 f + f 3 − if ψ+ = 0 . (I.14) f ψ− − f ψ− + 2 f + f 3 + if ψ− = 0 These equations, mentioned (but not used) in [7], are mathematically less convenient because they may be non-regular, since f may have zeros in typical cases, like the simple monochromatic case f (t) = −2β cos(ωt), analyzed in [7]. In [1] we attempted to solve (I.12) using the Ansatz t φ(t) = exp −i (f (τ ) + g(τ ))dτ . (I.15) 0

It follows that g has to satisfy the generalized Riccati equation (I.7) and we tried to find solutions for g in terms of a power expansion in like g(t) = q(t)

∞

n cn (t),

(I.16)

n=1

where

q(t) := exp i 0

t

f (τ )dτ

.

(I.17)

Vol. 2, 2001


The heuristic idea behind the Ans¨ atze (I.15) and(I.16) is the following. For t ≡ 0 a solution for (I.12) is given by exp −i 0 f (τ )dτ . Thus, in (I.15) and (I.16) we are searching for solutions in terms of an “effective external field” of the form f + g, with g vanishing for = 0. Note that a solution of the form (I.15) leads to only one of the two independent solutions of the second order Hill’s equation (I.12). The complete solution of the Schr¨ odinger equation (I.5) in terms of solutions of the generalized Riccati equation (I.7) is that described in Theorem I.1. As mentioned above, perturbative solutions of quasi-periodically time-dependent systems are usually plagued by small denominators and by the presence of the so-called “secular terms”. In [1] we discovered a particular way to eliminate completely the secular terms from the perturbative expansion of g (see Appendix A for a brief description of the strategy developed in [1]) and we were able to show, under some special assumptions, that the coefficients cn (t) are all quasiperiodic functions. In [1] we proved convergence of our perturbative solution in the somewhat trivial case where f (t) is a non-zero constant function. Unfortunately no conclusion could be drawn about the convergence of the perturbative expansion for g in the general case of quasi-periodic f . We conjectured, however, that our expansion is uniformly convergent in the situation where f (t) has small fluctuations about its mean value. The technically central result of the present paper is the proof that, under suitable assumptions, the series (I.16) converges uniformly on R as a function of time for || small enough and f periodic. This is the content of Theorem III.1. Moreover, we show that the functions cn and, hence, g, have uniformly converging Fourier series representations. We use this fact together with the solution (I.9) to find the Floquet representation of the components φ± of the wave function in terms of uniformly converging Fourier series representations. This is the content of Theorem I.2. Absolutely converging power series in for the Fourier coefficients and for the secular frequency are also presented. We believe that the methods employed in this paper are also of importance for the general theory of Hill’s equation. It would be of great interest to know whether the ideas described in [1] and here can be generalized and applied to a larger class of Hill’s equations than those we studied so far.

I.1

The Main Result

On what concerns the solutions of the Schr¨ odinger equation (I.5) the next theorem summarizes our main results. Theorem I.2 Let f be a real Tω -periodic function of time (Tω := 2π/ω) whose Fourier decomposition Fn einωt , (I.18) f (t) = n∈Z

968

J. C. A. Barata


with ω > 0, contains only a finite number of terms, i.e., the set of integers {n ∈ Z| Fn = 0} is a finite set. We also assume that either F0 = 0 or 2F0 ∈ R\{kω, k ∈ Z}. Consider the two following mutually exclusive conditions on f : I) M (q 2 ) = 0. II) M (q 2 ) = 0 but M (Q1 ) = 0, where t Q1 (t) := q(t)2 q −2 (τ )dτ. (I.19) 0

Then, for each f as above, satisfying condition I or II, there exists a constant K > 0 (depending on the Fourier coefficients {Fn , n ∈ Z , n = 0} and on ω > 0) such that, for each with || < K, there exist Ω ∈ R and Tω -periodic functions u± 11 and u± 12 such that the propagator U (t) of (I.8) can be written as     U11 (t) U12 (t) U11 (t) U12 (t)  =  , U (t) =  (I.20) U21 (t) U22 (t) −U12 (t) U11 (t) with U11 (t) U12 (t)

iΩt + = e−iΩt u− u11 (t), 11 (t) + e iΩt + = e−iΩt u− (t) + e u12 (t). 12

(I.21) (I.22)

± The functions u± 11 and u12 have absolutely and uniformly converging Fourier expansions ± u± U11 (n)einωt , 11 (t) = n∈Z

u± 12 (t)

=

± U12 (n)einωt .

n∈Z ± Moreover, under the same assumptions, Ω and the Fourier coefficients U11 (n) and ± U12 (n) can be expressed in terms of absolutely converging power series on .

Remarks on Theorem I.2 1. Expressions (I.21) and (I.22) represent the so-called “Floquet form” of the matrix elements U11 (t) and U12 (t). The frequency Ω is sometimes called the “secular frequency”. The existence of the Floquet form is, of course, guaranteed by the well known Floquet’s theorem. Hence, our algorithm not only recovers the Floquet form but also allows the explicit computation of the secular frequency and the Fourier coefficients in terms of convergent expansions. 2. For a discussion of some physical implications of the solution described in the last theorem, see [2].

Vol. 2, 2001


3. The physically realistic condition that the Fourier decomposition of f contains only a finite number of terms can be weakened. The only condition we use is the fast decay for |m| → ∞ of the Fourier coefficients Qm of the function q(t) (defined in (I.17)), as found in Proposition II.2. 4. The second equality in (I.20) is due to (I.9). 5. It is important to stress that conditions I and II are restrictions on the function f and not on the parameter . 6. Possibly there are other conditions beyond I and II which could be considered, but they have not been explored so far. They are relevant in some cases. Theorem I.2 still does not provide a complete solution of (I.5) for all possible periodic functions f , but examples and some qualitative arguments show that the remaining cases are rather exceptional. For instance, for the monochromatic case where f (t) = ϕ1 cos(ωt) + ϕ2 sin(ωt) condition I covers all pairs (ϕ1 , ϕ2 ) ∈ R2 , except the countable family of circles centered at the origin with radius xa ω/2, a = 1, 2, . . ., where xa if the a-th zero of J0 in R+ (J0 is the Bessel function of order zero). However, in these circles condition II is nowhere fulfilled. See the discussion in Section VI. 7. From the computational point of view the solution given by our method can be easily implemented in numerical programs and has been successfully tested, providing ways to study our two-level system for large times with controllable errors (due to the uniform convergence). 8. Unitarity of U (t) for all t ∈ R is a well known consequence of Dyson’s expansion (see f.i. [18]). 9. Conditions I and II define, in principle, distinct solutions of the generalized Riccati equation (I.7) and, hence, of the Schr¨ odinger equation (I.5). To fix a name we will call these solutions “classes” of solutions. 10. As we will discuss, condition I is mostly important for the case F0 = 0, while condition II is mostly important for the case F0 = 0. There are, however, particular cases where condition I holds for F0 = 0 and condition II for F0 = 0, but examples indicate that such situations are rather exceptional. See Section VI.1. For the proof of Theorem I.2 we have to consider two distinct cases, the case where F0 = 0 and the case where F0 = 0. The former will be considered in Section III and the later in Section IV.

970

I.2

J. C. A. Barata


Some Definitions and Some Remarks on the Notation

Let us make some remarks on the notation we use here and recall the notation used in [1]. Given the Fourier representation2 ·ω f t f (t) = Fm eim (I.23) ˜˜ ˜ B m∈Z ˜ of a quasi-periodic function f , we denote (as in [1]) by ω the vector of frequencies defined by  if F0 = 0  ω f ∈ RB , ˜ ˜ . (I.24) ω :=  (ω f , F0 ) ∈ RB+1 , if F0 = 0, ˜ ˜ ˜ Since we assume that ωf ∈ RB + , the definition above says that all components of ω are always non-zero.˜Moreover, we denote  if F0 = 0  B, ˜ . (I.25) A :=  B + 1, if F0 = 0 ˜ We will frequently use F0 ≡ F0 . We will denote vectors in ZB˜(or RB ) by v and vectors in ZA (or RA ) by v. The symbol |n| denotes the l1 (ZA ) norm of a˜ vector n = (n1 , . . . , nA ) ∈ ZA : |n| := |n1 | + · · · + |nA |. We denote by x the largest integer lower or equal to x ∈ R and by x the smallest integer larger or equal to x ∈ R For m ∈ Z we denote by m the following function: |m|, for m = 0 m := . (I.26) 1, for m = 0 In the case where f is a quasi-periodic function as in (I.23) we will denote by Qm the Fourier coefficients of the function q, defined in (I.17): q(t) =

Qm eim·ωt ,

(I.27)

m∈ZA (2)

and by Qm the Fourier coefficients of the function q 2 : q(t)2 =

im·ωt Q(2) . m e

(I.28)

m∈ZA 2 For

X

convenience we adopt here a different notation of that found in [1], where the Fourier ·ωf t decomposition of f was written as f (t) = fm eim ˜ ˜ . ˜ m∈ B ˜

Z

Vol. 2, 2001


Finally, for an almost periodic function h we denote by M (h) its “mean value”, defined as T 1 M (h) := lim h(t) dt. T →∞ 2T −T See, e.g. [16, 17]. The mean value M (h) equals the constant term in the Fourier (2) expansion of h. One has, for instance, M (q 2 ) = Q0 .

II Some Previous Results In [1] some results could be proven about the nature of some particular solutions of (I.7) for the case where f is a quasi-periodic function subjected to some additional restrictions. These results are described in Theorem II.1. Theorem II.1 Let f : R → R be quasi-periodic with f (t) =

n∈ZB

t Fn eiω˜f ·n ˜, ˜

˜

and such that the sum above contains only a finite number of terms. Assume that the vector ω (defined in (I.24)) satisfies Diophantine conditions, i.e., assume the existence of constants ∆ > 0 and σ > 0 such that, for all n ∈ ZA , n = 0, |n · ω| ≥ ∆−1 |n|−σ . I. Assume that f satisfies the condition M (q 2 ) = 0. Then, there exists a formal power series ∞ g(t) = q(t) cn (t)n , (II.1) n=1

representing a particular solution of the generalized Riccati equation (I.7) such that all coefficients cn can be chosen to be quasi-periodic and can be represented as cn (t) =

im·ωt C(n) , m e

m∈ZA (n)

where, for the Fourier coefficients Cm , we have −χ0 |m| , |C(n) m | ≤ Kn e

where χ0 > 0 is a constant and Kn ≥ 0.

(II.2)

972

J. C. A. Barata


II. Assume that f satisfies the conditions M (q 2 ) = 0

M (Q1 ) = 0,

and

where Q1 is defined in (I.19). Then, there exists a formal power series g(t) = q(t)

∞

en (t)2n ,

(II.3)

n=1

representing a particular solution of the generalized Riccati equation (I.7) such that all coefficients en can be chosen to be quasi-periodic and can be represented as en (t) =

im·ωt E(n) , m e

(II.4)

m∈ZA (n)

where, for the Fourier coefficients Em , we have −χ0 |m| , |E(n) m | ≤ Ln e

where χ0 > 0 is a constant and Ln ≥ 0. There are other conditions beyond I and II which could be considered, but they have not been explored so far. See the discussion in Section VI. The statements of this last theorem are not sufficient for proving convergence of the power series expansions in for g in the general case of quasi-periodic f . Unfortunately, as discussed in [1], the behavior for large n of the constants Kn and Ln is too bad to guarantee absolute convergence of the formal power series above. For the restricted case were f is periodic we will in the present paper prove stronger results (Theorem III.1 below) than that implied by Theorem II.1. As we will see, these stronger results, in contrast, imply convergence of the -power series for g (Theorem III.3 below). Some of the more technical results of [1] have been obtained through the analysis of the Fourier coefficients of the functions cn and en defined in Theorem II.1 above. Specially important for us are the recursion relations found in [1] for the (n) (n) Fourier coefficients Cm and Em defined in (II.2) and (II.4), respectively. Those recursion relations follow by imposing the generalized Riccati equation (I.7) to the power expansions (II.1) and (II.3). In Appendix A we reproduce some of the main ideas of [1] leading to a power series expansion for g free of secular terms and leading to the recursion relations below. It is important for our present purposes to reproduce those recursive relations here, what we shall do now. As in (I.27)–(I.28), we denote by Qm the Fourier coefficients of the function q (2) and by Qm the Fourier coefficients of the function q 2 . For the Fourier coefficients

Vol. 2, 2001


of the functions cn we have the following relations: C(1) m C(2) m

= α1 Qm , (2) (2) (2) α21 Qn − Q−n Qm Q−n = Qm−n − , (2) n·ω Q0 n∈ZA n=0

C(n) m

=

n1 , n2 ∈ZA n1 +n2 =0

−

(II.5) (II.6)

(2) Qm Q−n1 −n2 n−1 1 (n−p) C(p) Qm−(n1 +n2 ) − n1 Cn2 (2) (n1 + n2 ) · ω Q0 p=1

Qm (2)

2α1 Q0

n−1

(n+1−p)

C(p) n C−n

for n ≥ 3.

,

(II.7)

n∈ZA p=2

M (q 2 ) . For the Fourier coefficients of the functions en we M (q 2 ) have the following relations.

Above m ∈ ZA , α21 =

E(1) m

=

Qm+n Q(2) Qm n + n · ω 2iM (Q1 ) A

n∈Z n=0

E(n) m

=



 Qm−n1 −n2

n1 , n2 ∈ZA n1 +n2 =0

n−1

×

p=1

(2)

(2)

(2)

Qn1 +n2 Qn1 Qn2

n1 , n2 ∈ZA n1 =0, n2 =0

(n1 · ω)(n2 · ω)

,

(II.8)

 (2) (2) Qn−n1 −n2 Qn  Qm  (2) +  Q−n1 −n2 R + iM (Q1 ) n·ω A 

n∈Z n=0

(n−p) E(p) n1 En2

(n1 + n2 ) · ω

+

n−1 Qm (n+1−p) E(p) , n E−n 2iM (Q1 ) A p=2

n ≥ 2.

(II.9)

n∈Z

Above m ∈ ZA , Q1 is defined in (I.19) and 1 R := 2iM (Q1 )

n1 , n2 ∈ZA n1 =0, n2 =0

(2)

(2)

(2)

Qn1 +n2 Qn1 Qn2 (n1 · ω)(n2 · ω)

.

(II.10)

The above expressions for the Fourier coefficients are somewhat complex but two important features can be distinguished. The first is the inevitable presence of “small denominators”, represented by the various factors of the form (n · ω)−1 (with n = 0) appearing above. The second is the presence of convolution products (a consequence, lately, of the quadratic character of the generalized Riccati equation). The presence of the later is the additional source of complications mentioned before, for they also, together with the small denominators, contribute to spoil the decay of the Fourier coefficients needed to prove convergence of the -expansions.

974

J. C. A. Barata


(2)

II.1 The Fourier Coefficients Qm and Qm

For future purposes, it is important now to look more closely at the Fourier coef(2) ficients Qm and Qm . By assumption, the set {n ∈ ZB , n = 0| Fn = 0} is a finite set and, by the ˜ an even ˜ number ˜ ˜ of˜elements, say 2J with J ≥ 1. condition that f is real, it contains Let us write this set as {n1 , . . . , n2J } with the convention na = −n2J−a+1 = 0, ˜ f in the ˜ form ˜ ˜ ˜ 1 ≤ a ≤ J, and let us write f (t) = F0 +

2J a=1

fa einã ·ω˜f t ,

(II.11)

with fa ≡ Fna . Clearly fa = f2J−a+1 , 1 ≤ a ≤ J, since f is real. ˜ computation [1] shows that A simple 2J ! pa " ∞ 2J 1 fa iγf exp i F0 + ω f · pb nb t , q(t) = e pa ! na · ω f ˜ ˜ p , ..., p =0 a=1 1

˜ ˜

b=1

2J

with γf := i

2J

fa . n · ωf a=1 a

(II.12)

(II.13)

˜ ˜

One sees that γf ∈ R. The function q 2 is obtained by the replacement f → 2f : 2J ! pa " ∞ 2J 1 2fa 2 i2γf q(t) = e exp i 2F0 + ωf · pb nb t . pa ! na · ωf ˜ ˜ p , ..., p =0 a=1 1

˜ ˜

b=1

2J

From (II.12) we conclude that, if F0 is not of the form F0 = ω f · k, for some vector ˜ ˜ of integers k , one has im·ωt ˜ q(t) = Qm e m∈ZA

with ω defined in (I.24) and Qm = e

∞

iγf

2J

δ (P , m)

p1 , ..., p2J =0

where

P ≡ P (p1 , . . . , p2J , n1 , . . . , n2J ) :=

˜

˜

a=1

!

1 pa !

fa na · ω f

pa " ,

˜ ˜

 2J    pb nb ∈ ZB ,     b=1 ˜ 2J       pb nb , 1 ∈ ZB+1 ,  b=1

(II.14)

˜

if F0 = 0,

if F0 = 0. (II.15)

Vol. 2, 2001


and where δ (P , m) :=

1, if P = m, 0, else.

(II.16)

For q 2 , and if F0 is not of the form 2F0 = ω f · k , for some vector of integers k, we ˜ have ˜ ˜ im·ωt Q(2) e , q 2 (t) = m m∈ZA

where Q(2) m

= e

∞

i2γf

δ P

(2)

!

2J

, m

p1 , ..., p2J =0

a=1

with

P (2) ≡ P (2) (p1 , . . . , p2J , n1 , . . . , n2J ) :=

˜

˜

1 pa !

2fa na · ω f

pa " ,

(II.17)

˜ ˜

 2J    pb nb ∈ ZB ,     b=1 ˜

if F0 = 0,

2J       pb nb , 2 ∈ ZB+1 , if F0 = 0.  b=1

˜

(2)

Let us now study the condition M (q 2 ) = Q0 = 0 for F0 = 0, F0 not of the form 2F0 = ω f · k , for some vector of integers k . We have from (II.17) ˜ ˜ ˜ ∞ 2J ! 1 2f pa " a (2) 2 i2γf M (q ) = e δ P , 0 . (II.18) p ! n · ωf a a p , ..., p =0 a=1 1

˜ ˜

2J

(2)

Since the last component of P equals 2 for F0 = 0, we always have δ(P (2) , 0) = 0 in the sum above and, hence, M (q 2 ) = 0. This means that, for F0 = 0 condition I never happens, except perhaps for the case where 2F0 = ω f · k , k ∈ ZB , much ˜ everywhere ˜˜ in contrast to the case F0 = 0, where condition I holds almost in the space of the functions f (see Section VI.1). From (II.14) and (II.17) it is clear that for F0 = 0, and 2F0 = ω f · k , with ˜ ˜ k ∈ ZB , one has, writing m = (m, mA ),

˜

˜

(2) Qm = Qm δmA , 1 and Q(2) m = Q m δ mA , 2 , ˜ ˜ where δ is the usual Kr¨ onecker delta and where pa " ∞ 2J ! 1 fa iγf Qm := e δ (P , m) , ˜ ˜ ˜ a=1 pa ! na · ωf p1 , ..., p2J =0

(II.19)

(II.20)

˜ ˜

and 2iγf Q(2) m := e ˜

∞ p1 , ..., p2J =0

2J

δ (P , m)

˜ ˜

a=1

!

1 pa !

2fa na · ω f

˜ ˜

pa " ,

(II.21)

976

J. C. A. Barata

with P :=

˜

2J

pb nb ∈ ZB .

b=1


(II.22)

˜

(2)

The observation taken from (II.19) that Qm and Qm are zero except if mA = 1, respectively, if mA = 2, will be of crucial importance for the analysis of the case F0 = 0, given in Section IV. This is because these restrictions propagate (n) in a specific way to the Fourier coefficients Em . Below we will make use of the following proposition on the decay of the (2) coefficients Qm and Qm : Proposition II.2 Let f : R → R be periodic and be represented by a finite Fourier series as in (I.18). Then, for any χ > 0 there is a positive constant Q ≡ Q(χ) such that e−χ|m| e−χ|m| and |Q(2) (II.23) |Qm | ≤ Q m | ≤ Q 2 m m2 for all m ∈ Z, where the symbol m is defined in (I.26). The proof is found in Appendix B. Finally, we mention the following important lemma, whose proof is given in Appendix C. Lemma II.3 For χ > 0 and m ∈ Z define B(m) ≡ B(m, χ) :=

n∈Z

e−χ(|m−n|+|n|) . m − n2 n2

(II.24)

Then one has

e−χ|m| , m2 for some constant B0 ≡ B0 (χ) > 0 and for all m ∈ Z. B(m) ≤ B0

(II.25)

We are ready now to start the analysis of the recursion relations (II.5)–(II.7) and (II.8)–(II.9) for the periodic case. As already mentioned, we have to consider two separated cases: the case where F0 = 0, we will deal with now, and the case F0 = 0, which will be treated in Section IV.

III The Periodic Case With F0 = 0 In [1] the recursion relations (II.5)–(II.7) and (II.8)–(II.9) have been used to prove inductively exponential bounds for the Fourier coefficients. As mentioned before two main difficulties have to be faced in this enterprise: the presence of “small denominators” and of convolution products in the recursion relations. Both are responsible for reducing the rate of decay of the Fourier coefficients at each induction step.

Vol. 2, 2001


Let us consider the origin of the “small denominators problem” in our expansions. It comes from the many factors of the form (n · ω)−1 (with n = 0) appearing in the recursion relations. In the case where f is a periodic function with frequency ω with F0 = 0, we have A = 2, n = (n1 , n2 ) ∈ Z2 and n · ω = n1 ω + n2 F0 . On the other hand, in the case where f is a periodic function with frequency ω and with F0 = 0, we have A = 1, n = n ∈ Z and n · ω = nω. To avoid the quasi-resonant situation where n1 ω + n2 F0 is small we will first consider the case where F0 = 0.

III.1 The Recursive Relations in the Periodic Case for F0 = 0 Under the hypothesis, the recursive relations for the Fourier coefficients of the functions cn become (1) Cm (2) Cm

(n) Cm

= α1 Qm , (III.1) (2) 2 (2) (2) α1 Qn1 − Q−n1 Qm Q−n1 = , (III.2) Qm−n1 − (2) n1 ω Q0 n1 ∈Z n1 =0 n−1 (2) Qm Q−n1 −n2 (p) (n−p) 1 Qm−(n1 +n2 ) − = Cn1 Cn2 (2) (n1 + n2 ) · ω Q n , n ∈Z p=1 0

1 2 n1 +n2 =0

−

Qm

n−1

(2) 2α1 Q0 n1 ∈Z p=2

Above m ∈ Z and α21 =

(n+1−p)

Cn(p) C−n1 1

for n ≥ 3.

,

(III.3)

(2)

Q0

. (2) Q0 For the Fourier coefficients of the functions en we have:

(1) Em

=

Qm+n Q(2) Qm n1 1 + n1 ω 2iM (Q1 ) n ∈Z

1 n1 =0

(n) Em

=



 Qm−n1 −n2

n1 , n2 ∈Z n1 +n2 =0

n−1

×

p=1

n1 , n2 ∈Z n1 =0, n2 =0

(2)

(2)

(2)

Qn1 +n2 Qn1 Qn2 (n1 ω)(n2 ω)

(III.4)



 (2) (2) Qn3 −n1 −n2 Qn3  Qm  (2) + Q−n1 −n2 R +  iM (Q1 ) n3 ω n ∈Z 3 n3 =0

En(p) En(n−p) 1 2

(n1 + n2 )ω

+

n−1 Qm (n+1−p) En(p) E−n1 , 1 2iM (Q1 ) p=2

n ≥ 2. (III.5)

n1 ∈Z

It is clear here that no “small denominators” appear in this case, since now |(n·ω)−1 | ≤ ω −1 for n = 0. Hence, the convolution products are the only remaining

978

J. C. A. Barata


factors eventually forcing the reduction of the decay rate of the Fourier coefficients at the successive induction steps. In the Section III.2 we will show how the effect of the convolution products can be taken under control. The result is expressed in the following three theorems. Theorem III.1 Let f : R → R be periodic with a finite Fourier decomposition as in (I.18) and with F0 = 0. (n) Case I. Consider the Fourier coefficients Cm satisfying the recursion relations (III.1), (III.2) and (III.3). Under the hypothesis that M (q 2 ) = 0 we have (n) |Cm | ≤ Kn

e−χ|m| m2

(III.6)

for all n ∈ N, and all m ∈ Z, where χ > 0 is a constant and m is defined in (I.26). Above, the coefficients Kn do not depend on m and satisfy the recursion relation n−1 n−1 , (III.7) Kp Kn−p + Kp Kn+1−p K n = C2 p=1

p=2

with K1 = K2 = C1 , where C1 and C2 are positive constants which can be chosen larger than or equal to 1. (n) Case II. Consider the Fourier coefficients Em satisfying the recursion relations (III.4) and (III.5). Under the hypothesis that M (q 2 ) = 0 and M (Q1 ) = 0 we have e−χ|m| (n) | ≤ Kn (III.8) |Em m2 for all n ∈ N, and all m ∈ Z, where χ > 0 is a constant. Above, the coefficients Kn do not depend on m and satisfy the recursion relation Kn

= E2

n−1 p=1

Kp Kn−p

+

n−1

Kp Kn+1−p

,

(III.9)

p=2

with K1 = K2 = E1 , where E1 and E2 are positive constants which can be chosen larger than or equal to 1. Theorem III.1 will be proven in Section III.2. The importance of the recursive definition of the constants Kn given in (III.7) or (III.9) is expressed in the following crucial theorem, which says that the constants Kn grow at most exponentially with n. Theorem III.2 Let the constants Kn be defined through the recurrence relations (III.7) or (III.9). Then there exist constants K > 0 and K0 > 0 (depending eventually on f ) such that Kn ≤ K0 K n for all n ∈ N.

Vol. 2, 2001


The proof of Theorem III.2 is found in Appendix D and makes interesting use of properties of the Catalan sequence. Theorems III.1 and III.2 have the following immediate corollary: Theorem III.3 The power series expansions in (II.1) and (II.3) are absolutely convergent for all ∈ C with || < K −1 for all t ∈ R and, hence, (II.1) and (II.3) define particular solutions of the generalized Riccati equation (I.7) in cases I and II, respectively, of Theorem III.1. The function g can be expressed in terms of an absolutely and uniformly converging Fourier series whose coefficients can be expressed in terms of absolutely converging power series in for all ∈ C with || < K −1 . Proof of Theorem III.3. We prove the statement for case I. Case II is analogous. The first step is to determine the Fourier expansion of the function g, as given in (I.16), and to study some of their properties. One clearly has g(t) = Gm eimωt , (III.10) m∈Z

with

∞

Gm ≡ Gm () =

n G(n) m ,

(III.11)

n=1

where

G(n) m :=

(n)

Qm−l Cl .

(III.12)

l∈Z

We have the following proposition: Proposition III.4 For all χ > 0 there exists a constant Cg ≡ Cg (χ) > 0 such that |G(n) m | ≤ Cg K n

e−χ|m| m2

(III.13)

for all m ∈ Z and all n ∈ N. Consequently, for || < K one has |Gm | ≤ Cg

e−χ|m| m2

(III.14)

for some constant Cg (χ, ) > 0 and for all m ∈ Z. Proof of Proposition III.4. Inserting (II.23) and (III.6) into (III.12) we have, for any χ > 0, $ $ $ (n) $ (III.15) $Gm $ ≤ QKn B(m, χ), where B(m, χ) is defined in (II.24). Relation (III.13) follows now from Lemma II.3. From this, the proof of Theorem III.3 follows immediately.

980

J. C. A. Barata


The solutions for the generalized Riccati equation (I.7) mentioned in Theorem III.3 are, through (I.9), the main ingredient for the solution of the Schr¨ odinger equation (I.5). This will be further discussed in Section V. Now we have to prove Theorem III.1.

III.2 Inductive Bounds for the Fourier Coefficients In this section we will prove Theorem III.1 in cases I and II. We will make use (2) of Proposition II.2 on the decay of the Fourier coefficients Qm and Qm of the functions q and q 2 , respectively. III.2.1

Case I

In this section we will prove Theorem III.1 in case I. Making use of Proposition II.2 and of relations (III.1)–(III.3) we easily derive the following estimates: e−χ|m| , (III.16) m2 e−χ|m−n1 | 2Q e−χ|n1 | Q e−χ(|m|+|n1 |) (2) , (III.17) |≤ + (2) |Cm 2 2 2 2 ω n1 m − n1 |Q0 | m n1 n1 ∈Z n−1 Q (n) (p) (n−p) |Cn1 | |Cn2 | × |Cm | ≤ ω n1 , n2 ∈Z p=1 Q e−χ(|m|+|n1 +n2 |) e−χ|m−(n1 +n2 )| + (2) × 2 2 m − (n1 + n2 )2 |Q | m n1 + n2 (1) |Cm |≤Q

0

+

Q (2) 2|Q0 |

−χ|m|

e m2

n−1 n1 ∈Z p=2

(n+1−p)

|Cn(p) | |C−n1 1

|,

for n ≥ 3. (III.18)

It follows from (III.17), from the definition of B(m) in (II.24) and from Lemma II.3 that

(2) |Cm |

≤ 2ω

−1

e−χ|m| e−2χ|n1 | Q B(m) + (2) 2 n14 |Q | m Q 0

n1 ∈Z

≤ K2

e−χ|m| m2

(III.19)

for some convenient choice of the constant K2 . Now, we will use an induction argument to establish (III.6) for all n ≥ 3. Let us assume that, for a given n ∈ N, n ≥ 3, one has (p) |Cm | ≤ Kp

e−χ|m| , m2

∀m ∈ Z,

(III.20)

for all p such that 1 ≤ p ≤ n − 1, for some convenient constants Kp . We will establish that this implies the same sort of bound for p = n. Note, by taking

Vol. 2, 2001


K1 ≥ Q, that relation (III.16) guarantees (III.20) for p = 1 and that relation (III.19) guarantees the case p = 2. From (III.18) and from the induction hypothesis, (n) | ≤ |Cm

+

+

n−1

e−χ(|m−(n1 +n2 )|+|n1 |+|n2 |) m − (n1 + n2 )2 n12 n22 p=1 n1 , n2 ∈Z e−χ(|n1 +n2 |+|n1 |+|n2 |) Q e−χ|m| 2 2 2 2 (2) |Q0 | m n1 , n2 ∈Z n1 + n2 n1 n2 n−1 e−2χ|n1 | Q e−χ|m| K K . (III.21) p n+1−p 2 (2) n14 2|Q | m p=2

ω −1 Q

Kp Kn−p

n1 ∈Z

0

Now, n1 , n2 ∈Z

e−χ(|n1 +n2 |+|n1 |+|n2 |) n1 + n22 n12 n22

and

e−2χ|n1 | n14

n1 ∈Z

are just finite constants and n1 , n2 ∈Z

e−χ(|m−(n1 +n2 )|+|n1 |+|n2 |) m − (n1 + n2 )2 n12 n22

=

e−χ|n1 | B(m − n1 ) n12

n1 ∈Z

e−χ(|n1 |+|m−n1 |) n12 m − n12

≤

B0

=

B0 B(m)

≤

(B0 )2

n1 ∈Z

e−χ|m| , m2

(III.22)

where we again used Lemma II.3. Therefore, we conclude (n) |Cm | ≤

Ca

n−1 p=1

Kp Kn−p

+ Cb

n−1 p=2

Kp Kn+1−p

e−χ|m| , m2

(III.23)

for two positive constants Ca and Cb . Taking C2 := max{Ca , Cb , 1} relation (III.7) is proven with C2 ≥ 1. Note that, without loss, we are allowed to choose K1 = K2 ≥ 1 by choosing both equal to max{K1 , K2 , 1}.

982

III.2.2

J. C. A. Barata


Case II

In this section we will prove Theorem III.1 in case II. From (III.4)–(III.5), from Proposition II.2 and from the assumption (III.8) we have $ $ $ (1) $ $Em $

≤

Q2 e−χ(|m+n1 |+|n1 |) ω m + n12 n12 n1 ∈Z

e−χ(|n1 +n2|+|n1 |+|n2 |) Q4 e−χ|m| , 2 m2 ω 2 |M (Q1 )| n1 + n22 n12 n22 n1 , n2 ∈Z 1 e−χ(|m−n1 −n2 |+|n1 |+|n2 |) ≤ Q ω m − n1 − n22 n12 n22 n1 , n2 ∈Z Q2 e−χ|m| e−χ(|n1 +n2 |+|n1 |+|n2 |) |R| + |M (Q1 )| m2 n1 + n22 n12 n22 n−1 e−χ(|n1 +n2 +n3 |+|n1 |+|n2 |+|n3 |) Q + Kp Kn−p 2 2 2 2 ω n1 + n2 + n3 n1 n2 n3 p=1 n3 ∈Z n−1 e−2χ|n1 | Qe−χ|m| + Kp Kn+1−p , n ≥ 2. 2 4 2|M (Q1 )| m n1 p=2 +

$ $ $ (n) $ $Em $

n1 ∈Z

Since sums like

n1 , n2 ∈Z

and

n1 , n2 , n3 ∈Z

e−χ(|n1 +n2|+|n1 |+|n2 |) n1 + n22 n12 n22

e−χ(|n1 +n2 +n3 |+|n1 |+|n2 |+|n3 |) n1 + n2 + n32 n12 n22 n32

are just finite constants, and by applying Lemma II.3 we get (1) |Em | ≤ (n) |Em | ≤

e−χ|m| , m2 n−1 n−1 e−χ|m| Kp Kn−p + Ec Kp Kn+1−p Eb , m2 p=1 p=2

Ea

n ≥ 2,

where Ea , Eb and Ec are constants. The rest of the proof follows the same steps of the proof of Theorem III.1 in case I.

IV The Periodic Case With F0 = 0 Now we will consider the case where f is periodic with F0 = 0, for which we have A = 2. The denominators n · ω are of the form n1 ω + n2 F0 , with n1 , n2 ∈ Z,

Vol. 2, 2001


and one has to fear the presence of small denominators in the recursion relations if both n1 and n2 can be arbitrarily large. Due to (II.19), we will see, however, that the range of values of n2 is limited one single value. Hence, no small divisors appear and we are back to a situation analogous to the case F0 = 0. (n)

IV.1 The Structure of the Coefficients Em

Let us now return to the periodic case with B = 1, F0 = 0 and 2F0 = kω for any k ∈ Z. Recalling relations (II.19) let us first prove the following theorem: Theorem IV.1 For periodic f with a finite Fourier decomposition as above and (n) with F0 = 0 and 2F0 = kω, k ∈ Z, the Fourier coefficients Em , n ≥ 1, are given by (n) E(n) m = Em1 δm2 , −1 ,

(IV.1)

for all m = (m1 , m2 ) ∈ Z2 , where, for m ∈ Z, (1) := Em

Qm+a Q(2) a1 1 , a1 ω + 2F0

(IV.2)

a1 ∈Z

and (n) Em :=

n−1

p=1 a1 , b1 ∈Z

(p)

(n−p)

Qm−a1 −b1 Ea1 Eb1 (a1 + b1 )ω − 2F0

,

n ≥ 2.

(IV.3)

Proof. Let us first consider the case n = 1. The other cases will follow by induction. From (II.8), using (II.19) and writing a = (a1 , a2 ), b = (b1 , b2 ) and c = (c1 , c2 ) ∈ Z2 , we get E(1) m

=

Qm +a Q(2) a1 1 1 (δm2 +a2 , 1 δa2 , 2 ) a · ω a∈Z2 a=0

Q m1 δ m2 , 1 + 2iM (Q1 ) 

b, c∈Z2 b=0, c=0

(2)

(2)

(2)

Qb1 +c1 Qb1 Qc1 (δb2 +c2 , 2 δb2 , 2 δc2 , 2 ) (b · ω)(c · ω)

 Qm +a Q(2) a 1 1 1  δm2 , −1 , =  a1 ω + 2F0

(IV.4)

a1 ∈Z

since δb2 +c2 , 2 δb2 , 2 δc2 , 2 = δ4, 2 δb2 , 2 δc2 , 2 = 0. This proves Theorem IV.1 for n = 1.

984

J. C. A. Barata


For any n ≥ 2 relation (II.9) is very much simplified with the observation that, for F0 as above, one has R = 0. To see this, write R according to the definition (II.10) and use (II.19) to get

R =

1 2iM (Q1 )

(2)

(2)

(2)

Qa1 +b1 Qa1 Qb1 (δa2 +b2 , 2 δa2 , 2 δb2 , 2 ) = 0, (a · ω)(b · ω)

a, b∈ZA a=0, b=0

(IV.5)

since δa2 +b2 , 2 δa2 , 2 δb2 , 2 = δ4, 2 δa2 , 2 δb2 , 2 = 0. The proof is now done by induction. Let n ≥ 2 and assume that for all p with 1 ≤ p ≤ n − 1 one has (p) E(p) m = Em1 δm2 , −1

(IV.6)

for all m = (m1 , m2 ) ∈ Z2 . According to (II.9) we have E(n) m

=

n−1

p) A(n, m

p=1

Qm B (n, p) + iM (Q1 )

n−1 Qm C (n, p) , 2iM (Q1 ) p=2

+

(IV.7)

where p) := A(n, m

(p)

(a + b) · ω

a, b∈Z2 a+b=0

B (n, p) :=

(n−p)

Ea Eb

Qm−a−b

,

(2) (p) (n−p) Q(2) a−b−c Qa Eb Ec a∈Z2 b, c∈Z2 a=0 b+c=0

and C (n, p) :=

(a · ω)((b + c) · ω)

(n+1−p)

E(p) a E−a

.

(IV.8)

,

(IV.9)

(IV.10)

a∈Z2

By (II.19) and by the induction hypothesis, p) A(n, m

=

(p)

Qm1 −a1 −b1

a, b∈Z2 a+b=0

 =



a1 , b1 ∈Z

(n−p)

Ea1 Eb1 [δm2 −a2 −b2 , 1 δa2 , −1 δb2 , −1 ] (a1 + b1 )ω + (a2 + b2 )F0 (p)

(n−p)

Qm1 −a1 −b1 Ea1 Eb1 (a1 + b1 )ω − 2F0

  δm2 , −1 .

(IV.11)

Vol. 2, 2001


Moreover, B (n, p) =

(2) (p) (n−p) Q(2) [δa2 −b2 −c2 , 2 δa2 , 2 δb2 , −1 δc2 , −1 ] a1 −b1 −c1 Qa1 Eb1 Ec1 (a1 ω + a2 F0 )((b1 + c1 )ω + (b2 + c2 )F0 ) a∈Z2 b, c∈Z2 a=0

b+c=0

equals to zero, since δa2 −b2 −c2 , 2 δa2 , 2 δb2 , −1 δc2 , −1 = δ4, 2 δa2 , 2 δb2 , −1 δc2 , −1 = 0. Finally, (n+1−p) Ea(p) E−a1 (δa2 , −1 δ−a2 , −1 ) = 0. (IV.12) C (n, p) = 1 a∈Z2

Hence, for n ≥ 2, E(n) m =

n−1 p=1



n−1

p) A(n, =  m

p=1 a1 , b1 ∈Z

(p)

(n−p)

Qm1 −a1 −b1 Ea1 Eb1 (a1 + b1 )ω − 2F0

completing the proof of Theorem IV.1.

  δm2 , −1 , (IV.13)

IV.2 Inductive Upper Bounds and Convergence Theorem IV.1 is of crucial importance, since it shows that actually no problems with small denominators are present in the recursion relations defining the Fourier (n) coefficients Em . This allows to find upper bounds for the absolute values of the (n) coefficients Em in essentially the same way as performed in for the case F0 = 0. This is what we do now. (2) As we already mentioned, the coefficients Qm and Qm can be bounded as in Proposition II.2. Moreover, we have |a1 ω + 2F0 | ≥ min | |a|ω − 2|F0 | | =: η > 0. a∈Z

(IV.14)

Note that η = 2|F0 | for |F0 | ≤ ω/2 and, hence, η → 0 when F0 → 0. This remark will be relevant in Section VI.3. Using Proposition II.2 and Lemma II.3, $ $ $ $ $ $ $ $ Qm1 +a1 Q(2) a $ (1) $ 1 $ $ $Em $ = $ $ δm2 , −1 a ω + 2F 0 $ $a1 ∈Z 1 2 −χ|m1 | Q2 Q B0 e ≤ δm2 , −1 , (IV.15) B(m1 ) δm2 , −1 ≤ η η m12 where B(m) is defined in (II.24). Defining K1 := Q2 B0 /η, taking n ≥ 2 and assuming the induction hypothesis $ $ e−χ|m1 | $ (p) $ δm2 , −1 , $Em $ ≤ Kp m12

(IV.16)

986

J. C. A. Barata


for all p with 1 ≤ p ≤ n − 1, where Kp are constants independent of m, we have from (IV.13),   n−1 $ $ $ $ $ $ 1 $ $ (n−p) $ $ $ (n) $ |Qm1 −a1 −b1 | $Ea(p) $ $Eb1 $ δm2 , −1 $Em $ ≤ 1 η p=1 a1 , b1 ∈Z   n−1 −χ(|m1 −a1 −b1 |+|a1 |+|b1 |) e Q  ≤ Kp Kn−p δm2 , −1 2 2 2 η m1 − a1 − b1 a1 b1 p=1 a1 , b1 ∈Z n−1 QB02 e−χ|m1 | ≤ Kp Kn−p δm2 , −1 , (IV.17) η m12 p=1 where, above, we used Lemma II.3. Defining inductively n−1 2 QB 0 Kp Kn−p Kn := η p=1 we have proven that

$ $ e−χ|m1 | $ (n) $ δm2 , −1 , $Em $ ≤ Kn m12

(IV.18)

(IV.19)

for all n ∈ N and all m = (m1 , m2 ) ∈ Z2 . With the same methods employed Appendix D, we can show that Kn ≤ K0 (K )n for all n ∈ N, where K0 and K are positive constants. From all this, it follows that, for all n, |en (t)| ≤ K0 (K )n

e−χ|m1 | = K0 (K )n m12

(IV.20)

m1 ∈Z

where K0 is a constant and |g(t)| ≤ K0

∞

|2 |n (K )n .

(IV.21)

n=1

We have thus established that the Fourier series of the functions en converge absolutely and uniformly and that, for ||2 < (K )−1 , the power series (II.3), which defines the solution g, is absolutely convergent. The Fourier expansion for g is also absolutely and uniformly convergent. We conclude from the lines above that the true radius of convergence R of the -expansion of g is bounded from below by (K )−1/2 . Note that K is proportional to η −1 and, hence, (K )−1/2 shrinks to zero when F0 → 0 (see the definition of η in equation (IV.14)). As we will remark in Section VI.3, there are indications that R also shrinks o zero when F0 → 0.

Vol. 2, 2001


Let us finish this section with a closer look at the Fourier expansion of g. Theorem IV.1 says that the functions en have the following Fourier decomposition: (n) imωt en (t) = e−iF0 t Em e , (IV.22) m∈Z

while for q(t) we have q(t) = eiF0 t

Qm eimωt .

(IV.23)

m∈Z

Thus, g(t) =

Gm eimωt

(IV.24)

m∈Z

where Gm ≡ Gm () =

∞

λn G(n) m

(IV.25)

n=1

with λ = 2 and

G(n) m =

(n)

Qm−l El .

(IV.26)

l∈Z

Note by (IV.24) that F0 is present in g only in the Fourier coefficients Gm and not in the frequencies. (n) For the coefficients Gm we have the following expressions, which will need when we discuss the -expansion of Ω in Section VI.3: G(1) m =

(2) Q(2) m+a1 Qa1 a1 ω + 2F0

(IV.27)

a1 ∈Z

and G(n) m

=

n−1

p=1 a1 , b1 ∈Z

V

(2)

(p)

(n−p)

Qm−a1 −b1 Ea1 Eb1 (a1 + b1 )ω − 2F0

,

n ≥ 2.

(IV.28)

The Fourier Expansion for the Wave Function

Now we return to the discussion of the solution (I.9) of the Schr¨ odinger equation (I.5). Our intention is to find the Fourier expansion of the wave function Φ(t).

V.1 The Floquet Form of the Wave Function. The Fourier Decomposition and the Secular Frequency As explained in [1] and in Section I, the components φ± of the wave function Φ(t) are solutions of Hill’s equation (I.13). For periodic f the classical theorem of Floquet (see e.g. [21] and [22]) claims that there are particular solutions of

988

J. C. A. Barata


equations like (I.13) with the general form eiΩt u(t), where u(t) is periodic with the same period of f . In order to preserve unitarity we must have Ω ∈ R. This form of the particular solutions is called the “Floquet form” and the frequencies Ω are called “secular frequencies”. In this section we will recover the Floquet form of the wave function in terms of Fourier expansions and we will find out expansions for the secular frequencies as converging power series expansions in . According to the solution expressed in relation (I.8) and (I.9), we have first to find out the Fourier expansion for the functions R and S defined in (I.10) and (I.11), respectively. We start with the function R. The Fourier expansion of the function f + g is (Fn + Gn ()) einωt , (V.1) f (t) + g(t) = Ω + n∈Z n=0

where Ω ≡ Ω() := F0 + G0 (). One has,

R(t) = e

−iγf ()

e

−iΩt

exp −

(V.2)

Hn e

inωt

(V.3)

n∈Z

with Hn

 F + Gn ()   n , for n = 0 nω , ≡ Hn () :=   0, for n = 0

and γf () := i

Hm .

(V.4)

(V.5)

m∈Z

Note that γf (0) = γf , where γf is defined in (B.4). Since we are assuming that there are only finitely many non-vanishing coefficients Fn , we have the following proposition as an obvious corollary of Proposition III.4: Proposition V.1 For all χ > 0 and || small enough, there exists a constant CH ≡ CH (χ, ) > 0 such that e−χ|m| (V.6) |Hm | ≤ CH m2 for all m ∈ Z. Writing now the Fourier expansion of R(t) in the form R(t) = e−iΩt Rn einωt n∈Z

(V.7)

Vol. 2, 2001


we find from (V.3)  R0

= e−iγf () 1 +

p=1

 Rn

∞ (−1)p+1

= e−iγf () −Hn +

(p + 1)!

(p + 1)!

for n = 0, with Np :=

Hn1 · · · Hnp H−Np  ,

n1 ,..., np ∈Z

∞ (−1)p+1 p=1



p

(V.8) 

Hn1 · · · Hnp Hn−Np  ,(V.9)

n1 ,..., np ∈Z

na ,

(V.10)

a=1

for p ≥ 1. In order to compute the Fourier expansion of S we have to compute first the Fourier expansion of R−2 . This is now an easy task, since the replacement R(t) → R(t)−2 corresponds to the replacement (f + g) → −2(f + g) and, hence, to Hn → −2Hn . We get R(t)−2 = e2iΩt

Rn(−2) einωt ,

(V.11)

n∈Z

with 

(−2) R0

=

Rn(−2)

=

 ∞ p+1 2 e2iγf () 1 + Hn1 · · · Hnp H−Np  , (p + 1)! p=1 n1 ,..., np ∈Z   ∞ p+1 2 e2iγf () 2Hn + Hn1 · · · Hnp Hn−Np  , (p + 1)! p=1 n1 ,..., np ∈Z

for n = 0. The following proposition will be used below. Proposition V.2 For all χ > 0 and || small enough, there exist constants CR ≡ CR (χ, ) > 0 and CR(−2) ≡ CR(−2) (χ, ) > 0 such that |Rm | ≤ CR

e−χ|m| m2

(−2) |Rm | ≤ CR(−2)

for all m ∈ Z.

e−χ|m| m2

(V.12)

(V.13)

990

J. C. A. Barata


Proof of Proposition V.2. Using Proposition V.1 we have, for any p ≥ 1, $ $ $ $ $ $ $ Hn1 · · · Hnp Hn−Np $$ ≤ $ $n1 ,..., np ∈Z $ exp (−χ(|n1 | + · · · + |np | + |n − n1 − · · · − np |)) (CH )p+1 . 2 (n1 · · · np n − n1 − · · · − np) n1 ,..., np ∈Z Making repeated use of Lemma II.3, we get $ $ $ $ $ $ (CH B0 )p+1 e−χ|n| $ $ H · · · H H . n n n−N 1 p p$ ≤ $ B0 n2 $n1 ,..., np ∈Z $

(V.14)

Inserting this into (V.8)–(V.9) gives (since B0 > 1) |Rn |

≤

e|Im(γf ())|+CH B0 B0 (−2)

for all n ∈ Z, as desired. The proof for Rn Assuming for a while nω + 2Ω = 0

e−χ|n| n2

(V.15)

is analogous.

for all n ∈ Z,

we have3 S(t) = σ0 + e2iΩt

(V.16)

Sn einωt

(V.17)

n∈Z

with Sn := −i

(−2)

Rn nω + 2Ω

and

σ0 := −

Sn .

(V.18)

n∈Z

Assumption (V.16 ) is actually a consequence of unitarity, as will be discussed in Section V.2. The following proposition is an elementary corollary of Proposition V.2: Proposition V.3 For all χ > 0 and || small enough, there exists a constant CS ≡ CS (χ, ) > 0 such that e−χ|m| (V.19) |Sm | ≤ CS m2 for all m ∈ Z. 3 For

the case n = 0, (V.16) says that Ω = 0. This must hold except for = 0 when Ω = 0.

Vol. 2, 2001


Writing



U (t) = 

U11 (t)

U12 (t)

U21 (t)

U22 (t)





U11 (t)

 = 

U12 (t)

 ,

(V.20)

−U12 (t) U11 (t)

we have for U11 and U12 : U11 (t) = U12 (t) =

iΩt + e−iΩt u− u11 (t) 11 (t) + e

(V.21)

e

(V.22)

−iΩt

u− 12 (t)

+e

u+ 12 (t)

iΩt

with u− 11 (t) :=

(1 + ig(0)σ0 ) r(t),

u+ 11 (t) :=

ig(0) v(t),

u− 12 (t) :=

−iσ0 r(t),

u+ 12 (t) :=

−i v(t),

for r(t) :=

Rn einωt

and

v(t) :=

n∈Z

(V.23)

Vn einωt ,

(V.24)

n∈Z

with Vn :=

Sn−m Rm .

(V.25)

m∈Z

This provides the desired Floquet form for the components of the wave function Φ(t). We note from the expressions above that the secular frequencies are ±Ω. For Ω we have the -expansion Ω=

∞ n=1

for F0 = 0 or Ω = F0 +

(n)

n G0 ,

∞

(V.26)

(n)

n=1

2n G0 ,

(V.27)

(n)

0, where the coefficients G0 are given by (III.12) or (IV.26), according for F0 = to the case. Analogously, we have for g(0) g(0) = for F0 = 0 or g(0) =

Gm =

∞

n

m∈Z

n=1

m∈Z

∞

m∈Z

Gm =

n=1

2n

G(n) m ,

(V.28)

G(n) m ,

(V.29)

m∈Z

for F0 = 0. All these series converge absolutely for || small enough. As before, we have the following corollary of Propositions V.2, V.3 and Lemma II.3:

992

J. C. A. Barata


Proposition V.4 For all χ > 0 and || small enough, there exists a constant CV ≡ CV (χ, ) > 0 such that e−χ|m| (V.30) |Vm | ≤ CV m2 for all m ∈ Z. This last proposition closed the proof of Theorem I.2.

V.2 Remarks on the Unitarity of the Propagator. Crossings The unitarity of the propagator U (t) means U (t)∗ U (t) = 1l. After (V.20), this means (V.31) |U11 (t)|2 + |U12 (t)|2 = 1. Looking at relations (V.21) and (V.22) two conclusions can be drawn from (V.31). The first is the following proposition: Proposition V.5 For ∈ R and under the hypothesis leading to (V.21) and (V.22) one has Ω ∈ R. The proof follows from the obvious observation that (V.31) would be violated for |t| large enough if Ω had a non-vanishing imaginary part. Unfortunately a proof of this fact using directly the -expansions of Ω, (V.26) or (V.27), is difficult and has not been found yet. The second conclusion is that (V.16) must indeed hold. For, without this assumption there would be a term linear in t in (V.17), violating (V.31) for large |t|. As in the case of Proposition V.5, no direct proof of this fact out of the expansions for Ω, (V.26) or (V.27), has been found yet. The proof will probably follow the fact that |Ω| had to be smaller than 2ω in the region of convergence. Finally, note that on results say that the spectrum of the quasi-energy operator is a subset of {±Ω + kω| k ∈ Z}. Hence, the condition (V.16) 2Ω = kω, k ∈ Z, implies the absence of crossings in the spectrum of the quasi-energy operator when varies within the convergence region. This is, of course, relevant for the adiabatic limit of systems where is a slowly varying function of time.

VI Discussion on the Classes of Solutions Let us now discuss some aspects of conditions I and II of Theorem I.2 for the case F0 = 0. As in (II.11) or (B.1), let us write the Fourier decomposition of f as f (t) =

2J a=1

fa eina ωt ,

(VI.1)

Vol. 2, 2001


with na = −n2J−a+1 and fa = f2J−a+1 for all a with 1 ≤ a ≤ J. Comparing with (I.18) one has fa ≡ Fna , 1 ≤ a ≤ J. Hence, for F0 = 0 and for fixed J and ω, there are J independent complex coefficients fa and we can identify the parameter space R2J with the set FJ, ω of all possible functions f with a given J and ω. Condition M (q 2 ) = 0 determines a (2J − 1) or (2J − 2)-dimensional subset of FJ, ω and there condition II applies. It is also on this subset that the more restrictive condition M (q 2 ) = M (Q1 ) = 0 should hold, restricting the parameter space of f to a (2J − 2), (2J − 3) or (2J − 4)-dimensional subset. Hence, successive conditions like I and II would eventually exhaust completely the parameter space FJ, ω . Conditions beyond I and II have not been yet analyzed and many questions concerning the classes of solutions are still open. For instance, will further conditions like I and II really exhaust the parameter space of the functions f ? Will the subtraction method of [1] and the convergence proofs of the present paper also work under these further conditions? What are the physically qualitative distinctions between the classes? Are these classes of solutions in some sense analytic continuations of each other? In Section VI.3 we give indications that the answer to the last question is no. A distinction between class I and II may be pointed out with the observation that in class I we have power expansions in while in II we have power expansions in 2 . Compare relations (II.1) and (II.3) of Theorem II.1. See also Section VI.3.

VI.1 An Explicit Example In order to illustrate these ideas and point to some problems let us consider the important example where f represents a monochromatic interaction given by f (t) = ϕ1 cos(ωt) + ϕ2 sin(ωt),

(VI.2)

ϕ1 , ϕ2 ∈ R. We have f (t) = f1 e−iωt + f2 eiωt with f1 = (ϕ1 + iϕ2 )/2, f2 = f1 , J = 1, n1 = −1, n2 = 1. Applying now (II.17) for this case with m = 0 we get 2p ∞ 2ϕ0 (−1)p 4|f1 | (2) 2 2iγf 2iγf = e J0 M (q ) = Q0 = e , (VI.3) (p!)2 2ω ω p=0

% where ϕ0 := ϕ21 + ϕ22 and where J0 is the Bessel function of first kind and order zero. In this case γf = ϕ2 /ω. Relation (VI.3) shows that condition I is not empty and that the locus in the (ϕ1 , ϕ2 )-space of the condition M (q 2 ) = 0 (necessary for condition II) is the countable family of circles centered at the origin with radius xa ω/2, a = 1, 2, . . ., where xa if the a-th zero of J0 in R+ . One shows analogously that m 2|f1 | f1 Qm = eiγf Jm (VI.4) |f1 | ω

994

J. C. A. Barata

and Q(2) m

= e

2iγf

f1 |f1 |

m Jm

4|f1 | ω


,

(VI.5)

for all m ∈ Z, where Jm is the Bessel function of first kind and order m. (2) For Q0 = 0 the function Q1 is periodic and we have in general $ $ $ $ $  $ $ (2) $2 $ (2) $2 $ (2) $2 ∞ − $Q $ $ $Q $Q m −m $  i  m i = M (Q1 ) =   . ω m∈Z m ω m=1 m

(VI.6)

m=0

(2)

Since |Jm (x)| = |J−m (x)| for all x ∈ R, ∀m ∈ Z, it follows that |Qm | = ∀m ∈ Z. Hence, for functions f like (VI.2)

(2) |Q−m |,

M (Q1 ) = 0.

(VI.7)

Therefore, condition II is nowhere fulfilled. For a complete solution of the problem for functions like (VI.2), including the circles mentioned above, higher restrictions than that implied by condition II are necessary.

VI.2 A Second Example For functions f with J > 1 the situation leading to (VI.7) is not expected in general and condition II, and eventually others, may hold in non-empty regions of the parameter space of f . This can be seen in the following example with J = 2. Let us take f (t) = f1 (t) + f2 (t) with f1 (t) = f2 (t) =

f1 e−iωt + f1 eiωt f2 e−i2ωt + f2 ei2ωt

fi ∈ C, i = 1, 2. We have q(t) = q1 (t)q2 (t), where 2|f1 | inωt iγf1 inζ1 q1 (t) := e e Jn , e ω n∈Z |f2 | in2ωt iγf2 inζ2 e Jn , q2 (t) := e e ω n∈Z

with eiζi =

fi , |fi |

i = 1, 2.

Vol. 2, 2001


It follows that Qm

=

Q(2) m

=

2|f1 | |f2 | Jk , ω ω k∈Z 4|f1 | 2|f2 | e2i(γf1 +γf2 ) ei((m−2k)ζ1 +kζ2 ) Jm−2k Jk . ω ω

ei(γf1 +γf2 )

ei((m−2k)ζ1 +kζ2 ) Jm−2k

k∈Z

From this we see (using J−n (x) = (−1)n Jn (x)) that (2)

Q−m

=

(−1)m e−4i(γf1 +γf2 ) & ' 4|f1 | 2|f2 | 2i(γf1 +γf2 ) k i((m−2k)ζ1 +kζ2 ) × e (−1) e Jm−2k Jk . ω ω k∈Z

(2)

The factor between brackets differs from Qm due to the presence of the factor (2) (2) (−1)k in the sum over k ∈ Z. Hence, we should rather expect |Qm | = |Q−m | 2 in this case, what most likely implies M (Q1 ) = 0 for M (q ) = 0, leading to a non-empty condition II.

VI.3 The Secular Frequency For F0 = 0, case I, relation (V.26) says that ∞ $ $ $ (2) $ (2) (n) n G0 . Ω = $Q0 $ + 2 G0 +

(VI.8)

n=3

Because of condition I, the first order contribution in is non-vanishing. However, (2) as one easily checks, G0 = 0 and, hence, the second order contribution to Ω is always zero. As we will see, this no longer happens in the case F0 = 0. For F0 = 0 we have from (V.27), (IV.27) and (IV.28) Ω

=

F0 +

∞ n=1

(n)

2n G0



=

F0 + 2 

a1 ∈Z

+

∞ n=2

$ $ $ (2) $ $Qa1 $



2n 



a1 ω + 2F0

n−1

p=1 a1 , b1 ∈Z

 (2)

(p)

(n−p)

Q−a1 −b1 Ea1 Eb1 (a1 + b1 )ω − 2F0

 .

(VI.9) (2)

It is interesting to study the limit F0 → 0 of Ω given in (VI.9). If Q0 = 0 the limit F0 → 0 of Ω given in (VI.9) is termwise singular, in contrast to the expression for Ω obtained under the condition F0 = 0.

996

J. C. A. Barata


(2)

(1)

For Q0 = 0 the situation is analogous, as we discuss briefly now. For Em we have (1) := Em

Qm+a Q(2) a1 1 a ω + 2F 1 0 a ∈Z

(1) (1) lim Em = Em :=

=⇒

F0 →0

1 a1 =0

and hence lim

F0 →0

Qm+a Q(2) a1 1 , a ω 1 a ∈Z

1 a1 =0

(VI.10) (2) exists and is well defined for all m ∈ Z. However, for Em ,

(1) Em

we have (2) Em =

a1 , b1 ∈Z

Qm−a1 −b1 (1) E (1) E = S0 + S1 (a1 + b1 )ω − 2F0 a1 b1

(VI.11)

with S0 := −

Qm (1) (1) Ea1 E−a1 , 2F0

S1 :=

a1 ∈Z

a1 , b1 ∈Z a1 +b1 =0

Qm−a1 −b1 (1) E (1) E . (a1 + b1 )ω − 2F0 a1 b1 (VI.12)

The limit F0 → 0 exists for S1 , but not for S0 . One easily sees that (1)

lim G0

F0 →0

and that (2)

lim G0

F0 →0

=

|Q(2) a1 | a1 ω

=

(VI.13)

a1 ∈Z

(2)

a1 , b1 ∈Z a1 +b1 =0

(1)

(1)

Q−a1 −b1 Ea1 Eb1 , (a1 + b1 )ω

(VI.14)

(1)

where Em is defined in (VI.10). However, (3)

G0

=

a1 , b1 ∈Z

(2)

(1)

(2)

Q−a1 −b1 Ea1 Eb1 (a1 + b1 )ω − 2F0

(VI.15)

and the limit F0 → 0 of the right hand side does not exist, since it does not exist (2) (n) for Eb1 . The same must hold for G0 with n > 3. The conclusion is, thus, the (2)

same as in the case Q0 = 0. The remarks above indicate that the limit F0 → 0 of the solution of (I.5) obtained here is singular and does not converge to the solution corresponding to the case F0 = 0. All this strongly suggests that the radius of convergence of the expansions for the case F0 = 0 shrinks to zero when the limit F0 → 0 is performed. An indication to this was already discussed in the paragraphs following equation (IV.19). More generally, the same must happen when 2F0 approaches an integer multiple of ω.

Vol. 2, 2001


All this should not be surprising since there is no reason to expect analyticity or even continuity of, for instance, the secular frequency Ω as a function of the (2) parameters defining f . Recall that, generically, we have Q0 = 0 for F0 = 0 (2) but, generically, Q0 = 0 for F0 = 0 and, hence, both expansions can be rather different.

Appendices A

Short Description of the Strategy Followed in [1]

For convenience of the reader we reproduce the main steps of the strategy developed in [1] for finding a power series solution of the generalized Riccati equation (I.7) without secular terms. As discussed in Section I, a natural proposal is to express g, a particular solution of (I.7), as a formal power expansion on which vanishes at = 0. For convenience, we write this expansion as in (I.16) where q(t) is defined in (I.17). This would give the desired solution, provided the infinite sum converges. Inserting (I.16) into (I.7) leads to ∞ n−1 q 2 cp cn−p − 2if qcn n + i2 = 0. (A.1) (qcn ) − i n=1

p=1

Assuming that the coefficients vanish order by order we conclude (qc1 ) − 2if qc1 = 0,

(A.2)

(qc2 ) − iq 2 c21 − 2if qc2 + i = 0, n−1 (qcn ) − i q 2 cp cn−p − 2if qcn = 0,

(A.3) n ≥ 3.

(A.4)

p=1

The solutions of (A.2)–(A.3) are c1 (t) = α1 q(t), ! t " 2 2 α1 q(t ) − q(t )−2 dt + α2 , c2 (t) = q(t) i 0 n−1 t cp (t )cn−p (t ) dt + αn , cn (t) = q(t) i p=1

0

(A.5) (A.6) for n ≥ 3, (A.7)

where the αn ’s above, n = 1, 2, . . . , are arbitrary integration constants. The key idea is to fix the integration constants αi in such a way as to eliminate the constant terms from the integrands in (A.6) and (A.7). The remaining terms involve sums of exponentials like einωt , n = 0, which do not develop secular terms when integrated, in contrast to the constant terms. For instance, fixing α1 such

998

J. C. A. Barata


that M (α21 q 2 − q −2 ) = 0, that means, α21 = M (q −2 )/M (q 2 ), prevents secular terms in (A.6). As shown in [1] this procedure can be implemented in all orders, fixing all constants αi and preventing secular terms in all functions cn (t). In case I, relations (II.5)–(II.7) represent precisely relations (A.5)–(A.7) in Fourier space with the integration constants fixed as explained above. Case II is analogous.

B The Decay of the Fourier Coefficients of q and q 2 To prove our main results on the Fourier coefficients of the functions cn and en we have to establish some results on the decay of the Fourier coefficients of q and q 2 . For periodic f we write the Fourier series (I.18) in the form Fn einωt , f (t) = F0 + n∈Z n=0

with Fn = F−n , since f is real. In order to simplify our analysis we will consider here the case where the sum above is a finite sum. This situation is physically more realistic anyway. By assumption, the set of integers {n ∈ Z, n = 0| Fn = 0} is a finite set and, by the condition that f is real and F0 = 0, it contains an even number of elements, say 2J with J ≥ 1. Let us write this set of integers as {n1 , . . . , n2J } and write f (t) = F0 +

2J

fa eina ωt ,

(B.1)

a=1

with the convention that na = −n2J−a+1 , for all 1 ≤ a ≤ J, with fa ≡ Fna . Clearly fa = f2J−a+1 , 1 ≤ a ≤ J. Relation (II.20) becomes Qm = e

iγf

∞

2J

!

δ (P, m)

p1 , ..., p2J =0

a=1

where P ≡ P (p1 , . . . , p2J , n1 , . . . , n2J ) :=

1 pa !

2J

fa na ω

pa "

pb nb ∈ Z,

,

(B.2)

(B.3)

b=1

and where γf := i

2J fa . n ω a=1 a

As one easily sees, γf ∈ R. Above δ (P, m) is the Kr¨ onecker delta: 1, if P = m, δ (P, m) := 0, else.

(B.4)

Vol. 2, 2001


Relation (II.21) becomes ∞

2iγf Q(2) m = e

2J

δ (P, m)

p1 , ..., p2J =0

a=1

!

1 pa !

2fa na ω

pa " .

(B.5)

(2)

The coefficients Qm and Qm can also be expressed in terms of Bessel functions of the first kind and integer order. See Section VI for some examples. As in [1], define $ $ 2J $ fa $ $ $ and N := |nb |. ϕ := max $ 1≤a≤2J na ω $ b=1

Note that, since the nb ’s are fixed by the choice of f , N is non-zero. The following important bounds have been proven in [1], Appendix D: −1 ϕ N −1 |m| ϕ , (B.6) 1 − |Qm | ≤ 2Je(2J−1)ϕ N −1 |m|! N −1 |m| + 1 and |Q(2) m |

−1 (2ϕ) N −1 |m| 2ϕ (2J−1)2ϕ ≤ 2Je , 1− N −1 |m|! N −1 |m| + 1

(B.7)

for all m with N −1 |m| + 1 > 2ϕ. Above x is the lowest integer larger than or equal to x. In [1] we derived from (B.6) a simple exponential bound for |Qm |, namely, |Qm | ≤ Q e−χ|m| ,

(B.8)

where Q and χ are some positive constants. For the purposes of this paper a sharper bound than (B.8) is needed and we have to study relation (B.6) more carefully. The result is expressed in Proposition II.2 whose proof we present now. Proof of Proposition II.2. Let us consider first the coefficients Qm . Due to the dominating factor N −1 |m|!, one has −1

m2 ϕ N |m| = 0. lim |m|→∞ e−χ|m| N −1 |m|! for any constant χ > 0. Hence, one can choose a constant M1 > 0 depending on χ such that −1 e−χ|m| ϕ N |m| ≤ M1 −1 N |m|! m2 for all m ∈ Z. Therefore, there exists a positive constant Q1 > 0 (depending on χ) (2) such that |Qm | ≤ Q1 m−2 e−χ|m| for all m ∈ Z. For Qm we proceed in the (2) same way and get the bound |Qm | ≤ Q2 m−2 e−χ|m| for all m ∈ Z. In (II.23) we adopt Q = max{Q1 , Q2 }.

1000

C

J. C. A. Barata


Bounds on Convolutions

Here we will prove Lemma II.3. Consider for χ > 0 and m ∈ Z B(m) ≡ B(m, χ) :=

n∈Z

e−χ(|m−n|+|n|) . m − n2 n2

(C.1)

First, note that B(m) = B(−m) for all m ∈ Z. Choosing B0 to be such that B0 ≥

e−2χ|n| n4

n∈Z

the statement of the lemma becomes trivially correct for m = 0. Hence, it is enough to consider the case where m > 0. In (C.1), the sum over all n ∈ N can be split into three sums: B(m) =

e−χm

m e2χn 1 −χm + e 2 n2 2 n2 (m − n) m − n n=−∞ n=0

+ eχm

−1

∞

e−2χn . (m − n)2 n2 n=m+1

(C.2)

In the first sum above we perform the change of variables n → −n and in the third sum we perform the change of variables n → n + m. The result is ∞ m e−2χn 1 −χm B(m) = e + 2 (C.3) (m + n)2 n2 n=0 m − n2 n2 n=1 Now we will study separately each of the sums in (C.3). Since for n ≥ 1 one has m + n ≥ m one has for the first sum ∞

e−2χn B1 ≤ 2 n2 2 (m + n) m n=1

(C.4)

∞ e−2χn . n2 n=1 The second sum in (C.3) is a little more involving. We have

where B1 :=

m

1 m − n2 n2 n=0 m/2

n=0

=

1 + m − n2 n2

m n=m/2+1

1 . m − n2 n2

(C.5)

Vol. 2, 2001


For the first sum in the right hand side of (C.5) we have m − n ≥ m − n ≥ m − m/2 ≥ m/2. For the second sum in the right hand side of (C.5) we have n ≥ m/2 + 1 ≥ m/2. Hence, for m > 0,   2 m/2 m m 2 1 1 1   ≤ + 2 n2 2 m − n m n m − n2 n=0 n=0

≤

2

2 m

n=m/2+1

2 ∞

1 . 2 n n=0

(C.6)

Therefore, choosing B0 = 2B1 + 8

∞

1 2 n n=0

(C.7)

the lemma is proven.

D Catalan Numbers. Bounds on the Constants Kn Here we will prove the crucial Theorem III.2. Let us start recalling that we have chosen K1 = K2 = C1 for some constant C1 which, in turn, can be chosen without loss to be larger than or equal to 1. The proof of Theorem III.2 will be presented on four steps. Step 1. In this step we show that the sequence Kn , defined in (III.7), is an increasing sequence. First, note that K3 = C2 (2K1 K2 + (K2 )2 ) = 3C2 (K2 )2 . Since K1 = K2 ≥ 1 and C2 ≥ 1, we have K1 = K2 < K3 . Let us now suppose that K1 = K2 < K3 < · · · < Kn

(D.1)

for some n ≥ 3. We will show that Kn+1 > Kn . We have Kn+1 − Kn = n n n−1 n−1 Kp Kn−p+1 + Kp Kn−p+2 − Kp Kn−p − Kp Kn−p+1 = C2

p=1

C2 2K1 Kn +

p=2 n p=2

Kp Kn−p+2 −

p=1 n−1

p=2

Kp Kn−p =

p=1

C2 [2K1 Kn + (Kn − Kn−2 )K1 + (K3 − K1 )Kn−1 + · · · + (Kn − Kn−2 )K2 ] , where in the last equality we used K1 = K2 . Now, from hypothesis (D.1) we conclude that Kn+1 > Kn , thus proving that Kn is an increasing sequence.

1002

J. C. A. Barata


Step 2. Here we show that the sequence Kn defined in (III.7) satisfies Kn ≤ 3C2

n−1

Kp Kn−p+1

(D.2)

p=2

for all n ≥ 3. We have already shown that K3 = 3C2 (K2 )2 . Hence, (D.2) is obeyed for n = 3. Assume now that (D.2) is satisfied for all Kp with p ∈ {1, . . . , n − 1}, for some n ≥ 4. We will show that it is also satisfied for Kn . In fact, we have from (III.7) n−1 Kn = C2 K1 Kn−1 + Ka (Kn−a + Kn−a+1 ) . (D.3) a=2

From this and from the fact proven in step 1 that the sequence Kn is increasing, it follows that n−1 Ka Kn−a+1 . (D.4) Kn ≤ C2 K1 Kn−1 + 2 a=2

Now, using the obvious relation K1 Kn−1 = K2 Kn−1 ≤

n−1

Ka Kn−a+1

a=2

we get finally from (D.4) Kn ≤ 3C2

n−1

Kp Kn−p+1 ,

(D.5)

p=2

thus proving (D.2). Step 3. Here we will prove the following statement. Let Ln be defined as the sequence such that L1 = L2 = K1 = K2 = C1 and Ln = 3C2

n−1

Lp Ln−p+1 .

(D.6)

∀n ∈ N.

(D.7)

p=2

Then, one has Kn ≤ Ln , 2

2

First, note that K3 = 3C2 (K1 ) = 3C2 (L1 ) = L3 . Hence, (D.7) is valid for n ∈ {1, 2, 3}. Now suppose Kp ≤ Lp for all p ∈ {1, . . . , n − 1} for some n ≥ 4. One has from (D.2) Kn ≤ 3C2

n−1 p=2

thus proving (D.7).

Kp Kn−p+1 ≤ 3C2

n−1 p=2

Lp Ln−p+1 = Ln ,

(D.8)

Vol. 2, 2001


Step 4. Consider the sequence cn defined as follows: c1 = c2 = 1 and cn =

n−1

cp cn−p+1

(D.9)

p=2

for n ≥ 3. The so defined numbers cn are called “Catalan numbers”, after the mathematician Eugène C. Catalan. The Catalan numbers arise in several combinatorial problems (for a historical account with proofs, see [19]) and can be expressed in a closed form as cn =

(2n − 4)! , (n − 1)!(n − 2)!

n ≥ 2.

(D.10)

(see, f.i, [19] or [20]). Using Stirling’s formula we get the following asymptotic behaviour for the Catalan numbers: cn ≈

1 4n √ , 16 π n3/2

n large.

(D.11)

The existence of a connection between the Catalan numbers and the sequence Ln defined above is evident. Two distinctions are the factor 3C2 appearing in (D.6) and the fact that L1 = L2 = C1 is not necessarily equal to 1. Nevertheless, using the definition of the Catalan numbers in (D.9), it is easy to prove the following closed expression for the numbers Ln : Ln = (C1 )n−1 (3C2 )n−2

(2n − 4)! , (n − 1)!(n − 2)!

n ≥ 2.

(D.12)

We omit the proof here. Hence, the following asymptotic behaviour can be established: 1 (12C1 C2 )n √ Ln ≈ , n large. (D.13) 144C1C22 π n3/2 From the inequality Kn ≤ Ln , proven in step 3, it follows that Kn ≤ K0 (12C1 C2 )n for some constant K0 > 0, for all n ∈ N. Theorem III.2 is now proven.

Acknowledgments I am very indebted to Walter F. Wreszinski for enthusiastically supporting this work and for many important suggestions. I am also grateful to Paulo A. F. da Veiga for discussions and to César R. de Oliveira for asking the right questions. Partial financial support from CNPq is herewith recognized.

1004

J. C. A. Barata


References [1] J. C. A. Barata, On Formal Quasi-Periodic Solutions of the Schr¨ odinger Equation for a Two-Level System with a Hamiltonian Depending QuasiPeriodically on Time, Rev. Math. Phys. 12, 25–64 (2000). [2] J. C. A. Barata and W. F. Wreszinski. Strong Coupling Theory of Two Level Atoms in Periodic Fields, Phys. Rev. Lett. 84, 2112–2115 (2000). [3] W. F. Wreszinski, Atoms and Oscillators in Quasi-Periodic External Fields, Helv. Phys. Acta 70, 109–123 (1997). [4] W. F. Wreszinski and S. Casmeridis, Models of Two Level Atoms in Quasiperiodic External Fields, J. Stat. Phys. 90, 1061 (1998). [5] I. I. Rabi, Space Quantization in a Gyrating Magnetic Field, Phys. Rev. 31, 652–654 (1937). [6] F. Bloch and A. Siegert, Magnetic Resonance for Nonrotating Fields, Phys. Rev. 57, 522–527 (1940). [7] S. H. Autler and C. H. Townes, Stark Effect in Rapidly Varying Fields, Phys. Rev. 100, 703–722 (1955). [8] L. H. Eliasson, Absolutely Convergent Series Expansions for Quasi Periodic Motions, Mathematical Physics Electronic Journal 2, No. 4 (1996). URL: http://www.ma.utexas.edu/mpej/MPEJ.html [9] L. H. Eliasson, Floquet Solutions for the 1-Dimensional Quasi-Periodic Schr¨ odinger Equation, Comm. Math. Phys. 146, 447–482 (1992). [10] G. Gentile and V. Mastropietro, Methods for the Analysis of the Lindstedt Series for KAM Tori and Renormalizability in Classical Mechanics. A Review with Some Applications, Rev. Math. Phys. 8, 393–444 (1996). [11] G. Benfatto, G. Gentile and V. Mastropietro, Electrons in a Lattice with an Incommensurate Potential, J. Stat. Phys. 89, 655–708 (1997). [12] M. Frasca, Duality in Perturbation Theory and the Quantum Adiabatic Approximation, Phys. Rev. A. 58, 3439–3442 (1998). [13] W. Scherer, Superconvergent Perturbation Method in Quantum Mechanics, Phys. Rev. Lett. 74, 1495 (1995). [14] J. Feldman and E. Trubowitz, Renormalization in Classical Mechanics and Many Body Quantum Field Theory, Journal d’Analyse Mathématique 58, 213 (1992). [15] H. R. Jauslin, Stability and Chaos in Classical and Quantum Hamiltonian Systems, P. Garrido and J. Marro (editors). II Granada Seminar on Computational Physics – World Scientific, Singapore, (1993).

Vol. 2, 2001


[16] Yitzhak Katznelson, An Introduction to Harmonic Analysis, Dover Publications, Inc. (1978). [17] C. Corduneanu, Almost Periodic Functions. Interscience Publishers – John Wiley & Sons (1968). [18] Michael Reed and Barry Simon, Methods of Modern Mathematical Physics Vol. 2. Fourier Analysis , Self-Adjointness, Academic Press, New York (1972– 1979) [19] Heindrich D¨ orrie, 100 Great Problems of Elementary Mathematics. Their History and Solution, Dover Publications, Inc. (1965). Originally published in German under the title of “Triumph der Mathematik. Hunderte ber¨ uhmte Probleme aus zwei Jahrtausenden mathematischer Kultur”. Physica-Verlag, W¨ urzburg (1958). [20] Ronald L. Graham, Donald E. Knuth and Oren Patashnik, Concrete Mathematics – A Foundation for Computer Science, Addison-Wesley Publishing Company. (1994). [21] Harro Heuser, Gewöhnliche Differentialgleichungen, B. G. Teubner. Stuttgart (1991). [22] Harry Hochstadt, The Functions of Mathematical Physics, Dover Publications, Inc. (1986).

Jo˜ ao C. A. Barata Universidade de S˜ ao Paulo Instituto de F´ısica Caixa Postal 66 318 05315 970 S˜ ao Paulo SP Brasil email: [email protected] Communicated by Rafael D. Benguria submitted 19/07/00, accepted 09/04/01




Future Global in Time Einsteinian Spacetimes with U(1) Isometry Group Y. Choquet-Bruhat and V. Moncrief ∗

Abstract. We prove that spacetimes satisfying the vacuum Einstein equations on a manifold of the form Σ × U (1) × R where Σ is a compact surface of genus G > 1 and where the Cauchy data is invariant with respect to U(1) and sufficiently small exist for an infinite proper time in the expanding direction.

1 Introduction In this paper we prove a global in time existence theorem, in the expanding direction, for a family of spatially compact vacuum spacetimes having spacelike U(1) isometry groups. The 4-manifolds we consider have the form V = M × R where M is an (orientable) circle bundle over a compact higher genus surface Σ and where the spacetime metric is assumed to be invariant with respect to the natural action of U(1) along the bundle’s circle fibers. We reduce Einstein’s equations, a` la Kaluza-Klein, to a system on the base Σ × R where it takes the form of the 2+1 dimensional Einstein equations coupled to a wave map matter source whose target space is the hyperbolic plane. This wave map represents the true gravitational wave degrees of freedom that have descended from 3+1 dimensions to appear as “matter” degrees of freedom in 2+1 dimensions. The 2+1 metric itself contributes only a finite number of additional, Teichm¨ uller parameter, degrees of freedom which couple to the wave map and control the conformal geometry of Σ. After the constraints have been solved and coordinate conditions imposed, through a well defined elliptic system, nothing remains but the evolution problem for the wave map / Teichm¨ uller parameter system though the latter has now become non local in the sense that the “background” metric in which the wave map is propagating is now a non local functional of the wave map itself given by the solution of the elliptic system mentioned above. Thus even in the special “polarized” case which we concentrate on here, in which the wave map reduces to a pure wave equation, this wave equation is now both non linear and non local. In addition to the simplifying assumption of polarization (which obliges us here to treat only trivial bundles, M = Σ × S 1 ) we shall need a smallness condition on the initial data, an assumption that the genus of Σ is greater than 1 and a restriction on the initial values allowed for the Teichm¨ uller parameters. It ∗ Partially

supported by the NSF contract n◦ PHY-9732629 to Yale University

1008

Y. Choquet-Bruhat and V. Moncrief


seems straightforward to remove each of these restrictions except for the smallness condition on the initial data. In particular we believe that the methods developed herein can be extended to the treatment of non polarized solutions on non trivial bundles over surfaces including the torus (but not S 2 ) with no restriction on the initial values of the Teichm¨ uller parameters. Some preliminary work in this direction has already been carried out. We do not know how to remove the small data restriction even in the polarized case but conjecture that long time existence should hold for arbitrary large data since the U(1) isometry assumption seems to suppress the formation of black holes (note that U(1) is here essentially a “translational” and not a “rotational” symmetry since the existence of an axis of rotation would destroy the bundle structure). Of course there is as yet no large data global existence result for smooth wave maps in 2+1 dimensions even on a given background so there is no immediate hope for such a result in our still more non linear (and non local) problem but the polarized case, though non linear and non local as well, seems more promising. One knows how to control the Teichm¨ uller parameters in pure 2+1 gravity and a wave equation on a given curved background offers no special difficulty. But now the “background metric” is instead a functional of the evolving scalar field and one needs to control this along with the Teichm¨ uller parameters. Serious progress on this problem would represent a “quantum jump” forward in one’s understanding of long time existence problems for Einstein’s equations since, up to now, the only large data global results require simplifying assumptions that effectively reduce the number of spatial dimensions to one (e.g., Gowdy models and their generalizations, plane symmetric gravitational waves, spherically symmetric matter coupled to gravity) or zero (e.g., Bianchi models, 2+1 gravity). We hope that this work on small data global existence will lay the groundwork for such an eventual quantum jump. But why assume a Killing field if only small data results are aimed for in the current project ? A small data global existence result already exists (AnderssonMoncrief, in preparation) for Einstein equations on different 3-manifolds of negative Yamabe class which makes no symmetry assumption whatsoever. Shouldn’t those methods be applicable to our problem in which case the U(1) symmetry assumption could be removed. The answer to this question is far from obvious for a somewhat subtle reason. In those cases where small data global existence can be established the conformal geometry of the spatial slices (which represents the propagating gravitational wave degrees of freedom) is tending to a well behaved limit. Therefore the various Sobolev “constants” (which are in fact functionals of the geometry) which are needed in the associated energy estimates are tending to well behaved limits as well. This simplifying feature is however missing in the current problem since, during the course of our evolution, the conformal geometry of the circle bundles under study is undergoing a kind of Cheeger-Gromov collapse in which the circular fibers shrink to zero length and the various related Sobolev “constants” may careen out of control making even small data energy estimates much more difficult.

Vol. 2, 2001

Future Global in Time Einsteinian Spacetimes with U(1) Isometry Group 1009

Of the various Thurston types of 3-geometries which compactify to negative or zero Yamabe class manifolds {H 3 , H 2 × R, SL(2, R), Sol, N il, R3) only the hyperbolics are immune from such degenerations and the remaining (positive Yamabe class) Thurston types {S 3 , S 2 × R} not only are subject to Cheeger-Gromov type collapse but also to recollapse of the actual physical geometry to “big crunch” singularities in the future direction. By focusing on negative (or zero) Yamabe class manifolds which exclude (due to Einstein’s equations) the occurrence of maximal hypersurfaces, that would signal the onset of recollapse to a big crunch, we thereby concentrate on spacetimes that can expand indefinitely. That such Cheeger-Gromov collapse can be expected in solutions of Einstein’s equations can be seen already in the basic compactified Bianchi models wherein all the known solutions of negative Yamabe type except H 3 exhibit conformal collapse either along circular fibers {H 2 × R, SL(2, R)}, or collapse along T 2 fibers {Sol}, or even total collapse with non zero but bounded curvature of the Gromov “almost flat” variety {N il}. The solutions we are considering here (of Thurston type H 2 × R or eventually SL(2, R) in non polarized generalizations) extend results exhibiting such behavior to a large family of spatially inhomogeneous spacetimes. We sidestep the extra complication of degenerating Sobolev constants by imposing U (1) symmetry and carrying out Kaluza Klein reduction to work on a spatial manifold of hyperbolic type (though now a 2-dimensional one) for which, as we shall show, collapse and the corresponding degeneracy of the needed Sobolev constants is suppressed. The reason why we avoid the base Σ = T 2 is that the 2-tori themselves tend to collapse under the Einstein flow whereas the higher genus surfaces do not. On the other hand one can probably compute the explicit dependence of the needed Sobolev constants on the Teichm¨ uller parameters for the torus and eventually exploit this to treat the Thurston cases {R3 , N il} which compactify typically to trivial and non trivial S 1 −bundles over T 2 . The Sol case (which compactifies to T 2 bundles over S 1 ) tends to collapse (as seen from the Bianchi models) the entire T 2 fibers. Thus to avoid degenerating Sobolev “constants” in this case it seems necessary to impose a full T 2 = U (1) × U (1) isometry group and Kaluza Klein reduce to an S 1 spatial base manifold. This leads to a certain nice generalization of the Gowdy models defined on the “Sol-twisted torus” but has effectively only one space dimension remaining. We exclude the Thurston types {S 2 × R, S 3 } which correspond to trivial and non trivial S 1 bundles over S 2 respectively since they belong to the positive Yamabe class as we have mentioned and should not exhibit infinite expansion but rather recollapse to big crunch singularities. The eight Thurston types are the basic building blocks from which other (and conjucturally all) compact 3-manifolds can be built by glueing together along so called incompressible 2-tori or (to obtain non prime manifolds) along essential 2-spheres. Very little is known about the Einstein “flow” on such more general manifolds but it seems that a natural first step in this direction may be made by studying the Einstein flow on the basic building block manifolds themselves. This program seems tractable provided that a U (1) symmetry is imposed in the

1010



H 2 × R, Sl(2, R) and perhaps N il and R3 cases, and provided that a U (1) × U (1) symmetry is imposed in the Sol case. No symmetries are needed in the H 3 case due to the absence of Cheeger-Gromov collapse but one can hope to remove the symmetry hypothesis in the other cases by learning how to handle degenerating Sobolev “constants”. In this respect the N il and R3 cases may provide some guidance since they seem to require a treatment of degenerating Sobolev constants but only in the setting of 2-dimensions (when U (1) symmetry is imposed). The basic methods we use involve the construction of higher order energies to control the Sobolev norms of the scalar wave degrees of freedom combined with an application of the “Dirichlet energy” function in Teichm¨ uller space to control the Teichm¨ uller parameters degrees of freedom. A subtlety is that the most obvious definition of wave equation (or, more generally, wave map) energies does not lead to a well defined rate of decay so that corrected energies must be introduced which exploit “information” about the lowest eigenvalue of the spatial laplacian which enters into the wave equation. Since the lowest eigenvalues vary with position in Teichm¨ uller space we find convenient to choose initial data such that, during the course of the evolution, the lowest eigenvalue avoids a well known gap in the spectrum for an arbitrary higher genus surface. If no eigenvalue drifts into this gap (which we enforce by suitable restriction on the initial data) then one can establish a universal rate of decay for the energies. If the lowest eigenvalue drifts into this gap and remains there asymptotically then the rate of decay of these energies will depend upon the asymptotic value of the lowest eigenvalue and will no longer be universal. While it is straightforward to modify the definitions of the corrected energies to take this refinement into account we shall not do so here to avoid further complication of an already involved analysis. An extension of the definition of our corrected energies to the non polarized case and to the treatment of non trivial S 1 bundles is also relatively straightforward but for simplicity we shall not pursue that here either. The sense in which our solutions are global in the expanding direction is that they exhaust the maximal range allowed for the mean curvature function on a manifold of negative Yamabe type, for which a zero mean curvature can only be asymptotically approached. The normal trajectories to our space slices all have an infinite proper time length. We do not attempt to prove causal geodesic completeness but that would be straightforward to do given the estimates we obtain. Another question concerns the behavior of our solutions in the collapsing direction. Since our energies are decaying in the expanding direction they are growing in the collapsing direction and will eventually escape the region in which we can control their behavior. In particular we cannot use these arguments to show that our solutions extend to their conjectured natural limit as the mean curvature function tends to −∞. There is another approach to the U (1) problem however which, although local in nature, can describe a large family of U (1)-symmetric spacetimes by convergent expansions about the big-bang singularities themselves.

Vol. 2, 2001


This method, which is based on work by S. Kichenassamy and its extensions by A. Rendall and J. Isenberg, can handle vacuum spacetimes that are “velocity dominated” at their big-bang singularities. Work by J. Isenberg and one of us (V.M) shows that the polarized vacuum solutions on T 3 × R are amenable to this analysis. In fact there two larger families of “half-polarized” solutions that can also be rigorously treated and shown to have velocity dominated singularities. By contrast the general (non polarized) solution does not seem to be amenable to this kind of analysis and indeed numerical work by B. Berger shows that such solutions should have generically “oscillatory” rather than velocity dominated singularities. The expansion methods which produce these solutions near their velocity dominated singulariries are essentially local and should be readily adaptable to other manifolds such as circle bundles over higher genus surfaces. Thus one should be able to generate a large collection of initial data sets for the problem dealt with in this papaer which treats the further evolution globally in the expanding direction. Thus the machinery seems to be at hand for treating a large family of U (1) symmetric solutions from their big-bang initial singularities to the limit of infinite expansion.

2 Equations The spacetime manifold V is a principal fiber bundle with one dimensional Lie group G and base Σ × R, with Σ a smooth 2 dimensional manifold which we suppose here to be compact. The spacetime metric is invariant under the action of G, the orbits are the fibers of V and are supposed to be space like. We write it in the form (4)

g = e−2γ

where γ is a scalar function and (3)

(3)

(3)

g + e2γ (θ)2 ,

g a lorentzian metric on Σ × R which reads:

g = −N 2 dt2 + gab (dxa + ν a dt)(dxb + ν b dt)

N and ν are respectively the lapse and shift of

(3)

g, while

g = gab dxa dxb is a riemannian metric on Σ, depending on t. The 1-form θ is a connection on the fiber bundle V, represented in coordinates (x3 , xα ) adapted to the bundle structure by θ = dx3 + Aα dxα . Note that A is a locally defined 1-form on Σ × R.

1012

2.1



Twist potential

The curvature of the connection locally represented by A is a 2-form A on Σ × R, given by Fαβ = (1/2)e−3γ ηαβλ E λ where E is an arbitrary closed 1-form if the equations (4) Rα3 =0 are satisfied. Hence if Σ is compact E = dω + H where ω is a scalar function on V, called the twist potential, and H a representative of the 1-cohomology class of Σ×R, for instance defined by a 1-form on Σ, harmonic for some given riemannian metric m.

2.2

Wave map equation

The fact that F is a closed form together with the equation (4) R33 = 0 imply (with the choice H=0) that the pair u ≡ (γ, ω) satisfies a wave map equation from (Σ × R,(3) g) into the hyperbolic 2-space, i.e. R2 endowed with the riemannian metric 2(dγ)2 + (1/2)e4γ (dω)2 ). This wave map equation is a system of hyperbolic type when (3) g is a known lorentzian metric. In this article we will consider only the polarized case that is we take ω and H to be zero. Some of the computations and partial results hold however in the general case. It is why we keep the wave map notation wherever possible, since we intend to extend our final result to the general case in later work. In the polarized case the wave map equation reduces to the wave equation for γ in the metric (3) g.

2.3

3-dimensional Einstein equations

When (4) R3α = 0 and (4) R33 = 0 the Einstein equations (4) Rαβ = 0 are equivalent to Einstein equations on the 3-manifold Σ × R for the metric (3) g with source the stress energy tensor of the wave map: (3)

Rαβ = ∂α u.∂β u

(1)

where a dot denotes a scalar product in the metric of the hyperbolic 2-space. We continue to use the same notation in the polarized case, that is we set γ = u and ∂α u.∂β u ≡ 2∂α γ∂β γ. These Einstein equations decompose into a. Constraints. b. Equations for lapse and shift to be satisfied on each Σt . These equations, as well as the constraints, are of elliptic type. c. Evolution equations for the Teichm¨ uller parameters, which are ordinary differential equations.

Vol. 2, 2001

2.3.1


Constraints on Σt

One denotes by k the extrinsic curvature of Σt as submanifold of (Σ × R,(3) g); then, with ∇ the covariant derivative in the metric g, kab ≡ (2N )−1 (−∂t gab + ∇a νb + ∇b νa ). The equations (momentum constraint) (3)

R0a ≡ N (−∇b kab + ∂a τ ) = ∂0 u.∂a u

and (hamiltonian constraint),

(3)

S00 ≡(3) R00 + 12 N 2

(3)

(2)

R

2N −2(3) S00 ≡ R(g) − kba kab + τ 2 = N −2 ∂0 u.∂0 u + g ab ∂a u.∂b u

(3)

do not contain second derivatives transversal to Σt of g or u, they are the constraints. To transform the constraints into an elliptic system one uses the conformal method. We set gab = e2λ σab , where σ is a riemannian metric on Σ, depending on t, on which we will comment later, and 1 kab = hab + gab τ 2 where τ is the g-trace of k, hence h is traceless. We denote by D a covariant derivation in the metric σ. From now on, unless otherwise specified, all operators are in the metric σ, and indices are raised or lowered in this metric. We set u = N −1 ∂0 u with ∂0 the Pfaff derivative of u, namely ∂0 = ∂t − ν a ∂a with ∂a = and

∂ ∂xa

.

u = e2λ u .

The momentum constraint reads if τ is constant in space, a choice which we will make, . Db hba = La , La ≡ −Da u.u . (4) This is a linear equation for h, independent of λ. The general solution is the sum of a transverse traceless tensor hT T ≡ q and a conformal Lie derivative r. Such tensors are L2 −orthogonal on (Σ, σ). The hamiltonian constraint reads as the semilinear elliptic equation in λ : ∆λ = f (x, λ) ≡ p1 e2λ − p2 e−2λ + p3 , with p1 ≡

1 2 1 . 1 τ , p2 ≡ (| u |2 + | h |2 ), p3 ≡ (R(σ) − |Du|2 ) 4 2 2

(5)

1014

2.3.2



Equations for lapse and shift

The lapse and shift are gauge parameters for which we obtain elliptic equations on each Σt as follows. We impose that the Σt s have constant (in space) mean curvature, namely that τ is a given negative increasing function of t. The lapse N satisfies then the linear elliptic equation ∆N − αN = −e2λ ∂t τ with

1 . α ≡ e−2λ (| h |2 + | u |2 ) + e2λ τ 2 . 2 The equation to be satisfied by the shift ν results from the knowledge of σt . Indeed the definition of k implies that ν satisfies a linear differential equation with an operator L, the conformal Lie derivative, which we first write in the metric g: (Lg ν)ab ≡ ∇a νb + ∇b νa − gab ∇c ν c = φab with

1 φab ≡ 2N hab + ∂t gab − gab g cd ∂t gcd 2

then in the metric σ (Lσ n)ab ≡ Da nb + Db na − σab Dc nc = fab with na ≡ νa e−2λ where

1 fab ≡ 2N e−2λ hab + ∂t σab − σab σ cd ∂t σcd . 2 The kernel of the dual of L is the space of transverse traceless symmetric 2-tensors, i.e. symmetric 2-tensors T such that g ab Tab = 0,

∇a Tab = 0 .

(6)

These tensors are usually called TT tensors. The spaces of TT tensors are the same for two conformal metrics. 2.3.3

Teichm¨ uller parameters

On a compact 2-dimensional manifold of genus G ≥ 2 the space Teich of conformally inequivalent riemannian metrics, called Teichm¨ uller space, can be identified (cf. Fisher and Tromba) with M−1 /D0 , the quotient of the space of metrics with scalar curvature −1 by the group of diffeomorphisms homotopic to the identity. M−1 →Teich is a trivial fiber bundle whose base can be endowed with the structure of the manifold Rn , with n = 6G − 6. We impose to the metric σt to be in some chosen cross section Q → ψ(Q) of the above fiber bundle. Let QI , I = 1, ..., n be coordinates in Teich , then ∂ψ/∂QI is a known tangent vector to M−1 at ψ(Q), that is a traceless symmetric 2-tensor field

Vol. 2, 2001


on Σ, sum of a transverse traceless tensor field XI (Q) and of the Lie derivative of a vector field on the manifold (Σ, ψ(Q)). The tensor fields XI (Q), I = 1, ...n span the space of transverse traceless tensor fields on (Σ, ψ(Q)). The matrix with elements XIab XJab µψ(Q) Σ

is invertible. Lemma 1 If we impose to the metric σt to lie in the chosen cross section, i.e. σt ≡ ψ(Q(t)), the solvability condition for the shift equation determines dQI /dt in terms of ht . Proof. The time derivative of σ is given by ∂t σ = (dQI /dt)∂ψ/∂QI hence it is of the form ∂t σab =

dQI XIab + Cab dt

where C is a Lie derivative, L2 orthogonal to TT tensors. The shift equation on Σt is solvable if and only if its right hand side f is L2 orthogonal to TT tensors, i.e. to each tensor field XI . Theses conditions read fab XJab µσt = 0 . Σt

We have seen that h is the sum of a tensor r which is in the range of the conformal Killing operator, hence L2 orthogonal to TT tensors, and a TT tensor. This last tensor can be written with the use of the basis XI of such tensors, the coefficients P I depending only on t: hTabT = P I (t)XI,ab . The orthogonality conditions read, using the fact that the transverse tensors XI are orthogonal to Lie derivatives and are traceless: [2N e−2λ (rab + P I XI,ab ) + (dQI /dt)XI,ab ]Xjab µσ = 0 . Σt

The tangent vector dQI /dt to the curve t → Q(t) and the tangent vector P I (t) to Teich are therefore linked by the linear system XIJ

dQI + YIJ P I + ZJ = 0 dt

with XIJ ≡

Σt

XIab XJ,ab µσt

1016


YIJ ≡

Σt

ZJ ≡

Σt


2N e−2λ XIab XJ,ab µσt 2N e−2λ rab XJab µσt .

We will now construct an ordinary differential system for the evolution of the QI and P I by considering the as yet non solved 3-dimensional Einstein equations (3)

Rab = ρab ≡ ∂a u.∂b u .

Lemma 2 The constraint equations together with the lapse and the wave map equations imply that N ((3) Rab −ρab ) with ρab ≡ ∂a u.∂b u is a transverse traceless tensor on each Σt . Proof. The equations (3)

and

(3)

imply

(7)

R00 = ρ00

(8)

(3)

since (3)

hence

S00 = T00

R=ρ

(9)

1 (3) 1 S00 − T00 ≡(3) R00 − g00 R − (ρ00 − g00 ρ) 2 2

(10)

(3)

Rab − ρab =(3) S ab − T ab .

The equations 2 and 3 imply g ab ((3) Rab − ρab ) = 0 . On the other hand the Bianchi identity in the 3-metric g gives (3)

∇α ((3) S αb − T αb ) = 0 .

An elementary calculus using the connexion coefficients of (3) g and g shows that, due to equations previously satisfied, this equation reduces to the following divergence in the metric g: ∇a [N ((3) Rab − ρab )] = 0 . The tensor N ((3) Rab − ρab ) is therefore a traceless and transverse tensor on (Σ, g), and hence also on (Σ,σ), by conformal invariance of this property for symmetric 2-covariant tensors. We deduce from this lemma that a necessary and sufficient condition for the previous equations to imply (3) Rab − ρab = 0 is that the tensor N ((3) Rab − ρab )

Vol. 2, 2001


be L2 orthogonal to transverse traceless tensors on (Σt , σt ), i.e. to each of the TT tensors XI defined above through the cross section ψ where we choose σt , that is N ((3) Rab − ρab )XJab µσt = 0, for J = 1, 2, ...6G − 6 . Σt

We recall that (3)

Rab ≡ Rab − N −1 ∂ 0 kab − 2kac kbc + τ kab − N −1 ∇a ∂b N

with

1 kab ≡ P I XI,ab + rab + gab τ 2

and ∂ 0 is an operator on time dependent space tensors (cf. C-B and York) defined by, with Lν the Lie derivative in the direction of ν, ∂ 0 ≡ ∂t − Lν . We thus obtain an ordinary differential system of the form XIJ

dP I dQ + ΦJ (P, )=0. dt dt

where Φ is a polynomial of degree 2 in P and dQ/dt with coefficients depending smoothly on Q and directly but continuously on t through the other unknown, namely: AJIK ≡

Σt

c 2N e2λ XI,a XK,bc XJab µσt

BJIK ≡ CJI ≡

Σt

Σt

∂XI,ab ab X µσt ∂QK J

[(−Lν XI )ab + 4N e−2λ rbc XI,ac − τ N XI,ab ]XJab µσt

and, using integration by parts and the transverse property of the XI to eliminate second derivatives of N (recall that ∇a ∂b N ≡ Da ∂b N − 2∂a λ∂b N ) (−∂ 0 rab − 2N e−2λ rac rbc + τ N rab + 2∂a λ∂b N − ∂a u.∂b u)XJab µσt . DJ ≡ Σt

3 Homogeneous solution Theorem 3 A particular solution, obtained by taking for u a constant wave map and for h the zero tensor, is given by: (4)

g = −4dt2 + 2t2 σab dxa dxb + θ2

with σ a metric on Σ independent of t and of scalar curvature −1, and θ a flat connexion 1-form on the bundle.

1018



Proof. The wave map equation is satisfied by any constant map. Such a map has zero stress energy tensor. The momentum constraint is then satisfied by h = 0, hence 1 kab = τ gab . 2 The hamiltonian constraint is satisfied by a constant in space λ given if R(σ) = −1 by 2 e2λ = 2 τ the shift equation is then satisfied by ν = 0 and the lapse equation by N =2. A straightforward computation shows that Ricci((4) g) = 0 are satisfied.

(3)

Rab = 0. All the equations

Remark 4 The hypothesis imply that the bundle M→ Σ is a trivial bundle.

4 Local existence theorem 4.1

Cauchy problem

The unknowns which permit the reconstruction of the spacetime metric in the gauge τ ≡ τ (t), given some smooth cross section Q → ψ(Q) of Teichm¨ uller space Teich , are on the one hand u = γ satisfying the wave equation in the metric (3) g, on the other hand λ, N and ν, which satisfy elliptic equations on each Σt , and also a curve Q(t) in Teich which determines the metric σt ≡ ψ(Q(t)) on Σt . An intermediate unknown is the traceless tensor h which splits into a transverse part and a conformal Lie derivative of σt in the direction of a vector Y which satisfies also an elliptic system on Σt . The transverse part is determined through a field of tangent vectors to Teich at the points of Q(t). Definition 5 The Cauchy data on Σt0 denoted Σ0 are: 1. A C ∞ riemannian metric σ0 which projects onto a point Q(t0 ) of Teich and a C ∞ tensor q0 transverse and traceless in the metric σ0 . . 2. Cauchy data for u and u on Σ0 , i.e. .

.

u(t0 , .) = u0 ∈ H2 , u(t0 , .) = u0 ∈ H1 where Hs is the usual Sobolev space on (Σ, σ0 ).

Vol. 2, 2001

4.2


Functional spaces

Definition 6 Let σt be a curve of C ∞ riemannian metrics on Σ, uniformly equivalent to the metric σ0 for t∈ [t0 , T ] and C 1 in t. Such metrics are called regular for t ∈ [t0 , T ] 1. The spaces Wps (t) are the usual Sobolev spaces of tensor fields on the riemannian manifold (Σ, σt ). By the hypothesis on σt the norms in Wsp (t) are uniformly equivalent for t ∈ [t0, , T ] to the norm in Wsp (t0 ). We set Ws2 (t) = Hs (t). When working on one slice Σt we will often omit reference to the t dependence of the norm. 2. The spaces Esp (T ) are the Banach spaces of t dependent tensor fields f on Σ p Esp (T ) ≡ C 0 ([t0 , T ], Wsp ) ∩ C 1 ([t0 , T ], Ws−1 ) with norm p ||f ||Esp (T ) = Supt0 ≤t≤T (||f ||Wsp (t) + ||∂t f ||Ws−1 (t) ).

We set Es2 (T ) = Es (T ). We will proceed in two steps: . Case a. Du0 , u0 ∈ H2 . Case b. Du0 , u0 ∈ H1 We will need the following lemma (we set Es = Es2 ) Lemma 7 Let σt be a regular metric on Σ × [t0 , T ] then .

.

. .

a. If Du, u ∈ E2 (T ) then Du.Du, Du.u, u.u ∈ E2 (T ), . . . . b. If Du, u ∈ E1 (T ) then Du.Du, Du.u, u.u ∈ E1p (T ) ∩ E0q (T ), 1 ≤ p < 2, 1 ≤ q < ∞. Proof. a. Since in dimension 2 the space H2 is an algebra one has .

Du.Du, u ∈ C 0 ([t0 , T ], H2 ) . On the other hand we have |∂t (Du.Du)| = 2|∂t Du.Du| ≤ 2|∂t Du||Du| hence by multiplication properties of Sobolev spaces ∂t (Du.Du) ∈ C 0 ([t0 , T ], H1 ) . b. If Du ∈ E1 then Du ∈ E0q ≡ C 0 ([t0 , T ], Lq ) , for all q < ∞ by the standard Sobolev embedding theorem, and so does Du.Du. We have |D(Du.Du)| = 2|D2 u.Du| ≤ 2|D2 u||Du| hence D(Du.Du) ∈ E0p for all 1 ≤ p < 2 since D2 u ∈ E0 and Du ∈ E0q , 1 ≤ q < ∞. An analogous proof gives the result for the other products. Using again |∂t (Du.Du)|| ≤ 2|∂t Du||Du| we obtain ∂t (Du.Du) ∈ E0p for . 1≤ p < 2 since we have by definition ∂t Du ∈ E02 . Analogous reasoning with u completes the proof.

1020

4.3



Resolution of the elliptic equations for given Q(t), P(t) and u

We have supposed chosen a smooth cross section Q → ψ(Q) of M−1 over the Teichm¨ uller space Teich . We suppose given a C 1 curve t → Q(t) contained when t ∈ [t0 , T ] in a compact subset of Teich , and a continuous set of tangent vectors P to Teich at points of this curve. We are then given by lift to M−1 a regular metric σt for t ∈ [t0 , T ], with scalar curvature −1, together with a smooth symmetric 2-tensor hTt T ≡ qt transverse and traceless in the metric σt and depending continuously on t. 4.3.1 Determination of h We have set hab = qab + rab where q and r are traceless q is transverse and r is a conformal Lie derivative, i.e. Da qba = 0 and qaa = 0 rab = Da Yb + Db Ya − σab Dc Y c . Determination of q. The traceless transverse tensor q on (Σt , σt ) is deduced by lifting its given projection onto the tangent space to Teichm¨ uller space at the point Q(t). It is smooth and depends continuously on t ∈ [t0, T ]. Let us denote by XI (Q), I = 1, ..., 6G − 6, a basis of traceless transverse tensor fields on (Σ, ψ(Q)) then qt = XI (Q(t))P I (t) . Determination of r. The vector Y satisfies on each Σt the elliptic system with zero kernel ( in accordance with the fact that (Σ, σ) does not admit conformal Killing fields when R(σ) < 0), 1 . Da rab ≡ Da Da Yb + R(σ)Yb = Lb ≡ −Db u.u . 2 Case a. L ∈ E2 (T ). It results from elliptic theory that the system satisfied by Y has for each t ∈ [0, T ] one and only one solution in H4 (t) and there exists a constant depending only on σt such that .

||rt ||H3 (t) ≤ Cσt ||Du.u||H2 (t) . The constant Cσt is invariant under diffeomorphism acting on σt , that is it depends only on its projection on the Teichmuller space of Σ, hence is uniformly bounded under the hypothesis made on σt . We denote by Mσ,T such a constant. We have since the norms Wsp (t) and Wsp are uniformly equivalent .

||rt ||H3 ≤ Mσ,T ||(Du.u)t ||H2 .

Vol. 2, 2001


Derivations with respect to t of the equation for Y show that for a regular σt we have . r ∈ E3 (T ), ||r||E3 (T ) ≤ Mσ,T ||Du||E2 (T ) × ||u||E2 (T ) . Case b. L ∈ E1p (T ). The system for Y has one and only one solution in W3p (t) for each t∈ [t0 , T ], then rt is in W2p (t) and there exists a constant Cσt such that .

||rt ||W2p (t) ≤ Cσt ||(Du.u)t ||W1p (t) . One proves also that ∂t r ∈ W1p (t) hence r ∈ E2p (T ) and there exists a constant Mσ,T such that . ||r||E2p (T ) ≤ Mσ,T ||Du||E1 (T ) × ||u||E1 (T ) . 4.3.2 Case of initial values On the initial manifold Σt0 we have given q0 ∈ C ∞ , and r0 satisfies the inequality (we abbreviate to . the L2 norm on (Σ, σ0 )) .

r0 ≤ Cσ0 Du0 .u0 .

hence h0 is small in L2 norm if it is so of q0 while Du0 and u0 are small in H1 norm. 4.3.3 Determination of the conformal factor λ On each Σt the conformal factor λt satisfies the equation, with ∆ ≡ ∆σt the Laplacian in the metric σt (we omit the writing of t to simplify the notation) ∆λ = f (λ) ≡ p1 e2λ − p2 e−2λ + p3 where the coefficients p are given by, with R(σ) = −1, p1 =

1 2 1 1 τ , p2 = (| h |2 + | u˙ |2 ), p3 = (R(σ)− | Du |2 ) . 4 2 2

Case a. We suppose that the coefficients p are given functions in E2 (T ). This hypothesis is consistent with Du, u˙ ∈ E2 (T ) and h ∈ E2 (T ). We know from elliptic theory that the semi linear elliptic equation for λ on (Σt , σt ) admits a solution in H4 (t), which is included in C 2 , if it admits a subsolution λ− and a supersolution λ+ , i.e. C 2 functions such that ∆λ+ ≤ f (λ+ ),

and ∆λ− ≥ f (λ− ),

λ− ≤ λ+ .

We construct sub and super solutions as follows. We define the number ω to be the real root of the equation P1 e2ω − P2 e−2ω + P3 = 0 . where the P’s are the integrals of the p’s on Σt .

1022



By the Gauss Bonnet theorem the volume of (Σt , σt ) is a constant if R(σt ), is constant. We have here R(σt ) = −1, hence: Vσ ≡ µσ = − R(σ)µσ = −4πχ . Σt

We find:

Σt

1 ( h 2 + u˙ 2 ) ≥ 0 2 1 1 P1 = Vσ τ 2 > 0, P3 = − (Vσ + Du 2 ) < 0 4 2 exists, is unique and satisfies P2 =

hence e2ω

e2ω ≥ 2τ −2 . We define v ∈ H4 as the solution with mean value zero on Σt of the linear equation ∆v = f (ω) ≡ p1 e2ω − p2 e−2ω + p3 . Such a solution exists and is unique, because f (ω) has mean value zero on Σt Lemma 8 The functions λ+ = ω + v − minΣ v and λ− = ω + v − MaxΣ v are respectively a super and sub solution of the equation for λ. Proof. We have λ+ ≥ ω and λ− ≤ ω hence f (λ+ ) ≥ f (ω) ≥ f (λ− ), since f is an increasing function of λ, while ∆λ+ = ∆λ− = ∆v = f (ω). The solution λ ∈ H4 thus obtained for each t ∈ [t0 , T ] is unique, due to the monotony of the function f . Its H4 norm depends continuously on t. Derivation with respect to t of the equation satisfied by λ shows that ∂t λ ∈ C 0 ([t0 , T ], H3 ). We have proved: Theorem 9 The equation for λ has one and only one solution λ ∈ E4 (T ) under the hypothesis a (where pi ∈ E2 (T )). Case b. Theorem 10 The equation for λ has one and only one solution λ ∈ E3p (T ) under the hypothesis b (where pi ∈ E1p (T )). (n)

(n)

Proof. Consider a Cauchy sequence of functions p2 ≥ 0, p3 + 12 ≤ 0, both in E2 (T ), converging in E1p (T ) to functions p2 , p3 + 12 . For each n there is a solution λ(n) ∈ E4 (T ) of the conformal factor equation. The difference λ(n) − λ(m) satisfies the equation (n)

∆(λ(n) − λ(m) ) = p1 (e2λ(n) − e2λ(m) ) − p2 (e−2λ(n) − e−2λ(m) ) (m)

+(p2

(n)

(n)

(m)

− p2 )e−2λ(m) + p3 − p3

.

Vol. 2, 2001


Applying elementary calculus inequalities to the estimate of (a − b)−1 (ea − eb ) and (a−b)−1 (e−a −e−b ) one obtains a well posed linear elliptic equation for λ(n) −λ(m) and an inequality for its norm in W3p for each t ∈ [t0 , T ]. We thus have shown the convergence of the sequence to a limit λ which satisfies the required equation. One can prove similarly that λ ∈ E3p (T ). The uniqueness of the solution results from the monotony of f (λ). Bounds for λ. When λ ∈ C 2 one obtains a lower bound by using the maximum principle: at a minimum of λ we have ∆λ ≥ 0. Hence a minimum λm of λ satisfies the inequality 2 1 i.e. e−2λm ≤ τ 2 e2λm ≥ 2 , τ 2 p when λ ∈ E3 is solution of the equation it satisfies the same inequality since W3p ⊂ C 0 and λ can be obtained as a limit in W3p of functions satisfying this inequality. An analogous argument shows that λ− ≤ λ ≤ λ+ with λ− = ω − max v + v, and λ+ = ω + v − min v where v ∈

E2 ∩E3p

is the solution with mean value zero on Σt of the linear equation ∆v = f (ω) ≡ p1 e2ω − p2 e−2ω + p3

with e2ω the positive solution of the equation P1 e4ω + P3 e2ω − P2 = 0 . Case of initial values. The above construction applies in particular on the initial . surface Σ0 . In this case the functions u0 and u0 are considered as given. We have ∆v0 = f (ω0 ) ≡ p1,0 e2ω0 − p2,0 e−2ω0 + p3,0 with p1,0 =

1 2 1 1 τ , p2,0 = (| h0 |2 + | u˙ 0 |2 ), p3,0 = − (1+ | Du0 |2 ) . 4 0 2 2

We have e

2ω0

=

(Vσ + Du0 2 ) +

and we see that e2ω0 tends to

(Vσ + Du0 2 )2 + 2τ02 ( h0 2 + u˙ 0 2 ) , Vσ τ02 2 τ02

and f (ω0 ) tends to zero when q0 tends to .

zero as well as the H1 norms of Du0 and u0 (then the L2 norm of h0 tends also to zero).

1024



4.3.4 Determination of the lapse N The lapse N satisfies the equation ∆N − αN = −e2λ ∂t τ

(11)

with

1 2 τ + e−4λ (| u˙ |2 + | h |2 ) > 0 . 2 It is a well posed elliptic equation on (Σt , σt ), when u, h and λ are known, which has one and only one solution, always positive, in E4 (T ) in case a, in E3p (T ) in case b. Indeed: Case a. . We have u ∈ E2 (T ), h ∈ E3 (T ), λ ∈ E4 (T ) hence also e2λ ∈ E4 (T ) and α ∈ E2 (T ). The equation has then a solution N ∈ E4 (T ). αe−2λ =

Case b. . We have |u|2 + |h|2 ∈ E1p (T ), e2λ , e−2λ ∈ E3p (T ). The equation has a solution N ∈ E3p (T ). Upper bound of N. At a maximum xM of N ∈ C 2 we have (∆N )(xM ) ≤ 0 hence this maximum NM is such that NM ≤ (α−1 e2λ ∂t τ )(xM ) , a fortiori

2∂t τ . τ2 A reasoning analogous to that given for λ shows that this upper bound also holds in case b. NM ≤

4.3.5 Determination of the shift ν The definition of k implies that n ≡ e−2λ ν satisfies a linear differential equation involving an operator L, the conformal Lie derivative, with injective symbol: (Lσt n)ab ≡ Da nb + Db na − σab Dc nc = fab with

1 fab ≡ 2N e−2λ hab + ∂t σab − σab σ cd ∂t σcd . 2 The kernel of the dual of L is the space of transverse traceless symmetric 2tensors in the metric σt , the equation for ν admits a solution if and only if f is L2 −orthogonal to all such tensors, i.e. fab XIab µσt = 0, for I =1,...6G-6 . Σt

Vol. 2, 2001


This integrability condition will not in general be satisfied with the arbitrary choice of P (t), that is of qt ≡ hTt T . In this subsection P (t) is not considered as given.We set in the expression of fab ht = P I (t)XI (Q(t)) + rt . When σt is a known C 1 function of t the integrability condition determines P (t) as a continuous field of tangent vectors to Teich by an invertible system of ordinary linear equations. When h is so chosen the equation for n has a solution, unique since Lσ has a trivial kernel on manifolds with R(σ) = −1. It results from elliptic theory that n ∈ E4 (T ) in case a, and n ∈ E3p (T ) in case b.The same properties hold for ν.

4.4

Wave equation, local solution

The wave equation on (Σ × R) in the metric

(3)

g reads

−N −1 ∂0 (N −1 ∂0 u) + N g ab ∇a (N ∂b u) + N −1 τ ∂0 u = 0 . We suppose that σt is a given regular riemannian metric for t ∈ [t0 , T ] and that λ, N, ν are given in E3p (T ) with p > 1 and N > 0. Then we have (3) g ∈ E3p (T ) ⊂ C 1 (Σ × [t0 , T ]) and (3) g has hyperbolic signature. It is easy to prove along standard lines that the Cauchy problem with data u0 , (∂t u)0 ∈ H2 × H1 has a solution such that (u, ∂t u) ∈ E2 (T ) × E1 (T ) on Σ × [t0 , T ]. The initial value (∂t u)0 . is the product of the datum u0 by e−2λ0 , it belongs to H1 under the hypothesis made in section 1.1 on the Cauchy data.

4.5

Teichm¨ uller parameters

We suppose known h ∈ E2p (T ), λ, N, ν ∈ E3p (T ), u ∈ E2 (T ), and we suppose given Q → ψ(Q) a smooth cross section of M−1 over Teich . The unknown is the curve t → Q(t). We have σt ≡ ψ(Q(t)) and ∂t σab =

dQ I XI,ab + Cab dt

with XI (Q) a basis of the space of TT tensors on (Σ, ψ(Q)) and C a conformal Lie derivative, L2 orthogonal to TT tensors. The curve t → Q(t) and the tangent vector P I (t) to Teich satisfy the ordinary differential system (cf. section 2.3.3) XIJ

dQI + YIJ P I + ZJ = 0 dt

XIJ

dP I dQ + ΦJ (P, )=0. dt dt

and

1026



This quasi linear first order system for P and Q has coefficients continuous in t and smooth in Q and P. The matrix of the principal terms, XIJ , is invertible. There exists therefore a number T > 0 such that the system has one and only one solution in C 1 ([t0 , T ]) with given initial data P0 , Q0 .

4.6

Local existence theorem

We can now prove the following theorem .

Theorem 11 The Cauchy problem with data (u0, u0 ) ∈ H2 × H1 , on Σt0 (denoted Σ0 ) and Q0 , a point in Teich , P0 a tangent vector to Teich , for the Einstein equations with U(1) isometry group (polarized case) has a solution with σt a regular metric on Σt for t∈ [t0 , T ] and u ∈ E2 (T ), T > t0 , if T − t0 is small enough. This solution is unique when τ , depending only on t, is chosen together with a cross section of M−1 over Teich . Remark 12 One has, for this solution, λ, N, ν ∈ E3p (T ), 1 0. Proof. The proof is straightforward, using iteration to solve alternatively the elliptic systems, the wave equation and the differential system satisfied by Teichm¨ uller parameters, with τ a given function of t and σt required to remain in a chosen cross section of M−1 over Teich . The iteration converges if T − t0 is small enough. The limit can be shown to be a solution of Einstein equations with (3) g in constant mean curvature gauge by standard arguments, the 2-metric g is conformal with the factor e2λ to a metric in the chosen cross section by construction. This local existence theorem can be extended to the non polarized case.

5 Scheme for a global existence theorem As it is well known we will deduce from our local existence theorem a global one, i.e. on Σ × [t0 , ∞), if we can prove that the curve Q(t) remains in a compact subset . of Teich and that neither the H2 × H1 norm of (u(., t), u(., t)) nor the E3p norms of λ(., t), N (., t), ν(., t) blow up when t ∈ [t0, ∞) while N remains strictly positive. If the spacetime we construct is supported by the manifold M × [t0 , ∞) it will reach a moment of maximum expansion. It will be after an infinite proper time for observers moving along orthogonal trajectories of the hypersurfaces Mt ≡ M × {t} if the lapse function is uniformly bounded below by a strictly positive number. Our proof of this fact will rely on various refined estimates, using in particular corrected energies. The correction of the energies poses special problems in the non polarized case, which we will treat in another paper.

5.1

Notations

|.| and |.|g : pointwise norms of scalars or tensors on Σ, in the σ or g metric . and . p : L2 and Lp norms in the σ metric . g : L2 norm in the g metric.

Vol. 2, 2001


A lower case index m or M denote respectively the lower or upper bound of a scalar function on Σt . It may depends on t. When we have to make a choice of the time parameter t we will set t = −τ −1

(12)

then t will increase from t0 > 0 to infinity when, Σt expanding, τ (t) increases from τ0 < 0 to zero. With this choice the upper bound on N of subsection 4.3.4 reads N ≤ 2.

(13)

Remark 13 Other admissible choices of t, for instance τ = t, t ∈ [t0 , 0), t0 = τ0 < 0, would lead to the same geometrical conclusions.

5.2

Fundamental inequalities

Lemma 1 Let f be a scalar function on Σ. the following inequalities hold 1. f q ≤ e−2λm /q f Lq (g) , f Lq (g) ≤ e2λM /q f q . 2.

|Df |g = e−λ |Df |,

and if q ≥ 2

Df L∞ (g) ≤ e−λm Df ∞

Df Lq (g) ≤ e−λm (q−2)/q Df q

in particular Df = Df g . 3a.

|D2 f | = e2λ |D2 f |g , D2 f Lq (g) ≤ e−2

3b.

q−1 q λm

D 2 f q .

1 D2 f ≤ eλM ∆g f g + √ Df g . 2

Proof. The inequalities 1, 2, 3.a are trivial consequences of the identities: f q µσ = f q e−2λ µg since µg = e2λ µσ Σ

and

Σ

g ab Da f Db f = e−2λ σ ab Da f Db f

and a corresponding equality for D2 f or, more generally, for covariant 2-tensors.

1028



To prove 3b we use the identity obtained by two successive partial integrations and the Ricci formula with R(σ) = −1 1 D2 u2 = ∆u2 + Du2 . 2 We have

∆u = e2λ ∆g u, and e2λ ∆g u = eλ ∆g u g .

The given result follows. Lemma 2 We denote by Cσ any positive number depending only on (Σ, σ). 1. Let f be a scalar function on Σ. There exists Cσ such that the L4 norms of f and Df are estimated by: 1

1

1

f 4 ≤ Cσ {e−λm f g + e− 2 λm f g2 Df g2 ) and

1

1

1

Df 4 ≤ Cσ {Df g + Df g2 e 2 λM ∆g f g2 ) . 2. For any q such that 1 ≤ q < ∞ there exists Cσ such that f q ≤ Cσ f H1 . Proof. 1. By the Sobolev inequalities there exists Cσ such that f 24 = | f |2 ≤ Cσ ( | f |2 1 + D | f |2 1 ) . Using

D | f |2 = 2f.Df

we obtain

| f |2 ≤ Cσ f (f + 2Df )

which gives the first result using the lemma 1. Analogously Df 24 ≡ |Df |2 ≤ Cσ { Df 2 + D|Df |2 1 } leads to the second inequality. 2. We use the Sobolev embedding theorem and the compactness of Σ.

6 Energy estimates 6.1

Bound of the first energy

The 2+1 dimensional Einstein equations with source the stress energy tensor of the wave map u contain the following equation (hamiltonian constraint) 2N −2 (T00 −(3) S00 ) = N −2 ∂0 u.∂0 u + g ab∂a u.∂b u + g abg cd kcb kda − R − τ 2 = 0 (14)

Vol. 2, 2001


Recall the splitting of the covariant 2-tensor k into a trace and a traceless part: 1 kab = hab + gab τ 2

(15)

hence 1 |k|2g = g ac g bd kab kcd = |h|2g + τ 2 2 and the hamiltonian constraint equation reads 1 |u |2 + |Du|2g + |h|2g = R(g) + τ 2 2 with

(16)

(17)

u ≡ N −1 ∂0 u .

We define the first energy by the following formula (recall that |.|g and . g denote respectively the pointwise norm and the L2 norm in the metric g)

E(t) =

1 2

Σt

(|u |2 + |Du|2g + |h|2g )µg ≡

1 { u 2g + Du 2g + h 2g } . 2

(18)

This energy is the first energy of the wave map u completed by the L2 (g) norm of h. We integrate the hamiltonian constraint on (Σt, g) using the constancy of τ and the Gauss Bonnet theorem which reads, with χ the Euler characteristic of Σ R(g)µg = 4πχ . Σt

We have then E(t) =

τ2 V olg (Σt ) + 2πχ 4

with V olg (Σt ) =

Σt

µg .

We know from elementary calculus that on a compact manifold 1 dV olg Σt ab ∂gab = µg = −τ g N µg dt 2 Σt ∂t Σt since g ab ∂t gab = −2N τ + 2∇a νa .

(19)

1030



We use the equation N −1(3) R00 = ∆g N − N |k|2g + ∂t τ = |u |2 together with the splitting of k to write after integration, since τ is constant in space, 1 2 dτ τ V olg (Σt ) − N µg = N (|h|2g + |u |2 )µg . 2 dt Σt Σt We use these results to compute the derivative of E(t) and we find that it simplifies to: 1 dE(t) = τ (|h|2g + |u |2 )N µg . dt 2 t We see that E(t) is a non increasing function of t if τ is negative. The absence of the term |Du|2g on the right hand side does not permit an estimate of the rate of decay of E(t). We will estimate this decay in a forthcoming section. Note in addition the appearance of N in the right hand side.

6.2

Second energy estimates

In this paragraph indices are raised with g. We denote by hab g the contravariant components of hab computed with the metric g. We define the energy of gradient u by the formula E (1) (t) ≡ (J0 + J1 )µg Σt

with

1 1 | ∆g u |2 , J0 = | Du |2 . 2 2 We have for an arbitrary function f : d 1 f µg = {∂t f + g ab ∂t gab }µg dt Σt 2 Σt J1 =

that is, due to the definition of kab , d f µg = {∂t f − (N τ − ∇a ν a )f }µg dt Σt Σt hence after integration by parts on the compact manifold Σ, using the expression of ∂0 and replacing f by J0 + J1 the following formula where the shift does not appear explicitly: d (J1 + J0 )µg = {∂0 (J1 + J0 ) − N τ (J1 + J0 )}µg . dt Σt Σt

Vol. 2, 2001


We first compute

Σt

∂0 J1. µg =

Σt

∂0 ∆g u.∆g uµg .

We define the operator ∂¯0 on time dependent space tensors by ∂¯0 = ∂0 − Lν where Lν denotes the Lie derivative in the direction of the shift ν. We have ∂ 0 ∆g u = g ab ∂ 0 ∇a ∂b u + ∂ 0 g ab ∇a ∂b u . Therefore using ab ∂0 g ab = 2N k ab ≡ 2N hab g + Ng τ ∂0 J1 µg = g ab ∂ 0 ∇a ∂b u.∆g uµg + X1

Σt

Σt

with X1 =

Σt

{2N hab g ∇a ∂b u.∆g u + 2N τ J1 }µg .

Analogously

Σt

∂0 J0 µg =

with X0 =

Σt

Σt

g ab ∂0 ∂a u .∂b u µg + X0

{N hab g ∂a u .∂b u + N τ J0 }µg .

We use the commutation of the operator ∂ 0 with the partial derivative ∂a (cf. C.B-York 1995) together with partial integration to obtain Σt

g ab ∂0 ∂a u .∂b u µg = −

Σt

∂0 u .∆g u µg .

The function u satisfies the wave equation on (Σ × R,(3) g), namely: ∂0 u = N ∆g u + ∂ a N ∂a u + τ N u which gives

Σt

g ∂0 ∂a u .∂b u µg = − ab

Σt

N ∆g u.∆g u µg + Y0

with, after another integration by parts Y0 ≡ {(∇b (∂ a N ∂a u) + τ ∂b N u ).(∂ b u ) + 2τ N J0 }µg . Σt

1032



On the other hand: g ab ∂ 0 ∇a ∂b u ≡ ∆g ∂0 u − g ab ∂ 0 Γcab ∂c u with

∆g ∂0 u ≡ ∆g (N u ) ≡ N ∆u + 2∂ a ∂a u + u ∆g N

therefore

g ∂ 0 ∇a ∂b u.∆g uµg = ab

Σt

with Y1 =

Σt

Σt

N ∆g u.∆g u µg + Y1

{−g ab ∂ 0 Γcab ∂c u + 2∂ a N ∂a u + u ∆g N }.∆g uµg

which can be written, using the identity ∂ 0 Γcab = ∇c (N kab ) − ∇a (N kbc ) − ∇b (N kac ) together with the equation Y1 =

Σt

∇a kba = −∂b u.u

c a {(2∂a N hac g − 2N ∂ u.u )∂c u + 2∂ N ∂a u + u ∆g N }.∆g uµg .

(20)

We see that the terms in third derivatives of u disappear in the derivative of E (1) (t). We have obtained ∂0 (J0 + J1 )µg = X0 + X1 + Y0 + Y1 Σt

where the X’s and Y’ are given by the above formulas. We read from these formulas the following theorem Theorem 14 The time derivative of the second energy E (1) satisfies the equality dE (1) − 2τ E (1) = τ dt

Σt

N J0 + (N − 2)(J0 + J1 )µg + +Z .

The quantity Z is given by: ab Z≡ {N hab g ∂a u .∂b u + 2N hg ∇a ∂b u.∆g u+ Σt a

(∇b (∂ N ∂a u) + τ ∂b N u ).(∂ b u )}µg + Y1 .

(21)

(22) (23)

For τ ≤ 0, and 0 < N ≤ 2, the right hand side of (19) is less than Z, which can be estimated with non linear terms in the energies: all the terms which are only quadratic in the derivatives of u, i.e. linear in energy densities, have coefficients which contain N − 2, ∂a N or hab g , or their derivatives. To estimate these terms we

Vol. 2, 2001


need bounds which will be deduced from estimates on the conformal factor and the lapse N . In the following paragraphs we will set E(t) ≡ ε2 ,

and τ −2 E (1) (t) ≡ ε21 .

7 Estimates for h in H1 7.1

Estimate of h

We have defined the auxiliary unknown h by 1 hab ≡ kab − gab τ . 2 Its L2 norm on (Σ, σ) is bounded in terms of the first energy and an upper bound λM of the conformal factor since we have σ ac σ bd hab hcd µσ = e2λ g ac g bd hab hcd µg ≤ e2λM h 2L2 (g) h 2 = Σt

Σt

which implies on Σt , by the definition of E(t), h ≤ eλM ε with

1

ε ≡ E 2 (t) .

7.2

Estimate of Dh

The tensor h satisfies the equations .

Da hab = Lb ≡ −∂a u.u . It is the sum of a TT tensor hT T ≡ q and a conformal Lie derivative r: h≡q+r . It results from elliptic theory that on each Σt the tensor r satisfies the estimate .

1

.

1

r H1 ≤ Cσ Du.u ≤ Cσ |Du|2 2 |u|2 2 . We will bound the right hand side of this inequality in terms of the first and second energies of u . We have: . |u|2 ≤ e4λM |u |2 we have proven in section 4 that | u |2 ≤ Cσ e−λm u L2 (g) (e−λm u L2 (g) + Du L2 (g) ) .

1034


We have set


1

ε1 ≡ |τ |−1 {E (1) (t)} 2

hence, using the lower bound on λ and the definitions of ε and ε1 we obtain | u |2 ≤ Cσ τ 2 (ε2 + εε1 ) . On the other hand | Du |2 ≤ Cσ DuL2(g) (DuL2 (g) + eλM ∆g uL2 (g) ) hence

| Du |2 ≤ Cσ {ε2 + εε1 eλM |τ |} .

It results from these inequalities that r 2H1 ≤ Cσ e4λM τ 2 ε2 (ε + ε1 }{ε + ε1 eλM |τ |} . We now estimate the transverse part hT T = q. It is known (cf. Andersson and Moncrief ) that in dimension 2 the equation Da qba = 0, with qaa = 0 implies Dc Dc qab = R(σ)qab . When R(σ) = −1 this equation gives by integration on Σt of its contracted product with q ab the following relation Dq = q more generally any Hs norm of q is a multiple of its L2 norm. We have q ≤ h + r therefore Dh ≤ Dq + Dr ≤ h + r H1 . In other words 1

1

Dh ≤ eλM ε{1 + Cσ eλM |τ |(ε + ε1 eλM |τ |) 2 (ε + ε1 ) 2 } .

8 Estimates for the conformal factor 8.1

First estimates

Recall that we denote respectively by . and .p the L2 (σ) and Lp (σ) norms on Σ and by . g an L2 (g) norm on Σ.

Vol. 2, 2001


The conformal factor λ satisfies the equation ∆λ = f (λ) ≡ p1 e2λ − p2 e−2λ + p3 where the coefficients pi are functions in E0 ∩ E1p , 1 < p < 2, hypothesis consistent . with Du, u, h ∈ E1, given by p1 =

1 2 1 1 τ , p2 = (| h |2 + | u˙ |2 ), p3 = (R(σ)− | Du |2 ) . 4 2 2

Having chosen R(σ) = −1 we have seen that a lower bound λm for λ is such that e−2λm ≤

1 2 τ . 2

Also λ− ≤ λ ≤ λ+ λ− = ω − max v + v, and λ+ = ω + v − min v where v ∈

E2 ∩E3p

is the solution with mean value zero on Σt of the linear equation ∆v = f (ω) ≡ p1 e2ω − p2 e−2ω + p3

where e2ω , positive solution of the equation P1 e4ω + P3 e2ω − P2 = 0 is given by, since P3 < 0, P2 ≥ 0, P1 = 14 τ 2 Vσ , e2ω

−P3 (1 + −P3 + P32 + 4P1 P2 = ≡ 2P1

1 + 4P3−2 P1 P2 ) 2P1

.

This formula will permit an estimate of e2ω − τ22 , a positive quantity, in terms of the energies. Indeed using the elementary algebra inequality √ 1 1 + a ≤ 1 + a, when a ≥ 0 2 we obtain e2ω ≤ −

P3 P2 − P1 P3

and, using the expressions of P2 , P3 and P1 = 14 τ 2 Vσ , together with .

u 2 ≤ e2λM u 2g , and h 2 ≤ e2λM h 2g we find 0≤

1 2 2ω τ2 τ e − 1 ≤ Vσ−1 { Du 2 + e2λM ( u 2g + h 2g )} . 2 2

1036



We have set ε2 ≡ E(t) and therefore we have 1 2 2ω τ2 τ e − 1 ≡ εω ≤ Vσ−1 {1 + e2λM )ε2 } . 2 2 We will now give estimates for λ. 0≤

Lemma 15 Denote by λM the maximum of λ, one has 0 ≤ λM − ω ≤ 2 v L∞ , 0 ≤ ω − λm ≤ 2 v L∞ . Proof. The result follows from the expressions of λ− and λ+ : λM ≤ sup λ+ = ω + max v − min v, and λm ≥ inf λ− = ω + min v − max v . Also λM − λm ≤ 2max v − 2min v ≤ 4max v ≤ 4 v L∞ . Corollary 16 The following inequalities hold 1 ≤ eλM −ω ≤ 1 + 2 v L∞ e2vL∞ ,

(24)

1 ≤ eλM −λm ≤ 1 + 4 v L∞ e4vL∞ .

(25)

Proof. Elementary calculus We set εv ≡ v L∞ . ∞

Denote by εv0 the L norm of the function v computed with initial data. We have shown in the section on local existence that εv0 tends to zero with the initial data . q0 , Du0 and u0 . Hypothesis Hc . We say that v satisfies the hypothesis Hc if there exists a number c > εv0 , independent of t, such that εv ≤ c. We suppose also that the initial data are such that E(t0 ) ≡ ε20 verifies the inequality (we chose 12 for simplicity of notations) 1 . 2 Then, since E(t) is non increasing and the volume Vσ of (Σ, σ) is constant by the Gauss Bonnet theorem, it holds for all t that ε20 (1 + 2ce2c )2 < Vσ−1 0

Vσ−1 ε2 (1 + 2ce2c )2
1,

and the estimate of eλM |τ |. 2. 1

h L∞ (g) = SupΣ {g ac g bd hab hcd } 2 ≤ e−2λm h ∞ ≤

1 2 τ h ∞ . 2

1044

9.2



Wp3 estimates for N

9.2.1 H2 estimates of N Theorem 26 There exist numbers C = C(c) and Cσ such that the H2 norm of N satisfies the inequality 2 − N H2 ≤ CCσ (ε2 + εε1 ) . Corollary 27 The minimum Nm of N is such that 0 ≤ 2 − Nm ≤ CCσ (ε2 + εε1 ) . Proof. We write the equation satisfied by N in the form ∆(2 − N ) − (2 − N ) = β

(29)

with, having chosen the parameter t such that ∂t τ = τ 2 , 1 β ≡ (2 − N )(e2λ τ 2 − 1) − N (e2λ | u |2 +e−2λ | h |2 ) . 2 The standard elliptic estimate applied to the form given to the lapse equation gives 2 − N H2 ≤ Cσ β .

(30)

Since 0 < N ≤ 2 and e−2λ ≤ 12 τ 2 it holds that 1 1 β ≤ 2( e2λM τ 2 − 1)Vσ1/2 + 2(e2λM |u |2 + τ 2 |h|2 ) . 2 2

(31)

The L4 norms of h and u as well as 12 e2λM τ 2 − 1 have been estimated in the section conformal factor estimate. We deduce from these estimates the bound β ≤ CCσ (ε2 + εε1 ). which gives the result of the theorem. The corollary is a consequence of the Sobolev embedding theorem. Theorem 28 Under the hypothesis Hc there exist numbers C depending only on c and Cσ such that if 1 < p < 2, for instance p = 43 εDN ≡ 2 − N W3p ≤ CCσ (ε2 + εε1 ). Corollary 29 The gradient of N satisfies the inequality: DN L∞ (g) | ≤ CCσ |τ |(ε2 + εε1 ) .

Vol. 2, 2001


Proof. We have 1 1 |β| ≤ (2 − Nm )( e2λM τ 2 − 1) + 2(e2λM |u |2 + τ 2 |h|2 ) . 2 2 We apply the standard elliptic estimate p ≤ Cσ β Wsp 2 − N Ws+2

(32)

(33)

with now 1 < p < 2, s = 1. We have for any p ≤ 2, 1

β p ≤ Vσp

− 12

β .

We have already estimated β . To estimate β W1p we compute Dβ ≡ [(2 − N )e2λ τ 2 − 2N (e2λ | u |2 −e−2λ | h |2 )]Dλ 1 −DN [ e2λ τ 2 − 1 − e2λ | u |2 −e−2λ | h |2 ] − N [e2λ D | u |2 +e−2λ D | h |2 ] . 2 We have therefore, with H hypothesis

+ [e2λM

1 q

+ q1 = p1 , and using estimates obtained for λ under the

Dβ p ≤ CCσ {(2 − Nm ) Dλ p +(ε2 + εε1 ) DN p 1 |u |2 q + τ 2 |h|2 q ][4 Dλ q + DN q ] + A} 2

with

1 A ≡ 2[e2λM D | u |2 p + τ 2 D | h |2 p ] . 2 To bound the first line we recall that the Lp norms of Dλ and DN are bounded by their L2 norms estimated before. To estimate the second line (except for A) we choose p = 43 , q = 4, q = 2. We find quantities bounded before and the L4 norm of Dλ and DN which can be estimated in terms of their H1 norms bounded before. To bound A we write again, with p = 43 : D|u |2 p ≤ 2 u 4 Du , since

1 1 1 + = . 4 2 p

This inequality and corresponding estimates for h give: A ≤ CCσ (ε2 + εε1 ) . The H1 bound found above for DN and Dλ permits the obtention of the given result. The corollary is a consequence of the Sobolev embedding theorem and the relation between σ and g norms: DN L∞ (g) ≤ e−λm DN ∞ ≤ e−λm Cσ DN W2p ≤ CCσ |τ |(ε2 + εε1 ) .

1046



10 Corrected energy estimates We have obtained in section 6 a bound for the first energy and a decay for the second energy. These bounds prove unsufficient to control the behaviour in time of the Teichm¨ uller parameters. The right hand side of the first energy inequality is non positive, as well as the quadratic term of the right hand side of the second energy inequality, but the space derivatives are lacking in those right hand sides which would make them negative definite. The introduction of corrected energies enables one to obtain such a definiteness, compensating some terms by others, and leading to better decay estimates.

10.1

Corrected first energy

10.1.1 Definition and lower bound One defines as follows a corrected first energy where α is a constant, which we will choose positive: Eα (t) = E(t) − ατ (u − u).u µg (34) Σt

where we have denoted by u the mean value of u, a scalar function, on Σt : 1 u= uµg . V olg Σt Σt An estimate of Eα will give estimates of the L2 norms of the derivatives of u and of h if there exists a K > 0, independent of t, such that (u − u).u µg ] . (35) E(t) ≤ KEα (t) ≡ K[E(t) − ατ Σt

We set I0 ≡

1 2 1 |u | , and I1 ≡ |Du|2g 2 2

and x0 =

Σt

I0 µg ≡

(36)

1 1 u 2g , and x1 = Du 2g . 2 2

We estimate the complementary term through the Cauchy-Schwarz inequality | (u − u).u µg | ≤ ||u − u||g ||u ||g . Σt

We will use the Poincaré inequality on the compact manifold (Σ, σ) to estimate ¯: the L2 (σ) norm of u − u −

||u − u||g ≤ eλM ||u − u|| ≤ eλM Λ−1/2 ||Du|| σ

(37)

Vol. 2, 2001


where . denotes the L2 norm on Σ in the metric σ, λM is an upper bound of the conformal factor λ and Λσ is the first positive eigenvalue of the operator −∆ ≡ −∆σ acting on functions with mean value zero. Note that ||Du|| = ||Du||g . The inequality (35) to satisfy is implied by the two following ones: K ≥1

(38)

and (to be satisfied by all x0 , x1 ≥ 0) −1

1

1

(K − 1)(x0 + x1 ) − 2|ατ |KeλM Λσ 2 x02 x12 ≥ 0 ;

(39)

this quadratic form in the x’s will be always non negative if K ≥ 1 and its discriminant is non positive. This last condition reads aK ≤ K − 1 with a≡

α|τ |eλM 1

.

(40)

Λσ2

A necessary and sufficient condition for the existence of K ≥ 1 and K finite is therefore a2 ≡ α2 τ 2 e2λM Λ−1 (41) σ 1 there is an open subset of Teichm¨ uller space such that for metrics σ ∈ M−1 projecting on this open set it holds 8Λσ = 1 + δσ , with δσ > 0 . (43) We now choose α=

1 . 4

The condition a < 1 then reads (

τ 2 e2λM 1 )( ) < 1, 2 1 + δσ

that is using estimates on the conformal factor C(ε2 + Cσ εε1 ) < δσ .

(44)

1048



10.1.2 Time derivative of the corrected energy We set: dEα dE = − Rα dt dt with (the terms explicitly containing the shift ν give an exact divergence which integrates to zero) Rα = ατ {∂0 u .(u − u) + u .∂0 (u − u) − N τ u .(u − u)}µg Σt

+α

dτ dt

Σt

u .(u − u)µg .

(45)

The function γ ≡ u satisfies the wave equation −N −1 ∂0 (N −1 ∂0 u) + N −1 ∇a (N ∂a u) + N −1 τ ∂0 u = 0 .

(46)

Some elementary computations and integration by parts show that Rα = ατ {|u |2 − |Du|2g }N µg

Σt

dτ −ατ u .∂t uµg + α dt Σt

Σt

(u − u).u µg .

Lemma If u satisfies the wave equation the quantity u µg Σt

is conserved in time Proof. Integration on (Σt , g) of the wave equation (multiplied by N ) shows that on a compact manifold, where exact divergences integrate to zero, one has d u µg = (∂0 u − N τ u )µg = 0 . dt Σt Σt To simplify the proofs we will suppose in all that follows that u µg = 0 .

(47)

Σt

Then Ra reduces to, since ∂t u is constant on Σt , dτ Rα = ατ {|u |2 − |Du|2g }N µg + α (u − u).u µg . dt Σt Σt

(48)

Vol. 2, 2001


Remark If we do not make this hypothesis (39) we can bound the term containing ∂t u as follows. We have, using previous computations 1 ∂t u = (∂0 u − N τ u)µg + τ N u Vg Σ which we write, since u ≡ N −1 ∂0 u, 1 ∂t u = {2u + (N − 2)u }µg − τ [(N − 2)u − (N − 2)u] Vg Σ we deduce from this expression an estimate of ατ Σt u .∂t uµg (recall that α > 0, τ < 0) by non linear terms. 10.1.3 Decay of the corrected first energy In the corrected energy inequality we have seen appear the quantity dτ /dt. To obtain a differential inequality we have to make a choice of τ as a function of t. We wish to work in the expanding direction of our spacetime, where τ, with our sign convention for the extrinsic curvature, starts from a negative value τ0 and increases, eventually up to the moment of maximum expansion where τ = 0. We have made (section 5, notations) the choice τ = −t−1 ,

t ∈ [t0 , ∞),

t0 > 0,

1 dτ = 2 = τ2 . dt t

(49)

We obtain, using the value of dE/dt and Rα , that dEα =τ dt

1 1 {[ |h|2 + ( − α)|u |2 + α|Du|2g ]N − ατ u .(u − u)}µg 2 Σt 2

we look for a positive number k such that the difference dEα − kτ Eα dt can be estimated with higher order terms. We choose α= We have then dE1/4 − τ E1/4 = τ dt Which we write dE1/4 − τ E1/4 = τ dt

1 ,k = 1 . 4

1 1 { |h|2g (N − 1) + [ N − 1](I0 + I1 )}µg . 2 Σt 2

1 1 { |h|2g (1 + N − 2) + (N − 2)(I0 + I1 )}µg . 2 2 Σt

(50)

1050



The right hand side is the sum of a negative term and a term which can be considered as a non linear term in the energies because we have proved that (cf section 9.2 on N estimates): 0 ≤ 2 − N ≤ 2 − Nm ≤ CCσ (ε2 + εε1 ) . Therefore we obtain the following theorem (remember that τ < 0): Theorem 30 The corrected first energy with α = tion dE1/4 = τ E1/4 + |τ |A dt

10.2

with

1 4

satisfies the differential equa-

A ≤ CCσ ε2 (ε2 + εε1 ) .

(51)

Corrected second energy

10.2.1 Definition and lowerbound We define a corrected second energy Eα by the formula, with α some constant Eα(1) (t) = E (1) + Cα

with Cα = ατ

Σt

∆g u.u µg .

This corrected second energy will give bounds on the derivatives of Du and u if there exists a number K >0 such that: E (1) ≤ KEα(1) .

(52)

u ¯ = 0 is not necessary here because on a compact manifold The hypothesis Σt ∆g u.u µg = 0. We obtain the estimate, analogous to one obtained in the previous section, −1 ∆g u.u µg ≤ ∆g u g u − u g ≤ ∆g u g eλM Λσ 2 Du g Σt

The same K as in the previous section satisfies the required inequality when we choose α = 14 . 10.2.2 Time derivative of the corrected second energy We have dCα /dt = ατ

Σt

[∂0 ∆g u.u + ∆g u.∂0 u − N τ ∆g u.u ]µg + α

dτ dt

Σt

∆g u.u µg .

Vol. 2, 2001


We recall that (indices are raised with g in the next few lines) ac c ∂0 ∆g u = ∆g (N u ) + N τ ∆g u + 2N hab g ∇a ∂b u + ∂c u[2∇a (N k ) − τ ∂ N ] .

Partial integration together with the splitting kab = hab + 12 gab τ , and the equation (3)

R0c ≡ −N ∇a k ac = ∂0 u.∂ c u

gives: Σt

∂0 ∆g u.u µg =

Σt

{−N |Du |2g − ∂ a N ∂a u .u + N τ ∆g u.u

ac c + 2N hab g ∇a ∂b u.u + 2u .∂c u(∂a N hg − ∂ u.u )}µg .

On the other hand and if u satisfies the wave equation we find ∆g u.∂o u µg = {N |∆g u|2 + ∂ a N ∂a u.∆g u + N τ u .∆g u}µg . Σt

Σt

dτ 2 These equalities give, if we make the choice τ = −1 t , hence dt = τ : dCα = ατ {−N |Du |2 + N |∆g u|2 + ∂ a N (∂a u.∆g u + u .∂a u ) dt Σt ac c + 2N hab g ∇a ∂b u.u + 2u .∂c u(∂a N hg − u .∂ u) + (N + 1)τ ∆g u.u }µg .

We have found an equality of the form (1)

dEα dt with

dE (1) dCα ≡ + = dt dt

Σt

{τ Pα + ατ Q}µg + Z

Pa = N [2(1 − α)J0 + (1 + 2α)J1 ] + (N + 1)ατ ∆g u.u

and Q ≡ ∂ a N (∂a u.∆g u + u .∂a u ) ac c + 2N hab g ∇a ∂b u.u + 2u .∂c u(∂a N hg − u .∂ u) .

We see that Q contains also terms only quadratic in the first and second derivatives of u, but its integral will be bounded by non linear terms in the energies through previous estimates on DN and h. We choose α = 14 . We split the integral of P1/4 into linear and non linear terms in the energies by writing 1 (1) P1/4 µg = 3 (J0 + J1 + τ ∆g u.u )µg + U ≡ 3E1/4 + U 4 Σt Σt

1052



with non linear terms U given by 3 1 U= (N − 2)[ (J0 + J1 ) + τ ∆g u.u ]µg . 2 4 Σt We are ready to prove the following theorem Theorem 31 With the choice α = energy satisfies the inequality

1 4

and τ = − 1t , t > 0, the corrected second

(1)

dE1/4 dt

(1)

= 3τ E1/4 + |τ |3 B

where B a polynomial in ε and ε1 with all terms of order at least 3 and coefficients of the form CCσ . Proof. We have shown that (1)

dE1/4 dt

1 (1) = 3τ E1/4 + Z + τ 4

Qµg + τ U .

We will estimate the various terms in the right hand side. We obtain, using the bound of 2 − N and the definition of ε1

We now estimate

|τ U | ≤ CCσ |τ |3 (ε2 + εε1 )(ε21 + εε1 ) . τ Qµg , using its expression and the estimates (cf.section 9)

h L∞ (g) ≤ |τ |εh , with εh = CCσ {ε + ε1/2 (ε + ε1 )3/2 }, DN L∞ (g) ≤ |τ |Cσ εDN , with εDN = CCσ (ε2 + εε1 ) we have, with C0 a fixed number |τ

Σt

{∂ a N (∂a u.∆g u + u .∂a u ) + 2u .∂c u(∂a N pac }µg |

≤ C0 |τ |3 (εDN + εh )εε1 + εDN εh ε2

while |τ

Σt

2N pab ∇a ∂b u.u µg | ≤ 4τ 2 εh ε ∇2 u g .

It holds on a 2 dimensional compact manifold 1 ∇2 u 2g = ∆g u 2g − R(g)|Du|2g µg . 2 Σt

Vol. 2, 2001


Recall that 1 R(g) = − τ 2 + |p|2g + |u |2 + |Du|2g , 2

and

Σt

R(g)|Du|2g µg

= Σt

with

|Du|2g ≤

τ2 |Du|2 2

R(g)|Du|2 µσ

therefore ∇2 u 2g ≤ C0 τ 2 [ε21 + (1 + εh )ε2 ] + [ |u |2 +τ 2 |Du|2 ] |Du|2 . The bounds on L4 norms of u and Du give ∇2 u g ≤ |τ |ε∇2 u ,

1

ε∇2 u = C0 {ε21 + (1 + εh )ε2 + CCσ ε2 [ε1 + ε]2 } 2 .

Finally |τ

Σt

{2(u .∂c u)(u .∂ c u)}µg | ≤ 2|τ | u 2L4 (g) Du 2L4 (g) .

We have, using previous estimates, u 2L4 (g) ≤ eλM u 4 ≤ CCσ eλM τ 2 (ε2 + εε1 ) hence

u 2L4 (g) ≤ CCσ |τ |(ε2 + εε1 ) .

An inequality of the same type holds for Du L4 (g) . The estimate of | τ Qµg | by the product of |τ |3 with higher than 2 powers of the ε s follows. We now estimate Z. We recall that Z ≡ {N pab ∂a u .∂b u + 2N pab ∇a ∂b u.∆g u + (∇b (∂ a N ∂a u) + τ ∂b N u ).(∂ b u )}µg Σt

+ Y1 .

(53)

Previous estimates give |Z| ≤ |τ |3 {C0 εh ε21 + εDN (ε21 + εε1 ) + 4ε1 ε∇2 u } + Y2 + |Y1 |

with Y2 ≡ |

Σt

(54)

{(∇b ∂ a N )∂a u.(∂ b u )}µg | .

To bound Y2 we use the L4 norm of ∇2 N estimated in terms of its W3p norm in the section on lapse estimates. Indeed Y2 ≤ |τ |ε1 ∇2 N L4 (g) Du L4 (g) .

1054


We have


|∇2 N |g = e−2λ |∇2 N |

hence

3

3

∇2 N L4 (g) ≤ e− 2 λm ∇2 N 4 ≤ C0 |τ | 2 ∇2 N 4 .

On the other hand we recall the identity ∇a ∂b N ≡ Da ∂b N + σ cd ∂c N ∂d λ − δac ∂b λ∂c N − δbc ∂a λ∂b N . By the Sobolev embedding theorem, with p =

4 3

D2 N 4 ≤ Cσ D2 N W1p ≤ Cσ εDN . We also bound Dλ 4 ≤ Cσ Dλ H1 with

Dλ H1 ≤ CCσ (ε2 + εε1 )

and we obtain

∇2 N 4 ≤ Cσ εDN (1 + C(ε2 + εε1 ) .

Recall that 1

1

1

1

Du L4 (g) ≤ C0 |τ | 2 Du 4 ≤ CCσ |τ | 2 (ε + ε 2 ε12 ) . Finally 1

1

Y2 ≤ CCσ |τ |3 εDN [1 + C(ε2 + εε1 )]ε1 (ε + ε 2 ε12 ) . Recall that Y1 = {(2∂a N pac − 2N ∂ c u.u )∂c u + 2∂ a N ∂a u + u ∆g N }.∆g uµg

(55)

Σt

hence

|Y1 | ≤ |τ |3 CCσ {εDN εh εε1 + εDN ε21 } + Y3 + Y4

with Y3 = |

Σt

{(−2N ∂ cu.u )∂c u}.∆g uµg | .

The term Y3 can be estimated using the Hölder inequality, Y3 ≤ 4|τ |ε1 Du 2L6 (g) u L6 (g) . Elementary calculus gives 2

2

Du L6 (g) ≤ e− 3 λm Du 6 ≤ C0 |τ | 3 Du L6 and

1

u L6 (g) ≤ e 3 λM u 6 .

(56)

Vol. 2, 2001


The L6 norms can be estimated with H1 norms using the Sobolev inequality f 6 ≤ Cσ (f + Df )

applied to f = u and f = |Du| together with the inequality D | f |≤ |Df | . We obtain 1

Y3 ≤ 4|τ |ε1 Cσ e 3 (λM −λm )−λm [u + Du ][Du2 + D2 u2 ] hence, going back to the energies Y3 ≤ CCσ |τ |3 ε1 |[ε + ε1 ][ε2 + ε21 ] . Finally Y4 ≡ |

Σt

u ∆g N.∆g uµg | ≤ |τ |ε1 u L4 (g) ∆g N L4 (g)

therefore, using Laplacian and norms in conformal metrics and the previous estimate of u L4 (g) 3

1

Y4 ≤ Cσ |τ | 2 e−2λm e 2 λM ε1 (ε2 + εε1 ) ∆N 4 . The bound we have just computed of D2 N 4 gives also a bound of ∆N 4 , hence Y4 ≤ |τ |3 CCσ ε1 (ε2 + εε1 )εDN (1 + C(ε2 + εε1 ) . Gathering the results gives the theorem.

11 Decay of the total energy We call total energy the quantity Etot (t) ≡ E(t) + τ −2 E (1) (t) ≡ ε2 + ε21 . We define y(t) to be the total corrected energy namely: (1)

y(t) ≡ E1/4 (t) + τ −2 E1/4 . We have

1 y(t) 1 − at

Etot (t) ≤ with on each Σt at ≡

|τ |eλM 1

4Λt2

,

Λt ≡ Λσt .

(57)

1056



The inequalities obtained for the corrected energies imply, with τ = −t−1 1 dy = [−y + A + B] dt t

(58)

where A and B are bounded by polynomials in ε and ε1 with terms of degree at least 3. Lemma 32 Suppose that on (Σ, σ) there is δσ > 0 such that the first positive eigenvalue Λσ is 1 1 − δσ (4Λσ )− 2 = √ . 2 then if the energies are such that CCσ (ε2 + εε1 ) ≤

δσ 2

then

δσ . 2 The numbers C and Cσ are known numbers depending respectively on the number c of the hypothesis Hc and on the metric σ. 1 − aσ ≥

Proof. By the definition of a ≡ aσ it holds that 1 − aσ = 1 −

|τ |eλM δσ |τ |eλm √ √ + 2 2

which gives using the lower bound of λ and the lemma 3 of the section 8 “conformal factor estimates” 1 − aσ ≥ δσ − CCσ (ε2 + εε1 ) from which the result follows. Hypothesis Hσ : 1. The numbers Cσ are uniformly bounded by a constant M for all t ≥ t0 for which they exist. 2. There exists a constant δ > 0 such that the numbers Λσ , the first positive eigenvalues of −∆σt for functions with mean value zero, are such that 1

(4Λσ )− 2 =

1 − δσ √ , 2

with δσ ≥ δ .

Hypothesis HE . The energies ε2t and ε21,t satisfy as long as they exist an inequality of the form δ C(c, M )(ε2 + εε1 ) ≤ 2 where C is a number depending only on the numbers c and M. We will prove the following theorem.

Vol. 2, 2001


Theorem 33 Under the hypothesis Hc , HE and Hσ there exists a number η such that if the total energy is bounded at time t0 by η then it satisfies at time t = − τ −1 ≥ t0 > 0 an inequality of the form tEtot (t) ≡ t(ε2 + ε21 ) ≤ Mtot Etot (t0 ) where Mtot depends only on δ. Proof. Under the hypothesis we have made the polynomials A and B are bounded 1 by polynomials in y 2 with terms of degree at least 3 and bounded coefficients depending only on c, M, δ. Take η such that y0 ≡ y(t0 ) < 1. Then all powers of y0 greater than 3/2 are 3/2 less than y0 and there exists a constant M1 , depending only on c, δ and M such that 3/2 (A + B)t=t0 ≤ M1 y0 . Take η such that moreover 1/2

y0

0

and, consequently, the differential inequality dt dy + ≤0 1/2 t y(1 − M1 y ) equivalently dt dz + ≤ 0, z(1 − M1 z) 2t

with

y = z2

which gives by integration log{ that is

z(1 − M1 z0 ) t 1 } + log( ) 2 ≤ 0 (1 − M1 z)z0 t0 1

t 2 z(1 − M1 z0 ) 1

(1 − M1 z)t02 zo

≤1

1058


in other words

1/2


1/2

t0 z 0 t0 z 0 ≤ 1 − M1 z0 1 − M1 z0

t1/2 z + M1 z a fortiori ty ≤

t0 y 0 . (1 − M1 z0 )2

We suppose for instance z0 ≤

1 , 2M1

then ty ≤ 4t0 y0 . Recall that under the Hc , HE and Hσ hypotheses Etot (t) ≤ also y0 ≤

1 2 y(t) ≤ y(t), 1 − at δt

1 2 y0 ≤ Etot (t0 ) . 1 − a0 δ0

The inequality for y implies therefore tEtot (t) ≤ Mt Etot (t0 ) with, as announced, Mt uniformly bounded: Mt =

16t0 4t0 16t0 ≤ ≤ 2 . (1 − at )1 − a0 ) δt δt0 δ

12 Teichm¨ uller parameters 12.1

Dirichlet energy

Let s and σ be two given metrics on Σ and Φ be a mapping from Σ into Σ. The energy of the mapping Φ : (Σ, σ) → (Σ, s) is by definition the positive quantity: E(σ, Φ) ≡

σ ab Σ

∂ΦA ∂ΦB sAB (Φ)µσ . ∂xa ∂xb

Consider the metric s as fixed. Elementary calculus shows that the energy E(σ, Φ) is invariant under a diffeomorphism f of Σ in the following sense E(σ, Φ) = E(f∗ σ, Φ ◦ f ) . In the case where s and σ both have negative curvature it has been proved by Eells and Sampson that there exists one and only one harmonic map Φσ : (Σ, σ) → (Σ, s)

Vol. 2, 2001


which is a diffeomorphism homotopic to the identity, i.e. Φσ ∈ D0 . Such a harmonic map is equivariant under diffeomorphisms homotopic to the identity, i.e. Φf∗ σ = Φσ ◦ f, with f ∈ D0 . One is then led to the definition: Definition 34 Given a metric s ∈ M−1 the Dirichlet energy D(σ) of the metric σ ∈ M−1 is the energy of the harmonic map Φσ ∈ D0 : D(σ) ≡ E(σ, Φσ ) . It depends on the choice of the fixed metric s, but is invariant under the action of diffeomorphisms included in D0 hence defines a positive functional on the Teichm¨ uller space Teich ≡ M−1 /D0 . Remark 35 The energy of the mapping Φ : (Σ, σ) → (Σ, s) as well as the harmonic map Φσ are also invariant under conformal rescalings of σ. They can be used on the space of riemannian metrics of negative curvature before the rescaling which restricts them to metrics of curvature −1. The importance of the Dirichlet energy rests on the following theorem which says that if D(σ) remains in a bounded set of R then the equivalence class of σ remains in a bounded set of Teich . Theorem 36 (Eells and Sampson) The Dirichlet energy is a proper function on Teichm¨ uller space.

12.2

Estimate of the Dirichlet energy

We will require of the metric σt that it remains, when t varies, in some cross section of M−1 (space of C ∞ metrics with scalar curvature −1) over the Teichm¨ uller space, diffeomorphic to R6G−6 , G the genus of Σ. Remark Following Andersson-Moncrief one can choose the cross section as follows, having given some metric s ∈ M−1 . To an arbitrary metric ζ ∈ M−1 we associate another such metric by its pull back through Φ−1 ζ ψ(ζ) = (Φ−1 ζ )∗ ζ . For any f ∈ D0 we have ψ(f∗ ζ) = (Φ−1 f∗ ζ )∗ f∗ ζ = ψ(ζ) hence the metric ψ depends only on the equivalence class Q of ζ through D0 . Thus uller space, Q ∈ Teich → ψ(Q) ∈ M−1 . one gets a cross section of M−1 over Teichm¨ If Q remains in a bounded set of Teich then ψ(Q) remains in a bounded set of M−1 i.e. all these metrics are uniformly equivalent.

1060



We will estimate the Dirichlet energy D(σ) ≡ E(σ, Φσ ). We have, with gab = e2λ σab B B E(σ, Φσ )≡ σ ab ∂a ΦA ∂ Φ s (Φ )µ = g ab ∂a ΦA σ σ σ b σ AB σ ∂b Φσ sAB (Φσ )µg ≡ E(g, Φσ ). Σ

Σ

If Φσ is a harmonic map from (Σ, σ) into (Σ, s) it is an extremal of the mapping Φ → E(σ, Φ) and also an extremal of the mapping Φ → E(g, Φ) . We have, with respectively on

∂E ∂E ∂g and ∂Φ dg dΦ dt and dt )

denoting functional derivatives (linear maps acting

∂E dg ∂E dΦ d E(g, Φ) = . + . dt dg dt ∂Φ dt We compute this derivative at a point (σ, Φσ ); we have , by the extremality of Φσ , ( ∂E ∂Φ )(σ, Φσ ) = 0 . Therefore d ∂E dg d D(σ) ≡ { E(g, Φ)}(g,Φσ ) = { . }(g,Φσ ) dt dt ∂g dt which gives using previous notations and the vanishing of the integral of a divergence on a compact manifold d B ab A B D(σ) = {∂ 0 g ab ∂a ΦA σ ∂b Φσ − N τ g ∂a Φσ ∂b Φσ }sAB (Φσ )µg . dt Σt Recall that hence

∂ 0 g ab = 2N g ac g bd kcd = 2N e−4λ hab + N e−2λ hab τ d D(σ) = dt

Σt

Using 0 < N ≤ 2 and e−2λ ≤ |

B 2N e−2λ hab ∂a ΦA σ ∂b Φσ sAB (Φσ )µσ .

τ2 2

we find

d D(σ)| ≤ 2τ 2 h ∞ D(σ) . dt

The bound of h ∞ found in the section on h estimates gives: |

d D(σ)| ≤ |τ |CCσ [ε + (ε + ε1 )2 ]D(σ) . dt

We recall the following lemmas.

Vol. 2, 2001


Lemma 37 There exists an open subset Ω of Teich such that if the equivalence class of σ is in Ω and σ is in a smooth cross section of Teich , then there exists a number δ > 0 such that Λ(σ) ≥ 18 + δ and all constants Cσ are bounded by a fixed number M. Lemma 38 There exists an interval I≡(a,b) of R such that if the Dirichlet energy (taken with some metric s) D(σ) ∈ I then σ projects into Ω. More precisely, there exists σ0 projecting in Ω and given σ0 there exists a number D such that if D(σ) − D(σ0 )| ≤ D then the hypothesis Hσ is satisfied. We will prove the following theorem Theorem 39 Under the hypothesis Hc , HE and Hσ there exists a number MD depending only on the bounds in these hypothesis such that the Dirichlet energy satisfies the inequality 1

|D(σt ) − D(σ0 )| ≤ MD x02

with

x0 ≡ Etot (t0 ) .

Proof. Under the hypothesis that we have made the Dirichlet energy satisfies the differential inequality (we have set τ = −t−1 ) 1

|

t 2 [ε + (ε + ε1 )2 ] d D(σ)| ≤ D(σ)CM { ). 3 dt t2

We recall the decay found for the total energy tEtot (t) ≡ t(ε2 + ε21 ) ≤ Mtot Etot (t0 ) with Mtot ≤

16t0 . δ2

We have, using t ≥ t0 and (ε + ε1 )2 ≤ 2Etot 1

1

1

−1

2 t 2 (ε + (ε + ε1 )2 ) ≤ t 2 Etot (t) + 2t0 2 tEtot (t) .

Using the decay of the total energy (section 11) and the assumption x0 ≡ Etot (t0 ) < 1 we find that there exists a number M2 depending only on c, M and δ such that 1 M2 x0 2 d . | D(σ)| ≤ D(σ) 3 dt t2 We deduce from this inequality, by elementary calculus, abbreviating D(σ) to D and D(σ0 ) to D0 , 1

d M2 x 2 d |D − D0 | ≤ | (D − D0 )| ≤ [|D − D0 | + D0 ] 3 0 . dt dt t2

1062



By the Gromwall lemma |D − D0 | is for t ≥ t0 bounded by the solution of the associated differential equality with initial value zero, which gives t t 1 1 3 3 |D − D0 | ≤ [D0 M2 x02 t− 2 dt] exp(M2 x02 t− 2 dt) t0

t0

hence, as announced

1

|Dσt − Dσ0 | ≤ MD x02 with (recall that x0 ≤ 1) − 12

MD = Dσ0 2M2 t0

−1

exp(2M2 t0 2 ) .

13 Global existence .

Theorem 40 Let (σ0 , q0 ) ∈ C ∞ (Σ0 ) and (u0 , u0 ) ∈ H2 (Σ0 , σ0 ) × H1 (Σ0 , σ0 ) be initial data for the polarized Einstein equations with U(1) isometry group on the initial manifold M0 ≡ Σ0 × U (1) ; suppose that σ0 is such that R(σ0 ) = −1 and the first positive eigenvalue Λ0 of −∆σ0 (for functions with mean value zero) is such that 1 Λ0 > . 8 Then there exists a number η > 0 such that if Etot (t0 ) < η these Einstein equations have a solution on M × [t0 , ∞), with initial values deter. mined by σ0 , q0 , u0 , u0 . The orthogonal trajectories to the space sections M × {t} have an infinite proper length. Proof. It results from the local existence theorem that we only have to prove that Etot (t) does not blow up. We have in the previous sections made the following hypothesis, to hold for all t ≥ t0 for which the involved quantities exist Hypothesis Hc . There exists a number c > c0 = εv0 > 0 such that 1. εv ≤ c , 2. ε0 ≤

1 . 2(1 + 2ce2c )

Hypothesis HD . The Dirichlet energy is such that |D(σ) − D(σ0 )| ≤ d where d > 0 is a given number such that the above inequality implies the hypothesis Hσ .

Vol. 2, 2001


Hypothesis HE . The total energy is such that Etot (t) ≤ cE where cE is a number depending only on c and d. Under these hypothesis we have obtained the following result: there are numbers Ai depending only on c and d such that εv ≤ A1 Etot (t0 ) and tEtot (t) ≤ A2 Etot (t0 ) and

1

2 (t0 ) . |D(σ) − D(σ0 )| ≤ A3 Etot

Now consider the triple of numbers {Xt ≡ εvt , xt ≡ Etot (t), Zt ≡ |D(σt ) − D(σ0 )|} . We have shown that the hypothesis Xt ≤ c, xt ≤ cE , Zt ≤ d and smallness conditions on x0 , imply the existence of numbers Ai depending only on c, cE and d such that Xt ≤ A1 x0 ,

txt ≤ A2 x0 ,

1

Zt ≤ A3 x02 .

Therefore there exists η > 0 such that x0 ≤ η implies that the triple belongs to the subset U1 ⊂ R3 defined by the inequalities: U1 ≡ {Xt < c, xt < cE , Zt < d} . For such an η the triple either belongs to U1 or to the subset U2 defined by U2 ≡ {Xt > c

or xt > cE

or Zt > d} .

These subsets are disjoint. We have supposed that for t = t0 it holds that (X0 , x0 , Z0 ) ∈ U1 hence, by continuity in t, (Xt , xt , Zt ) ∈ U1 for all t. We have proved the required a priori bounds. The orthogonal trajectories to the space sections M × {t} have an infinite proper length since the lapse N is bounded below by a strictly positive number.

Acknowledgments We thank L. Andersson for suggesting the use of corrected energies. We thank the University Paris VI, the ITP in Santa Barbara, the University of the Aegean in Samos and the IHES in Bures for their hospitality during our collaboration.

1064



References V. Moncrief, Reduction of Einstein equations for vacuum spacetimes with U(1) spacelike isometry group, Annals of Physics 167, 118–142 (1986). A. Fisher and A. Tromba, Teichm¨ uller spaces, Math. Ann. 267, 311–345 (1984). Cf. also Y. Choquet-Bruhat and C. DeWitt-Morette, Analysis Manifolds and Physics, Part II, supplemented edition North Holland, 2000. J. Cameron and V. Moncrief, The reduction of Einstein’s vacuum equations on space times with U(1) isometry group. Contemporary Mathematics 132, 143–169 (1992). D. Christodoulou and S. Klainerman, The global nonlinear stability of the Minkowski space, Princeton University Press 1992. Y. Choquet-Bruhat and V. Moncrief, Existence theorem for solutions of Einstein equations with 1 parameter spacelike isometry group,Proc. Symposia in Pure Math, 59, 1994, H. Brezis and I.E. Segal ed. 67–80 Y. Choquet-Bruhat and J. W. York, Well posed system for the Einstein equations, C. R. Acad. Sci. Paris 321, 1089–1095 (1995). L. Andersson, V. Moncrief and A. Tromba, On the global evolution problem in 2+1 gravity, J. Geom. Phys. 23, 1991–205 (1997) n◦ 3–4. Yvonne Choquet-Bruhat Tour 22–12, 4ème étage Place Jussieu F-75252 Paris Cedex 05 France email: [email protected]

Vincent Moncrief Department of Physics Yale University PO Box 08120 New Haven 06520 USA email: [email protected]

Communicated by Sergiu Klainerman submitted 9/02/01, accepted 9/07/01




Scattering and Bound States in Euclidean Lattice Quantum Field Theories F. Auil∗ and J. C. A. Barata†

Abstract. In this paper we study the property of asymptotic completeness in (massive) Euclidean lattice quantum field theories. We use the methods of Spencer and Zirilli [2] to prove, under suitable hypothesis, two-body asymptotic completeness, i.e., for the energy range just above the two-particle threshold.

1 Introduction The analysis of the particle content of relativistic quantum field theories remains one of the most elusive problems of theoretical physics. Crucial to the particle interpretation of relativistic quantum field theoretical models is the problem of asymptotic completeness, i.e., the question whether all (pure) states can be interpreted in terms of scattering states of particles. In the framework of constructive relativistic quantum field theory asymptotic completeness has been analyzed in some models (see [1, 2, 12, 13, 21]). Although those works represent true technical masterpieces, their results are relatively modest, being essentially restricted to finite energy ranges. Effective quantum field theories are frequently considered in the literature, partially due to the belief, shared by some, that the quantum field theoretical description of the physics of the elementary particles is limited to a range of low energies. According to this circle of ideas, quantum field theoretical models for the elementary particles are just low energy limits of more general theories (whatever this means), that should effectively hold at very high energy scales (f.i., up to or beyond the Planck energy). It remains unclear, however, what kind of physics effective quantum field theories describe. One is, in particular, interested to know something about the particle interpretation of those theories. For well-known reasons Euclidean lattice quantum field theories play a special role among effective quantum field theories and there have been many studies concerning the existence of particles in such models. The existence of one-particle states, for instance, was established in works like [24]–[47]. In [5] (see also [6, 7]) a full Haag-Ruelle scattering theory was developed for lattice models exhibiting massive one-particle states, thus proving the existence of multi-particle states for those systems. ∗ Work

supported by CNPq. supported by CNPq.

† Partially

1066

F. Auil and J. C.A. Barata


This paper is dedicated to the problem of asymptotic completeness (AC) in Euclidean lattice quantum field theories. In our analysis we followed closely the methods of Spencer and Zirilli [2], who proved, probably for the first time, asymptotic completeness for a true relativistic quantum field theoretical model for a limited range of energies. In spite of the higher technical difficulties posed by the lack of Lorentz invariance and other problems we are able to reproduce the same results of [2] in our lattice context. We have namely proven two-body asymptotic completeness, i.e., asymptotic completeness up to energies just above the two-particle threshold. These methods rely basically on the exponential decay of the Bethe-Salpeter kernel in space variables, and can be applied to the analysis of bound states and resonances for weakly coupled models, as in [15, 16] in the continuum, or [44, 47, 45, 46] in lattice models. Also, and with more work, the methods can be extended to the analysis of three-particle bound states and AC following, for instance, [17] and [13], respectively. Limitations on the energy range are found, unfortunately, in all the proofs of asymptotic completeness performed along these lines on the continuum. This deplorable state of affairs urges the QFT community to develop new ideas and techniques to deal with the spectral problems of QFT, but this is not our subject here. See, however, [8] (and also [9, 10]) for a proposal involving the nuclearity criterion. For a discussion of the problem of asymptotic completeness in QED in the context of perturbation theory, see [11]. We believe our results are interesting not only due to their connection to QFT problems, as mentioned above. Some of the models we consider are, in fact, models of classical statistical mechanical spin systems (for instance, the Ising model) and the spectral properties of the transfer matrix reflect on corrections to the exponential decay of correlations as, for instance, the Ornstein-Zernike corrections (see [22, 33, 34, 35]). In our work many adaptations to the lattice context were necessary, which increased considerably the technical complications involved. Let us briefly discuss some of them. In their work, for instance, Spencer and Zirilli [2] restricted a good part of their analysis to the zero-momentum sector of the energy-momentum spectrum and then invoked Lorentz covariance to extend their results to nonzero momenta. This strategy of argumentation simplifies many computations but cannot be applied to a situation where Lorentz covariance is lacking. Another major source of complications involved a series of space-time changes of variables intended to express some four point functions and the Bethe-Salpeter kernel in terms of “centre of mass” and relative coordinates (the variables τ, η and ξ of [2]). The transformations used in [2] present no problem for a continuum space-time. On the lattice, however, they result in variables defined not on a lattice of integers Zd+1 but on a lattice of half-integers (Z/2)d+1 . Adaptations are, therefore, necessary to stay on a lattice of integers Zd+1 and, what is of crucial importance, to preserve the form of relation (4.4) (the analogous to expression (2.5) in [2]), relevant for the analysis of the Bethe-Salpeter equation. The proof of (4.4) for lattice theories is surprisingly very involved and is presented in detail in Appendix A. Another source of complications, also related to the lack of Lorentz invariance,

Vol. 2, 2001

Scattering and Bound States in Lattice Quantum Field Theories

1067

involves the one-particle dispersion relations. For lattice systems little is a priori known about the dispersion relation of a particle1 and, hence, many of our proofs have to be performed without assuming a particular form for them. In [2], however, some computations use the explicit expression of the relativistic dispersion relation as a function of the momentum. In our case we are often forced to find more general arguments and computational methods. This paper is organized as follows. In Section 2 we present the basic setting we will work with and the main assumptions. In Section 3 we present the main result, whose proof starts in Section 4. In Section 5 an important technical lemma is proven and Appendix A is dedicated to the Bethe-Salpeter equation on the lattice.

2 Background and Notation Our basic object is the lattice of integers in d + 1 dimensions Zd+1 , with d 1, whose sites will be denoted by x = (x0 , x1 , . . . , xd ) or (x0 , x) for short. The canonical basis in Zd+1 will be denoted {ei }di=0 . A finite subset of sites will be denoted generically by Λ. We introduce a conveniently topologised set S ⊆ C and define the set of all configurations in Λ as the set S Λ := {ϕ : Λ −→ S}, with the product topology. For instance, for the Ising Model we have S := {−1, 1} with the discrete topology. Restrictions on S and its topology will be opportunely quoted, but are really neither serious nor relevant in the present context (see [3]). The set of complex-valued, continuous functions on configurations will be denoted by C(S Λ ). Examples are the projections at site x, ϕx : ϕ −→ ϕ(x), and the function identically equal to 1, denoted here by 1I. For each i ∈ {0, 1, . . . , d} and a ∈ Z we define the semi-spaces Λi,a := {(x0 , x1 , . . . , xd ) ∈ Zd+1 : xi a}. We define also the lattice translations: τx : y −→ y + x,

∀ x, y ∈ Zd+1 ,

and reflections θi,a : y −→ (y0 , y1 , . . . , yi−1 , 2a − yi , yi+1 , . . . , yd ),

a ∈ Z/2.

A relevant example is the “temporal reflection” θ := θ0,0 : (x0 , x) −→ (−x0 , x). We assume the existence of a state µ (i.e., a linear, positive functional with norm d+1 1) on the algebra C(S Z ), satisfying A1 Invariance: µ = µ◦θi,a = µ◦τx , ∀ i ∈ {1, . . . , d}, Z/2.

∀ x ∈ Zd+1 and ∀ a ∈

A2 Reflection Positivity: µ(θ0,a f f ) 0, ∀ f ∈ P(Λ0,a ) and ∀ a ∈ Z/2. 1 It

can be computed, however, through converging expansions. See f.i. [28, 29].

1068



Here, the horizontal bar indicates complex conjugation and P(Λ) denotes the algebra generated by all projections ϕx with x ∈ Λ. The state µ is interpreted as the vacuum state and is typically obtained as the thermodynamic limit of finite volume Gibbs states. By the last two properties we are allowed to construct, by standard procedures (see, f.i., [19, 20]), the Hilbert space of physical states H as the completion of the quotient P(Λ0,0 )/{f : f, f 0 = 0}. Here, f, g 0 := µ(θf g) is a sesquilinear, Hermitian, non-negative form (by A2) on P(Λ0,0 ). The standard procedure gives, additionally, a canonical inclusion i : P(Λ0,0 ) −→ H with dense image. The functions f ∈ P(Λ0,0 ) act as multiplication operators on P(Λ0,0 ), f : g −→ f g, and this action can be extended to H via the canonical inclusion i. Some of its images are relevant for the formalism of the theory, receiving special names: i(1I) i( ϕx ) i(τe0 ) i(τen )

=: =: =: =:

Ω Φ(x) T Tn

the the the the

vacuum state vector, local fields, for x ∈ Λ0,0 , transfer matrix, generators of space translations, for n = 1, . . . , d.

The local fields are bounded operators if S is a compact set, with Φ(x) sup{|s| : s ∈ S}. The transfer matrix is a self-adjoint, positive (by A2) operator with norm equal to 1 and the operators Tn , the generators of elementary space translations on the lattice, are unitaries, so they can be expressed as Tn = eiPn for certain self-adjoint operators Pn , with spectrum in (−π, π]. We can identify P = (P1 , . . . , Pd ) with the momentum operator. The Hamilton operator is defined on Ker (T )⊥ by H := − ln(T Ker (T )⊥ ). Notice, however, that Ker (T ) = {0} in many of the more interesting models (see [20]) and in such cases H will be defined on the whole Hilbert space. Finally, (H, P) is the energy-momentum (em) operator. We define the n-point Euclidean, or Schwinger, functions as Sn (x1 , . . . , xn ) := µ(ϕx1 · · · ϕxn ). As a consequence of translation invariance A1, we can express Sn in terms of difference variables Sn (x1 , . . . , xn ) = Sn (x1 − xn , . . . , xn−1 − xn ).

(2.1)

Directly from definitions we get the following trivial, but relevant property Sn (x1 , . . . , xn ) = Ω, Φ(0)Tx2 −x1 Φ(0)Tx3 −x2 Φ(0) · · · Txn −xn−1 Φ(0)Ω d for (xn )0 ≥ (xn−1 )0 ≥ · · · ≥ (x1 )0 . Above, we denote Tx = T x0 i=1 Tixi = e−x0 H ei x.P with x0 ≥ 0. By time-reflection invariance of µ, one has S2 ((x0 , x), 0) = S2 ((|x0 | , x), 0). Hence, S2 (x0 , x) ≡ S2 ((x0 , x), 0) = Ω, Φ(0)e−|x0 |H ei x.P Φ(0)Ω .

Vol. 2, 2001


1069

The above expression, sometimes called the Gell-Mann-Low formula, is the starting point for the study of spectral properties of the e-m operator restricted to the “one-particle” subspace (see [26]). The method employed in the present work is based on an analogous formula for the four-point function (see (4.6)). For functions f on the lattice, the Fourier transform f and the inverse Fourier transform fˇ are defined respectively by − d+1 −i p·x − d+1 ˇ 2 2 e f (x) and f (x) = (2π) ei x·p f (p) dp. f (p) = (2π) x∈Zd+1

Td+1

Above, Td+1 is the (d + 1)-dimensional torus Td+1 = (−π, π]d+1 . Note that can be seen as an operator transforming functions on the lattice into functions on momentum space, and vice-versa for ˇ. The scattering theory for massive lattice field theories was developed in [5] along the same lines of the Haag-Ruelle scattering theory. It can be seen as a mathematical machine, starting with the input hypothesis A3 Lower mass gap (or exponential decay of the truncated two-point function): There exists a constant m > 0 such that: |µ(ϕx1 ϕx2 ) − µ(ϕx1 ) µ(ϕx2 )| const e−m|x1 −x2 | ,

∀ x1 , x2 .

A4 Upper mass gap (or stronger exponential decay of the inverse of the truncated two-point function, or existence of “one-particle states”): There exists ω(p), real analytic (called the dispersion curve) and ω (p), continuous with m ω(p) < ω (p), such that the Fourier transform of the two-point function S2 (p0 , p) can be analytically extended to a meromorphic function in {p0 : Im p0 < ω (p)}, with a simple pole at p0 = iω(p). Furthermore, det(∂ 2 ω(p)/∂pi ∂pj ) = 0, ∀ p. to produce several output results [5, 6, 7]: the existence of asymptotic subspaces Hin and Hout ⊆ H provided with a natural well-known Fock space structure, reduction formulae, clustering of S-matrix elements, etc. In intuitive terms the property of AC says that the space of all states of the theory can be spanned by the scattering states, interpreted as states which, at large (positive or negative) times, represent spatially separated free particles. The (strong) AC condition can be mathematically expressed as H = Hin = Hout . On a physical basis, we should expect to have AC in every reasonable theory. In other words, we should expect that every state of a physical system can be regarded either as composed of particles or else as decaying into such a state as time progresses. Implicit in this statement is the idea that the free dynamics has no bound states.

1070



3 Main Result Consider the following hypothesis H1 Existence of “one-particle states”: The Fourier transform of the two-point function is given by: ∞ Z(p) sinh ω(p) sinh λ0 S2 (p0 , p) = + dρ(λ0 , p), cosh ω(p) − cos p0 m+2δ0 cosh λ0 − cos p0 where Z(p) is a positive, C ∞ function; ω(p) is real analytic, m ω(p) < m + 2δ0 ; 0 < δ0 m and ρ is a positive measure. H2 Exponential decay of the Bethe-Salpeter kernel: The Bethe-Salpeter kernel2 in momentum space K(k, p, q) is analytic in the region |Im pi | < δi + ' (i = 0, 1, . . . , d) |Im qi | < δi + ' (i = 0, 1, . . . , d) |Im k0 | < m + δ0 for certain δi > 0 (i = 1, . . . , d), and ' > 0. H3 “Repulsive interaction”: K(k, p, q) = η 1+η 2 K1 (k, p, q), with K1 satisfying H2. Here 1 is the function identically equal to 1 and η is a non-negative constant. H4 “Two-particle states” filling certain energy interval: span{Θ(fˇ) : f ∈ A1δ } ⊇ H2(m+δ0 ) . Here, Θ(fˇ) are basically the “two-particle states” and H2(m+δ0 ) are the states having energy less than 2(m + δ0 ). The precise definitions are given in the next section (see eq. (4.7)). 2(m+δ0 )

Then, denoting H in

out

follows:

:= H in ∩ H2(m+δ0 ) our main result can be described as out

Theorem 3.1 Assuming space dimension d = 1 and under H1, H2 and H4, it follows that 2(m+δ0 ) 2(m+δ ) H2(m+δ0 ) = Hbs ⊕ H in 0 , (3.1) out

2(m+δ0 ) Hbs

where is the subspace containing one-particle bound states with energy less than 2(m + δ0 ). If in addition H3 holds, bound states are excluded and we have 2(m+δ0 )

H2(m+δ0 ) = H in

.

(3.2)

out

✷ 2 Defined

in the next section.

Vol. 2, 2001


1071

Remark 1. The hypothesis H1 was verified in various models by many authors (see refs. [24]–[47]). Hypothesis H2 and H3 were verified in continuous RQFT for P (ϕ)2 models by T. Spencer [1]. There are proofs of H2 for a wide class of spin systems [43] and for the high temperature Ising model [4] based in polymer expansions and inspired by Spencer’s work [1]. In the case of [4], one has δ0 = δi = m − ', with ' an arbitrary positive constant. The Bethe-Salpeter kernel is the analogous in the QFT to the potential in Quantum Mechanics, and the positive constant η in H3 is basically a coupling constant. So, H3 says basically that, at the first order in the coupling constant, the interaction is repulsive, making plausible the reason for the exclusion of bound states. This is a feature of certain P (ϕ)2 models studied by Spencer [1] (see also in this context [14, 15, 16, 17]) and may not be satisfied in other QFT models. Anyway, it is a sufficient condition to exclude the presence of bound states. Remark 2. Most of the results presented in the course of the proof of Theorem 3.1, in the next sections, are valid in any space dimension. The restriction to d = 1 is purely technical and enables us to explicitly compute the Radon-Nykodim derivative of the spectral measure by a local inversion of the dispersion function ω(p), which is only possible in d = 1. Remark 3. Notice that, by the hypothesis H1, one has m + 2δ0 > ω(p) and, hence, 2(m + δ0 ) > m + ω(p), which is the lowest energy of a two-particle state with momentum p. Therefore, the range of energies 2(m + δ0 ) of Theorem 3.1 includes states with all possible monenta p ∈ (−π, π]d .

4 Proof of the Main Result 4.1

Relation Between Resolvent and Spectrum

We define the connected part of the truncated four-point function as D(x1 , x2 , x3 , x4 ) := S4 (x1 , x2 , x3 , x4 ) − S2 (x1 , x2 )S2 (x3 , x4 ) and the unconnected part as D0 (x1 , x2 , x3 , x4 ) := S2 (x1 , x3 )S2 (x2 , x4 ) + S2 (x1 , x4 )S2 (x2 , x3 ), In terms of the new variables ξ := x1 − x2 ,

η := x3 − x4 ,

τ := x1 + x2 − (x3 + x4 ),

(4.1)

and expressing the two-point and four-point functions in terms of difference variables (2.1), we get τ +ξ+η τ −ξ+η , , η − S2 (ξ)S2 (η) =: D(τ, ξ, η). D(x1 , x2 , x3 , x4 ) = S4 2 2 (4.2)

1072



and τ + (ξ − η) τ − (ξ − η) D0 (x1 , x2 , x3 , x4 ) = S2 S2 2 2 τ + (ξ + η) τ − (ξ + η) + S2 S2 =: D0 (τ, ξ, η). (4.3) 2 2 The change of variables (4.1) is the same as in [2], except by a factor 1/2, which is present in [2] but must be avoided here. In the continuum framework this factor is incidental, but here its absence is necessary to lead to a transformation from the lattice of integers Zd+1 again into the the lattice of integers Zd+1 (see also Appendix A below). 0 (k, p, q), Denote R(k, p, q) := D(k, p, q) and R0 (k, p, q) := D considering them as a family of integral operators indexed by k: [R(k)f ](p) = R(k, p, q) f (q) dq, and the same for R (k). Direct calculations show that the 0 Td+1 kernel of the integral operator R0 (k) acting on symmetrical (i.e., f (p) = f (−p)) functions is given by R0 (k, p, q) = 2(2π)

d+1 2

S2 (k + p)S2 (k − p)δ(p + q).

(4.4)

This expression is totally analogous to expression (2.5) of [2], and this fact is of crucial importance to our analysis. The proof (4.4) for the lattice is surprisingly very intricate and is presented in Appendix A. We will denote R0 (k, p) = 2(2π)

d+1 2

S2 (k + p)S2 (k − p).

(4.5)

On the other hand, straightforward calculations show that ∞ d+1 λ sinh(λ0 /2) + k dE(λ), δ f, R(k)g L2 (Td+1 ) = (2π) 2 2 0 Td cosh(λ0 /2) − cos k0 (4.6) where f and g are symmetrical with purely spatial dependence, λ = (λ0 , λ) ∈ [0, ∞) × Td and Eλ denotes the spectral family of the e-m operator whose spectral measure is denoted dE(λ) = d Θ(fˇ), Eλ Θ(ˇ g) , with the “two-particle states” given by 1 Θ(h) := h(−x) e− 2 ix·P [Φ((0, x))Φ(0) − µ(ϕ(0,x) ϕ0 )1I]Ω. (4.7) x∈Zd+1

Equation (4.6) implies the following important relation between four-point functions and the spectral measure of e-m operator. For k0 = x0 + iy0 ∈ C, x0 , y0 ∈ R, denote cos k0 − 1 =: x + iy with x, y ∈ R. Then ∞ ∞ d+1 2 Im f, R((k0 , k))g h(x) dx = π (2π) h(a) dν(a) (4.8) lim y→0+

0

0

Vol. 2, 2001


1073

for all h ∈ C0∞ (0, +∞), where,

dν(a) := (a + 1)2 − 1 d Θ(fˇ), E(2 arcosh(a+1),−2k) Θ(ˇ g ) . In other words, we have the following result, whose proof is omitted because is basically a simple paraphrase of equation (4.8): Lemma 4.1 If the distribution given by the left hand side of (4.8) vanishes in the open set (α, β), then the spectral measure of e-m operator vanishes in the open (2arcosh(α + 1), 2arcosh(β + 1)). ✷

4.2

Regularity of Resolvents

We learn from (4.8) and from Lemma 4.1 that the study of the spectral measure of the e-m operator can be reduced to the study of the distribution given by the l.h.s of (4.8) which, in turn, can be reduced to the study of the operator R(k). How to do this? We start by introducing the Bethe-Salpeter (B-S) equation R(k) = R0 (k) − R0 (k)K(k)R(k). The B-S kernel K(k) is self-defined by this equation. In quantum mechanical scattering, this is the starting point for a perturbative method. Here we explore other features. The B-S equation has the formal solution R = R0 (1I + KR0 )−1 , provided this inverse exists.3 As in [2], we introduce the following notations: δ := (δ0 , δ1 , . . . , δd ), Iδ := (α0 , α1 , . . . , αd ) ∈ Rd+1 : |αi | δi , 2 2 f δ := sup |f (p + iα)| dp, α∈Iδ

∀ i = 0, 1, . . . , d ,

Td+1

Aδ := {f : f is analytic in |Im pi | δi , f δ < ∞, f (p) = f (−p)} . What we can say about the invertibility of 1I + KR0 is condensed in the next result, whose proof is highly technical and will be delayed until Section 5: Lemma 4.2

(a) For each fixed k and for k0 belonging to

D1 := {|Re k0 | < π/2 ∧ |Im k0 | < m + δ0 }\{Re k0 = 0 ∧ m |Im k0 | < m + δ0}, K(k0 , k)R0 (k0 , k) is an analytic family of Hilbert-Schmidt operators in Aδ . Therefore, the inverse operator (1I + KR0 )−1 exists for k0 in D1 , except for a discrete set of poles P ⊂ D1 . that taking formally inverses in this last relation we get K = R−1 − R−1 0 , a convenient expression for the B-S kernel. 3 Observe

1074



(b) If the constant η in H3 is sufficiently small, then there exists ρ > 0 such that R(k0 , k) has no poles for k0 in D1 ∪ Bρ (im), where D1 := {|Re k0 | < π/2 ∧ 0 < |Im k0 | < m} , and Bρ (im) is a ball of radius ρ centered at im. (c) When R is well defined, the distribution in the l.h.s of (4.8) is given by h −→ (2π)

d+1 2

π2

Td

Z(p + k) Z(p − k) sinh[ω(p + k) + ω(p − k)] × W f (0, p) W f (0, p)

where

θ(p) :=

h(θ(p) − 1) dp, (4.9) θ(p)

cosh[ω(p + k) + ω(p − k)] + 1 , 2

(4.10)

and W f :=

lim

y→0+ x→θ(p)−1

[1I + K(k0 , k)R0 (k0 , k)]

−1

f.

(4.11) ✷

Im k0

Im k0

i(m+δ0 )

i(m+δ0 )

D1

ρ

im

im D’1

Re k0 −π/2

π/2

Rek0 −π/2

π/2

Figure 1: Left: The region D1 of Lemma 4.2 (a) is as shown plus its reflection on the real axis. Right: The region D1 of Lemma 4.2 (b).

Vol. 2, 2001

4.3


1075

Energy Spectrum

If A = (r, s) × {k} ⊂ (0, +∞) × (π, π]d is a Borel set and E denotes the spectral family of the e-m operator (H, P), we define H(r,s) := E(A)H. In the particular case r = 0 we denote it by Hs rather than H(0,s) . We also denote by A1δ the subspace of functions in Aδ with purely spatial dependence. Let σ be the spectrum of e-m operator restricted to the subspace span{Θ(fˇ) : f ∈ A1δ } ∩ H2(m+δ0 ) .

(4.12)

The following result proves (3.1). Proposition 4.1 (a) Below 2m, the e-m spectrum is contained in the poles of R(k). In more precise terms, this means σ ∩ (0, 2m) ⊆ −2i [P ∩ {k0 = iy0 : y0 ∈ (0, m)}]. (b) Assume space dimension d equal to 1. Then, except for the poles of R(k), the e-m spectrum above 2m is absolutely continuous having multiplicity 1. In more precise terms, this means that c

σ := σ ∩ (2m, 2(m + δ0 )) ∩ (−2i [P ∩ {k0 = iy0 : y0 ∈ (m, m + δ0 )}]) (4.13) is absolutely continuous and has multiplicity 1. ✷

im −2i

iβ iλ/2 iα P

σ 2α λ 2β

2m

2(m+δ0 )

Figure 2: The points × represent the discrete set of poles P . Bold lines and dots in the horizontal axis represent the spectrum σ in a general situation. Proof. For part (a) it is sufficient to prove the statement in (', 2m − '), with ' > 0 arbitrarily small. By reductio ad absurdum, assume the existence of λ ∈

1076



σ ∩ (', 2m − ') but with iλ/2 ∈ / P (see Figure 2). In this case, since P is a discrete subset of D1 , there exists a neighbourhood of iλ/2 containing no poles. In this neighbourhood, the distribution of l.h.s of (4.8) is well defined and given by expression (4.9). Denote by (α, β) the overlap of this neighbourhood with the imaginary axis. We can assume, without loss of generality, that β < m, choosing the original neighbourhood smaller, if necessary. Hence, the distribution is well defined in C0∞ (cosh α − 1, cosh β − 1) and vanishes there (note that expression (4.9) and the fact that ω(q) m imply that the distribution has support contained in [cosh m − 1, +∞)). Therefore, according to Lemma 4.1, the spectral measure vanishes in (2α, 2β) λ, but this is a contradiction. For part (b), assume that (2α, 2β) is contained in σ (defined in (4.13)). In this case, i(α + ', β − ') ∩ {k0 = iy0 : y0 ∈ (m, m + δ0 )} has a neighbourhood containing no poles, for any ' > 0. As in the previous case, the distribution in (4.8) is well defined in this neighbourhood and given by (4.9). If d = 1, and denoting consequently p = p1 , equation (4.9) reads 2π 3

π

−π

Z(p1 + k1 ) Z(p1 − k1 ) sinh[ω(p1 + k1 ) + ω(p1 − k1 )] × W f (0, p1 ) W f (0, p1 )

h(θ(p1 ) − 1) dp1 , (4.14) θ(p1 )

where θ(p1 ) and W f are as in (4.10)-(4.11), respectively. To keep the analogy with equation (4.8), we define a new variable a (= a(p1 )) by a := θ(p1 ) − 1. Assume that the function F defined by F (p1 ) := ω(p1 + k1 ) + ω(p1 − k1 ) = arcosh[2(a + 1)2 − 1] =: µ(a) is invertible and that the derivative of F −1 is non-negative. Note that 2(a + 1)2 = cosh F (p1 ) + 1 and, hence, sinh(F (p1 )) dp1 = 4(a + 1) (F −1 ) (µ(a)) da. Therefore, with this change of variables, (4.14) reads 8π 3

π

−π

Z(F −1 (µ(a)) + k1 ) Z(F −1 (µ(a)) − k1 ) × W f [0, F −1 (µ(a))] W f [0, F −1 (µ(a))] (F −1 ) (µ(a)) h(a) da.

(4.15)

Comparison of (4.15) with (4.8) shows that d Θ(fˇ), E(2 arcosh(a+1),−2k1 ) Θ(fˇ) = 4πZ(F −1 (µ(a)) + k1 ) Z(F −1 (µ(a)) − k1 )W f W f [0, F −1 (µ(a))](F −1 ) (µ(a))

da, (a + 1)2 − 1 for a ∈ (cosh(α + ') − 1, cosh(β − ') − 1). Or 2 d Θ(fˇ), E(λ0 ,−2k1 ) Θ(fˇ) H = |Lf (λ0 )| dλ0 ,

(4.16)

Vol. 2, 2001


1077

for λ0 ∈ (2(α + '), 2(β − ')), where 1/2 W f (0, F −1 (λ0 )). Lf (λ0 ) = 2πZ(F −1 (λ0 ) + k1 )Z(F −1 (λ0 ) − k1 ) (F −1 ) (λ0 ) Above, we have defined the new variable λ0 := 2arcosh(a + 1). So, (a + 1) = cosh(λ0 /2), and µ(a) = λ0 . Therefore, according to (4.16), the mapping {Θ(fˇ) : f ∈ A1δ } ∩ H(2(α+), 2(β−)) −→ L2 ((2(α + '), 2(β − ')), dλ0 ) given by Θ(fˇ) −→ Lf is unitary and HΘ(fˇ)H = λ0 Lf (λ0 )L2 , i.e., H acts in L2 as a multiplication operator by λ0 . Notice that we do not exclude the existence of poles embedded in the continuum. For this we need hypothesis H3, as we discuss below.

4.4

Absence of Bound States 2(m+ρ)

2(m+ρ)

Lemma 4.2 (b) and Proposition 4.1 imply H2(m+ρ) = Hin = Hout . This is the AC condition, but only for energies a little above 2m because, in general, ρ will be small. We refer the reader to the proof of Lemma 4.2 in the Section 5. There, integrals like (5.7) are reduced, in the d = 1 case, to integrals over the real line, whose integration path can be deformed into the complex plane to avoid the cut in D1 , as in [2]. This allows us to increase the energy range. Let us now introduce the following Assumption: There exists γ > 0 such that the dispersion curve ω(p1 ) admits an analytic extension ω(p1 + iq1 ) to the strip |q1 | < γ with the following property: there exists a path t : I −→ C contained in this strip, homotopic to the real line, with t(I) ∩ R ⊆ {0}, such that F (I) ∩ {k0 = iy0 : 0 < |y0 | < m + δ0 } ⊆ {im},

∀ k1 ∈ (−π, π).

Above, I is some open interval of the real line and for the sake of brevity we denote F (I) = i [ω(t(I) + k1 ) + ω(t(I) − k1 )]/2. We have: Lemma 4.3 Under the above assumption, Lemma 4.2 (a) is valid for k0 in a neighbourhood of {k0 = iy0 : |y0 | < m + δ0 }, except for a neighbourhood of k0 = im, which can be chosen arbitrarily small. ✷ Proof. The proof is analogous to the proof of Lemma 4.2 (a). We have to verify the analyticity of π Z(p1 + k1 ) Z(p1 − k1 ) sinh[ω(p1 + k1 ) + ω(p1 − k1 )] g(0, p1 ) dp1 . (4.17) θ2 (p1 ) − cos2 k0 −π

1078



i(m+δ0 ) F(I) t(I)

im

Figure 3: Left: An example of the path t(I) in the complex plane. Right: The image of the relativistic F (I) along this path (zero momentum case).

Here, as there, the idea consists in displacing the path in the dp1 integration to the path t(s) in the complex plane. The analyticity region would exclude the k0 ’s laying in this path. By the Assumption we are allowed to admit the set {k0 = iy0 : m |y0 | < m + δ0 } in this region, except for a neighbourhood of k0 = im. In fact, using the trigonometrical identity cos2 α − cos2 β = − sin(α + β) sin(α − β), the denominator in (4.17) can be written as cosh2 [ω(t(I) + k1 ) + ω(t(I) − k1 )]/2 − cos2 k0 = cos2 F (I) − cos2 k0 = sin(k0 + F (I)) sin(k0 − F (I)), and, by the Assumption, this expression is non-zero for these k0 ’s, except possibly for k0 = im. Analogously, Lemma 4.2 (b) is valid in D1 given by D1 := {|Re k0 | < π/2 ∧ 0 < |Im k0 | < m + δ0 }. This gives (3.2). The assumption above is best understood when the dispersion curve is explicitly known. See [2], for an example in the case ω(p1 ) = 4m2 + p21 . In fact, our assumption was stated having this as a paradigm.

5 Proof of Lemma 4.2 We start introducing two auxiliary results. Lemma 5.1 f 2δ :=

x∈Zd+1

e2

P

d i=0

δi |xi |

fˇ(x)2

Vol. 2, 2001


P

is a norm in Aδ equivalent to .δ and U : f −→ e map from (Aδ , .δ ) in :2 (Zd+1 ) with U −1 f (p) = (2π)−

d+1 2

e−ip·x e−

P

d i=0

d i=0

δi |xi |

δi |xi |

1079

fˇ(x) is an unitary

f (x).

(5.1)

x∈Zd+1

✷ Proof. Using the fact that f (p+iα) = (eα x fˇ(x)) (p), and the Plancherel identity we have 2 2 2 |f (p + iα)| dp = e2α x fˇ(x) e2δ|x| fˇ(x) , ∀ |α| δ. x

x

Therefore, f δ f δ . On the other hand, one has for all |α| δ 2 2α x 2 fˇ(x)2 = e2α x fˇ(x) e |f (p + iα)| dp x0

x

sup |α|δ

2

2

|f (p + iα)| dp = f δ .(5.2)

In particular, for α = δ in (5.2) we have

2 2 e2δ x fˇ(x) f δ .

(5.3)

x0

Analogously (but taking now α = −δ), we have

2 2 e−2δ x fˇ(x) f δ .

(5.4)

x 0, then ∞ 2xy π lim h(a). h(x) dx = (a2 − x2 )2 + 2y 2 (a2 + x2 ) + y 4 2a y→0+ 0 ✷

1080


Proof. By the change of variables x =

∞

0

(a2

−

x2 )2


2ayv + a2 − y 2 , one has

2xy h(x) dx + 2y 2 (a2 + x2 ) + y 4 ∞

1 1 h( 2ayv + a2 − y 2 ) dv. = −y2 v 2 + 1 2a − a22ya

Defining h1 (v) := h(v) − h(a), the last integral in (5.5) can be written as ∞ ∞

1 1 dv + h ( 2ayv + a2 − y 2 ) dv. h(a) 1 2 2 2 −y2 v + 1 2 −y2 v + 1 − a 2ya − a 2ya

(5.5)

(5.6)

Note that in the first integral above the lower integration limit tends to −∞ as y → 0+ and, due to the regularity of the integrand, the integral tends to an integral over the whole real line, whose value, by an elementary computation, is π. Therefore, it suffices to prove that the second term in (5.6) tends to 0 when y → 0+ . For this, observe that ∞

1 2 − y 2 ) dv π sup h ( 2ayv + a2 − y 2 ) ; h ( 2ayv + a 1 1 − a2 −y2 v 2 + 1 v∈R 2ya

and that, by continuity of h1 , it follows that limy→0+ h1 ( 2ayv + a2 − y 2 ) = h1 (a) = 0. This limit is uniform in v, by the compactness of the support of h. The hard work in the proof of Lemma 4.2 really concerns the proof of the analyticity of the family of operators. To do this, using (4.4) and (4.5), note first that [K(k)R0 (k)f ](p) =

K(k, p, q) R0 (k, q) f (q) dq.

So, it is sufficient to prove that the map k0 −→ R0 ((k0 , k), p)f1 (p)f2 (p) dp;

f1 , f2 ∈ Aδ ,

(5.7)

is analytic in D1 , since it is not hard to prove, using H2 and Lemma 5.1, that q −→ K(k, p, q) is in Aδ . Replacing S2 in (4.5) by the expression in H1, we can write (5.8) R0 (k, p) = R00 (k, p) + R01 (k, p), where d+1

R00 (k, p) :=

2(2π) 2 Z(p + k) sinh ω(p + k) Z(p − k) sinh ω(p − k) , [cosh ω(p + k) − cos(p0 + k0 )] [cosh ω(p − k) − cos(p0 − k0 )] (5.9)

Vol. 2, 2001


1081

and R01 := R0 − R00 . Note that, alternatively, d+1

2(2π) 2 Z(p + k) sinh ω(p + k) Z(p − k) sinh ω(p − k) R00 (k, p) = [cosh ω(p − k) − cosh ω(p + k)] − 2 sin k0 sin p0 1 1 − × . (5.10) cosh ω(p + k) − cos(p0 + k0 ) cosh ω(p − k) − cos(p0 − k0 ) For convenience, we denote g(p) := f1 (p)f2 (p), and define g1 (p) = g(p) − g(0, p).

(5.11)

Introducing (5.11) and (5.8) into (5.7), we get R0 ((k0 , k), p) g(p) dp = R00 ((k0 , k), p) g(0, p) dp Td+1 Td+1 + R00 ((k0 , k), p) g1 (p) dp + R01 ((k0 , k), p) g(p) dp.

Td+1

(5.12)

Td+1

The idea is to prove the lemma for each term in (5.12) separately. Let us start considering the first term in (5.12). Assuming |Imk0 | < m and using (5.9), the first term can be written as d+1 2 2(2π) Z(p + k) sinh ω(p + k) Z(p − k) sinh ω(p − k) d π T 1 dp0 × −π [cosh ω(p + k) − cos(p0 + k0 )] [cosh ω(p − k) − cos(p0 − k0 )] g(0, p)dp. (5.13) Denoting z = eip0 , the integral in brackets can be written as −4iz dz, (z − α )(z − α + − )(z − β+ )(z − β− ) |z|=1 where α± = e±ω(p+k) e−ik0 and β± = e±ω(p−k) eik0 . This integral can be evaluated by the method of residues (by noticing that the only poles contributing are α− and β− ), giving 2π sinh[ω(p + k) + ω(p − k)] . sinh ω(p + k) sinh ω(p − k) [cosh[ω(p + k) + ω(p − k)] − cos 2k0 ] Therefore, (5.13) equals d+3 Z(p + k) Z(p − k) sinh[ω(p + k) + ω(p − k)] 2 2(2π) g(0, p) dp. cosh[ω(p + k) + ω(p − k)] − cos 2k0 d T

(5.14)

1082



Writing k0 = x0 + iy0 ∈ C, x0 , y0 ∈ R, the denominator in (5.14) vanishes in D1 if and only if x0 = 0 and y0 m (because ω(q) m). Hence, expression (5.14) can be analytically extended to D1 . We can also compute the contribution of this first term to the distribution in the l.h.s of (4.8). Note that y→0+

∞ Im f, R(k0 , k)f h(x) dx = lim Im W f , R0 (k0 , k)W f h(x) dx y→0+ 0 ∞ = lim+ Im R0 ((k0 , k), p) W f (p)W f (p) dp h(x)dx.

∞

lim

0

y→0

0

Td+1

In the first equality we have used Lemma 5.3 of [2]. The integral in dp in the second line above is as in (5.7) with f1 = W f and f2 = W f . Using (5.14), the first term in (5.12) gives the contribution

2(2π)

d+3 2

lim

y→0+

d+3

Im 0

Td

Z(p + k) Z(p − k) sinh[ω(p + k) + ω(p − k)] cosh[ω(p + k) + ω(p − k)] − cos 2k0 × W f W f (0, p) dp h(x) dx

Z(p + k) Z(p − k) sinh[ω(p + k) + ω(p − k)] W f W f (0, p) ∞ 1 Im h(x) dx dp. (5.15) cosh[ω(p + k) + ω(p − k)] − cos 2k0 0

= 2(2π) 2 × lim y→0+

∞

Td

In the last equality above we have used Fubini’s and dominated convergence theorems to interchange the integrals. Writing cos k0 − 1 =: x + iy, x, y ∈ R, straightforward calculations show that Im

1 1 1 = Im cosh[ω(p + k) + ω(p − k)] − cos 2k0 2 θ2 (p) − cos2 k0 2(x + 1)y 1 = , 2 [θ2 − (x + 1)2 ]2 + 2y 2 [θ2 + (x + 1)2 ] + y 4

(here we have used θ = θ(p) for simplicity). From Lemma 5.2, the term in brackets in (5.15) is given by π h(θ − 1)/4θ. Hence, the first term in (5.12) gives the total contribution to (4.9). Consider now the second term in (5.12) (the proof for the third one is R ((k0 , k), p) g1 (p) dp = analogous). We split the integral into two regions: Td+1 00 + . Denoting |p| 0 such that 0 < c

|p · Bp| |p|

2

,

∀ p = 0,

(see 3.2.12 in [23]) and this controls the first term in the denominator of (5.32).

A

The B-S Kernel on the Lattice

We have started with n-point functions defined on the lattice but, using Fourier transforms, we translated the problem to momentum space. This is done passing from D to R via D. So, we have introduced the B-S equation directly in momentum

1088



space. This can be done in an earlier stage, directly on the lattice, by taking D = D0 − D0 N D, or, in expanded form, D(x1 , x2 , x3 , x4 ) = D0 (x1 , x2 , x3 , x4 ) − D0 (x1 , x2 , y1 , y2 )N (y1 , y2 , y3 , y4 )D(y3 , y4 , x3 , x4 ), (A.1) y1 ,y2 ,y3 ,y4 ∈Zd+1

and using the Fourier transform and appropriate coordinate changes to write down (A.1) in momentum space. In this appendix we carry on this work, showing an alternative approach to [43], that enables us to prove the relation (A.14). Considered as a kernel, N has several symmetries, among them translation invariance, from which there exists N such that N (y1 , y2 , y3 , y4 ) = N (y1 − y4 , y2 − y4 , y3 − y4 ). In fact, we need only to define N (y1 , y2 , y3 ) := N (y1 , y2 , y3 , 0). In addition to the variables (4.1) we introduce u1 := x1 + x2 − (y1 + y2 ), u3 := y3 + y4 − (x3 + x4 ),

u2 := y1 − y2 , u4 := y3 − y4 .

(A.2)

In analogy to (4.2) and (4.3) we have

u1 + (ξ − u2 ) u1 − (ξ − u2 ) D0 (x1 , x2 , y1 , y2 ) = S2 S2 2 2 u1 + (ξ + u2 ) u1 − (ξ + u2 ) + S2 S2 = D0 (u1 , ξ, u2 ), (A.3) 2 2 D(y3 , y4 , x3 , x4 ) = S4

u3 + u4 + η u3 − u4 + η , , η − S2 (u4 )S2 (η) 2 2 = D(u3 , u4 , η) (A.4)

and N (y1 , y2 , y3 , y4 ) = N (y1 − y4 , y2 − y4 , y3 − y4 ) (τ − u1 − u3 ) + u2 + u4 (τ − u1 − u3 ) − u2 + u4 , , u4 = N 2 2 ˇ − u1 − u3 , u2 , u4 ). (A.5) =: K(τ Difficulties arise from the fact that transformations (4.1) and (A.2) are not bijective. If we wish that new coordinates vary freely in Zd+1 we have to restrict ˇ to the image of this transformation. To consider this the new functions D, D0 , K image, denote by ξi , ηi , τi the i-th coordinate of the vectors ξ, η and τ , respectively. Now, if a and b are integer numbers, then the parity of a + b is the same of a − b. Hence, we have from the definitions that τi is even if and only if ξi and ηi have

Vol. 2, 2001


1089

the same parity, and, analogously, τi is odd if and only if ξi and ηi have different parity. Therefore, the characteristic function of the image of the i-th coordinate of the transformation (4.1) is given by Ξ(ξi , ηi , τi ) := [χE (ξi )χE (ηi ) + χO (ξi )χO (ηi )] χE (τi ) + [χO (ξi )χE (ηi ) + χE (ξi )χO (ηi )] χO (τi ), where χE and χO denote the characteristic functions of the set of even and odd numbers, respectively. As this is valid for every coordinate, the characteristic function of the image is given by I(τ, ξ, η) :=

d

Ξ(ξi , ηi , τi ).

(A.6)

i=0

Hence, for generic τ, ξ, η ∈ Zd+1 we have to define D(τ, ξ, η) = I(τ, ξ, η) D(τ, ξ, η), D0 (τ, ξ, η) = I(τ, ξ, η) D0 (τ, ξ, η), ˇ ˇ K(τ, ξ, η) = I(τ, ξ, η) K(τ, ξ, η). Incidentally, the function I defined in (A.6) is totally symmetric by permutations of the variables. To consider the image of the transformation (A.2), observe that (A.2) has the form y → u = T (y) := Ay + b, −1 1 1 with b = b1 ⊕ b2 and A = A1 ⊕ A2 , where A1 = −1 1 −1 ; b1 = 1 −1 , A2 = −(x +x ) x +x 1 2 3 4 , b2 = . Here, by abuse of notation, the symbol 1 denotes the 0 0 (d + 1) × (d + 1) identity matrix. Hence, the characteristic function of the image of T is given by the product of the characteristic functions of the images of T1 and T2 : χT (u1 , u2 , u3 , u4 ) = χT1 (u1 , u2 )χT2 (u3 , u4 ), where we define Ti (y) := Ai y+bi , i = 1, 2. By analogy with the former case it is easy to verify, by observing that the set of variables (u1 , ξ, u2 ) and (u3 , u4 , η) are formally analogous to (τ, ξ, η), that χT1 (u1 , u2 ) = I(u1 , ξ, u2 ) and χT2 (u3 , u4 ) = I(u3 , u4 , η). Introducing (4.2)-(4.3) and (A.3)-(A.5) into (A.1), we get D(τ, ξ, η) = D0 (τ, ξ, η)− ˇ − u1 − u3 , u2 , u4 )D(u3 , u4 , η). χT (u1 , u2 , u3 , u4 )D0 (u1 , ξ, u2 )K(τ u1 ,u2 ,u3 ,u4 ∈Zd+1

1090



and R0 := D 0 , taking the Fourier transform above, we Now, denoting R := D have, after the change of variable τ := τ −u1 −u3 and a reordering of summations, that R(k, p, q) = R0 (k, p, q) − (2π)−

ˇ , u2 , u4 ) e−i k·τ K(τ

τ ,u2 ,u4 ∈Zd+1



× 

3(d+1) 2



e−i k·u1 e−i p·ξ χT1 (u1 , u2 , ξ)D0 (u1 , ξ, u2 )

ξ,u1 ∈Zd+1

×

η,u3

 e−i k·u3 e−i q·η χT2 (u3 , u4 , η)D(u3 , u4 , η) . (A.7)

∈Zd+1

Writing down D0 as the inverse Fourier transform of R0 , we have, after interchanging summations and integrals, that the first factor in parenthesis in 3(d+1) the sum above is given by a factor (2π)− 2 times Td+1







e−i(p−β)·ξ e−i(k−α)·u1 I(u1 , ξ, u2 ) ei u2 ·γ R0 (α, β, γ) dα dβ dγ.

ξ,u1 ∈Zd+1

(A.8) If we define λi := e−i(pi −βi ) ξi e−i(ki −αi ) u1 Ξ(ξi , ui2 , ui1 ), i

where ui1 e ui2 denote the i-th coordinate of u1 and u2 , respectively, the factor in parenthesis in the integrand in (A.8) becomes

···

ξ0 ∈Z

ξd ∈Z

u01 ∈Z

···

d ud 1 ∈Z

λi =

i=0

d

 

ξi ∈Z

i=0



λi  ,

ui1 ∈Z

because each λi depends only on ξi and on ui1 . Therefore, (A.8) results in dγ e Td+1

i u2 ·γ

π

−π

···

π

−π

dα0 · · · dαd dβ0 · · · dβd

d i=0

where Λi :=

ξi ∈Z ui1 ∈Z

λi .

Λi

R0 (α, β, γ),

(A.9)

Vol. 2, 2001


1091

To proceed, we have unfortunately to write Λi more explicitly. We have Λi

=

e−i(ki −αi ) u1 i

e−i(ki −αi ) u1 i

e−i(pi −βi ) ξi [χO (ξi )χE (ui2 ) + χE (ξi )χO (ui2 )]

ξi ∈Z

ui1 ∈Z ui1 odd

=

e−i(pi −βi ) ξi [χE (ξi )χE (ui2 ) + χO (ξi )χO (ui2 )]

ξi ∈Z

ui1 ∈Z ui1 even

+

e

−i(ki −αi ) ui1

χE (ui2 )

ui1 ∈Z ui1 even

e−i(pi −βi ) 2n

n∈Z

+ χO (ui2 )

e−i(pi −βi ) e−i(pi −βi ) 2n

n∈Z

+


e

χE (ui2 )

ui1 ∈Z ui1 odd

e−i(pi −βi ) e−i(pi −βi ) 2n

n∈Z

+ χO (ui2 )

e−i(pi −βi ) 2n

n∈Z

! " i e−i(ki −αi ) u1 χE (ui2 ) + χO (ui2 )e−i(pi −βi ) e−i(pi −βi ) 2n

=

ui1 ∈Z ui1 even

+

n∈Z

! " i e−i(ki −αi ) u1 χE (ui2 ) e−i(pi −βi ) + χO (ui2 ) e−i(pi −βi ) 2n

ui1 ∈Z ui1 odd

n∈Z

and, hence, Λi

=

! " i e−i(ki −αi ) u1 χE (ui2 ) + χO (ui2 ) e−i(pi −βi ) π δ(pi − βi )

ui1 ∈Z ui1 even

+

ui1 ∈Z ui1 odd

! " i e−i(ki −αi ) u1 χE (ui2 ) e−i(pi −βi ) + χO (ui2 ) π δ(pi − βi ).

1092



If we replace Λi in (A.9) by the expression in the last line above we get, after integration in β,

dγ ei u2 ·γ

π

−π

Td+1

dγ e

= Td+1

dα0 · · ·

i u2 ·γ

π

−π

π

−π

dαd

d i=0

dα0 · · ·

 


e

 R0 (α, p, γ)

ui1 ∈Z

π

−π



dαd

d

[2π δ(ki − αi )] R0 (α, p, γ)

i=0

= 2d+1 π 2(d+1)

Td+1

ei u2 ·γ R0 (k, p, γ) dγ. (A.10)

Recall that this expression corresponds to (A.8) which, in turn, corresponds 3(d+1) to (2π) 2 times the first factor in parenthesis of (A.7). Analogously, the second factor in parenthesis in (A.7) is given by # π $ d+1 2 2

ei u4 ·β R(k, β, q) dβ.

(A.11)

Td+1

Introducing (A.10) and (A.11) into (A.7), and changing the order of sums and integrals, we obtain R0 (k, p, α)K(k, α, β)R(k, β, q) dα dβ. R(k, p, q) = R0 (k, p, q) − Td+1

Td+1

(A.12) d+1 Here, by abuse of notation, we adopted K(k, α, β) := (25 π)− 2 K(k, −α, −β). The analyticity properties of functions at the left and at the right above 2 d+1 will be basically the same. If in L (T , dx) we define integral operators A(k) by (A(k)(f ))(p) := Td+1 A(k, p, q)f (q) dq, where A = R, R0 or K, we can rewrite (A.12) as R(k) = R0 (k) − R0 (k)K(k)R(k). Identity (A.12) is the lattice analog of expression (2.6) in [2]. Note that 3(d+1) (2π) 2 R0 (k, p, q) is given by τ + (ξ − η) τ − (ξ − η) S2 2 2 τ,ξ,η∈Zd+1 τ + (ξ + η) τ − (ξ + η) −i(k·τ +p·ξ+q·η) e I(τ, ξ, η) S2 + S2 . 2 2 d+1

e−i(k·τ +p·ξ+q·η) I(τ, ξ, η) S2

τ,ξ,η∈Z

(A.13) In the first term above we now perform the change of variables given by τ = u + v,

ξ = u − v + w,

η = w.

Vol. 2, 2001


1093

In terms of these variables we have I(τ, ξ, η) =

=

d

i=0

d

Ξ(ui − vi + wi , wi , ui + vi )

i=0

[χE (ui − vi + wi )χE (wi ) + χO (ui − vi + wi )χO (wi )] χE (ui + vi ) %

+ [χO (ui − vi + wi )χE (wi ) + χE (ui − vi + wi )χO (wi )] χO (ui + vi ) =: J(u, v, w). Note that ui + vi is even if and only if ui and vi have the same parity, and in this case ui − vi + wi and wi will have the same parity (that will be the parity of wi ). Analogously, ui + vi is odd if and only if ui and vi have different parity, in which case ui − vi + wi and wi will have different parity, independently of the parity of wi . Therefore, J above defined is identically equal to 1: J(u, v, w) = 1. Note also that I(τ, ξ, η) is not identically equal to 1, because the three variables are independent. In the case of J(u, v, w), although itself a function of three independent variables, two of them arise in the combination u + v or u − v, that have the same parity. Hence, from the point of view of parity, there remain only two independent variables. It is for this reason that the values of J(u, v, w) are restricted, at the point of being identically one. With these observations, the first term in (A.13) becomes

e−i[(k+p)·u+(k−p)·v+(p+q)·w] S2 (u) S2 (v)

u,v,w∈Zd+1

= (2π)2(d+1) δ(p + q)S2 (k + p)S2 (k − p).

The second term in (A.13) can be handled in the same way, with the change of variables given now by τ = u + v,

ξ = u − v − w,

η = w,

which gives in this case (2π)2(d+1) δ(p − q)S2 (k + p)S2 (k − p). Therefore, R0 (k, p, q) = (2π)

d+1 2

S2 (k + p)S2 (k − p) [δ(p + q) + δ(p − q)].

In particular, acting on symmetric functions, i.e., f (p) = f (−p), the integral operator R0 (k) is given by the kernel R0 (k, p, q) = R0 (k, p)δ(p + q), where R0 (k, p) := 2(2π)

d+1 2

S2 (k + p)S2 (k − p).

(A.14)

This last identity is identical to expression (2.5) of [2], a crucial fact to our analysis.

1094



Acknowledgment. We are grateful to M. O’Carroll and R. S. Schor for discussions and suggestions.

References [1] T. Spencer, The Decay of Bethe-Salpeter Kernel in P (φ)2 Quantum Field Models, Comm. Math. Phys. 44, 143–164 (1975). [2] T. Spencer and F. Zirilli, Scattering and Bounded States in λP (φ)2 , Comm. Math. Phys. 49, 1–16 (1976). [3] F. Auil, Completeza Assint´ otica em Teoria Quântica de Campos na Rede, PhD Thesis, Universidade de S˜ ao Paulo, 2000. [4] F. Auil, in preparation. [5] J. C. A. Barata and K. Fredenhagen, Particle Scattering in Euclidean Lattice Field Theories, Comm. Math. Phys. 138, 507–519 (1991). [6] J. C. A. Barata, Reduction Formulae for Euclidean Lattice Theories, Comm. Math. Phys. 143, 545–558 (1992). [7] J. C. A. Barata, S-Matrix Elements in Euclidean Lattice Theories, Rev. Math. Phys. Vol. 6 3, 497–513 (1994). [8] D. Buchholz, On Particles, Infraparticles and the Problem of Asymptotic Completeness, VIII-th International Congress on Mathematical Physics, Marseille 1986. Eds. Mebkhout, Sénéor. World Scientific, Singapore, 1987. [9] D. Buchholz, Harmonic Analysis of Local Operators, Commun. Math. Phys. 129, 631–641 (1990). [10] D. Buchholz, M. Porrmann and U. Stein, Dirac versus Wigner. Towards a Universal Particle Concept in Local Quantum Field Theory, Phys. Lett. B 267 377–381 (1991). [11] O. Steinmann, Asymptotic Completeness in QED (I). Quasilocal States, Nucl. Phys. B350, 355–374 (1991). [12] M. Combescure and F. Dunlop, n-Particle-Irreducible Functions in Euclidean Quantum Field Theory Ann. Phys. 122, 102–150 (1979). [13] M. Combescure and F. Dunlop, Three Body Asymptotic Completeness for P (φ)2 Models, Comm. Math. Phys. 85, 381–418 (1982). [14] T. Spencer, The Absence of Even Bound States for λ(ϕ4 )2 , Comm. Math. Phys. 39, 77–79 (1974).

Vol. 2, 2001


1095

[15] J. Dimock and J.-P. Eckmann, On the Bound State in Weakly Coupled λ(ϕ6 − ϕ4 )2 . Comm. Math. Phys. 51, 41-54 (1976). [16] J. Dimock and J.-P. Eckmann, Spectral Properties and Bound-State Scattering for Weakly Coupled λP (φ)2 Models, Ann. Phys. 103, 289–314 (1977). [17] R. Neves da Silva, Three Particle Bound States in even λP (φ)2 Models, Helv. Phys. Acta 54, 131–190 (1981). [18] N. Dunford and J. T. Schwartz, Linear Operators. Part I: General Theory, Interscience, New York, 1958. [19] K. Osterwalder and E. Seiler, Gauge Field Theories on the Lattice, Ann. Phys. 110, 440–471 (1978). [20] K. Fredenhagen, On the Existence of the Real Time Evolution in Euclidean Lattice Gauge Theories, Comm. Math. Phys. 101, 579–587 (1985). [21] D. Iagolnitzer and J. Magnen, Asymptotic Completeness and Multiparticle Structure in Field Theories, Comm. Math. Phys. 110, 51–74 (1987). [22] P. Paes-Leme, Ornstein-Zernike and Analyticity Properties of Classical Lattice Spin Systems, Ann. Physics 115, 367–387 (1978). [23] G. K. Pedersen, Analysis Now, Springer-Verlag, New York, 1989. [24] E. Lieb, D. Mattis and T. Shultz, Two-dimensional Ising Model as a Soluble Problem of Many Fermions, Rev. Mod. Phys. 36, 856–871 (1964). [25] R. A. Minlos and Ya. G. Sinai, Investigation of the Spectra of Stochastic Operators Arising in Lattice Models of a Gas, Theoret. and Math. Phys. 2, 167–176 (1970). [26] R. S. Schor, The Particle Structure of ν-Dimensional Ising Models at Low Temperatures, Comm. Math. Phys. 59, 213–233 (1978). [27] R. S. Schor, Existence of Glueballs in Strongly Coupled Lattice Gauge Theories, Nuclear Phys. B222, 71–82 (1983). [28] M. O’Carroll, Analyticity Properties and a Convergent Expansion for the Inverse Correlation Length of the High Temperature d-dimensional Ising Model, J. Stat. Phys. 34, 597–608 (1984). [29] M. O’Carroll and W. D. Barbosa, Analyticity Properties and a Convergent Expansion for the Inverse Correlation Length of the Low Temperature ddimensional Ising Model, J. Stat. Phys. 34, 609–614 (1984). [30] M. O’Carroll and G. Braga, Analyticity Properties and a Convergent Expansion for the Glueball Mass and Dispersion Relation Curve of Strongly Coupled Euclidean 2+1 Lattice Gauge Theories, J. Math. Phys. 25, 2741–2743 (1984).

1096



[31] R. S. Schor, Glueball Spectroscopy in Strongly Coupled Lattice Gauge Theories, Commun. Math. Phys. 92, 369–395 (1985). [32] R. S. Schor and M. O’Carroll, On the Mass Spectrum of the 2+1 Gauge-Higgs Lattice Quantum Field Theory, Commun. Math. Phys. 103, 569–597 (1986). [33] J. Bricmont and J. Fr¨ ohlich, Statistical Mechanical Methods in Particle Structure Analysis of Lattice Field Theories. Part I: General Results, Nucl. Phys. B251 [FS13], 517 (1985). [34] J. Bricmont and J. Fr¨ ohlich, Statistical Mechanical Methods in Particle Structure Analysis of Lattice Field Theories. Part II: Scalar and Surface Models, Commun. Math. Phys. 98, 553–578 (1985). [35] J. Bricmont and J. Fr¨ ohlich, Statistical Mechanical Methods in Particle Structure Analysis of Lattice Field Theories. Part III: Confinement and Bound States in Gauge Theories, Nucl. Phys. B280 [FS18], 385–444 (1987). [36] R. A. Minlos, Spectral Expansion of the Transfer Matrix of Gibbs Fields, Sov. Sci. Rev. C. Math. Phys. 7, 235–280 (1988). [37] V. A. Malyshev and R. A. Minlos, Linear Infinite-Particle Operators, Translations of Mathematical Monographs, 143. AMS. (1995). [38] R. A. Minlos and E. A. Zhizhina, Meson States in Lattice QCD, Advances in Soviet Mathematics 5, 113–137 (1991). [39] Yu. G. Kontraiev and R. A. Minlos, One-Particle Subspaces in the Stochastic XY Model, J. Stat. Phys. 87 613–642 (1997). [40] J. Fr¨ ohlich and P.-A. Marchetti, Soliton Quantization in Lattice Gauge Theories, Commun. Math. Phys. 112, 343 (1987). [41] J. C. A. Barata and K. Fredenhagen, Charged Particles in Z2 Gauge Theories, Commun. Math. Phys. 113, 403–417 (1987). [42] J. C. A. Barata and F. Nill, Electrically and Magnetically Charged States and Particles in the 2+1-dimensional ZN -Higgs Gauge Model, Commun. Math. Phys. 171, 27-86 (1995). [43] R. S. Schor and M. O’Carroll, Decay of the Bethe-Salpeter Kernel and Bound States for Lattice Classical Ferromagnetic Spin Systems at High Temperature, J. Stat. Phys. 99, 1265–1279 (2000). [44] R. S. Schor, J. C. A. Barata, P. A. Faria da Veiga and E. Pereira, Spectral Properties of Weakly Coupled Landau-Ginzburg Stochastics Models, Phys. Rev. E 59, 2689–2694 (1999).

Vol. 2, 2001


1097

[45] R. S. Schor and M. O’Carroll, Bound States in the Transfer Matrix Spectrum for General Lattice Ferromagnetic Spin Systems at High Temperature, Phys. Rev. E 62, 1521–1525 (2000). [46] R. S. Schor and M. O’Carroll, Transfer Matrix Spectrum and Bound States for Lattice Classical Ferromagnetic Spin Systems at High Temperature, J. Stat. Phys. 99, 1207–1223 (2000). [47] M. O’Carroll, P. A. Faria da Veiga, E. Pereira and R. Schor, Spectral Analysis of Weakly Coupled Stochastic Lattice Landau-Ginzburg Type Models, Commun. Math. Phys. 220, 377–402 (2001).

F. Auil and J. C. A. Barata Universidade de S˜ ao Paulo Instituto de F´ısica Caixa Postal 66318 S˜ ao Paulo - 05315 970 - SP Brasil email: [email protected] email: [email protected] Communicated by Klaus Fredenhagen submitted 28/05/01, accepted 7/08/01




The Low-Temperature Limit of Transfer Operators in Fixed Dimension J. Schach Møller∗

Abstract. We construct the 0’th order low-temperature WKB-phase for the first eigenfunction of a transfer operator in a domain around a non-degenerate critical point for the potential. The 0’th order low-temperature phase is shown to solve the eikonal equation in the strong-coupling limit and we obtain estimates on the 0’th order phase, which are preserved in the limit. We furthermore use the IMS localization technique to study the two highest eigenvalues of the transfer operator in the case where the potential is allowed to have many non-degenerate global minima.

I Introduction I.1

Short presentation of the results

This paper is concerned with three problems related to the low-temperature limit of transfer operators. A transfer operator is a bounded positive operator on L2 (Rm ) with integral kernel β

K(x, y) = (βJ) 2 e− 2 V (x) e− m

βJ 2

|x−y|2 − β 2 V (y)

e

,

(I.1.1)

where β > 0 is the inverse temperature, J is a coupling constant and V ∈ C ∞ (Rm ) is a non-negative potential. We will always assume J > 0 (ferromagnetism), V has a finite number of non-degenerate global minima (where V equals 0) and inf |x|>R V (x) > 0, for some R > 0. One can also consider other dispersion relations than |x − y|2 , for example ω(x − y) where ω is some strictly convex function. The transfer operator can be viewed as a continuous spin analogue of the transfer matrix, which plays a central role in the study of the Ising model. It was used by Helfand and Kac in [HK] to treat the mean-field limit of some discrete spin-systems. The mean-field parameter becomes a ferromagnetic coupling constant for the corresponding transfer operator. See [K] for an exposition. The study undertaken here, by analogy with the Ising model, relates to the statistical mechanics of continuous spin-systems. See [F], [He5] and [K] for details. This well-known relation is sketched for spin-chains in Subsection III.4. There are two limits which are of a semiclassical nature. The strong-coupling limit (SC) J → +∞ and the low-temperature limit (LT) β → +∞. For β or J ∗ Supported

by TMR grant FMRX-960001

1100

J. Schach Møller


sufficiently large at least one eigenvalue will appear at the top of the spectrum of K, which has a structure similar to the inverse of the spectrum of a Schr¨ odinger operator. Note that β β 1 m K = (2π) 2 e− 2 V e 2βJ ∆ e− 2 V . The eigenvalues will be numbered in decreasing order making the first eigenvalue the highest. By the Perron-Frobenius Theorem, the first eigenvalue is nondegenerate and its eigenfunction can be chosen strictly positive. For simplicity we assume V has a non-degenerate critical point at 0. In the LT limit we make the following ansatz for the first eigenfunction, ψ1 , of K ψ1 (x) = e−βϕ

(LT)

(x;β)

ϕ(LT) (x; β) ∼

,

∞

(LT)

ϕk

(x)β −k .

(I.1.2)

k=0 (LT)

Note that the low-temperature phases ϕk will depend on the coupling constant J. (By ∼ we mean equality in the sense of formal powerseries.) In the SC limit we put β = 1 and make the ansatz 1

ψ1 (x) = e−J 2 ϕ

(SC)

(x;J)

ϕ(SC) (x; J) ∼

,

∞

(SC)

ϕk

(x)J − 2 . k

(I.1.3)

k=0

As for the highest eigenvalue λ1 we make the ansatz λ1 (β) = e−F

(LT )

(β)

,

F (LT) (β) ∼

∞

(LT) −k

Fk

β

(I.1.4)

k=0

in the LT limit, and λ1 (J) = e−F

(SC)

(J)

,

F (SC) (J) ∼

∞

(SC)

Fk

J− 2

k

(I.1.5)

k=0 (LT)

(SC)

for the SC limit. Determining the phases ϕk and ϕk , and the coefficients (SC) F (LT) k and Fk constitutes a WKB-construction for the transfer operator, in the LT and SC limit respectively. In [He3] and [He4], Helffer showed how to make such constructions in some sufficiently small neighbourhood of a critical point for the potential V . In Subsections I.3 and I.4 we follow Helffer and derive the equations which determine the phases. We will in this paper be interested in the following problems P1) A non-local WKB-construction of the first eigenfunction of K in the low-temperature limit. The 0’th order LT phase is determined as the (non-negative) generator of (LT) a Lagrangian submanifold of R2m , that is; the manifold is the graph of ∇ϕ0 . The manifold arises as the stable incoming manifold of a symplectomorphism, κ = κ(J), with a hyperbolic fixed point at (0, 0) ∈ R2m .

Vol. 2, 2001

The Low-Temperature Limit of Transfer Operators in Fixed Dimension

1101

(LT)

The construction of ϕ0 given by Helffer relied on an application of the local stable manifold Theorem, which only gives the Lagrangian manifold (and hence the 0’th order phase) in a sufficiently small neighbourhood of (0, 0). In this paper (LT) in certain we give a more detailed analysis of the problem. We construct ϕ0 maximal neighbourhoods of 0 (see Subsection I.3) and provide explicit estimates on its Hessian. In particular, the construction is global if V is globally convex. (LT) P2) Study the behavior of ϕ0 , as a function of the coupling constant J, (SC) and its relation with ϕ0 . (SC) The 0’th order strong-coupling phase, ϕ0 , satisfies the eikonal equation |∇ϕ|2 = 2V,

(I.1.6)

and hence is the generator of a Lagrangian submanifold of R2m ; namely, the outgoing stable manifold of the Hamiltonian flow, Ψt , generated by the Hamiltonian H(x, ξ) =

1 2 ξ − V (x). 2

(I.1.7)

Note that (0, 0) is a hyperbolic fixed point for Ψt . We show that the family of symplectomorphisms κ, which determines the LT phase, are discretizations of the continuous dynamical system Ψt . An iteration of (a symplectically rescaled) κ will, in the limit J → ∞, constitute an infinitesimal step (backwards in time) in the direction of the Hamiltonian vector-field (ξ, ∇V (x)). It should be noted that this particular discretization preserves symplectic invariance, as opposed to standard schemes. We prove that the (rescaled) stable incoming manifold for κ converge to the stable outgoing manifold for Ψt , in the limit J → ∞. This connection between the LT limit and the eikonal equation (I.1.6) enables us to construct and estimate solutions to (I.1.6) as a byproduct of our analysis of the 0’th order LT phase. In particular, we get a global construction and estimates, if V is globally convex. We note that the eikonal equation enters at leading order in the WKB-analysis of the Schr¨ odinger operator and in semiclassical expansions of Laplace integrals. See [He1], [Sj1] and [Sj2]. The analysis for P1) and P2) can be done in the framework of standard functions (control with respect to dimension), see [Sj2], which will be the topic of future work. We note that a global construction for the SC limit, with control with respect to dimension, will resolve the remaining problem in [HeRa], for a class of globally convex potentials. (The problem in [HeRa] is the lack of control of a localization error.) P3) Use localizations to study the full transfer operator in the low-temperature limit, when V has many non-degenerate global minima. In [He5] a harmonic approximation was discussed which showed that − ln λ1 (LT ) is given, to leading order in β −1 , by the coefficient F0 , obtained from the transfer operator localized near the shallowest well. In this paper we use an IMS type localization technique, which is similar to (but simpler than) the one employed in [Si] to analyze the Schr¨ odinger operator in

1102

J. Schach Møller


the semiclassical limit. We show that the full expansion, F (LT) (β), is given by the expansion of [He3] and [He4], for the localized transfer operator, up to an O(β −∞ ) error. As for the splitting between the two first eigenvalues we consider two cases. For a symmetric double well potential we show that the splitting vanishes exponentially in the LT limit. In the case of a unique global minimum we show that, up to an exponentially small error, the splitting is given by the splitting of the localized problem, which can be treated by harmonic approximation as in [He5].

I.2

Overview and acknowledgments

The paper is divided into three parts. The present part is an introduction, which is primarily concerned with an exposition of the WKB-construction in the LT and SC limits. In Subsection I.3 the crucial step of determining the LT phase to 0’th order is explained and in Subsection I.4 the corresponding step for the SC limit is recalled. A relation between these two limits is exploited in Subsection I.5 to indicate how one can obtain solutions to the eikonal equation as limits of 0’th order LT phases. In Subsection I.6 we derive the equations determining the higher order LT phases and in I.7 we discuss how to compute the expansion coefficients (LT) explicitly, in terms of J and derivatives of V evaluated at zero. Fk Section II of the paper is concerned with the construction of the WKBphases based on the ideas presented in Section I. In Subsection II.1 we construct the 0’th order LT phase and provide bounds on its Hessian. In Subsection II.2 we approximate a Hamiltonian flow which we use in Subsection II.3 to obtain solutions to the eikonal equation, following the idea outlined in Subsection I.5. We give some extra results on the 0’th order LT phase and the solution to the eikonal equation in Subsection II.4. The last part of the paper connects the expansion obtained by Helffer (and recalled in Section I) for the localized transfer operator, with the problem of determining ln λ1 . In Subsection III.1 we show that the restriction of the transfer operators to neighbourhoods of local minima of the potential contain all the lowtemperature information. In Subsection III.2 we prove that the first eigenvalue of the restricted transfer operator is correctly described by the WKB-construction and we analyze the second eigenvalue in the non-symmetric case. In Subsection III.3 we comment on global constructions in the case where the potential is globally convex. In Subsection III.4 we discuss the consequences of the results for classical spin-chains and we compare with the recent result of [BJS]. The references to work related to WKB-constructions for Schr¨ odinger operators are chosen based on relevance to control with respect to dimension and are not intended to represent the subject as a whole. Studying the low-temperature limit of the transfer operator using semiclassical techniques was suggested to the author by B. Helffer. We would like to thank V. Bach, H. Cornean, G. M. Graf, B. Helffer, T. Ramond and E. Skibsted for discussions and comments.

Vol. 2, 2001

I.3


1103

The low-temperature limit

Throughout the paper we will for the sake of brevity use the notation f for ∇f , the gradient of f . In the case where f depends on more than one variable we will write (∇x f )(x, y) in order to avoid confusion. The same notation will be used for Hessians. We furthermore drop the superscripts (LT) and (SC) when it is clear from the context which one is implied. In the derivation of the equations we simplify the exposition by assuming that the potential V ∈ C ∞ (Rm ) is globally convex, V > 0, and has a unique global minimum at 0 V (0) = V (0) = 0. We define the transfer operator K by its kernel 1

2

1

K(x, y) = (πh)− 2 e− 2h V (x) e− 2h |x−y| e− 2h V (y) . m

J

(I.3.1)

Notice that, compared to (I.1.1), we have replaced the inverse temperature by a semiclassical parameter h = β −1 and chosen a more natural prefactor for the limit considered. The ansatz (I.1.2) now reads ψ1 (x; h) = e

1 −h ϕ(x;h)

,

ϕ(x; h) ∼

∞

ϕk (x)hk ,

(I.3.2)

k=0

This type of ansatz was used for Schr¨ odinger operators in [Sj1] instead of the 1 more traditional ansatz, ψ1 (x) = a(x; h)e− h ϕ0 (x) , where the amplitude a is a formal power series in h. The reason is that the control of terms with respect to dimension is difficult with the latter approach. As for the eigenvalue we get from (I.1.4) λ1 (h) = e−F (h) ,

F (h) ∼

∞

Fk hk .

(I.3.3)

k=0

The idea is to determine ϕ and F inductively by requiring that the integral below ∞ is equal to 1, or rather eO(h ) . 1 1 e h ϕ(x;h) K(x, y)e− h ϕ(y;h) dy = 1 for all x ∈ Rm . (I.3.4) eF (h) Rm

The strategy is to make a coordinate transformation to bring the integrand into a certain form depending on the scaling of h. We consider a change of coordinates y → ξ(y; x, h), where ξ has a formal expansion, which gives the equation (I.3.4) above the form 1 2 −m 2 (πh) e− h ξ (y;x,h)+ln det ∇y ξ(y;x,h) dy = 1 for all x ∈ Rm . (I.3.5) Rm

1104

J. Schach Møller


This will imply that F (h) is formally an expansion of the logarithm of the first eigenvalue. We make the further assumption that ξ is the gradient with respect to y of a function, in order to make its derivative symmetric. That is we look for ξ(y; x, h) = ∇y f (x, y; h),

f (x, y; h) ∼

∞

fk (x, y)hk .

(I.3.6)

k=0 (LT )

Here fk = fk to work with.

. We make the following restriction on the class of f0 ’s we want

Condition I.3.1 We require f0 ≥ 0, ∇2y f0 > 0 and for each x ∈ Rm there exists y = ΦJ (x), such that f0 (x, y) = 0 (and hence ∇y f0 (x, y) = 0). We later verify that the f0 we obtain does indeed satisfy Condition I.3.1. We note that Nthis condition ensures that for each N there exists h0 > 0 such that det(∇2y k=0 fk ) > 0 for h < h0 . Using the explicit form of the integral kernel of the transfer operator we find the equation between formal power series J 1 ϕ(y; h) − ϕ(x; h) + (V (x) + V (y)) + |x − y|2 2 2 = |∇y f (x, y; h)|2 − h ln det ∇2y f (x, y; h) + hF (h).

(I.3.7)

Consider the 0’th order equation 1 J ϕ0 (y) − ϕ0 (x) + (V (x) + V (y)) + |x − y|2 = |∇y f0 (x, y)|2 . 2 2

(I.3.8)

Notice that we have to determine the two unknowns ϕ0 and f0 at the same time. Fix ϕ0 (0) = 0 (can be chosen freely). By Condition I.3.1, ∇y f0 (x, ΦJ (x)) = 0 and hence the gradient of the right hand side, with respect to both x and y, vanishes at the critical manifold (x, ΦJ (x)). This gives the following equations (for ΦJ (x)) 1 ϕ0 (y) + V (y) + J(y − x) = 0 2 1 −ϕ0 (x) + V (x) + J(x − y) = 0. 2

(I.3.9)

Given (x, ϕ0 (x)) we can determine (ΦJ (x), ϕ0 (ΦJ (x))). Inspired by this observam tion we introduce a diffeomorphism of the symplectic space Rm x × Rξ κ(x, ξ; J) = (κx (x, ξ; J), κξ (x, ξ; J)) = (y, η), given by the set of equations 1 η + V (y) + J(y − x) = 0 2 1 −ξ + V (x) + J(x − y) = 0. 2

(I.3.10)

Vol. 2, 2001


1105

The 0’th order phase ϕ0 will now by (I.3.9) satisfy the relation κ(x, ϕ0 (x); J) = (ΦJ (x), ϕ0 (ΦJ (x))).

(I.3.11)

In other words the graph of ϕ0 is a stable manifold for the fixed point (0, 0) of the discrete dynamical system κ. One can see by convexity that the critical point ΦJ (x) will be closer to 0 than x (if ϕ0 is convex) which indicates that the graph is the incoming manifold. One can check that (0, 0) is a hyperbolic fixed point and that κ is a symplectic transformation which implies that the stable manifolds are Lagrangian submanifolds of R2m . From (I.3.8) one can also see that the 0’th order phase will be a fixed point of the transformation ϕ→

J 1 1 V (x) + infm ( V (y) + ϕ(y) + |x − y|2 ), y∈R 2 2 2

(I.3.12)

which turns out to be a contraction on convex functions. This observation will be the key to the non-local construction given in Subsection II.1. Notice that the stable manifold Theorem always gives a local construction. This approach was taken in [He3], and in [He4] control with respect to dimension is achieved. Iterating the transformation (I.3.12) starting with the function ϕ = 12 V gives a pointwise non-decreasing sequence which is bounded from above. Hence it converges to some function which is locally Lipschitz. This is true for any non-negative potential but we lack explicit control of the limit. See [He2], [He3] or [He5]. We end this section by formulating our main result concerning the construction of LT 0’th order phases. We introduce some notation before we state the theorem. Let V ∈ C ∞ (Rm ) have a non-degenerate local minimum at x = 0 with V (0) = 0 (and V (0) = 0). Suppose V is convex in a neighbourhood of 0 and define Ω = {x ∈ Rm : V (x) > 0}. Let D∞ ⊂ Ω be an open set containing 0 and let d ∈ C ∞ (D∞ ) be a smooth non-negative convex function with d(0) = 0 (and hence d (0) = 0). For R > 0 we consider the sets (I.3.13) DR = d−1 ([0, R]). By assumption DR is convex and closed for small R but this need not be the case for large R (unless for example Ω = Rm ). Let R0 = sup{R > 0 : DR is convex and closed, and V (x) · d (x) > 0, x ∈ DR \{0}}. (I.3.14) The function d will be called a comparison function for V , if R0 > 0. We can choose d such that R0 = R0 (d) > 0 since both d = V|Ω and d = x2|Ω are comparison functions. For R < R0 the boundary of DR is a level set for d ∂DR = d−1 ({R}).

1106

J. Schach Møller


We redefine for notational simplicity ◦

D∞ = DR0 .

(I.3.15)

For a given comparison function d we define lower and upper spectral bounds of V : λl (x) = sup{v : V (y) ≥ vI for y ∈ Dd(x) } (I.3.16) λu (x) = inf{v : V (y) ≤ vI for y ∈ Dd(x) } Given a comparison function d we define an adapted version of the transformation (I.3.12): J 1 1 T˜ϕ(x) = V (x) + inf ( V (y) + ϕ(y) + |x − y|2 ). y∈D 2 2 ∞ 2

(I.3.17)

Our answer to P1), of Subsection I.1, is Theorem I.3.2 The set S˜ = {ϕ ∈ C ∞ (D∞ ) : ϕ(0) = 0, ϕ ≥ 0 and ϕ · d ≥ 0}

(I.3.18)

(LT) ˜ The graph of is invariant under T˜ and there exists a unique fixed point ϕ0 ∈ S. (LT) is the stable incoming manifold (over D∞ ) for the transformation κ and ∇ϕ0 (LT)

al (x)I ≤ ∇2 ϕ0

(x) ≤ au (x)I,

where al =

1 Jλl + λ2l 4

x ∈ D∞ ,

(I.3.19)

1 Jλu + λ2u . 4

(I.3.20)

and

au =

Here one should modify Condition I.3.1, which gives the existence of a critical point. We replace it by Condition I.3.3 There exists an open set O ⊂ Rm × Rm , with {x : ∃y ∈ Rm s.t. (x, y) ∈ O} = D∞ , such that f0 is defined as a function on O. Furthermore; f0 ≥ 0, ∇2y f0 > 0 and for any x ∈ D∞ there exists y = ΦJ (x) ∈ Rm with the property that (x, y) ∈ O and f0 (x, y) = 0.

I.4

The strong-coupling limit

This limit has been treated by Helffer in [He4] and by Helffer and Ramond in [HeRa]. We present here the results. The semiclassical parameter will be 1

h = J−2 . We put β = 1 (or alternatively transform it into the potential). The ansatz (I.1.3) and (I.1.5), with this choice of h, takes the same form as in the LT case; see (I.3.2) and (I.3.3).

Vol. 2, 2001


1107

One can proceed here as in the previous section and look for a suitable change of coordinates. The scaling is however different and one should aim at bringing the equation (I.3.4) on the form 2 1 m (πh2 )− 2 e− h2 |ξ(y;x,h)| +ln det ∇y ξ(y;x,h) dy = 1 for all x ∈ Rm Rm

instead of the form (I.3.5). As in the LT case we assume ξ = ∇y f , where f (x, y; h) ∼

∞

fk (x, y)hk

k=0 (SC)

and in order to distinguish from the LT case we sometimes write fk = fk . m (Notice that the transfer operator in this limit has an extra factor of h− 2 .) We get the following equation, between formal power series, 1 h2 (V (x) + V (y)) + |x − y|2 2 2 = |∇y f (x, y; h)|2 − h2 ln det ∇2y f (x, y; h) + h2 F (h).

hϕ(y; h)−hϕ(x; h) +

(I.4.1)

In order to obtain an equation for ϕ0 we identify the h0 , h1 and h2 parts of (I.4.1). In the process we will determine f0 and F0 as well. We have at 0’th order |∇y f0 (x, y)|2 =

1 |x − y|2 , 2

(I.4.2)

at 1’st order ϕ0 (y) − ϕ0 (x) = 2∇y f0 (x, y) · ∇y f1 (x, y),

(I.4.3)

and finally at 2’nd order, with x = y, V (x) = 2∇f0 (x, x) · ∇y f2 (x, x) + |∇f1 (x, x)|2 − ln det ∇2y f0 (x, x) + F0 .

(I.4.4)

The 0’th order equation (I.4.2), which is an eikonal equation with parameter, (together with the constraints f0 (0, 0) = 0 and f0 ≥ 0) determines f0 1 f0 (x, y) = √ |x − y|2 . 2 2 Here we know f0 explicitly and thus need not impose any a priori conditions on it, as we did in the previous section with Condition I.3.1. We note that ∇y f0 (x, x) = 0 and choose 1 (SC) F0 = F0 = ln det √ , 2 which is just an overall additive constant in the free energy. (In [HeRa] it is scaled to zero.) The equation (I.4.4) simplifies to V (x) = |∇y f1 (x, x)|2 .

(I.4.5)

1108

J. Schach Møller


Taking the gradient of (I.4.3) with respect to y, and subsequently evaluating on the diagonal x = y, yields √ ϕ0 (x) = 2∇y f1 (x, x). Inserting this into (I.4.5) shows that ϕ0 must satisfy the eikonal equation (I.1.6), that is: |ϕ0 (x)|2 = 2V (x). One can also describe ϕ0 in way more parallel to what was done in the previous section. Notice that (x, ϕ0 (x)) lies on the zero-energy manifold for the Hamiltonian (I.1.7). In fact (x, ϕ0 (x)) will be the outgoing stable manifold at the hyperbolic fixed point (0, 0) for the continuous symplectic dynamical system given by the Hamiltonian flow Ψt (x, ξ) for the Hamiltonian (I.1.7) ˙ t (x, ξ) = (∇ξ H)(Ψt (x, ξ)) Ψ ˙ t (x, ξ) = −(∇x H)(Ψt (x, ξ)) Ψ

(I.4.6)

Ψ0 (x, ξ) = (x, ξ).

I.5

A connection between the two limits

In this subsection we will explain a central point of this paper which links the LT limit with the SC limit. 1 (LT) scales as J 2 in the SC limit. Inspired We see from Theorem I.3.2 that ϕ0 by this observation we make a symplectic rescaling of the symplectomorphism κ, see (I.3.10). This gives a new symplectomorphism 1

1

1

τ (x, ξ; J) = (κx (x, J 2 ξ; J), J − 2 κξ (x, J 2 ξ; J)).

(I.5.1)

The corresponding dynamical system, has as incoming manifold the graph of 1 (LT) J − 2 ∇ϕ0 and τ = (τx , τξ ) can be written, using Taylor expansion, as 1 1 −1 V (x) x ξ 0 − 32 τ (x, ξ; J) = − J−2 J + + J , (I.5.2) ξ V (x)ξ V (x) rJ (x, ξ) 2 where the remainder rJ (x, ξ) =

1 1 V (x)V (x) + 4 4

0

1

1 1 1 1 ∇3 V (zt ), (ξ − J − 2 V (x)) ⊗ (ξ − J − 2 V (x))dt 2 2

and zt = tx + (1 − t)(x − τx ). The remainder is clearly bounded uniformly in large J. Applying τ once, in the limit of large J, corresponds to taking an infinitesimal step along the Hamiltonian flow (I.4.6) for the Hamiltonian (I.1.7) and in fact by iterating τ one recovers the Euler-Cauchy method (up to the remainder term) for integrating the differential equation (I.4.6). Thus it is no miracle that one can obtain the 0’th order SC phase as a limit of 0’th order LT phases and we will in fact adopt some ideas from theoretical numerical analysis to achieve this end (see [D]). We have the following answer to P2) of Subsection I.1

Vol. 2, 2001

The Low-Temperature Limit of Transfer Operators in Fixed Dimension 1

1109

Theorem I.5.1 Let d be a comparison function for V . The sequences J − 2 ϕ0 and (LT) (SC) − 12 converge locally uniformly in D∞ and there exists a function ϕ0 ∈ S˜ J ∇ϕ0 such that for x ∈ D∞ 1

lim J − 2 ϕ0

J→∞

(LT)

(SC)

(x) = ϕ0

(x)

and

1

lim J − 2 ∇ϕ0

J→∞

(LT)

(SC)

(x) = ∇ϕ0

(LT)

(x).

(SC)

solves the eikonal equation (I.1.6) and its Hessian satisfies The function ϕ0 the estimates (SC) λl (x)I ≤ ∇2 ϕ0 (x) ≤ λu (x)I, for x ∈ D∞ . This seems to be a new way of constructing solutions to the eikonal equation in a non-local way. Notice that the proof of the local stable manifold Theorem for continuous dynamical systems goes through the discrete case (see [I]) and in some sense this is also what is done here. What is more important is probably that our discretization preserves symplectic invariance, and as a byproduct we can get estimates on solutions to the eikonal equation as SC limits of corresponding estimates on the 0’th order LT phase, which in some sense is easier to handle. We note that a more detailed analysis of the higher order derivatives, which 1 (LT) converge to the corresponding we omit here, shows that all derivatives of J − 2 ϕ0 (SC) derivative of ϕ0 , locally uniformly in D∞ .

I.6

Higher order phases

In this subsection we return to the equations (I.3.7) and (I.4.1) and obtain equations for higher order phases (in the LT limit only). See [He4] for a more complete exposition. From the discussion in Subsections I.3 and I.4 we know how to produce 0’th order phases. First of all we get an equation for the coordinate change to 0’th order now (LT) (see (I.3.8)) that we know ϕ0 J 1 |∇y f0 (x, y)|2 = Wx (y) = ϕ0 (y) − ϕ0 (x) + (V (y) + V (x)) + |x − y|2 . (I.6.1) 2 2 This is an eikonal equation for a convex potential with critical point at y = ΦJ (x) and it can be solved (globally if V is globally convex) by Theorem I.5.1. Here x enters as a parameter in the potential Wx . We note that Theorem I.5.1 implies that this solution is within the a priori class given by Condition I.3.1. Before we proceed we need a lemma which appeared in [Sj1], [He1], [He4] and [HeRa]. It is used to determine which terms of ln det ∇2y f (x, y; h) one should use at a given order. Lemma I.6.1 Let h → M (h) be a smooth family of positive m × m matrices with an asymptotic expansion M (h) ∼ k≥0 Mk hk and suppose M0 > 0. Then L(h) = ln det M (h) has an asymptotic expansion L(h) ∼ k≥0 Lk hk where L0 = ln det M0

1110

J. Schach Møller


and for k ≥ 1 Lk =

k

n=1 j1 ,...,jn ≥1 j1 +···+jn =k

(−1)n−1 tr{Πni=1 M0−1 Mji }. n

As for the 1’st order phase we get ϕ1 (y) − ϕ1 (x) = 2∇y f0 (x, y) · ∇y f1 (x, y) − ln det ∇2y f0 (x, y) + F0 .

(I.6.2)

Considering this equation along the critical manifold y = ΦJ (x) gives the following discrete transport equation ϕ1 (x) = ϕ1 (ΦJ (x)) + ln det ∇2y f0 (x, ΦJ (x)) − F0 .

(I.6.3)

The idea is to choose F0 = ln det ∇2y f0 (0, 0) (k)

as the eigenvalue to 0’th order and then iterate keeping in mind that ΦJ (x) → 0 as k → ∞. Subtracting the constant part of ln det(∇2y f0 )(x, ΦJ (x)) will ensure convergence: ϕ1 (x) =

∞

(k)

(k+1)

ln det(∇2y f0 )(ΦJ (x), ΦJ

) − F0 .

k=0

In order to control the regularity of this construction we will need to have good estimates of iterates of ΦJ and its derivatives. The coordinate change to 1’st order satisfies a standard transport equation 2∇y f1 (x, y) · ∇y f0 (x, y) = ϕ1 (y) − ϕ1 (x) + ln det ∇2y f0 (x, y) − F0

(I.6.4)

which can be solved explicitly around y = ΦJ (x): 1 0

f1 (x, y) = ϕ1 (Ψt,x (y)) − ϕ1 (x) + ln det(∇2y f0 )(x, Ψt,x (y)) − F0 dt, 2 −∞ where, for each x, Ψt,x solves d Ψt,x (y) = ∇y f0 (x, Ψt,x (y)) and Ψ0,x (y) = y. dt In particular it can be solved in the same domains for which we construct the 0’th order coordinate change. (See for example [He1], [He4] or [Sj1]). The rest of the way is simply a repetition of the procedure described for the 1’st order phase and coordinate change. First set the eigenvalue at k’th order, Fk , equal to the known term in the (k + 1)’th order equation, obtained from (I.3.7), evaluated at (0, 0). Next iterate towards (0, 0) along the critical manifold, to solve the discrete

Vol. 2, 2001


1111

transport equation (see (I.6.3)) and obtain the k’th order phase. Finally solve the corresponding continuous transport equation (see I.6.4) and obtain the coordinate change to k’th order. We have described above how to obtain the LT expansion under the assumption that we can work in the full space Rm . In practice we want to work in (a subset of) D∞ , see (I.3.15), which gives rise to a localization error, see (I.3.4). We treat the localization problem in Section III and state here the main result in the case where V has a unique global minimum at 0. In case of several global minima, one needs to compare expansion coefficients from each of the minima to determine where the operator localizes. (For the symmetric double well problem we furthermore show that the gap between the first and the second eigenvalue, ln(λ2 /λ1 ), is exponentially small.) Our answer to P3) of Subsection I.1 is Theorem I.6.2 Suppose V ∈ C ∞ (Rm ) has a unique global minimum at 0, where V (0) = 0. Let λ1 be the highest eigenvalue of the operator with kernel (I.3.1). For any N ≥ 0 there exists β0 > 0 such that, for β > β0 , we have ln λ1 = −

N

(LT) −k

Fk

β

+ O(β −N −1 ).

k=0

The first two expansion coefficients are given by (LT)

F0 and (LT)

F1

I.7

(LT)

= −|∇y f1

(LT)

= ln det ∇2y f0

(0, 0)

(LT) (LT) (0, 0)|2 + tr ∇2y f0 (0, 0)−1 ∇2y f1 (0, 0) .

Computing the LT expansion coefficients (LT)

The expansion coefficients Fk are expressed a priori in terms of derivatives of (LT) the fk ’s evaluated at (0, 0). This follows from (I.3.7) and Lemma I.6.1. We end the introduction with some remarks on how to obtain an expression in terms of J and derivatives of V , evaluated at 0. In particular we aim at computing all the ingredients needed to express F0 and F1 in terms of J and derivatives of V , evaluated at zero. As for the eikonal equation and the discrete and continuous transport equation, one can compute the derivatives of solutions at 0, by Taylor expanding the equations. Before we continue we introduce some convenient notation. Let H(n) = (n) Γsym (Rm ) denote the vector-space of real-valued fully symmetric n-tensors. That is, elements of the form T = {Ti1 ,...,in }ik ∈{1,...,m} , with Ti1 ,...,in = Tσ(i1 ),...,σ(in ) for any permutation σ. We equip H(n) with an inner product Ti1 ,...,in Si1 ,...,in . T, S(n) = i1 ,...,in

1112

J. Schach Møller


Let A : Rm → Rm be a symmetric matrix; At = A. We write Γ(n) (A) for the linear map on H(n) given by

(Γ(n) (A)T )i1 ,...,in =

Tj1 ,...,jn Aj1 ,i1 · · · Ajn ,in

j1 ,...,jn

and dΓ(n) (A) for the map (dΓ(n) (A)T )i1 ,...,in =

n m

Ti1 ,...,ik−1 ,jk ,ik+1 ,...,in Ajk ,ik .

k=1 jk =1

The spectra of these operators are σ(Γ(n) (A)) = {Πnk=1 λk : λk ∈ σ(A)}

and σ(dΓ(n) (A)) = {

n

λk : λk ∈ σ(A)}.

k=1

For our purpose A will always be a positive operator. In this case, Γ(n) (A) and dΓ(n) (A) are positive operators on H(n) , and if A is a strict contraction, then so is Γ(n) (A) (in the norm coming from the inner product). These observations will be used below to invert certain linear operators on tensors. Let T (n) ∈ H(n) and S (l) ∈ H(l) . We extend the matrix calculus to tensors as follows ; the (n + l − 2)tensor T (n) S (l) is given by (T

(n)

(l)

S )i1 ,...,in+l−2 =

m

(n)

(l)

Ti1 ,...,in−1 ,j Sj,in ,...,in+l−2 .

j=1

This is however not (generally) a symmetric tensor so we will instead use T (n) ∗ S (l) = Σ(n+l−2) (T (n) S (l) ), where Σ(n) maps n tensors into symmetric n-tensors into as follows: Σ(n) (T )i1 ,...,in =

1 Tσ(i1 ),...,σ(in ) . n! σ∈Σn

Here Σn is the symmetric group of order n. (One can write dΓ(n) (A)T = nA ∗ T , for T ∈ H(n) .) We note that the notation introduced here is that of bosonic Fockspaces. We abbreviate (n)

Tk

= ∇ny fk (0, 0) ∈ H(n) , (1)

and note that S0

(1)

= T0

(n)

and Sk

= ∇n ϕk (0) ∈ H(n) (n)

= ΦJ (0) = 0. As for the Tk ’s we compute, using

Vol. 2, 2001


(I.6.1), (2) T0 (3) T0 (4)

T0

=

1 2 ∇ W0 (0) = 2 y

1113

1 1 (2) S0 + V (0) + J 2 2

1 (3) (2) −1 3 1 (3) (2) −1 1 3 (3) = dΓ (T0 ) ∇y W0 (0) = dΓ (T0 ) S0 + ∇ V (0) 2 2 2 1 (2) (3) (3) = dΓ(4) (T0 )−1 ∇4y W0 (0) − 6T0 ∗ T0 4 1 (4) (2) −1 1 4 (4) (3) (3) = dΓ (T0 ) S0 + ∇ V (0) − 6T0 ∗ T0 . 4 2

(n)

The T1 ’s we need are (see (I.6.4)) 1 (2) −1 (1) (1) S1 + (∇y ln det ∇2y f0 )(0, 0) T1 = T0 2 1 (2) −1 (2) (2) (3) (1) S1 + (∇2y ln det ∇2y f0 )(0, 0) − 2T0 T1 . T1 = T0 4 We notice that we need the first two derivatives of ϕ1 . Let U (x) = ∇2y f0 (x, ΦJ (x)). We start by computing the first two derivatives of U at 0. We abbreviate, for n ≥ 1, (l,n)

T0 (n)

= ∇lx ∇ny f0 (0, 0) ∈ H(l) ⊗ H(n)

(n)

and interpret Tk , Sk ∈ H(0) ⊗ H(n) . We extend the composition ∗ to the tensor product as follows. Let T i ∈ H(li ) and S i ∈ H(ni ) , with ni ≥ 1 and i ∈ {1, 2}. Then (T 1 ⊗ S 1 ) ∗ (T 2 ⊗ S 2 ) ∈ H(l1 +l2 ) ⊗ H(n1 +n2 −2) is defined by (T 1 ⊗ S 1 ) ∗ (T 2 ⊗ S 2 ) = (Σ(l1 +l2 ) T 1 ⊗ T 2 ) ⊗ (S 1 ∗ S 2 ). Then

(1,2)

+ ΦJ (0)T0

(2,2)

+ 2(Σ(2) ⊗ I)(T0

∇U (0) = T0 ∇2 U (0) = T0

(3)

∈ H(1) ⊗ H(2) (1,3)

ΦJ (0)) + ∇2 ΦJ (0)T0

(3)

+ ΦJ (0)T0 ΦJ (0) ∈ H(2) ⊗ H(2) . (4)

(Here one should group the indecies in the tensor product in the obvious way) As for the T (l,n) ’s we have, see (I.6.1), (1,1)

T0

(1,2)

T0

(1,3)

T0

(2,1)

T0

(2,2)

T0

1 (2) −1 = − JT0 2

(2) (3) (1,1) = −(I ⊗ dΓ(2) (T0 )−1 ) T0 ∗ T0 (2) (3) (1,2) (4) (1,1) = −(I ⊗ dΓ(3) (T0 )−1 ) 3T0 ∗ T0 + T0 ∗ T0 (2) −1 (1,2) (1,1) = −2(I ⊗ T0 ) T0 ∗ T0 (2) (1,2) (1,2) (1,3) (1,1) (3) (2,1) . = −(I ⊗ dΓ(2) (T0 )−1 ) 2T0 ∗ T0 + 2T0 ∗ T0 + T0 ∗ T0

1114

J. Schach Møller


(n)

We now compute the S1 ’s, using (I.6.2), S1 = (I − ΦJ (0))−1 (∇ ln det U )(0) (2) (1) S1 = (I − Γ(2) (ΦJ (0)))−1 ∇2 ΦJ (0)S1 + (∇2 ln det U )(0) . (1)

(2)

Using that ln det = tr ln and U (0) = T0 (2) −1

(∂xi ln det U )(0) = tr(T0

(2) −1

(∇y ln det ∇2y f0 )(0, 0) = tr(T0

(2) −1

(∂xj ∂xi ln det U )(0) = tr(T0

we find

∂xi U (0)) (3)

(T0 ei )) (2) −1

(∂xj U (0))T0

∂xi U (0))

(2) −1

+ tr T0 (2) −1

(∂yj ∂yi ln det ∇2y f0 )(0) = tr(T0

(3)

(2) −1

(T0 ej )T0

(∂xi ∂xj U (0) (2) −1

(3)

(T0 ei )) + tr(T0

(4)

(ei T0 ej )),

where ej ∈ H(1) is the j’th standard basis vector in Rm . (n) We are now left with computing S0 and ∇n−1 ΦJ (0). Here we will use formulas from Section II, namely (II.1.7) and (II.1.8). We furthermore exploit that ϕ0 is a fixed point of the map T , introduced in (II.1.6). First of all we get (notice that (II.1.8) at x = 0 becomes an equation for ϕ0 (0), since ΦJ (0) = 0) 1 (2) S0 = JV (0) + V (0)2 4 and ΦJ (0) =

J J+

1 2 V (0)

(2)

.

(I.7.1)

+ S0

Taking derivatives of the equations in (II.1.6) yields the following 1 (3) ∇2 ΦJ (0) = −J −1 Γ(3) (ΦJ (0))( ∇3 V (0) + S0 ) 2 and hence (3)

S0

=

1 I + Γ(3) (ΦJ (0)) 3 ∇ V (0). 2 I − Γ(3) (ΦJ (0))

By taking an extra derivative of (II.1.6) we find the remaining 4-tensor (4)

S0

=

1 I + Γ(4) (ΦJ (0)) 4 ∇ V (0) + 3(I − Γ(4) (ΦJ (0)))−1 R(4) , 2 I − Γ(4) (ΦJ (0))

where R(4) ∈ H(4) is given by 1 (3) R(4) , t1 ⊗ · · · ⊗ t4 (n) = ∇3 V (0) + S0 , ΦJ (0)t1 ⊗ ΦJ (0)t2 ⊗ (t3 ∇2 ΦJ (0)t4 )(n) . 2

Vol. 2, 2001


1115

The expression for F0 is

1 1 2 1 (LT) F0 = ln det JV (0) + V (0) + V (0) + J . 2 4 2 One can combine the computations above to get an expression for F1 explicit (but lengthy) in terms of J and the first 4 derivatives of V evaluated at zero. We instead give an expression for F1 under the simplifying assumption ∇3 V (0) = 0, which holds for reflection invariant potentials V (−x) = V (x). In this case (3)

S0

(3)

= T0

(1)

(1)

= ∇2 ΦJ (0) = S1 = T1

(1,2)

= ∇U (0) = T0

(2,1)

= T0

= R(4) = 0.

We simplify further by assuming m = 1, which reduces tensor composition to multiplication of numbers and removes traces and determinants. Note that in this case Γ(n) (A) = An and dΓ(n) (A) = nA, where A ∈ R. We express the result in terms of ΦJ (0), see (I.7.1). Under these assumptions we have (LT)

F1

=

1 ΦJ (0)2 ∂ 4 V (0). 32J 2 (1 − ΦJ (0)2 )2

One could continue and compute higher order coefficients as well. The same type of arguments gives the SC coefficients.

II Constructing the 0’th order phases II.1 The 0’th order low-temperature phase In this subsection we construct the 0’th order WKB-phase in the low-temperature limit. See P1) of Subsection I.1. Let the potential V be as in Subsection I.3 and let d be a comparison function for V . See (I.3.13), (I.3.14) and (I.3.15). We will consider the family of discrete dynamical systems on Rm × Rm introduced in (I.3.10). It has the form κ(x, ξ; J) = (κx (x, ξ; J), κξ (x, ξ; J)) x − J −1 ξ + J −1 12 V (x) . = ξ − 12 J −1 (V (κx (x, ξ; J)) − V (x))

(II.1.1)

The aim is to construct a branch of the incoming manifold near the hyperbolic fixed point (0, 0), more precisely in the open convex set D∞ , as the graph of the gradient of a function which is a fixed point for the map T˜, see (I.3.17). The map T˜ is closely connected with the Legendre transform in the sense that [T˜ϕ](x) =

1 J J 1 V (x) + x2 − [L( V + ϕ + y 2 )](Jx), 2 2 2 2

1116

J. Schach Møller


where [Lψ](x) = sup (x · y − ψ(y)) y∈D∞

is the Legendre transform of ψ. Consider now the function class S˜ given by (I.3.18). We wish to show that S˜ is invariant under T˜ . Obviously [T˜ϕ](0) = 0. The first assertion is that the infimum (in the transformation T˜ ) is attained in D∞ . For x ∈ D∞ we consider z ∈ ∂Dd(x) and t ≥ 0. Compute for zt = z − td (z) ∈ Dd(x) d 1 J ( V (zt ) + ϕ(zt ) + |zt − x|2 )|t=0 dt 2 2 1 = −( V (z) + ϕ (z) + J(z − x)) · d (z). 2 Since x ∈ Dd(x) , Dd(x) is convex and d (z) is normal to ∂Dd(x) at z, we find that (z − x) · d (z) is positive. By this argument, and the assumption that V (z) · d (z) > 0, we conclude that the infimum, Φ(x), is attained at a point in the interior of Dd(x) ⊂ D∞ . In particular this implies that Φ satisfies the critical equation J(x − Φ(x)) =

1 V (Φ(x)) + ϕ (Φ(x)) 2

(II.1.2)

and the inequality for x = 0.

d(Φ(x)) < d(x),

(II.1.3)

By the Implicit Function Theorem we see that the map x → Φ(x) is smooth. We can now compute 1 [T˜ϕ] (x) = V (x) + J(x − Φ(x)). (II.1.4) 2 Using (II.1.3), and the convexity argument again, we find that [T˜ϕ] (x) · d (x) ≥ 0, for x ∈ D∞ . Hence S˜ is invariant under T˜. Since Φ only depends on ϕ , we obtain a new fixed point problem on the following space of vector valued functions with symmetric derivatives S = {ξ ∈ C ∞ (D∞ ; Rm ) : ξ(0) = 0, ξ ≥ 0,

and ξ · d ≥ 0}.

(II.1.5)

˜ The operation Notice that elements of S are gradients of convex functions (from S). is (see (II.1.2) and (II.1.4)) [T ξ](x) =

1 V (x) + J(x − Φ(x)), 2

where J(x − Φ(x)) =

1 V (Φ(x)) + ξ(Φ(x)). 2 (II.1.6)

Taking the derivative of these relations we find that Φ =

J J+

1 2 V (Φ)

+ ξ (Φ)

(II.1.7)

Vol. 2, 2001


and [T ξ] =

J( 12 V (Φ) + ξ (Φ)) 1 V + . 2 J + 12 V (Φ) + ξ (Φ)

1117

(II.1.8)

This shows that functions ξ with non-negative derivative are mapped into functions T ξ with positive derivative. Hence T maps gradients of convex functions into gradients of convex functions. We also see by (II.1.7) that Φ’s derivative is positive. We will now show that the map T has a unique fixed point in S. Let ξ1 , ξ2 ∈ S and write Φ1 and Φ2 for the corresponding vector-fields. By (II.1.6) and Taylor’s formula T ξ1 − T ξ2 = J(Φ2 − Φ1 ) and 1 (V (Φ1 ) − V (Φ2 )) + ξ1 (Φ1 ) − ξ2 (Φ2 ) 2 1 1 = { V (xt ) + ξ1 (xt )}dt(Φ1 − Φ2 ) + ξ1 (Φ2 ) − ξ2 (Φ2 ), 2 0

J(Φ2 − Φ1 ) =

where (by (II.1.3)) xt = tΦ1 (x) + (1 − t)Φ2 (x) ∈ Dd(x) . This shows that 1 J|Φ1 − Φ2 |2 ≤ − λl (x)|Φ1 − Φ2 |2 + |ξ1 (Φ2 ) − ξ2 (Φ2 )||Φ1 − Φ2 |, 2

(II.1.9)

where λl is given by (I.3.16). We thus get the following inequality, for any 0 < R < R0 , sup |(T ξ1 )(x) − (T ξ2 )(x)| ≤ ◦

x∈DR

J+

1 2

J sup |ξ1 (x) − ξ2 (x)|. inf x∈DR λl (x) ◦ x∈DR

◦

This shows that T restricted to functions on DR is a strict contraction, with respect to the supremum norm. Hence it extends to a contraction on the closure (with respect to the supremum norm) and thus has a fixed point in the closure (of S with R0 = R). Obviously choosing R larger provides an extension of the fixed point to a larger domain and hence the fixed point extends to an L∞ loc function on D∞ . We will write ξJ for this fixed point. Theorem II.1.1 The set S is invariant under the transformation T and there exists a unique fixed point ξJ ∈ S whose graph is the incoming manifold over D∞ for the symplectomorphism κ at the hyperbolic fixed point (0, 0). The derivative of ξJ satisfies (II.1.10) al (x)I ≤ ξJ (x) ≤ au (x)I, x ∈ D∞ , where al and au are given by (I.3.20).

1118

J. Schach Møller


Proof. For ξ ∈ S we define spectral bounds σlξ (x) = sup{λ ≥ 0 : ξ (y) ≥ λI, y ∈ Dd(x) } σuξ (x) = inf{λ ≥ 0 : ξ (y) ≤ λI, y ∈ Dd(x) }. By (II.1.7) and (II.1.8) we have the a priori bounds

and

1 1 J(J + λu + σuξ )−1 I ≤ Φ ≤ J(J + λl + σlξ )−1 I 2 2

(II.1.11)

J( 12 λl + σlξ ) J( 12 λu + σuξ ) 1 1 λl + λ ≤ [T ξ] ≤ + . u 2 2 J + 12 λl + σlξ J + 12 λu + σuξ

(II.1.12)

This leads us to investigate the (contraction) map ρt : [0, ∞) → [0, ∞), given by ρt (s) =

J( 12 t + s) 1 t+ , 2 J + 12 t + s

for t > 0. Solving the equation ρt (s) = s we find the fixed point 1 sf (t) = Jt + t2 . 4 The estimate (II.1.12) now implies that the set Sal ,au = {ξ ∈ S : al (x)I ≤ ξ (x) ≤ au (x)I, x ∈ D∞ }

(II.1.13)

is invariant under T and since al (0)x ∈ Sal ,au , we find that it is a non-empty subset of S.

◦

Hence the fixed point ξJ restricted to D R lies in the closure of Sal ,au and is in particular a global Lipschitz function. Pick a sequence {ηn } ⊂ Sal ,au such that ηn → ξJ . Then the corresponding sequence of vector-fields Ψn is Cauchy by (II.1.9) (with respect to the supremum ◦

norm on DR ) and we write ΦJ for the limit which is also Lipschitz and satisfies the critical equation in (II.1.6) with ξJ . This implies, since ξJ is a fixed point for T, (II.1.14) κ(x, ξJ (x); J) = (ΦJ (x), ξJ (ΦJ (x))). We now estimate using (II.1.11) |ΦJ | = lim |Ψn | ≤ n→∞

J |x|, J + 12 λl

Vol. 2, 2001


1119

which in turn imply that (recall ξJ (0) = 0) κ(k) (x, ξJ (x); J) → (0, 0),

for k → +∞.

This shows, together with (II.1.14), that the graph of ξJ is contained in the stable incoming manifold for κ at the fixed point (0,0). From the global stable manifold Theorem, see [I], we know that the incoming manifold has dimension m and the ◦

same regularity as V . This implies that ξJ is smooth and its graph, over DR , coincides with the local incoming manifold for κ. Since 0 < R < R0 was arbitrary we find ξJ ∈ S. ✷ Let ξ ∈ Sal ,au and write Φ for the corresponding vector-field. From (II.1.10) and (II.1.11) we get the estimates 1 1 J(J + λu + au )−1 I ≤ Φ ≤ J(J + λl + al )−1 I, 2 2

(II.1.15)

which imply the following estimate |Φ(x)| ≤

J J+

1 2 λl (x)

+ al (x)

|x|.

(II.1.16) (LT)

We note that Theorem I.3.2 follows from Theorem II.1.1 by choosing ϕ0 (LT) (LT) such that ϕ0 (0) = 0 and ∇ϕ0 = ξJ . Notice that the graph of −ξJ is the local outgoing manifold. We remark that the global incoming manifold can have several layers in D∞ and the global outgoing and incoming manifolds might even coincide. See the monograph [I] for the theory of stable manifolds for hyperbolic fixed points.

II.2 Approximating the Hamiltonian system We saw in the last subsection, as discussed in Subsection I.5, that the incoming 1 manifolds for κ scale as J − 2 . We therefore consider the symplectically scaled transformation τ introduced in (I.5.1). This symplectomorphism again has (0, 0) 1 as a hyperbolic fixed point and the graph of J − 2 ξJ is the local incoming manifold for (0, 0) in D∞ . In this subsection we will prove that τ , in the limit J → ∞, gives the Hamiltonian flow (I.4.6) for the Hamiltonian (I.1.7). Here V will be any smooth function on Rm . The ideas employed here are common in theoretical numerical analysis, see for example [D]. We introduce a family of diffeomorphisms Ψnt (x, ξ) = τ (n) (x, ξ;

n2 ), t2

t < 0,

and put Ψn0 (x, ξ) = (x, ξ). Obviously Ψnt (x, ξ) is continuous at t = 0.

(II.2.1)

1120

J. Schach Møller


m Let P ⊂ Rm x × Rξ × Rt be the set of (x, ξ, t) for which the Hamiltonian flow (I.4.6) starting at (x, ξ) can be extended to time t. If V is globally bounded the m system is globally Lipschitz and P = Rm x × Rξ × Rt . If V grows too rapidly the flow can go to infinity in finite time. For t ∈ R we define t2 t x V (x) ξ + 2 (II.2.2) τ˜(x, ξ; t, n) = + ξ n V (x) 2n V (x)ξ

and ˜ n (x, ξ) = τ˜(n) (x, ξ; t, n). Ψ t

(II.2.3)

Then we have ˜ nt converge to the Theorem II.2.1 Let V ∈ C ∞ (Rm ). The diffeomorphisms Ψ Hamiltonian flow (I.4.6), for the Hamiltonian (I.1.7), locally uniformly in P. In particular for any compact K ⊂ P there exists CK > 0 such that sup (x,ξ,t)∈K

˜ nt (x, ξ) − Ψt (x, ξ)| ≤ CK n−2 . |Ψ

Proof. Let K ⊂ P be compact. We can assume that if (x, ξ, t) ∈ K, t ≥ 0 (t ≤ 0), then (ψt−s (x, ξ), r) ∈ K for 0 ≤ r ≤ s ≤ t (0 ≥ r ≥ s ≥ t). Using Taylor’s formula on the Hamiltonian flow around t = 0 we find t2 V (x) x ξ Ψt (x, ξ) = +t + + t3 Rt (x, ξ), ξ V (x) 2 V (x)ξ where the remainder Rt (x, ξ) is bounded uniformly in K. By this computation we conclude that the error after one iteration is sup (x,ξ,t)∈K

|Ψ nt (x, ξ) − τ˜(x, ξ; t, n)| ≤ C0 n−3 .

The next step will be to estimate how much τ˜ distorts phase-space. We compute for (x, ξ) and (y, η) in Rm × Rm . Using (II.2.2) we find the estimate C1 x y |˜ τ (x, ξ; t, n) − τ˜(y, η; t, n)| ≤ 1 + − , (II.2.4) ξ η n where the C1 can be chosen locally uniformly in (x, ξ, y, η, t) and n ≥ 1. Pick C1 such that the estimate holds uniformly in (x, ξ, y, η, t, n) with (x, ξ, t) ∈ K, (y, η, t) ∈ K and n ≥ 1. We can now make the estimate ˜ nt (x, ξ)| ≤ |Ψt (x, ξ) − τ˜(Ψt− t (x, ξ); t, n)| |Ψt (x, ξ) − Ψ n + |˜ τ (Ψt− nt (x, ξ); t, n) − τ˜(n) (x, ξ; t, n)| C0 C1 ≤ 1+ |Ψt− nt (x, ξ) − τ˜(n−1) (x, ξ; t, n)| + 3 . n n

Vol. 2, 2001


1121

Iterating this estimate using the choice of C1 and the inequality 1+

C1 n

k

≤

n C1 1+ , n

gives ˜ nt (x, ξ)| ≤ C0 |Ψt (x, ξ) − Ψ n2 Since (1 + Let

C1 n n )

for

0 ≤ k ≤ n,

n C1 . 1+ n

(II.2.5)

→ eC1 as n → ∞, we conclude the result.

✷

P− = {(x, ξ, t) ∈ P : t ≤ 0}. Then we have Corollary II.2.2 The family of diffeomorphisms Ψnt , t ≤ 0 converges to the Hamiltonian flow (I.4.6), for the Hamiltonian (I.1.7), locally uniformly in P− . In particular for any compact K ⊂ P− there exists CK > 0 such that sup (x,ξ,t)∈K

|Ψt (x, ξ) − Ψnt (x, ξ)| ≤ CK n−2 .

II.3 Solving the eikonal equation In this subsection we return to the potentials considered in Subsection II.1 and 1 the family of fixed points ηJ = J − 2 ξJ connected to the rescaled diffeomorphism τ (see (I.5.1)). We know that bl (x)I ≤ ηJ (x) ≤ bu (x)I, where

1

bl (x) = J − 2 al (x)

1

and bu (x) = J − 2 au (x).

Consider an 0 < R < R0 . Since the derivative of ηJ is bounded from above and below uniformly in J ≥ 1 and x ∈ DR we find that the sequence ηJ (x) is uniformly bounded for x ∈ DR and J ≥ 1. We pick a compact set ΣR ⊂ Rm such that (x, ηJ (x)) ∈ DR × ΣR ,

for x ∈ DR

and J ≥ 1.

In the following x will be a point in DR and ξ ∈ ΣR an accumulation point for ηJ (x). We pick a sequence {Jk } (with 1 ≤ Jk → ∞) such that ηJk (x) → ξ for k → ∞. Lemma II.3.1 Let t ≤ 0 and 8 > 0. There exist N0 > 0 and C > 0 such that |Ψnt (x, η) − Ψnt (x, η )| < 8 + C|η − η | uniformly in n ≥ N0 and η, η ∈ ΣR .

1122

J. Schach Møller


Proof. We estimate as for (II.2.4) using (I.5.2), |τ (x, η;

n2 n2 ) − τ (y, η ; 2 )| 2 t t C1 |t| C2 t2 x y C3 |t|3 − + ≤ 1+ + 2 , η η n n n3

where the constants are chosen uniformly in DR × ΣR . By iterating this estimate we get (see the argument for (II.2.5)) |Ψnt (x, η)

−

Ψnt (x, η )|

≤

C1 |t| C2 t2 + 2 1+ n n

n

C3 |t|3 + |η − η | n2

and the lemma follows. ✷ The following lemma is the first example of a statement which is proved by choosing invariant subsets of Sal ,au (see (II.1.13)) thereby implying statements for the fixed point. This idea will be central for the next section. Lemma II.3.2 The map J → ηJ is smooth and we have the following bound for all 0 < R < R0 and J ≥ 1 d sup | ηJ (x)| ≤ CR J −1 , dJ x∈DR for some (non-decreasing family) CR > 0. Proof. That ξJ (and hence ηJ ) is smooth with respect to J > 0 follows from the stable manifold Theorem, see [I]. It can be verified independently however by applying the idea of this proof to higher order derivatives in J. This means that T can be viewed as a map on the set S 1 = C ∞ ((0, ∞); Sal ,au ) and it has a unique fixed point ξJ . In analogy with the proof of Theorem II.1.1 we write, for ξ ∈ S 1 with associated vector-field Φ, d σξ (J; x) = sup | ξ(J; y)|. dJ y∈Dd(x) Taking the derivative with respect to J of the critical relation in (II.1.6), we get d d Φ = J −1 Φ (x − Φ − ξ(Φ; J)). dJ dJ This implies d d T ξ = (I − Φ )(x − Φ) + Φ ξ(Φ; J) dJ dJ which together with (II.1.3), (II.1.7) and (II.1.16) gives the estimate σT ξ (J; x) ≤ tJ (x) + θ(x)σξ (J; x),

Vol. 2, 2001


where θ= and

tJ (x) =

J

1123

J J + 12 λl + al

2 1 2 λu (x) + au (x) sup + 12 λu (x) + au (x) y∈Dd(x)

y.

As in the proof of Theorem II.1.1, we can consider the (contraction) map ρt,θ (s) = t+θs and conclude that the fixed point ξJ must satisfy the estimate σξJ ≤ sf (tJ , θ), where tJ sf (tJ , θ) = 1−θ is the fixed point of ρtJ ,θ . Note that for J large we have by (I.3.20) 1 tJ ∼ J −1 λu sup y and 1 − θ ∼ J − 2 λl y∈Dd(x)

1

uniformly in DR , 0 < R < R0 . The result now follows since ηJ = J − 2 ξJ . ✷ We now combine the first two lemmas to get control of the Hamiltonian flow for t < 0. Lemma II.3.3 The Hamiltonian flow beginning at (x, ξ) can be extended to t = −∞. Furthermore Ψt (x, ξ) ∈ Dd(x) × Σd(x) ,

for

t ≤ 0.

Proof. Assume the Hamiltonian flow exists for 0 ≥ t > tc > −∞. Let 8 > 0. By Corollary II.2.2 there exists N1 > 0 such that 8 |Ψt (x, ξ) − Ψnt (x, ξ)| ≤ , 3 for n ≥ N1 . By Lemma II.3.1, applied with (η, η ) = (ξ, ηJk ), we thus get an N0 and a K0 such that 28 (II.3.1) |Ψt (x, ξ) − Ψnt (x, ηJk (x))| ≤ , 3 N1 } and k ≥ K0 . for n ≥ max{N0 , √ ([t Jk ]+1)2 ˜ Let Jk = , t < 0. Then nk,t = t J˜k is an integer and t2 4 0 ≤ J˜k − Jk ≤ 2 (1 + t Jk ). t Using Lemma II.3.2 we thus get (uniformly in k and Dd(x) ) Jk d | ηJ |dJ |ηJk − ηJ˜k | ≤ dJ J˜k √ 1 + t Jk . ≤C t2 Jk

1124

J. Schach Møller


This estimate shows, in conjunction with (II.3.1), that for each 0 > t > tc there exists K0 > 0 large enough such that n

|Ψt (x, ξ) − Ψt k,t (x, ηJ˜k (x))| ≤ 8, for k ≥ K0 . Here we have used Lemma II.3.1 with (η, η ) = (ηJk , ηJ˜k ). This implies that Ψt (x, ξ) can be approximated by elements of Dd(x) × Σd(x) , see (II.1.14). Since this set is closed, Ψt (x, ξ) must be there itself. We have now shown Ψt (x, ξ) ∈ Dd(x) × Σd(x) , for 0 ≥ t > tc . The flow is thus contained in a compact set and we find by general theory that it can be extended beyond tc , see [HiSm]. This concludes the proof. ✷ Corollary II.3.4 The accumulation point (x, ξ) lies on the outgoing manifold for the hyperbolic fixed point (0, 0) of the Hamiltonian flow. Proof. Let x ∈ D∞ \{0}. We write (x(t), ξ(t)) for the flow Ψt (x, ξ) with Ψ0 (x, ξ) = (x, ξ). Suppose there exist 8 > 0 and a sequence {tn }n∈N with tn → −∞ such that x(tn )2 + ξ(tn )2 ≥ 8. First notice that by Hamiltons equations (I.4.6) and convexity of V , the quantities and h2 (t) = ξ(t) · V (x(t))

h1 (t) = x(t) · ξ(t)

are increasing in t (and positive at t = 0). Now compute for t, t0 < 0 t s d x(t)2 + ξ(t)2 = (h1 (r) + h2 (r))drds t0 t0 dr + (t − t0 )(h1 (t0 ) + h2 (t0 )) + x(t0 )2 + ξ(t0 )2 ≥ −|t0 − t|(h1 (0) + h2 (0)) + x(t0 )2 + ξ(t0 )2 . This shows that for δ =

$ 2(h1 (0)+h2 (0))

x2 (t) + ξ 2 (t) ≥

8 2

we have for |tn − t| ≤ δ, n ∈ N.

By Lemma II.3.3 we have V (x(s)) ≥ λl (x)I, s ≤ 0. We thus get (with ρ = min{1, λl (x)}) 0 h1 (tn ) = − ξ(s)2 + x(s) · V (x(s))ds + h1 (0) tn

0

≤ −ρ

x(s)2 + ξ(s)2 ds + h1 (0) tn

≤ −nρ8δ + h1 (0),

Vol. 2, 2001


1125

which contradicts the requirement given by Lemma II.3.3 that the trajectory stays in the compact set Dd(x) × Σd(x) . ✷ With this result established we proceed to the Proof of Theorem I.5.1. As mentioned earlier the outgoing manifold may have many branches in D∞ × Rm . We will argue that there is at most one point (x, ξ) ∈ D∞ × Rm on the outgoing manifold which propagated by Ψt does not leave this set on its way to (0, 0). If this is the case then Corollary II.3.4 shows that for each x such a ξ = η∞ (x) exists and since it is unique it is the limit of ηJ (x) as J → ∞ and hence Lipschitz. Since the outgoing manifold of the Hamiltonian flow is m dimensional and smooth, we conclude that η∞ is smooth and its graph is the branch of the outgoing manifold in D∞ × Rm going through (0, 0). The function ψ in the theorem is chosen such that ψ(0) = 0 and ψ = η∞ , which is possible since η∞ is symmetric. That ψ solves the eikonal equation follows from conservation of energy. Let x ∈ D∞ and suppose there exists ξ1 , ξ2 ∈ Rm such that Ψ−t (x, ξ1 ) = (x1 (−t), ξ1 (−t)) and Ψ−t (x, ξ2 ) = (x2 (−t), ξ2 (−t)) stays in Dd(x) × Rm and converges to (0, 0) as t → +∞. Let h(t) = (x1 (−t) − x2 (−t)) · (ξ1 (−t) − ξ2 (−t)). We have ˙ h(t) = −|ξ1 (−t)−ξ2 (−t)|2 −(x1 (−t)−x2 (−1))·(V (x1 (−t))−V (x2 (−t))). (II.3.2) Since x1 (−t), x2 (−t) ∈ Dd(x) and V is convex on the convex set Dd(x) with Hessian bounded from below, we find by Taylor’s formula ˙ h(t) ≤ −|ξ1 (−t) − ξ2 (−t)|2 − ρ|x1 (−t) − x2 (−t)|2 , for some ρ > 0. This shows on one hand that h(t) is non-increasing as t → +∞. On the other hand we know that h(0) = 0 and lim h(t) = 0

t→∞

by assumption. Hence h(t) = 0 for any t ≥ 0. Since the right-hand side of (II.3.2) is now forced to vanish for all t ≥ 0, we have in particular that ξ1 = ξ2 . This concludes the proof. ✷ We note that the estimates provided by al and au are not optimal since they are required to be monotone, see (I.3.16). Only in the case where V is of quadratic type do they even capture the growth-rate. Around the minimum however these bounds are optimal.

II.4 Further results for the 0’th order phases In this subsection we discuss some additional properties of the constructions presented in Section II. The first observation concerns symmetries. Let G ⊂ O(m) be

1126

J. Schach Møller


a subgroup of the orthogonal group. We say a function f : Rm → R is G-invariant if f (gx) = f (x), for g ∈ G. Notice that if d˜ is a comparison function for a G-invariant potential V then ˜ d(x) = d(gx)dg G

is a G-invariant comparison function for V . Here dg is the Haar measure on G. Now let d be G-invariant (then gDR = DR , for g ∈ G). The following result follows by replacing S˜ and S in Subsection II.1 by S˜G = {ϕ ∈ S˜ : ϕ(gx) = ϕ(x), g ∈ G, x ∈ D∞ } and

SG = {ξ ∈ S : g −1 ξ(gx) = ξ(x), g ∈ G, x ∈ D∞ },

noting that the construction using functions from these classes is invariant under G. Proposition II.4.1 Let G be a subgroup of O(m) and suppose V is G-invariant. Then the low-temperature phase constructed in Theorem II.1.1 and the solution to the eikonal equation given by Theorem I.5.1 are G-invariant. The next observation is concerned with extending the domain for which we have a solution. Let D = D(V ) denote the class of comparison functions, d, for a potential V in the sense of Subsection II.1. For each d we have a maximal convex domain D∞ (d) in which the construction works. Patching these domains together we get solutions in domains of the form ∪d∈D D∞ (d).

(II.4.1)

This can be used in applications to get solutions in some non-convex domains and we will use it in a later paper to construct solutions in l∞ neighbourhoods of the critical point with control in high dimension. See [Sj1] and [He4]. In [So] local solutions to the eikonal equation are constructed in l2 balls with radius scaling like the square root of the dimension. Let d ≥ 0 be a smooth convex function on an open subset of Rm containing ◦

zero. Suppose d(0) = d (0) = 0. Let D∞ = DR for some R > 0 for which D∞ is convex. For such R and σ > 0 we consider the potential class VR,σ = {V ∈ C ∞ (D∞ ) : V ≥ 0, V (0) = 0, V ≥ σI, V · d > 0 in D∞ \{0}}. Write θ(σ) = J+

1 2σ

J . + Jσ + 14 σ 2

Vol. 2, 2001


1127

Proposition II.4.2 Let V1 , V2 ∈ VR,σ . We have sup ∇(ϕ0,1 − ϕ0,2 ) ≤

x∈D∞

1 sup ∇(V1 − V2 ), 2(1 − θ(σ)) x∈D∞

and for the associated vector-fields, see (II.1.6), we have sup Φ1 − Φ2 ≤ x∈D∞

θ(σ) + 1 sup ∇(V1 − V2 ). 2J(1 − θ(σ)) x∈D∞

As for the solutions to the eikonal equation, ψ1 and ψ2 , we have 1 sup ∇(ψ1 − ψ2 ) ≤ √ sup ∇(V1 − V2 ). 2 σ x∈D∞ x∈D∞ Proof. Write

1 Jσ + σ 2 I}. 4 Then Sσ is invariant under T1 and T2 by the choice of VR,σ and the proof of the lower bound in Theorem II.1.1. Here Ti will denote the map T associated with the potential Vi . Let ξ1 , ξ2 ∈ Sσ and write Φ1 and Φ2 for the vector-fields obtained from ξ1 and ξ2 using the critical equation in (II.1.6) with potentials V1 and V2 respectively. We compute as for the argument which gave the contraction property of T on S, see (II.1.9), and get for ξ1 , ξ2 ∈ Sσ .

Sσ = {ξ ∈ S : ξ ≥

sup Φ1 − Φ2 ≤

x∈D∞

θ(σ) sup ξ1 − ξ2 . J x∈D∞

Choosing ξ1 and ξ2 as the fixed points of T1 and T2 restricted to Sσ thus imply sup ξ1 − ξ2 ≤ x∈D∞

1 sup ∇(V1 − V2 ). 2(1 − θ(σ)) x∈D∞

This gives the first estimate since ∇ϕ0,i = ξi . The second estimate now follows directly from the first and the last follows from Theorem I.5.1 by noting that 1

lim J − 2

J→∞

1 1 = √ . 2(1 − θ(σ)) 2 σ

✷ This proposition is a special case of a more general result for parameter dependent potentials Vz , which gives continuity of mixed derivatives of the LT phase with respect to the potential (provided the potentials are taken from some a priori class like VR,σ ). We do not elaborate further but refer the reader to a coming paper in which we will control the constructions presented here with respect to the dimension m, using the framework of standard function calculus (see [Sj2]).

1128

J. Schach Møller


III Global analysis III.1 Restriction to global minima In this part of the paper we wish to analyze the localization properties of the transfer operator in the LT limit and use the WKB-construction of Helffer, as presented in Subsections I.3 and I.6, of the localized first eigenfunction to gain information on the corresponding eigenvalue. We note that the methods presented here does not directly extend to give control with respect to high dimension. Let V ∈ C ∞ (Rm ) be a non-negative potential with a finite number of nondegenerate global minima, {z1 , . . . , zk }, with V (zi ) = 0. Choose R small enough such that V is convex in each of the k disjoint balls Bi = B(zi , R) = {x ∈ Rm : |x − zi | ≤ R}. Write B0 = Rm \ ∪ki=1 Bi and assume ρ = inf V (x) > 0. x∈B0

We write r0 =

min

1≤i<j≤k

dist(Bi , Bj ).

Let Ki , 1 ≤ i ≤ k, be the restriction of the full transfer operator K, see (I.3.1), to the ball Bi . That is the bounded operator with the kernel Ki (x, y) = K(x, y)|Bi ×Bi on the Hilbert space L2 (Bi ). One can verify that K becomes trace class if V satisfies some growth estimate at infinity (see [He5]). That property is however not needed for the analysis here. Let χi be the characteristic function for the ball Bi , 0 ≤ i ≤ k. First we estimate the contribution from the region away from the global minima 1

ρ

χ0 K ≤ sup e− 2h V (x) = e− 2h .

(III.1.1)

x∈B0

Let ϕ1 , ϕ2 ∈ L2 (Rm ). For 1 ≤ i < j ≤ k we have 2 m J |χi ϕ1 , Kχj ϕ2 | ≤ (πh)− 2 χi (x)χj (y)e− 2h |x−y| |ϕ1 (x)||ϕ2 (y)|dxdy Rm

−m 2

≤ (πh)

|B|e

Rm

Jr2 − 2h0

ϕ1 ϕ2 ,

where |B| denotes the volume of the balls Bi . This implies the following estimate on the coupling between minima χi Kχj ≤ (πh)− 2 e− m

2 Jr0 2h

for 1 ≤ i < j ≤ k.

(III.1.2)

Vol. 2, 2001


1129

In the following C will denote positive constants and δ will denote positive exponents. Both C and δ may vary within the same calculation but are uniform with respect to small h > 0. We choose the first eigenfunction ψ1 normalized and positive and we write λi1 , 1 ≤ i ≤ k, for the first eigenvalue of the operator Ki . Let µ1 = max λi1 . 1≤i≤k

The estimates (III.1.1) and (III.1.2) show on one hand that λ1 ≤

k

χi ψ1 , Kχi ψ1 + Ce− h δ

i=1

≤

k

λi1 χi ψ1 2 + Ce− h δ

i=1

≤ µ1 + Ce− h . δ

(III.1.3)

On the other hand we write ψ1i for the normalized first eigenfunction of the operator Ki extended to the whole space by the 0 function. Again we choose it positive. Then by the variational principle λ1 ≥ max ψ1i , Kψ1i 1≤i≤k

= max ψ1i , Ki ψ1i 1≤i≤k

= µ1 .

(III.1.4)

We have the a priori estimate (see [He2] and [He5]) 0 < C1 ≤ λ1 ≤ C2 ,

(III.1.5)

where C1 and C2 can be chosen uniformly in small h > 0. The lower bound follows from an application of Segal’s Lemma (or harmonic approximation which also applies to the µi1 ’s). The estimates (III.1.3) - (III.1.5) give Theorem III.1.1 There exists h0 > 0 such that µ1 ≤ λ1 ≤ µ1 (1 + Ce− h ) δ

for 0 < h ≤ h0 . The estimates (III.1.1) and (III.1.2) also yield Proposition III.1.2 Let V be a symmetric double well. (i.e. z1 = −z2 and V (x) = V (−x).) There exists h0 > 0 such that 0 ≤ ln for 0 < h ≤ h0 .

λ1 δ ≤ Ce− h , λ2

1130

J. Schach Møller


Proof. Choose B2 = −B1 disjoint and let ψ˜2 = χ1 ψ1 − χ2 ψ1 . Then by symmetry ψ˜2 ⊥ ψ1 , and by (III.1.1) and (III.1.5) 1 ≥ ψ˜2 2 = 1 − χ0 ψ1 2 2 = 1 − λ−1 1 χ0 Kψ1

≥ 1 − Ce− h . δ

The result now follows from the variational principle, (III.1.1) and (III.1.2). It would be interesting to have a more detailed understanding of the optimal exponent δ in Proposition III.1.2. In the case of a Schr¨ odinger operator the optimal exponent is given by the Agmon distance between the two wells. A less ambitious problem would be to determine a lower bound for the splitting to complement the upper bound given here. See also [He2] for estimates on the splitting.

III.2 The localized operator and the WKB-construction The aim of this subsection is to analyze the first eigenvalue λi1 of the localized transfer operator Ki . First we choose the Bi ’s of the previous section so small that the WKB-construction (at each critical point 1 ≤ i ≤ k) of Subsection I.6 is defined in Bi . i Let fN ∈ L2 (Bi ) be given by 1

i (x; h) = e− h ϕN (x;h) , fN i

ϕN (x; h) =

N

ϕik (x)hk .

k=0

ϕik

Here are the WKB-phases associated with the critical point zi . By the variational principle i 2 i i λi1 fN L2 (Bi ) ≥ fN , K i fN L2 (Bi )

≥ e−FN (h)−CN h i

where FNi (h) =

N

N +1

i 2 fN L2 (Bi ) ,

Fki hk

k=0

Fki ’s

and the are the expansion coefficients obtained during the construction of the ϕik ’s. We will now employ a trick due to Helffer and Ramond [HeRa] to get the opposite inequality. The following estimate holds for all strictly positive ψ ψ(y) dy. Ki B(L2 (Bi )) ≤ sup K(x, y) ψ(x) x∈Bi Bi

Vol. 2, 2001


1131

i This estimate follows from the proof of the Schur Lemma. Choosing ψ = fN gives

λi1 ≤ e−FN (h)+CN h i

N +1

In other words Proposition III.2.1 For each N ∈ N there exists h0 > 0 such that ln λi1 = −FNi (h) + O(hN +1 ) for 0 < h ≤ h0 . Let F −1 = {1, . . . , k} and define for N ≥ 0 sets F N = {i ∈ F N −1 : FjN ≤ FiN , for all j ∈ F N −1 } and

N = ∅. F ∞ = ∩∞ N =0 F

Notice that FNi (h) = FNj (h) for i, j ∈ F N (for N = ∞ in the sense of formal power series) and there exists h0 > 0 such that FNi (h) > FNj (h) for i ∈ F N , j ∈ F N and 0 < h ≤ h0 . Combining this with Theorem III.1.1 and Proposition III.2.1 we find the following two results. Theorem III.2.2 For each N ∈ N there exists h0 > 0 such that ln λ1 = −FNi (h) + O(hN +1 ) for some i ∈ F N and 0 < h ≤ h0 . Corollary III.2.3 There exists h0 > 0 such that for 0 < h ≤ h0 (1 − χ∞ )ψ1 ≤ Ce− h , δ

where χ∞ =

i∈F ∞

χi .

Proof. Let µ2 = maxi∈F ∞ λi1 and estimate, using (III.1.1), (III.1.2) and an argument similar to the one that gave (III.1.3), λ1 = ψ1 , Kψ1 ≤ µ1 χ∞ ψ1 2 + µ2 (1 − χ∞ )ψ1 2 + Ce− h δ

= µ1 + (µ2 − µ1 )(1 − χ∞ )ψ1 2 + Ce− h . δ

By Proposition III.2.1 we find that µ1 − µ2 ≥ ChN , for some C > 0 and N ∈ N, and this together with Theorem III.1 proves the result.

1132

J. Schach Møller


In the proof of the next result we will need another version of the µ2 used in the proof of the previous result. Namely µ2 = max {λi2 }, ∞ i∈F

where the λi2 ’s are the second eigenvalues of the restricted operators Ki (see Theorem III.2.5). One can use harmonic approximation to verify that λi1 − λi2 ≥ C > 0 uniformly in small h, see [He5]. Hence we have µ1 − µ2 ≥ C > 0.

(III.2.1)

Proposition III.2.4 Let h0 > 0 be smallenough. For all 0 < h ≤ h0 there exists ω = ω(h) = {ωi }i∈F ∞ with ωi > 0 and i∈F ∞ ωi2 = 1 such that δ ψ1 − ψ1,ω ≤ Ce− h , where ψ1,ω = ωi ψ1i . i∈F ∞

Proof. Let, for i ∈ F ∞ , ω ˜ i = ψ1 , ψ1i

and ωi =

ω ˜i i∈F ∞

. ω ˜ i2

We introduce the subspace H0 ⊂ L2 (Rm ) of linear combinations of the functions ψ1i , i ∈ F ∞ (extended to 0 outside Bi ). Notice that ϕ ∈ H0⊥ will satisfy that ϕ|Bi ⊥ ψ1i , for i ∈ F ∞ . We wish to prove that |ϕ, ψ1 | ≤ Ce− h δ

(III.2.2)

for ϕ ∈ H0⊥ . This will imply the result since ψ1,ω is exactly the orthogonal projection of ψ1 onto H0 . The estimate (III.2.2) holds by (III.2.3) for ϕ with support outside B∞ = ∪i∈F ∞ Bi . As for functions with support inside B∞ we introduce the projection onto H0 Pϕ = ψ1i , ϕψ1i i∈F ∞

and the operator (on L2 (B∞ )) K∞ =

Ki .

i∈F ∞

Notice that P K∞ = K∞ P and I − P is the projection onto functions in H0⊥ restricted to B∞ . We can now estimate using K∞| Range(I−P ) ≤ µ2 (I − P )χ∞ ψ1 2 = (I − P )(µ1 − K∞ )−1 (I − P )χ∞ ψ1 , (µ1 − K∞ )χ∞ ψ1 ≤ (µ1 − µ2 )−1 (I − P )χ∞ ψ1 (µ1 − K∞ )χ∞ ψ1 .

(III.2.3)

Vol. 2, 2001


1133

By (III.1.2) we have K∞ − χ∞ Kχ∞ ≤ Ce− h , δ

which we combine with (III.2.1) and (III.2.3) to get the estimate (I − P )χ∞ ψ1 ≤ C(µ1 − K∞ )χ∞ ψ1 ≤ C(µ1 − K)χ∞ ψ1 + Ce− h δ

≤ C(µ1 − K)ψ1 + Ce− h , δ

where we used Corollary III.2.3 in the last step. The result now follows from an application of Theorem III.1. ✷ If at least two of the weights ωi vanishes at most polynomially in h then the splitting between the two first eigenvalues of the transfer operator is O(h∞ ). We omit the proof which follows from Proposition III.2.4. See also Proposition III.1.2. Theorem III.2.5 Suppose |F ∞ | = 1. There exists h0 > 0 such that for 0 < h ≤ h0 λ2 = µ2 (1 + O(1)e− h ), δ

where µ2 = max{µ2 , µ2 }. Proof. Write F ∞ = {i0 }. The first step of the proof is an analogue of the estimate (III.1.3) λ2 = ψ2 , Kψ2 ≤ χi0 ψ2 , Kχi0 ψ2 +

χi ψ2 , Kχi ψ2 + Ce− h δ

i=i0

≤

µ2 χi0 ψ2 2

+

µ2 (1

− χi0 )ψ2 2 + Ce− h δ

≤ µ2 + Ce− h . δ

(III.2.4)

As for the other inequality we use the variational principle. Let ψ˜2i0 = ψ2i0 − ψ1 , ψ2i0 ψ1 and for the other critical points, i = i0 , we choose ψ˜2i = ψ1i − ψ1 , ψ1i ψ1 . Then ψ˜2i ⊥ ψ1 , 1 ≤ i ≤ k, and by Proposition III.2.4 1 ≥ ψ˜2i0 2 = 1 − |ψ2i0 , ψ1 |2 = 1 − |ψ2i0 , ψ1 − ψ1i0 |2 ≥ 1 − Ce− h . δ

1134

J. Schach Møller


For i = i0 we estimate using Corollary III.2.3 1 ≥ ψ˜2i 2 = 1 − |ψ1 , ψ1i |2 ≥ 1 − Ce− h . δ

We thus get λ2 ≥ µ2 − Ce− h , δ

which together with (III.2.4) and the a priori estimate µ2 ≥ C > 0 uniformly in small h > 0 implies the result. One can use harmonic approximation to show the a priori estimate for µi20 , which is a one well problem (see [He5]), and the estimate µi1 ≥ C is covered by (III.1.5) (see Theorem III.1).

III.3 Globally convex potentials In the case where the potential V is globally convex the constructions given in this paper are global as well. In particular the 0’th order phases (in the LT and SC limits) are also globally convex and if the Hessian of V is bounded uniformly from below or above then so are the 0’th order phases (see Theorem II.1.1 and Theorem I.5.1). In the special case where the Hessian of the potential is bounded uniformly from both above and below then one can estimate all the higher order phases globally as well, in the sense that they all grow at most linearly at infinity and have uniformly bounded derivatives (provided all derivatives of the potential of order 3 and higher are uniformly bounded). We will not prove this result here for two reasons. The proof is rather technical and lengthy and we will need an extended version of the proof in a later paper in order to show that the WKBconstructions considered here are stable with respect to the standard function class introduced by Sj¨ ostrand (see [Sj2]). Notice that this result underscores that the choice of ansatz for the first eigenfunction employed here is a natural one. We note that one can use such a result to give an alternative proof of Theorem III.2.2. In the case of the SC limit it seems to be necessary to know something about the decay of the first eigenfunction outside wells, in order to make a localization argument work. Alternatively one can make the assumption of uniform bounds (from above and below) on the Hessian of V and prove the analogue of Theorem III.2.2 in the SC limit, using the uniform control of higher order phases. This would complete the work of [HeRa] for a class of globally convex potentials, which includes the Kac potential for temperatures above the Curie temperature. It is natural to ask whether or not the 0’th order phases can be extended smoothly beyond a region of convexity into a region of attraction where x · V > 0,

for x = 0.

In one dimension the eikonal equation (the 0’th order phase in the strong-coupling limit) has the smooth solution x ϕ(x) = sign(x) V (y)dy. 0

Vol. 2, 2001


1135

However a numerical test for the one-dimensional potential 3 V (x) = x2 + x sin(x) 4 shows that the incoming manifold for the corresponding diffeomorphism κ ceases to be a graph (over configuration space) outside an interval which grows with J (to include more and more regions of non-convexity). The 0’th order LT phase is therefore not a globally smooth function in this example.

III.4 Chains of continuous spins In this subsection we will briefly review the consequences of the results of Section III in the context of a ferromagnetic, J > 0, system of continuous spins arranged on the one-dimensional lattice Z. See [K] for a more detailed analysis. Let V ∈ C ∞ (Rn ) be a self-energy. It should satisfy the conditions introduced in Subsection III.1 and we suppose for simplicity that it grows at least linearly at infinity. The energy of a spin-distribution at finite volume Λ = {−L, . . . , L} ⊂ Z, is given by HΛ (σ) =

V (σi ) +

i∈Λ

L ∈ N,

J |σi − σj |2 , 2 i∼j

where i ∼ j means nearest neighbour with the periodic boundary condition −L ∼ L. The spin distribution σ takes values in the one-particle space Rn . We note that the quantities discussed here are independent of the choice of boundary condition. The partition function and free energy of the system is e−βHΛ (σ) dn|Λ| σ ZΛ (β) = Rn|Λ|

and FΛ (β) = −

ln ZΛ (β) . β|Λ|

It is well known that in the thermodynamic limit (L → ∞) we have −β lim FΛ (β) = L→∞

n n ln π − ln β + ln λ1 (β), 2 2

where λ1 is the highest eigenvalue of the transfer operator with kernel (I.3.1), potential V and coupling constant J. In other words Theorem III.2.2 gives a lowtemperature expansion of the free energy in the thermodynamic limit. We recall that F0 and F1 , the 0’th and 1’st order parts of ln λ1 , are computed explicitly in Subsection I.7.

1136

J. Schach Møller


Other quantities of interest are expectations of the Gibbs measure, which at finite volume is given by dµβ,Λ = ZΛ (β)−1 e−βHΛ dn|Λ| σ. Of particular interest is the truncated two-point correlation function corβ,Λ (i, j) = σi · σj − σi · σj , where f = f (σ) dµβ,Λ . (III.4.1) Rn|Λ|

Another well known connection between the transfer operator and the thermodynamic limit of one-dimensional spin-systems is the following relation λ1 |i − j|, lim ln corβ,Λ (i, j) ∼ − ln L→∞ λ2 asymptotically for large |i − j|. In other words the inverse correlation length of the system is given by the logarithm of the splitting between the two first eigenvalues of the transfer operator. In the case where the self-energy is a symmetric double well we see from Proposition III.1.2 that the correlation length can become exponentially large in the low-temperature limit. In the case where |F ∞ | = 1 (in particular in the case of a unique global minimum) one can apply Theorem III.2.5 to localize the problem and compute the asymptotics of the correlation length at low temperatures using for example harmonic approximation or WKB-analysis. In [BJS] the LT limit of the correlation function (III.4.1) is determined completely (for any lattice dimension and one-dimensional spins), under assumptions which includes a unique critical point for V . Here we get the correlation length at low temperatures under much weaker conditions on V (for a one-dimensional lattice). We refer the reader to [F], [K], [He4] and [He5] for a more thorough discussion of the connection between transfer operators and spin-systems.

References [BJS]

V. Bach, T. Jecko and J. Sj¨ ostrand, Correlation asymptotics of classical spin systems with nonconvex Hamilton function at low temperature, Ann. Henri Poincaré 1, 59–100 (2000).

[D]

J.P. Demailly, Analyze numérique et équation differentielles, Presses Universitaire de Grenoble (1991).

[F]

J. Fr¨ ohlich, Phase transitions, Goldstone bosons and topological superselection rules, Acta Phys. Austriaca, Suppl. XV 133–269 (1976).

[He1]

B. Helffer, Around a stationary phase theorem in large dimension, J. Func. Anal. 119, 217–252 (1994).

Vol. 2, 2001


1137

[He2]

B. Helffer, On Laplace integrals and transfer operators in large dimension: Examples in the non-convex case, Letters in Math. Phys. 38, 297–312 (1996).

[He3]

B. Helffer, Semi-classical analysis for the transfer operator: WKB constructions in dimension 1, Partial differential equations and mathematical physics. Birkh¨ auser verlag, PNLDE. 21, 168–180 (1996).

[He4]

B. Helffer, Semiclassical analysis for transfer operators: WKB constructions in large dimension, Commun. Math. Phys. 187, 81–113 (1997).

[He5]

B. Helffer, Semiclassical analysis and statistical mechanics Unpublished lecture notes (1998).

[HeRa] B. Helffer and T. Ramond, Semiclassical study of the thermodynamical limit of the ground state energy of Kac’s operator, to appear in Long time behaviour of classical and quantum systems (Ed. S.Graffi, A.Martinez), World Scientific Publishing. [HiSm] M.W. Hirsch and S. Smale, Differential equations, dynamical systems, and linear algebra, Academic Press. (1974). [HK]

E. Helfand and M. Kac, Study of several lattice systems with long-range forces, J. Math. Phys. 4, 1078–1088 (1963).

[I]

M.C. Irwin, Smooth dynamical systems, Academic Press, (1970).

[K]

M. Kac, Mathematical mechanisms of phase transitions, Brandeis lectures, Gordon and Breach, (1966).

[Si]

B. Simon, Semiclassical analysis of low-lying eigenvalues I, Ann. Inst. Poincaré 38, 295–307 (1983).

[Sj1]

J. Sj¨ ostrand, Potential wells in high dimensions I, Ann. Inst. Poincaré. 58, 1–41 (1993).

[Sj2]

J. Sj¨ ostrand, Evolution equations in a large number of variables, Math. Nachr. 166, 17–53 (1994).

[So]

V. Sordoni, Schr¨ odinger operators in high dimension: Convex potentials, Asymptotic Anal. 13, 109–129 (1996).

Jacob Schach Møller Université Paris-Sud Département de Mathématique F-91405 Orsay France email: [email protected] Communicated by Gian Michele Graf submitted 20/04/01, accepted 1/06/01



Band Gap of the Schrödinger Operator with a Strong δ-Interaction on a Periodic Curve P. Exner and K. Yoshitomi Abstract. In this paper we study the operator Hβ = −∆ − βδ(· − Γ) in L2 (R2 ), where Γ is a smooth periodic curve in R2 . We obtain the asymptotic form of the band spectrum of Hβ as β tends to infinity. Furthermore, we prove the existence of the band gap of σ(Hβ ) for sufficiently large β > 0. Finally, we also derive the spectral behaviour for β → ∞ in the case when Γ is non-periodic and asymptotically straight.

1 Introduction In this paper we are going to discuss some geometrically induced spectral properties of singular Schr¨ odinger operators which can be formally written as Hβ = −∆ − βδ(· − Γ), where Γ is an infinite curve in the plane. This problem stems from physical interest to quantum mechanics of electrons confined to narrow tubelike regions usually dubbed “quantum wires”. Such systems are often modeled by means of Schr¨ odinger operators on curves, or more generally, on graphs. This is an idealization, however, because in reality the electrons are confined in a potential well of a finite depth, and therefore one can find them also in the exterior of such a “wire”, even if not too far since this a classically forbidden region. The generalized Schr¨ odinger operators mentioned above provide us with a simple model which can take such tunneling effects into account. Singular interactions have been studied by numerous authors – see the classical monograph [AGHH], and the recent volume [AK] for an up-to-date bibliography. While the general concepts are well known, the particular case of a δinteraction supported by a curve attracted much less attention; we can mention ˇ and a recent article [EI], where a nontrivial relation between spectral [BT, BEKS] properties and the geometry of the curve Γ was found for the first time. It was followed by our previous paper [EY], where we posed the question about the strong coupling asymptotic behaviour, β → ∞, of the eigenvalues of Hβ in the case when Γ was a loop. We have shown there that the asymptotics is given by the spectrum of the Schr¨ odinger operator on L2 (Γ) with a curvature-induced potential. Here we are going to discuss a similar problem in the situation when Γ is an infinite smooth curve without self-intersections. We pay most attention to the case of a periodic Γ where we find the asymptotic form of the spectral bands and prove existence of open band gaps for β > 0 large enough provided Γ is not a straight line. We also treat the case of a non-straight Γ which is straight asymptotically, and

1140

P. Exner and K. Yoshitomi


thus by [EI] it gives rise to a nonempty discrete spectrum; we find the behaviour of these eigenvalues for β → ∞. While the basic idea is the same as in [EY], namely combination of a bracketing argument with the use of suitable curvilinear coordinates in the vicinity of Γ, the periodic case requires several more tools. Let us review briefly the contents of the paper. In the following section we present a formulation of the problem and state the results. Section 3 is devoted to the proof of our main result, Theorem 2.1. We perform the Floquet-Bloch reduction and estimate the discrete spectrum of the fiber operator Hβ,θ using a Dirichlet-Neumann bracketing and approximate operators with separated variables. As a corollary we obtain the existence of open gaps for β large enough. To get a more specific information on the last question, we derive in Section 4 a sufficient condition under which the nth gap is open for a given n. The final section deals with the case of an asymptotically straight Γ.

2 Main results Let us first introduce the needed notation and formulate the problem. The main topic of this paper is the Schr¨ odinger operator with a δ-interaction on a periodic curve. Let Γ : R s → (Γ1 (s), Γ2 (s)) ∈ R2x,y be a curve which is parametrized by its arc length. Let γ : R → R be the signed curvature of Γ, i.e. γ(s) := (Γ1 Γ2 − Γ2 Γ1 )(s). We impose on it the following assumptions: (A.1) γ ∈ C 2 (R). (A.2) There exists L > 0 such that γ(· + L) = γ(·) on R. L (A.3) γ(t) dt = 0. 0

Given β > 0, we define

qβ (f, f ) = ∇f 2L2(R2 ) − β

|f (x)|2 dS

for

f ∈ H 1 (R2 ).

Γ

By Hβ we denote the self-adjoint operator associated with the form qβ . The operator Hβ can be formally written as −∆ − βδ(· − Γ). Our main purpose is to study the asymptotic behaviour of the band spectrum of Hβ as β tends to infinity. Let α ∈ [0, 2π) be the angle between the vectors Γ (0) and (1, 0): Γ (0) = (cos α, sin α). We define new coordinates (x , y ) by cos α sin α x x − Γ1 (0) = . y y − Γ2 (0) − sin α cos α From now on, we work in the coordinates (x , y ), where the curve Γ assumes the form t s cos − γ(u) du dt, Γ1 (s) = 0 0 t s Γ2 (s) = sin − γ(u) du dt. 0

0

Vol. 2, 2001

Band Gap of the Schr¨ odinger Operator

1141

Combining these relations with (A.3), we have Γ(· + L) − Γ(·) = (K1 , K2 ) on R, where

K1

K2

L

t cos − γ(u) du dt,

L

t sin − γ(u) du dt.

= 0

(2.1)

0

= 0

0

In the vicinity of Γ one can introduce the natural locally orthogonal system of curvilinear coordinates. By Φ we denote the map R2 (s, u) → (Φ1 (s, u), Φ2 (s, u)) = (Γ1 (s) − uΓ2 (s), Γ2 (s) + uΓ1 (s)) ∈ R2 . We further impose the following assumptions on Γ: (A.4) K1 > 0. (A.5) There exists a0 > 0 such that the map Φ|[0,L)×(−a,a) is injective and Φ((0, L) × (−a, a)) ⊂ (0, K1 ) × R for all a ∈ (0, a0 ). As in the proof of [Yo, Proposition 3.5], we notice that the assumptions (A.4) and t (A.5) are satisfied, e.g., if maxt∈[0,L] | 0 γ(s) ds| < π/2; on the other hand, this condition is by no means necessary. Let us also remark that in general the choice of the initial point s = 0 is important in checking the assumptions (A.4) and (A.5). We put Λ = (0, K1 ) × R. For θ ∈ [0, 2π), we define Qθ

=

qβ,θ (f, f ) =

{u ∈ H 1 (Λ);

u(K1 , K2 + ·) = eiθ u(0, ·) on R}, 2 ∇f L2 (Λ) − β |f (x)|2 dS for f ∈ Qθ . Γ((0,L))

By Hβ,θ we denote the self-adjoint operator associated with the form qβ,θ . We shall prove in Lemma 3.1 the unitary equivalence 2π ⊕Hβ,θ dθ. (2.2) Hβ ∼ = 0

By Lemma 3.3 this implies σ(Hβ ) =

σ(Hβ,θ ).

(2.3)

θ∈[0,2π)

Since Γ((0, L)) is compact, we infer by Lemma 3.2 that σess (Hβ,θ ) = [0, ∞).

(2.4)

1142



Next we need a comparison operator on the curve. For a fixed θ ∈ [0, 2π) we define Sθ = −

d2 1 − γ(s)2 ds2 4

in L2 ((0, L))

with the domain Pθ = {u ∈ H 2 ((0, L));

u(L) = eiθ u(0),

u (L) = eiθ u (0)}.

For j ∈ N, we denote by µj (θ) the jth eigenvalue of the operator Sθ counted with multiplicity. This allows us to formulate our main result. Theorem 2.1 Let n be an arbitrary integer. There exists β(n) > 0 such that "σd (Hβ,θ ) ≥ n

for

β ≥ β(n)

and

θ ∈ [0, 2π).

For β ≥ β(n) we denote by λn (β, θ) the nth eigenvalue of Hβ,θ counted with multiplicity. Then λn (β, θ) admits an asymptotic expansion of the form 1 λn (β, θ) = − β 2 + µn (θ) + O(β −1 log β) 4

as

β → ∞,

where the error term is uniform with respect to θ ∈ [0, 2π). Combining this result with Borg’s theorem on the inverse problem for Hill’s equation, we obtain the following corollary about the existence of the band gap of σ(Hβ ). Corollary 2.2 Assume that γ = 0, i.e. that Γ is not a straight line. Then there exists m ∈ N and Gm > 0 such that lim min λm+1 (β, θ) − max λm (β, θ) = Gm . β→∞

θ∈[0,2π)

θ∈[0,2π)

We would like to know, of course, which gaps in the spectrum open as β → ∞. To this aim we prove a sufficient condition which guarantees this property for a fixed 1 ∞ 2 gap index n. Let {cj }∞ j=1 and {dj }j=0 be the Fourier coefficients of 4 γ(s) : ∞ ∞ 1 2πj 2πj γ(s)2 = s+ s in cj sin dj cos 4 L L j=1 j=0

L2 ((0, L)).

2 2 Proposition 2.3 Let n ∈ N. Assume that 0 < c2n + d2n < 12π L2 n and 1 2nπ 2nπ 1 2 s − dn cos s < max γ(s)2 − d0 − cn sin cn + d2n , L L 4 s∈[0,L] 4 then we have

lim

β→∞

min λn+1 (β, θ) − max λn (β, θ) > 0.

θ∈[0,2π)

θ∈[0,2π)

(2.5)

Vol. 2, 2001


1143

In particular, it is obvious that if the effective curvature-induced potential has a dominating Fourier component in the expansion (2.5), the band with the same index opens as β → ∞. We also see that the second assumption of Proposition 2.3 is more difficult to satisfy as the index n increases.

3 Proof of Theorem 2.1 We first prove the unitary equivalence (2.2) by using the Floquet-Bloch reduction scheme – see, e.g., [RS, XIII.16]. For u ∈ C0∞ (R2 ) and θ ∈ [0, 2π), we define ∞ 1 eimθ u(x − mK1 , y − mK2 ), U0 u(x, y, θ) = √ 2π m=−∞

(x, y) ∈ Λ.

2π Then U0 extends uniquely a unitary operator from L2 (R2 ) to 0 ⊕L2 (Λ) dθ, which we denote as U. In addition, U is unitary also as an operator from H 1 (R2 ) to 2π 1 0 ⊕H (Λ)dθ. Let us check the following claim. Lemma 3.1 We have UHβ U −1 =

2π

⊕Hβ,θ dθ.

(3.1)

0

Proof. We shall first show that qβ (f, g) =

2π

qβ,θ ((Uf )(·, ·, θ), (Ug)(·, ·, θ)) dθ

for f, g ∈ H 1 (R2 ).

(3.2)

0

Let u, v ∈ C0∞ (R2 ). The quadratic form

qβ (u, v) = (∇u, ∇v)L2 (R2 ) − β

u(x)v(x) dS Γ

can be in view of (2.1) written as ∞

((∇u)(x − mK1 , x − mK2 ), (∇v)(x − mK1 , y − mK2 ))L2 (Λ)

m=−∞

−β

∞ m=−∞

u(x − mK1 , y − mK2 )v(x − mK1 , y − mK2 ) dS

Γ((0,L))

2 and since { √12π einθ }∞ n=−∞ is a complete orthonormal system of L ((0, 2π)) we have 2π ∞ 1 √ = eimθ (∇u)(x − mK1 , y − mK2 ), 2π m=−∞ 0

1144



∞ 1 inθ √ e (∇v)(x − nK1 , y − nK2 ) dθ 2π n=−∞ L2 (Λ) 2π ∞ 1 √ eimθ u(x − mK1 , y − mK2 ), −β 2π m=−∞ 0

∞ 1 inθ √ e v(x − nK1 , y − nK2 ) dθ 2π n=−∞ L2 (Γ((0,L))) 2π = qβ,θ ((Uu)(·, ·, θ), (Uv)(·, ·, θ)) dθ.

(3.3)

0

Let f, g ∈ H 1 (R2 ). Since C0∞ (R2 ) is dense in H 1 (R2 ), we can choose in it two ∞ sequences {uj }∞ j=1 and {vj }j=1 such that uj → f

in

vj → g

H 1 (R2 ),

in H 1 (R2 ) as

j → ∞.

The form qβ is bounded in H 1 (R2 ), hence we get lim qβ (uj , vj ) = qβ (f, g).

j→∞

(3.4)

Notice that there exist a constant C > 0 such that for any θ ∈ [0, 2π) and u, v ∈ Qθ , we have (3.5) |qβ,θ (u, v)| ≤ CuH 1 (Λ) vH 1 (Λ) . 2π Since U is a unitary operator from H 1 (R2 ) to 0 ⊕H 1 (Λ) dθ, we have Uuj → Uf

2π

in

⊕H 1 (Λ) dθ,

0

Uvj → Ug

2π

in

⊕H 1 (Λ) dθ.

0

Combining these relations with (3.5), we have

2π

lim

j→∞

qβ,θ ((Uuj )(·, ·, θ), (Uvj )(·, ·, θ)) dθ

0

=

2π

qβ,θ ((Uf )(·, ·, θ), (Ug)(·, ·, θ)) dθ.

(3.6)

0

Putting (3.3), (3.4), and (3.6) together, we get (3.2). Next we shall show that 2π U −1 ⊕Hβ,θ dθ U ⊂ Hβ . 0

(3.7)

Vol. 2, 2001


1145

2π Let u ∈ L2 (R2 ) and Uu ∈ D( 0 ⊕Hβ,θ dθ). By definition of the direct integral we have (Uu)(·, ·, θ) ∈ D(Hβ,θ ) for a.e. θ ∈ [0, 2π), 2π Hβ,θ Uu(·, ·, θ)2L2 (Λ) dθ < ∞.

(3.8)

0

The first named property means in particular that (Uu)(·, ·, θ) ∈ D(Hβ,θ ) for a.e. θ ∈ [0, 2π), thus we have qβ,θ ((Uu)(·, ·, θ), g) = (Hβ,θ Uu(·, ·, θ), g)L2 (Λ)

for all g ∈ Qθ .

(3.9)

Note that there exists a constant b > 0 such that for all θ ∈ [0, 2π) and f ∈ Qθ , we have 1 qβ,θ (f, f ) + bf 2L2(Λ) ≥ f 2H 1 (Λ) . (3.10) 2 It follows from (3.9) that |qβ,θ (Uu(·, ·, θ), Uu(·, ·, θ))| = |(Hβ,θ Uu(·, ·, θ), Uu(·, ·, θ))| 1 Hβ,θ Uu(·, ·, θ)2L2 (Λ) + Uu(·, ·, θ)2L2 (Λ) . ≤ 2 2π This together with (3.8) and (3.10) implies that Uu ∈ 0 ⊕H 1 (Λ) dθ, so we have u ∈ H 1 (R2 ). We pick any v ∈ H 1 (R2 ). Its image by U satisfies (Uv)(·, ·, θ) ∈ Qθ

for

a.e. θ ∈ [0, 2π).

We put w(θ) = Hβ,θ Uu(·, ·, θ). From (3.2) we have 2π qβ,θ ((Uu)(·, ·, θ), (Uv)(·, ·, θ)) dθ qβ (u, v) = 0

which can be using (3.9) rewritten as 2π qβ (u, v) = (w(θ), (Uv)(·, ·, θ))L2 (Λ) dθ = (U −1 w, v)L2 (R2 ) . 0

Using (3.8), we get

U −1 w ∈ L2 (R2 ).

Thus we have u ∈ D(Hβ ) and −1 U

2π

⊕Hβ,θ dθ Uu = Hβ u,

0

which proves (3.7). Since the two operators in this inclusion are self-adjoint, we arrive at (3.1). Next we have to locate the essential spectrum of our operator.

1146



Lemma 3.2 We have σess (Hβ,θ ) = [0, ∞). Proof. We define cθ (u, v) =

u(x)v(x) dS,

u, v ∈ Qθ ,

Γ((0,L))

which allows us to write qβ,θ = q0,θ − βcθ on Qθ . Let Cθ be the self-adjoint operator associated with the form cθ . In view of the quadratic form version of Weyl’s theorem (see [RS, XIII.4, Corollary 4]), it suffices to demonstrate that the 2 operator (H0,θ + 1)−1 Cθ (H0,θ + 1)−1 is compact on L2 (Λ). Let {un }∞ n=1 ⊂ L (Λ) 2 be a sequence which converges to zero vector weakly in L (Λ). We put vn = (H0,θ + 1)−1 un . Since (H0,θ + 1)−1 is a bounded operator from L2 (Λ) to H 2 (Λ) and the operator H 2 (Λ) f → f |Γ((0,L)) ∈ L2 (Γ((0, L))) is compact, we have Cθ (H0,θ + 1)−1 un 2L2 (Λ) = cθ (vn , vn ) = vn 2L2 (Γ((0,L))) → 0 1/2

as n → ∞.

Thus Cθ (H0,θ + 1)−1 is a compact operator on L2 (Λ), and consequently 1/2

(H0,θ + 1)−1 Cθ (H0,θ + 1)−1 = [Cθ (H0,θ + 1)−1 ]∗ [Cθ (H0,θ + 1)−1 ] 1/2

1/2

is a compact operator on L2 (Λ).

Lemma 3.3 We have σ(Hβ ) =

σ(Hβ,θ ).

θ∈[0,2π)

Proof. We put

2π

Kβ =

⊕Hβ,θ dθ.

0

In view of Lemma 3.1, it suffices to prove that σ(Kβ ) = σ(Hβ,θ ).

(3.11)

θ∈[0,2π)

Combining Lemma 3.2 with [RS, Theorem XIII.85(d)], we have   σ(Kβ ) ∩ [0, ∞) =  σ(Hβ,θ ) ∩ [0, ∞) = [0, ∞).

(3.12)

θ∈[0,2π)

Next we shall show that



σ(Kβ ) ∩ (−∞, 0) = 

θ∈[0,2π)

 σ(Hβ,θ ) ∩ (−∞, 0).

(3.13)

Vol. 2, 2001


1147

For n ∈ N, we put αn (β, θ) =

sup

inf

v1 ,···,vn−1 ∈L2 (Λ) φ∈P(v1 ,···,vn−1 )

qβ,θ (φ, φ),

where P(v1 , · · · , vn−1 ) := {φ; φ ∈ Qθ , φL2 (Λ) = 1, and (φ, vj )L2 (Λ) = 0 for 1 ≤ j ≤ n − 1}. In order to prove (3.13), we shall show that the functions αn (β, ·) are continuous on [0, 2π]. Let θ, θ0 ∈ [0, 2π]. We define θ − θ0 x f (x, y) for f ∈ L2 (Λ). (Vθ,θ0 f )(x, y) = exp i K1 Then Vθ,θ0 is a unitary operator on L2 (Λ) which maps Qθ0 onto Qθ bijectively. We have qβ,θ (Vθ,θ0 g, Vθ,θ0 g) − qβ,θ0 (g, g) θ−θ (θ − θ0 )2 θ − θ0 i K 0x ∂ 2 g = gL2(Λ) + 2 i Vθ,θ0 g, e 1 K12 K1 ∂x L2 (Λ)

(3.14)

for g ∈ Qθ0 . Note that there exists α > 0 such that ∂ 2 g ∂x 2

L (Λ)

≤

3 qβ,θ0 (g, g) + αg2L2 (Λ) 2

for g ∈ Qθ0 .

Combining this with (3.14), we obtain |qβ,θ (Vθ,θ0 g, Vθ,θ0 g) − qβ,θ0 (g, g)| ≤

(θ − θ0 )2 |θ − θ0 | g2L2 (Λ) + K12 K1

3 (1 + α)g2L2 (Λ) + qβ,θ0 (g, g) 2

for g ∈ Qθ0 . It proves the continuity of αn (β, ·) on [0, 2π]. Combining this with the min-max principle and [RS, Theorem XIII.85(d)], we arrive at (3.13). The relations (3.12) and (3.13) together give (3.11) which completes the proof. The most important part of the proof is the analysis of the discrete spectrum of Hβ,θ . The tool we use is the Dirichlet-Neumann bracketing. Given a > 0, we put Σa = Φ((0, L) × (−a, a)). Note that Σa is a domain derived by transporting a segment of the length 2a perpendicular to Γ along the curve. Since Γ (0) = Γ (L) = (1, 0), we have Φ1 (0, ·) = 0 and Φ1 (L, ·) = K1 on R. This together with (A.5) implies, for |a| < a0 , that Σa ⊂ Λ and that Λ\Σa consists of two connected components, which we denote by Λ1a and Λ2a . For θ ∈ [0, 2π), we define + Ra,θ

= {u ∈ H 1 (Σa );

u=0

on ∂Σa ∩ Λ,

1148



u(K1 , ·) = eiθ u(0, ·) on (−a, a)}, − Ra,θ + qa,β,θ (f, f ) − qa,β,θ (f, f )

= {u ∈ H 1 (Σa );

u(K1 , ·) = eiθ u(0, ·) on (−a, a)}, + = ∇f 2L2 (Σa ) − β |f (x)|2 dS for f ∈ Ra,θ , Γ((0,L)) − = ∇f 2L2 (Σa ) − β |f (x)|2 dS for f ∈ Ra,θ . Γ((0,L))

− + Let L+ a,β,θ and La,β,θ be the self-adjoint operators associated with the forms qa,β,θ − and qa,β,θ , respectively. For j = 1, 2, we define + Ka,j,θ

= {f ∈ H 1 (Λja );

− Ka,j,θ

= {f ∈ H

1

(Λja );

f (K1 , K2 + u) = eiθ f (0, u) if

(0, u) ∈ ∂Λja ,

f = 0 on ∂Λja ∩ Λ}, f (K1 , K2 + u) = eiθ f (0, u) if

(0, u) ∈ ∂Λja },

2 e± a,j,θ (f, f ) = ∇f L2 (Λj ) a

± for f ∈ Ka,j,θ .

± Let Ea,j,θ be the self-adjoint operators associated with the forms e± a,j,θ . By the bracketing bounds (see [RS, XIII.15, Proposition 4]) we obtain − − + + + Ea,1,θ ⊕ L− a,β,θ ⊕ Ea,2,θ ≤ Hβ,θ ≤ Ea,1,θ ⊕ La,β,θ ⊕ Ea,2,θ

(3.15)

in L2 (Λ1a ) ⊕ L2 (Σa ) ⊕ L2 (Λ2a ). In order to estimate the negative eigenvalues − of Hβ,θ , it is sufficient to estimate those of L+ a,β,θ and La,β,θ because the other operators involved in (3.15) are non-negative. To this aim we introduce two operators in L2 ((0, L) × (−a, a)) which are − unitarily equivalent to L+ a,β,θ and La,β,θ , respectively. We define 1 Q+ a,θ = {ϕ ∈ H ((0, L) × (−a, a));

ϕ(K1 , ·) = eiθ ϕ(0, ·)

ϕ(·, a) = ϕ(·, −a) = 0

on (−a, a),

on (0, L)},

1 Q− ϕ(K1 , ·) = eiθ ϕ(0, ·) on (−a, a)}, a,θ = {ϕ ∈ H ((0, L) × (−a, a)); 2 L a L a 2 ∂f + −2 ∂f duds (1 + uγ(s)) duds + ba,β,θ (f, f ) = ∂s 0 −a 0 −a ∂u L L a V (s, u)|f |2 dsdu − β |f (s, 0)|2 ds for f ∈ Q+ + a,θ , 0

−a a

0

−a L

0

2 L a 2 ∂f − −2 ∂f duds (1 + uγ(s)) duds + ba,β,θ (f, f ) = ∂s 0 −a 0 −a ∂u L L a V (s, u)|f |2 dsdu − β |f (s, 0)|2 ds +

L

0

1 − 2

0

γ(s) 1 |f (s, a)|2 ds + 1 + aγ(s) 2

0

L

γ(s) |f (s, −a)|2 ds 1 − aγ(s)

Vol. 2, 2001


1149

for f ∈ Q− a,θ , where 1 5 1 (1 + uγ(s))−3 uγ (s) − (1 + uγ(s))−4 u2 γ (s)2 − (1 + uγ(s))−2 γ(s)2 . 2 4 4

V (s, u) =

+ − and Ba,β,θ be the self-adjoint operators associated with the forms b+ Let Ba,β,θ a,β,θ and b− , respectively. Acting as in the proof of Lemma 2.2 in [EY], we arrive at a,β,θ the following result. + − and Ba,β,θ are unitarily equivalent to L+ Lemma 3.4 The operators Ba,β,θ a,β,θ and − La,β,θ , respectively. + − and Ba,β,θ by operators with separated variables. We put Next we estimate Ba,β,θ γ+ = max |γ (·)|,

γ+ = max |γ(·)|, [0,L]

γ+ = max |γ (·)|,

[0,L]

[0,L]

and V+ (s)

=

V− (s)

=

If 0 < a
83 . Then Ta,β has only one negative eigenvalue, + which we denote by ζa,β . It satisfies the inequalities

1 2 1 1 2 + 2 − β < ζa,β < − β + 2β exp − βa . 4 4 2 − Proposition 3.7 Let βa > 8 and β > 83 γ+ . Then Ta,β has a unique negative eigen− value ζa,β , and moreover, we have

2205 2 1 1 1 − β exp − βa < ζa,β < − β2. − β2 − 4 16 2 4

Vol. 2, 2001


1151

Now we are ready to prove our main result. ± Proof of Theorem 2.1. We put a(β) = 6β −1 log β. Let ξβ,j be the jth eigenvalue of ± Ta(β),β . From Propositions 3.6 and 3.7, we have ± ± ξβ,1 = ζa(β),β

± and ξβ,2 ≥ 0.

± From (3.18), we infer that {ξβ,j + µ± k (a(β), θ)}j,k∈N , properly ordered, is the se± ˜ quence of all eigenvalues of Ha(β),β,θ counted with multiplicity. Using Proposition 3.5, we find ± ± −1 ξβ,j + µ± log β) k (a(β), θ) ≥ µ1 (a(β), θ) = µ1 (θ) + O(β

(3.21)

for j ≥ 2 and k ≥ 1, where the error term is uniform with respect to the quasimomentum θ ∈ [0, 2π). For k ∈ N and θ ∈ [0, 2π), we define ± ± τβ,k,θ = ζa(β),β + µ± k (a(β), θ).

(3.22)

From Propositions 3.5–3.7 we get 1 ± τβ,k,θ = − β 2 + µk (θ) + O(β −1 log β) 4

as β → ∞,

(3.23)

where the error term is uniform with respect to θ ∈ [0, 2π). Let n ∈ N. Combining (3.21) with (3.23), we claim that there exists β(n) > 0 such that + < 0, τβ,n,θ

+ + τβ,n,θ < ξβ,j + µ+ k (a(β), θ),

− − and τβ,n,θ < ξβ,j + µ− k (a(β), θ)

˜± for β ≥ β(n), j ≥ 2, k ≥ 1, and θ ∈ [0, 2π). Hence the jth eigenvalue of H a(β),β,θ ± counted with multiplicity is τβ,j,θ for j ≤ n, β ≥ β(n), and θ ∈ [0, 2π). Let β ≥ β(n) ± and denote by κ± j (β, θ) the jth eigenvalue of La(β),β,θ . From (3.16), (3.17), and the min-max principle, we obtain − τβ,j,θ ≤ κ− j (β, θ)

+ and κ+ j (β, θ) ≤ τβ,j,θ

for 1 ≤ j ≤ n,

(3.24)

so we have κ+ n (β, θ) < 0. Hence the min-max principle and (3.15) imply that Hβ,θ has at least n eigenvalues in (−∞, κ+ n (β, θ)). For 1 ≤ j ≤ n, we denote by λj (β, θ) the jth eigenvalue of Hβ,θ . We have + κ− j (β, θ) ≤ λj (β, θ) ≤ κj (β, θ)

for

1 ≤ j ≤ n.

This together with (3.23) and (3.24) implies that 1 λj (β, θ) = − β 2 + µj (θ) + O(β −1 log β) 4

as β → ∞ for

1 ≤ j ≤ n,

where the error term is uniform with respect to θ ∈ [0, 2π), and completes thus the proof of Theorem 2.1.

1152



Our next aim is to prove Corollary 2.2. As a preliminary, we denote by Bj and Gj , respectively, the length of the jth band and the jth gap of the spectrum d2 1 2 in L2 (R) with the domain H 2 (R): of the operator − ds 2 − 4 γ(s) Bj

=

Gj

=

µj (π) − µj (0) for odd j, µj (0) − µj (π) for even j, µj+1 (π) − µj (π) µj+1 (0) − µj (0)

for odd j, for even j.

Since µj (·) is continuous on [0, 2π], we immediately obtain from Theorem 2.1 the following claim. Lemma 3.8 For n ∈ N, we have |λn (β, [0, 2π))| = Bn + O(β −1 log β) min λn+1 (β, θ) − max λn (β, θ) = Gn + O(β −1 log β)

θ∈[0,2π)

as as

θ∈[0,2π)

β → ∞, β → ∞.

Now we recall Borg’s theorem (see [Bo, Ho, Un]). Theorem 3.9 (Borg) Suppose that W is a real-valued, piecewise continuous function on [0, L]. Let α± j be the jth eigenvalue of the following operator counted with multiplicity: d2 − 2 + W (s) in L2 ((0, L)) ds with the domain {v ∈ H 2 ((0, L));

v(L) = ±v(0),

v (L) = ±v (0)}.

Suppose that + α+ j = αj+1

for all even

j,

− α− j = αj+1

for all odd

j.

and Then W is constant on [0, L]. Proof of Corollary 2.2. Assume that γ is not identically zero. Then it follows from (A.3) that γ is not constant on [0, L]. Combining this with Borg’s theorem, we infer that there exists m ∈ N such that Gm > 0. From Lemma 3.8 we get lim min λm+1 (β, θ) − max λm (β, θ) = Gm > 0. β→∞

θ∈[0,2π)

This completes the proof.

θ∈[0,2π)

Vol. 2, 2001


1153

4 The gaps of Hill’s equation 2

d 1 2 It follows from Lemma 3.8 that if the mth gap of − ds in L2 (R) is open, 2 − 4 γ(s) so is the mth gap of H(β) for sufficiently large β > 0. It is thus useful to find a sufficient condition for which the mth gap of our comparison operator is open for a given m ∈ N. Since a particular form of the effective potential is not essential, we will do that for gaps of the Hill operator with a general bounded potential. ∞ Let V ∈ L∞ ((−a/2, a/2)) and denote by {aj }∞ j=1 and {bj }j=0 the sequences of its Fourier coefficients: ∞

∞ 2πj 2πj x+ x aj sin bj cos V (x) = a a j=1 j=0

in L2 ((−a/2, a/2)),

where aj

=

bj

=

2 a 2 a

a/2

V (x) sin

2πj x dx, a

V (x) cos

2πj x dx. a

−a/2

a/2

−a/2

Let κj be the jth eigenvalue of the operator −

d2 + V (x) ds2

in

L2 ((−a/2, a/2)) with periodic b.c.,

(4.1)

and similarly, let νj be the jth eigenvalue of the operator −

d2 + V (x) ds2

in L2 ((−a/2, a/2)) with antiperiodic b.c..

We are going to prove the following result. Theorem 4.1 Let n ∈ N. Assume that 0< and

12π 2 a2n + b2n < 2 n2 a

1 2 V (x) − b0 − an sin 2πn x − bn cos 2πn x < an + b2n . a a 4 L∞ ((−a/2,a/2))

Then we have νn+1 − νn > 0

when

n

is odd,

κn+1 − κn > 0

when

n

is even.

and

(4.2)

1154



Proposition 2.3 immediately follows from Theorem 4.1. In order to prove the latter, we shall estimate the length of the first gap of the Mathieu operator. For α ∈ R, we define d2 2π Mα = − 2 + 2α cos x in L2 ((−a/2, a/2)) dx a with the domain D = {u ∈ H 2 ((−a/2, a/2));

u(a/2) = −u(−a/2), u (a/2) = −u (−a/2)}.

By mj (α) we denote the jth eigenvalue of Mα counted with multiplicity. The sought estimate looks as follows : Theorem 4.2 We have m2 (α) − m1 (α) ≥ |α|

provided that

|α|
0 is similar. We put D+

=

{u ∈ H 2 ((0, a/2));

u (0) = u(a/2) = 0},

D−

=

{u ∈ H 2 ((0, a/2));

u(0) = u (a/2) = 0}

and define L± α = −

d2 2π + 2α cos x 2 dx a

in L2 ((0, a/2)) with the domain D± .

2π ± By µ± 1 (α) we denote the first eigenvalue of Lα . Since the function cos a x is even, + − 2 we infer that Mα is unitarily equivalent to the operator Lα ⊕ Lα in L ((0, a/2)) ⊕ L2 ((0, a/2)). We put

π π 2 2 ϕj (x) = √ sin (2j − 1)x and ψj (x) = √ cos (2j − 1)x. a a a a It is clear that

− {ϕj }∞ j=1 ⊂ D

+ and {ψj }∞ j=1 ⊂ D ,

∞ and, in addition, {ϕj }∞ j=1 and {ψj }j=1 are complete orthonormal systems of + 2 L ((0, a/2)). We first estimate µ1 (α) from above. By the min-max principle, we obtain π 2 + µ+ + α. (4.3) 1 (α) ≤ (Lα ψ1 , ψ1 ) = a − Next we estimate µ− and φL2 ((0,a/2)) = 1. Since 1 (α) from below. Let φ ∈ D ∞ 2 {ϕj }j=1 is a complete orthonormal system of L ((0, a/2)), we have

φ(x) =

∞ j=1

sj ϕj ,

∞ j=1

s2j = 1,

Vol. 2, 2001


1155

where sj = (φ, ϕj )L2 ((0,a/2)) are the Fourier coefficients. We have (L− α φ, φ)L2 ((0,a/2)) −

π 2

φ2L2 ((0,a/2))   ∞ ∞ π 2 = s2j 4j(j − 1) + α 2 sj sj+1 − s21  a j=2 j=1   ∞ ∞ π 2 = s2j 4j(j − 1) + α 2 sj sj+1 − (s1 − s2 )2 + s22  a j=2 j=2   ∞ ∞ π 2 s2j 4j(j − 1) + α 2 sj sj+1 + s22  ≥ a j=2 j=2   ∞ ∞ π 2 1 2 sj + 3s2j+1 + s22  s2j 4j(j − 1) + α  ≥ a 3 j=2 j=2 a

∞ 2 π 10 π 2 4 + α s22 + 4j(j − 1) + α s2j = 8 a 3 a 3 j=3

≥0

for

−

6π 2 < α < 0. a2

This together with the min-max principle implies that µ− 1 (α) ≥

π 2 a

for

−

6π 2 < α < 0. a2

(4.4)

Combining (4.4) with (4.3), we obtain the assertion of the theorem.

Now we are ready to prove the main result of this section. Proof of Theorem 4.1. We prove the assertion for odd n only since the argument for even n is similar. We extend V to an a-periodic function which we denote by V˜ . Let τ ∈ [0, 2π) be such that bn cos τ = 2 an + b2n

and

an sin τ = − . 2 an + b2n

We have an sin

a 2πnx 2πnx 2 2nπ + bn cos = an + b2n cos x+ τ . a a a 2nπ

Let dj be the jth eigenvalue of the operator with this potential, −

a 2nπ d2 2 + b2 cos x + τ + a n n dx2 a 2nπ

in L2

a a a a − − τ, − τ 2 2nπ 2 2nπ

1156



with antiperiodic boundary condition. Since a coordinate shift amounts to a unitary transformation and does not change the spectrum, dn+1 − dn is equal to the difference of the first two eigenvalues of the operator a a d2 2nπx − 2 + a2n + b2n cos in L2 − , dx a 2n 2n with antiperiodic boundary condition. Thus it follows from Theorem 4.2 that 1 2 dn+1 − dn ≥ an + b2n . (4.5) 2 Let ej be the jth eigenvalue of the operator −

d2 + V˜ (x) dx2

in L2

a a a a − − τ, − τ 2 2nπ 2 2nπ

with antiperiodic boundary condition. By the min-max principle, we get a 2nπ 2 2 ˜ x+ τ |dj − ej | ≤ V (x) − b0 − an + bn cos a 2nπ L∞ ((− a − a τ, a − 2

2π

2

. a 2π

τ ))

(4.6) Notice that νj = ej for all j ∈ N. This together with (4.5) and (4.6) implies that νn+1 − νn > 0, and completes therefore the proof of Theorem 4.1.

5 Asymptotically straight curves Finally, we are going to discuss briefly the case when Γ is non-periodic and asymptotically straight. We impose the following assumptions on γ: (A.6) (A.7) (A.8) (A.9)

γ ∈ C 2 (R). The function γ is not identically zero. There exists c ∈ (0, 1) such that |Γ(s) − Γ(t)| ≥ c|t − s| for s, t ∈ R. There exist τ > 54 and K > 0 such that |γ(s)| ≤ K|s|−τ for s ∈ R.

From [EI, Proposition 5.1 and Theorem 5.2] we know that under these conditions 1 σess (Hβ ) = [− β 2 , ∞) and σd (Hβ ) = ∅. 4 We define S=−

d2 1 − γ(s)2 ds2 4

in L2 (R)

with the domain H 2 (R).

Since γ is not identically zero on R, we have σd (S) = ∅ (see, e.g., [BGS] and [Si]). We put n = "σd (S). For 1 ≤ j ≤ n, we denote by µj the jth eigenvalue of S counted with multiplicity.

Vol. 2, 2001


1157

Theorem 5.1 There exists β0 > 0 such that "σd (Hβ ) = n for β ≥ β0 . For β ≥ β0 and 1 ≤ j ≤ n, we denote by λj (β) the jth eigenvalue of Hβ counted with multiplicity. Then we have 1 λj (β) = − β 2 + µj + O(β −1 log β) 4

as

β→∞

for

1 ≤ j ≤ n.

We omit the proof, since it analogous to those of Theorem 2.1 and [EY, Theorem 1].

Acknowledgments ˇ z where a part K.Y. appreciates the hospitality in NPI, Academy of Sciences, in Reˇ of this work was done. The research has been partially supported by GAAS and the Czech Ministry of Education within the projects A1048101 and ME170.

References [AGHH] S. Albeverio, F. Gesztesy, R. Høegh-Krohn and H. Holden, Solvable Models in Quantum Mechanics, Springer, Heidelberg 1988. [AK]

S. Albeverio and P. Kurasov, Singular Perturbations of Differential Operators, London Mathematical Society Lecture Note Series 271, Cambridge Univ. Press 1999.

[BGS]

R. Blanckenberger, M. Goldberger and B. Simon, The bound state of weakly coupled long-range one-dimensional quantum Hamiltonians, Ann. Phys. 108, 69–77 (1977).

[Bo]

G. Borg, Eine Umkehrung der Sturm-Liouvillschen Eigenwertaufgabe. Bestimmung der Differentialgleichung durch die Eigenwerte, Acta Math. 78, 1–96 (1946).

ˇ J.F. Brasche, P. Exner, Yu.A. Kuperin and P. Seba, ˇ [BEKS] Schr¨ odinger operators with singular interactions, J. Math. Anal. Appl. 184, 112–139 (1994). [BT]

J.F. Brasche, A. Teta, Spectral analysis and scattering theory for Schr¨ odinger operators with an interaction supported by a regular curve, in Ideas and Methods in Quantum and Statistical Physics, Cambridge Univ. Press 1992, 197–211.

[EI]

P. Exner and T. Ichinose, Geometrically induced spectrum in curved leaky wires, J. Phys. A34, 1439–1450 (2001).

[EY]

P. Exner and K. Yoshitomi, Asymptotics of eigenvalues of the Schr¨ odinger operator with a strong δ-interaction on a loop, math-ph/0103029 and mparc 01-108 (http://www.ma.utexas.edu/mp arc/) ; J. Geom. Phys. to appear.

1158



[Ho]

H. Hochstadt, On the determination of a Hill’s equation from its spectrum, Arch. Rat. Mech. Anal. 19, 353–362 (1965).

[Ka]

T. Kato, Perturbation Theory for Linear Operators, 2nd edition Springer, Heidelberg 1976.

[RS]

M. Reed and B. Simon, Methods of modern mathematical physics, IV. Analysis of operators, Academic Press, New York 1978.

[Si]

B. Simon, The bound state of weakly coupled Schr¨ odinger operators in one and two dimensions, Ann. Phys. 97, 279–287 (1976).

[Un]

P. Unger, Stable Hill equations, Comm. Pure. Appl. Math. 14, 707–710 (1961).

[Yo]

K. Yoshitomi, Band gap of the spectrum in periodically curved quantum waveguides, J. Diff. Eq. 142, 123–166 (1998).

P. Exner a) Department of Theoretical Physics Nuclear Physics Institute Academy of Sciences ˇ z 25068 Reˇ Czech Republic

b) Doppler Institute Czech Technical University Bˇrehová 7, 11519 Prague Czech Republic email: [email protected]

K. Yoshitomi Graduate School of Mathematics Kyushu University Hakozaki Fukuoka 812-8581 Japan email: [email protected] Communicated by Gian Michele Graf submitted 23/06/01, accepted 18/08/01




Enhanced Binding Through Coupling to a Quantum Field F. Hiroshima and H. Spohn Abstract. We consider an electron coupled to the quantized radiation field in the dipole approximation. Even if the Hamiltonian has no ground state at zero coupling, α = 0, the interaction with the quantized radiation field may induce binding. Under suitable assumptions on the external potential, we prove that there exists a critical constant α∗ ≥ 0 such that the Hamiltonian has a unique ground state for arbitrary |α| > α∗ . Moreover an explicit lower bound αc of α∗ is given.

1 Introduction Atoms consist of charged particles and they are necessarily coupled to the quantized radiation field. In the lowest approximation, this interaction can be ignored 1 2 p +V for the particles only. and one is led to a Schr¨ odinger operator of the form 2m Under suitable conditions on V the Schr¨ odinger operator has a state of the lowest energy, the ground state of the atom. There has been renewed interest within mathematical physics to understand whether this ground state persists when the coupling to the radiation field is included [3, 5, 6, 8, 9, 10, 13, 14, 12, 18, 19]. We will investigate here a related, but distinct problem. To formulate it properly, we first have to introduce our Hamiltonian. In the non-relativistic approximation, the coupling to the radiation field is described by the Pauli-Fierz Hamiltonian. We will simplify through the dipole approximation which neglects the variation of the vector potential over the size of the atoms. Thereby the Hamiltonian becomes H=

1 2 (p ⊗ I − αI ⊗ A) + V ⊗ I + I ⊗ Hf 2m

(1.1)

acting on the Hilbert space L2 (R3 ) ⊗ F. Here p = −i∇ is the momentum operator canonically conjugate to the position operator x in L2 (R3 ), V is an external potential for which precise conditions will be specified below. In essence, V is short ranged and sufficiently shallow, i.e. lim|x|→∞ V (x) = 0, V ≤ 0. A is the quantized electromagnetic vector potential at the origin defined by 1 ∗ ej (k) {ϕ(k)a A= ˆ (k, j) + ϕ(k) ˆ ∗ a(k, j)} d3 k. 2ω(k) j=1,2 Here ϕˆ∗ denotes the complex conjugate of ϕ, ˆ e1 (k), e2 (k), k/|k| are a standard dreibein, and a∗ (k, j), a(k, j), j = 1, 2, are Bose fields with commutation relations [a∗ (k, j), a(k , j )] = δjj δ(k − k ),

1160

F. Hiroshima and H. Spohn


[a∗ (k, j), a∗ (k , j )] = 0, [a(k, j), a(k , j )] = 0

(1.2)

and acting on the Boson Fock space F over L2 (R3 ) ⊗ C2 . ϕˆ cuts the high frequencies. The dispersion relation of the photons is ω(k) = |k|

(1.3)

and therefore the field energy Hf =

ω(k)a∗ (k, j)a(k, j)d3 k.

j=1,2

For H to be a well defined self-adjoint operator, ϕ ˆ has to satisfy certain integrability conditions which will be stated in Section 2. Also, if clear from the context, we omit factors ⊗I and I⊗ in the following. The problem of the existence of the ground state for H is usually regarded as a stability property. One assumes that H has a ground state for α = 0, which 1 2 p + V , and proves that H has amounts to the existence of a ground state for 2m also a ground state for α = 0. It is then necessarily unique, since e−tH has a positivity improving kernel in a suitable function space [14]. In contrast, in our contribution, we assume that H has no ground state for α = 0. In fact, this will be the case for a sufficiently shallow V . We expect the interaction with the quantized radiation field to enhance binding. The non-binding potential should become binding at a sufficiently strong coupling strength. This is precisely our main result. The physical reasoning behind such a result is simple. As the particle binds photons it acquires an effective mass meff = meff (α2 ) which is increasing in |α|. Roughly speaking, H may be replaced by Heff =

p2 + V, 2meff

which binds for sufficiently strong α. We have the two main results. In Theorem 3.1 we prove that there is a critical coupling α∗ ≥ αc such that H has a unique ground state for arbitrary |α| > α∗ . Here −1 ˆ (1.4) αc = m(µ0 − 1)ϕ/ω with some µ0 > 1 determined by the external potential V . In Theorem 3.4 we give examples of potentials V for which H has the ground state for arbitrary |α| > αc . The proof of Theorem 3.1 is based on combination of a Bogoliubov transformation and a momentum lattice approximation. Let us start with Hel =

p2 + V, 2m

Vol. 2, 2001

Enhanced Binding Through Coupling to a Quantum Field

1161

where V (x) ≤ 0, lim|x|→∞ V (x) = 0, such that Hel has only the absolutely continuous spectrum [0, ∞). We first prove that H∼ =

p2 + V + Hf + α2 g + δV, 2meff

(1.5)

where ∼ = denotes unitary equivalence, g is a constant, 22

2

meff = meff (α ) = m + α

3

R3

2 |ϕ(k)| ˆ d3 k, ω(k)2

(1.6)

and δV is an error term. Note that meff rather than m appears in the transformed p2 + V has the ground state for |α| > αc . Through a momentum Hamiltonian. 2m eff lattice approximation we can prove that for |α| > α∗ with some α∗ ≥ αc , the approximate Hamiltonian (1.5) has a ground state. The proof of Theorem 3.4 is based on a scaling limit argument: The dilated Hel , x p2 Hel (κ) = κ2 + V ( ), (1.7) 2m κ has the same spectrum as Hel . We couple to the Bose field as 1 1 x 1 2 2 (p − αA) + 2 V ( ) + Hf ∼ (p − καA) +V +κ2 Hf = H(κ) (1.8) κ2 = 2m κ κ 2m and prove that, for sufficiently large z > 0, −1 = s − lim H(κ) − κ2 α2 g + z

κ→∞

p2 +V +z 2meff

−1 ⊗ Pg

(1.9)

with Pg the projection onto the ground state of Hf +

α2 2 A . 2m

Note that meff rather than m appears in the limit Hamiltonian. For |α| > αc , the limit Hamiltonian has a ground state. If we can prove that this ground state persists for large κ, we are done. In other words H with the external potential 1 x κ2 V ( κ ), κ 1, has a ground state for arbitrary |α| > αc . This paper is organized as follows. In Section 2 we introduce our assumptions on V , ϕˆ and consider the κ-dependence of the ground state in case of massive photons. In Section 3 we establish the main theorems Theorems 3.1 and 3.4. As corollaries, we provide examples for potentials V such that the zero-coupling Hamiltonian has no ground state, but for sufficiently large |α|, H has a unique ground state.

1162



2 The Pauli-Fierz Hamiltonian in the dipole approximation 2.1

Hamiltonian

Let us assume that the electron moves in d dimensions. We set L2 = L2 (Rd ) and denote by F the Boson Fock space over L2 ⊗ Cd−1 . The Hilbert space H of the coupled system is then H = L2 ⊗ F. Let Ffin denote the finite particle subspace of F . The Fock representation of the Bose field is denoted by a∗ (k, j), a(k, j), j = 1, . . . , d − 1, which satisfy the com

mutation relations (1.2). We use the shorthand a∗ (λ, j) = λ(k)a∗ (k, j)dd k and a(λ, j) = λ(k)∗ a(k, j)dd k. Note that [a(f, j), a∗ (g, j )] = δjj (f, g) on Ffin , where (f, g) is the scalar product on L2 . The d-dimensional polarization vectors are written as ej = ej (k) = (e1j (k), . . . , edj (k)), j = 1, . . . , d − 1, which satisfy ei (k) · ej (k) = δij and ej (k) · k = 0 almost everywhere on Rd . The quantized vector potential is defined by1 1 ∗ A= ˆ (k, j) + ϕ(k) ˆ ∗ a(k, j)} dd k, ej (k) {ϕ(k)a 2ω(k) and the quantized electric field, as its canonically conjugate, by ω(k) ∗ √ ej (k) {ϕ(k)a Π=i ˆ (k, j) − ϕ(k) ˆ ∗ a(k, j)} dd k. 2 Let Hf be the field Hamiltonian and Nf the number operator in F , Hf = ω(k)a∗ (k, j)a(k, j)dd k, Nf =

a∗ (k, j)a(k, j)dd k.

The Hamiltonian H acting in H is then given by H=

1 (p − αA)2 + V + Hf . 2m

We first state the self-adjointness of H, which is established in [1, 15, 16] for arbitrary α ∈ R. √ ω ϕˆ ∈ L2 . Moreover suppose that V is relatively bounded Proposition 2.1 Let ϕ/ω, ˆ 2 with respect to p = −∆ with a sufficiently small relative bound. Then H is selfadjoint on D(p2 ) ∩ D(Hf ) and bounded below for arbitrary α ∈ R. 1 The

summation over repeated indices is automatically understood unless otherwise stated.

Vol. 2, 2001


1163

We need in addition some technical assumptions on ϕˆ which we list as Definition 2.2 We say that ϕˆ is in E if (1) ϕ(k) ˆ ∗ = ϕ(−k) ˆ and ϕˆ is rotation invariant, i.e. ϕ(k) ˆ = ϕ(|k|), ˆ (2) ϕ/ω ˆ √5/2 , ω 1/2 ϕˆ ∈ L2 , (3) |ϕ( ˆ s)|2 s(d−2)/2 ∈ L ([0, ∞), ds), 0 < " < 1, and Lipschitz continuous on [0, ∞) with an order strictly less than one, ˆ (d−1)/2 ∞ < ∞, where f ∞ = supk∈Rd |f (k)|, (4) ϕω ˆ (d−3)/2 ∞ < ∞ and ϕω (5) ϕ(k) ˆ = 0 for k = 0. H for V = 0 is quadratic and can therefore be diagonalized explicitly, which is carried out in the Appendix. We also need the corresponding unitary U so as to control the error δV = U −1 V U − V in the unitarily transformed potential. Proposition 2.3 Let ϕˆ ∈ E and V be relatively bounded with respect to −∆ with a sufficiently small relative bound. Then, for each α ∈ R, there exists a unitary operator U on H such that U maps D(p2 ) ∩ D(Hf ) onto itself and U −1 HU = Heff + Hf + δV + α2 g, where

p2 + V, 2meff 2 meff = meff (α2 ) = m + α2 qϕ/ω ˆ , ∞ 2 2 2 2 d−1 t ϕ/(t ˆ + ω ) √ g= dt, 2π −∞ m + α2 qϕ/ ˆ t2 + ω 2 2 d−1 q= . d Proof. See Theorem 4.13 of the Appendix. Heff =

Proposition 2.4 We have δV = T −1 V T − V, where α p·K , T = exp −i meff µ 1 ∗ Λj (k)a∗ (k, j) + Λµj (k) a(k, j) dd k, Kµ = √ 2 µ and Λj satisfies ˆ ω n/2 Λµj ≤ C1 ω (n−3)/2 ϕ,

Λµj ∈ L2 (Rd ),

n = −2, −1, 0,

with some constant C1 . In particular, Kµ Ψ ≤ C2 (2ϕ/ω ˆ 2 + ϕ/ω ˆ 3/2 )(Hf 1/2 Ψ + Ψ) √ with C2 = C1 / 2. Proof. See Theorem 4.14 of the Appendix.

(2.1)

1164

2.2



External potentials

We introduce the following conditions on V . Condition 2.5 (1) V is relatively bounded with respect to −∆ with a sufficiently small relative bound. (2) V ∈ C 1 (Rd ) and ∇V ∈ L∞ (Rd ). (3) There exist µ0 ≥ 1 and r0 > 0 such that, for all η > µ0 , 2 2 p p + ηV ≤ −r0 , σess + ηV = [0, ∞). infσ 2m 2m Let V satisfy Condition 2.5 and let −1 αc = m(µ0 − 1)ϕ/ω ˆ . Then Heff has a ground state for |α| > αc , since 2 p meff m Heff = + V , meff 2m m

meff > µ0 . m

If we set Σ = infσ(Heff ), then V∞ ≤ Σ ≤ −

V∞ = inf V (x),

mr0 meff

x∈Rd

for |α| > αc

(2.2)

and lim Σ = V∞ > −∞,

|α|→∞

(2.3)

since Heff → V as |α| → ∞ in the norm resolvent sense. In what follows we assume ϕˆ ∈ E and V to satisfy Condition 2.5.

2.3

Scaled Hamiltonian

Let D(κ) be the dilation unitary on L2 defined by k D(κ)f (k) = κd/2 f ( ). κ We introduce the scaled Hamiltonian by 1 1 x 2 −1 2 H(κ) = κ D(κ) (p − αA) + Hf + 2 V ( ) D(κ) 2m κ κ =

1 (p − καA)2 + V + κ2 Hf , 2m

Vol. 2, 2001


1165

which means that H(κ) can be obtained from H by the substitution ω → κ2 ω,

ϕˆ → κ2 ϕ. ˆ

(2.4)

If we make the corresponding substitution in U , δV , denoted by U (κ), δV (κ), then

H(κ) = U (κ)−1 H(κ)U (κ) = Heff + κ2 Hf + δV (κ) + κ2 α2 g. Note that Heff and g are independent of κ, whereas αK/meff is scaled as α 1 α K→ K, meff κ meff see (4.32) of the Appendix.

We prove the existence of a ground state of H(κ). Lemma 2.6 Let Ψ ∈ D(p2 ) ∩ D(Hf 1/2 ). Then Ψ ∈ D(δV (κ)) for all κ ∈ R and δV (κ)Ψ ≤ where D = D(α) =

D (Hf 1/2 Ψ + Ψ), κ

(2.5)

C2 |α| (2ϕ/ω ˆ 2 + ϕ/ω ˆ 3/2 )∇V ∞ , meff (α2 )

and C2 is given in (2.1). Proof. Let φµ = αKµ /(κmeff ) for instance. Taking the Q-space representation of [15], we see that F can be identified with the probability measure space L2 (Q) and φµ with a multiplication operator in L2 (Q). We set φ = φ(q) = (φ1 (q), ..., φd (q)), q ∈ Q. Moreover we regard L2 ⊗ L2 (Q) as the set of L2 (Q)-valued L2 -function d ∞ d i.e. for Ψ ∈ L2 ⊗ L2 (Q), Ψ(x) ∈ L2 (Q) for almost all x ∈

R . Let ρ ∈ C0 (R ) d be such that ρ(x) ≥ 0, suppρ ⊂ {x ∈ R ||x| ≤ 1} and Rd ρ(x)dx = 1. Define ρ (x) = ρ(x/")/"d , " > 0, and V = ρ ∗ V, where ∗ denotes the convolution. Then it follows that V ∈ C0∞ (Rd ) and V → V , ∇V → ∇V uniformly on compact sets. Let Φ ∈ C0∞ (Rd )⊗alg [Ffin ∩D(Hf )], where ⊗alg denotes the algebraic tensor product. Since T −1 ρ T = ρ (· + φ) as a bounded operator in L2 ⊗ L2 (Q), we have T −1 V T Φ = V (· + φ)Φ. Fix q ∈ Q. We may assume φµ (q) ≥ 0, µ = 1, ..., d, without loss of generality. It then follows that V (x + φ(q)) − V (x) = φ(q) · ∇V (ξ)

1166



with some ξµ ∈ (xµ , xµ + φ(q)) for arbitrary x ∈ Rd . Thus V (· + φ(q)) − V L∞ (C) ≤ ∇V L∞ (C) |φ(q)| for any compact set C ⊂ Rd . Hence we have (V (· + φ) − V )ΦL2 ⊗L2 (Q) ≤ ∇V L∞ (C) |φ|ΨL2 ⊗L2 (Q) , where C = suppΦ(·)L2 (Q) is a compact set. Taking " → 0 on the both sides, we conclude (V (· + φ) − V )ΦL2 ⊗L2 (Q) ≤ ∇V L∞ (C) |φ|ΨL2 ⊗L2 (Q) ≤ ∇V L∞ (Rd ) |φ|ΨL2 ⊗L2 (Q) . Thus (2.5) holds by (2.1) for Φ ∈ C0∞ (Rd ) ⊗alg [Ffin ∩ D(Hf )]. By a simple limiting argument, the lemma follows for arbitrary Φ ∈ D(p2 ) ∩ D(Hf ). From the definition of meff it follows that lim D(α) = 0.

|α|→∞

(2.6)

We write A ≤ B, if D(B) ⊂ D(A) and if for ψ ∈ D(B), (ψ, Aψ) ≤ (ψ, Bψ). Thus, by Lemma 2.6, we conclude the inequalities 1 D 3D 3D 1 D Hf + Hf + − ≤ δV (κ) ≤ . (2.7) κ 2 2 κ 2 2 In particular we have

infσ(H(κ)) ≤Σ+

2.4

1 3D + κ2 α2 g. κ 2

(2.8)

Estimates for massive ground states

The ground state of our massless model is approximated through the ground state of massive models and we first develop some preparatory lemmas. Let ν > 0. We set ˆ + ν)2 mνeff = m + α2 qϕ/(ω and Dν equal to D with meff replaced by mνeff . As can be easily seen mνeff ↑ meff ,

ν → 0+ ,

and Dν ↓ D,

ν → 0+ .

(2.9)

Vol. 2, 2001


1167

Let Γ(l, a), l = (l1 , · · · , ld ) ∈ Zd , a > 0, be the momentum lattice with spacing 1/a Γ(l, a) = [l1 /a, (l1 + 1)/a) × · · · × [ld /a, (ld + 1)/a) and χΓ(l,a) be the characteristic function of Γ(l, a). We define La : L2 → L2 by (La f )(k) = f (l)χΓ(l,a) (k). |l|≤2πL

Set La L2 = 7a and denote by La : F → F the second quantization of La . Let La F = Fa ⊂ F and Ha = L2 ⊗ Fa ⊂ H. We set a (κ) = Heff + κ2 Hfν + δVa (κ), H a where

δVa (κ) = Ta−1 V Ta − V, α a Ta = exp −i ν p · K , meff a 2 ˆ mνeff a = m + α2 q(La ϕ)/L a (ω + ν) , 1 Kµa = √ La Λµj (k)a∗ (k, j) + La Λµj (k)∗ a(k, j) dd k, 2 Hfνa = La (ω + ν)(k)a∗ (k, j)a(k, j)dd k.

a (κ) is reduced by Ha and infσ(H a (κ)H⊥ ) > ν + infσ(H a (κ)). Lemma 2.7 H a Proof. The first statement is proved easily. We shall prove the second statement. d−1 Let Pa be the projection onto the vacuum state of F (7⊥ ), where F (7⊥ a ⊗C a ⊗ d−1 ⊥ d−1 C ) is the Boson Fock space over 7a ⊗ C . Under the identification Ha⊥ ∼ = d−1 ), we have Ha ⊗ Pa⊥ F (7⊥ a ⊗C a (κ)H⊥ ∼ a (κ)) + ν, a (κ)Ha ⊗I + I ⊗ H ν ≥ infσ(H H =H fa a

which implies the desired results. Lemma 2.8 Let a be sufficiently large, and κ, α, and ν be such that |Σ| >

3Dν mr0 + ν. > mνeff a 2κ

(2.10)

a (κ) has a ground state. Then H Proof. By Lemma 2.7 it is enough to prove that the dimension of the spectrum of a (κ)Ha on the interval [infσ(H a (κ)), infσ(H a (κ)) + ν) is finite. On Ha , by (2.7), H a (κ) − infσ(H a (κ)) − ν ≥ Heff + θ Hfν − θ, H a

1168



where Heff = Heff − Σ, θ = 3Daν /(2κ) + ν and θ = κ2 − Daν /(2κ). As before Daν is defined by Dν with ω + ν and ϕˆ replaced by La (ω + ν) and La ϕ, ˆ respectively. Let EA be the spectral projection of Heff on a measurable set A ⊂ Rd . Then a (κ) − infσ(H a (κ)) − ν ≥ (|Σ| − θ)E[|Σ|,∞) + E[0,|Σ|) ⊗ (θ H ν − θ) H fa ≥ E[0,|Σ|) ⊗ (θ Hfνa − θ). Here we used |Σ| − θ > 0. Since Hfνa and E[0,|Σ|) have purely discrete spectrum, the lemma follows. 2 2

As can be proved directly, Ha (κ) converges to H(κ)+νNf −κ α gν as a → ∞ and L → ∞ in the norm resolvent sense, where gν is defined by g with ω replaced

by ω + ν. Combined with Lemma 2.8, we see that H(κ) + νNf − κ2 α2 gν has a

ν (κ) = H(κ) + νNf has a ground state, which is denoted by ground state, i.e. H Ψν = Ψν (κ). Lemma 2.9 On D(p2 ) ∩ D(Hf ) we have [δV (κ), a(f, j)] =

1 α (f, Λj ) · T (κ)−1 (∇V )T (κ). κ meff

Proof. Since T maps D(p2 )∩D(Hf ) onto itself, one can check that T −1 V T a(f, j)Ψ and a(f, j)T −1 V T Ψ for Ψ ∈ D(p2 ) ∩ D(Hf ) are well defined. Thus [δV (κ), a(f, j)] is well defined on D(p2 ) ∩ D(Hf ). We have [δV (κ), a(f, j)] = T (κ)−1 [V, T (κ)a(f, j)T (κ)−1 ]T (κ). Since

T (κ)a(f, j)T (κ)−1 = a(f, j) + iα(f, Λj ) ·

p , meff κ

the lemma follows. Lemma 2.10 We have

1 |α| Nf 1/2 Ψν ", ≤ 3 Ψν κ meff

where " = Cϕ/ω ˆ 5/2 ∇V ∞ with some constant C. Proof. Note that Ψν ∈ D(Hν ) = D(p2 ) ∩ D(Hf ). We have

ν (κ) − infσ(H

ν (κ)))a(f, j)Ψν ) 0 ≤ (a(f, j)Ψν , (H = −κ2 (a∗ ((ω + ν)f, j)a(f, j)Ψν , Ψν ) + (a(f, j)Ψν , [δV (κ), a(f, j)]Ψν ). By Lemma 2.9, κ2 (Nf Ψ, Ψ) ≤ which is our claim.

1 |α| κ meff

Λj , j Ψν , T (κ)−1 (∇V )T (κ)Ψν , a ω+ν

Vol. 2, 2001


1169

Lemma 2.11 Let PΩ be the projection onto the vacuum state of F . Let Q = E[0,∞) ⊗ PΩ . Let κ, α and ν be such that mr0 3Dν > 0. − ν meff 2κ Then QΨν ≤ Ψν

(2.11)

3Dν /2 . κ2 (|Σ| − 3Dν /(2κ))

Proof. We have

ν (κ)) + κ2 α2 gν )Ψν ) = −(Ψν , QδV (κ)Ψν ). (Ψν , Q(Heff − infσ(H Hence

ν (κ)))QΨν 2 ≤ δV (κ)QΨν Ψν . (κ2 α2 gν − infσ(H

Note that, by (2.5) and (2.8), ν ν

ν (κ)) ≥ |Σ| − 3D > mr0 − 3D > 0, κ2 α2 gν − infσ(H 2κ mνeff 2κ

δV (κ)QΨν ≤

3Dν Ψν . 2κ

Thus the lemma follows.

3 Binding Our next task is to establish that for sufficiently large α and/or sufficiently large κ the ground state of H(κ) exists.

3.1

Existence of a ground state for large coupling constant

Theorem 3.1 There exists α∗ ≥ αc such that, for arbitrary |α| > α∗ , the ground state of H exists and is unique. Proof. The uniqueness follows from the ergodic property of e−tH , see [14]. Let a > 0 be sufficiently small, then P = E[Σ,Σ+a) ⊗ PΩ is a finite rank operator. We denote by Ψ the weak-limit of the normalized Ψν as a subsequence ν tends to infinity. Since P ≥ I − Nf − Q, we have, by Lemma 2.10 and 2.11, 2 |α|" 3D/2 Ψ2 . Ψ2 − (3.1) (Ψ, P Ψ) ≥ Ψ2 − meff |Σ| − 3D/2 Since lim

|α|→∞

|α| = 0, meff

lim D = 0,

|α|→∞

lim Σ = V∞ < 0,

|α|→∞

1170



we see that there exists α∗ > 0 such that, for arbitrary |α| > α∗ , the right-hand side of (3.1) is strictly positive. Thus Ψ = 0 for |α| > α∗ , which implies that Ψ is

a ground state of H(1). Since H(1) is unitarily equivalent to H, the existence of the ground state of H follows.

3.2

A lower bound of the critical coupling constant

By assumption Heff has a ground state for |α| > αc . We have to make sure that H shares the same property. Our key is Lemma 3.2 For arbitrary α ∈ R, Hf + α2 A2 /2m has a unique ground state, and α2 2 A = α2 g. infσ Hf + 2m Let Pg be the projection onto the ground state of Hf + α2 A2 /2m. Then, for sufficiently large z > 0, −1 −1 = (Heff + z) ⊗ Pg . s − lim H(κ) − α2 κ2 g + z κ→∞

Proof. See Theorem 4.11 of the Appendix.

Lemma 3.2 suggests that H(κ) has a ground state for sufficiently large κ, since Heff does so. Lemma 3.3 Fix a sufficiently large κ. Then, for arbitrary |α| > αc , the ground

state of H(κ) exists. Proof. As Theorem 3.1, we have 2 1 |α|" 3D/(2κ) 2 Ψ2 . Ψ2 − 2 (Ψ, P Ψ) ≥ Ψ − 3 κ meff κ (|Σ| − 3D/(2κ))

(3.2)

Since κ is sufficiently large, the right-hand side of (3.2) is strictly positive. There fore Ψ is a ground state of H(κ). Theorem 3.4 Let κ be sufficiently large. Set Vκ (x) =

1 x V ( ). κ2 κ

Then the ground state of Hκ =

1 (p − αA)2 + Vκ + Hf 2m

exits for arbitrary |α| > αc . In particular, if Hel has a ground state, then Hκ has a unique ground state for all α.

Vol. 2, 2001


1171

Proof. We have Hκ =

1 D(κ)H(κ)D(κ)−1 . κ2

Since H(κ) is unitarily equivalent to H(κ), Hκ has a unique ground state by Lemma 3.3. If Heff has a ground state, then αc = 0 and the zero-coupling Hκ has a ground state. Thus Hκ has a ground state for all α.

3.3

Examples

We provide some examples of shallow external potentials. Let d ≥ 3. We assume that V is nonpositive (V ≡ 0) and satisfies Condition 2.5. Moreover N (V ) = ad

Rd

|mV (x)|d/2 dx < 1,

(3.3)

where ad is a universal constant [17]. Note that, for all κ > 0, N (V ) = N (Vκ ). Thus

1 2 2m p

+ Vκ has no ground state for all κ > 0.

Example 3.5 We assume that κ is sufficiently large. By virtue of Theorem 3.4, we see that Hκ has a unique ground state for |α| > αc . On the other side (3.3) says that Hκ has no ground state for α = 0. Let Hel (γ) =

1 2 p + γV 2m

and γ∗ = sup {γ ∈ R|σ (Hel (γ)) = [0, ∞)} . Since E(γ) = infσ (Hel (γ)) is continuous in γ, we have E(γ∗ ) = 0. Thus, for any " > 0, we have E(γ + ") < 0 and then Hel (γ + ") has a ground state with a spectral gap, which yields the following Example 3.6 Let arbitrary " > 0 be given. We assume that κ is sufficiently large. Then 1 (p − αA)2 + γ∗ Vκ + Hf 2m has a unique ground state for arbitrary |α| > ", but Hel (γ) has no spectral gap.

1172



4 Appendix 4.1

Bogoliubov transformation

To estimate δV and meff explicitly, we need the exact form of the transformed Hamiltonian Heff + Hf + δV + α2 g for arbitrary α ∈ R. Let us define an operator in H by 1 (p − αA)2 + Hf . Hdip = 2m A. Arai diagonalized Hdip exactly in [1]. However this is not enough for our purposes and we have to improve the results in [1]. Following [1] we review the Bogoliubov transform of Hdip . The operator Hdip is decomposable for all α ∈ R, i.e. ⊕ Hdip (p)dp, Hdip = Rd

where

1 2 (p − αA) + Hf , p ∈ Rd . 2m We shall prove that there exists a unitary operator U (p) such that Hdip (p) =

U (p)−1 Hdip (p)U (p) = E(p) + Hf , where E(p) = E(p, α2 ) is a constant. Let 2 |ϕ(k)| ˆ 2 D± (s) = m − α q dd k, 2 Rd s − ω(k) ± i0

s ∈ [0, ∞).

Then we have

α2 q|Sd−1 | × 2 √ √ 2 (d−2)/2 |ϕ( ˆ x)|2 |x|(d−2)/2 × lim , dx ∓ 2πi|ϕ( ˆ s)| s ↓0 |s−x|> ,x≥0 s−x

D± (s) = m −

(4.1)

where |Sd−1 | denotes the volume of the d − 1-dimensional unit sphere Sd−1 . In particular, there exists " > 0 such that sup |D± (s)| > " s∈[0,∞)

by Definition 2.2 (3), (4). Define f (k ) dd k . Gf (k) = 2 2 (d−2)/2 Rd (ω(k) − ω(k ) + i0)(ω(k)ω(k )) It is seen that Gf (k) =

1 × 2ω(k)(d−2)/2

Vol. 2, 2001


× lim ↓0

|ω(k)2 −x|> ,x≥0

where [f ](r) =

Sd−1

1173

√ [f ]( x) (d−2)/4 (d−2)/2 x , (4.2) dx − 2πi[f ](ω(k))ω(k) ω(k)2 − x

f (r, φ)dφ and φ is the volume element of Sd−1 . Define

Tµν f = Tµν (α2 )f = δµν f + α2 Qω (d−2)/2 Gω (d−2)/2 dµν ϕf, ˆ where dµν (k) = δµν − kµ kν /|k|2 and Q(k) = Q(k, α2 ) =

ϕ(k) ˆ . D+ (ω(k)2 )

It is seen that for rotation invariant functions f and g, (dµν f, g) = δµν q(f, g).

(4.3)

Since G is a bounded operator on L2 , we see that, by (2) of Definition 2.2, ω n/2 Tµν f ≤ Cω n/2 f ,

n = −1, 0, 1,

∗ with some constant C. Let Tµν = (Tµν )∗ . We define p · e √ ∗ 1 ν 1 ∗ √ ν 1 j Aµ √ Tµν ωej f + iΠµ Bp (f, j)= √ ωTµν √ ej f − α Q, f , ω ω ω 3/2 2 p · e √ ν √ ∗ 1 ν 1 j ¯ µ √1 T ∗ ωe µ √ Q, f e A f − i Π ωT f − α , Bp∗ (f, j)= √ µν j ω µν ω j ω 3/2 2

where Xf = X f¯, f¯ denotes the complex conjugate of f , and ej (k) A(λ) = λ(k)a∗ (k, j) + λ(k)a(k, j) dd k, 2ω(k) ω(k) ∗ √ ej (k) λ(k)a Π(λ) = (k, j) − λ(k)a(k, j) dd k. 2 Here λ(k) = λ(−k). Then A(λ) and Π(λ) are complex linear in λ. We have the following commutation relations on Ffin ∩ D(Hf 3/2 ) ν (ρ)] = i dµν (k)η(k)ρ(k)dd k, [A µ (η), A ν (ρ)] = 0, [Π µ (η), Π ν (ρ)] = 0, µ (η), Π [A µ (η)] = −iΠ µ (η), [Hf , A

µ (η)] = iA µ (ωη). [Hf , Π

We have Bp (f, j) = a∗ (W+ij f, i) + a(W−ij f , i) − α

p · ej √ Q, f 2ω 3/2

(4.4) ,

1174


Bp∗ (f, j) = a∗ (W +ij f, i) + a(W −ij f , i) − α where


p·e ¯ √ j Q, f 2ω 3/2

,

1 µ −1/2 ∗ 1/2 ∗ ei ω Tµν ω + ω 1/2 Tµν ω −1/2 eνj f, 2 1 ∗ ∗ νf. W−ij f = eµi ω −1/2 Tµν ω 1/2 − ω 1/2 Tµν ω −1/2 e j 2 W+ij f =

Let

  W± = 

···

W±11 .. .

W±1 d−1 .. .

··· ···

W±d−1 1

  .

W±d−1 d−1

Remark 4.1 From the substitution ω → κ2 ω and ϕˆ → κϕ, ˆ we infer that Tµν , W+ , and W− are independent of κ. By using (4.1) and (4.2), as operator equations on L2 ⊗ Cd−1 , we see a symplectic group structure W+∗ W+ − W−∗ W− = I, ∗

W+ W+∗ − W − W − = I,

∗

∗

W + W− − W − W+ = 0, ∗

W− W+∗ − W + W − = 0.

(4.5) (4.6)

By (4.5) we have [Bp (f, j), Bp∗ (g, j )] = δjj (f¯, g), [Bp∗ (f, j), Bp∗ (g, j )] = 0, [Bp (f, j), Bp (g, j )] = 0, and by (4.6) p · ej ϕˆ α √ + − ,f , a(f, j) = meff 2ω 3/2 p · ej ϕˆ α ∗ ∗ ∗ ∗ √ ,f . a (f, j) = Bp (W + ij f, i) − Bp (W− ij f, i) − meff 2ω 3/2 ∗ −Bp∗ (W − ij f , i)

Bp (W+∗ ij f , i)

(4.7) (4.8)

Lemma 4.2 We have on a dense domain [Hdip (p), Bp (f, j)] = −Bp (ωf, j), [Hdip (p), Bp∗ (f, j)] = Bp∗ (ωf, j). Proof. A direct calculation, see [1]. We define b = B0 and b∗ = B0∗ .

Vol. 2, 2001


1175

Lemma 4.3 There exists a unitary operator R of F such that R−1 b∗ (f, j)R = a∗ (f, j), R−1 b(f, j)R = a(f , j).

Proof. See [1]. Let

ϕˆ α S(p) = exp −i p·Π meff ω2

and U (p) = S(p)R. Lemma 4.4 For all α ∈ R and all p ∈ Rd , U (p) maps D(Hf ) onto itself and U (p)−1 Bp∗ (f, j)U (p) = a∗ (f, j), U (p)−1 Bp (f, j)U (p) = a(f , j).

(4.9)

There exists E(p) = E(p, α ) ∈ R such that 2

U (p)−1 Hdip (p)U (p) = E(p) + Hf .

(4.10)

Proof. As is easily seen U (p) maps Ffin ∩ D(Hf ) to D(Hf ). By a limiting argument U (p) maps D(Hf ) onto itself. (4.9) easily follows from Lemma 4.3. By Lemma 4.2, (4.10) holds.

4.2

Effective mass and ground state energy

In the previous subsection it is established that U (p)−1 Hdip (p)U (p) = E(p) + Hf for all α ∈ R. Then E(p) is the ground state energy of Hdip (p). In the present subsection we give the explicit form of E(p). Since a momentum lattice approximated Hdip (p) can be identified with a harmonic oscillator in L2 (RD ) for some D, E(p) can be obtained through calculating the ground state energy of the harmonic oscillator. First ω is replaced by ω (k) = ω(k) + ",

" > 0.

For l = (l1 , · · · , ld ) ∈ Rd , let |l| = maxj |lj |. For the time being we suppose l ∈ (2πZ/a)d ,

|l| ≤ 2πL

(4.11)

with some a and L; l is a lattice point with the width 2π/a of the d-dimensional rectangle centered at the origin with the width 4πL. The lattice points are named

1176



l1 , l2 , · · · , l(2[aL]+1)d , where [z] denotes the integer part of z ∈ R. For l with (4.11) we define 2π 2π Γ(l) = l1 , l1 + × · · · × ld , ld + . a a Let

∗ 1 1 jl = √ a (χΓ(l) , j) + a(χΓ(l) , j) , qrad 2 ω (l) i ω (l) a∗ (χΓ(l) , j) − a(χΓ(l) , j) . pjl rad = √ 2

Then the Weyl relations hold, j l exp itpjl rad exp isqrad j l exp itpjl = exp itsδl1 l1 ...δld ld δjj exp isqrad rad ,

t, s ∈ R.

(4.12)

Let D = (d − 1)(2[aL] + 1)d . We define the D × D-diagonal matrix by    A0 =  



ω (l1 )Id−1

  , 

ω (l2 )Id−1 ..

. ω (l(2[aL]+1)d )Id−1

where Id−1 denotes the (d − 1) × (d − 1)-identity matrix. Since " > 0, A0 is a strictly positive matrix. We denote by (f, g)D the D-dimensional scalar product. Let µ µ vjl = ϕ(l)e ˆ j (l), and 

µ Bvµ = (vjl )1≤j≤d−1,|l|≤2πL

          =          

µ v1l 1 µ v2l 1 .. . µ vd−1l 1 .. . .. . µ v1l µ v2l

(2[aL]+1)d (2[aL]+1)d

.. .

µ vd−1l

(2[aL]+1)d

            ∈ RD ,          

µ = 1, ..., d.

Vol. 2, 2001


1177

For linear operator T , let T !D =

d

(Bvµ , T Bvµ )D .

µ=1

Suppose that T : Rd → R is a rotation invariant function. Let Tdiag diagonal matrix with diagonal elements T (l):  T (l1 )Id−1  T (l2 )Id−1  Tdiag =  ..  .

be the D × D   . 

T (l(2[aL]+1)d )Id−1 Then (Bvµ , TdiagBvν ) = δµν q

T (l)|ϕ(l)| ˆ 2.

(4.13)

|l|≤2πL

See (4.3). Let prad = (pjl rad )1≤j≤d−1,|l|≤2πL , jl qrad = (qrad )1≤j≤d−1,|l|≤2πL .

Then the momentum lattice approximated Hdip (p) is written as HL,a (p) =

d 1 2 {pµ − α(Bvµ , qrad )D } 2m µ=1

1 + {(prad , prad )D + (qrad , A0 qrad )D } − tr A0 . 2 Lemma 4.5 Suppose that " > 0. Let B p and Bq be the momentum operator and its canonical position operator in L2 (RD ), respectively. Then there exist a D × D nonnegative symmetric matrix A and fB ∈ RD such that 1 1 1 2 1 B B 1 p, p B)D + (B q , ABq)D + p − (f , Af )D − tr A0 . HL,a (p) ∼ = (B 2 2 2m 2 2 Proof. Define the D × D-matrix by P =

d

|Bvµ ! Bvµ |.

µ=1

Set λ=

α2 . m

(4.14)

1178



Then let us define A = A0 + λP. Note that A is a strictly positive symmetric matrix, since A0 is strictly positive and P is nonnegative. In particular, (A + a)−1 exists for a ≥ 0. Let fB = fB(p) =

d α −1 A pµBvµ ∈ RD . m µ=1

Then we have (p) = HL,a

1 1 1 2 1 B B 1 (prad , prad )D + ((qrad − fB), A(qrad − fB))D + p − (f , Af )D − tr A0 . 2 2 2m 2 2 By (4.12) and the von Neumann uniqueness theorem, there exists a unitary operator ϑ : F → L2 (RD ) implementing −1 ϑpjl = −i rad ϑ

∂ , ∂xjl

jl ϑqrad ϑ−1 = xjl .

Then HL,a (p) is unitarily equivalent with the harmonic oscillator

1 1 1 2 1 B B 1 (B p, p B)D + ((B q − fB), A(B q − fB))D + p − (f , Af )D − tr A0 2 2 2m 2 2 in L2 (RD ). By the shift B q→B q + fB implemented by a unitary operator, we obtain (4.14). Lemma 4.6 Suppose the same assumptions as in Lemma 4.5. Then infσ(HL,a (p)) =

1 2 1 B B 1 √ p − (f , Af )D + tr( A − A0 ). 2m 2 2

(4.15)

Proof. Generally for the harmonic oscillator HT =

1 1 (B p, pB)D + (Bq, T Bq)D 2 2

with a symmetric nonnegative matrix T , it follows that infσ(HT ) =

Hence infσ

1 √ tr T . 2

1 1 (B p, pB)D + (B q , ABq)D 2 2

=

1 √ tr A. 2

Thus the ground state energy of HL,a (p) is given by (4.15).

Vol. 2, 2001


1179

√ √ We calculate (fB, AfB)D and tr( A − A0 ) as follows. By (4.13) we note that |ϕ(l)| ˆ 2 , ω (l)2

(Bvµ , A−1 vν ) = δµν q 0 B

(4.16)

|l|≤2πL

(Bvµ , (s2 + A0 )−1Bvν ) = δµν q

|l|≤2πL

(Bvµ , (s2 + A0 )−1 A0Bvν ) = δµν q

s2

|ϕ(l)| ˆ 2 , + ω (l)2

ω (l)2 |ϕ(l)| ˆ 2 . 2 s + ω (l)2

(4.17)

(4.18)

|l|≤2πL

Furthermore A−1 = s − lim

N

N →∞

n−1 −1 (A−1 A0 . 0 P)

(4.19)

n=1

Lemma 4.7 Suppose the same assumptions as in Lemma 4.5. Then 1 p2 1 2 1 B B p − (f , Af )D = , 2m 2 2m 1 + λθ where θ = θ(a, L, ") = q

(4.20)

|ϕ(l)| ˆ 2 . ω (l)2

|l|≤2πL

Proof. By (4.16) we have λ (fB, AfB)D = pµ pν (Bvµ , A−1Bvν )D m =

=

d ∞ λ n−1 −1 pµ pν (−λ)n−1 (Bvµ , (A−1 A0 Bvν )D 0 P) m µ,ν=1 n=1 ∞ λ m n=1 µ,µ

d

pµ pν (−λ)n−1 (Bvµ , A−1 vµ1 )D × 0 B

1 ,··· ,µn−1 ,ν=1

×(Bvµ1 , A−1 vµ2 )D · · · (Bvµn−1 , A−1 vν )D 0 B 0 B =

∞ λ m n=1 µ,µ

d

pµ pν δµµ1 δµ1 µ2 · · · δµn−1 ν (−λ)n−1 θn

1 ,··· ,µn−1 ,ν=1

= Hence (4.20) follows.

∞ λ λθ p2 n−1 2 n . (−λ) p θ = m n=1 1 + λθ m

1180



Lemma 4.8 Suppose the same assumptions as in Lemma 4.5. Then dq ∞ λs2 |ϕ(l)| ˆ 2 1 √ tr A − A0 = ds, 2 2π −∞ 1 + λξ (s2 + ω (l)2 )2

(4.21)

|l|≤2πL

where

ξ = ξ(a, L, ") = q

|l|≤2πL

|ϕ(l)| ˆ 2 . s2 + ω (l)2

Proof. We see that

Let A∞

√ 1 ∞ tr A − tr A0 = tr A(s2 + A)−1 − A0 (s2 + A0 )−1 ds. π −∞ n ∞ = n=1 −λP (s2 + A0 )−1 . We have

A(s2 + A)−1 − A0 (s2 + A0 )−1 = λP (s2 + A0 )−1 + A(s2 + A0 )−1 A∞ . It follows that trλP (s2 + A0 )−1 = λ

d

(φ, Bvµ )D (Bvµ , (s2 + A0 )−1 φ)D ,

µ=1 φ:CONS

where φ:CONS means to sum up all the vectors φn in a complete orthonormal system (CONS). Take a CONS such that Bvµ , φ2 , φ3 , · · · . φ1 = Bvµ Then we have by (4.17) = λ (s2 + A0 )−1 !D = dλξ.

(4.22)

We see that A(s2 + A0 )−1 A∞ = A0 (s2 + A0 )−1 A∞ + λP (s2 + A0 )−1 A∞ . It follows that 2

−1

trA0 (s + A0 )

A∞ =

∞ n=1

=

∞

d

(s2 + A0 )−1 A0 φ, (P (s2 + A0 )−1 )n φ D

(−λ)n

φ:CONS

(−λ)n

n=1 µ1 ,··· ,µn =1

((s2 + A0 )−1 A0 φ, Bvµ1 )D ×

φ:CONS

×((s2 + A0 )−1Bvµ1 , Bvµ2 )D · · · ((s2 + A0 )−1Bvµn , φ)D .

Vol. 2, 2001


1181

Take a CONS such that (s2 + A0 )−1Bvµn φ1 = , φ2 , φ3 , · · · , . (s2 + A0 )−1Bvµn From (4.18) it follows that =

∞

d

(−λ)n ((s2 + A0 )−1Bvµn , (s2 + A0 )−1 A0Bvµ1 )D ×

n=1 µ1 ,··· ,µn =1

×((s2 + A0 )−1Bvµ1 , Bvµ2 )D · · · ((s2 + A0 )−1Bvµn−1 , Bvµn )D =

∞

d

(−λ) δµ1 µ2 · · · δµn−1 µn ξ n−1 (s2 + A0 )−2 A0 !D n

n=1 µ1 ,··· ,µn =1

= −λ

−dλ (s2 + A0 )−2 A0 !D = 1 + λξ 1 + λξ

|l|≤2πL

ω (l)2 |ϕ(l)| ˆ 2 , (s2 + ω (l)2 )2

(4.23)

and trλP (s2 + A0 )−1 A∞ =

∞

(−λ)n λ

n=1

=

∞

d

n φ, P (s2 + A0 )−1 P (s2 + A0 )−1 φ

φ:CONS

D

(−λ)n λ

n=1 µ1 ,··· ,µn+1 =1

(φ, Bvµ1 )D (Bvµ1 , (s2 + A0 )−1Bvµ2 )D · · · (Bvµn+1 , (s2 + A0 )−1 φ)D .

φ:CONS

Take a CONS such that Bvµ1 , φ2 , φ3 , · · · , . φ1 = Bvµ1 Then we see that =−

∞

n+1

(−λ)

n=1

d

δµ1 µ2 · · · δµn µn+1 δµn+1 µ1 ξ n+1

µ1 ,··· ,µn+1 =1

=d

−λ2 ξ 2 . 1 + λξ

(4.24)

1182



Hence we have by (4.22),(4.23) and (4.24), tr A(s2 + A)−1 − A0 (s2 + A0 )−1 (λξ) (s2 + A0 )−2 A0 !D (−λ) − d + dλξ 1 + λξ 1 + λξ |ϕ(l)| ˆ 2 dq ω (l)2 |ϕ(l)| ˆ 2 =λ − 1 + λξ (s2 + ω (l)2 ) (s2 + ω (l)2 )2 2

=

|l|≤2πL

=

dqλs2 1 + λξ

|l|≤2πL

(s2

|ϕ(l)| ˆ 2 . + ω (l)2 )2

Thus the lemma follows. Lemma 4.9 Suppose the same assumptions as in Lemma 4.5. Then d − 1 ∞ α2 s2 |ϕ(l)| ˆ 2 p2 infσ(HL,a + (p)) = ds. 2(m + α2 θ) 2π −∞ m + α2 ξ (s2 + ω (l)2 )2 |l|≤2πL

Proof. It follows from Lemmas 4.7 and 4.8. Lemma 4.10 We have E(p) =

p2 + α2 g, 2meff

(4.25)

where 2 ˆ , meff = m + α2 qϕ/ω ∞ t2 ϕ/(t ˆ 2 + ω 2 )2 d−1 √ dt. g= 2 2π −∞ m + α qϕ/ ˆ t2 + ω 2 2

Proof. We set meff (a, L, ") = m + α2 θ, and g(a, L, ") =

d−1 2π

∞ −∞

α2 s2 m + α2 ξ

|l|≤2πL

(s2

|ϕ(l)| ˆ 2 ds. + ω (l)2 )2

Note that meff (a, L, ") → meff and g(a, L, ") → g as a → ∞, L → ∞, " → 0. Taking a → ∞ and then L → ∞, we see that (p) → Hdip (p) + "Nf HL,a

Vol. 2, 2001


1183

uniformly in the resolvent sense, which yields that (p)) → infσ(Hdip (p) + "Nf ). infσ(HL,a

Hence

infσ(Hdip (p) + "Nf ) = lim lim

L→∞ a→∞

=

p2 + α2 g(a, L, ") 2meff (a, L, ")

p2 + α2 g("), 2meff (")

where meff (") and g(") are defined by meff and g with ω replaced by ω , respectively. Since (4.26) Hdip (p) + "Nf → Hdip (p) strongly on D(Hf ) as " → 0, (4.26) holds in the strong resolvent sense. Then it follows (4.27) lim sup infσ(Hdip (p) + "Nf ) ≤ infσ(Hdip (p)). →0

Furthermore, since Nf ≥ 0, we have lim inf infσ(Hdip (p) + "Nf ) ≥ infσ(Hdip (p)). →0

(4.28)

Combining (4.27) and (4.28) we have infσ(Hdip (p) + "Nf ) → infσ(Hdip (p)) = E(p) as " → 0. Then

E(p) = lim

→0

p2 p2 + α2 g(") = + α2 g. 2meff (") 2meff

Hence the lemma follows.

Corollary 4.11 Hdip (p) is self-adjoint on D(Hf ) for all α ∈ R and p ∈ Rd , and bounded below. Moreover U (p)Ω is a ground state of Hdip (p) with eigenvalue 1 2 2 2meff p + α g.

Proof. The corollary follows from Lemmas 4.4 and 4.10.

4.3

Effective potential and scaling limit

⊕ We define the unitary operator on H ∼ = Rd F dx by ⊕ π S(p)dp Rei 2 Nf . U=

(4.29)

Rd

Lemma 4.12 We define U (κ) by the substitution ω → κ2 ω and ϕˆ → κϕ. ˆ Then s − lim U (κ) = I ⊗ R. κ→∞

1184



Proof. By the definition of R, we see that R is independent of κ, while S(p) is scaled as ϕˆ 1 α S(p) → S(p, κ) = exp −i p·Π , p ∈ Rd . κ meff ω2 Hence s − lim S(p, κ) = I,

p ∈ Rd .

s − lim U (p, κ) = R,

p ∈ Rd .

κ→∞

Thus κ→∞

The desired result follows from the definition (4.29).

Proof of Lemma 3.2. We apply [2, Theorem 2.2]. We check (1) D(δV (k)) ⊃ D(Heff ) and δV (κ)(Heff + λ)−1 is bounded in H for large λ > 0 with limλ→∞ δV (κ)(Heff + λ)−1 H = 0 uniformly in κ, (2) δV (κ)(Heff + λ)−1 is strongly continuous in κ, (3) s − limκ→∞ δV (κ)(Heff + λ)−1 = 0. Thus (1) to (3) imply that −1

− α2 κ2 g + z s − lim H(κ) = (Heff + z)−1 ⊗ PΩ . κ→∞

Since, by Lemma 4.12, we have −1 −1

= s − lim U (κ) H(κ) − α2 κ2 g + z U (κ)−1 s − lim H(κ) − α2 κ2 g + z κ→∞

κ→∞

−1

= s − lim (Heff + z) κ→∞

⊗ (RPΩ R−1 ).

Corollary 4.11 tells us that RPΩ R−1 = Pg . Thus the lemma follows.

Theorem 4.13 Let V be relatively bounded with respect to −∆ with a sufficiently small relative bound. Then U maps D(p2 ) ∩ D(Hf ) onto itself and U −1 (Hdip + V )U = Heff + Hf + α2 g + δV.

(4.30)

Proof. By Lemma 4.4, we have U −1 Hdip U = Heff + Hf + α2 g

(4.31)

on a core of the right-hand side above, e.g., C0∞ (Rd ) ⊗alg [Ffin ∩ D(Hf )]. Since Hdip is self-adjoint on D(p2 ) ∩ D(Hf ), a limiting argument tells us that U maps D(p2 ) ∩ D(Hf ) onto itself and (4.31) is valid on D(p2 ) ∩ D(Hf ). Thus the theorem follows.

Vol. 2, 2001


1185

Theorem 4.14 We have δV := T −1 V T − V, where α T = exp −i pK , meff µ 1 Λj (k)a∗ (k, j) + Λµj (k)∗ a(k, j) dd k, Kµ = √ 2 Λµj =

eµj Q ω 3/2

.

(4.32)

Proof. We see that π π α ϕˆ U −1 e−ikx U = e−ikx e−i 2 Nf R−1 exp −i kΠ Rei 2 Nf . meff ω2 Let ϕ/ω ˆ 3/2 = f . By (4.7), (4.8), Lemma 4.2, and f = f¯ = f , we have ϕˆ i −1 ∗ µ µ −1 √ a R R Πµ (e f, j) R f , j) − a(e R = j j ω2 2 i ∗ = √ R−1 b∗ (W + ij eµj f, i) − b(W−∗ ij eµj f, i) 2 ∗ eµ f¯, i) − b(W ∗ eµ f¯, i) R +b∗ (W − ij j

+ ij j

i ∗ ∗ = √ a∗ (W + ij eµj f + W − ij eµj f, i) − a(W−∗ ij eµj f + W+∗ ij eµj f , j) 2 i ∗ 1/2 α −1/2 f , i) . = √ a (ω ei Tαβ dβµ ω −1/2 f , i) − a(ω 1/2 eα i Tαβ dβµ ω 2

Note the following algebraic relation eµj Q ω 3/2 Since

−1/2 = ω 1/2 eα ϕ. ˆ j Tαβ dβµ ω

e−i 2 Nf i {a∗ (g) − a(g)} ei 2 Nf = a∗ (g) + a(g), π

π

we have e Hence

−i π 2 Nf

R

−1

Πµ

ϕˆ ω2

π

Rei 2 Nf = Kµ .

(4.33)

U −1 e−ikx U = e−ikx e−iαkK/meff = T −1 e−ikx T.

Let ρ and V be in the proof of Lemma 2.6. We see that −1 −d/2 −1 −ikx d ρˇ(k)U e Ud k = ρˇ(k)T −1 e−ikx T dd k = T −1 ρT. U ρU = (2π) Rd

Rd

1186



Thus, for Ψ ∈ C0∞ (Rd ) ⊗alg [Ffin ∩ D(Hf )], U −1 V U Ψ = T −1 V T Ψ. Hence (4.30) holds on C0∞ (Rd ) ⊗alg [Ffin ∩ D(Hf )]. By a limiting argument, we obtain (4.30) on D(p2 ) ∩ D(Hf ). Acknowledgment F. H. thanks the Graduiertenkolleg “Mathematik in ihrer Wechselbeziehung zur Physik” of the LMU Munich and Grant-in-Aid 13740106 for Encouragement of Young Scientists from the Ministry of Education, Science, Sports and Culture for financial support.

References [1] A. Arai, Rigorous theory of spectra and radiation for a model in quantum electrodynamics, J. Math. Phys. 24, 1896–1910 (1983). [2] A. Arai, An asymptotic analysis and its application to the nonrelativistic limit of the Pauli-Fierz and a spin-boson model, J. Math. Phys. 31, 2653– 2663 (1990). [3] A. Arai and M. Hirokawa, On the existence and uniqueness of ground states of a generalized spin-boson model, J. Funct. Anal. 151, 455–503 (1997). [4] V. Bach, J. Fr¨ ohlich and I.M. Sigal, Mathematical theory of non-relativistic matter and radiation, Lett. Math. Phys. 34, 183–201 (1995). [5] V. Bach, J. Fr¨ ohlich and I. M. Sigal, Quantum electrodynamics of confined nonrelativistic particles, Adv. Math. 137, 205–298 (1998). [6] V. Bach, J. Fr¨ ohlich and I. M. Sigal, Spectral analysis for systems of atoms and molecules coupled to the quantized radiation fields, Commun. Math. Phys. 207, 249–290 (1999). [7] E. A. Berezin, The Method of Second Quantization, Academic Press, 1966. [8] V. Betz, F. Hiroshima, J. Lorinczi, R. A. Minlos and H. Spohn, Properties of the ground state of a scalar quantum field model: A Gibbs measure-based approach, TU-M¨ unchen preprint, 2001. [9] C. Gérard, On the existence of ground states for massless Pauli-Fierz Hamiltonians, Ann. H. Poincaré 1, 443–460 (2000) . [10] M. Griesemer, E. Lieb and M. Loss, Ground states in non-relativistic quantum electrodynamics, Los Alamos Preprint Archive, math-ph/0007014, 2000. [11] F. Hiroshima, Scaling limit of a model of quantum electrodynamics, J. Math. Phys. 34, 4478-4518 (1993). 9, 201–225 (1997).

Vol. 2, 2001


1187

[12] F. Hiroshima, Ground states and spectrum of quantum electrodynamics of non-relativistic particles, Trans. Amer. Math. Soc. 353, 4497–4528 (2001). [13] F. Hiroshima, Ground states of a model in nonrelativistic quantum electrodynamics I, J. Math. Phys. 40, 6209–6222 (1999). [14] F. Hiroshima, Ground states of a model in nonrelativistic quantum electrodynamics II, J. Math. Phys. 41, 661–674 (2000). [15] F. Hiroshima, Essential self-adjointness of translation invariant quantum filed models for arbitrary coupling constants, Commun. Math. Phys. 211, 585–613 (2000) [16] F. Hiroshima, The self-adjointness of the Pauli-Fierz Hamiltonian in quantum electrodynamics for arbitrary coupling constants, mp-arc 01-097, 2001. [17] E. Lieb and W. Thirring, Inequalities for the moments of the eigenvalues of the Schr¨ odinger Hamiltonian and their relation to Sobolev inequalities, Studies in Mathematical Physics, Princeton Univ. Press, pp 269–303 (1976). [18] H. Spohn, Ground state(s) of the spin-boson Hamiltonian, Commun. Math. Phys. 123, 277–304 (1989). [19] H. Spohn, Ground state of quantum particle coupled to a scalar boson field, Lett. Math. Phys. 44, 9–16 (1998). Fumio Hiroshima Department of Mathematics and Physics Setsunan University 572-8508, Osaka Japan email: [email protected]

Herbert Spohn Zentrum Mathematik and Physik Department TU M¨ unchen D-80290, M¨ unchen Germany email: [email protected]

Communicated by Bernard Helffer submitted 21/01/01, accepted 20/07/01




On the Semiclassical Asymptotics of the Current and Magnetic Moment of a Non-Interacting Electron Gas at Zero Temperature in a Strong Constant Magnetic Field S. Fournais ∗

Abstract. We calculate the asymptotic form of the quantum current and magnetic moment of a non-interacting electron gas at zero temperature. The calculation uses coherent states and a novel commutator identity for the current operator.

1 Introduction In recent years a lot of mathematical research has been focused on understanding quantum mechanics in magnetic fields. The semiclassical results obtained so far in this area have concentrated on the energy (i.e. the sum of the negative eigenvalues) and the density. Nevertheless, in the presence of a magnetic field, the current (and the magnetic moment) is as natural a quantity as the density, but it has not received the same attention in the mathematical community. There are two possible reasons for this: The current vanishes for a Schrödinger operator without magnetic field, i.e. current is truly a property of problems with magnetic fields. Secondly, the current of a classical electron gas at equilibrium vanishes, and therefore, as was proved in [Fou98], in a standard semiclassical limit the leading (Weyl-like) term for the current is zero. There exists, however, another semiclassical limit, introduced by Lieb, Solovej and Yngvason in [LSY94], in which the magnetic field strength µ is allowed to vary as the semiclassical parameter h tends to zero. The new semiclassical limit was introduced in order to study ground state properties of large atoms in magnetic fields as strong as those which exist on the surface of a neutron star. The purpose of this paper is to study the current in this semiclassical limit, applications to the calculation of the current/magnetic moment of large atoms in strong magnetic fields will be given in a later paper. When attacking semiclassical problems in strong magnetic fields there are two different approaches possible: One can use the very precise pseudodifferential machinery developed by Ivrii and others (see [Ivr98] and [Sob94]). This will give very good remainder estimates and can be applied quite directly to the current. The drawback of the method is that it is technically involved and requires a certain degree of smoothness of the potentials. An alternative approach is the variational approach used by Lieb, Solovej and Yngvason in the paper [LSY94] to calculate ∗ Partially

supported by the European Union, grant FMRX-960001.

1190

S. Fournais


the energy and the density. This method uses coherent states to approximate the true ground state and (magnetic) Lieb-Thirring inequalities to bound the error terms. Here we will apply this latter technique to calculate the current. As will be explained below some new ideas are necessary in order to do so since the current operator is a priori too big to fit in the scheme. We get around this difficulty by applying a novel commutator formula for the current. This method unfortunately only works for magnetic fields which are not too strong. In a later paper [Fou99] we will apply Ivrii’s microlocal techniques to the current and thereby improve the error estimates and enlarge the range of allowed magnetic field strengths. Notice, however, that it is necessary to use the commutator formula in order to calculate the current – an approximate ground state does not necessarily have the right current. This is illustrated in Appendix A where we construct a trial density matrix that gives the correct semiclassical energy but fails to give the right current. In this paper we study the current and magnetic moment of an electron gas in a strong constant magnetic field. Suppose the dynamics of an electron is governed by the Pauli-operator V ) = (−ih∇ + µA) 2 + V (x) + hµσ · B, P = P(µA, acting in L2 (R3 ; C2 ). Here V is a real potential, σ = (σ1 , σ2 , σ3 ) is the vector of Pauli spin matrices: 0 1 σ1 = , 1 0 0 −i , σ2 = i 0 1 0 σ3 = , 0 −1 = ∇ × A. The operator P contains two parameters h, µ ∈ R+ , where h and B is a semi-classical parameter, which we will let tend to zero, and µ is parameter measuring the strength of the magnetic field. We will let µ → +∞ as h → 0 in such a way that the product µh remains bounded below, i.e. µh ≥ c > 0. Let ψ be any state, then the current in the state ψ is the distribution jψ given by: jψ · a = ψ|J(a)|ψ, where J(a) is the operator: + (−ih∇ + µA) a + hσ · b, J(a) = a(−ih∇ + µA) with b = curl a.

Vol. 2, 2001

On Semiclassical Asymptotics for a Non-Interacting Electron Gas

1191

The quantity that we will study in this paper is the total current of a noninteracting electron gas at zero temperature i.e. we sum the current of all eigenfunctions below zero (we set the chemical potential equal to zero). Thus the definition of the current (as a distribution) is: j · a dx = tr[J(a)1(−∞,0] (P)]. The current is given as the curl of the magnetic moment: j = curl m, thus results for the current translate directly into results for the magnetic moment. Remark 1.1. The quantity j · a dx should only depend on the magnetic field generated by a i.e. curla. That this is indeed the case can easily be seen from the fact that: [P, φ] = ihJ(∇φ), and therefore ψ|J(∇φ)|ψ = 0, for all ψ which are eigenfunctions of P.

1.1

Statement of the results

The energy of the electron gas is given by: V ))]. V ) = tr[P(µA, V )1(−∞,0] (P(µA, E(µA, Notice, that this is clearly a negative quantity. The semiclassical asymptotics of the energy has been calculated by [LSY94] (constant magnetic field) and [ES97] that (non constant fields), and it was found, under very general conditions on V, A, µ V ), where V ) ≈ 2 Escl (µA, E(µA, h V)=− Escl (µA, with d0 =

1 2π

and dn =

1 π

2 3π

∞

3/2

+ V (x)]− dx, dn |B|[2nµh| B|

(1.1)

n=0

for n ≥ 1. Here and in what follows we use the notation [x]− =

−x x ≤ 0 0 x>0

2 − µh + V (x) on L2 (R3 ) instead of P (i.e. Remark 1.2. If we study (−ih∇ + µA) restrict to the spin-down subspace), then the only change is that we have to put dn = 1/(2π) for all n in (1.1).

1192

S. Fournais

Now, formally1 j =

≈ =

δE

, δA


so we would expect that

h2 j · µa dx µ d + sa), V ) |s=0 Escl (µ(A ds ∞ −2 3/2 1/2 dn b · B [2nhµ + V (x)]− − 3nhµ[2nhµ + V (x)]− dx, 3π n=0

as h tends to zero. This is indeed the result of the paper: = 1 (−x2 , x1 , 0), that a = (a1 , a2 , 0) ∈ C ∞ (R3 ), and Theorem 1.3. Suppose that A 0 2 3/2 3 that V, a ˜ · ∇V ∈ L (R ) ∩ L5/2 (R3 ), where a ˜ = (−a2 , a1 , 0). Suppose furthermore that µh → β ∈ (0, +∞) as h → 0. Then ∞ h2 −2 dν (∂x1 a2 − ∂x2 a1 ) lim j · µa dx = h→0 µ 3π ν=0 3/2

1/2

× [2νβ + V (x)]− − 3νβ[2νβ + V (x)]−

dx.

Theorem 1.3 only deals with the current perpendicular to the magnetic field. It turns out that the current parallel to the field is more difficult to analyze. We have the following result: Theorem 1.4. Let the assumptions be as in Theorem 1.3 except that a ∈ C0∞ is arbitrary ,i.e. a3 is not necessarily vanishing – but still a ˜ = (−a2 , a1 , 0). Suppose V satisfies the following additional symmetry constraint: V (x1 , x2 , −x3 ) = V (x1 , x2 , x3 ). Then ∞ h2 −2 dν (∂x1 a2 − ∂x2 a1 ) lim j · µa dx = h→0 µ 3π ν=0 3/2

1/2

× [2νβ + V (x)]− − 3νβ[2νβ + V (x)]−

dx.

As mentioned before the local magnetic moment m is defined by curl m = j, or equivalently, m is the distribution: m · b dx = tr[J(a)1(−∞,0] (P)], 1 It

is easy to prove that tr[J(µa)1(−∞,0] (P)] =

if the derivative on the right hand exists.

d + ta), V ), |t=0 E(µ(A dt

Vol. 2, 2001


1193

where a is any test function satisfying curla = b. Thus the results for the current (Thms 1.3 and 1.4) translate directly into results for the local magnetic moment. To be precise we state explicitly what Thm. 1.4 implies for the magnetic moment: satisfies Corollary 1.5. Let b = curla with a ∈ C0∞ (R3 ; R3 ). Suppose V (and A) the assumptions of Thm. 1.4, then h2 h→0 µ ∞ −2

m · µb dx

lim

=

1.2

3π

dν

3/2 1/2 b3 [2νβ + V (x)]− − 3νβ[2νβ + V (x)]− dx.

ν=0

Difficulties

Let us recall how the density is calculated [LS77]: The density ρ is defined as V ))], ρφ dx = tr[φ1(−∞,0] (P(µA, for all φ ∈ C0∞ (R3 ). Formally, ρ is the variational derivative of the energy with f ormally

δE respect to V i.e. ρ = δV . To calculate the asymptotics of the density we use the following variational principle:

V )], V ) = inf tr[γP(µA, E(µA, 0≤γ≤1

where the infimum is taken over all density matrices, i.e. all operators satisfying the inequality 0 ≤ γ ≤ 1 in the quadratic form sense. Here we need to apply the V ) is not trace class, then the trace is +∞ by definition. convention that if γP(µA, V + sφ) and let E(s) be the corresponding energy. Then, by Let H(s) = P(µA, V )) in the variational principle for E(s), we obtain: using 1(−∞,0] (P(µA, E(s) − E(0) ≤ s

ρφ dx.

If we now divide by s = 0 on both sides of the inequality, multiply by h2 /µ, and let h and s tend successively to zero, then we get: h2 µ

ρφ dx →

δEscl φ dx. δV

Unfortunately, this technique does not work for the current: If we define ˜ V ) + sJ(µa), H(s) = P(µA,

1194

S. Fournais


˜ and let E(s) be the corresponding energy, then we get ˜ E(s)

+ sa), V ))H(s)] ˜ ≤ tr[1(−∞,0] (P(µ(A + sa), V ) − s2 µ2 tr[a2 1(−∞,0] (P(µ(A + sa), V ))]. = E(µ(A

The first term on the right hand side is known to be of order hµ2 , but the second term is of order µ2 hµ2 ! Thus, this term – which is quadratic in s and therefore without interest for us – spoils the asymptotic picture. The morale of this calculation is, that the operator J is too big – adding just a bit of it, changes the energy dramatically. The way out of this problem is to V ). The realize that ψ|J|ψ is ’small’ for all ψ which are eigenfunctions of P(µA, commutator formula (see Section 2 below) will exactly give us that smallness of ψ|J|ψ.

1.3

Discussion

As can be seen from the statements of the main results (Thms 1.3 and 1.4) we restrict ourselves to the consideration of constant magnetic fields. This is motivated by the fact that in most physically relevant situations the magnetic field does not change over the length scale of the quantum mechanical system in question. However, the methods in this paper are generally robust (coherent states, LiebThirring inequalities, and the commutator formula from Section 2), so it is likely that the methods from [ES97] could be applied to the current as well, thereby generalizing the results of this paper to non-constant fields. That would introduce a number of technical difficulties that are beyond the motivation for the present paper. A further issue is the magnetic field strength. ’Normal’ semiclassical analysis is the limit h → 0, µ = 1. In this paper we allow for magnetic field strengths µ such that µh is bounded as h → 0. One would expect the result of the present paper to hold also in the case where µh → +∞. However, that is beyond the techniques of this paper. When µh → +∞, the lower bound in Thm. 4.5 below, becomes the (not very informative) statement: lim inf h→0

h2 E(t) ≥ −∞. µ

The paper [Fou99] will (at least partially) address the problem of very strong magnetic fields.

1.4

Organization of the paper

In Section 2 we prove a commutator formula for J. This formula expresses J as a commutator with P plus an operator which is a factor µ smaller than J. Since the commutator does not contribute to the trace, we hereby reduce the problem of calculating the current considerably. Unfortunately, this commutator formula

Vol. 2, 2001


1195

only gives information about the current orthogonal to the magnetic field – this is the reason why the parallel current is more difficult. Then, in Sections 4 and 5 we use the ’variational principle’ – i.e. the method used above to calculate the density – to calculate the orthogonal current. Using symmetry, gauge invariance and the result on the orthogonal current, we can prove Theorem 1.4. Finally, in Appendix A, we give some arguments to support the necessity of using our commutator formula: We construct a density matrix which has asymptotically (as h → 0) the same energy and density as the ground state, but does not have the right current.

1.5

Notation and preliminaries

The results in Section 3 and the calculations in Sections 4 and 5 are only for a constant magnetic field and there we fix the choice of the vector potential as A(x) = 12 (−x2 , x1 , 0). The commutator formula in Section 2 is valid for general, denotes an arbitrary everywhere nonvanishing magnetic fields, so in that section A vector potential. We will denote the magnetic momentum operator as pA = (−ih∇ + µA). Furthermore, we will denote the closed ball of radius r centered around the point x by B(x, r), and by Da the Jacobian matrix of the vector function a. All through the paper we will apply the standard convention that c or C denote appropriate constants, the value of which we will not try to calculate. Finally a few words on the Pauli operator in a constant magnetic field: the magnetic field is parallel to the 3rd unit vector e3 and With our choice of A, therefore 2 pA + µh + V (x) 0 P= , 0 p2A − µh + V (x) and thus tr[J(µa)1(−∞,0] (P)] =

µtr[(a · pA + pA · a + hb3 )1(−∞,0] (p2A + µh + V (x))]

+µtr[(a · pA + pA · a − hb3 )1(−∞,0] (p2A − µh + V (x))]. We therefore can (and will) calculate the current as the sum of the two terms on the right hand side.

2 Commutator identity In this section we will prove the commutator identity of Lemma 2.1 below. Define 2 + V (x), H = (−ih∇ + µA)

1196

S. Fournais


+ (−ih∇ + µA) · a. Let furthermore and write Jp (a) = a · (−ih∇ + µA)   0 B3 −B2 0 B1  = {∂xj Ak − ∂xk Aj }j,k . B =  −B3 B2 −B1 0 Let us first give the result of a formal calculation: ∈ C 1,1 (R3 ; R3 ) (i.e. ∂ 2 Al ∈ L∞ (R3 ). Define a by a = B˜ Lemma 2.1. Suppose A a. j,k Then formally (i.e. as a calculation on C0∞ (R3 )): [H, Jp (˜ a)]

= 2ih˜ a · ∇V − 2ihµJp (a) a + (D˜ a)t )pA − ih3 ∆div(˜ a). −2ihpA · (D˜

Proof. The proof is just a calculation, so let us only state the ingredients: First of all [V, Jp (˜ a)] = a ˜ · [V, −ih∇] + [V, −ih∇] · a ˜. That gives the first term on the right hand side in the result. So we are left with

=

[p2A , Jp (˜ a)]

˜ + pA · [p2A , a ˜] + [p2A , a ˜] · pA . a ˜ · [p2A , pA ] + [p2A , pA ] · a

Here the first term on the right can be calculated (using that [pA,j

, pA,k

] = a = a) to give −2ihµJp (a). The second term on the right will −iµhBj,k and B˜ give derivatives in a ˜ and contribute with the last two terms in the lemma. In order to use the lemma for our purposes we need to solve the equation B˜ a = a for a given a. This can be done in general if B(x) · a(x) = 0 and B(x) = 0 for all x. In that case × a B , a ˜= 2 |B| gives a solution. Notice that ker B(x) = span B(x), so we have some freedom in the choice of a ˜. = (0, 0, 1) and a = (a1 , a2 , 0) then a ˜ = (−a2 , a1 , 0). Remark 2.2. Notice, that if B = (0, 0, 1), However, all we need for a ˜ is that B˜ a = a. Therefore, in the case of B ˜3 ), with any (smooth, we actually have the freedom to choose a ˜ = (−a2 , a1 , a compactly supported) a ˜3 . Of course, the final result for the current does not depend on this choice, and we will take a ˜3 ≡ 0 all through this paper. Remark 2.3. We would like to combine the formal calculation in Lemma 2.1 with the fact that (at least for matrices H, J) ψ; [H, J]ψ = 0,

Vol. 2, 2001


1197

for all eigenfunctions (eigenvectors) ψ of H. However, as the paper [GG99] shows, one has to be a bit careful. In the case we are interested in (constant magnetic field, a ˜ ∈ C0∞ , V ∈ Lp ) it is well known that there are no problems: One can for instance use the formal calculation on a sequence ψj ∈ C0∞ that converges to ψ in graph norm. Furthermore, this is a technical issue, which it (in most cases) should be possible to overcome. Thus Cor. 2.4 below is only stated for the needed case, however it is true in much more generality. Corollary 2.4. Let us impose the assumptions of Thm. 1.3. Let ψ be an eigenfunction for H, i.e. Hψ = λψ, then µ ψ; Jp (a)ψ =

ψ; a ˜ · ∇V ψ 1 a + (D˜ a)t )pA ψ − h2 ψ; ∆div(˜ a)ψ. − ψ; pA · (D˜ 2

3 Known results In this section we will recall some results on semiclassics of the energy and density in a constant magnetic field. These are all taken from [LSY94]. First we have a magnetic Lieb-Thirring inequality for constant magnetic field: Theorem 3.1. Let [V ]− ∈ L3/2 (R3 ) ∩ L5/2 (R3 ) and let ej (µ, V ) denote the negative eigenvalues of the operator p2A − µh + V (x). Then 3/2 5/2 −2 −3 |ej (µ, V )| ≤ L1 µh [V (x)]− dx + L2 h [V (x)]− dx, j

where the constants L1 , L2 are independent of h, µ and V . The result on the semiclassics of the energy in a constant magnetic field is: V ) and Escl (A, V) Theorem 3.2. Suppose [V ]− ∈ L3/2 (R3 )∩L5/2 (R3 ) and let E(A, be as given in Section 1. Then V) E(A, lim = 1, µ h→0 2 Escl (A, V ) h

uniformly in the magnetic field strength µ, where Escl was defined in (1.1). By the variational principle, we get as in Section 1.2: Corollary 3.3. Let us keep the assumptions from Theorem 3.2. Suppose φ ∈ L5/2 (R3 ) ∩ L3/2 (R3 ), then ∞ h2 1 1/2 tr[φ1(−∞,0] (P(µA, V ))] = dn [2nµh + V (x)]− φ(x) dx + o(1), µ π n=0 as h → 0.

1198

S. Fournais


4 Lower bound

 M11 M12 M13 Let M =  M21 M22 M23  ∈ C0∞ (R3 ) be a real, symmetric matrix, and let M31 M32 0 22 (x) m(x) = M11 (x)+M = tr[M ]/2. In this section we prove a semi-classical lower 2 bound for the energy of the operator 

H(t) = pA · St (x)pA − µh(1 + tm(x)) + V (x), where St (x) = 1 + tM (x). We will need the following easily proved version of the IMS-localization formula: Lemma 4.1. Let g ∈ C ∞ (R3 ; R) be a bounded function with bounded derivatives, and let S(x) be any symmetric, real matrix, such that x → S(x) is a bounded function. Then 2 f g|pA · S(x)pA |gf =

f |g 2 pA · S(x)pA |f + f |pA · S(x)pA g 2 |f −2h2 f |∇g · S(x)∇g|f ,

for all f in the quadratic form domain of p2A . We will also need to diagonalise the ’kinetic energy part’ of H(t) – for constant (in x) matrices St , this is the content of the next lemma, the proof of which is a simple change of variables: Lemma 4.2. Let St = I + √ tM , where M is a constant real, symmetric matrix, and t is small. Define Nt = I + tM , and define a unitary operator Ut on L2 (R3 ) by: 1/2

(Ut f )(x) = Λt f (Nt x), Λt = | det Nt |.Then Ut pA · St pA Ut−1 = (−ih∇ + µA˜t )2 , t x). where A˜t (x) = Nt A(N Remark 4.3. Since we have chosen a ˜ = (−a2 , a1 , 0), and we will apply the results in this section with M = −(D˜ a + (D˜ a)t ), we do have M33 = 0. The analysis goes through for arbitrary matrices, but the result will then depend on M33 . For simplicity, we only state and prove results for matrices of the type we need for our application.   M11 M12 M13 Remark 4.4. If M =  M21 M22 M23 , then |curl A˜t | = 1+ 12 t(M11 +M22 )+ M31 M32 0 2 3 t c + O(t ), where c is a negative constant (depending on M ). Thus, we see from

Vol. 2, 2001


1199

the above, that for t = 0: 1 inf Spec pA · St pA − µh(1 + t(M11 + M22 )) → −∞, 2 as µh → ∞. This is the reason why the lower bound below does not work in that case. Theorem 4.5. Suppose that [V ]− ∈ L3/2 (R3 ) ∩ L5/2 (R3 ) and that   M11 M12 M13 M (x) =  M21 M22 M23  ∈ C0∞ . M31 M32 0 Let E(t) = tr[H(t)1(−∞,0] (H(t))], and suppose, that µh → β as h → 0. Then we have the following lower bound on E(t), for sufficiently small (independent of h, µ) t: h2 E(t) ≥ h→0 µ −1 bu,t [(2ν + 1)βbu,t − β(1 + tm(u)) + V (u)]3/2 − du, 2 3π ν Λu,t

lim inf

where bu,t Λu,t

1 + tM (u)x)|, = |curl x 1 + tM (u)A( = | det 1 + tM (u)|.

Remark 4.6. Notice, that due to the 2nd order discrepancy between bu,t and µ(1 + tm(u)) (see Remark 4.4), we really need the matrix M (x) to have compact support, since this assures the convergence of the integral in the lower bound for ν = 0. Remark 4.7. For the case of M = −(D˜ a + (D˜ a)t ), we have m(x) = tr[M (x)]/2 = b3 (x). Proof. It is clear, that we get a lower energy by replacing V (x) by −[V (x)]− , so we will assume V (x) = −[V (x)]− in the proof. Let us first define some necessary tools for the argument: is a (constant) vector, then the projection onto the νth Coherent states: If B × x)2 has integral kernel [LSY94, p.95]: Landau level of (−ih∇ + µ 12 B Π(2) ν (x⊥ , y⊥ ) = µb µb µb µB exp{i(x⊥ × y⊥ ) − |x⊥ − y⊥ |2 }Lν (|x⊥ − y⊥ |2 ), 2πh 2h 4h 2h

(4.1)

1200

S. Fournais


and x B. Furthermore, where we have written x ∈ R3 as (x⊥ , x ), with x⊥ ⊥ B and Lν are Laguerre polynomials normalized by Lν (0) = 1. we have written b = |B| Let us now write ip(x −y ) , Πν,p (x, y) = Π(2) ν (x⊥ , y⊥ )e with p ∈ R, then ∞

−1

Πν,p dp

=

1,

1 × x)2 Πν,p (x, y) (−ih∇ + µ B 2

=

5ν,p (h, b)Πν,p (x, y),

ν=0

(2π)

R

with 5ν,p (h, b) = (2ν + 1)hµb + h2 p2 , and Πν,p (x, x) =

µb . 2πh

Let us finally introduce a localization function g ∈ C0∞ (R3 ), gr (x) = r−3/2 g(x/r), where r = h1−α , α < 1. Then, we write

g 2 = 1 and write

−1 Πν,p,u,t Ut,u gr (x − u), Q(ν, u, p, t) = gr (x − u)Ut,u

where Ut,u is the unitary operator described in Lemma 4.2, with Nt = Nt,u = B t,u being satisfying Nt,u = I + tM (u), and where Πν,p,u,t = Πν,p with B the magnetic field generated by Nt,u A(Nt,u x). Below, we will in general insert an extra index u on the quantities, where this is needed, as exemplified here by Nt,u t,u . and B Useful identities: We find: tr[pA · S(u)pA Q(ν, u, p, t)]

−1 2 2 = tr[pA · S(u)pA Ut,u gr (Nt,u x − u)Πν,p,u,t gr (Nt,u x − u)Ut,u ] 2 2 2 ˜ = tr[(−ih∇ + µAt,u ) gr (Nt,u x − u)Πν,p,u,t gr (Nt,u x − u)] µbt,u 2 2 2 = (∇gr ((Nt,u x − u)) dx . 5p,ν (h, bt,u ) + h 2πhΛt,u

(4.2)

Here we used the localization formula in the last equality. For a normalized function f ∈ L2 we get:

≥ ≥

f |pA · S(x)pA |f f |gr (x − u)pA · S(x)pA gr (x − u)|f du − h2 C (∇gr )2 f |gr (x − u)pA · S(u)pA gr (x − u)|f du + f |gr (x − u)pA · (S(x) − S(u))pA gr (x − u)|f du −Ch2 r−2 .

(4.3)

Vol. 2, 2001


1201

The second term on the right can be estimated as follows, where we use that t is small so S(u) ≥ 1/2 as a matrix. | f |gr (x − u)pA · (S(x) − S(u))pA gr (x − u)|f du| (4.4) ≤ Fr (u) f |gr (x − u)pA · S(u)pA gr (x − u)|f du, where Fr (u) = 2 supx∈B(u,r) |S(x) − S(u)|. Thus, the first two terms in (4.3) can be estimated as: f |gr (x − u)pA · S(u)pA gr (x − u)|f du + f |gr (x − u)pA · (S(x) − S(u))pA gr (x − u)|f du 1 − Fr (u) ≥ 5p,ν (h, bt,u ) f |Q(ν, u, p, t)|f dp du. (4.5) 2π ν So we finally get:

≥

f |pA · S(x)pA |f −1 (2π) (1 − Fr (u))5p,ν (h, bt,u ) − Ch2 r−2 ν

× f |Q(ν, u, p, t)|f dp du. For the potential we get: f |V ∗

gr2 |f

= =

(4.6)

V (u) f |gr2 (x − u)|f du 1 V (u) f |Q(ν, u, p, t)|f dpdu. 2π ν

(4.7)

Lower bound: Now we are ready to prove the lower bound on the energy. We have N to bound the sum j=1 fj |H(t)|fj from below, where the fj ’s are orthonormal, with a bound independent of N and of the fj ’s. Let us take a (small) δ > 0 and write H(t) = δ(p2A − µh) + (1 − δ)(p2A − µh) + t(pA · M pA − µhm(x)) + V (x). Let us furthermore take an 5 > 0 and choose R such that 3/2 |V (x)| dx < 5 and |V (x)|5/2 dx < 5. |x|≥R

|x|≥R

Since M (x) ∈ C0∞ we will assume that M (x) = 0 for |x| ≥ R. Choose finally, a partition of unity θ12 , θ22 of positive real functions, satisfying: θ1 (x) = 0 for |x| ≥ 2R and θ2 (x) = 0 for |x| ≤ R.

1202

S. Fournais


Then N

fj |H(t)|fj

(4.8)

j=1

=

N

fj |θ1 H(t)θ1 |fj +

j=1

−h2

N

fj |θ2 H(t)θ2 |fj

j=1 N

fj |(∇θ1 )2 + (∇θ2 )2 |fj

j=1

=

(1 − δ)

N

fj |θ1 pA · (I +

j=1

+

N

−µh(1 +

t M (x))pA 1−δ

t V (x) m ∗ gr2 ) + ∗ gr2 θ1 |fj 1−δ 1−δ

fj |θ1 δ(p2A − µh) + (V − V ∗ gr2 )

j=1

−µht(m − m ∗ +

N

gr2 )

2

2

2

2

− h (∇θ1 ) − h (∇θ2 )

θ1 |fj

fj |θ2 p2A − µh + V (x) − h2 (∇θ1 )2 − h2 (∇θ2 )2 θ2 |fj .

j=1

Let us now, in order to simplify some expressions below, introduce the notation: t τ = 1−δ . The first term above can be bounded below using (4.6) and (4.7) by 1−δ (1 − Fr (u))5p,ν (h, bu,τ ) − Ch2 r−2 − µh(1 + τ m(u)) 2π ν V (u) fj |θ1 Q(ν, u, p, τ )θ1 |fj dpdu. 1 − δ j=1 N

+ Since

0≤

N j=1

fj |θ1 Q(ν, u, p, τ )θ1 |fj ≤

µbu,τ , 2πhΛτ,u

and f |θ Q(ν, u, p, τ )θ1 |fj = 0 if |u| ≥ 3R + r, we get a lower bound by replacing N j 1 j=1 fj |θ1 Q(ν, u, p, τ )θ1 |fj by a function M (ν, u, p, τ ), which is the characteristic function of the set (ν, u, p)(1 − Fr (u))5p,ν (h, bu,τ ) − Ch2 r−2 − µh(1 + τ m(u)) V (u) + ≤ 0 and |u| ≤ 3R + r , 1−δ

Vol. 2, 2001


1203

µb

u,τ times − 2πhΛ . Thus, the lower bound becomes τ,u 1−δ µbu,τ − × 2π ν {|u|≤3R+r} 2πhΛτ,u V (u) 2 −2 dpdu. (1 − Fr (u))5p,ν (h, bu,τ ) − Ch r − µh(1 + τ m(u)) + 1−δ −

We do the p integration explicitly and get: 1 − Fr (u) 4µbu,τ −(1 − δ) (2ν + 1)hµbu,τ 2π 6πh2 Λτ,u {|u|≤3R+r} ν 1 V (u) 3/2 2 −2 + du. −Ch r − µh(1 + τ m(u)) + 1 − Fr (u) 1−δ − The last two terms in (4.8) are error terms, and will be bounded below using the magnetic Lieb-Thirring inequality (Thm 3.1). Since r = h1−α and µh ≤ C, we have for small h that (V − V ∗ gr2 ) − µh(m − m ∗ gr2 ) − h2 ((∇θ1 )2 + (θ2 )2 )q dx < 5, for q = 3/2 and q = 5/2. Therefore, we get by application of the magnetic LiebThirring inequality that the first error term in (4.8) can be bounded below by −C5h−3 (δ −1/2 + δ −3/2 ). We can use the Lieb-Thirring inequality directly to bound the second error term from below by −C5h−3 , where we used the definition of θ2 .

5 Calculation of the current In this section we will finally find the asymptotics of the current, i.e. prove Thm 1.3. The assumptions of that theorem will be standing assumptions in the entire section. By applying the commutator identity from Section 2, we get tr[J(µa)1(−∞,0] (P)] = tr[(pA · M (x)pA + µhσ3 b3 )1(−∞,0] (P)] +tr[(˜ a · ∇V − h2 ∆div(˜ a))1(−∞,0] (P)],

(5.1)

where M (x) = − (D˜ a(x) + (D˜ a(x))t ). The asymptotics of the second term in (5.1) is easy to calculate using the results on the density from Cor. 3.3 and integration by parts:

1204

S. Fournais


Lemma 5.1. When [V ]− , a ˜ · ∇V ∈ L3/2 ∩ L5/2 , then a · ∇V − h2 ∆div(˜ a))1(−∞,0] (P)] tr[(˜ ∞ −2µ 3/2 d a − ∂ a )[2νµh + V (x)] dx − (∂ ν x1 2 x2 1 − 3πh2 ν=0 = o(h−3 + µh−2 ). a) – can readily be estimated Proof. The second term – the term with h2 ∆div(˜ using Cor.3.3 to give: |tr[h2 ∆div(˜ a)1(−∞,0] (P)]| ≤ h2 O(µ/h2 ). So that term can be neglected. From the same corollary we get:

=

tr[(˜ a · ∇V )1(−∞,0] (P)] ∞ µ 1/2 dn [2nµh + V (x)]− (˜ a · ∇V )(x) dx + o(µ/h2 ). πh2 n=0

Now we can use the identity 1/2

[2nµh + V (x)]− ∇V = ∇

−2 3/2 [2nµh + V (x)]− , 3

and integration by parts to finish the proof. Remark 5.2. Notice, that another choice of a ˜3 would have led to an extra term in the above lemma. This extra term would, however, have been compensated by an extra term in Lemmas 5.3, 5.4 below, in order to make the final result independent of a ˜3 . For the first term in (5.1) we need the result from Section 4. Let us write the term as tr[(pA · M (x)pA − µhb3 )1(−∞,0] (p2A − µh + V (x))]

+tr[(pA · M (x)pA + µhb3 )1(−∞,0] (p2A + µh + V (x))],

(5.2)

and analyses each term separately. Lemma 5.3. Suppose [V ]− ∈ L3/2 ∩ L5/2 and a ˜ = (−a2 , a1 , 0) ∈ C0∞ (R3 ). Write t M (x) = − (D˜ a(x) + (D˜ a(x)) ). Suppose furthermore, that µh → β ∈ (0, +∞) as h → 0. Then h2 tr[(pA · M (x)pA − µhb3 )1(−∞,0] (p2A − µh + V (x))] = h→0 µ ∞ 2νβ 1/2 (∂x1 a2 − ∂x2 a1 )[2νµh + V (x)]− dx. 2 2π ν=0 lim

Vol. 2, 2001


1205

Proof. The proof is easy, using the variational principle for the energy and the lower bound from Section 4. We write H(t) = p2A − µh + V (x) + t(pA · M (x)pA − µhb3 ), and E(t) = tr[H(t)1(−∞,0] (H(t))] = inf 0≤γ≤1 tr[γH(t)]. Then we get the following inequality: t tr[(pA · M (x)pA − µhb3 )1(−∞,0] (H(0))] = tr[H(t)1(−∞,0] (H(0))] − tr[H(0)1(−∞,0] (H(0))] ≥ E(t) − E(0). Now we invoke the lower bound from Thm. 4.5 together with the known upper (and lower) bound from Thm. 3.2 to get: h2 lim inf t tr[(pA · M (x)pA − µhb3 )1(−∞,0] (H(0))] h→0 µ −bu,t 3/2 ≥ [(2ν + 1)βbu,t − β(1 + tb3 (u)) + V (u)]− du 2Λ 3π u,t ν −1 − 2 [2νβ + V (u)]3/2 du. 3π ν Notice that the right hand side vanishes for t = 0. If we now assume that t > 0, divide by t on both sides and let t tend to 0 (from above), we will get a lower bound on lim inf h→0 . If, on the other hand we take t < 0. Then pulling t outside the lim inf will change it to a lim sup, and dividing by t will change the direction of the inequality sign. By doing so and letting t tend to 0 (from below), we will get an upper bound on lim suph→0 . The combined result of these two processes is: h2 tr[(pA · M (x)pA − µhb3 )1(−∞,0] (H(0))] = h→0 µ −bu,t d 3/2 |t=0 [(2ν + 1)βbu,t − β(1 + tb3 (u)) + V (u)]− du. 2Λ dt 3π u,t ν lim

Now we obtain the result, if we remember that bu,t = (1 + tb3 (u) + O(t2 )), and b that Λu,t = 1 + O(t2 ). u,t Now, since µh is bounded, it is easy to use the same methods to treat the spin-up part. The result is similar:

1206

S. Fournais


Lemma 5.4. Suppose [V ]− ∈ L3/2 ∩ L5/2 and a ˜ = (−a2 , a1 , 0) ∈ C0∞ (R3 ). Suppose furthermore, that µh → β ∈ (0, +∞) as h → 0. Then h2 tr[(pA · M (x)pA + µhb3 )1(−∞,0] (p2A + µh + V (x))] = h→0 µ ∞ 2(ν + 1)β 1/2 (∂x1 a2 − ∂x2 a1 )[2(ν + 1)µh + V (x)]− dx. 2π ν=0 lim

If we put the three lemmas together we obtain Theorem 1.3. Finally, we prove Theorem 1.4: Proof. We will use the linearity of a → J(a). Therefore, using Theorem 1.3, we may assume a = (0, 0, a3 ), with a3 ∈ C0∞ . Let U be the unitary operator on L2 (R3 , C2 ) defined by U f (x1 , x2 , x3 ) = f (x1 , x2 , −x3 ), and write a3 as a3 = a3,even + a3,odd, where a3,even (a3,odd ) is even (odd) under the reflection x3 → −x3 . We define aeven and aodd similarly. Now, since V is invariant under conjugation by U we get: U (p2A + µh + V (x))U ∗ = p2A + µh + V (x). ∗ Since U pA,3

U = −pA,3

, we have

U J(µ(aeven + aodd ))U ∗ = J(µ(−aeven + aodd )). Therefore we get: tr[J(µa)1(−∞,0] (P)]

1 tr[(J(µa) + U J(µa)U ) 1(−∞,0] (P)] 2 = tr[J(µaodd )1(−∞,0] (P)]. =

Now we will use gauge invariance to ’move a3,odd up to the first two components’: ¯2 , 0) ∈ C0∞ (a3 being odd assures the We can easily find a function a ¯ = (¯ a1 , a compact support) such that curl a ¯ = curl (0, 0, a3,odd). We thus finish the proof by appealing to Theorem 1.3 and the gauge invariance of the current.

A

A density matrix with a strong current

We want to argue that it is necessary to use something like our commutator argument in order to calculate the current. Therefore we will produce an example of a density matrix γ that gives the right energy to highest order – but gives a current of too high order. This proves that the energy does not control the current on general states – only on eigenstates. We will work with µh = 1 i.e µ = h−1 and will only look at one spin component i.e. 2 − µh + V (x). H = (−ih∇ + µA)

Vol. 2, 2001


1207

= Lemma A.1. There exists a potential V (x) ∈ C0∞ (R3 ) and a test function φ ∞ 3 (φ1 , φ2 , 0) ∈ C0 (R ) together with a density matrix i.e. an operator γ satisfying 0 ≤ γ ≤ 1 such that µ tr[Hγ] = Escl + o( 2 ), h and h2 |tr[J(µφ)γ]| → ∞, µ as h → 0. Thus the lemma says that a density matrix that gives the right energy does not necessarily give the right current. This is unlike the situation for the density, since it is easy to prove that a density matrix that gives the right energy also gives the right density. The trial density matrix γ will be constructed as a perturbation of the density matrix used in [LSY94]. The key to the construction is the following: The current operator – as opposed to the energy operator (the Hamiltonian) – mixes the Landau levels. In fact, the main part of the current operator does not respect the Landau levels – the part that does is much smaller2 . Thus, a density matrix that gives the right energy but contains a small part which mixes neighboring Landau levels should have too large a current. As the proof below shows this turns out to be the case. Proof. Let us choose V ∈ C0∞ (R3 ), which satisfies [V (x)]− = 10 for all x ∈ B(0, 2). = (φ1 , φ2 , 0), which is supported in B(0, 1). We will choose a test vector φ The density matrix γ constructed in [LSY94] is γ =

∞ 1 M (ν, u, p)Π(ν, u, p) dudp, 2π ν=1

where M (ν, u, p) is the characteristic function of the set (in (N+ ∪ {0}) × R3u × Rp ) {(ν, u, p)|2νµh + h2 p2 + V (x) ≤ 0}, and where Π(ν, u, p) is an operator with kernel ip(x3 −y3 ) Π(ν, u, p)(x, y) = gr (x − u)Π(2) gr (y − u). ν (x⊥ , y⊥ )e

In this last expression gr is a localization function gr (x) = r−3/2 g(x/r), 0 ≤ g ∈ (2) C0∞ (R3 ), g 2 = 1 and r = h1−α for some 0 < α < 1. Furthermore, Πν (x⊥ , y⊥ ) is the (two-dimensional) integral kernel of the projection to the ν-th Landau level = (0, 0, 1) and therefore b = 1. Since the defined in (4.1), where we choose B direction of B is fixed, we will write x3 instead of x . 2 This

can be seen from the commutator formula.

1208

S. Fournais


˜ be the characteristic function of B(0, 1)u × [−h−1 , h−1 ]p , and Let now M write ˜ (u, p)Π(u, ˜ M p) dudp, γ˜ = 5 where 5 → 0 as h → 0 and where ˜ Π(u, p)(x, y) = gr (x − u)P (x⊥ , y⊥ )eip(x3 −y3 ) gr (y − u). In this final expression P is the operator (2)

(2)

(2)

(2)

P = Π1 a∗ Π0 + Π0 aΠ1 , ∗ with a = pA,1 being the raising and lowering operators

− ipA,2

, a = pA,1

+ ipA,2

that define the Landau levels. We finally define γ = γ + γ˜. Since the operator P satisfies (remember µh = 1) (2)

(2)

(2)

(2)

−c(Π0 + Π1 ) ≤ P ≤ c(Π0 + Π1 ), it is easy to see that 0 ≤ γ for sufficiently small 5. In order to get γ ≤ 1 we should 1 multiply by a factor 1+δ , where δ → 0 as h → 0. We will not do this, since it will not affect order of magnitude estimates and only obscure notation. We need to calculate tr[Hγ] = Escl + tr[H γ˜ ], and

tr[h−1 Jp (φ)γ].

Notice, that since γ gives the right density to highest order, we do not need to calculate the spin current i.e. tr[µhb3 γ], since we know this to be of order hµ2 once we have proved that γ gives the right energy. Furthermore, we may assume that γ does not satisfy the requirements of the lemma – if it does we do not have to construct anything. The energy: The idea of the argument is as follows: ˜ (u, p)tr[H Π(u, ˜ M p)] dudp. tr[H γ˜ ] = 5 we use the IMS-localization formula: 2gp2A g − (p2A g 2 + g 2 p2A ) = [[g, p2A ], g]. Let us first look at the potential energy. This term will be small (i.e o(µ/h)) since (2) (2) Π1 f Π0 is small for f ∈ C0∞ , as the following calculation shows: ˜ tr[V Π(u, p)] = trL2 (R2⊥ ) [gr ((·, x3 ) − u)2 V (·, x3 )P ] dx3 .

Vol. 2, 2001


1209

Now, gr ((·, x3 ) − u)2 V (·, x3 ) is a smooth function with compact support, let us write it as f (x⊥ ) for shortness. Let us choose f˜ ∈ C0∞ (R2 ) such that f˜f = f . Then (2)

(2)

(2)

(2)

(2)

(2)

tr[f P ] = tr([Π0 , f ]Π1 a∗ Π0 ) + tr([f, Π0 ]Π0 aΠ1 ). The two terms are similar, so let us only estimate the first. This will be done using Lemma A.2 below. (2)

(2)

(2)

tr([Π0 , f ]Π1 a∗ Π0 ) (2) (2) (2) = tr([Π0 , f˜f ]Π1 a∗ Π0 ) (2) (2) (2) (2) (2) (2) = tr([Π , f ]Π a∗ Π f˜) + tr([Π , f˜]f Π a∗ Π ) ≤

0 1 0 0 (2) (2) ∗ (2) (2) ˜ [Π0 , f ]HS Π1 a Π0 Π0 f HS

1

+

0 (2) ˜ (2) (2) [Π0 , f ]HS f Π1 HS a∗ Π0 .

The operator norms · are bounded, and the Hilbert-Schmidt norms · HS can be estimated using Lemma A.2. For the kinetic energy term we get:

=

=

˜ p)] tr[(p2A − µh)Π(u, 1 tr ((p2A − µh)gr2 (· − u) + gr2 (· − u)(p2A − µh))P (x⊥ , y⊥ )eip(x3 −y3 ) 2

+2[[gr (· − u), p2A ], gr (· − u)]P (x⊥ , y⊥ )eip(x3 −y3 ) 1 tr (4µh + 2h2 p2 )gr2 (· − u)P (x⊥ , y⊥ )eip(x3 −y3 ) 2

+2h2 (∇gr (· − u))2 P (x⊥ , y⊥ )eip(x3 −y3 ) .

This term is small for the same reason as above. Thus we may choose 5 to go to zero slowly with h – for definiteness let us take 5 = | log h|−1 . The current: In order to calculate the current we write

γ ] = 2 tr[φ(−ih∇ γ] , tr[Jp (φ)˜ + µA)˜ so we only need to consider

= =

γ] tr[φ(−ih∇ + µA)˜ ˜ (u, p)tr[φ(−ih∇ Π(u, ˜ M + µA) p)] dudp 5 ip(x3 −y3 ) ˜ (u, p) tr[φ(−ih∇g M 5 gr (· − u)] r (· − u))P (x⊥ , y⊥ )e

r (· − u)(−ih∇ + µA)P (x⊥ , y⊥ )eip(x3 −y3 ) gr (· − u)] dudp +tr[φg

1210

S. Fournais (2)


(2)

Since Πj f Πk is small when j = k, f ∈ C0∞ , we get that the highest order contribution comes from a part of the second term, namely:   (a + a∗ )/2  (a∗ − a)/(2i)  ˜ (u, p)tr[gr (· − u)φ M 5 0

≈

×P (x⊥ , y⊥ )eip(x3 −y3 ) gr (· − u)] dudp   µh  iµh  Π(2) (x⊥ , y⊥ )eip(x3 −y3 ) ˜ (u, p)tr gr2 (· − u) φ M 5 0 0   µh  −iµh  Π(2) (x⊥ , y⊥ )eip(x3 −y3 ) dudp. +φ 1 0

If we remember that µh = 1 and choose φ2 = 0 we can calculate the trace as:

˜ (u, p)gr2 (x − u)φ1 (x) Π(2) (x⊥ , x⊥ ) + Π(2) (x⊥ , x⊥ ) dxdudp M 5 0 1 5µ ˜ (u, p)g 2 (x − u)φ1 (x) dxdudp M = r πh h−1 5µ = dp r−3 g 2 ((x − u)/r)φ1 (x) dxdu πh −h−1 |u|≤1 5µ φ1 (x) dx. ≈ πh2 If we remember that this term has to be multiplied by h−1 it is easy to see that we have reached our aim. Lemma A.2. Let f ∈ C01 (R2 ), and let · HS denote the Hilbert-Schmidt norm in L2 (R2 ). Then (2)

Πj f 2HS (2)

[Πj ; f ]2HS

≤

cf 2µ/h,

≤

c1 ∇f 2∞ (diam (supp f ) + c2

h/µ)2 .

Proof. We will only prove the second statement, since the proof of the first is similar (and easier!). Remember that the Hilbert-Schmidt norm of an operator is given as the L2 -norm of the integral kernel. (2) The operator K = [Πj ; f ] has integral kernel (2)

K(x⊥ , y⊥ ) = Πj (x⊥ , y⊥ ){f (y⊥ ) − f (x⊥ )}, (2)

where Πj (x⊥ , y⊥ ) is seen from (4.1) to have the following structure: (2) |Πj (x⊥ , y⊥ )| = µ/hF (|x⊥ − y⊥ | µ/h),

Vol. 2, 2001


1211

with F a rapidly decreasing function (gaussian times a polynomial) independent of µ, h. We can furthermore estimate |f (y⊥ ) − f (x⊥ )| ≤ G(x⊥ , |x⊥ − y⊥ |)|x⊥ − y⊥ |, where G(x⊥ , r) =

sup

|∇f (z)|.

z∈B(x⊥ ,r)

Here G satisfies the following relations: |G(x⊥ , r)| supp G(·, r)

≤ ⊂

∇f ∞ , supp f + B(0, r) ⊂ B(0, diam (supp f ) + r).

Therefore, we get: |K(x⊥ , y⊥ )|2 dx⊥ dy⊥ 2 µ ≤ F (|z| µ/h)|z|G(x⊥ , |z|) dx⊥ dz h ≤ c∇f 2∞ (µ/h)2 F (|z| µ/h)2 |z|2 (diam (supp f ) + |z|)2 dz. Changing variables to z =

µ/h z now finishes the proof.

Acknowledgments. The author wishes to thank the Schr¨ odinger Institute in Vienna for hospitality during the fall term 1999, especially Thomas and Maria HoffmannOstenhof. Furthermore, the author acknowledges many useful discussions with Thomas Østergaard Sørensen and Jan Philip Solovej. Finally, the author is very grateful to the patient referee who directed his attention to a number of misprints and imprecisions.

References [ES97]

L. Erd¨ os and J.P. Solovej, Semiclassical Eigenvalue Estimates for the Pauli Operator with Strong non-homogeneous magnetic fields. II. Leading order asymptotic estimates, Commun. Math. Phys. 188, 599–656 (1997).

[Fou98] S. Fournais, Semiclassics of the Quantum Current, Comm. in P.D.E 23, no. 3-4, 601–628 (1998). [Fou99] S. Fournais, Semiclassics of the quantum current in a strong constant magnetic field, University of Aarhus Preprint, no. 9 (1999). [GG99] V. Georgescu and C. Gérard, On the virial theorem in quantum mechanics, Commun. Math. Phys. 208, no. 2, 275–281 (1999).

1212

S. Fournais


[Ivr98]

V. Ivrii, Microlocal Analysis and Semiclassical Spectral Asymptotics, Springer Verlag, 1998.

[LS77]

E.H. Lieb and B. Simon, The Thomas-Fermi Theory of Atoms, Molecules and Solids, Adv. Math., no. 23, 22–116 (1977).

[LSY94] E. Lieb, J.P. Solovej, and J. Yngvason, Asymptotics of heavy atoms in high magnetic fields: II. Semiclassical regions., Commun. Math. Phys., no. 161, 77–124 (1994). [Sob94] A.V. Sobolev, The quasi-classical asymptotics of local Riesz means for the Schroedinger operator in a strong homogeneous magnetic field., Duke Math. J. 74, no. 2, 319–429 (1994).

S. Fournais Department of Mathematical Sciences and MaPhySto3 University of Aarhus Denmark email: [email protected] Communicated by Bernard Helffer submitted 7/03/01, accepted 19/06/01


3 Centre for Mathematical Physics and Stochastics, funded by a grant from the Danish National Research Foundation.

Annales Henri Poincaré - Volume 2

Annales Henri Poincaré - Volume 11

Annales Henri Poincaré - Volume 4

Annales Henri Poincaré - Volume 3

Annales Henri Poincaré - Volume 6

Annales Henri Poincaré - Volume 10

Annales Henri Poincaré - Volume 1

Annales Henri Poincaré - Volume 9

Annales Henri Poincaré - Volume 5

Annales Henri Poincaré - Volume 4 - Supplements

Henri Poincaré

Henri IV

Henri Charriere

Annales - Livre VI 31

Annales du DALF

Henri Michaux: Icebergs

Les Annales Akashiques

Les Annales des Heechees

Volume 2

Volume 2

Volume 2

ADVANCES IN CATALYSIS VOLUME 2, Volume 2

Henri Poincaré : impatient genius

ADVANCES IN GEOPHYSICS VOLUME 2, Volume 2 (v. 2)

Seminaire Pierre Lelong - Henri Skoda

A New Philosophy Henri Bergson

A New Philosophy, Henri Bergson

Henri III et sa cour

Henri Poincare and Relativity Theory

Secrets, Volume 2

Precaution, Volume 2

Annales Henri Poincaré - Volume 2

Annales Henri Poincaré - Volume 11

Annales Henri Poincaré - Volume 4

Annales Henri Poincaré - Volume 3

Annales Henri Poincaré - Volume 6

Annales Henri Poincaré - Volume 10

Annales Henri Poincaré - Volume 1

Annales Henri Poincaré - Volume 9

Annales Henri Poincaré - Volume 5

Annales Henri Poincaré - Volume 4 - Supplements

Henri Poincaré

Henri IV

Henri Charriere

Annales - Livre VI 31

Annales du DALF

Henri Michaux: Icebergs

Les Annales Akashiques

Les Annales des Heechees

Volume 2

Volume 2

Volume 2

ADVANCES IN CATALYSIS VOLUME 2, Volume 2

Henri Poincaré : impatient genius

ADVANCES IN GEOPHYSICS VOLUME 2, Volume 2 (v. 2)

Seminaire Pierre Lelong - Henri Skoda

A New Philosophy Henri Bergson

A New Philosophy, Henri Bergson

Henri III et sa cour

Henri Poincare and Relativity Theory

Secrets, Volume 2

Precaution, Volume 2

Recommend Documents