Commun. Math. Phys. 220, 1 – 12 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
On the Definiti...

Author:
M. Aizenman (Chief Editor)

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Commun. Math. Phys. 220, 1 – 12 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

On the Definition of SRB-Measures for Coupled Map Lattices Esa Järvenpää, Maarit Järvenpää University of Jyväskylä, Department of Mathematics, P.O. Box 35, 40351 Jyväskylä, Finland. E-mail: [email protected]; [email protected] Received: 23 June 2000 / Accepted: 4 January 2001

Abstract: We consider SRB-measures of coupled map lattices. The emphasis is given to a definition according to which a SRB-measure is an invariant probability measure whose projections onto finite-dimensional subsystems are absolutely continuous with respect to the Lebesgue measure. We show that coupled map lattices which are close to an uncoupled expanding map have typically an infinite number of SRB-measures. In particular, we give a counterexample to the Bricmont–Kupiainen conjecture.

1. Introduction The SRB-measure (Sinai, Ruelle, Bowen) is by definition a “natural” invariant probability measure of a dynamical system (X, T ), where X is a manifold and T : X → X is a differentiable mapping. The meaning of the word “natural” comes from the interpretation that the dynamical system is a model of some physical system. The natural measure should tell how typical points behave asymptotically, that is, what the long time behaviour of the system is for typical initial values. Typical points are determined by the set-up of the actual experiment. If the phase space of the system is a manifold then one may argue that the Lebesgue measure or some smooth modification of it is the right distribution for the initial values. Having found an invariant measure µ and aset A ⊂ X with positive Lebesgue measure such that the Birkhoff average limn→∞ n1 ni=1 δT i (x) tends to µ in the weak∗ -topology for all x ∈ A, it is reasonable to say that µ is a SRBmeasure. Here δx is the probability measure concentrated at the point x. The existence of several other definitions for the SRB-measure found in the literature stems from the fact that this is a difficult condition to test. One definition is that the SRB-measure is an invariant probability measure whose conditional distributions on unstable leaves are absolutely continuous with respect to the corresponding Lebesgue measure. According to another definition it is an equilibrium state for a certain potential function obtained from the derivative of the map. A third definition states that the SRB-measure is a limit of the

2

E. Järvenpää, M. Järvenpää

Lebesgue measure under the iteration of the dynamics. For nice finite-dimensional systems like expanding maps on compact manifolds or axiom A systems all these definitions agree and give the same unique SRB-measure. When adopting the aforementioned definitions into the infinite-dimensional setting of coupled map lattices, one should take into consideration that in an experiment it is possible to measure only a finite number of quantities, in particular, a finite number of coordinates. Thus it seems quite natural to demand that the finite dimensional projections of a SRB-measure are absolutely continuous with respect to the corresponding Lebesgue measure. The extension of the equilibrium state definition to the infinite dimensional setting is not trivial because of the difficulties caused by infinite determinants and matrices. The third definition is obtained by studying finite dimensional approximations of the whole system, taking the limit of the (finite) Lebesgue measure under these approximations, and letting the subsystem size tend to infinity. Even for expanding maps one possibility is to demand that finite dimensional conditional distributions are absolutely continuous. All of the above definitions have been used in the literature. Bunimovich and Sinai [BS] studied expanding maps of the unit interval with a special diffusive coupling over one-dimensional lattice Z. They showed that the system has an invariant Gibbs state whose projections onto finite-dimensional subsystems are absolutely continuous with respect to the Lebesgue measure. In [BK1] Bricmont and Kupiainen used the first mentioned definition, proved the existence of a SRB-measure for analytic expanding circle maps in the regime of small analytic coupling over d-dimensional lattice Zd , and conjectured the uniqueness of this SRB-measure. They extended the existence result for special Hölder continuous functions in [BK2]. They also verified that the SRB-measure is unique in the class of measures for which the logarithm of the density is Hölder continuous. In [J] it was shown that all these results remain true if one replaces the circle by any compact Riemannian manifold. Jiang and Pesin [JP] considered weakly coupled Anosov maps. They managed to extend the equilibrium state definition to this setting and proved the existence and uniqueness of the SRB-measure. Recently, Keller and Zweimüller [KZ] studied piecewise expanding interval maps with a special unidirectional coupling using the last mentioned definition. They established the existence and uniqueness of the SRB-measure in this setting. Finally, the proofs of [BK2, JP] give the uniqueness of the SRB-measure given as in the third definition above. The purpose of this paper is to show that the first mentioned definition is not equivalent with the second and third ones in an infinite dimensional setting. We will construct a coupled map lattice which has an infinite number of SRB-measures according to the first mentioned definition (see Theorem 3.4). (Three of these are also (space) translation invariant.) We also argue that our example is not just a curious artificial system but it manifests a typical behaviour. Thus, although being perhaps the most natural of the above definitions at the heuristic level, this definition has the drawback of being non-unique. Our results also imply that for each finite subsystem X one can find a set A of positive Lebesgue measure such that for each x ∈ A there are boundary conditions y1 (x) and y2 (x) such that n 1 lim δT i (x∨yi ) = µi , n→∞ n i=1

where µ1 = µ2 and x ∨ y is the natural element of the phase space X. Hence the boundary conditions do have an effect. Note that one cannot draw the conclusion that there is a physical phase transition since for each x ∈ A one has to choose the boundary

Non-Uniqueness of SRB-Measures for Coupled Map Lattices

3

condition in a very special way in order to see another SRB-measure than the one whose existence was proved in [BK2].

2. Preliminaries Our main motivation comes from the well-known projection results in Rn stating that the projections of a Radon measure µ onto almost all m-planes are absolutely continuous with respect to the m-dimensional Lebesgue measure provided that the m-energy of µ is finite [M, Theorem 9.7]. Our strategy is to use the fact that expanding maps have small invariant sets (and measures) in the sense that their dimensions are less than the dimension of the ambient manifold. For example, the 13 -Cantor set is invariant under the map x → 3x mod 1. If one takes a finite n-fold product of these Cantor-sets, one will obtain a set which is invariant under the corresponding n-fold product map. Of course, the dimension of this product set is less than n, and so the natural Hausdorff measure living on the set, although being invariant, is not a SRB-measure since it is not absolutely continuous with respect to the n-dimensional Lebesgue measure. However, as n grows, the dimension of the product Cantor set grows. In particular, for each integer m one can find n such that the dimension of the n-fold Cantor set is greater than m. By the above mentioned projection result typical projections of the n-fold Hausdorff measure onto m-dimensional subspaces are absolutely continuous with respect to the m-dimensional Lebesgue measure. Of course, for this system the m-dimensional subsystems are atypical and the projections onto them are not absolutely continuous. Our idea is that a small coupling will make these coordinate planes typical ones. However, one has to be careful since in [HK] Hunt and Kaloshin proved that these projection results are not valid in infinite dimensional spaces. The projection theorems have also the reversed statements according to which the set of exceptional directions may have positive dimension although having zero measure (see [F]). Thus one cannot expect anything more than “almost all”-results. We adopt the very general formulation of the projection theorem due to Peres and Schlag [PS]. We begin by recalling the notation from [PS] which we will use later. Definition 2.1. Let (X, d) be a compact metric space, Q ⊂ Rn an open connected set, and : Q × X → Rm a continuous map with n ≥ m. For any multi-index |η| η = (η1 , . . . , ηn ) ∈ Nn , let |η| = ni=1 ηi be the length of it, and ∂ η = (∂ε1 )η1∂...(∂εn )ηn , where = (ε1 , . . . , εn ) ∈ Q. Let L be a positive integer and δ ∈ [0, 1). We say that ∈ C L,δ (Q) if for any compact set Q ⊂ Q and for any multi-index η with |η| ≤ L there exist constants Cη,Q and Cδ,Q such that

|∂ η (, x)| ≤ Cη,Q and sup |∂ η (, x) − ∂ η ( , x)| ≤ Cδ,Q | − |δ |η |=L

for all , ∈ Q and x ∈ X. Next we will give a definition of a subclass of C L,δ (Q) from [PS]. Definition 2.2. Let ∈ C L,δ (Q) for some L and δ. Define for all x = y ∈ X, x,y () =

(, x) − (, y) . d(x, y)

4

E. Järvenpää, M. Järvenpää

Let β ∈ [0, 1). The set Q is a region of transversality of order β for if there exists a constant Cβ such that for all ∈ Q and for all x = y ∈ X the condition |x,y ()| ≤ Cβ d(x, y)β implies det(Dx,y ()(Dx,y ())T ) ≥ Cβ2 d(x, y)2β . Here the derivative with respect to is denoted by D and AT is the transpose of a matrix A. Further, is (L, δ)-regular on Q if there exists a constant Cβ,L,δ and for all multiindices η with |η| ≤ L there exists a constant Cβ,η such that for all , ∈ Q and for all distinct x, y ∈ X, |∂ η x,y ()| ≤ Cβ,η d(x, y)−β|η| and

sup |∂ η x,y () − ∂ η x,y ( )| ≤ Cβ,L,δ | − |δ d(x, y)−β(L+δ) .

|η |=L

Remark 2.3. Note that if the determinant in Definition 2.2 is bounded away from zero then Q is a region of transversality of order β for all β ∈ [0, 1). Definition 2.4. Let µ be a Borel measure on X and α ∈ R. The α-energy of µ is d(x, y)−α dµ(x)dµ(y). Eα (µ) = X

X

We denote the image of a measure µ under a map f : X → Y by f∗ µ, that is, f∗ µ(A) = µ(f −1 (A)) for all A ⊂ Y . The following theorem from [PS] gives a relation between Sobolev-norms of images of measures under C L,δ (Q)-mappings and energies of original measures. Theorem 2.5. Let Q ⊂ Rn and ∈ C L,δ (Q) such that L + δ > 1. Let β ∈ [0, 1). Assume that Q is a region of transversality of order β for and that is (L, δ)-regular on Q. Let µ be a finite Borel measure on X such that Eα (µ) < ∞ for some α > 0. Then there exist a constant a0 depending only on m, n, and δ such that for any compact Q ⊂ Q, ∗ µ22,γ dLn () ≤ Cγ Eα (µ) Q

for some constant Cγ provided that 0 < (m + 2γ )(1 + a0 β) ≤ α and 2γ < L + δ − 1. Here · 2,γ is the Sobolev norm, that is, |ˆν (ξ )|2 |ξ |2γ dLm (ξ ) ν22,γ = Rm

for any finite compactly supported Borel measure on Rm , where νˆ (ξ ) = e−iξ ·x dν(x) Rm

is the Fourier transform of ν. Proof. [PS, Theorem 7.3].

Non-Uniqueness of SRB-Measures for Coupled Map Lattices

5

Remark 2.6. Let ν be a finite compactly supported Borel measure on Rn . If ν2,0 < ∞ then ν is absolutely continuous with respect to the Lebesgue measure Ln and its RadonNikodym derivative is L2 -integrable, that is, D(ν, Ln ) ∈ L2 (Rn ) (see 3.5). Indeed, if νˆ ∈ L2 (Rn ) then by the surjectivity of the Fourier transform [SW, Theorem 2.3, p. 17] there exists f ∈ L2 (Rn ) such that fˆ = νˆ . Thus by [T, Definition 1.7, p. 262] f = ν as a distribution meaning that f = D(ν, Ln ). Note also that ν2,γ < ∞ for γ ≥ n + 2 implies that D(ν, Ln ) has L2 -integrable derivatives of order γ , that γ is, D(ν, Ln ) ∈ W2 (Rn ). So by [SW, Lemma 3.17, p. 26] D(ν, Ln ) is continuously differentiable. 3. Results Let ' = Zd S 1 , where d ≥ 1 is an integer and S 1 ⊂ C is the unit circle. We use ˜ ⊂ Zd let π : ' → ' and the notation ' = S 1 for all ⊂ Zd . For ⊂ π , : ' ˜ → ' be the natural projections. Let ε0 > 0 and let A : ' → ' be such ˜ that its lift A : ' → ', where ' = Zd R, is A (x)i = xi + εil 2−|i−l| g(xl ) (3.1) l∈Zd

for all i ∈ Zd , where | · | is a metric on Zd , εil ∈ (−ε0 , ε0 ) for all i, l ∈ Zd and g is continuously differentiable and 1-periodic. (We use the covering map p : ' → ' such that Zd [0, 1] is a covering domain. Then A = p ◦ A ◦ p−1 .) For the discussion of the explicit form of the conjugacy A , see Remarks 3.5. Set E = Zd ×Zd (−ε0 , ε0 ) and denote by L the product over Zd × Zd of normalized Lebesgue measures on (−ε0 , ε0 ). It is not difficult to see that A is invertible for all ∈ E provided ε0 is small enough (depending on |g |). We fix such ε0 and set T = A ◦F ◦A−1 , d 3 1 maps z → z (or t → 3t mod 1 if S is where F : ' → ' is the product over Z of viewed as [0, 1]). Let K = Zd K and µ = Zd Hs |K , where K is the 13 -Cantor set on S 1 (or [0, 1]) and Hs |K is the restriction of the s-dimensional Hausdorff measure to 2 K with s = log log 3 . (Note that s is the Hausdorff dimension of K). Now (A )∗ µ is clearly T -invariant, that is, (T )∗ (A )∗ µ = (A )∗ µ. Our aim is to show that for L-almost all the projection (π )∗ (Aε )∗ µ is absolutely continuous with respect to the Lebesgue measure on ' for all finite ⊂ Zd . Let ⊂ Zd . We denote the restriction of A to ' by A, , that is, A, (x)i = xi + εil 2−|i−l| g(xl ) l∈

˜ ⊂ Zd be finite for all i ∈ . Set µ = Hs |K and K = K. Let ⊂ ˜ such that | |s > | |, where the number of elements in is denoted by | |. Let ˜ E × ˜ = × ˜ (−ε0 , ε0 ) and let L × be the restriction of L to E × ˜ . We will first ˜

show that for L × -almost all ∈ E × ˜ the measure (π , ◦ Aε, ˜ )∗ µ ˜ is absolutely ˜ continuous with respect to the Lebesgue measure on ' . As it will be indicated in the proof of Proposition 3.2 this claim follows from Theorem 2.5. In order to apply Theorem 2.5 we have to give some conditions on g. Since g is 1-periodic and continuously differentiable there necessarily exists t0 ∈ [0, 1] such that

6

E. Järvenpää, M. Järvenpää

g (t0 ) = 0. In order to satisfy the transversality assumption in Theorem 2.5, we demand that g = 0 on K. More precisely, let b > 0 and let g be increasing on [0, 1/6] such that g(0) = 0 and g (t) ≥ b for all t ∈ [0, t1 ] for some 1/9 < t1 < 1/6. Define g(t + 1/6) = g(1/6 − t) for t ∈ [0, 1/6] and g(1 − t) = −g(t) for t ∈ [0, 1/3]. We extend g to the interval [1/3, 2/3] such that g is continuously differentiable, g([0, 1]) ⊂ [−1, 1], for some B ≥ b we have |g (t)| ≤ B for all t ∈ [0, 1], and |g (t)| ≥ b for all t ∈ [1/3, 1/3 + t2 ] ∪ [2/3 − t2 , 2/3], where 0 < t2 < 1/9. Consider the second step in the construction of the Cantor set K. Call the chosen intervals Ii , i = 1, . . . , 4, that is, I1 = [0, 1/9], I2 = [2/9, 1/3], I3 = [2/3, 7/9], and I4 = [8/9, 1]. Let x ∈ K and ⊂ Zd . Define x˜ ∈ K in the following way: For all i ∈ , let x˜i = xi . For j ∈ c = Zd \ set x˜j = xj if xj ∈ I1 ∪ I4 , x˜j = 1/6 − (xj − 1/6) if xj ∈ I2 , and x˜j = 5/6 + 5/6 − xj if xj ∈ I3 . Note that with these definitions g(x˜j ) = g(xj ) for all j ∈ Zd implying that π ◦ A (x) ˜ = π ◦ A (x). Further, if / [−t1 , t1 ] for some j ∈ c then x˜j ∈ [−t1 , t1 ]. xj ∈ Let x, y ∈ K such that xi ∈ I1 and yi ∈ I2 for some i ∈ . Then A (y)i − A (x)i ≥ yi − xi − εil 2−|i−l| |g(yl ) − g(xl )| l∈Zd

≥ yi − xi −

εil 2−|i−l| B|yl − xl | ≥ yi − xi − Cε0 ≥

l∈Zd

1 (3.2) 18

for ε0 small enough since yi − xi ≥ 1/9. Thus the cubes at the second stage of the construction of K with i th side I1 will not overlap with cubes with i th side I2 under the projection π ◦ A provided that i ∈ . (The same argument works in other cases as well, see 3.3 below.) More precisely, there exists a constant c > 0 such that |π ◦ A (x) − π ◦ A (y)| ≥ c

(3.3)

for all x, y ∈ K with xi ∈ I1 ∪ I4 and yi ∈ I2 ∪ I3 (or xi ∈ I2 and yi ∈ I3 ) for some i ∈ . Further, as in (3.2) we see that there exists c˜ > 0 such that |A (x)i − 1/6| ≥ c˜ for all i ∈ and x ∈ K, giving the existence of δ > 0 such that 1 1 − δ, + δ = ∅ (3.4) π{i} ◦ A (K) ∩ 6 6 for all i ∈ . We fix ε0 and δ such that the above results hold.

˜ ⊂ Zd be finite such that | |s ˜ > | |. Set X ˜ = ˜ [−t1 , t1 ]. Lemma 3.1. Let ⊂

Define : E × ˜ × X ˜ → ' by (, x) = π , ◦ A, ˜ (x). Then the assumptions ˜ of Theorem 2.5 are valid for δ = 0, β = 0, and for all integers L > 1. Further, ˜ Eα (µ ˜ ) < ∞ for any | | < α < | |s.

Proof. We may replace ' by Rm , where m = | |. Let i0 ∈ . Note that X ˜ is a compact metric space equipped with the metric 2−2|i0 −l| |xl − yl |2 . d(x, y)2 = ˜ l∈

Clearly ∈ C L,0 (E × ˜ ) for all positive integers L since all the first order partial derivatives are constants. Note that Q in Definition 2.1 will not play any role here since all the estimates are independent of Q .

Non-Uniqueness of SRB-Measures for Coupled Map Lattices

7

To check the transversality assumption in Definition 2.2, define for all x = y ∈ X ˜ , x,y () =

(, x) − (, y) . d(x, y)

˜ and x, y ∈ X ˜ such that x = y. Then Fix i ∈ , k = (k1 , k2 ) ∈ × ,

Dx,y ()i,k = δi,k1 2−|i−k2 |

g(xk2 ) − g(yk2 ) , d(x, y)

where δi,j is the Kronecker’s delta. Thus for i, j ∈ , (Dx,y ()Dx,y ()T )i,j =

δi,j −|i−l|−|j −l| 2 (g(xl ) − g(yl ))2 d(x, y)2 ˜ l∈ 2 −|i−i0 |−|j −i0 |

≥ δi,j b 2

.

By Remark 2.3 the transversality assumption is valid for β = 0 with the constant C0 = bm 2− i∈ |i−i0 | . Finally, is obviously (L, 0)-regular (in fact (L, δ)-regular for all δ ∈ [0, 1)) on E × ˜ for all positive integers L. The last assertion follows from the well-known properties of the Hausdorff measure Hs |K (see [M, Chapter 8]). The following absolute continuity result follows from Theorem 2.5 and Lemma 3.1. ˜

˜ > | |. Then for L × ˜ ⊂ Zd be finite such that | |s Proposition 3.2. Let ⊂ almost all ∈ E × ˜ the measure (π , ◦ A ) µ is absolutely continuous with ˜ ∗ ˜ ˜ , respect to the Lebesgue measure on ' . Proof. By the arguments given before stating Lemma 3.1 we may replace ' ˜ by X ˜ = ˜

× ˜ )∗ µ ˜ [−t1 , t1 ]. Lemma 3.1 and Theorem 2.5 give (π , ˜ ◦A, ˜ 2,0 < ∞ for L

almost all ∈ E × ˜ which by Remark 2.6 implies the claim. In Proposition 3.3 we will prove that one may replace A, ˜ by A and µ ˜ by µ in Proposition 3.2. For this purpose we use differentiation theory of measures. Let ν and λ be Radon measures on Rn . Recall that the lower derivative of ν with respect to λ at a point x ∈ Rn is defined by D(ν, λ, x) = lim inf r→0

ν(B(x, r)) , λ(B(x, r))

(3.5)

where B(x, r) is the closed ball with centre at x and with radius r. If the limit exists it is called the Radon-Nikodym derivative of ν with respect to λ and is denoted by D(ν, λ, x). Further, ν is absolutely continuous with respect to λ if and only if D(ν, λ, x) < ∞ for ν-almost all x ∈ Rn [M, Theorem 2.12]. ˜ > | | and let 1 ∈ E ˜ ˜ ⊂ Zd be finite such that | |s Proposition 3.3. Let ⊂

× such that the conclusion of Proposition 3.2 is valid. Then for all ∈ E with × ˜ = 1 we have D((π ◦ A )∗ µ, L , x) < ∞ for (π ◦ A )∗ µ-almost all x ∈ ' . Here L is the Lebesgue measure on ' and × ˜ = (εij )(i,j )∈ × ˜ .

8

E. Järvenpää, M. Järvenpää

Proof. Let , 0 ∈ E such that × ˜ = 1 , × ˜c = ˜ ˜ = (0 ) × ˜ ˜ , and (0 )Zd × (0 ) ˜ c ×Zd = 0. Set ν = (π ◦ A )∗ µ and ν0 = (π , ◦ A ) µ . Then ν and ν 0 are ˜ ∗ ˜ ˜ 0 , Radon measures with compact supports [M, Theorem 1.18]. It follows directly from (3.1) that (A0 , ˜ )∗ µ ˜ = (π ˜ ◦ A0 )∗ µ, meaning that ν0 = (π ◦ A0 )∗ µ. By Proposition 3.2 the measure ν0 is absolutely continuous with respect to L . Set m = | |. We will first show that there exists a constant C > 0 such that for all r > 0, √ ν (B(x, r))dν (x) ≤ C ν0 (B(x, mr))dν0 (x). (3.6) '

'

By [FO, Lemma 2.6] it is enough to prove that ν (Q)2 ≤ C Q∈D (r, )

ν0 (Q)2 ,

(3.7)

Q∈D (r, )

where D(r, ) is the family of r-mesh cubes in R , that is, cubes of the form [l1 r, (l1 + 1)r) × · · · × [lm r, (lm + 1)r), where li ∈ Z for all i = 1, . . . , m. Let r > 0. Consider the cubes at the nth stage of the construction of K, where 3−n < r. Call this nth stage approximation K(n). Setting V0 = A0 , ˜ (K ˜ (n)) × K ˜ c (n) = A, ˜ (K ˜ (n)) × K ˜ c (n), we get A0 (spt µ) ⊂ V0 implying that spt ν0 ⊂ π (V0 ). Here the support of a measure λ is denoted by spt λ. ˜ and x, y ∈ X = Zd [−t1 , t1 ] such that xk = yk for all k ∈ , ˜ then If i ∈ A (x)i − A (y)i = εil 2−|i−l| (g(xl ) − g(yl )). (3.8) ˜c l∈

(Recall the discussion before Lemma 3.1 according to which we can assume that xi ∈ ˜ c. [−t1 , t1 ] for all i ∈ Zd ). Note that the difference in (3.8) depends only on xj for j ∈ Defining V = A (K(n)), we have spt ν ⊂ π (V ). Further, A (x)i = A, ˜ (x)i for ˜ c meaning that the restriction of V to the subspace ˜ if xj = 0 for all j ∈ all i ∈ ' ˜ ⊂ ' equals A, ˜ (K ˜ (n)) = A0 , ˜ (K ˜ (n)). So by (3.8) V is obtained from V0 by tilting the rows of “cubes” above each “cube” in A, ˜ (K ˜ (n)) in such a way that the ˜ Thus ν is obtained from ν0 by amount of translation does not depend on xi for i ∈ . spreading around the “cubes” defining ν0 . Let Q ∈ D(r, ). If there is Q ∈ D(r, ) such that a part of the “cubes” above it in V0 are tilted above Q then the corresponding “cubes” above Q (in V0 ) are removed away by (3.8). Define AQ = {Q ∈ D(r, ) | π (A (A−1˜ (Q × X \ ) × X ˜ c )) ∩ Q = ∅}. ˜ ,

Then for all Q ∈ AQ with π (V ) ∩ π (A (A−1˜ (Q × X \ ) × X ˜ c )) ∩ Q = ∅ we ˜ , c have V0 ∩ (Q × X ) = ∅. Further, Q × X c = PQ (Q ), (3.9) Q ∈D (r, ) Q∈AQ

where

PQ (Q ) = {x ∈ Q × X c | π (A (A−1˜ (x ˜ ) × x ˜ c )) ∈ Q }. ,

Non-Uniqueness of SRB-Measures for Coupled Map Lattices

9

Observe that (A0 )∗ µ(PQ (Q )) = (A )∗ µ(A (A−1 0 (PQ (Q )))).

(3.10)

Note that by (3.8) the geometric shape of this partition is independent of Q, that is, if Q1 ∈ D(r, ) with Q1 × X c =

PQ1 (Q ),

Q ∈D (r, ) Q1 ∈AQ

then for all Q2 = τ (Q1 ) ∈ D(r, ) (τ is a translation) we have

Q2 × X c =

τ (PQ1 (Q )).

Q ∈D (r, ) Q1 ∈AQ

Naturally, this partition can be restricted to V0 . Hence for all Q ∈ D(r, ) there are 1 non-negative numbers pQ (Q ) = ν0 (Q) (A0 )∗ µ(PQ (Q )) adding to 1 such that

ν0 (Q) = (A0 )∗ µ(Q × X c ) =

(A0 )∗ µ(PQ (Q ))

Q ∈D (r, ) Q∈AQ

=

(3.11)

pQ (Q )ν0 (Q).

Q ∈D (r, ) Q∈AQ

This gives by (3.10) that ν (Q) =

Q ∈AQ

(A0 )∗ µ(PQ (Q)) =

pQ (Q)ν0 (Q ).

(3.12)

Q ∈AQ

The numbers pQ (Q ) depend on both Q and PQ (Q ). Enumerating the partition of Q×X c given in (3.9) we get Q×X c = ∪i PQ (i), where the geometric shape of PQ (i) ∈ D(r, ) we have PQ may vary as i varies. However, for all i and Q, Q (i) = τ (PQ (i)), = τ (Q). Hence the differences in PQ (i) as Q varies where τ is the translation with Q and i is kept fixed are due to the fact that the measure is not evenly distributed inside horizontal | |-dimensional slices of Q × X c . Note that if such a horizontal slice intersects an element PQ (Q ) of the partition (3.9), then, by (3.8), it may intersect only the elements PQ (Q ), where Q is a neighbour of Q in D(r, ). Let N = 3| | be the number of neighbours. We say that Q and Q are related (Q ∼ Q ) if there exists Q

10

E. Järvenpää, M. Järvenpää

such that Q , Q ∈ AQ . Then by (3.11) and (3.12) N

ν0 (Q)2 −

Q∈D (r, )

=N

ν (Q)2

Q∈D (r, )

pQ (Q )pQ (Q )ν0 (Q)2

Q∈D (r, ) Q ∈D (r, ) Q ∈D (r, ) Q∈AQ Q∈AQ

−

pQ (Q)pQ (Q)ν0 (Q )ν0 (Q )

Q∈D (r, ) Q ∈AQ Q ∈AQ

=

pQ (Q)pQ (Q)(ν0 (Q ) − ν0 (Q ))2 + P ≥ 0

Q ,Q ∈D (r, ) Q∈D (r, ) Q ,Q ∈AQ Q ∼Q

since the remainder P (which is due to the occasionally very generous compensation factor N ) is non-negative. This concludes the proof of (3.7). Let α be the L -measure of the m-dimensional unit ball. By [M, Theorem 2.12] D(ν0 , L , x) exists and is finite for L -almost all x. By Proposition 3.2 the same is true for ν0 -almost all x. By Remark 2.6 we can choose D(ν0 , L ) as smooth as we like by ˜ In particular, it can be chosen to be uniformly continuous so that one can increasing . find r0 > 0 such that ν0 (B(x, r))α −1 r −m ≤ max{2D(ν0 , L , x), 1} for all 0 < r < r0 and x ∈ ' . Thus using Fatou’s lemma, inequality 3.6, the theorem of dominated convergence, and Theorem 2.5 together with Plancharel’s formula [SW, Theorem 2.1, p. 16], we have D(ν , L , x)dν (x) = lim inf ν (B(x, r))α −1 r −m dν (x) r→0 ≤ lim inf ν (B(x, r))α −1 r −m dν (x) r→0 √ ≤ lim inf C ν0 (B(x, mr))α −1 r −m dν0 (x) r→0 √ m = C( m) D(ν0 , L , x)dν0 (x) = C D(ν0 , L , x)2 dL (x) < ∞. Thus D(ν , L , x) is finite for ν -almost all x.

Theorem 3.4. For L-almost all the map T has infinitely many SRB-measures. Proof. For all finite ⊂ Zd , let Eg ( ) = { ∈ E | (π ◦ A )∗ µ is absolutely continuous with respect to L }. By Propositions 3.2 and 3.3 and [M, Theorem 2.12] we get for all finite ⊂ Zd , L(Eg ( )) = 1.

Non-Uniqueness of SRB-Measures for Coupled Map Lattices

Defining Eg =

11

Eg ( )

⊂Zd

| | 0), and has a phase transition, if 1 < c < 2 (d > 2) [4]. It has been widely believed without proof that the hierarchical Ising model in d ≥ 4 dimensions has a critical trajectory converging to the Gaussian fixed point and that the “continuum limit” of the hierarchical Ising model in d ≥ 4 dimensions will be trivial. In this paper, we prove this fact. In the present analysis, it is crucial that the critical Ising model is mapped into a weak coupling regime after a small number of renormalization group transformations (in fact, 70 iterations for d = 4). Moreover, using a framework essentially different from that of [16, 7], we see in the weak coupling regime that the “effective coupling constant” of a critical model decays as c1 /(N +c2 ) after N iterations in d = 4 dimensions (exponentially for d > 4). Our framework in the weak coupling regime is designed especially for a critical trajectory starting at the strong coupling regime so that the criterion of convergence to the Gaussian fixed point can be checked numerically with mathematical rigor. Corresponding results, triviality of φ44 spin model on regular lattice (“full model”), are far harder, and a proof of triviality of Ising model on 4 dimensional regular lattice is, though widely believed, still open. We should here note the excellent and hard work of [9, 10] where the existence of critical trajectory in the weak coupling regime (near Gaussian fixed point; “weak triviality”) is solved by rigorous block spin renormalization group transformation. Our main theorem is the following: √ Theorem 1.1. If d ≥ 4 (i.e. c ≥ 2), there exists a “critical trajectory” converging to the Gaussian fixed point starting from the hierarchical Ising models. Namely, there exists a positive real number sc such that if hN , N = 0, 1, 2, · · · , are defined by (1.5) with h0 = hI,sc , then the sequence of measures hN (x) dx, N = 0, 1, 2, · · · , converges weakly to the massless Gaussian measure hG (x) dx. Remark. Our proof is partially computer-aided and shows for d = 4 that sc ∈ [1.7925671170092624, 1.7925671170092625]. In the following sections, we give a proof of Theorem 1.1. We will concentrate on the case d = 4, since the cases d > 4 can be proved along similar lines (with weaker bounds).

16

T. Hara, T. Hattori, H. Watanabe

2. Strategy The proof of Theorem 1.1 is decomposed into two parts: Theorem 2.1(analysis in the weak coupling regime) and Theorem 2.2 (analysis in the strong coupling regime). They are stated in Sect. 2.3, and their proofs are given in Sect. 4 and Sect. 5, respectively. Theorem 1.1 is proved at the end of this section assuming them. (1) In Theorem 2.1, we control the renormalization group flow in a weak coupling regime by means of a finite number of truncated correlations (Taylor coefficients of logarithm of characteristic functions), and, in terms of the truncated correlations, we give a criterion, a set of sufficient conditions, for the measure to be in a domain of attraction of the Gaussian fixed point. (2) In Theorem 2.2, we prove, by rigorous computer-aided calculations, that there is a trajectory whose initial point is an Ising measure and for which the criterion in Theorem 2.1 is satisfied after a small number of iterations. The first part (Theorem 2.1) is essentially the Bleher–Sinai argument [1, 2, 16]. However, the criteria introduced in the references [16, 7] seem to be difficult to handle when “strong coupling constants” are present in the model, as in the Ising models. In order to overcome this difficulty, we use characteristic functions of single spin distributions and Newman’s inequalities for truncated correlations. The second part (Theorem 2.2) is basically simple numerical calculations of truncated correlations up to 8 points to ensure the criterion. The results are double checked by Mathematica and C++ programs, and furthermore they are made mathematically rigorous by means of Newman’s inequalities. It should be noted that rigorous computer-aided proofs are employed in [14] to Dyson’s hierarchical model in d = 3 dimensions, to prove, with [13], an existence of a non-Gaussian fixed point. (The “physics” are of course different between d = 3 and d = 4.) We also focus on a complete mathematical proof, by combining rigorous computer-aided bounds with mathematical methods such as Newman’s inequalities and the Bleher–Sinai arguments. 2.1. Characteristic function. Denote the characteristic function of the single spin distribution hN as √ ˆhN (ξ ) = FhN (ξ ) = e −1ξ x hN (x) dx. (2.1) R

The renormalization group transformation for hˆ N is hˆ N+1 = FRF −1 hˆ N ,

(2.2)

FRF −1 = T S,

(2.3)

which has a decomposition

where

√ 2 c ξ , 2

β T g(ξ ) = const. exp − g(ξ ), 2 Sg(ξ ) = g

(2.4) (2.5)

Triviality of Hierarchical Ising Model in Four Dimensions

17

and the constant is so defined that T g (0) = 1. The transformation (2.2) has the same form as the N = 2 case of the Gallavotti hierarchical model [5, 11, 12]. Note that only for N = 2 the Gallavotti model is equivalent (by Fourier transform) to the Dyson’s hierarchical model. We introduce a “potential” VN for the characteristic function hˆ N and its Taylor coefficients µn,N by hˆ N (ξ ) = e−VN (ξ ) , VN (ξ ) =

∞

(2.6)

µn,N ξ n .

(2.7)

n=1

(Note that hˆ N (0) = 1.) The coefficient µn,N is called a truncated n point correlation. They are functions of Ising parameter s in h0 = hI,s , but to simplify expressions, we will always suppress the dependences on s in the following. In particular, for the initial condition h0 = hI,s , we have hˆ 0 (ξ ) = hˆ I,s (ξ ) = FhI,s (ξ ) = cos(sξ ), 1 1 4 1 6 µ2,0 = s 2 , µ4,0 = s , µ6,0 = s , 2 12 45 and

µ8,0 =

17 8 s , 2520

etc.,

√ √ 2 h1 (x) = RhI,s (x) = const. eβcs /2 δ(x − s c) + δ(x + s c) + 2δ(x) , √ 1 2 1 + k cos( csξ ) , with k = eβcs /2 , 1+k k k = k", µ4,1 = (2k − 1)"2 , µ6,1 = (16k 2 − 13k + 1)"3 , 6 90 k cs 2 = (272k 3 − 297k 2 + 60k − 1)"4 , etc., with " = . 2520 2(k + 1)

hˆ 1 (ξ ) = µ2,1 µ8,1

2.2. Newman’s inequalities. The function VN has a remarkable positivity property and its Taylor coefficients obey Newman’s inequalities (for a brief review of relevant part, see Appendix A): 1 (2µ4,N )n/2 , n = 3, 4, 5, · · · . (2.8) n These inequalities follow from [15, Theorem 3, 6], since we have chosen the Ising spin distribution h0 = hI,s and the function of η defined by √ c N ηx e hN (x)dx = exp η φθ (2.9) 2 N,hI,s 0 ≤ µ2n,N ≤

θ

has only pure imaginary zeros as is shown in [15, Theorem 1]. Note also that (1.2) and (1.6) imply µ2n+1,N = 0,

n = 0, 1, 2, · · · .

(2.10)

18

T. Hara, T. Hattori, H. Watanabe

The bounds (2.8) are extensively used in this paper. We here note the following facts: (1) The right-hand side of (2.7) has a nonzero radius of convergence. (2) It suffices to prove lim µ4,N = 0 in order to ensure that µ2n,N , n ≥ 3, converges N→∞

to zero, hence the trajectory converges to the Gaussian fixed point. 2.3. Proof of Theorem 1.1. Let h0 = hI,s and d = 4. Note the following simple observations on the “mass term” µ2,N , which is the variance of hN (x) dx. (1) µ2,N is continuous in the Ising parameter s, because hN (x) dx is a result of a finite number of renormalization group transformation (1.2). (2) µ2,N is increasing in s, vanishes at s = 0, and diverges as s → ∞. We then put, for N = 0, 1, 2, · · · ,

s N = inf s > 0 | µ2,N ≥ 1 , √ 3 s N = inf s > 0 | µ2,N ≥ min 1 + √ µ4,N, 2 + 2 . 2

(2.11) (2.12)

Obviously, we have 0 < s N ≤ s N < ∞. Note also that 3 1 ≤ µ2,N ≤ 1 + √ µ4,N 2

(2.13)

holds for s ∈ [s N , s N ]. As is seen in Sect. 4, (2.13) is necessary for the model to be critical. We call this a critical mass condition. The following theorem states our result in the weak coupling regime and is proved in Sect. 4. Theorem 2.1. Let h0 = hI,s and d = 4. Assume that there exist integers N0 and N1 , satisfying N0 ≤ N1 , such that, for s ∈ [s N1 , s N1 ], the bounds 0 ≤ µ4,N0 ≤ 0.0045, 1.6µ24,N0

≤ µ6,N0 ≤

(2.14)

6.07µ24,N0 , 48.469µ34,N0 ,

(2.15)

N0 ≤ N < N1 ,

(2.17)

0 ≤ µ8,N0 ≤

(2.16)

and µ2,N < 2 +

√

2,

hold. Then there exists an sc ∈ [s N1 , s N1 ] such that if s = sc then lim µ4,N = 0,

N→∞

lim µ2,N = 1.

N→∞

Triviality of Hierarchical Ising Model in Four Dimensions

19

s=sc µ4 s=sN -- 1 0.0045

N0

N1

-s=s N

1

N0

N1

0

µ2

1.0

Fig. 2.1. A schematic view of trajectories on (µ2 , µ4 -plane) in Theorem 2.1. Trajectories for s = s N1 and for s = s N1 (solid lines) and the critical trajectory for s = sc (broken line) are shown. The Gaussian fixed point corresponds to the point (1.0, 0). The region defined by inequalities for (µ2 , µ4 ) analogous to (2.13) and (2.14) (and (2.17)) is shaded

Remark. The original Bleher–Sinai argument takes N0 = N1 . We include the N0 < N1 case which makes it possible to complete our proof by evaluating various quantities only at 2 endpoints of the interval in consideration for Ising parameter s, instead of all values in the interval, as is implicit in the assumptions of Theorem 2.1. This point will be clarified at the end of Sect. 5.3. The following theorem states our result in the strong coupling regime and is proved in Sect. 5. Theorem 2.2. The assumptions of Theorem 2.1 are satisfied for N0 = 70 and N1 = 100, where s N1 and s N1 satisfy 1.7925671170092624 ≤ s N1 ,

s N1 ≤ 1.7925671170092625.

Proof of Theorem 1.1 for d = 4 assuming Theorem 2.1 and Theorem 2.2. Theorem 2.1 and Theorem 2.2 imply that there exists sc ∈ [s N1 , s N1 ] such that, for s = sc , lim µ4,N = 0 and lim µ2,N = 1 hold. Then (2.6), (2.7), and (2.8) imply

N→∞

N→∞

2 lim hˆ N (ξ ) = e−ξ ,

N→∞

uniformly in ξ on any closed interval in R. It is easy to see that e−ξ is the characteristic function of the massless Gaussian measure hG , hence Theorem 1.1 holds for d = 4. The bounds on s N1 and s N1 in Theorem 2.2 imply 2

1.7925671170092624 ≤ sc ≤ 1.7925671170092625.

20

T. Hara, T. Hattori, H. Watanabe

3. Truncated Correlations In this section, we prepare basic (recursive) bounds on the truncated correlations that will be used in Sect. 4. The renormalization group transformation is decomposed as (2.3). Since the mapping S is simple, the essential part of our work is an analysis of T . The consequence in this section is Proposition 3.1. 3.1. Recursions. Note first that in terms of VN the mapping S can be expressed as

Se

−VN

(ξ ) = e

−2VN

√

c 2 ξ

.

Using (2.7), (2.10), (1.4) we also have

√ ∞ c 21−(1+2/d)n µ2n,N ξ 2n . ξ = 2VN 2

(3.1)

(3.2)

n=1

Next, write (2.5) as T g = const. gβ/2 , where g(ξ ) =

gt = exp(−t)g,

(3.3)

√ d 2g 1 (ξ ), and β = ( 2 − 1) for d = 4. gt is a solution to 2 dξ 2 ∂gt = −gt , g0 = g. ∂t

Hence, if we put gt (ξ ) = exp(−Vt (ξ )), then Vt satisfies d Vt = (∇Vt )2 − Vt , dt

(3.4)

∂Vt (ξ ). In other words, VN+1 is given as a solution of (3.4) at t = β/2 ∂ξ (modulo constant term), with the initial condition (3.2) at t = 0. If we write where ∇Vt (ξ ) =

Vt (ξ ) =

∞

µ2n (t)ξ 2n ,

n=0

then (3.4) implies d µ2n (t) = − (2n + 2)(2n + 1)µ2n+2 (t) dt n + (2")(2n − 2" + 2)µ2" (t) µ2n−2"+2 (t). "=1

(3.5)

Triviality of Hierarchical Ising Model in Four Dimensions

21

In particular, we have d µ2 (t) = 4µ2 (t)2 − 12µ4 (t), dt d µ4 (t) = 16µ2 (t)µ4 (t) − 30µ6 (t), dt d µ6 (t) = 24µ2 (t)µ6 (t) + 16µ4 (t)2 − 56µ8 (t), dt d µ8 (t) = 32µ2 (t)µ8 (t) + 48µ4 (t)µ6 (t) − 90µ10 (t). dt

(3.6) (3.7) (3.8) (3.9)

Thus, µ2n,N and µ2n,N+1 are related for d = 4 by e.g., 1 1 1 1 µ2 (0) = √ µ2,N , µ4 (0) = µ4,N , µ6 (0) = √ µ6,N , µ8 (0) = µ8,N , 4 32 2 8 2

β β β β µ2,N+1 = µ2 , µ4,N+1 = µ4 , µ6,N+1 = µ6 , µ8,N+1 = µ8 . 2 2 2 2 3.2. Bounds. We first note that the quantities µn (t) obey Newman’s inequalities: by comparing (2.5) and (3.3) we see that the correspondence VN → V (t) is obtained by a replacement β → 2t in (1.2). Therefore µn (t) also is a truncated n point correlation of a measure to which arguments in [15] apply, hence an analogue of (2.8) holds: 0 ≤ µ2n (t) ≤

1 (2µ4 (t))n/2 , n

n = 3, 4, 5, · · · .

(3.10)

We have to show decay of µ4,N as N → ∞. In case d > 4, the decay follows from (3.6) and (3.7) with d-dependent coefficients, namely, if we throw out the negative contributions −µ4 (t) and −µ6 (t) to the right-hand sides of (3.6) and (3.7), respectively, then we have upper bounds on µ2 (t) and µ4 (t). This argument eventually yields exponential decay of µ4,N . In case d = 4, the situation is more subtle, since the decay of µ4,N is weak, i.e., powerlike instead of exponential. In order to derive the delicate bound on µ4 (t), a lower bound for µ6 (t) must be incorporated, which in turn needs an upper bound on µ8 (t). Thus, we have to deal with Eqs. (3.6)–(3.9). This is the principle of our estimation. The result is the following: Proposition 3.1. Let d = 4 and N be a positive integer, and put rN =

√

1

=√

√

1

1 − ( 2 − 1)(µ2,N − 1) 2 − ( 2 − 1)µ2,N √ 2rN − 1 rN 1 ζN = √ = −√ . µ 2µ2,N 2µ2,N 2,N

,

(3.11) (3.12)

(i) If µ2,N < 2 +

√

2,

(3.13)

22

T. Hara, T. Hattori, H. Watanabe

then µ2,N+1 ≤ rN µ2,N ,

(3.14)

µ2,N+1 ≥

(3.15)

rN µ2,N − 3rN2 ζN µ4,N .

(ii) If, furthermore, 21 15 µ4,N ≥ √ ζN µ6,N + ζN2 µ24,N , 4 4 8 2 µ6,N 123 7 1 √ + ζN µ24,N ≥ 24ζN3 µ34,N + √ ζN2 µ4,N µ6,N + ζN µ8,N , 2 8 8 2 8 2 3 45 ζN µ4,N ≥ 12ζN3 µ24,N + √ ζN2 µ6,N , 2 8 2

(3.16) (3.17) (3.18)

then

15 µ2,N+1 ≤ rN µ2,N − 3rN2 ζN µ4,N − 8ζN3 µ24,N − √ ζN2 µ6,N , (3.19) 4 2

15 µ4,N+1 ≥ rN4 µ4,N − √ ζN µ6,N − 21ζN2 µ24,N , (3.20) 2 2

15 µ4,N+1 ≤ rN4 µ4,N − √ ζN µ6,N − 21ζN2 µ24,N 2 2 705 105 2 (3.21) + √ ζN3 µ4,N µ6,N + 447ζN4 µ34,N + ζN µ8,N , 4 2 2

µ6,N µ6,N+1 ≤ rN6 (3.22) √ + 4ζN µ24,N , 2

µ6,N 123 µ6,N+1 ≥ rN6 √ + 4ζN µ24,N − 192ζN3 µ34,N − √ ζN2 µ4,N µ6,N − 7ζN µ8,N , 2 2 (3.23)

µ 12 8,N µ8,N+1 ≤ rN8 (3.24) + √ ζN µ4,N µ6,N + 24ζN2 µ34,N . 2 2 The rest of this section is devoted to a proof of Proposition 3.1.

Proof. Now, observe that µ¯2 (t) defined by d 1 µ¯2 (t) = 4µ¯2 (t)2 , µ¯2 (0) = √ µ2,N , dt 2

(3.25)

is an upper bound of µ2 (t): µ2,N 1 µ2 (t) ≤ µ¯2 (t) = √ . √ 2 1 − 2 2µ2,N t √ 2−1 β = for d = 4 implies (3.14). This, at t = 2 4

(3.26)

Triviality of Hierarchical Ising Model in Four Dimensions

23

Put 1 , √ 1 − 2 2µ2,N t m(t) = µ¯2 (t) − µ2 (t).

M(t) =

We have m(t) ≥ 0, and (3.13) implies that M(t) is√ increasing in t ∈ [0, β/2]. By a change of variable z = M(t) − 1 (dz = 2 2µ2,N M(t)2 dt) and by putting m(z) ˆ = m(t)/M(t)2 ,

µˆ4 (z) = µ4 (t)/M(t)4 ,

µˆ6 (z) = µ6 (t)/M(t)6 , µˆ8 (z) = µ8 (t)/M(t)8 , we have, from (3.6)–(3.9), z µ4,N 1 (−8m(z) ˆ µˆ4 (z) − 15µˆ6 (z))dz, (3.27) +√ 4 2µ2,N 0 z µ6,N 1 µˆ6 (z) = √ + √ (8µˆ4 (z)2 − 12m(z) ˆ µˆ6 (z) − 28µˆ8 (z))dz, (3.28) 8 2 2µ2,N 0 z µ8,N 1 µˆ8 (z) = (24µˆ4 (z)µˆ6 (z) − 16m(z) ˆ µˆ8 (z) − 45µˆ10 (z))dz, +√ 32 2µ2,N 0

µˆ4 (z) =

m(z) ˆ =√

1 2µ2,N

(3.29)

z

(6µˆ4 (z) − 2m(z) ˆ 2 )dz,

(3.30)

0

Eqs. (3.27)–(3.30) with positivity of µ2n (t) imply µ4,N , 4 z µ24,N µ6,N µ6,N 1 µˆ6 (z) ≤ √ + √ 8µˆ4 (z)2 dz ≤ √ + √ z, 8 2 2µ2,N 0 8 2 2 2µ2,N z µ8,N 1 µˆ8 (z) ≤ 24µˆ4 (z)µˆ6 (z)dz +√ 32 2µ2,N 0

µˆ4 (z) ≤

µ8,N 3 µ4,N 2 3 µ4,N µ6,N z+ z , + 32 8 µ2,N 4 µ22,N z 3µ4,N 1 6µˆ4 (z)dz ≤ √ z. m(z) ˆ ≤√ 2µ2,N 0 2 2µ2,N

(3.31) (3.32)

3

≤

(3.33) (3.34)

√ β β (z = M( ) − 1 = 2rn − 1 for d = 4) implies (3.15). 2 2 Using (3.31), (3.32), (3.34) in (3.27), we have

In particular, (3.34) at t =

µˆ4 (z) ≥

21µ24,N 2 µ4,N 15µ6,N z− z . − 4 16µ2,N 8µ22,N

(3.35)

24

T. Hara, T. Hattori, H. Watanabe

Using (3.32), (3.33), (3.34), (3.35) in (3.28) and (3.30) we further have 12µ34,N µ24,N µ6,N 123µ4,N µ6,N 2 7µ8,N z − √ 3 z3 − z − √ z, µˆ6 (z) ≥ √ + √ √ 2 8 2 2 2µ2,N 2µ2,N 16 2µ2,N 8 2µ2,N (3.36) 6µ24,N √ 3 z3 2µ2,N

3µ4,N 45µ6,N m(z) ˆ ≥ √ z− − √ 2 z2 . (3.37) 2 2µ2,N 16 2µ2,N √

√ √ β 2−1 β and z = M − 1 = 2rN − 1 M = 2rN . When d = 4, β = 2 2 2 Then the assumptions (3.16) – (3.18) of Proposition 3.1 imply that the right-hand sides β of (3.35), (3.36), and (3.37) are non-negative at t = . On the other hand, they are 2 concave in z for z ≥ 0. Recall also that z = M(t) − 1 is increasing in t ∈ [0, β/2]. Therefore, they are non-negative for all t ∈ [0, β/2]. Using (3.35), (3.36), and (3.37) in (3.27), we therefore have z

6µ24,N 3µ4,N µ4,N 45µ6,N 1 8 √ z − √ 3 z3 − √ 2 z2 × −√ µˆ4 (z) ≤ 4 2 2µ2,N 16 2µ2,N 2µ2,N 0 2µ2,N

21µ24,N 2 µ4,N 15µ6,N × z− z − 4 16µ2,N 8µ22,N

12µ34,N 3 123µ4,N µ6,N 2 µ24,N µ6,N 7µ8,N +15 √ + √ z− √ 3 z − z − √ z dz √ 8 2 2 2µ2,N 2µ2,N 16 2µ22,N 8 2µ2,N ≤

21µ24,N 2 µ4,N 15µ6,N z− z − 4 16µ2,N 8µ22,N

3 705µ4,N µ6,N 3 447µ4,N 4 105µ8,N 2 z + z + z . 32µ32,N 16µ42,N 32µ22,N √ Recalling that at t = β/2 (z = M( β2 ) − 1 = 2rN − 1) we have

+

(3.38)

β µ¯2 ( ) = rN µ2,N , 2 µ2,N+1 µ4,N+1 µ6,N+1 µ8,N+1

2 β = rN µ2,N − m( ˆ 2rN − 1)M , 2

4 √ β = µˆ4 ( 2rN − 1)M , 2

6 √ β = µˆ6 ( 2rN − 1)M , 2

8 √ β = µˆ8 ( 2rN − 1)M , 2 √

we see that (3.37), (3.35), (3.38), (3.32), (3.36), (3.33) imply (3.19)–(3.24), respectively. This completes a proof of Proposition 3.1.

Triviality of Hierarchical Ising Model in Four Dimensions

25

4. Bleher–Sinai Argument In order to show Theorem 2.1, we confirm existence of a critical parameter s = sc by means of Bleher–Sinai argument, and, at the same time, we derive the expected decay of µ4,N . In Bleher–Sinai argument, monotonicity of s N and s N with respect to N is essential. Proposition 4.1. Let d = 4. Then the following hold: (1) If µ2,N − 1 < 0 then µ2,N+1 < µ2,N . 3 1 (2) If > µ2,N − 1 ≥ √ µ4,N then µ2,N+1 ≥ µ2,N . 4 2 Proof. Note that for both cases in the statement, the assumption (3.13) in Proposition 3.1 holds. Hence, (3.14), with (3.11) and monotonicity of µ2,N , implies µ2,N − 1 < 0 ⇒ rN < 1 ⇒ µ2,N+1 < µ2,N . Next we see that (3.15), with (3.11) and (3.12), implies √ 3rN ( 2rN − 1) µ2,N − 1 ≥ ⇒ µ2,N+1 ≥ µ2,N . √ µ4,N (2 − 2)µ22,N

(4.1)

(4.2)

Put L1 (x) = √

3 . √ 2x( 2 − ( 2 − 1)x)2 √

Then by straightforward calculation we see 1≤x≤

5 3 ⇒ L1 (x) ≤ L1 (1) = √ , 4 2

and (3.11) implies √ 3rN ( 2rN − 1) . L1 (µ2,N ) = √ (2 − 2)µ22,N Therefore (4.2) implies that 1 3 > µ2,N − 1 ≥ √ µ4,N ⇒ µ2,N+1 ≥ µ2,N . 4 2

(4.3)

Corollary 4.2. Let d = 4. Then, for the s N defined in (2.11), it holds that s N ≤ s N+1 . Proof. Since µ2,N is increasing in s, if s < s N then µ2,N < 1, hence Proposition 4.1 implies µ2,N+1 < µ2,N < 1, further implying s < s N+1 . Hence the statement holds.

26

T. Hara, T. Hattori, H. Watanabe

For later convenience, define rN∗ =

1

√

3 1 − ( 2 − 1) √ µ4,N 2 1 ζ∗N = 1 − √ , 2 √ ∗ 2rN − 1 .

ζN∗ = √ 3 2 1 + √ µ4,N 2

,

(4.4)

(4.5) (4.6)

Then we see that if (2.13) holds, then we have, from (3.11) and (3.12), 1 2M, n n c n c n n n n aM,N a",N an−",N ≤ a1,N × M bn,N = " " 4 4 a1,N "=0 "=0 c a n a 1,N M,N = . M 2 a1,N

(5.26)

Triviality of Hierarchical Ising Model in Four Dimensions

35

Therefore 2a¯ ",N ≤

aM,N c a1,N " M 2 a1,N

≤

aM,N M a1,N

=

aM,N M a1,N

=

aM,N M a1,N

∞

m (2m + 2" − 1)!! (2m)!! (2" − 1)!! m=2M+1−"

∞ c a " m m + " 1,N βc a1,N " 2 m=2M+1−"

∞ c a " 2M+1−" k 2M + 1 + k 1,N βc a1,N βc a1,N " 2 k=0

"

∞ 2M+1 k 2M + 1 + k 1 . (5.27) βc a1,N βc a1,N " 2β βc a1,N

k=0

Here, T2M+1," (r) =

∞

βc a1,N

k

k=0

∞ 2M + 1 + k k 2M + 1 + k = r " "

"

1 2M + 1 m = q , "−m 1−r

k=0

(5.28)

m=0

r where r = βc a1,N , and q = 1−r . By assumption r < 21 . The binomial coefficient in the summand is largest when m = 0, because 2M + 1 > 2M ≥ 2". Therefore,

"

1 1 2M + 1 m 1 2M + 1 T2M+1," (r) ≤ q ≤ 1−r " 1−r 1−q " m=0

1 2M + 1 = . 1 − 2r "

(5.29)

This proves

2a¯ ",N ≤

1 2β

"

2M+1

βc a1,N aM,N 2M + 1 × M ≤ 2a¯ ",N , " 1 − 2βc a1,N a1,N

where 2a¯ ",N is defined in (5.14). This proves a˜ n,N ≤ a˜¯ n,N .

(5.30)

Remark. We can “improve” Proposition 5.1 by employing (correct) bounds, in a similar ca¯

n

1,N way as the term proportional to in (5.9). In actual calculations, we improve 2 a¯ n,N+1 , n = 1, 2, · · · , M, in (5.12), the upper bounds for an,N+1 ’s, using (A.6) (as well 2 as its special case (5.5)). To be more specific, we compare a¯ 4,N+1 in (5.12) with a¯ 2,N+1 and replace the definition if the latter is smaller. Then we go on to “improve” a¯ 6,N+1 by comparing with a¯ 2,N+1 a¯ 4,N+1 , and so on. Conceptually there is nothing really new here, but this procedure improves the actual value of the bounds in Proposition 5.1.

36

T. Hara, T. Hattori, H. Watanabe

5.3. Computer results. In this subsection we prove Theorem 2.2 on computers using Proposition 5.1. We double checked by Mathematica and C++ programs on interval arithmetic. Here we will give results from C++ programs. Our program employs interval arithmetic, which gives rigorous bounds numerically. The idea is to express a number by a pair of “vectors”, which consists of an array of length M of “digits”, taking values in {0, 1, 2, · · · , 9}, and an integer corresponding to “exponent”. To give a simple example, let M = 2. One can view that 0.0523 is expressed on the program, for example, as I1 = [5.2 × 10−2 , 5.3 × 10−2 ], and 3 is expressed as I2 = [3.0 × 100 , 3.0 × 100 ]. When the division I1 /I2 is performed, our program routines are so designed that they give correct bounds as an output. Namely, the computer output of I1 /I2 will be [1.7 × 10−2 , 1.8 × 10−2 ]. We may occasionally lose the best possible bounds, but the program is so designed that we never lose the correctness of the bounds. Thus all the outputs are rigorous bounds of the corresponding quantities. In actual calculation we took M = 70 digits, which turned out to be sufficient. We also note that interval arithmetic is employed in [14] for the hierarchical model in d = 3 dimensions. We took an independent approach in programming – we focused on ease in implementing the interval arithmetic to main programs developed for standard floating point calculations – so that structure and details of the programs are quite different. However, our numerical calculations are “not that heavy” to require anything special. For the program which we used for our proof, see the supplement to [17]. As will be explained below, we only need to consider 2 values for the initial Ising parameter s: s− = 1.7925671170092624, and s+ = 1.7925671170092625. We perform explicit recursion on computers for each s = s± using Proposition 5.1. We summarize what is left to be proved: 1 , 0 ≤ s ≤ sN1 , 0 ≤ N ≤ N1 , where N1 = 100. This condition is 2βc from (5.15), imposed because we are going to do evaluation using Proposition 5.1. Note that this condition is stronger than (2.17) in the assumptions in Theorem 2.1, √ 1 1 because = (2 + 2) = 1.707 · · · for d = 4. 2βc 2 (2) s− ≤ s N1 and s N1 ≤ s+ . To prove this, it is sufficient (as seen from the definitions (2.11) and (2.12)) to prove

(1) a¯ 1,N

1 + √ µ4,N1 , when s = s+ . 2 (5.31)

(3) For any s satisfying s− ≤ s ≤ s+ , the bounds (0 ≤)µ4,N0 ≤ 0.0045, 1.6µ24,N0

≤ µ6,N0 ≤

(0 ≤)µ8,N0 ≤

6.07µ24,N0 , 48.469µ34,N0 ,

(5.32) (5.33) (5.34)

hold for N0 = 70. This condition comes from the assumptions in Theorem 2.1 (sufficient, if s− ≤ s N1 and s N1 ≤ s+ ). We now summarize our results from explicit calculations.

Triviality of Hierarchical Ising Model in Four Dimensions

37

1 2 (1) We have a¯ 1,N ≤ s+ = 1.6066 · · · , 0 ≤ s ≤ s+ , 0 ≤ N ≤ N1 . The largest value 2 for a¯ 1,N in the range of parameters is actually obtained at s = s+ and N = 0. (2) Our calculations turned out to be accurate to obtain more than 40 digits below decimal point correctly for µ2,100 and µ4,100 at s = s± , which is more than enough to prove (5.31). In fact, we have 0.99609586499804791366176669341357334889503943 ≤ a 1,100 ≤ µ2,100 ≤ a¯ 1,100 ≤ 0.99609586499804791366176669341357334889503972, at s = s− , and 1.0131857903720691722396611098376636943838027 ≤ a 1,100 ≤ µ2,100 ≤ a¯ 1,100 ≤ 1.0131857903720691722396611098376636943838031, 0.00281027097809098768088795100753480139767915 2 ≤ 21 (−a¯ 2,100 + a 21,100 ) ≤ µ4,100 ≤ 21 (−a 2,100 + a¯ 1,100 ) ≤ 0.00281027097809098768088795100753480139767969, at s = s+ . (3) To prove (5.32)–(5.34), we note the following. Let us write the s dependences of an,N and µn,N explicitly like an,N (s) and µn,N (s). For any integer N and for any s satisfying s− ≤ s ≤ s+ , the monotonicity of an,N (s) with respect to s implies µ4,N (s) =

1 1 (−a2,N (s) + a1,N (s)2 ) ≤ (−a2,N (s− ) + a1,N (s+ )2 ) =: µ¯ 4,N . 2 2 (5.35)

Hence if we can prove µ¯ 4,70 ≤ 0.0045, then we have proved (5.32). In a similar way, sufficient conditions for (5.33) and (5.34) are 1.6 ≤

µ6,70 µ¯ 24,70

,

µ¯ 6,70 ≤ 6.07, µ24,70

µ¯ 8,70 ≤ 48.469, µ34,70

with obvious definitions (as in (5.35) for µ¯ 4,N ) for µn,70 and µ¯ n,70 . The bounds we have for these quantities are (we shall not waste space by writing too many digits): µ¯ 4,70 ≤ 0.004144, 3.6459 ≤

µ6,70 µ¯ 24,70

,

µ¯ 6,70 µ¯ 8,70 ≤ 3.7542, 3 ≤ 38.488. µ24,70 µ4,70

This completes a proof of Theorem 2.2, and therefore Theorem 1.1 is proved. Acknowledgement. The authors would like to thankYoichiro Takahashi for his interest in the present work and for discussions. Part of this work was done while T. Hara was at Department of Mathematics, Tokyo Institute of Technology. The researches of T. Hara and T. Hattori are partially supported by Grant-in-Aid for Scientific Research (C) of the Ministry of Education, Science, Sports and Culture.

38

T. Hara, T. Hattori, H. Watanabe

A. Newman’s Inequalities Let X be a stochastic variable which is in class L of [15]. X ∈ L has Lee-Yang property, which states that the zeros of the moment generating function E eH X are pure imag inary. In fact, it is shown in [15, Prop. 2] using Hadamard’s Theorem that E eH X has the following expression: !

E e

HX

"

=e

bH 2

#

j

H2 1+ 2 αj

$ ,

(A.1)

where b is a non-negative constant and αj , j = 1, 2, 3, · · · , is a positive nondecreasing ∞ αj−2 < ∞. sequence satisfying j =1

Consequences of (A.1) in terms of inequalities among moments (n point functions) are given in [15], among which we note the following: 1. Positivity [15, Theorem 3]. Put µ2n

! √ d 2n 1 =− log E e −1ξ X (2n)! dξ 2n

"%% % %

ξ =0

.

(A.2)

Then, µ2n ≥ 0, n = 0, 1, 2, · · · .

(A.3)

(Note that (A.1) implies µ2n+1 = 0.) 2. Newman’s bound [15, Theorem 6]. Put v2n = nµ2n . Then, v4n ≤ v4n ,

v6 ≤

√ v 4 v8 ,

v4n+2 ≤ v6 v4n−1 ,

(A.4)

where the first and third inequalities follow from (2.10) of [15], while the second one n/2 is (2.12) of [15]. These imply v2n ≤ v4 , n ≥ 2, and therefore µ2n ≤

(2µ4 )n/2 , n = 2, 3, 4, · · · . n

(A.5)

Furthermore, we will prove the following. Proposition A.1. Put aN =

" ! N! E X2N , N ∈ Z+ . Then, (2N )!

aM+N ≤ aM aN

N, M = 0, 1, 2, · · · .

(A.6)

Proof. Put yj = αj−2 > 0. Then " ! 2 1 + H 2 yj . E eH X = ebH j

(A.7)

Triviality of Hierarchical Ising Model in Four Dimensions

39

Expand the infinite product to obtain H4

H6

yj + yi yj + y i y j yk + . . . 1 + H 2 yj = 1 + H 2 2! 3! j

j

=

∞

i,j

i,j,k

H 2n cn , n!

n=0

with

cn =

yi1 yi2 yi3 . . . yin ,

(A.8)

(A.9)

i1 ,i2 ,...,in

where primed summations denote summations over non-coinciding indices. Hence we have, ∞ ! " E eH X = H 2N

N=0

!

Comparing with E e

m,n:m+n=N

HX

"

∞ N bm cn bN−n cn = . H 2N m! n! (N − n)! n!

(A.10)

n=0

N=0

∞ aN 2N = H , we obtain N! N=0

aN =

N N n=0

bN−n cn .

n

Note that (A.9) implies cn+m ≤ cm cn ,

(A.11)

because the conditions of primed summations are weaker for the left-hand side. This with b ≥ 0 implies M N M N M+N−m−n b cm cn aM aN = m n m=0 n=0

≥

N M M N

m=0 n=0

=

M+N

b

m

M+N−"

=

"

c"

m:0≤m≤M, 0≤"−m≤N

"=0 M+N

bM+N−m−n cm+n

n

b

"=0

M+N−"

c"

M +N "

M N m "−m

= aM+N ,

where, in the last line, we also used

" M N M +N = , m "−m "

(A.12)

m: 0≤m≤M, 0≤"−m≤N

which is seen to hold if we compare the coefficients of x " of an identity (1 + x)M+N = (1 + x)M (1 + x)N .

40

T. Hara, T. Hattori, H. Watanabe

References 1. Bleher, P.M. and Sinai, Ya.G.: Investigation of the critical point in models of the type of Dyson’s hierarchical model. Commun. Math. Phys. 33, 23–42 (1973) 2. Bleher, P.M. and Sinai, Ya.G.: Critical indices for Dyson’s asymptotically hierarchical models. Commun. Math. Phys. 45, 247–278 (1975) 3. Collet, P. and Eckmann, J.-P.: A renormalization group analysis of the hierarchical model in statistical physics. Springer Lecture Note in Physics 74, 1978 4. Dyson, F.J.: Existence of a phase-transition in a one-dimensional Ising ferromagnet. Commun. Math. Phys. 12, 91–107 (1969) 5. Gallavotti, G.: Some aspects of the renormalization problems in statistical mechanics. Memorie dell’ Accademia dei Lincei 15, 23–59 (1978) 6. Gaw¸edzki, K. and Kupiainen, A.: Triviality of φ44 and all that in a hierarchical model approximation. J. Stat. Phys. 29, 683–699 (1982) 7. Gaw¸edzki, K. and Kupiainen, A.: Non-Gaussian fixed points of the block spin transformation. Hierarchical model approximation. Commun. Math. Phys. 89, 191–220 (1983) 8. Gaw¸edzki, K. and Kupiainen, A.: Nongaussian Scaling limits. Hierarchical model approximation. J. Stat. Phys. 35, 267–284 (1984) 9. Gaw¸edzki, K. and Kupiainen, A.: Asymptotic freedom beyond perturbation theory. In: K. Osterwalder and R. Stora, eds., Critical Phenomena, Random Systems, Gauge Theories. Les Houches 1984, Amsterdam: North-Holland, 1986 10. Gaw¸edzki, K. and Kupiainen, A.: Massless lattice φ44 Theory: Rigorous control of a renormalizable asymptotically free model. Commun. Math. Phys. 99, 199–252 (1985) 11. Koch, H. and Wittwer, P.: A non-Gaussian renormalization group fixed point for hierarchical scalar lattice field theories. Commun. Math. Phys. 106, 495–532 (1986) 12. Koch, H. and Wittwer, P.: On the renormalization group transformation for scalar hierarchical models. Commun. Math. Phys. 138, 537–568 (1991) 13. Koch, H. and Wittwer, P.: A nontrivial renormalization group fixed point for the Dyson–Baker hierarchical model. Commun. Math. Phys. 164, 627–647 (1994) 14. Koch, H. and Wittwer, P.: Bounds on the zeros of a renormalization group fixed point. Mathematical Physics Electronic Journal 1, No. 6 (24pp.) (1995) 15. Newman, C.M.: Inequalities for Ising models and field theories which obey the Lee–Yang theorem. Commun. Math. Phys. 41, 1–9 (1975) 16. Sinai, Ya.G.: Theory of phase transition: Rigorous results. New York: Pergamon Press, 1982 17. Hara, T., Hattori, T., and Watanabe, H.: Triviality of hierarchical Ising Model in four dimensions. Archived in mp_arc (Mathematical Physics Preprint Archive, http://www.ma.utexas.edu/mp_arc/) 00-397 Communicated by D. C. Brydges

Commun. Math. Phys. 220, 41 – 67 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

Geometric Optics and Long Range Scattering for One-Dimensional Nonlinear Schrödinger Equations Rémi Carles Antenne de Bretagne de l’ENS Cachan and IRMAR, Campus de Ker Lann, 35 170 Bruz, France. E-mail: [email protected] Received: 23 May 2000 / Accepted: 8 January 2001

Abstract: With the methods of geometric optics used in [2], we provide a new proof of some results of [11], to construct modified wave operators for the one-dimensional cubic Schrödinger equation. We improve the rate of convergence of the nonlinear solution towards the simplified evolution, and get better control of the loss of regularity in Sobolev spaces. In particular, using the results of [9], we deduce the existence of a modified scattering operator with small data in some Sobolev spaces. We show that in terms of geometric optics, this gives rise to a “random phase shift” at a caustic. Contents 1. 2. 3. 4. 5. 6. 7.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . Formal Computations . . . . . . . . . . . . . . . . . . . . . . . Estimates on Some Oscillatory Integrals . . . . . . . . . . . . . Energy Estimates . . . . . . . . . . . . . . . . . . . . . . . . . Justification of Nonlinear Geometric Optics Before the Caustic . Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . Construction of the Modified Scattering Operator and Application

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

41 46 48 53 54 58 63

1. Introduction In this article, we consider the nonlinear Schrödinger equation in one space dimension, 1 i∂t ψ + ∂x2 ψ = λ|ψ|p ψ, λ ∈ R, 2 in the particular case where p = 2. Define the Fourier transform by Fv(ξ ) = v (ξ ) = e−ix.ξ v(x)dx.

(1.1)

42

R. Carles

For p > 2, it is well known that to any asymptotic state ψ− ∈ H 1 ∩ F(H 1 ) =: , one can associate a solution ψ of (1.1) that behaves asymptotically as the free evolution of ψ− , that is, U0 (−t)ψ(t) − ψ− −→ 0, t→−∞

i 2t ∂x2

where U0 (t) := e denotes the unitary group of the free Schrödinger equation. The operator W− : ψ− → ψ|t=0 is called a wave operator. The case p = 2 is different (long range case). It is proved (see [1,12,13,5]) that if ψ− ∈ L2 and U0 (−t)ψ(t) − ψ− −→ 0 in L2 , where ψ solves t→−∞

1 i∂t ψ + ∂x2 ψ = |ψ|2 ψ, 2 then ψ = ψ− = 0. One cannot compare the nonlinear dynamics with the free dynamics. In [11], the author constructs modified wave operators that allow to compare the nonlinear dynamics of (1.1) when p = 2 with a simpler one, yet more complicated than the free dynamics. Assuming the asymptotic state ψ− is sufficiently smooth and small in a certain Hilbert space, Ozawa defines a new operator (that depends on ψ− ) such that the evolution of ψ− under this dynamics can be compared to the asymptotic behavior of a certain solution of 1 i∂t ψ + ∂x2 ψ = λ|ψ|2 ψ. (1.2) 2 Using the methods of geometric optics as in [2], we rediscover these modified operators, and improve some convergence estimates (Corollary 1). Moreover, we have better control of the (possible) loss of regularity, which, along with the results of [9], makes it possible to define a modified scattering operator (S = W+−1 W− ) for small data in (Corollary 2). This enables us to describe the validity of nonlinear geometric optics with focusing initial data. In particular, we show that the caustic crossing is described in terms of the scattering operator (as in [2]), plus a “random phase shift” (Corollary 3). In [8], Ginibre and Velo construct modified wave operators in Gevrey spaces. They make no size restriction on the data, but require analyticity for the asymptotic states. In the present article, we cannot leave out the smallness assumption, but our asymptotic states are less regular. Denote H := {f ∈ H 3 (R); xf ∈ H 2 (R)} = {f ∈ S (R); f H := (1 + x 2 )1/2 (1 − ∂x2 )f L2 + (1 − ∂x2 )3/2 f L2 < ∞}. Recall one of the results in [11]. Theorem 1 ([11], Theorem 2). There exists γ > 0 with the following properties. For 1 any ψ− ∈ F(H) with ψ − L∞ < γ , (1.2) has a unique solution ψ ∈ C(R; H ) ∩ 4 1,∞ Lloc (R; W ) such that for any α with 1/2 < α < 1, ψ(t) − eiS

t

−∞

ψ(τ ) − eiS

− (τ )

− (t)

U0 (t)ψ− H 1 = O(|t|−α ), 1/4

U0 (τ )ψ− 4W 1,∞ dτ

= O(|t|−α ) as t → −∞,

(1.3)

(1.4)

Geometric Optics and Long Range Scattering for NLS

43

where the phase shift S − is defined by S − (t, x) :=

λ 2π

x 2 ψ− log |t|. t

(1.5)

Remark 1. Theorem 1 in [11] gives an asymptotic in L2 instead of H 1 , and requires less regularity on the asymptotic state ψ− . Yet, it is still required to be small in the same space as in Theorem 2. Now we recall why the method of geometric optics can be closely related to scattering theory in the case of the nonlinear Schrödinger equation. In [2], we consider the initial value problem 1 iε∂t uε + ε 2 ∂x2 uε = λε α |uε |β uε , (t, x) ∈ R+ × R, 2 (1.6) 2 ε −i x2ε u|t=0 = e f (x), where α ≥ 1, β > 0 and 0 < ε ≤ 1 is a parameter going to zero. With the initial phase −x 2 /2, rays of geometric optics (which are the projection on the (t, x) space of the bicharacteristics) focus at the point (t, x) = (1, 0). We proved in [2] that in the case where β = 2α > 2 (“nonlinear caustic”), the asymptotic behavior, as ε goes to zero, of the solution near t = 1 is easily expressed in terms of f and the wave operator W− . To see that point, we introduced the scaling 1 ε t −1 x ε , , (1.7) u (t, x) = √ ψ ε ε ε that satisfies

1 1 1 ε U0 ψ − −→ ψ− := √ f (x). ε ε ε→0 2iπ

Define the function ψ by

(1.8)

i∂ ψ + 1 ∂ 2 ψ = λ|ψ|β ψ, t 2 x ψ|t=0 = W− ψ− .

Then ψ is a concentrating profile for uε , that is 1 t −1 x uε (t, x) ∼ √ ψ , . ε→0 ε ε ε In this paper, we treat the limiting case of (1.6), that is α = 1, β = 2. We study the validity of nonlinear geometric optics, for positive times, for the solutions of the following initial value problem, 1 iε∂t uε + ε 2 ∂x2 uε = λε|uε |2 uε , (t, x) ∈ R+ × R, 2 (1.9) x2 1 2 uε = e−i 2ε +iλ|f (x)| log ε f (x). |t=0

44

R. Carles

We altered the initial data by adding the term eiλ|f (x)| log ε in order to recover the same modified wave operator as in [11]. Nonlinear geometric optics could be justified as well without this term, by the same methods as that which follows in this article, but would not make it possible to deduce the existence of modified wave operators for (1.2). From now on, the function f in the initial data is supposed to belong to H, and nonzero. Then for every (fixed) ε > 0, (1.9) has a unique global solution, which belongs to C(Rt , ) (see for instance [5,6]). The following definition, that follows the spirit of [2], will be motivated in Sect. 2. 2

Definition 1. Let g ε be defined for t < 1 by g ε (t, ξ ) := λ|f (−ξ )|2 log

1

1−t ε

.

The approximate solution uεapp is defined for t < 1 by x.ξ t−1 2 1 dξ ε uεapp (t, x) := √ e−i 2ε ξ +i ε +ig (t,ξ ) a0 (ξ ) , 2π ε with

a0 (ξ ) :=

2π f (−ξ ). i

We define the symbol aε (t, ξ ) by x.ξ t−1 2 1 dξ ε uε (t, x) = √ e−i 2ε ξ +i ε +ig (t,ξ ) aε (t, ξ ) , 2π ε

(1.10)

(1.11)

(1.12)

(1.13)

which makes sense since uε ∈ L2 and

ξ 1 t−1 2 ε . aε (t, ξ ) = √ ei 2ε ξ −ig (t,ξ ) uε t, ε ε

(1.14)

We can now state the main result. Theorem 2. Let f ∈ H. There exist C ∗ = C ∗ (f ) and ε∗ = ε ∗ (f ) > 0 such that for 0 < ε ≤ ε∗ , nonlinear geometric optics is valid before the focus, with the following distinctions. – If f L∞ < |2λ|−1/2 , then in {1 − t ≥ C ∗ ε} and for any 0 ≤ s ≤ 1, 1 − t 2+s ε . aε (t, .) − a0 H s ∩F(H s ) = O log 1−t ε 2 ∗ – If f L∞ ≥ |2λ|−1/2 , then denote C0 := 2|λ|f L∞ . For any α > 0, there exists Cα 1/(C +α) 0 such that in 1 − t ≥ Cα∗ ε| log ε|5/2 , and for any 0 ≤ s ≤ 1,

aε (t, .) − a0 H s ∩F(H s ) = O

ε 2+s . | log ε| (1 − t)C0 +2sα

The above estimates are uniform on the time intervals we consider.

Geometric Optics and Long Range Scattering for NLS

45

Define the Galilean operator J (see for instance [5,6]) by J (t) := x + it∂x . 1/2 . Then there exists a unique Corollary 1. Let ψ− ∈ F(H) with ψ − L∞ < (π/|λ|) ψ ∈ C(R, ) solution of (1.2) such that for any 0 ≤ s ≤ 1, as t → −∞, (log |t|)2+s − ψ(t) − eiS (t) U0 (t)ψ− H s = O , (1.15) |t| (log |t|)3 iS − (t) J (t)ψ − J (t)e U0 (t)ψ− L2 = O . (1.16) |t| In particular, we have (log |t|)5/2 iS − (t) ψ(t) − e . (1.17) U0 (t)ψ− L∞ = O |t|3/2

Actually, we will prove uniqueness under weaker conditions, as stated in the following proposition. Recall that f and ψ− are related by (1.8). Proposition 1. Let f ∈ H 2 (R). Suppose f L∞ < |2λ|−1/2 . Then there exists at most one function ψ ∈ C(Rt , L2 ∩ L∞ ) solution of (1.2) satisfying the following property: There exists 1/2 < α < 1, with α > 2|λ|f 2L∞ , such that, as t → −∞, 1 iS − (t) ψ(t) − e . U0 (t)ψ− L2 ∩L∞ = O |t|α Remark 2. Our method does not recover the convergence in L4t (L∞ x ) of the derivatives, stated in Theorem 1. However, we recover all the others, with a better convergence rate. Remark 3. The other improvement involves the regularity of the function ψ thus constructed. We get some regularity of the momenta of ψ, namely xψ ∈ L2 , which did not appear in [11]. Thanks to this regularity, we can use the results of asymptotic completeness stated in [9], in order to define a long range scattering operator for small data. Corollary 2. We can define a modified scattering operator for (1.2), for small data in H. There exists δ > 0 such that to any ψ− ∈ F(H) satisfying ψ− ≤ δ, we can associate unique ψ ∈ C(Rt , ) solution of (1.2) and ψ+ ∈ L2 such that ψ(t)

∼

t→±∞

eiS

± (t)

U0 (t)ψ± in L2 ,

(1.18)

where S ± are defined by (1.5). The map S : ψ− → ψ+ is the modified scattering operator. Corollary 3. Let f ∈ H. Assume f is sufficiently small. Then nonlinear geometric optics is valid in L2 for the problem (1.9), before and after the caustic. The caustic crossing is described by the modified scattering operator S and a “random phase shift”. One has the following asymptotics in L2 , – if t < 1, then π

2 ei 4 i x +i λ ψ u (t, x) ∼ √ e 2ε(t−1) 2π − ε→0 2π(1 − t)

ε

x t−1

2 log

1−t ε

ψ −

x , t −1

46

R. Carles

– if t > 1, then

π

2 e−i 4 i x +i λ ψ u (t, x) ∼ √ e 2ε(t−1) 2π + ε→0 2π(t − 1)

ε

x t−1

2 log

t−1 ε

ψ +

x , t −1

where ψ− is defined by (1.8) and ψ+ = Sψ− . Remark 4. The phase shift of −π/2 between the two asymptotics is classical, and appears even in the linear case ([4]). The change in the profile, measured by a scattering operator, was proved in [2]. The new phenomenon here is the phase shift 2 2 x x λ λ t −1 log t − 1 , ψ+ − ψ− log 2π t −1 ε 2π t −1 ε which is “very nonlinear”, and depends on ε, hence can be called “random”. Remark 5. From a physical point of view, the nonlinearity λ|ψ|2 ψ appears as the first term of a Taylor expansion of a more general nonlinearity h(|ψ|2 )ψ. For instance, h may be bounded (to model the phenomenon of saturation). For large times, ψ is small and we can write h(|ψ|2 )ψ = λ|ψ|2 ψ + R(|ψ|2 )ψ,

(1.19)

with R(|ψ|2 ) = O(|ψ|4 ). One can check that replacing λ|ψ|2 ψ with the right-hand side of (1.19), Corollary 1 still holds, as well as Corollary 2, since the results in [9] still hold with (1.19). Notations. We will denote d¯ξ := so that the Fourier inverse formula writes F −1 f (x) =

dξ , 2π eixξ f (ξ )d¯ξ.

For x ∈ R, we denote x := (1 + x 2 )1/2 . 2. Formal Computations In this section, we recall how the oscillatory integrals were introduced in the nonlinear short range case ([2]), and give a formal argument that leads to Definition 1 before the focus, that is for t < 1. Suppose uε solves the initial value problem 1 iε∂t uε + ε 2 ∂x2 uε = 0, (t, x) ∈ R+ × R, 2 (2.1) 2 ε −i x2ε u|t=0 = e f (x). For t < 1, the asymptotics when ε goes to zero is given by WKB methods, x2 1 x uε (t, x) ∼ √ ei 2ε(t−1) . f ε→0 1 − t 1−t

(2.2)

Geometric Optics and Long Range Scattering for NLS

47

Near the focus, this description fails to be valid. Neither the profile nor the phase in (2.2) are defined for t = 1. For much more general cases, Duistermaat showed that a uniform description can be obtained in terms of oscillatory integrals ([4]), that is, in this case, xξ t−1 2 1 ε u (t, x) = √ (2.3) e−i 2ε ξ +i ε aε (ξ )d¯ξ. ε It is easy to check that aε has an asymptotic expansion in powers of ε, and in particular, aε −→ a0 defined by (1.12). For t < 1 the usual stationary phase formula applied to the ε→0

above integral with aε replaced by a0 gives the asymptotics (2.2). For t > 1, one has almost the same asymptotics, the main difference is a phase shift of −π/2 due to the caustic crossing. For the nonlinear case (1.6), we generalized the previous representation as follows ([2]), xξ t−1 2 1 ε u (t, x) = √ (2.4) e−i 2ε ξ +i ε aε (t, ξ )d¯ξ. ε This formula makes sense as soon as uε ∈ L2 , since ξ 1 t−1 2 aε (t, ξ ) = √ ei 2ε ξ uε t, . ε ε The nonlinear term εα |uε |β uε is negligible when ∂t aε goes to zero. With this natural definition, we proved that the nonlinear term can have different influences away from the caustic, and near t = 1, which led us to use the same vocabulary as in [10], linear/nonlinear propagation, linear/nonlinear caustic. We also proved that the four cases can be encountered. When the propagation is nonlinear (α = 1), a formal computation based on the stationary phase formula suggests as a limit transport equation for the symbol aε , i∂t a(t, ξ ) =

λ |2π(1 − t)|

β 2

|a|β a(t, ξ ),

(2.5)

at least away from the caustic, with initial data a|t=0 = a0 (ξ ). Multiplying (2.5) by a, ¯ ig(t,ξ ) one notices that the modulus of a is constant. If we write a = a0 e , the equation for g is: ∂t g(t, ξ ) = −

λ |1 − t|

β 2

|f (−ξ )|β .

(2.6)

If we wish to get as a limit transport equation the relation ∂t a˜ = 0, it seems natural to define a modified symbol a˜ ε as x.ξ t−1 2 1 ε u (t, x) = √ (2.7) e−i 2ε ξ +i ε +ig(t,ξ ) a˜ ε (t, ξ )d¯ξ, ε with g|t=0 = 0. In the case of a linear caustic (β < 2), we proved that indeed, a˜ ε (t, ξ ) −→ a0 (ξ ) in L∞ t,loc (x ). ε→0

48

R. Carles

In the case we want to study now, β = 2, the integration of (2.6) is possible only for t < 1. With the initial data g|t=0 = λ|f (−ξ )|2 log 1ε , it gives the result introduced in Definition 1. As in the cases recalled above, the transport equation for the modified symbol a˜ ε must be, for t < 1, ∂t a˜ ε −→ 0, ε→0

which leads us to the definition of the approximate solution (1.11). From now on, we will leave out the tilde symbol for a, and adopt the notation (1.13). Remark 6. The function g is defined only for t < 1, not near t = 1. One must remember that the formal computations that lead to the definition of g are based on the application of 2 the stationary phase formula. When the phase 1−t 2 ξ + xξ does not have non-degenerate critical points, one must not expect this formal argument to be valid in the general case. On the other hand, recall that the case we study (α = 1 and β = 2) corresponds to a nonlinear propagation and a nonlinear caustic. The phase g takes the nonlinear effects of the propagation before the caustic into account. To take the nonlinear effects of the caustic into account, one has to define a (long range) scattering operator for the cubic Schrödinger equation (see Sect. 7). For t < 1, the function uεapp satisfies the equation √ xξ t−1 2 1 ε iε∂t uεapp + ε 2 ∂x2 uεapp = − ε ∂t g ε (t, ξ )e−i 2ε ξ +i ε +ig (t,ξ ) a0 (ξ )d¯ξ 2 xξ t−1 2 1 |f (−ξ )|2 ε = λε √ a0 (ξ )d¯ξ. e−i 2ε ξ +i ε +ig (t,ξ ) 1−t ε

(2.8)

For t < 1, one can formally apply the stationary phase formula to the integral defining uεapp , x2 x 1 x i 2ε(t−1) +ig ε t, t−1 ε uapp (t, x) ∼ e f (2.9) =: uε1 (t, x). ε→0 (1 − t)1/2 1−t On the other hand, if one applies the stationary phase formula to the right-hand side of (2.8), it comes λε|uε1 |2 uε1 (t, x), so formally, uεapp is an approximate solution of (1.9). In the following section, we estimate precisely the remainders when one applies the stationary phase formula as above. 3. Estimates on Some Oscillatory Integrals 3.1. The fundamental estimate. We first estimate precisely the remainder of the usual stationary phase formula applied to the first order, in L2 . Lemma 1. Let σ (t, ξ ) be locally bounded in time with values in L2 (R). Denote xξ t−1 2 1 H ε (t, x) := √ e−i 2ε ξ +i ε σ (t, ξ )d¯ξ, ε and .ε the first term given by the stationary phase formula, x2 x i −i ε 2ε(1−t) σ t, . . (t, x) := e 2π(1 − t) t −1

Geometric Optics and Long Range Scattering for NLS

49

1. There exists a continuous function h, with h(0) = 0, such that ε ε H (t, .) − .ε (t, .) 2 = h . L 1−t 2. If σ (t, .) ∈ H 2 (R), the rate of continuity of h can be estimated, ε H (t, .) − .ε (t, .) 2 ≤ C ε σ (t, .) 2 . Hξ L |1 − t| Proof. From the definition of H ,

1−t x 2 1 i ξ + 1−t σ (t, ξ )d¯ξ H (t, x) = e e 2ε √ ε x2 1−t 2 1 x ei 2ε ξ σ t, ξ − = e−i 2ε(1−t) √ d¯ξ, 1−t ε 2

x −i 2ε(1−t)

ε

hence from Parseval formula,

ε xy 2 i ei 2(1−t) y ei 1−t Fξ−1 H (t, x) = e →y σ (t, y)dy 2π(1 − t) x2 x i −i 2ε(1−t) = e σ t, 2π(1 − t) t −1 xy x2 ε 2 i + e−i 2ε(1−t) ei 2(1−t) y − 1 ei 1−t F −1 σ (t, y)dy, 2π(1 − t) ε

2

x −i 2ε(1−t)

and the last term can also be written as ε x2 i x −i 2ε(1−t) i 2(1−t) y2 −1 −1 F σ t, F e . e 2π(1 − t) t −1 Now from the Plancherel formula, ε H (t, .) − .(t, .) 2 = h t, L x

with

ε 1−t

z 2 h(t, z) = ei 2 y − 1 F −1 σ

L2y

,

.

Then the first point follows from the dominated convergence theorem. When σ (t, .) ∈ H 2 , we have z h(t, z) = 2 sin y 2 F −1 σ (t, .) 4 L2y z ≤ 2 y 2 F −1 σ (t, .) 4 L2y ≤ |z| y 2 F −1 σ (t, .) 2 = C|z|σ (t, .)H 2 . Ly

This inequality completes the proof of Lemma 1.

50

R. Carles

3.2. Convergence of the initial data. To obtain asymptotics in for the symbols as stated in Theorem 2, we have to notice the following properties. If xξ t−1 2 1 v ε (t, x) = √ e−i 2ε ξ +i ε bε (t, ξ )d¯ξ, ε then

1 √ ε and 1 √ ε

e−i

e−i

xξ t−1 2 2ε ξ +i ε

xξ t−1 2 2ε ξ +i ε

ξ bε (t, ξ )d¯ξ = ε∂x v ε (t, x),

∂ξ bε (t, ξ )d¯ξ = J ε (t)v ε (t, x),

where we denoted J ε (t) :=

x + i(t − 1)∂x . ε

(3.1)

The operator J ε is nothing else than the usual Galilean operator, rescaled accordingly to our problem. Lemma 2. The operator J ε satisfies the following properties. – The commutation relation,

1 2 2 J (t), iε∂t + ε ∂x = 0. 2 ε

(3.2)

x2

– Denote M ε (t) = ei 2ε(t−1) , then J ε (t) writes J ε (t) = i(t − 1)M ε (t)∂x M ε (2 − t).

(3.3)

– The modified Sobolev inequality, w(t)L∞ ≤ C √

1 1/2 1/2 w(t)L2 J ε (t)w(t)L2 . |1 − t|

(3.4)

– For any function F ∈ C 1 (C, C) satisfying the gauge invariance condition ∃G ∈ C 1 (R+ , R), F (z) = zG (|z|2 ), one has J ε (t)F (w) = ∂z F (w)J ε (t)w − ∂z¯ F (w)J ε (t)w.

(3.5)

Th first step to prove Theorem 2 is to study the convergence of the initial value of the symbol aε .

Geometric Optics and Long Range Scattering for NLS

51

Lemma 3. The following convergence holds in , aε (0, ξ ) −→ a0 (ξ ). ε→0

More precisely, there exists C = C(f H ) such that

1 2 ≤ Cε log , ε 1 3 ≤ Cε log . ε

aε (0, .) − a0 L2 ξ(aε (0, ξ ) − a0 (ξ ))L2 , ∂ξ (aε (0, ξ ) − a0 (ξ ))L2

(3.6)

Moreover, the same estimates hold with aε (0, ξ ) − a0 (ξ ) replaced with (aε (0, ξ ) − 2 a0 (ξ ))e−iλ|f (−ξ )| log ε . Proof. From (1.14) and the initial value of uε , one has i 1 1 2 2 2 aε (0, ξ ) = eiλ|f (−ξ )| log ε . √ e− 2ε (x+ξ ) +iλ|f (x)| log ε f (x)dx. ε Denote hε (x) := eiλ|f (x)|

2 log 1 ε

f (x). From Parseval formula, one also has y2 1 −iλ|f (−ξ )|2 log ε =√ aε (0, ξ )e e−iyξ −iε 2 hε (y)dy, 2iπ

hence (aε (0, ξ ) − a0 (ξ )) e

−iλ|f (−ξ )|2 log ε

=√

1 2iπ

e

−iyξ

e

2

−iε y2

− 1 hε (y)dy.

Following the proof of Lemma 1, one then proves that the L2 -norm of the above quantity is O(ε| log ε|2 ), and its -norm is O(ε| log ε|3 ). The estimates of Lemma 3.6 are then straightforward. 3.3. Estimating the approximate solution. To estimate the remainder uε − uεapp , we will need some information as for the L∞ -norm of the approximate solution. The following lemma provides some. Lemma 4. Let β > 0. There exists C∗ = C∗ (β, f H 2 ) such that in the region {1 − t ≥ C∗ ε}, uεapp (t) satisfies almost the same estimate as uε1 (t) in L∞ , that is, uεapp (t)L∞ ≤

f L∞ + β . √ 1−t

Proof. Write uεapp (t)L∞ ≤ uε1 (t)L∞ + uεapp (t) − uε1 (t)L∞ , and denote d ε (t, x) := uεapp (t, x) − uε1 (t, x). From the modified Sobolev inequality, d ε (t)L∞ ≤ √

C 1−t

d ε (t)L2 J ε (t)d ε L2 . 1/2

1/2

(3.7)

52

R. Carles

Now the L2 -norms can be estimated thanks to Lemma 1, with σ ε (t, ξ ) := eig

ε (t,ξ )

a0 (ξ ).

Thus, d ε (t)L2 ≤ C

ε σ ε (t, .)H 2 . ξ 1−t

It is a straightforward computation to see that since H 1 (R) ⊂ L∞ (R), there are some constants such that ε 1−t 2 ε d (t)L2 ≤ C(f H 2 ) . log 1−t ε Since J ε (t) acts as the differentiation with respect to ξ on the symbols, the first part of Lemma 1 gives 1−t ε J ε (t)d ε L2 = log h , ε 1−t where h ∈ C(R) satisfies h(0) = 0. Then from (3.7),

1/2 ε ε C(f H 2 ) 1 − t 3/2 ε . d (t)L∞ ≤ √ h log 1−t ε 1−t 1−t Hence, for 1 − t ε, d ε (t) is negligible compared to uε1 (t) in L∞ . This completes the proof of Lemma 4. The proof of the next lemma is similar, and uses the regularity f ∈ H. Lemma 5. There exists C∗ = C∗ (f H ) such that in the region {1 − t ≥ C∗ ε}, the derivatives of uεapp satisfy almost the same estimates as the derivatives of uε1 in L∞ , that is, there exists C = C(f H ) such that ε∂x uεapp (t)L∞ ≤ √ J ε (t)uεapp L∞ ≤ √

C 1−t C 1−t

, log

(3.8) 1−t . ε

(3.9)

3.4. The equation satisfied by the approximate solution. From Sect. 2 and more precisely from Eq. (2.8), the approximate solution uεapp solves the cubic nonlinear Schrödinger equation up to the error term 6ε (t, x) := |uεapp |2 uεapp (t, x) xξ t−1 2 1 |f (−ξ )|2 ε −√ a0 (ξ )d¯ξ. e−i 2ε ξ +i ε +ig (t,ξ ) 1−t ε

(3.10)

Lemma 6. There exist C = C(f H 2 ) and C∗ = C∗ (f H 2 ) such that uniformly in the region {1 − t ≥ C∗ ε}, 1−t 2 ε log 6ε (t)L2x ≤ C . (3.11) (1 − t)2 ε

Geometric Optics and Long Range Scattering for NLS

53

Proof. Write 6ε (t, x) = 6ε (t, x) + |uε1 |2 uε1 (t, x) − |uε1 |2 uε1 (t, x), and introduce ε (t, x) := |uεapp |2 uεapp (t, x) − |uε1 |2 uε1 (t, x). 6 ε satisfies the estimate stated in Lemma 6. The other estimate to complete We prove that 6 the proof of Lemma 6 would be easier and will be left out. First remark that ε (t, .)L2 ≤ C uεapp (t)2L∞ + uε1 (t)2L∞ (uεapp − uε1 )(t)L2 . 6 x x x One has obviously 1 f 2L∞ . 1−t From Lemma 4, uεapp satisfies the same estimate in the region we are considering. Hence, uε1 (t)2L∞ ≤ x

ε (t, .)L2 ≤ C(f H 2 ) 6 From Lemma 1 with σ ε = eig

1 (uεapp − uε1 )(t)L2x . 1−t

(3.12)

ε (t,ξ )

a0 (ξ ), we finally have ε ig ε (t,ξ ) (uεapp − uε1 )(t)L2x ≤ C a0 (ξ ) 2 , e Hξ 1−t

ε satisfies the estimate announced in Lemma 4. and it is easy to check that 6

The following lemma is the extension of Lemma 6 we will need for the proof of Theorem 2, and its proof is similar. Lemma 7. There exist C = C(f H ) and C∗ = C∗ (f H ) such that uniformly in the region {1 − t ≥ C∗ ε}, 1−t 3 ε ε ε log , J (t)6 (t)L2x ≤ C (1 − t)2 ε (3.13) 1−t 3 ε ε log . ε∂x 6 (t)L2x ≤ C (1 − t)2 ε 4. Energy Estimates In this section, we derive the three energy estimates we will use to justify nonlinear geometric optics. Recall that the exact solution uε and the approximate solution uεapp satisfy 1 iε∂t uε + ε 2 ∂x2 uε = λε|uε |2 uε , 2 1 iε∂t uεapp + ε 2 ∂x2 uεapp = λε|uεapp |2 uεapp − ε6ε , 2 where 6ε is defined by (3.10) and is estimated in Lemmas 6 and 7. Introduce the remainder w ε := uε − uεapp . Subtracting the previous two equations, one has 1 iε∂t w ε + ε 2 ∂x2 w ε = λε |uε |2 uε − |uεapp |2 uεapp + ε6ε . 2

(4.1)

54

R. Carles

Multiplying the previous equation by w ε and taking the imaginary part of the result integrated in x, it follows ∂t w ε (t)L2 ≤ C uε (t)2L∞ + uεapp (t)2L∞ w ε (t)L2 + C6ε (t)L2 (4.2) ≤ C wε (t)2L∞ + uεapp (t)2L∞ wε (t)L2 + C6ε (t)L2 . Differentiating (4.1) with respect to x and multiplying by ε∂x w ε , one has similarly ∂t ε∂x w ε (t)L2 ≤ C w ε (t)2L∞ + uεapp (t)2L∞ ε∂x w ε (t)L2 + Cw ε (t)L2 uεapp (t)L∞ ε∂x uεapp (t)L∞

(4.3)

+ Cw ε (t)2L∞ ε∂x uεapp (t)L2 + Cε∂x 6ε (t)L2 . Finally, since from Lemma 2 J ε commutes with the Schrödinger operator and acts on the nonlinearity we are considering as a differentiation, we also have ∂t J ε (t)w ε L2 ≤ C w ε (t)2L∞ + uεapp (t)2L∞ J ε (t)w ε L2 + Cw ε (t)L2 uεapp (t)L∞ J ε (t)uεapp L∞

(4.4)

+ Cw ε (t)2L∞ J ε (t)uεapp L2 + CJ ε (t)6ε L2 . The main idea to justify nonlinear geometric optics is to integrate those three energy estimates so long as wε (t)L∞ is not greater than uεapp (t)L∞ . Since w ε is expected to be a remainder, this case actually occurs in “sufficiently” large regions, as we will see in the next section. 5. Justification of Nonlinear Geometric Optics Before the Caustic We now illustrate the method announced above. From Lemma 4, the “so long” condition writes for instance 4f L∞ wε (t)L∞ ≤ √ . 1−t

(5.1)

From Inequality (3.4) and Lemma 3, wε (0)L∞ ≤ Cw ε (0)L2 J ε (0)w ε L2 ≤ Cε| log ε|5/2 . 1/2

1/2

Hence there exists ε∗ = ε∗ (f H ) > 0 such that for 0 < ε ≤ ε∗ , w ε (0)L∞ ≤ 2f L∞ . By continuity, Condition (5.1) is satisfied for 0 ≤ t ≤ Tε for some Tε > 0. Then so long as (5.1) holds, we can integrate the three energy estimates using Gronwall lemma. Since (4.3) and (4.4) are very similar, introduce the following norm, (5.2) wε (t)Y := max ε∂x w ε (t)L2 , J ε (t)w ε L2 .

Geometric Optics and Long Range Scattering for NLS

55

Now we can write (3.4) as w ε (t)L∞ ≤ √

C

w ε (t)L2 w ε (t)Y . 1/2

1−t

1/2

So long as (5.1) holds, estimate (4.2) can be written as follows, ∂t w ε (t)L2 ≤ C1

f 2L∞ ε w (t)L2 + C6ε (t)L2 , 1−t

(5.3)

where C1 is a universal constant that does not depend on f . Denote C0 := C1 f 2L∞ . From the Gronwall lemma, we can integrate the previous inequality as follows, t 1 − s C0 w ε (0)L2 ε ε w (t)L2 ≤ +C 6 (s)L2 ds. (5.4) (1 − t)C0 1−t 0 From Lemma 5, and from Lemma 7, Inequalities (4.3) and (4.4) can also be written 1−t C0 C w ε (t)Y + w ε (t)L2 log ∂t w ε (t)Y ≤ 1−t 1−t ε C 1 − t + w ε (t)L2 w ε (t)Y log (5.5) 1−t ε 1−t 3 ε log , +C (1 − t)2 ε where we possibly increased the value of C1 (this question will be addressed more precisely in Sect. 6). To estimate the integral of the right-hand side of (5.4), we use Lemmas 6 and 7. The integral is not greater than t C ε 1−s 2 log ds. (5.6) (1 − t)C0 0 (1 − s)2−C0 ε For j > 0, we are thus led to study t 1−s j 1 log ds. 2−C0 ε 0 (1 − s) We take j > 0 and not only j = 2 because to estimate w ε (t)Y , we will have to deal with similar integrals with j = 3. With the substitution σ = 1−t ε , it becomes ε

C0 −1

1 ε 1−t ε

log σ j dσ. σ 2−C0

(5.7)

Since in Lemmas 4, 6 and 7, we had to restrict our attention to the region 1−t ε, we can replace log σ with | log σ | in the previous integral with no change in the asymptotics, and we have to study 1 ε (log σ )j dσ, j > 0. (5.8) 1−t σ 2−C0 ε To estimate these integrals, we have to distinguish two cases, namely C0 < 1 and C0 ≥ 1.

56

R. Carles

5.1. Case C0 < 1. In this case, one has obviously 2 − C0 > 1, hence the integral (5.8) is convergent. More precisely, we have to estimate the remainder of a converging integral. Integration by parts shows that for b > a 1, b (log σ )j (log a)j . dσ = O σ 2−C0 a 1−C0 a With a =

1−t ε ,

it follows that the energy estimate (5.4) becomes w ε (t)L2 ≤ C2

ε 1−t

log

1−t ε

2 .

(5.9)

Let α > 0. Then if 1 − t ≥ C∗ ε where C∗ is such that C2

(log C∗ )3 = α, C∗

where C2 is the constant in (5.9), Inequality (5.5) becomes 1−t 3 C0 + α ε Cε log . w (t)Y + ∂t w ε (t)Y ≤ 1−t (1 − t)2 ε

(5.10)

Taking α > 0 sufficiently small, we have C0 + α < 1, hence we can replace C0 + α with C0 with no change in the result. Now we can apply Gronwall lemma to (5.10), t 1 1−s 3 w ε (0)Y Cε wε (t)Y ≤ log + ds. (1 − t)C0 (1 − t)C0 0 (1 − s)2−C0 ε The previous estimate with j = 3 yields ε w (t)Y ≤ C 1−t ε

1−t log ε

From (3.4), w ε (t)L∞ ≤ √

C

ε 1−t 1−t

log

3

1−t ε

.

(5.11)

5/2 .

Hence, there exists C∗ = C∗ (f ) such that for 0 < ε ≤ ε∗ and in the region {1−t ≥ C∗ ε}, condition (5.1) is always satisfied, and estimates (5.9) and (5.11) hold, which we can summarize in the following proposition. −1/2

Proposition 2. Define δ := C1 . If f ∈ H satisfies f L∞ < δ, then nonlinear geometric optics is uniformly valid in the region {1 − t ≥ C∗ ε} for some (large) C∗ = C∗ (f ), with the estimates, ε 1−t 2 , log aε (t, .) − a0 L2 ≤ C 1−t ε ε 1−t 3 . log ∂ξ (aε (t, .) − a0 )L2 , ξ(aε (t, ξ ) − a0 (ξ ))L2 ≤ C 1−t ε

Geometric Optics and Long Range Scattering for NLS

57

5.2. Case C0 ≥ 1. Now the integral (5.8) is divergent. Integration by parts shows that for b > a 1, b (log σ )j C0 −1 j dσ = O b (log b) . σ 2−C0 a Then the energy estimate (5.4) becomes w ε (t)L2 ≤ C

ε | log ε|2 . (1 − t)C0

(5.12)

Let α > 0. First, we restrict our study to the region 1−t ε ≤ 2α. | log ε|2 log C 0 (1 − t) ε

(5.13)

Then the energy estimate (5.5) becomes, when we take only the “worst” terms into account, ∂t w ε (t)Y ≤

1−t C0 + 2α ε Cε 2 log | log ε| w (t)Y + . 1−t (1 − t)1+C0 ε

(5.14)

Applying the Gronwall lemma and proceeding as for the L2 -norm yields wε (t)Y ≤ C

ε| log ε|3 . (1 − t)C0 +2α

(5.15)

From (3.4), wε (t)L∞ ≤ √

C ε | log ε|5/2 . 1 − t (1 − t)C0 +α

1/(C0 +α) Hence, for 1 − t ε| log ε|5/2 and ε sufficiently small, condition (5.1) is always satisfied, and estimates (5.12) and (5.15) hold. Notice that in this region, for ε sufficiently small, (5.13) is automatically satisfied. Proposition 3. Take δ as in Proposition 2. Assume f ∈ H satisfies f L∞ ≥ δ. Let α > 0. Then there exists Cα∗ such that nonlinear geometric optics is uniformly valid in the 1/(C0 +α) region {1 − t ≥ Cα∗ ε| log ε|5/2 }, where C0 = f 2L∞ /δ 2 , with the estimates, ε | log ε|2 , (1 − t)C0 ε ≤C | log ε|3 . (1 − t)C0 +2α

aε (t, .) − a0 L2 ≤ C ∂ξ (aε (t, .) − a0 )L2 , ξ(aε (t, ξ ) − a0 (ξ ))L2

Propositions 2 and 3 imply Theorem 2, up to the computation of the smallness constant we find with this method, which we shall perform in the next section.

58

R. Carles

6. Interpretation 6.1. Computation of δ. We now focus on the case f L∞ < δ, and compute the best constant given by our method. From Sect. 5.1, we have to compute the coefficient in the factor of wε (t)L2 in Inequality (4.2), and the constant that appears in the first line of the right-hand side of (4.3) and (4.4). Indeed, for these last two inequalities, we proved that the other terms can be either absorbed (provided we remain sufficiently “far” from the caustic), or considered as a small source term. For Inequality (4.2), we multiplied (4.1) by w ε , then took the imaginary part of the result integrated in space. Write |uε |2 uε − |uεapp |2 uεapp = |uε |2 w ε + (|uε |2 − |uεapp |2 )uεapp . With the method of energy estimates, the first term will vanish, and the second is written |wε |2 + 2 Re(w ε uεapp ) uεapp . Hence, we can rewrite (4.2) more precisely as ∂t w ε (t)L2 ≤ 2|λ| 2w ε (t)L∞ + uεapp (t)L∞ uεapp (t)L∞ w ε (t)L2 + source term.

(6.1)

For Inequality (4.3), we differentiate |uε |2 uε − |uεapp |2 uεapp , with the result (uε )2 ε∂x uε − (uεapp )2 ε∂x uεapp + 2(|uε |2 ε∂x uε − |uεapp |2 ε∂x uεapp ). The very last term will be considered as a source term. The term before is written |uε |2 ε∂x uε = |uε |2 ε∂x w ε + |uε |2 ε∂x uεapp . When we take the imaginary part, the term |uε |2 ε∂x w ε vanishes, and the other term is made of source terms and of “absorbed” terms. Finally, the only relevant term will be (uε )2 ε∂x w ε , and we can rewrite (4.3) as 2 ∂t ε∂x w ε (t)L2 ≤ 2|λ| w ε (t)L∞ + uεapp (t)L∞ ε∂x w ε (t)L2 (6.2) + absorbed terms + source terms. Since from (3.5), J ε acts on the nonlinearity as a differentiation, the computation is exactly the same as with ε∂x , and we have 2 ∂t J ε (t)w ε L2 ≤ 2|λ| w ε (t)L∞ + uεapp (t)L∞ J ε (t)w ε L2 (6.3) + absorbed terms + source terms. Now notice that in Lemma 4, we could have obtained the estimate 1+β uεapp (t)L∞ ≤ √ f L∞ , 1−t for any β > 0, provided that we take C∗ sufficiently large.

Geometric Optics and Long Range Scattering for NLS

59

Similarly, for Condition (5.1), we could have taken w ε (t)L∞ ≤ √

β

f L∞ .

1−t

Obviously, the smaller β is, the smaller ε∗ is, and the larger C ∗ in Theorem 2. We see that for any β > 0, we can take C0 = 2(1 + 2β)2 |λ|f 2L∞ , which proves that we can take δ = |2λ|−1/2 , and completes the proof of Theorem 2. 6.2. Proof of Corollary 1. Existence. Recall the scaling (1.7). For every ε > 0, ψ ε is the unique solution in C(R, ) of the initial value problem 1 i∂t ψ ε + ∂x2 ψ ε = λ|ψ ε |2 ψ ε , 2 (6.4) 2 √ ε −iε x2 −iλ|f (εx)|2 log ε ψ = εf (εx)e . |t=−1/ε

For t < 0, define ψapp by

ψapp (t, x) :=

t

e−i 2 ξ

2 +ixξ +iλ|f (−ξ )|2 log |t|

a0 (ξ )d¯ξ.

From Theorem 2, if f L∞ < |λ|−1/2 , then there exist ε∗ and C ∗ such that for 0 < ε ≤ ε∗ and 0 ≤ s ≤ 1, ψ ε (t) − ψapp (t)H s ≤ C

(log |t|)2+s , |t|

(6.5)

uniformly for −1/ε ≤ t ≤ −C ∗ . Moreover, since the operator J ε is nothing but the classical Galilean operator J (t) up to the scaling (1.7), we also have J (t)ψ ε − J (t)ψapp L2 ≤ C

(log |t|)3 , |t|

(6.6)

uniformly for −1/ε ≤ t ≤ −C ∗ . Proposition 4. Assume f L∞ < |λ|−1/2 . Then there exists C ∗ > 0 such that (ψ ε (−C ∗ ))0 0 such that for any t ≤ −C ∗ , ψapp (t)L∞ ≤

f L∞ + β . |t|1/2

Then for t ≤ −C ∗ and from (6.17), (6.18) becomes C C ∂t φ(t)L2 ≤ + φ(t)L2 , |t| |t|α+1/2 with C := |λ| (f L∞ + β)2 < 1. For t0 ≤ t ≤ −C ∗ , the Gronwall lemma gives φ(t)L2

C t0 ≤ Cφ(t0 )L2 . t

Using the assumptions again, we have φ(t)L2

1 ≤C α |t0 |

C t0 . t

Given our choice for β, α > C. Fix t = −C ∗ . The right-hand side goes to zero when t0 goes to −∞. Hence φ(−C ∗ ) = 0, and φ ≡ 0 from the uniqueness for (1.2) in C(Rt , L2 ∩ L∞ ) (see [7]). This proves Proposition 1 and completes the proof of Corollary 1. Remark 9. For Proposition 1, we need the assumption f ∈ H 2 (R) because it is the minimum regularity we assumed for Lemma 4. 7. Construction of the Modified Scattering Operator and Application 7.1. Proof of Corollary 2. We first recall the main result in [9] for nonlinear Schrödinger equation, which corresponds to the notion of asymptotic completeness of the modified wave operators introduced in [11]. Theorem 3 ([9], Theorem 1.2, case n = 1). Let ϕ ∈ , with ϕ = δ ≤ δ, where δ is sufficiently small. Let ψ ∈ C(Rt , ) be the solution of the initial value problem (6.14), with C ∗ = 0. Then there exist unique functions W ∈ L2 ∩ L∞ and φ ∈ L∞ such that for t ≥ 1, t dτ 2 F U0 (−t)ψ (t) exp −i λ ˆ )|2 − W | ψ(τ ≤ Cδ t −α+C(δ ) , (7.1) 2π 1 τ 2 ∞ L ∩L t λ dτ 2 2 2 ˆ |ψ(τ )| ≤ Cδ t −α+C(δ ) , (7.2) − λ|W | log t − φ 2π τ 1

L∞

64

R. Carles

where Cδ < α < 1/4, and φ is a real valued function. Furthermore we have the asymptotic formula for large time t, 2 x 2 x x x 1 W exp i + iλ W ψ(t, x) = log t + iφ (it)1/2 t 2t t t (7.3) 2

+ O(δ t −1/2−α+C(δ ) ) and the estimate F U0 (−t)ψ (t) − W exp(iλ|W |2 log t + iφ)

2

L2 ∩L∞

≤ Cδ t −α+C(δ ) .

(7.4)

Remark 10. Uniqueness follows from (7.1) and (7.2), which make it possible to define W and φ. The asymptotics (7.3) and (7.4) are immediate consequences of (7.1) and (7.2). ˜ ˜ where φ˜ ∈ L∞ , then (7.3) and In particular, if we replace (W, φ) with (W ei φ , φ − φ), (7.4) still hold. Remark 11. This theorem states “almost” asymptotic completeness for small data for the modified wave operators introduced in [11]. Indeed, no regularity for the momenta of ψ is proved in [11]. In Corollary 1, we limit the loss of regularity, and in particular obtain for ψ that required in Theorem 3. 1/2 . From Corollary 1, Proof of Corollary 2. Let ψ− ∈ F(H), with ψ − L∞ < (π/|λ|) there exists a unique ψ ∈ C(R, ) solution of (1.2) satisfying (1.15), (1.16). The first step is then to check that for ψ− sufficiently small, ψ(0) < δ, so that we can use the results of Theorem 3. The second step consists in defining ψ+ . From Duhamel’s formula, one has 0 ψ(0) = U0 (C ∗ )ψ(−C ∗ ) − iλ U0 (−s)|ψ|2 ψ(s)ds. −C ∗

On the other hand, we saw that for C ∗ 1,

U0 (C ∗ )ψ(−C ∗ ) ≤ U0 (C ∗ )ψapp (−C ∗ ) + U0 (C ∗ )(ψ − ψapp )(−C ∗ ) ≤ Cψ− log C ∗ + C

log C ∗ 4 . C∗

From local estimates for (1.2), we see that there exist functions hj , j = 1, 2, 3, with h1 (x) −→ 0, h2 is increasing, and h3 (x) −→ 0, such that x→+∞

x→0

ψ(0) ≤ h1 (C ∗ ) + h2 (C ∗ )h3 (ψ− ).

(7.5)

Taking first C ∗ sufficiently large, then ψ− sufficiently small, we see that we can have ψ(0) < δ. Then Theorem 3 provides (unique) functions W and φ. Define ψ+ by ψ+ := F −1 W eiφ ∈ L2 (R).

(7.6)

Geometric Optics and Long Range Scattering for NLS

65

From (7.3) and (7.4), we have, in L2 , ψ(t)

eiλ|ψ+ ( t )|

∼

t→+∞

x

2

log t

which, along with Corollary 1, yields Corollary 2.

U0 (t)ψ+ ,

7.2. Consequences for nonlinear geometric optics. In Sect. 2, we mentioned the fact that to describe the asymptotics of uε after the caustic, one needs a modified scattering operator. Now we have one, we can describe uε globally. We first give a heuristic approach, then prove Corollary 3. We already noticed that the phase g ε (hence the symbol aε ) is defined only for t < 1. If we want a global description, we have to replace g ε with a phase φ ε which is defined for all t, and coincides asymptotically with g ε for t < 1. To guess which possible φ ε we can choose, recall Scaling (1.7). The function ψ ε solves (1.2), and we saw that, for t ∈] − ∞, T ], where T is finite, ψ ε (t) −→ ψ(t) in L2 ∩ L∞ . ε→0

Hence we have 1 i∂t ψ ε + ∂x2 ψ ε = λ|ψ|2 ψ ε + small. 2

(7.7)

Forget the “small” term. We now have to study a linear Schrödinger equation, with a time-dependent potential λ|ψ|2 . According to the vocabulary used in [3], this is not a short range potential, for it does not belong to L1t (L∞ x ). A scattering theory for long range potentials is available (see for instance [3]). The first idea is due to Dollard and consists in studying t ε 2 ψ (t, x) exp −iλ |ψ| (s, sξ )ds 0

in order to get rid of the long range part. In our context, this means that we can replace g ε with φ ε (t, ξ ) := −λ

t−1 ε

1 |ψ|2 (s, sξ )ds + λ|f (−ξ )|2 log . ε −1/ε

The symbol aε is now defined (globally in time) by x.ξ t−1 2 1 ε ε u (t, x) = √ e−i 2ε ξ +i ε +iφ (t,ξ ) aε (t, ξ )d¯ξ. ε

(7.8)

(7.9)

Now from Corollary 1, one has, for t < 1, |ψ(s, sξ )| =

1 1 ∞ |ψ − (ξ )| + o(1) in Lt (Lx ), |2π s|1/2

∞ hence, in L∞ t,loc (0, 1; Lx ),

φ ε (t, ξ ) = g ε (t, ξ ) + o(1).

(7.10)

66

R. Carles

Therefore, even with this new definition of aε , we have, for t < 1, aε (t, ξ ) −→ a0 (ξ ) in L2 . ε→0

Similarly, for t > 1 and from Theorem 3, there exists a function H (that depends on ∞ ψ) such that in L∞ t,loc (1, 2; Lx ), t −1 + H (ξ ) + o(1). ε

φ ε (t, ξ ) = −λ|W (ξ )|2 log In particular, since aε (t, ξ ) = e−iφ

ε (t,ξ )

(7.11)

1−t t −1 F U0 ψε ε ε

and the map ϕ → (W, φ) in Theorem 3 is continuous, 2 aε (t, ξ ) −→ e−iH (ξ )+iφ(ξ ) W (ξ ) = e−iH (ξ ) ψ + , in L . ε→0

(7.12)

Apparently, the limit of aε depends on this function H . One must bear in mind that this function H is closely related to our choice in the definition of the new phase φ ε . For instance, one can check that replacing φ ε with φ ε (t, ξ ) + h1 (ξ )

t−1 ε

−1/ε

h2 (s)ds,

where h1 ∈ L∞ , h2 ∈ L1 , would just alter the definition of H . Thus this function appears as a parameter in the definition of aε . Nevertheless, the asymptotics for uε is independent of H . It is given, in L2 , by (7.12), (7.11) and the first part of Lemma 1. This leads to the asymptotics given in Corollary 3 for t > 1. The asymptotics for t < 1 is a simple consequence of Theorem 2 and (7.10). This completes the proof of Corollary 3. Acknowledgements. I would like to thank Professor A. Bressan for his invitation at SISSA, where this work was achieved. This research was supported by the European TMR ERBFMRXCT960033.

References 1. Barab, J. E.: Nonexistence of asymptotically free solutions for nonlinear Schrödinger equation. J. Math. Phys. 25, 3270–3273 (1984) 2. Carles, R.: Geometric optics with caustic crossing for some nonlinear Schrödinger equations. Indiana Univ. Math. J. 49, 475–551 (2000) 3. Derezi´nski, J., and Gérard, C.: Scattering theory of quantum and classical N-particle systems. Texts and Monographs in Physics, Berlin–Heidelberg: Springer Verlag, 1997 4. Duistermaat, J. J.: Oscillatory integrals, Lagrangian immersions and unfolding of singularities. Comm. Pure Appl. Math. 27, 207–281 (1974) 5. Ginibre, J.: Introduction aux équations de Schrödinger non linéaires. Cours de DEA, Paris Onze Édition (1995) 6. Ginibre, J.: An introduction to nonlinear Schrödinger equations. In: Nonlinear waves (Sapporo, 1995). Gakk¯otosho, R. Agemi and Y. Giga and T. Ozawa (eds.), GAKUTO International Series, Math. Sciences and Appl., 1997, pp. 85–133 7. Ginibre, J., and Velo, G.: On a class of nonlinear Schrödinger equations. III. Special theories in dimensions 1, 2 and 3. Annales de l’Institut Henri Poincaré. Section A. Physique Théorique. Nouvelle Série 28, 287– 316 (1978)

Geometric Optics and Long Range Scattering for NLS

67

8. Ginibre, J., and Velo, G.: Long Range Scattering and Modified Wave Operators for some Hartree Type Equations III. Gevrey spaces and low dimensions. J. Diff. Eq., to appear 9. Hayashi, N., and Naumkin, P.: Asymptotics for large time of solutions to the nonlinear Schrödinger and Hartree equations. Am. J. Math. 120, 369–389 (1998) 10. Hunter, J., and Keller, J.: Caustics of nonlinear waves. Wave Motion 9, 429–443 (1987) 11. Ozawa, T.: Long range scattering for nonlinear Schrödinger equations in one space dimension. Commun. Math. Phys. 139, 479–493 (1991) 12. Strauss, W.: Nonlinear scattering theory. In: Scattering theory in mathematical physics, J. Lavita and J. P. Marchands (eds.), Dordrecht: Reidel, 1974 13. Strauss, W.: Nonlinear scattering theory at low energy. J. Funct. Anal. 41, 110–133 (1981) Communicated by A. Kupiainen

Commun. Math. Phys. 220, 69 – 94 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

Regularized Products and Determinants Georg Illies IHES, Le Bois-Marie, 35, Route de Chartres, 91440 Bures-sur-Yvette, France. E-mail: [email protected] Received: 4 April 2000 / Accepted: 15 January 2001

Abstract: Zeta-regularized products are used to define determinants of operators in infinite dimensional spaces. This article provides a general theory of regularized products and determinants which delivers a better approach to their existence and explicit determination. 1. Introduction The zeta-regularized product of a sequence ak ∈ C∗ is defined by ∞ ∞ d −s ak := exp − ak |s=0 , ds k=1

(1.1)

k=1

provided that the Dirichlet series converges absolutely in a half plane and can be meromorphically continued to the left of (s) = 0; the evaluation at s = 0 means the constant term of the Laurent expansion. This obviously generalizes the ordinary finite product. Zeta-regularization was first used to define analytic torsion [RS] and since then has played a role in global analysis, the theory of dynamical zeta functions, and Arakelov theory. Theoretical physicists use zeta-regularization as a method for renormalization in quantum field theories [EORBZ] and various papers (e.g. [Ef, Ko1, Ko2, Ko3, Sa]) have calculated the regularized determinant of Laplacians. Zeta-regularization also appeared in a conjectural cohomological approach to motivic L-functions ([De1, De2, De3, Ma]). In that context the question appeared as to which meromorphic functions of finite order (e.g. motivic L-functions) are zeta-regularized, i.e. can be represented as f (z) = (z − ρ)±1 (1.2) Present address: Algebra und Zahlentheorie, Fachbereich Mathematik, Universität – Gesamthochschule Siegen, Walter-Flex-Str. 3, 57068 Siegen, Germany. E-mail: [email protected]

70

G. Illies

where the product is over all zeroes and poles ρ of f (z) with multiplicities and the sign of the exponent being positive for zeroes and negative for poles. This turns out to be the basic problem of zeta regularized determinants and it was the starting point of the following investigation which, we hope, gives a satisfying answer to the question. Regularization entails several technical problems because of the meromorphic continuation of the Dirichlet series. For example, the regularized product of all primes does not exist as p −s has the natural boundary (s) = 0 ([LW]). The aim of this paper is to give a better approach to regularized products improving the formalism in [Vo, CV, QHS] and [JL1] (compare Sect. 6 below) which is based on the representation of the Dirichlet series as the Mellin transform of the series θ(t) :=

∞

eak t .

(1.3)

k=1

In many applications arg(ak ) varies in such a way that this series does not converge for any t, for example in the product (1.2). This problem can be solved by instead using a kind of Hankel integral of the Dirichlet series (see Sect. 7). To treat the product (1.2) one has by definition to regard the function ζ (s, z) := ±(z − ρ)−s . This paper is also thought of as an examination of the analytic and asymptotic properties of this generalized Hurwitz zeta function which should be interesting for its own sake. Before giving a short overview we introduce the notion of a divisor which is basic for all that follows. A divisor D is given by a function mD : C → Z such that there is a β > 0 with |mD (ρ)| < ∞. |ρ|β

(1.4)

ρ∈C

Condition (1.4) reflects that the Dirichlet series in Definition (1.1) must converge absolutely in a half plane. We recall a fundamental fact from the theory of entire functions of finite order (compare [Ti] for the proof): A function mD : C → Z gives rise to a divisor if and only if there is a meromorphic function f (z) of finite order (i.e. f (z) is the quotient of two entire functions of finite order) such that mD (ρ) = ord f (z), z=ρ

ρ ∈ C,

thus D is the divisor of f (z) in the usual sense. And this function f (z) is determined by D up to an exponential polynomial, i.e. a function g(z) is meromorphic of finite order with divisor D if and only if there is a polynomial P (z) with g(z) = eP (z) f (z). After introducing some notation (Sect. 2) we define a general class of regularized products in Sect. 3, zeta-regularization being just an example; the rhs of (1.2) with the multiplicities mD (ρ) in the case of its existence is called regularized determinant and denoted by (z − ρ)±1 . (1.5) D (z) := We prove that D (z) is a meromorphic function of finite order with divisor D, thus equals eP (z) f (z) for a certain polynomial P (z). Regularization means finding this polynomial. Section 4, a sort of theoretical excursion, discusses axiomatic generalizations of the regularization process and shows that a theory of regularization should deal with quasidirected divisors (defined in Sect. 2).

Regularized Products and Determinants

71

If the Dirichlet series in the definition of regularized products also satisfies certain exponential estimates and does not have too many poles we speak of bounded regularizability (Sect. 5). In that case one can apply certain integral transformations, especially the Mellin transform, the mentioned Hankel integral and the Laplace transform, to get Theorems 3, 4 and 6 of Sects. 6 and 7 and 9. They give the equivalence of bounded regularizability with certain asymptotics for θD (t), ζD (s, z) and for the function θD (t, s) which is defined as the Laplace transform of ζD (s, z). As a corollary of Theorem 4 one gets Theorem 5 in Sect. 7 which is the fundamental theorem of the theory of regularization. It states that D is bounded regularizable if and only if for some 0 < ψi < π , i = 1, 2 and ε > 0 an asymptotic log f (z) =

zαi logni z + o(|z|−ε ) |z| → ∞

(1.6)

i

with finite sum, αi ∈ C and ni ∈ N0 , is valid for −ψ2 < arg(z) < ψ1 . D (z) exists in that case and also the polynomial P (z) can be determined in terms of this asymptotic of log f (z) which is very intrinsic. These results deliver a satisfying theory of regularization and apply to a large class of examples. In Sect. 8.1 for instance it is shown that every meromorphic function of finite order representable by a Dirichlet series is regularized; this improves results of Jorgenson and Lang ([JL1, JL2]) who had to assume that it also satisfies a functional equation. In 8.2 we regularize higher "-functions. Thus Sect. 8 is applicable to various kinds of zeta and L-functions. The function θD (t, s), introduced in Sect. 9, is a type of multivalued theta function and plays a central role in [Il2]. There is also an alternative approach to regularization via renormalizing certain divergent integrals (Sect. 10). The following three sections contain technical proofs which were postponed. The article reproduces the main results of Chapter 2 of my thesis [Il1] in a more special context. In some cases we only give sketches of the proofs, for complete proofs, generalizations and further results the reader is referred to [Il1]. In [Il2] it is shown how to apply the theory of regularization to generalize results of Cramér ([Cr]) and Guinand ([Gui]) thus improving results of [JL2].

2. Notation In the sequel f (z) denotes a meromorphic function of finite order and D its divisor. We define two important parameters: The exponent r of f and D is the infimum of all β > 0 satisfying (1.4); the genus g of f and D is the smallest n ∈ N0 such that (1.4) is satisfied for β = n + 1; note g + 1 ≥ r ≥ g. We will say that D lies in a set M ⊂ C if mD (ρ) = 0 implies that ρ ∈ M. Let 0 < ϕi < π, i = 1, 2, then we define open connected sets Wrϕ1 ,ϕ2 := {z ∈ C∗ | − ϕ2 < arg(z) < ϕ1 }, Wlϕ1 ,ϕ2 := C∗ \Wrϕ1 ,ϕ2 and a contour Cϕ1 ,ϕ2 consisting of the ray from e−ϕ2 i ∞ to 0 and the ray from 0 to eϕ1 i ∞; thus C = Wlϕ1 ,ϕ2 ∪ Cϕ1 ,ϕ2 ∪ Wrϕ1 ,ϕ2 is a disjoint union.

72

G. Illies

(z) ✻

Cϕ1 ,ϕ2 ✡✡ ✣ ✡ ✡ ✡ ✡ ✡ ϕ1 ◗ϕ2 Wrϕ1 ,ϕ2 ◗ ◗ ◗ ◗ ◗ ◗ ❦

Wlϕ1 ,ϕ2

✲

(z)

A divisor D is called directed if it lies in a Wlϕ1 ,ϕ2 . It is called quasi-directed if it is directed with the exception of finitely many ρ, and it is called strictly directed if it lies in a Wlϕ1 ,ϕ2 with ϕ1 > π2 and ϕ2 > π2 . We will also write ρ ∈ D instead of mD (ρ) = 0 and use the following notation:

ϕ(ρ) :=

ρ∈D

mD (ρ)ϕ(ρ).

ρ∈C

3. Xi Functions and Regularization Definition 3.1. If D is a directed divisor, UD := {z ∈ C | |z| < |ρ|, ρ ∈ D} and the argument is chosen so that −π < arg(z − ρ) < π then ξD (s, z) :=

ρ∈D

"(s) , (z − ρ)s

(s) > r,

z ∈ UD ,

(3.1)

is called the Hurwitz xi function of D; ξD (s) := ξD (s, 0) is called the xi function of D. Convergence is absolute and ξD (s, z) is holomorphic in both variables. Proposition 3.2. ξD (s, z) satisfies the following differential equation: d ξD (s, z) = −ξD (s + 1, z). dz

(3.2)

A function f (z) is meromorphic of finite order with divisor D if and only if for some l ≥ g: d l+1 log f (z) = (−1)l ξD (l + 1, z). dz

(3.3)

Proof. Equation (3.2) follows by taking the term by term derivative. For (3.3) check that d l+1 Wei,l log D,a (z) defined in (3.6) below satisfies (3.3) and observe that the operation dz exactly kills exponential polynomials of degree ≤ l.

Regularized Products and Determinants

73

Proposition 3.3. For z ∈ UD the following absolutely convergent Taylor series expansion is valid: ξD (s, z) =

∞

(−1)m ξD (s + m)

m=0

zm . m!

(3.4)

If ξD (s) is meromorphic for (s) > −p then ξD (s, z) is also meromorphic for (s) > −p and holomorphic for z ∈ U for any simply connected U ⊂ C with UD ⊂ U and ρ ∈ U for all ρ ∈ D. Proof. The Taylor series follows from (3.2); for the meromorphy in s observe that shifting the coefficients does not change the convergence radius. The continuation in z is obtained by treating finitely many ρ ∈ D separately. Definition 3.4. A regularization sequence δ is a sequence of complex numbers δ0 , δ1 , . . . with δ0 = 1. Formally let δ(s) := δ0 + δ1 s + δ2 s 2 + . . . . A directed divisor D is called regularizable if ξD (s) is meromorphic in a half plane (s) > −ε with ε > 0. For z ∈ U D (z) := exp(−CTs=0 (δ(s)ξD (s, z)))

(3.5)

is called the δ-regularized determinant of D. One calls D (0) the δ-regularized product of D. Note that CTs=0 means the constant term in the Laurent expansion at s = 0. If δ(s) is a divergent series, then one has to develop ξD (s, z) in a Laurent series and multiply it formally with the formal series for δ(s). In the sequel there will often appear formulas which must be interpreted in this formal sense. Examples. 1) xi-regularized determinant (Jorgenson, Lang): δ(s) = 1, 2) zeta-regularized determinant: δ(s) = " −1 (s + 1), 3) zero-renormalized determinant: δ(s) = "(1 − s). Remark. The factor δ(s) isintroduced because of several reasons. First one wants to handle "scaled" products a(z − ρ) (compare [De1, De2, De3]). It also turned out that the canonical way of renormalization (see Theorem 7 in Sect. 10) differs from zeta-regularization. A further reason is that in [JL1] xi-regularization was used which is technically the simplest regularization. While zeta-regularization as well as zerorenormalization generalize the ordinary finite product (as every regularization with δ1 = γ does, γ the Euler-Mascheroni constant), xi-regularization does not. Zeta-regularization satisfies the product rule ρ n = ( ρ)n so comes closest to what one would expect for a product. Fix a ∈ C with mD (a) = 0 and - ≥ g, then we define the absolutely convergent Weierstrass product

mD (ρ) z−a 1 z − a k Wei,D,a (z) := 1− , (3.6) exp ρ−a k ρ−a ρ∈C

k=0

which is a meromorphic function of finite order with divisor D. For a = 0 and g = l one has the usual canonical Weierstrass product (compare [Ti]).

74

G. Illies

Theorem 1. D (z) is a meromorphic function of finite order with divisor D. The explicit relation to Weierstrass products is given by D (z) = eP (z) Wei,D,a (z),

(3.7)

where with a suitable branch of the logarithm P (z) =

(z − a)m log(m) D (a), m!

(3.8)

m=0

log(m) D (z) = (−1)m+1 CTs=0 (δ(s)ξD (s + m, z)) m = 0, 1, . . . . In the sequel a meromorphic function of finite order f (z) with divisor D will be called δ-regularized if it equals D (z). Proof of Theorem 1. We have (

d m+1 log D (z) = (−1)m CTs=0 (δ(s)ξD (s + m + 1, z)) ) dz = (−1)m ξD (m + 1, z) for m ≥ g;

(3.9) (3.10)

the first equation holds because of (3.2) and is valid for all m ∈ N0 . It is also true that ξD (s, z) is holomorphic for s = m + 1 if m ≥ g by the definition of g. Comparison with (3.3) proves the first assertion. One also easily checks d m+1 0 for m = −1, 0, . . . , - − 1 Wei,log D,a (z)|z=a = (−1)m ξD (m + 1, a) for m ≥ -. dz Using this as well as (3.9) and (3.10) the explicit relation to Weierstrass products follows by subtracting the Taylor series expansion around s = a for log Wei,D,a (z) from that for log D (z). 4. Determinant Systems We will call a function f (z) associated to a divisor D if there is a polynomial P (z) such that f (z) = eP (z) D,a (z) and deg P ≤ g Wei,g

or equivalently that (3.3) is satisfied for l = g (compare the proof of (3.3)). Observe that Wei,g in this definition we have set - = g in (3.6). Note also that if D,a (z) is, in addition, entire then its order is exactly r and no entire function with divisor D can have a smaller order (compare [Ti]). So associated functions have minimal order because of g ≤ r. By Theorem 1 regularization means picking out a certain associated function to a divisor. Now we ask for extensions of this process to non-regularizable divisors. For α ∈ C we define the translated divisor D |+α by mD |+α (z + α) := mD (z) and the sum D1 + D2 by mD1 +D2 (z) := mD1 (z) + mD2 (z). Let Dfin be the abelian group of all finite divisors, Dbreg that of all bounded regularizable quasi-directed divisors (compare Sect. 5), Dreg that of all regularizable quasi-directed divisors, i.e. those which are regularizable directed after eliminating finitely many ρ, Dqd that of all quasi-directed divisors, and D that of all divisors. These are all translation-invariant with proper inclusions Dfin ⊂ Dbreg ⊂ Dreg ⊂ Dqd ⊂ D. The following definition arises from the demand for generalizations of the characteristic polynomial to the infinite dimensional case.

Regularized Products and Determinants

75

Definition 4.1. Let D ⊆ D be a translation-invariant subgroup. A determinant on D attaches to every D ∈ D an associated function D (z), such that: i) D |+α (z + α) = D (z) ii) D1 +D2 (z) = D1 (z)D2 (z)

(translation-invariance) (linearity)

for D, D1 , D2 ∈ D , α ∈ C. (D , ) is called a determinant system. Examples. 1) (Dfin , ) with characteristic “polynomial” which is a rational function defined by D (z) := ρ∈C (z − ρ)mD (ρ) . 2) (Dreg , ) with the δ-regularized determinant . For a D ∈ Dreg which is not directed, D (z) can be defined by translation-invariance. Theorem 2 answers the question of how large determinant systems can be. Theorem 2. a) There is no determinant system (D, ). b) For every regularization sequence δ there is a determinant system (Dqd , ) which is an extension of the δ-regularized determinant system (Dreg , ). Proof. a) Let D be the divisor that consists only of zeroes of order one lying at the lattice points ρ = m + ni, m, n ∈ Z. From translation invariance one gets D (z + 1) = D (z) and D (z + i) = D (z). Hence D (z) must be a doubly-periodic entire non-constant function which is impossible by Liouville’s theorem. b) (Idea) One has to choose the exponential polynomials consistently with linearity and translation-invariance. This leads to a system of infinitely many linear equations with infinitely many variables which can be reduced to finite systems by Zorn’s Lemma. (See Sect. 11 for a complete proof.) Remark. Not every determinant system is extendable, so b) is an aesthetic property of regularization. The proof is non-constructive and its extensions are not uniquely determined. The meaning of regularization is that it gives large constructively defined determinant systems. 5. Bounded Regularizability, Singularities and Asymptotics In this section we introduce the special case of bounded regularizability of divisors and give all the neccessary technical definitions to formulate the results of Sects. 6, 7 and 9 which state its equivalence to various asymptotics. Definition 5.1. Let D be a directed divisor, 0 < σi < π for i = 1, 2 and p ∈ R ∪ {∞}, then D resp. ξD (s) are called (σ1 , σ2 )-bounded p-regular if: i) ξD (s) is meromorphic for (s) > −p. ii) ξD (s) has only finitely many poles in the strip α1 < (s) < α2 for any −p < α1 < α2 . iii) For all −p < α1 < α2 and σ1 < σ1 , σ2 < σ2 , π O(e( 2 −σ2 )(s) ) for (s) → ∞ ξD (s) = π O(e−( 2 −σ1 )(s) ) for (s) → −∞ in the strip α1 < (s) < α2 .

76

G. Illies

We simply say bounded p-regular, if there are 0 < σi < π such that (σ1 , σ2 )-bounded p-regularity is valid. We have bounded regularizability if p > 0. Note that every directed divisor D in Wlϕ1 ,ϕ2 is (ϕ1 , ϕ2 )-bounded (−r)-regular as follows from Stirling’s formula. Definition 5.2. A pB-System consists of: 1. A finite or infinite sequence of pairs (pn , Bn (z))n=0,1,2,... with complex numbers pn satisfying (p0 ) ≤ (p1 ) ≤ . . . ≤ (pn ) ≤ . . . and polynomials Bn (z) ∈ C[z], Bn (z) = k bn,k zk . 2. An abscissa p ∈ R ∪ {∞} such that p > (pn ) for all n, and in addition for infinite sequences: p = limn→∞ (pn ). pB-systems capture the simultaneous information about the occurring singular part distributions and the occurring asymptotics. Example. If the divisor D is (σ1 , σ2 )-bounded p-regular, then there is a pB-system (pn , Bn (z))n=0,1,2,... with abscissa p such that the poles of ξD (s) in the half plane (s) > −p lie exactly at the values s = −pn and the Laurent expansions have the singular parts Bn (∂s )[

(−1)k k! 1 ]= bn,k . s + pn (s + pn )k+1 k

(−pn , Bn (∂s )[

1 ])n=0,1,... s + pn

is called the singular part distribution of ξD (s) in that case. Definition 5.3. Let (pn , Bn (z))n=0,1,... be a pB-system with abscissa p as above and 0 < σi ≤ ϕi < π, i = 1, 2. A function θ : Wrϕ1 ,ϕ2 −→ C satisfies the Cramér asymptotic with abscissa p in Wrσ1 ,σ2 , θ (t) ∼

∞

t pn Bn (log t) for |t| → 0,

n=0

if the estimate for t ∈ Wrσ1 ,σ2 θ (t) − t pn Bn (log t) = O(|t|q ) for |t| → 0 (pn ) r, ∞ ξD (s) = θD (t)t s−1 dt,

(6.2)

0

and its inverse for t ∈ Wr(ϕ2 − π2 ),(ϕ1 − π2 ) and c > r c+i∞ 1 ξD (s)t −s ds θD (t) = 2π i c−i∞

(6.3)

with absolute convergence of the integrals. Proof. Because of the theorem about Mellin inversion it suffices to prove (6.3) and by majorized convergence, this is reduced to the case of a one-point-divisor. In that case (6.3) is the inverse formula for Euler’s Mellin integral for "(s). This approach is only possible for strictly directed divisors as otherwise the defining series for θD (t) does not converge for any t. Theorem 3. Let π2 < σi ≤ ϕi < π for i = 1, 2 and D be strictly directed in Wlϕ1 ,ϕ2 , and let (pn , Bn (z))n=0,1,... be a pB-system with abscissa p. Then the following statements are equivalent: A) ξD (s) is (σ1 , σ2 )-bounded p-regular with singular part distribution (−pn , Bn (∂s )[

1 ])n=0,1,... . s + pn

C) θD (t) satisfies a Cramér asymptotic with abscissa p of the form θD (t) ∼

∞

t pn Bn (log t) for |t| → 0

n=0

in Wr(σ2 − π2 ),(σ1 − π2 ) . Proof (sketch). C) ⇒ A) is shown by (6.2): The poles and singular parts of ξD (s) arise by integrating the terms of the Cramér asymptotic, and the exponential estimation for ξD (s) in vertical strips can be shown by rotating the ray of integration in (6.2) in Wr(σ2 − π2 ),(σ1 − π2 ) . A) ⇒ C) follows from (6.3) by replacing the abscissa c of the line of integration by a smaller c > −p and applying the residue theorem. The residues of the integrand produce the terms of the Cramér asymptotic.

78

G. Illies

Remark 1. This theorem is well known in the context of the Mellin and Laplace transform (e.g. [Do, II, Chap. 5], where a complete proof can be found). Using it one can decide whether a strictly directed divisor is bounded regularizable or not, by checking for the existence of a suitable Cramér asymptotic for the partition function. For example the regularized product of the eigenvalues of Laplacians on manifolds exists because of Cramér asymptotics arising from heat kernel expansions (comp. [Ef, Ko1, Ko2, Ko3, Sa]). Remark 2. In the case of strictly directed divisors also the implication A) ⇒ B’) of Theorem 4 and, in particular, the asymptotic (7.8) can be obtained by the Mellin integral ξD (s, z) =

∞

0

θD (t)e−zt t s−1 dt

using 0

∞

pn

t Bn (log t)e

−zt s−1

t

"(s + pn ) . dt = Bn (∂s ) zs+pn

The Mellin transform approach to regularized products and determinants (the details can be found in Sect. 2.4 in [Il1]) was also extensively studied by Jorgenson and Lang ([JL1]).

7. Hankel Integrals and Stirling Asymptotics The Mellin integral method has two shortcomings: It is possible only for strictly directed divisors and the partition function is a non-intrinsic construction, one wants criteria in terms of associated functions. In this section we solve these problems postponing the technical proofs until Sect. 12. For powers a s and log(a) we always use −π < arg(a) < π. Proposition 7.1. Let D be a directed divisor in Wlϕ1 ,ϕ2 (0 < ϕi < π ). a) ξD (s, z) =

1 2π i

c+i∞

c−i∞

"(s − s ) ξD (s )ds zs−s

(7.1)

for z ∈ Wrϕ1 ,ϕ2 and (s) > c > r with absolute convergence of the integral. b) Let 0 < σi < ϕi for i = 1, 2, C = Cσ1 ,σ2 and z ∈ Wrσ1 ,σ2 . Then for (s) > r and (s0 ) > r one has the absolutely convergent integral representation ξD (s, z) =

1 2π i

C

"(s − s0 + 1) ξD (s0 , w)dw. (z − w)s−s0 +1

(7.2)

In the sequel the representations (7.1) and (7.2) play a similar role as (6.2) and (6.3) in Sect. 6. For the explicit description of the Stirling asymptotics we need the following definition.

Regularized Products and Determinants

79

Definition 7.2. Let δ be a fixed regularization sequence. Then for any q ∈ C we define the linear map [q] : C[z] −→ C[z] B(z) −→B [q] (z), by

"(s + q) CTs=0 δ(s)B(∂s ) = z−q B [q] (log z). zs+q

(7.3)

For Pk (z) := zk we get: [q]

Pk (z) =

k j =0

(−1)j

k (k−j ) " (q)zj j

(7.4)

in the case that q = −n for all n ∈ N0 , while for q = −n, [q] Pk (z)

k CTs=q (" (k−j ) (s))zj = (−1) j j =0 (−1)n (−1)k+1 k+1 z + + (−1)k k!δk+1 . n! k+1 k

j

(7.5)

Special case B(z) = b0 . Easy calculations using the fact that "(z) is holomorphic for (−z) ∈ N0 as well as the expansion "(s) = 1s − γ + . . . (γ the Euler–Mascheroni constant) and "(s − n) = "(s)((s − 1)(s − 2) . . . (s − n))−1 deliver for q = −n b0 "(q) n 1 (7.6) B [q] (z) = (−1)n+1 z + γ − δ1 − j =1 j for q = −n, b0 n! (for zeta-regularization as well as for zero-renormalization one has δ1 = γ .) The following basic properties of [q] are clear by (7.4), (7.5) and the definition. Proposition 7.3. a) [q] is bijective for q = 0, −1, −2, . . . with deg B [q] = deg B. b) [q] is injective for q = 0, −1, −2, . . . with deg B [q] = deg B + 1 and with dim Coker([q]) = 1. c) d −q [q] z B (log z) = −z−(q+1) B [q+1] (log z). dz

(7.7)

Remark 3. In particular, every Stirling asymptotic with abscissa p can be represented as linear combination of terms of the form z−q B [q] up to a polynomial P (z) with1 P (z)zp → 0 for |z| → 0, and this polynomial is uniquely determined. This shows that the asymptotics in B’) of Theorem 4 and B) of Theorem 5 are general Stirling asymptotics which are written in a special manner. And this also means that the Stirling asymptotic (7.8) is an effective method to determine the regularized determinant among all associated functions for D. 1 Observe: Terms z−q with q ≥ p make no sense in Stirling asymptotics with abscissa p.

80

G. Illies

Remark 4. Part c) of the proposition together with (3.2) shows that the Stirling asymptotics in B’) and B) can de differentiated term by term. Theorem 4. Let 0 < σi ≤ ϕi < π for i = 1, 2 and D be a directed divisor in Wlϕ1 ,ϕ2 , let p ∈ R ∪ {∞} and ξD (s) be meromorphic for (s) > −p (compare Prop. 3.3). Then for any regularization sequence δ, s0 with (s0 ) > −p and a pBsystem (pn , Bn (z))n=0,1,... with abscissa p ∈ R ∪ {∞} the following statements are equivalent: A) ξD (s) is (σ1 , σ2 )-bounded p-regular with the singular part distribution (−pn , Bn (∂s )[

1 ])n=0,1,... . s + pn

B’) There is a polynomial Ps0 (z) with Ps0 (z)zp+s0 → 0 for |z| → 0 and such that the Stirling asymptotic with abscissa p + (s0 ) CTs=0 (δ(s)ξD (s + s0 , z))) ∼ Ps0 (z) +

∞

[pn +s0 ]

z−(pn +s0 ) Bn

(log z) for |z| → ∞

n=0

is valid in Wrσ1 ,σ2 . The polynomial in B’) is then uniquely determined: Ps0 (z) = 0. The idea of the proof given in Sect. 12 is rather similar to the proof of Theorem 3. To get the Stirling asymptotic B’) from the singular part distribution A) one uses (7.1), shifts the line of integration and applies the residue theorem. The other direction is a little bit more difficult but the basic idea is of course to use (7.2) and integrate the Stirling asymptotic term by term. Some technical difficulties arise because (7.2) is not valid for z = 0. Using Eqs. (3.2), (3.3) and (3.5) one obtains the following theorem as an easy corollary of Theorem 4. Theorem 5. Let 0 < σi ≤ ϕi < π for i = 1, 2 and D be a directed divisor in Wlϕ1 ,ϕ2 ; let f (z) be a meromorphic function of finite order with divisor D. Then for a regularization sequence δ, m ∈ N0 and a pB-system (pn , Bn (z))n=0,1,... with abscissa p ∈ R ∪ {∞} the following statements are equivalent: A) ξD (s) is (σ1 , σ2 )-bounded p-regular with singular part distribution (−pn , Bn (∂s )[

1 ])n=0,1,... . s + pn

B) There is a polynomial Pf (z) with Pf (z)zp → 0 for |z| → 0, such that the Stirling asymptotic with abscissa (p + m) (m)

log

f (z) ∼

(m) Pf (z) + (−1)m−1

is valid in Wrσ1 ,σ2 .

∞ n=0

[pn +m]

z−(pn +m) Bn

(log z) for |z| → ∞

Regularized Products and Determinants

81

Pf (z) in B) can then be chosen independent of m, it is (up to the choice of the logarithm) uniquely determined. If, in addition, p > 0, so D is bounded regularizable, then for the δ-regularized determinant one has PD = 0, i.e. the following Stirling asymptotic with abszissa p in Wrσ1 ,σ2 is valid: log D (z) ∼ −

∞

[pn ]

z−pn Bn

(log z) for |z| → +∞.

(7.8)

n=0

Theorem 5 can be regarded as the fundamental theorem about bounded regularizability by Remark 3 is states that whenever a log(m) f (z) satisfies any Stirling asymptotic with abscissa greater than zero, then f (z) and its divisor D are bounded regularizable, and (7.8) allows to determine its regularized determinant, i.e. the polynomial P (z) mentioned in the introduction. The triple equivalence A) ⇔ B) (⇔ C)) given by Theorems 3 and 5 where the latter equivalence is valid only for strictly directed divisors will be generalized in Sect. 9 (Theorems 4 and 6) to an equivalence A) ⇔B’) ⇔ C’) valid for all directed divisors which summarizes all informations about singular part distributions and asymptotics of ξD (s, z) and θD (t, s). 8. Examples 8.1. Dirichlet series. Corollary 8.1. If f (z) is meromorphic of finite order and has an absolutely convergent Dirichlet series representation f (z) = 1 +

∞ βn n=0

αnz

, (z) > σ0 ,

with βn ∈ C and αn ∈ R>1 with limn→∞ αn = ∞, then f (z) is δ-regularized for every regularization sequence δ, i.e. f (z) = D (z), in particular, f (z) is associated to its divisor (compare to Sect. 4). ξD (s, z) is holomorphic for s ∈ C. Proof. By the Taylor series expansion for log(1 + x) it is clear that the trivial Stirling asymptotic log f (z) ∼ 0 as |z| → 0 with abscissa +∞ is valid in Wr( π2 −ε),( π2 −ε) , so by Theorem 5 and Proposition 3.3 the assertion is clear. Remark 5. Using only the Mellin integral method one needs to assume that f (z) also satisfies a functional equation and examines eρt , θD− (t) := eρt θD+ (t) := ρ∈D,(ρ)>0

ρ∈D,(ρ)≤0

separately. For f (z) = ζ (z) the Riemann zeta function a classical result of Cramér ([Cr]) delivers the Cramér asymptotics for θD+ (t) and θD− (t) (with logarithmic terms in contrast to the examples from the spectra of Laplacians mentioned in Sect. 6) and thus regularizability of ζ (z) ([So, ScSo]). In [JL2] Cramér’s result was generalized to a class of f (z) as in the above corollary which in addition satisfies a functional equation, and their result implies regularizablity

82

G. Illies

of all these functions. Corollary 8.1 shows regularizability for a much larger class and moreover one no longer needs Cramér’s result. The methods of this section of [Il2] also apply to the “polynomial Bessel fundamental class” introduced in [JL3]. Nevertheless a functional equation is neccessary if one wants to get information about θD+ (t) (compare [Il2]). Remark 6. Theorem 5 gives satisfying criteria for deciding whether a function is bounded regularizable or not. For example, consider the function 1 2 f (z) = √ (z2 + 1)(1 + e−z )"(z) + e−z " (z). 2π √ It can be immediately seen that it is zeta-regularized: ( 2π )−1 "(z) is zeta-regularized because of (8.5) below, and this is true for (z2 + 1) because it is a characteristic polynomial, and this holds for (1 + e−z ) because it is a Dirichlet series; the second summand is small (in an angular domain) compared to the first and does not change the Stirling asymptotic of log f (z). 8.2. "-functions. In §2.8 of [Vi] the functions "n (z) were defined which appear in the functional equations of Selberg zeta functions and which are special cases of the general higher "-functions introduced by Barnes ([Bar]). They are simple examples for regularization with non-trivial Stirling asymptotics and their zeta-regularization can already be found in [Va, Ku] and [Ma, §3.3]. We give the following definition which is equivalent to that of Vigneras. Definition 8.2. The sequence ("n (z))n=0,1,... of "-functions of order n is defined by the following conditions: 0) "0 (z) = 1z . 1) "n−1 (z), n ∈ N, is an entire function of finite order and the divisor Dn of "n (z) consists

exactly of the ρ = −k, k ∈ N0 with multiplicity − n+k−1 n−1 . 2) "n (1) = 1 for all n ∈ N0 . 3) For all n ∈ N0 the following functional equation is valid: "n+1 (z) "n+1 (z + 1) = . "n (z) Using higher Bernoulli polynomials ([No]) one has for n ∈ N0 , 1 θDn (t) = −(−θD1 (t))n = − , (1 − e−t )n (8.1) ∞ (−1)ν Bνn (0) ν−n =− for 0 < |t| < 2π , t ν! ν=0

thus by Theorem 3 the Dn are bounded regularizable. Applying (7.6) and (7.8) one gets the Stirling asymptotic with abscisssa +∞ for the δ-regularized determinant k n n (0) B 1 n−k log z + γ − δ1 − zk log Dn (z) ∼ (−1)n+1 (n − k)!k! j k=0 j =1 (8.2) ∞ n (k − 1)!B (0) n+k z−k for |z| → ∞ + (−1)n+k (n + k)! k=1

Regularized Products and Determinants

83

in Wr(π−ε),(π−ε) . Proposition 8.3. The functions "n (z) are well defined; one has "n (z) = e−Pn (z) Dn (z) with polynomials Pn (z) of degree ≤ n which are determined (e.g. using Lagrange interpolation) by the relations j −1 j −1 Pn (j ) = log Dn−i (1), j = 1, . . . , n + 1. (−1)i (8.3) i i=0

The values log Dn (1) := −CTs=0 (δ(s)ξDn (s, 1)) can be expressed in terms of the Riemann zeta function: log Dn (1) =

n−1

τn,l ζ (−l) + (δ1 − γ )

l=0

n−1

τn,l ζ (−l)

(8.4)

l=0

for n ≥ 1 and log D0 (1) = δ1 − γ , with the Euler-Mascheroni constant γ and τn,l from the development n−1 n+x−1 = τn,l (x + 1)l . n−1 l=0

Proof. One shows that there exists exactly one choice of polynomials Pn (z) with Pn+1 (z + 1) + Pn (z) − Pn+1 (z) = 0 Pn (1) = log Dn (1) for n ∈ N0 , with deg P0 = 0. (Because of deg P0 = 0 and the first equation one gets deg Pn ≤ n, by induction it is easy to prove that the Pn (j ) are given as in the proposition, and in the other direction, that the uniquely determined Pn (z) with deg Pn ≤ n and with these Pn (1) satisfy the two equations.) The expression for log Dn (1) follows from δ(s)"(s) = 1s + (δ1 − γ ) + . . . and ξDn (s, 1) = −"(s)

∞ n+k−1 k=0

n−1

(k + 1)−s = −"(s)

n−1

τn,l ζ (s − l).

l=0

"1 (z) is the usual "-function, "2−1 (z) = G(z) is known as Barnes’ G-function. For these two functions we√will give the result more explicitly. It is well known that 1 ζ (0) = − 21 , ζ (0) = − log 2π and ζ (−1) = − 12 . With the Kinkelin-Glaisher constant 1 A one can express ζ (−1) = 12 − log A (compare [Vo, pp. 461–464], [Al, p. 357]), but we use just ζ (−1). Corollary 8.4. For n = 1, 2 one has 1 "1 (z) D1 (z) = √ e−(δ1 −γ )(z− 2 ) , 2π "2 (z) ζ (−1)+z log √2π+ δ1 −γ ((z−1)2 − 1 ) 2 6 . D2 (z) = √ e 2π By combining (8.5) and (8.2) one gets the usual Stirling formula for "(z).

(8.5) (8.6)

84

G. Illies

9. The Function θD (t, s) We now define and examine the function θD (t, s) for a directed divisor. This function is a Laplace transform of ξD (s, z) for the variable z which turns out to be a "mixture" of θD (t) and ξD (s) and is an essential tool in [Il2]. We give without proof a sort of generalization of Theorem 3 to directed divisors in terms of this function. ∗ with ϕ1 < arg(t) < In the sequel Wlϕ1 ,ϕ2 is regarded as the subset of all those t ∈ C −ϕ2 +2π (and we use these arguments for log t). Then Wlϕ1 ,ϕ2 is also defined for ϕi ≤ 0, which is needed in what follows. we define Definition 9.1. For ρ ∈ C\R≥0 , s ∈ C with (−s) ∈ N0 and t ∈ C e xp(ρ, t, s) :=

e−πi(s−1) "(s)"(1 − s, ρt) · eρt . 2π i

For a directed divisor D in Wlϕ1 ,ϕ2 (with 0 < ϕi < π ) we define e xp(ρ, t, s) for t ∈ Wl( π2 −ϕ1 ),( π2 −ϕ2 ) , (s) > r. θD (t, s) :=

(9.1)

(9.2)

ρ∈D

In the definition of e xp(ρ, t, s), a type of multivalued exponential function, the incomplete Gamma function (obviously holomorphic in α and z) ∞ ∗ "(α, z) := e−τ τ α−1 dτ, α ∈ C, z ∈ C z

is used. Properties of "(α, z) are well known (e.g. [EMOT, II, Chap. 9]). We state the needed properties of e xp(ρ, t, s) in Lemma 13.1 and give a selfcontained proof. In particular, by the lemma one can see that the defining sum for θD (t, s) converges absolutely and is holomorphic in the given domains. Proposition 9.2. With D as in the above definition, t ∈ Wl( π2 −ϕ1 ),( π2 −ϕ2 ) and (s) > r one has iα t −(s−1) e ∞ wt θD (t, s) = e ξD (s, w)dw (9.3) 2π i 0 for every α ∈] − ϕ2 , ϕ1 [ satisfying (eiα t) < 0, and the integral converges absolutely. θD (t, s) satisfies the following functional equations: θD (t, s + 1) − θD (t, s) = and if D is strictly directed in Wlϕ1 ,ϕ2 with

π 2

t −s · ξD (s) 2π i

(9.4)

< ϕi < π ,

θD (t, s) − e2πi(s−1) θD (exp(2π i)t, s) = θD (t), t ∈ Wr(ϕ2 − π2 ),(ϕ1 − π2 ) ,

(9.5)

θD (t, s), is identified where the overlap in Wl( π2 −ϕ1 ),( π2 −ϕ2 ) , the domain of definition for with Wr(ϕ2 − π2 ),(ϕ1 − π2 ) which is the domain of definition for the partition function θD (t) (compare Definition 6.1). Proof. By majorized convergence using Lemma 12.1 the Laplace integral representation is obtained from (13.4). The functional equations follow from the corresponding ones for e xp(ρ, t, s) given in Lemma 13.1.

Regularized Products and Determinants

85

Remark 7. The proposition shows that θD (t, s) behaves like ξD (s) in the variable s and like θD (t) in the variable t. In particular, θD (t, s) is meromorphic for (s) > −p if and only if this is true for ξD (s). For q ∈ C, a regularization sequence δ and B(z) ∈ C[z] we define the polynomial B [[q]] (z) ∈ C[z] by π(−z)s+q 1 CTs=0 δ(s)B(∂s ) = B [[q]] (log z)zq , − 2πi sin(π(s + q)) with arg(−z) := arg(z) − π. Theorem 6. Let 0 < σi ≤ ϕi < π for i = 1, 2 and D be a directed divisor in Wlϕ1 ,ϕ2 such that ξD (s) is meromorphic for (s0 ) > −p (compare Remark 1). For s0 ∈ C with (s0 ) > −p , a regularization sequence δ and a pB-system (pn , Bn (z))n=0,1,... with abscissa p < ∞, the following statements are equivalent: A) ξD (s) is (σ1 , σ2 )-bounded p-regular with singular part distribution 1 . − pn , Bn (∂s ) s + pn n=0,1,... s0 (t, t −1 ) with P s0 (t, t −1 )t −(p+s0 −1) → 0 for |t| → ∞ C’) There exists a polynomial P and such that the Cramér asymptotic with abscissa p + (s0 ) − 1, s0 (t, t −1 ) θD (t, s + s0 ) ∼ P CTs=0 δ(s)t s+s0 −1 +

∞

[[pn +s0 −1]]

t pn +s0 −1 Bn

(log t)

(9.6)

n=0

for |t| → 0 is valid in Wl( π2 −σ1 ),( π2 −σ2 ) . The polynomial in C’) is then uniquely determined: n−1 s0 (t, t −1 ) = 1 P CTs=0 (δ(s)ξD (s + s0 − k − 1))t k 2π i k=0

with n such that n − 1 < p + (s0 ) − 1 ≤ n. As this theorem has no direct application to regularization we omit the analogue of Proposition 7.3 for [[q]] which shows that the Cramér asymptotic in C’) is a general one written in a special manner, and we give only the idea of the proof of Theorem 6. Proof. (idea) By Theorem 4 it suffices to prove B’) ⇔ C’). This equivalence can be shown using (9.3) and its inversion by a Hankel integral ((2.6.11) in [Il2]) integrating the asymptics term by term. For details see Sect. 2.6.1 in [Il2]. Remark 8. B [[q]] (log t)t q − B [[q]] (log(exp(2π i)t))(exp(2π i)t)q = B(log t)t q and (9.5) lead one to rediscover the implication A) ⇒ C) in Theorem 3 but now one has with Theorem 4 the general equivalence A) ⇔ B’) ⇔C’) for all directed divisors already mentioned in Sect. 7. Because of (9.3) C’) is an explicit determination of the Cramér asymptotic of the Laplace transform of ξD (s, z) which is the basic meaning of Theorem 6.

86

G. Illies

10. Renormalized Determinants The following is a generalization of ideas from §5 of [Vo]. Because of (3.2) and (3.3) every meromorphic function D (z) of finite order with divisor D has representations of the form z λ λ1 log D (z) = ... (−1)- ξD (- + 1, λ0 )dλ0 dλ1 . . . dλ(10.1) a-+1

a-

a1

for certain - ≥ g and ai ∈ C, e.g. Wei,D,a (z) defined by Eq. (3.6) for ai = a. Easy considerations show that one must have |ai | = ∞ in order to get a determinant (compare Sect. 4) by this. But then (10.1) is divergent, so one has to renormalize the divergent integral. If D is quasi-directed and bounded regularizable according to Theorem 4 one has a Stirling asymptotic for ξD (- + 1, z), and (10.1) with ai = ∞ can be renormalized z if one has a renormalization for every integral of the form ∞ λ−q B(log λ)dλ, q ∈ C, B(z) ∈ C[z] (taking of course the value of the integral in case of absolute convergence). In the sequel for B(z) ∈ C[z] and q ∈ C we define B {q−1} (z) ∈ C[z] by d −(q−1) {q−1} z B (log z) = z−q B(log z), dz B {0} (0) = 0. Thus the z−(q−1) B {q−1} (log z) are just those primitives of the z−q B(log z) whose constant terms are zero. Definition 10.1. A renormalization sequence ω is a sequence (ωn )n=0,1,... of complex numbers. For such a renormalization sequence ω and D ∈ Dbreg , i.e. D is a quasidirected bounded regularizable divisor, the ω-renormalized determinant D (z) of D is defined by log D (z) =

z λ∞

∞

...

λ1

∞

(−1)- ξD (- + 1, λ0 )dλ0 dλ1 . . . dλ-

for - ≥ g, integrating (- + 1) times using the Stirling asymptotic for ξD (- + 1, z) from Theorem 4 and following the renormalization rule: z z−(q−1) B {q−1} (log z) for q = 1 λ−q B(log λ)dλ := (10.2) B {0} (log z) + ω(B) for q = 1 ∞ with ω(B) :=

k

ωk0 bk for B(z) =

k bk z

k.

One can easily prove that the definition of D (z) is independent of - ≥ g and of the lines of integration and that it delivers indeed a determinant system on Dbreg . The Stirling asymptotic that determines log D (z) in the same way as (7.8) for the δregularized determinant is derived by integrating the Stirling asymptotic for ξD (- + 1, z) term by term following (10.2). The next theorem shows that renormalization and regularization in fact are essentially the same.

Regularized Products and Determinants

87

Theorem 7. There is a bijection between the set of regularization sequences δ and the set of renormalization sequences ω such that the δ-regularized determinant and the ω-renormalized determinant deliver the same determinant system on Dbreg . The ω0 -renormalized determinant with ωn0 := 0 for all n ∈ N0 delivers the zerorenormalization as defined in Example 3 in Sect. 3. Proof. By Theorem 5 and the properties of the map [q], in particular, (7.7) and the fact that Stirling asymptotics for log(m) f (z) can be differentiated term by term (Remark 2 in Sect. 7) one easily sees that it is sufficient to observe the following: 1. δ(s) = "(1 − s) is a regularization sequence with B [0] (0) = 0 for all B(z) ∈ C[z]. This follows from (7.5) as then one has CTs=0 (" (k) (s)) = (−1)k+1 k!δk for all k ∈ N0 . 2. Let >1 be the C-vector space of all renormalization sequences and >2 that of all regularization sequences. Define ? to be the C-vector space of all C-linear maps from C[z] to C. Then regard the maps α1 : >1 −→ ?, ω −→ (B → ω(B)), α2 : >2 −→ ?, δ −→ (B → (B [0]δ − B [0]0 )), where in the latter definition [q]δ means the map [q] for the regularization sequence δ while [q]0 means [q] for the special regularization sequence of zero renormalization (δ(s) = "(1 − s)). These maps are obviously isomorphisms and α1−1 ◦ α2 is the demanded isomorphism between >2 and >1 . 11. Proof of Theorem 2b) Given a system of relations

(i)

|+α1,k

D1

+ ... +

k

D (i)

(i)

|+αn,k

Dn

= D (i) ,

i ∈ I,

(11.1)

k

Dreg ,

Dqd \Dreg ,

(i)

with ∈ D1 , . . . , Dn ∈ αm,k ∈ C for i ∈ I , m = 1, . . . , n and with finite sums over the index k. We first regard logarithms of associated functions for large real z. We choose the logarithms of the regularized determinants log D (i) (z) := −CTs=0 (δ(s)ξD (i) (s, z)) and logarithms log inDm (z) of certain associated functions inDm (z). We search for polynomials Pm (z) =

gm l=0

(−1)l+1

xm,l l z, l!

m = 1, . . . , n

such that log Dm (z) = Pm (z) + log inDm (z) is consistent with i) and ii) of Definition 4.1 under (11.1). With the polynomials (i) (i) P(i) (z) := log D (i) (z) − log inD1 z − α1,k − . . . − log inDn z − αn,k k

k

(11.2)

88

G. Illies

this is equivalent to (i) (i) P1 z − α1,k + . . . + Pn z − αn,k , P(i) (z) = k

i ∈ I,

(11.3)

k

and this is equivalent to a system of linear equations for the xm,l . Now by Zorn’s lemma a system of linear equations has a solution if every finite subsystem has one. Thus it suffices to prove that there is always a solution if |I | < ∞. So wlog we may assume that δ(s) is a polynomial (as the finitely many log D (i) (z) depend only on finitely many δn ). And after a trivial translation we also assume that all (i) divisors are directed and that there is an α > 0 such that |z| < α implies |z − αm,k | < |ρ| for all ρ that occur in the D (i) , Dm ; we always assume |z| < α. If we choose log inm (z) = Wei,g log Dm ,0 m (z) (compare Eq. (3.6)), then we have with the coefficients given in the proof of Theorem 1 b) and using Proposition 3.3: (i) log inDm z − αm,k (i) l gm z − αm,k (i) − CTs=0 δ(s) ξDm s, z − αm,k − ξDm (s + l) (−1)l l! l=0

and thus by Eq. (11.2)

(i) l g1 z − α1,k (−1)l ξD1 (s + l) P(i) (z) = − CTs=0 δ(s) l! l=0

... +

gn l=0

k

(−1)l

(i) l

z − αn,k l!

k

(11.4)

ξDn (s + l) , i ∈ I

(i) (i) (as k ξD1 (s, z −α1,k )+. . .+ k ξDn (s, z −αn,k ) = ξD (i) (s, z)). Comparison of (11.4) and (11.3) leads one to introduce the functions xm,l (s) := δ(s)ξDm (s + l), Pm (s, z) :=

gm l=0

P(i) (s, z) :=

(−1)l+1

xm,l (s) l z, l! (i)

P1 (s, z − α1,k ) + . . . +

k

(11.5) k

(i)

Pn (s, z − αn,k ),

where the xm,l (s) and the Pm (s, z) are all holomorphic for (s) > rmax := max rm while P(i) (s, z) is meromorphic for (s) > 0 with CTs=0 (P(i) (s, z)) = P(i) (z),

i ∈ I,

(11.6)

for |z| < α as is seen from Eq. (11.4). We now expand (11.5) and (11.3) by powers of z: P(i) (s, z) =

g max l=0

p(i),l (s)zl

Regularized Products and Determinants

89

gmax and P(i) (z) = l=0 p(i),l zl . Regard the p(i),l (s) and correspondingly the p(i),l as the components of vectors p(s), p ∈ CM , and regard the xm,l (s) and xm,l as the components of vectors x m,l (s), x m,l ∈ CN . The expansion of (11.3) and (11.5) by powers of z delivers a matrix B ∈ Mat(N × M, C) such that p(s) = B · x(s)

for (s) > rmax ,

(11.7)

and it has to be shown that there is a x ∈ CN such that p = B · x. But there is a matrix Bˆ ∈ Mat(M × N , C) such that a solution exists if and only if Bˆ · p = 0. With this Bˆ one has Bˆ · p = Bˆ · CTs=0 (p(s)) = CTs=0 (Bˆ · p(s)) = 0, where the first equality is obtained from (11.6) and the last from (11.7).

Remark. Observe that in the proof the operation CTs=0 is applied to functions f (s) = f1 (s) + f2 (s) with f1 (s) being meromorphic around s = 0 and f2 (s) defined only for (s) > 0 but continuous at s = 0.

12. Proof of Theorem 4 The following estimate is needed to apply majorized convergence to integrals over ξD (s, z). Lemma 12.1. Let 0 < ϕi < ϕi < π , i + 1, 2 and let D be a directed divisor in Wlϕ1 ,ϕ2 and given r ≥ r such that c := ρ∈D |mD (ρ)ρk−r | < ∞. Then for (s0 ) > r, mD (ρ) r −(s )

0 (z − ρ)s0 = O |z|

(12.1)

ρ∈D

for z ∈ Wrϕ1 ,ϕ2 and |z| → ∞. Proof. We split the series in

|ρ|< 21 |z| and

|ρ|≥ 21 |z| and treat these two series separately. 1 r |ρ|<x |mD (ρ)| ≤ cx for any 2 |z| and use

For the first series we estimate |z − ρ| > x > 0 which follows immediately from the definition of c. This last inequality on the other hand implies x1 ≤|ρ|<x2

x2 mD (ρ) 1 ≤ cx r 1 + cr y r −1 α dy 1 α ρα x1 y x1

(12.2)

x for all α ∈ R>0 and 0 < x1 < x2 (as the rhs obviously maximizes x12 y −α dµ(y) under x the condition x1 dµ(y) ≤ cx r for all x ∈ [x1 , x2 ]). Observing that there is a β > 0 such that |z − ρ| > β|ρ| for all ρ ∈ D and z ∈ Wrϕ1 ,ϕ2 and using (12.2) we get the estimate also for the second series.

90

G. Illies

In the sequel we often tacitly use the following estimate: For 0 ≤ ϕ < and m ∈ N0 , " (m) (s) = O(e−ϕ|(s)| ),

|(s)| → ±∞

π 2,

α1 < α2 (12.3)

for α1 < (s) < α2 . For m = 0 this is part of the Stirling formula, for m > 0 it follows by applying Cauchy’s inequalities. Proof of Proposition 7.1. By majorized convergence (for b) apply the above Lemma) the two integral representations have to be proved only for one-point-divisors. In the sequel for expressions a s we always use arg(a) ∈] − π, π [. a) Let (s) > c > 0 and (ρ) < 0, (z) > 0, then by Euler’s Mellin integral for "(s) and its inversion one has ∞ "(s) = e−zt t s−1 eρt dt (z − ρ)s 0 ∞ c+i∞ "(s ) −s 1 −zt s−1 = e t t ds dt 2π i c−i∞ (−ρ)s 0 c+i∞ 1 "(s − s ) "(s ) = ds , 2πi c−i∞ zs−s (−ρ)s the last equation by interchanging the integrations (Fubini). Using the identity theorem one gets this formula as needed for ρ ∈ Wlϕ1 ,ϕ2 and z ∈ Wrϕ1 ,ϕ2 because both sides are holomorphic in the variables ρ and z. b) For ρ ∈ Wlϕ1 ,ϕ2 , z ∈ Wrϕ1 ,ϕ2 and (s) > 0 we will prove 1 "(s) = s (z − ρ) 2πi

C

"(s − s0 + 1) "(s0 ) dw. (z − w)s−s0 +1 (w − ρ)s0

(12.4)

For z0 ∈ Wrϕ1 ,ϕ2 and (s0 ) < 1 one has 1 2πi

C

iπs0 − e−iπs0 ∞ 1 1 v −s0 −s e dw = z dv 0 (z0 − w)s−s0 +1 w s0 2π i (1 + v)s−s0 +1 0 sin π s0 "(1 − s0 )"(s) = z0−s π "(s − s0 + 1) "(s) 1 = s , z0 "(s0 )"(s − s0 + 1)

the first equation by substituting w → −z0 v and deforming the contour (residue theorem), the second because of the representation 1.5 (2) in [EMOT] for Euler’s beta function B(u, v) = "(u)"(v)" −1 (u + v) and the third because of the equation "(1 − s0 )"(s0 ) = π sin−1 πs0 . Now for ρ ∈ Wlϕ1 ,ϕ2 such that z = z0 + ρ ∈ Wrϕ1 ,ϕ2 , replace the contour C by the shifted contour C − ρ. The value of this integral is independent of ρ (residue theorem) and (by majorized convergence for ρ → 0) equals the value of the above integral. Applying the substitution w → (w − ρ) and the identity theorem yields (12.4) in the demanded generality.

Regularized Products and Determinants

91

Proof of Theorem 4. A) ⇒ B’). Let −p < −q < r < c with (pn ) = q for all n. Then by the residue theorem (7.1) for (s) > c and z ∈ Wrϕ1 ,ϕ2 becomes −q+i∞ "(s − s ) 1 ξD (s, z) = ξD (s )ds 2π i −q−i∞ zs−s "(s − s ) + Ress =−pn ξD (s ) zs−s (pn ) −q as is seen by the identity theorem. From this B’) easily follows for (s0 ) > −p. If −p < (s0 ) ≤ −p then first take the Stirling asymptotic for s0 = s0 + k with k ∈ N such that −p < (s0 ) ≤ −p + 1 and integrate k times. B’) ⇒ A). First note that we just need to show that ξD (s) is (σ1 , σ2 )-bounded pregular, but we do not need to determine the singular part distribution as then because of A) ⇒ B’) and the properties of the map [q] it must be the demanded one. Let now 0 < σi < σi < σ . We have for (s) > r and C = Cσ1 ,σ2 , 1 "(s − s0 + 1) CTs1 =0 (δ(s1 )ξD (s1 + s0 , w)) dw, (12.5) ξD (s, z) = 2πi C (z − w)s−s0 +1 which is obtained by applying partial integration to (7.2) using (3.2) where the necessary estimates for |w| → ∞ are derived by integrating (12.1). Now as (12.5) is not valid for z = 0 one has to use a little trick: One deforms the contour C and uses a "shifted" Stirling asymptotics. Let ε > 0 then by Taylor series expansion one obtains a pB-systems n ) with abscissa p such that ( pn , B q (z) := CTs1 =0 (δ(s1 )ξD (s1 + s0 , z)) R B n (log(z + ε)) s0 (z) − −P (z + ε)pn +s0 ( pn ) max(r, −(p0 ), deg P 1 Bn (log(w + ε)) "(s − s0 + 1) ξD (s) = dw 2π i C (w + ε)pn +s0 (−w)s−s0 +1 ( pn ) 0, e xp(ρ, t, s) − e2πi(s−1) e xp(ρ, exp(2π i)t, s) i(α+δ) ei(α−δ) ∞ wt e ∞ t −(s−1) e = − dw "(s)eρt 2πi ws 0 0 ! " (e−π i(s−1) −eπ i(s−1) )t s−1 "(1−s)

=

1 "(s)"(1 − s)2i sin(π s)eρt = eρt , 2πi

thus (13.1), by the identity theorem also in general. It remains to prove (13.3). We assume arg(ρt) ∈] − ε , ε [ for 0 < ε < π2 , the general case follows then by rotating the ray of integration, ∞ (−ρt)w "(s) e e xp(ρ, t, s) = dw, (−ρt)−(s−1) 2π i (w + 1)s 0

Regularized Products and Determinants

93

which immediately gives the estimate | exp(ρ, t, s)| < c1 |ρt|−((s)−1) for |ρt| ≥ 1 for a suitable c1 > 0 und thus (13.3) by (13.2). For |ρt| ≤ 1 on the other hand with 1 0 < α := (ρt) ≤ 1 and the trivial estimate e−x ≤ x+α for x ∈ R≥0 one has 0

∞

e−(ρt)w dw = α (s)−1 |(w + 1)s | ≤α

(s)−1

∞

0 ∞ 0

and the assertion easily follows also for |ρt| ≤ 1.

e−x dx (x + α)(s) dx dx, (x + α)(s)+1

14. Miscelleanea In Chapter 2 of [Il1] the formalism of regularized determinants was developed more generally: Following Jorgenson and Lang ([JL1]) divisors with non-integer multiplicities mD : C → C (instead of C → Z) were regarded, then everything can be carried out with almost no difficulties, except that the associated functions become multivalued with the ρ ∈ D as branch points. Also essential singularities for ξD (s) were allowed. In that case it is neccessary that the formal power series δ(s) is convergent near zero. With this assumption almost everything can be done in general although some not completely trivial convergence problems occur. The maps [q] and [[q]] defined in Sects. 7 and 8 are special cases of the following construction: For q ∈ C, a regularization sequence δ and a function h(s), which is meromorphic in a neighborhood of q we define a linear map [h, q] : C[z] → C[z] (notation: B(z) → B [h,q] (z)) by CTs=0 (δ(s)B(∂s )[h(s)zs+q ]) = B [h,q] (log z)zq . If h1 (s) and h2 (s) are two such function, then if h1 (s) is, in addition, holomorphic at s = q the composition law [h2 , q] ◦ [h1 , q] = [h1 · h2 , q] is easily checked. For example 1 this implies [[q]] = − 2πi [1 − q] ◦ [q] for q = −n, n ∈ N0 . Also a sort of inverse of [q] can be defined (compare Satz 2.3.6 in [Il1]). Acknowledgements. I would like to thank C. Deninger for supervising my Ph.D. thesis as well as M. Schröter, I. Vardi, C. Bree, C. Soulé, A. Voros, J. B. Bost and J. Jorgenson for helpful discussions and improvements. Parts of the article were written during a visit at the IHES.

References [Al] [Bar] [Cr] [CV] [De1] [De2]

Almquist, G.: Asymptotic Formulas and Generalized Dedekind Sums. Exp. Math. 7, 343–359 (1998) Barnes, E.W.: On the Theory of the Multiple Gamma Function. Phil. Trans. of the Royal Soc. (A) 19, 374–439 (1904) Cramér, H.: Studien über die Nullstellen der Riemannschen Zetafunktion. Math. Zeitschrift 4, 104–130 (1919) Cartier, P., Voros, A.: Une nouvelle interpretation de la formule des traces de Selberg. In: The Grothendieck Festschrift, Vol. 2, Basel–Boston: Birkhäuser, 1991, pp. 1–67 Deninger, C.: Motivic L-functions and regularized determinants. In: Motives, Proc. of Symp. Pure Math. 55/1, Providence, RI: AMS, 1994, pp. 707–743 Deninger, C.: Motivic L-functions and regularized determinants II. In: F. Catanese (Hrsg.) Proc. Arithmetic Geometry, Cortona, 1994

94

[De3]

G. Illies

Deninger, C.: Some Analogies between Number Theory and dynamical Systems on foliated Spaces. Documenta Mathematica, extra vol. ICM 1998, I, Plenary Talks, pp. 23–46 [Do] Doetsch, G.: Handbuch der Laplacetransformation I/II, Basel: Birkhäuser, 1950/1955 [Ef] Efrat, L.: Determinants of Laplacians on surfaces of finite volume. Commun. Math. Phys. 119, 443–451 (1988); Erratum. Commun. Math. Phys. 138, 607 (1991) [EMOT] Erdelyi, A., Magnus, W., Oberhettinger, F., Tricomi, F.G.: Higher transcendental functions I, II, III. New York: McGraw-Hill, 1953 [EORBZ] Elizalde, E., Odintsov, S.D., Romeo, A., Bytsenko, A.A., Zerbini, S.: Zeta regularization techniques with applications. Singapore: World Scientific, 1994 [Gui] Guinand, A.D.: Fourier reciprocities and the Riemann zeta function. Proc. London Math. Soc. (2) 51, 401–414 (1950) [Il1] Illies, G.: Regularized products, trace formulas and Cramér functions. Ph.D.-thesis (in German), Schriftenreihe des mathematischen Instituts der Universität Münster, 3. Serie, Heft 22, 1998 [Il2] Illies, G.: Cramér functions and Guinand equations. IHES-preprint 1999 [JL1] Jorgenson, J., Lang, S.: Basic Analysis of regularized series and products. LNM 1564, Berlin: Springer, 1994 [JL2] Jorgenson, J., Lang, S.: On Cramér’s theorem for general Euler products with functional equation. Math. Ann. 297/3 383–416 (1993) [JL3] Jorgenson, J., Lang, S.: Extension of analytic number theory and the theory of regularized harmonic series from Dirichlet series to Bessel series. Math. Ann. 306, 75–124 (1996) [Ko1] Koyama, S.Y.: Determinant expressions of Selberg zeta functions I. Trans. AMS 324, 149–168 (1991) [Ko2] Koyama, S.Y.: Determinant expressions of Selberg zeta functions II. Trans. AMS 329, 755–772 (1992) [Ko3] Koyama, S.Y.: Determinant expressions of Selberg zeta functions III. Proc. AMS 113, 303–311 (1991) [Ku] Kurokawa, N.: Multiple sine functions and Selberg zeta functions. Proc. Japan Acad. 67A, 61–64 (1991) [LW] Landau, E., Walfisz, A.: Über die Nichtfortsetzbarkeit einiger durch Dirichletsche Reihen definierter Funktionen. Rend. di Palermo 44, 8286 (1919) [Ma] Manin, Y.I.: Lectures on zeta functions and motives Preprint MPI Bonn, 1992 [No] Norlund, N.E.: Memoire sur les polynomes de Bernoulli. Acta Mathematica 43, 121–196 (1920) [QHS] Quine, J.R., Heydari, S.H., Song, R.Y.: Zeta-regularized products. Trans. of the AMS 338, 1, 213–231 (1993) [RS] Ray, D., Singer, I.: Analytic torsion for analytic manifolds. Ann. Math. 98, 154–177 (1973) [Sa] Sarnak, P.: Determinants of Laplacians. Commun. Math. Phys. 110, 113–120 (1987) [ScSo] Schröter, M., Soulé, C.: On a Result of Deninger Concerning Riemann’s Zeta Function. In: Motives, Proc. of Symp. Pure Math. 55/1, Providence, RI: AMS, 1994, pp. 745–747 [So] Soulé, C.: Letter to C. Deninger, 13.2.1991, as: M. Schröter, S. Soulé: On a result of Deninger concerning Riemann’s zeta function. In: Motives, Proc. of Symp. Pure Math. 55/1, Providence, RI: AMS, 1994, pp. 745–747 [Ti] Titchmarsh, E.C.: The Theory of Functions. 2nd ed., Oxford: Oxford University Press, 1939 [Va] Vardi, I.: Determinants of Laplacians and multiple Gamma Functions. Siam J. Math. Anal. 19, 1, 493–507 (1988) [Vi] Vigneras, M.F.: L’equation fonctionelle de la fonction zeta de Selberg du groupe modulaire SL(2, Z). Asterisque 61, 235–249 (1979) [Vo] Voros, A.: Spectral Functions, Special Functions and the Selberg Zeta Function. Commun. Math. Phys. 110, 439–465 (1987) Communicated by P. Sarnak

Commun. Math. Phys. 220, 95 – 104 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

Super Brockett Equations: A Graded Gradient Integrable System R. Felipe1 , F. Ongay2 1 ICIMAF, Havana, Cuba, and Universidad de Antioquia, Medellín, Colombia 2 CIMAT, Guanajuato, Mexico. E-mail: [email protected]

Received: 9 February 2000 / Accepted: 18 January 2001

Abstract: Rather recently equations of Lax type defined by a double commutator, the so-called Brockett equations, have received considerable attention. In this paper we prove that a supersymmetric version of a Brockett hierarchy is an infinite dimensional integrable gradient system. As far as we know, this is the only graded system of this type existing in the literature. 0. Introduction Ever since the discovery in 1968 by Gardner, Green, Kruskal and Miura of the inverse scattering method to solve the KdV equations, the theory of infinite dimensional integrable systems, sometimes also known as the theory of soliton equations, has been the subject of a great deal of work, and many results and applications have stemmed from this newfound attention to the subject. As is well known, one of the first major developments came with the realization that these systems can be put in the so-called Lax form, L˙ = [L, N ], since this description is particularly well suited to stress some of the geometrical interpretations of the equations, in particular allowing to place them into a Hamiltonian framework. On the other hand, some ten years ago, ODE’s of Lax type defined by more than one Lie bracket were introduced by R. Brockett (see [B1] and [B2]), in connection with some least squares matching and sorting problems. Surprisingly enough, these so-called Brockett systems exhibit many remarkable features besides the original intended ones: to name one, it was discovered by A. Bloch, R. Brockett and T. Ratiu (see e.g. [B-B-R]) that the equations corresponding to the celebrated Toda lattice can be cast into this mold. But moreover, another property of these equations, still more relevant to our purposes, was also proved in [B-B-R], where it was shown that these finite dimensional systems Partially supported by CONACYT, Mexico, project 28-492E and CODI project “Complete integrability of Brockett type equations”, University of Antioquia, Colombia.

96

R. Felipe, F. Ongay

are completely integrable, but of gradient type (the existence of a suitable Hamiltonian structure remaining an open question). Quite recently, the theory of Brockett equations was adapted by one of us for PDE’s (reference [F]), and it was proved that many important properties, such as the complete integrability and the property of being a gradient system, were still valid in this infinite dimensional context, but also that this analog of the Brockett equation belongs to a hierarchy, similar to the well known KdV or KP hierarchies. In this work we consider yet another extension of the Brockett system: Following the approach to supersymmetric (i.e., Z2 -graded) versions of the KP hierarchy, studied for example by Manin and Radul ([M-R]), Mulase ([Mu2]), or Rabin ([R]), we define and study a supersymmetric extension of the Brockett hierarchy introduced in [F]. In particular, our main results will show that the properties of being completely integrable and a gradient flow, also extend to this graded hierarchy; to the best of our knowledge, this is the first example of a graded system possessing these properties. Furthermore, the flows associated to this new hierarchy naturally “live” on a flag in the space of gauge operators, and we conjecture that this geometric feature of our construction might be of some use in the algebro-geometric study of deformations of line bundles over algebraic curves, both in the classical and graded case. 1. A Z2 -Graded Brockett Hierarchy We will consider in this work a rather standard (1, 1) dimensional setting, namely, the one studied by Manin and Radul, which we now briefly recall, referring the reader to the basic reference [M-R] for more details (see also [Mu2]): First of all, let x denote an even variable, ξ an odd one (the parity of an object will be denoted by a tilde, so that for instance x˜ = 0; ξ˜ = 1), and fix some ring of “superfunctions” in these variables (for instance, we may take the ring of formal power series in x and ξ ), B, where the operator θ = ∂ξ +∂x acts as an odd derivation (recall that θ 2 = ∂x ). Then one considers the ring of (formal) super pseudo-differential operators, B((θ −1 )), with coefficients in B. To avoid confusion with the action of the derivations on the operators, the product in this ring will be denoted by ◦, and by θ −1 we will denote the (formal) inverse of θ. Thus, every operator L ∈ B((θ −1 )) can be written as a formal series bi θ i , L= i≤m

and, as usual, we will write L+ =

bi θ i ;

L− =

bi θ i ,

(1)

i 0, Eq. (7) gives the flow of the gradient of the graded Adler functional Fk (S) on the affine subspace 1 + E (−k−2) . Proof. Indeed, to end the proof of our claim, it remains only to observe that, from Lemma 2, we have θk S −1 = −S −1 ◦ θk S ◦ S −1 = (−1)k+1 S −1 ◦ [, k+1 − ]. Therefore, modulo an inessential sign, the right-hand side of (15) is in fact equivalent to the right-hand side of (7), which we have already shown to be equivalent to the super Brockett system. Remark. The graded hierarchy that we have constructed in this paper preserves, and in a definite sense generalizes, several of the remarkable features of the standard Brockett equation. But moreover, we have also seen that these super Brockett equations will induce a flow on an infinite Grassmannian, of a different type to that given by the known super KP flows. We conjecture, therefore, that this hierarchy might also be of value, for instance, for the algebro-geometric study of deformations of superline bundles over supercurves, etc. (and it is clear that this remark also applies to the non-graded case; see also [F]). We hope to clarify some of these questions in a future work. Acknowledgements. Both authors wish to express their indebtedness to Prof. J. Rabin, who patiently listened to our expositions of a preliminary version of this work, and made several valuable comments. The bulk of this paper was done during reciprocal visits by each author to his coauthor’s respective institution; both of us thankfully acknowledge their hospitality during these stays. Finally, we are grateful to one of the referees, who pointed out an error in the original manuscript.

References [B-B-R] Bloch, A.M., Brockett, R.W., and Ratiu, T.S.: Completely integrable gradient flows. Commun. Math. Phys. 147, 57–54 (1992) [B1] Brockett, R.W.: Least squares matching problems. Linear Algebra Appl. 122, 761–777 (1989) [B2] Brockett, R.W.: Dynamical systems that sort lists, diagonalize matrices, and solve linear programming problems. Linear Algebra Appl. 146, 79–91 (1991) [D] Dickey, L.A.: Soliton equations and Hamiltonian systems Advanced Series in Math. 12, Phys. Singapore: World Scientific, 1991 [F] Felipe, R.: Algebraic aspects of Brockett type equations. Physica D 132, 287–297 (1999) [M-R] Manin, Yu.I., and Radul, O.A.: A supersymmetric extension of the Kadomtsev–Petviashvili hierarchy. Commun. Math. Phys. 98, 65–77 (1985)

104

[Mu1] [Mu2] [R]

R. Felipe, F. Ongay

Mulase, M.: Complete integrability of the Kadomtsev–Petviashvili equation. Adv. Math. 54, 57–66 (1984) Mulase, M.: A new super KP system and a characterization of the Jacobians of arbitrary algebraic supercurves. J. Diff. Geom. 34, 651–680 (1991) Rabin, J. M.: The geometry of super KP flows. Commun. Math. Phys. 137, 533–552 (1991)

Communicated by T. Miwa

Commun. Math. Phys. 220, 105 – 164 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials and Coset Branching Functions Anne Schilling1, , Mark Shimozono2, 1 Department of Mathematics, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge,

MA 02139, USA. E-mail: [email protected]

2 Department of Mathematics, Virginia Tech, Blacksburg, VA 24061-0123, USA.

E-mail: [email protected] Received: 9 April 2000 / Accepted: 26 January 2001

Abstract: Level-restricted paths play an important rôle in crystal theory. They correspond to certain highest weight vectors of modules of quantum affine algebras. We show that the recently established bijection between Littlewood–Richardson tableaux and rigged configurations is well-behaved with respect to level-restriction and give an explicit characterization of level-restricted rigged configurations. As a consequence a new general fermionic formula for the level-restricted generalized Kostka polynomial is obtained. Some coset branching functions of type A are computed by taking limits of these fermionic formulas. 1. Introduction Generalized Kostka polynomials [26, 33, 35–38] are q-analogues of the tensor product multiplicity λ cR = dim Homsln (V λ , V R1 ⊗ · · · ⊗ V RL ),

(1.1)

where λ is a partition, R = (R1 , . . . , RL ) is a sequence of rectangles and V λ is the irreducible integrable highest weight module of highest weight λ over the quantized enveloping algebra Uq (sln ). The generalized Kostka polynomials can be expressed as generating functions of classically restricted paths [30, 33, 37]. In terms of the theory of Uq (sln )-crystals [16, 17] these paths correspond to the highest weight vectors of tensor products of perfect crystals. The statistic is given by the energy function on paths. n )-crystal strucThe Uq (sln )-crystal structure on paths can be extended to a Uq (sl ture [18]. The level-restricted paths are the subset of classically restricted paths which, New address as of July 2001: Department of Mathematics, University of California, One Shields Ave., Davis, CA 956116-8633, USA. E-mail: [email protected] Partially supported by NSF grant DMS-9800941.

106

A. Schilling, M. Shimozono

n )after tensoring with the crystal graph of a suitable integrable highest weight Uq (sl module, are affine highest weight vectors. Hence it is natural to consider the generating functions of level-restricted paths, giving rise to level-restricted generalized Kostka polynomials which will take a lead rôle in this paper. The notion of level-restriction is also very important in the context of restricted-solid-on-solid (RSOS) models in statistical mechanics [3] and fusion models in conformal field theory [39]. The one-dimensional configuration sums of RSOS models are generating functions of level-restricted paths (see for example [2, 9, 14]). The structure constants of the fusion algebras of Wess– Zumino–Witten conformal field theories are exactly the level-restricted analogues of the Littlewood–Richardson coefficients in (1.1) as shown by Kac [15, Exercise 13.35] and Walton [40, 41]. q-Analogues of these level-restricted Littlewood–Richardson coefficients in terms of ribbon tableaux were proposed in ref. [10]. The generalized Kostka polynomial admits a fermionic (or quasi-particle) formula [25]. Fermionic formulas originate from the Bethe Ansatz [4] which is a technique to construct eigenvectors and eigenvalues of row-to-row transfer matrices of statistical mechanical models. Under certain assumptions (the string hypothesis) it is possible to count the solutions of the Bethe equations resulting in fermionic expressions which look like sums of products of binomial coefficients. The Kostka numbers arise in the study of the XXX model in this way [22–24]. Fermionic formulas are of interest in physics since they reflect the particle structure of the underlying model [20, 21] and also reveal information about the exclusion statistics of the particles [5–7]. The fermionic formula of the Kostka polynomial can be combinatorialized by taking a weighted sum over sets of rigged configurations [22–24]. In ref. [25] the fermionic formula for the generalized Kostka polynomial was proven by establishing a statisticpreserving bijection between Littlewood–Richardson tableaux and rigged configurations. In this paper we show that this bijection is well-behaved with respect to levelrestriction and we give an explicit characterization of level-restricted rigged configurations (see Definition 5.5 and Theorem 8.2). This enables us to obtain a combinatorial formula for the level-restricted generalized Kostka polynomials as the generating function of level-restricted rigged configurations (see Theorem 5.7). As an immediate consequence this proves a new general fermionic formula for the level-restricted generalized Kostka polynomial (see Theorem 6.2 and Eq. (6.7)). Special cases of this formula were conjectured in refs. [8, 12, 13, 27, 33, 42]. As opposed to some definitions of “fermionic formulas” the expression of Theorem 6.2 involves in general explicit negative signs. However, we would like to point out that because of the equivalent combinatorial formulation in terms of rigged configurations as given in Theorem 5.7 the fermionic sum is manifestly positive (i.e., a polynomial with positive coefficients). The branching functions of type A can be described in terms of crystal graphs of n )-modules. For certain triples of weights irreducible integrable highest weight Uq (sl they can be expressed as limits of level-restricted generalized Kostka polynomials. The structure of the rigged configurations allows one to take this limit, thereby yielding a fermionic formula for the corresponding branching functions (see Eq. (7.10)). The derivation of this formula requires the knowledge of the ground state energy, which is obtained from the explicit construction of certain local isomorphisms of perfect crystals (see Theorem 7.3). A more complete set of branching functions can be obtained by considering “skew” level-restricted generalized Kostka polynomials. We conjecture that rigged configurations are also well-behaved with respect to skew shapes (see Conjecture 8.3).

Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials

107

The paper is structured as follows. Section 2 sets out notation used in the paper. In Sect. 3 we review some crystal theory, in particular the definition of level-restricted paths, which are used to define the level-restricted generalized Kostka polynomials. Littlewood–Richardson tableaux and their level-restricted counterparts are defined in Sect. 4. The formulation of the generalized Kostka polynomials in terms of Littlewood– Richardson tableaux with charge statistic is necessary for the proof of the fermionic formula which makes use of the bijection between Littlewood–Richardson tableaux and rigged configurations. The latter are the subject of Sect. 5 which also contains the new definition of level-restricted rigged configurations and our main Theorem 5.7. The proof of this theorem is reserved for Sect. 8. The fermionic formulas for the level-restricted Kostka polynomial and the type A branching functions are given in Sects. 6 and 7, respectively. 2. Notation All partitions are assumed to have n parts, some of which may be zero. Let R = (R1 , R2 , . . . , RL ) be a sequence of partitions whose Ferrers diagrams are rectangles. Let Rj have µj columns and ηj rows for 1 ≤ j ≤ L. We adopt the English notation for partitions and tableaux. Unless otherwise specified, all tableaux are assumed to be column-strict (that is, the entries in each row weakly increase from left to right and in each column strictly increase from top to bottom). 3. Paths The main goal of this section is to define the level-restricted generalized Kostka polyn )-crystal graphs nomials. These polynomials are defined in terms of certain finite Uq (sl whose elements are called paths. The theory of crystal graphs was invented by Kashiwara [16], who showed that the quantized universal enveloping algebras of Kac–Moody algebras and their integrable highest weight modules admit special bases whose structure at q = 0 is specified by a colored graph known as the crystal graph. The crystal graphs for the finite-dimensional irreducible modules for the classical Lie algebras were computed explicitly by Kashiwara and Nakashima [17]. The theory of perfect crystals gave a realization of the crystal graphs of the irreducible integrable highest weight modules for affine Kac–Moody algebras, as certain eventually periodic sequences of elements taken from finite crystal graphs [19]. This realization is used for the main application, some new explicit formulas for coset branching functions of type A. 3.1. Crystal graphs. Let Uq (g) be the quantized universal enveloping algebra for the Kac–Moody algebra g. Let I be an indexing set for the Dynkin diagram of g, P the weight lattice of g, P ∗ the dual lattice, {αi | i ∈ I } the (not necessarily linearly independent) simple roots, {hi | i ∈ I } the simple coroots, and {i | i ∈ I } the fundamental weights. Let · , · denote the natural pairing of P ∗ and P . Suppose V is a Uq (g)-module with crystal graph B. Then B is a directed graph whose vertex set (also denoted B) indexes a basis of weight vectors of V , and has directed edges colored by the elements of the set I . The edges may be viewed as a combinatorial version of the action of Chevalley generators. This graph has the property that for every b ∈ B and i ∈ I , there is at most one edge colored i entering (resp. leaving) b. If there is an edge b → b colored i, denote this by fi (b) = b and ei (b ) = b. If there is no edge

108

A. Schilling, M. Shimozono

colored i leaving b (resp. entering b ) then say that fi (b) (resp. ei (b )) is undefined. The fi and ei are called Kashiwara lowering and raising operators. Define φi (b) (resp. i (b)) to be the maximum m ∈ N such that fim (b) (resp. eim (b)) is defined. There is a weight function wt : B → P that satisfies the following properties: wt(fi (b)) = wt(b) − αi , wt(ei (b)) = wt(b) + αi , hi , wt(b) = φi (b) − i (b).

(3.1)

B is called a P -weighted I -crystal. Let P + = { ∈ P | hi , ≥ 0, ∀i ∈ I } be the set of dominant integral weights. For ∈ P + denote by V() the irreducible integrable highest weight Uq (g)-module of highest weight . Let B() be its crystal graph. Say that an element b ∈ B of the P -weighted I -crystal B is a highest weight vector if i (b) = 0 for all i ∈ I . Let u be the highest weight vector in B(). By (3.1), for all i ∈ I , i (u ) = 0, φi (u ) = hi , .

(3.2)

Let B be the crystal graph of a Uq (g)-module V . A morphism of P -weighted I crystals is a map τ : B → B such that wt(τ (b)) = wt(b) and τ (fi (b)) = fi (τ (b)) for all b ∈ B and i ∈ I . In particular fi (b) is defined if and only if fi (τ (b)) is. Suppose V and V are Uq (g)-modules with crystal graphs B and B respectively. Then V ⊗ V admits a crystal graph denoted B ⊗ B which is equal to the direct product B × B as a set. We use the opposite of the convention used in the literature. Define b ⊗ fi (b ) if φi (b ) > i (b), fi (b ⊗ b ) = fi (b) ⊗ b if φi (b ) ≤ i (b) and φi (b) > 0, (3.3) undefined otherwise. Equivalently, ei (b) ⊗ b if φi (b ) < i (b), ei (b ⊗ b ) = b ⊗ ei (b ) if φi (b ) ≥ i (b) and i (b ) > 0, undefined otherwise.

(3.4)

One has φi (b ⊗ b ) = φi (b) + max{0, φi (b ) − i (b)}, i (b ⊗ b ) = max{0, i (b) − φi (b )} + i (b ).

(3.5)

Finally wt : B ⊗ B → P is defined by wt(b ⊗ b ) = wt B (b) + wt B (b ), where wtB : B → P and wtB : B → P are the weight functions for B and B . This construction is “associative”, that is, the P -weighted I -crystals form a tensor category. Remark 3.1. It follows from (3.4) that if b = bL ⊗ · · · ⊗ b1 and ei (b) is defined, then ei (b) = bL ⊗ · · · ⊗ bj +1 ⊗ ei (bj ) ⊗ bj −1 ⊗ · · · ⊗ b1 for some 1 ≤ j ≤ L.

Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials

109

3.2. Uq (sln )-crystal graphs on tableaux. Let J = {1, 2, . . . , n − 1} be the indexing set for the Dynkin diagram of type An−1 , with weight lattice Pfin , simple roots {α i | i ∈ J }, fundamental weights {i | i ∈ J }, and simple coroots {hi | i ∈ J }. Let λ = (λ1 ≥ λ2 ≥ · · · ≥ λn ) ∈ Nn be a partition. There is a natural projection n Z → Pfin denoted λ → λ = n−1 i=1 (λi −λi+1 )i . Let V (λ) be the irreducible integrable highest weight module of highest weight λ over the quantized universal enveloping algebra Uq (sln ) [17]. By abuse of notation we shall write V λ = V (λ) and denote the crystal graph of V λ by Bλ . As a set Bλ may be realized as the set of tableaux of shape λ over the alphabet {1, 2, . . . , n}. Define the content of b ∈ Bλ by content(b) = (c1 , . . . , cn ) ∈ Nn , where cj is the number of times the letter j appears in b. The weight function wt : Bλ → Pfin is given by sending b to the image of content(b) under the projection Zn → Pfin . The row-reading word of b is defined by word(b) = · · · w2 w1 , where wr is the word obtained by reading the r th row of b from left to right. This definition is useful even in the context that b is a skew tableau. The edges of Bλ are given as follows. First let v be a word in the alphabet {1, 2, . . . , n}. View each letter i (resp. i +1) of v as a closing (resp. opening) parenthesis, ignoring other letters. Now iterate the following step: declare each adjacent pair of matched parentheses to be invisible. Repeat this until there are no matching pairs of visible parentheses. At the end the result must be a sequence of closing parentheses (say p of them) followed by a sequence of opening parentheses (say q of them). The unmatched (visible) subword is of the form i p (i + 1)q . If p > 0 (resp. q > 0) then fi (v) (resp. ei (v)) is obtained from v by replacing the unmatched subword i p (i + 1)q by i p−1 (i + 1)q+1 (resp. i p+1 (i + 1)q−1 ). Then φi (v) = p, i (v) = q, and fi (v) (resp. ei (v)) is defined if and only if p > 0 (resp. q > 0). For the tableau b ∈ Bλ , let fi (b) be undefined if fi (word(b)) is; otherwise define fi (b) to be the unique (not necessarily column-strict) tableau of shape λ such that word(fi (b)) = fi (word(b)). It is easy to verify that when defined, fi (b) is a columnstrict tableau. Consequently φi (b) = φi (word(b)). The operator ei and the quantity i (b) are defined similarly. n )-crystal structure on rectangular tableaux. There is an inclusion of alge3.3. Uq (sl n ), where Uq (sl n ) is the quantized universal enveloping algebra bras Uq (sln ) ⊂ Uq (sl n of the affine Kac–Moody algebra sl n [15]. corresponding to the derived subalgebra sl (1) Let I = {0, 1, 2, . . . , n − 1} be the index set for the Dynkin diagram of An−1 . Let Pcl n , with (linearly dependent) simple roots {α cl | i ∈ I }, simple be the weight lattice of sl i coroots {hi | i ∈ I }, and fundamental weights {cl | i ∈ I }. The simple roots satisfy i the relation α0cl = − i∈J αicl . There is a natural projection Pcl → Pfin with kernel cl Z0 such that cl i → i for i ∈ J and 0 → 0. Let cl : Pfin → Pcl be the section cl of the above projection defined by cl(i ) = cl i − 0 for i ∈ J . Let c ∈ sl n be the canonical central element. The level of a weight ∈ Pcl is defined by c , . Let (Pcl+ )* = { ∈ Pcl+ | c , = *}. n )-module that has a crystal graph B (not all Suppose V is a finite-dimensional Uq (sl do); B is a Pcl -weighted I -crystal. A weight function wt cl : B → Pcl may be given by wtcl (b) = cl(wt(b)), where wt : B → Pfin is the weight function on the set B viewed as a Uq (sln )-crystal graph. In addition to being a Uq (sln )-crystal graph, B also has some

110

A. Schilling, M. Shimozono

n ) which edges colored 0. The action of Uq (sln ) on V λ extends to an action of Uq (sl admits a crystal structure, if and only if the partition λ is a rectangle [18, 30]. If λ is n )-module with the rectangle with k rows and m columns, then write V k,m for the Uq (sl Uq (sln )-structure V λ and denote its crystal graph by B k,m . If one of m or k is 1, then it is easy to give e0 and f0 explicitly on B k,m , for in this case the weight spaces of V k,m are one-dimensional, and the zero edges can be deduced from (3.1) [18]. The general case is given as follows [37]. We shall first define a content-rotating bijection ψ −1 : B k,m → B k,m . Let b ∈ B k,m be a tableau, say of content (c1 , c2 , . . . , cn ). ψ −1 (b) will have content (c2 , c3 , . . . , cn , c1 ). Remove all the letters 1 from b, leaving a vacant horizontal strip of size c1 in the northwest corner of b. Compute Schensted’s P tableau [34] of the row-reading word of this skew subtableau. It can be shown that this yields a tableau of the shape obtained by removing c1 cells from the last row of the rectangle (mk ). Subtract one from the value of each entry of this tableau, and then fill in the c1 vacant cells in the last row of the rectangle (mk ) with the letter n. It can be shown that ψ −1 is a well-defined bijection, whose inverse ψ can be given by a similar algorithm. Then fi = ψ −1 ◦ fi+1 ◦ ψ, ei = ψ −1 ◦ ei+1 ◦ ψ

(3.6)

for all i where indices are taken modulo n; in particular for i = 0 this defines explicitly the operators e0 and f0 . 3.4. Sequences of rectangular tableaux. For a sequence of rectangles R, consider the n )-crystal graph has underlying set PR = tensor product V RL ⊗ · · · ⊗ V R1 . Its Uq (sl BRL ⊗ · · · ⊗ BR1 , where the tensor symbols denote the Cartesian product of sets. A typical element of PR is called a path and is written b = bL ⊗ · · · ⊗ b2 ⊗ b1 , where bj ∈ BRj is a tableau of shape Rj . The edges of the crystal graph PR are given explicitly as follows. Define the word of a path b by word(b) = word(bL ) · · · word(b2 )word(b1 ). Then for i = 1, 2, . . . , n − 1 (as in the definition of fi for b ∈ Bλ ), if fi (word(b)) is undefined, let fi (b) be undefined; otherwise it is not hard to see that there is a unique path fi (b) ∈ PR such that word(fi (b)) = fi (word(b)). To define f0 , let ψ(b) = ψ(bL ) ⊗ · · · ⊗ ψ(b1 ) and f0 = ψ −1 ◦ f1 ◦ ψ. This definition is equivalent to that given by taking the above definition of fi on the crystals BRj and then applying the rule for lowering operators on tensor products (3.3). The action of ei for i ∈ I is defined analogously. n , with weight 3.5. Integrable affine crystals. Consider the affine Kac–Moody algebra sl lattice Paf , independent simple roots {αi | i ∈ I }, simple coroots {hi | i ∈ I }, and fundamental weights {i | i ∈ I }. Let δ ∈ Paf be the null root. There is a natural projection which we shall by abuse of notation also call cl : Paf → Pcl such that cl(δ) = 0 and cl(i ) = cl i for i ∈ I . Write af : Pcl → Paf for the section of cl given by af(cl ) = for i ∈ I . i i

Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials

111

Let ∈ Pcl+ be a dominant integral weight and B() the crystal graph of the n )-module of highest weight . If = 0 irreducible integrable highest weight Uq (sl then B() is infinite. The set of weights in Paf that project by cl to are given by cl−1 () = {af() + j δ | j ∈ Z}. Now fix j . The irreducible integrable highest weight n )-crystal graph B(af() + j δ) may be identified with B() as sets and as I Uq (sl crystals (independent of j ). The weight functions for B(af()+j δ) and B(af()) differ by the global constant j δ. The weight function B() → Z is obtained by composing the weight function for B(af() + j δ), with the projection cl : Paf → Pcl . The set B() is then endowed with an induced Z-grading E : B() → N defined by E(b) = − d , wt(b) , where B() is identified with B(af()), wt : B(af()) → Paf is the weight function and d ∈ Paf∗ is the degree generator. The map d , · takes the coefficient of the element δ of an element in Paf when written in the basis {i | i ∈ I } ∪ {δ}. 3.6. Energy function on finite paths. The set of paths PR has a natural statistic called the energy function. The definitions here follow [30]. Consider first the case that R = (R1 , R2 ) is a sequence of two rectangles. Let Bj = BRj for 1 ≤ j ≤ 2. Since B2 ⊗ B1 is a connected crystal graph, there is a unique n )-crystal graph isomorphism Uq (sl (3.7) σ : B2 ⊗ B1 ∼ = B1 ⊗ B2 . This is called the local isomorphism (see Sect. 4.4 for an explicit construction). Write σ (b2 ⊗ b1 ) = b1 ⊗ b2 . Then there is a unique (up to a global additive constant) map H : B2 ⊗ B1 → Z such that −1 if i = 0, e0 (b2 ⊗ b1 ) = e0 b2 ⊗ b1 and e0 (b1 ⊗ b2 ) = e0 b1 ⊗ b2 , H (ei (b2 ⊗ b1 )) = H (b2 ⊗ b1 ) + 1 if i = 0, e0 (b2 ⊗ b1 ) = b2 ⊗ e0 b1 (3.8) and e0 (b1 ⊗ b2 ) = b1 ⊗ e0 b2 , 0 otherwise. This map is called the local energy function. By definition it is invariant under the local isomorphism and under fi and ei for i ∈ J . Let us normalize it by the condition that H (u2 ⊗u1 ) = |R1 ∩R2 |, where uj is the Uq (sln ) highest weight vector of Bj for 1 ≤ j ≤ 2, R1 ∩ R2 is the intersection of the Ferrers diagrams of R1 and R2 , and |R1 ∩ R2 | is the number of cells in this intersection. Explicitly |R1 ∩ R2 | = min{η1 , η2 } min{µ1 , µ2 }. If η1 + η2 ≤ n then the local energy function attains precisely the values from 0 to |R1 ∩ R2 |. Now let R = (R1 , . . . , RL ) be a sequence of rectangles and b = bL ⊗ · · · ⊗ b1 ∈ PR . For 1 ≤ p ≤ L−1 let σp denote the local isomorphism that exchanges the tensor factors (i+1) be the (i + 1)th tensor in the pth and (p + 1)th positions. For 1 ≤ i < j ≤ L, let bj factor in σi+1 σi+2 . . . σj −1 (b). Then define the energy function (i+1) E(b) = H (bj ⊗ bi ). (3.9) 1≤i<j ≤L

The value of the energy function is unchanged under local isomorphisms and under ei and fi for i ∈ J , since the local energy function has this property. The next lemma follows from the definition of the local energy function.

112

A. Schilling, M. Shimozono

Lemma 3.2. Suppose b = bL ⊗ · · · ⊗ b1 ∈ PR is such that e0 (b) is defined and for any ⊗ · · · ⊗ b of b under a composition of local isomorphisms, e (b ) = image b = bL 0 1 bL ⊗ · · · ⊗ bj +1 ⊗ e0 (bj ) ⊗ bj −1 ⊗ · · · ⊗ b1 , where j = 1. Then E(e0 (b)) = E(b) − 1. If all rectangles Rj are the same then each of the local isomorphisms is the identity and E(b) = (L − i)H (bi+1 ⊗ bi ). (3.10) 1≤i≤L−1

Say that b ∈ PR is classically restricted if it is an sln -highest weight vector, that is, i (b) = 0 for all i ∈ J . Equivalently, word(b) is a (reverse) lattice permutation (every final subword has partition content). Let PR be the set of classically restricted paths in PR of weight ∈ Pcl . It was shown in [37] that the generalized Kostka polynomial (which was originally defined in terms of Littlewood–Richardson tableaux; see (4.3)) can be expressed as KλR (q) = q E(b) . (3.11) b∈Pcl(λ)R

This extends the path formulation of the Kostka polynomial by Nakayashiki and Yamada [30]. 3.7. Level-restricted paths. Let B be any Pcl -weighted I -crystal and ∈ Pcl+ . Say that b ∈ B is -restricted if b ⊗ u is a highest weight vector in the Pcl -weighted I -crystal B ⊗ B(), that is, i (b ⊗ u ) = 0 for all i ∈ I . Equivalently i (b) ≤ hi , for all i ∈ I by (3.5) and (3.2). Denote by H(, B) the set of elements b ∈ B that are -restricted. If ∈ Pcl+ has the same level as , define H(, B, ) to be the set of b ∈ H(, B) such that wt(b) = − ∈ Pcl , that is, the set of b ∈ B such that b ⊗ u is a highest weight vector of weight . Say that the element b is restricted of level * if it is (*0 )-restricted. Such paths are also classically restricted since hi , *0 = 0 * denote the set of paths in P for i ∈ J . Let PR R that are restricted of level *. Letting * = H(* , B, + * ). B = PR , this is the same as saying PR 0 0 Define the level-restricted generalized Kostka polynomial by * KλR (q) = q E(b) . (3.12) b∈P *

cl(λ)R

3.8. Perfect crystals. This section is needed to compute the coset branching functions in n )-crystal n . For any Uq (sl Sect. 7. We follow [19], stating the definitions in the case of sl B, define , φ : B → Pcl by (b) = i∈I i (b)i and φ(b) = i∈I φi (b)i . Now let * be a positive integer and B the crystal graph of a finite dimensional irren )-module V . Say that B is perfect of level * if ducible Uq (sl (1) B ⊗ B is connected. (2) There is a weight ∈ Pcl such that B has a unique vector of weight and all other vectors in B have lower weight in the Chevalley order, that is, wt(B) ⊂ − i∈J Nαi .

Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials

113

(3) * = minb∈B c , (b) . (4) The maps and φ restrict to bijections Bmin → (Pcl+ )* , where Bmin ⊂ B is the set of b ∈ B achieving the minimum in 3.

n the perfect crystals of level * are precisely those of the form B k,* for 1 ≤ k ≤ For sl cl n − 1 [18, 30]. Let B = B k,* . The weight can be taken to be *(cl k − 0 ). Example 3.3. We describe the bijections , φ : Bmin → (Pcl+ )* in this example. Let B = B k,* . For this example let n = 6, k = 3, * = 5, and consider the weight = 20 + 1 + 2 + 4 . As usual subscripts are identified modulo n. The unique tableau b ∈ B k,* such that φ(b) = is constructed as follows. First let T be the following tableau of shape (*k ). Its bottom row contains hi , copies of the letter i for 1 ≤ i ≤ n (here it is 12466 since the sequence of hi , for 1 ≤ i ≤ 6 is (1, 1, 0, 1, 0, 2)). Let every letter in T have value one smaller than the letter directly below it. Here we have −1 0 2 4 4 T = 0

1 3 5 5

1

2 4 6 6.

Let T− be the subtableau of T consisting of the entries that are nonpositive and T+ the rest. Say T− has shape ν (here ν = (2, 1)). Let ν = (*k ) − (νk , νk−1 , . . . , ν1 ) (here ν = (5, 4, 3)). The desired tableau b is defined as follows. The restriction of b to the shape ν is P (T+ ), or equivalently, the tableau obtained by taking the skew tableau T+ and first pushing all letters straight upwards to the top of the bounding rectangle (*k ), and then pushing all letters straight to the left inside (*k ). The restriction of b to (*k )/ ν is the tableau of that skew shape in the alphabet {1, 2, . . . , n} with maximal entries, that is, its bottom row is filled with the letter n, the next-to-bottom row is filled with the letter n − 1, etc. In the example, 1 1 2 4 4 b=2 3 5 5 5 4 6 6 6 6. To construct the unique element b ∈ B k,* such that (b ) = , let U be the tableau whose first row has hi , copies of the letter i + 1 for 1 ≤ i ≤ n, again identifying subscripts modulo n; here U has first row 11235. Now let the rest of U be defined by letting each entry have value one greater than the entry above it. So 1 1 2 3 5 U =2 2 3 4 6 3 3 4 5 7. Let U− be the subtableau of U consisting of the values that are at most n. Let µ be the µ = (*k ) − (µk , µk−1 , . . . , µ1 ). Here µ = (5, 5, 4) and µ = (1, 0, 0). shape of U− and The element b is defined as follows. Its restriction to the skew shape (*k )/ µ is the unique skew tableau V of that shape such that P (V ) = U− , or equivalently, this restriction is obtained by taking the tableau U− , pushing all letters directly down within the rectangle (*k ) and then pushing all letters to the right within (*k ). The restriction of b to the

114

A. Schilling, M. Shimozono

shape µ is filled with the smallest letters possible, so that the first row of this subtableau consists of ones, the second row consists of twos, etc. Here 1 1 1 2 3

b =2 2 3 4 5 3 3 4 5 6. The main theorem for perfect crystals is: Theorem 3.4 ([19]). Let B be a perfect crystal of level * and ∈ (Pcl+ )* with * ≥ * . n )-crystals Then there is an isomorphism of Uq (sl B ⊗ B() ∼ =

B( + wt(b)).

(3.13)

b∈H(,B)

Suppose now that B is perfect of level * and ∈ (Pcl+ )* . Write b() for the unique element of B such that φ(b()) = . Theorem 3.4 (with therein replaced by = (b())) says that B ⊗ B((b())) ∼ = B() with corresponding highest weight vectors b() ⊗ u(b()) → u . This isomorphism can be iterated. Let σ : Bmin → Bmin be the unique bijection defined by φ ◦ σ = . Then there are isomorphisms B ⊗N ⊗ B(φ(σ N (b()))) ∼ = B() such that the highest weight vector of the left-hand side is n ) given by b()⊗σ (b())⊗σ 2 (b())⊗· · ·⊗σ N−1 (b())⊗uφ(σ N (b())) . For the Uq (sl perfect crystals B k,* , it can be shown that the map σ is none other than the power ψ −k of the content rotating map ψ. Moreover if σ is extended to a bijection σ : B k,* → B k,* by defining σ = ψ −k , then the extended function also satisfies φ(σ (b)) = (b) for all b ∈ B k,* not just for b ∈ Bmin . Since the bijection ψ on B k,* has order n, the bijection σ has order n/ gcd(n, k). The ground state path for the pair (, B) is by definition the infinite periodic sequence b = b1 ⊗ b2 ⊗ . . . , where bi = σ i−1 (b()). Let P(, B) be the set of all semi-infinite sequences b = b1 ⊗ b2 ⊗ . . . of elements in B such that b eventually agrees with the ground state path b for (, B). Then the set P(, B) has the structure of the crystal B() with highest weight vector u = b and weight function wt(b) = i≥1 (wt(bi ) − wt(bi )). To recover the weight function of the n )-crystal B(af()), define the energy function on P(, B) by Uq (sl E(b) =

i(H (bi ⊗ bi+1 ) − H (bi ⊗ bi+1 ))

(3.14)

i≥1

and define the map B(af(λ)) → Paf by b → wt(b) − E(b)δ, where wt : B() → Pcl . P(, B) can be regarded as a direct limit of the finite crystals B ⊗N . Define the embedding iN : B ⊗N → P(, B) by b1 ⊗ · · · ⊗ bN → b1 ⊗ b2 ⊗ bN ⊗ bN+1 ⊗ bN+2 ⊗ . . . . Define EN : B ⊗N → Z by EN (b1 ⊗ · · · ⊗ bN ) = E(b1 ⊗ · · · ⊗ bN ⊗ bN+1 ), where the E on the right-hand side is the energy function for the finite path space B ⊗N+1 . By definition for all p = b1 ⊗ · · · ⊗ bN ∈ B ⊗N , E(iN (p)) = EN (p) − EN (b1 ⊗ · · · ⊗ bN ). Note that the last fixed step bN+1 is necessary to make the energy function on the finite paths stable under the embeddings into P(, B).

Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials

115

3.9. Standardization embeddings. We require certain embeddings of finite path spaces. Given a sequence of rectangles R, let r(R) denote the sequence of rectangles given by splitting the rectangles of R into their constituent rows. For example, if R = ((1), (2, 2)), then r(R) = ((1), (2), (2)). There is a unique embedding iR : PR 7→ Pr(R)

(3.15)

defined as follows. Its explicit computation is based on transforming R into r(R) using two kinds of steps. (1) Suppose R1 has more than one row (η1 > 1). Then use the transformation R → η −1 R < = ((µ1 ), (µ11 ), R2 , R3 , . . . , RL ). Informally, R < is obtained from R by n )-crystal splitting off the first row of R1 . There is an associated embedding of Uq (sl < < graphs iR : PR → PR < defined by the property that word(i (b)) = word(b) for all b ∈ PR . Here it is crucial that the rectangle being split horizontally, is the first one, for otherwise the embedding does not preserve the edges labeled by 0. (2) If η1 = 1, then use a transformation of the form R → sp R for some p. Here sp R denotes the sequence of rectangles obtained by exchanging the p th and (p + 1)th n )-crystal graphs is the local rectangles in R. The associated isomorphism of Uq (sl isomorphism σp : PR → Psp R defined before. It is clear that one can transform R into r(R) using these two kinds of steps. Now fix one such sequence of steps leading from R to r(R), say R = R (0) → R (1) → · · · → R (N) = r(R), where each R (m) is a sequence of rectangles and each step R (m−1) → R (m) is one of the two types defined above. Define the map i (m) : PR (m−1) 7→ PR (m) by i (m) = iRk

L

µa max{ηa − k, 0}

(5.1)

a=1

for k ≥ 0, where by convention ν (0) is the empty partition. If λ has at most n parts all partitions ν (k) for k ≥ n are empty. For a partition ρ, define mi (ρ) to be the number of parts equal to i and min{i, ρj }, Qi (ρ) = ρ1t + ρ2t + · · · + ρit = j ≥1

the size of the first i columns of ρ. Let ξ (k) (R) be the partition whose parts are the widths of the rectangles in R of height k. The vacancy numbers for the (λ; R)-configuration ν are the numbers (indexed by k ≥ 1 and i ≥ 0) defined by (k) Pi (ν) = Qi ν (k−1) − 2Qi ν (k) + Qi ν (k+1) + Qi ξ (k) (R) . (k)

(5.2)

In particular P0 (ν) = 0 for all k ≥ 1. The (λ; R)-configuration ν is said to be admissible (k) if Pi (ν) ≥ 0 for all k, i ≥ 1, and the set of admissible (λ; R)-configurations is denoted by C(λ; R). Following [26, (3.2)], set (k) (k) (k+1) αi αi − α i , cc(ν) = k,i≥1

Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials

121

(k)

where αi is the size of the i th column in ν (k) . Define the charge c(ν) of a configuration ν ∈ C(λ; R) by c(ν) = ||R|| − cc(ν) − |P | with ||R|| =

|Ri ∩ Rj |

and

|P | =

1≤i<j ≤L

k,i≥1

(k)

mi (ν)Pi (ν).

Observe that c(ν) depends on both ν and R but cc(ν) depends only on ν. Example 5.1. Let λ = (3, 2, 2, 1) and R = ((2), (2, 2), (1, 1)). Then ν = ((2), (2, 1), (1)) is a (λ; R)-configuration with ξ (1) (R) = (2) and ξ (2) (R) = (2, 1). The configuration ν may be represented as 0

1

0

0

where the vacancy numbers are indicated to the left of each part. In addition cc(ν) = 3, !R! = 5, |P | = 1 and c(ν) = 1. Define the q-binomial by

(q)m+p m+p = (q)m (q)p m

for m, p ∈ N and zero otherwise, where (q)m = (1 − q)(1 − q 2 ) · · · (1 − q m ). The following fermionic or quasi-particle expression of the generalized Kostka polynomials, is a variant of [25, Theorem 2.10]. Theorem 5.2. For λ a partition and R a sequence of rectangles P (k) (ν) + mi (ν (k) ) i . KλR (q) = q c(ν) mi (ν (k) ) k,i≥1

(5.3)

ν∈C(λ;R)

Expression (5.3) can be reformulated as the generating function over rigged configurations. To this end we need to define certain labelings of the rows of the partitions in a configuration. For this purpose one should view a partition as a multiset of positive integers. A rigged partition is by definition a finite multiset of pairs (i, x), where i is a positive integer and x is a nonnegative integer. The pairs (i, x) are referred to as strings; i is referred to as the length of the string and x as the label or quantum number of the string. A rigged partition is said to be a rigging of the partition ρ if the multiset consisting of the lengths of the strings is the partition ρ. So a rigging of ρ is a labeling of the parts of ρ by nonnegative integers, where one identifies labelings that differ only by permuting labels among equal-sized parts of ρ. A rigging J of the (λ; R)-configuration ν is a sequence of riggings of the partitions ν (k) such that for every part of ν (k) of length i and label x, (k)

0 ≤ x ≤ Pi (ν).

(5.4)

The pair (ν, J ) is called a rigged configuration. The set of riggings of admissible (λ; R)configurations is denoted by RC(λ; R). Let (ν, J )(k) be the k th rigged partition of (ν, J ).

122

A. Schilling, M. Shimozono (k)

A string (i, x) ∈ (ν, J )(k) is said to be singular if x = Pi (ν), that is, its label takes on the maximum value. Observe that the definition of the set RC(λ; R) is completely insensitive to the order of the rectangles in the sequence R. However the notation involving the sequence R is useful when discussing the bijection between LR tableaux and rigged configurations, since the ordering on R is essential in the definition of LR tableaux. Define the cocharge and charge of (ν, J ) ∈ RC(λ; R) by cc(ν, J ) = cc(ν) + |J |, c(ν, J ) = c(ν) + |J |, (k) |Ji |, |J | = k,i≥1

(k)

(k)

where Ji is the partition inside the rectangle of height mi (ν (k) ) and width Pi (ν) given by the labels of thepartsof ν (k) of size i. Since the q-binomial m+p is the generating function of partitions with at most m m parts each not exceeding p [1, Theorem 3.1], Theorem 5.2 is equivalent to the following theorem. Theorem 5.3. For λ a partition and R a sequence of rectangles KλR (q) =

q c(ν,J ) .

(5.5)

(ν,J )∈RC(λ;R)

5.2. Switching between quantum and coquantum numbers. Let θR : RC(λ; R) → RC(λ; R) be the involution that complements quantum numbers. More precisely, for (k) (ν, J ) ∈ RC(λ; R), replace every string (i, x) ∈ (ν, J )(k) by (i, Pi (ν) − x). The notation here differs from that in [25], in which θR is an involution on RC(λt ; R t ). Lemma 5.4. c(θR (ν, J )) = ||R|| − cc(ν, J ) for all (ν, J ) ∈ RC(λ; R). Proof. Let θR (ν, J ) = (ν , J ). It follows immediately from the definitions that ν = ν. In particular ν and ν have the same vacancy numbers and |J | = |P | − |J |. Then c(θR (ν, J )) = c(ν , J ) = ||R|| − cc(ν ) − |P | + |J | = ||R|| − cc(ν) − |J | = ||R|| − cc(ν, J ).

# "

There is a bijection tr RC : RC(λ; R) → RC(λt ; R t ) that has the property cc(tr RC (ν, J )) = ||R|| − cc(ν, J ) for all (ν, J ) ∈ RC(λ; R); see the proof of [26, Prop. 11].

(5.6)

Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials

123

5.3. RC’s and level-restriction. Here we introduce the most important new definition in this paper, namely, that of a level-restricted rigged configuration. Say that a partition λ is restricted of level * if λ1 − λn ≤ *, recalling that it is assumed that all partitions have at most n parts, some of which may be zero. Fix a shape λ and a sequence of rectangles R that are all restricted of level *. Define * = * − (λ1 − λn ), which is nonnegative by assumption. Set λ = (λ1 − λn , . . . , λn−1 − λn )t and denote the set of all column-strict tableaux of shape λ over the alphabet {1, 2, . . . , λ1 − λn } by CST(λ ). Define a table of modified vacancy numbers depending on ν ∈ C(λ; R) and t ∈ CST(λ ) by (k)

(k)

Pi (ν, t) = Pi (ν) −

λ k −λn

χ (i ≥ * + tj,k ) +

λk+1 −λn

j =1

χ (i ≥ * + tj,k+1 )

(5.7)

j =1

for all i, k ≥ 1, where χ (S) = 1 if the statement S is true and χ (S) = 0 otherwise, and (k) (k) tj,k is the (j, k)th entry of t. Finally let xi be the largest part of the partition Ji ; if (k) (k) Ji is the empty set xi = 0. Definition 5.5. Say that (ν, J ) ∈ RC(λ; R) is restricted of level * provided that (k)

(1) ν1 ≤ * for all k. (2) There exists a tableau t ∈ CST(λ ), such that for every i, k ≥ 1, (k)

xi

(k)

≤ Pi (ν, t).

Let C* (λ; R) be the set of all ν ∈ C(λ; R) such that the first condition holds, and denote by RC* (λ; R) the set of (ν, J ) ∈ RC(λ; R) that are restricted of level *. (k)

Note in particular that the second condition requires that Pi (ν, t) ≥ 0 for all i, k ≥ 1. Example 5.6. Let us consider Definition 5.5 for two classes of shapes λ more closely: (k)

(1) Vacuum case: Let λ = (a n ) be rectangular with n rows. Then λ = ∅ and Pi (ν, ∅) = (k) Pi (ν) for all i, k ≥ 1 so that the modified vacancy numbers are equal to the vacancy numbers. (2) Two-corner case: Let λ = (a α , bβ ) with α + β = n and a > b. Then λ = (α a−b ) and there is only one tableau t in CST(λ ), namely the Yamanouchi tableau of shape λ . Since tj,k = j for 1 ≤ k ≤ α we find that (k) (k) *, 0} Pi (ν, t) = Pi (ν) − δk,α max{i −

for 1 ≤ i ≤ * and 1 ≤ k < n. We wish to thank Anatol Kirillov for communicating this formula to us [27]. Our main result is the following formula for the level-restricted generalized Kostka polynomial: Theorem 5.7. Let * be a positive integer. For λ a partition and R a sequence of rectangles both restricted of level *, * KλR (q) = q c(ν,J ) . (ν,J )∈RC* (λ;R)

124

A. Schilling, M. Shimozono

The proof of this theorem is given in Sect. 8. Example 5.8. Consider n = 3, * = 2, λ = (3, 2, 1) and R = ((2), (1)4 ). Then 0 0

1

and

1

0

(5.8)

0

2

are in C* (λ; R), where again the vacancy numbers are indicated to the left of each part. The set CST(λ ) consists of the two elements 1

1

1

and

2

2

2

.

Since * = 0 the three rigged configurations 0 0 ,

0 0

0 0

and

0

0 1

0

are restricted of level 2 with charges 2, 3, 4, respectively. The riggings are given on the 2 (q) = q 2 + q 3 + q 4 . right of each part. Hence KλR In contrast to this, the Kostka polynomial Kλµ (q) is obtained by summing over both configurations in (5.8) with all possible riggings below the vacancy numbers. This amounts to Kλµ (q) = q 2 + 2q 3 + 2q 4 + 2q 5 + q 6 . In Sect. 7 we will use Theorem 5.7 to obtain explicit expressions for type A branching functions. The results suggest that it is also useful to consider the following sets of rigged configurations with imposed minima on the set of riggings. t t t Let ρ ⊂ λ be a partition and Rρ = ((1ρ1 ), (1ρ2 ), . . . , (1ρn )), the sequence of single t t columns of height ρi . Set ρ = (ρ1 − ρn , . . . , ρn−1 − ρn ) and (k) Mi (t)

=

ρ k −ρn j =1

ρk+1 −ρn

χ (i ≤ ρ1 − ρn − tj,k ) −

χ (i ≤ ρ1 − ρn − tj,k+1 )

j =1

for all t ∈ CST(ρ ). Then define RC* (λ, ρ; R) to be the set of all (ν, J ) ∈ RC* (λ; Rρ ∪R) (k) such that there exists a t ∈ CST(ρ ) such that Mi (t) ≤ x for (i, x) ∈ (ν, J )(k) and (k) (k) Mi (t) ≤ Pi (ν) for all i, k ≥ 1. Note that the second condition is obsolete if i occurs (k) (k) as a part in ν (k) since by definition Mi (t) ≤ x ≤ Pi (ν) for all (i, x) ∈ (ν, J )(k) . Conjecture 8.3 asserts that the set RC* (λ, ρ; R) corresponds to the set of all level-* restricted Littlewood–Richardson tableaux with a fixed subtableaux of shape ρ.

Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials

125

6. Fermionic Expression of Level-Restricted Generalized Kostka Polynomials 6.1. Fermionic expression. Similarly to the Kostka polynomial case, one can rewrite the expression of the level-restricted generalized Kostka polynomials of Theorem 5.7 in fermionic form. (k)

Lemma 6.1. For all ν ∈ C* (λ, R), t ∈ CST(λ ) and 1 ≤ k < n, we have Pi (ν, t) = 0 for i ≥ *. (k)

(k)

Proof. Since ν1 ≤ * it follows from [26, (11.2)] that Pi (ν) = λk − λk+1 for i ≥ *. Since t is over the alphabet {1, 2, . . . , λ1 − λn } this implies for i ≥ *,

(k)

(k)

Pi (ν, t) = Pi (ν) −

λ k −λn

χ (i ≥ * + tj,k ) +

j =1

λk+1 −λn

χ (i ≥ * + tj,k+1 )

j =1

= λk − λk+1 − (λk − λn ) + (λk+1 − λn ) = 0.

# "

Let SCST(λ ) be the set of all nonempty subsets of CST(λ ). Furthermore set (k) = min{Pi (ν, t)|t ∈ S} for S ∈ SCST(λ ). Then by inclusion-exclusion the set of allowed rigging for a given configuration ν ∈ C* (λ; R) is given by

(k) Pi (ν, S)

S∈SCST(λ )

(k)

(−1)|S|+1 {J |xi

(k)

≤ Pi (ν, S)}.

is the generating function of partitions with at most m parts Since the q-binomial m+p m (k) each not exceeding p and since P* (ν, S) = 0 by Lemma 6.1 the level-* restricted generalized Kostka polynomials has the following fermionic form. Theorem 6.2.

* KλR (q) =

(−1)|S|+1

S∈SCST(λ )

ν∈C* (λ;R)

q c(ν)

(k) mi (ν (k) ) + Pi (ν, S) . mi (ν (k) )

*−1 n−1 i=1 k=1

In Sect. 7 we will derive new expressions for branching functions of type A as limits of the level-restricted generalized Kostka polynomials. To this end we need to reformulate the fermionic formula of Theorem 6.2 in terms of a so-called (m, n)-system. Set (a)

(a)

(a)

(a)

mi

= Pi (ν, S) = Pi (ν) + fi (S),

ni

= mi (ν (a) ),

(a)

126

A. Schilling, M. Shimozono

(a) and Li = L j =1 χ (i = µj )χ (a = ηj ) for 1 ≤ i ≤ * and 1 ≤ a ≤ n which is the number of rectangles in R of shape (i a ). Then (a)

(a)

(a)

(a−1)

(a)

(a+1)

+ 2ni − ni −mi−1 + 2mi − mi+1 − ni (a−1) (a−1) (a) (a+1) (a) (a+1) = αi − 2αi + αi − αi+1 − 2αi+1 + αi+1 +

=

L

δa,ηk − min{i − 1, µk } + 2 min{i, µk } − min{i + 1, µk }

k=1 (a) (a) (a) − fi−1 (S) + 2fi (S) − fi+1 (S) (a−1) (a+1) (a−1) (a) (a) − αi − αi+1 + 2(αi − αi+1 ) − αi (a) (a) (a) (a) Li − fi−1 (S) + 2fi (S) − fi+1 (S).

(a+1)

− αi+1

(a)

At this stage it is convenient to introduce vector notation. For a matrix vi 1 ≤ i ≤ * − 1 and 1 ≤ a ≤ n − 1 define v=

*−1 n−1 i=1 a=1

with indices

(a)

vi e i ⊗ e a ,

where ei and ea are the canonical basis vectors of Z*−1 and Zn−1 , respectively. Define (a)

(a)

(a)

(a)

ui (S) = −fi−1 (S) + 2fi (S) − fi+1 (S), which in vector notation reads u(S) = (C ⊗ I )f (S) +

n−1

(λa − λa+1 )e*−1 ⊗ ea ,

(6.1)

a=1 (0)

where C is the Cartan matrix of type A and I is the identity matrix. Since ni (k) (k) m0 = 0 and m* = 0 by Lemma 6.1 it follows that (C ⊗ I )m + (I ⊗ C)n = L + u(S).

(n)

= ni

=

(6.2)

In terms of the new variables the condition (5.1) on |ν (a) | becomes (a)

n* = −e*−1 ⊗ ea (C −1 ⊗ I )n −

a

*

n

1 1 (b) λj + i min{a, b}Li , * * j =1

(6.3)

i=1 b=1

where we used Cij−1 = min{i, j } − ij/* if C is (* − 1) × (* − 1)-dimensional and n * (b) b=1 i=1 ibLi = |λ|. Lemma 6.3. In terms of the above (m, n)-system c(ν) =

1 m(C ⊗ C −1 )m − m(I ⊗ C −1 )u(S) 2 1 + u(S)(C −1 ⊗ C −1 )u(S) + g(R, λ), 2

(6.4)

Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials

127

where g(R, λ) = !R! −

2 n−1 * n 1 −1 (a) (b) 1 1 λj − |λ| Cab Lj Lj + 2 2* n a,b=1 j =1

(a)

and Li

=

j =1

*

(a) j =1 min{i, j }Lj .

Proof. By definition c(ν) = !R! − cc(ν) − |P |. Note that |P | =

* n−1 i=1 k=1

=

(k)

mi (ν (k) )Pi (ν)

* n−1 i=1 k=1

(k)

αi

= −2cc(ν) +

(k)

− αi+1

n−1 * i=1 k=1

i (k) (α (k−1) − 2α (k) + α (k+1) ) + Li j =1

j

j

j

(k) (k)

n i Li .

Hence eliminating cc(ν) in favor of |P | yields * n−1

1 1 (k) (k) c(ν) = !R! − |P | − n i Li . 2 2 i=1 k=1

(k)

On the other hand, using ni

(k)

= mi (ν (k) ) and P* (ν) = λk − λk+1 ,

|P | = n(I ⊗ I )P (ν) +

n−1 k=1

(k)

n* (λk − λk+1 )

so that n−1

1 1 (k) (k) c(ν) = !R! − n(I ⊗ I )(P (ν) + L) − n* λk − λk+1 + L* . 2 2

(6.5)

k=1

Eliminating n in favor of m using (6.2) and substituting P (ν) = m − f (S) yields 1 1 − n(I ⊗ I )(P (ν) + L) = m{C ⊗ C −1 (m + L − f (S)) − I ⊗ C −1 (L + u(S))} 2 2 1 − (L + u(S))(I ⊗ C −1 )(L − f (S)). 2 Similarly, replacing n by m in (6.3) we obtain (a)

n* = e*−1 ⊗ ea (I ⊗ C −1 m − C −1 ⊗ C −1 u(S)) −

1 1 −1 (b) λj − |λ| + Cab L* . * n a

n−1

j =1

b=1

(6.6)

128

A. Schilling, M. Shimozono

Inserting these equations into (6.5), trading f (S) for u(S) by (6.1) and using (C ⊗ I )L − L −

n−1 a=1

(a)

e*−1 ⊗ ea L* = 0

# "

results in the claim of the lemma.

As a corollary of Lemma 6.3 and Theorem 6.2 we obtain the following expression for the level-restricted generalized Kostka polynomial 1 −1 −1 * KλR (q) = q g(R,λ) (−1)|S|+1 q 2 u(S)C ⊗C u(S) ×

S∈SCST(λ )

q

m+n , m

1 −1 −1 2 mC⊗C m−mI ⊗C u(S)

m

(6.7)

where n is determined by (6.2), the sum over m is such that e*−1 ⊗ ea (I ⊗ C −1 m − C −1 ⊗ C −1 u(S)) 1 1 −1 (b) λj − |λ| + Cab L* ∈ Z, * n a

−

n−1

j =1

for all 1 ≤ a ≤ n − 1 and

m+n m

=

*−1 n−1 i=1

k=1

b=1

(k) (k) mi +ni (k) mi

.

Now consider the second case of Example 5.6, namely λ = (a α , bβ ) with a > b and α + β = n. Then SCST(λ ) only contains the element S = {t}, where t is the Yamanouchi tableau of shape λ and u(S) = e * ⊗ eα . In the vacuum case, that is, when n ), the set SCST(λ ) only contains S = {∅} and u(S) = f (S) = 0. In this ) λ = (( |λ| n case (6.7) simplifies to 1 * g(R,λ) mC⊗C −1 m m + n 2 KλR (q) = q . q m m When R is a sequence of single boxes this proves [8, Theorem 1]1 . When R is a sequence of single rows or single columns this settles [12, Conjecture 5.7]. 6.2. Polynomial Rogers–Ramanujan-type identities. Let W be the Weyl group of sln , M = {β ∈ Zn | ni=1 βi = 0} be the root lattice, ρ the half-sum of the positive roots, and (·|·) the standard symmetric bilinear form. Recall the energy function (3.9). It was shown in [31] that 1 * (q) = (−1)τ q − 2 (*+n)(β|β)+(λ+ρ|β)+E(b) . (6.8) KλR τ ∈W β∈M

b∈PR wt(b)=−ρ+τ −1 (λ−(*+n)β+ρ)

Equating (6.7) and (6.8) gives rise to polynomial Rogers–Ramanujan-type identities. For the vacuum case, that is, when the partition λ is rectangular with n rows, this proves [33, Eq. (9.2)]2 . 1 We believe that the proof given in [8] is incomplete. 2 The definition of level-restricted path as given in [33, p. 394] only works when R (or µ therein) consists

of single rows; otherwise the description of Sect. 3.7 should be used.

Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials

129

7. New Expressions for Type A Branching Functions The coset branching functions b labeled by the three weights , , have a nat ural finitization in terms of ( + )-restricted crystals. For certain triples of weights these can be reformulated in terms of level-restricted paths, which in turn yield an expression of the type A branching functions as a limit of the level-restricted generalized Kostka polynomials. Together with the results of the last section this implies new fermionic expressions for type A branching functions at certain triples of weights.

7.1. Branching function in terms of paths. Let , , ∈ Pcl be dominant integral weights of levels *, * , and * respectively, where * = * + * . The branching function b (z) is the formal power series defined by af()−mδ b zm caf( ),af( ) , (z) = m≥0

af()−mδ

where caf( ),af( ) is the multiplicity of the irreducible integrable highest weight n )-module V(af() − mδ) in the tensor product V(af( )) ⊗ V(af( )). Uq (sl n -highest weight vectors of weight The desired multiplicity is equal to the number of sl af()−mδ in the tensor product B(af( ))⊗B(af( )), that is, the number of elements b ⊗b ∈ B(af( ))⊗B(af( )) such that wt(b ⊗b ) = af()−mδ and i (b ⊗b ) = 0 for all i ∈ I . By (3.5), b = u , b is -restricted, and wt(b ) = af( − ) − mδ. Let B be a perfect crystal of level * . Using the isomorphism B( ) ∼ = P( , B) let b = b1 ⊗ b2 ⊗ · · · and b ∈ P( , B) be the ground state path. Suppose N is such that . In type A(1) the period of the ground for all j > N, bj = bj . Write b = b1 ⊗ · · · ⊗ bN n−1 state path b always divides n. Choose N to be a multiple of n, so that b = b ⊗ b and bN+1 = b1 . Then the above desired highest weight vectors have the form b ⊗ b = (b ⊗ u ) ⊗ u ∈ B ⊗N ⊗ B(af( )) ⊗ B(af( )). But there is an embedding B(af( + )) 7→ B(af( )) ⊗ B(af( )) defined by u + → u ⊗ u . With this rephrasing of the conditions on b and taking limits, we have −EN (b1 ⊗···⊗bN ) b zEN (b) , (7.1) (z) = lim z N→∞ N∈nZ

b∈H( + ,B⊗N ,)

where EN : B ⊗N → Z is given by EN (b) = E(b ⊗ bN+1 ) = E(b ⊗ b1 ) and E is the energy function on finite paths. Our goal is to express (7.1) in terms of level-restricted generalized Kostka polynomials. We find that this is possible for certain triples of weights. Using the results of Sect. 6 this provides explicit formulas for the branching functions. 7.2. Reduction to level-restricted paths. The first step in the transformation of (7.1) is to replace the condition of ( + )-restrictedness by level * restrictedness. This is achieved at the cost of appending a fixed inhomogeneous path. Consider any tensor product B of perfect crystals each of which has level at most * (the level of ), such that there is an element y ∈ H(* 0 , B , ). We indicate how such a B and y can be constructed explicitly. Let λ be the partition with strictly

130

A. Schilling, M. Shimozono

less than n rows with hi , columns of length i for 1 ≤ i ≤ n − 1. Let Yλ be the Yamanouchi tableau of shape λ. Then any factorization (in the plactic monoid) of Yλ into a sequence of rectangular tableaux, yields such a B and y . Example 7.1. Let n = 6, * = 5, = 0 + 22 + 3 + 4 . Then λ = (4, 4, 2, 1) (its transpose is λt = (4, 3, 2, 2)) and 1 1 1 1 Yλ =

2 2 2 2 3 3

.

4 One way is to factorize into single columns: B = B 2,1 ⊗ B 2,1 ⊗ B 3,1 ⊗ B 4,1 and y = y4 ⊗ y3 ⊗ y2 ⊗ y1 , where each yj is an sln highest weight vector, namely, the j th column of Yλ . Another way is to factorize into the minimum number of rectangles by slicing Yλ vertically. This yields B = B 2,2 ⊗ B 3,1 ⊗ B 4,1 ; again the factors of y = y3 ⊗ y2 ⊗ y1 are the sln highest weight vectors, namely,

y3 =

1 1 2 2

1 ,

y2 = 2 , 3

1 2 y1 = . 3 4

Consider also a tensor product B of perfect crystals such that there is an element ∈ H(* 0 , B , ). Then y = y ⊗ y ∈ H(*0 , B ⊗ B , + ). Instead of b ∈ H( + , B ⊗N , ), we work with b ⊗ y, where b ⊗ y is restricted of level *. This trick doesn’t help unless one can recover the correct energy function directly from b ⊗ y. Let p be the first N steps of the ground state path b ∈ P( , B). Define the normalized energy function on B ⊗N by E(b) = E(b ⊗ y ) − E(p ⊗ y ). A priori it depends on , B, and y . The energy function occurring in the branching function is E (b) = E(b ⊗ b1 ) − E(p ⊗ b1 ). y

Lemma 7.2. E = E . Proof. It suffices to show that the function B ⊗N → Z given by b → E(b ⊗ y ) − E(b ⊗ b1 ) is constant. Using the definition (3.9) and the fact that b is homogeneous of length N, we have E(b ⊗ y ) = E(b) + N E(bN ⊗ y ) − (N − 1)E(y ). Similarly E(b ⊗ b1 ) = E(b) + N E(bN ⊗ b1 ). Therefore E(b ⊗ y ) − E(b ⊗ b1 ) = N(E(bN ⊗ y ) − E(bN ⊗ b1 )) − (N − 1)E(y ). Thus it suffices to show that the function B → Z given by b → E(b ⊗ y ) − E(b ⊗ b1 ) is a constant function. Suppose first that i (b ) > hi , for some 1 ≤ i ≤ n − 1. By the construction of y and b1 , φi (y ) = hi , = φi (b1 ) for 1 ≤ i ≤ n − 1, since φ(b1 ) = . Then ei (b ⊗ y ) = ei (b ) ⊗ y and ei (b ⊗ b1 ) = ei (b ) ⊗ b1 by (3.4). Passing from b to ei (b ) repeatedly, the values of the energy functions are constant, so it may be assumed that b ⊗ y is a sln highest weight vector; in particular, i (b ) ≤ hi , for all 1 ≤ i ≤ n − 1.

Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials

131

Next suppose that 0 (b ) > h0 , . Now φ0 (y ) = 0 and φ0 (b1 ) = h0 , . By (3.4) e0 (b ⊗ b1 ) = e0 (b ) ⊗ b1 and e0 (b ⊗ y ) = e0 (b ) ⊗ y . By (3.8) and the fact that the local isomorphism on B ⊗B is the identity, we have E(e0 (b ⊗b1 )) = E(b ⊗b1 )−1. To show that E(e0 (b ⊗y )) = E(b ⊗y )−1 we check the conditions of Lemma 3.2. By (3.1) 0 (y ) = φ0 (y ) − h0 , wt(y ) = 0 − h0 , − * 0 = * − h0 , . Also by (3.5), since φ0 (y ) = 0, we have 0 (b ⊗ y ) = 0 (b ) + 0 (y ) > h0 , + * − h0 , = * . Let z ⊗ x be the image of b ⊗ y under an arbitrary composition of local isomorphisms. Since b ⊗ y is an sln highest weight vector, so is z ⊗ x and x. Now x is the sln -highest weight vector in a perfect crystal of level at most * , so φ0 (x) = 0 and 0 (x) ≤ * . But * < 0 (b ⊗ y ) = 0 (z ⊗ x) = 0 (z) + 0 (x) so that 0 (z) > 0. By (3.4) e0 (z ⊗ x) = e0 (z) ⊗ x. So E(e0 (b ⊗ y )) = E(b ⊗ y ) − 1 by Lemma 3.2. ) ≤ h , . But then ) ≤ (b (b By induction we may now assume that 0 0 i i i hi , , or c , (b ) ≤ c , = * . Since b ∈ B and B is a perfect crystal of level * , b must be the unique element of B such that (b ) = . Thus the function B → Z given by b → E(b ⊗ y ) − E(b ⊗ b1 ) is constant on B if it is constant on the singleton set { −1 ( )}, which it obviously is. " # 7.3. Explicit ground state energy. To go further, an explicit formula for the value E(p ⊗ y ) is required. This is achieved in (7.2). The derivation makes use of the following explicit construction of the local isomorphism. Theorem 7.3. Let B = B k,* be a perfect crystal of level *, , ∈ (Pcl+ )* , B a perfect crystal of level * ≤ *, and b ∈ H( , B , ). Let x ∈ B (resp. y ∈ B) be the unique element such that (x) = (resp. (y) = ). Then under the local isomorphism B ⊗ B ∼ = ψ k (b) ⊗ y. = B ⊗ B, we have x ⊗ b ∼ The proof requires several technical lemmas and is given in the next section. Example 7.4. Let n = 5, * = 4, k = 2, = 0 +1 +3 +4 , = 0 +1 +2 + 4 , * = 2, B = B 2,2 . Here the set H( , B , ) consists of two elements, namely, 1 2 4 5

and

1 4 2 5.

Let b be the second tableau. The theorem says that 1 1 2 3 2 3 4 5

⊗

1 1 2 4 1 4 ∼ 1 3 ⊗ = 2 3 5 5. 2 4 2 5

Proposition 7.5. Let ∈ (Pcl+ )* , B = B k,* a perfect crystal of level *, b ∈ P(, B) the ground state path, p a finite path (say of length N , where N is a multiple of n) such that p ⊗ b = b, B the tensor product of perfect crystals each of level at most *, and y ∈ H(*0 , B , ). Let p be the path of length N such that p ⊗ b = b , where b ∈ P(*0 , B) is the ground state path. Then under the composition of local isomorphisms B ⊗N ⊗ B ∼ = y ⊗ p . = B ⊗ B ⊗N we have p ⊗ y ∼ Proof. Induct on the length of the path y. Suppose B = B1 ⊗ B2 and y = y1 ⊗ y2 , where yj ∈ Bj and Bj is a perfect crystal. Let = − wt(y1 ). By the definitions y2 ∈ H(*0 , B2 , ). By induction the first N steps p of the ground state path of

132

A. Schilling, M. Shimozono

∼ y2 ⊗ p under the composition of local isomorphisms P( , B) satisfy p ⊗ y2 = ⊗N ⊗N ∼ B ⊗ B2 = B2 ⊗ B . Tensoring on the left with y1 , it remains to show that p ⊗ y1 ∼ = y1 ⊗ p under the composition of local isomorphisms B ⊗N ⊗ B1 ∼ = B1 ⊗ B ⊗N . Now ∈ B are the unique elements such that (p ) = and (p ) = . pN ∈ B and pN N N . Now p ⊗ y ∈ H( , B ⊗ Applying Theorem 7.3 we obtain pN ⊗ y1 ∼ = ψ k (y1 ) ⊗ pN N 1 ∈ H( , B ⊗ B, φ(p )). This implies that ψ k (y ) ∈ B1 , φ(pN )) so that ψ k (y1 ) ⊗ pN 1 N 1 ) and (p H(φ(pN ), B1 , φ(pN )). Now by definition (pN−1 ) = φ(pN N−1 ) = φ(pN ). . Continuing in Applying Theorem 7.3 we obtain pN−1 ⊗ ψ k (y1 ) ∼ = ψ 2k (y1 ) ⊗ pN−1 j k (j +1)k ∼ (y1 ) ⊗ pN−j for 0 ≤ j ≤ N − 1. this manner it follows that pN−j ⊗ ψ (y1 ) = ψ Composing these local isomorphisms it follows that p ⊗ y1 ∼ = ψ Nk (y1 ) ⊗ p . But ψ N is the identity since the order of ψ divides n which divides N . Therefore p ⊗ y1 ∼ = y1 ⊗ p under the composition of local isomorphisms and we are done. " # In the notation in the previous section, E(p ⊗ y ) = E(y ⊗ p ), where p is the first N steps of the ground state path of P(* 0 , B). Write N = nM and B = B k,* . Then using the generalized cocyclage one may calculate explicitly the generalized charge of the LR tableau corresponding to the level * restricted (and hence classically restricted) path y ⊗ p . Let |y | denote the total number of cells in the tableaux comprising y . Then kM . (7.2) E(y ⊗ p ) = E(y ) + |y |kM + n* 2 Example 7.6. Let n = 5, * = 3, = 0 + 3 + 4 , k = 2 and M = 1. Then p is the path 4 4 4 5 5 5

⊗

2 2 2 3 3 3

⊗

1 1 1 5 5 5

⊗

3 3 3 4 4 4

⊗

1 1 1 2 2 2.

The element y can be taken to be the tensor product 1

1

2

2⊗ 3

3 4.

Let λ = (8, 8, 8, 7, 6). Then the tableau Q ∈ LR(λ; R) (resp. Y ) that records the path y ⊗ p (resp. y ) is given by 1 1 1

5

5

5

11 15

2 2 2

7

7

7

12 16

Q=3 3 3

8

8

8

13 17 ,

4 4 4

9

9

9

14

1 5 2 6 Y =3 7 4

6 6 6 10 10 10 with R = ((3, 3), (3, 3), (3, 3), (3, 3), (3, 3), (1, 1, 1, 1), (1, 1, 1)) and subalphabets {1, 2}, {3, 4}, {5, 6}, {7, 8}, {9, 10}, {11, 12, 13, 14}, {15, 16, 17}. The generalized charge

Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials

133

cR (Q) is equal to the energy E(y ⊗ p ) [37, Theorem 23]. Here the widest rectangle in the path is of width * . For any tableau T ∈ LR(ρ; R) for some partition ρ, define V (T ) = P ((w0R Te )(w0R Tw )), where P is the Schensted P tableau, w0R is the automorphism of conjugation that reverses each of the subalphabets, and Tw and Te are the west and east subtableaux obtained by slicing T between the * th and (* + 1)th columns. It can be shown that there is a composition of |Te | generalized R-cocyclages leading from T to V (T ), where |Te | denotes the number of cells in Te . It follows from the ideas in [35, Sect. 3] and the intrinsic characterization of cR in [35, Theorem 21] that cR (T ) = cR (V (T )) + |Te |.

(7.3)

For the above tableau Q we have 1 1 1

1 1 1

2 2 2

2 2 2

Qw = 3 3 3

w0R Qw = 3 3 3

4 4 4

4 4 4

6 6 6

5 5 5

6

6

6

11 15

7

7

7

12 16

= 8

8

8

13 17 .

9

9

9

14

and 5

5

5

11 15

7

7

7

12 16

Qe = 8

8

8

13 17

9

9

9

14

w0R Qe

10 10 10

10 10 10

Then

V (Q) =

1

1

1

2

2

2

1

1

1

11 15

3

3

3

2

2

2

12 16

4

4

4

3

3

3

13 17

5

5

5

4

4

4

14

6

6

6

5

5

5

7

7

7

6

6

6

8

8

8

7

7

7

9

9

9

8

8

8

10 10 10.

9

9

9

11 15

10 10 10

12 16

and

V (V (Q)) =

13 17 14

134

A. Schilling, M. Shimozono

We have cR (V (V (Q))) = cR (Y ) = E(y ) by [35, Theorem 21] and cR (Q) = cR (V (Q)) + |Qe | = cR (V (Q)) + * n + |Y |, and cR (V (Q)) = cR (V (V (Q))) + |Y | by (7.3). This implies cR (Q) = * n + E(y ) + 2|Y |. 7.4. Proof of Theorem 7.3. The proof of Theorem 7.3 requires several lemmas. Words of length L in the alphabet {1, 2, . . . , n} are identified with the elements of the crystal basis of the L-fold tensor product (B 1,1 )⊗L . Lemma 7.7. Let u and v be words such that uv is an An−1 highest weight vector. Then v is an An−1 highest weight vector and j (u) ≤ φj (v) for all 1 ≤ j ≤ n − 1. Proof. Let uv be an An−1 highest weight vector and 1 ≤ j ≤ n − 1. By (3.5) 0 = j (uv) = j (v) + max{0, j (u) − φj (v)}. Since both summands on the right-hand side are nonnegative and sum to zero they must both be zero. " # Lemma 7.8. Let w be a word in the alphabet {1, 2} and w a word obtained by removing a letter i of w. Then w ) ≤ 1 (w) + 1 with equality only if i = 1. (1) 1 ( w ) + 1 with equality only if i = 2. (2) 1 (w) ≤ 1 ( Proof. Write w = uiv and w = uv. By (3.5) 1 (ui) = 1 (i) + max{0, 1 (u) − φ1 (i)} max{0, 1 (u) − 1} if i = 1 = 1 + 1 (u) if i = 2.

(7.4)

In particular 1 (ui) ≥ 1 (u) − 1. Applying (3.5) to both 1 (uv) and 1 (uiv) and subtracting, we obtain 1 (uv) − 1 (uiv) = max{0, 1 (u) − φ1 (v)} − max{0, 1 (ui) − φ1 (v)} ≤ max{0, 1 (u) − φ1 (v)} − max{0, 1 (u) − 1 − φ1 (v)} ≤ 1. Moreover if 1 (uv) − 1 (uiv) = 1 then all of the inequalities are equalities. In particular it must be the case that 1 (ui) = 1 (u) − 1, which by (7.4) implies that i = 1, proving the first assertion. On the other hand, (7.4) also implies 1 (ui) ≤ 1 + 1 (u). Subtracting 1 (uv) from 1 (uiv) and computing as before, the second part follows. " # Say that w is an almost highest weight vector with defect i if there is an index 1 ≤ i ≤ n − 1 such that j (w) = δij for 1 ≤ j ≤ n − 1, and also i−1 (ei (w)) = 0 if i > 1. Lemma 7.9. Let w be an almost highest weight vector with defect i for 1 ≤ i ≤ n − 1. Then ei (w) is either an An−1 highest weight vector or an almost highest weight vector of defect i + 1.

Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials

135

Proof. For j ∈ {i − 1, i, i + 1}, the restriction of the words w and ei (w) to the alphabet {j, j + 1} are identical, so that j (ei (w)) = j (w) = 0 by the definition of an almost highest weight vector.Also i (w) = 1 implies that i (ei (w)) = 0.Again by the definition of an almost highest weight vector, i−1 (ei (w)) = 0. If i = n − 1 we have shown that ei (w) is an An−1 highest weight vector. So it may be assumed that i < n − 1. It is enough to show that one of the two following possibilities occurs. (1) i+1 (ei (w)) = 0. (2) i+1 (ei (w)) = 1 and i (ei+1 ei (w)) = 0. Recall that ei (w) is obtained from w by changing an i + 1 into an i. Write w = u(i + 1)v such that ei (w) = uiv. In this notation we have φi (v) = 0 and i (u) = 0. By Lemma 7.8 point 7.8 with {1, 2} replaced by {i + 1, i + 2} and using that w is an almost highest weight vector of defect i, we have i+1 (ei (w)) ≤ i+1 (w) + 1 = 1. It is now enough to assume that i+1 (ei (w)) = 1 and to show that i (ei+1 ei (w)) = 0. By (3.5) 0 = i+1 (w) = i+1 (u(i + 1)v) = i+1 (v) + max{0, i+1 (u) − φi+1 ((i + 1)v)}. In particular i+1 (v) = 0. Hence ei+1 (ei (w)) = ei+1 (uiv) = ei+1 (u)iv. Similar computations starting with i (w) = 1 and which use the fact that i (u) = φi (v) = 0, yield i (v) = 0. We have i (ei+1 ei (w)) = i (ei+1 (u)iv) = i (iv) + max{0, i (ei+1 (u)) − φi (iv)} = 0 + max{0, i (ei+1 (u)) − 1}. But i (u) = 0 and in passing from u to ei+1 (u) an i + 2 is changed into an i + 1. By Lemma 7.8 point 7.8 applied to the restriction of u to the alphabet {i, i + 1}, we have i (ei+1 (u)) ≤ i (u) + 1 = 1. It follows that i (ei+1 ei (w)) = 0, and that ei (w) is an almost highest weight vector of defect i + 1. " # Lemma 7.10. Suppose w is an An−1 highest weight vector and w is a word obtained by removing a letter (say i) from w. Then there is an index r such that i ≤ r ≤ n and er−1 er−2 · · · ei ( w ) is an An−1 highest weight vector. Proof. By Lemma 7.9 it suffices to show that w is either an An−1 highest weight vector or an almost highest weight vector of defect i. w ) = 0 for j = i. For j ∈ {i − 1, i}, the restrictions of w and First it is shown that j ( w to the alphabet {j, j + 1} are the same, so that j ( w ) = j (w) = 0. For j = i − 1, by Lemma 7.8 point 7.8 and the assumption that w is an An−1 highest weight vector, it follows that i−1 ( w ) ≤ i−1 (w) + 1 = 1. But equality cannot hold since the removed letter is i as opposed to i − 1. Thus i−1 ( w ) = 0. w ) ≤ i (w) + 1 = 1 by Lemma 7.8 point 7.8 and the fact Next we observe that i ( that w is an An−1 highest weight vector. w ) = 0 then w is an An−1 highest weight vector. So it may be assumed that If i ( i ( w ) = 1. It suffices to show that i−1 (ei ( w )) = 0. Write w = uiv and w = uv. Now

136

A. Schilling, M. Shimozono

j (v) = 0 for all 1 ≤ j ≤ n − 1 by Lemma 7.7 since w is an An−1 highest weight vector. In particular i (v) = 0 so that ei ( w ) = ei (uv) = ei (u)v. We have i−1 (ei ( w )) = i−1 (ei (u)v) = i−1 (v) + max{0, i−1 (ei (u)) − φi−1 (v)} = max{0, i−1 (ei (u)) − φi−1 (v)}, since i−1 (v) = 0 by Lemma 7.7. It is enough to show that i−1 (ei (u)) ≤ φi−1 (v). But i−1 (ei (u)) ≤ i−1 (u) + 1 = i−1 (ui) ≤ φi−1 (v). The first inequality holds by an application of Lemma 7.8 point 7.8 since the restrictions of u and ei (u) to the alphabet {i − 1, i} differ by inserting a letter i. The last inequality holds by Lemma 7.7 since w = uiv is an An−1 highest weight vector. " #

Lemma 7.11. Let B = B k,* be a perfect crystal of level * ≤ *, ∈ (Pcl+ )* , B a finite (possibly empty) tensor product of perfect crystals of level at most *, x ∈ B and b ∈ B such that x ⊗ b ∈ H(, B ⊗ B). Let i ∈ J such that hi , > 0 and set = − i + i−1 . Then there is an index 0 ≤ s ≤ k such that ei+s−1 · · · ei+1 ei (x ⊗ b) = x ⊗ ei+s−1 · · · ei+1 ei (b)

(7.5)

and ei+s−1 · · · ei (b) ∈ H( , B), where the subscripts are taken modulo n. Moreover if * = * then s = k. (1)

Proof. Since the Dynkin diagram An−1 has an automorphism given by rotation, it may be assumed that i = 1. Let λ be the partition of length less than n, given by hj , = λj − λj +1 for 1 ≤ j ≤ n − 1 and λn = 0. Since h1 , > 0 it follows that λ has t

a column of size 1. Let m = λ1 and yi be the An−1 -highest weight vector in B λj ,1 for 1 ≤ j ≤ m. Write y = ym ⊗ · · · ⊗ y1 and y = ym−1 ⊗ · · · ⊗ y1 . Observe that t t ,1 λ m y ⊗ u*0 is an affine highest weight vector in B ⊗ · · · ⊗ B λ1 ,1 ⊗ B(*0 ) and has weight so its connected component is isomorphic to B(). A similar statement holds for y ⊗ u*0 and B( ). In particular, b ⊗ y is an An−1 highest weight vector. The map x ⊗ b ⊗ y → word(x)word(b)word(y) gives an embedding of An−1 -crystals into a tensor product of crystals B 1,1 . By Lemma 7.10, there exists an index 1 ≤ r ≤ n such that er−1 er−2 · · · e1 (word(x)word(b)word( y )) is an An−1 highest weight vector. Since y is an An−1 highest weight vector it follows that er−1 · · · e1 (word(x)word(b)word( y )) = er−1 · · · e1 (word(x)word(b))word( y ). Let pj be the position of the letter in ej −1 . . . e1 (word(x)word(b)) that changes from a j + 1 to j upon the application of ej , for 1 ≤ j ≤ r − 1. It follows from the proof of Lemma 7.9 that pr−1 < pr−2 < · · · < p2 < p1 .

(7.6) b

Let s be the maximal index such that ps is located in word(b). Write = es · · · e1 (b). It follows that es es−1 · · · e1 (x ⊗ b) = x ⊗ b and that b ⊗ y is an An−1 highest weight vector. It remains to show that 0 (b ⊗ y ⊗ u*0 ) = 0 and that s ≤ k with equality if * = *.

(7.7)

Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials

137

Consider the corresponding positions in the tableau b. Since b → word(b) is an An−1 crystal morphism, es · · · e1 (word(b)) = word(es · · · e1 (b)). Let (i1 , j1 ) be the position in the tableau b corresponding to the position p1 in word(b), and analogously define (i2 , j2 ), (i3 , j3 ), and so on. Since the rows of all tableaux (and in particular b, e1 (b), e2 e1 (b), etc.) are weakly increasing and (7.6) holds, it follows that i1 < i2 < i3 < · · · < is . But b has k rows, so s ≤ k. The next goal is to prove (7.7). Suppose first that s < n − 1. In this case the letters 1 and n are undisturbed in passing from e1 (b) to es · · · e1 (b). Using this and the Dynkin diagram rotation it follows that y ⊗ u*0 ) = 0 (e1 (b) ⊗ u ) 0 (es · · · e2 e1 (b) ⊗ = max{0, 0 (e1 (b)) − φ0 (u )} = max{0, 0 (e1 (b)) − φ0 (u ) − 1}.

(7.8)

But φ0 (u ) ≥ 0 (b) ≥ 0 (e1 (b)) − 1 by the fact that 0 (b ⊗ u ) = 0 and Lemma 7.8 point 7.8 applied after rotation of the Dynkin diagram. By (7.8) the desired result (7.7) follows. Otherwise assume s = n − 1. Here k = n − 1 since s ≤ k < n with the inequality holding by the perfectness of B. By (7.6) and the fact that b is a tableau, it must be the case that e1 acting on b changes a 2 in the first row of b into a 1, e2 acting on e1 (b) changes a 3 in the second row of e1 (b) into a 2, etc. Since b is a tableau with n − 1 rows with entries between 1 and n, there are integers 0 ≤ νn−1 ≤ νn−2 ≤ · · · ≤ ν1 < * such that the i th row of b consists of νi copies of the letter i and * − νi copies of the letter i + 1. For tableaux b of this very special form, the explicit formula for e0 in [37, (3.11)] yields 0 (b) = * − mn (b), where mn (b) is the number of occurrences of the letter n in b. Since b = en−1 · · · e1 (b) also has the same form (with νi replaced by νi + 1 for 1 ≤ i ≤ n − 1) and mn (b ) = mn (b) − 1, it follows that 0 (b ) = 0 (b) + 1. We have y ⊗ u*0 ) = 0 (b ⊗ u ) 0 (b ⊗

= max{0, 0 (b ) − φ0 (u )} = max{0, 0 (b) + 1 − (φ0 (u ) + 1)} = 0

since b ∈ H(, B). Finally, assuming * = *, it must be shown that s = k. Since the level of B is the same as that of the weights and , it follows from the perfectness of B that both b and b are uniquely defined by the property that (b) = and (b ) = . Let = n−1 i=0 zi i . By the explicit construction of b in Example 3.3, wt(b) =

n−1 k j =1 i=0

zi (i+j − i+j −1 ) =

n−1

zi (i+k − i )

i=0

with indices taken modulo n. Subtracting the analogous formula for wt(b ), wt(b) − wt(b ) = − kj =1 αj . Using (3.1) it follows that k = s. " # Proof of Theorem 7.3. First observe that x ⊗ b ∈ H( , B ⊗ B , φ(x)) by (3.1), b ∈ H( , B , ), and (x) = . Let c ∈ B and z ∈ B be such that x ⊗ b ∼ = c⊗z under the local isomorphism. Then c ⊗ z ∈ H( , B ⊗ B, φ(x)) which means that z is -restricted. Hence z ∈ H( , B, φ(z)) and c ∈ H(φ(z), B , φ(x)). The former together with the perfectness of B implies that y = z. From the latter it follows that

138

A. Schilling, M. Shimozono

ψ −k (c) ∈ H( , B , ). However the set H( , B , ) might have multiplicities so it is not obvious why b = ψ −k (c) or equivalently c = ψ k (b). The proof proceeds by an induction that changes the weight to a weight that is “closer to" *0 . Suppose first that there is a root direction i = 0 such that = − i + i−1 . By Lemma 7.11 applied for the weight hi , > 0 and , simple root αi , and element x ⊗ b ∈ H( , B ⊗ B ), there is an 0 ≤ s < n such , B , ), where = − s+i + s+i−1 and that b = ei+s−1 · · · ei+1 ei (b) ∈ H( ei+s−1 · · · ei (x ⊗ b) = x ⊗ b. Applying Lemma 7.11 with , αs+i , and x ∈ H(, B), , B). it follows that x = ek+s+i−1 · · · es+i (x) ∈ H( , B ⊗ B ). The above computations imply ek+s+i−1 · · · ei (x ⊗ b) = x ⊗ b ∈ H( , B ⊗ B) since x ⊗ b → c ⊗ y under We have ek+s+i−1 · · · ei+1 ei (c ⊗ y) ∈ H( the local isomorphism. It must be seen which of these raising operators act on the tensor factor in B and which act in B. By Lemma 7.11 applied with , αi , and c ⊗ y ∈ , B) and that ek+i−1 · · · ei (c⊗ H( , B ⊗B), it follows that y = ek+i−1 · · · ei (y) ∈ H( (1) y) = c⊗ y . Since y ⊗u is an An−1 highest weight vector, the rest of the raising operators es+k−1 · · · ek+i must act on the first tensor factor. Let c = ek+s+i−1 · · · ek+i (c). Then ek+s+i−1 · · · ei (c ⊗ y) = c ⊗ y . But the local isomorphism is a crystal morphism so it sends x ⊗ b → c ⊗ y . By induction c = ψ k ( b). By (3.6) it follows that c = ψ k (b). Otherwise there is no index i = 0 such that hi , > 0. This means = *0 . But the sets H(*0 , B, ) and H(*0 , B , φ(y)) are singletons whose lone elements are given by the An−1 highest weight vectors in B and B respectively. Since B ⊗ B is An−1 multiplicity-free it follows that the sets H(φ(y), B , φ(x)) and H(, B, φ(x)) are singletons. In this case it follows directly that c = ψ k (b) since both c and ψ k (b) are elements of the set H(φ(y), B , φ(x)). " # 7.5. Branching function by restricted generalized Kostka polynomials. The appropriate map from LR tableaux to rigged configurations, sends the generalized charge of the LR tableau to the charge of the rigged configuration. Unfortunately in general it is not clear what happens when one uses the statistic coming from the energy function E(b ⊗ y ) but using the path b ⊗ y ⊗ y . It is only known that the statistic E(b ⊗ y ⊗ y ) on the path b ⊗ y ⊗ y is well-behaved. So to continue the computation we require that y = ∅. This is achieved when = * 0 . So let us assume this. The other problem is that we do not consider all paths in H(*0 , B ⊗N ⊗ B , ), but only those of the form b ⊗ y , where y ∈ B is a fixed path. Passing to LR tableaux, this is equivalent to imposing an additional condition that the subtableaux corresponding to the first several rectangles must be in fixed positions. Conjecture 8.3 asserts that the corresponding sets of rigged configurations are well-behaved. The special case that requires no extra work is when B consists of a single perfect crystal. This is achievable when has the form = rs + (* − r)0 ; in this case B = B s,r and y is the sln -highest weight element of B s,r . This is the same as requiring that the first subtableau of the LR tableau be fixed. But this is always the case. Let R (M) consist of the single rectangle (r s ) followed by N = Mn copies of the rectangle (* k ), where B = B k,* . Let λ(M) be the partition of the same size as the total size of R (M) , (M) such that λ projects to − *0 . Then the set of paths H(*0 , B ⊗N ⊗ B s,r , ) is * equal to P−* ,R (M) . This is summarized by 0

kM

−rskM−n* ( 2 ) * b Kλ(M) ,R (M) (q), (q) = lim q M→∞

where is arbitrary, = rs + (* − r)0 , and = * 0 .

(7.9)

Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials

139

Inserting expression (6.7) for the generalized Kostka polynomial in (7.9) and taking the limit yields the following fermionic expression for the branching function: b (q) = q

×

rs(s−n) 1 2n + 2*

n

|λ| 2 j =1 (λj − n )

(−1)|S|+1 q 2 u(S)C 1

−1 ⊗C −1 u(S)

S∈SCST(λ )

q 2 mC⊗C 1

−1 m−mI ⊗C −1 u(S)

m

*−1 n−1 m(a) +n(a) n−1 i

i=1 a=1 i=*

(a) mi

i

a=1

1 , (q)m(a)

(7.10)

*

where λ is any partition which projects to − *0 and u(S) as defined in (6.1). The n−1 (a) (a) sum over m runs over all m = *−1 a=1 mi ei ⊗ ea such that mi ∈ Z and i=1 e*−1 ⊗ ea (I ⊗ C −1 m − C −1 ⊗ C −1 u(S)) −

1 1 λj − |λ| ∈ Z * n a

j =1

(a)

for all 1 ≤ a ≤ n − 1. The variables ni (a)

ni

are given by

= ei ⊗ ea −C ⊗ C −1 m + I ⊗ C −1 (u(s) + er ⊗ es )

for all 1 ≤ a < n and 1 ≤ i < *, i = * . 8. Proof of Theorem 5.7 To prove Theorem 5.7 it clearly suffices to show that there is a bijection ψ R : RLR* (λ; R) → RC* (λ; R) that is charge-preserving, that is, cR (T ) = c(ψ R (T )) for all T ∈ RLR* (λ; R). Here we identify LR(λ; R) with RLR(λ; R) via the standardization bijec : CLR(λ; R) → N by c = c ◦ γ , where c : RLR(λ; R) → tion std. Also define cR R R R R N. It will be shown that one of the standard bijections ψ R : RLR(λ; R) → RC(λ; R) is charge-preserving, and that it restricts to a bijection RLR* (λ; R) → RC* (λ; R). With this in mind let us review the bijections from LR tableaux to rigged configurations. 8.1. Bijections from LR tableaux to rigged configurations. A bijection φ R : CLR(λ; R) → RC(λt ; R t ) was defined recursively in [25, Definition-Proposition 4.1]. It is one of four natural bijections from LR tableaux to rigged configurations: (1) Column index quantum: φ R : CLR(λ; R) → RC(λt ; R t ), R : CLR(λ; R) → RC(λt ; R t ), defined by φ R = (2) Column index coquantum: φ θR t ◦ φ R , (3) Row index quantum: ψ R : RLR(λ; R) → RC(λ; R), defined by ψ R = φ R t ◦ tr, and R : RLR(λ; R) → RC(λ; R), defined by ψ R = θR ◦ ψ R . (4) Row index coquantum: ψ Of these four, the one that is compatible with level-restriction is ψ. First we show that it is charge-preserving. This fact is a corollary of the difficult result [25, Theorem 9.1]. Proposition 8.1. c(ψ R (T )) = cR (T ) for all T ∈ RLR(λ; R).

140

A. Schilling, M. Shimozono

Proof. Consider the following diagram, which commutes by the definitions and [25, Theorem 7.1] RLR(λ; R) ggggogooo g g g g o g ggggg oootr g g o g g o wo sggg CLR(λ; R) tr / CLR(λt ; R t ) ψR LR φ R t φR / RC(λ; R). RC(λt ; R t ) tr γR−1

RC

In particular ψ R = tr RC ◦ φ R ◦ γR−1 . Let T ∈ RLR(λ; R) and Q = γR−1 (T ). Then, using tr RC ◦ θR t = θR ◦ tr RC , R (Q))). ψ R (T ) = θR (tr RC (φ R (Q)). Then Let (ν, J ) = tr RC (φ c(ψ R (T )) = c(θR (ν, J )) = ||R|| − cc(ν, J ) R (Q))) = cc(φ R (Q)) = cR (Q) = cR (T ) = ||R|| − cc(tr RC (φ . by Lemma 5.4, (5.6) and [25, Theorem 9.1] to pass from cc to cR

# "

In light of Proposition 8.1, to prove Theorem 5.7 it suffices to establish the following result. Theorem 8.2. The bijection ψ R : RLR(λ; R) → RC(λ; R) restricts to a well-defined bijection ψ R : RLR* (λ; R) → RC* (λ; R). Computer data suggests that the bijection ψ R is not only well-behaved with respect to level-restriction, but also with respect to fixing certain subtableaux. It was argued in Sect. 7.5 that the branching functions can be expressed in terms of generating functions of tableaux with certain fixed subtableaux.t t Let ρ ⊂ λ be partitions, Rρ = ((1ρ1 ), . . . , (1ρn )) and Tρ the unique tableau in RLR(ρ; Rρ ). Define RLR* (λ, ρ; R) to be the set of tableaux T ∈ RLR* (λ; Rρ ∪ R) such that T restricted to shape ρ equals Tρ . Recall the set of rigged configurations RC* (λ, ρ; R) defined in Sect. 5.3. Conjecture 8.3. The bijection ψ R : RLR(λ; R) → RC(λ; R) restricts to a well-defined bijection ψ R : RLR* (λ, ρ; R) → RC* (λ, ρ; R). 8.2. Reduction to single rows. In this section it is shown that to prove Theorem 8.2 it suffices to consider the case where R consists of single rows. Recall the nontrivial embedding iR : LR(λ; R) 7→ LR(λ; r(R)). We identify LR(λ; R) and RLR(λ; R) via std, and therefore have an embedding iR : RLR(λ; R) 7→ RLR(λ; r(R)). Define a map jR : RC(λ; R) → RC(λ; r(R)) as follows. Let (ν, J ) ∈ RC(λ; R). For each rectangle of R having k rows and m columns, add k − j strings (m, 0) of length m and label zero to the rigged partition (ν, J )(j ) for 1 ≤ j ≤ k − 1. The resulting rigged configuration is jR (ν, J ).

Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials

141

Proposition 8.4. The following diagram commutes: iR

RLR(λ; R) −−−−→ RLR(λ; r(R)) ψ ψR

r(R) RC(λ; R) −−−−→ RC(λ; r(R)). jR

It must be shown that similar diagrams commute in which iR is replaced by either iR< or sp , the maps that occur in the definition of iR . Let jR< : RC(λ; R) → RC(λ; R < ) be defined by adding a string (µ1 , 0) to each of the first η1 − 1 rigged partitions in (ν, J ) ∈ RC(λ; R). Lemma 8.5. jR< is well-defined and the following diagram commutes: iR

, ∂pj ∂p/ 4 4 1 2 E0 p − E0 0 > 1 − √ , so that, since |Eλ (p) − E0 (p)| ≤ constλ , we also have

2

Eλ p − Eλ (0) + Eλ (0) − E0 (0) ≤ 2d + constλ2 . As Eλ (p) is real analytic in p, the ∂ 2 E0 (p) analytic implicit function theorem and Cauchy estimates are used to control ∂pj ∂p/ and the remainder. & ' Proof. For λ = 0, we have

Spectral Analysis Stochastic Lattice Ginzburg–Landau Models

391

4.2. The ladder approximation. The first part of this subsection is devoted to showing the existence or absence of two-particle bound states in the ladder approximation and follows [15]. We use the mixed coordinates of Eq. (2.7) to analyze the kernels in the BS equation. The kernel of D˜ λ0 is given by 0 0

(2) k (2) k 0 0 0 0 0 ˜ ˜ ˜ Dλ (p, q, k) = δ p + q Sλ − p , p Sλ + p , q δ p + q − k 2 2 0 0 k k (2) (2) + S˜λ + p 0 , p S˜λ − p 0 , k − p δ (p − q) . 2 2 (4.3) The Recall that D˜ λ (k 0 ) means D˜ λ taken at zero spatial momentum, i.e., D˜ λ ((k 0 , 0)). action of D˜ λ0 (k 0 ) on energy independent functions f (p), which depend only on p, is 0 0 (2) k (2) k (D˜ λ0 (k 0 )f )(p) = (2π)d+1 S˜λ + f (−p)]. + p 0 , p S˜λ − p 0 , p [f (p) 2 2 (4.4) In the ladder approximation, K˜ λ is replaced by its first order term λL˜ of Eq. (3.2), which is local in time and so 3 ˜ + E0 ( L(p, q, k) = − a2 [E0 (p) + E0 ( q ) + E0 (p − k) q − k)], 4 i.e., its Fourier transform does not depend on p0 , q 0 and k 0 . Hence, at zero total spatial momentum k, ˜ = − 3 a2 [E0 (p) L(p, q, (k 0 , 0)) + E0 ( q )], 2 ˜ 0 , 0) has rank two (in a scalar local field theory the which shows that the operator L(k rank is one). Solving the Bethe–Salpeter equation (2.9) for D˜ λ , in the ladder approximation, yields −1 ˜ 0) D˜ λ0 (k 0 ) D˜ λ (k 0 ) = 1 − (2π )−2(d+1) λD˜ λ0 (k 0 )L(k (4.5) −1 = D˜ λ0 (k 0 ) 1 − (2π )−2(d+1) λL˜ λ (k 0 )D˜ λ0 (k 0 ) with all quantities taken at zero spatial momentum as in (4.4). The action of L˜ λ D˜ λ0 is given by

L˜ λ (k 0 )D˜ λ0 (k 0 )f (p) = − 3a2 (2π )d+1 E0 (p) + E0 ( q) 0 0 k k (4.6) × S˜λ − q 0 , q S˜λ + q 0 , q 2 2 × f (−q) + f (−q 0 , q ) dq. Hence, if the test function f depends only on p, we have (L˜ λ (k 0 )D˜ λ0 (k 0 )f )(p) = −3a2 (2π )d+1 ρ0 (f ) + ρ1 (f )E0 (p) ,

392

P. A. Faria da Veiga, M. O’Carroll, E. Pereira, R. Schor

where ρn (f ) = G( q , k0 ) =

1 2

Td ∞

−∞

G( q , k 0 )E0 ( q )δ0n f ( q ) + f (− q ) d q;

n = 0, 1,

(2) (2) S˜λ (q)S˜λ (k 0 − q0 , q )dq0 .

It follows from q , k 0 ) is

and from a simple analytic continuation argument that G(

(4.1) 0 This result depends on the fact that Eλ (0) ≤ Eλ (p) analytic on Imk < 2Eλ (0). for any p ∈ Td , proven in Proposition 4.2. Recall, from (2.8), that the basic object we want to analyze is (f, D˜ λ (k 0 )f ), which has the form, 0 d+1 ˜ f (p)G( p, k 0 )g(p, k 0 )d p, (4.7) (f, Dλ (k )f ) = 2(2π ) Td

where

−1 ˜ 0 )D˜ λ0 (k 0 ) f (·). g(·, k 0 ) = 1 − (2π )−2(d+1) λL(k

must come from those of g(·, k 0 ). The only singularities of (4.7) on Imk 0 < 2Eλ (0)

But, in turn, these come from the zeroes of 1−µ± (k 0 ), where µ± (k 0 ) are the eigenvalues ˜ 0 )D˜ 0 (k 0 ) on the space generated by the functions 1 and E0 (p). of (2π)−2(d+1) λL(k We λ find

1/2 0 −(d+1) 0 0 0 (4.8) λ α(k ) ± β(k )γ (k ) µ± (k ) = −3a2 (2π) with the eigenfunction corresponding to µ+ given by β ψ+ (p) = 1 + E0 (p), γ where

α(k 0 ) = β(k ) =

Td

Td

0

γ (k 0 ) =

Td

E0 ( q )G( q , k 0 )d q, G( q , k 0 )d q,

(4.9)

E0 ( q )2 G( q , k 0 )d q.

Now, from (4.1), G( q , k 0 ) can be written as q )2 cλ ( π q , k 0 ), + G1 ( 2 Eλ ( q ) Eλ ( q )2 + 41 (k 0 )2

+ 2M0 . q , k 0 ) is analytic on Imk 0 < Eλ (0) where G1 ( From general principles, the singularities of (4.7) can only be located on the imaginary k 0 axis. Writing k 0 = iκ with κ ≥ 0 and using (4.1), one can show that G( q , iκ) > 0 G( q , k0 ) =

Spectral Analysis Stochastic Lattice Ginzburg–Landau Models

393

It follows then that α(iκ), β(iκ) and γ (iκ) are positive and, by for 0 ≤ κ < 2Eλ (0). Cauchy-Schwarz’s inequality, α ≤ [βγ ]1/2 on 0 ≤ κ < 2Eλ (0). For space dimension d ≥ 3, then α(iκ), β(iκ) and γ (iκ) increase to a finite limit as because the singularity generated by G( κ → 2Eλ (0) q , iκ) is quadratic and therefore integrable. Thus, if λ is small enough, 1 − µ± (iκ) cannot be zero on 0 < κ < 2Eλ (0) so that, in the ladder approximation, there are no bound states. but α − [βγ ]1/2 remains finite. This If d < 3, α, β and γ diverge as κ → 2Eλ (0), yields the nonvanishing of 1 − µ− (iκ) . Finally, 1 − µ+ (iκ) is nonzero if a2 > 0, if a2 < 0. This implies the and has a unique zero on the interval 0 < κ < 2Eλ (0), existence of a single bound state for the later case. be the mass for a single quasiparticle in the interacting theLet Mλ = Eλ (0) ory. The mass ML of the bound state, in the ladder approximation, is the solution of (assuming a2 < 0) F (λ, iML ) = −(2π )d+1 /3a2 λ, where F (λ, k 0 ) = α(λ, k 0 ) + [β(λ, k 0 )γ (λ, k 0 )]1/2 , and we have made explicit the λ dependence of α, β and γ . Let E = 2Mλ − ML . Performing an asymptotic analysis of the coefficients α, β, and γ we find 9 λ2 2 a [1 + O(λ)] ; if d = 1 4 m4 2 (4.10) E(λ) = 4π m2 exp − [1 + O(λ)] ; if d = 2. 3 |a2 | λ To go beyond the ladder approximation, let us introduce some function spaces. We define a weighted Hardy space Hδ (see [3, 21]) as functions f analytic in the strip | Imp j |< δ1 such that f (p) = f (−p), with norm given by, with α = α 0 , α , | w(p + iα)f (p + iα) |2 dp, sup f 2δ = | Imp 0 |< δ0 ;

|α0 | M0 . For | q 0 |≤ M0 , w(q)−1 Bδ (q −q )w(q )−1 is clearly bounded so that we have the required bound c$(κ). For | q 0 |> M0 , write q 0 = (q 0 − q 0 ) + q 0 , so that

2α

α α w(q)−1 = (q 0 )2 + 16Mλ2 ≤ 2 | q 0 − q 0 |2α +2 q 0 + 16Mλ2 , using the /p triangle inequality with p = α −1 . As r00 κ, q is O (q 0 )−4 , the result follows. & ' Let H∗ be the dual space to H, determined by the L2 inner product. We have Lemma 4.6. R0λ : H → H∗ is analytic in 0 < Reκ < 2Mλ , |Imκ| < Mλ and with norm bounded by c$(κ)2 , c > 0. Proof. From Eq. (4.3),

(g, R0λ f )2 ≤ sup S˜λ p 0 + iκ S˜λ p 0 − iκ w(p)−2 w(p) |g(p)f (p)| dp, 2 2 p and using (4.1) the result follows.

' &

Spectral Analysis Stochastic Lattice Ginzburg–Landau Models

397

4.4. Complete model: Existence of bound states. For the complete model, following [2], here we show the existence of mass spectrum in the interval κ ∈ (0, 2Mλ ) when d < 3 and a2 < 0. We will prove there is a unique bound state near the ladder bound state ML . In the next subsection, absence of bound states in (0, 2Mλ ) will be proven both for d ≥ 3 and a2 < 0 and for a2 > 0. Essentially, this is done by showing the existence or absence of an eigenvalue 1 of Kλ (κ) R0λ (κ). Multiplicity one is checked for the former case. Before we go to the technical details, we give a description of the strategy employed in both cases. For the repulsive case a2 > 0 and for the attractive case a2 < 0 and d ≥ 3, with Kλ = λL + λ2 K (2) λ , we write

−1 (2) , Dλ = DL + Dλ λ2 K (2) λ DL = DL 1 − λ2 Kλ DL where

−1 DL = Dλ0 + λDL LDλ0 = Dλ0 1 − λLDλ0 .

Using an explicit representation for DL , we show that DL has no singularities in (2) (0, 2Mλ ), and also that Kλ DL has norm less than one in (0, 2Mλ ). Hence, the resol−1

(2) is well defined by its Neumann series and Dλ does not have vent 1 − λ2 Kλ DL singularities in (0, 2Mλ ). For the attractive case a2 < 0 and d < 3, in order to show existence of a bound state we write

−1 Dλ = Dλ0 1 − Kλ Dλ0 and consider the family of compact operators, µ ∈ C, defined by Tλ (µ, κ) = −λT1 (κ) + µT2 (κ), where T1 , and T2 are defined in (4.16). We remark that µ = λ2 corresponds to the value of interest (the physical one), that is [see (4.15)] Tλ (λ2 , κ) = Kλ (κ)R0λ (κ). This family is shown to be compact and jointly analytic in κ and µ, for 0 < Reκ < 2Mλ and |µ| < 2λ2 . Without further analysis, the analytic Fredholm theory implies that −1 exists, except for κ in a discrete set. As Dλ0 is not singular in the same 1 − Kλ Dλ0 domain, it follows that the mass spectrum is discrete in (0, 2Mλ ). However, we show more. The point µ = 0 is called the ladder approximation which was solved explicitly in Subsect. 4.2, and leads to a bound state at some κ = κL ∈ (0, 2Mλ ). This is the only mass spectral point in (0, 2Mλ ). As µT2 is an analytic perturbation, it is shown that there is an isolated bound state of multiplicity one at κb ∈ (0, 2Mλ ), where κb lies in the interval |κb − κL | ≤ 21 bλ2 , for b sufficiently small, uniform in λ, such that κb is the unique mass spectral point in the interval. For κ in the intervals 0, κL − 21 bλ2 or κL + 21 bλ2 , 2Mλ − λ5/2 , the mass spec−1 exists. Thus, as Dλ0 is not singular, the trum is excluded by showing that 1 − Kλ Dλ0 −1 same holds for D = Dλ0 1 − Kλ Dλ0 . For κ near ML , the resolvent (−λT1 (κ) − w)−1 of −λT1 (κ) is constructed explicitly and µT2 (κ) is shown to be an analytic perturbation to this ladder operator. The resolvent (Tλ (µ, κ) − w)−1 is defined through its Neumann series and is shown to exist for w in

398

P. A. Faria da Veiga, M. O’Carroll, E. Pereira, R. Schor

the complement of | w |−1 , with | w |−1 < 4. This means that the spectrum of Tλ (µ, κ) is contained in | w |≤ 1/4, | w − 1 |≤ 1/4. Consequently, by analytic perturbation theory, there is a unique multiplicity one eigenvalue αλ (µ, κ) of Tλ (µ, κ) which is analytic both in κ and µ, and satisfies αλ (0, κ) = 1. However, we do not know that for real µ > 0 and small the eigenvalue takes the value one. To show that indeed it does, we compute the derivative [∂αλ /∂κ] (0, κ) (see Lemma 4.11), which is large positive for small λ. This is shown to be the dominating contribution to [∂αλ /∂κ] (µ, κ). Thus, for small real µ, αλ (µ, κ) is strictly monotone increasing in µ. In this way, we show: Lemma 4.7. Let µ and κ be real. For | µ |< 2λ2 and c sufficiently small, there is a unique κ = κλ (µ) in | κ − ML |≤ 21 cλ2 such that αλ (µ, κλ (µ)) = 1. Remark 4.8. Recall that µ = λ2 is the physical value of interest so that αλ λ2 , κλ (λ2 ) = 1 is the eigenvalue of Tλ (λ2 , κ) = Kλ (κ)R0λ (κ), where κ = κλ (λ2 ) ≡ Mb is the bound state mass given in (1.5). In order for the analysis of [2, Lemmas 2.7–2.11] to go through, it suffices to show the two lemmas below Lemma 4.9. Let µ± be defined as in (4.8). Then, for some positive c, 1 1 1 1 1 −1 ≤ c max , , Tλ (0, ML ) . [w − Tλ (0, ML )] w w − 1 w w − 1 w − µ− (ML ) Remark 4.10. We recall, for κ = ML , that µ+ (ML ) = 1 and | µ− (ML ) |≤ c | λ |. Note that the ladder bound state satisfies αλ (0, ML ) = 1. Lemma 4.11. For κ such that | κ − ML |≤ 21 cλ2 , with a sufficiently small c > 0, set αλ (0, κ) = ρ (α + βγ )1/2 . Then, there exist positive constants c1 and c2 such that ∂αλ (0, κ) ≥ λc1 $(κ)3 ≥ c2 λ−2 , for $(κ) as defined in (4.13). ∂κ Proof of Lemma 4.9. Using the representation (4.12), the resolvent [w − Tλ (0, ML )]−1 is bounded using Lemma 4.5. & ' Proof of Lemma 4.11. From the representations for α, β and γ [see (4.9)], we see that they are all strictly positive as well as their κ derivatives. From the bounds of Lemma 4.4, ∂αλ it follows that ' (0, κ) ≥ λc1 $(κ)3 . & ∂κ 4.5. Complete model: Absence of bound states. Here, considering the complete model and using the strategy described in Sect. 4.4, we show the absence of mass spectrum in (0, 2Mλ ) in the two-particle sector for the repulsive case a2 > 0 and d < 3, as well as for d ≥ 3. A variant of the method is used to complete the proof that excludes spectrum between the bound state Mb and the two-particle threshold 2Mλ . We treat the repulsive case following the method of [19] and the attractive one following [2]. As before, the λ dependence is omitted unless deemed necessary. To control the spectrum, we treat D = D 0 + DKD 0 as a perturbation about the ladder approximation. For this, we set DL for the (λ dependent) D solution of the ladder BS equation, that is, DL = D 0 + λDL LD 0 .

(4.18)

Spectral Analysis Stochastic Lattice Ginzburg–Landau Models

399

Then, D is given by

−1 D = DL + Dλ2 K (2) DL = DL 1 − λ2 K (2) DL .

In the repulsive case a2 > 0, we show that DL has no singularity in 0 < κ < 2Mλ , and that the bound state of mass Mb is isolated with isolation radius rb . We now show that there is no spectrum in (Mb + rb , 2Mb ) again by showing that K (2) DL has norm less than one in this interval. The starting point of the analysis is an explicit representation for DL . Using (4.6) and (4.17) in (4.18), and suppressing the κ dependence, gives q )X(p), DL (p, q) = r0 (p)δ(p + q) − 3λa2 r0 (q)Y (p) − 3λa2 r(q)E0 (

where X(p) =

(4.19)

DL (p, q)dq

,

Y (p) =

DL (p, q)E0 ( q )dq.

Multiplying (4.19) by the function 1 and E0 ( q ), integrating over q and solving for X(p) and Y (p), leads to 3λa2 DL (p, q) = r0 (p)δ(p + q) − + E0 ( q ) + 3λa2 E0 (p) D × (E0 (p) + E0 ( q )) α − γ − βE0 (p)E 0 ( q) ≡ r0 (p)δ(p + q) + c(p, q)r0 (p)r0 (q), where D = D(w = 1) [see (4.11)], that is

D = (1 − µ+ ) (1 − µ− ) = 1 + 3λa2 α + (3λa2 )2 α 2 − βγ .

(4.20)

To establish our result, it is sufficient to use the bound of Lemma 4.3 for the Hilbert– (2) Schmidt norm of λ2 Kλ R, with, following (4.14), R(κ) ≡ D˜ L (κ), and the bound (uniformly in p )

I ≡ R(p, q)f (p)g(q)dpdq ≤ O λ−1 w(q ) . (4.21)

In (4.21), suppressing the p and q dependence, (2)

f (p) = Kλ (κ, p + iδ, p); As the κ behavior of

J ≡

g(q) = w(q)−1 Bδ (q − q).

D˜ L (p, q)dpdq = β/D

(4.22)

(4.23)

is easily controlled (see Lemma 4.14 below), it is convenient to write I of (4.21) as I= r0 (p) [f (p)g(q) − f (0)g(0)] dpdq + [f (p) − f (0)] g(0)c(p, q)r0 (p)r0 (q)dpdq + f (p) [g(p) − g(0)] c(p, q)r0 (p)r0 (q)dpdq + Jf (0)g(0) ≡ X1 + X2 + X3 + X4 .

400

P. A. Faria da Veiga, M. O’Carroll, E. Pereira, R. Schor

The terms X2 and X3 are bounded by a combination of the methods used for bounding X1 and X4 (see [2]). We now bound X4 . Following [19], we write [h(p) ≡ f (p)g(p)] − h(0) as h(p) − h(0) = h(p) − h(0, p) + h(0, p) − h(0) ≡ δh1 (p) + δh2 (p). (4.24) The δh1 (p) and δh2 (p) terms are bounded in the lemmas below. Lemma 4.12. Recalling the definitions given in (4.22) and (4.24), the bound

δh1 (p)r00 (κ, p)dp ≤ O λ−1 w

is satisfied. Proof. Write [see (2.2) and (4.3)] o −1

r00 (κ, p) = (iκp )

−1 p/2 m2 κ2 + + (p ) − iκp − 4 2 2 −1 2 p/2 m2 κ 0 2 0 . + + − (p ) + iκp − 4 2 2

0 2

0

The singularity in p0 at zero is cancelled and for the first (second) terms we make the contour shift p 0 → p 0 ± iδ0 , with δ0 < δ0 . Thus, the denominators become 2 2 2 2 p2 (p0 )2 ± 2i δ0 p 0 ∓ κp 0 + κδ0 − δ02 + κ4 + m2 + 2/ , which is zero for κ4 = m2 + 2 √ m2 p/ 0 2 + δ0 κ − δ0 > 2 for δ0 < κ. Thus, κ > 2m. Hence, we have no p singularity 0 3 and the rest of the bound is carried out using the 1/(p ) falloff of the term of r00 (κ, p) as in the proof of Lemma 4.5. & '

Lemma 4.13. The bound δh2 (p)r00 (κ, p)dp

≤ O λ−1 w(q ) holds. Proof. Writing h(0, p) − h(0) = p.∇ u h(0, u ) |u=0 +

1

t

dt

0

dt

0

∂2 h(0, t p) ∂t 2

and doing the p0 integration, we get

cλ (p) 2 pj pk Eλ (p) 4Eλ (p )2 − 4Eλ (0 )2 + 4Mλ2 − κ 2

1 0

t

dt 0

dt

∂2 h(0, u = t p)d p, ∂uj uk

where the p terms integrate to zero by parity. The integral over p is finite for 0 < κ < 2Mλ . Concerning the derivatives of h(0, p), with respect to p j , we see that they are (2) (2) bounded by Bδ and those of Kλ . Using the analyticity of Kλ , the derivatives are uniformly bounded. Proceeding as in the proof of Lemma 4.5, the bound is completed. ' &

Spectral Analysis Stochastic Lattice Ginzburg–Landau Models

401

Lemma 4.14. There exist positive constants c1 , c2 , c3 and c4 such that i) For a2 > 0, and uniformly for 0 < κ < 2Mλ , −1 J ≤ c1 $(κ) 1 + 3λa2 c2 $(κ) − c3 λ2 . ii) For a2 < 0, and uniformly for 2Mλ − λ5/2 κ < 2Mλ , λJ ≤ c4 . Proof. i) From (4.20) and (4.23), we get 2 −1 J = β 1 − µ+ 1 − µ− = 1 + 3λa2 α + 3λa2 α 2 − βγ . The first bound follows from α, β, γ < c$(κ) but α 2 −βγ < 0, by the Cauchy-Schwarz inequality. However, by separating out the constant term in the numerators of α, β and γ , the p/2 singularity in the denominator is cancelled and α 2 − βγ < c uniformly in 0 < κ < 2Mλ . For ii) see Sect. 3 of [2]. & ' 5. Concluding Remarks We have determined the low-lying e − m spectrum for dynamic stochastic lattice Landau–Ginzburg models with small polynomial interaction and such that the equilibrium state is in the single phase region. The determination of the spectrum for models with equilibrium states in the multi-phase region is of interest. Also the question of the effect of large noise on the spectrum is relevant and is currently being investigated [13]. References 1. Dimock, J.: A Cluster Expansion for Stochastic Lattice Fields. J. Stat. Phys. (1990)

58, 1181–1207 2. Dimock, J., Eckmann, J.-P.: On the Bound State in Weakly Coupled λ φ 6 − φ 4 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.

2

Models. Commun.

Math. Phys. 51, 41–54 (1976) Duren, P.: Theory of H p Spaces. Pure and Applied Mathematics Vol. 38, New York: Academic Press, 1970 Glimm, J., Jaffe, A.: Quantum Physics: A Functional Integral Point of View. New York: Springer Verlag, 1986 Gammaitoni, L., Hanggi, P., Jung, P., Marchesoni, F.: Stochastic Resonance. Rev. Mod. Phys. 70, 223–287 (1998) Hohenberg, P. C., Halperin, B. I.: Theory of Dynamic Critical Phenomena. Rev. Mod. Phys. 49, 435–479 (1977) Horsthemke, W., Lefever, R.: Noise-induced Transitions. Berlin: Springer Verlag, 1984 Itzykson, C., Zuber, J.-B.: Quantum Field Theory. New York: McGraw-Hill, 1980 Jona-Lasinio, G., Mitter, P. K.: On the Stochastic Quantization of Field Theory. Commun. Math. Phys. 101, 409–436 (1985) Jona-Lasinio, G., Sénéor, R.: Study of Stochastic Differential Equations by Constructive Methods I. J. Stat. Phys. 83, 1109–1148 (1996) Kondratiev, Yu. G., Minlos, R. A.: One-Particle Subspaces in the Stochastic XY Model. J. Stat. Phys. 87, 613–642 (1997) Minlos, R. A., Suhov, Y. M.: On the Spectrum of the Generator of an Infinite System of Interacting Diffusions. Commun. Math. Phys. 206, 463–489 (1999) Pereira, E.: Noise Induced Bound States. Phys. Lett. A 282, 169–174 (2001) Reed, M., Simon, B.: Analysis of Operators. Modern Methods of Mathematical Physics Vol. IV, New York: Academic Press, 1978 Schor, R., Barata, J. C. A., Faria da Veiga, P. A., Pereira, E.: Spectral Properties of Weakly Coupled Landau-Ginzburg Stochastic Models. Phys. Rev. E 59, Issue 3, 2689–2694 (1999)

402

P. A. Faria da Veiga, M. O’Carroll, E. Pereira, R. Schor

16. Schor, R., O’Carroll, M.: Decay of the Bethe–Salpeter Kernel and Absence of Bound States for Lattice Classical Ferromagnetic Spin Systems at High Temperature. J. Stat. Phys. 99, 1207–1223 (2000); Transfer Matrix Spectrum and Bound States for Lattice Classical Ferromagnetic Spin Systems at High Temperature. J. Stat. Phys. 99, 1265–1279 (2000) 17. Simon, B.: Statistical Mechanics of Lattice Models. Princeton, NJ: Princeton University Press, 1994 18. Spencer, T.: The Decay of the Bethe–Salpeter Kernel in P(ϕ)2 Quantum Field Models. Commun. Math. Phys. 44, 143–164 (1975) 19. Spencer, T., Zirilli, F.: Scattering States and Bound States in λP(φ)2 Models. Commun. Math. Phys. 49, 1–16 (1976) 20. Spohn, H.: Large Scale Dynamics of Interacting Particles. Berlin: Springer Verlag, 1991 21. Stein, E.M.: Harmonic Analysis. Princeton, NJ: Princeton University Press, 1993 22. Zhizhina, E.A.: Two-Particle Spectrum of the Generator for Stochastic Model of Planar Rotators at High Temperature. J. Stat. Phys. 91, 343–366 (1998) 23. Zinn-Justin, J.: Quantum Field Theory and Critical Phenomena. Oxford: Oxford University Press, 1993 Communicated by Ya. G. Sinai

Commun. Math. Phys. 220, 403 – 428 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

Global Properties of Gravitational Lens Maps in a Lorentzian Manifold Setting Volker Perlick Albert Einstein Institute, 14476 Golm, Germany. E-mail: [email protected] Received: 16 October 2000 / Accepted: 18 January 2001

Abstract: In a general-relativistic spacetime (Lorentzian manifold), gravitational lensing can be characterized by a lens map, in analogy to the lens map of the quasi-Newtonian approximation formalism. The lens map is defined on the celestial sphere of the observer (or on part of it) and it takes values in a two-dimensional manifold representing a twoparameter family of worldlines. In this article we use methods from differential topology to characterize global properties of the lens map. Among other things, we use the mapping degree (also known as Brouwer degree) of the lens map as a tool for characterizing the number of images in gravitational lensing situations. Finally, we illustrate the general results with gravitational lensing (a) by a static string, (b) by a spherically symmetric body, (c) in asymptotically simple and empty spacetimes, and (d) in weakly perturbed Robertson–Walker spacetimes.

1. Introduction Gravitational lensing is usually studied in a quasi-Newtonian approximation formalism which is essentially based on the assumptions that the gravitational fields are weak and that the bending angles are small, see Schneider, Ehlers and Falco [1] for a comprehensive discussion. This formalism has proven to be very powerful for the calculation of special models. In addition it has also been used for proving general theorems on the qualitative features of gravitational lensing such as the possible number of images in a multiple imaging situation. As to the latter point, it is interesting to inquire whether the results can be reformulated in a Lorentzian manifold setting, i.e., to inquire to what extent the results depend on the approximations involved. In the quasi-Newtonian approximation formalism one considers light rays in Euclidean 3-space that go from a fixed point (observer) to a point that is allowed to vary Permanent address: TU Berlin, Sekr. PN 7-1, 10623 Berlin, Germany. E-mail: [email protected]

404

V. Perlick

over a 2-dimensional plane (source plane). The rays are assumed to be straight lines with the only exception that they may have a sharp bend at a 2-dimensional plane (deflector plane) that is parallel to the source plane. (There is also a variant with several deflector planes to model deflectors which are not “thin”.) For each concrete mass distribution, the deflecting angles are to be calculated with the help of Einstein’s field equation, or rather of those remnants of Einstein’s field equation that survive the approximations involved. Hence, at each point of the deflector plane the deflection angle is uniquely determined by the mass distribution. As a consequence, following light rays from the observer into the past always gives a unique “lens map” from the deflector plane to the source plane. There is “multiple imaging” whenever this lens map fails to be injective. In this article we want to inquire whether an analogous lens map can be introduced in a spacetime setting, without using quasi-Newtonian approximations. According to the rules of general relativity, a spacetime is to be modeled by a Lorentzian manifold (M, g) and the light rays are to be modeled by the lightlike geodesics in M. We shall assume that (M, g) is time-oriented, i.e., that the timelike and lightlike vectors can be distinguished into future-pointing and past-pointing in a globally consistent way. To define a general lens map, we have to fix a point p ∈ M as the event where the observation takes place and we have to look for an analogue of the deflector plane and for an analogue of the source plane. As to the deflector plane, there is an obvious candidate, namely the celestial sphere Sp at p. This can be defined as the set of all one-dimensional lightlike subspaces of the tangent space Tp M or, equivalently, as the totality of all light rays issuing from p into the past. As to the source plane, however, there is no natural candidate. Following Frittelli, Newman and Ehlers [2–4], one might consider any timelike 3-dimensional submanifold T of the spacetime manifold as a substitute for the source plane. The idea is to view such a submanifold as ruled by worldlines of light sources. To make this more explicit, one could restrict to the case that T is a fiber bundle over a two-dimensional manifold N , with fibers timelike and diffeomorphic to R. Each fiber is to be interpreted as the worldline of a light source, and the set N may be identified with the set of all those worldlines. In this situation we wish to define a lens map fp : Sp −→ N by extending each light ray from p into the past until it meets T and then projecting onto N . In general, this prescription does not give a well-defined map since neither existence nor uniqueness of the target value is guaranteed. As to existence, there might be some past-pointing lightlike geodesics from p that never reach T . As to uniqueness, one and the same light ray might intersect T several times. The uniqueness problem could be circumvented by considering, on each past-pointing lightlike geodesic from p, only the first intersection with T , thereby willfully excluding some light rays from the discussion. This comes up to ignoring every image that is hidden behind some other image of a light source with a worldline ξ ∈ N . For the existence problem, however, there is no general solution. Unless one restricts to special situations, the lens map will be defined only on some subset Dp of Sp (which may even be empty). Also, one would like the lens map to be differentiable or at least continuous. This is guaranteed if one further restricts the domain Dp of the lens map by considering only light rays that meet T transversely. Following this line of thought, we give a precise definition of lens maps in Sect. 2. We will be a little bit more general than outlined above insofar as the source surface need not be timelike; we also allow for the limiting case of a lightlike source surface. This has the advantage that we may choose the source surface “at infinity” in the case of an asymptotically simple and empty spacetime. In Sect. 3 we briefly discuss some general properties of the caustic of the lens map. In Sect. 4 we introduce the mapping degree (Brouwer degree) of the lens map as an important tool from differential topology.

Global Properties of Gravitational Lens Maps in Lorentzian Setting

405

This will then give us some theorems on the possible number of images in gravitational lensing situations, in particular in the case that we have a “simple lensing neighborhood”. The latter notion will be introduced and discussed in Sect. 5. We conclude with applying the general results to some examples in Sect. 6. Our investigation will be purely geometrical in the sense that we discuss the influence of the spacetime geometry on the propagation of light rays but not the influence of the matter distribution on the spacetime geometry. In other words, we use only the geometrical background of general relativity but not Einstein’s field equation. For this reason the “deflector”, i.e., the matter distribution that is the cause of gravitational lensing, never explicitly appears in our investigation. However, information on whether the deflectors are transparent or non-transparent will implicitly enter into our considerations. 2. Definition of the Lens Map As a preparation for precisely introducing the lens map in a spacetime setting, we first specify some terminology. By a manifold we shall always mean what is more fully called a “real, finitedimensional, Hausdorff, second countable (and thus paracompact) C ∞ -manifold without boundary”. Whenever we have a C ∞ vector field X on a manifold M, we may consider two points in M as equivalent if they lie on the same integral curve of X. We shall denote the resultant quotient space, which may be identified with the set of all integral curves of X, by M/X. We call X a regular vector field if M/X can be given the structure of a manifold in such a way that the natural projection πX : M −→ M/X becomes a C ∞ -submersion. It is easy to construct examples of non-regular vector fields. E.g., if X has no zeros and is defined on Rn \ {0}, then M/X cannot satisfy the Hausdorff property, so it cannot be a manifold according to our terminology. Palais [5] has proven a useful result which, in our terminology, can be phrased in the following way. If none of X’s integral curves is closed or almost closed, and if M/X satisfies the Hausdorff property, then X is regular. We are going to use the following terminology. A Lorentzian manifold is a manifold M together with a C ∞ metric tensor field g of Lorentzian signature (+ · · · + −). A Lorentzian manifold is time-orientable if the set of all timelike vectors {Z ∈ T M | g(Z, Z) < 0} has exactly two connected components. Choosing one of those connected components as future-pointing defines a time-orientation for (M, g).A spacetime is a connected 4-dimensional time-orientable Lorentzian manifold together with a time-orientation. We are now ready to define what we will call a “source surface” in a spacetime. This will provide us with the target space for lens maps. Definition 1. (T , W ) is called a source surface in a spacetime (M, g) if (a) T is a 3-dimensional C ∞ submanifold of M; (b) W is a nowhere vanishing regular C ∞ vector field on T which is everywhere causal, g(W, W ) ≤ 0, and future-pointing; (c) πW : T −→ N = T /W is a fiber bundle with fiber diffeomorphic to R and the quotient manifold N = T /W is connected and orientable. We want to interpret the integral curves of W as the worldlines of light sources. Thus, one should assume that they are not only causal but even timelike, g(W, W ) < 0, since a light source should move at subluminal velocity. For technical reasons, however, we

406

V. Perlick

allow for the possibility that an integral curve of W is lightlike (everywhere or at some points), because such curves may appear as (C 1 -)limits of timelike curves. This will give us the possibility to apply the resulting formalism to asymptotically simple and empty spacetimes in a convenient way, see Subsect. 6.2 below. Actually, the causal character of W will have little influence upon the results we want to establish. What really matters is a transversality condition that enters into the definition of the lens map below. Please note that, in the situation of Def. 1, the bundle πW : T −→ N is necessarily trivializable, i.e., T N × R. To prove this, let us assume that the flow of W is defined on all of R × T , so it makes πW : T −→ N into a principal fiber bundle. (This is no restriction of generality since it can always be achieved by multiplying W with an appropriate function. This function can be determined in the following way. Owing to a famous theorem of Whitney [6], also see Hirsch [7], p. 55, paracompactness guarantees that T can be embedded as a closed submanifold into Rn for some n. Pulling back the Euclidean metric gives a complete Riemannian metric h on T and the flow of the vector field h(W, W )−1/2 W is defined on all of R × T , cf. Abraham and Marsden [8], Prop. 2.1.21.) Then the result follows from the well known facts that any fiber bundle whose typical fiber is diffeomorphic to Rn admits a global section (see, e.g., Kobayashi and Nomizu [9], p. 58), and that a principal fiber bundle is trivializable if and only if it admits a global section (see again [9], p. 57). Also, it is interesting to note the following. If T is any 3-dimensional submanifold of M that is foliated into timelike curves, then time orientability guarantees that these are the integral curves of a timelike vector field W . If we assume, in addition, that T contains no closed timelike curves, then it can be shown that πW : T −→ N is necessarily a fiber bundle with fiber diffeomorphic to R, providing N satisfies the Hausdorff property, see Harris [10], Theorem 2. This shows that there is little room for relaxing the conditions of Def. 1. Choosing a source surface in a spacetime will give us the target space N = T /W for the lens map. To specify the domain of the lens map, we consider, at any point p ∈ M, the set Sp of all lightlike directions at p, i.e., the set of all one-dimensional lightlike subspaces of Tp M. We shall refer to Sp as to the celestial sphere at p. This is justified since, obviously, Sp is in natural one-to-one relation with the set of all light rays arriving at p. As it is more convenient to work with vectors rather than with directions, we shall usually represent Sp as a submanifold of Tp M. To that end we fix a future-pointing timelike vector Vp in the tangent space Tp M. The vector Vp may be interpreted as the 4-velocity of an observer at p. We now consider the set Sp = Yp ∈ Tp M g(Yp , Yp ) = 0 and g(Yp , Vp ) = 1 . (1) It is an elementary fact that (1) defines an embedded submanifold of Tp M which is diffeomorphic to the standard 2-sphere S 2 . As indicated by our notation, the set (1) can be identified with the celestial sphere at p, just by relating each vector to the direction spanned by it. Representation (1) of the celestial sphere gives a convenient way of representing the light rays through p. We only have to assign to each Yp ∈ Sp the lightlike geodesic s −→ expp (sYp ) , where expp : Wp ⊆ Tp M −→ M denotes the exponential map at the point p of the Levi-Civita connection of the metric g. Please note that this geodesic is past-pointing, because Vp was chosen future-pointing, and that it passes through p at the parameter value s = 0. The lens map is defined in the following way. After fixing a source surface (T , W ) and choosing a point p ∈ M, we denote by Dp ⊆ Sp the subset of all lightlike

Global Properties of Gravitational Lens Maps in Lorentzian Setting

p

..................................................................................... ............ .... ... . ..... .................... .... ... ......... .... .... .... ........ .... .... ....... .... .... .... .... .... .... ... ... ... .... .... .... ... . . . . .... . . .... .... .... .... .... . .... . . .... . . . . . . . .... . . .... ... ... ... .... . ... . . . . . .... . . . . .... . . . . . .... .... ..... .... ..... ..... .... . . . . .... . . p . . . .... ..... .... .... ..... .... .. .... .. . . ... . .... ... . . . . .... . . .... .. .... .... ..... .... . . ..... ....... .... ..... ..... .... ...... .... ... ... .. .. .... . . ... ... .... . . . . .... .... . . .... ... .... .... .. .... .... . . .... ... . .... .... . ..... . .... ... .... ....... .... .... .... ..... ... .... . .... .... . .... . .... .... .... ..... .... .... .... .... .... .... ... ... ... .... .... .... ... .... .... .... .... .... .... .... .... .... .... .... ... . ..................................................................... ... . ................ ... .... ........... ........... .... ......... ......... ..... ........ ... .......

q ❅ ❘ ❅

407

✻

W

Y

T

q

πW

❄ q

..................................................................... ............... ........... ........... ......... ......... ........ p p .....

f (Y )

N

Fig. 1. Illustration of the lens map

directions at p such that the geodesic to which this direction is tangent meets T (at least once) if sufficiently extended to the past, and if at the first intersection point q with T this geodesic is transverse to T . By projecting q to N = T /W we get the lens map fp : Dp −→ N = T /W , see Fig. 1. If we use the representation (1) for Sp , the definition of the lens map can be given in more formal terms in the following way. Definition 2. Let (T , W ) be a source surface in a spacetime (M, g). Then, for each p ∈ M, the lens map fp : Dp −→ N = T /W is defined in the following way. In the notation of Eq. (1), let Dp be the set of all Yp ∈ Sp such that there is a real number wp (Yp ) > 0 with the properties (a) sYp is in the maximal domain of the exponential map for all s ∈ [ 0 , wp (Yp )]; (b) the curve s −→ exp(sYp ) intersects T at the value s = wp (Yp ) transversely; (c) expp (sYp ) ∈ / T for all s ∈ [ 0 , wp (Yp )[ . This defines a map wp : Dp −→ R. The lens map at p is then, by definition, the map fp : Dp −→ N = T /X ,

fp (Yp ) = πW expp (wp (Yp )Yp ) .

(2)

Here πW : T −→ N denotes the natural projection. The transversality condition in part (b) of Def. 2 guarantees that the domain Dp of the lens map is an open subset of Sp . The case Dp = ∅ is, of course, not excluded. In particular, Dp = ∅ whenever p ∈ T , owing to part (c) of Def. 2.

408

V. Perlick

Moreover, the transversality condition in part (b) of Definition 2, in combination with the implicit function theorem, makes sure that the map wp : Dp −→ R is a C ∞ map. As the exponential map of a C ∞ metric is again C ∞ , and πW is a C ∞ submersion by assumption, this proves the following. Proposition 1. The lens map is a C ∞ map. Please note that without the transversality condition the lens map need not even be continuous. Although our Def. 2 made use of the representation (1), which refers to a timelike vector Vp , the lens map is, of course, independent of which future-pointing Vp has been chosen. We decided to index the lens map only with p although, strictly speaking, it depends on T , on W , and on p. Our philosophy is to keep a source surface (T , W ) fixed, and then to consider the lens map for all points p ∈ M. In view of gravitational lensing, the lens map admits the following interpretation. For ξ ∈ N , each point Yp ∈ Dp with fp (Yp ) = ξ corresponds to a past-pointing lightlike geodesic from p to the worldline ξ in M, i.e., it corresponds to an image at the celestial sphere of p of the light source with worldline ξ . If fp is not injective, we are in a multiple imaging situation. The converse need not be true as the lens map does not necessarily cover all images. There might be a past-pointing lightlike geodesic from p reaching ξ after having met T before, or being tangential to T on its arrival at ξ . In either case, the corresponding image is ignored by the lens map. The reader might be inclined to view this as a disadvantage. However, in Sect. 6 below we discuss some situations where the existence of such additional light rays can be excluded (e.g., asymptotically simple and empty spacetimes) and situations where it is desirable, on physical grounds, to disregard such additional light rays (e.g., weakly perturbed Robertson–Walker spacetimes with compact spatial sections). It was already mentioned that the domain Dp of the lens map might be empty; this is, of course, the worst case that could happen. The best case is that the domain is all of the celestial sphere, Dp = Sp . We shall see in the following sections that many interesting results are true just in this case. However, there are several cases of interest where Dp is a proper subset of Sp . If the domain of the lens map fp is the whole celestial sphere, none of the light rays issuing from p into the past is blocked or trapped before it reaches T . In view of applications to gravitational lensing, this excludes the possibility that these light rays meet a non-transparent deflector. In other words, it is a typical feature of gravitational lensing situations with non-transparent deflectors that Dp is not all of Sp . Two simple examples, viz., a non-transparent string and a non-transparent spherical body, will be considered in Subsect. 6.1 below. 3. Regular and Critical Values of the Lens Map Please recall that, for a differentiable map F : M1 −→ M2 between two manifolds, Y ∈ M1 is called a regular point of F if the differential TY F : TY M1 −→ TF (Y ) M2 has maximal rank, otherwise Y is called a critical point. Moreover, ξ ∈ M2 is called a regular value of F if all Y ∈ F −1 (ξ ) are regular points, otherwise ξ is called a critical value. Please note that, according to this definition, any ξ ∈ M2 that is not in the image of F is regular. The well-known (Morse-)Sard theorem (see, e.g., Hirsch [7], p. 69) says that the set of regular values of F is residual (i.e., it contains the intersection of countably many sets that are open and dense in M2 ) and thus dense in M2 and the critical values of F make up a set of measure zero in M2 .

Global Properties of Gravitational Lens Maps in Lorentzian Setting

For the lens map fp : Dp −→ N , we call the set Caust(fp ) = ξ ∈ N ξ is a critical value of fp

409

(3)

the caustic of fp . The Sard theorem then implies the following result. Proposition 2. The caustic Caust(fp ) is a set of measure zero in N and its complement N \ Caust(fp ) is residual and thus dense in N . Please note that Caust(fp ) need not be closed in N . Counter-examples can be constructed easily by starting with situations where the caustic is closed and then excising points from spacetime. For lens maps defined on the whole celestial sphere, however, we have the following result. Proposition 3. If Dp = Sp , the caustic Caust(fp ) is compact in N . This is an obvious consequence of the fact that Sp is compact and that fp and its first derivative are continuous. As the domain and the target space of fp have the same dimension, Yp ∈ Dp is a regular point of fp if and only if the differential TYp fp : TYp Sp −→ Tfp (Yp ) N is an isomorphism. In this case fp maps a neighborhood of Yp diffeomorphically onto a neighborhood of fp (Yp ). The differential TYp fp may be either orientation-preserving or orientation-reversing. To make this notion precise we have to choose an orientation for Sp and an orientation for N . For the celestial sphere Sp it is natural to choose the orientation according to which the origin of the tangent space Tp M is to the inner side of Sp . The target manifold N is orientable by assumption, but in general there is no natural choice for the orientation. Clearly, choosing an orientation for N fixes an orientation for T , because the vector field W gives us an orientation for the fibers. We shall say that the orientation of N is adapted to some point Yp ∈ Dp if the geodesic with initial vector Yp meets T at the inner side. If Dp is connected, the orientation of N that is adapted to some Yp ∈ Dp is automatically adapted to all other elements of Dp . Using this terminology, we may now introduce the following definition. Definition 3. A regular point Yp ∈ Dp of the lens map fp is said to have even parity (or odd parity, respectively) if TYp fp is orientation-preserving (or orientation-reversing, respectively) with respect to the natural orientation on Sp and the orientation adapted to Yp on N . For a regular value ξ ∈ N of the lens map, we denote by n+ (ξ ) (or n− (ξ ), respectively) the number of elements in fp−1 (ξ ) with even parity (or odd parity, respectively). Please note that n+ (ξ ) and n− (ξ ) may be infinite, see the Schwarzschild example in Subsect. 6.1 below. A criterion for n± (ξ ) to be finite will be given in Prop. 8 below. Definition 3 is relevant for gravitational lensing in the following sense. The assumption that Yp is a regular point of fp implies that an observer at p sees a neighborhood of ξ = fp (Yp ) in N as a neighborhood of Yp at his or her celestial sphere. If we compare the case that Yp has odd parity with the case that Yp has even parity, then the appearance of the neighborhood in the first case is the mirror image of its appearance in the second case. This difference is observable for a light source that is surrounded by some irregularly shaped structure, e.g. a galaxy with curved jets or with lobes. If ξ is a regular value of fp , it is obvious that the points in fp−1 (ξ ) are isolated, i.e., any Yp in fp−1 (ξ ) has a neighborhood in Dp that contains no other point in fp−1 (ξ ). This follows immediately from the fact that fp maps a neighborhood of Yp diffeomorphically

410

V. Perlick

onto its image. In the next section we shall formulate additional assumptions such that the set fp−1 (ξ ) is finite, i.e., such that the numbers n± (ξ ) introduced in Def. 3 are finite. It is the main purpose of the next section to demonstrate that then the difference n+ (ξ ) − n− (ξ ) has some topological invariance properties. As a preparation for that we notice the following result which is an immediate consequence of the fact that the lens map is a local diffeomorphism near each regular point. Proposition 4. n+ and n− are constant on each connected component of fp (Dp ) \ Caust(fp ). Hence, along any continuous curve in fp (Dp ) that does not meet the caustic of the lens map, the numbers n+ and n− remain constant, i.e., the observer at p sees the same number of images for all light sources on this curve. If a curve intersects the caustic, the number of images will jump. In the next section we shall prove that n+ and n− always jump by the same amount (under conditions making sure that these numbers are finite), i.e., the total number of images always jumps by an even number. This is well known in the quasi-Newtonian approximation formalism, see, e.g., Schneider, Ehlers and Falco [1], Sect. 6. If Caust(fp ) is empty, transversality guarantees that fp (Dp ) is open in N and, thus a manifold. Proposition 4 implies that, in this case, fp gives a C ∞ covering map from Dp onto fp (Dp ). As a C ∞ covering map onto a simply connected manifold must be a global diffeomorphism, this implies the following result. Proposition 5. Assume that Caust(fp ) is empty and that fp (Dp ) is simply connected. Then fp gives a global diffeomorphism from Dp onto fp (Dp ). In other words, the formation of a caustic is necessary for multiple imaging provided that fp (Dp ) is simply connected. In Subsect. 6.1 below we shall consider the spacetime of a non-transparent string. This will demonstrate that the conclusion of Prop. 5 is not true without the assumption of fp (Dp ) being simply connected. In the rest of this subsection we want to relate the caustic of the lens map to the caustic of the past light cone of p. The past light cone of p can be defined as the image set in M of the map Fp : (s, Yp ) −→ expp (sYp )

(4)

considered on its maximal domain in ] 0 , ∞ [ × Sp , and its caustic can be defined as the set of critical values of Fp . In other words, q ∈ M is in the caustic of the past light cone of p if and only if there is an s0 ∈ ] 0 , ∞ [ and a Yp ∈ Sp such that the differential T(s0 ,Yp ) Fp has rank k < 3. In that case one says that the point q = expp (s0 Yp ) is conjugate to p along the geodesic s −→ expp (sYp ), and one calls the number m = 3 − k the multiplicity of this conjugate point. As Fp ( · , Yp ) is always an immersion, the multiplicity can take the values 1 and 2 only. (This formulation is equivalent to the definition of conjugate points and their multiplicities in terms of Jacobi vector fields which may be more familiar to the reader.) It is well known, but far from trivial, that along every lightlike geodesic conjugate points are isolated. Hence, in a compact parameter interval there are only finitely many points that are conjugate to a fixed point p. A proof can be found, e.g., in Beem, Ehrlich and Easley [11], Theorem 10.77. After these preparations we are now ready to establish the following proposition. We use the notation introduced in Def. 2.

Global Properties of Gravitational Lens Maps in Lorentzian Setting

411

Proposition 6. An element Yp ∈ Dp is a regular point of the lens map if and only if the point expp (wp (Yp )Yp ) is not conjugate to p along the geodesic s −→ expp (sYp ). A regular point Yp ∈ Dp has even parity (or odd parity, respectively) if and only if the number of points conjugate to p along the geodesic [ 0 , wp (Yp )] −→ M , s −→ expp (sYp ) is even (or odd, respectively). Here each conjugate point is to be counted with its multiplicity. Proof. In terms of the function (4), the lens map can be written in the form fp (Yp ) = πW Fp (wp (Yp ), Yp ) .

(5)

As s −→ Fp (s, Yp ) is an immersion transverse to T at s = wp (Yp ) and πW is a submersion, the differential of fp at Yp has rank 2 if and only if the differential of Fp at (wp (Yp ), Yp ) has rank 3. This proves the first claim. For proving the second claim define, for each s ∈ [0, wp (Yp )], a map s : TYp Sp −→ Tfp (Yp ) N

(6)

by applying to each vector in TYp Sp the differential T(s,Yp ) Fp , parallel-transporting the result along the geodesic Fp ( · , Yp ) to the point q = Fp wp (Yp ), Yp and then projecting down to Tfp (Yp ) N . In the last step one uses the fact that, by transversality, any vector in Tq M can be uniquely decomposed into a vector tangent to T and a vector tangent to the geodesic Fp ( · , Yp ). For s = 1, this map s gives the differential of the lens map. We now choose a basis in TYp Sp and a basis in Tfp (Yp ) N , thereby representing the map s as a (2 × 2)-matrix. We choose the first basis right-handed with respect to the natural orientation on Sp and the second basis right-handed with respect to the orientation on N that is adapted to Yp . Then det(0 ) is positive as the parallel transport gives an orientation-preserving isomorphism. The function s −→ det(s ) has a single zero whenever Fp (s, Yp ) is a conjugate point of multiplicity one and it has a double zero whenever Fp (s, Yp ) is a conjugate point of multiplicity two. Hence, the sign of det(1 ) can be determined by counting the conjugate points. This result implies that ξ ∈ N is a regular value of the lens map fp whenever the worldline ξ does not pass through the caustic of the past light cone of p. The relation between parity and the number of conjugate points is geometrically rather evident because each conjugate point is associated with a “crossover” of infinitesimally neighboring light rays. 4. The Mapping Degree of the Lens Map The mapping degree (also known as Brouwer degree) is one of the most powerful tools in differential topology. In this section we want to investigate what kind of information could be gained from the mapping degree of the lens map, providing it can be defined. For the reader’s convenience we briefly summarize the definition and main properties of the mapping degree, following closely Choquet-Bruhat, Dewitt-Morette, and DillardBleick [12], pp. 477. For a more abstract approach, using homology theory, the reader may consult Dold [13], Spanier [14] or Bredon [15]. In this article we shall not use homology theory with the exception of the proof of Prop. 11. The definition of the mapping degree is based on the following observation.

412

V. Perlick

Proposition 7. Let F : D ⊆ M1 −→ M2 be a continuous map, where M1 and M2 are oriented connected manifolds of the same dimension, D is an open subset of M1 with compact closure D and F |D is a C ∞ map. (Actually, C 1 would do.) Then for every ξ ∈ M2 \ F (∂D) which is a regular value of F |D , the set F −1 (ξ ) is finite. Proof. By contradiction, let us assume that there is a sequence (yi )i∈N with pairwise different elements in F −1 (ξ ). By compactness of D, we can choose an infinite subsequence of (yi )i∈N that converges towards some point y∞ ∈ D. By continuity of F , F (y∞ ) = ξ , so the hypotheses of the proposition imply that y∞ ∈ / ∂D. As a consequence, y∞ is a regular point of F |D , so it must have an open neighborhood in D that does not contain any other element of F −1 (ξ ). This contradicts the fact that a subsequence of (yi )i∈N converges towards y∞ . If we have a map F that satisfies the hypotheses of Prop. 7, we can thus define, for every ξ ∈ M2 \ F (∂D) which is a regular value of F |D , deg(F, ξ ) = sgn(y) , (7) y ∈ F −1 (ξ )

where sgn(y) is defined to be +1 if the differential Ty F preserves orientation and −1 if Ty F reverses orientation. If F −1 (ξ ) is the empty set, the right-hand side of (7) is set equal to zero. The number deg(F, ξ ) is called the mapping degree of F at ξ . Roughly speaking, deg(F, ξ ) tells how often the image of F covers the point ξ , counting each “layer” positive or negative depending on orientation. The mapping degree has the following properties (for proofs see Choquet-Bruhat, Dewitt-Morette, and Dillard-Bleick [12], pp. 477). Property A. deg(F, ξ ) = deg(F, ξ ) whenever ξ and ξ are in the same connected component of M2 \ F (∂D). Property B. deg(F, ξ ) = deg(F , ξ ) whenever F and F are homotopic, i.e., whenever there is a continuous map : [0, 1] × D −→ M2 , (s, y) −→ s (y) with 0 = F and 1 = F such that deg(s , ξ ) is defined for all s ∈ [0, 1]. Property A can be used to extend the definition of deg(F, ξ ) to the non-regular values ξ ∈ M2 \ F (∂D). Given the fact that, by the Sard theorem, the regular values are dense in M2 , this can be done just by continuous extension. Property B can be used to extend the definition of deg(F, ξ ) to continuous maps F : D −→ M2 which are not necessarily differentiable on D. Given the fact that the C ∞ maps are dense in the continuous maps with respect to the C 0 -topology, this can be done again just by continuous extension. We now apply these general results to the lens map fp : Dp −→ N . In the case Dp = Sp it is necessary to extend the domain of the lens map onto a compact set to define the degree of the lens map. We introduce the following definition. Definition 4. A map fp : Dp ⊆ M1 −→ M2 is called an extension of the lens map fp : Dp −→ N if (a) M1 is an orientable manifold that contains Dp as an open submanifold; (b) M2 is an orientable manifold that contains N as an open submanifold; (c) the closure Dp of Dp in M1 is compact; (d) fp is continuous and the restriction of fp to Dp is equal to fp .

Global Properties of Gravitational Lens Maps in Lorentzian Setting

413

If the lens map is defined on the whole celestial sphere, Dp = Sp , then the lens map is an extension of itself, fp = fp , with M1 = Sp and M2 = N . If Dp = Sp , one may try to continuously extend fp onto the closure of Dp in Sp , thereby getting an extension with M1 = Sp and M2 = N . If this does not work, one may try to find some other extension. The string spacetime in Subsect. 6.1 below will provide us with an example where an extension exists although fp cannot be continuously extended from Dp onto its closure in Sp . The spacetime around a spherically symmetric body with Ro < 3m will provide us with an example where the lens map admits no extension at all, see Subsect. 6.1 below. Applying Prop. 7 to the case F = fp immediately gives the following result. Proposition 8. If the lens map fp : Dp −→ N admits an extension fp : Dp ⊆ M1 −→ M2 , then for all regular values ξ ∈ N \fp (∂Dp ) the set fp−1 (ξ ) is finite, so the numbers n+ (ξ ) and n− (ξ ) introduced in Def. 3 are finite. If fp is an extension of the lens map fp , the number deg(fp , ξ ) is a well defined integer for all ξ ∈ N \ fp (∂Dp ), provided that we have chosen an orientation on M1 and on M2 . The number deg(fp , ξ ) changes sign if we change the orientation on M1 or on M2 . This sign ambiguity can be removed if Dp is connected. Then we know from the preceding section that N admits an orientation that is adapted to all Yp ∈ Dp . As N is connected, this determines an orientation for M2 . Moreover, the natural orientation on Sp induces an orientation on Dp which, for Dp connected, gives an orientation for M1 . In the rest of this paper we shall only be concerned with the situation that Dp is connected, and we shall always tacitly assume that the orientations have been chosen as indicated above, thereby fixing the sign of deg(fp , ξ ). Now comparison of (7) with Def. 3 shows that deg(fp , ξ ) = n+ (ξ ) − n− (ξ )

(8)

for all regular values in N \ fp (∂Dp ). Owing to Property A, this has the following consequence. Proposition 9. Assume that Dp is connected and that the lens map admits an extension fp : Dp ⊆ M1 −→ M2 . Then n+ (ξ ) − n− (ξ ) = n+ (ξ ) − n− (ξ ) for any two regular values ξ and ξ which are in the same connected component of N \ fp (∂Dp ). In particular, n+ (ξ ) + n− (ξ ) is odd if and only if n+ (ξ ) + n− (ξ ) is odd. We know already from Prop. 4 that the numbers n+ and n− remain constant along each continuous curve in fp (Dp ) that does not meet the caustic of fp . Now let us consider a continuous curve α : ] − ε0 , ε0 [ −→ fp (Dp ) that meets the caustic at α(0) whereas α(ε) is a regular value of fp for all ε = 0. Under the additional assumptions that Dp is connected, an extension, and that α(0) ∈ / fp (∂Dp ), Prop. 9 that fp admits tells us that n+ α(ε) − n− α(ε) remains constant when ε passes through zero. In other words, n+ and n− are allowed to jump only by the same amount. As a consequence, the total number of images n+ + n− is allowed to jump only by an even number. We now specialize to the case that the lens map is defined on the whole celestial sphere, Dp = Sp . Then the assumption of fp admitting an extension is trivially satisfied, with fp = fp , and the degree deg(fp , ξ ) is a well-defined integer for all ξ ∈ N . Moreover,

414

V. Perlick

deg(fp , ξ ) is a constant with respect to ξ , owing to Property A. It is then usual to write simply deg(fp ) instead of deg(fp , ξ ). Using this notation, (8) simplifies to deg(fp ) = n+ (ξ ) − n− (ξ )

(9)

for all regular values ξ of fp . Thus, the total number of images n+ (ξ ) + n− (ξ ) = deg(fp ) + 2n− (ξ )

(10)

is either even for all regular values ξ or odd for all regular values ξ , depending on whether deg(fp ) is even or odd. In some gravitational lensing situations it might be possible to show that there is one light source ξ ∈ N for which fp−1 (ξ ) consists of exactly one point, i.e., ξ is not multiply imaged. This situation is characterized by the following proposition. Proposition 10. Assume that Dp = Sp and that there is a regular value ξ of fp such that fp−1 (ξ ) is a single point. Then |deg(fp )| = 1. In particular, fp must be surjective and N must be diffeomorphic to the sphere S 2 . Proof. The result |deg(fp )| = 1 can be read directly from (9), choosing the regular value ξ which has exactly one pre-image point under fp . This implies that fp must be surjective since a non-surjective map has degree zero. So N being the continuous image of the compact set Sp under the continuous map fp must be compact. It is well known (see, e.g., Hirsch [7], p. 130, Exercise 5) that for n ≥ 2 the existence of a continuous map F : S n −→ M2 with deg(F ) = 1 onto a compact oriented n-manifold M2 implies that M2 must be simply connected. As the lens map gives us such a map onto N (after changing the orientation of N , if necessary), we have thus found that N must be simply connected. Owing to the well-known classification theorem of compact orientable twodimensional manifolds (see, e.g., Hirsch [7], Chapter 9), this implies that N must be diffeomorphic to the sphere S 2 . In the situation of Prop. 10 we have n+ (ξ ) + n− (ξ ) = 2n− (ξ ) ± 1, for all ξ ∈ N \ Caust(fp ), i.e., the total number of images is odd for all light sources ξ ∈ N S 2 that lie not on the caustic of fp . The idea to use the mapping degree for proving an odd number theorem in this way was published apparently for the first time in the introduction of McKenzie [16]. In Prop. 10 one would, of course, like to drop the rather restrictive assumption that fp−1 (ξ ) is a single point for some ξ . In the next section we consider a special situation where the result |deg(fp )| = 1 can be derived without this assumption. 5. Simple Lensing Neighborhoods In this section we investigate a special class of spacetime regions that will be called “simple lensing neighborhoods”. Although the assumption of having a simple lensing neighborhood is certainly rather special, we shall demonstrate in Sect. 6 below that sufficiently many examples of physical interest exist. We define simple lensing neighborhoods in the following way. Definition 5. (U, T , W ) is called a simple lensing neighborhood in a spacetime (M, g) if (a) U is an open connected subset of M and T is the boundary of U in M; (b) ( T = ∂U, W ) is a source surface in the sense of Def. 1;

Global Properties of Gravitational Lens Maps in Lorentzian Setting

415

(c) for all p ∈ U, the lens map fp : Dp −→ N = ∂U/W is defined on the whole celestial sphere, Dp = Sp ; (d) U does not contain an almost periodic lightlike geodesic. Here the notion of being “almost periodic” is defined in the following way. Any immersed curve λ : I −→ U, defined on a real interval I , induces a curve λˆ : I −→ P U ˆ ˙ | c ∈ R }. in the projective tangent bundle P U over U which is defined by λ(s) = { cλ(s) The curve λ is called almost periodic if there is a strictly monotonous sequence of ˆ i ) i∈N has an accumulation point parameter values (si )i∈N such that the sequence λ(s in P U. Please note that Condition (d) of Def. 5 is certainly true if the strong causality condition holds everywhere on U, i.e., if there are no closed or almost closed causal curves in U. Also, Condition (d) is certainly true if every future-inextendible lightlike geodesic in U has a future end-point in M. Condition (d) should be viewed as adding a fairly mild assumption on the futurebehavior of lightlike geodesics to the fairly strong assumptions on their past-behavior that are contained in Condition (c). In particular, Condition (c) excludes the possibility that past-oriented lightlike geodesics are blocked or trapped inside U, i.e., it excludes the case that U contains non-transparent deflectors. Condition (c) requires, in addition, that the past-pointing lightlike geodesics are transverse to ∂U when leaving U. In the situation of a simple lensing neighborhood, we have for each p ∈ U a lens map that is defined on the whole celestial sphere, fp : Sp −→ N = ∂U/W . We have, thus, Eq. (9) at our disposal which relates the numbers n+ (ξ ) and n− (ξ ), for any regular value ξ ∈ N , to the mapping degree of fp . (Please recall that, by Prop. 8, n+ (ξ ) and n− (ξ ) are finite.) It is our main goal to prove that, in a simple lensing neighborhood, the mapping degree of the lens map equals ±1, so n(ξ ) = n+ (ξ ) + n− (ξ ) is odd for all regular values ξ . Also, we shall prove that a simple lensing neighborhood must be contractible and that its boundary must be diffeomorphic to S 2 × R. The latter result reflects the fact that the notion of simple lensing neighborhoods generalizes the notion of asymptotically simple and empty spacetimes, with ∂U corresponding to past lightlike infinity J− , as will be detailed in Subsect. 6.2 below. When proving the desired properties of simple lensing neighborhoods we may therefore use several techniques that have been successfully applied to asymptotically simple and empty spacetimes before. As a preparation we need the following lemma. Lemma 1. Let (U, T , W ) be a simple lensing neighborhood in a spacetime (M,g). Then there is a diffeomorphism , from the sphere bundle S = Yp ∈ Sp p ∈ U of lightlike directions over U onto the space T N × R2 such that the following diagram commutes. S

,

−→ T N × R2

ip ↑

↓ pr fp

Sp −→

(11)

N

Here ip denotes the inclusion map and pr is defined by dropping the second factor and projecting to the foot-point. Proof. We fix a trivialization for the bundle πW : T −→ N and identify T with N × R. Then we consider the bundle B = Xq ∈ Bq q ∈ T over T , where Bq ⊂ Sq is, by definition, the subspace of all lightlike directions that are tangent to past-oriented

416

V. Perlick

lightlike geodesics that leave U transversely at q. Now we choose for each q ∈ T a vector Qq ∈ Tq M, smoothly depending on q, which is non-tangent to T and outward pointing. With the help of this vector field Q we may identify B and T N × R as bundles over T N × R in the following way. Fix ξ ∈ N , Xξ ∈ Tξ N and s ∈ R and view the tangent space Tξ N as a natural subspace of Tq (N × R), where q = (ξ, s). Then the desired identification is given by associating the pair (Xξ , s) with the direction spanned by Zq = Xξ + Qq − α W (q), where the number α is uniquely determined by the requirement that Zq should be lightlike and past-pointing. – Now we consider the map π : S −→ B T N × R

(12)

given by following each lightlike geodesic from a point p ∈ U into the past until it reaches T , and assigning the tangent direction at the end-point to the tangent direction at the initial point. As a matter of fact, (12) gives a principal fiber bundle with structure group R. To prove this, we first observe that the geodesic spray induces a vector field without zeros on S. By multiplying this vector field with an appropriate function we get a vector field whose flow is defined on all of R × S (see the second paragraph after Def. 1 for how to find such a function). The flow of this rescaled vector field defines an R-action on S such that (12) can be identified with the projection onto the space of orbits. Conditions (c) and (d) of Def. 5 guarantee that no orbit is closed or almost closed. Owing to a general result of Palais [5], this is sufficient to prove that this action makes (12) into a principal fiber bundle with structure group R. However, any such bundle is trivializable, see, e.g., Kobayashi and Nomizu [9], pp. 57/58. Choosing a trivialization for (12) gives us the desired diffeomorphism , from S to B × R T N × R2 . The commutativity of the diagram (11) follows directly from the definition of the lens map fp . With the help of this lemma we will now prove the following proposition which is at the center of this section. Proposition 11. Let (U, T , W ) be a simple lensing neighborhood in a spacetime (M, g). Then (a) N = T /W is diffeomorphic to the standard 2-sphere S 2 ; (b) U is contractible; (c) for all p ∈ U, the lens map fp : Sp S 2 −→ N S 2 has |deg(fp )| = 1; in particular, fp is surjective. Proof. In the proof of part (a) and (b) we shall adapt techniques used by Newman and Clarke [17, 18] in their study of asymptotically simple and empty spacetimes. To that end it will be necessary to assume that the reader is familiar with homology theory. With the sphere bundle S, introduced in Lemma 1, we may associate the Gysin homology sequence . . . −→ Hm (S) −→ Hm (U) −→ Hm−3 (U) −→ Hm−1 (S) −→ . . . ,

(13)

where Hm (X ) denotes the mth homology group of the space X with coefficients in a field F. For any choice of F, the Gysin sequence is an exact sequence of abelian groups, see, e.g., Spanier [14], p. 260 or, for the analogous sequence of cohomology groups, Bredon [15], p. 390. By Lemma 1, S and N have the same homotopy type, so Hm (S) and Hm (N ) are isomorphic. Upon inserting this into (13), we use the fact

Global Properties of Gravitational Lens Maps in Lorentzian Setting

417

that Hm (U) = 1 ( = trivial group consisting of the unit element only) for m > 4 and Hm (N ) = 1 for m > 2 because dim(U) = 4 and dim(N ) = 2. Also, we know that H0 (U) = F and H0 (N ) = F since U and N are connected. Then the exactness of the Gysin sequence implies that Hm (U) = 1

for m > 0

(14)

H2 (N ) = F.

(15)

and H1 (N ) = 1 ,

From (15) we read that N is compact since otherwise H2 (N ) = 1. Moreover, we observe that N has the same homology groups and thus, in particular, the same Euler characteristic as the 2-sphere. It is well known that any two compact and orientable 2-manifolds are diffeomorphic if and only if they have the same Euler characteristic (or, equivalently, the same genus), see, e.g., Hirsch [7], Chapter 9. We have thus proven part (a) of the proposition. – To prove part (b) we consider the end of the exact homotopy sequence of the fiber bundle S over U, see, e.g., Frankel [19], p. 600, . . . −→ π1 (S) −→ π1 (U) −→ 1.

(16)

As S has the same homotopy type as N S 2 , we may replace π1 (S) with π1 (S 2 ) = 1, so the exactness of (16) implies that π1 (U) = 1, i.e., that U is simply connected. If, for some m > 1, the homotopy group πm (U) would be different from 1, the Hurewicz isomorphism theorem (see, e.g., Spanier [14], p. 394 or Bredon [15], p. 479, Corollary 10.10.) would give a contradiction to (14). Thus, πm (U) = 1 for all m ∈ N, i.e., U is contractible. – We now prove part (c). Since U is contractible, the tangent bundle T U and thus the sphere bundle S over U admits a global trivialization, S U ×S 2 . Fixing such a trivialization and choosing a contraction that collapses U onto some point p ∈ U gives a contraction i˜p : S −→ Sp . Together with the inclusion map ip : Sp −→ S this gives us a homotopy equivalence between Sp and S. (Please recall that a homotopy equivalence between two topological spaces X and Y is a pair of continuous maps ϕ : X −→ Y and ϕ˜ : Y −→ X such that ϕ ◦ ϕ˜ can be continuously deformed into the identity on Y and ϕ˜ ◦ ϕ can be continuously deformed into the identity on X .) On the other hand, the projection pr from (11), together with the zero section pr ˜ : N −→ T N × R2 gives a homotopy equivalence between T N × R2 and N . As a consequence, the diagram (11) ˜ tells us that the lens map fp = pr ◦ , ◦ ip together with the map f˜p = i˜p ◦ , −1 ◦ pr gives a homotopy equivalence between Sp S 2 and N S 2 , so fp ◦ f˜p is homotopic to the identity. Since the mapping degree is a homotopic invariant (please recall Property B of the mapping degree from Sect. 4), this implies that deg(fp ◦ f˜p ) = 1. Now the product theorem for the mapping degree (see, e.g., Choquet-Bruhat, Dewitt-Morette, and Dillard-Bleick [12], p. 483) yields deg(fp ) deg(f˜p ) = 1. As the mapping degree is an integer, this can be true only if deg(fp ) = deg(f˜p ) = ±1. In particular, fp must be surjective since otherwise deg(fp ) = 0. In all simple examples to which this proposition applies the degree of fp is, actually, equal to +1, and it is hard to see whether examples with deg(fp ) = −1 do exist. The following consideration is quite instructive. If we start with a simple lensing neighborhood in a flat spacetime (or, more generally, in a conformally flat spacetime), then

418

V. Perlick

conjugate points cannot occur, so it is clear that the case deg(fp ) = −1 is impossible. If we now perturb the metric in such a way that the simple-lensing-neighborhood property is maintained during the perturbation, then, by Property B of the degree, the equation deg(fp ) = +1 is preserved. This demonstrates that the case deg(fp ) = −1 cannot occur for weak gravitational fields (or for small perturbations of conformally flat spacetimes such as Robertson–Walker spacetimes). Among other things, Proposition 11 gives a good physical motivation for studying degree-one maps from S 2 to S 2 . In particular, it is an interesting problem to characterize the caustics of such maps. Please note that, by parts (a) and (c) of Proposition 11, fp (Dp ) is simply connected for all p ∈ U. Hence, Proposition 5 applies which says that the formation of a caustic is necessary for multiple imaging. Owing to (10), part (c) of Proposition 11 implies in particular that n(ξ ) = n+ (ξ ) + n− (ξ ) is odd for all worldlines of light sources ξ ∈ N that do not pass through the caustic of the past light cone of p, i.e., if only light rays within U are taken into account the observer at p sees an odd number of images of such a worldline. It is now our goal to prove a similar “odd number theorem” for a light source with worldline inside U. As a preparation we establish the following lemma. Lemma 2. Let (U, T , W ) be a simple lensing neighborhood in a spacetime (M, g) and p ∈ U. Let J − (p, U) denote, as usual, the causal past of p in U, i.e., the set of all points in M that can be reached from p along a past-pointing causal curve in U. Let ∂U J − (p, U) denote the boundary of J − (p, U) in U. Then (a) every point q ∈ ∂U J − (p, U) can be reached from p along a past-pointing lightlike geodesic in U; (b) ∂U J − (p, U) is relatively compact in M. Proof. As usual, let I − (p, U) denote the chronological past of p in U, i.e., the set of all points that can be reached from p along a past-pointing timelike curve in U. To prove part (a), fix a point q ∈ ∂U J − (p, U). Choose a sequence (pi )i∈N of points in U that converge towards p in such a way that p ∈ I − (pi , U) for all i ∈ N. This implies that we can find for each i ∈ N a past-pointing timelike curve λi from pi to q. Then the λi are past-inextendible in U \ {q}. Owing to a standard lemma (see, e.g., Wald [20], Lemma 8.1.5) this implies that the λi have a causal limit curve λ through p that is pastinextendible in U \ {q}. We want to show that λ is the desired lightlike geodesic. Assume that λ is not a lightlike geodesic. Then λ enters into the open set I − (p, U) (see Hawking and Ellis [21], Prop. 4.5.10), so λi enters into I − (p, U) for i sufficiently large. This, however, is impossible since all λi have past end-point on ∂U J − (p, U), so λ must be a lightlike geodesic. It remains to show that λ has past end-point at q. Assume that this is not true. Since λ is past-inextendible in U \ {q} this assumption implies that λ is pastinextendible in U, so by condition (c) of Def. 5 λ has past end-point on ∂U and meets ∂U transversely. As a consequence, for i sufficiently large λi has to meet ∂U which gives a contradiction to the fact that all λi are within U. – To prove part (b), we have to show that any sequence (qi )i∈N in ∂U J − (p, U) has an accumulation point in M. So let us choose such a sequence. From part (a) we know that there is a past-pointing lightlike geodesic µi from p to qi in U for all i ∈ N. By compactness of Sp S 2 , the tangent directions to these geodesics at p have an accumulation point in Sp . Let µ be the past-pointing lightlike geodesic from p which is determined by this direction. By condition (c) of Definition 5, this geodesic µ and each of the geodesics µi must have a past end-point on ∂U if maximally extended inside U. We may choose an affine parametrization for each of those geodesics with the parameter ranging from the value 0 at p to the value 1 at ∂U.

Global Properties of Gravitational Lens Maps in Lorentzian Setting

419

Then our sequence (qi )i∈N in U determines a sequence (si )i∈N in the interval [0, 1] by setting qi = µi (si ). By compactness of [0, 1], this sequence must have an accumulation point s ∈ [0.1]. This demonstrates that the qi must have an accumulation point in M, namely the point µ(s). We are now ready to prove the desired odd-number theorem for light sources with worldline in U. Proposition 12. Let (U, T , W ) be a simple lensing neighborhood in a spacetime (M, g) and assume that U does not contain a closed timelike curve. Fix a point p ∈ U and a timelike embedded C ∞ curve γ in U whose image is a closed topological subset of M. (The latter condition excludes the case that γ has an end-point on ∂U.) Then the following is true. (a) If γ does not meet the point p, then there is a past-pointing lightlike geodesic from p to γ that lies completely within U and contains no conjugate points in its interior. (The end-point may be conjugate to the initial-point.) If this geodesic meets γ at the point q, say, then all points on γ that lie to the future of q cannot be reached from p along a past-pointing lightlike geodesic in U. (b) If γ meets neither the point p nor the caustic of the past light cone of p, then the number of past-pointing lightlike geodesics from p to γ that are completely contained in U is finite and odd. Proof. In the first step we construct a C ∞ vector field V on M that is timelike on U, has γ as an integral curve, and coincides with W on T = ∂U. To that end we first choose any future-pointing timelike C ∞ vector field V1 on M. (Existence is guaranteed by our assumption of time-orientability.) Then we extend the vector field W to a C ∞ vector field V2 onto some neighborhood V of T . Since W is causal and future-pointing, V2 may be chosen timelike and future-pointing on V \ T . (Here we make use of the fact that T = ∂U is a closed subset of M.) Finally we choose a timelike and future-pointing vector field V3 on some neighborhood W of γ that is tangent to γ at all points of γ . (Here we make use of the fact that the image of γ is a closed subset of M.) We choose the neighborhoods V and W disjoint which is possible since γ is completely contained in U and closed in M. With the help of a partition of unity we may now combine the three vector fields V1 , V2 , V3 into a vector field V with the desired properties. In the second step we consider the quotient space M/V . This space contains the open subset U/V whose boundary T /V = N is, by Prop. 11, a manifold diffeomorphic to S 2 . We want to show that U/V is a manifold (which, according to our terminology, in particular requires that U/V is a Hausdorff space). To that end we consider the map jp : ∂U J − (p, U) −→ U/V which assigns to each point q ∈ ∂U J − (p, U) the integral curve of V passing through that point. (In this proof overlining always means closure in M.) Clearly, jp is continuous with respect to the topology ∂U J − (p, U) inherits as a subspace of M and the quotient topology on U/V . Moreover, ∂U J − (p, U) intersects each integral curve of V at most once, and if it intersects one integral curve then it also intersects all neighbboring integral curves in U; this follows from Wald [20], Theorem 8.1.3. Hence, jp is injective and its image is open in U/V . On the other hand, part (b) of Lemma 2 implies that the image of jp is closed. Since the image of jp is non-empty and connected, it must be all of U/V . (The domain of jp and, thus, the image of jp is non-empty because U does not contain a closed timelike curve. The domain and, thus, the image of jp is connected since U is connected.) We have, thus, proven that jp

420

V. Perlick

is a homeomorphism. This implies that the Hausdorff condition is satisfied on U/V and, in particular, on U/V . Since V is timelike and U contains no closed timelike curves, this makes sure that U/V is a manifold according to our terminology, see Harris [10], Theorem 2. In the third step we use these results to prove part (a) of the proposition. Our result that jp is a homeomorphism implies, in particular, that γ has an intersection with ∂U J − (p, U) at some point q. Now part (a) of Lemma 2 shows that there is a past-pointing lightlike geodesic from p to q in U. This geodesic cannot contain conjugate points in its interior since otherwise a small variation would give a timelike curve from p to q, see Hawking and Ellis [21], Prop. 5.4.12, thereby contradicting q ∈ ∂U J − (p, U). The rest of part (a) is clear since all past-pointing lightlike geodesics in U that start at p are confined to J − (p, U). In the last step we prove part (b). To that end we choose on the tangent space Tp M a Lorentz basis (Ep1 , Ep2 , Ep3 , Ep4 ) with Ep4 future-pointing, and we identify each x = (x 1 , x 2 , x 3 ) ∈ R3 with the past-pointing lightlike vector Yp = x 1 Ep1 + x 2 Ep2 + x 3 Ep3 − |x|Ep4 . With this identification, the lens map takes the form fp : S 2 −→ N = ∂U/V , x −→ πV expp (wp (x)x) . We now define a continuous map F : B −→ M/V x on the closed ball B = x ∈ R3 |x| ≤ 1 by setting F (x) = πV expp (wp ( |x| ) x) for x = 0 and F (0) = πV (p). The restriction of F to the interior of B is a C ∞ map onto the manifold U/V , with the exception of the origin where F is not differentiable. The latter problem can be circumvented by approximating F in the C o -sense, on an arbitrarily small neighborhood of the origin, by a C ∞ map. Then the mapping degree deg(F ) can be calculated (see, e.g., Choquet-Bruhat, Dewitt-Morette, and Dillard-Bleick [12], pp. 477) with the help of the integral formula F ∗ ω = deg(F ) ω, (17) B

U /V

where ω is any 3-form on U/V and the star denotes the pull-back of forms. For any 2-form ψ on U/V , we may apply this formula to the form ω = dψ. With the help of the Stokes theorem we then find F ∗ ψ = deg(F ) ψ. (18) S2

N

However, the restriction of F to ∂B = S 2 gives the lens map, so on the left-hand side of (18) we may replace F ∗ ψ by fp∗ ψ. Then comparison with the integral formula for the degree of fp shows that deg(F ) = deg(fp ) which, according to Prop. 11, is equal to ±1. For every ζ ∈ U/V that is a regular value of F , the result deg(F ) = ±1 implies that the number of elements in F −1 (ζ ) is finite and odd. By assumption, the worldline γ ∈ U/V meets neither the point p nor the caustic of the past light cone of p. The first condition makes sure that our perturbation of F near the origin can be done without influencing the set F −1 (γ ); the second condition implies that γ is a regular value of F , please recall our discussion at the end of Sect. 3. This completes the proof. If only light rays within U are taken into account, then Prop. 12 can be summarized by saying that, for light sources in a simple lensing neighborhood, the “youngest image” has always even parity and the total number of images is finite and odd. In the quasi-Newtonian approximation formalism it is a standard result that a transparent gravitational lens produces an odd number of images, see Schneider, Ehlers and

Global Properties of Gravitational Lens Maps in Lorentzian Setting

421

Falco [1], Section 5.4, for a detailed discussion. Proposition 12 may be viewed as a reformulation of this result in a Lorentzian geometry setting. It is quite likely that an alternative proof of Prop. 12 can be given by using the Morse theoretical results of Giannoni, Masiello and Piccione [22, 23]. Also, the reader should compare our results with the work of McKenzie [16] who used Morse theory for proving an odd-number theorem in certain globally hyperbolic spacetimes. Contrary to McKenzie’s theorem, our Prop. 12 requires mathematical assumptions which can be physically interpreted rather easily. 6. Examples 6.1. Two simple examples with non-transparent deflectors. 6.1.1. Non-transparent string. As a simple example, we consider gravitational lensing in the spacetime (M, g) where M = R2 × R2 \ {0} and g = −dt 2 + dz2 + dr 2 + k 2 r 2 dϕ 2

(19)

with some constant 0 < k < 1. Here (t, z) denote Cartesian coordinates on R2 and (r, ϕ) denote polar coordinates on R2 \ {0}. This can be interpreted as the spacetime around a static non-transparent string, see Vilenkin [24], Hiscock [25] and Gott [26]. One should think of the string as being situated at the z-axis. Since the latter is not part of the spacetime, it is indeed justified to speak of a non-transparent string. As ∂/∂t is a Killing vector field normalized to −1, the lightlike geodesics in (M, g) correspond to the geodesics of the space part. The latter is a metrical product of a real line with coordinate z and a cone with polar coordinates (r, ϕ). So the geodesics are straight lines if we cut the cone open along some radius ϕ = const. and flatten it out in a plane. Owing to this simple form of the lightlike geodesics, the investigation of lens maps in this string spacetime is quite easy. To work this out, choose some constant R > 0 and let T denote the hypercylinder r = R in M. Let W denote the restriction of the vector field ∂/∂t to T . Then (T , W ) is a source surface, with N = T /W S 1 × R. Henceforth we discuss the lens map fp for any point p ∈ M at a radius r < R. There are no past-pointing lightlike geodesics from p that intersect T more than once or touch T tangentially, so the lens map fp gives full information about all images at p of each light source ξ ∈ N . The domain Dp of the lens map is given by excising a curve segment, namely a meridian including both end-points at the “poles”, from the celestial sphere Sp , so Dp R2 is connected. The boundary of Dp in Sp corresponds to light rays that are blocked by the string before reaching T . It is easy to see that the lens map cannot be continuously extended onto Sp (= closure of Dp in Sp ). Nonetheless, the lens map admits an extension in the sense of Def. 4. We may choose M1 = S 2 and M2 = S 2 . Here Dp is embedded into the sphere in such a way that it covers a region (θ, ϕ) ∈ ]0, π [ × ] ε , 2π − ε[ , i.e., in comparison with the embedding into Sp the curve segment excised from the sphere has been “widened” a bit. The embedding of N S 1 × R into S 2 is made via Mercator projection. As the string spacetime has vanishing curvature, the light cones in M have no caustics. Owing to our general results of Sect. 3, this implies that the caustic of the lens map is empty and that all images have even parity, so (8) gives deg(fp , ξ ) = n+ (ξ ) = n(ξ ) for all ξ ∈ N \ fp (∂Dp ). The actual value of n(ξ ) depends on the parameter k that enters into the metric (19). If i = 1/k is an integer, N \ fp (∂Dp ) is connected and n(ξ ) = i everywhere on this set. If

422

V. Perlick

i < 1/k < i + 1 for some integer i, N \ fp (∂Dp ) has two connected components, with n(ξ ) = i on one of them and n(ξ ) = i +1 on the other. Thus, the string produces multiple imaging and the number of images is (finite but) arbitrarily large if k is sufficiently small. For all k ∈ ]0, 1[ , the lens map is surjective, fp (Dp ) = N S 1 ×R. So this example shows that the assumption of fp (Dp ) being simply connected was essential in Prop. 5. 6.1.2. Non-transparent spherical body. We consider the Schwarzschild metric −1 2 2 g = 1 − 2m dr + r 2 dθ 2 + sin2 θ dϕ 2 − 1 − 2m r r ) dt

(20)

on the manifold M = ]Ro , ∞[ × S 2 × R. In (20), r is the coordinate ranging over ]Ro , ∞[ , t is the coordinate ranging over R, and θ and ϕ are spherical coordinates on S 2 . This gives the static vacuum spacetime around a spherically symmetric body of mass m and radius Ro . Restricting the spacetime manifold to the region r > Ro is a way of treating the central body as non-transparent. In the following we keep a value Ro > 0 fixed and we allow m to vary between m = 0 (flat space) and m = Ro /2 (black hole). For discussing lens maps in this spacetime we fix a constant R > 3Ro /2. We denote by T the set of all points in M with coordinate r = R and we denote by W the restriction of ∂/∂t to W . Then (T , W ) is a source surface, with N = T /W S 2 . It is our goal to discuss the properties of the lens map fp : Dp −→ N for a point p ∈ M with a radius coordinate r < R in dependence of the mass parameter m. To that end we make use of well-known properties of the lightlike geodesics in the Schwarzschild metric, see, e.g., Chandrasekhar [28], Sect. 20, for a comprehensive discussion. For determining the relevant features of the lens map it will be sufficient to concentrate on qualitative aspects of image positions. For quantitative aspects the reader may consult Virbhadra and Ellis [27]. We first observe that, for any m ∈ [0, Ro /2], there is no past-pointing lightlike geodesic from p that intersects T more than once or touches T tangentially. This follows from the fact that in the region r > 3m the radius coordinate has no local maximum along any light ray. So the lens map fp gives full information about all images at p of light sources ξ ∈ N . For m = 0, the light rays are straight lines. The domain Dp of the lens map is given by excising a disc, including the boundary, from the celestial sphere Sp , i.e., Dp R2 . The boundary of Dp corresponds to light rays grazing the surface of the central body, so fp can be continuously extended onto the closure of Dp in Sp , thereby giving an extension of fp , in the sense of Def. 4, fp : Dp ⊆ Sp −→ N . In Fig. 2, fp (∂Dp ) can be represented as a “circle of equal latitude” on the sphere r = R, with the image of fp

Communications in

Mathematical Physics

© Springer-Verlag 2001

On the Definition of SRB-Measures for Coupled Map Lattices Esa Järvenpää, Maarit Järvenpää University of Jyväskylä, Department of Mathematics, P.O. Box 35, 40351 Jyväskylä, Finland. E-mail: [email protected]; [email protected] Received: 23 June 2000 / Accepted: 4 January 2001

Abstract: We consider SRB-measures of coupled map lattices. The emphasis is given to a definition according to which a SRB-measure is an invariant probability measure whose projections onto finite-dimensional subsystems are absolutely continuous with respect to the Lebesgue measure. We show that coupled map lattices which are close to an uncoupled expanding map have typically an infinite number of SRB-measures. In particular, we give a counterexample to the Bricmont–Kupiainen conjecture.

1. Introduction The SRB-measure (Sinai, Ruelle, Bowen) is by definition a “natural” invariant probability measure of a dynamical system (X, T ), where X is a manifold and T : X → X is a differentiable mapping. The meaning of the word “natural” comes from the interpretation that the dynamical system is a model of some physical system. The natural measure should tell how typical points behave asymptotically, that is, what the long time behaviour of the system is for typical initial values. Typical points are determined by the set-up of the actual experiment. If the phase space of the system is a manifold then one may argue that the Lebesgue measure or some smooth modification of it is the right distribution for the initial values. Having found an invariant measure µ and aset A ⊂ X with positive Lebesgue measure such that the Birkhoff average limn→∞ n1 ni=1 δT i (x) tends to µ in the weak∗ -topology for all x ∈ A, it is reasonable to say that µ is a SRBmeasure. Here δx is the probability measure concentrated at the point x. The existence of several other definitions for the SRB-measure found in the literature stems from the fact that this is a difficult condition to test. One definition is that the SRB-measure is an invariant probability measure whose conditional distributions on unstable leaves are absolutely continuous with respect to the corresponding Lebesgue measure. According to another definition it is an equilibrium state for a certain potential function obtained from the derivative of the map. A third definition states that the SRB-measure is a limit of the

2

E. Järvenpää, M. Järvenpää

Lebesgue measure under the iteration of the dynamics. For nice finite-dimensional systems like expanding maps on compact manifolds or axiom A systems all these definitions agree and give the same unique SRB-measure. When adopting the aforementioned definitions into the infinite-dimensional setting of coupled map lattices, one should take into consideration that in an experiment it is possible to measure only a finite number of quantities, in particular, a finite number of coordinates. Thus it seems quite natural to demand that the finite dimensional projections of a SRB-measure are absolutely continuous with respect to the corresponding Lebesgue measure. The extension of the equilibrium state definition to the infinite dimensional setting is not trivial because of the difficulties caused by infinite determinants and matrices. The third definition is obtained by studying finite dimensional approximations of the whole system, taking the limit of the (finite) Lebesgue measure under these approximations, and letting the subsystem size tend to infinity. Even for expanding maps one possibility is to demand that finite dimensional conditional distributions are absolutely continuous. All of the above definitions have been used in the literature. Bunimovich and Sinai [BS] studied expanding maps of the unit interval with a special diffusive coupling over one-dimensional lattice Z. They showed that the system has an invariant Gibbs state whose projections onto finite-dimensional subsystems are absolutely continuous with respect to the Lebesgue measure. In [BK1] Bricmont and Kupiainen used the first mentioned definition, proved the existence of a SRB-measure for analytic expanding circle maps in the regime of small analytic coupling over d-dimensional lattice Zd , and conjectured the uniqueness of this SRB-measure. They extended the existence result for special Hölder continuous functions in [BK2]. They also verified that the SRB-measure is unique in the class of measures for which the logarithm of the density is Hölder continuous. In [J] it was shown that all these results remain true if one replaces the circle by any compact Riemannian manifold. Jiang and Pesin [JP] considered weakly coupled Anosov maps. They managed to extend the equilibrium state definition to this setting and proved the existence and uniqueness of the SRB-measure. Recently, Keller and Zweimüller [KZ] studied piecewise expanding interval maps with a special unidirectional coupling using the last mentioned definition. They established the existence and uniqueness of the SRB-measure in this setting. Finally, the proofs of [BK2, JP] give the uniqueness of the SRB-measure given as in the third definition above. The purpose of this paper is to show that the first mentioned definition is not equivalent with the second and third ones in an infinite dimensional setting. We will construct a coupled map lattice which has an infinite number of SRB-measures according to the first mentioned definition (see Theorem 3.4). (Three of these are also (space) translation invariant.) We also argue that our example is not just a curious artificial system but it manifests a typical behaviour. Thus, although being perhaps the most natural of the above definitions at the heuristic level, this definition has the drawback of being non-unique. Our results also imply that for each finite subsystem X one can find a set A of positive Lebesgue measure such that for each x ∈ A there are boundary conditions y1 (x) and y2 (x) such that n 1 lim δT i (x∨yi ) = µi , n→∞ n i=1

where µ1 = µ2 and x ∨ y is the natural element of the phase space X. Hence the boundary conditions do have an effect. Note that one cannot draw the conclusion that there is a physical phase transition since for each x ∈ A one has to choose the boundary

Non-Uniqueness of SRB-Measures for Coupled Map Lattices

3

condition in a very special way in order to see another SRB-measure than the one whose existence was proved in [BK2].

2. Preliminaries Our main motivation comes from the well-known projection results in Rn stating that the projections of a Radon measure µ onto almost all m-planes are absolutely continuous with respect to the m-dimensional Lebesgue measure provided that the m-energy of µ is finite [M, Theorem 9.7]. Our strategy is to use the fact that expanding maps have small invariant sets (and measures) in the sense that their dimensions are less than the dimension of the ambient manifold. For example, the 13 -Cantor set is invariant under the map x → 3x mod 1. If one takes a finite n-fold product of these Cantor-sets, one will obtain a set which is invariant under the corresponding n-fold product map. Of course, the dimension of this product set is less than n, and so the natural Hausdorff measure living on the set, although being invariant, is not a SRB-measure since it is not absolutely continuous with respect to the n-dimensional Lebesgue measure. However, as n grows, the dimension of the product Cantor set grows. In particular, for each integer m one can find n such that the dimension of the n-fold Cantor set is greater than m. By the above mentioned projection result typical projections of the n-fold Hausdorff measure onto m-dimensional subspaces are absolutely continuous with respect to the m-dimensional Lebesgue measure. Of course, for this system the m-dimensional subsystems are atypical and the projections onto them are not absolutely continuous. Our idea is that a small coupling will make these coordinate planes typical ones. However, one has to be careful since in [HK] Hunt and Kaloshin proved that these projection results are not valid in infinite dimensional spaces. The projection theorems have also the reversed statements according to which the set of exceptional directions may have positive dimension although having zero measure (see [F]). Thus one cannot expect anything more than “almost all”-results. We adopt the very general formulation of the projection theorem due to Peres and Schlag [PS]. We begin by recalling the notation from [PS] which we will use later. Definition 2.1. Let (X, d) be a compact metric space, Q ⊂ Rn an open connected set, and : Q × X → Rm a continuous map with n ≥ m. For any multi-index |η| η = (η1 , . . . , ηn ) ∈ Nn , let |η| = ni=1 ηi be the length of it, and ∂ η = (∂ε1 )η1∂...(∂εn )ηn , where = (ε1 , . . . , εn ) ∈ Q. Let L be a positive integer and δ ∈ [0, 1). We say that ∈ C L,δ (Q) if for any compact set Q ⊂ Q and for any multi-index η with |η| ≤ L there exist constants Cη,Q and Cδ,Q such that

|∂ η (, x)| ≤ Cη,Q and sup |∂ η (, x) − ∂ η ( , x)| ≤ Cδ,Q | − |δ |η |=L

for all , ∈ Q and x ∈ X. Next we will give a definition of a subclass of C L,δ (Q) from [PS]. Definition 2.2. Let ∈ C L,δ (Q) for some L and δ. Define for all x = y ∈ X, x,y () =

(, x) − (, y) . d(x, y)

4

E. Järvenpää, M. Järvenpää

Let β ∈ [0, 1). The set Q is a region of transversality of order β for if there exists a constant Cβ such that for all ∈ Q and for all x = y ∈ X the condition |x,y ()| ≤ Cβ d(x, y)β implies det(Dx,y ()(Dx,y ())T ) ≥ Cβ2 d(x, y)2β . Here the derivative with respect to is denoted by D and AT is the transpose of a matrix A. Further, is (L, δ)-regular on Q if there exists a constant Cβ,L,δ and for all multiindices η with |η| ≤ L there exists a constant Cβ,η such that for all , ∈ Q and for all distinct x, y ∈ X, |∂ η x,y ()| ≤ Cβ,η d(x, y)−β|η| and

sup |∂ η x,y () − ∂ η x,y ( )| ≤ Cβ,L,δ | − |δ d(x, y)−β(L+δ) .

|η |=L

Remark 2.3. Note that if the determinant in Definition 2.2 is bounded away from zero then Q is a region of transversality of order β for all β ∈ [0, 1). Definition 2.4. Let µ be a Borel measure on X and α ∈ R. The α-energy of µ is d(x, y)−α dµ(x)dµ(y). Eα (µ) = X

X

We denote the image of a measure µ under a map f : X → Y by f∗ µ, that is, f∗ µ(A) = µ(f −1 (A)) for all A ⊂ Y . The following theorem from [PS] gives a relation between Sobolev-norms of images of measures under C L,δ (Q)-mappings and energies of original measures. Theorem 2.5. Let Q ⊂ Rn and ∈ C L,δ (Q) such that L + δ > 1. Let β ∈ [0, 1). Assume that Q is a region of transversality of order β for and that is (L, δ)-regular on Q. Let µ be a finite Borel measure on X such that Eα (µ) < ∞ for some α > 0. Then there exist a constant a0 depending only on m, n, and δ such that for any compact Q ⊂ Q, ∗ µ22,γ dLn () ≤ Cγ Eα (µ) Q

for some constant Cγ provided that 0 < (m + 2γ )(1 + a0 β) ≤ α and 2γ < L + δ − 1. Here · 2,γ is the Sobolev norm, that is, |ˆν (ξ )|2 |ξ |2γ dLm (ξ ) ν22,γ = Rm

for any finite compactly supported Borel measure on Rm , where νˆ (ξ ) = e−iξ ·x dν(x) Rm

is the Fourier transform of ν. Proof. [PS, Theorem 7.3].

Non-Uniqueness of SRB-Measures for Coupled Map Lattices

5

Remark 2.6. Let ν be a finite compactly supported Borel measure on Rn . If ν2,0 < ∞ then ν is absolutely continuous with respect to the Lebesgue measure Ln and its RadonNikodym derivative is L2 -integrable, that is, D(ν, Ln ) ∈ L2 (Rn ) (see 3.5). Indeed, if νˆ ∈ L2 (Rn ) then by the surjectivity of the Fourier transform [SW, Theorem 2.3, p. 17] there exists f ∈ L2 (Rn ) such that fˆ = νˆ . Thus by [T, Definition 1.7, p. 262] f = ν as a distribution meaning that f = D(ν, Ln ). Note also that ν2,γ < ∞ for γ ≥ n + 2 implies that D(ν, Ln ) has L2 -integrable derivatives of order γ , that γ is, D(ν, Ln ) ∈ W2 (Rn ). So by [SW, Lemma 3.17, p. 26] D(ν, Ln ) is continuously differentiable. 3. Results Let ' = Zd S 1 , where d ≥ 1 is an integer and S 1 ⊂ C is the unit circle. We use ˜ ⊂ Zd let π : ' → ' and the notation ' = S 1 for all ⊂ Zd . For ⊂ π , : ' ˜ → ' be the natural projections. Let ε0 > 0 and let A : ' → ' be such ˜ that its lift A : ' → ', where ' = Zd R, is A (x)i = xi + εil 2−|i−l| g(xl ) (3.1) l∈Zd

for all i ∈ Zd , where | · | is a metric on Zd , εil ∈ (−ε0 , ε0 ) for all i, l ∈ Zd and g is continuously differentiable and 1-periodic. (We use the covering map p : ' → ' such that Zd [0, 1] is a covering domain. Then A = p ◦ A ◦ p−1 .) For the discussion of the explicit form of the conjugacy A , see Remarks 3.5. Set E = Zd ×Zd (−ε0 , ε0 ) and denote by L the product over Zd × Zd of normalized Lebesgue measures on (−ε0 , ε0 ). It is not difficult to see that A is invertible for all ∈ E provided ε0 is small enough (depending on |g |). We fix such ε0 and set T = A ◦F ◦A−1 , d 3 1 maps z → z (or t → 3t mod 1 if S is where F : ' → ' is the product over Z of viewed as [0, 1]). Let K = Zd K and µ = Zd Hs |K , where K is the 13 -Cantor set on S 1 (or [0, 1]) and Hs |K is the restriction of the s-dimensional Hausdorff measure to 2 K with s = log log 3 . (Note that s is the Hausdorff dimension of K). Now (A )∗ µ is clearly T -invariant, that is, (T )∗ (A )∗ µ = (A )∗ µ. Our aim is to show that for L-almost all the projection (π )∗ (Aε )∗ µ is absolutely continuous with respect to the Lebesgue measure on ' for all finite ⊂ Zd . Let ⊂ Zd . We denote the restriction of A to ' by A, , that is, A, (x)i = xi + εil 2−|i−l| g(xl ) l∈

˜ ⊂ Zd be finite for all i ∈ . Set µ = Hs |K and K = K. Let ⊂ ˜ such that | |s > | |, where the number of elements in is denoted by | |. Let ˜ E × ˜ = × ˜ (−ε0 , ε0 ) and let L × be the restriction of L to E × ˜ . We will first ˜

show that for L × -almost all ∈ E × ˜ the measure (π , ◦ Aε, ˜ )∗ µ ˜ is absolutely ˜ continuous with respect to the Lebesgue measure on ' . As it will be indicated in the proof of Proposition 3.2 this claim follows from Theorem 2.5. In order to apply Theorem 2.5 we have to give some conditions on g. Since g is 1-periodic and continuously differentiable there necessarily exists t0 ∈ [0, 1] such that

6

E. Järvenpää, M. Järvenpää

g (t0 ) = 0. In order to satisfy the transversality assumption in Theorem 2.5, we demand that g = 0 on K. More precisely, let b > 0 and let g be increasing on [0, 1/6] such that g(0) = 0 and g (t) ≥ b for all t ∈ [0, t1 ] for some 1/9 < t1 < 1/6. Define g(t + 1/6) = g(1/6 − t) for t ∈ [0, 1/6] and g(1 − t) = −g(t) for t ∈ [0, 1/3]. We extend g to the interval [1/3, 2/3] such that g is continuously differentiable, g([0, 1]) ⊂ [−1, 1], for some B ≥ b we have |g (t)| ≤ B for all t ∈ [0, 1], and |g (t)| ≥ b for all t ∈ [1/3, 1/3 + t2 ] ∪ [2/3 − t2 , 2/3], where 0 < t2 < 1/9. Consider the second step in the construction of the Cantor set K. Call the chosen intervals Ii , i = 1, . . . , 4, that is, I1 = [0, 1/9], I2 = [2/9, 1/3], I3 = [2/3, 7/9], and I4 = [8/9, 1]. Let x ∈ K and ⊂ Zd . Define x˜ ∈ K in the following way: For all i ∈ , let x˜i = xi . For j ∈ c = Zd \ set x˜j = xj if xj ∈ I1 ∪ I4 , x˜j = 1/6 − (xj − 1/6) if xj ∈ I2 , and x˜j = 5/6 + 5/6 − xj if xj ∈ I3 . Note that with these definitions g(x˜j ) = g(xj ) for all j ∈ Zd implying that π ◦ A (x) ˜ = π ◦ A (x). Further, if / [−t1 , t1 ] for some j ∈ c then x˜j ∈ [−t1 , t1 ]. xj ∈ Let x, y ∈ K such that xi ∈ I1 and yi ∈ I2 for some i ∈ . Then A (y)i − A (x)i ≥ yi − xi − εil 2−|i−l| |g(yl ) − g(xl )| l∈Zd

≥ yi − xi −

εil 2−|i−l| B|yl − xl | ≥ yi − xi − Cε0 ≥

l∈Zd

1 (3.2) 18

for ε0 small enough since yi − xi ≥ 1/9. Thus the cubes at the second stage of the construction of K with i th side I1 will not overlap with cubes with i th side I2 under the projection π ◦ A provided that i ∈ . (The same argument works in other cases as well, see 3.3 below.) More precisely, there exists a constant c > 0 such that |π ◦ A (x) − π ◦ A (y)| ≥ c

(3.3)

for all x, y ∈ K with xi ∈ I1 ∪ I4 and yi ∈ I2 ∪ I3 (or xi ∈ I2 and yi ∈ I3 ) for some i ∈ . Further, as in (3.2) we see that there exists c˜ > 0 such that |A (x)i − 1/6| ≥ c˜ for all i ∈ and x ∈ K, giving the existence of δ > 0 such that 1 1 − δ, + δ = ∅ (3.4) π{i} ◦ A (K) ∩ 6 6 for all i ∈ . We fix ε0 and δ such that the above results hold.

˜ ⊂ Zd be finite such that | |s ˜ > | |. Set X ˜ = ˜ [−t1 , t1 ]. Lemma 3.1. Let ⊂

Define : E × ˜ × X ˜ → ' by (, x) = π , ◦ A, ˜ (x). Then the assumptions ˜ of Theorem 2.5 are valid for δ = 0, β = 0, and for all integers L > 1. Further, ˜ Eα (µ ˜ ) < ∞ for any | | < α < | |s.

Proof. We may replace ' by Rm , where m = | |. Let i0 ∈ . Note that X ˜ is a compact metric space equipped with the metric 2−2|i0 −l| |xl − yl |2 . d(x, y)2 = ˜ l∈

Clearly ∈ C L,0 (E × ˜ ) for all positive integers L since all the first order partial derivatives are constants. Note that Q in Definition 2.1 will not play any role here since all the estimates are independent of Q .

Non-Uniqueness of SRB-Measures for Coupled Map Lattices

7

To check the transversality assumption in Definition 2.2, define for all x = y ∈ X ˜ , x,y () =

(, x) − (, y) . d(x, y)

˜ and x, y ∈ X ˜ such that x = y. Then Fix i ∈ , k = (k1 , k2 ) ∈ × ,

Dx,y ()i,k = δi,k1 2−|i−k2 |

g(xk2 ) − g(yk2 ) , d(x, y)

where δi,j is the Kronecker’s delta. Thus for i, j ∈ , (Dx,y ()Dx,y ()T )i,j =

δi,j −|i−l|−|j −l| 2 (g(xl ) − g(yl ))2 d(x, y)2 ˜ l∈ 2 −|i−i0 |−|j −i0 |

≥ δi,j b 2

.

By Remark 2.3 the transversality assumption is valid for β = 0 with the constant C0 = bm 2− i∈ |i−i0 | . Finally, is obviously (L, 0)-regular (in fact (L, δ)-regular for all δ ∈ [0, 1)) on E × ˜ for all positive integers L. The last assertion follows from the well-known properties of the Hausdorff measure Hs |K (see [M, Chapter 8]). The following absolute continuity result follows from Theorem 2.5 and Lemma 3.1. ˜

˜ > | |. Then for L × ˜ ⊂ Zd be finite such that | |s Proposition 3.2. Let ⊂ almost all ∈ E × ˜ the measure (π , ◦ A ) µ is absolutely continuous with ˜ ∗ ˜ ˜ , respect to the Lebesgue measure on ' . Proof. By the arguments given before stating Lemma 3.1 we may replace ' ˜ by X ˜ = ˜

× ˜ )∗ µ ˜ [−t1 , t1 ]. Lemma 3.1 and Theorem 2.5 give (π , ˜ ◦A, ˜ 2,0 < ∞ for L

almost all ∈ E × ˜ which by Remark 2.6 implies the claim. In Proposition 3.3 we will prove that one may replace A, ˜ by A and µ ˜ by µ in Proposition 3.2. For this purpose we use differentiation theory of measures. Let ν and λ be Radon measures on Rn . Recall that the lower derivative of ν with respect to λ at a point x ∈ Rn is defined by D(ν, λ, x) = lim inf r→0

ν(B(x, r)) , λ(B(x, r))

(3.5)

where B(x, r) is the closed ball with centre at x and with radius r. If the limit exists it is called the Radon-Nikodym derivative of ν with respect to λ and is denoted by D(ν, λ, x). Further, ν is absolutely continuous with respect to λ if and only if D(ν, λ, x) < ∞ for ν-almost all x ∈ Rn [M, Theorem 2.12]. ˜ > | | and let 1 ∈ E ˜ ˜ ⊂ Zd be finite such that | |s Proposition 3.3. Let ⊂

× such that the conclusion of Proposition 3.2 is valid. Then for all ∈ E with × ˜ = 1 we have D((π ◦ A )∗ µ, L , x) < ∞ for (π ◦ A )∗ µ-almost all x ∈ ' . Here L is the Lebesgue measure on ' and × ˜ = (εij )(i,j )∈ × ˜ .

8

E. Järvenpää, M. Järvenpää

Proof. Let , 0 ∈ E such that × ˜ = 1 , × ˜c = ˜ ˜ = (0 ) × ˜ ˜ , and (0 )Zd × (0 ) ˜ c ×Zd = 0. Set ν = (π ◦ A )∗ µ and ν0 = (π , ◦ A ) µ . Then ν and ν 0 are ˜ ∗ ˜ ˜ 0 , Radon measures with compact supports [M, Theorem 1.18]. It follows directly from (3.1) that (A0 , ˜ )∗ µ ˜ = (π ˜ ◦ A0 )∗ µ, meaning that ν0 = (π ◦ A0 )∗ µ. By Proposition 3.2 the measure ν0 is absolutely continuous with respect to L . Set m = | |. We will first show that there exists a constant C > 0 such that for all r > 0, √ ν (B(x, r))dν (x) ≤ C ν0 (B(x, mr))dν0 (x). (3.6) '

'

By [FO, Lemma 2.6] it is enough to prove that ν (Q)2 ≤ C Q∈D (r, )

ν0 (Q)2 ,

(3.7)

Q∈D (r, )

where D(r, ) is the family of r-mesh cubes in R , that is, cubes of the form [l1 r, (l1 + 1)r) × · · · × [lm r, (lm + 1)r), where li ∈ Z for all i = 1, . . . , m. Let r > 0. Consider the cubes at the nth stage of the construction of K, where 3−n < r. Call this nth stage approximation K(n). Setting V0 = A0 , ˜ (K ˜ (n)) × K ˜ c (n) = A, ˜ (K ˜ (n)) × K ˜ c (n), we get A0 (spt µ) ⊂ V0 implying that spt ν0 ⊂ π (V0 ). Here the support of a measure λ is denoted by spt λ. ˜ and x, y ∈ X = Zd [−t1 , t1 ] such that xk = yk for all k ∈ , ˜ then If i ∈ A (x)i − A (y)i = εil 2−|i−l| (g(xl ) − g(yl )). (3.8) ˜c l∈

(Recall the discussion before Lemma 3.1 according to which we can assume that xi ∈ ˜ c. [−t1 , t1 ] for all i ∈ Zd ). Note that the difference in (3.8) depends only on xj for j ∈ Defining V = A (K(n)), we have spt ν ⊂ π (V ). Further, A (x)i = A, ˜ (x)i for ˜ c meaning that the restriction of V to the subspace ˜ if xj = 0 for all j ∈ all i ∈ ' ˜ ⊂ ' equals A, ˜ (K ˜ (n)) = A0 , ˜ (K ˜ (n)). So by (3.8) V is obtained from V0 by tilting the rows of “cubes” above each “cube” in A, ˜ (K ˜ (n)) in such a way that the ˜ Thus ν is obtained from ν0 by amount of translation does not depend on xi for i ∈ . spreading around the “cubes” defining ν0 . Let Q ∈ D(r, ). If there is Q ∈ D(r, ) such that a part of the “cubes” above it in V0 are tilted above Q then the corresponding “cubes” above Q (in V0 ) are removed away by (3.8). Define AQ = {Q ∈ D(r, ) | π (A (A−1˜ (Q × X \ ) × X ˜ c )) ∩ Q = ∅}. ˜ ,

Then for all Q ∈ AQ with π (V ) ∩ π (A (A−1˜ (Q × X \ ) × X ˜ c )) ∩ Q = ∅ we ˜ , c have V0 ∩ (Q × X ) = ∅. Further, Q × X c = PQ (Q ), (3.9) Q ∈D (r, ) Q∈AQ

where

PQ (Q ) = {x ∈ Q × X c | π (A (A−1˜ (x ˜ ) × x ˜ c )) ∈ Q }. ,

Non-Uniqueness of SRB-Measures for Coupled Map Lattices

9

Observe that (A0 )∗ µ(PQ (Q )) = (A )∗ µ(A (A−1 0 (PQ (Q )))).

(3.10)

Note that by (3.8) the geometric shape of this partition is independent of Q, that is, if Q1 ∈ D(r, ) with Q1 × X c =

PQ1 (Q ),

Q ∈D (r, ) Q1 ∈AQ

then for all Q2 = τ (Q1 ) ∈ D(r, ) (τ is a translation) we have

Q2 × X c =

τ (PQ1 (Q )).

Q ∈D (r, ) Q1 ∈AQ

Naturally, this partition can be restricted to V0 . Hence for all Q ∈ D(r, ) there are 1 non-negative numbers pQ (Q ) = ν0 (Q) (A0 )∗ µ(PQ (Q )) adding to 1 such that

ν0 (Q) = (A0 )∗ µ(Q × X c ) =

(A0 )∗ µ(PQ (Q ))

Q ∈D (r, ) Q∈AQ

=

(3.11)

pQ (Q )ν0 (Q).

Q ∈D (r, ) Q∈AQ

This gives by (3.10) that ν (Q) =

Q ∈AQ

(A0 )∗ µ(PQ (Q)) =

pQ (Q)ν0 (Q ).

(3.12)

Q ∈AQ

The numbers pQ (Q ) depend on both Q and PQ (Q ). Enumerating the partition of Q×X c given in (3.9) we get Q×X c = ∪i PQ (i), where the geometric shape of PQ (i) ∈ D(r, ) we have PQ may vary as i varies. However, for all i and Q, Q (i) = τ (PQ (i)), = τ (Q). Hence the differences in PQ (i) as Q varies where τ is the translation with Q and i is kept fixed are due to the fact that the measure is not evenly distributed inside horizontal | |-dimensional slices of Q × X c . Note that if such a horizontal slice intersects an element PQ (Q ) of the partition (3.9), then, by (3.8), it may intersect only the elements PQ (Q ), where Q is a neighbour of Q in D(r, ). Let N = 3| | be the number of neighbours. We say that Q and Q are related (Q ∼ Q ) if there exists Q

10

E. Järvenpää, M. Järvenpää

such that Q , Q ∈ AQ . Then by (3.11) and (3.12) N

ν0 (Q)2 −

Q∈D (r, )

=N

ν (Q)2

Q∈D (r, )

pQ (Q )pQ (Q )ν0 (Q)2

Q∈D (r, ) Q ∈D (r, ) Q ∈D (r, ) Q∈AQ Q∈AQ

−

pQ (Q)pQ (Q)ν0 (Q )ν0 (Q )

Q∈D (r, ) Q ∈AQ Q ∈AQ

=

pQ (Q)pQ (Q)(ν0 (Q ) − ν0 (Q ))2 + P ≥ 0

Q ,Q ∈D (r, ) Q∈D (r, ) Q ,Q ∈AQ Q ∼Q

since the remainder P (which is due to the occasionally very generous compensation factor N ) is non-negative. This concludes the proof of (3.7). Let α be the L -measure of the m-dimensional unit ball. By [M, Theorem 2.12] D(ν0 , L , x) exists and is finite for L -almost all x. By Proposition 3.2 the same is true for ν0 -almost all x. By Remark 2.6 we can choose D(ν0 , L ) as smooth as we like by ˜ In particular, it can be chosen to be uniformly continuous so that one can increasing . find r0 > 0 such that ν0 (B(x, r))α −1 r −m ≤ max{2D(ν0 , L , x), 1} for all 0 < r < r0 and x ∈ ' . Thus using Fatou’s lemma, inequality 3.6, the theorem of dominated convergence, and Theorem 2.5 together with Plancharel’s formula [SW, Theorem 2.1, p. 16], we have D(ν , L , x)dν (x) = lim inf ν (B(x, r))α −1 r −m dν (x) r→0 ≤ lim inf ν (B(x, r))α −1 r −m dν (x) r→0 √ ≤ lim inf C ν0 (B(x, mr))α −1 r −m dν0 (x) r→0 √ m = C( m) D(ν0 , L , x)dν0 (x) = C D(ν0 , L , x)2 dL (x) < ∞. Thus D(ν , L , x) is finite for ν -almost all x.

Theorem 3.4. For L-almost all the map T has infinitely many SRB-measures. Proof. For all finite ⊂ Zd , let Eg ( ) = { ∈ E | (π ◦ A )∗ µ is absolutely continuous with respect to L }. By Propositions 3.2 and 3.3 and [M, Theorem 2.12] we get for all finite ⊂ Zd , L(Eg ( )) = 1.

Non-Uniqueness of SRB-Measures for Coupled Map Lattices

Defining Eg =

11

Eg ( )

⊂Zd

| | 0), and has a phase transition, if 1 < c < 2 (d > 2) [4]. It has been widely believed without proof that the hierarchical Ising model in d ≥ 4 dimensions has a critical trajectory converging to the Gaussian fixed point and that the “continuum limit” of the hierarchical Ising model in d ≥ 4 dimensions will be trivial. In this paper, we prove this fact. In the present analysis, it is crucial that the critical Ising model is mapped into a weak coupling regime after a small number of renormalization group transformations (in fact, 70 iterations for d = 4). Moreover, using a framework essentially different from that of [16, 7], we see in the weak coupling regime that the “effective coupling constant” of a critical model decays as c1 /(N +c2 ) after N iterations in d = 4 dimensions (exponentially for d > 4). Our framework in the weak coupling regime is designed especially for a critical trajectory starting at the strong coupling regime so that the criterion of convergence to the Gaussian fixed point can be checked numerically with mathematical rigor. Corresponding results, triviality of φ44 spin model on regular lattice (“full model”), are far harder, and a proof of triviality of Ising model on 4 dimensional regular lattice is, though widely believed, still open. We should here note the excellent and hard work of [9, 10] where the existence of critical trajectory in the weak coupling regime (near Gaussian fixed point; “weak triviality”) is solved by rigorous block spin renormalization group transformation. Our main theorem is the following: √ Theorem 1.1. If d ≥ 4 (i.e. c ≥ 2), there exists a “critical trajectory” converging to the Gaussian fixed point starting from the hierarchical Ising models. Namely, there exists a positive real number sc such that if hN , N = 0, 1, 2, · · · , are defined by (1.5) with h0 = hI,sc , then the sequence of measures hN (x) dx, N = 0, 1, 2, · · · , converges weakly to the massless Gaussian measure hG (x) dx. Remark. Our proof is partially computer-aided and shows for d = 4 that sc ∈ [1.7925671170092624, 1.7925671170092625]. In the following sections, we give a proof of Theorem 1.1. We will concentrate on the case d = 4, since the cases d > 4 can be proved along similar lines (with weaker bounds).

16

T. Hara, T. Hattori, H. Watanabe

2. Strategy The proof of Theorem 1.1 is decomposed into two parts: Theorem 2.1(analysis in the weak coupling regime) and Theorem 2.2 (analysis in the strong coupling regime). They are stated in Sect. 2.3, and their proofs are given in Sect. 4 and Sect. 5, respectively. Theorem 1.1 is proved at the end of this section assuming them. (1) In Theorem 2.1, we control the renormalization group flow in a weak coupling regime by means of a finite number of truncated correlations (Taylor coefficients of logarithm of characteristic functions), and, in terms of the truncated correlations, we give a criterion, a set of sufficient conditions, for the measure to be in a domain of attraction of the Gaussian fixed point. (2) In Theorem 2.2, we prove, by rigorous computer-aided calculations, that there is a trajectory whose initial point is an Ising measure and for which the criterion in Theorem 2.1 is satisfied after a small number of iterations. The first part (Theorem 2.1) is essentially the Bleher–Sinai argument [1, 2, 16]. However, the criteria introduced in the references [16, 7] seem to be difficult to handle when “strong coupling constants” are present in the model, as in the Ising models. In order to overcome this difficulty, we use characteristic functions of single spin distributions and Newman’s inequalities for truncated correlations. The second part (Theorem 2.2) is basically simple numerical calculations of truncated correlations up to 8 points to ensure the criterion. The results are double checked by Mathematica and C++ programs, and furthermore they are made mathematically rigorous by means of Newman’s inequalities. It should be noted that rigorous computer-aided proofs are employed in [14] to Dyson’s hierarchical model in d = 3 dimensions, to prove, with [13], an existence of a non-Gaussian fixed point. (The “physics” are of course different between d = 3 and d = 4.) We also focus on a complete mathematical proof, by combining rigorous computer-aided bounds with mathematical methods such as Newman’s inequalities and the Bleher–Sinai arguments. 2.1. Characteristic function. Denote the characteristic function of the single spin distribution hN as √ ˆhN (ξ ) = FhN (ξ ) = e −1ξ x hN (x) dx. (2.1) R

The renormalization group transformation for hˆ N is hˆ N+1 = FRF −1 hˆ N ,

(2.2)

FRF −1 = T S,

(2.3)

which has a decomposition

where

√ 2 c ξ , 2

β T g(ξ ) = const. exp − g(ξ ), 2 Sg(ξ ) = g

(2.4) (2.5)

Triviality of Hierarchical Ising Model in Four Dimensions

17

and the constant is so defined that T g (0) = 1. The transformation (2.2) has the same form as the N = 2 case of the Gallavotti hierarchical model [5, 11, 12]. Note that only for N = 2 the Gallavotti model is equivalent (by Fourier transform) to the Dyson’s hierarchical model. We introduce a “potential” VN for the characteristic function hˆ N and its Taylor coefficients µn,N by hˆ N (ξ ) = e−VN (ξ ) , VN (ξ ) =

∞

(2.6)

µn,N ξ n .

(2.7)

n=1

(Note that hˆ N (0) = 1.) The coefficient µn,N is called a truncated n point correlation. They are functions of Ising parameter s in h0 = hI,s , but to simplify expressions, we will always suppress the dependences on s in the following. In particular, for the initial condition h0 = hI,s , we have hˆ 0 (ξ ) = hˆ I,s (ξ ) = FhI,s (ξ ) = cos(sξ ), 1 1 4 1 6 µ2,0 = s 2 , µ4,0 = s , µ6,0 = s , 2 12 45 and

µ8,0 =

17 8 s , 2520

etc.,

√ √ 2 h1 (x) = RhI,s (x) = const. eβcs /2 δ(x − s c) + δ(x + s c) + 2δ(x) , √ 1 2 1 + k cos( csξ ) , with k = eβcs /2 , 1+k k k = k", µ4,1 = (2k − 1)"2 , µ6,1 = (16k 2 − 13k + 1)"3 , 6 90 k cs 2 = (272k 3 − 297k 2 + 60k − 1)"4 , etc., with " = . 2520 2(k + 1)

hˆ 1 (ξ ) = µ2,1 µ8,1

2.2. Newman’s inequalities. The function VN has a remarkable positivity property and its Taylor coefficients obey Newman’s inequalities (for a brief review of relevant part, see Appendix A): 1 (2µ4,N )n/2 , n = 3, 4, 5, · · · . (2.8) n These inequalities follow from [15, Theorem 3, 6], since we have chosen the Ising spin distribution h0 = hI,s and the function of η defined by √ c N ηx e hN (x)dx = exp η φθ (2.9) 2 N,hI,s 0 ≤ µ2n,N ≤

θ

has only pure imaginary zeros as is shown in [15, Theorem 1]. Note also that (1.2) and (1.6) imply µ2n+1,N = 0,

n = 0, 1, 2, · · · .

(2.10)

18

T. Hara, T. Hattori, H. Watanabe

The bounds (2.8) are extensively used in this paper. We here note the following facts: (1) The right-hand side of (2.7) has a nonzero radius of convergence. (2) It suffices to prove lim µ4,N = 0 in order to ensure that µ2n,N , n ≥ 3, converges N→∞

to zero, hence the trajectory converges to the Gaussian fixed point. 2.3. Proof of Theorem 1.1. Let h0 = hI,s and d = 4. Note the following simple observations on the “mass term” µ2,N , which is the variance of hN (x) dx. (1) µ2,N is continuous in the Ising parameter s, because hN (x) dx is a result of a finite number of renormalization group transformation (1.2). (2) µ2,N is increasing in s, vanishes at s = 0, and diverges as s → ∞. We then put, for N = 0, 1, 2, · · · ,

s N = inf s > 0 | µ2,N ≥ 1 , √ 3 s N = inf s > 0 | µ2,N ≥ min 1 + √ µ4,N, 2 + 2 . 2

(2.11) (2.12)

Obviously, we have 0 < s N ≤ s N < ∞. Note also that 3 1 ≤ µ2,N ≤ 1 + √ µ4,N 2

(2.13)

holds for s ∈ [s N , s N ]. As is seen in Sect. 4, (2.13) is necessary for the model to be critical. We call this a critical mass condition. The following theorem states our result in the weak coupling regime and is proved in Sect. 4. Theorem 2.1. Let h0 = hI,s and d = 4. Assume that there exist integers N0 and N1 , satisfying N0 ≤ N1 , such that, for s ∈ [s N1 , s N1 ], the bounds 0 ≤ µ4,N0 ≤ 0.0045, 1.6µ24,N0

≤ µ6,N0 ≤

(2.14)

6.07µ24,N0 , 48.469µ34,N0 ,

(2.15)

N0 ≤ N < N1 ,

(2.17)

0 ≤ µ8,N0 ≤

(2.16)

and µ2,N < 2 +

√

2,

hold. Then there exists an sc ∈ [s N1 , s N1 ] such that if s = sc then lim µ4,N = 0,

N→∞

lim µ2,N = 1.

N→∞

Triviality of Hierarchical Ising Model in Four Dimensions

19

s=sc µ4 s=sN -- 1 0.0045

N0

N1

-s=s N

1

N0

N1

0

µ2

1.0

Fig. 2.1. A schematic view of trajectories on (µ2 , µ4 -plane) in Theorem 2.1. Trajectories for s = s N1 and for s = s N1 (solid lines) and the critical trajectory for s = sc (broken line) are shown. The Gaussian fixed point corresponds to the point (1.0, 0). The region defined by inequalities for (µ2 , µ4 ) analogous to (2.13) and (2.14) (and (2.17)) is shaded

Remark. The original Bleher–Sinai argument takes N0 = N1 . We include the N0 < N1 case which makes it possible to complete our proof by evaluating various quantities only at 2 endpoints of the interval in consideration for Ising parameter s, instead of all values in the interval, as is implicit in the assumptions of Theorem 2.1. This point will be clarified at the end of Sect. 5.3. The following theorem states our result in the strong coupling regime and is proved in Sect. 5. Theorem 2.2. The assumptions of Theorem 2.1 are satisfied for N0 = 70 and N1 = 100, where s N1 and s N1 satisfy 1.7925671170092624 ≤ s N1 ,

s N1 ≤ 1.7925671170092625.

Proof of Theorem 1.1 for d = 4 assuming Theorem 2.1 and Theorem 2.2. Theorem 2.1 and Theorem 2.2 imply that there exists sc ∈ [s N1 , s N1 ] such that, for s = sc , lim µ4,N = 0 and lim µ2,N = 1 hold. Then (2.6), (2.7), and (2.8) imply

N→∞

N→∞

2 lim hˆ N (ξ ) = e−ξ ,

N→∞

uniformly in ξ on any closed interval in R. It is easy to see that e−ξ is the characteristic function of the massless Gaussian measure hG , hence Theorem 1.1 holds for d = 4. The bounds on s N1 and s N1 in Theorem 2.2 imply 2

1.7925671170092624 ≤ sc ≤ 1.7925671170092625.

20

T. Hara, T. Hattori, H. Watanabe

3. Truncated Correlations In this section, we prepare basic (recursive) bounds on the truncated correlations that will be used in Sect. 4. The renormalization group transformation is decomposed as (2.3). Since the mapping S is simple, the essential part of our work is an analysis of T . The consequence in this section is Proposition 3.1. 3.1. Recursions. Note first that in terms of VN the mapping S can be expressed as

Se

−VN

(ξ ) = e

−2VN

√

c 2 ξ

.

Using (2.7), (2.10), (1.4) we also have

√ ∞ c 21−(1+2/d)n µ2n,N ξ 2n . ξ = 2VN 2

(3.1)

(3.2)

n=1

Next, write (2.5) as T g = const. gβ/2 , where g(ξ ) =

gt = exp(−t)g,

(3.3)

√ d 2g 1 (ξ ), and β = ( 2 − 1) for d = 4. gt is a solution to 2 dξ 2 ∂gt = −gt , g0 = g. ∂t

Hence, if we put gt (ξ ) = exp(−Vt (ξ )), then Vt satisfies d Vt = (∇Vt )2 − Vt , dt

(3.4)

∂Vt (ξ ). In other words, VN+1 is given as a solution of (3.4) at t = β/2 ∂ξ (modulo constant term), with the initial condition (3.2) at t = 0. If we write where ∇Vt (ξ ) =

Vt (ξ ) =

∞

µ2n (t)ξ 2n ,

n=0

then (3.4) implies d µ2n (t) = − (2n + 2)(2n + 1)µ2n+2 (t) dt n + (2")(2n − 2" + 2)µ2" (t) µ2n−2"+2 (t). "=1

(3.5)

Triviality of Hierarchical Ising Model in Four Dimensions

21

In particular, we have d µ2 (t) = 4µ2 (t)2 − 12µ4 (t), dt d µ4 (t) = 16µ2 (t)µ4 (t) − 30µ6 (t), dt d µ6 (t) = 24µ2 (t)µ6 (t) + 16µ4 (t)2 − 56µ8 (t), dt d µ8 (t) = 32µ2 (t)µ8 (t) + 48µ4 (t)µ6 (t) − 90µ10 (t). dt

(3.6) (3.7) (3.8) (3.9)

Thus, µ2n,N and µ2n,N+1 are related for d = 4 by e.g., 1 1 1 1 µ2 (0) = √ µ2,N , µ4 (0) = µ4,N , µ6 (0) = √ µ6,N , µ8 (0) = µ8,N , 4 32 2 8 2

β β β β µ2,N+1 = µ2 , µ4,N+1 = µ4 , µ6,N+1 = µ6 , µ8,N+1 = µ8 . 2 2 2 2 3.2. Bounds. We first note that the quantities µn (t) obey Newman’s inequalities: by comparing (2.5) and (3.3) we see that the correspondence VN → V (t) is obtained by a replacement β → 2t in (1.2). Therefore µn (t) also is a truncated n point correlation of a measure to which arguments in [15] apply, hence an analogue of (2.8) holds: 0 ≤ µ2n (t) ≤

1 (2µ4 (t))n/2 , n

n = 3, 4, 5, · · · .

(3.10)

We have to show decay of µ4,N as N → ∞. In case d > 4, the decay follows from (3.6) and (3.7) with d-dependent coefficients, namely, if we throw out the negative contributions −µ4 (t) and −µ6 (t) to the right-hand sides of (3.6) and (3.7), respectively, then we have upper bounds on µ2 (t) and µ4 (t). This argument eventually yields exponential decay of µ4,N . In case d = 4, the situation is more subtle, since the decay of µ4,N is weak, i.e., powerlike instead of exponential. In order to derive the delicate bound on µ4 (t), a lower bound for µ6 (t) must be incorporated, which in turn needs an upper bound on µ8 (t). Thus, we have to deal with Eqs. (3.6)–(3.9). This is the principle of our estimation. The result is the following: Proposition 3.1. Let d = 4 and N be a positive integer, and put rN =

√

1

=√

√

1

1 − ( 2 − 1)(µ2,N − 1) 2 − ( 2 − 1)µ2,N √ 2rN − 1 rN 1 ζN = √ = −√ . µ 2µ2,N 2µ2,N 2,N

,

(3.11) (3.12)

(i) If µ2,N < 2 +

√

2,

(3.13)

22

T. Hara, T. Hattori, H. Watanabe

then µ2,N+1 ≤ rN µ2,N ,

(3.14)

µ2,N+1 ≥

(3.15)

rN µ2,N − 3rN2 ζN µ4,N .

(ii) If, furthermore, 21 15 µ4,N ≥ √ ζN µ6,N + ζN2 µ24,N , 4 4 8 2 µ6,N 123 7 1 √ + ζN µ24,N ≥ 24ζN3 µ34,N + √ ζN2 µ4,N µ6,N + ζN µ8,N , 2 8 8 2 8 2 3 45 ζN µ4,N ≥ 12ζN3 µ24,N + √ ζN2 µ6,N , 2 8 2

(3.16) (3.17) (3.18)

then

15 µ2,N+1 ≤ rN µ2,N − 3rN2 ζN µ4,N − 8ζN3 µ24,N − √ ζN2 µ6,N , (3.19) 4 2

15 µ4,N+1 ≥ rN4 µ4,N − √ ζN µ6,N − 21ζN2 µ24,N , (3.20) 2 2

15 µ4,N+1 ≤ rN4 µ4,N − √ ζN µ6,N − 21ζN2 µ24,N 2 2 705 105 2 (3.21) + √ ζN3 µ4,N µ6,N + 447ζN4 µ34,N + ζN µ8,N , 4 2 2

µ6,N µ6,N+1 ≤ rN6 (3.22) √ + 4ζN µ24,N , 2

µ6,N 123 µ6,N+1 ≥ rN6 √ + 4ζN µ24,N − 192ζN3 µ34,N − √ ζN2 µ4,N µ6,N − 7ζN µ8,N , 2 2 (3.23)

µ 12 8,N µ8,N+1 ≤ rN8 (3.24) + √ ζN µ4,N µ6,N + 24ζN2 µ34,N . 2 2 The rest of this section is devoted to a proof of Proposition 3.1.

Proof. Now, observe that µ¯2 (t) defined by d 1 µ¯2 (t) = 4µ¯2 (t)2 , µ¯2 (0) = √ µ2,N , dt 2

(3.25)

is an upper bound of µ2 (t): µ2,N 1 µ2 (t) ≤ µ¯2 (t) = √ . √ 2 1 − 2 2µ2,N t √ 2−1 β = for d = 4 implies (3.14). This, at t = 2 4

(3.26)

Triviality of Hierarchical Ising Model in Four Dimensions

23

Put 1 , √ 1 − 2 2µ2,N t m(t) = µ¯2 (t) − µ2 (t).

M(t) =

We have m(t) ≥ 0, and (3.13) implies that M(t) is√ increasing in t ∈ [0, β/2]. By a change of variable z = M(t) − 1 (dz = 2 2µ2,N M(t)2 dt) and by putting m(z) ˆ = m(t)/M(t)2 ,

µˆ4 (z) = µ4 (t)/M(t)4 ,

µˆ6 (z) = µ6 (t)/M(t)6 , µˆ8 (z) = µ8 (t)/M(t)8 , we have, from (3.6)–(3.9), z µ4,N 1 (−8m(z) ˆ µˆ4 (z) − 15µˆ6 (z))dz, (3.27) +√ 4 2µ2,N 0 z µ6,N 1 µˆ6 (z) = √ + √ (8µˆ4 (z)2 − 12m(z) ˆ µˆ6 (z) − 28µˆ8 (z))dz, (3.28) 8 2 2µ2,N 0 z µ8,N 1 µˆ8 (z) = (24µˆ4 (z)µˆ6 (z) − 16m(z) ˆ µˆ8 (z) − 45µˆ10 (z))dz, +√ 32 2µ2,N 0

µˆ4 (z) =

m(z) ˆ =√

1 2µ2,N

(3.29)

z

(6µˆ4 (z) − 2m(z) ˆ 2 )dz,

(3.30)

0

Eqs. (3.27)–(3.30) with positivity of µ2n (t) imply µ4,N , 4 z µ24,N µ6,N µ6,N 1 µˆ6 (z) ≤ √ + √ 8µˆ4 (z)2 dz ≤ √ + √ z, 8 2 2µ2,N 0 8 2 2 2µ2,N z µ8,N 1 µˆ8 (z) ≤ 24µˆ4 (z)µˆ6 (z)dz +√ 32 2µ2,N 0

µˆ4 (z) ≤

µ8,N 3 µ4,N 2 3 µ4,N µ6,N z+ z , + 32 8 µ2,N 4 µ22,N z 3µ4,N 1 6µˆ4 (z)dz ≤ √ z. m(z) ˆ ≤√ 2µ2,N 0 2 2µ2,N

(3.31) (3.32)

3

≤

(3.33) (3.34)

√ β β (z = M( ) − 1 = 2rn − 1 for d = 4) implies (3.15). 2 2 Using (3.31), (3.32), (3.34) in (3.27), we have

In particular, (3.34) at t =

µˆ4 (z) ≥

21µ24,N 2 µ4,N 15µ6,N z− z . − 4 16µ2,N 8µ22,N

(3.35)

24

T. Hara, T. Hattori, H. Watanabe

Using (3.32), (3.33), (3.34), (3.35) in (3.28) and (3.30) we further have 12µ34,N µ24,N µ6,N 123µ4,N µ6,N 2 7µ8,N z − √ 3 z3 − z − √ z, µˆ6 (z) ≥ √ + √ √ 2 8 2 2 2µ2,N 2µ2,N 16 2µ2,N 8 2µ2,N (3.36) 6µ24,N √ 3 z3 2µ2,N

3µ4,N 45µ6,N m(z) ˆ ≥ √ z− − √ 2 z2 . (3.37) 2 2µ2,N 16 2µ2,N √

√ √ β 2−1 β and z = M − 1 = 2rN − 1 M = 2rN . When d = 4, β = 2 2 2 Then the assumptions (3.16) – (3.18) of Proposition 3.1 imply that the right-hand sides β of (3.35), (3.36), and (3.37) are non-negative at t = . On the other hand, they are 2 concave in z for z ≥ 0. Recall also that z = M(t) − 1 is increasing in t ∈ [0, β/2]. Therefore, they are non-negative for all t ∈ [0, β/2]. Using (3.35), (3.36), and (3.37) in (3.27), we therefore have z

6µ24,N 3µ4,N µ4,N 45µ6,N 1 8 √ z − √ 3 z3 − √ 2 z2 × −√ µˆ4 (z) ≤ 4 2 2µ2,N 16 2µ2,N 2µ2,N 0 2µ2,N

21µ24,N 2 µ4,N 15µ6,N × z− z − 4 16µ2,N 8µ22,N

12µ34,N 3 123µ4,N µ6,N 2 µ24,N µ6,N 7µ8,N +15 √ + √ z− √ 3 z − z − √ z dz √ 8 2 2 2µ2,N 2µ2,N 16 2µ22,N 8 2µ2,N ≤

21µ24,N 2 µ4,N 15µ6,N z− z − 4 16µ2,N 8µ22,N

3 705µ4,N µ6,N 3 447µ4,N 4 105µ8,N 2 z + z + z . 32µ32,N 16µ42,N 32µ22,N √ Recalling that at t = β/2 (z = M( β2 ) − 1 = 2rN − 1) we have

+

(3.38)

β µ¯2 ( ) = rN µ2,N , 2 µ2,N+1 µ4,N+1 µ6,N+1 µ8,N+1

2 β = rN µ2,N − m( ˆ 2rN − 1)M , 2

4 √ β = µˆ4 ( 2rN − 1)M , 2

6 √ β = µˆ6 ( 2rN − 1)M , 2

8 √ β = µˆ8 ( 2rN − 1)M , 2 √

we see that (3.37), (3.35), (3.38), (3.32), (3.36), (3.33) imply (3.19)–(3.24), respectively. This completes a proof of Proposition 3.1.

Triviality of Hierarchical Ising Model in Four Dimensions

25

4. Bleher–Sinai Argument In order to show Theorem 2.1, we confirm existence of a critical parameter s = sc by means of Bleher–Sinai argument, and, at the same time, we derive the expected decay of µ4,N . In Bleher–Sinai argument, monotonicity of s N and s N with respect to N is essential. Proposition 4.1. Let d = 4. Then the following hold: (1) If µ2,N − 1 < 0 then µ2,N+1 < µ2,N . 3 1 (2) If > µ2,N − 1 ≥ √ µ4,N then µ2,N+1 ≥ µ2,N . 4 2 Proof. Note that for both cases in the statement, the assumption (3.13) in Proposition 3.1 holds. Hence, (3.14), with (3.11) and monotonicity of µ2,N , implies µ2,N − 1 < 0 ⇒ rN < 1 ⇒ µ2,N+1 < µ2,N . Next we see that (3.15), with (3.11) and (3.12), implies √ 3rN ( 2rN − 1) µ2,N − 1 ≥ ⇒ µ2,N+1 ≥ µ2,N . √ µ4,N (2 − 2)µ22,N

(4.1)

(4.2)

Put L1 (x) = √

3 . √ 2x( 2 − ( 2 − 1)x)2 √

Then by straightforward calculation we see 1≤x≤

5 3 ⇒ L1 (x) ≤ L1 (1) = √ , 4 2

and (3.11) implies √ 3rN ( 2rN − 1) . L1 (µ2,N ) = √ (2 − 2)µ22,N Therefore (4.2) implies that 1 3 > µ2,N − 1 ≥ √ µ4,N ⇒ µ2,N+1 ≥ µ2,N . 4 2

(4.3)

Corollary 4.2. Let d = 4. Then, for the s N defined in (2.11), it holds that s N ≤ s N+1 . Proof. Since µ2,N is increasing in s, if s < s N then µ2,N < 1, hence Proposition 4.1 implies µ2,N+1 < µ2,N < 1, further implying s < s N+1 . Hence the statement holds.

26

T. Hara, T. Hattori, H. Watanabe

For later convenience, define rN∗ =

1

√

3 1 − ( 2 − 1) √ µ4,N 2 1 ζ∗N = 1 − √ , 2 √ ∗ 2rN − 1 .

ζN∗ = √ 3 2 1 + √ µ4,N 2

,

(4.4)

(4.5) (4.6)

Then we see that if (2.13) holds, then we have, from (3.11) and (3.12), 1 2M, n n c n c n n n n aM,N a",N an−",N ≤ a1,N × M bn,N = " " 4 4 a1,N "=0 "=0 c a n a 1,N M,N = . M 2 a1,N

(5.26)

Triviality of Hierarchical Ising Model in Four Dimensions

35

Therefore 2a¯ ",N ≤

aM,N c a1,N " M 2 a1,N

≤

aM,N M a1,N

=

aM,N M a1,N

=

aM,N M a1,N

∞

m (2m + 2" − 1)!! (2m)!! (2" − 1)!! m=2M+1−"

∞ c a " m m + " 1,N βc a1,N " 2 m=2M+1−"

∞ c a " 2M+1−" k 2M + 1 + k 1,N βc a1,N βc a1,N " 2 k=0

"

∞ 2M+1 k 2M + 1 + k 1 . (5.27) βc a1,N βc a1,N " 2β βc a1,N

k=0

Here, T2M+1," (r) =

∞

βc a1,N

k

k=0

∞ 2M + 1 + k k 2M + 1 + k = r " "

"

1 2M + 1 m = q , "−m 1−r

k=0

(5.28)

m=0

r where r = βc a1,N , and q = 1−r . By assumption r < 21 . The binomial coefficient in the summand is largest when m = 0, because 2M + 1 > 2M ≥ 2". Therefore,

"

1 1 2M + 1 m 1 2M + 1 T2M+1," (r) ≤ q ≤ 1−r " 1−r 1−q " m=0

1 2M + 1 = . 1 − 2r "

(5.29)

This proves

2a¯ ",N ≤

1 2β

"

2M+1

βc a1,N aM,N 2M + 1 × M ≤ 2a¯ ",N , " 1 − 2βc a1,N a1,N

where 2a¯ ",N is defined in (5.14). This proves a˜ n,N ≤ a˜¯ n,N .

(5.30)

Remark. We can “improve” Proposition 5.1 by employing (correct) bounds, in a similar ca¯

n

1,N way as the term proportional to in (5.9). In actual calculations, we improve 2 a¯ n,N+1 , n = 1, 2, · · · , M, in (5.12), the upper bounds for an,N+1 ’s, using (A.6) (as well 2 as its special case (5.5)). To be more specific, we compare a¯ 4,N+1 in (5.12) with a¯ 2,N+1 and replace the definition if the latter is smaller. Then we go on to “improve” a¯ 6,N+1 by comparing with a¯ 2,N+1 a¯ 4,N+1 , and so on. Conceptually there is nothing really new here, but this procedure improves the actual value of the bounds in Proposition 5.1.

36

T. Hara, T. Hattori, H. Watanabe

5.3. Computer results. In this subsection we prove Theorem 2.2 on computers using Proposition 5.1. We double checked by Mathematica and C++ programs on interval arithmetic. Here we will give results from C++ programs. Our program employs interval arithmetic, which gives rigorous bounds numerically. The idea is to express a number by a pair of “vectors”, which consists of an array of length M of “digits”, taking values in {0, 1, 2, · · · , 9}, and an integer corresponding to “exponent”. To give a simple example, let M = 2. One can view that 0.0523 is expressed on the program, for example, as I1 = [5.2 × 10−2 , 5.3 × 10−2 ], and 3 is expressed as I2 = [3.0 × 100 , 3.0 × 100 ]. When the division I1 /I2 is performed, our program routines are so designed that they give correct bounds as an output. Namely, the computer output of I1 /I2 will be [1.7 × 10−2 , 1.8 × 10−2 ]. We may occasionally lose the best possible bounds, but the program is so designed that we never lose the correctness of the bounds. Thus all the outputs are rigorous bounds of the corresponding quantities. In actual calculation we took M = 70 digits, which turned out to be sufficient. We also note that interval arithmetic is employed in [14] for the hierarchical model in d = 3 dimensions. We took an independent approach in programming – we focused on ease in implementing the interval arithmetic to main programs developed for standard floating point calculations – so that structure and details of the programs are quite different. However, our numerical calculations are “not that heavy” to require anything special. For the program which we used for our proof, see the supplement to [17]. As will be explained below, we only need to consider 2 values for the initial Ising parameter s: s− = 1.7925671170092624, and s+ = 1.7925671170092625. We perform explicit recursion on computers for each s = s± using Proposition 5.1. We summarize what is left to be proved: 1 , 0 ≤ s ≤ sN1 , 0 ≤ N ≤ N1 , where N1 = 100. This condition is 2βc from (5.15), imposed because we are going to do evaluation using Proposition 5.1. Note that this condition is stronger than (2.17) in the assumptions in Theorem 2.1, √ 1 1 because = (2 + 2) = 1.707 · · · for d = 4. 2βc 2 (2) s− ≤ s N1 and s N1 ≤ s+ . To prove this, it is sufficient (as seen from the definitions (2.11) and (2.12)) to prove

(1) a¯ 1,N

1 + √ µ4,N1 , when s = s+ . 2 (5.31)

(3) For any s satisfying s− ≤ s ≤ s+ , the bounds (0 ≤)µ4,N0 ≤ 0.0045, 1.6µ24,N0

≤ µ6,N0 ≤

(0 ≤)µ8,N0 ≤

6.07µ24,N0 , 48.469µ34,N0 ,

(5.32) (5.33) (5.34)

hold for N0 = 70. This condition comes from the assumptions in Theorem 2.1 (sufficient, if s− ≤ s N1 and s N1 ≤ s+ ). We now summarize our results from explicit calculations.

Triviality of Hierarchical Ising Model in Four Dimensions

37

1 2 (1) We have a¯ 1,N ≤ s+ = 1.6066 · · · , 0 ≤ s ≤ s+ , 0 ≤ N ≤ N1 . The largest value 2 for a¯ 1,N in the range of parameters is actually obtained at s = s+ and N = 0. (2) Our calculations turned out to be accurate to obtain more than 40 digits below decimal point correctly for µ2,100 and µ4,100 at s = s± , which is more than enough to prove (5.31). In fact, we have 0.99609586499804791366176669341357334889503943 ≤ a 1,100 ≤ µ2,100 ≤ a¯ 1,100 ≤ 0.99609586499804791366176669341357334889503972, at s = s− , and 1.0131857903720691722396611098376636943838027 ≤ a 1,100 ≤ µ2,100 ≤ a¯ 1,100 ≤ 1.0131857903720691722396611098376636943838031, 0.00281027097809098768088795100753480139767915 2 ≤ 21 (−a¯ 2,100 + a 21,100 ) ≤ µ4,100 ≤ 21 (−a 2,100 + a¯ 1,100 ) ≤ 0.00281027097809098768088795100753480139767969, at s = s+ . (3) To prove (5.32)–(5.34), we note the following. Let us write the s dependences of an,N and µn,N explicitly like an,N (s) and µn,N (s). For any integer N and for any s satisfying s− ≤ s ≤ s+ , the monotonicity of an,N (s) with respect to s implies µ4,N (s) =

1 1 (−a2,N (s) + a1,N (s)2 ) ≤ (−a2,N (s− ) + a1,N (s+ )2 ) =: µ¯ 4,N . 2 2 (5.35)

Hence if we can prove µ¯ 4,70 ≤ 0.0045, then we have proved (5.32). In a similar way, sufficient conditions for (5.33) and (5.34) are 1.6 ≤

µ6,70 µ¯ 24,70

,

µ¯ 6,70 ≤ 6.07, µ24,70

µ¯ 8,70 ≤ 48.469, µ34,70

with obvious definitions (as in (5.35) for µ¯ 4,N ) for µn,70 and µ¯ n,70 . The bounds we have for these quantities are (we shall not waste space by writing too many digits): µ¯ 4,70 ≤ 0.004144, 3.6459 ≤

µ6,70 µ¯ 24,70

,

µ¯ 6,70 µ¯ 8,70 ≤ 3.7542, 3 ≤ 38.488. µ24,70 µ4,70

This completes a proof of Theorem 2.2, and therefore Theorem 1.1 is proved. Acknowledgement. The authors would like to thankYoichiro Takahashi for his interest in the present work and for discussions. Part of this work was done while T. Hara was at Department of Mathematics, Tokyo Institute of Technology. The researches of T. Hara and T. Hattori are partially supported by Grant-in-Aid for Scientific Research (C) of the Ministry of Education, Science, Sports and Culture.

38

T. Hara, T. Hattori, H. Watanabe

A. Newman’s Inequalities Let X be a stochastic variable which is in class L of [15]. X ∈ L has Lee-Yang property, which states that the zeros of the moment generating function E eH X are pure imag inary. In fact, it is shown in [15, Prop. 2] using Hadamard’s Theorem that E eH X has the following expression: !

E e

HX

"

=e

bH 2

#

j

H2 1+ 2 αj

$ ,

(A.1)

where b is a non-negative constant and αj , j = 1, 2, 3, · · · , is a positive nondecreasing ∞ αj−2 < ∞. sequence satisfying j =1

Consequences of (A.1) in terms of inequalities among moments (n point functions) are given in [15], among which we note the following: 1. Positivity [15, Theorem 3]. Put µ2n

! √ d 2n 1 =− log E e −1ξ X (2n)! dξ 2n

"%% % %

ξ =0

.

(A.2)

Then, µ2n ≥ 0, n = 0, 1, 2, · · · .

(A.3)

(Note that (A.1) implies µ2n+1 = 0.) 2. Newman’s bound [15, Theorem 6]. Put v2n = nµ2n . Then, v4n ≤ v4n ,

v6 ≤

√ v 4 v8 ,

v4n+2 ≤ v6 v4n−1 ,

(A.4)

where the first and third inequalities follow from (2.10) of [15], while the second one n/2 is (2.12) of [15]. These imply v2n ≤ v4 , n ≥ 2, and therefore µ2n ≤

(2µ4 )n/2 , n = 2, 3, 4, · · · . n

(A.5)

Furthermore, we will prove the following. Proposition A.1. Put aN =

" ! N! E X2N , N ∈ Z+ . Then, (2N )!

aM+N ≤ aM aN

N, M = 0, 1, 2, · · · .

(A.6)

Proof. Put yj = αj−2 > 0. Then " ! 2 1 + H 2 yj . E eH X = ebH j

(A.7)

Triviality of Hierarchical Ising Model in Four Dimensions

39

Expand the infinite product to obtain H4

H6

yj + yi yj + y i y j yk + . . . 1 + H 2 yj = 1 + H 2 2! 3! j

j

=

∞

i,j

i,j,k

H 2n cn , n!

n=0

with

cn =

yi1 yi2 yi3 . . . yin ,

(A.8)

(A.9)

i1 ,i2 ,...,in

where primed summations denote summations over non-coinciding indices. Hence we have, ∞ ! " E eH X = H 2N

N=0

!

Comparing with E e

m,n:m+n=N

HX

"

∞ N bm cn bN−n cn = . H 2N m! n! (N − n)! n!

(A.10)

n=0

N=0

∞ aN 2N = H , we obtain N! N=0

aN =

N N n=0

bN−n cn .

n

Note that (A.9) implies cn+m ≤ cm cn ,

(A.11)

because the conditions of primed summations are weaker for the left-hand side. This with b ≥ 0 implies M N M N M+N−m−n b cm cn aM aN = m n m=0 n=0

≥

N M M N

m=0 n=0

=

M+N

b

m

M+N−"

=

"

c"

m:0≤m≤M, 0≤"−m≤N

"=0 M+N

bM+N−m−n cm+n

n

b

"=0

M+N−"

c"

M +N "

M N m "−m

= aM+N ,

where, in the last line, we also used

" M N M +N = , m "−m "

(A.12)

m: 0≤m≤M, 0≤"−m≤N

which is seen to hold if we compare the coefficients of x " of an identity (1 + x)M+N = (1 + x)M (1 + x)N .

40

T. Hara, T. Hattori, H. Watanabe

References 1. Bleher, P.M. and Sinai, Ya.G.: Investigation of the critical point in models of the type of Dyson’s hierarchical model. Commun. Math. Phys. 33, 23–42 (1973) 2. Bleher, P.M. and Sinai, Ya.G.: Critical indices for Dyson’s asymptotically hierarchical models. Commun. Math. Phys. 45, 247–278 (1975) 3. Collet, P. and Eckmann, J.-P.: A renormalization group analysis of the hierarchical model in statistical physics. Springer Lecture Note in Physics 74, 1978 4. Dyson, F.J.: Existence of a phase-transition in a one-dimensional Ising ferromagnet. Commun. Math. Phys. 12, 91–107 (1969) 5. Gallavotti, G.: Some aspects of the renormalization problems in statistical mechanics. Memorie dell’ Accademia dei Lincei 15, 23–59 (1978) 6. Gaw¸edzki, K. and Kupiainen, A.: Triviality of φ44 and all that in a hierarchical model approximation. J. Stat. Phys. 29, 683–699 (1982) 7. Gaw¸edzki, K. and Kupiainen, A.: Non-Gaussian fixed points of the block spin transformation. Hierarchical model approximation. Commun. Math. Phys. 89, 191–220 (1983) 8. Gaw¸edzki, K. and Kupiainen, A.: Nongaussian Scaling limits. Hierarchical model approximation. J. Stat. Phys. 35, 267–284 (1984) 9. Gaw¸edzki, K. and Kupiainen, A.: Asymptotic freedom beyond perturbation theory. In: K. Osterwalder and R. Stora, eds., Critical Phenomena, Random Systems, Gauge Theories. Les Houches 1984, Amsterdam: North-Holland, 1986 10. Gaw¸edzki, K. and Kupiainen, A.: Massless lattice φ44 Theory: Rigorous control of a renormalizable asymptotically free model. Commun. Math. Phys. 99, 199–252 (1985) 11. Koch, H. and Wittwer, P.: A non-Gaussian renormalization group fixed point for hierarchical scalar lattice field theories. Commun. Math. Phys. 106, 495–532 (1986) 12. Koch, H. and Wittwer, P.: On the renormalization group transformation for scalar hierarchical models. Commun. Math. Phys. 138, 537–568 (1991) 13. Koch, H. and Wittwer, P.: A nontrivial renormalization group fixed point for the Dyson–Baker hierarchical model. Commun. Math. Phys. 164, 627–647 (1994) 14. Koch, H. and Wittwer, P.: Bounds on the zeros of a renormalization group fixed point. Mathematical Physics Electronic Journal 1, No. 6 (24pp.) (1995) 15. Newman, C.M.: Inequalities for Ising models and field theories which obey the Lee–Yang theorem. Commun. Math. Phys. 41, 1–9 (1975) 16. Sinai, Ya.G.: Theory of phase transition: Rigorous results. New York: Pergamon Press, 1982 17. Hara, T., Hattori, T., and Watanabe, H.: Triviality of hierarchical Ising Model in four dimensions. Archived in mp_arc (Mathematical Physics Preprint Archive, http://www.ma.utexas.edu/mp_arc/) 00-397 Communicated by D. C. Brydges

Commun. Math. Phys. 220, 41 – 67 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

Geometric Optics and Long Range Scattering for One-Dimensional Nonlinear Schrödinger Equations Rémi Carles Antenne de Bretagne de l’ENS Cachan and IRMAR, Campus de Ker Lann, 35 170 Bruz, France. E-mail: [email protected] Received: 23 May 2000 / Accepted: 8 January 2001

Abstract: With the methods of geometric optics used in [2], we provide a new proof of some results of [11], to construct modified wave operators for the one-dimensional cubic Schrödinger equation. We improve the rate of convergence of the nonlinear solution towards the simplified evolution, and get better control of the loss of regularity in Sobolev spaces. In particular, using the results of [9], we deduce the existence of a modified scattering operator with small data in some Sobolev spaces. We show that in terms of geometric optics, this gives rise to a “random phase shift” at a caustic. Contents 1. 2. 3. 4. 5. 6. 7.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . Formal Computations . . . . . . . . . . . . . . . . . . . . . . . Estimates on Some Oscillatory Integrals . . . . . . . . . . . . . Energy Estimates . . . . . . . . . . . . . . . . . . . . . . . . . Justification of Nonlinear Geometric Optics Before the Caustic . Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . Construction of the Modified Scattering Operator and Application

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

41 46 48 53 54 58 63

1. Introduction In this article, we consider the nonlinear Schrödinger equation in one space dimension, 1 i∂t ψ + ∂x2 ψ = λ|ψ|p ψ, λ ∈ R, 2 in the particular case where p = 2. Define the Fourier transform by Fv(ξ ) = v (ξ ) = e−ix.ξ v(x)dx.

(1.1)

42

R. Carles

For p > 2, it is well known that to any asymptotic state ψ− ∈ H 1 ∩ F(H 1 ) =: , one can associate a solution ψ of (1.1) that behaves asymptotically as the free evolution of ψ− , that is, U0 (−t)ψ(t) − ψ− −→ 0, t→−∞

i 2t ∂x2

where U0 (t) := e denotes the unitary group of the free Schrödinger equation. The operator W− : ψ− → ψ|t=0 is called a wave operator. The case p = 2 is different (long range case). It is proved (see [1,12,13,5]) that if ψ− ∈ L2 and U0 (−t)ψ(t) − ψ− −→ 0 in L2 , where ψ solves t→−∞

1 i∂t ψ + ∂x2 ψ = |ψ|2 ψ, 2 then ψ = ψ− = 0. One cannot compare the nonlinear dynamics with the free dynamics. In [11], the author constructs modified wave operators that allow to compare the nonlinear dynamics of (1.1) when p = 2 with a simpler one, yet more complicated than the free dynamics. Assuming the asymptotic state ψ− is sufficiently smooth and small in a certain Hilbert space, Ozawa defines a new operator (that depends on ψ− ) such that the evolution of ψ− under this dynamics can be compared to the asymptotic behavior of a certain solution of 1 i∂t ψ + ∂x2 ψ = λ|ψ|2 ψ. (1.2) 2 Using the methods of geometric optics as in [2], we rediscover these modified operators, and improve some convergence estimates (Corollary 1). Moreover, we have better control of the (possible) loss of regularity, which, along with the results of [9], makes it possible to define a modified scattering operator (S = W+−1 W− ) for small data in (Corollary 2). This enables us to describe the validity of nonlinear geometric optics with focusing initial data. In particular, we show that the caustic crossing is described in terms of the scattering operator (as in [2]), plus a “random phase shift” (Corollary 3). In [8], Ginibre and Velo construct modified wave operators in Gevrey spaces. They make no size restriction on the data, but require analyticity for the asymptotic states. In the present article, we cannot leave out the smallness assumption, but our asymptotic states are less regular. Denote H := {f ∈ H 3 (R); xf ∈ H 2 (R)} = {f ∈ S (R); f H := (1 + x 2 )1/2 (1 − ∂x2 )f L2 + (1 − ∂x2 )3/2 f L2 < ∞}. Recall one of the results in [11]. Theorem 1 ([11], Theorem 2). There exists γ > 0 with the following properties. For 1 any ψ− ∈ F(H) with ψ − L∞ < γ , (1.2) has a unique solution ψ ∈ C(R; H ) ∩ 4 1,∞ Lloc (R; W ) such that for any α with 1/2 < α < 1, ψ(t) − eiS

t

−∞

ψ(τ ) − eiS

− (τ )

− (t)

U0 (t)ψ− H 1 = O(|t|−α ), 1/4

U0 (τ )ψ− 4W 1,∞ dτ

= O(|t|−α ) as t → −∞,

(1.3)

(1.4)

Geometric Optics and Long Range Scattering for NLS

43

where the phase shift S − is defined by S − (t, x) :=

λ 2π

x 2 ψ− log |t|. t

(1.5)

Remark 1. Theorem 1 in [11] gives an asymptotic in L2 instead of H 1 , and requires less regularity on the asymptotic state ψ− . Yet, it is still required to be small in the same space as in Theorem 2. Now we recall why the method of geometric optics can be closely related to scattering theory in the case of the nonlinear Schrödinger equation. In [2], we consider the initial value problem 1 iε∂t uε + ε 2 ∂x2 uε = λε α |uε |β uε , (t, x) ∈ R+ × R, 2 (1.6) 2 ε −i x2ε u|t=0 = e f (x), where α ≥ 1, β > 0 and 0 < ε ≤ 1 is a parameter going to zero. With the initial phase −x 2 /2, rays of geometric optics (which are the projection on the (t, x) space of the bicharacteristics) focus at the point (t, x) = (1, 0). We proved in [2] that in the case where β = 2α > 2 (“nonlinear caustic”), the asymptotic behavior, as ε goes to zero, of the solution near t = 1 is easily expressed in terms of f and the wave operator W− . To see that point, we introduced the scaling 1 ε t −1 x ε , , (1.7) u (t, x) = √ ψ ε ε ε that satisfies

1 1 1 ε U0 ψ − −→ ψ− := √ f (x). ε ε ε→0 2iπ

Define the function ψ by

(1.8)

i∂ ψ + 1 ∂ 2 ψ = λ|ψ|β ψ, t 2 x ψ|t=0 = W− ψ− .

Then ψ is a concentrating profile for uε , that is 1 t −1 x uε (t, x) ∼ √ ψ , . ε→0 ε ε ε In this paper, we treat the limiting case of (1.6), that is α = 1, β = 2. We study the validity of nonlinear geometric optics, for positive times, for the solutions of the following initial value problem, 1 iε∂t uε + ε 2 ∂x2 uε = λε|uε |2 uε , (t, x) ∈ R+ × R, 2 (1.9) x2 1 2 uε = e−i 2ε +iλ|f (x)| log ε f (x). |t=0

44

R. Carles

We altered the initial data by adding the term eiλ|f (x)| log ε in order to recover the same modified wave operator as in [11]. Nonlinear geometric optics could be justified as well without this term, by the same methods as that which follows in this article, but would not make it possible to deduce the existence of modified wave operators for (1.2). From now on, the function f in the initial data is supposed to belong to H, and nonzero. Then for every (fixed) ε > 0, (1.9) has a unique global solution, which belongs to C(Rt , ) (see for instance [5,6]). The following definition, that follows the spirit of [2], will be motivated in Sect. 2. 2

Definition 1. Let g ε be defined for t < 1 by g ε (t, ξ ) := λ|f (−ξ )|2 log

1

1−t ε

.

The approximate solution uεapp is defined for t < 1 by x.ξ t−1 2 1 dξ ε uεapp (t, x) := √ e−i 2ε ξ +i ε +ig (t,ξ ) a0 (ξ ) , 2π ε with

a0 (ξ ) :=

2π f (−ξ ). i

We define the symbol aε (t, ξ ) by x.ξ t−1 2 1 dξ ε uε (t, x) = √ e−i 2ε ξ +i ε +ig (t,ξ ) aε (t, ξ ) , 2π ε

(1.10)

(1.11)

(1.12)

(1.13)

which makes sense since uε ∈ L2 and

ξ 1 t−1 2 ε . aε (t, ξ ) = √ ei 2ε ξ −ig (t,ξ ) uε t, ε ε

(1.14)

We can now state the main result. Theorem 2. Let f ∈ H. There exist C ∗ = C ∗ (f ) and ε∗ = ε ∗ (f ) > 0 such that for 0 < ε ≤ ε∗ , nonlinear geometric optics is valid before the focus, with the following distinctions. – If f L∞ < |2λ|−1/2 , then in {1 − t ≥ C ∗ ε} and for any 0 ≤ s ≤ 1, 1 − t 2+s ε . aε (t, .) − a0 H s ∩F(H s ) = O log 1−t ε 2 ∗ – If f L∞ ≥ |2λ|−1/2 , then denote C0 := 2|λ|f L∞ . For any α > 0, there exists Cα 1/(C +α) 0 such that in 1 − t ≥ Cα∗ ε| log ε|5/2 , and for any 0 ≤ s ≤ 1,

aε (t, .) − a0 H s ∩F(H s ) = O

ε 2+s . | log ε| (1 − t)C0 +2sα

The above estimates are uniform on the time intervals we consider.

Geometric Optics and Long Range Scattering for NLS

45

Define the Galilean operator J (see for instance [5,6]) by J (t) := x + it∂x . 1/2 . Then there exists a unique Corollary 1. Let ψ− ∈ F(H) with ψ − L∞ < (π/|λ|) ψ ∈ C(R, ) solution of (1.2) such that for any 0 ≤ s ≤ 1, as t → −∞, (log |t|)2+s − ψ(t) − eiS (t) U0 (t)ψ− H s = O , (1.15) |t| (log |t|)3 iS − (t) J (t)ψ − J (t)e U0 (t)ψ− L2 = O . (1.16) |t| In particular, we have (log |t|)5/2 iS − (t) ψ(t) − e . (1.17) U0 (t)ψ− L∞ = O |t|3/2

Actually, we will prove uniqueness under weaker conditions, as stated in the following proposition. Recall that f and ψ− are related by (1.8). Proposition 1. Let f ∈ H 2 (R). Suppose f L∞ < |2λ|−1/2 . Then there exists at most one function ψ ∈ C(Rt , L2 ∩ L∞ ) solution of (1.2) satisfying the following property: There exists 1/2 < α < 1, with α > 2|λ|f 2L∞ , such that, as t → −∞, 1 iS − (t) ψ(t) − e . U0 (t)ψ− L2 ∩L∞ = O |t|α Remark 2. Our method does not recover the convergence in L4t (L∞ x ) of the derivatives, stated in Theorem 1. However, we recover all the others, with a better convergence rate. Remark 3. The other improvement involves the regularity of the function ψ thus constructed. We get some regularity of the momenta of ψ, namely xψ ∈ L2 , which did not appear in [11]. Thanks to this regularity, we can use the results of asymptotic completeness stated in [9], in order to define a long range scattering operator for small data. Corollary 2. We can define a modified scattering operator for (1.2), for small data in H. There exists δ > 0 such that to any ψ− ∈ F(H) satisfying ψ− ≤ δ, we can associate unique ψ ∈ C(Rt , ) solution of (1.2) and ψ+ ∈ L2 such that ψ(t)

∼

t→±∞

eiS

± (t)

U0 (t)ψ± in L2 ,

(1.18)

where S ± are defined by (1.5). The map S : ψ− → ψ+ is the modified scattering operator. Corollary 3. Let f ∈ H. Assume f is sufficiently small. Then nonlinear geometric optics is valid in L2 for the problem (1.9), before and after the caustic. The caustic crossing is described by the modified scattering operator S and a “random phase shift”. One has the following asymptotics in L2 , – if t < 1, then π

2 ei 4 i x +i λ ψ u (t, x) ∼ √ e 2ε(t−1) 2π − ε→0 2π(1 − t)

ε

x t−1

2 log

1−t ε

ψ −

x , t −1

46

R. Carles

– if t > 1, then

π

2 e−i 4 i x +i λ ψ u (t, x) ∼ √ e 2ε(t−1) 2π + ε→0 2π(t − 1)

ε

x t−1

2 log

t−1 ε

ψ +

x , t −1

where ψ− is defined by (1.8) and ψ+ = Sψ− . Remark 4. The phase shift of −π/2 between the two asymptotics is classical, and appears even in the linear case ([4]). The change in the profile, measured by a scattering operator, was proved in [2]. The new phenomenon here is the phase shift 2 2 x x λ λ t −1 log t − 1 , ψ+ − ψ− log 2π t −1 ε 2π t −1 ε which is “very nonlinear”, and depends on ε, hence can be called “random”. Remark 5. From a physical point of view, the nonlinearity λ|ψ|2 ψ appears as the first term of a Taylor expansion of a more general nonlinearity h(|ψ|2 )ψ. For instance, h may be bounded (to model the phenomenon of saturation). For large times, ψ is small and we can write h(|ψ|2 )ψ = λ|ψ|2 ψ + R(|ψ|2 )ψ,

(1.19)

with R(|ψ|2 ) = O(|ψ|4 ). One can check that replacing λ|ψ|2 ψ with the right-hand side of (1.19), Corollary 1 still holds, as well as Corollary 2, since the results in [9] still hold with (1.19). Notations. We will denote d¯ξ := so that the Fourier inverse formula writes F −1 f (x) =

dξ , 2π eixξ f (ξ )d¯ξ.

For x ∈ R, we denote x := (1 + x 2 )1/2 . 2. Formal Computations In this section, we recall how the oscillatory integrals were introduced in the nonlinear short range case ([2]), and give a formal argument that leads to Definition 1 before the focus, that is for t < 1. Suppose uε solves the initial value problem 1 iε∂t uε + ε 2 ∂x2 uε = 0, (t, x) ∈ R+ × R, 2 (2.1) 2 ε −i x2ε u|t=0 = e f (x). For t < 1, the asymptotics when ε goes to zero is given by WKB methods, x2 1 x uε (t, x) ∼ √ ei 2ε(t−1) . f ε→0 1 − t 1−t

(2.2)

Geometric Optics and Long Range Scattering for NLS

47

Near the focus, this description fails to be valid. Neither the profile nor the phase in (2.2) are defined for t = 1. For much more general cases, Duistermaat showed that a uniform description can be obtained in terms of oscillatory integrals ([4]), that is, in this case, xξ t−1 2 1 ε u (t, x) = √ (2.3) e−i 2ε ξ +i ε aε (ξ )d¯ξ. ε It is easy to check that aε has an asymptotic expansion in powers of ε, and in particular, aε −→ a0 defined by (1.12). For t < 1 the usual stationary phase formula applied to the ε→0

above integral with aε replaced by a0 gives the asymptotics (2.2). For t > 1, one has almost the same asymptotics, the main difference is a phase shift of −π/2 due to the caustic crossing. For the nonlinear case (1.6), we generalized the previous representation as follows ([2]), xξ t−1 2 1 ε u (t, x) = √ (2.4) e−i 2ε ξ +i ε aε (t, ξ )d¯ξ. ε This formula makes sense as soon as uε ∈ L2 , since ξ 1 t−1 2 aε (t, ξ ) = √ ei 2ε ξ uε t, . ε ε The nonlinear term εα |uε |β uε is negligible when ∂t aε goes to zero. With this natural definition, we proved that the nonlinear term can have different influences away from the caustic, and near t = 1, which led us to use the same vocabulary as in [10], linear/nonlinear propagation, linear/nonlinear caustic. We also proved that the four cases can be encountered. When the propagation is nonlinear (α = 1), a formal computation based on the stationary phase formula suggests as a limit transport equation for the symbol aε , i∂t a(t, ξ ) =

λ |2π(1 − t)|

β 2

|a|β a(t, ξ ),

(2.5)

at least away from the caustic, with initial data a|t=0 = a0 (ξ ). Multiplying (2.5) by a, ¯ ig(t,ξ ) one notices that the modulus of a is constant. If we write a = a0 e , the equation for g is: ∂t g(t, ξ ) = −

λ |1 − t|

β 2

|f (−ξ )|β .

(2.6)

If we wish to get as a limit transport equation the relation ∂t a˜ = 0, it seems natural to define a modified symbol a˜ ε as x.ξ t−1 2 1 ε u (t, x) = √ (2.7) e−i 2ε ξ +i ε +ig(t,ξ ) a˜ ε (t, ξ )d¯ξ, ε with g|t=0 = 0. In the case of a linear caustic (β < 2), we proved that indeed, a˜ ε (t, ξ ) −→ a0 (ξ ) in L∞ t,loc (x ). ε→0

48

R. Carles

In the case we want to study now, β = 2, the integration of (2.6) is possible only for t < 1. With the initial data g|t=0 = λ|f (−ξ )|2 log 1ε , it gives the result introduced in Definition 1. As in the cases recalled above, the transport equation for the modified symbol a˜ ε must be, for t < 1, ∂t a˜ ε −→ 0, ε→0

which leads us to the definition of the approximate solution (1.11). From now on, we will leave out the tilde symbol for a, and adopt the notation (1.13). Remark 6. The function g is defined only for t < 1, not near t = 1. One must remember that the formal computations that lead to the definition of g are based on the application of 2 the stationary phase formula. When the phase 1−t 2 ξ + xξ does not have non-degenerate critical points, one must not expect this formal argument to be valid in the general case. On the other hand, recall that the case we study (α = 1 and β = 2) corresponds to a nonlinear propagation and a nonlinear caustic. The phase g takes the nonlinear effects of the propagation before the caustic into account. To take the nonlinear effects of the caustic into account, one has to define a (long range) scattering operator for the cubic Schrödinger equation (see Sect. 7). For t < 1, the function uεapp satisfies the equation √ xξ t−1 2 1 ε iε∂t uεapp + ε 2 ∂x2 uεapp = − ε ∂t g ε (t, ξ )e−i 2ε ξ +i ε +ig (t,ξ ) a0 (ξ )d¯ξ 2 xξ t−1 2 1 |f (−ξ )|2 ε = λε √ a0 (ξ )d¯ξ. e−i 2ε ξ +i ε +ig (t,ξ ) 1−t ε

(2.8)

For t < 1, one can formally apply the stationary phase formula to the integral defining uεapp , x2 x 1 x i 2ε(t−1) +ig ε t, t−1 ε uapp (t, x) ∼ e f (2.9) =: uε1 (t, x). ε→0 (1 − t)1/2 1−t On the other hand, if one applies the stationary phase formula to the right-hand side of (2.8), it comes λε|uε1 |2 uε1 (t, x), so formally, uεapp is an approximate solution of (1.9). In the following section, we estimate precisely the remainders when one applies the stationary phase formula as above. 3. Estimates on Some Oscillatory Integrals 3.1. The fundamental estimate. We first estimate precisely the remainder of the usual stationary phase formula applied to the first order, in L2 . Lemma 1. Let σ (t, ξ ) be locally bounded in time with values in L2 (R). Denote xξ t−1 2 1 H ε (t, x) := √ e−i 2ε ξ +i ε σ (t, ξ )d¯ξ, ε and .ε the first term given by the stationary phase formula, x2 x i −i ε 2ε(1−t) σ t, . . (t, x) := e 2π(1 − t) t −1

Geometric Optics and Long Range Scattering for NLS

49

1. There exists a continuous function h, with h(0) = 0, such that ε ε H (t, .) − .ε (t, .) 2 = h . L 1−t 2. If σ (t, .) ∈ H 2 (R), the rate of continuity of h can be estimated, ε H (t, .) − .ε (t, .) 2 ≤ C ε σ (t, .) 2 . Hξ L |1 − t| Proof. From the definition of H ,

1−t x 2 1 i ξ + 1−t σ (t, ξ )d¯ξ H (t, x) = e e 2ε √ ε x2 1−t 2 1 x ei 2ε ξ σ t, ξ − = e−i 2ε(1−t) √ d¯ξ, 1−t ε 2

x −i 2ε(1−t)

ε

hence from Parseval formula,

ε xy 2 i ei 2(1−t) y ei 1−t Fξ−1 H (t, x) = e →y σ (t, y)dy 2π(1 − t) x2 x i −i 2ε(1−t) = e σ t, 2π(1 − t) t −1 xy x2 ε 2 i + e−i 2ε(1−t) ei 2(1−t) y − 1 ei 1−t F −1 σ (t, y)dy, 2π(1 − t) ε

2

x −i 2ε(1−t)

and the last term can also be written as ε x2 i x −i 2ε(1−t) i 2(1−t) y2 −1 −1 F σ t, F e . e 2π(1 − t) t −1 Now from the Plancherel formula, ε H (t, .) − .(t, .) 2 = h t, L x

with

ε 1−t

z 2 h(t, z) = ei 2 y − 1 F −1 σ

L2y

,

.

Then the first point follows from the dominated convergence theorem. When σ (t, .) ∈ H 2 , we have z h(t, z) = 2 sin y 2 F −1 σ (t, .) 4 L2y z ≤ 2 y 2 F −1 σ (t, .) 4 L2y ≤ |z| y 2 F −1 σ (t, .) 2 = C|z|σ (t, .)H 2 . Ly

This inequality completes the proof of Lemma 1.

50

R. Carles

3.2. Convergence of the initial data. To obtain asymptotics in for the symbols as stated in Theorem 2, we have to notice the following properties. If xξ t−1 2 1 v ε (t, x) = √ e−i 2ε ξ +i ε bε (t, ξ )d¯ξ, ε then

1 √ ε and 1 √ ε

e−i

e−i

xξ t−1 2 2ε ξ +i ε

xξ t−1 2 2ε ξ +i ε

ξ bε (t, ξ )d¯ξ = ε∂x v ε (t, x),

∂ξ bε (t, ξ )d¯ξ = J ε (t)v ε (t, x),

where we denoted J ε (t) :=

x + i(t − 1)∂x . ε

(3.1)

The operator J ε is nothing else than the usual Galilean operator, rescaled accordingly to our problem. Lemma 2. The operator J ε satisfies the following properties. – The commutation relation,

1 2 2 J (t), iε∂t + ε ∂x = 0. 2 ε

(3.2)

x2

– Denote M ε (t) = ei 2ε(t−1) , then J ε (t) writes J ε (t) = i(t − 1)M ε (t)∂x M ε (2 − t).

(3.3)

– The modified Sobolev inequality, w(t)L∞ ≤ C √

1 1/2 1/2 w(t)L2 J ε (t)w(t)L2 . |1 − t|

(3.4)

– For any function F ∈ C 1 (C, C) satisfying the gauge invariance condition ∃G ∈ C 1 (R+ , R), F (z) = zG (|z|2 ), one has J ε (t)F (w) = ∂z F (w)J ε (t)w − ∂z¯ F (w)J ε (t)w.

(3.5)

Th first step to prove Theorem 2 is to study the convergence of the initial value of the symbol aε .

Geometric Optics and Long Range Scattering for NLS

51

Lemma 3. The following convergence holds in , aε (0, ξ ) −→ a0 (ξ ). ε→0

More precisely, there exists C = C(f H ) such that

1 2 ≤ Cε log , ε 1 3 ≤ Cε log . ε

aε (0, .) − a0 L2 ξ(aε (0, ξ ) − a0 (ξ ))L2 , ∂ξ (aε (0, ξ ) − a0 (ξ ))L2

(3.6)

Moreover, the same estimates hold with aε (0, ξ ) − a0 (ξ ) replaced with (aε (0, ξ ) − 2 a0 (ξ ))e−iλ|f (−ξ )| log ε . Proof. From (1.14) and the initial value of uε , one has i 1 1 2 2 2 aε (0, ξ ) = eiλ|f (−ξ )| log ε . √ e− 2ε (x+ξ ) +iλ|f (x)| log ε f (x)dx. ε Denote hε (x) := eiλ|f (x)|

2 log 1 ε

f (x). From Parseval formula, one also has y2 1 −iλ|f (−ξ )|2 log ε =√ aε (0, ξ )e e−iyξ −iε 2 hε (y)dy, 2iπ

hence (aε (0, ξ ) − a0 (ξ )) e

−iλ|f (−ξ )|2 log ε

=√

1 2iπ

e

−iyξ

e

2

−iε y2

− 1 hε (y)dy.

Following the proof of Lemma 1, one then proves that the L2 -norm of the above quantity is O(ε| log ε|2 ), and its -norm is O(ε| log ε|3 ). The estimates of Lemma 3.6 are then straightforward. 3.3. Estimating the approximate solution. To estimate the remainder uε − uεapp , we will need some information as for the L∞ -norm of the approximate solution. The following lemma provides some. Lemma 4. Let β > 0. There exists C∗ = C∗ (β, f H 2 ) such that in the region {1 − t ≥ C∗ ε}, uεapp (t) satisfies almost the same estimate as uε1 (t) in L∞ , that is, uεapp (t)L∞ ≤

f L∞ + β . √ 1−t

Proof. Write uεapp (t)L∞ ≤ uε1 (t)L∞ + uεapp (t) − uε1 (t)L∞ , and denote d ε (t, x) := uεapp (t, x) − uε1 (t, x). From the modified Sobolev inequality, d ε (t)L∞ ≤ √

C 1−t

d ε (t)L2 J ε (t)d ε L2 . 1/2

1/2

(3.7)

52

R. Carles

Now the L2 -norms can be estimated thanks to Lemma 1, with σ ε (t, ξ ) := eig

ε (t,ξ )

a0 (ξ ).

Thus, d ε (t)L2 ≤ C

ε σ ε (t, .)H 2 . ξ 1−t

It is a straightforward computation to see that since H 1 (R) ⊂ L∞ (R), there are some constants such that ε 1−t 2 ε d (t)L2 ≤ C(f H 2 ) . log 1−t ε Since J ε (t) acts as the differentiation with respect to ξ on the symbols, the first part of Lemma 1 gives 1−t ε J ε (t)d ε L2 = log h , ε 1−t where h ∈ C(R) satisfies h(0) = 0. Then from (3.7),

1/2 ε ε C(f H 2 ) 1 − t 3/2 ε . d (t)L∞ ≤ √ h log 1−t ε 1−t 1−t Hence, for 1 − t ε, d ε (t) is negligible compared to uε1 (t) in L∞ . This completes the proof of Lemma 4. The proof of the next lemma is similar, and uses the regularity f ∈ H. Lemma 5. There exists C∗ = C∗ (f H ) such that in the region {1 − t ≥ C∗ ε}, the derivatives of uεapp satisfy almost the same estimates as the derivatives of uε1 in L∞ , that is, there exists C = C(f H ) such that ε∂x uεapp (t)L∞ ≤ √ J ε (t)uεapp L∞ ≤ √

C 1−t C 1−t

, log

(3.8) 1−t . ε

(3.9)

3.4. The equation satisfied by the approximate solution. From Sect. 2 and more precisely from Eq. (2.8), the approximate solution uεapp solves the cubic nonlinear Schrödinger equation up to the error term 6ε (t, x) := |uεapp |2 uεapp (t, x) xξ t−1 2 1 |f (−ξ )|2 ε −√ a0 (ξ )d¯ξ. e−i 2ε ξ +i ε +ig (t,ξ ) 1−t ε

(3.10)

Lemma 6. There exist C = C(f H 2 ) and C∗ = C∗ (f H 2 ) such that uniformly in the region {1 − t ≥ C∗ ε}, 1−t 2 ε log 6ε (t)L2x ≤ C . (3.11) (1 − t)2 ε

Geometric Optics and Long Range Scattering for NLS

53

Proof. Write 6ε (t, x) = 6ε (t, x) + |uε1 |2 uε1 (t, x) − |uε1 |2 uε1 (t, x), and introduce ε (t, x) := |uεapp |2 uεapp (t, x) − |uε1 |2 uε1 (t, x). 6 ε satisfies the estimate stated in Lemma 6. The other estimate to complete We prove that 6 the proof of Lemma 6 would be easier and will be left out. First remark that ε (t, .)L2 ≤ C uεapp (t)2L∞ + uε1 (t)2L∞ (uεapp − uε1 )(t)L2 . 6 x x x One has obviously 1 f 2L∞ . 1−t From Lemma 4, uεapp satisfies the same estimate in the region we are considering. Hence, uε1 (t)2L∞ ≤ x

ε (t, .)L2 ≤ C(f H 2 ) 6 From Lemma 1 with σ ε = eig

1 (uεapp − uε1 )(t)L2x . 1−t

(3.12)

ε (t,ξ )

a0 (ξ ), we finally have ε ig ε (t,ξ ) (uεapp − uε1 )(t)L2x ≤ C a0 (ξ ) 2 , e Hξ 1−t

ε satisfies the estimate announced in Lemma 4. and it is easy to check that 6

The following lemma is the extension of Lemma 6 we will need for the proof of Theorem 2, and its proof is similar. Lemma 7. There exist C = C(f H ) and C∗ = C∗ (f H ) such that uniformly in the region {1 − t ≥ C∗ ε}, 1−t 3 ε ε ε log , J (t)6 (t)L2x ≤ C (1 − t)2 ε (3.13) 1−t 3 ε ε log . ε∂x 6 (t)L2x ≤ C (1 − t)2 ε 4. Energy Estimates In this section, we derive the three energy estimates we will use to justify nonlinear geometric optics. Recall that the exact solution uε and the approximate solution uεapp satisfy 1 iε∂t uε + ε 2 ∂x2 uε = λε|uε |2 uε , 2 1 iε∂t uεapp + ε 2 ∂x2 uεapp = λε|uεapp |2 uεapp − ε6ε , 2 where 6ε is defined by (3.10) and is estimated in Lemmas 6 and 7. Introduce the remainder w ε := uε − uεapp . Subtracting the previous two equations, one has 1 iε∂t w ε + ε 2 ∂x2 w ε = λε |uε |2 uε − |uεapp |2 uεapp + ε6ε . 2

(4.1)

54

R. Carles

Multiplying the previous equation by w ε and taking the imaginary part of the result integrated in x, it follows ∂t w ε (t)L2 ≤ C uε (t)2L∞ + uεapp (t)2L∞ w ε (t)L2 + C6ε (t)L2 (4.2) ≤ C wε (t)2L∞ + uεapp (t)2L∞ wε (t)L2 + C6ε (t)L2 . Differentiating (4.1) with respect to x and multiplying by ε∂x w ε , one has similarly ∂t ε∂x w ε (t)L2 ≤ C w ε (t)2L∞ + uεapp (t)2L∞ ε∂x w ε (t)L2 + Cw ε (t)L2 uεapp (t)L∞ ε∂x uεapp (t)L∞

(4.3)

+ Cw ε (t)2L∞ ε∂x uεapp (t)L2 + Cε∂x 6ε (t)L2 . Finally, since from Lemma 2 J ε commutes with the Schrödinger operator and acts on the nonlinearity we are considering as a differentiation, we also have ∂t J ε (t)w ε L2 ≤ C w ε (t)2L∞ + uεapp (t)2L∞ J ε (t)w ε L2 + Cw ε (t)L2 uεapp (t)L∞ J ε (t)uεapp L∞

(4.4)

+ Cw ε (t)2L∞ J ε (t)uεapp L2 + CJ ε (t)6ε L2 . The main idea to justify nonlinear geometric optics is to integrate those three energy estimates so long as wε (t)L∞ is not greater than uεapp (t)L∞ . Since w ε is expected to be a remainder, this case actually occurs in “sufficiently” large regions, as we will see in the next section. 5. Justification of Nonlinear Geometric Optics Before the Caustic We now illustrate the method announced above. From Lemma 4, the “so long” condition writes for instance 4f L∞ wε (t)L∞ ≤ √ . 1−t

(5.1)

From Inequality (3.4) and Lemma 3, wε (0)L∞ ≤ Cw ε (0)L2 J ε (0)w ε L2 ≤ Cε| log ε|5/2 . 1/2

1/2

Hence there exists ε∗ = ε∗ (f H ) > 0 such that for 0 < ε ≤ ε∗ , w ε (0)L∞ ≤ 2f L∞ . By continuity, Condition (5.1) is satisfied for 0 ≤ t ≤ Tε for some Tε > 0. Then so long as (5.1) holds, we can integrate the three energy estimates using Gronwall lemma. Since (4.3) and (4.4) are very similar, introduce the following norm, (5.2) wε (t)Y := max ε∂x w ε (t)L2 , J ε (t)w ε L2 .

Geometric Optics and Long Range Scattering for NLS

55

Now we can write (3.4) as w ε (t)L∞ ≤ √

C

w ε (t)L2 w ε (t)Y . 1/2

1−t

1/2

So long as (5.1) holds, estimate (4.2) can be written as follows, ∂t w ε (t)L2 ≤ C1

f 2L∞ ε w (t)L2 + C6ε (t)L2 , 1−t

(5.3)

where C1 is a universal constant that does not depend on f . Denote C0 := C1 f 2L∞ . From the Gronwall lemma, we can integrate the previous inequality as follows, t 1 − s C0 w ε (0)L2 ε ε w (t)L2 ≤ +C 6 (s)L2 ds. (5.4) (1 − t)C0 1−t 0 From Lemma 5, and from Lemma 7, Inequalities (4.3) and (4.4) can also be written 1−t C0 C w ε (t)Y + w ε (t)L2 log ∂t w ε (t)Y ≤ 1−t 1−t ε C 1 − t + w ε (t)L2 w ε (t)Y log (5.5) 1−t ε 1−t 3 ε log , +C (1 − t)2 ε where we possibly increased the value of C1 (this question will be addressed more precisely in Sect. 6). To estimate the integral of the right-hand side of (5.4), we use Lemmas 6 and 7. The integral is not greater than t C ε 1−s 2 log ds. (5.6) (1 − t)C0 0 (1 − s)2−C0 ε For j > 0, we are thus led to study t 1−s j 1 log ds. 2−C0 ε 0 (1 − s) We take j > 0 and not only j = 2 because to estimate w ε (t)Y , we will have to deal with similar integrals with j = 3. With the substitution σ = 1−t ε , it becomes ε

C0 −1

1 ε 1−t ε

log σ j dσ. σ 2−C0

(5.7)

Since in Lemmas 4, 6 and 7, we had to restrict our attention to the region 1−t ε, we can replace log σ with | log σ | in the previous integral with no change in the asymptotics, and we have to study 1 ε (log σ )j dσ, j > 0. (5.8) 1−t σ 2−C0 ε To estimate these integrals, we have to distinguish two cases, namely C0 < 1 and C0 ≥ 1.

56

R. Carles

5.1. Case C0 < 1. In this case, one has obviously 2 − C0 > 1, hence the integral (5.8) is convergent. More precisely, we have to estimate the remainder of a converging integral. Integration by parts shows that for b > a 1, b (log σ )j (log a)j . dσ = O σ 2−C0 a 1−C0 a With a =

1−t ε ,

it follows that the energy estimate (5.4) becomes w ε (t)L2 ≤ C2

ε 1−t

log

1−t ε

2 .

(5.9)

Let α > 0. Then if 1 − t ≥ C∗ ε where C∗ is such that C2

(log C∗ )3 = α, C∗

where C2 is the constant in (5.9), Inequality (5.5) becomes 1−t 3 C0 + α ε Cε log . w (t)Y + ∂t w ε (t)Y ≤ 1−t (1 − t)2 ε

(5.10)

Taking α > 0 sufficiently small, we have C0 + α < 1, hence we can replace C0 + α with C0 with no change in the result. Now we can apply Gronwall lemma to (5.10), t 1 1−s 3 w ε (0)Y Cε wε (t)Y ≤ log + ds. (1 − t)C0 (1 − t)C0 0 (1 − s)2−C0 ε The previous estimate with j = 3 yields ε w (t)Y ≤ C 1−t ε

1−t log ε

From (3.4), w ε (t)L∞ ≤ √

C

ε 1−t 1−t

log

3

1−t ε

.

(5.11)

5/2 .

Hence, there exists C∗ = C∗ (f ) such that for 0 < ε ≤ ε∗ and in the region {1−t ≥ C∗ ε}, condition (5.1) is always satisfied, and estimates (5.9) and (5.11) hold, which we can summarize in the following proposition. −1/2

Proposition 2. Define δ := C1 . If f ∈ H satisfies f L∞ < δ, then nonlinear geometric optics is uniformly valid in the region {1 − t ≥ C∗ ε} for some (large) C∗ = C∗ (f ), with the estimates, ε 1−t 2 , log aε (t, .) − a0 L2 ≤ C 1−t ε ε 1−t 3 . log ∂ξ (aε (t, .) − a0 )L2 , ξ(aε (t, ξ ) − a0 (ξ ))L2 ≤ C 1−t ε

Geometric Optics and Long Range Scattering for NLS

57

5.2. Case C0 ≥ 1. Now the integral (5.8) is divergent. Integration by parts shows that for b > a 1, b (log σ )j C0 −1 j dσ = O b (log b) . σ 2−C0 a Then the energy estimate (5.4) becomes w ε (t)L2 ≤ C

ε | log ε|2 . (1 − t)C0

(5.12)

Let α > 0. First, we restrict our study to the region 1−t ε ≤ 2α. | log ε|2 log C 0 (1 − t) ε

(5.13)

Then the energy estimate (5.5) becomes, when we take only the “worst” terms into account, ∂t w ε (t)Y ≤

1−t C0 + 2α ε Cε 2 log | log ε| w (t)Y + . 1−t (1 − t)1+C0 ε

(5.14)

Applying the Gronwall lemma and proceeding as for the L2 -norm yields wε (t)Y ≤ C

ε| log ε|3 . (1 − t)C0 +2α

(5.15)

From (3.4), wε (t)L∞ ≤ √

C ε | log ε|5/2 . 1 − t (1 − t)C0 +α

1/(C0 +α) Hence, for 1 − t ε| log ε|5/2 and ε sufficiently small, condition (5.1) is always satisfied, and estimates (5.12) and (5.15) hold. Notice that in this region, for ε sufficiently small, (5.13) is automatically satisfied. Proposition 3. Take δ as in Proposition 2. Assume f ∈ H satisfies f L∞ ≥ δ. Let α > 0. Then there exists Cα∗ such that nonlinear geometric optics is uniformly valid in the 1/(C0 +α) region {1 − t ≥ Cα∗ ε| log ε|5/2 }, where C0 = f 2L∞ /δ 2 , with the estimates, ε | log ε|2 , (1 − t)C0 ε ≤C | log ε|3 . (1 − t)C0 +2α

aε (t, .) − a0 L2 ≤ C ∂ξ (aε (t, .) − a0 )L2 , ξ(aε (t, ξ ) − a0 (ξ ))L2

Propositions 2 and 3 imply Theorem 2, up to the computation of the smallness constant we find with this method, which we shall perform in the next section.

58

R. Carles

6. Interpretation 6.1. Computation of δ. We now focus on the case f L∞ < δ, and compute the best constant given by our method. From Sect. 5.1, we have to compute the coefficient in the factor of wε (t)L2 in Inequality (4.2), and the constant that appears in the first line of the right-hand side of (4.3) and (4.4). Indeed, for these last two inequalities, we proved that the other terms can be either absorbed (provided we remain sufficiently “far” from the caustic), or considered as a small source term. For Inequality (4.2), we multiplied (4.1) by w ε , then took the imaginary part of the result integrated in space. Write |uε |2 uε − |uεapp |2 uεapp = |uε |2 w ε + (|uε |2 − |uεapp |2 )uεapp . With the method of energy estimates, the first term will vanish, and the second is written |wε |2 + 2 Re(w ε uεapp ) uεapp . Hence, we can rewrite (4.2) more precisely as ∂t w ε (t)L2 ≤ 2|λ| 2w ε (t)L∞ + uεapp (t)L∞ uεapp (t)L∞ w ε (t)L2 + source term.

(6.1)

For Inequality (4.3), we differentiate |uε |2 uε − |uεapp |2 uεapp , with the result (uε )2 ε∂x uε − (uεapp )2 ε∂x uεapp + 2(|uε |2 ε∂x uε − |uεapp |2 ε∂x uεapp ). The very last term will be considered as a source term. The term before is written |uε |2 ε∂x uε = |uε |2 ε∂x w ε + |uε |2 ε∂x uεapp . When we take the imaginary part, the term |uε |2 ε∂x w ε vanishes, and the other term is made of source terms and of “absorbed” terms. Finally, the only relevant term will be (uε )2 ε∂x w ε , and we can rewrite (4.3) as 2 ∂t ε∂x w ε (t)L2 ≤ 2|λ| w ε (t)L∞ + uεapp (t)L∞ ε∂x w ε (t)L2 (6.2) + absorbed terms + source terms. Since from (3.5), J ε acts on the nonlinearity as a differentiation, the computation is exactly the same as with ε∂x , and we have 2 ∂t J ε (t)w ε L2 ≤ 2|λ| w ε (t)L∞ + uεapp (t)L∞ J ε (t)w ε L2 (6.3) + absorbed terms + source terms. Now notice that in Lemma 4, we could have obtained the estimate 1+β uεapp (t)L∞ ≤ √ f L∞ , 1−t for any β > 0, provided that we take C∗ sufficiently large.

Geometric Optics and Long Range Scattering for NLS

59

Similarly, for Condition (5.1), we could have taken w ε (t)L∞ ≤ √

β

f L∞ .

1−t

Obviously, the smaller β is, the smaller ε∗ is, and the larger C ∗ in Theorem 2. We see that for any β > 0, we can take C0 = 2(1 + 2β)2 |λ|f 2L∞ , which proves that we can take δ = |2λ|−1/2 , and completes the proof of Theorem 2. 6.2. Proof of Corollary 1. Existence. Recall the scaling (1.7). For every ε > 0, ψ ε is the unique solution in C(R, ) of the initial value problem 1 i∂t ψ ε + ∂x2 ψ ε = λ|ψ ε |2 ψ ε , 2 (6.4) 2 √ ε −iε x2 −iλ|f (εx)|2 log ε ψ = εf (εx)e . |t=−1/ε

For t < 0, define ψapp by

ψapp (t, x) :=

t

e−i 2 ξ

2 +ixξ +iλ|f (−ξ )|2 log |t|

a0 (ξ )d¯ξ.

From Theorem 2, if f L∞ < |λ|−1/2 , then there exist ε∗ and C ∗ such that for 0 < ε ≤ ε∗ and 0 ≤ s ≤ 1, ψ ε (t) − ψapp (t)H s ≤ C

(log |t|)2+s , |t|

(6.5)

uniformly for −1/ε ≤ t ≤ −C ∗ . Moreover, since the operator J ε is nothing but the classical Galilean operator J (t) up to the scaling (1.7), we also have J (t)ψ ε − J (t)ψapp L2 ≤ C

(log |t|)3 , |t|

(6.6)

uniformly for −1/ε ≤ t ≤ −C ∗ . Proposition 4. Assume f L∞ < |λ|−1/2 . Then there exists C ∗ > 0 such that (ψ ε (−C ∗ ))0 0 such that for any t ≤ −C ∗ , ψapp (t)L∞ ≤

f L∞ + β . |t|1/2

Then for t ≤ −C ∗ and from (6.17), (6.18) becomes C C ∂t φ(t)L2 ≤ + φ(t)L2 , |t| |t|α+1/2 with C := |λ| (f L∞ + β)2 < 1. For t0 ≤ t ≤ −C ∗ , the Gronwall lemma gives φ(t)L2

C t0 ≤ Cφ(t0 )L2 . t

Using the assumptions again, we have φ(t)L2

1 ≤C α |t0 |

C t0 . t

Given our choice for β, α > C. Fix t = −C ∗ . The right-hand side goes to zero when t0 goes to −∞. Hence φ(−C ∗ ) = 0, and φ ≡ 0 from the uniqueness for (1.2) in C(Rt , L2 ∩ L∞ ) (see [7]). This proves Proposition 1 and completes the proof of Corollary 1. Remark 9. For Proposition 1, we need the assumption f ∈ H 2 (R) because it is the minimum regularity we assumed for Lemma 4. 7. Construction of the Modified Scattering Operator and Application 7.1. Proof of Corollary 2. We first recall the main result in [9] for nonlinear Schrödinger equation, which corresponds to the notion of asymptotic completeness of the modified wave operators introduced in [11]. Theorem 3 ([9], Theorem 1.2, case n = 1). Let ϕ ∈ , with ϕ = δ ≤ δ, where δ is sufficiently small. Let ψ ∈ C(Rt , ) be the solution of the initial value problem (6.14), with C ∗ = 0. Then there exist unique functions W ∈ L2 ∩ L∞ and φ ∈ L∞ such that for t ≥ 1, t dτ 2 F U0 (−t)ψ (t) exp −i λ ˆ )|2 − W | ψ(τ ≤ Cδ t −α+C(δ ) , (7.1) 2π 1 τ 2 ∞ L ∩L t λ dτ 2 2 2 ˆ |ψ(τ )| ≤ Cδ t −α+C(δ ) , (7.2) − λ|W | log t − φ 2π τ 1

L∞

64

R. Carles

where Cδ < α < 1/4, and φ is a real valued function. Furthermore we have the asymptotic formula for large time t, 2 x 2 x x x 1 W exp i + iλ W ψ(t, x) = log t + iφ (it)1/2 t 2t t t (7.3) 2

+ O(δ t −1/2−α+C(δ ) ) and the estimate F U0 (−t)ψ (t) − W exp(iλ|W |2 log t + iφ)

2

L2 ∩L∞

≤ Cδ t −α+C(δ ) .

(7.4)

Remark 10. Uniqueness follows from (7.1) and (7.2), which make it possible to define W and φ. The asymptotics (7.3) and (7.4) are immediate consequences of (7.1) and (7.2). ˜ ˜ where φ˜ ∈ L∞ , then (7.3) and In particular, if we replace (W, φ) with (W ei φ , φ − φ), (7.4) still hold. Remark 11. This theorem states “almost” asymptotic completeness for small data for the modified wave operators introduced in [11]. Indeed, no regularity for the momenta of ψ is proved in [11]. In Corollary 1, we limit the loss of regularity, and in particular obtain for ψ that required in Theorem 3. 1/2 . From Corollary 1, Proof of Corollary 2. Let ψ− ∈ F(H), with ψ − L∞ < (π/|λ|) there exists a unique ψ ∈ C(R, ) solution of (1.2) satisfying (1.15), (1.16). The first step is then to check that for ψ− sufficiently small, ψ(0) < δ, so that we can use the results of Theorem 3. The second step consists in defining ψ+ . From Duhamel’s formula, one has 0 ψ(0) = U0 (C ∗ )ψ(−C ∗ ) − iλ U0 (−s)|ψ|2 ψ(s)ds. −C ∗

On the other hand, we saw that for C ∗ 1,

U0 (C ∗ )ψ(−C ∗ ) ≤ U0 (C ∗ )ψapp (−C ∗ ) + U0 (C ∗ )(ψ − ψapp )(−C ∗ ) ≤ Cψ− log C ∗ + C

log C ∗ 4 . C∗

From local estimates for (1.2), we see that there exist functions hj , j = 1, 2, 3, with h1 (x) −→ 0, h2 is increasing, and h3 (x) −→ 0, such that x→+∞

x→0

ψ(0) ≤ h1 (C ∗ ) + h2 (C ∗ )h3 (ψ− ).

(7.5)

Taking first C ∗ sufficiently large, then ψ− sufficiently small, we see that we can have ψ(0) < δ. Then Theorem 3 provides (unique) functions W and φ. Define ψ+ by ψ+ := F −1 W eiφ ∈ L2 (R).

(7.6)

Geometric Optics and Long Range Scattering for NLS

65

From (7.3) and (7.4), we have, in L2 , ψ(t)

eiλ|ψ+ ( t )|

∼

t→+∞

x

2

log t

which, along with Corollary 1, yields Corollary 2.

U0 (t)ψ+ ,

7.2. Consequences for nonlinear geometric optics. In Sect. 2, we mentioned the fact that to describe the asymptotics of uε after the caustic, one needs a modified scattering operator. Now we have one, we can describe uε globally. We first give a heuristic approach, then prove Corollary 3. We already noticed that the phase g ε (hence the symbol aε ) is defined only for t < 1. If we want a global description, we have to replace g ε with a phase φ ε which is defined for all t, and coincides asymptotically with g ε for t < 1. To guess which possible φ ε we can choose, recall Scaling (1.7). The function ψ ε solves (1.2), and we saw that, for t ∈] − ∞, T ], where T is finite, ψ ε (t) −→ ψ(t) in L2 ∩ L∞ . ε→0

Hence we have 1 i∂t ψ ε + ∂x2 ψ ε = λ|ψ|2 ψ ε + small. 2

(7.7)

Forget the “small” term. We now have to study a linear Schrödinger equation, with a time-dependent potential λ|ψ|2 . According to the vocabulary used in [3], this is not a short range potential, for it does not belong to L1t (L∞ x ). A scattering theory for long range potentials is available (see for instance [3]). The first idea is due to Dollard and consists in studying t ε 2 ψ (t, x) exp −iλ |ψ| (s, sξ )ds 0

in order to get rid of the long range part. In our context, this means that we can replace g ε with φ ε (t, ξ ) := −λ

t−1 ε

1 |ψ|2 (s, sξ )ds + λ|f (−ξ )|2 log . ε −1/ε

The symbol aε is now defined (globally in time) by x.ξ t−1 2 1 ε ε u (t, x) = √ e−i 2ε ξ +i ε +iφ (t,ξ ) aε (t, ξ )d¯ξ. ε

(7.8)

(7.9)

Now from Corollary 1, one has, for t < 1, |ψ(s, sξ )| =

1 1 ∞ |ψ − (ξ )| + o(1) in Lt (Lx ), |2π s|1/2

∞ hence, in L∞ t,loc (0, 1; Lx ),

φ ε (t, ξ ) = g ε (t, ξ ) + o(1).

(7.10)

66

R. Carles

Therefore, even with this new definition of aε , we have, for t < 1, aε (t, ξ ) −→ a0 (ξ ) in L2 . ε→0

Similarly, for t > 1 and from Theorem 3, there exists a function H (that depends on ∞ ψ) such that in L∞ t,loc (1, 2; Lx ), t −1 + H (ξ ) + o(1). ε

φ ε (t, ξ ) = −λ|W (ξ )|2 log In particular, since aε (t, ξ ) = e−iφ

ε (t,ξ )

(7.11)

1−t t −1 F U0 ψε ε ε

and the map ϕ → (W, φ) in Theorem 3 is continuous, 2 aε (t, ξ ) −→ e−iH (ξ )+iφ(ξ ) W (ξ ) = e−iH (ξ ) ψ + , in L . ε→0

(7.12)

Apparently, the limit of aε depends on this function H . One must bear in mind that this function H is closely related to our choice in the definition of the new phase φ ε . For instance, one can check that replacing φ ε with φ ε (t, ξ ) + h1 (ξ )

t−1 ε

−1/ε

h2 (s)ds,

where h1 ∈ L∞ , h2 ∈ L1 , would just alter the definition of H . Thus this function appears as a parameter in the definition of aε . Nevertheless, the asymptotics for uε is independent of H . It is given, in L2 , by (7.12), (7.11) and the first part of Lemma 1. This leads to the asymptotics given in Corollary 3 for t > 1. The asymptotics for t < 1 is a simple consequence of Theorem 2 and (7.10). This completes the proof of Corollary 3. Acknowledgements. I would like to thank Professor A. Bressan for his invitation at SISSA, where this work was achieved. This research was supported by the European TMR ERBFMRXCT960033.

References 1. Barab, J. E.: Nonexistence of asymptotically free solutions for nonlinear Schrödinger equation. J. Math. Phys. 25, 3270–3273 (1984) 2. Carles, R.: Geometric optics with caustic crossing for some nonlinear Schrödinger equations. Indiana Univ. Math. J. 49, 475–551 (2000) 3. Derezi´nski, J., and Gérard, C.: Scattering theory of quantum and classical N-particle systems. Texts and Monographs in Physics, Berlin–Heidelberg: Springer Verlag, 1997 4. Duistermaat, J. J.: Oscillatory integrals, Lagrangian immersions and unfolding of singularities. Comm. Pure Appl. Math. 27, 207–281 (1974) 5. Ginibre, J.: Introduction aux équations de Schrödinger non linéaires. Cours de DEA, Paris Onze Édition (1995) 6. Ginibre, J.: An introduction to nonlinear Schrödinger equations. In: Nonlinear waves (Sapporo, 1995). Gakk¯otosho, R. Agemi and Y. Giga and T. Ozawa (eds.), GAKUTO International Series, Math. Sciences and Appl., 1997, pp. 85–133 7. Ginibre, J., and Velo, G.: On a class of nonlinear Schrödinger equations. III. Special theories in dimensions 1, 2 and 3. Annales de l’Institut Henri Poincaré. Section A. Physique Théorique. Nouvelle Série 28, 287– 316 (1978)

Geometric Optics and Long Range Scattering for NLS

67

8. Ginibre, J., and Velo, G.: Long Range Scattering and Modified Wave Operators for some Hartree Type Equations III. Gevrey spaces and low dimensions. J. Diff. Eq., to appear 9. Hayashi, N., and Naumkin, P.: Asymptotics for large time of solutions to the nonlinear Schrödinger and Hartree equations. Am. J. Math. 120, 369–389 (1998) 10. Hunter, J., and Keller, J.: Caustics of nonlinear waves. Wave Motion 9, 429–443 (1987) 11. Ozawa, T.: Long range scattering for nonlinear Schrödinger equations in one space dimension. Commun. Math. Phys. 139, 479–493 (1991) 12. Strauss, W.: Nonlinear scattering theory. In: Scattering theory in mathematical physics, J. Lavita and J. P. Marchands (eds.), Dordrecht: Reidel, 1974 13. Strauss, W.: Nonlinear scattering theory at low energy. J. Funct. Anal. 41, 110–133 (1981) Communicated by A. Kupiainen

Commun. Math. Phys. 220, 69 – 94 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

Regularized Products and Determinants Georg Illies IHES, Le Bois-Marie, 35, Route de Chartres, 91440 Bures-sur-Yvette, France. E-mail: [email protected] Received: 4 April 2000 / Accepted: 15 January 2001

Abstract: Zeta-regularized products are used to define determinants of operators in infinite dimensional spaces. This article provides a general theory of regularized products and determinants which delivers a better approach to their existence and explicit determination. 1. Introduction The zeta-regularized product of a sequence ak ∈ C∗ is defined by ∞ ∞ d −s ak := exp − ak |s=0 , ds k=1

(1.1)

k=1

provided that the Dirichlet series converges absolutely in a half plane and can be meromorphically continued to the left of (s) = 0; the evaluation at s = 0 means the constant term of the Laurent expansion. This obviously generalizes the ordinary finite product. Zeta-regularization was first used to define analytic torsion [RS] and since then has played a role in global analysis, the theory of dynamical zeta functions, and Arakelov theory. Theoretical physicists use zeta-regularization as a method for renormalization in quantum field theories [EORBZ] and various papers (e.g. [Ef, Ko1, Ko2, Ko3, Sa]) have calculated the regularized determinant of Laplacians. Zeta-regularization also appeared in a conjectural cohomological approach to motivic L-functions ([De1, De2, De3, Ma]). In that context the question appeared as to which meromorphic functions of finite order (e.g. motivic L-functions) are zeta-regularized, i.e. can be represented as f (z) = (z − ρ)±1 (1.2) Present address: Algebra und Zahlentheorie, Fachbereich Mathematik, Universität – Gesamthochschule Siegen, Walter-Flex-Str. 3, 57068 Siegen, Germany. E-mail: [email protected]

70

G. Illies

where the product is over all zeroes and poles ρ of f (z) with multiplicities and the sign of the exponent being positive for zeroes and negative for poles. This turns out to be the basic problem of zeta regularized determinants and it was the starting point of the following investigation which, we hope, gives a satisfying answer to the question. Regularization entails several technical problems because of the meromorphic continuation of the Dirichlet series. For example, the regularized product of all primes does not exist as p −s has the natural boundary (s) = 0 ([LW]). The aim of this paper is to give a better approach to regularized products improving the formalism in [Vo, CV, QHS] and [JL1] (compare Sect. 6 below) which is based on the representation of the Dirichlet series as the Mellin transform of the series θ(t) :=

∞

eak t .

(1.3)

k=1

In many applications arg(ak ) varies in such a way that this series does not converge for any t, for example in the product (1.2). This problem can be solved by instead using a kind of Hankel integral of the Dirichlet series (see Sect. 7). To treat the product (1.2) one has by definition to regard the function ζ (s, z) := ±(z − ρ)−s . This paper is also thought of as an examination of the analytic and asymptotic properties of this generalized Hurwitz zeta function which should be interesting for its own sake. Before giving a short overview we introduce the notion of a divisor which is basic for all that follows. A divisor D is given by a function mD : C → Z such that there is a β > 0 with |mD (ρ)| < ∞. |ρ|β

(1.4)

ρ∈C

Condition (1.4) reflects that the Dirichlet series in Definition (1.1) must converge absolutely in a half plane. We recall a fundamental fact from the theory of entire functions of finite order (compare [Ti] for the proof): A function mD : C → Z gives rise to a divisor if and only if there is a meromorphic function f (z) of finite order (i.e. f (z) is the quotient of two entire functions of finite order) such that mD (ρ) = ord f (z), z=ρ

ρ ∈ C,

thus D is the divisor of f (z) in the usual sense. And this function f (z) is determined by D up to an exponential polynomial, i.e. a function g(z) is meromorphic of finite order with divisor D if and only if there is a polynomial P (z) with g(z) = eP (z) f (z). After introducing some notation (Sect. 2) we define a general class of regularized products in Sect. 3, zeta-regularization being just an example; the rhs of (1.2) with the multiplicities mD (ρ) in the case of its existence is called regularized determinant and denoted by (z − ρ)±1 . (1.5) D (z) := We prove that D (z) is a meromorphic function of finite order with divisor D, thus equals eP (z) f (z) for a certain polynomial P (z). Regularization means finding this polynomial. Section 4, a sort of theoretical excursion, discusses axiomatic generalizations of the regularization process and shows that a theory of regularization should deal with quasidirected divisors (defined in Sect. 2).

Regularized Products and Determinants

71

If the Dirichlet series in the definition of regularized products also satisfies certain exponential estimates and does not have too many poles we speak of bounded regularizability (Sect. 5). In that case one can apply certain integral transformations, especially the Mellin transform, the mentioned Hankel integral and the Laplace transform, to get Theorems 3, 4 and 6 of Sects. 6 and 7 and 9. They give the equivalence of bounded regularizability with certain asymptotics for θD (t), ζD (s, z) and for the function θD (t, s) which is defined as the Laplace transform of ζD (s, z). As a corollary of Theorem 4 one gets Theorem 5 in Sect. 7 which is the fundamental theorem of the theory of regularization. It states that D is bounded regularizable if and only if for some 0 < ψi < π , i = 1, 2 and ε > 0 an asymptotic log f (z) =

zαi logni z + o(|z|−ε ) |z| → ∞

(1.6)

i

with finite sum, αi ∈ C and ni ∈ N0 , is valid for −ψ2 < arg(z) < ψ1 . D (z) exists in that case and also the polynomial P (z) can be determined in terms of this asymptotic of log f (z) which is very intrinsic. These results deliver a satisfying theory of regularization and apply to a large class of examples. In Sect. 8.1 for instance it is shown that every meromorphic function of finite order representable by a Dirichlet series is regularized; this improves results of Jorgenson and Lang ([JL1, JL2]) who had to assume that it also satisfies a functional equation. In 8.2 we regularize higher "-functions. Thus Sect. 8 is applicable to various kinds of zeta and L-functions. The function θD (t, s), introduced in Sect. 9, is a type of multivalued theta function and plays a central role in [Il2]. There is also an alternative approach to regularization via renormalizing certain divergent integrals (Sect. 10). The following three sections contain technical proofs which were postponed. The article reproduces the main results of Chapter 2 of my thesis [Il1] in a more special context. In some cases we only give sketches of the proofs, for complete proofs, generalizations and further results the reader is referred to [Il1]. In [Il2] it is shown how to apply the theory of regularization to generalize results of Cramér ([Cr]) and Guinand ([Gui]) thus improving results of [JL2].

2. Notation In the sequel f (z) denotes a meromorphic function of finite order and D its divisor. We define two important parameters: The exponent r of f and D is the infimum of all β > 0 satisfying (1.4); the genus g of f and D is the smallest n ∈ N0 such that (1.4) is satisfied for β = n + 1; note g + 1 ≥ r ≥ g. We will say that D lies in a set M ⊂ C if mD (ρ) = 0 implies that ρ ∈ M. Let 0 < ϕi < π, i = 1, 2, then we define open connected sets Wrϕ1 ,ϕ2 := {z ∈ C∗ | − ϕ2 < arg(z) < ϕ1 }, Wlϕ1 ,ϕ2 := C∗ \Wrϕ1 ,ϕ2 and a contour Cϕ1 ,ϕ2 consisting of the ray from e−ϕ2 i ∞ to 0 and the ray from 0 to eϕ1 i ∞; thus C = Wlϕ1 ,ϕ2 ∪ Cϕ1 ,ϕ2 ∪ Wrϕ1 ,ϕ2 is a disjoint union.

72

G. Illies

(z) ✻

Cϕ1 ,ϕ2 ✡✡ ✣ ✡ ✡ ✡ ✡ ✡ ϕ1 ◗ϕ2 Wrϕ1 ,ϕ2 ◗ ◗ ◗ ◗ ◗ ◗ ❦

Wlϕ1 ,ϕ2

✲

(z)

A divisor D is called directed if it lies in a Wlϕ1 ,ϕ2 . It is called quasi-directed if it is directed with the exception of finitely many ρ, and it is called strictly directed if it lies in a Wlϕ1 ,ϕ2 with ϕ1 > π2 and ϕ2 > π2 . We will also write ρ ∈ D instead of mD (ρ) = 0 and use the following notation:

ϕ(ρ) :=

ρ∈D

mD (ρ)ϕ(ρ).

ρ∈C

3. Xi Functions and Regularization Definition 3.1. If D is a directed divisor, UD := {z ∈ C | |z| < |ρ|, ρ ∈ D} and the argument is chosen so that −π < arg(z − ρ) < π then ξD (s, z) :=

ρ∈D

"(s) , (z − ρ)s

(s) > r,

z ∈ UD ,

(3.1)

is called the Hurwitz xi function of D; ξD (s) := ξD (s, 0) is called the xi function of D. Convergence is absolute and ξD (s, z) is holomorphic in both variables. Proposition 3.2. ξD (s, z) satisfies the following differential equation: d ξD (s, z) = −ξD (s + 1, z). dz

(3.2)

A function f (z) is meromorphic of finite order with divisor D if and only if for some l ≥ g: d l+1 log f (z) = (−1)l ξD (l + 1, z). dz

(3.3)

Proof. Equation (3.2) follows by taking the term by term derivative. For (3.3) check that d l+1 Wei,l log D,a (z) defined in (3.6) below satisfies (3.3) and observe that the operation dz exactly kills exponential polynomials of degree ≤ l.

Regularized Products and Determinants

73

Proposition 3.3. For z ∈ UD the following absolutely convergent Taylor series expansion is valid: ξD (s, z) =

∞

(−1)m ξD (s + m)

m=0

zm . m!

(3.4)

If ξD (s) is meromorphic for (s) > −p then ξD (s, z) is also meromorphic for (s) > −p and holomorphic for z ∈ U for any simply connected U ⊂ C with UD ⊂ U and ρ ∈ U for all ρ ∈ D. Proof. The Taylor series follows from (3.2); for the meromorphy in s observe that shifting the coefficients does not change the convergence radius. The continuation in z is obtained by treating finitely many ρ ∈ D separately. Definition 3.4. A regularization sequence δ is a sequence of complex numbers δ0 , δ1 , . . . with δ0 = 1. Formally let δ(s) := δ0 + δ1 s + δ2 s 2 + . . . . A directed divisor D is called regularizable if ξD (s) is meromorphic in a half plane (s) > −ε with ε > 0. For z ∈ U D (z) := exp(−CTs=0 (δ(s)ξD (s, z)))

(3.5)

is called the δ-regularized determinant of D. One calls D (0) the δ-regularized product of D. Note that CTs=0 means the constant term in the Laurent expansion at s = 0. If δ(s) is a divergent series, then one has to develop ξD (s, z) in a Laurent series and multiply it formally with the formal series for δ(s). In the sequel there will often appear formulas which must be interpreted in this formal sense. Examples. 1) xi-regularized determinant (Jorgenson, Lang): δ(s) = 1, 2) zeta-regularized determinant: δ(s) = " −1 (s + 1), 3) zero-renormalized determinant: δ(s) = "(1 − s). Remark. The factor δ(s) isintroduced because of several reasons. First one wants to handle "scaled" products a(z − ρ) (compare [De1, De2, De3]). It also turned out that the canonical way of renormalization (see Theorem 7 in Sect. 10) differs from zeta-regularization. A further reason is that in [JL1] xi-regularization was used which is technically the simplest regularization. While zeta-regularization as well as zerorenormalization generalize the ordinary finite product (as every regularization with δ1 = γ does, γ the Euler-Mascheroni constant), xi-regularization does not. Zeta-regularization satisfies the product rule ρ n = ( ρ)n so comes closest to what one would expect for a product. Fix a ∈ C with mD (a) = 0 and - ≥ g, then we define the absolutely convergent Weierstrass product

mD (ρ) z−a 1 z − a k Wei,D,a (z) := 1− , (3.6) exp ρ−a k ρ−a ρ∈C

k=0

which is a meromorphic function of finite order with divisor D. For a = 0 and g = l one has the usual canonical Weierstrass product (compare [Ti]).

74

G. Illies

Theorem 1. D (z) is a meromorphic function of finite order with divisor D. The explicit relation to Weierstrass products is given by D (z) = eP (z) Wei,D,a (z),

(3.7)

where with a suitable branch of the logarithm P (z) =

(z − a)m log(m) D (a), m!

(3.8)

m=0

log(m) D (z) = (−1)m+1 CTs=0 (δ(s)ξD (s + m, z)) m = 0, 1, . . . . In the sequel a meromorphic function of finite order f (z) with divisor D will be called δ-regularized if it equals D (z). Proof of Theorem 1. We have (

d m+1 log D (z) = (−1)m CTs=0 (δ(s)ξD (s + m + 1, z)) ) dz = (−1)m ξD (m + 1, z) for m ≥ g;

(3.9) (3.10)

the first equation holds because of (3.2) and is valid for all m ∈ N0 . It is also true that ξD (s, z) is holomorphic for s = m + 1 if m ≥ g by the definition of g. Comparison with (3.3) proves the first assertion. One also easily checks d m+1 0 for m = −1, 0, . . . , - − 1 Wei,log D,a (z)|z=a = (−1)m ξD (m + 1, a) for m ≥ -. dz Using this as well as (3.9) and (3.10) the explicit relation to Weierstrass products follows by subtracting the Taylor series expansion around s = a for log Wei,D,a (z) from that for log D (z). 4. Determinant Systems We will call a function f (z) associated to a divisor D if there is a polynomial P (z) such that f (z) = eP (z) D,a (z) and deg P ≤ g Wei,g

or equivalently that (3.3) is satisfied for l = g (compare the proof of (3.3)). Observe that Wei,g in this definition we have set - = g in (3.6). Note also that if D,a (z) is, in addition, entire then its order is exactly r and no entire function with divisor D can have a smaller order (compare [Ti]). So associated functions have minimal order because of g ≤ r. By Theorem 1 regularization means picking out a certain associated function to a divisor. Now we ask for extensions of this process to non-regularizable divisors. For α ∈ C we define the translated divisor D |+α by mD |+α (z + α) := mD (z) and the sum D1 + D2 by mD1 +D2 (z) := mD1 (z) + mD2 (z). Let Dfin be the abelian group of all finite divisors, Dbreg that of all bounded regularizable quasi-directed divisors (compare Sect. 5), Dreg that of all regularizable quasi-directed divisors, i.e. those which are regularizable directed after eliminating finitely many ρ, Dqd that of all quasi-directed divisors, and D that of all divisors. These are all translation-invariant with proper inclusions Dfin ⊂ Dbreg ⊂ Dreg ⊂ Dqd ⊂ D. The following definition arises from the demand for generalizations of the characteristic polynomial to the infinite dimensional case.

Regularized Products and Determinants

75

Definition 4.1. Let D ⊆ D be a translation-invariant subgroup. A determinant on D attaches to every D ∈ D an associated function D (z), such that: i) D |+α (z + α) = D (z) ii) D1 +D2 (z) = D1 (z)D2 (z)

(translation-invariance) (linearity)

for D, D1 , D2 ∈ D , α ∈ C. (D , ) is called a determinant system. Examples. 1) (Dfin , ) with characteristic “polynomial” which is a rational function defined by D (z) := ρ∈C (z − ρ)mD (ρ) . 2) (Dreg , ) with the δ-regularized determinant . For a D ∈ Dreg which is not directed, D (z) can be defined by translation-invariance. Theorem 2 answers the question of how large determinant systems can be. Theorem 2. a) There is no determinant system (D, ). b) For every regularization sequence δ there is a determinant system (Dqd , ) which is an extension of the δ-regularized determinant system (Dreg , ). Proof. a) Let D be the divisor that consists only of zeroes of order one lying at the lattice points ρ = m + ni, m, n ∈ Z. From translation invariance one gets D (z + 1) = D (z) and D (z + i) = D (z). Hence D (z) must be a doubly-periodic entire non-constant function which is impossible by Liouville’s theorem. b) (Idea) One has to choose the exponential polynomials consistently with linearity and translation-invariance. This leads to a system of infinitely many linear equations with infinitely many variables which can be reduced to finite systems by Zorn’s Lemma. (See Sect. 11 for a complete proof.) Remark. Not every determinant system is extendable, so b) is an aesthetic property of regularization. The proof is non-constructive and its extensions are not uniquely determined. The meaning of regularization is that it gives large constructively defined determinant systems. 5. Bounded Regularizability, Singularities and Asymptotics In this section we introduce the special case of bounded regularizability of divisors and give all the neccessary technical definitions to formulate the results of Sects. 6, 7 and 9 which state its equivalence to various asymptotics. Definition 5.1. Let D be a directed divisor, 0 < σi < π for i = 1, 2 and p ∈ R ∪ {∞}, then D resp. ξD (s) are called (σ1 , σ2 )-bounded p-regular if: i) ξD (s) is meromorphic for (s) > −p. ii) ξD (s) has only finitely many poles in the strip α1 < (s) < α2 for any −p < α1 < α2 . iii) For all −p < α1 < α2 and σ1 < σ1 , σ2 < σ2 , π O(e( 2 −σ2 )(s) ) for (s) → ∞ ξD (s) = π O(e−( 2 −σ1 )(s) ) for (s) → −∞ in the strip α1 < (s) < α2 .

76

G. Illies

We simply say bounded p-regular, if there are 0 < σi < π such that (σ1 , σ2 )-bounded p-regularity is valid. We have bounded regularizability if p > 0. Note that every directed divisor D in Wlϕ1 ,ϕ2 is (ϕ1 , ϕ2 )-bounded (−r)-regular as follows from Stirling’s formula. Definition 5.2. A pB-System consists of: 1. A finite or infinite sequence of pairs (pn , Bn (z))n=0,1,2,... with complex numbers pn satisfying (p0 ) ≤ (p1 ) ≤ . . . ≤ (pn ) ≤ . . . and polynomials Bn (z) ∈ C[z], Bn (z) = k bn,k zk . 2. An abscissa p ∈ R ∪ {∞} such that p > (pn ) for all n, and in addition for infinite sequences: p = limn→∞ (pn ). pB-systems capture the simultaneous information about the occurring singular part distributions and the occurring asymptotics. Example. If the divisor D is (σ1 , σ2 )-bounded p-regular, then there is a pB-system (pn , Bn (z))n=0,1,2,... with abscissa p such that the poles of ξD (s) in the half plane (s) > −p lie exactly at the values s = −pn and the Laurent expansions have the singular parts Bn (∂s )[

(−1)k k! 1 ]= bn,k . s + pn (s + pn )k+1 k

(−pn , Bn (∂s )[

1 ])n=0,1,... s + pn

is called the singular part distribution of ξD (s) in that case. Definition 5.3. Let (pn , Bn (z))n=0,1,... be a pB-system with abscissa p as above and 0 < σi ≤ ϕi < π, i = 1, 2. A function θ : Wrϕ1 ,ϕ2 −→ C satisfies the Cramér asymptotic with abscissa p in Wrσ1 ,σ2 , θ (t) ∼

∞

t pn Bn (log t) for |t| → 0,

n=0

if the estimate for t ∈ Wrσ1 ,σ2 θ (t) − t pn Bn (log t) = O(|t|q ) for |t| → 0 (pn ) r, ∞ ξD (s) = θD (t)t s−1 dt,

(6.2)

0

and its inverse for t ∈ Wr(ϕ2 − π2 ),(ϕ1 − π2 ) and c > r c+i∞ 1 ξD (s)t −s ds θD (t) = 2π i c−i∞

(6.3)

with absolute convergence of the integrals. Proof. Because of the theorem about Mellin inversion it suffices to prove (6.3) and by majorized convergence, this is reduced to the case of a one-point-divisor. In that case (6.3) is the inverse formula for Euler’s Mellin integral for "(s). This approach is only possible for strictly directed divisors as otherwise the defining series for θD (t) does not converge for any t. Theorem 3. Let π2 < σi ≤ ϕi < π for i = 1, 2 and D be strictly directed in Wlϕ1 ,ϕ2 , and let (pn , Bn (z))n=0,1,... be a pB-system with abscissa p. Then the following statements are equivalent: A) ξD (s) is (σ1 , σ2 )-bounded p-regular with singular part distribution (−pn , Bn (∂s )[

1 ])n=0,1,... . s + pn

C) θD (t) satisfies a Cramér asymptotic with abscissa p of the form θD (t) ∼

∞

t pn Bn (log t) for |t| → 0

n=0

in Wr(σ2 − π2 ),(σ1 − π2 ) . Proof (sketch). C) ⇒ A) is shown by (6.2): The poles and singular parts of ξD (s) arise by integrating the terms of the Cramér asymptotic, and the exponential estimation for ξD (s) in vertical strips can be shown by rotating the ray of integration in (6.2) in Wr(σ2 − π2 ),(σ1 − π2 ) . A) ⇒ C) follows from (6.3) by replacing the abscissa c of the line of integration by a smaller c > −p and applying the residue theorem. The residues of the integrand produce the terms of the Cramér asymptotic.

78

G. Illies

Remark 1. This theorem is well known in the context of the Mellin and Laplace transform (e.g. [Do, II, Chap. 5], where a complete proof can be found). Using it one can decide whether a strictly directed divisor is bounded regularizable or not, by checking for the existence of a suitable Cramér asymptotic for the partition function. For example the regularized product of the eigenvalues of Laplacians on manifolds exists because of Cramér asymptotics arising from heat kernel expansions (comp. [Ef, Ko1, Ko2, Ko3, Sa]). Remark 2. In the case of strictly directed divisors also the implication A) ⇒ B’) of Theorem 4 and, in particular, the asymptotic (7.8) can be obtained by the Mellin integral ξD (s, z) =

∞

0

θD (t)e−zt t s−1 dt

using 0

∞

pn

t Bn (log t)e

−zt s−1

t

"(s + pn ) . dt = Bn (∂s ) zs+pn

The Mellin transform approach to regularized products and determinants (the details can be found in Sect. 2.4 in [Il1]) was also extensively studied by Jorgenson and Lang ([JL1]).

7. Hankel Integrals and Stirling Asymptotics The Mellin integral method has two shortcomings: It is possible only for strictly directed divisors and the partition function is a non-intrinsic construction, one wants criteria in terms of associated functions. In this section we solve these problems postponing the technical proofs until Sect. 12. For powers a s and log(a) we always use −π < arg(a) < π. Proposition 7.1. Let D be a directed divisor in Wlϕ1 ,ϕ2 (0 < ϕi < π ). a) ξD (s, z) =

1 2π i

c+i∞

c−i∞

"(s − s ) ξD (s )ds zs−s

(7.1)

for z ∈ Wrϕ1 ,ϕ2 and (s) > c > r with absolute convergence of the integral. b) Let 0 < σi < ϕi for i = 1, 2, C = Cσ1 ,σ2 and z ∈ Wrσ1 ,σ2 . Then for (s) > r and (s0 ) > r one has the absolutely convergent integral representation ξD (s, z) =

1 2π i

C

"(s − s0 + 1) ξD (s0 , w)dw. (z − w)s−s0 +1

(7.2)

In the sequel the representations (7.1) and (7.2) play a similar role as (6.2) and (6.3) in Sect. 6. For the explicit description of the Stirling asymptotics we need the following definition.

Regularized Products and Determinants

79

Definition 7.2. Let δ be a fixed regularization sequence. Then for any q ∈ C we define the linear map [q] : C[z] −→ C[z] B(z) −→B [q] (z), by

"(s + q) CTs=0 δ(s)B(∂s ) = z−q B [q] (log z). zs+q

(7.3)

For Pk (z) := zk we get: [q]

Pk (z) =

k j =0

(−1)j

k (k−j ) " (q)zj j

(7.4)

in the case that q = −n for all n ∈ N0 , while for q = −n, [q] Pk (z)

k CTs=q (" (k−j ) (s))zj = (−1) j j =0 (−1)n (−1)k+1 k+1 z + + (−1)k k!δk+1 . n! k+1 k

j

(7.5)

Special case B(z) = b0 . Easy calculations using the fact that "(z) is holomorphic for (−z) ∈ N0 as well as the expansion "(s) = 1s − γ + . . . (γ the Euler–Mascheroni constant) and "(s − n) = "(s)((s − 1)(s − 2) . . . (s − n))−1 deliver for q = −n b0 "(q) n 1 (7.6) B [q] (z) = (−1)n+1 z + γ − δ1 − j =1 j for q = −n, b0 n! (for zeta-regularization as well as for zero-renormalization one has δ1 = γ .) The following basic properties of [q] are clear by (7.4), (7.5) and the definition. Proposition 7.3. a) [q] is bijective for q = 0, −1, −2, . . . with deg B [q] = deg B. b) [q] is injective for q = 0, −1, −2, . . . with deg B [q] = deg B + 1 and with dim Coker([q]) = 1. c) d −q [q] z B (log z) = −z−(q+1) B [q+1] (log z). dz

(7.7)

Remark 3. In particular, every Stirling asymptotic with abscissa p can be represented as linear combination of terms of the form z−q B [q] up to a polynomial P (z) with1 P (z)zp → 0 for |z| → 0, and this polynomial is uniquely determined. This shows that the asymptotics in B’) of Theorem 4 and B) of Theorem 5 are general Stirling asymptotics which are written in a special manner. And this also means that the Stirling asymptotic (7.8) is an effective method to determine the regularized determinant among all associated functions for D. 1 Observe: Terms z−q with q ≥ p make no sense in Stirling asymptotics with abscissa p.

80

G. Illies

Remark 4. Part c) of the proposition together with (3.2) shows that the Stirling asymptotics in B’) and B) can de differentiated term by term. Theorem 4. Let 0 < σi ≤ ϕi < π for i = 1, 2 and D be a directed divisor in Wlϕ1 ,ϕ2 , let p ∈ R ∪ {∞} and ξD (s) be meromorphic for (s) > −p (compare Prop. 3.3). Then for any regularization sequence δ, s0 with (s0 ) > −p and a pBsystem (pn , Bn (z))n=0,1,... with abscissa p ∈ R ∪ {∞} the following statements are equivalent: A) ξD (s) is (σ1 , σ2 )-bounded p-regular with the singular part distribution (−pn , Bn (∂s )[

1 ])n=0,1,... . s + pn

B’) There is a polynomial Ps0 (z) with Ps0 (z)zp+s0 → 0 for |z| → 0 and such that the Stirling asymptotic with abscissa p + (s0 ) CTs=0 (δ(s)ξD (s + s0 , z))) ∼ Ps0 (z) +

∞

[pn +s0 ]

z−(pn +s0 ) Bn

(log z) for |z| → ∞

n=0

is valid in Wrσ1 ,σ2 . The polynomial in B’) is then uniquely determined: Ps0 (z) = 0. The idea of the proof given in Sect. 12 is rather similar to the proof of Theorem 3. To get the Stirling asymptotic B’) from the singular part distribution A) one uses (7.1), shifts the line of integration and applies the residue theorem. The other direction is a little bit more difficult but the basic idea is of course to use (7.2) and integrate the Stirling asymptotic term by term. Some technical difficulties arise because (7.2) is not valid for z = 0. Using Eqs. (3.2), (3.3) and (3.5) one obtains the following theorem as an easy corollary of Theorem 4. Theorem 5. Let 0 < σi ≤ ϕi < π for i = 1, 2 and D be a directed divisor in Wlϕ1 ,ϕ2 ; let f (z) be a meromorphic function of finite order with divisor D. Then for a regularization sequence δ, m ∈ N0 and a pB-system (pn , Bn (z))n=0,1,... with abscissa p ∈ R ∪ {∞} the following statements are equivalent: A) ξD (s) is (σ1 , σ2 )-bounded p-regular with singular part distribution (−pn , Bn (∂s )[

1 ])n=0,1,... . s + pn

B) There is a polynomial Pf (z) with Pf (z)zp → 0 for |z| → 0, such that the Stirling asymptotic with abscissa (p + m) (m)

log

f (z) ∼

(m) Pf (z) + (−1)m−1

is valid in Wrσ1 ,σ2 .

∞ n=0

[pn +m]

z−(pn +m) Bn

(log z) for |z| → ∞

Regularized Products and Determinants

81

Pf (z) in B) can then be chosen independent of m, it is (up to the choice of the logarithm) uniquely determined. If, in addition, p > 0, so D is bounded regularizable, then for the δ-regularized determinant one has PD = 0, i.e. the following Stirling asymptotic with abszissa p in Wrσ1 ,σ2 is valid: log D (z) ∼ −

∞

[pn ]

z−pn Bn

(log z) for |z| → +∞.

(7.8)

n=0

Theorem 5 can be regarded as the fundamental theorem about bounded regularizability by Remark 3 is states that whenever a log(m) f (z) satisfies any Stirling asymptotic with abscissa greater than zero, then f (z) and its divisor D are bounded regularizable, and (7.8) allows to determine its regularized determinant, i.e. the polynomial P (z) mentioned in the introduction. The triple equivalence A) ⇔ B) (⇔ C)) given by Theorems 3 and 5 where the latter equivalence is valid only for strictly directed divisors will be generalized in Sect. 9 (Theorems 4 and 6) to an equivalence A) ⇔B’) ⇔ C’) valid for all directed divisors which summarizes all informations about singular part distributions and asymptotics of ξD (s, z) and θD (t, s). 8. Examples 8.1. Dirichlet series. Corollary 8.1. If f (z) is meromorphic of finite order and has an absolutely convergent Dirichlet series representation f (z) = 1 +

∞ βn n=0

αnz

, (z) > σ0 ,

with βn ∈ C and αn ∈ R>1 with limn→∞ αn = ∞, then f (z) is δ-regularized for every regularization sequence δ, i.e. f (z) = D (z), in particular, f (z) is associated to its divisor (compare to Sect. 4). ξD (s, z) is holomorphic for s ∈ C. Proof. By the Taylor series expansion for log(1 + x) it is clear that the trivial Stirling asymptotic log f (z) ∼ 0 as |z| → 0 with abscissa +∞ is valid in Wr( π2 −ε),( π2 −ε) , so by Theorem 5 and Proposition 3.3 the assertion is clear. Remark 5. Using only the Mellin integral method one needs to assume that f (z) also satisfies a functional equation and examines eρt , θD− (t) := eρt θD+ (t) := ρ∈D,(ρ)>0

ρ∈D,(ρ)≤0

separately. For f (z) = ζ (z) the Riemann zeta function a classical result of Cramér ([Cr]) delivers the Cramér asymptotics for θD+ (t) and θD− (t) (with logarithmic terms in contrast to the examples from the spectra of Laplacians mentioned in Sect. 6) and thus regularizability of ζ (z) ([So, ScSo]). In [JL2] Cramér’s result was generalized to a class of f (z) as in the above corollary which in addition satisfies a functional equation, and their result implies regularizablity

82

G. Illies

of all these functions. Corollary 8.1 shows regularizability for a much larger class and moreover one no longer needs Cramér’s result. The methods of this section of [Il2] also apply to the “polynomial Bessel fundamental class” introduced in [JL3]. Nevertheless a functional equation is neccessary if one wants to get information about θD+ (t) (compare [Il2]). Remark 6. Theorem 5 gives satisfying criteria for deciding whether a function is bounded regularizable or not. For example, consider the function 1 2 f (z) = √ (z2 + 1)(1 + e−z )"(z) + e−z " (z). 2π √ It can be immediately seen that it is zeta-regularized: ( 2π )−1 "(z) is zeta-regularized because of (8.5) below, and this is true for (z2 + 1) because it is a characteristic polynomial, and this holds for (1 + e−z ) because it is a Dirichlet series; the second summand is small (in an angular domain) compared to the first and does not change the Stirling asymptotic of log f (z). 8.2. "-functions. In §2.8 of [Vi] the functions "n (z) were defined which appear in the functional equations of Selberg zeta functions and which are special cases of the general higher "-functions introduced by Barnes ([Bar]). They are simple examples for regularization with non-trivial Stirling asymptotics and their zeta-regularization can already be found in [Va, Ku] and [Ma, §3.3]. We give the following definition which is equivalent to that of Vigneras. Definition 8.2. The sequence ("n (z))n=0,1,... of "-functions of order n is defined by the following conditions: 0) "0 (z) = 1z . 1) "n−1 (z), n ∈ N, is an entire function of finite order and the divisor Dn of "n (z) consists

exactly of the ρ = −k, k ∈ N0 with multiplicity − n+k−1 n−1 . 2) "n (1) = 1 for all n ∈ N0 . 3) For all n ∈ N0 the following functional equation is valid: "n+1 (z) "n+1 (z + 1) = . "n (z) Using higher Bernoulli polynomials ([No]) one has for n ∈ N0 , 1 θDn (t) = −(−θD1 (t))n = − , (1 − e−t )n (8.1) ∞ (−1)ν Bνn (0) ν−n =− for 0 < |t| < 2π , t ν! ν=0

thus by Theorem 3 the Dn are bounded regularizable. Applying (7.6) and (7.8) one gets the Stirling asymptotic with abscisssa +∞ for the δ-regularized determinant k n n (0) B 1 n−k log z + γ − δ1 − zk log Dn (z) ∼ (−1)n+1 (n − k)!k! j k=0 j =1 (8.2) ∞ n (k − 1)!B (0) n+k z−k for |z| → ∞ + (−1)n+k (n + k)! k=1

Regularized Products and Determinants

83

in Wr(π−ε),(π−ε) . Proposition 8.3. The functions "n (z) are well defined; one has "n (z) = e−Pn (z) Dn (z) with polynomials Pn (z) of degree ≤ n which are determined (e.g. using Lagrange interpolation) by the relations j −1 j −1 Pn (j ) = log Dn−i (1), j = 1, . . . , n + 1. (−1)i (8.3) i i=0

The values log Dn (1) := −CTs=0 (δ(s)ξDn (s, 1)) can be expressed in terms of the Riemann zeta function: log Dn (1) =

n−1

τn,l ζ (−l) + (δ1 − γ )

l=0

n−1

τn,l ζ (−l)

(8.4)

l=0

for n ≥ 1 and log D0 (1) = δ1 − γ , with the Euler-Mascheroni constant γ and τn,l from the development n−1 n+x−1 = τn,l (x + 1)l . n−1 l=0

Proof. One shows that there exists exactly one choice of polynomials Pn (z) with Pn+1 (z + 1) + Pn (z) − Pn+1 (z) = 0 Pn (1) = log Dn (1) for n ∈ N0 , with deg P0 = 0. (Because of deg P0 = 0 and the first equation one gets deg Pn ≤ n, by induction it is easy to prove that the Pn (j ) are given as in the proposition, and in the other direction, that the uniquely determined Pn (z) with deg Pn ≤ n and with these Pn (1) satisfy the two equations.) The expression for log Dn (1) follows from δ(s)"(s) = 1s + (δ1 − γ ) + . . . and ξDn (s, 1) = −"(s)

∞ n+k−1 k=0

n−1

(k + 1)−s = −"(s)

n−1

τn,l ζ (s − l).

l=0

"1 (z) is the usual "-function, "2−1 (z) = G(z) is known as Barnes’ G-function. For these two functions we√will give the result more explicitly. It is well known that 1 ζ (0) = − 21 , ζ (0) = − log 2π and ζ (−1) = − 12 . With the Kinkelin-Glaisher constant 1 A one can express ζ (−1) = 12 − log A (compare [Vo, pp. 461–464], [Al, p. 357]), but we use just ζ (−1). Corollary 8.4. For n = 1, 2 one has 1 "1 (z) D1 (z) = √ e−(δ1 −γ )(z− 2 ) , 2π "2 (z) ζ (−1)+z log √2π+ δ1 −γ ((z−1)2 − 1 ) 2 6 . D2 (z) = √ e 2π By combining (8.5) and (8.2) one gets the usual Stirling formula for "(z).

(8.5) (8.6)

84

G. Illies

9. The Function θD (t, s) We now define and examine the function θD (t, s) for a directed divisor. This function is a Laplace transform of ξD (s, z) for the variable z which turns out to be a "mixture" of θD (t) and ξD (s) and is an essential tool in [Il2]. We give without proof a sort of generalization of Theorem 3 to directed divisors in terms of this function. ∗ with ϕ1 < arg(t) < In the sequel Wlϕ1 ,ϕ2 is regarded as the subset of all those t ∈ C −ϕ2 +2π (and we use these arguments for log t). Then Wlϕ1 ,ϕ2 is also defined for ϕi ≤ 0, which is needed in what follows. we define Definition 9.1. For ρ ∈ C\R≥0 , s ∈ C with (−s) ∈ N0 and t ∈ C e xp(ρ, t, s) :=

e−πi(s−1) "(s)"(1 − s, ρt) · eρt . 2π i

For a directed divisor D in Wlϕ1 ,ϕ2 (with 0 < ϕi < π ) we define e xp(ρ, t, s) for t ∈ Wl( π2 −ϕ1 ),( π2 −ϕ2 ) , (s) > r. θD (t, s) :=

(9.1)

(9.2)

ρ∈D

In the definition of e xp(ρ, t, s), a type of multivalued exponential function, the incomplete Gamma function (obviously holomorphic in α and z) ∞ ∗ "(α, z) := e−τ τ α−1 dτ, α ∈ C, z ∈ C z

is used. Properties of "(α, z) are well known (e.g. [EMOT, II, Chap. 9]). We state the needed properties of e xp(ρ, t, s) in Lemma 13.1 and give a selfcontained proof. In particular, by the lemma one can see that the defining sum for θD (t, s) converges absolutely and is holomorphic in the given domains. Proposition 9.2. With D as in the above definition, t ∈ Wl( π2 −ϕ1 ),( π2 −ϕ2 ) and (s) > r one has iα t −(s−1) e ∞ wt θD (t, s) = e ξD (s, w)dw (9.3) 2π i 0 for every α ∈] − ϕ2 , ϕ1 [ satisfying (eiα t) < 0, and the integral converges absolutely. θD (t, s) satisfies the following functional equations: θD (t, s + 1) − θD (t, s) = and if D is strictly directed in Wlϕ1 ,ϕ2 with

π 2

t −s · ξD (s) 2π i

(9.4)

< ϕi < π ,

θD (t, s) − e2πi(s−1) θD (exp(2π i)t, s) = θD (t), t ∈ Wr(ϕ2 − π2 ),(ϕ1 − π2 ) ,

(9.5)

θD (t, s), is identified where the overlap in Wl( π2 −ϕ1 ),( π2 −ϕ2 ) , the domain of definition for with Wr(ϕ2 − π2 ),(ϕ1 − π2 ) which is the domain of definition for the partition function θD (t) (compare Definition 6.1). Proof. By majorized convergence using Lemma 12.1 the Laplace integral representation is obtained from (13.4). The functional equations follow from the corresponding ones for e xp(ρ, t, s) given in Lemma 13.1.

Regularized Products and Determinants

85

Remark 7. The proposition shows that θD (t, s) behaves like ξD (s) in the variable s and like θD (t) in the variable t. In particular, θD (t, s) is meromorphic for (s) > −p if and only if this is true for ξD (s). For q ∈ C, a regularization sequence δ and B(z) ∈ C[z] we define the polynomial B [[q]] (z) ∈ C[z] by π(−z)s+q 1 CTs=0 δ(s)B(∂s ) = B [[q]] (log z)zq , − 2πi sin(π(s + q)) with arg(−z) := arg(z) − π. Theorem 6. Let 0 < σi ≤ ϕi < π for i = 1, 2 and D be a directed divisor in Wlϕ1 ,ϕ2 such that ξD (s) is meromorphic for (s0 ) > −p (compare Remark 1). For s0 ∈ C with (s0 ) > −p , a regularization sequence δ and a pB-system (pn , Bn (z))n=0,1,... with abscissa p < ∞, the following statements are equivalent: A) ξD (s) is (σ1 , σ2 )-bounded p-regular with singular part distribution 1 . − pn , Bn (∂s ) s + pn n=0,1,... s0 (t, t −1 ) with P s0 (t, t −1 )t −(p+s0 −1) → 0 for |t| → ∞ C’) There exists a polynomial P and such that the Cramér asymptotic with abscissa p + (s0 ) − 1, s0 (t, t −1 ) θD (t, s + s0 ) ∼ P CTs=0 δ(s)t s+s0 −1 +

∞

[[pn +s0 −1]]

t pn +s0 −1 Bn

(log t)

(9.6)

n=0

for |t| → 0 is valid in Wl( π2 −σ1 ),( π2 −σ2 ) . The polynomial in C’) is then uniquely determined: n−1 s0 (t, t −1 ) = 1 P CTs=0 (δ(s)ξD (s + s0 − k − 1))t k 2π i k=0

with n such that n − 1 < p + (s0 ) − 1 ≤ n. As this theorem has no direct application to regularization we omit the analogue of Proposition 7.3 for [[q]] which shows that the Cramér asymptotic in C’) is a general one written in a special manner, and we give only the idea of the proof of Theorem 6. Proof. (idea) By Theorem 4 it suffices to prove B’) ⇔ C’). This equivalence can be shown using (9.3) and its inversion by a Hankel integral ((2.6.11) in [Il2]) integrating the asymptics term by term. For details see Sect. 2.6.1 in [Il2]. Remark 8. B [[q]] (log t)t q − B [[q]] (log(exp(2π i)t))(exp(2π i)t)q = B(log t)t q and (9.5) lead one to rediscover the implication A) ⇒ C) in Theorem 3 but now one has with Theorem 4 the general equivalence A) ⇔ B’) ⇔C’) for all directed divisors already mentioned in Sect. 7. Because of (9.3) C’) is an explicit determination of the Cramér asymptotic of the Laplace transform of ξD (s, z) which is the basic meaning of Theorem 6.

86

G. Illies

10. Renormalized Determinants The following is a generalization of ideas from §5 of [Vo]. Because of (3.2) and (3.3) every meromorphic function D (z) of finite order with divisor D has representations of the form z λ λ1 log D (z) = ... (−1)- ξD (- + 1, λ0 )dλ0 dλ1 . . . dλ(10.1) a-+1

a-

a1

for certain - ≥ g and ai ∈ C, e.g. Wei,D,a (z) defined by Eq. (3.6) for ai = a. Easy considerations show that one must have |ai | = ∞ in order to get a determinant (compare Sect. 4) by this. But then (10.1) is divergent, so one has to renormalize the divergent integral. If D is quasi-directed and bounded regularizable according to Theorem 4 one has a Stirling asymptotic for ξD (- + 1, z), and (10.1) with ai = ∞ can be renormalized z if one has a renormalization for every integral of the form ∞ λ−q B(log λ)dλ, q ∈ C, B(z) ∈ C[z] (taking of course the value of the integral in case of absolute convergence). In the sequel for B(z) ∈ C[z] and q ∈ C we define B {q−1} (z) ∈ C[z] by d −(q−1) {q−1} z B (log z) = z−q B(log z), dz B {0} (0) = 0. Thus the z−(q−1) B {q−1} (log z) are just those primitives of the z−q B(log z) whose constant terms are zero. Definition 10.1. A renormalization sequence ω is a sequence (ωn )n=0,1,... of complex numbers. For such a renormalization sequence ω and D ∈ Dbreg , i.e. D is a quasidirected bounded regularizable divisor, the ω-renormalized determinant D (z) of D is defined by log D (z) =

z λ∞

∞

...

λ1

∞

(−1)- ξD (- + 1, λ0 )dλ0 dλ1 . . . dλ-

for - ≥ g, integrating (- + 1) times using the Stirling asymptotic for ξD (- + 1, z) from Theorem 4 and following the renormalization rule: z z−(q−1) B {q−1} (log z) for q = 1 λ−q B(log λ)dλ := (10.2) B {0} (log z) + ω(B) for q = 1 ∞ with ω(B) :=

k

ωk0 bk for B(z) =

k bk z

k.

One can easily prove that the definition of D (z) is independent of - ≥ g and of the lines of integration and that it delivers indeed a determinant system on Dbreg . The Stirling asymptotic that determines log D (z) in the same way as (7.8) for the δregularized determinant is derived by integrating the Stirling asymptotic for ξD (- + 1, z) term by term following (10.2). The next theorem shows that renormalization and regularization in fact are essentially the same.

Regularized Products and Determinants

87

Theorem 7. There is a bijection between the set of regularization sequences δ and the set of renormalization sequences ω such that the δ-regularized determinant and the ω-renormalized determinant deliver the same determinant system on Dbreg . The ω0 -renormalized determinant with ωn0 := 0 for all n ∈ N0 delivers the zerorenormalization as defined in Example 3 in Sect. 3. Proof. By Theorem 5 and the properties of the map [q], in particular, (7.7) and the fact that Stirling asymptotics for log(m) f (z) can be differentiated term by term (Remark 2 in Sect. 7) one easily sees that it is sufficient to observe the following: 1. δ(s) = "(1 − s) is a regularization sequence with B [0] (0) = 0 for all B(z) ∈ C[z]. This follows from (7.5) as then one has CTs=0 (" (k) (s)) = (−1)k+1 k!δk for all k ∈ N0 . 2. Let >1 be the C-vector space of all renormalization sequences and >2 that of all regularization sequences. Define ? to be the C-vector space of all C-linear maps from C[z] to C. Then regard the maps α1 : >1 −→ ?, ω −→ (B → ω(B)), α2 : >2 −→ ?, δ −→ (B → (B [0]δ − B [0]0 )), where in the latter definition [q]δ means the map [q] for the regularization sequence δ while [q]0 means [q] for the special regularization sequence of zero renormalization (δ(s) = "(1 − s)). These maps are obviously isomorphisms and α1−1 ◦ α2 is the demanded isomorphism between >2 and >1 . 11. Proof of Theorem 2b) Given a system of relations

(i)

|+α1,k

D1

+ ... +

k

D (i)

(i)

|+αn,k

Dn

= D (i) ,

i ∈ I,

(11.1)

k

Dreg ,

Dqd \Dreg ,

(i)

with ∈ D1 , . . . , Dn ∈ αm,k ∈ C for i ∈ I , m = 1, . . . , n and with finite sums over the index k. We first regard logarithms of associated functions for large real z. We choose the logarithms of the regularized determinants log D (i) (z) := −CTs=0 (δ(s)ξD (i) (s, z)) and logarithms log inDm (z) of certain associated functions inDm (z). We search for polynomials Pm (z) =

gm l=0

(−1)l+1

xm,l l z, l!

m = 1, . . . , n

such that log Dm (z) = Pm (z) + log inDm (z) is consistent with i) and ii) of Definition 4.1 under (11.1). With the polynomials (i) (i) P(i) (z) := log D (i) (z) − log inD1 z − α1,k − . . . − log inDn z − αn,k k

k

(11.2)

88

G. Illies

this is equivalent to (i) (i) P1 z − α1,k + . . . + Pn z − αn,k , P(i) (z) = k

i ∈ I,

(11.3)

k

and this is equivalent to a system of linear equations for the xm,l . Now by Zorn’s lemma a system of linear equations has a solution if every finite subsystem has one. Thus it suffices to prove that there is always a solution if |I | < ∞. So wlog we may assume that δ(s) is a polynomial (as the finitely many log D (i) (z) depend only on finitely many δn ). And after a trivial translation we also assume that all (i) divisors are directed and that there is an α > 0 such that |z| < α implies |z − αm,k | < |ρ| for all ρ that occur in the D (i) , Dm ; we always assume |z| < α. If we choose log inm (z) = Wei,g log Dm ,0 m (z) (compare Eq. (3.6)), then we have with the coefficients given in the proof of Theorem 1 b) and using Proposition 3.3: (i) log inDm z − αm,k (i) l gm z − αm,k (i) − CTs=0 δ(s) ξDm s, z − αm,k − ξDm (s + l) (−1)l l! l=0

and thus by Eq. (11.2)

(i) l g1 z − α1,k (−1)l ξD1 (s + l) P(i) (z) = − CTs=0 δ(s) l! l=0

... +

gn l=0

k

(−1)l

(i) l

z − αn,k l!

k

(11.4)

ξDn (s + l) , i ∈ I

(i) (i) (as k ξD1 (s, z −α1,k )+. . .+ k ξDn (s, z −αn,k ) = ξD (i) (s, z)). Comparison of (11.4) and (11.3) leads one to introduce the functions xm,l (s) := δ(s)ξDm (s + l), Pm (s, z) :=

gm l=0

P(i) (s, z) :=

(−1)l+1

xm,l (s) l z, l! (i)

P1 (s, z − α1,k ) + . . . +

k

(11.5) k

(i)

Pn (s, z − αn,k ),

where the xm,l (s) and the Pm (s, z) are all holomorphic for (s) > rmax := max rm while P(i) (s, z) is meromorphic for (s) > 0 with CTs=0 (P(i) (s, z)) = P(i) (z),

i ∈ I,

(11.6)

for |z| < α as is seen from Eq. (11.4). We now expand (11.5) and (11.3) by powers of z: P(i) (s, z) =

g max l=0

p(i),l (s)zl

Regularized Products and Determinants

89

gmax and P(i) (z) = l=0 p(i),l zl . Regard the p(i),l (s) and correspondingly the p(i),l as the components of vectors p(s), p ∈ CM , and regard the xm,l (s) and xm,l as the components of vectors x m,l (s), x m,l ∈ CN . The expansion of (11.3) and (11.5) by powers of z delivers a matrix B ∈ Mat(N × M, C) such that p(s) = B · x(s)

for (s) > rmax ,

(11.7)

and it has to be shown that there is a x ∈ CN such that p = B · x. But there is a matrix Bˆ ∈ Mat(M × N , C) such that a solution exists if and only if Bˆ · p = 0. With this Bˆ one has Bˆ · p = Bˆ · CTs=0 (p(s)) = CTs=0 (Bˆ · p(s)) = 0, where the first equality is obtained from (11.6) and the last from (11.7).

Remark. Observe that in the proof the operation CTs=0 is applied to functions f (s) = f1 (s) + f2 (s) with f1 (s) being meromorphic around s = 0 and f2 (s) defined only for (s) > 0 but continuous at s = 0.

12. Proof of Theorem 4 The following estimate is needed to apply majorized convergence to integrals over ξD (s, z). Lemma 12.1. Let 0 < ϕi < ϕi < π , i + 1, 2 and let D be a directed divisor in Wlϕ1 ,ϕ2 and given r ≥ r such that c := ρ∈D |mD (ρ)ρk−r | < ∞. Then for (s0 ) > r, mD (ρ) r −(s )

0 (z − ρ)s0 = O |z|

(12.1)

ρ∈D

for z ∈ Wrϕ1 ,ϕ2 and |z| → ∞. Proof. We split the series in

|ρ|< 21 |z| and

|ρ|≥ 21 |z| and treat these two series separately. 1 r |ρ|<x |mD (ρ)| ≤ cx for any 2 |z| and use

For the first series we estimate |z − ρ| > x > 0 which follows immediately from the definition of c. This last inequality on the other hand implies x1 ≤|ρ|<x2

x2 mD (ρ) 1 ≤ cx r 1 + cr y r −1 α dy 1 α ρα x1 y x1

(12.2)

x for all α ∈ R>0 and 0 < x1 < x2 (as the rhs obviously maximizes x12 y −α dµ(y) under x the condition x1 dµ(y) ≤ cx r for all x ∈ [x1 , x2 ]). Observing that there is a β > 0 such that |z − ρ| > β|ρ| for all ρ ∈ D and z ∈ Wrϕ1 ,ϕ2 and using (12.2) we get the estimate also for the second series.

90

G. Illies

In the sequel we often tacitly use the following estimate: For 0 ≤ ϕ < and m ∈ N0 , " (m) (s) = O(e−ϕ|(s)| ),

|(s)| → ±∞

π 2,

α1 < α2 (12.3)

for α1 < (s) < α2 . For m = 0 this is part of the Stirling formula, for m > 0 it follows by applying Cauchy’s inequalities. Proof of Proposition 7.1. By majorized convergence (for b) apply the above Lemma) the two integral representations have to be proved only for one-point-divisors. In the sequel for expressions a s we always use arg(a) ∈] − π, π [. a) Let (s) > c > 0 and (ρ) < 0, (z) > 0, then by Euler’s Mellin integral for "(s) and its inversion one has ∞ "(s) = e−zt t s−1 eρt dt (z − ρ)s 0 ∞ c+i∞ "(s ) −s 1 −zt s−1 = e t t ds dt 2π i c−i∞ (−ρ)s 0 c+i∞ 1 "(s − s ) "(s ) = ds , 2πi c−i∞ zs−s (−ρ)s the last equation by interchanging the integrations (Fubini). Using the identity theorem one gets this formula as needed for ρ ∈ Wlϕ1 ,ϕ2 and z ∈ Wrϕ1 ,ϕ2 because both sides are holomorphic in the variables ρ and z. b) For ρ ∈ Wlϕ1 ,ϕ2 , z ∈ Wrϕ1 ,ϕ2 and (s) > 0 we will prove 1 "(s) = s (z − ρ) 2πi

C

"(s − s0 + 1) "(s0 ) dw. (z − w)s−s0 +1 (w − ρ)s0

(12.4)

For z0 ∈ Wrϕ1 ,ϕ2 and (s0 ) < 1 one has 1 2πi

C

iπs0 − e−iπs0 ∞ 1 1 v −s0 −s e dw = z dv 0 (z0 − w)s−s0 +1 w s0 2π i (1 + v)s−s0 +1 0 sin π s0 "(1 − s0 )"(s) = z0−s π "(s − s0 + 1) "(s) 1 = s , z0 "(s0 )"(s − s0 + 1)

the first equation by substituting w → −z0 v and deforming the contour (residue theorem), the second because of the representation 1.5 (2) in [EMOT] for Euler’s beta function B(u, v) = "(u)"(v)" −1 (u + v) and the third because of the equation "(1 − s0 )"(s0 ) = π sin−1 πs0 . Now for ρ ∈ Wlϕ1 ,ϕ2 such that z = z0 + ρ ∈ Wrϕ1 ,ϕ2 , replace the contour C by the shifted contour C − ρ. The value of this integral is independent of ρ (residue theorem) and (by majorized convergence for ρ → 0) equals the value of the above integral. Applying the substitution w → (w − ρ) and the identity theorem yields (12.4) in the demanded generality.

Regularized Products and Determinants

91

Proof of Theorem 4. A) ⇒ B’). Let −p < −q < r < c with (pn ) = q for all n. Then by the residue theorem (7.1) for (s) > c and z ∈ Wrϕ1 ,ϕ2 becomes −q+i∞ "(s − s ) 1 ξD (s, z) = ξD (s )ds 2π i −q−i∞ zs−s "(s − s ) + Ress =−pn ξD (s ) zs−s (pn ) −q as is seen by the identity theorem. From this B’) easily follows for (s0 ) > −p. If −p < (s0 ) ≤ −p then first take the Stirling asymptotic for s0 = s0 + k with k ∈ N such that −p < (s0 ) ≤ −p + 1 and integrate k times. B’) ⇒ A). First note that we just need to show that ξD (s) is (σ1 , σ2 )-bounded pregular, but we do not need to determine the singular part distribution as then because of A) ⇒ B’) and the properties of the map [q] it must be the demanded one. Let now 0 < σi < σi < σ . We have for (s) > r and C = Cσ1 ,σ2 , 1 "(s − s0 + 1) CTs1 =0 (δ(s1 )ξD (s1 + s0 , w)) dw, (12.5) ξD (s, z) = 2πi C (z − w)s−s0 +1 which is obtained by applying partial integration to (7.2) using (3.2) where the necessary estimates for |w| → ∞ are derived by integrating (12.1). Now as (12.5) is not valid for z = 0 one has to use a little trick: One deforms the contour C and uses a "shifted" Stirling asymptotics. Let ε > 0 then by Taylor series expansion one obtains a pB-systems n ) with abscissa p such that ( pn , B q (z) := CTs1 =0 (δ(s1 )ξD (s1 + s0 , z)) R B n (log(z + ε)) s0 (z) − −P (z + ε)pn +s0 ( pn ) max(r, −(p0 ), deg P 1 Bn (log(w + ε)) "(s − s0 + 1) ξD (s) = dw 2π i C (w + ε)pn +s0 (−w)s−s0 +1 ( pn ) 0, e xp(ρ, t, s) − e2πi(s−1) e xp(ρ, exp(2π i)t, s) i(α+δ) ei(α−δ) ∞ wt e ∞ t −(s−1) e = − dw "(s)eρt 2πi ws 0 0 ! " (e−π i(s−1) −eπ i(s−1) )t s−1 "(1−s)

=

1 "(s)"(1 − s)2i sin(π s)eρt = eρt , 2πi

thus (13.1), by the identity theorem also in general. It remains to prove (13.3). We assume arg(ρt) ∈] − ε , ε [ for 0 < ε < π2 , the general case follows then by rotating the ray of integration, ∞ (−ρt)w "(s) e e xp(ρ, t, s) = dw, (−ρt)−(s−1) 2π i (w + 1)s 0

Regularized Products and Determinants

93

which immediately gives the estimate | exp(ρ, t, s)| < c1 |ρt|−((s)−1) for |ρt| ≥ 1 for a suitable c1 > 0 und thus (13.3) by (13.2). For |ρt| ≤ 1 on the other hand with 1 0 < α := (ρt) ≤ 1 and the trivial estimate e−x ≤ x+α for x ∈ R≥0 one has 0

∞

e−(ρt)w dw = α (s)−1 |(w + 1)s | ≤α

(s)−1

∞

0 ∞ 0

and the assertion easily follows also for |ρt| ≤ 1.

e−x dx (x + α)(s) dx dx, (x + α)(s)+1

14. Miscelleanea In Chapter 2 of [Il1] the formalism of regularized determinants was developed more generally: Following Jorgenson and Lang ([JL1]) divisors with non-integer multiplicities mD : C → C (instead of C → Z) were regarded, then everything can be carried out with almost no difficulties, except that the associated functions become multivalued with the ρ ∈ D as branch points. Also essential singularities for ξD (s) were allowed. In that case it is neccessary that the formal power series δ(s) is convergent near zero. With this assumption almost everything can be done in general although some not completely trivial convergence problems occur. The maps [q] and [[q]] defined in Sects. 7 and 8 are special cases of the following construction: For q ∈ C, a regularization sequence δ and a function h(s), which is meromorphic in a neighborhood of q we define a linear map [h, q] : C[z] → C[z] (notation: B(z) → B [h,q] (z)) by CTs=0 (δ(s)B(∂s )[h(s)zs+q ]) = B [h,q] (log z)zq . If h1 (s) and h2 (s) are two such function, then if h1 (s) is, in addition, holomorphic at s = q the composition law [h2 , q] ◦ [h1 , q] = [h1 · h2 , q] is easily checked. For example 1 this implies [[q]] = − 2πi [1 − q] ◦ [q] for q = −n, n ∈ N0 . Also a sort of inverse of [q] can be defined (compare Satz 2.3.6 in [Il1]). Acknowledgements. I would like to thank C. Deninger for supervising my Ph.D. thesis as well as M. Schröter, I. Vardi, C. Bree, C. Soulé, A. Voros, J. B. Bost and J. Jorgenson for helpful discussions and improvements. Parts of the article were written during a visit at the IHES.

References [Al] [Bar] [Cr] [CV] [De1] [De2]

Almquist, G.: Asymptotic Formulas and Generalized Dedekind Sums. Exp. Math. 7, 343–359 (1998) Barnes, E.W.: On the Theory of the Multiple Gamma Function. Phil. Trans. of the Royal Soc. (A) 19, 374–439 (1904) Cramér, H.: Studien über die Nullstellen der Riemannschen Zetafunktion. Math. Zeitschrift 4, 104–130 (1919) Cartier, P., Voros, A.: Une nouvelle interpretation de la formule des traces de Selberg. In: The Grothendieck Festschrift, Vol. 2, Basel–Boston: Birkhäuser, 1991, pp. 1–67 Deninger, C.: Motivic L-functions and regularized determinants. In: Motives, Proc. of Symp. Pure Math. 55/1, Providence, RI: AMS, 1994, pp. 707–743 Deninger, C.: Motivic L-functions and regularized determinants II. In: F. Catanese (Hrsg.) Proc. Arithmetic Geometry, Cortona, 1994

94

[De3]

G. Illies

Deninger, C.: Some Analogies between Number Theory and dynamical Systems on foliated Spaces. Documenta Mathematica, extra vol. ICM 1998, I, Plenary Talks, pp. 23–46 [Do] Doetsch, G.: Handbuch der Laplacetransformation I/II, Basel: Birkhäuser, 1950/1955 [Ef] Efrat, L.: Determinants of Laplacians on surfaces of finite volume. Commun. Math. Phys. 119, 443–451 (1988); Erratum. Commun. Math. Phys. 138, 607 (1991) [EMOT] Erdelyi, A., Magnus, W., Oberhettinger, F., Tricomi, F.G.: Higher transcendental functions I, II, III. New York: McGraw-Hill, 1953 [EORBZ] Elizalde, E., Odintsov, S.D., Romeo, A., Bytsenko, A.A., Zerbini, S.: Zeta regularization techniques with applications. Singapore: World Scientific, 1994 [Gui] Guinand, A.D.: Fourier reciprocities and the Riemann zeta function. Proc. London Math. Soc. (2) 51, 401–414 (1950) [Il1] Illies, G.: Regularized products, trace formulas and Cramér functions. Ph.D.-thesis (in German), Schriftenreihe des mathematischen Instituts der Universität Münster, 3. Serie, Heft 22, 1998 [Il2] Illies, G.: Cramér functions and Guinand equations. IHES-preprint 1999 [JL1] Jorgenson, J., Lang, S.: Basic Analysis of regularized series and products. LNM 1564, Berlin: Springer, 1994 [JL2] Jorgenson, J., Lang, S.: On Cramér’s theorem for general Euler products with functional equation. Math. Ann. 297/3 383–416 (1993) [JL3] Jorgenson, J., Lang, S.: Extension of analytic number theory and the theory of regularized harmonic series from Dirichlet series to Bessel series. Math. Ann. 306, 75–124 (1996) [Ko1] Koyama, S.Y.: Determinant expressions of Selberg zeta functions I. Trans. AMS 324, 149–168 (1991) [Ko2] Koyama, S.Y.: Determinant expressions of Selberg zeta functions II. Trans. AMS 329, 755–772 (1992) [Ko3] Koyama, S.Y.: Determinant expressions of Selberg zeta functions III. Proc. AMS 113, 303–311 (1991) [Ku] Kurokawa, N.: Multiple sine functions and Selberg zeta functions. Proc. Japan Acad. 67A, 61–64 (1991) [LW] Landau, E., Walfisz, A.: Über die Nichtfortsetzbarkeit einiger durch Dirichletsche Reihen definierter Funktionen. Rend. di Palermo 44, 8286 (1919) [Ma] Manin, Y.I.: Lectures on zeta functions and motives Preprint MPI Bonn, 1992 [No] Norlund, N.E.: Memoire sur les polynomes de Bernoulli. Acta Mathematica 43, 121–196 (1920) [QHS] Quine, J.R., Heydari, S.H., Song, R.Y.: Zeta-regularized products. Trans. of the AMS 338, 1, 213–231 (1993) [RS] Ray, D., Singer, I.: Analytic torsion for analytic manifolds. Ann. Math. 98, 154–177 (1973) [Sa] Sarnak, P.: Determinants of Laplacians. Commun. Math. Phys. 110, 113–120 (1987) [ScSo] Schröter, M., Soulé, C.: On a Result of Deninger Concerning Riemann’s Zeta Function. In: Motives, Proc. of Symp. Pure Math. 55/1, Providence, RI: AMS, 1994, pp. 745–747 [So] Soulé, C.: Letter to C. Deninger, 13.2.1991, as: M. Schröter, S. Soulé: On a result of Deninger concerning Riemann’s zeta function. In: Motives, Proc. of Symp. Pure Math. 55/1, Providence, RI: AMS, 1994, pp. 745–747 [Ti] Titchmarsh, E.C.: The Theory of Functions. 2nd ed., Oxford: Oxford University Press, 1939 [Va] Vardi, I.: Determinants of Laplacians and multiple Gamma Functions. Siam J. Math. Anal. 19, 1, 493–507 (1988) [Vi] Vigneras, M.F.: L’equation fonctionelle de la fonction zeta de Selberg du groupe modulaire SL(2, Z). Asterisque 61, 235–249 (1979) [Vo] Voros, A.: Spectral Functions, Special Functions and the Selberg Zeta Function. Commun. Math. Phys. 110, 439–465 (1987) Communicated by P. Sarnak

Commun. Math. Phys. 220, 95 – 104 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

Super Brockett Equations: A Graded Gradient Integrable System R. Felipe1 , F. Ongay2 1 ICIMAF, Havana, Cuba, and Universidad de Antioquia, Medellín, Colombia 2 CIMAT, Guanajuato, Mexico. E-mail: [email protected]

Received: 9 February 2000 / Accepted: 18 January 2001

Abstract: Rather recently equations of Lax type defined by a double commutator, the so-called Brockett equations, have received considerable attention. In this paper we prove that a supersymmetric version of a Brockett hierarchy is an infinite dimensional integrable gradient system. As far as we know, this is the only graded system of this type existing in the literature. 0. Introduction Ever since the discovery in 1968 by Gardner, Green, Kruskal and Miura of the inverse scattering method to solve the KdV equations, the theory of infinite dimensional integrable systems, sometimes also known as the theory of soliton equations, has been the subject of a great deal of work, and many results and applications have stemmed from this newfound attention to the subject. As is well known, one of the first major developments came with the realization that these systems can be put in the so-called Lax form, L˙ = [L, N ], since this description is particularly well suited to stress some of the geometrical interpretations of the equations, in particular allowing to place them into a Hamiltonian framework. On the other hand, some ten years ago, ODE’s of Lax type defined by more than one Lie bracket were introduced by R. Brockett (see [B1] and [B2]), in connection with some least squares matching and sorting problems. Surprisingly enough, these so-called Brockett systems exhibit many remarkable features besides the original intended ones: to name one, it was discovered by A. Bloch, R. Brockett and T. Ratiu (see e.g. [B-B-R]) that the equations corresponding to the celebrated Toda lattice can be cast into this mold. But moreover, another property of these equations, still more relevant to our purposes, was also proved in [B-B-R], where it was shown that these finite dimensional systems Partially supported by CONACYT, Mexico, project 28-492E and CODI project “Complete integrability of Brockett type equations”, University of Antioquia, Colombia.

96

R. Felipe, F. Ongay

are completely integrable, but of gradient type (the existence of a suitable Hamiltonian structure remaining an open question). Quite recently, the theory of Brockett equations was adapted by one of us for PDE’s (reference [F]), and it was proved that many important properties, such as the complete integrability and the property of being a gradient system, were still valid in this infinite dimensional context, but also that this analog of the Brockett equation belongs to a hierarchy, similar to the well known KdV or KP hierarchies. In this work we consider yet another extension of the Brockett system: Following the approach to supersymmetric (i.e., Z2 -graded) versions of the KP hierarchy, studied for example by Manin and Radul ([M-R]), Mulase ([Mu2]), or Rabin ([R]), we define and study a supersymmetric extension of the Brockett hierarchy introduced in [F]. In particular, our main results will show that the properties of being completely integrable and a gradient flow, also extend to this graded hierarchy; to the best of our knowledge, this is the first example of a graded system possessing these properties. Furthermore, the flows associated to this new hierarchy naturally “live” on a flag in the space of gauge operators, and we conjecture that this geometric feature of our construction might be of some use in the algebro-geometric study of deformations of line bundles over algebraic curves, both in the classical and graded case. 1. A Z2 -Graded Brockett Hierarchy We will consider in this work a rather standard (1, 1) dimensional setting, namely, the one studied by Manin and Radul, which we now briefly recall, referring the reader to the basic reference [M-R] for more details (see also [Mu2]): First of all, let x denote an even variable, ξ an odd one (the parity of an object will be denoted by a tilde, so that for instance x˜ = 0; ξ˜ = 1), and fix some ring of “superfunctions” in these variables (for instance, we may take the ring of formal power series in x and ξ ), B, where the operator θ = ∂ξ +∂x acts as an odd derivation (recall that θ 2 = ∂x ). Then one considers the ring of (formal) super pseudo-differential operators, B((θ −1 )), with coefficients in B. To avoid confusion with the action of the derivations on the operators, the product in this ring will be denoted by ◦, and by θ −1 we will denote the (formal) inverse of θ. Thus, every operator L ∈ B((θ −1 )) can be written as a formal series bi θ i , L= i≤m

and, as usual, we will write L+ =

bi θ i ;

L− =

bi θ i ,

(1)

i 0, Eq. (7) gives the flow of the gradient of the graded Adler functional Fk (S) on the affine subspace 1 + E (−k−2) . Proof. Indeed, to end the proof of our claim, it remains only to observe that, from Lemma 2, we have θk S −1 = −S −1 ◦ θk S ◦ S −1 = (−1)k+1 S −1 ◦ [, k+1 − ]. Therefore, modulo an inessential sign, the right-hand side of (15) is in fact equivalent to the right-hand side of (7), which we have already shown to be equivalent to the super Brockett system. Remark. The graded hierarchy that we have constructed in this paper preserves, and in a definite sense generalizes, several of the remarkable features of the standard Brockett equation. But moreover, we have also seen that these super Brockett equations will induce a flow on an infinite Grassmannian, of a different type to that given by the known super KP flows. We conjecture, therefore, that this hierarchy might also be of value, for instance, for the algebro-geometric study of deformations of superline bundles over supercurves, etc. (and it is clear that this remark also applies to the non-graded case; see also [F]). We hope to clarify some of these questions in a future work. Acknowledgements. Both authors wish to express their indebtedness to Prof. J. Rabin, who patiently listened to our expositions of a preliminary version of this work, and made several valuable comments. The bulk of this paper was done during reciprocal visits by each author to his coauthor’s respective institution; both of us thankfully acknowledge their hospitality during these stays. Finally, we are grateful to one of the referees, who pointed out an error in the original manuscript.

References [B-B-R] Bloch, A.M., Brockett, R.W., and Ratiu, T.S.: Completely integrable gradient flows. Commun. Math. Phys. 147, 57–54 (1992) [B1] Brockett, R.W.: Least squares matching problems. Linear Algebra Appl. 122, 761–777 (1989) [B2] Brockett, R.W.: Dynamical systems that sort lists, diagonalize matrices, and solve linear programming problems. Linear Algebra Appl. 146, 79–91 (1991) [D] Dickey, L.A.: Soliton equations and Hamiltonian systems Advanced Series in Math. 12, Phys. Singapore: World Scientific, 1991 [F] Felipe, R.: Algebraic aspects of Brockett type equations. Physica D 132, 287–297 (1999) [M-R] Manin, Yu.I., and Radul, O.A.: A supersymmetric extension of the Kadomtsev–Petviashvili hierarchy. Commun. Math. Phys. 98, 65–77 (1985)

104

[Mu1] [Mu2] [R]

R. Felipe, F. Ongay

Mulase, M.: Complete integrability of the Kadomtsev–Petviashvili equation. Adv. Math. 54, 57–66 (1984) Mulase, M.: A new super KP system and a characterization of the Jacobians of arbitrary algebraic supercurves. J. Diff. Geom. 34, 651–680 (1991) Rabin, J. M.: The geometry of super KP flows. Commun. Math. Phys. 137, 533–552 (1991)

Communicated by T. Miwa

Commun. Math. Phys. 220, 105 – 164 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials and Coset Branching Functions Anne Schilling1, , Mark Shimozono2, 1 Department of Mathematics, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge,

MA 02139, USA. E-mail: [email protected]

2 Department of Mathematics, Virginia Tech, Blacksburg, VA 24061-0123, USA.

E-mail: [email protected] Received: 9 April 2000 / Accepted: 26 January 2001

Abstract: Level-restricted paths play an important rôle in crystal theory. They correspond to certain highest weight vectors of modules of quantum affine algebras. We show that the recently established bijection between Littlewood–Richardson tableaux and rigged configurations is well-behaved with respect to level-restriction and give an explicit characterization of level-restricted rigged configurations. As a consequence a new general fermionic formula for the level-restricted generalized Kostka polynomial is obtained. Some coset branching functions of type A are computed by taking limits of these fermionic formulas. 1. Introduction Generalized Kostka polynomials [26, 33, 35–38] are q-analogues of the tensor product multiplicity λ cR = dim Homsln (V λ , V R1 ⊗ · · · ⊗ V RL ),

(1.1)

where λ is a partition, R = (R1 , . . . , RL ) is a sequence of rectangles and V λ is the irreducible integrable highest weight module of highest weight λ over the quantized enveloping algebra Uq (sln ). The generalized Kostka polynomials can be expressed as generating functions of classically restricted paths [30, 33, 37]. In terms of the theory of Uq (sln )-crystals [16, 17] these paths correspond to the highest weight vectors of tensor products of perfect crystals. The statistic is given by the energy function on paths. n )-crystal strucThe Uq (sln )-crystal structure on paths can be extended to a Uq (sl ture [18]. The level-restricted paths are the subset of classically restricted paths which, New address as of July 2001: Department of Mathematics, University of California, One Shields Ave., Davis, CA 956116-8633, USA. E-mail: [email protected] Partially supported by NSF grant DMS-9800941.

106

A. Schilling, M. Shimozono

n )after tensoring with the crystal graph of a suitable integrable highest weight Uq (sl module, are affine highest weight vectors. Hence it is natural to consider the generating functions of level-restricted paths, giving rise to level-restricted generalized Kostka polynomials which will take a lead rôle in this paper. The notion of level-restriction is also very important in the context of restricted-solid-on-solid (RSOS) models in statistical mechanics [3] and fusion models in conformal field theory [39]. The one-dimensional configuration sums of RSOS models are generating functions of level-restricted paths (see for example [2, 9, 14]). The structure constants of the fusion algebras of Wess– Zumino–Witten conformal field theories are exactly the level-restricted analogues of the Littlewood–Richardson coefficients in (1.1) as shown by Kac [15, Exercise 13.35] and Walton [40, 41]. q-Analogues of these level-restricted Littlewood–Richardson coefficients in terms of ribbon tableaux were proposed in ref. [10]. The generalized Kostka polynomial admits a fermionic (or quasi-particle) formula [25]. Fermionic formulas originate from the Bethe Ansatz [4] which is a technique to construct eigenvectors and eigenvalues of row-to-row transfer matrices of statistical mechanical models. Under certain assumptions (the string hypothesis) it is possible to count the solutions of the Bethe equations resulting in fermionic expressions which look like sums of products of binomial coefficients. The Kostka numbers arise in the study of the XXX model in this way [22–24]. Fermionic formulas are of interest in physics since they reflect the particle structure of the underlying model [20, 21] and also reveal information about the exclusion statistics of the particles [5–7]. The fermionic formula of the Kostka polynomial can be combinatorialized by taking a weighted sum over sets of rigged configurations [22–24]. In ref. [25] the fermionic formula for the generalized Kostka polynomial was proven by establishing a statisticpreserving bijection between Littlewood–Richardson tableaux and rigged configurations. In this paper we show that this bijection is well-behaved with respect to levelrestriction and we give an explicit characterization of level-restricted rigged configurations (see Definition 5.5 and Theorem 8.2). This enables us to obtain a combinatorial formula for the level-restricted generalized Kostka polynomials as the generating function of level-restricted rigged configurations (see Theorem 5.7). As an immediate consequence this proves a new general fermionic formula for the level-restricted generalized Kostka polynomial (see Theorem 6.2 and Eq. (6.7)). Special cases of this formula were conjectured in refs. [8, 12, 13, 27, 33, 42]. As opposed to some definitions of “fermionic formulas” the expression of Theorem 6.2 involves in general explicit negative signs. However, we would like to point out that because of the equivalent combinatorial formulation in terms of rigged configurations as given in Theorem 5.7 the fermionic sum is manifestly positive (i.e., a polynomial with positive coefficients). The branching functions of type A can be described in terms of crystal graphs of n )-modules. For certain triples of weights irreducible integrable highest weight Uq (sl they can be expressed as limits of level-restricted generalized Kostka polynomials. The structure of the rigged configurations allows one to take this limit, thereby yielding a fermionic formula for the corresponding branching functions (see Eq. (7.10)). The derivation of this formula requires the knowledge of the ground state energy, which is obtained from the explicit construction of certain local isomorphisms of perfect crystals (see Theorem 7.3). A more complete set of branching functions can be obtained by considering “skew” level-restricted generalized Kostka polynomials. We conjecture that rigged configurations are also well-behaved with respect to skew shapes (see Conjecture 8.3).

Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials

107

The paper is structured as follows. Section 2 sets out notation used in the paper. In Sect. 3 we review some crystal theory, in particular the definition of level-restricted paths, which are used to define the level-restricted generalized Kostka polynomials. Littlewood–Richardson tableaux and their level-restricted counterparts are defined in Sect. 4. The formulation of the generalized Kostka polynomials in terms of Littlewood– Richardson tableaux with charge statistic is necessary for the proof of the fermionic formula which makes use of the bijection between Littlewood–Richardson tableaux and rigged configurations. The latter are the subject of Sect. 5 which also contains the new definition of level-restricted rigged configurations and our main Theorem 5.7. The proof of this theorem is reserved for Sect. 8. The fermionic formulas for the level-restricted Kostka polynomial and the type A branching functions are given in Sects. 6 and 7, respectively. 2. Notation All partitions are assumed to have n parts, some of which may be zero. Let R = (R1 , R2 , . . . , RL ) be a sequence of partitions whose Ferrers diagrams are rectangles. Let Rj have µj columns and ηj rows for 1 ≤ j ≤ L. We adopt the English notation for partitions and tableaux. Unless otherwise specified, all tableaux are assumed to be column-strict (that is, the entries in each row weakly increase from left to right and in each column strictly increase from top to bottom). 3. Paths The main goal of this section is to define the level-restricted generalized Kostka polyn )-crystal graphs nomials. These polynomials are defined in terms of certain finite Uq (sl whose elements are called paths. The theory of crystal graphs was invented by Kashiwara [16], who showed that the quantized universal enveloping algebras of Kac–Moody algebras and their integrable highest weight modules admit special bases whose structure at q = 0 is specified by a colored graph known as the crystal graph. The crystal graphs for the finite-dimensional irreducible modules for the classical Lie algebras were computed explicitly by Kashiwara and Nakashima [17]. The theory of perfect crystals gave a realization of the crystal graphs of the irreducible integrable highest weight modules for affine Kac–Moody algebras, as certain eventually periodic sequences of elements taken from finite crystal graphs [19]. This realization is used for the main application, some new explicit formulas for coset branching functions of type A. 3.1. Crystal graphs. Let Uq (g) be the quantized universal enveloping algebra for the Kac–Moody algebra g. Let I be an indexing set for the Dynkin diagram of g, P the weight lattice of g, P ∗ the dual lattice, {αi | i ∈ I } the (not necessarily linearly independent) simple roots, {hi | i ∈ I } the simple coroots, and {i | i ∈ I } the fundamental weights. Let · , · denote the natural pairing of P ∗ and P . Suppose V is a Uq (g)-module with crystal graph B. Then B is a directed graph whose vertex set (also denoted B) indexes a basis of weight vectors of V , and has directed edges colored by the elements of the set I . The edges may be viewed as a combinatorial version of the action of Chevalley generators. This graph has the property that for every b ∈ B and i ∈ I , there is at most one edge colored i entering (resp. leaving) b. If there is an edge b → b colored i, denote this by fi (b) = b and ei (b ) = b. If there is no edge

108

A. Schilling, M. Shimozono

colored i leaving b (resp. entering b ) then say that fi (b) (resp. ei (b )) is undefined. The fi and ei are called Kashiwara lowering and raising operators. Define φi (b) (resp. i (b)) to be the maximum m ∈ N such that fim (b) (resp. eim (b)) is defined. There is a weight function wt : B → P that satisfies the following properties: wt(fi (b)) = wt(b) − αi , wt(ei (b)) = wt(b) + αi , hi , wt(b) = φi (b) − i (b).

(3.1)

B is called a P -weighted I -crystal. Let P + = { ∈ P | hi , ≥ 0, ∀i ∈ I } be the set of dominant integral weights. For ∈ P + denote by V() the irreducible integrable highest weight Uq (g)-module of highest weight . Let B() be its crystal graph. Say that an element b ∈ B of the P -weighted I -crystal B is a highest weight vector if i (b) = 0 for all i ∈ I . Let u be the highest weight vector in B(). By (3.1), for all i ∈ I , i (u ) = 0, φi (u ) = hi , .

(3.2)

Let B be the crystal graph of a Uq (g)-module V . A morphism of P -weighted I crystals is a map τ : B → B such that wt(τ (b)) = wt(b) and τ (fi (b)) = fi (τ (b)) for all b ∈ B and i ∈ I . In particular fi (b) is defined if and only if fi (τ (b)) is. Suppose V and V are Uq (g)-modules with crystal graphs B and B respectively. Then V ⊗ V admits a crystal graph denoted B ⊗ B which is equal to the direct product B × B as a set. We use the opposite of the convention used in the literature. Define b ⊗ fi (b ) if φi (b ) > i (b), fi (b ⊗ b ) = fi (b) ⊗ b if φi (b ) ≤ i (b) and φi (b) > 0, (3.3) undefined otherwise. Equivalently, ei (b) ⊗ b if φi (b ) < i (b), ei (b ⊗ b ) = b ⊗ ei (b ) if φi (b ) ≥ i (b) and i (b ) > 0, undefined otherwise.

(3.4)

One has φi (b ⊗ b ) = φi (b) + max{0, φi (b ) − i (b)}, i (b ⊗ b ) = max{0, i (b) − φi (b )} + i (b ).

(3.5)

Finally wt : B ⊗ B → P is defined by wt(b ⊗ b ) = wt B (b) + wt B (b ), where wtB : B → P and wtB : B → P are the weight functions for B and B . This construction is “associative”, that is, the P -weighted I -crystals form a tensor category. Remark 3.1. It follows from (3.4) that if b = bL ⊗ · · · ⊗ b1 and ei (b) is defined, then ei (b) = bL ⊗ · · · ⊗ bj +1 ⊗ ei (bj ) ⊗ bj −1 ⊗ · · · ⊗ b1 for some 1 ≤ j ≤ L.

Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials

109

3.2. Uq (sln )-crystal graphs on tableaux. Let J = {1, 2, . . . , n − 1} be the indexing set for the Dynkin diagram of type An−1 , with weight lattice Pfin , simple roots {α i | i ∈ J }, fundamental weights {i | i ∈ J }, and simple coroots {hi | i ∈ J }. Let λ = (λ1 ≥ λ2 ≥ · · · ≥ λn ) ∈ Nn be a partition. There is a natural projection n Z → Pfin denoted λ → λ = n−1 i=1 (λi −λi+1 )i . Let V (λ) be the irreducible integrable highest weight module of highest weight λ over the quantized universal enveloping algebra Uq (sln ) [17]. By abuse of notation we shall write V λ = V (λ) and denote the crystal graph of V λ by Bλ . As a set Bλ may be realized as the set of tableaux of shape λ over the alphabet {1, 2, . . . , n}. Define the content of b ∈ Bλ by content(b) = (c1 , . . . , cn ) ∈ Nn , where cj is the number of times the letter j appears in b. The weight function wt : Bλ → Pfin is given by sending b to the image of content(b) under the projection Zn → Pfin . The row-reading word of b is defined by word(b) = · · · w2 w1 , where wr is the word obtained by reading the r th row of b from left to right. This definition is useful even in the context that b is a skew tableau. The edges of Bλ are given as follows. First let v be a word in the alphabet {1, 2, . . . , n}. View each letter i (resp. i +1) of v as a closing (resp. opening) parenthesis, ignoring other letters. Now iterate the following step: declare each adjacent pair of matched parentheses to be invisible. Repeat this until there are no matching pairs of visible parentheses. At the end the result must be a sequence of closing parentheses (say p of them) followed by a sequence of opening parentheses (say q of them). The unmatched (visible) subword is of the form i p (i + 1)q . If p > 0 (resp. q > 0) then fi (v) (resp. ei (v)) is obtained from v by replacing the unmatched subword i p (i + 1)q by i p−1 (i + 1)q+1 (resp. i p+1 (i + 1)q−1 ). Then φi (v) = p, i (v) = q, and fi (v) (resp. ei (v)) is defined if and only if p > 0 (resp. q > 0). For the tableau b ∈ Bλ , let fi (b) be undefined if fi (word(b)) is; otherwise define fi (b) to be the unique (not necessarily column-strict) tableau of shape λ such that word(fi (b)) = fi (word(b)). It is easy to verify that when defined, fi (b) is a columnstrict tableau. Consequently φi (b) = φi (word(b)). The operator ei and the quantity i (b) are defined similarly. n )-crystal structure on rectangular tableaux. There is an inclusion of alge3.3. Uq (sl n ), where Uq (sl n ) is the quantized universal enveloping algebra bras Uq (sln ) ⊂ Uq (sl n of the affine Kac–Moody algebra sl n [15]. corresponding to the derived subalgebra sl (1) Let I = {0, 1, 2, . . . , n − 1} be the index set for the Dynkin diagram of An−1 . Let Pcl n , with (linearly dependent) simple roots {α cl | i ∈ I }, simple be the weight lattice of sl i coroots {hi | i ∈ I }, and fundamental weights {cl | i ∈ I }. The simple roots satisfy i the relation α0cl = − i∈J αicl . There is a natural projection Pcl → Pfin with kernel cl Z0 such that cl i → i for i ∈ J and 0 → 0. Let cl : Pfin → Pcl be the section cl of the above projection defined by cl(i ) = cl i − 0 for i ∈ J . Let c ∈ sl n be the canonical central element. The level of a weight ∈ Pcl is defined by c , . Let (Pcl+ )* = { ∈ Pcl+ | c , = *}. n )-module that has a crystal graph B (not all Suppose V is a finite-dimensional Uq (sl do); B is a Pcl -weighted I -crystal. A weight function wt cl : B → Pcl may be given by wtcl (b) = cl(wt(b)), where wt : B → Pfin is the weight function on the set B viewed as a Uq (sln )-crystal graph. In addition to being a Uq (sln )-crystal graph, B also has some

110

A. Schilling, M. Shimozono

n ) which edges colored 0. The action of Uq (sln ) on V λ extends to an action of Uq (sl admits a crystal structure, if and only if the partition λ is a rectangle [18, 30]. If λ is n )-module with the rectangle with k rows and m columns, then write V k,m for the Uq (sl Uq (sln )-structure V λ and denote its crystal graph by B k,m . If one of m or k is 1, then it is easy to give e0 and f0 explicitly on B k,m , for in this case the weight spaces of V k,m are one-dimensional, and the zero edges can be deduced from (3.1) [18]. The general case is given as follows [37]. We shall first define a content-rotating bijection ψ −1 : B k,m → B k,m . Let b ∈ B k,m be a tableau, say of content (c1 , c2 , . . . , cn ). ψ −1 (b) will have content (c2 , c3 , . . . , cn , c1 ). Remove all the letters 1 from b, leaving a vacant horizontal strip of size c1 in the northwest corner of b. Compute Schensted’s P tableau [34] of the row-reading word of this skew subtableau. It can be shown that this yields a tableau of the shape obtained by removing c1 cells from the last row of the rectangle (mk ). Subtract one from the value of each entry of this tableau, and then fill in the c1 vacant cells in the last row of the rectangle (mk ) with the letter n. It can be shown that ψ −1 is a well-defined bijection, whose inverse ψ can be given by a similar algorithm. Then fi = ψ −1 ◦ fi+1 ◦ ψ, ei = ψ −1 ◦ ei+1 ◦ ψ

(3.6)

for all i where indices are taken modulo n; in particular for i = 0 this defines explicitly the operators e0 and f0 . 3.4. Sequences of rectangular tableaux. For a sequence of rectangles R, consider the n )-crystal graph has underlying set PR = tensor product V RL ⊗ · · · ⊗ V R1 . Its Uq (sl BRL ⊗ · · · ⊗ BR1 , where the tensor symbols denote the Cartesian product of sets. A typical element of PR is called a path and is written b = bL ⊗ · · · ⊗ b2 ⊗ b1 , where bj ∈ BRj is a tableau of shape Rj . The edges of the crystal graph PR are given explicitly as follows. Define the word of a path b by word(b) = word(bL ) · · · word(b2 )word(b1 ). Then for i = 1, 2, . . . , n − 1 (as in the definition of fi for b ∈ Bλ ), if fi (word(b)) is undefined, let fi (b) be undefined; otherwise it is not hard to see that there is a unique path fi (b) ∈ PR such that word(fi (b)) = fi (word(b)). To define f0 , let ψ(b) = ψ(bL ) ⊗ · · · ⊗ ψ(b1 ) and f0 = ψ −1 ◦ f1 ◦ ψ. This definition is equivalent to that given by taking the above definition of fi on the crystals BRj and then applying the rule for lowering operators on tensor products (3.3). The action of ei for i ∈ I is defined analogously. n , with weight 3.5. Integrable affine crystals. Consider the affine Kac–Moody algebra sl lattice Paf , independent simple roots {αi | i ∈ I }, simple coroots {hi | i ∈ I }, and fundamental weights {i | i ∈ I }. Let δ ∈ Paf be the null root. There is a natural projection which we shall by abuse of notation also call cl : Paf → Pcl such that cl(δ) = 0 and cl(i ) = cl i for i ∈ I . Write af : Pcl → Paf for the section of cl given by af(cl ) = for i ∈ I . i i

Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials

111

Let ∈ Pcl+ be a dominant integral weight and B() the crystal graph of the n )-module of highest weight . If = 0 irreducible integrable highest weight Uq (sl then B() is infinite. The set of weights in Paf that project by cl to are given by cl−1 () = {af() + j δ | j ∈ Z}. Now fix j . The irreducible integrable highest weight n )-crystal graph B(af() + j δ) may be identified with B() as sets and as I Uq (sl crystals (independent of j ). The weight functions for B(af()+j δ) and B(af()) differ by the global constant j δ. The weight function B() → Z is obtained by composing the weight function for B(af() + j δ), with the projection cl : Paf → Pcl . The set B() is then endowed with an induced Z-grading E : B() → N defined by E(b) = − d , wt(b) , where B() is identified with B(af()), wt : B(af()) → Paf is the weight function and d ∈ Paf∗ is the degree generator. The map d , · takes the coefficient of the element δ of an element in Paf when written in the basis {i | i ∈ I } ∪ {δ}. 3.6. Energy function on finite paths. The set of paths PR has a natural statistic called the energy function. The definitions here follow [30]. Consider first the case that R = (R1 , R2 ) is a sequence of two rectangles. Let Bj = BRj for 1 ≤ j ≤ 2. Since B2 ⊗ B1 is a connected crystal graph, there is a unique n )-crystal graph isomorphism Uq (sl (3.7) σ : B2 ⊗ B1 ∼ = B1 ⊗ B2 . This is called the local isomorphism (see Sect. 4.4 for an explicit construction). Write σ (b2 ⊗ b1 ) = b1 ⊗ b2 . Then there is a unique (up to a global additive constant) map H : B2 ⊗ B1 → Z such that −1 if i = 0, e0 (b2 ⊗ b1 ) = e0 b2 ⊗ b1 and e0 (b1 ⊗ b2 ) = e0 b1 ⊗ b2 , H (ei (b2 ⊗ b1 )) = H (b2 ⊗ b1 ) + 1 if i = 0, e0 (b2 ⊗ b1 ) = b2 ⊗ e0 b1 (3.8) and e0 (b1 ⊗ b2 ) = b1 ⊗ e0 b2 , 0 otherwise. This map is called the local energy function. By definition it is invariant under the local isomorphism and under fi and ei for i ∈ J . Let us normalize it by the condition that H (u2 ⊗u1 ) = |R1 ∩R2 |, where uj is the Uq (sln ) highest weight vector of Bj for 1 ≤ j ≤ 2, R1 ∩ R2 is the intersection of the Ferrers diagrams of R1 and R2 , and |R1 ∩ R2 | is the number of cells in this intersection. Explicitly |R1 ∩ R2 | = min{η1 , η2 } min{µ1 , µ2 }. If η1 + η2 ≤ n then the local energy function attains precisely the values from 0 to |R1 ∩ R2 |. Now let R = (R1 , . . . , RL ) be a sequence of rectangles and b = bL ⊗ · · · ⊗ b1 ∈ PR . For 1 ≤ p ≤ L−1 let σp denote the local isomorphism that exchanges the tensor factors (i+1) be the (i + 1)th tensor in the pth and (p + 1)th positions. For 1 ≤ i < j ≤ L, let bj factor in σi+1 σi+2 . . . σj −1 (b). Then define the energy function (i+1) E(b) = H (bj ⊗ bi ). (3.9) 1≤i<j ≤L

The value of the energy function is unchanged under local isomorphisms and under ei and fi for i ∈ J , since the local energy function has this property. The next lemma follows from the definition of the local energy function.

112

A. Schilling, M. Shimozono

Lemma 3.2. Suppose b = bL ⊗ · · · ⊗ b1 ∈ PR is such that e0 (b) is defined and for any ⊗ · · · ⊗ b of b under a composition of local isomorphisms, e (b ) = image b = bL 0 1 bL ⊗ · · · ⊗ bj +1 ⊗ e0 (bj ) ⊗ bj −1 ⊗ · · · ⊗ b1 , where j = 1. Then E(e0 (b)) = E(b) − 1. If all rectangles Rj are the same then each of the local isomorphisms is the identity and E(b) = (L − i)H (bi+1 ⊗ bi ). (3.10) 1≤i≤L−1

Say that b ∈ PR is classically restricted if it is an sln -highest weight vector, that is, i (b) = 0 for all i ∈ J . Equivalently, word(b) is a (reverse) lattice permutation (every final subword has partition content). Let PR be the set of classically restricted paths in PR of weight ∈ Pcl . It was shown in [37] that the generalized Kostka polynomial (which was originally defined in terms of Littlewood–Richardson tableaux; see (4.3)) can be expressed as KλR (q) = q E(b) . (3.11) b∈Pcl(λ)R

This extends the path formulation of the Kostka polynomial by Nakayashiki and Yamada [30]. 3.7. Level-restricted paths. Let B be any Pcl -weighted I -crystal and ∈ Pcl+ . Say that b ∈ B is -restricted if b ⊗ u is a highest weight vector in the Pcl -weighted I -crystal B ⊗ B(), that is, i (b ⊗ u ) = 0 for all i ∈ I . Equivalently i (b) ≤ hi , for all i ∈ I by (3.5) and (3.2). Denote by H(, B) the set of elements b ∈ B that are -restricted. If ∈ Pcl+ has the same level as , define H(, B, ) to be the set of b ∈ H(, B) such that wt(b) = − ∈ Pcl , that is, the set of b ∈ B such that b ⊗ u is a highest weight vector of weight . Say that the element b is restricted of level * if it is (*0 )-restricted. Such paths are also classically restricted since hi , *0 = 0 * denote the set of paths in P for i ∈ J . Let PR R that are restricted of level *. Letting * = H(* , B, + * ). B = PR , this is the same as saying PR 0 0 Define the level-restricted generalized Kostka polynomial by * KλR (q) = q E(b) . (3.12) b∈P *

cl(λ)R

3.8. Perfect crystals. This section is needed to compute the coset branching functions in n )-crystal n . For any Uq (sl Sect. 7. We follow [19], stating the definitions in the case of sl B, define , φ : B → Pcl by (b) = i∈I i (b)i and φ(b) = i∈I φi (b)i . Now let * be a positive integer and B the crystal graph of a finite dimensional irren )-module V . Say that B is perfect of level * if ducible Uq (sl (1) B ⊗ B is connected. (2) There is a weight ∈ Pcl such that B has a unique vector of weight and all other vectors in B have lower weight in the Chevalley order, that is, wt(B) ⊂ − i∈J Nαi .

Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials

113

(3) * = minb∈B c , (b) . (4) The maps and φ restrict to bijections Bmin → (Pcl+ )* , where Bmin ⊂ B is the set of b ∈ B achieving the minimum in 3.

n the perfect crystals of level * are precisely those of the form B k,* for 1 ≤ k ≤ For sl cl n − 1 [18, 30]. Let B = B k,* . The weight can be taken to be *(cl k − 0 ). Example 3.3. We describe the bijections , φ : Bmin → (Pcl+ )* in this example. Let B = B k,* . For this example let n = 6, k = 3, * = 5, and consider the weight = 20 + 1 + 2 + 4 . As usual subscripts are identified modulo n. The unique tableau b ∈ B k,* such that φ(b) = is constructed as follows. First let T be the following tableau of shape (*k ). Its bottom row contains hi , copies of the letter i for 1 ≤ i ≤ n (here it is 12466 since the sequence of hi , for 1 ≤ i ≤ 6 is (1, 1, 0, 1, 0, 2)). Let every letter in T have value one smaller than the letter directly below it. Here we have −1 0 2 4 4 T = 0

1 3 5 5

1

2 4 6 6.

Let T− be the subtableau of T consisting of the entries that are nonpositive and T+ the rest. Say T− has shape ν (here ν = (2, 1)). Let ν = (*k ) − (νk , νk−1 , . . . , ν1 ) (here ν = (5, 4, 3)). The desired tableau b is defined as follows. The restriction of b to the shape ν is P (T+ ), or equivalently, the tableau obtained by taking the skew tableau T+ and first pushing all letters straight upwards to the top of the bounding rectangle (*k ), and then pushing all letters straight to the left inside (*k ). The restriction of b to (*k )/ ν is the tableau of that skew shape in the alphabet {1, 2, . . . , n} with maximal entries, that is, its bottom row is filled with the letter n, the next-to-bottom row is filled with the letter n − 1, etc. In the example, 1 1 2 4 4 b=2 3 5 5 5 4 6 6 6 6. To construct the unique element b ∈ B k,* such that (b ) = , let U be the tableau whose first row has hi , copies of the letter i + 1 for 1 ≤ i ≤ n, again identifying subscripts modulo n; here U has first row 11235. Now let the rest of U be defined by letting each entry have value one greater than the entry above it. So 1 1 2 3 5 U =2 2 3 4 6 3 3 4 5 7. Let U− be the subtableau of U consisting of the values that are at most n. Let µ be the µ = (*k ) − (µk , µk−1 , . . . , µ1 ). Here µ = (5, 5, 4) and µ = (1, 0, 0). shape of U− and The element b is defined as follows. Its restriction to the skew shape (*k )/ µ is the unique skew tableau V of that shape such that P (V ) = U− , or equivalently, this restriction is obtained by taking the tableau U− , pushing all letters directly down within the rectangle (*k ) and then pushing all letters to the right within (*k ). The restriction of b to the

114

A. Schilling, M. Shimozono

shape µ is filled with the smallest letters possible, so that the first row of this subtableau consists of ones, the second row consists of twos, etc. Here 1 1 1 2 3

b =2 2 3 4 5 3 3 4 5 6. The main theorem for perfect crystals is: Theorem 3.4 ([19]). Let B be a perfect crystal of level * and ∈ (Pcl+ )* with * ≥ * . n )-crystals Then there is an isomorphism of Uq (sl B ⊗ B() ∼ =

B( + wt(b)).

(3.13)

b∈H(,B)

Suppose now that B is perfect of level * and ∈ (Pcl+ )* . Write b() for the unique element of B such that φ(b()) = . Theorem 3.4 (with therein replaced by = (b())) says that B ⊗ B((b())) ∼ = B() with corresponding highest weight vectors b() ⊗ u(b()) → u . This isomorphism can be iterated. Let σ : Bmin → Bmin be the unique bijection defined by φ ◦ σ = . Then there are isomorphisms B ⊗N ⊗ B(φ(σ N (b()))) ∼ = B() such that the highest weight vector of the left-hand side is n ) given by b()⊗σ (b())⊗σ 2 (b())⊗· · ·⊗σ N−1 (b())⊗uφ(σ N (b())) . For the Uq (sl perfect crystals B k,* , it can be shown that the map σ is none other than the power ψ −k of the content rotating map ψ. Moreover if σ is extended to a bijection σ : B k,* → B k,* by defining σ = ψ −k , then the extended function also satisfies φ(σ (b)) = (b) for all b ∈ B k,* not just for b ∈ Bmin . Since the bijection ψ on B k,* has order n, the bijection σ has order n/ gcd(n, k). The ground state path for the pair (, B) is by definition the infinite periodic sequence b = b1 ⊗ b2 ⊗ . . . , where bi = σ i−1 (b()). Let P(, B) be the set of all semi-infinite sequences b = b1 ⊗ b2 ⊗ . . . of elements in B such that b eventually agrees with the ground state path b for (, B). Then the set P(, B) has the structure of the crystal B() with highest weight vector u = b and weight function wt(b) = i≥1 (wt(bi ) − wt(bi )). To recover the weight function of the n )-crystal B(af()), define the energy function on P(, B) by Uq (sl E(b) =

i(H (bi ⊗ bi+1 ) − H (bi ⊗ bi+1 ))

(3.14)

i≥1

and define the map B(af(λ)) → Paf by b → wt(b) − E(b)δ, where wt : B() → Pcl . P(, B) can be regarded as a direct limit of the finite crystals B ⊗N . Define the embedding iN : B ⊗N → P(, B) by b1 ⊗ · · · ⊗ bN → b1 ⊗ b2 ⊗ bN ⊗ bN+1 ⊗ bN+2 ⊗ . . . . Define EN : B ⊗N → Z by EN (b1 ⊗ · · · ⊗ bN ) = E(b1 ⊗ · · · ⊗ bN ⊗ bN+1 ), where the E on the right-hand side is the energy function for the finite path space B ⊗N+1 . By definition for all p = b1 ⊗ · · · ⊗ bN ∈ B ⊗N , E(iN (p)) = EN (p) − EN (b1 ⊗ · · · ⊗ bN ). Note that the last fixed step bN+1 is necessary to make the energy function on the finite paths stable under the embeddings into P(, B).

Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials

115

3.9. Standardization embeddings. We require certain embeddings of finite path spaces. Given a sequence of rectangles R, let r(R) denote the sequence of rectangles given by splitting the rectangles of R into their constituent rows. For example, if R = ((1), (2, 2)), then r(R) = ((1), (2), (2)). There is a unique embedding iR : PR 7→ Pr(R)

(3.15)

defined as follows. Its explicit computation is based on transforming R into r(R) using two kinds of steps. (1) Suppose R1 has more than one row (η1 > 1). Then use the transformation R → η −1 R < = ((µ1 ), (µ11 ), R2 , R3 , . . . , RL ). Informally, R < is obtained from R by n )-crystal splitting off the first row of R1 . There is an associated embedding of Uq (sl < < graphs iR : PR → PR < defined by the property that word(i (b)) = word(b) for all b ∈ PR . Here it is crucial that the rectangle being split horizontally, is the first one, for otherwise the embedding does not preserve the edges labeled by 0. (2) If η1 = 1, then use a transformation of the form R → sp R for some p. Here sp R denotes the sequence of rectangles obtained by exchanging the p th and (p + 1)th n )-crystal graphs is the local rectangles in R. The associated isomorphism of Uq (sl isomorphism σp : PR → Psp R defined before. It is clear that one can transform R into r(R) using these two kinds of steps. Now fix one such sequence of steps leading from R to r(R), say R = R (0) → R (1) → · · · → R (N) = r(R), where each R (m) is a sequence of rectangles and each step R (m−1) → R (m) is one of the two types defined above. Define the map i (m) : PR (m−1) 7→ PR (m) by i (m) = iRk

L

µa max{ηa − k, 0}

(5.1)

a=1

for k ≥ 0, where by convention ν (0) is the empty partition. If λ has at most n parts all partitions ν (k) for k ≥ n are empty. For a partition ρ, define mi (ρ) to be the number of parts equal to i and min{i, ρj }, Qi (ρ) = ρ1t + ρ2t + · · · + ρit = j ≥1

the size of the first i columns of ρ. Let ξ (k) (R) be the partition whose parts are the widths of the rectangles in R of height k. The vacancy numbers for the (λ; R)-configuration ν are the numbers (indexed by k ≥ 1 and i ≥ 0) defined by (k) Pi (ν) = Qi ν (k−1) − 2Qi ν (k) + Qi ν (k+1) + Qi ξ (k) (R) . (k)

(5.2)

In particular P0 (ν) = 0 for all k ≥ 1. The (λ; R)-configuration ν is said to be admissible (k) if Pi (ν) ≥ 0 for all k, i ≥ 1, and the set of admissible (λ; R)-configurations is denoted by C(λ; R). Following [26, (3.2)], set (k) (k) (k+1) αi αi − α i , cc(ν) = k,i≥1

Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials

121

(k)

where αi is the size of the i th column in ν (k) . Define the charge c(ν) of a configuration ν ∈ C(λ; R) by c(ν) = ||R|| − cc(ν) − |P | with ||R|| =

|Ri ∩ Rj |

and

|P | =

1≤i<j ≤L

k,i≥1

(k)

mi (ν)Pi (ν).

Observe that c(ν) depends on both ν and R but cc(ν) depends only on ν. Example 5.1. Let λ = (3, 2, 2, 1) and R = ((2), (2, 2), (1, 1)). Then ν = ((2), (2, 1), (1)) is a (λ; R)-configuration with ξ (1) (R) = (2) and ξ (2) (R) = (2, 1). The configuration ν may be represented as 0

1

0

0

where the vacancy numbers are indicated to the left of each part. In addition cc(ν) = 3, !R! = 5, |P | = 1 and c(ν) = 1. Define the q-binomial by

(q)m+p m+p = (q)m (q)p m

for m, p ∈ N and zero otherwise, where (q)m = (1 − q)(1 − q 2 ) · · · (1 − q m ). The following fermionic or quasi-particle expression of the generalized Kostka polynomials, is a variant of [25, Theorem 2.10]. Theorem 5.2. For λ a partition and R a sequence of rectangles P (k) (ν) + mi (ν (k) ) i . KλR (q) = q c(ν) mi (ν (k) ) k,i≥1

(5.3)

ν∈C(λ;R)

Expression (5.3) can be reformulated as the generating function over rigged configurations. To this end we need to define certain labelings of the rows of the partitions in a configuration. For this purpose one should view a partition as a multiset of positive integers. A rigged partition is by definition a finite multiset of pairs (i, x), where i is a positive integer and x is a nonnegative integer. The pairs (i, x) are referred to as strings; i is referred to as the length of the string and x as the label or quantum number of the string. A rigged partition is said to be a rigging of the partition ρ if the multiset consisting of the lengths of the strings is the partition ρ. So a rigging of ρ is a labeling of the parts of ρ by nonnegative integers, where one identifies labelings that differ only by permuting labels among equal-sized parts of ρ. A rigging J of the (λ; R)-configuration ν is a sequence of riggings of the partitions ν (k) such that for every part of ν (k) of length i and label x, (k)

0 ≤ x ≤ Pi (ν).

(5.4)

The pair (ν, J ) is called a rigged configuration. The set of riggings of admissible (λ; R)configurations is denoted by RC(λ; R). Let (ν, J )(k) be the k th rigged partition of (ν, J ).

122

A. Schilling, M. Shimozono (k)

A string (i, x) ∈ (ν, J )(k) is said to be singular if x = Pi (ν), that is, its label takes on the maximum value. Observe that the definition of the set RC(λ; R) is completely insensitive to the order of the rectangles in the sequence R. However the notation involving the sequence R is useful when discussing the bijection between LR tableaux and rigged configurations, since the ordering on R is essential in the definition of LR tableaux. Define the cocharge and charge of (ν, J ) ∈ RC(λ; R) by cc(ν, J ) = cc(ν) + |J |, c(ν, J ) = c(ν) + |J |, (k) |Ji |, |J | = k,i≥1

(k)

(k)

where Ji is the partition inside the rectangle of height mi (ν (k) ) and width Pi (ν) given by the labels of thepartsof ν (k) of size i. Since the q-binomial m+p is the generating function of partitions with at most m m parts each not exceeding p [1, Theorem 3.1], Theorem 5.2 is equivalent to the following theorem. Theorem 5.3. For λ a partition and R a sequence of rectangles KλR (q) =

q c(ν,J ) .

(5.5)

(ν,J )∈RC(λ;R)

5.2. Switching between quantum and coquantum numbers. Let θR : RC(λ; R) → RC(λ; R) be the involution that complements quantum numbers. More precisely, for (k) (ν, J ) ∈ RC(λ; R), replace every string (i, x) ∈ (ν, J )(k) by (i, Pi (ν) − x). The notation here differs from that in [25], in which θR is an involution on RC(λt ; R t ). Lemma 5.4. c(θR (ν, J )) = ||R|| − cc(ν, J ) for all (ν, J ) ∈ RC(λ; R). Proof. Let θR (ν, J ) = (ν , J ). It follows immediately from the definitions that ν = ν. In particular ν and ν have the same vacancy numbers and |J | = |P | − |J |. Then c(θR (ν, J )) = c(ν , J ) = ||R|| − cc(ν ) − |P | + |J | = ||R|| − cc(ν) − |J | = ||R|| − cc(ν, J ).

# "

There is a bijection tr RC : RC(λ; R) → RC(λt ; R t ) that has the property cc(tr RC (ν, J )) = ||R|| − cc(ν, J ) for all (ν, J ) ∈ RC(λ; R); see the proof of [26, Prop. 11].

(5.6)

Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials

123

5.3. RC’s and level-restriction. Here we introduce the most important new definition in this paper, namely, that of a level-restricted rigged configuration. Say that a partition λ is restricted of level * if λ1 − λn ≤ *, recalling that it is assumed that all partitions have at most n parts, some of which may be zero. Fix a shape λ and a sequence of rectangles R that are all restricted of level *. Define * = * − (λ1 − λn ), which is nonnegative by assumption. Set λ = (λ1 − λn , . . . , λn−1 − λn )t and denote the set of all column-strict tableaux of shape λ over the alphabet {1, 2, . . . , λ1 − λn } by CST(λ ). Define a table of modified vacancy numbers depending on ν ∈ C(λ; R) and t ∈ CST(λ ) by (k)

(k)

Pi (ν, t) = Pi (ν) −

λ k −λn

χ (i ≥ * + tj,k ) +

λk+1 −λn

j =1

χ (i ≥ * + tj,k+1 )

(5.7)

j =1

for all i, k ≥ 1, where χ (S) = 1 if the statement S is true and χ (S) = 0 otherwise, and (k) (k) tj,k is the (j, k)th entry of t. Finally let xi be the largest part of the partition Ji ; if (k) (k) Ji is the empty set xi = 0. Definition 5.5. Say that (ν, J ) ∈ RC(λ; R) is restricted of level * provided that (k)

(1) ν1 ≤ * for all k. (2) There exists a tableau t ∈ CST(λ ), such that for every i, k ≥ 1, (k)

xi

(k)

≤ Pi (ν, t).

Let C* (λ; R) be the set of all ν ∈ C(λ; R) such that the first condition holds, and denote by RC* (λ; R) the set of (ν, J ) ∈ RC(λ; R) that are restricted of level *. (k)

Note in particular that the second condition requires that Pi (ν, t) ≥ 0 for all i, k ≥ 1. Example 5.6. Let us consider Definition 5.5 for two classes of shapes λ more closely: (k)

(1) Vacuum case: Let λ = (a n ) be rectangular with n rows. Then λ = ∅ and Pi (ν, ∅) = (k) Pi (ν) for all i, k ≥ 1 so that the modified vacancy numbers are equal to the vacancy numbers. (2) Two-corner case: Let λ = (a α , bβ ) with α + β = n and a > b. Then λ = (α a−b ) and there is only one tableau t in CST(λ ), namely the Yamanouchi tableau of shape λ . Since tj,k = j for 1 ≤ k ≤ α we find that (k) (k) *, 0} Pi (ν, t) = Pi (ν) − δk,α max{i −

for 1 ≤ i ≤ * and 1 ≤ k < n. We wish to thank Anatol Kirillov for communicating this formula to us [27]. Our main result is the following formula for the level-restricted generalized Kostka polynomial: Theorem 5.7. Let * be a positive integer. For λ a partition and R a sequence of rectangles both restricted of level *, * KλR (q) = q c(ν,J ) . (ν,J )∈RC* (λ;R)

124

A. Schilling, M. Shimozono

The proof of this theorem is given in Sect. 8. Example 5.8. Consider n = 3, * = 2, λ = (3, 2, 1) and R = ((2), (1)4 ). Then 0 0

1

and

1

0

(5.8)

0

2

are in C* (λ; R), where again the vacancy numbers are indicated to the left of each part. The set CST(λ ) consists of the two elements 1

1

1

and

2

2

2

.

Since * = 0 the three rigged configurations 0 0 ,

0 0

0 0

and

0

0 1

0

are restricted of level 2 with charges 2, 3, 4, respectively. The riggings are given on the 2 (q) = q 2 + q 3 + q 4 . right of each part. Hence KλR In contrast to this, the Kostka polynomial Kλµ (q) is obtained by summing over both configurations in (5.8) with all possible riggings below the vacancy numbers. This amounts to Kλµ (q) = q 2 + 2q 3 + 2q 4 + 2q 5 + q 6 . In Sect. 7 we will use Theorem 5.7 to obtain explicit expressions for type A branching functions. The results suggest that it is also useful to consider the following sets of rigged configurations with imposed minima on the set of riggings. t t t Let ρ ⊂ λ be a partition and Rρ = ((1ρ1 ), (1ρ2 ), . . . , (1ρn )), the sequence of single t t columns of height ρi . Set ρ = (ρ1 − ρn , . . . , ρn−1 − ρn ) and (k) Mi (t)

=

ρ k −ρn j =1

ρk+1 −ρn

χ (i ≤ ρ1 − ρn − tj,k ) −

χ (i ≤ ρ1 − ρn − tj,k+1 )

j =1

for all t ∈ CST(ρ ). Then define RC* (λ, ρ; R) to be the set of all (ν, J ) ∈ RC* (λ; Rρ ∪R) (k) such that there exists a t ∈ CST(ρ ) such that Mi (t) ≤ x for (i, x) ∈ (ν, J )(k) and (k) (k) Mi (t) ≤ Pi (ν) for all i, k ≥ 1. Note that the second condition is obsolete if i occurs (k) (k) as a part in ν (k) since by definition Mi (t) ≤ x ≤ Pi (ν) for all (i, x) ∈ (ν, J )(k) . Conjecture 8.3 asserts that the set RC* (λ, ρ; R) corresponds to the set of all level-* restricted Littlewood–Richardson tableaux with a fixed subtableaux of shape ρ.

Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials

125

6. Fermionic Expression of Level-Restricted Generalized Kostka Polynomials 6.1. Fermionic expression. Similarly to the Kostka polynomial case, one can rewrite the expression of the level-restricted generalized Kostka polynomials of Theorem 5.7 in fermionic form. (k)

Lemma 6.1. For all ν ∈ C* (λ, R), t ∈ CST(λ ) and 1 ≤ k < n, we have Pi (ν, t) = 0 for i ≥ *. (k)

(k)

Proof. Since ν1 ≤ * it follows from [26, (11.2)] that Pi (ν) = λk − λk+1 for i ≥ *. Since t is over the alphabet {1, 2, . . . , λ1 − λn } this implies for i ≥ *,

(k)

(k)

Pi (ν, t) = Pi (ν) −

λ k −λn

χ (i ≥ * + tj,k ) +

j =1

λk+1 −λn

χ (i ≥ * + tj,k+1 )

j =1

= λk − λk+1 − (λk − λn ) + (λk+1 − λn ) = 0.

# "

Let SCST(λ ) be the set of all nonempty subsets of CST(λ ). Furthermore set (k) = min{Pi (ν, t)|t ∈ S} for S ∈ SCST(λ ). Then by inclusion-exclusion the set of allowed rigging for a given configuration ν ∈ C* (λ; R) is given by

(k) Pi (ν, S)

S∈SCST(λ )

(k)

(−1)|S|+1 {J |xi

(k)

≤ Pi (ν, S)}.

is the generating function of partitions with at most m parts Since the q-binomial m+p m (k) each not exceeding p and since P* (ν, S) = 0 by Lemma 6.1 the level-* restricted generalized Kostka polynomials has the following fermionic form. Theorem 6.2.

* KλR (q) =

(−1)|S|+1

S∈SCST(λ )

ν∈C* (λ;R)

q c(ν)

(k) mi (ν (k) ) + Pi (ν, S) . mi (ν (k) )

*−1 n−1 i=1 k=1

In Sect. 7 we will derive new expressions for branching functions of type A as limits of the level-restricted generalized Kostka polynomials. To this end we need to reformulate the fermionic formula of Theorem 6.2 in terms of a so-called (m, n)-system. Set (a)

(a)

(a)

(a)

mi

= Pi (ν, S) = Pi (ν) + fi (S),

ni

= mi (ν (a) ),

(a)

126

A. Schilling, M. Shimozono

(a) and Li = L j =1 χ (i = µj )χ (a = ηj ) for 1 ≤ i ≤ * and 1 ≤ a ≤ n which is the number of rectangles in R of shape (i a ). Then (a)

(a)

(a)

(a−1)

(a)

(a+1)

+ 2ni − ni −mi−1 + 2mi − mi+1 − ni (a−1) (a−1) (a) (a+1) (a) (a+1) = αi − 2αi + αi − αi+1 − 2αi+1 + αi+1 +

=

L

δa,ηk − min{i − 1, µk } + 2 min{i, µk } − min{i + 1, µk }

k=1 (a) (a) (a) − fi−1 (S) + 2fi (S) − fi+1 (S) (a−1) (a+1) (a−1) (a) (a) − αi − αi+1 + 2(αi − αi+1 ) − αi (a) (a) (a) (a) Li − fi−1 (S) + 2fi (S) − fi+1 (S).

(a+1)

− αi+1

(a)

At this stage it is convenient to introduce vector notation. For a matrix vi 1 ≤ i ≤ * − 1 and 1 ≤ a ≤ n − 1 define v=

*−1 n−1 i=1 a=1

with indices

(a)

vi e i ⊗ e a ,

where ei and ea are the canonical basis vectors of Z*−1 and Zn−1 , respectively. Define (a)

(a)

(a)

(a)

ui (S) = −fi−1 (S) + 2fi (S) − fi+1 (S), which in vector notation reads u(S) = (C ⊗ I )f (S) +

n−1

(λa − λa+1 )e*−1 ⊗ ea ,

(6.1)

a=1 (0)

where C is the Cartan matrix of type A and I is the identity matrix. Since ni (k) (k) m0 = 0 and m* = 0 by Lemma 6.1 it follows that (C ⊗ I )m + (I ⊗ C)n = L + u(S).

(n)

= ni

=

(6.2)

In terms of the new variables the condition (5.1) on |ν (a) | becomes (a)

n* = −e*−1 ⊗ ea (C −1 ⊗ I )n −

a

*

n

1 1 (b) λj + i min{a, b}Li , * * j =1

(6.3)

i=1 b=1

where we used Cij−1 = min{i, j } − ij/* if C is (* − 1) × (* − 1)-dimensional and n * (b) b=1 i=1 ibLi = |λ|. Lemma 6.3. In terms of the above (m, n)-system c(ν) =

1 m(C ⊗ C −1 )m − m(I ⊗ C −1 )u(S) 2 1 + u(S)(C −1 ⊗ C −1 )u(S) + g(R, λ), 2

(6.4)

Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials

127

where g(R, λ) = !R! −

2 n−1 * n 1 −1 (a) (b) 1 1 λj − |λ| Cab Lj Lj + 2 2* n a,b=1 j =1

(a)

and Li

=

j =1

*

(a) j =1 min{i, j }Lj .

Proof. By definition c(ν) = !R! − cc(ν) − |P |. Note that |P | =

* n−1 i=1 k=1

=

(k)

mi (ν (k) )Pi (ν)

* n−1 i=1 k=1

(k)

αi

= −2cc(ν) +

(k)

− αi+1

n−1 * i=1 k=1

i (k) (α (k−1) − 2α (k) + α (k+1) ) + Li j =1

j

j

j

(k) (k)

n i Li .

Hence eliminating cc(ν) in favor of |P | yields * n−1

1 1 (k) (k) c(ν) = !R! − |P | − n i Li . 2 2 i=1 k=1

(k)

On the other hand, using ni

(k)

= mi (ν (k) ) and P* (ν) = λk − λk+1 ,

|P | = n(I ⊗ I )P (ν) +

n−1 k=1

(k)

n* (λk − λk+1 )

so that n−1

1 1 (k) (k) c(ν) = !R! − n(I ⊗ I )(P (ν) + L) − n* λk − λk+1 + L* . 2 2

(6.5)

k=1

Eliminating n in favor of m using (6.2) and substituting P (ν) = m − f (S) yields 1 1 − n(I ⊗ I )(P (ν) + L) = m{C ⊗ C −1 (m + L − f (S)) − I ⊗ C −1 (L + u(S))} 2 2 1 − (L + u(S))(I ⊗ C −1 )(L − f (S)). 2 Similarly, replacing n by m in (6.3) we obtain (a)

n* = e*−1 ⊗ ea (I ⊗ C −1 m − C −1 ⊗ C −1 u(S)) −

1 1 −1 (b) λj − |λ| + Cab L* . * n a

n−1

j =1

b=1

(6.6)

128

A. Schilling, M. Shimozono

Inserting these equations into (6.5), trading f (S) for u(S) by (6.1) and using (C ⊗ I )L − L −

n−1 a=1

(a)

e*−1 ⊗ ea L* = 0

# "

results in the claim of the lemma.

As a corollary of Lemma 6.3 and Theorem 6.2 we obtain the following expression for the level-restricted generalized Kostka polynomial 1 −1 −1 * KλR (q) = q g(R,λ) (−1)|S|+1 q 2 u(S)C ⊗C u(S) ×

S∈SCST(λ )

q

m+n , m

1 −1 −1 2 mC⊗C m−mI ⊗C u(S)

m

(6.7)

where n is determined by (6.2), the sum over m is such that e*−1 ⊗ ea (I ⊗ C −1 m − C −1 ⊗ C −1 u(S)) 1 1 −1 (b) λj − |λ| + Cab L* ∈ Z, * n a

−

n−1

j =1

for all 1 ≤ a ≤ n − 1 and

m+n m

=

*−1 n−1 i=1

k=1

b=1

(k) (k) mi +ni (k) mi

.

Now consider the second case of Example 5.6, namely λ = (a α , bβ ) with a > b and α + β = n. Then SCST(λ ) only contains the element S = {t}, where t is the Yamanouchi tableau of shape λ and u(S) = e * ⊗ eα . In the vacuum case, that is, when n ), the set SCST(λ ) only contains S = {∅} and u(S) = f (S) = 0. In this ) λ = (( |λ| n case (6.7) simplifies to 1 * g(R,λ) mC⊗C −1 m m + n 2 KλR (q) = q . q m m When R is a sequence of single boxes this proves [8, Theorem 1]1 . When R is a sequence of single rows or single columns this settles [12, Conjecture 5.7]. 6.2. Polynomial Rogers–Ramanujan-type identities. Let W be the Weyl group of sln , M = {β ∈ Zn | ni=1 βi = 0} be the root lattice, ρ the half-sum of the positive roots, and (·|·) the standard symmetric bilinear form. Recall the energy function (3.9). It was shown in [31] that 1 * (q) = (−1)τ q − 2 (*+n)(β|β)+(λ+ρ|β)+E(b) . (6.8) KλR τ ∈W β∈M

b∈PR wt(b)=−ρ+τ −1 (λ−(*+n)β+ρ)

Equating (6.7) and (6.8) gives rise to polynomial Rogers–Ramanujan-type identities. For the vacuum case, that is, when the partition λ is rectangular with n rows, this proves [33, Eq. (9.2)]2 . 1 We believe that the proof given in [8] is incomplete. 2 The definition of level-restricted path as given in [33, p. 394] only works when R (or µ therein) consists

of single rows; otherwise the description of Sect. 3.7 should be used.

Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials

129

7. New Expressions for Type A Branching Functions The coset branching functions b labeled by the three weights , , have a nat ural finitization in terms of ( + )-restricted crystals. For certain triples of weights these can be reformulated in terms of level-restricted paths, which in turn yield an expression of the type A branching functions as a limit of the level-restricted generalized Kostka polynomials. Together with the results of the last section this implies new fermionic expressions for type A branching functions at certain triples of weights.

7.1. Branching function in terms of paths. Let , , ∈ Pcl be dominant integral weights of levels *, * , and * respectively, where * = * + * . The branching function b (z) is the formal power series defined by af()−mδ b zm caf( ),af( ) , (z) = m≥0

af()−mδ

where caf( ),af( ) is the multiplicity of the irreducible integrable highest weight n )-module V(af() − mδ) in the tensor product V(af( )) ⊗ V(af( )). Uq (sl n -highest weight vectors of weight The desired multiplicity is equal to the number of sl af()−mδ in the tensor product B(af( ))⊗B(af( )), that is, the number of elements b ⊗b ∈ B(af( ))⊗B(af( )) such that wt(b ⊗b ) = af()−mδ and i (b ⊗b ) = 0 for all i ∈ I . By (3.5), b = u , b is -restricted, and wt(b ) = af( − ) − mδ. Let B be a perfect crystal of level * . Using the isomorphism B( ) ∼ = P( , B) let b = b1 ⊗ b2 ⊗ · · · and b ∈ P( , B) be the ground state path. Suppose N is such that . In type A(1) the period of the ground for all j > N, bj = bj . Write b = b1 ⊗ · · · ⊗ bN n−1 state path b always divides n. Choose N to be a multiple of n, so that b = b ⊗ b and bN+1 = b1 . Then the above desired highest weight vectors have the form b ⊗ b = (b ⊗ u ) ⊗ u ∈ B ⊗N ⊗ B(af( )) ⊗ B(af( )). But there is an embedding B(af( + )) 7→ B(af( )) ⊗ B(af( )) defined by u + → u ⊗ u . With this rephrasing of the conditions on b and taking limits, we have −EN (b1 ⊗···⊗bN ) b zEN (b) , (7.1) (z) = lim z N→∞ N∈nZ

b∈H( + ,B⊗N ,)

where EN : B ⊗N → Z is given by EN (b) = E(b ⊗ bN+1 ) = E(b ⊗ b1 ) and E is the energy function on finite paths. Our goal is to express (7.1) in terms of level-restricted generalized Kostka polynomials. We find that this is possible for certain triples of weights. Using the results of Sect. 6 this provides explicit formulas for the branching functions. 7.2. Reduction to level-restricted paths. The first step in the transformation of (7.1) is to replace the condition of ( + )-restrictedness by level * restrictedness. This is achieved at the cost of appending a fixed inhomogeneous path. Consider any tensor product B of perfect crystals each of which has level at most * (the level of ), such that there is an element y ∈ H(* 0 , B , ). We indicate how such a B and y can be constructed explicitly. Let λ be the partition with strictly

130

A. Schilling, M. Shimozono

less than n rows with hi , columns of length i for 1 ≤ i ≤ n − 1. Let Yλ be the Yamanouchi tableau of shape λ. Then any factorization (in the plactic monoid) of Yλ into a sequence of rectangular tableaux, yields such a B and y . Example 7.1. Let n = 6, * = 5, = 0 + 22 + 3 + 4 . Then λ = (4, 4, 2, 1) (its transpose is λt = (4, 3, 2, 2)) and 1 1 1 1 Yλ =

2 2 2 2 3 3

.

4 One way is to factorize into single columns: B = B 2,1 ⊗ B 2,1 ⊗ B 3,1 ⊗ B 4,1 and y = y4 ⊗ y3 ⊗ y2 ⊗ y1 , where each yj is an sln highest weight vector, namely, the j th column of Yλ . Another way is to factorize into the minimum number of rectangles by slicing Yλ vertically. This yields B = B 2,2 ⊗ B 3,1 ⊗ B 4,1 ; again the factors of y = y3 ⊗ y2 ⊗ y1 are the sln highest weight vectors, namely,

y3 =

1 1 2 2

1 ,

y2 = 2 , 3

1 2 y1 = . 3 4

Consider also a tensor product B of perfect crystals such that there is an element ∈ H(* 0 , B , ). Then y = y ⊗ y ∈ H(*0 , B ⊗ B , + ). Instead of b ∈ H( + , B ⊗N , ), we work with b ⊗ y, where b ⊗ y is restricted of level *. This trick doesn’t help unless one can recover the correct energy function directly from b ⊗ y. Let p be the first N steps of the ground state path b ∈ P( , B). Define the normalized energy function on B ⊗N by E(b) = E(b ⊗ y ) − E(p ⊗ y ). A priori it depends on , B, and y . The energy function occurring in the branching function is E (b) = E(b ⊗ b1 ) − E(p ⊗ b1 ). y

Lemma 7.2. E = E . Proof. It suffices to show that the function B ⊗N → Z given by b → E(b ⊗ y ) − E(b ⊗ b1 ) is constant. Using the definition (3.9) and the fact that b is homogeneous of length N, we have E(b ⊗ y ) = E(b) + N E(bN ⊗ y ) − (N − 1)E(y ). Similarly E(b ⊗ b1 ) = E(b) + N E(bN ⊗ b1 ). Therefore E(b ⊗ y ) − E(b ⊗ b1 ) = N(E(bN ⊗ y ) − E(bN ⊗ b1 )) − (N − 1)E(y ). Thus it suffices to show that the function B → Z given by b → E(b ⊗ y ) − E(b ⊗ b1 ) is a constant function. Suppose first that i (b ) > hi , for some 1 ≤ i ≤ n − 1. By the construction of y and b1 , φi (y ) = hi , = φi (b1 ) for 1 ≤ i ≤ n − 1, since φ(b1 ) = . Then ei (b ⊗ y ) = ei (b ) ⊗ y and ei (b ⊗ b1 ) = ei (b ) ⊗ b1 by (3.4). Passing from b to ei (b ) repeatedly, the values of the energy functions are constant, so it may be assumed that b ⊗ y is a sln highest weight vector; in particular, i (b ) ≤ hi , for all 1 ≤ i ≤ n − 1.

Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials

131

Next suppose that 0 (b ) > h0 , . Now φ0 (y ) = 0 and φ0 (b1 ) = h0 , . By (3.4) e0 (b ⊗ b1 ) = e0 (b ) ⊗ b1 and e0 (b ⊗ y ) = e0 (b ) ⊗ y . By (3.8) and the fact that the local isomorphism on B ⊗B is the identity, we have E(e0 (b ⊗b1 )) = E(b ⊗b1 )−1. To show that E(e0 (b ⊗y )) = E(b ⊗y )−1 we check the conditions of Lemma 3.2. By (3.1) 0 (y ) = φ0 (y ) − h0 , wt(y ) = 0 − h0 , − * 0 = * − h0 , . Also by (3.5), since φ0 (y ) = 0, we have 0 (b ⊗ y ) = 0 (b ) + 0 (y ) > h0 , + * − h0 , = * . Let z ⊗ x be the image of b ⊗ y under an arbitrary composition of local isomorphisms. Since b ⊗ y is an sln highest weight vector, so is z ⊗ x and x. Now x is the sln -highest weight vector in a perfect crystal of level at most * , so φ0 (x) = 0 and 0 (x) ≤ * . But * < 0 (b ⊗ y ) = 0 (z ⊗ x) = 0 (z) + 0 (x) so that 0 (z) > 0. By (3.4) e0 (z ⊗ x) = e0 (z) ⊗ x. So E(e0 (b ⊗ y )) = E(b ⊗ y ) − 1 by Lemma 3.2. ) ≤ h , . But then ) ≤ (b (b By induction we may now assume that 0 0 i i i hi , , or c , (b ) ≤ c , = * . Since b ∈ B and B is a perfect crystal of level * , b must be the unique element of B such that (b ) = . Thus the function B → Z given by b → E(b ⊗ y ) − E(b ⊗ b1 ) is constant on B if it is constant on the singleton set { −1 ( )}, which it obviously is. " # 7.3. Explicit ground state energy. To go further, an explicit formula for the value E(p ⊗ y ) is required. This is achieved in (7.2). The derivation makes use of the following explicit construction of the local isomorphism. Theorem 7.3. Let B = B k,* be a perfect crystal of level *, , ∈ (Pcl+ )* , B a perfect crystal of level * ≤ *, and b ∈ H( , B , ). Let x ∈ B (resp. y ∈ B) be the unique element such that (x) = (resp. (y) = ). Then under the local isomorphism B ⊗ B ∼ = ψ k (b) ⊗ y. = B ⊗ B, we have x ⊗ b ∼ The proof requires several technical lemmas and is given in the next section. Example 7.4. Let n = 5, * = 4, k = 2, = 0 +1 +3 +4 , = 0 +1 +2 + 4 , * = 2, B = B 2,2 . Here the set H( , B , ) consists of two elements, namely, 1 2 4 5

and

1 4 2 5.

Let b be the second tableau. The theorem says that 1 1 2 3 2 3 4 5

⊗

1 1 2 4 1 4 ∼ 1 3 ⊗ = 2 3 5 5. 2 4 2 5

Proposition 7.5. Let ∈ (Pcl+ )* , B = B k,* a perfect crystal of level *, b ∈ P(, B) the ground state path, p a finite path (say of length N , where N is a multiple of n) such that p ⊗ b = b, B the tensor product of perfect crystals each of level at most *, and y ∈ H(*0 , B , ). Let p be the path of length N such that p ⊗ b = b , where b ∈ P(*0 , B) is the ground state path. Then under the composition of local isomorphisms B ⊗N ⊗ B ∼ = y ⊗ p . = B ⊗ B ⊗N we have p ⊗ y ∼ Proof. Induct on the length of the path y. Suppose B = B1 ⊗ B2 and y = y1 ⊗ y2 , where yj ∈ Bj and Bj is a perfect crystal. Let = − wt(y1 ). By the definitions y2 ∈ H(*0 , B2 , ). By induction the first N steps p of the ground state path of

132

A. Schilling, M. Shimozono

∼ y2 ⊗ p under the composition of local isomorphisms P( , B) satisfy p ⊗ y2 = ⊗N ⊗N ∼ B ⊗ B2 = B2 ⊗ B . Tensoring on the left with y1 , it remains to show that p ⊗ y1 ∼ = y1 ⊗ p under the composition of local isomorphisms B ⊗N ⊗ B1 ∼ = B1 ⊗ B ⊗N . Now ∈ B are the unique elements such that (p ) = and (p ) = . pN ∈ B and pN N N . Now p ⊗ y ∈ H( , B ⊗ Applying Theorem 7.3 we obtain pN ⊗ y1 ∼ = ψ k (y1 ) ⊗ pN N 1 ∈ H( , B ⊗ B, φ(p )). This implies that ψ k (y ) ∈ B1 , φ(pN )) so that ψ k (y1 ) ⊗ pN 1 N 1 ) and (p H(φ(pN ), B1 , φ(pN )). Now by definition (pN−1 ) = φ(pN N−1 ) = φ(pN ). . Continuing in Applying Theorem 7.3 we obtain pN−1 ⊗ ψ k (y1 ) ∼ = ψ 2k (y1 ) ⊗ pN−1 j k (j +1)k ∼ (y1 ) ⊗ pN−j for 0 ≤ j ≤ N − 1. this manner it follows that pN−j ⊗ ψ (y1 ) = ψ Composing these local isomorphisms it follows that p ⊗ y1 ∼ = ψ Nk (y1 ) ⊗ p . But ψ N is the identity since the order of ψ divides n which divides N . Therefore p ⊗ y1 ∼ = y1 ⊗ p under the composition of local isomorphisms and we are done. " # In the notation in the previous section, E(p ⊗ y ) = E(y ⊗ p ), where p is the first N steps of the ground state path of P(* 0 , B). Write N = nM and B = B k,* . Then using the generalized cocyclage one may calculate explicitly the generalized charge of the LR tableau corresponding to the level * restricted (and hence classically restricted) path y ⊗ p . Let |y | denote the total number of cells in the tableaux comprising y . Then kM . (7.2) E(y ⊗ p ) = E(y ) + |y |kM + n* 2 Example 7.6. Let n = 5, * = 3, = 0 + 3 + 4 , k = 2 and M = 1. Then p is the path 4 4 4 5 5 5

⊗

2 2 2 3 3 3

⊗

1 1 1 5 5 5

⊗

3 3 3 4 4 4

⊗

1 1 1 2 2 2.

The element y can be taken to be the tensor product 1

1

2

2⊗ 3

3 4.

Let λ = (8, 8, 8, 7, 6). Then the tableau Q ∈ LR(λ; R) (resp. Y ) that records the path y ⊗ p (resp. y ) is given by 1 1 1

5

5

5

11 15

2 2 2

7

7

7

12 16

Q=3 3 3

8

8

8

13 17 ,

4 4 4

9

9

9

14

1 5 2 6 Y =3 7 4

6 6 6 10 10 10 with R = ((3, 3), (3, 3), (3, 3), (3, 3), (3, 3), (1, 1, 1, 1), (1, 1, 1)) and subalphabets {1, 2}, {3, 4}, {5, 6}, {7, 8}, {9, 10}, {11, 12, 13, 14}, {15, 16, 17}. The generalized charge

Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials

133

cR (Q) is equal to the energy E(y ⊗ p ) [37, Theorem 23]. Here the widest rectangle in the path is of width * . For any tableau T ∈ LR(ρ; R) for some partition ρ, define V (T ) = P ((w0R Te )(w0R Tw )), where P is the Schensted P tableau, w0R is the automorphism of conjugation that reverses each of the subalphabets, and Tw and Te are the west and east subtableaux obtained by slicing T between the * th and (* + 1)th columns. It can be shown that there is a composition of |Te | generalized R-cocyclages leading from T to V (T ), where |Te | denotes the number of cells in Te . It follows from the ideas in [35, Sect. 3] and the intrinsic characterization of cR in [35, Theorem 21] that cR (T ) = cR (V (T )) + |Te |.

(7.3)

For the above tableau Q we have 1 1 1

1 1 1

2 2 2

2 2 2

Qw = 3 3 3

w0R Qw = 3 3 3

4 4 4

4 4 4

6 6 6

5 5 5

6

6

6

11 15

7

7

7

12 16

= 8

8

8

13 17 .

9

9

9

14

and 5

5

5

11 15

7

7

7

12 16

Qe = 8

8

8

13 17

9

9

9

14

w0R Qe

10 10 10

10 10 10

Then

V (Q) =

1

1

1

2

2

2

1

1

1

11 15

3

3

3

2

2

2

12 16

4

4

4

3

3

3

13 17

5

5

5

4

4

4

14

6

6

6

5

5

5

7

7

7

6

6

6

8

8

8

7

7

7

9

9

9

8

8

8

10 10 10.

9

9

9

11 15

10 10 10

12 16

and

V (V (Q)) =

13 17 14

134

A. Schilling, M. Shimozono

We have cR (V (V (Q))) = cR (Y ) = E(y ) by [35, Theorem 21] and cR (Q) = cR (V (Q)) + |Qe | = cR (V (Q)) + * n + |Y |, and cR (V (Q)) = cR (V (V (Q))) + |Y | by (7.3). This implies cR (Q) = * n + E(y ) + 2|Y |. 7.4. Proof of Theorem 7.3. The proof of Theorem 7.3 requires several lemmas. Words of length L in the alphabet {1, 2, . . . , n} are identified with the elements of the crystal basis of the L-fold tensor product (B 1,1 )⊗L . Lemma 7.7. Let u and v be words such that uv is an An−1 highest weight vector. Then v is an An−1 highest weight vector and j (u) ≤ φj (v) for all 1 ≤ j ≤ n − 1. Proof. Let uv be an An−1 highest weight vector and 1 ≤ j ≤ n − 1. By (3.5) 0 = j (uv) = j (v) + max{0, j (u) − φj (v)}. Since both summands on the right-hand side are nonnegative and sum to zero they must both be zero. " # Lemma 7.8. Let w be a word in the alphabet {1, 2} and w a word obtained by removing a letter i of w. Then w ) ≤ 1 (w) + 1 with equality only if i = 1. (1) 1 ( w ) + 1 with equality only if i = 2. (2) 1 (w) ≤ 1 ( Proof. Write w = uiv and w = uv. By (3.5) 1 (ui) = 1 (i) + max{0, 1 (u) − φ1 (i)} max{0, 1 (u) − 1} if i = 1 = 1 + 1 (u) if i = 2.

(7.4)

In particular 1 (ui) ≥ 1 (u) − 1. Applying (3.5) to both 1 (uv) and 1 (uiv) and subtracting, we obtain 1 (uv) − 1 (uiv) = max{0, 1 (u) − φ1 (v)} − max{0, 1 (ui) − φ1 (v)} ≤ max{0, 1 (u) − φ1 (v)} − max{0, 1 (u) − 1 − φ1 (v)} ≤ 1. Moreover if 1 (uv) − 1 (uiv) = 1 then all of the inequalities are equalities. In particular it must be the case that 1 (ui) = 1 (u) − 1, which by (7.4) implies that i = 1, proving the first assertion. On the other hand, (7.4) also implies 1 (ui) ≤ 1 + 1 (u). Subtracting 1 (uv) from 1 (uiv) and computing as before, the second part follows. " # Say that w is an almost highest weight vector with defect i if there is an index 1 ≤ i ≤ n − 1 such that j (w) = δij for 1 ≤ j ≤ n − 1, and also i−1 (ei (w)) = 0 if i > 1. Lemma 7.9. Let w be an almost highest weight vector with defect i for 1 ≤ i ≤ n − 1. Then ei (w) is either an An−1 highest weight vector or an almost highest weight vector of defect i + 1.

Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials

135

Proof. For j ∈ {i − 1, i, i + 1}, the restriction of the words w and ei (w) to the alphabet {j, j + 1} are identical, so that j (ei (w)) = j (w) = 0 by the definition of an almost highest weight vector.Also i (w) = 1 implies that i (ei (w)) = 0.Again by the definition of an almost highest weight vector, i−1 (ei (w)) = 0. If i = n − 1 we have shown that ei (w) is an An−1 highest weight vector. So it may be assumed that i < n − 1. It is enough to show that one of the two following possibilities occurs. (1) i+1 (ei (w)) = 0. (2) i+1 (ei (w)) = 1 and i (ei+1 ei (w)) = 0. Recall that ei (w) is obtained from w by changing an i + 1 into an i. Write w = u(i + 1)v such that ei (w) = uiv. In this notation we have φi (v) = 0 and i (u) = 0. By Lemma 7.8 point 7.8 with {1, 2} replaced by {i + 1, i + 2} and using that w is an almost highest weight vector of defect i, we have i+1 (ei (w)) ≤ i+1 (w) + 1 = 1. It is now enough to assume that i+1 (ei (w)) = 1 and to show that i (ei+1 ei (w)) = 0. By (3.5) 0 = i+1 (w) = i+1 (u(i + 1)v) = i+1 (v) + max{0, i+1 (u) − φi+1 ((i + 1)v)}. In particular i+1 (v) = 0. Hence ei+1 (ei (w)) = ei+1 (uiv) = ei+1 (u)iv. Similar computations starting with i (w) = 1 and which use the fact that i (u) = φi (v) = 0, yield i (v) = 0. We have i (ei+1 ei (w)) = i (ei+1 (u)iv) = i (iv) + max{0, i (ei+1 (u)) − φi (iv)} = 0 + max{0, i (ei+1 (u)) − 1}. But i (u) = 0 and in passing from u to ei+1 (u) an i + 2 is changed into an i + 1. By Lemma 7.8 point 7.8 applied to the restriction of u to the alphabet {i, i + 1}, we have i (ei+1 (u)) ≤ i (u) + 1 = 1. It follows that i (ei+1 ei (w)) = 0, and that ei (w) is an almost highest weight vector of defect i + 1. " # Lemma 7.10. Suppose w is an An−1 highest weight vector and w is a word obtained by removing a letter (say i) from w. Then there is an index r such that i ≤ r ≤ n and er−1 er−2 · · · ei ( w ) is an An−1 highest weight vector. Proof. By Lemma 7.9 it suffices to show that w is either an An−1 highest weight vector or an almost highest weight vector of defect i. w ) = 0 for j = i. For j ∈ {i − 1, i}, the restrictions of w and First it is shown that j ( w to the alphabet {j, j + 1} are the same, so that j ( w ) = j (w) = 0. For j = i − 1, by Lemma 7.8 point 7.8 and the assumption that w is an An−1 highest weight vector, it follows that i−1 ( w ) ≤ i−1 (w) + 1 = 1. But equality cannot hold since the removed letter is i as opposed to i − 1. Thus i−1 ( w ) = 0. w ) ≤ i (w) + 1 = 1 by Lemma 7.8 point 7.8 and the fact Next we observe that i ( that w is an An−1 highest weight vector. w ) = 0 then w is an An−1 highest weight vector. So it may be assumed that If i ( i ( w ) = 1. It suffices to show that i−1 (ei ( w )) = 0. Write w = uiv and w = uv. Now

136

A. Schilling, M. Shimozono

j (v) = 0 for all 1 ≤ j ≤ n − 1 by Lemma 7.7 since w is an An−1 highest weight vector. In particular i (v) = 0 so that ei ( w ) = ei (uv) = ei (u)v. We have i−1 (ei ( w )) = i−1 (ei (u)v) = i−1 (v) + max{0, i−1 (ei (u)) − φi−1 (v)} = max{0, i−1 (ei (u)) − φi−1 (v)}, since i−1 (v) = 0 by Lemma 7.7. It is enough to show that i−1 (ei (u)) ≤ φi−1 (v). But i−1 (ei (u)) ≤ i−1 (u) + 1 = i−1 (ui) ≤ φi−1 (v). The first inequality holds by an application of Lemma 7.8 point 7.8 since the restrictions of u and ei (u) to the alphabet {i − 1, i} differ by inserting a letter i. The last inequality holds by Lemma 7.7 since w = uiv is an An−1 highest weight vector. " #

Lemma 7.11. Let B = B k,* be a perfect crystal of level * ≤ *, ∈ (Pcl+ )* , B a finite (possibly empty) tensor product of perfect crystals of level at most *, x ∈ B and b ∈ B such that x ⊗ b ∈ H(, B ⊗ B). Let i ∈ J such that hi , > 0 and set = − i + i−1 . Then there is an index 0 ≤ s ≤ k such that ei+s−1 · · · ei+1 ei (x ⊗ b) = x ⊗ ei+s−1 · · · ei+1 ei (b)

(7.5)

and ei+s−1 · · · ei (b) ∈ H( , B), where the subscripts are taken modulo n. Moreover if * = * then s = k. (1)

Proof. Since the Dynkin diagram An−1 has an automorphism given by rotation, it may be assumed that i = 1. Let λ be the partition of length less than n, given by hj , = λj − λj +1 for 1 ≤ j ≤ n − 1 and λn = 0. Since h1 , > 0 it follows that λ has t

a column of size 1. Let m = λ1 and yi be the An−1 -highest weight vector in B λj ,1 for 1 ≤ j ≤ m. Write y = ym ⊗ · · · ⊗ y1 and y = ym−1 ⊗ · · · ⊗ y1 . Observe that t t ,1 λ m y ⊗ u*0 is an affine highest weight vector in B ⊗ · · · ⊗ B λ1 ,1 ⊗ B(*0 ) and has weight so its connected component is isomorphic to B(). A similar statement holds for y ⊗ u*0 and B( ). In particular, b ⊗ y is an An−1 highest weight vector. The map x ⊗ b ⊗ y → word(x)word(b)word(y) gives an embedding of An−1 -crystals into a tensor product of crystals B 1,1 . By Lemma 7.10, there exists an index 1 ≤ r ≤ n such that er−1 er−2 · · · e1 (word(x)word(b)word( y )) is an An−1 highest weight vector. Since y is an An−1 highest weight vector it follows that er−1 · · · e1 (word(x)word(b)word( y )) = er−1 · · · e1 (word(x)word(b))word( y ). Let pj be the position of the letter in ej −1 . . . e1 (word(x)word(b)) that changes from a j + 1 to j upon the application of ej , for 1 ≤ j ≤ r − 1. It follows from the proof of Lemma 7.9 that pr−1 < pr−2 < · · · < p2 < p1 .

(7.6) b

Let s be the maximal index such that ps is located in word(b). Write = es · · · e1 (b). It follows that es es−1 · · · e1 (x ⊗ b) = x ⊗ b and that b ⊗ y is an An−1 highest weight vector. It remains to show that 0 (b ⊗ y ⊗ u*0 ) = 0 and that s ≤ k with equality if * = *.

(7.7)

Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials

137

Consider the corresponding positions in the tableau b. Since b → word(b) is an An−1 crystal morphism, es · · · e1 (word(b)) = word(es · · · e1 (b)). Let (i1 , j1 ) be the position in the tableau b corresponding to the position p1 in word(b), and analogously define (i2 , j2 ), (i3 , j3 ), and so on. Since the rows of all tableaux (and in particular b, e1 (b), e2 e1 (b), etc.) are weakly increasing and (7.6) holds, it follows that i1 < i2 < i3 < · · · < is . But b has k rows, so s ≤ k. The next goal is to prove (7.7). Suppose first that s < n − 1. In this case the letters 1 and n are undisturbed in passing from e1 (b) to es · · · e1 (b). Using this and the Dynkin diagram rotation it follows that y ⊗ u*0 ) = 0 (e1 (b) ⊗ u ) 0 (es · · · e2 e1 (b) ⊗ = max{0, 0 (e1 (b)) − φ0 (u )} = max{0, 0 (e1 (b)) − φ0 (u ) − 1}.

(7.8)

But φ0 (u ) ≥ 0 (b) ≥ 0 (e1 (b)) − 1 by the fact that 0 (b ⊗ u ) = 0 and Lemma 7.8 point 7.8 applied after rotation of the Dynkin diagram. By (7.8) the desired result (7.7) follows. Otherwise assume s = n − 1. Here k = n − 1 since s ≤ k < n with the inequality holding by the perfectness of B. By (7.6) and the fact that b is a tableau, it must be the case that e1 acting on b changes a 2 in the first row of b into a 1, e2 acting on e1 (b) changes a 3 in the second row of e1 (b) into a 2, etc. Since b is a tableau with n − 1 rows with entries between 1 and n, there are integers 0 ≤ νn−1 ≤ νn−2 ≤ · · · ≤ ν1 < * such that the i th row of b consists of νi copies of the letter i and * − νi copies of the letter i + 1. For tableaux b of this very special form, the explicit formula for e0 in [37, (3.11)] yields 0 (b) = * − mn (b), where mn (b) is the number of occurrences of the letter n in b. Since b = en−1 · · · e1 (b) also has the same form (with νi replaced by νi + 1 for 1 ≤ i ≤ n − 1) and mn (b ) = mn (b) − 1, it follows that 0 (b ) = 0 (b) + 1. We have y ⊗ u*0 ) = 0 (b ⊗ u ) 0 (b ⊗

= max{0, 0 (b ) − φ0 (u )} = max{0, 0 (b) + 1 − (φ0 (u ) + 1)} = 0

since b ∈ H(, B). Finally, assuming * = *, it must be shown that s = k. Since the level of B is the same as that of the weights and , it follows from the perfectness of B that both b and b are uniquely defined by the property that (b) = and (b ) = . Let = n−1 i=0 zi i . By the explicit construction of b in Example 3.3, wt(b) =

n−1 k j =1 i=0

zi (i+j − i+j −1 ) =

n−1

zi (i+k − i )

i=0

with indices taken modulo n. Subtracting the analogous formula for wt(b ), wt(b) − wt(b ) = − kj =1 αj . Using (3.1) it follows that k = s. " # Proof of Theorem 7.3. First observe that x ⊗ b ∈ H( , B ⊗ B , φ(x)) by (3.1), b ∈ H( , B , ), and (x) = . Let c ∈ B and z ∈ B be such that x ⊗ b ∼ = c⊗z under the local isomorphism. Then c ⊗ z ∈ H( , B ⊗ B, φ(x)) which means that z is -restricted. Hence z ∈ H( , B, φ(z)) and c ∈ H(φ(z), B , φ(x)). The former together with the perfectness of B implies that y = z. From the latter it follows that

138

A. Schilling, M. Shimozono

ψ −k (c) ∈ H( , B , ). However the set H( , B , ) might have multiplicities so it is not obvious why b = ψ −k (c) or equivalently c = ψ k (b). The proof proceeds by an induction that changes the weight to a weight that is “closer to" *0 . Suppose first that there is a root direction i = 0 such that = − i + i−1 . By Lemma 7.11 applied for the weight hi , > 0 and , simple root αi , and element x ⊗ b ∈ H( , B ⊗ B ), there is an 0 ≤ s < n such , B , ), where = − s+i + s+i−1 and that b = ei+s−1 · · · ei+1 ei (b) ∈ H( ei+s−1 · · · ei (x ⊗ b) = x ⊗ b. Applying Lemma 7.11 with , αs+i , and x ∈ H(, B), , B). it follows that x = ek+s+i−1 · · · es+i (x) ∈ H( , B ⊗ B ). The above computations imply ek+s+i−1 · · · ei (x ⊗ b) = x ⊗ b ∈ H( , B ⊗ B) since x ⊗ b → c ⊗ y under We have ek+s+i−1 · · · ei+1 ei (c ⊗ y) ∈ H( the local isomorphism. It must be seen which of these raising operators act on the tensor factor in B and which act in B. By Lemma 7.11 applied with , αi , and c ⊗ y ∈ , B) and that ek+i−1 · · · ei (c⊗ H( , B ⊗B), it follows that y = ek+i−1 · · · ei (y) ∈ H( (1) y) = c⊗ y . Since y ⊗u is an An−1 highest weight vector, the rest of the raising operators es+k−1 · · · ek+i must act on the first tensor factor. Let c = ek+s+i−1 · · · ek+i (c). Then ek+s+i−1 · · · ei (c ⊗ y) = c ⊗ y . But the local isomorphism is a crystal morphism so it sends x ⊗ b → c ⊗ y . By induction c = ψ k ( b). By (3.6) it follows that c = ψ k (b). Otherwise there is no index i = 0 such that hi , > 0. This means = *0 . But the sets H(*0 , B, ) and H(*0 , B , φ(y)) are singletons whose lone elements are given by the An−1 highest weight vectors in B and B respectively. Since B ⊗ B is An−1 multiplicity-free it follows that the sets H(φ(y), B , φ(x)) and H(, B, φ(x)) are singletons. In this case it follows directly that c = ψ k (b) since both c and ψ k (b) are elements of the set H(φ(y), B , φ(x)). " # 7.5. Branching function by restricted generalized Kostka polynomials. The appropriate map from LR tableaux to rigged configurations, sends the generalized charge of the LR tableau to the charge of the rigged configuration. Unfortunately in general it is not clear what happens when one uses the statistic coming from the energy function E(b ⊗ y ) but using the path b ⊗ y ⊗ y . It is only known that the statistic E(b ⊗ y ⊗ y ) on the path b ⊗ y ⊗ y is well-behaved. So to continue the computation we require that y = ∅. This is achieved when = * 0 . So let us assume this. The other problem is that we do not consider all paths in H(*0 , B ⊗N ⊗ B , ), but only those of the form b ⊗ y , where y ∈ B is a fixed path. Passing to LR tableaux, this is equivalent to imposing an additional condition that the subtableaux corresponding to the first several rectangles must be in fixed positions. Conjecture 8.3 asserts that the corresponding sets of rigged configurations are well-behaved. The special case that requires no extra work is when B consists of a single perfect crystal. This is achievable when has the form = rs + (* − r)0 ; in this case B = B s,r and y is the sln -highest weight element of B s,r . This is the same as requiring that the first subtableau of the LR tableau be fixed. But this is always the case. Let R (M) consist of the single rectangle (r s ) followed by N = Mn copies of the rectangle (* k ), where B = B k,* . Let λ(M) be the partition of the same size as the total size of R (M) , (M) such that λ projects to − *0 . Then the set of paths H(*0 , B ⊗N ⊗ B s,r , ) is * equal to P−* ,R (M) . This is summarized by 0

kM

−rskM−n* ( 2 ) * b Kλ(M) ,R (M) (q), (q) = lim q M→∞

where is arbitrary, = rs + (* − r)0 , and = * 0 .

(7.9)

Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials

139

Inserting expression (6.7) for the generalized Kostka polynomial in (7.9) and taking the limit yields the following fermionic expression for the branching function: b (q) = q

×

rs(s−n) 1 2n + 2*

n

|λ| 2 j =1 (λj − n )

(−1)|S|+1 q 2 u(S)C 1

−1 ⊗C −1 u(S)

S∈SCST(λ )

q 2 mC⊗C 1

−1 m−mI ⊗C −1 u(S)

m

*−1 n−1 m(a) +n(a) n−1 i

i=1 a=1 i=*

(a) mi

i

a=1

1 , (q)m(a)

(7.10)

*

where λ is any partition which projects to − *0 and u(S) as defined in (6.1). The n−1 (a) (a) sum over m runs over all m = *−1 a=1 mi ei ⊗ ea such that mi ∈ Z and i=1 e*−1 ⊗ ea (I ⊗ C −1 m − C −1 ⊗ C −1 u(S)) −

1 1 λj − |λ| ∈ Z * n a

j =1

(a)

for all 1 ≤ a ≤ n − 1. The variables ni (a)

ni

are given by

= ei ⊗ ea −C ⊗ C −1 m + I ⊗ C −1 (u(s) + er ⊗ es )

for all 1 ≤ a < n and 1 ≤ i < *, i = * . 8. Proof of Theorem 5.7 To prove Theorem 5.7 it clearly suffices to show that there is a bijection ψ R : RLR* (λ; R) → RC* (λ; R) that is charge-preserving, that is, cR (T ) = c(ψ R (T )) for all T ∈ RLR* (λ; R). Here we identify LR(λ; R) with RLR(λ; R) via the standardization bijec : CLR(λ; R) → N by c = c ◦ γ , where c : RLR(λ; R) → tion std. Also define cR R R R R N. It will be shown that one of the standard bijections ψ R : RLR(λ; R) → RC(λ; R) is charge-preserving, and that it restricts to a bijection RLR* (λ; R) → RC* (λ; R). With this in mind let us review the bijections from LR tableaux to rigged configurations. 8.1. Bijections from LR tableaux to rigged configurations. A bijection φ R : CLR(λ; R) → RC(λt ; R t ) was defined recursively in [25, Definition-Proposition 4.1]. It is one of four natural bijections from LR tableaux to rigged configurations: (1) Column index quantum: φ R : CLR(λ; R) → RC(λt ; R t ), R : CLR(λ; R) → RC(λt ; R t ), defined by φ R = (2) Column index coquantum: φ θR t ◦ φ R , (3) Row index quantum: ψ R : RLR(λ; R) → RC(λ; R), defined by ψ R = φ R t ◦ tr, and R : RLR(λ; R) → RC(λ; R), defined by ψ R = θR ◦ ψ R . (4) Row index coquantum: ψ Of these four, the one that is compatible with level-restriction is ψ. First we show that it is charge-preserving. This fact is a corollary of the difficult result [25, Theorem 9.1]. Proposition 8.1. c(ψ R (T )) = cR (T ) for all T ∈ RLR(λ; R).

140

A. Schilling, M. Shimozono

Proof. Consider the following diagram, which commutes by the definitions and [25, Theorem 7.1] RLR(λ; R) ggggogooo g g g g o g ggggg oootr g g o g g o wo sggg CLR(λ; R) tr / CLR(λt ; R t ) ψR LR φ R t φR / RC(λ; R). RC(λt ; R t ) tr γR−1

RC

In particular ψ R = tr RC ◦ φ R ◦ γR−1 . Let T ∈ RLR(λ; R) and Q = γR−1 (T ). Then, using tr RC ◦ θR t = θR ◦ tr RC , R (Q))). ψ R (T ) = θR (tr RC (φ R (Q)). Then Let (ν, J ) = tr RC (φ c(ψ R (T )) = c(θR (ν, J )) = ||R|| − cc(ν, J ) R (Q))) = cc(φ R (Q)) = cR (Q) = cR (T ) = ||R|| − cc(tr RC (φ . by Lemma 5.4, (5.6) and [25, Theorem 9.1] to pass from cc to cR

# "

In light of Proposition 8.1, to prove Theorem 5.7 it suffices to establish the following result. Theorem 8.2. The bijection ψ R : RLR(λ; R) → RC(λ; R) restricts to a well-defined bijection ψ R : RLR* (λ; R) → RC* (λ; R). Computer data suggests that the bijection ψ R is not only well-behaved with respect to level-restriction, but also with respect to fixing certain subtableaux. It was argued in Sect. 7.5 that the branching functions can be expressed in terms of generating functions of tableaux with certain fixed subtableaux.t t Let ρ ⊂ λ be partitions, Rρ = ((1ρ1 ), . . . , (1ρn )) and Tρ the unique tableau in RLR(ρ; Rρ ). Define RLR* (λ, ρ; R) to be the set of tableaux T ∈ RLR* (λ; Rρ ∪ R) such that T restricted to shape ρ equals Tρ . Recall the set of rigged configurations RC* (λ, ρ; R) defined in Sect. 5.3. Conjecture 8.3. The bijection ψ R : RLR(λ; R) → RC(λ; R) restricts to a well-defined bijection ψ R : RLR* (λ, ρ; R) → RC* (λ, ρ; R). 8.2. Reduction to single rows. In this section it is shown that to prove Theorem 8.2 it suffices to consider the case where R consists of single rows. Recall the nontrivial embedding iR : LR(λ; R) 7→ LR(λ; r(R)). We identify LR(λ; R) and RLR(λ; R) via std, and therefore have an embedding iR : RLR(λ; R) 7→ RLR(λ; r(R)). Define a map jR : RC(λ; R) → RC(λ; r(R)) as follows. Let (ν, J ) ∈ RC(λ; R). For each rectangle of R having k rows and m columns, add k − j strings (m, 0) of length m and label zero to the rigged partition (ν, J )(j ) for 1 ≤ j ≤ k − 1. The resulting rigged configuration is jR (ν, J ).

Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials

141

Proposition 8.4. The following diagram commutes: iR

RLR(λ; R) −−−−→ RLR(λ; r(R)) ψ ψR

r(R) RC(λ; R) −−−−→ RC(λ; r(R)). jR

It must be shown that similar diagrams commute in which iR is replaced by either iR< or sp , the maps that occur in the definition of iR . Let jR< : RC(λ; R) → RC(λ; R < ) be defined by adding a string (µ1 , 0) to each of the first η1 − 1 rigged partitions in (ν, J ) ∈ RC(λ; R). Lemma 8.5. jR< is well-defined and the following diagram commutes: iR

, ∂pj ∂p/ 4 4 1 2 E0 p − E0 0 > 1 − √ , so that, since |Eλ (p) − E0 (p)| ≤ constλ , we also have

2

Eλ p − Eλ (0) + Eλ (0) − E0 (0) ≤ 2d + constλ2 . As Eλ (p) is real analytic in p, the ∂ 2 E0 (p) analytic implicit function theorem and Cauchy estimates are used to control ∂pj ∂p/ and the remainder. & ' Proof. For λ = 0, we have

Spectral Analysis Stochastic Lattice Ginzburg–Landau Models

391

4.2. The ladder approximation. The first part of this subsection is devoted to showing the existence or absence of two-particle bound states in the ladder approximation and follows [15]. We use the mixed coordinates of Eq. (2.7) to analyze the kernels in the BS equation. The kernel of D˜ λ0 is given by 0 0

(2) k (2) k 0 0 0 0 0 ˜ ˜ ˜ Dλ (p, q, k) = δ p + q Sλ − p , p Sλ + p , q δ p + q − k 2 2 0 0 k k (2) (2) + S˜λ + p 0 , p S˜λ − p 0 , k − p δ (p − q) . 2 2 (4.3) The Recall that D˜ λ (k 0 ) means D˜ λ taken at zero spatial momentum, i.e., D˜ λ ((k 0 , 0)). action of D˜ λ0 (k 0 ) on energy independent functions f (p), which depend only on p, is 0 0 (2) k (2) k (D˜ λ0 (k 0 )f )(p) = (2π)d+1 S˜λ + f (−p)]. + p 0 , p S˜λ − p 0 , p [f (p) 2 2 (4.4) In the ladder approximation, K˜ λ is replaced by its first order term λL˜ of Eq. (3.2), which is local in time and so 3 ˜ + E0 ( L(p, q, k) = − a2 [E0 (p) + E0 ( q ) + E0 (p − k) q − k)], 4 i.e., its Fourier transform does not depend on p0 , q 0 and k 0 . Hence, at zero total spatial momentum k, ˜ = − 3 a2 [E0 (p) L(p, q, (k 0 , 0)) + E0 ( q )], 2 ˜ 0 , 0) has rank two (in a scalar local field theory the which shows that the operator L(k rank is one). Solving the Bethe–Salpeter equation (2.9) for D˜ λ , in the ladder approximation, yields −1 ˜ 0) D˜ λ0 (k 0 ) D˜ λ (k 0 ) = 1 − (2π )−2(d+1) λD˜ λ0 (k 0 )L(k (4.5) −1 = D˜ λ0 (k 0 ) 1 − (2π )−2(d+1) λL˜ λ (k 0 )D˜ λ0 (k 0 ) with all quantities taken at zero spatial momentum as in (4.4). The action of L˜ λ D˜ λ0 is given by

L˜ λ (k 0 )D˜ λ0 (k 0 )f (p) = − 3a2 (2π )d+1 E0 (p) + E0 ( q) 0 0 k k (4.6) × S˜λ − q 0 , q S˜λ + q 0 , q 2 2 × f (−q) + f (−q 0 , q ) dq. Hence, if the test function f depends only on p, we have (L˜ λ (k 0 )D˜ λ0 (k 0 )f )(p) = −3a2 (2π )d+1 ρ0 (f ) + ρ1 (f )E0 (p) ,

392

P. A. Faria da Veiga, M. O’Carroll, E. Pereira, R. Schor

where ρn (f ) = G( q , k0 ) =

1 2

Td ∞

−∞

G( q , k 0 )E0 ( q )δ0n f ( q ) + f (− q ) d q;

n = 0, 1,

(2) (2) S˜λ (q)S˜λ (k 0 − q0 , q )dq0 .

It follows from q , k 0 ) is

and from a simple analytic continuation argument that G(

(4.1) 0 This result depends on the fact that Eλ (0) ≤ Eλ (p) analytic on Imk < 2Eλ (0). for any p ∈ Td , proven in Proposition 4.2. Recall, from (2.8), that the basic object we want to analyze is (f, D˜ λ (k 0 )f ), which has the form, 0 d+1 ˜ f (p)G( p, k 0 )g(p, k 0 )d p, (4.7) (f, Dλ (k )f ) = 2(2π ) Td

where

−1 ˜ 0 )D˜ λ0 (k 0 ) f (·). g(·, k 0 ) = 1 − (2π )−2(d+1) λL(k

must come from those of g(·, k 0 ). The only singularities of (4.7) on Imk 0 < 2Eλ (0)

But, in turn, these come from the zeroes of 1−µ± (k 0 ), where µ± (k 0 ) are the eigenvalues ˜ 0 )D˜ 0 (k 0 ) on the space generated by the functions 1 and E0 (p). of (2π)−2(d+1) λL(k We λ find

1/2 0 −(d+1) 0 0 0 (4.8) λ α(k ) ± β(k )γ (k ) µ± (k ) = −3a2 (2π) with the eigenfunction corresponding to µ+ given by β ψ+ (p) = 1 + E0 (p), γ where

α(k 0 ) = β(k ) =

Td

Td

0

γ (k 0 ) =

Td

E0 ( q )G( q , k 0 )d q, G( q , k 0 )d q,

(4.9)

E0 ( q )2 G( q , k 0 )d q.

Now, from (4.1), G( q , k 0 ) can be written as q )2 cλ ( π q , k 0 ), + G1 ( 2 Eλ ( q ) Eλ ( q )2 + 41 (k 0 )2

+ 2M0 . q , k 0 ) is analytic on Imk 0 < Eλ (0) where G1 ( From general principles, the singularities of (4.7) can only be located on the imaginary k 0 axis. Writing k 0 = iκ with κ ≥ 0 and using (4.1), one can show that G( q , iκ) > 0 G( q , k0 ) =

Spectral Analysis Stochastic Lattice Ginzburg–Landau Models

393

It follows then that α(iκ), β(iκ) and γ (iκ) are positive and, by for 0 ≤ κ < 2Eλ (0). Cauchy-Schwarz’s inequality, α ≤ [βγ ]1/2 on 0 ≤ κ < 2Eλ (0). For space dimension d ≥ 3, then α(iκ), β(iκ) and γ (iκ) increase to a finite limit as because the singularity generated by G( κ → 2Eλ (0) q , iκ) is quadratic and therefore integrable. Thus, if λ is small enough, 1 − µ± (iκ) cannot be zero on 0 < κ < 2Eλ (0) so that, in the ladder approximation, there are no bound states. but α − [βγ ]1/2 remains finite. This If d < 3, α, β and γ diverge as κ → 2Eλ (0), yields the nonvanishing of 1 − µ− (iκ) . Finally, 1 − µ+ (iκ) is nonzero if a2 > 0, if a2 < 0. This implies the and has a unique zero on the interval 0 < κ < 2Eλ (0), existence of a single bound state for the later case. be the mass for a single quasiparticle in the interacting theLet Mλ = Eλ (0) ory. The mass ML of the bound state, in the ladder approximation, is the solution of (assuming a2 < 0) F (λ, iML ) = −(2π )d+1 /3a2 λ, where F (λ, k 0 ) = α(λ, k 0 ) + [β(λ, k 0 )γ (λ, k 0 )]1/2 , and we have made explicit the λ dependence of α, β and γ . Let E = 2Mλ − ML . Performing an asymptotic analysis of the coefficients α, β, and γ we find 9 λ2 2 a [1 + O(λ)] ; if d = 1 4 m4 2 (4.10) E(λ) = 4π m2 exp − [1 + O(λ)] ; if d = 2. 3 |a2 | λ To go beyond the ladder approximation, let us introduce some function spaces. We define a weighted Hardy space Hδ (see [3, 21]) as functions f analytic in the strip | Imp j |< δ1 such that f (p) = f (−p), with norm given by, with α = α 0 , α , | w(p + iα)f (p + iα) |2 dp, sup f 2δ = | Imp 0 |< δ0 ;

|α0 | M0 . For | q 0 |≤ M0 , w(q)−1 Bδ (q −q )w(q )−1 is clearly bounded so that we have the required bound c$(κ). For | q 0 |> M0 , write q 0 = (q 0 − q 0 ) + q 0 , so that

2α

α α w(q)−1 = (q 0 )2 + 16Mλ2 ≤ 2 | q 0 − q 0 |2α +2 q 0 + 16Mλ2 , using the /p triangle inequality with p = α −1 . As r00 κ, q is O (q 0 )−4 , the result follows. & ' Let H∗ be the dual space to H, determined by the L2 inner product. We have Lemma 4.6. R0λ : H → H∗ is analytic in 0 < Reκ < 2Mλ , |Imκ| < Mλ and with norm bounded by c$(κ)2 , c > 0. Proof. From Eq. (4.3),

(g, R0λ f )2 ≤ sup S˜λ p 0 + iκ S˜λ p 0 − iκ w(p)−2 w(p) |g(p)f (p)| dp, 2 2 p and using (4.1) the result follows.

' &

Spectral Analysis Stochastic Lattice Ginzburg–Landau Models

397

4.4. Complete model: Existence of bound states. For the complete model, following [2], here we show the existence of mass spectrum in the interval κ ∈ (0, 2Mλ ) when d < 3 and a2 < 0. We will prove there is a unique bound state near the ladder bound state ML . In the next subsection, absence of bound states in (0, 2Mλ ) will be proven both for d ≥ 3 and a2 < 0 and for a2 > 0. Essentially, this is done by showing the existence or absence of an eigenvalue 1 of Kλ (κ) R0λ (κ). Multiplicity one is checked for the former case. Before we go to the technical details, we give a description of the strategy employed in both cases. For the repulsive case a2 > 0 and for the attractive case a2 < 0 and d ≥ 3, with Kλ = λL + λ2 K (2) λ , we write

−1 (2) , Dλ = DL + Dλ λ2 K (2) λ DL = DL 1 − λ2 Kλ DL where

−1 DL = Dλ0 + λDL LDλ0 = Dλ0 1 − λLDλ0 .

Using an explicit representation for DL , we show that DL has no singularities in (2) (0, 2Mλ ), and also that Kλ DL has norm less than one in (0, 2Mλ ). Hence, the resol−1

(2) is well defined by its Neumann series and Dλ does not have vent 1 − λ2 Kλ DL singularities in (0, 2Mλ ). For the attractive case a2 < 0 and d < 3, in order to show existence of a bound state we write

−1 Dλ = Dλ0 1 − Kλ Dλ0 and consider the family of compact operators, µ ∈ C, defined by Tλ (µ, κ) = −λT1 (κ) + µT2 (κ), where T1 , and T2 are defined in (4.16). We remark that µ = λ2 corresponds to the value of interest (the physical one), that is [see (4.15)] Tλ (λ2 , κ) = Kλ (κ)R0λ (κ). This family is shown to be compact and jointly analytic in κ and µ, for 0 < Reκ < 2Mλ and |µ| < 2λ2 . Without further analysis, the analytic Fredholm theory implies that −1 exists, except for κ in a discrete set. As Dλ0 is not singular in the same 1 − Kλ Dλ0 domain, it follows that the mass spectrum is discrete in (0, 2Mλ ). However, we show more. The point µ = 0 is called the ladder approximation which was solved explicitly in Subsect. 4.2, and leads to a bound state at some κ = κL ∈ (0, 2Mλ ). This is the only mass spectral point in (0, 2Mλ ). As µT2 is an analytic perturbation, it is shown that there is an isolated bound state of multiplicity one at κb ∈ (0, 2Mλ ), where κb lies in the interval |κb − κL | ≤ 21 bλ2 , for b sufficiently small, uniform in λ, such that κb is the unique mass spectral point in the interval. For κ in the intervals 0, κL − 21 bλ2 or κL + 21 bλ2 , 2Mλ − λ5/2 , the mass spec−1 exists. Thus, as Dλ0 is not singular, the trum is excluded by showing that 1 − Kλ Dλ0 −1 same holds for D = Dλ0 1 − Kλ Dλ0 . For κ near ML , the resolvent (−λT1 (κ) − w)−1 of −λT1 (κ) is constructed explicitly and µT2 (κ) is shown to be an analytic perturbation to this ladder operator. The resolvent (Tλ (µ, κ) − w)−1 is defined through its Neumann series and is shown to exist for w in

398

P. A. Faria da Veiga, M. O’Carroll, E. Pereira, R. Schor

the complement of | w |−1 , with | w |−1 < 4. This means that the spectrum of Tλ (µ, κ) is contained in | w |≤ 1/4, | w − 1 |≤ 1/4. Consequently, by analytic perturbation theory, there is a unique multiplicity one eigenvalue αλ (µ, κ) of Tλ (µ, κ) which is analytic both in κ and µ, and satisfies αλ (0, κ) = 1. However, we do not know that for real µ > 0 and small the eigenvalue takes the value one. To show that indeed it does, we compute the derivative [∂αλ /∂κ] (0, κ) (see Lemma 4.11), which is large positive for small λ. This is shown to be the dominating contribution to [∂αλ /∂κ] (µ, κ). Thus, for small real µ, αλ (µ, κ) is strictly monotone increasing in µ. In this way, we show: Lemma 4.7. Let µ and κ be real. For | µ |< 2λ2 and c sufficiently small, there is a unique κ = κλ (µ) in | κ − ML |≤ 21 cλ2 such that αλ (µ, κλ (µ)) = 1. Remark 4.8. Recall that µ = λ2 is the physical value of interest so that αλ λ2 , κλ (λ2 ) = 1 is the eigenvalue of Tλ (λ2 , κ) = Kλ (κ)R0λ (κ), where κ = κλ (λ2 ) ≡ Mb is the bound state mass given in (1.5). In order for the analysis of [2, Lemmas 2.7–2.11] to go through, it suffices to show the two lemmas below Lemma 4.9. Let µ± be defined as in (4.8). Then, for some positive c, 1 1 1 1 1 −1 ≤ c max , , Tλ (0, ML ) . [w − Tλ (0, ML )] w w − 1 w w − 1 w − µ− (ML ) Remark 4.10. We recall, for κ = ML , that µ+ (ML ) = 1 and | µ− (ML ) |≤ c | λ |. Note that the ladder bound state satisfies αλ (0, ML ) = 1. Lemma 4.11. For κ such that | κ − ML |≤ 21 cλ2 , with a sufficiently small c > 0, set αλ (0, κ) = ρ (α + βγ )1/2 . Then, there exist positive constants c1 and c2 such that ∂αλ (0, κ) ≥ λc1 $(κ)3 ≥ c2 λ−2 , for $(κ) as defined in (4.13). ∂κ Proof of Lemma 4.9. Using the representation (4.12), the resolvent [w − Tλ (0, ML )]−1 is bounded using Lemma 4.5. & ' Proof of Lemma 4.11. From the representations for α, β and γ [see (4.9)], we see that they are all strictly positive as well as their κ derivatives. From the bounds of Lemma 4.4, ∂αλ it follows that ' (0, κ) ≥ λc1 $(κ)3 . & ∂κ 4.5. Complete model: Absence of bound states. Here, considering the complete model and using the strategy described in Sect. 4.4, we show the absence of mass spectrum in (0, 2Mλ ) in the two-particle sector for the repulsive case a2 > 0 and d < 3, as well as for d ≥ 3. A variant of the method is used to complete the proof that excludes spectrum between the bound state Mb and the two-particle threshold 2Mλ . We treat the repulsive case following the method of [19] and the attractive one following [2]. As before, the λ dependence is omitted unless deemed necessary. To control the spectrum, we treat D = D 0 + DKD 0 as a perturbation about the ladder approximation. For this, we set DL for the (λ dependent) D solution of the ladder BS equation, that is, DL = D 0 + λDL LD 0 .

(4.18)

Spectral Analysis Stochastic Lattice Ginzburg–Landau Models

399

Then, D is given by

−1 D = DL + Dλ2 K (2) DL = DL 1 − λ2 K (2) DL .

In the repulsive case a2 > 0, we show that DL has no singularity in 0 < κ < 2Mλ , and that the bound state of mass Mb is isolated with isolation radius rb . We now show that there is no spectrum in (Mb + rb , 2Mb ) again by showing that K (2) DL has norm less than one in this interval. The starting point of the analysis is an explicit representation for DL . Using (4.6) and (4.17) in (4.18), and suppressing the κ dependence, gives q )X(p), DL (p, q) = r0 (p)δ(p + q) − 3λa2 r0 (q)Y (p) − 3λa2 r(q)E0 (

where X(p) =

(4.19)

DL (p, q)dq

,

Y (p) =

DL (p, q)E0 ( q )dq.

Multiplying (4.19) by the function 1 and E0 ( q ), integrating over q and solving for X(p) and Y (p), leads to 3λa2 DL (p, q) = r0 (p)δ(p + q) − + E0 ( q ) + 3λa2 E0 (p) D × (E0 (p) + E0 ( q )) α − γ − βE0 (p)E 0 ( q) ≡ r0 (p)δ(p + q) + c(p, q)r0 (p)r0 (q), where D = D(w = 1) [see (4.11)], that is

D = (1 − µ+ ) (1 − µ− ) = 1 + 3λa2 α + (3λa2 )2 α 2 − βγ .

(4.20)

To establish our result, it is sufficient to use the bound of Lemma 4.3 for the Hilbert– (2) Schmidt norm of λ2 Kλ R, with, following (4.14), R(κ) ≡ D˜ L (κ), and the bound (uniformly in p )

I ≡ R(p, q)f (p)g(q)dpdq ≤ O λ−1 w(q ) . (4.21)

In (4.21), suppressing the p and q dependence, (2)

f (p) = Kλ (κ, p + iδ, p); As the κ behavior of

J ≡

g(q) = w(q)−1 Bδ (q − q).

D˜ L (p, q)dpdq = β/D

(4.22)

(4.23)

is easily controlled (see Lemma 4.14 below), it is convenient to write I of (4.21) as I= r0 (p) [f (p)g(q) − f (0)g(0)] dpdq + [f (p) − f (0)] g(0)c(p, q)r0 (p)r0 (q)dpdq + f (p) [g(p) − g(0)] c(p, q)r0 (p)r0 (q)dpdq + Jf (0)g(0) ≡ X1 + X2 + X3 + X4 .

400

P. A. Faria da Veiga, M. O’Carroll, E. Pereira, R. Schor

The terms X2 and X3 are bounded by a combination of the methods used for bounding X1 and X4 (see [2]). We now bound X4 . Following [19], we write [h(p) ≡ f (p)g(p)] − h(0) as h(p) − h(0) = h(p) − h(0, p) + h(0, p) − h(0) ≡ δh1 (p) + δh2 (p). (4.24) The δh1 (p) and δh2 (p) terms are bounded in the lemmas below. Lemma 4.12. Recalling the definitions given in (4.22) and (4.24), the bound

δh1 (p)r00 (κ, p)dp ≤ O λ−1 w

is satisfied. Proof. Write [see (2.2) and (4.3)] o −1

r00 (κ, p) = (iκp )

−1 p/2 m2 κ2 + + (p ) − iκp − 4 2 2 −1 2 p/2 m2 κ 0 2 0 . + + − (p ) + iκp − 4 2 2

0 2

0

The singularity in p0 at zero is cancelled and for the first (second) terms we make the contour shift p 0 → p 0 ± iδ0 , with δ0 < δ0 . Thus, the denominators become 2 2 2 2 p2 (p0 )2 ± 2i δ0 p 0 ∓ κp 0 + κδ0 − δ02 + κ4 + m2 + 2/ , which is zero for κ4 = m2 + 2 √ m2 p/ 0 2 + δ0 κ − δ0 > 2 for δ0 < κ. Thus, κ > 2m. Hence, we have no p singularity 0 3 and the rest of the bound is carried out using the 1/(p ) falloff of the term of r00 (κ, p) as in the proof of Lemma 4.5. & '

Lemma 4.13. The bound δh2 (p)r00 (κ, p)dp

≤ O λ−1 w(q ) holds. Proof. Writing h(0, p) − h(0) = p.∇ u h(0, u ) |u=0 +

1

t

dt

0

dt

0

∂2 h(0, t p) ∂t 2

and doing the p0 integration, we get

cλ (p) 2 pj pk Eλ (p) 4Eλ (p )2 − 4Eλ (0 )2 + 4Mλ2 − κ 2

1 0

t

dt 0

dt

∂2 h(0, u = t p)d p, ∂uj uk

where the p terms integrate to zero by parity. The integral over p is finite for 0 < κ < 2Mλ . Concerning the derivatives of h(0, p), with respect to p j , we see that they are (2) (2) bounded by Bδ and those of Kλ . Using the analyticity of Kλ , the derivatives are uniformly bounded. Proceeding as in the proof of Lemma 4.5, the bound is completed. ' &

Spectral Analysis Stochastic Lattice Ginzburg–Landau Models

401

Lemma 4.14. There exist positive constants c1 , c2 , c3 and c4 such that i) For a2 > 0, and uniformly for 0 < κ < 2Mλ , −1 J ≤ c1 $(κ) 1 + 3λa2 c2 $(κ) − c3 λ2 . ii) For a2 < 0, and uniformly for 2Mλ − λ5/2 κ < 2Mλ , λJ ≤ c4 . Proof. i) From (4.20) and (4.23), we get 2 −1 J = β 1 − µ+ 1 − µ− = 1 + 3λa2 α + 3λa2 α 2 − βγ . The first bound follows from α, β, γ < c$(κ) but α 2 −βγ < 0, by the Cauchy-Schwarz inequality. However, by separating out the constant term in the numerators of α, β and γ , the p/2 singularity in the denominator is cancelled and α 2 − βγ < c uniformly in 0 < κ < 2Mλ . For ii) see Sect. 3 of [2]. & ' 5. Concluding Remarks We have determined the low-lying e − m spectrum for dynamic stochastic lattice Landau–Ginzburg models with small polynomial interaction and such that the equilibrium state is in the single phase region. The determination of the spectrum for models with equilibrium states in the multi-phase region is of interest. Also the question of the effect of large noise on the spectrum is relevant and is currently being investigated [13]. References 1. Dimock, J.: A Cluster Expansion for Stochastic Lattice Fields. J. Stat. Phys. (1990)

58, 1181–1207 2. Dimock, J., Eckmann, J.-P.: On the Bound State in Weakly Coupled λ φ 6 − φ 4 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.

2

Models. Commun.

Math. Phys. 51, 41–54 (1976) Duren, P.: Theory of H p Spaces. Pure and Applied Mathematics Vol. 38, New York: Academic Press, 1970 Glimm, J., Jaffe, A.: Quantum Physics: A Functional Integral Point of View. New York: Springer Verlag, 1986 Gammaitoni, L., Hanggi, P., Jung, P., Marchesoni, F.: Stochastic Resonance. Rev. Mod. Phys. 70, 223–287 (1998) Hohenberg, P. C., Halperin, B. I.: Theory of Dynamic Critical Phenomena. Rev. Mod. Phys. 49, 435–479 (1977) Horsthemke, W., Lefever, R.: Noise-induced Transitions. Berlin: Springer Verlag, 1984 Itzykson, C., Zuber, J.-B.: Quantum Field Theory. New York: McGraw-Hill, 1980 Jona-Lasinio, G., Mitter, P. K.: On the Stochastic Quantization of Field Theory. Commun. Math. Phys. 101, 409–436 (1985) Jona-Lasinio, G., Sénéor, R.: Study of Stochastic Differential Equations by Constructive Methods I. J. Stat. Phys. 83, 1109–1148 (1996) Kondratiev, Yu. G., Minlos, R. A.: One-Particle Subspaces in the Stochastic XY Model. J. Stat. Phys. 87, 613–642 (1997) Minlos, R. A., Suhov, Y. M.: On the Spectrum of the Generator of an Infinite System of Interacting Diffusions. Commun. Math. Phys. 206, 463–489 (1999) Pereira, E.: Noise Induced Bound States. Phys. Lett. A 282, 169–174 (2001) Reed, M., Simon, B.: Analysis of Operators. Modern Methods of Mathematical Physics Vol. IV, New York: Academic Press, 1978 Schor, R., Barata, J. C. A., Faria da Veiga, P. A., Pereira, E.: Spectral Properties of Weakly Coupled Landau-Ginzburg Stochastic Models. Phys. Rev. E 59, Issue 3, 2689–2694 (1999)

402

P. A. Faria da Veiga, M. O’Carroll, E. Pereira, R. Schor

16. Schor, R., O’Carroll, M.: Decay of the Bethe–Salpeter Kernel and Absence of Bound States for Lattice Classical Ferromagnetic Spin Systems at High Temperature. J. Stat. Phys. 99, 1207–1223 (2000); Transfer Matrix Spectrum and Bound States for Lattice Classical Ferromagnetic Spin Systems at High Temperature. J. Stat. Phys. 99, 1265–1279 (2000) 17. Simon, B.: Statistical Mechanics of Lattice Models. Princeton, NJ: Princeton University Press, 1994 18. Spencer, T.: The Decay of the Bethe–Salpeter Kernel in P(ϕ)2 Quantum Field Models. Commun. Math. Phys. 44, 143–164 (1975) 19. Spencer, T., Zirilli, F.: Scattering States and Bound States in λP(φ)2 Models. Commun. Math. Phys. 49, 1–16 (1976) 20. Spohn, H.: Large Scale Dynamics of Interacting Particles. Berlin: Springer Verlag, 1991 21. Stein, E.M.: Harmonic Analysis. Princeton, NJ: Princeton University Press, 1993 22. Zhizhina, E.A.: Two-Particle Spectrum of the Generator for Stochastic Model of Planar Rotators at High Temperature. J. Stat. Phys. 91, 343–366 (1998) 23. Zinn-Justin, J.: Quantum Field Theory and Critical Phenomena. Oxford: Oxford University Press, 1993 Communicated by Ya. G. Sinai

Commun. Math. Phys. 220, 403 – 428 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

Global Properties of Gravitational Lens Maps in a Lorentzian Manifold Setting Volker Perlick Albert Einstein Institute, 14476 Golm, Germany. E-mail: [email protected] Received: 16 October 2000 / Accepted: 18 January 2001

Abstract: In a general-relativistic spacetime (Lorentzian manifold), gravitational lensing can be characterized by a lens map, in analogy to the lens map of the quasi-Newtonian approximation formalism. The lens map is defined on the celestial sphere of the observer (or on part of it) and it takes values in a two-dimensional manifold representing a twoparameter family of worldlines. In this article we use methods from differential topology to characterize global properties of the lens map. Among other things, we use the mapping degree (also known as Brouwer degree) of the lens map as a tool for characterizing the number of images in gravitational lensing situations. Finally, we illustrate the general results with gravitational lensing (a) by a static string, (b) by a spherically symmetric body, (c) in asymptotically simple and empty spacetimes, and (d) in weakly perturbed Robertson–Walker spacetimes.

1. Introduction Gravitational lensing is usually studied in a quasi-Newtonian approximation formalism which is essentially based on the assumptions that the gravitational fields are weak and that the bending angles are small, see Schneider, Ehlers and Falco [1] for a comprehensive discussion. This formalism has proven to be very powerful for the calculation of special models. In addition it has also been used for proving general theorems on the qualitative features of gravitational lensing such as the possible number of images in a multiple imaging situation. As to the latter point, it is interesting to inquire whether the results can be reformulated in a Lorentzian manifold setting, i.e., to inquire to what extent the results depend on the approximations involved. In the quasi-Newtonian approximation formalism one considers light rays in Euclidean 3-space that go from a fixed point (observer) to a point that is allowed to vary Permanent address: TU Berlin, Sekr. PN 7-1, 10623 Berlin, Germany. E-mail: [email protected]

404

V. Perlick

over a 2-dimensional plane (source plane). The rays are assumed to be straight lines with the only exception that they may have a sharp bend at a 2-dimensional plane (deflector plane) that is parallel to the source plane. (There is also a variant with several deflector planes to model deflectors which are not “thin”.) For each concrete mass distribution, the deflecting angles are to be calculated with the help of Einstein’s field equation, or rather of those remnants of Einstein’s field equation that survive the approximations involved. Hence, at each point of the deflector plane the deflection angle is uniquely determined by the mass distribution. As a consequence, following light rays from the observer into the past always gives a unique “lens map” from the deflector plane to the source plane. There is “multiple imaging” whenever this lens map fails to be injective. In this article we want to inquire whether an analogous lens map can be introduced in a spacetime setting, without using quasi-Newtonian approximations. According to the rules of general relativity, a spacetime is to be modeled by a Lorentzian manifold (M, g) and the light rays are to be modeled by the lightlike geodesics in M. We shall assume that (M, g) is time-oriented, i.e., that the timelike and lightlike vectors can be distinguished into future-pointing and past-pointing in a globally consistent way. To define a general lens map, we have to fix a point p ∈ M as the event where the observation takes place and we have to look for an analogue of the deflector plane and for an analogue of the source plane. As to the deflector plane, there is an obvious candidate, namely the celestial sphere Sp at p. This can be defined as the set of all one-dimensional lightlike subspaces of the tangent space Tp M or, equivalently, as the totality of all light rays issuing from p into the past. As to the source plane, however, there is no natural candidate. Following Frittelli, Newman and Ehlers [2–4], one might consider any timelike 3-dimensional submanifold T of the spacetime manifold as a substitute for the source plane. The idea is to view such a submanifold as ruled by worldlines of light sources. To make this more explicit, one could restrict to the case that T is a fiber bundle over a two-dimensional manifold N , with fibers timelike and diffeomorphic to R. Each fiber is to be interpreted as the worldline of a light source, and the set N may be identified with the set of all those worldlines. In this situation we wish to define a lens map fp : Sp −→ N by extending each light ray from p into the past until it meets T and then projecting onto N . In general, this prescription does not give a well-defined map since neither existence nor uniqueness of the target value is guaranteed. As to existence, there might be some past-pointing lightlike geodesics from p that never reach T . As to uniqueness, one and the same light ray might intersect T several times. The uniqueness problem could be circumvented by considering, on each past-pointing lightlike geodesic from p, only the first intersection with T , thereby willfully excluding some light rays from the discussion. This comes up to ignoring every image that is hidden behind some other image of a light source with a worldline ξ ∈ N . For the existence problem, however, there is no general solution. Unless one restricts to special situations, the lens map will be defined only on some subset Dp of Sp (which may even be empty). Also, one would like the lens map to be differentiable or at least continuous. This is guaranteed if one further restricts the domain Dp of the lens map by considering only light rays that meet T transversely. Following this line of thought, we give a precise definition of lens maps in Sect. 2. We will be a little bit more general than outlined above insofar as the source surface need not be timelike; we also allow for the limiting case of a lightlike source surface. This has the advantage that we may choose the source surface “at infinity” in the case of an asymptotically simple and empty spacetime. In Sect. 3 we briefly discuss some general properties of the caustic of the lens map. In Sect. 4 we introduce the mapping degree (Brouwer degree) of the lens map as an important tool from differential topology.

Global Properties of Gravitational Lens Maps in Lorentzian Setting

405

This will then give us some theorems on the possible number of images in gravitational lensing situations, in particular in the case that we have a “simple lensing neighborhood”. The latter notion will be introduced and discussed in Sect. 5. We conclude with applying the general results to some examples in Sect. 6. Our investigation will be purely geometrical in the sense that we discuss the influence of the spacetime geometry on the propagation of light rays but not the influence of the matter distribution on the spacetime geometry. In other words, we use only the geometrical background of general relativity but not Einstein’s field equation. For this reason the “deflector”, i.e., the matter distribution that is the cause of gravitational lensing, never explicitly appears in our investigation. However, information on whether the deflectors are transparent or non-transparent will implicitly enter into our considerations. 2. Definition of the Lens Map As a preparation for precisely introducing the lens map in a spacetime setting, we first specify some terminology. By a manifold we shall always mean what is more fully called a “real, finitedimensional, Hausdorff, second countable (and thus paracompact) C ∞ -manifold without boundary”. Whenever we have a C ∞ vector field X on a manifold M, we may consider two points in M as equivalent if they lie on the same integral curve of X. We shall denote the resultant quotient space, which may be identified with the set of all integral curves of X, by M/X. We call X a regular vector field if M/X can be given the structure of a manifold in such a way that the natural projection πX : M −→ M/X becomes a C ∞ -submersion. It is easy to construct examples of non-regular vector fields. E.g., if X has no zeros and is defined on Rn \ {0}, then M/X cannot satisfy the Hausdorff property, so it cannot be a manifold according to our terminology. Palais [5] has proven a useful result which, in our terminology, can be phrased in the following way. If none of X’s integral curves is closed or almost closed, and if M/X satisfies the Hausdorff property, then X is regular. We are going to use the following terminology. A Lorentzian manifold is a manifold M together with a C ∞ metric tensor field g of Lorentzian signature (+ · · · + −). A Lorentzian manifold is time-orientable if the set of all timelike vectors {Z ∈ T M | g(Z, Z) < 0} has exactly two connected components. Choosing one of those connected components as future-pointing defines a time-orientation for (M, g).A spacetime is a connected 4-dimensional time-orientable Lorentzian manifold together with a time-orientation. We are now ready to define what we will call a “source surface” in a spacetime. This will provide us with the target space for lens maps. Definition 1. (T , W ) is called a source surface in a spacetime (M, g) if (a) T is a 3-dimensional C ∞ submanifold of M; (b) W is a nowhere vanishing regular C ∞ vector field on T which is everywhere causal, g(W, W ) ≤ 0, and future-pointing; (c) πW : T −→ N = T /W is a fiber bundle with fiber diffeomorphic to R and the quotient manifold N = T /W is connected and orientable. We want to interpret the integral curves of W as the worldlines of light sources. Thus, one should assume that they are not only causal but even timelike, g(W, W ) < 0, since a light source should move at subluminal velocity. For technical reasons, however, we

406

V. Perlick

allow for the possibility that an integral curve of W is lightlike (everywhere or at some points), because such curves may appear as (C 1 -)limits of timelike curves. This will give us the possibility to apply the resulting formalism to asymptotically simple and empty spacetimes in a convenient way, see Subsect. 6.2 below. Actually, the causal character of W will have little influence upon the results we want to establish. What really matters is a transversality condition that enters into the definition of the lens map below. Please note that, in the situation of Def. 1, the bundle πW : T −→ N is necessarily trivializable, i.e., T N × R. To prove this, let us assume that the flow of W is defined on all of R × T , so it makes πW : T −→ N into a principal fiber bundle. (This is no restriction of generality since it can always be achieved by multiplying W with an appropriate function. This function can be determined in the following way. Owing to a famous theorem of Whitney [6], also see Hirsch [7], p. 55, paracompactness guarantees that T can be embedded as a closed submanifold into Rn for some n. Pulling back the Euclidean metric gives a complete Riemannian metric h on T and the flow of the vector field h(W, W )−1/2 W is defined on all of R × T , cf. Abraham and Marsden [8], Prop. 2.1.21.) Then the result follows from the well known facts that any fiber bundle whose typical fiber is diffeomorphic to Rn admits a global section (see, e.g., Kobayashi and Nomizu [9], p. 58), and that a principal fiber bundle is trivializable if and only if it admits a global section (see again [9], p. 57). Also, it is interesting to note the following. If T is any 3-dimensional submanifold of M that is foliated into timelike curves, then time orientability guarantees that these are the integral curves of a timelike vector field W . If we assume, in addition, that T contains no closed timelike curves, then it can be shown that πW : T −→ N is necessarily a fiber bundle with fiber diffeomorphic to R, providing N satisfies the Hausdorff property, see Harris [10], Theorem 2. This shows that there is little room for relaxing the conditions of Def. 1. Choosing a source surface in a spacetime will give us the target space N = T /W for the lens map. To specify the domain of the lens map, we consider, at any point p ∈ M, the set Sp of all lightlike directions at p, i.e., the set of all one-dimensional lightlike subspaces of Tp M. We shall refer to Sp as to the celestial sphere at p. This is justified since, obviously, Sp is in natural one-to-one relation with the set of all light rays arriving at p. As it is more convenient to work with vectors rather than with directions, we shall usually represent Sp as a submanifold of Tp M. To that end we fix a future-pointing timelike vector Vp in the tangent space Tp M. The vector Vp may be interpreted as the 4-velocity of an observer at p. We now consider the set Sp = Yp ∈ Tp M g(Yp , Yp ) = 0 and g(Yp , Vp ) = 1 . (1) It is an elementary fact that (1) defines an embedded submanifold of Tp M which is diffeomorphic to the standard 2-sphere S 2 . As indicated by our notation, the set (1) can be identified with the celestial sphere at p, just by relating each vector to the direction spanned by it. Representation (1) of the celestial sphere gives a convenient way of representing the light rays through p. We only have to assign to each Yp ∈ Sp the lightlike geodesic s −→ expp (sYp ) , where expp : Wp ⊆ Tp M −→ M denotes the exponential map at the point p of the Levi-Civita connection of the metric g. Please note that this geodesic is past-pointing, because Vp was chosen future-pointing, and that it passes through p at the parameter value s = 0. The lens map is defined in the following way. After fixing a source surface (T , W ) and choosing a point p ∈ M, we denote by Dp ⊆ Sp the subset of all lightlike

Global Properties of Gravitational Lens Maps in Lorentzian Setting

p

..................................................................................... ............ .... ... . ..... .................... .... ... ......... .... .... .... ........ .... .... ....... .... .... .... .... .... .... ... ... ... .... .... .... ... . . . . .... . . .... .... .... .... .... . .... . . .... . . . . . . . .... . . .... ... ... ... .... . ... . . . . . .... . . . . .... . . . . . .... .... ..... .... ..... ..... .... . . . . .... . . p . . . .... ..... .... .... ..... .... .. .... .. . . ... . .... ... . . . . .... . . .... .. .... .... ..... .... . . ..... ....... .... ..... ..... .... ...... .... ... ... .. .. .... . . ... ... .... . . . . .... .... . . .... ... .... .... .. .... .... . . .... ... . .... .... . ..... . .... ... .... ....... .... .... .... ..... ... .... . .... .... . .... . .... .... .... ..... .... .... .... .... .... .... ... ... ... .... .... .... ... .... .... .... .... .... .... .... .... .... .... .... ... . ..................................................................... ... . ................ ... .... ........... ........... .... ......... ......... ..... ........ ... .......

q ❅ ❘ ❅

407

✻

W

Y

T

q

πW

❄ q

..................................................................... ............... ........... ........... ......... ......... ........ p p .....

f (Y )

N

Fig. 1. Illustration of the lens map

directions at p such that the geodesic to which this direction is tangent meets T (at least once) if sufficiently extended to the past, and if at the first intersection point q with T this geodesic is transverse to T . By projecting q to N = T /W we get the lens map fp : Dp −→ N = T /W , see Fig. 1. If we use the representation (1) for Sp , the definition of the lens map can be given in more formal terms in the following way. Definition 2. Let (T , W ) be a source surface in a spacetime (M, g). Then, for each p ∈ M, the lens map fp : Dp −→ N = T /W is defined in the following way. In the notation of Eq. (1), let Dp be the set of all Yp ∈ Sp such that there is a real number wp (Yp ) > 0 with the properties (a) sYp is in the maximal domain of the exponential map for all s ∈ [ 0 , wp (Yp )]; (b) the curve s −→ exp(sYp ) intersects T at the value s = wp (Yp ) transversely; (c) expp (sYp ) ∈ / T for all s ∈ [ 0 , wp (Yp )[ . This defines a map wp : Dp −→ R. The lens map at p is then, by definition, the map fp : Dp −→ N = T /X ,

fp (Yp ) = πW expp (wp (Yp )Yp ) .

(2)

Here πW : T −→ N denotes the natural projection. The transversality condition in part (b) of Def. 2 guarantees that the domain Dp of the lens map is an open subset of Sp . The case Dp = ∅ is, of course, not excluded. In particular, Dp = ∅ whenever p ∈ T , owing to part (c) of Def. 2.

408

V. Perlick

Moreover, the transversality condition in part (b) of Definition 2, in combination with the implicit function theorem, makes sure that the map wp : Dp −→ R is a C ∞ map. As the exponential map of a C ∞ metric is again C ∞ , and πW is a C ∞ submersion by assumption, this proves the following. Proposition 1. The lens map is a C ∞ map. Please note that without the transversality condition the lens map need not even be continuous. Although our Def. 2 made use of the representation (1), which refers to a timelike vector Vp , the lens map is, of course, independent of which future-pointing Vp has been chosen. We decided to index the lens map only with p although, strictly speaking, it depends on T , on W , and on p. Our philosophy is to keep a source surface (T , W ) fixed, and then to consider the lens map for all points p ∈ M. In view of gravitational lensing, the lens map admits the following interpretation. For ξ ∈ N , each point Yp ∈ Dp with fp (Yp ) = ξ corresponds to a past-pointing lightlike geodesic from p to the worldline ξ in M, i.e., it corresponds to an image at the celestial sphere of p of the light source with worldline ξ . If fp is not injective, we are in a multiple imaging situation. The converse need not be true as the lens map does not necessarily cover all images. There might be a past-pointing lightlike geodesic from p reaching ξ after having met T before, or being tangential to T on its arrival at ξ . In either case, the corresponding image is ignored by the lens map. The reader might be inclined to view this as a disadvantage. However, in Sect. 6 below we discuss some situations where the existence of such additional light rays can be excluded (e.g., asymptotically simple and empty spacetimes) and situations where it is desirable, on physical grounds, to disregard such additional light rays (e.g., weakly perturbed Robertson–Walker spacetimes with compact spatial sections). It was already mentioned that the domain Dp of the lens map might be empty; this is, of course, the worst case that could happen. The best case is that the domain is all of the celestial sphere, Dp = Sp . We shall see in the following sections that many interesting results are true just in this case. However, there are several cases of interest where Dp is a proper subset of Sp . If the domain of the lens map fp is the whole celestial sphere, none of the light rays issuing from p into the past is blocked or trapped before it reaches T . In view of applications to gravitational lensing, this excludes the possibility that these light rays meet a non-transparent deflector. In other words, it is a typical feature of gravitational lensing situations with non-transparent deflectors that Dp is not all of Sp . Two simple examples, viz., a non-transparent string and a non-transparent spherical body, will be considered in Subsect. 6.1 below. 3. Regular and Critical Values of the Lens Map Please recall that, for a differentiable map F : M1 −→ M2 between two manifolds, Y ∈ M1 is called a regular point of F if the differential TY F : TY M1 −→ TF (Y ) M2 has maximal rank, otherwise Y is called a critical point. Moreover, ξ ∈ M2 is called a regular value of F if all Y ∈ F −1 (ξ ) are regular points, otherwise ξ is called a critical value. Please note that, according to this definition, any ξ ∈ M2 that is not in the image of F is regular. The well-known (Morse-)Sard theorem (see, e.g., Hirsch [7], p. 69) says that the set of regular values of F is residual (i.e., it contains the intersection of countably many sets that are open and dense in M2 ) and thus dense in M2 and the critical values of F make up a set of measure zero in M2 .

Global Properties of Gravitational Lens Maps in Lorentzian Setting

For the lens map fp : Dp −→ N , we call the set Caust(fp ) = ξ ∈ N ξ is a critical value of fp

409

(3)

the caustic of fp . The Sard theorem then implies the following result. Proposition 2. The caustic Caust(fp ) is a set of measure zero in N and its complement N \ Caust(fp ) is residual and thus dense in N . Please note that Caust(fp ) need not be closed in N . Counter-examples can be constructed easily by starting with situations where the caustic is closed and then excising points from spacetime. For lens maps defined on the whole celestial sphere, however, we have the following result. Proposition 3. If Dp = Sp , the caustic Caust(fp ) is compact in N . This is an obvious consequence of the fact that Sp is compact and that fp and its first derivative are continuous. As the domain and the target space of fp have the same dimension, Yp ∈ Dp is a regular point of fp if and only if the differential TYp fp : TYp Sp −→ Tfp (Yp ) N is an isomorphism. In this case fp maps a neighborhood of Yp diffeomorphically onto a neighborhood of fp (Yp ). The differential TYp fp may be either orientation-preserving or orientation-reversing. To make this notion precise we have to choose an orientation for Sp and an orientation for N . For the celestial sphere Sp it is natural to choose the orientation according to which the origin of the tangent space Tp M is to the inner side of Sp . The target manifold N is orientable by assumption, but in general there is no natural choice for the orientation. Clearly, choosing an orientation for N fixes an orientation for T , because the vector field W gives us an orientation for the fibers. We shall say that the orientation of N is adapted to some point Yp ∈ Dp if the geodesic with initial vector Yp meets T at the inner side. If Dp is connected, the orientation of N that is adapted to some Yp ∈ Dp is automatically adapted to all other elements of Dp . Using this terminology, we may now introduce the following definition. Definition 3. A regular point Yp ∈ Dp of the lens map fp is said to have even parity (or odd parity, respectively) if TYp fp is orientation-preserving (or orientation-reversing, respectively) with respect to the natural orientation on Sp and the orientation adapted to Yp on N . For a regular value ξ ∈ N of the lens map, we denote by n+ (ξ ) (or n− (ξ ), respectively) the number of elements in fp−1 (ξ ) with even parity (or odd parity, respectively). Please note that n+ (ξ ) and n− (ξ ) may be infinite, see the Schwarzschild example in Subsect. 6.1 below. A criterion for n± (ξ ) to be finite will be given in Prop. 8 below. Definition 3 is relevant for gravitational lensing in the following sense. The assumption that Yp is a regular point of fp implies that an observer at p sees a neighborhood of ξ = fp (Yp ) in N as a neighborhood of Yp at his or her celestial sphere. If we compare the case that Yp has odd parity with the case that Yp has even parity, then the appearance of the neighborhood in the first case is the mirror image of its appearance in the second case. This difference is observable for a light source that is surrounded by some irregularly shaped structure, e.g. a galaxy with curved jets or with lobes. If ξ is a regular value of fp , it is obvious that the points in fp−1 (ξ ) are isolated, i.e., any Yp in fp−1 (ξ ) has a neighborhood in Dp that contains no other point in fp−1 (ξ ). This follows immediately from the fact that fp maps a neighborhood of Yp diffeomorphically

410

V. Perlick

onto its image. In the next section we shall formulate additional assumptions such that the set fp−1 (ξ ) is finite, i.e., such that the numbers n± (ξ ) introduced in Def. 3 are finite. It is the main purpose of the next section to demonstrate that then the difference n+ (ξ ) − n− (ξ ) has some topological invariance properties. As a preparation for that we notice the following result which is an immediate consequence of the fact that the lens map is a local diffeomorphism near each regular point. Proposition 4. n+ and n− are constant on each connected component of fp (Dp ) \ Caust(fp ). Hence, along any continuous curve in fp (Dp ) that does not meet the caustic of the lens map, the numbers n+ and n− remain constant, i.e., the observer at p sees the same number of images for all light sources on this curve. If a curve intersects the caustic, the number of images will jump. In the next section we shall prove that n+ and n− always jump by the same amount (under conditions making sure that these numbers are finite), i.e., the total number of images always jumps by an even number. This is well known in the quasi-Newtonian approximation formalism, see, e.g., Schneider, Ehlers and Falco [1], Sect. 6. If Caust(fp ) is empty, transversality guarantees that fp (Dp ) is open in N and, thus a manifold. Proposition 4 implies that, in this case, fp gives a C ∞ covering map from Dp onto fp (Dp ). As a C ∞ covering map onto a simply connected manifold must be a global diffeomorphism, this implies the following result. Proposition 5. Assume that Caust(fp ) is empty and that fp (Dp ) is simply connected. Then fp gives a global diffeomorphism from Dp onto fp (Dp ). In other words, the formation of a caustic is necessary for multiple imaging provided that fp (Dp ) is simply connected. In Subsect. 6.1 below we shall consider the spacetime of a non-transparent string. This will demonstrate that the conclusion of Prop. 5 is not true without the assumption of fp (Dp ) being simply connected. In the rest of this subsection we want to relate the caustic of the lens map to the caustic of the past light cone of p. The past light cone of p can be defined as the image set in M of the map Fp : (s, Yp ) −→ expp (sYp )

(4)

considered on its maximal domain in ] 0 , ∞ [ × Sp , and its caustic can be defined as the set of critical values of Fp . In other words, q ∈ M is in the caustic of the past light cone of p if and only if there is an s0 ∈ ] 0 , ∞ [ and a Yp ∈ Sp such that the differential T(s0 ,Yp ) Fp has rank k < 3. In that case one says that the point q = expp (s0 Yp ) is conjugate to p along the geodesic s −→ expp (sYp ), and one calls the number m = 3 − k the multiplicity of this conjugate point. As Fp ( · , Yp ) is always an immersion, the multiplicity can take the values 1 and 2 only. (This formulation is equivalent to the definition of conjugate points and their multiplicities in terms of Jacobi vector fields which may be more familiar to the reader.) It is well known, but far from trivial, that along every lightlike geodesic conjugate points are isolated. Hence, in a compact parameter interval there are only finitely many points that are conjugate to a fixed point p. A proof can be found, e.g., in Beem, Ehrlich and Easley [11], Theorem 10.77. After these preparations we are now ready to establish the following proposition. We use the notation introduced in Def. 2.

Global Properties of Gravitational Lens Maps in Lorentzian Setting

411

Proposition 6. An element Yp ∈ Dp is a regular point of the lens map if and only if the point expp (wp (Yp )Yp ) is not conjugate to p along the geodesic s −→ expp (sYp ). A regular point Yp ∈ Dp has even parity (or odd parity, respectively) if and only if the number of points conjugate to p along the geodesic [ 0 , wp (Yp )] −→ M , s −→ expp (sYp ) is even (or odd, respectively). Here each conjugate point is to be counted with its multiplicity. Proof. In terms of the function (4), the lens map can be written in the form fp (Yp ) = πW Fp (wp (Yp ), Yp ) .

(5)

As s −→ Fp (s, Yp ) is an immersion transverse to T at s = wp (Yp ) and πW is a submersion, the differential of fp at Yp has rank 2 if and only if the differential of Fp at (wp (Yp ), Yp ) has rank 3. This proves the first claim. For proving the second claim define, for each s ∈ [0, wp (Yp )], a map s : TYp Sp −→ Tfp (Yp ) N

(6)

by applying to each vector in TYp Sp the differential T(s,Yp ) Fp , parallel-transporting the result along the geodesic Fp ( · , Yp ) to the point q = Fp wp (Yp ), Yp and then projecting down to Tfp (Yp ) N . In the last step one uses the fact that, by transversality, any vector in Tq M can be uniquely decomposed into a vector tangent to T and a vector tangent to the geodesic Fp ( · , Yp ). For s = 1, this map s gives the differential of the lens map. We now choose a basis in TYp Sp and a basis in Tfp (Yp ) N , thereby representing the map s as a (2 × 2)-matrix. We choose the first basis right-handed with respect to the natural orientation on Sp and the second basis right-handed with respect to the orientation on N that is adapted to Yp . Then det(0 ) is positive as the parallel transport gives an orientation-preserving isomorphism. The function s −→ det(s ) has a single zero whenever Fp (s, Yp ) is a conjugate point of multiplicity one and it has a double zero whenever Fp (s, Yp ) is a conjugate point of multiplicity two. Hence, the sign of det(1 ) can be determined by counting the conjugate points. This result implies that ξ ∈ N is a regular value of the lens map fp whenever the worldline ξ does not pass through the caustic of the past light cone of p. The relation between parity and the number of conjugate points is geometrically rather evident because each conjugate point is associated with a “crossover” of infinitesimally neighboring light rays. 4. The Mapping Degree of the Lens Map The mapping degree (also known as Brouwer degree) is one of the most powerful tools in differential topology. In this section we want to investigate what kind of information could be gained from the mapping degree of the lens map, providing it can be defined. For the reader’s convenience we briefly summarize the definition and main properties of the mapping degree, following closely Choquet-Bruhat, Dewitt-Morette, and DillardBleick [12], pp. 477. For a more abstract approach, using homology theory, the reader may consult Dold [13], Spanier [14] or Bredon [15]. In this article we shall not use homology theory with the exception of the proof of Prop. 11. The definition of the mapping degree is based on the following observation.

412

V. Perlick

Proposition 7. Let F : D ⊆ M1 −→ M2 be a continuous map, where M1 and M2 are oriented connected manifolds of the same dimension, D is an open subset of M1 with compact closure D and F |D is a C ∞ map. (Actually, C 1 would do.) Then for every ξ ∈ M2 \ F (∂D) which is a regular value of F |D , the set F −1 (ξ ) is finite. Proof. By contradiction, let us assume that there is a sequence (yi )i∈N with pairwise different elements in F −1 (ξ ). By compactness of D, we can choose an infinite subsequence of (yi )i∈N that converges towards some point y∞ ∈ D. By continuity of F , F (y∞ ) = ξ , so the hypotheses of the proposition imply that y∞ ∈ / ∂D. As a consequence, y∞ is a regular point of F |D , so it must have an open neighborhood in D that does not contain any other element of F −1 (ξ ). This contradicts the fact that a subsequence of (yi )i∈N converges towards y∞ . If we have a map F that satisfies the hypotheses of Prop. 7, we can thus define, for every ξ ∈ M2 \ F (∂D) which is a regular value of F |D , deg(F, ξ ) = sgn(y) , (7) y ∈ F −1 (ξ )

where sgn(y) is defined to be +1 if the differential Ty F preserves orientation and −1 if Ty F reverses orientation. If F −1 (ξ ) is the empty set, the right-hand side of (7) is set equal to zero. The number deg(F, ξ ) is called the mapping degree of F at ξ . Roughly speaking, deg(F, ξ ) tells how often the image of F covers the point ξ , counting each “layer” positive or negative depending on orientation. The mapping degree has the following properties (for proofs see Choquet-Bruhat, Dewitt-Morette, and Dillard-Bleick [12], pp. 477). Property A. deg(F, ξ ) = deg(F, ξ ) whenever ξ and ξ are in the same connected component of M2 \ F (∂D). Property B. deg(F, ξ ) = deg(F , ξ ) whenever F and F are homotopic, i.e., whenever there is a continuous map : [0, 1] × D −→ M2 , (s, y) −→ s (y) with 0 = F and 1 = F such that deg(s , ξ ) is defined for all s ∈ [0, 1]. Property A can be used to extend the definition of deg(F, ξ ) to the non-regular values ξ ∈ M2 \ F (∂D). Given the fact that, by the Sard theorem, the regular values are dense in M2 , this can be done just by continuous extension. Property B can be used to extend the definition of deg(F, ξ ) to continuous maps F : D −→ M2 which are not necessarily differentiable on D. Given the fact that the C ∞ maps are dense in the continuous maps with respect to the C 0 -topology, this can be done again just by continuous extension. We now apply these general results to the lens map fp : Dp −→ N . In the case Dp = Sp it is necessary to extend the domain of the lens map onto a compact set to define the degree of the lens map. We introduce the following definition. Definition 4. A map fp : Dp ⊆ M1 −→ M2 is called an extension of the lens map fp : Dp −→ N if (a) M1 is an orientable manifold that contains Dp as an open submanifold; (b) M2 is an orientable manifold that contains N as an open submanifold; (c) the closure Dp of Dp in M1 is compact; (d) fp is continuous and the restriction of fp to Dp is equal to fp .

Global Properties of Gravitational Lens Maps in Lorentzian Setting

413

If the lens map is defined on the whole celestial sphere, Dp = Sp , then the lens map is an extension of itself, fp = fp , with M1 = Sp and M2 = N . If Dp = Sp , one may try to continuously extend fp onto the closure of Dp in Sp , thereby getting an extension with M1 = Sp and M2 = N . If this does not work, one may try to find some other extension. The string spacetime in Subsect. 6.1 below will provide us with an example where an extension exists although fp cannot be continuously extended from Dp onto its closure in Sp . The spacetime around a spherically symmetric body with Ro < 3m will provide us with an example where the lens map admits no extension at all, see Subsect. 6.1 below. Applying Prop. 7 to the case F = fp immediately gives the following result. Proposition 8. If the lens map fp : Dp −→ N admits an extension fp : Dp ⊆ M1 −→ M2 , then for all regular values ξ ∈ N \fp (∂Dp ) the set fp−1 (ξ ) is finite, so the numbers n+ (ξ ) and n− (ξ ) introduced in Def. 3 are finite. If fp is an extension of the lens map fp , the number deg(fp , ξ ) is a well defined integer for all ξ ∈ N \ fp (∂Dp ), provided that we have chosen an orientation on M1 and on M2 . The number deg(fp , ξ ) changes sign if we change the orientation on M1 or on M2 . This sign ambiguity can be removed if Dp is connected. Then we know from the preceding section that N admits an orientation that is adapted to all Yp ∈ Dp . As N is connected, this determines an orientation for M2 . Moreover, the natural orientation on Sp induces an orientation on Dp which, for Dp connected, gives an orientation for M1 . In the rest of this paper we shall only be concerned with the situation that Dp is connected, and we shall always tacitly assume that the orientations have been chosen as indicated above, thereby fixing the sign of deg(fp , ξ ). Now comparison of (7) with Def. 3 shows that deg(fp , ξ ) = n+ (ξ ) − n− (ξ )

(8)

for all regular values in N \ fp (∂Dp ). Owing to Property A, this has the following consequence. Proposition 9. Assume that Dp is connected and that the lens map admits an extension fp : Dp ⊆ M1 −→ M2 . Then n+ (ξ ) − n− (ξ ) = n+ (ξ ) − n− (ξ ) for any two regular values ξ and ξ which are in the same connected component of N \ fp (∂Dp ). In particular, n+ (ξ ) + n− (ξ ) is odd if and only if n+ (ξ ) + n− (ξ ) is odd. We know already from Prop. 4 that the numbers n+ and n− remain constant along each continuous curve in fp (Dp ) that does not meet the caustic of fp . Now let us consider a continuous curve α : ] − ε0 , ε0 [ −→ fp (Dp ) that meets the caustic at α(0) whereas α(ε) is a regular value of fp for all ε = 0. Under the additional assumptions that Dp is connected, an extension, and that α(0) ∈ / fp (∂Dp ), Prop. 9 that fp admits tells us that n+ α(ε) − n− α(ε) remains constant when ε passes through zero. In other words, n+ and n− are allowed to jump only by the same amount. As a consequence, the total number of images n+ + n− is allowed to jump only by an even number. We now specialize to the case that the lens map is defined on the whole celestial sphere, Dp = Sp . Then the assumption of fp admitting an extension is trivially satisfied, with fp = fp , and the degree deg(fp , ξ ) is a well-defined integer for all ξ ∈ N . Moreover,

414

V. Perlick

deg(fp , ξ ) is a constant with respect to ξ , owing to Property A. It is then usual to write simply deg(fp ) instead of deg(fp , ξ ). Using this notation, (8) simplifies to deg(fp ) = n+ (ξ ) − n− (ξ )

(9)

for all regular values ξ of fp . Thus, the total number of images n+ (ξ ) + n− (ξ ) = deg(fp ) + 2n− (ξ )

(10)

is either even for all regular values ξ or odd for all regular values ξ , depending on whether deg(fp ) is even or odd. In some gravitational lensing situations it might be possible to show that there is one light source ξ ∈ N for which fp−1 (ξ ) consists of exactly one point, i.e., ξ is not multiply imaged. This situation is characterized by the following proposition. Proposition 10. Assume that Dp = Sp and that there is a regular value ξ of fp such that fp−1 (ξ ) is a single point. Then |deg(fp )| = 1. In particular, fp must be surjective and N must be diffeomorphic to the sphere S 2 . Proof. The result |deg(fp )| = 1 can be read directly from (9), choosing the regular value ξ which has exactly one pre-image point under fp . This implies that fp must be surjective since a non-surjective map has degree zero. So N being the continuous image of the compact set Sp under the continuous map fp must be compact. It is well known (see, e.g., Hirsch [7], p. 130, Exercise 5) that for n ≥ 2 the existence of a continuous map F : S n −→ M2 with deg(F ) = 1 onto a compact oriented n-manifold M2 implies that M2 must be simply connected. As the lens map gives us such a map onto N (after changing the orientation of N , if necessary), we have thus found that N must be simply connected. Owing to the well-known classification theorem of compact orientable twodimensional manifolds (see, e.g., Hirsch [7], Chapter 9), this implies that N must be diffeomorphic to the sphere S 2 . In the situation of Prop. 10 we have n+ (ξ ) + n− (ξ ) = 2n− (ξ ) ± 1, for all ξ ∈ N \ Caust(fp ), i.e., the total number of images is odd for all light sources ξ ∈ N S 2 that lie not on the caustic of fp . The idea to use the mapping degree for proving an odd number theorem in this way was published apparently for the first time in the introduction of McKenzie [16]. In Prop. 10 one would, of course, like to drop the rather restrictive assumption that fp−1 (ξ ) is a single point for some ξ . In the next section we consider a special situation where the result |deg(fp )| = 1 can be derived without this assumption. 5. Simple Lensing Neighborhoods In this section we investigate a special class of spacetime regions that will be called “simple lensing neighborhoods”. Although the assumption of having a simple lensing neighborhood is certainly rather special, we shall demonstrate in Sect. 6 below that sufficiently many examples of physical interest exist. We define simple lensing neighborhoods in the following way. Definition 5. (U, T , W ) is called a simple lensing neighborhood in a spacetime (M, g) if (a) U is an open connected subset of M and T is the boundary of U in M; (b) ( T = ∂U, W ) is a source surface in the sense of Def. 1;

Global Properties of Gravitational Lens Maps in Lorentzian Setting

415

(c) for all p ∈ U, the lens map fp : Dp −→ N = ∂U/W is defined on the whole celestial sphere, Dp = Sp ; (d) U does not contain an almost periodic lightlike geodesic. Here the notion of being “almost periodic” is defined in the following way. Any immersed curve λ : I −→ U, defined on a real interval I , induces a curve λˆ : I −→ P U ˆ ˙ | c ∈ R }. in the projective tangent bundle P U over U which is defined by λ(s) = { cλ(s) The curve λ is called almost periodic if there is a strictly monotonous sequence of ˆ i ) i∈N has an accumulation point parameter values (si )i∈N such that the sequence λ(s in P U. Please note that Condition (d) of Def. 5 is certainly true if the strong causality condition holds everywhere on U, i.e., if there are no closed or almost closed causal curves in U. Also, Condition (d) is certainly true if every future-inextendible lightlike geodesic in U has a future end-point in M. Condition (d) should be viewed as adding a fairly mild assumption on the futurebehavior of lightlike geodesics to the fairly strong assumptions on their past-behavior that are contained in Condition (c). In particular, Condition (c) excludes the possibility that past-oriented lightlike geodesics are blocked or trapped inside U, i.e., it excludes the case that U contains non-transparent deflectors. Condition (c) requires, in addition, that the past-pointing lightlike geodesics are transverse to ∂U when leaving U. In the situation of a simple lensing neighborhood, we have for each p ∈ U a lens map that is defined on the whole celestial sphere, fp : Sp −→ N = ∂U/W . We have, thus, Eq. (9) at our disposal which relates the numbers n+ (ξ ) and n− (ξ ), for any regular value ξ ∈ N , to the mapping degree of fp . (Please recall that, by Prop. 8, n+ (ξ ) and n− (ξ ) are finite.) It is our main goal to prove that, in a simple lensing neighborhood, the mapping degree of the lens map equals ±1, so n(ξ ) = n+ (ξ ) + n− (ξ ) is odd for all regular values ξ . Also, we shall prove that a simple lensing neighborhood must be contractible and that its boundary must be diffeomorphic to S 2 × R. The latter result reflects the fact that the notion of simple lensing neighborhoods generalizes the notion of asymptotically simple and empty spacetimes, with ∂U corresponding to past lightlike infinity J− , as will be detailed in Subsect. 6.2 below. When proving the desired properties of simple lensing neighborhoods we may therefore use several techniques that have been successfully applied to asymptotically simple and empty spacetimes before. As a preparation we need the following lemma. Lemma 1. Let (U, T , W ) be a simple lensing neighborhood in a spacetime (M,g). Then there is a diffeomorphism , from the sphere bundle S = Yp ∈ Sp p ∈ U of lightlike directions over U onto the space T N × R2 such that the following diagram commutes. S

,

−→ T N × R2

ip ↑

↓ pr fp

Sp −→

(11)

N

Here ip denotes the inclusion map and pr is defined by dropping the second factor and projecting to the foot-point. Proof. We fix a trivialization for the bundle πW : T −→ N and identify T with N × R. Then we consider the bundle B = Xq ∈ Bq q ∈ T over T , where Bq ⊂ Sq is, by definition, the subspace of all lightlike directions that are tangent to past-oriented

416

V. Perlick

lightlike geodesics that leave U transversely at q. Now we choose for each q ∈ T a vector Qq ∈ Tq M, smoothly depending on q, which is non-tangent to T and outward pointing. With the help of this vector field Q we may identify B and T N × R as bundles over T N × R in the following way. Fix ξ ∈ N , Xξ ∈ Tξ N and s ∈ R and view the tangent space Tξ N as a natural subspace of Tq (N × R), where q = (ξ, s). Then the desired identification is given by associating the pair (Xξ , s) with the direction spanned by Zq = Xξ + Qq − α W (q), where the number α is uniquely determined by the requirement that Zq should be lightlike and past-pointing. – Now we consider the map π : S −→ B T N × R

(12)

given by following each lightlike geodesic from a point p ∈ U into the past until it reaches T , and assigning the tangent direction at the end-point to the tangent direction at the initial point. As a matter of fact, (12) gives a principal fiber bundle with structure group R. To prove this, we first observe that the geodesic spray induces a vector field without zeros on S. By multiplying this vector field with an appropriate function we get a vector field whose flow is defined on all of R × S (see the second paragraph after Def. 1 for how to find such a function). The flow of this rescaled vector field defines an R-action on S such that (12) can be identified with the projection onto the space of orbits. Conditions (c) and (d) of Def. 5 guarantee that no orbit is closed or almost closed. Owing to a general result of Palais [5], this is sufficient to prove that this action makes (12) into a principal fiber bundle with structure group R. However, any such bundle is trivializable, see, e.g., Kobayashi and Nomizu [9], pp. 57/58. Choosing a trivialization for (12) gives us the desired diffeomorphism , from S to B × R T N × R2 . The commutativity of the diagram (11) follows directly from the definition of the lens map fp . With the help of this lemma we will now prove the following proposition which is at the center of this section. Proposition 11. Let (U, T , W ) be a simple lensing neighborhood in a spacetime (M, g). Then (a) N = T /W is diffeomorphic to the standard 2-sphere S 2 ; (b) U is contractible; (c) for all p ∈ U, the lens map fp : Sp S 2 −→ N S 2 has |deg(fp )| = 1; in particular, fp is surjective. Proof. In the proof of part (a) and (b) we shall adapt techniques used by Newman and Clarke [17, 18] in their study of asymptotically simple and empty spacetimes. To that end it will be necessary to assume that the reader is familiar with homology theory. With the sphere bundle S, introduced in Lemma 1, we may associate the Gysin homology sequence . . . −→ Hm (S) −→ Hm (U) −→ Hm−3 (U) −→ Hm−1 (S) −→ . . . ,

(13)

where Hm (X ) denotes the mth homology group of the space X with coefficients in a field F. For any choice of F, the Gysin sequence is an exact sequence of abelian groups, see, e.g., Spanier [14], p. 260 or, for the analogous sequence of cohomology groups, Bredon [15], p. 390. By Lemma 1, S and N have the same homotopy type, so Hm (S) and Hm (N ) are isomorphic. Upon inserting this into (13), we use the fact

Global Properties of Gravitational Lens Maps in Lorentzian Setting

417

that Hm (U) = 1 ( = trivial group consisting of the unit element only) for m > 4 and Hm (N ) = 1 for m > 2 because dim(U) = 4 and dim(N ) = 2. Also, we know that H0 (U) = F and H0 (N ) = F since U and N are connected. Then the exactness of the Gysin sequence implies that Hm (U) = 1

for m > 0

(14)

H2 (N ) = F.

(15)

and H1 (N ) = 1 ,

From (15) we read that N is compact since otherwise H2 (N ) = 1. Moreover, we observe that N has the same homology groups and thus, in particular, the same Euler characteristic as the 2-sphere. It is well known that any two compact and orientable 2-manifolds are diffeomorphic if and only if they have the same Euler characteristic (or, equivalently, the same genus), see, e.g., Hirsch [7], Chapter 9. We have thus proven part (a) of the proposition. – To prove part (b) we consider the end of the exact homotopy sequence of the fiber bundle S over U, see, e.g., Frankel [19], p. 600, . . . −→ π1 (S) −→ π1 (U) −→ 1.

(16)

As S has the same homotopy type as N S 2 , we may replace π1 (S) with π1 (S 2 ) = 1, so the exactness of (16) implies that π1 (U) = 1, i.e., that U is simply connected. If, for some m > 1, the homotopy group πm (U) would be different from 1, the Hurewicz isomorphism theorem (see, e.g., Spanier [14], p. 394 or Bredon [15], p. 479, Corollary 10.10.) would give a contradiction to (14). Thus, πm (U) = 1 for all m ∈ N, i.e., U is contractible. – We now prove part (c). Since U is contractible, the tangent bundle T U and thus the sphere bundle S over U admits a global trivialization, S U ×S 2 . Fixing such a trivialization and choosing a contraction that collapses U onto some point p ∈ U gives a contraction i˜p : S −→ Sp . Together with the inclusion map ip : Sp −→ S this gives us a homotopy equivalence between Sp and S. (Please recall that a homotopy equivalence between two topological spaces X and Y is a pair of continuous maps ϕ : X −→ Y and ϕ˜ : Y −→ X such that ϕ ◦ ϕ˜ can be continuously deformed into the identity on Y and ϕ˜ ◦ ϕ can be continuously deformed into the identity on X .) On the other hand, the projection pr from (11), together with the zero section pr ˜ : N −→ T N × R2 gives a homotopy equivalence between T N × R2 and N . As a consequence, the diagram (11) ˜ tells us that the lens map fp = pr ◦ , ◦ ip together with the map f˜p = i˜p ◦ , −1 ◦ pr gives a homotopy equivalence between Sp S 2 and N S 2 , so fp ◦ f˜p is homotopic to the identity. Since the mapping degree is a homotopic invariant (please recall Property B of the mapping degree from Sect. 4), this implies that deg(fp ◦ f˜p ) = 1. Now the product theorem for the mapping degree (see, e.g., Choquet-Bruhat, Dewitt-Morette, and Dillard-Bleick [12], p. 483) yields deg(fp ) deg(f˜p ) = 1. As the mapping degree is an integer, this can be true only if deg(fp ) = deg(f˜p ) = ±1. In particular, fp must be surjective since otherwise deg(fp ) = 0. In all simple examples to which this proposition applies the degree of fp is, actually, equal to +1, and it is hard to see whether examples with deg(fp ) = −1 do exist. The following consideration is quite instructive. If we start with a simple lensing neighborhood in a flat spacetime (or, more generally, in a conformally flat spacetime), then

418

V. Perlick

conjugate points cannot occur, so it is clear that the case deg(fp ) = −1 is impossible. If we now perturb the metric in such a way that the simple-lensing-neighborhood property is maintained during the perturbation, then, by Property B of the degree, the equation deg(fp ) = +1 is preserved. This demonstrates that the case deg(fp ) = −1 cannot occur for weak gravitational fields (or for small perturbations of conformally flat spacetimes such as Robertson–Walker spacetimes). Among other things, Proposition 11 gives a good physical motivation for studying degree-one maps from S 2 to S 2 . In particular, it is an interesting problem to characterize the caustics of such maps. Please note that, by parts (a) and (c) of Proposition 11, fp (Dp ) is simply connected for all p ∈ U. Hence, Proposition 5 applies which says that the formation of a caustic is necessary for multiple imaging. Owing to (10), part (c) of Proposition 11 implies in particular that n(ξ ) = n+ (ξ ) + n− (ξ ) is odd for all worldlines of light sources ξ ∈ N that do not pass through the caustic of the past light cone of p, i.e., if only light rays within U are taken into account the observer at p sees an odd number of images of such a worldline. It is now our goal to prove a similar “odd number theorem” for a light source with worldline inside U. As a preparation we establish the following lemma. Lemma 2. Let (U, T , W ) be a simple lensing neighborhood in a spacetime (M, g) and p ∈ U. Let J − (p, U) denote, as usual, the causal past of p in U, i.e., the set of all points in M that can be reached from p along a past-pointing causal curve in U. Let ∂U J − (p, U) denote the boundary of J − (p, U) in U. Then (a) every point q ∈ ∂U J − (p, U) can be reached from p along a past-pointing lightlike geodesic in U; (b) ∂U J − (p, U) is relatively compact in M. Proof. As usual, let I − (p, U) denote the chronological past of p in U, i.e., the set of all points that can be reached from p along a past-pointing timelike curve in U. To prove part (a), fix a point q ∈ ∂U J − (p, U). Choose a sequence (pi )i∈N of points in U that converge towards p in such a way that p ∈ I − (pi , U) for all i ∈ N. This implies that we can find for each i ∈ N a past-pointing timelike curve λi from pi to q. Then the λi are past-inextendible in U \ {q}. Owing to a standard lemma (see, e.g., Wald [20], Lemma 8.1.5) this implies that the λi have a causal limit curve λ through p that is pastinextendible in U \ {q}. We want to show that λ is the desired lightlike geodesic. Assume that λ is not a lightlike geodesic. Then λ enters into the open set I − (p, U) (see Hawking and Ellis [21], Prop. 4.5.10), so λi enters into I − (p, U) for i sufficiently large. This, however, is impossible since all λi have past end-point on ∂U J − (p, U), so λ must be a lightlike geodesic. It remains to show that λ has past end-point at q. Assume that this is not true. Since λ is past-inextendible in U \ {q} this assumption implies that λ is pastinextendible in U, so by condition (c) of Def. 5 λ has past end-point on ∂U and meets ∂U transversely. As a consequence, for i sufficiently large λi has to meet ∂U which gives a contradiction to the fact that all λi are within U. – To prove part (b), we have to show that any sequence (qi )i∈N in ∂U J − (p, U) has an accumulation point in M. So let us choose such a sequence. From part (a) we know that there is a past-pointing lightlike geodesic µi from p to qi in U for all i ∈ N. By compactness of Sp S 2 , the tangent directions to these geodesics at p have an accumulation point in Sp . Let µ be the past-pointing lightlike geodesic from p which is determined by this direction. By condition (c) of Definition 5, this geodesic µ and each of the geodesics µi must have a past end-point on ∂U if maximally extended inside U. We may choose an affine parametrization for each of those geodesics with the parameter ranging from the value 0 at p to the value 1 at ∂U.

Global Properties of Gravitational Lens Maps in Lorentzian Setting

419

Then our sequence (qi )i∈N in U determines a sequence (si )i∈N in the interval [0, 1] by setting qi = µi (si ). By compactness of [0, 1], this sequence must have an accumulation point s ∈ [0.1]. This demonstrates that the qi must have an accumulation point in M, namely the point µ(s). We are now ready to prove the desired odd-number theorem for light sources with worldline in U. Proposition 12. Let (U, T , W ) be a simple lensing neighborhood in a spacetime (M, g) and assume that U does not contain a closed timelike curve. Fix a point p ∈ U and a timelike embedded C ∞ curve γ in U whose image is a closed topological subset of M. (The latter condition excludes the case that γ has an end-point on ∂U.) Then the following is true. (a) If γ does not meet the point p, then there is a past-pointing lightlike geodesic from p to γ that lies completely within U and contains no conjugate points in its interior. (The end-point may be conjugate to the initial-point.) If this geodesic meets γ at the point q, say, then all points on γ that lie to the future of q cannot be reached from p along a past-pointing lightlike geodesic in U. (b) If γ meets neither the point p nor the caustic of the past light cone of p, then the number of past-pointing lightlike geodesics from p to γ that are completely contained in U is finite and odd. Proof. In the first step we construct a C ∞ vector field V on M that is timelike on U, has γ as an integral curve, and coincides with W on T = ∂U. To that end we first choose any future-pointing timelike C ∞ vector field V1 on M. (Existence is guaranteed by our assumption of time-orientability.) Then we extend the vector field W to a C ∞ vector field V2 onto some neighborhood V of T . Since W is causal and future-pointing, V2 may be chosen timelike and future-pointing on V \ T . (Here we make use of the fact that T = ∂U is a closed subset of M.) Finally we choose a timelike and future-pointing vector field V3 on some neighborhood W of γ that is tangent to γ at all points of γ . (Here we make use of the fact that the image of γ is a closed subset of M.) We choose the neighborhoods V and W disjoint which is possible since γ is completely contained in U and closed in M. With the help of a partition of unity we may now combine the three vector fields V1 , V2 , V3 into a vector field V with the desired properties. In the second step we consider the quotient space M/V . This space contains the open subset U/V whose boundary T /V = N is, by Prop. 11, a manifold diffeomorphic to S 2 . We want to show that U/V is a manifold (which, according to our terminology, in particular requires that U/V is a Hausdorff space). To that end we consider the map jp : ∂U J − (p, U) −→ U/V which assigns to each point q ∈ ∂U J − (p, U) the integral curve of V passing through that point. (In this proof overlining always means closure in M.) Clearly, jp is continuous with respect to the topology ∂U J − (p, U) inherits as a subspace of M and the quotient topology on U/V . Moreover, ∂U J − (p, U) intersects each integral curve of V at most once, and if it intersects one integral curve then it also intersects all neighbboring integral curves in U; this follows from Wald [20], Theorem 8.1.3. Hence, jp is injective and its image is open in U/V . On the other hand, part (b) of Lemma 2 implies that the image of jp is closed. Since the image of jp is non-empty and connected, it must be all of U/V . (The domain of jp and, thus, the image of jp is non-empty because U does not contain a closed timelike curve. The domain and, thus, the image of jp is connected since U is connected.) We have, thus, proven that jp

420

V. Perlick

is a homeomorphism. This implies that the Hausdorff condition is satisfied on U/V and, in particular, on U/V . Since V is timelike and U contains no closed timelike curves, this makes sure that U/V is a manifold according to our terminology, see Harris [10], Theorem 2. In the third step we use these results to prove part (a) of the proposition. Our result that jp is a homeomorphism implies, in particular, that γ has an intersection with ∂U J − (p, U) at some point q. Now part (a) of Lemma 2 shows that there is a past-pointing lightlike geodesic from p to q in U. This geodesic cannot contain conjugate points in its interior since otherwise a small variation would give a timelike curve from p to q, see Hawking and Ellis [21], Prop. 5.4.12, thereby contradicting q ∈ ∂U J − (p, U). The rest of part (a) is clear since all past-pointing lightlike geodesics in U that start at p are confined to J − (p, U). In the last step we prove part (b). To that end we choose on the tangent space Tp M a Lorentz basis (Ep1 , Ep2 , Ep3 , Ep4 ) with Ep4 future-pointing, and we identify each x = (x 1 , x 2 , x 3 ) ∈ R3 with the past-pointing lightlike vector Yp = x 1 Ep1 + x 2 Ep2 + x 3 Ep3 − |x|Ep4 . With this identification, the lens map takes the form fp : S 2 −→ N = ∂U/V , x −→ πV expp (wp (x)x) . We now define a continuous map F : B −→ M/V x on the closed ball B = x ∈ R3 |x| ≤ 1 by setting F (x) = πV expp (wp ( |x| ) x) for x = 0 and F (0) = πV (p). The restriction of F to the interior of B is a C ∞ map onto the manifold U/V , with the exception of the origin where F is not differentiable. The latter problem can be circumvented by approximating F in the C o -sense, on an arbitrarily small neighborhood of the origin, by a C ∞ map. Then the mapping degree deg(F ) can be calculated (see, e.g., Choquet-Bruhat, Dewitt-Morette, and Dillard-Bleick [12], pp. 477) with the help of the integral formula F ∗ ω = deg(F ) ω, (17) B

U /V

where ω is any 3-form on U/V and the star denotes the pull-back of forms. For any 2-form ψ on U/V , we may apply this formula to the form ω = dψ. With the help of the Stokes theorem we then find F ∗ ψ = deg(F ) ψ. (18) S2

N

However, the restriction of F to ∂B = S 2 gives the lens map, so on the left-hand side of (18) we may replace F ∗ ψ by fp∗ ψ. Then comparison with the integral formula for the degree of fp shows that deg(F ) = deg(fp ) which, according to Prop. 11, is equal to ±1. For every ζ ∈ U/V that is a regular value of F , the result deg(F ) = ±1 implies that the number of elements in F −1 (ζ ) is finite and odd. By assumption, the worldline γ ∈ U/V meets neither the point p nor the caustic of the past light cone of p. The first condition makes sure that our perturbation of F near the origin can be done without influencing the set F −1 (γ ); the second condition implies that γ is a regular value of F , please recall our discussion at the end of Sect. 3. This completes the proof. If only light rays within U are taken into account, then Prop. 12 can be summarized by saying that, for light sources in a simple lensing neighborhood, the “youngest image” has always even parity and the total number of images is finite and odd. In the quasi-Newtonian approximation formalism it is a standard result that a transparent gravitational lens produces an odd number of images, see Schneider, Ehlers and

Global Properties of Gravitational Lens Maps in Lorentzian Setting

421

Falco [1], Section 5.4, for a detailed discussion. Proposition 12 may be viewed as a reformulation of this result in a Lorentzian geometry setting. It is quite likely that an alternative proof of Prop. 12 can be given by using the Morse theoretical results of Giannoni, Masiello and Piccione [22, 23]. Also, the reader should compare our results with the work of McKenzie [16] who used Morse theory for proving an odd-number theorem in certain globally hyperbolic spacetimes. Contrary to McKenzie’s theorem, our Prop. 12 requires mathematical assumptions which can be physically interpreted rather easily. 6. Examples 6.1. Two simple examples with non-transparent deflectors. 6.1.1. Non-transparent string. As a simple example, we consider gravitational lensing in the spacetime (M, g) where M = R2 × R2 \ {0} and g = −dt 2 + dz2 + dr 2 + k 2 r 2 dϕ 2

(19)

with some constant 0 < k < 1. Here (t, z) denote Cartesian coordinates on R2 and (r, ϕ) denote polar coordinates on R2 \ {0}. This can be interpreted as the spacetime around a static non-transparent string, see Vilenkin [24], Hiscock [25] and Gott [26]. One should think of the string as being situated at the z-axis. Since the latter is not part of the spacetime, it is indeed justified to speak of a non-transparent string. As ∂/∂t is a Killing vector field normalized to −1, the lightlike geodesics in (M, g) correspond to the geodesics of the space part. The latter is a metrical product of a real line with coordinate z and a cone with polar coordinates (r, ϕ). So the geodesics are straight lines if we cut the cone open along some radius ϕ = const. and flatten it out in a plane. Owing to this simple form of the lightlike geodesics, the investigation of lens maps in this string spacetime is quite easy. To work this out, choose some constant R > 0 and let T denote the hypercylinder r = R in M. Let W denote the restriction of the vector field ∂/∂t to T . Then (T , W ) is a source surface, with N = T /W S 1 × R. Henceforth we discuss the lens map fp for any point p ∈ M at a radius r < R. There are no past-pointing lightlike geodesics from p that intersect T more than once or touch T tangentially, so the lens map fp gives full information about all images at p of each light source ξ ∈ N . The domain Dp of the lens map is given by excising a curve segment, namely a meridian including both end-points at the “poles”, from the celestial sphere Sp , so Dp R2 is connected. The boundary of Dp in Sp corresponds to light rays that are blocked by the string before reaching T . It is easy to see that the lens map cannot be continuously extended onto Sp (= closure of Dp in Sp ). Nonetheless, the lens map admits an extension in the sense of Def. 4. We may choose M1 = S 2 and M2 = S 2 . Here Dp is embedded into the sphere in such a way that it covers a region (θ, ϕ) ∈ ]0, π [ × ] ε , 2π − ε[ , i.e., in comparison with the embedding into Sp the curve segment excised from the sphere has been “widened” a bit. The embedding of N S 1 × R into S 2 is made via Mercator projection. As the string spacetime has vanishing curvature, the light cones in M have no caustics. Owing to our general results of Sect. 3, this implies that the caustic of the lens map is empty and that all images have even parity, so (8) gives deg(fp , ξ ) = n+ (ξ ) = n(ξ ) for all ξ ∈ N \ fp (∂Dp ). The actual value of n(ξ ) depends on the parameter k that enters into the metric (19). If i = 1/k is an integer, N \ fp (∂Dp ) is connected and n(ξ ) = i everywhere on this set. If

422

V. Perlick

i < 1/k < i + 1 for some integer i, N \ fp (∂Dp ) has two connected components, with n(ξ ) = i on one of them and n(ξ ) = i +1 on the other. Thus, the string produces multiple imaging and the number of images is (finite but) arbitrarily large if k is sufficiently small. For all k ∈ ]0, 1[ , the lens map is surjective, fp (Dp ) = N S 1 ×R. So this example shows that the assumption of fp (Dp ) being simply connected was essential in Prop. 5. 6.1.2. Non-transparent spherical body. We consider the Schwarzschild metric −1 2 2 g = 1 − 2m dr + r 2 dθ 2 + sin2 θ dϕ 2 − 1 − 2m r r ) dt

(20)

on the manifold M = ]Ro , ∞[ × S 2 × R. In (20), r is the coordinate ranging over ]Ro , ∞[ , t is the coordinate ranging over R, and θ and ϕ are spherical coordinates on S 2 . This gives the static vacuum spacetime around a spherically symmetric body of mass m and radius Ro . Restricting the spacetime manifold to the region r > Ro is a way of treating the central body as non-transparent. In the following we keep a value Ro > 0 fixed and we allow m to vary between m = 0 (flat space) and m = Ro /2 (black hole). For discussing lens maps in this spacetime we fix a constant R > 3Ro /2. We denote by T the set of all points in M with coordinate r = R and we denote by W the restriction of ∂/∂t to W . Then (T , W ) is a source surface, with N = T /W S 2 . It is our goal to discuss the properties of the lens map fp : Dp −→ N for a point p ∈ M with a radius coordinate r < R in dependence of the mass parameter m. To that end we make use of well-known properties of the lightlike geodesics in the Schwarzschild metric, see, e.g., Chandrasekhar [28], Sect. 20, for a comprehensive discussion. For determining the relevant features of the lens map it will be sufficient to concentrate on qualitative aspects of image positions. For quantitative aspects the reader may consult Virbhadra and Ellis [27]. We first observe that, for any m ∈ [0, Ro /2], there is no past-pointing lightlike geodesic from p that intersects T more than once or touches T tangentially. This follows from the fact that in the region r > 3m the radius coordinate has no local maximum along any light ray. So the lens map fp gives full information about all images at p of light sources ξ ∈ N . For m = 0, the light rays are straight lines. The domain Dp of the lens map is given by excising a disc, including the boundary, from the celestial sphere Sp , i.e., Dp R2 . The boundary of Dp corresponds to light rays grazing the surface of the central body, so fp can be continuously extended onto the closure of Dp in Sp , thereby giving an extension of fp , in the sense of Def. 4, fp : Dp ⊆ Sp −→ N . In Fig. 2, fp (∂Dp ) can be represented as a “circle of equal latitude” on the sphere r = R, with the image of fp