Commun. Math. Phys. 211, 1 – 43 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Algebraic Coset Conformal Field Theories Feng Xu Department of Mathematics, University of Oklahoma, 601 Elm Ave, Room 423, Norman, OK 73019, USA. E-mail:
[email protected] Received: 5 November 1998 / Accepted: 18 October 1999
Abstract: All unitary Rational Conformal Field Theories (RCFT) are conjectured to be related to unitary coset Conformal Field Theories, i.e., gauged Wess–Zumino–Witten (WZW) models with compact gauge groups. In this paper we use subfactor theory and ideas of algebraic quantum field theory to approach coset Conformal Field Theories. Two conjectures are formulated and their consequences are discussed. Some results are presented which prove the conjectures in special cases. In particular, one of the results states that a class of representations of coset WN (N ≥ 3) algebras with critical parameters are irreducible, and under the natural compositions (Connes’ fusion), they generate a finite dimensional fusion ring whose structure constants are completely determined, thus proving a long-standing conjecture about the representations of these algebras. 1. Introduction Conformal field theories (CFT) in two dimensions (cf. [MS]) have attracted great attention among mathematicians in recent years. A large class of CFT known as Rational CFT (RCFT) are more amenable than general CFT and the classification of RCFT is an outstanding open problem. Unitary coset CFT is a gauged Wess–Zumino–Witten (WZW) model with a compact Lie group G as a gauge group, and H , a subgroup of G, is gauged (cf. [KS]). It has been conjectured (cf. [MS, Witten]) that all unitary RCFT (except perhaps orbifolds, which are relatively better understood and also similar to the coset CFT, cf. p. 428 of [MS]) are related to unitary coset CFT. In the literature there have been several different mathematical approaches to CFT, see, for an example, [DL] for an approach using vertex operator algebras (cf. [B or FLM]). However, in the case of WZW model with a compact gauge group G , there is a manifestly unitary formulation of these unitary CFT using subfactor theory (cf. [J, W1 and L1]) and ideas from algebraic quantum field theory (cf. [Haag]). This formulation has various advantages besides producing the expected results, see, for an example, [X1 and X2] for some results on certain rings which seem
2
F. Xu
to be invisible in other approaches. In view of the importance of coset CFT among all RCFT, it is natural to use this unitary formulation to study unitary coset CFT, and this is the main purpose of this paper. To illustrate the new ideas in this paper we will focus on the case when G is a simply connected semisimple compact Lie group of type A, i.e., G = SU (N1 ) × SU (N2 ) × · · · × SU (Nn ). The ideas of this paper can be applied to all compact semisimple and simply connected Lie groups and we plan to consider them in separate publications. We will describe the paper in more details. In Subsect. 2.1, we define conformal precosheaf and their covariant representations as in [GL]. We then show how coset H ⊂ G naturally gives rise to an irreducible conformal precosheaf (Prop. 2.2) together with a class of covariant representations. As a first step we proved in Theorem 2.3 an irreducibility result in our setting by using vertex operators (cf. [FZ] or [Kacv]). To do this we show in Prop. 2.3 that certain smeared vertex operators are affiliated (cf. [Mv]) with some von Neumann algebras. Note that these smeared vertex operators are generally unbounded operators and one must be careful with the formal manipulations of them. Proposition 2.3 is proved by using a series of lemmas (Lemmas 1–6). It can be seen from the proof that the vertex operator algebra for coset H ⊂ G as defined on p. 67 of [Kacv] or §5 of [FZ] can be thought as of “germs” of our irreducible conformal precosheaf. Based on the physicists’ argument and well known examples, in Subsect. 2.3 two conjectures are formulated about these representations. The first conjecture states that the representations generate a finite dimensional ring under certain compositions, in other words, the CFT is really “rational”. The second conjecture is a formula about the square root of the minimal index or the statistical dimension (cf. 4.1) of these representations in terms of certain limits of characters, referred to as the “Kac–Wakimoto” formula in [L5]. Both conjectures are highly nontrivial. In particular, the second conjecture implies the Kac–Wakimoto conjecture in §2 of [KW]. The rest of the paper is devoted to the proof of the two conjectures in special cases. Section 3 is about proving the finite index by using certain commuting squares which play an important role in the type I I1 subfactor theory (cf. [Po, We]). However, we will consider factors of type I I I1 . First we introduce a notion of co-finiteness for a pair H ⊂ G. This notion is motivated by the conjectures of 2.3. Proposition 3.1, which follows from the commuting squares in Lemma 3.1, is a key observation of this paper. It implies that if H1 ⊂ H2 ⊂ G and H1 ⊂ G is cofinite, then H1 ⊂ H2 is cofinite. This leads to an infinite series of cofinite pairs in Cor. 3.1. Section 4 can be considered as an application of [X1 and X2] given the results of 3.1. We first recall some of the results of [X1] which are used in 4.2 to determine certain ring structures. These results are summarized in Theorem 4.1. Proposition 4.2 is the key observation in Sect. 4. It allows one to study the representations of coset H ⊂ G by using the theory which has been developed for G and H in some cases. Then we prove Theorem 4.2 which states that for a pair H ⊂ G which is cofinite, if similar results in Chap. V of [W2] hold for H , then Conj. 1 is also true. This result and Cor. 3.1 imply Conj. 1 for the infinite series in Cor. 4.2. In Subsect. 4.3 we study the cosets corresponding to diagonal inclusions by using Theorem 4.2 and prove in Theorem 4.3 that Conj. 2 is true for these cosets. We also prove in Theorem 4.3 a certain irreducibility result. The results in Theorem 4.3 in the special case of coset WN algebras with critical parameters have been long anticipated in both the mathematics and physics literature. For example these results are related to the conjectures in §3 and 4 of [FKW], which
Algebraic Coset Conformal Field Theories
3
are conjectured for what is known as “W-algebras”, and they are closely related to coset W -algebras in the vertex operator algebra sense (cf. [Watts]). In 4.4 we present more examples by using Theorem 4.2. The first example is the coset H ⊂ G, where H is the Cartan subgroup of G. This coset is related to Parafermions (cf. [DL]). We then consider a “Maverick” coset model considered in [DJ1] and determine the relevant ring structure. All the groups considered in this paper are assumed to be connected compact Lie groups unless stated otherwise. As we already noted before, G always denotes a simply connected semisimple compact Lie group of type A, i.e., G = SU (N1 ) × SU (N2 ) × · · · × SU (Nn ). 2. Coset CFT from Algebraic QFT Point of View 2.1. The irreducible conformal precosheaf and its representations. In this subsection we recall the basic properties enjoyed by the family of the von Neumann algebras associated with a conformal quantum field theory on S 1 (cf. [GL and FG]). By an interval in this paper we shall always mean an open connected subset I of S 1 such that I and the interior I 0 of its complement are non-empty. We shall denote by I the set of intervals in S 1 . We shall denote by P SL(2, R) the group of conformal transformations on the complex plane that preserve the orientation and leave the unit circle S 1 globally invariant. Denote by G the universal covering group of P SL(2, R). Notice that G is a simple Lie group and has a natural action on the unit circle S 1 . Denote by R(ϑ) the (lifting to G of the) rotation by an angle ϑ. This one-parameter subgroup of G will be referred to as a rotation a group (denoted by Rot) in the following. We may associate a one-parameter group with any interval I in the following way. Let I1 be the upper semi-circle, i.e. the interval {eiϑ , ϑ ∈ (0, π )}. By using the Cayley transform C : S 1 → R ∪ {∞} given by z → −i(z − 1)(z + 1)−1 , we may identify I1 with the positive real line R+ . Then we consider the one-parameter group 3I1 (s) of diffeomorphisms of S 1 such that C3I1 (s)C −1 x = es x, s, x ∈ R. We also associate with I1 the reflection rI1 given by rI1 z = z¯ , where z¯ is the complex conjugate of z. It follows from the definition that 3I1 restricts to an orientation preserving diffeomorphism of I1 , rI1 restricts to an orientation reversing diffeomorphism of I1 onto I10 . Then, if I is an interval and we choose g ∈ G such that I = gI1 we may set 3I = g3I1 g −1 , rI = grI1 g −1 . Let r be an orientation reversing isometry of S 1 with r 2 = 1 (e.g. rI1 ). The action of r on P SL(2, R) by conjugation lifts to an action σr on G, therefore we may consider the semidirect product of G ×σr Z2 . Since G ×σr Z2 is a covering of the group generated by P SL(2, R) and r, G ×σr Z2 acts on S 1 . We call (anti-)unitary a representation U of G ×σr Z2 by operators on H such that U (g) is unitary, resp. antiunitary, when g is orientation preserving, resp. orientation reversing. Now we are ready to define a conformal precosheaf.
4
F. Xu
An irreducible conformal precosheaf A of von Neumann algebras on the intervals of S 1 is a map I → A(I ) from I to the von Neumann algebras on a separable Hilbert space H that verifies the following properties: A. Isotony. If I1 , I2 are intervals and I1 ⊂ I2 , then A(I1 ) ⊂ A(I2 ). Conformal invariance. There is a nontrivial unitary representation U of G on H such that U (g)A(I )U (g)∗ = A(gI ), g ∈ G, I ∈ I. Positivity of the energy. The generator of the rotation subgroup U (R(ϑ)) is positive. Locality. If I0 , I are disjoint intervals then A(I0 ) and A(I ) commute. The lattice symbol ∨ will denote “the von Neumann algebra generated by”. Existence of the vacuum. There exists a unit vector (vacuum vector) which is U (G)invariant and cyclic for ∨I ∈I A(I ). Irreducibility. The only U (G)-invariant vectors are the scalar multiples of . The term irreducibility is due to the fact (cf. Prop. 1.2 of [GL]) that under the assumption of F ∨I ∈I A(I ) = B(H). We have the following (cf. Prop. 1.1 of [GL]): Proposition 2.1. Let A be an irreducible conformal precosheaf. The following hold: (a) Reeh–Schlieder theorem: is cyclic and separating for each von Neumann algebra A(I ), I ∈ I. (b) Bisognano–Wichmann property: U extends to an (anti-)unitary representation of G ×σr Z2 such that, for any I ∈ I, U (3I (2π t)) = 1itI , U (rI ) = JI , where 1I , JI are the modular operator and the modular conjugation associated with (A(I ), ). For each g ∈ G ×σr Z2 , U (g)A(I )U (g)∗ = A(gI ). (c) Additivity: if a family of intervals Ii covers the interval I , then A(I ) ⊂ ∨i A(Ii ). (d) Haag duality: A(I )0 = A(I 0 ).
Algebraic Coset Conformal Field Theories
5
Assume A is an irreducible conformal precosheaf as defined in the above. By Cor. B.2 of [GL], the only invariant vectors under the action of the one-parameter group U (3I (2πt)) are the scalar multiples of . This fact and (b) of Prop. 2.1 are usually referred to as the action of modular group as ergodic and geometric respectively. It follows that A(I ) is a type I I I1 factor for any I ∈ I (cf. Prop. 1.2 of [GL]). A covariant representation π of A is a family of representations πI of the von Neumann algebras A(I ), I ∈ I, on a separable Hilbert space Hπ and a unitary representation Uπ of the covering group G of P SL(2, R) such that the following properties hold: I ⊂ I¯ ⇒ πI¯ |A(I ) = πI (isotony), adUπ (g) · πI = πgI · adU (g)(covariance). A covariant representation π is called irreducible if ∨I ∈I π(A(I )) = B(Hπ ). By our definition the irreducible conformal precosheaf is in fact an irreducible representation of itself and we will call this representation the vacuum representation. Note that by Cor. B.2 of [GL], the vacuum representation is the unique (up to unitary equivalence) irreducible covariant representation which contains an eigenvector of the generator of the rotation group with lowest eigenvalue 0. A unitary equivalence class of covariant representations of A is called a superselection sector (cf. p. 17 of [GL]). The composition of two superselection sectors, similar to Connes’s fusion (cf. Chap.V of [W2]), can be defined (cf. § IV.2 of [FG] ). The composition is manifestly unitary, and this is one of the virtues of the above formulation. Now let G be the group as in the introduction. Let g = Lie(G), gC := g ⊗ C. Denote by gˆ the affine Kac–Moody algebra (cf. p. 163 of [KW]) associated to gC . Recall ˆ gˆ = gC ⊗ C[t, t −1 ] ⊕ Cc, where Cc is the 1-dimensional center of g. Denote by LG the group of smooth maps f : S 1 7→ G under pointwise multiplication. The diffeomorphism group of the circle DiffS 1 is naturally a subgroup of Aut(LG) with the action given by reparametrization. In particular G acts on LG. We will be interested in the projective unitary representations (cf. Chap. 9 of [PS]) π of LG that are both irreducible and have positive energy. This implies that π should extend to LG n Rot so that the generator of the rotation group Rot is positive. It follows from Chap. 9 of [PS] or p. 490 of [W2] that for fixed level (a finite set of positive integers, see the footnote on this page), there are only a finite number of such irreducible projective representations (cf. 4.3 for a list in the case G = SU (N )). These irreducible projective representations can be obtained by exponentiating the integrable representations of gˆ at the same level by Theorem 4.4 of [GW]. We refer the reader to III.7 of [FG] for a summary about the properties of these representations. Now let H ⊂ G be a connected Lie subgroup. Let π i be an irreducible projective representation of LG with positive energy at level L 1 on Hilbert space H i . When restricting to LH , π i is a projective representation of LH with positive energy. By the proposition on p. 484 of [W2], π i is a direct sum of irreducible projective representations of LH . Suppose when restricting to LH , H i decomposes as: X Hi,α ⊗ Hα , Hi = α
and πα are irreducible projective representations of LH on Hilbert space Hα . The set of (i, α) which appears in the above decompositions will be denoted by exp. The above 1 When G is the direct product of simple groups, L is a multi-index, i.e., L = (L , . . . , L ), where L ∈ N n 1 i corresponding to the level of the i th simple group. To save some writing we write the coset as H ⊂ GL or as H ⊂ G when the level is kept fixed in the question.
6
F. Xu
decomposition can also be obtained via exponentiation (cf. Theorem 4.4 of [GW]) from a similar decomposition (cf. (1.6.2) of [KW]) of the integrable representation of gˆ when restricting to hˆ (the affine Kac–Moody algebra associated with H ). We shall use π 0 (resp. π0 ) to denote the vacuum representation of LG (resp. LH ) on H 0 (resp. H0 ). This is the unique projective representation of LG (resp. LH ) which contains a nonzero vector, unique up to multiplication by a nonzero scalar, with the property that it is an eigenvector of the generator of rotation group with eigenvalue 0. Such vectors will be called vacuum vectors. By Theorem 3.2 of [FG] (also cf. §17 of [W2]) the vacuum representation π 0 of LG gives rise to an irreducible conformal precosheaf by the map I ∈ I → π 0 (LI G)00 on H 0 . Similarly I ∈ I → π0 (LI H )00 on H0 is an irreducible conformal precosheaf on H0 . Let (resp. 0 ) be a vacuum vector in π 0 (resp.π0 ) and assume = 0,0 ⊗ 0 with 0,0 ∈ H0,0 . We shall always assume that H ⊂ GL is not a conformal inclusion since the case of conformal inclusions has been considered in [X1]. For the definition of conformal inclusion, we refer the reader to p. 210 of [KW]. Lemma 2.2. (1) π 0 (LI H )00 is strongly additive , i.e., if I1 , I2 are the connected components of the interval I with one internal point removed, then: π 0 (LI H )00 = π 0 (LI1 H )00 ∨ π 0 (LI2 H )00 ; (2) π 0 (LI H )0 ∩ π 0 (LI 0 H )0 = π 0 (LH )0 . Proof. Ad (1): Let P0 be the projection from H 0 onto 0,0 ⊗ H0 . Since the action of the modular group of π 0 (LJ G)00 with respect to is geometric and ergodic, it fixes globally π 0 (LJ H )00 , and by Takesaki’s theorem (cf. [MT] or p. 495 of [W2]), π 0 (LJ H )00 is a factor. So the map x ∈ π 0 (LJ H )00 → xP0 is a ∗-isomorphism (cf. p. 492 of [W2]). Hence to prove (1) we just need to show π0 (LI H )00 = π0 (LI1 H )00 ∨ π0 (LI2 H )00 . By Haag duality in Prop. 2.1 it is enough to show π0 (LI 0 H )00 = π0 (LI10 H )00 ∩ π0 (LI20 H )00 . By Reeh–Schlieder theorem in Prop. 2.1, the closed space spanned by π 0 (LJ H ) is P0 H 0 for any interval J , and by Takesaki’s theorem (cf. [MT] or (c) of Theorem on p. 495 of [W2]), π 0 (LJ H )00 = {CP0 }0 ∩ π 0 (LJ G)00 , so to prove (1) we just have to show that π 0 (LI 0 G)00 = π 0 (LI10 G)00 ∩ π 0 (LI20 G)00 . It is sufficient to show the above in the case when G is simple, and this follows from Theorem E of [W2].
Algebraic Coset Conformal Field Theories
7
Ad (2): Let I1 = I, I2 , I3 , I4 be four consecutive disjoint intervals on S 1 such that the closure of I ∪ I2 ∪ I3 ∪ I4 is S 1 . Let J1 = I10 , J3 = I30 . For any a ∈ LH , since H is connected, we can always choose a1 ∈ LJ1 H such that a1 |I˜1 = e, a1 |I˜3 = a|I˜3 , where e is the identity of G, I1 ⊂ I˜1 , I3 ⊂ I˜3 , and the closure of I˜1 and the closure of I˜3 are disjoint. Let a2 = aa1−1 , then a2 ∈ LJ3 H , and a2 a1 = a. Hence LJ1 H and LJ3 H generate LH algebraically, and π 0 (LJ1 H )00 ∨ π 0 (LJ3 H )00 = π 0 (LH )00 . By (1)
π 0 (LI2 H )00 ∨ π 0 (LI3 H )00 ∨ π 0 (LI4 H )00 = π 0 (LJ1 H )00 ; π 0 (LI1 H )00 ∨ π 0 (LI2 H )00 ∨ π 0 (LI4 H )00 = π 0 (LJ3 H )00 .
Hence and
π 0 (LJ1 H )00 ∨ π 0 (LJ3 H )00 ⊂ π 0 (LI H )00 ∨ π 0 (LI 0 H )00 , π 0 (LH )00 = π 0 (LI H )00 ∨ π 0 (LI 0 H )00 .
By taking the commutant of the above equality, the proof of (2) is complete. u t For each interval I ⊂ S 1 , we define: A(I ) := π 0 (LI H )0 ∩ π 0 (LI G)00 P , where P is the projection from H 0 to the closed subspace H spanned by ∨J ∈I π 0 (LJ H )0 ∩π 0 (LJ G). Note that π 0 (LI H )0 ∩ π 0 (LI G)00 ⊃ π 0 (LH )0 ∩ π 0 (LI G)00 . On the other hand if x ∈ π 0 (LI H )0 ∩ π 0 (LI G)00 , then x ∈ π 0 (LI H )0 ∩ π 0 (LI 0 H )0 , but π 0 (LI H )0 ∩ π 0 (LI 0 H )0 = π 0 (LH )0 by (2) of Lemma 2.1, so x ∈ π 0 (LH )0 . This shows that π 0 (LI H )0 ∩ π 0 (LI G)00 = π 0 (LH )0 ∩ π 0 (LI G)00 . It follows that if x ∈ π 0 (LI H )0 ∩ π 0 (LI G)00 , then as an operator on H 0 , it takes the form ⊕0,α B(H0,α ) ⊗ idα , and x ∈ H0,0 ⊗ 0 , so we have H ⊂ H0,0 ⊗ 0 . Proposition 2.3. The map I ∈ I → A(I ) as defined above is an irreducible conformal precosheaf. Proof. We have to check conditions A to F. A follows from π 0 (LI H )0 ∩ π 0 (LI G)00 = π 0 (LH )0 ∩ π 0 (LI G)00 which is proved above, B and C follows from a lemma on p. 485 of [W2] except that we need to show that the action of G is nontrivial. Denote by U (t) the action of the g,h g,h rotation group on H0,0 and assume U (t) = exp(2π iL0 t), where L0 is the positive self-adjoint operator. Fix τ ∈ C with Imτ > 0. Then g,h
b00 (τ ) := exp((1/24)2π iτ (zm − z˙ m˙ ))trH0,0 exp(2π iτ L0 ) is a branching function (cf. 3.2.7 of [KW]), where (zm − z˙ m˙ ) is a nonnegative number defined as 1.4.2 of [KW]. We will not need the detailed expression of the branching
8
F. Xu
function in our argument, all we need to know is if H ⊂ G is not conformal, then zm − z˙ m˙ > 0 (cf. p. 210 of [KW]). Let us show that if H ⊂ G is not conformal g,h g,h which is assumed throughout this paper, then L0 6= 0. If L0 = 0, then b00 (τ ) = exp((1/24)2πiτ (zm − z˙ m˙ )) dim(H0,0 ). Since b00 (τ ) is well defined for Im τ > 0 (cf. p. 170 of [KW]), we must have dim(H0,0 ) < ∞, but by 2 (a) of Theorem B of [KW], zm˙ ) m −˙ ) as τ → 0 where b(0, 0) > 0. This is a contradiction, b00 (τ ) ∼ b(0, 0) exp( πi(z12τ g,h and shows that L0 6 = 0. So the action of G is nontrivial. F follows from the uniqueness of the vacuum (up to multiplication by a non-zero scalar) for LG. D and E follow from the definitions. u t The irreducible conformal precosheaf as in Prop. 2.2 is defined to be the irreducible conformal precosheaf of the coset H ⊂ GL . Note that when H = {e} is a trivial subgroup (e denotes the identity element in G), the irreducible conformal precosheaf defined above coincides with the one defined in III.8 of [FG]. Since the action of the modular group of π 0 (LI G)00 with respect to is geometric and ergodic, it fixes globally π 0 (LI H )00 , hence π 0 (LI G)00 ∩ π 0 (LI H )0 , and by Takesaki’s theorem (cf. [MT] or p. 495 of [W2]), π 0 (LI G)00 ∩ π 0 (LI H )0 is a factor. It follows that the map x ∈ π 0 (LI G)00 ∩ π 0 (LI H )0 → xP ∈ π 0 (LI G)00 ∩ π 0 (LI H )0 P is a ∗ isomorphism (cf. p. 492 of [W2]), and can be implemented by a unitary U1 : H 0 → H, i.e., x = U1∗ xP U1 , since π 0 (LI G)00 ∩ π 0 (LI H )0 P is a type I I I1 factor by Prop. 2.2 and the remarks after Prop. 2.1. Let us define a class of covariant representations of A(I ) coming from the decompositions of irreducible projective representations π i of LG with respect to LH . By the remarks after Theorem B on p. 502 of [W2], for any fixed interval I , there exists a unitary map U : H i → H 0 such that π i (a) = U ∗ π 0 (a)U, ∀a ∈ LI G. For y = xP ∈ π 0 (LI H )0 ∩ π 0 (LI G)0 P , we define π i (y) = U ∗ U1∗ yU1 U. This gives a factor representation of A(I ). Let Pi,α be a projection from H i to a subspace Hi,α ⊗ α , where α is a unit vector in Hα . Then y ∈ A(I ) → πi,α (y) := π i (y)Pi,α is a subrepresentation of the factor representation π i , and so the map above is a ∗isomorphism (cf. p. 492 of [W2]). Denote by π i (g) the action of g ∈ G on H i . By the lemma on p. 485 of [W2], π i (g) can be written as π i (g) = ⊕α πi,α (g) ⊗ πα (g). One checks by using the definitions that the representations πi,α of A(I ) and the representations πi,α of G satisfy the covariance condition, and so πi,α are covariant representations of A(I ). The study of these representations is the main purpose of this paper. 2 The branching function here corresponds to b0 in the notations of [KW] on p. 187, where 0 always denotes 0 the vacuum representations. The assumption of Theorem B of [KW] follows from the definition 2.5.4 of [KW] when 3 = 0, λ = 0.
Algebraic Coset Conformal Field Theories
9
By the same argument as in the proof of (1) of Lemma 2.1 one can show that A(I ) as in Prop. 2.2 is strongly additive , i.e., if I1 , I2 are the connected components of the interval I with one internal point removed, then: A(I ) = A(I1 ) ∨ A(I2 ). In fact, by Haag duality in Prop. 2.1, it is sufficient to show that A(I 0 ) = A(I10 ) ∩ A(I20 ), which is equivalent to π 0 (LI 0 H )0 ∩ π 0 (LI 0 G)00 = (π 0 (LI10 H )0 ∩ π 0 (LI10 G)00 ) ∩ (π 0 (LI20 H )0 ∩ π 0 (LI20 G)00 ) by the paragraph after Prop. 2.2. Let P be the projection defined before Prop. 2.2. By the Reeh–Schlieder theorem in Prop. 2.1 the closed space spanned by π 0 (LI 0 H )0 ∩ π 0 (LI 0 G)00 is P H 0 for any interval I , and by Takesaki’s theorem (cf. [MT] or (c) of Theorem on p. 495 of [W2]), π 0 (LI 0 H )0 ∩ π 0 (LI 0 G)00 = {CP }0 ∩ π 0 (LI 0 G)00 , so we just have to show that π 0 (LI 0 G)00 = π 0 (LI10 G)00 ∩ π 0 (LI20 G)00 . It is sufficient to show the above equation in the case G is simple, and in this case it follows from Theorem E of [W2]. Note that the inclusion (π 0 (LI G)00 ∩ π 0 (LI H )0 ) ∨ π 0 (LI H )00 ⊂ π 0 (LI G)00 is irreducible. In fact ((π 0 (LI G)00 ∩ π 0 (LI H )0 ) ∨ π 0 (LI H )00 )0 ∩ π 0 (LI G)00 = (π 0 (LI G)00 ∩ π 0 (LI H )0 )0 ∩ (π 0 (LI G)00 ∩ π 0 (LI H )0 ) = C, since π 0 (LI G)00 ∩ π 0 (LI H )0 is a factor by the paragraph after Prop. 2.2. This fact can also be proved by using the fact that in H0,0 , the vacuum representation of A(I ) appears once and only once. We shall see in the next subsection that π0,0 is in fact the vacuum representation under general conditions. 2.2. π0,0 is the vacuum representation. As a first step in the study of representations πi,α we show that the representation of π0,0 on H0,0 is the vacuum representation. This is equivalent to H = H0,0 ⊗ 0 by definition. Theorem 2.4. Suppose H ⊂ G, and H is simply connected. Then H = H0,0 ⊗ 0 . Hence π0,0 is the vacuum representation.
10
F. Xu
The idea of the proof is to use smeared vertex operators. From the proof one can also see the close relation between our irreducible conformal precosheaf and the definition of the coset vertex operator algebra in §5 of [FZ]. In fact, the coset vertex operator algebra in §5 of [FZ] can be thought of as “germs” of ours. Let g (resp. h) be the Lie algebra of G (resp. H ). Choose a basis eα , e−α , hα in gC := g ⊗ C with α ranging over the set of roots as in §2.5 of [PS]. Let Xα := eα + e−α , Yα := i(eα − e−α ). Denote by gˆ the affine Kac–Moody algebra (cf. p. 163 of [KW]) associated to gC . Note gˆ = gC ⊗ C[t, t −1 ] ⊕ Cc, where P Cc is the 1-dimensional center of g. ˆ For X ∈ g, define X(n) := X ⊗ t n , X(z) := n X(n)z−n−1 as on p. 312 P P of [KT] and X + (z) := n 1 can be expressed in terms of linear combinations of the form X(−1)Y (−1) . . . Z(−1).
Algebraic Coset Conformal Field Theories
11
Because is the vacuum, X = 0, ∀X ∈ g, so in Xi1 (−n1 ) . . . Xit (−nt ), if a certain X ∈ g appears, we can always move X to the right until it vanishes when acting on . For the above two reasons, we can assume that n1 = · · · = nt = 1. The vertex operator V (ψ, z) is then given by: V (ψ, z) =
X
Ci1 ,...,it : Xi1 (z) . . . Xit (z) :,
where :, : are normal ordered products by property (2) above. P Recall V (ψ, z) = m ψ(m)z−m−1 . Define V (m) := ψ(m + n − 1) P so we have V (ψ, z) = m V (m)z−m−n . This expression for V (ψ, z) is in accordance with the convention of [KT]. Note that V (−n) = ψ(−1) = ψ by property (1) above. P Let f = m f (m)zm be a test function with only a finite number of non-zero fm . Such f will be referred to as finite energy functions. Define ||f ||s =
X
(1 + |m|)s |f (m)|.
n∈Z
The smeared vertex operator V (ψ, f ) is defined to be: Z X 1 V (ψ, z)f dz = f (m + n − 1)V (m). V (ψ, f ) = 2πi S 1 m V (ψ, f ) is a well defined operator on H00 . Let V (ψ, f )F A be the formal adjoint of V (ψ, f ) on H00 . It is defined by the equation hV (ψ, f )x, yi = hx, V (ψ, f )F A yi, ∀x, y ∈ H00 , arises, we will where h, i is the inner product on the Hilbert space H 0 . When no confusionP write V (ψ, f ) simply as V (f ). Similarly for X ∈ g, we define X(f ) := n X(n)f (n). When the level is 1 H 0 admits a fermionic representation (cf. §13.3 of [PS] or I.6 of [W2]), and we will denote the underlying Hilbert space by F . In fact, to each simple component Gi of G there is a level 1 vacuum fermionic representation of LGi on Fi , i = 1, . . . , m, and F = F1 ⊗ F2 · · · ⊗ Fm . Lemma 2.5. (1) Let ξ ∈ H00 , and f is a finite energy function. There exists a positive integer a and c > 0 which are independent of f and ξ such that ||V (ψ, f )ξ ||s ≤ c||f |||s|+a ||ξ ||s+a ; (2) (1) is also true for V (ψ, f )F A for the same constants c and a.
12
F. Xu
Proof. Note by definition V (ψ, f ) =
Z
1 2πi
S1
V (ψ, z)f dz =
X
f (m + n − 1)V (m).
m
If we can show (1) for the case V (ψ, f ) = V (l), ∀l, then X f (l + n − 1)V (l)ξ ||s ||V (ψ, f )ξ ||s = || l∈Z
X ≤c (1 + |l|)|s|+a |f (l + n − 1)| ||ξ ||s+a l∈Z
≤ c(1 + |n|)|s|+a ||f ||s+a ||ξ ||s+a . So we just need to show (1) for the case V (ψ, f ) = V (l), ∀l. Also note that V (ψ, f ) is linear in ψ, so it is sufficient to prove (1) in the case when V (ψ, z) =: X1 (z) . . . Xn (z) :, and ξ is an eigenvector of D with eigenvalue µ. Similarly to prove (2) we just need to prove (2) in the case V (ψ, f )FA = V (l)F A , and ψ, ξ are as above. Note that : X1 (z) . . . Xn (z) : is a summation of 2n of expressions of the form Xi+ . . . Xj+ Xi−0 . . . Xj−0 . We will in fact prove the inequality in Lemma 2.5 for such expressions. This will finish the proof of Lemma 2.5 by definitions. We first prove this for level 1 and G is simple, and the representation is on F . To avoid too many subscripts we will denote the fermionic creation or annilation operators simply by a(m), since we only need to use the fact that a(m) increases the energy by m, i.e., [D, a(m)] = ma(m) on H00 and its norm is 1 in our proof. Note in terms of a(m), X X a(m − k)a(−m) − a(m)a(−m − k) X(k) = m>0
m≥0
when acting on finite energy vectors, cf. the expression in a theorem on p. 486 of [W2]. Notice that we have dropped all the subscripts here for simplicity. We will prove the lemma for V (z) = X + (z) . . . X+ (z)Y − (z) . . . Y − (z), where there are n X, Y and we have dropped the subscripts for simplicity. Then V (l)ξ is a sum of 2n expressions of the form: X a(m1 ) . . . a(m2n )ξ, where the sum is over a finite number of m1 , . . . , mn ’s, subject to a certain constraint. Let us now show by induction on n that 0 ≤ |mi | ≤ e(1 + |l| + µ), i = 1, . . . , 2n,
(1)
where e ≥ 1 depends only on V (z). When n = 1, it is contained in a proof on p. 488 of [W2]. Assume that the statement is true for k < n. Suppose V (z) = Y (z)X− (z). Then: X X Y (l − k)X(k) = Y (l − k)a(m − k)a(−m) k≥0
k≥0,m>0
−
X
k≥0,m≥0
Y (l − k)a(m)a(−m − k).
Algebraic Coset Conformal Field Theories
P
k≥0,m>0 Y (l
13
− k)a(m − k)a(−m)ξ is a sum of 2(n−1) of expressions of the form X a(m1 ) . . . a(m2n−2 )a(m − k)a(−m)ξ, k≥0,m>0,m1 ,...,m2n−2
where 0 ≤ mi ≤ e0 (1 + |µ − k| + |l − k|) by induction hypothesis, and e0 ≥ 1 depends only on Y (z). Note that the above expression is nonzero only if 0 < m ≤ µ, 0 ≤ k ≤ µ. It follows that the expression is the sum of X a(m1 ) . . . a(m2n )ξ m1 ,...,m2n
with 0 ≤ P |mi | ≤ e(1 + |l| + µ), i = 1, . . . , 2n, where e = 4e0 . The same conclusion holds for k≥0,m>0,m1 ,...,m2n−2 a(m1 ) . . . a(m2n−2 )a(m)a(−m − k)ξ . Suppose V (z) = X + (z)Y (z). Then X X X(k)Y (l − k) = a(m − k)a(−m)Y (l − k) k 0 is as in Lemma 2.5. By definition we have hV (f )x, ym i = hx, V (f )F A ym i, and by (2) of Lemma 2.5 ||V (f )FA (ym − ym0 )|| ≤ c||f ||a ||ym − ym0 ||a . So {V (f )FA ym }m≥0 is a Cauchy sequence with a limit defined to be V (f )F A y. We can choose w = V (f )FA y. Also note that ||V (f )F A y|| ≤ c||f ||a ||y||a . Now let f be a smooth function, and choose a sequence fn of finite energy functions such that lim ||fn − f ||a = 0, n→∞
where constant a is as in Lemma 2.5. By the proof in the finite energy function case we have hV (fn )x, yi = hx, V (fn )F A yi. By Lemma 2.5 and the note above ||V (fn − f )x|| ≤ c||fn − f ||a ||x||a , ||V (fn − fn0 )F A y|| ≤ c||fn − fn0 ||a ||y||a . It follows that the sequence V (fn )F A y, ∀n > 0 is a Cauchy sequence with a limit denoted by w and hV (f )x, yi = hx, wi, ∀x ∈ H00 .
16
F. Xu
0 . Then one can find x ∈ H 0 such that Ad (2): By (1) V (f ) is closable. Let x ∈ H∞ n ||xn − x||a → 0 with a > 0 as in Lemma 2.5, and by Lemma 2.5 V (f )xn → V (f )x. This shows H00 is a core for V (f ). Ad (3): Suppose yn → y ∈ M in the strong topology and
V (f )yn x = yn V (f )x, ∀s ∈ S, and for all x ∈ the domain of V (f ), it follows immediately that yx is in the domain of V (f ) and V (f )yx = yV (f )x, ∀s ∈ S, and for all x ∈ the domain of V (f ). Since V (f )sx = sV (f )x, ∀s ∈ S, x ∈ H00 , 0 and H0 is a core for Vj (f ), it follows that V (f )sx = sV (f )x, ∀s ∈ S, and for all x ∈ the domain of V (f ). So V (f )s1 s2 . . . sn x = s1 s2 . . . sn V (f )x, ∀si ∈ S, i = 1, . . . , n and for all x ∈ the domain of V (f ) and finite n. (3) now follows from the definition. u t Suppose p ∗ = −p is a smooth test function. Assume that X ∈ g. It follows from §3 of [GW] (also cf. p. 489 of [W2]) that X(p) is essentially skew-self adjoint with core 0 to H 0 . H00 , and X(p) maps H∞ ∞ Lemma 2.7. Let X ∈ g. Then:
P j [X(p), V (ψ, f )]x = 0≤j ≤n V (X(j )ψ, j1! ddzpj f (z))x 0 , where n is the energy of ψ; for any smooth functions p, f and x ∈ H∞ 0 0 (2) If exp(tX(p))H0 ⊂ H∞ , −1 ≤ t ≤ 1, and Sup{|| exp(tX(p))x||s , −1 ≤ t ≤ 1} < ∞ for any x ∈ H00 , s > 0, then (1)
h[exp(X(p)), V (ψ, f )]x, yi Z 1 hexp(tX(p))[X(p), V (ψ, f )] exp((1 − t)X(p))x, yidt, = 0
for any smooth functions p = −p∗ , f and x ∈ H00 , y ∈ H00 . Proof. Note for x, y ∈ H00 , < X(k)V (ψ, z)x − V (ψ, z)X(k)x, y > is a polynomial in z, z−1 . We have (cf. for an example p. 327 of [KT] ): for z 6 = 0, Z 1 dwwk < X(w)V (ψ, z)x, y > < X(k)V (ψ, z)x−V (ψ, z)X(k)x, y >= 2π i Cz Z X 1 dwwk (w − z)−m−1 < V (X(m)ψ, z)x, y > = 2πi Cz m Z X 1 dwwk (w − z)−m−1 < V (X(m)ψ, z)x, y > = 2πi Cz m≤n =
X 1 d j zk hV (X(j )ψ, z)x, yi, j ! dzj
0≤j ≤n
Algebraic Coset Conformal Field Theories
17
where Cz is the boundary of a disk centered at z with radius 21 |z|, and in the last equation we used the fact that Z 1 d j zk 1 dwwk (w − z)−j −1 = 2πi Cz j ! dzj for 0 ≤ j ≤ n. Since H00 is dense, we have: X
x=
0≤j ≤n
X
=
1 2π i
Z S1
1 d j zk f (z)V (X(j )ψ, z) j ! dzj
V (X(j )ψ,
0≤j ≤n
1 d j zk f (z))x, j ! dzj
which is true for any finite energy function f and x ∈ H00 , and so it is true for any smooth P 0 by using approximation and Lemma 2.5. Let p = k function f and x ∈ H∞ k p(k)z 0 . By definition be a finite energy function and x ∈ H∞ X p(k)X(k), X(p) = k
and so (remember p =
P
k
p(k)zk )
[X(p), V (ψ, f )]x =
X
V (X(j )ψ,
0≤j ≤n
1 dj p f (z))x. j ! dzj
By Lemma 2.5 the above is also true for any smooth function p since we can always choose a sequence of functions pm , each pm is a finite energy function, and ||pm −p||s → 0 as m → ∞ for s greater than a given number which may depend on X, ψ, X(j )ψ, j = 0, . . . , n. 0 is a subset of C ∞ vectors of X(p) and V (f ) := V (ψ, f ). Let us Ad (2): Note H∞ check that the map s ∈ [0, 1] → A(s) := hexp(sX(p))V (f ) exp((1 − s)X(p))x, yi is a differentiable function with continuous derivative B(s) := hexp(sX(p))[X(p), V (f )] exp((1 − s)X(p))x, yi. Define C(s, t) := hexp(sX(p))V (f ) exp((1 − t)X(p))x, yi, (s, t) ∈ [0, 1] × [0, 1]. We shall repeatedly use the following elementary fact about C ∞ vectors (cf. p. 488 of [W2]): if ξ is a C ∞ vector of X(p), i.e., ξ is in the domain of X(p)n , ∀n ≥ 1, then the function u ∈ R → hexp(uX(p))ξ, ηi is a smooth function of u for any η ∈ H 0 .
18
F. Xu
0 is a subset of C ∞ vectors of X(p), it follows Since V (f ) exp((1 − t)X(p))x ∈ H∞ that C(s, t) is a smooth function of s for fixed t. Also note
C(s, t) = hexp(sX(p))V (f ) exp((1 − t)X(p))x, yi = hexp((1 − t)X(p))x, V (f )∗ exp(−sX(p))yi, where we have used (1) of Lemma 2.6 and our assumption that exp(−sX(p))y ∈ 0 , 0 ≤ s ≤ 1. So C(s, t) is a smooth function of t for fixed s. We have (use (1) of H∞ Lemma 2.6 when computing derivatives with respect to t) the following partial derivatives: Cs (s, t) = hexp(sX(p))X(p)V (f ) exp((1 − t)X(p))x, yi, Css (s, t) = hexp(sX(p))X(p)2 V (f ) exp((1 − t)X(p))x, yi, Ct (s, t) = hexp(sX(p))V (f )(−X(p)) exp((1 − t)X(p))x, yi, Ctt (s, t) = hexp(sX(p))V (f )X(p)2 exp((1 − t)X(p))x, yi, Cst (s, t) = Cts (s, t) = hexp(sX(p))X(p)V (f )(−X(p)) exp((1 − t)X(p))x, yi. Note that all the derivatives above are smooth functions of one variable when the other variable is fixed. We have |Css (s, t)| ≤ || exp(sX(p))X(p)2 V (f ) exp((1 − t)X(p))x|| ||y|| ≤ ||X(p)2 V (f ) exp((1 − t)X(p))x|| ||y|| ≤ C1 ||p||2a1 ||f ||a2 || exp((1 − t)X(p))x||a3 ||y|| ≤ C2 , where we used Lemma 2.5 in the third ≤, the assumption in the last ≤, and C1 , C2 , a1 , a2 , a3 are independent of s, t by Lemma 2.5 and the assumption. We can obtain similar estimates for other second partial derivatives, and so there exists a constant C 0 > 0 such that Sup{|Css (s, t)|, |Cst (s, t) = Cst (t, s)|, |Ctt (s, t)|, ∀(s, t) ∈ [0, 1] × [0, 1]} ≤ C 0 . By using the uniform bound for second partial derivatives and Taylor’s theorem in calculus, we have A(s + 1s) − A(s) = C(s + 1s, s + 1s) − C(s, s) = C(s + 1s, s + 1s) − C(s + 1s, s) + C(s + 1s, s) − C(s, s) 1 1 = Ct (s + 1s, s)1s + Ctt (s + 1s, θ1 )(1s)2 + Cs (s, s)1s + Css (θ2 , s)(1s)2 2 2 1 1 2 = (Ct (s, s) + Cts (θ3 , s)1s)1s + Ctt (s + 1s, θ1 )(1s) 2 2 1 2 + Cs (s, s)1s + Css (θ2 , s)(1s) 2 = B(s)1s + O((1s)2 ), where θi , i = 1, 2, 3 are between s and s + 1s, |O((1s)2 )| ≤
3C 0 |(1s)2 |, 2
Algebraic Coset Conformal Field Theories
19
and we have used B(s) = Ct (s, s) + Cs (s, s), which follows from definitions. It follows immediately that the derivative of A(s) is B(s) on [0,1]. A similar elementary exercise in calculus as above shows that B(s) is continuous on [0,1]. (2) now follows by the Fundamental Theorem of Calculus. u t Let T be the maximal torus of G and > = Lie(T ). By §13.3 of [PS], the level 1 vacuum representation of LT on Hilbert space F is also an irreducible representation of LG. Denote by π the representation of LG on F . Lemma 2.8. (1) Let k ∈ N and x ∈ F0 , where F0 denotes the set of finite energy vectors and v = exp(w(p)) with w ∈ >, p = −p∗ , ||p||k+1 < M. Then there exists a constant C which only depends on w, M, k and x ∈ F0 such that ||π(v)x||k ≤ C; (2) Let u = exp(X(p)), where X = Xα or X = Yα and p = −p∗ , ||p||k+1 < M. Then then there exists a constant C 0 which only depends on X, M, k and x ∈ F0 such that ||π(u)x||k ≤ C 0 ; (3) Let u = exp(X(p)), where X = Xα or X = Yα and p = −p∗ , ||p||k+1 < M. Denote by π 0 the vacuum representation of LG on H 0 . Then then there exists a constant C 00 which only depends on X, M, k and x ∈ H00 such that ||π 0 (u)x||k ≤ C 00 . Proof. Ad (1): The basic idea is contained in Prop. 9.5.15 of [PS] and we will recall the notations and facts in 9.5 [PS]. We can write LT ' H om(S 1 , T )×T ×V , where T is the subgroup of constant loops, and V is the vector space of maps a : S 1 → > with integral 0, which is regarded as a subgroup of LT by the exponential map a → exp(ia). The identity component of the central extension of LT in our case is canonically a product T × V˜ , where V˜ is the Heisenberg group associated to a skew form S defined on p. 63 of k ¯ [PS]. Write P V ⊗ C = A ⊕ A, where A is spanned by z > ⊗ C for k > 0. For a ∈ V ⊗ C, let a = n an zn , an ∈ > ⊗ C be its Fourier series. We define X (1 + |n|)s |an |, s ∈ R, ||a||s := n
and |.| is the norm on > ⊗ C induced from the restriction of the Killing form on >. The Hermitian form h, i on A defined by ha, a 0 i = −2iS(ξ¯ , η) is positive definite, where the skew form S is defined on p. 63 of [PS]. The only property we need about S is |S(a, a 0 )| ≤ ||a||0 ||a 0 ||1 , ∀a, a 0 ∈ V which follows from its definition.
20
F. Xu
The Hilbert space F is the completion of the symmetric algebra S(A) with respect to the Hermitian form above, which is extended from A to S(A) by the formula X ha1 , ai01 i . . . han , ai0n i, ha1 a2 . . . an , a10 a20 . . . an0 i = where the sum is over all permutations {i1 , . . . , in } of {1, . . . , n}. Note that for any a ∈ A, X an ea := n! n≥0
belongs to F . The vacuum vector in F is denoted by 1. As in the proof of Prop. 9.5.15 of [PS], it is sufficient to prove (1) for the case when v ∈ V˜ and x is the vacuum vector. Note that v = exp(w(p)) is identified with w(p) ˜ = ipw ∈ V˜ under the isomorphism LT ' Hom(S 1 , T ) × T × V above. The action of v on the vacuum vector 1 is given by (cf. p. 194 of [PS]): v.1 = e− 2 ha,ai ea , P where w(p) ˜ = a + a, ¯ and a(z) = i>0 ai zi . Let X i s ai zi , s ∈ N. a (s) (z) := 1
i>0
Note that ||a (s) ||0 ≤ ||a||s ≤ ||p||s |w|. We have: X
Dk an =
s1 ≥0,...sn ≥0,s1 +···+sn =k
and so
X
||D k a n || ≤
s1 ≥0,...sn ≥0,s1 +···+sn =k
k! a (s1 ) . . . a (sn ) , s1 ! . . . sn ! k! ||a (s1 ) . . . a (sn ) ||. s1 ! . . . sn !
Note that for 0 ≤ s1 , t1 ≤ k, |ha (s1 ) , a (t1 ) i| = |2S(a (s1 ) , a (t1 ) )| ≤ 2||a (s1 ) ||0 ||a (t1 ) ||1 ≤ 2||a||s1 ||a||t1 +1 ≤ 2||p||2k+1 |w|2 , hence
||a (s1 ) . . . a (sn ) ||2 = ha (s1 ) . . . a (sn ) , a (s1 ) . . . a (sn ) i ≤ n!2n |w|2n ||p||2n k+1 ,
where we used the definition of h, i on F as completion of S(A). So √ 1 ||D k a n || ≤ (n!) 2 ||p||nk+1 nk ( 2|w|)n , and ||D k v.1|| ≤ e− 2 ha,ai 1
∞ X n=1
≤
∞ X n=1
1 (n!)
1 2
1 (n!)
1 2
√ ||p||nk+1 nk ( 2|w|)n
√ ||p||nk+1 nk ( 2|w|)n .
Algebraic Coset Conformal Field Theories
21
This implies (1) by the definition of ||.||k . Ad (2): By the observation on p. 267 of [PS] there exists an element q ∈ G such that q exp(v(p))q −1 = exp(X(p)) up to a scalar as operators on F . Since the action of q commutes with the action of rotation, (2) follows from (1). Ad (3): It is enough to consider the case when G is simple. Since any level L vacuum representation appears as a direct summand of F ⊗L , we just have to prove (3) for the t representation π ⊗L , but this follows immediately from (2). u Lemma 2.9. (1) If f, p = −p∗ are smooth functions, then π 0 (exp(X(p)))π 0 (V (ψ, f ))x = π 0 (V (ψ, f ))π 0 (exp(X(p)))x for any x ∈ H00 , X ∈ Lie(H ); (2) Let f, p = −p ∗ be smooth functions on S 1 with support f ⊂ I and support p ⊂ I 0 . If X = Xα or X = Yα , then: π 0 (V (ψ, f ))π 0 (exp(X(p)))x = π 0 (exp(X(p)))π 0 (V (ψ, f ))x for any x ∈ H00 . Proof. Ad (1): If X ∈ Lie(H ), X(i)ψ = 0 for any i ≥ 0 by the definition of ψ, it follows from (1) of Lemma 2.7 that [X(p), V (ψ, f )]x = 0 0 . By (3) of Lemma 2.8, the condition of (2) of Lemma 2.7 (note in (2) of for any x ∈ H∞ Lemma 2.7 tX(p) = X(tp) by definition) is satisfied, and the identity follows by using (2) of Lemma 2.7 and the fact that H00 is norm dense in H 0 . Ad (2): Since the support of f and the support of p are disjoint, by (1) of lemma 3
[X(p), V (ψ, f )]x = 0 0 . By (3) of Lemma 2.8, the condition of (2) of Lemma 2.7 is satisfied, for any x ∈ H∞ and the identity follows by using (2) of Lemma 2.7 and the fact that H00 is norm dense t in H 0 . u
Lemma 2.10. Let S be the set which consists of elements π 0 (exp(X(p))) with p = −p∗ smooth if X ∈ Lie(H ), and p = −p∗ smooth, support p ⊂ I 0 if X = Xα or X = Yα . Then the C ∗ algebra generated by S is strongly dense in π 0 (LH )00 ∨ π 0 (LI 0 G)00 if H is simply connected. Proof. Note S = S ∗ . Since every element of LI 0 G (resp. LH ) is a product of exponentials in LI 0 g (resp. Lh), cf. p. 487 of [W2] (we use the fact that G, H are simply connected here), we just have to show every element of the form π 0 (exp(X(p))) with ⊂ I 0 , and X ∈ g is in the von Neumann algebra M p = −p ∗ smooth, support pP generated by S. Assume X = i ci Xi , where ci ∈ R and Xi is either Xα or Yα . Note that π 0 (X(p)) and π 0 (Xi (p)) are essentially skew self-adjoint operator with a common core H00 . By abuse of notations, we will use the same symbol to denote its closure. Let a ∈ M 0 . Then we have: π 0 (Xi (p))ax = aπ 0 (Xi (p))x
22
F. Xu
for any x ∈ H00 , so π 0 (X(p))ax = aπ 0 (X(p))x for any x ∈ H00 , and it follows that the closure of π 0 (X(p)) t is affiliated with M, so π 0 (exp(X(p))) is in M. u Proposition 2.11. Suppose H ⊂ G, H is simply connected, and π 0 is the vacuum representation of LG. Let f be a smooth function with support f ⊂ I . Then π 0 (V (ψ, f )) is affiliated with von Neumann algebra π 0 (LH )0 ∩ π 0 (LI G)00 . Proof. By Lemma 2.9, π 0 (V (ψ, f ))ux = uπ 0 (V (ψ, f ))x for any u ∈ S where S is the generating set as in Lemma 2.10 and x ∈ H00 . Note by Haag duality in Prop. 2.1, π 0 (LI G)0 = π 0 (LI 0 G)00 . The proposition now follows from (3) of Lemma 2.6 and Lemma 2.10. u t Now we can finish the proof of Theorem 2.3. Proof of Theorem 2.3. Let ψ ∈ H0,0 ⊗0 be an eigenvector of D with eigenvalue n ∈ N. Note ψ = V (−n) = V (ψ, p) with p = z−1 . Choose two smooth functions f1 and f2 , with support f1 ⊂ I1 ∈ I and support f2 ⊂ I2 ∈ I, and f1 + f2 = 1. Then ψ = V (ψ, p) = V (ψ, pf1 ) + V (ψ, pf2 ). By Prop. 2.11, the closed operator V (ψ, f ) is affiliated with π 0 (LH )0 ∩ π 0 (LJ G)00 if the support f ⊂ J ∈ I. Let V (ψ, f ) = U |V | be the polar decomposition. By Lemma 4.4.1 of [Mv], U and exp(it|V |), ∀t ∈ R are in π 0 (LH )0 ∩ π 0 (LJ G)00 . Since is in the domain of |V |, it follows by Stone’s theorem (cf. p. 266 of [RS]) that as t → 0, (eit|V | − )/t → |V |, and this shows V (ψ, f ) = U |V | ∈ π 0 (LH )0 ∩ π 0 (LJ G)00 ⊂ H. So V (ψ, pf1 ) ∈ H, V (ψ, pf2 ) ∈ H, and it follows that ψ ∈ H by the expression given at the beginning of the proof. u t
Algebraic Coset Conformal Field Theories
23
2.3. Two conjectures. We will use the notations in 2.1. Conjecture 2.12. The covariant representations πi,α can be decomposed into a direct sum of a finite number of irreducible representations and the localized sectors corresponding to these irreducible representations generate a finite dimensional ring over C under the product of sectors (cf. 4.1). Conjecture 2.12 comes from the physicists’ argument that the coset H ⊂ G CFT is a rational CFT: there are only a finite number of primary fields. Here the primary fields correspond to the representations or sectors. In some cases, e.g., G ⊂ G × G where the inclusion is diagonal, there are also conjectures on the structure constants of the ring (cf. [FKW] and [BBSS]). More precisely the conjectures in §§3 and 4 of [FKW] are about certain representations of W-algebras with critical parameters. The W-algebras defined in [FKW] are closely related to the coset G ⊂ G1 × Gm (cf. [Watts]), for example, the representations of W-algebras in [FKW] have the same characters as those which come from the coset. We shall call the irreducible conformal precosheaves of the cosets SU (N) ⊂ SU (N)1 × SU (N)m coset WN -algebras with critical parameters. Note that coset W2 algebras with critical parameters are the irreducible conformal precosheaves corresponding to Virasoro algebras studied in [GKO and Luke]. g,h To state Conjecture 2.13, let L0 be the generator of the rotation group for the coset as in the proof of Prop. 2.3. Then e−βL0 , β > 0 is a trace-class operator on Hi,α by Theorem B of [KW]. Denote by di,α the statistical dimension (cf. 4.1) of πi,α . Then we have: (cf. [L5] and (4.27) of [FG]) Conjecture 2.13 (also known as Kac–Wakimoto formula in [L5]). di,α = limβ→0
T rHi,α e−βL0 . T rH0,0 e−βL0
Both of these conjectures are highly nontrivial. The results in [W2] prove these conjectures in the case G is of type A and H is a trivial group. For the case of coset W2 algebras with critical parameters the above conjectures follow from the results of [Luke]. Note that Conjecture 2.13 immediately implies the Kac–Wakimoto conjecture (cf. Conj. 2.5 in [KW]). In fact, Conjecture 2.13 can also be stated as: di,α =
b(i, α) , b(0, 0)
where b(i, α) is defined as in §2 of [KW] with our (i, α) identified with (3, λ) in (2.5.4) of [KW]. Conjecture 2.5 in [KW] states that b(i, α) > 0. Conjecture 2.13 is stronger than this since di,α ≥ 1 and b(0, 0) > 0 by definitions. Note that the Kac–Wakimoto hypothesis (cf. p. 161 of [KW]) also implies Kac– Wakimoto conjecture, but the first counter-example to Kac–Wakimoto hypothesis has been found in [X2] by considering subfactors associated with conformal inclusions. So far Conjecture 2.13 and hence the Kac–Wakimoto conjecture have been checked to be true in all known examples. 3. Commuting Squares We will use the notations of 2.1. All the cosets considered in this section are assumed to verify the assumptions of Theorem 2.4 unless stated otherwise. For the definitions and properties of statistical dimensions and minimal index, see 4.1.
24
F. Xu
Definition (Cofiniteness). The coset H ⊂ GL is called cofinite if the inclusion (π 0 (LI G)00 ∩ π 0 (LI H )0 ) ∨ π 0 (LI H )00 ⊂ π 0 (LI G)00 has finite statistical dimension. The statistical dimension of the inclusion is denoted by d(G/H ). Note that d(G/H ) does not depend on the choice of I by the covariance property of representations (cf. Prop. 2.1 of [GL]), and we can replace π 0 by any level L representation of LG in the above definition due to the local equivalence of these representations (cf. Theorem II. B of [W2]). Let π i be an irreducible projective representation of LG with positive energy at level L on Hilbert space H i . Recall (cf. 2.1) when restricting to LH , H i decomposes as: X Hi,α ⊗ Hα , Hi = α
and πα are irreducible projective representations of LH on the Hilbert space Hα , and the sum is over α such that (i, α) ∈ exp. Consider the following inclusions: (π i (LI G)00 ∩ π i (LI H )0 ) ∨ π i (LI H )00 ⊂ π i (LI G)00 ⊂ π i (LI 0 G)0 ⊂ ((π i (LI 0 G)00 ∩ π i (LI 0 H )0 ) ∨ π i (LI 0 H )00 )0 . Note that
π i (LI 0 G)0 ⊂ ((π i (LI 0 G)00 ∩ π i (LI 0 H )0 ) ∨ π i (LI 0 H )00 )0
has the same statistical dimension as (π i (LI 0 G)00 ∩ π i (LI 0 H )0 ) ∨ π i (LI 0 H )00 ⊂ π i (LI 0 G)00 which is d(G/H ). By the multiplicativity of statistical dimensions (cf. 4.1) the statistical dimension of the inclusion (π i (LI G)00 ∩ π i (LI H )0 ) ∨ π i (LI H )00 ⊂ ((π i (LI 0 G)00 ∩ π i (LI 0 H )0 ) ∨ π i (LI 0 H )00 )0 is di d(G/H )2 , where di is the statistical dimension of π i (LI G)00 ⊂ π i (LI 0 G)0 . On the other hand by the additivity Pof the statistical dimension (cf. 4.1)) the statistical dimension of the above inclusion is α d(i,α) dα , where d(i,α) and dα are the statistical dimensions of πi,α and πα respectively. So we have X d(i,α) dα . (3.1) di d(G/H )2 = α
When any of the statistical dimensions in formula (3.1) are ∞, then (3.1) is understood as the statement that both sides of the equation are ∞. See the paragraph before Prop. 4.2 for a slightly different derivation of formula (3.1). When i = 0 is the vacuum representation, the statistical dimension d0 of the inclusion π 0 (LI G)00 ⊂ π 0 (LI 0 G)0 is 1 by Haag duality in Prop. 2.1. It follows from formula (3.1) that H ⊂ GL is cofinite if and only if d(0,α) dα < ∞ for all (0, α) ∈ exp. Hence Conjecture 2.13 implies the cofiniteness for any coset.
Algebraic Coset Conformal Field Theories
25
For simplicity we will drop the subscript L in the following when no confusion arises. Note that if H = {e} is the trivial group , then H ⊂ G is cofinite. So the statement “if H1 ⊂ G is cofinite and H1 ⊂ H2 ⊂ G, then H2 ⊂ G is cofinite” is as difficult to prove as the statement “H2 ⊂ G is cofinite” by simply taking H1 = {e}. But we have: Proposition 3.1. Suppose H1 ⊂ H2 ⊂ G. (1) If H1 ⊂ G is cofinite, then H1 ⊂ H2 is cofinite, and d(G/H1 ) ≥ d(H2 /H1 ); (2) If H1 ⊂ H2 and H2 ⊂ G are cofinite, then H1 ⊂ G is cofinite and d(G/H1 ) ≤ d(G/H2 ) × d(H2 /H1 ). This proposition is proved below by using commuting squares. Commuting squares where all the algebras are finite type can be found in reference [We, Po]. But we will consider the case where all the algebras are type I I I . Since the action of the modular group of π 0 (LI G)00 with respect to the vacuum vector is geometric and ergodic (cf. 2.1), it follows from Takesaki’s theorem (cf. [MT] or p. 495 of [W2]) that the von Neumann algebras π 0 (LI Hi )00 ∨ (π 0 (LI Hi )0 ∩ π 0 (LI G)00 ), i = 1, 2 and π 0 (LI H1 )00 ∨ (π 0 (LI H1 )0 ∩ π 0 (LI H2 )00 ) ∨ (π 0 (LI H2 )0 ∩ π 0 (LI G)00 ) are factors, and there exist normal faithful conditional expectations i : π 0 (LI G)00 → π 0 (LI Hi )00 ∨ (π 0 (LI Hi )0 ∩ π 0 (LI G)00 ) , i = 1, 2 and : π 0 (LI G)00 → π 0 (LI H1 )00 ∨ (π 0 (LI H1 )0 ∩ π 0 (LI H2 )00 ) ∨ (π 0 (LI H2 )0 ∩ π 0 (LI G)00 ). Moreover, these conditional expectations preserve the state ω on π 0 (LI G)00 defined by ω(x) = (x, ), i.e., ω(x) = ω( 0 (x)), ∀x ∈ π 0 (LI G)00 , when 0 = 1 , 2 , respectively. Then we have: Lemma 3.2 (Commuting Square). (1) 1 · 2 = 1 · 2 = ; (2) 1 , 2 are minimal if has finite index. Proof. Assume the vacuum representation π 0 of LG decomposes with respect to LH2 as: H 0 = ⊕α HG/H2 ,0,α ⊗ HH2 ,α , then with respect to LH1 the decomposition is: H 0 = ⊕α HG/H2 ,0,α ⊗ HH2 ,α = ⊕α HG/H2 ,0,α ⊗ (⊕β HH2 /H1 ,α,β ⊗ HH1 ,β ). Let P1 be the projection from H 0 onto ⊕α HG/H2 ,0,α ⊗ HH2 /H1 ,α,0 ⊗ HH1 ,0 , P2 the projection from H 0 onto HG/H2 ,0,0 ⊗ ⊕β (HH2 /H1 ,0,β ⊗ HH1 ,β ), and P the projection from H 0 onto HG/H2 ,0,0 ⊗ HH2 /H1 ,0,0 ⊗ HH1 ,0 . It follows from definitions that P1 P2 = P2 P1 = P . By Theorem 2.4 and the Reeh– Schlieder Theorem in Prop. 2.1, Pi H 0 = π 0 (LI Hi )00 ∨ (π 0 (LI Hi )0 ∩ π 0 (LI G)00 ), i = 1, 2,
26
F. Xu
and P H 0 = π 0 (LI H1 )00 ∨ (π 0 (LI H1 )0 ∩ π 0 (LI H2 )00 ) ∨ (π 0 (LI H2 )0 ∩ π 0 (LI G)00 ). So for any x ∈ π 0 (LI G)00 , we have: 1 (2 (x)) = P1 2 (x) = P1 P2 x = P x = (x), and similarly 2 (1 (x)) = (x). Since is separating for π 0 (LI G)00 , (1) of the lemma is proved. Note by the remark at the end of 2.2, the inclusions π 0 (LI Hi )00 ∨ (π 0 (LI Hi )0 ∩ π 0 (LI G)00 ) ⊂ π 0 (LI G)00 are irreducible, so by Prop. 4.3 of [L4] i are unique and must be the minimal conditional expectations, i = 1, 2 if the index of is finite. u t Proof of Proposition 3.2. (1) As in the proof of Lemma 3.2, suppose the vacuum representation π 0 of LG decomposes with respect to LH2 as: H 0 = ⊕α HG/H2 ,0,α ⊗ HH2 ,α , and let P0 (resp. P00 ) be the projection from H 0 onto HG/H2 ,0,0 ⊗HH2 ,0 (resp. HG/H2 ,0,0 ⊗0 , where 0 is the vacuum vector in HH2 ,0 ). Then π 0 (LI H2 )00 ∨ (π 0 (LI H2 )0 ∩ π 0 (LI G)00 ) ' π 0 (LI H2 )00 ∨ (π 0 (LI H2 )0 ∩ π 0 (LI G)00 )P0 ' π0 (LI H2 )00 ⊗ (π 0 (LI H2 )0 ∩ π 0 (LI G)00 )P00 ' π 0 (LI H2 )00 ⊗ (π 0 (LI H2 )0 ∩ π 0 (LI G)00 ), where ⊗ is the tensor product of von Neumann algebras and A ' B means A and B are *-isomorphic, since all the algebras above are factors. Note the ∗-isomorphism above from π 0 (LI H2 )00 ∨ (π 0 (LI H2 )0 ∩ π 0 (LI G)00 ) to π 0 (LI H2 )00 ⊗ (π 0 (LI H2 )0 ∩ π 0 (LI G)00 ) maps π 0 (x) to π 0 (x) ⊗ 1, ∀x ∈ LI H2 , and y to 1 ⊗ y, ∀y ∈ π 0 (LI H2 )0 ∩ π 0 (LI G)00 . So this ∗-isomorphism maps π 0 (LI H1 )00 ∨ (π 0 (LI H1 )0 ∩ π 0 (LI H2 )00 ) ∨ (π 0 (LI H2 )0 ∩ π 0 (LI G)00 ) onto (π 0 (LI H1 )00 ∨ (π 0 (LI H1 )0 ∩ π 0 (LI H2 )00 )) ⊗ (π 0 (LI H2 )0 ∩ π 0 (LI G)00 ). It follows that the inclusion π 0 (LI H1 )00 ∨ (π 0 (LI H1 )0 ∩ π 0 (LI H2 )00 ) ∨ (π 0 (LI H2 )0 ∩ π 0 (LI G)00 ) ⊂ π 0 (LI H2 )00 ∨ (π 0 (LI H2 )0 ∩ π 0 (LI G)00 )
Algebraic Coset Conformal Field Theories
27
is conjugate to (π 0 (LI H1 )00 ∨ (π 0 (LI H1 )0 ∩ π 0 (LI H2 )00 )) ⊗ (π 0 (LI H2 )0 ∩ π 0 (LI G)00 ) ⊂ π 0 (LI H2 )00 ⊗ (π 0 (LI H2 )0 ∩ π 0 (LI G)00 ), hence it is irreducible and by Cor. 2.2 of [L6], its statistical dimension is d(H2 /H1 ). By Lemma 3.2, the minimal normal faithful conditional expectation 1 restricts to a normal faithful conditional expectation η from π 0 (LI H2 )00 ∨ (π 0 (LI H2 )0 ∩ π 0 (LI G)00 ) to
π 0 (LI H1 )00 ∨ (π 0 (LI H1 )0 ∩ π 0 (LI H2 )00 ) ∨ (π 0 (LI H2 )0 ∩ π 0 (LI G)00 ).
By Prop. 4.3 of [L4], η is also minimal, and so dη = d(H2 /H1 ) ≤ d1 = d(G/H1 ) by the definition of the statistical dimension (cf. 4.1). To prove (2), note that η1 = is a minimal conditional expectation by Cor. 2.2 of [L6], and d ≥ d1 , so we have d(H2 /H1 )d(G/H2 ) = d ≥ d1 = d(G/H1 ), where we have used multiplicativity of statistical dimensions (cf. 4.1). u t We consider some examples when Prop. 3.1 can be applied. The conformal inclusion SU (n)m × SU (m)n ⊂ SU (nm)1 has been considered in [X2] and the decomposition is given in Theorem 1 of [ABI]. Let H = SU (n) be the first factor in the above inclusion. Lemma 3.3. π 0 (LI SU (nm))00 ∩ π 0 (LI SU (n))0 = π 0 (LI SU (m))00 . So the irreducible conformal precosheaf of coset SU (n) ⊂ SU (nm)1 is the irreducible conformal precosheaf of LSU (m) at level n. Proof. From the definition we have: π 0 (LI SU (nm))00 ∩ π 0 (LI SU (n))0 ⊃ π 0 (LI SU (m))00 . Since (cf. the remark after Prop. 2.2) the action of the modular group of π 0 (LI SU (nm))00 ∩ π 0 (LI SU (n))0 with respect to the vacuum vector is geometric and fixes globally π 0 (LI SU (m))00 , by Takesaki’s theorem (cf. [MT] or p. 495 of [W2]), we just have to show that π 0 (LI SU (nm))00 ∩ π 0 (LI SU (n))0 ⊂ π 0 (LI SU (m))00 . By the decomposition of H 0 with respect to LSU (n) × LSU (m) in Theorem 1 of [ABI], = 0,0 ⊗ 0 ∈ H0,0 ⊗ H0 , where H0,0 and H0 are vacuum representations of LSU (m) and LSU (n) respectively, and 0,0 , 0 are vacuum vectors for LSU (m) and LSU (n) respectively. By the Reeh–Schlieder Theorem in Prop. 2.1, π 0 (LI SU (m))00 = H0,0 ⊗ 0 , but by the observation before Prop. 2.3 we have π 0 (LI SU (nm))00 ∩ π 0 (LI SU (n))0 ⊂ H0,0 ⊗ 0 . It follows that π 0 (LI SU (nm))00 ∩ π 0 (LI SU (n))0 ⊂ π 0 (LI SU (m))00 , and the lemma is proved. u t
28
F. Xu
By Lemma 3.3, Theorem 1.2 of [X1] and formula (3.1), the inclusion SU (n)k+l ⊂ SU (n(k +l))1 is cofinite. Since SU (n)k+l ⊂ SU (n)k ×SU (n)l ⊂ SU (n(k +l))1 , where the first inclusion is diagonal, by (1) of Prop. 3.1 the diagonal inclusion SU (n)k+l ⊂ SU (n)k × SU (n)l is cofinite, and using (2) of Prop. 3.1 repeatedly we conclude that the diagonal inclusion SU (n)k ⊂ SU (n)1 × · · · × SU (n)1 is also cofinite, where there are k factors in the product. It follows by (1) Prop. 3.1 that SU (n)k1 +···+km ⊂ SU (n)k1 × · · · × SU (n)km is cofinite, ki ∈ N, i = 1, . . . , m, since SU (n)k1 +···+km ⊂ SU (n)k1 × · · · × SU (n)km ⊂ SU (n)1 × . . . SU (n)1 , where there are k1 + · · · + km factors in the last group. Suppose Hk ⊂ G1 is a conformal inclusion, H is simple and of type A, G is simple and k is the Dynkin index (cf. p. 170 of [KW]). An infinite list can be found in [X2]. Let l ∈ N. Since Hkl ⊂ Hk × · · · × Hk is cofinite by the previous paragraph and Hk × · · · × Hk ⊂ G1 × · · · × G1 is cofinite by Prop. 2.4 of [X1], it follows by (2) of Prop. 3.1 that Hkl ⊂ G1 × · · · × G1 is cofinite, and by (1) of Prop. 3.1 Hkl ⊂ Gl is cofinite. Finally let us consider the case H ⊂ Gm with G = SU (l) and H is the Cartan subalgebra of G, a l − 1 dimensional torus. We will first consider the inclusion H ⊂ G1 × · · · × G1 , ˜ := where there are m factors in the product, and the inclusion is diagonal. Define G G × G · · · × G, where there are m factors in the product. The irreducible projective representations of LH at level m have been classified in Prop. 9.5.10 of [PS]. Let us describe this result in our case. These irreducible projective representations are in fact representations of LH , which is a central extension of LH induced from the central extension LG of LG (cf. p. 483 of [W2] or Chap. 4 of [PS]). Write LH ' Hom(S 1 , H ) × H × V , where H is the subgroup of constant loops, and V is the vector space of maps f : S 1 → Lie(H ) with integral 0, which is regarded as a subgroup of LH by the exponential map. The density component of LH is canonically a product H × V˜ , where V˜ is the Heisenberg group defined by a skew form on V . The center of the identity component of LH is H × S 1 . Let ξ = (ξ1 , . . . , ξl−1 , (ξ1 . . . ξl−1 )−1 ) ∈ LH, where each ξi ∈ C ∞ (S 1 , S 1 ) has winding number xi , i = 1, . . . , l − 1. The conjugate action of ξ on the center H × S 1 of the identity component of LH is given by (cf. p. 192 of [PS]) x +(x1 +...xl−1 )
(t, u) → (t, ut1 1
x
l−1 . . . tl−1
+(x1 +...xl−1 )
),
where t = (t1 , . . . , tl−1 , (t1 . . . tl−1 )−1 ) ∈ H . Introduce an equivalent relation on Zl−1 by: (n1 , . . . , nl−1 ) ∼ (n01 , . . . , n0l−1 ) iff there exists (m1 , . . . ml−1 ) ∈ Zl−1 with m1 + · · · + ml−1 ∈ lZ such that (n01 , . . . , n0l−1 ) = (n1 + mm1 , . . . nl−1 + mml−1 ). Denote the equivalence class of (n1 , . . . , nl−1 ) by [n1 , . . . , nl−1 ] or simply [n]. The irreducible representation of LH at level m on Hilbert space H[n] has the following form (cf. p. 192 of [PS]): H[n] = ⊕(a1 ,...,al−1 )∼(n1 ,...,nl−1 ) H(a1 ,...,al−1 ) , where on H(a1 ,...,al−1 ) , the center H × S 1 of the identity component of LH acts as al−1 u × id, and on H(a1 ,...,al−1 ) the representation of the Heisenberg (t, u) → t1a1 . . . tl−1 group V˜ is irreducible (and unique by Prop. 9.5.10 of [PS]).
Algebraic Coset Conformal Field Theories
29
˜ := LG × . . . LG When restricting to LH , the vacuum representation π 0 of LG (there are m factors in the product) on (Hv )⊗m decomposes as: X H0,[n] ⊗ H[n1 ,...nl−1 ] , (Hv )⊗m = l|
P
i
ni
and π[n] are irreducible projective representations of LH on H[n1 ,...nl−1 ] (cf. §2.6 of [KW]). Let α ∈ LH × · · · × LH be a loop of the form ξ × 1 × · · · × 1, where ξ = (ξ1 , . . . , ξl−1 , (ξ1 . . . ξl−1 )−1 ) ∈ LH, and each ξi ∈ C ∞ (S 1 , S 1 ) has winding number xi , i = 1, . . . , l − 1. We can assume that α is localized on I . Define ˜ Adα .y := αyα −1 , Adα .π 0 (y) = π 0 (α)π 0 (y)π 0 (α −1 ), ∀y ∈ LG. ˜ Note (remember that LH is diagonally included in LG) Adα .y ∈ LH, Adα .π 0 (y) ∈ π 0 (LH ), ∀y ∈ LH. Then by definitions we have: π[n] (Adα y) ' π[n+b] (y), ∀y ∈ LH, where [n + b] = [n1 + b1 , . . . , nl−1 + bl−1 ], with bi = xi + (x1 + . . . xl−1 ), i = 1, . . . , l − 1. Note that this implies that π[n] has statistical dimension 1 since Adα is a localized automorphism. Also note that since Adα is an automorphism of π 0 (LH ), it is ˜ 00 for any interval J . also an automorphism of π 0 (LH )0 ∩ π 0 (LJ G) We claim that π0,[n] (Adα .y) ' π0,[n+b] (y) ˜ In for any y ∈ A(J ), where A(J ) is the conformal precosheaf for the coset H ⊂ G. fact let U[n] : H[n] → H[n+b] be a unitary map intertwining the action of LH and the action of Adα .LH , and let W = V ⊗ U : H 0 → H 0 be a unitary map such that V ⊗ U (z ⊗ y) = V[n] z ⊗ U[n] y ∈ H0,[n+b] ⊗ H[n+b] for any z ⊗ y ∈ H0,[n] ⊗ H[n] . It follows that W ∗ π 0 (α) ∈ π 0 (LH )0 = ⊕[n] B(H0,[n] ) ⊗ idH[n] , so we have
0 ⊗ U[n] , π 0 (α) = ⊕[n] V[n]
0 : H 0 0 0 0 −1 where V[n] 0,[n] → H0,[n+b] is unitary. Note π (Adα .y) = π (α)π (y)π (α) , and π 0 (y) ∈ ⊕[n] B(H0,[n] ) ⊗ idH[n] , ∀y ∈ A(J ).
Hence
∗
π0,[n] (Adα .y) = V 0 [n] π0,[n+b] (y)V 0 [n] , ∀y ∈ A(J ).
Now choose x so that bi = −ni , i = 1, . . . , l − 1, we get ∗
π0,[n] (Adα .y) = V 0 [n] π0,[0] (y)V 0 [n] . So π0,[n] has the same statistical dimension as π0,[0] since Adα is a localized automorphism.
30
F. Xu
If π0,[0] is the vacuum representation, then π0,[n] has statistical dimension 1. Note π[n] also has statistical dimension 1, by formula (3.1) we conclude that the diagonal inclusion H ⊂ G1 × · · · × G1 is cofinite. We claim that π0,[0] is indeed the vacuum representation. Note this does not follow directly from Theorem 2.3 since we assume H is simply connected in the theorem. However, the assumption that H is simply connected is only used in the proof of Lemma 2.10. From the proof of Lemma 2.10, we see that the smeared vertex operators in Prop. 2.4 are affiliated with the von Neumann algebra ˜ 00 , π 0 ((LH )0 )0 ∩ π 0 (LI G) where (LH )0 is the connected component of LH that contains the identity. Note that LH is generated as a group by (LH )0 and a set of elements with non-trivial winding ˜ So numbers, and we can certainly choose these elements to be in LI 0 H ⊂ LI 0 G. ˜ 00 . π 0 (LH )00 ⊂ π 0 ((LH )0 )00 ∨ π 0 (LI 0 G) ˜ 00 , then p ∈ π 0 (LH )0 ∩ π 0 (LI G) ˜ 00 . On the other Hence if p ∈ π 0 ((LH )0 )0 ∩ π 0 (LI G) hand ˜ 00 ⊃ π 0 (LH )0 ∩ π 0 (LI G) ˜ 00 . π 0 ((LH )0 )0 ∩ π 0 (LI G) So
˜ 00 = π 0 (LH )0 ∩ π 0 (LI G) ˜ 00 . π 0 ((LH )0 )0 ∩ π 0 (LI G)
˜ This shows that Prop. 2.11, and therefore Theorem 2.4 hold for any pair H ⊂ G ˜ as long as G is semisimple and simply connected. It follows now that Prop. 3.1 can be applied to the present case for H ⊂ Gm ⊂ G1 × · · · × G1 since we only use Theorem 2.4 in its proof. So we conclude that H ⊂ Gm is cofinite. To summarize, we have proved the following: Corollary 3.4. The following inclusions are cofinite: (1) Gk1 +k2 +···+km ⊂ Gk1 × · · · × Gkm , where the inclusion is diagonal, ki ∈ N, i = 1, . . . , m and G = SU (n); (2) Hlk ⊂ Gl , if Hk ⊂ G1 is a conformal inclusion where k is the Dynkin index, l ∈ N, H is simple and of type A and G is simple ; (3) H ⊂ Gm , where H is the Cartan subgroup of G. 4. Braided Endomorphisms All the cosets considered in this section are assumed to verify the assumptions of Theorem 2.4 unless stated otherwise. 4.1. Some results from [X1]. In this subsection we recall some of the results from [X1] which will be used in 4.2. We start with some preliminaries on sectors to set up notations. Let M be a properly infinite factor and End(M) the semigroup of unit preserving endomorphisms of M. In this paper M will always be a type I I I1 factor. Let Sect(M) denote the quotient of End(M) modulo unitary equivalence in M. It follows from [L3 and L4] that Sect(M) is endowed with a natural involution θ → θ¯ , and Sect(M) is a semiring: i.e., there are two operations +, × on Sect(M) which verifes the usual axioms. The multiplication of sectors is simply the composition of sectors. Hence if θ1 , θ2 are
Algebraic Coset Conformal Field Theories
31
two sectors, we shall write θ1 ×θ2 as θ1 θ2 . In [X1], the image of θ ∈ End(M) in Sect(M) is denoted by [θ]. However, since we will be mainly concerned with the ring structure of certain sectors in Sect. 4, we will denote [θ ] simply by θ if no confusion arises. Assume θ ∈ End(M), and there exists a normal faithful conditional expectation : M → θ(M). We define a number d (possibly ∞) by: d−2 := Max{λ ∈ [0, +∞)|(m+ ) ≥ λm+ , ∀m+ ∈ M+ } (cf. [PP]). If d < ∞ for some , we say θ has finite index or statistical dimension. In this case we define dθ = Min {d |d < ∞}. dθ is called the statistical dimension of θ . dθ2 is called the minimal index of θ . In fact in this case there exists a unique θ such that dθ = dθ . θ is called the minimal conditional expectation. It is clear from the definition that the statistical dimension of θ depends only on the unitary equivalence classes of θ . When N ⊂ M with N ' M, we choose θ ∈ End(M) such that θ(M) = N. The statistical dimension (resp. minimal index) of the inclusion N ⊂ M is defined to be the statistical dimension (resp. minimal index) of θ. Let θ1 , θ2 ∈ Sect(M). By Theorem 5.5 of [L3], dθ1 +θ2 = dθ1 + dθ2 , and by Cor. 2.2 of [L6], dθ1 θ2 = dθ1 dθ2 . These two properties are usually referred to as the additivity and multiplicativity of statistical dimensions. Also note by Prop. 4.12 of [L4] dθ = dθ¯ . If a sector does not have finite statistical dimension in any of the above three equations, then the equation is understood as the statement that both sides of the equation are ∞. Assume λ, µ, and ν ∈ End(M) have finite statistical dimensions. Let Hom(λ, µ) denote the space of intertwiners from λ to µ, i.e. a ∈ Hom(λ, µ) iff aλ(p) = µ(p)a for any p ∈ M. Hom(λ, µ) is a finite dimensional vector space and we use hλ, µi to denote the dimension of this space. Note that hλ, µi depends only on [λ] and [µ]. Moreover we have hνλ, µi = hλ, ν¯ µi, hνλ, µi = hν, µλ¯ i which follows from Frobenius duality (see [L2 Y]). We will also use the following notation: if µ is a subsector of λ, we will write µ ≺ λ or λ µ. A sector is said to be irreducible if it has only one subsector. Let θi , i = 1, . . . , n be a set of irreducible sectors with finite index. The ring generated by θi , i = 1, . . . , n under compositions is defined to be a vector space (possibly infinite dimensional) over C with a basis {ξj , j ≥ 1}, such that ξj are irreducible sectors, ξj 6 = ξj 0 if j 6 = j 0 , and the set {ξj , j ≥ 1} is a list of all irreducible sectors which appear as subsectors of finite products of θi , i = 1, . . . , n. The ring multiplication on the vector space is obtained naturally from that of Sect(M). Let M(J ), J ∈ I be an irreducible conformal precosheaf on Hilbert space H 0 . Suppose N(J ), J ∈ I is an irreducible conformal precosheaf and π 0 is a covariant representation of N(J ) on H 0 such that π 0 (N (J )) ⊂ M(J ) is a directed standard net as defined in Definition 3.1 of [LR] for any directed set of intervals. Fix an interval I and denote by N := N(I ), M := M(I ). For any covariant representation πλ (resp. π i ) of the irreducible conformal precosheaf N (J ), J ∈ I (resp. M(J ), J ∈ I), let λ (resp. i) be the corresponding endomorphism of N (resp. M) as defined in Sect. 2.1 of [GL]. These endomorphisms are obtained by localization in Sect. 2.1 of [GL] and will be referred to as localized endomorphisms for convenience. The corresponding sectors will be called localized sectors. See the paragraph after the proof of Lemma 4.2 for examples. We will use dλ and di to denote the statistical dimensions of λ and i respectively. dλ and di are also called the statistical dimensions of πλ and π i respectively, and they are independent of the choice of I (cf. Prop. 2.1 of [GL]).
32
F. Xu
Let π i be a covariant representation of M(J ), J ∈ I which decomposes as: X biλ πλ πi = λ
P when restricted to N(J ), J ∈ I, where the sum is finite and biλ ∈ N. Let γi := λ biλ λ be the corresponding sector of N. It is shown (cf. (1) of Prop. 2.8 in [X1]) that there are sectors ρ, σi ∈ Sect(N) such that: ρσi ρ¯ = γi . Notice that σi are in one-to-one correspondence with covariant representations π i , and in fact the map i → σi is an isomorphism of the ring generated by i and the ring generated ¯ ⊂ N is conjugate to π 0 (N (I )) ⊂ M(I ) (cf. (2) of Prop. 2.6 by σi . The subfactor ρ(N) in [X1]). Now we assume π 0 (N(I )) ⊂ M(I ) has finite index. Then for each localized sector λ of N there exists a sector denoted by aλ of N such that the following theorem is true (cf. [X1]): Theorem 4.1. (1) The map λ → aλ is a ring homomorphism; ¯ dλ = daλ ; (2) ρaλ = λρ, aλ ρ¯ = ρλ, ¯ aµ ρi; ¯ (3) hρaλ , ρaµ i = haλ , aµ i = haλ ρ, ¯ σi ρi ¯ ; (4) hρaλ , ρσi i = haλ , σi i = haλ ρ, (5) (3) (resp. (4)) remains valid if aλ , aµ (resp. aλ ) is replaced by any of its subsectors; (6) aλ σi = σi aλ . Proof. (1) to (4) follows from Theorem 3.2, 3.3, Cor. 3.4, Lemma 3.4, 3.5 of [X1], (5) is proved on p. 9 of [X2], and (6) is proved on p. 387 of [X1]. It should be noted that these results in Sect. 3 of [X1] are stated for conformal inclusions, but all the proof there applies verbatim to the present setting. u t 4.2. The ring structure. We will apply the results of 4.1 to the case when N (I ) = A(I ) ⊗ π0 (LI H )00 and M(I ) = π 0 (LI G)00 under the assumption that H ⊂ GL is cofinite, where A(I ) is as in Prop. 2.11 for the coset H ⊂ GL , and π0 denotes the vacuum representation of LH . Note that if H ⊂ GL is cofinite, then π 0 (N (I )) ⊂ M(I ) has finite index. By Theorem 4.1, for every localized endomorphism λ of N (I ) we have a map a : λ → aλ which verifies (1) to (6) in Theorem 4.1. Tensor Notation. Let θ ∈ End(A(I ) ⊗ π0 (LI H )00 ). We will denote θ by ρ1 ⊗ ρ2 if θ(p ⊗ 1) = ρ1 (p) ⊗ 1, ∀p ∈ A(I ), θ (1 ⊗ p0 ) = 1 ⊗ ρ2 (p0 ), ∀p0 ∈ π0 (LI H )00 , where ρ1 ∈ End(A(I )), ρ2 ∈ End(π0 (LI H )00 ). Lemma 4.2. (1) If θ = ρ1 ⊗ ρ2 , and X X [ρ1i ], [ρ2 ] = [ρ2j ], [ρ1 ] = i
j
where all the summations are finite. Then: X [ρ1i ⊗ ρ2j ]; [θ ] = i,j
Algebraic Coset Conformal Field Theories
(2)
33
hρ1 ⊗ ρ2 , σ1 ⊗ σ2 i = hρ1 , ρ2 ihσ1 , σ2 i, where ρ1 , σ1 are in End(A(I )), and ρ2 , σ2 are in End(π0 (LI H )00 ).
Proof. (1) follows immediately from the definitions. By (1), we just have to show (2) in the case that ρ1 , σ1 , ρ2 , σ2 are irreducible sectors. It is obvious that if ρ1 ' σ1 , ρ2 ' σ2 as sectors, then ρ1 ⊗ ρ2 ' σ1 ⊗ σ2 as sectors of N (I ). Now suppose ρ1 ⊗ ρ2 ' σ1 ⊗ σ2 as sectors of N(I ). This means there exists a unitary u ∈ N (I ) such that: uρ1 (p) ⊗ ρ2 (p 0 ) = σ1 (p) ⊗ σ2 (p0 )u, for any p ∈ A(I ), p 0 ∈ π0 (LI H )00 . By the statement on p. 123 of [Stra], there exists normal conditional expectation E : N (I ) → A(I ) ⊗ 1 such that E(u) 6 = 0. Applying E to the above equation and setting p 0 = 1, we have: E(u)ρ1 (p) = σ1 (p)E(u). Since ρ1 , σ1 are irreducible and E(u) 6 = 0, it follows that ρ1 ' σ1 as sectors. Similarly t one can show that ρ2 ' σ2 as sectors. This proves (2). u Recall from 2.1 πi,α of A(I ) are obtained in the decompositions of π i of LG with respect to subgroup LH , and we denote the set of such (i, α) by exp. For any J ∈ I, let U (J ) be a unitary operator from Hi,α to H0,0 such that: πi,α (p) = U (J )∗ π0,0 (p)U (J ), ∀p ∈ A(J ). Recall I is a fixed interval. Identify Hi,α with H0,0 by U (I 0 ), we may choose a representation unitarily equivalent to πi,α , still denoted by πi,α on H0,0 , with the property that πi,α (p0 ) = p 0 , ∀p0 ∈ A(I 0 ). It follows that πi,α (A(I )) commutes with A(I 0 ). By Haag duality in Prop. 2.1, πi,α (p) ∈ A(I ), ∀p ∈ A(I ), and so πi,α |A(I ) ∈ End(A(I )). We will denote πi,α |A(I ) by (i, α). The corresponding sector in Sect(A(I )) is also denoted by (i, α) when no confusion arises. Note that (i, α) is an irreducible sector if and only if πi,α is an irreducible covariant representation, since the coset conformal precosheaf A(J ), ∀J ∈ I is strongly additive by the remarks in 2.1 after Prop. 2.3. In fact suppose (i, α) is an irreducible sector. Let p ∈ (∨J ∈I πi,α (A(J )))0 . Then p ∈ πi,α (A(I 0 ))0 = A(I ). It follows that p ∈ Hom((i, α), (i, α)) = C, since (i, α) is irreducible. On the other hand if πi,α is irreducible, and p ∈ Hom((i, α), (i, α)). Then p ∈ A(I ) and so p ∈ (πi,α (A(I 0 )) ∨ πi,α (A(I )))0 . But πi,α (A(I 0 )) ∨ πi,α (A(I )) = ∨J ∈I πi,α (A(J )) by the strong additivity of the coset conformal precosheaf, so p ∈ (∨J ∈I πi,α (A(J )))0 = C, since πi,α is irreducible. Similarly one can show that (i, α) (j, β) if and only if πj,β appears as a direct summand of πi,α , and (i, α) is equal to (j, β) as sectors if and only if πi,α is unitarily equivalent to πj,β . Given (i, α) ∈ End(A(I )) as above, we define (i, α) ⊗ 1 ∈ End(N (I )) so that: (i, α) ⊗ 1(p ⊗ p 0 ) = (i, α)(p) ⊗ p0 , ∀p ∈ A(I ), p0 ∈ π0 (LI H )00 . It is easy to see that (i, α) ⊗ 1 corresponds to the covariant representation πi,α ⊗ π0 of N(I ). Note that this notation agrees with our tensor notation above. Also note that for
34
F. Xu
any covariant representation πx of A(I ), we can define a localized sector x ⊗ 1 of N (I ) in the same way as in the case when πx = πi,α . Each covariant representation π i of LG gives rise to an endomorphism σi ∈ End(N(I )) and (cf. Subsect. 4.1) X (i, α) ⊗ (α), ρσi ρ¯ = γi = α
where the summation is over those α such P that (i, α) ∈ exp. So by the properties of statistical dimensions (cf. 4.1) di dρ2 = α d(i,α) dα . Note that this is in fact formula (3.1), with dρ = d(G/H ) by definition. Proposition 4.3. Assume H ⊂ GL is cofinite. We have: (1) Let x, y be localized sectors of A(I ) with finite index. Then hx, yi = hax⊗1 , ay⊗1 i; (2) If (i, α) ∈ exp, then a(i,α)⊗1 ≺ a1⊗α¯ σi ; (3) Denote by d(i,α) the statistical dimension of (i, α). Then d(i,α) ≤ di dα , where di (resp. dα ) is the statistical dimension of i ( resp. α). Proof. Ad (1): By the assumption and Theorem 4.1, we have hax⊗1 , ay⊗1 i = hρax⊗1 , ρay⊗1 i = h(x ⊗ 1)ρ, (y ⊗ 1)ρi = hx ⊗ 1, (y ⊗ 1)ρ ρi ¯ X (0, δ) ⊗ δi = hx ⊗ 1, (y ⊗ 1) = hx ⊗ 1, =
X
X
δ
y(0, δ) ⊗ δi
δ
hx, y(0, δ)i × h1, δi,
δ
where in the last identity we used (2) of Lemma 4.2. Note h1, δi is equal to 1 iff δ corresponds to the vacuum representation of LH . When δ is the vacuum representation, (0, δ) corresponds to the representation π0,0 of A(I ), which by Th. 2.4, is the vacuum representation of A(I ), and corresponds to the identity sector. So we have: X hx, y(0, δ)i × h1, δi = hx, yi, δ
and the proof of (1) is complete. Ad (2): Since π i has finite statistical dimension and H ⊂ G is cofinite, P by formula (3.1) d(i,α) < ∞, dα < ∞, ∀(i, α) ∈ exp. So we can assume (i, α) = j mj xj , where the sum is finite, mj ∈ N, and xj is irreducible and has finite index. Note that xj is a localized sector of A(I ) (cf. Prop. 2.2 of [GL]), so axj ⊗1 is well defined, and it follows from (1) that axj ⊗1 is also irreducible. By Theorem 4.1, X mj axj ⊗1 . a(i,α)⊗1 = j
Algebraic Coset Conformal Field Theories
35
By using Theorem 4.1 and Lemma 4.2 we have: haxi ⊗1 , a1⊗α¯ σi i = haxi ⊗1 a1⊗α , σi i = haxi ⊗α , σi i ¯ σi ρi ¯ = haxi ⊗α ρ, ¯ = hρx ¯ i ⊗ α, σi ρi ¯ = hxi ⊗ α, ρσi ρi X (i, β) ⊗ βi = hxi ⊗ α, β
= hxi , (i, α)i = mi . This shows a(i,α)⊗1 =
X
mj axj ⊗1 ≺ a1⊗α¯ σi .
j
(3) follows immediately from (2) and the fact that (i, α) and a(i,α)⊗1 have the same statistical dimension by (2) of Theorem 4.1. u t Theorem 4.4. Suppose H ⊂ GL is cofinite, and every irreducible representation πα of LH has finite index, and the localized sectors {α} generate a finite dimensional ring over C under compositions. Then Conj. 1 of 2.3 is true. Proof. By (3) of Prop. 4.2, d(i,α) < ∞, ∀(i, α) ∈ exp, since di < ∞, dα < ∞ by our assumption. Hence each (i, α) decomposes into a direct sum of a finite number of irreducible sectors. By Prop. 2.2 of [GL], each πi,α decomposes into a direct sum of a finite number of irreducible covariant representations. By (6) of Theorem 4.1, a1⊗α¯ σi = σi a1⊗α¯ , so the ring Y generated by irreducible subsectors of a1⊗α¯ σi , ∀α, ∀i has finite dimension over C by (1) of Theorem 4.1 and the the assumption of the theorem. Denote by X the ring generated (under compositions) by the set of all irreducible sectors which appear as subsectors of (i, α), ∀(i, α) ∈ exp. By Prop. 4.3, the map x ∈ X → ax⊗1 is an injective homomorphism from X into Y . It follows that X is finite dimensional over C, and Conj. 2.12 of Subsect. 2.3 is proved. u t By Cor. 3.4, Theorem 4.4 and the theorem on p. 535 of [W2], we immediately have the following: Corollary 4.5. Conjecture 2.12 of Subsect. 2.3 is true for the following inclusions: (1) Gk1 +k2 +···+km ⊂ Gk1 × · · · × Gkm , where the inclusion is diagonal, ki ∈ N, i = 1, . . . , m and G = SU (n); (2) Hlk ⊂ Gl , if Hk ⊂ G1 is a conformal inclusion, where k is the Dynkin index , l ∈ N, l > 1, H and G are simple and of type A; (3) H ⊂ Gm , where H is the Cartan subgroup of G , m ∈ N, m > 1 and G is a simple type A group. Note in (2) and (3) of Cor. 4.5, we restrict l > 1 and m > 1 respectively to avoid the trivial case of conformal inclusions.
36
F. Xu
4.3. SU (N) ⊂ SU (N)m0 ×SU (N)m00 . In this subsection we consider the coset H ⊂ GL with H := SU (N), GL := SU (N )m0 × SU (N )m00 , where the embedding H ⊂ GL is diagonal. Let 31 , . . . , 3N−1 be the fundamental weights of sl(N ). Let k ∈ N. Recall \) at level k is the following that the set of integrable weights of the affine algebra sl(N subset of the weight lattice of sl(N ): (h)
P++ = {λ = λ1 31 + · · · + λN −1 3N −1 |λi ∈ N, λ1 + · · · + λN −1 < h}, where h = k + N. This set admits a ZN automorphism generated by σ1 : λ = (λ1 , λ2 , . . . , λN−1 ) → σ1 (λ) = (h −
N −1 X
λj , λ1 , . . . , λN −2 ).
j =1
P \) We define the color τ (λ) :≡ i (λi − 1)imod(N ) and Q to be the root lattice of sl(N (cf. §1.3 of [KW]). Note that λ ∈ Q if and only if τ (λ) ≡ 0 mod(N ). As in 2.1, we use i (resp. α) to denote the irreducible positive energy representations of LG (resp. LH ). To compare our notations with that of §2.7 in [KW], note that our i is (30 , 300 ) of [KW] , and our α is 3 of [KW]. We will identify i = (30 , 300 ) and α = 3, where 30 , 300 , 3 are the weights of sl(N ) at levels m0 , m00 , m0 +m00 respectively. \) at level m0 , m00 and m0 + m00 Denote by 00 , 000 , 0 the vacuum representations of sl(N 0 00 respectively. Note that for i = (3 , 3 ), α = 3, by Theorem 1.2 of [X1] the statistical dimensions of i and α are given by di =
a(30 )a(300 ) a(3) , , dα = a(00 )a(000 ) a(0)
where the positive numbers a(3), a(30 ) and a(300 ) are defined as in (0.4b) of [KW] (3) (a(3) is also equal to S30 as defined on p. 362 of [X1]). Suppose i = (31 0 , 31 00 ), j = (32 0 , 32 00 ), k = (33 0 , 33 00 ), α = 31 , β = 32 , δ = 33 . 30
00
33 33 δ Then the fusion coefficients Nijk := N3 30 3 0 N3 00 00 (resp. Nαβ := N31 32 )of LG (resp. 1 2 1 32 LH ) are well known and they are given by Verlinde formula (cf. Cor. 1 on p. 536 of [W2] and p. 288 of [Kac]). Recall πi,α are the covariant representations of the coset H ⊂ GL . The set of all (i, α) := (30 , 300 , 3) which appear in the decompositions of π i of LG with respect to LH is denoted by exp. This set is determined on p. 194 of [KW] to be (30 , 300 , 3) ∈ exp iff 30 + 300 − 3 ∈ Q. The ZN action on (i, α), ∀i, ∀α is denoted by
σ (i, α) := (σ (i), σ (α)) = (σ (30 ), σ (300 ), σ (3)), σ ∈ ZN . This is also known as diagram automorphisms since they correspond to the automorphisms of Dynkin diagrams. Note that dσ (i) = di , dσ (α) = dα by (3.2) of [Wal] and the formula for statistical dimensions above. Also note that this ZN action preserves exp and therefore induces a ZN action on exp. For each (i, α) ∈ exp, we will denote by [i, α] its orbit in exp under the ZN action. Theorem 4.6. Let H ⊂ GL be as in the previous paragraph. Then:
Algebraic Coset Conformal Field Theories
37
(1) a(i,α)⊗1 = σi a1⊗α¯ for all (i, α) ∈ exp; (2) Assume the action of ZN on exp is faithful, i.e., if σ (i) = i, σ (α) = α for some (i, α) ∈ exp, then σ = id. Then the covariant representations πi,α are irreducible and πi,α is unitarily equivalent to πj,β as covariant representations iff σ (i) = j, σ (α) = β for some σ ∈ ZN ; (3) Suppose the conditions of (2) hold. Denote by EXP the set of all irreducible localized sectors corresponding to the the covariant representations πi,α , ∀(i, α) ∈ exp. Then the set EXP is in one to one correspondence with the set {[i, α], ∀(i, α) ∈ exp}. Denote the elements of EXP by [i, α]. Define X σ (k) σ (δ) [k,δ] := Nij Nαβ . C[i,α][j,β] σ ∈ZN
Then the compositions are given by: [i, α][j, β] =
X [k,δ]
[k,δ] C[i,α][j,β] [k, δ];
(4) Conjecture 2.13 of 2.3 is true for H ⊂ GL . Proof. By Cor. 4.5 and Theorem 4.1, for any (i, α), (i 0 , α 0 ) we have: ha1⊗α¯ σi , a1⊗α 0 σi 0 i = ha1⊗α¯ a1⊗α 0 , σi¯ σi 0 i = ha1⊗αα ¯ 0 , σi¯ σi 0 i X j X β Nαα Nii =h ¯ 0 σj i ¯ 0 a1⊗β , β
X
=h
β
=
X β
j
β Nαα ¯ 0 1 ⊗ β,
β Nαα ¯ 0
X j
X j,δ
j
Nii ¯ 0 (j, δ) ⊗ δi
j Nii ¯ 0 h1, (j, β)i,
where 1 in the last = stands for the identity sector of A(I ), which by Theorem 2.4, corresponds to the representation π0,0 . So h1, (j, β)i 6 = 0 if and only if π0,0 appears as an irreducible summand in πj,β by the remarks after Lemma 4.2. Note that the vacuum vector (unique up to a nonzero scalar) of π0,0 has lowest energy (the eigenvalue of the generator of the rotation group) 0, and π0,0 is the unique (up to unitary equivalence) irreducible representation with this property (cf. remarks after Prop. 2.1). So π0,0 appears as an irreducible summand in πj,β if and only if there exists a nonzero (vacuum) vector in Hj,β with lowest energy 0. The set of such (j, β) was introduced on p. 186 in [KW] with our (j, β) corresponds to (M, µmod(δ)) in the notation of [KW] (in the notation on p. 186 of [KW], hM − hµ ≥ 0 is the eigenvalue of the generator of the rotation group in the coset Hilbert space by definition). This set in our case of diagonal inclusions is determined in (2.7.12) of [KW]. Translating (2.7.12) of [KW] into the notations of this paper, the statement is that π0,0 appears as an irreducible
38
F. Xu
summand in πj,β iff there exists σ ∈ ZN such that σ (0, 0) = (j, β), where 0 is used to denote the vacuum representation of G and H . Since (cf. (3.3) of [Wal]) σ (0)
Nii ¯0 we have:
σ (0)
0 = δσ (i),i 0 , Nαα ¯ 0 = δσ (α),α ,
ha1⊗α¯ σi , a1⊗α 0 σi 0 i =
X
δσ (i),i 0 δσ (α),α 0 hσ (0, 0), (0, 0)i.
(∗∗)
σ ∈ZN
Ad (1): We will prove (1) by using an “exhaustion” trick similar to the one used in §3 of [X3]. Suppose (i, α) = (i 0 , α 0 ) = (0, α), where 0 denotes the vacuum sector of LG. Note that σ (0) = 0 iff σ = 1. So we conclude from (**) that a1⊗α¯ is irreducible. By (2) of Prop. 4.3, a1⊗α¯ = a(0,α)⊗1 if (0, α) ∈ exp, and so d(0,α) = dα . Note that (cf. P the paragraph before Prop. 4.3) for fixed i, the statistical dimension of ρσi ρ¯ is given by α di,α dα , where the sum is over those α with (i, α) ∈ exp, which will be denoted by expi . Note that if i = (30 , 300 ), α = 3, then expi is a congruence class P m0 +m00 +N modQ (congruent to 30 + 300 ) by definition. So dρ2 = d1i α∈expi di,α dα . of P++ By (3) of Prop. 4.3, d(i,α) ≤ di dα , ∀(i, α) ∈ exp, hence X X dα dα = dα dα , dρ2 ≤ α∈expi
α∈exp0
where the last = follows from Cor. 2.7 of [KW]3 . But since d(0,α) = dα , we have: X dα dα . dρ2 = α∈exp0
It follows that all the ≤’s above are actually =, in particular di,α = di dα , and it follows from (2) of Prop. 4.3 that a(i,α)⊗1 = σi a1⊗α¯ . So we have (see the paragraph before Theorem 4.6): dσ (0,0) = dσ (0) dσ (0) = 1. On the other hand
hσ (0, 0), (0, 0)i ≥ 1,
it follows by comparing statistical dimensions that hσ (0, 0), (0, 0)i = 1. So we can improve (**) to ha1⊗α¯ σi , a1⊗α 0 σi 0 i =
X
δσ (i),i 0 δσ (α),α 0 .
(∗)
σ ∈ZN
Ad (2): By assumption expression (*) holds for a unique σ since the action is faithful. If (i, α) = (i 0 , α 0 ), then σ (i) = i 0 , σ (α) = α 0 iff σ = id, and we conclude from (*) that ha1⊗α¯ σi , a1⊗α¯ σi i = 1, 3 Note that our α correspond to 3 in Cor. 2.7 of [KW], d = a(3) , where 0 denotes the vacuum represenα a(0) m0 +m00 +N tation, and expi is a congruence class of P++ modQ.
Algebraic Coset Conformal Field Theories
39
i.e., a1⊗α¯ σi is irreducible. By (1) of Prop. 4.3 and (1) of Theorem 4.6 we conclude that if (i, α) ∈ exp, then (i, α) is irreducible, i.e., πi,α is irreducible by the remarks after Lemma 4.2. If σ (i) = j, σ (α) = β, then 1 ≥ ha1⊗α¯ σi , a1⊗β¯ σj i ≥ 1, where the first ≥ follows from the fact that a1⊗α¯ σi , a1⊗β¯ σj are irreducible, and the second ≥ follows from (*). So we must have a1⊗α¯ σi = a1⊗β¯ σj . By (1) of Prop. 4.3 and (1) of Theorem 4.6 h(i, α), (j, β)i = 1, so (i, α) is identical to (j, β) as sectors, since both sectors are irreducible. Hence πi,α is unitarily equivalent to πj,β as covariant representations by the remarks after Lemma 4.2. On the other hand if πi,α is unitarily equivalent to πj,β as covariant representations, then (i, α) is equal to (j, β) as sectors by the remarks after Lemma 4.2. By (1) of Prop. 4.3 and (1) of Theorem 4.6, a1⊗α¯ σi = a1⊗β¯ σj , so we must have σ (i) = j, σ (α) = β for some σ ∈ ZN by (*). This proves (2). Ad (3): Note that if (i, α) ∈ exp, (j, β) ∈ exp, by (1) and Theorem 4.1 we have: a(i,α)⊗1 a(j,β)⊗1 = σi σj a1⊗α¯ a1⊗β¯ X δ = Nijk Nαβ σk a1⊗δ¯ . k,δ
δ 6= Note by the characterization of exp (cf. p. 194 of [KW]) and 4.3.4 of [FKW], if Nijk Nαβ 0, then (k, δ) ∈ exp. Using (1) again we have:
a(i,α)⊗1 a(j,β)⊗1 =
X (k,δ)∈exp
δ Nijk Nαβ a(k,δ)⊗1 .
(3) now follows from (1) of Prop. 4.2 and (2). Ad (4): By (1) we need to show that b(i, α) = di dα , b(0, 0) where under the identification of i with (30 , 300 ) and α with 3, b(i, α) is given by (2.7.14) of [KW], and using the notations there we have b(i, α) = b(30 , 300 ; 3) = N a(30 )a(300 )a(3). 0 0 Note that d(3) = a(3) a(0) , and we have similar identities when 3, 0 is replaced by 3 , 0 and 300 , 000 . The proof of (4) is now complete by definitions. u t
40
F. Xu
Note that by (1) of Theorem 4.6 and formula (*) if σ (i) = i, σ (α) = α for some σ 6 = id, (i, α) ∈ exp, then a(i,α)⊗1 is not irreducible. Hence (i, α) is not irreducible by (1) of Prop. 4.3. This happens for example when N = 2, m0 ∈ 2N and m00 = 2, which is related to supersymmetric conformal algebras (cf. p. 195 of [KW]). Recall from Subsect. 2.3 coset WN algebras with critical parameters are defined to be the irreducible conformal precosheaves of cosets SU (N)m+1 ⊂ SU (N )m × SU (N )1 , which obviously satisfies (1) to (4) of Theorem 4.6. For N = 2, Theorem 4.6 is obtained in [Luke] by different methods. So we obtain a proof of a long-standing conjecture about representations of coset WN (N > 2) algebras with critical parameters, which is similar in the type A case to Conjecture 3.4 and Theorem 4.3 stated in §§3 and 4 of [FKW]. To compare (3) of Theorem 4.6 to Theorem 4.3 of [FKW], note that (p, p0 ) in [FKW] is identified with (m0 , m0 + 1) in our case, and (λi , λ0i ) in Theorem 4.3 of [FKW] is identified with [30i , 3i 00 , 3i ] in our case with λi = 30i , λ0i = 3i , and 3i 00 is the unique element such that 30i + 3i 00 − 3i ∈ Q. Then the compositions in Theorem 4.3 of [FKW] can be easily checked to be the same as (3) of Theorem 4.6 under the assumptions of Theorem 4.3 in [FKW]. 4.4. More examples. Let us consider the case H ⊂ Gm with G = SU (l) and H is the Cartan subalgebra of G, a l − 1 dimensional torus. This coset does not verify the conditions of Theorem 2.4, but Theorem 2.4 applies to this coset by the remark before Cor. 3.4. So we can apply the results of Sect. 4. Recall the equivalent relation on Zl−1 in 3.1 defined by: (n1 , . . . , nl−1 ) ∼ (n01 , . . . , n0l−1 ) iff there exists (m1 , . . . ml−1 ) ∈ Zl−1 with m1 + · · · + ml−1 ∈ lZ such that (n01 , . . . , n0l−1 ) = (n1 + mm1 , . . . nl−1 + mml−1 ). Denote the equivalence class of (n1 , . . . , nl−1 ) by [n] := [n1 , . . . , nl−1 ] or simply [n], and they are used to denote irreducible projective representations of LH . Let 3 be an irreducible weight of SU (l) at level m. Let τ (3) be the color of 3 , see 4.3. Then: H 3 = ⊕{[n]|(Pi ni −τ (3))∈l Z} H3,[n] ⊗ H[n] , P cf. §2.6 of [KW]. So the set exp = {(3, [n]), |( i ni − τ (3)) ∈ lZ} is determined, and (i, α) = (3, [n]) in the present case. Note that the sector [n] corresponds to an automorphism and has statistical dimension 1 (cf. Sect. 3 before Cor. 3.4). So a1⊗[n] has statistical dimension 1, and it follows that a1⊗[n] a1⊗[n] is the identity sector. So ha1⊗[n] σ3 , a1⊗[n] σ3 i = ha1⊗[n] a1⊗[n] σ3 , σ3 i = hσ3 , σ3 i = 1. Hence a1⊗[n] σ3 is irreducible for any [n], 3. So by Cor. 3.4 and (2) of Prop. 4.3 if (3, [n]) ∈ exp, a(3,[n])⊗1 = a1⊗[n] σ3 , and by (1) of Prop. 4.3 we have the following fusion rules: X 300 00 0 N33 (3, [n])(30 , [n0 ]) = 0 (3 , [n + n ]), 300
Algebraic Coset Conformal Field Theories
41
for any (3, [n]), (30 , [n0 ]) ∈ exp. By using (2.6.14) of [KW], one can immediately check that Conj. 2.13 of 2.3 is true in this case, as we did in the proof of (4) of Theorem 4.6. Note this coset is related to Parafermions, see [DL] for an approach using vertex operator algebras. Next we consider the case SU (2)4k ⊂ SU (3)k . When k = 1, SU (2)4 ⊂ SU (3)1 is a conformal inclusion, so by Cor. 4.5, the coset SU (2)4k ⊂ SU (3)k , k > 1 is rational, i.e., it verifies Conj. 2.12 of 2.3. In this particular example, we will use the notations in [ are labeled by an integer l with 0 denoting the §6 of [DJ1]. Thus the weights of sl(2) [ are labeled by a pair of integers (pq) vacuum representation and the weights of sl(3) with (00) denoting the vacuum representation. So (i, α) := (pq, l). In general, the identifications of (i, α) = (j, β) as sectors are related to certain Dynkin diagram automorphisms, as in (2) of Theorem 4.6. See [SY] for more examples. However, in [DJ1], a “Maverick” coset is given which violates the above identification rule. This is the coset SU (2)8 ⊂ SU (3)2 . From the first line of the Table on p. 4117 of [DJ1], we have: (00, 0) = (00, 8) = (11, 4) as sectors by Theorem 2.4 and the fact that the vacuum representation is the unique (up to unitary equivalence) representation which contains a nonzero (vacuum) vector with lowest energy 0. By using the above equation, Theorem 4.1 and Prop. 4.3, we find that the irreducible subsectors of a(pq,l)⊗1 generate a 6 dimensional ring whose basis is: 1, x := a(00,4)⊗1 y := a(10,2) y¯ := a(01,2) z := a(10,4)
= a1⊗4 − σ11 , = σ10 a1⊗2 − σ02 a1⊗2 , = a(10,2) , = σ02 , z¯ := a(01,4) = a(10,4) ,
where a¯ is√the conjugate of a. The statistical dimensions of a(00,4)⊗1 , a(10,2) , a(10,4) are √
5+1 given by 5+1 2 , 2 , 1 respectively. The fusion rules are also completely determined by the above formula, and we have
¯ = [1] + [x], [z3 ] = [1], [y] = [xz]. [x 2 ] = [1] + [x], [y y] By (1) of Prop. 4.3, the above formula determines the structure of the ring generated by the coset sectors. Conjecture 2.13 of 2.3 is also easily verified in this case. For more such “Maverick” cosets, see [DJ2]. Acknowledgement. I’d like to thank the referee for helpful suggestions. This work is partially supported by NSF grant DMS-9820935.
References [ABI] [B] [BBSS] [D]
Altschuler, D., Bauer, M. and Itzykson, C.: The branching rules of conformal embeddings. Commun. Math. Phys. 132, 349–364 (1990) Borcherds, R.: Vertex algebras, Kac–Moody algebras and the Monster. Proc. Nat. Acad. Sci. USA 83, 3068–3071 (1986) Bais, F.A., Bouwknegt, P., Schoutens, K., Surridge, M.: Coset construction for extended Virasoro algebras. Nucl. Phys. B 304, 371–391 (1988) Dong, C.: Introduction to vertex operator algebras I: S¯urikaisekikenky¯usho K¯o ky¯uroku No. 904, 1–25 (1995). Also see q-alg/9504017
42
[DL] [DJ1] [DJ2] [Dix] [FG] [FKW] [FLM] [FZ] [GKO] [GL] [GW] [Haag] [J] [KW] [Kacv] [Kac] [KS] [KT] [L1] [L2] [L3] [L4] [L5] [L6] [LR] [Luke] [MT] [MS] [Mv] [Po] [PP] [PS] [RS] [Stra] [SY] [Wal] [We] [W1]
F. Xu
Dong, C. and Lepowsky, J.: Generalized vertex algebras and relative vertex operators. Progress in Mathematics 112 (1993) Dunbar, D. and Joshi, K.: Characters for Coset conformal field theories and Maverick examples. Inter. J. Mod. Phys. A 8 No. 23, 4103–4121 (1993) Dunbar, D. and Joshi, K.: Maverick examples of Coset conformal field theories. Mod. Phys. Letters A 8 29, 2803–2814 (1993) Dixmier, J.: von Neumann Algebras. Amsterdam: North-Holland Publishing Company, 1981 Fröhlich, J. and Gabbiani, F.: Operator algebras and Conformal field theory. Commun. Math. Phys. 155, 569–640 (1993) Frenkel, E., Kac, V. and Wakimoto, M.: Characters and Fusion rules for W-algebras via Quantized Drinfeld–Sokolov Reductions. Commun. Math. Phys. 147, 295–328 (1992) Frenkel, I.B., Lepowsky, J. and Ries, J.: Vertex operator algebras and the Monster. New York: Academic, 1988 Frenkel, I. and Zhu, Y.: Vertex operator algebras associated to representations of affine and Virasoro algebras. Duke Math. J. 66 1, 123–168 (1992) Goddard, P., Kent,A. and Olive, D.: Unitary representations ofVirasoro and super-Virasoro algebras. Commun. Math. Phys. 103 1, 105–119 (1986) Guido, D. and Longo, R.: The conformal spin and statistics theorem. Commun. Math. Phys. 181, 11–35 (1996) Goodman, R. and Wallach, N.: Structure and unitary cocycle representations of loop groups and the group of diffeomorphisms of the circle. J. Reine Angew. Math. 347, 69–133 (1984) Haag, R.: Local Quantum Physics. Berlin–Heidelberg–New York: Springer-Verlag, 1992 Jones, V.: Fusion en al´gebres de Von Neumann et groupes de lacets (d’aprés A. Wassermann). Seminarie Bourbaki 800, 1–20 (1995) Kac, V.G. and Wakimoto, M.: Modular and conformal invariance constraints in representation theory of affine algebras. Adv. in Math. 70, 156–234 (1988) Kac, V.G.: Vertex algebras for beginners. Providence, RI: AMS, 1997 Kac, V.G.: Infinite dimensional Lie algebras. 3rd Edition, Cambridge: Cambridge University Press, 1990 Karabali, D. and Schnitzer, H.J.: BRST quantization of the gauged WZW action and coset conformal field theories. Nucl. Phys. B 329 3, 649–666 (1990) Tsuchiya, A. and Kanie, Y.: Vertex Operators in conformal field theory on P 1 and monodromy representations of braid group. Adv. Studies in Pure Math. 16, 297–372 (1988) Longo, R.: Proceedings of International Congress of Mathematicians. Cambridge, MA: International Press, 1994, pp. 1281–1291 Longo, R.: Duality for Hopf algebras and for subfactors, I. Commun. Math. Phys. 159, 133–150 (1994) Longo, R.: Index of subfactors and statistics of quantum fields, I. Commun. Math. Phys. 126, 217–247 (1989) Longo, R.: Index of subfactors and statistics of quantum fields II. Commun. Math. Phys. 130, 285–309 (1990) Longo, R.: An analogue of the Kac–Wakimoto formula and black hole conditional entropy. Commun. Math. Phys. 186, 451–479 (1997) Longo, R.: Minimal index and braided subfactors. J. Funct. Anal. 109, 98–112 (1992) Longo, R. and Rehren, K.-H.: Nets of subfactors. Rev. Math. Phys. 7, 567–597 (1995) Luke, T.: Operator algebras and Conformal Field Theory of the Discrete series representations of Diff(S 1 ). Dissertation, Cambridge (1994) Takesaki, M.: Conditional expectation in von Neumann algebra. J. Funct. Anal. 9, 306–321 (1972) Moore, G. and Seiberg, N.: Taming the conformal zoo. Lett. Phys. B 220, 422–430 (1989) Murray, F.J. and v. Neumann, J.: On rings of Operators. Ann. Math. 37, 116–229 (1936) Popa, S.: Classification of amenable subfactors of type II. Acta Math. 172, 352–445 (1994) Pimsner, M. and Popa, S.: Entropy and index for subfactors. Ann. Éc. Norm. Sup. 19, 57–106 (1986) Pressly, A. and Segal, G.: Loop Groups, Oxford: Oxford University Press, 1986 Reed, M. and Simon, B.: Methods of Mathematical Physics I: Functional Analysis. London–New York: Academic Press, 1980 Stratila, S.: Modular Theory in Operator Algebras. Editura Academiei, 1981 Schellekens, A.N. and Yankielowicz, S.: Field identification fixed points in the coset construction. Nucl. Phys. B 334, 67–102 (1990) Walton, M.: Fusion rules of Wess–Zumino–Witten Models. Nuc. Phys. B 340, 777–789 (1990) Wenzl, H.: Hecke algebras of type A and subfactors. Invent. Math. 92, 345–383 (1988) Wassermann, A.: Proceedings of International Congress of Mathematicians, Cambridge, Ma: Inernational Press, 1994, pp. 966–979
Algebraic Coset Conformal Field Theories
[W2]
43
Wassermann, A.: Operator algebras and Conformal field theories III. Invent. Math. 133, 467–538 (1998) [Watts] Watts, G.M.T.: W-algebras and coset models. Phys. Lett. B 245, 65–71 (1990) [Witten] Witten, E.: The central charge in three dimensions. V.G. Knizhnik’s Memorial Volume, 1989, pp. 530–559 [X1] Xu, F.: New braided endomorphisms from conformal inclusions. Commun. Math. Phys. 192, 349– 403 (1998) [X2] Xu, F.: Applications of braided endomorphisms from conformal inclusions. Inter. Math. Res. Notice. 1, 5–23 (1998); see also q-alg/9708013, and Erratum, Inter. Math. Res. Notice 8 (1998) [X3] Xu, F.: Jones–Wassermann subfactors for disconnected intervals. q-alg/9704003 [Y] Yamagami, S.: A note on Ocneanu’s approach to Jones index theory. Internat. J. Math. 4, 859–871 (1993) Communicated by H. Araki
Commun. Math. Phys. 211, 45 – 61 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Global Weak Solutions for a Shallow Water Equation Adrian Constantin, Luc Molinet Institut für Mathematik, Universität Zürich, Winterthurerstrasse 190, 8057 Zürich, Switzerland Received: 26 June 1999 / Accepted: 21 October 1999
Abstract: We show the existence and uniqueness of global weak solutions for an equation describing the motion of waves at the free surface of shallow water under the influence of gravity. 1. Introduction The nonlinear partial differential equation ut − utxx + 3uux = 2ux uxx + uuxxx , u(0, x) = u0 (x), x ∈ R,
t > 0, x ∈ R,
(1.1)
was obtained by the method of recursion operators by Fokas and Fuchssteiner [14] as a bi-Hamiltonian equation. Its physical derivation as a model for the unidirectional propagation of waves at the free surface of a shallow layer of water (with u(t, x) representing the water’s free surface above a flat bottom) is due to Camassa and Holm [3]. Equation (1.1) was also found independently by Dai [13] as a model for nonlinear waves in cylindrical hyperelastic rods with u(t, x) representing the radial stretch relative to a pre-stressed state. Camassa and Holm [3] found that the solitary waves for (1.1) are uc (x, t) = c ϕ(x − ct), x ∈ R,
(1.2)
where ϕ(x) := e−|x| , x ∈ R. Moreover, the solitary waves are solitons: they retain their individuality under interaction and eventually emerge with their original shapes and speeds (see [2–4]). The solitons are peaked waves and have to be understood (see [6]) as weak solutions of this equation – one can write (1.1) as a conservation law 1 ut + uux + ∂x p ∗ [u2 + u2x ] = 0, 2
(1.3)
46
A. Constantin, L. Molinet
with p(x) := 21 e−|x| , x ∈ R. According to Whitham [23], the solitary waves which are found in the shallow water theory have a symmetrical peaking of the crests with a finite angle there. In [6] results on the existence of classical solutions to (1.1) are obtained. The only way that a classical solution of Eq. (1.1) fails to exist for all time is that the wave breaks [5]: the solution remains bounded while its slope becomes unbounded in finite time. For results on wave breaking we refer to [8,19]. A model for waves on shallow water which presents wave breaking and soliton interaction phenomena was long-time sought after [23]. Other interesting aspects of Eq. (1.1) are its integrability as an infinite dimensional Hamiltonian system1 and the fact that the shallow water flow is just a transcription of the geodesic flow in the group of proper diffeomorphisms of the line modelled on H 1 (R) – see [15,18,20]. For initial profiles u0 ∈ H s (R) with s > 23 , it is known [16,22], that (1.1) has a unique solution in C([0, T ); H s (R)) for some T > 0. The fact that the solitons do not belong to the spaces H s (R) with s > 23 motivates the study of weak solutions to (1.1) that are suitable to treat soliton interaction. Some first results in this direction were recently obtained. In [7] it is proved that if u0 ∈ H 1 (R) is such that y0 := u0 − u0,xx ∈ M+ (R) 2 then there exists a global weak solution u ∈ L∞ ((0, ∞); H 1 (R)) of (1.1). It would be better to have some regularity in time for these weak solutions. Note that the solitons belong to the space C 1 (R+ ; L2 (R)) ∩ C(R+ ; H 1 (R)). The aim of this paper is to prove the following Theorem. If u0 ∈ H 1 (R) is such that y0 := u0 − u0,xx ∈ M+ (R), then equation (1.3) has a unique solution u ∈ C 1 (R+ ; L2 (R)) ∩ C(R+ ; H 1 (R)) with initial data u(0) = u0 + and such that the total variation of y(t, ·) − uxx (t, ·) ∈ M is uniformly R ·) 2:= u(t, R (R) 2 bounded on R+ . Moreover, E(u) := R (u + ux ) dx and F (u) := R (u3 + uu2x ) dx are conservation laws. The method of the proof relies on the approximation of the initial data u0 ∈ H 1 (R) by smooth functions producing a sequence of global solutions of (1.1) in H 3 (R). Suitable a priori estimates enable us to extract a subsequence of these solutions that converges weakly in H 1 (R). We use Helly’s theorem [21] to pass to the limit in the nonlocal nonlinear term from (1.3). By a regularization technique we then show that the obtained solution depends continuously on time. Note that we deal with the same class of initial data as in [7]. Therein the compensated compactness method was used to pass to the limit in the nonlocal nonlinear term from (1.3) in order to show the existence of a solution u ∈ L∞ (R+ ; H 1 (R)). We use here a different approach that enables us to obtain a better time-dependence and we also show the uniqueness of the weak solutions within the considered class. The example of the solitons (1.2), where y0 = 2c δ (here δ is the Dirac mass at zero), shows that our notion of weak solution is the appropriate one for Eq. (1.1). In Sect. 3 we comment about the relevance of our result in the study of the orbital stability of the solitons. The last section is devoted to a discussion of the problem of weak solutions for (1.1) in the periodic case. 1 The spectral and inverse spectral problem in the periodic case is treated in [10]. For the scattering problem see [1,10] – the inverse problem on the line is not fully understood. 2 M(R) is the space of Radon measures on R with bounded total variation and M+ (R) is the subset of positive measures.
Global Weak Solutions for Shallow Water Equation
47
2. Proof of the Theorem Before proceeding with the proof, let us first present some lemmas that will be of use in our approach. Lemma 1 ([6,7,11]). Let u0 ∈ H 3 (R) and assume that y0 := u0 − u0,xx is nonnegative and belongs to L1 (R). Then the initial value problem R (1.1) has a unique solution u ∈ C(R+ ; H 3 (R))∩C 1 (R+ ; H 2 (R)). Moreover, E(u) := R (u2 +u2x ) dx and F (u) := R 3 2 R (u +uux ) dx are conservation laws and if y(t, ·) := u(t, ·)−uxx (t, ·), then for every t ≥ 0 we have (a) u(t, ·) − uxx (t, ·) ≥ 0 and |ux (t, ·)| ≤ u(t, ·) on R, (a) ku(t, ·)kL1 (R) = ky(t, ·)kL1 (R) = ky0 kL1 (R) , (a) kux (t, ·)kL∞ (R) ≤ ky0 kL1 (R) and ku(t, ·)kH 1 (R) = ku0 kH 1 (R) . As the previous result plays an important role in our approach, let us explain informally the importance of the positivity assumption on y0 , without paying attention to technical details. Note that if g = f − fxx , then Z Z 1 ∞ x−ξ 1 x ξ −x e g(ξ ) dξ + e g(ξ ) dξ, x ∈ R, f (x) = 2 −∞ 2 x while fx (x) = −
1 2
Z
x
−∞
eξ −x g(ξ ) dξ +
1 2
Z x
∞
ex−ξ g(ξ ) dξ, x ∈ R.
If g(x) ≥ 0 on R, we see that fx2 (x) ≤ f 2 (x) on R. As the positivity of y(t, ·) is preseved by the flow of (1.1), cf. [6], we see that y0 (x) ≥ 0 on R ensures that u2x (t, x) ≤ u2 (t, x) on R as long as the corresponding solution u(t, ·) to (1.1) exists. The conservation law E(u) guarantees that the solution u(t, ·) remains bounded as long as it exists and, in view of the previous considerations, the slope of the solution will also be bounded. But in classical solutions to (1.1) singularities can arise only in the form of wave breaking, i.e. the solution remains bounded while its slope becomes unbounded in finite time [5]. This reasoning shows that the positivity of y0 ensures global existence for the corresponding solution to (1.1). Note that if for some x0 ∈ R we have y0 (x) ≤ 0 for x ≤ x0 while y0 (x) ≥ 0 for x ≥ x0 , with a proper change of sign, then the corresponding solution of (1.1) will blow up in finite time, [5]. Let us now recall a partial integration result for Bochner spaces (below h·, ·i is the H −1 (R), H 1 (R) duality bracket). Lemma 2 ([17]). Let T > 0. If f, g ∈ L2 ((0, T ); H 1 (R)) and
df dg , ∈ L2 ((0, T ); H −1 (R)), dt dt
then f, g are a.e. equal to a function continuous from [0, T ] into L2 (R) and Z t Z t d f (τ ) d g(τ ) , g(τ )i dτ + , f (τ )i dτ h h hf (t), g(t)i − hf (s), g(s)i = dτ dτ s s for all s, t ∈ [0, T ].
48
A. Constantin, L. Molinet
Throughout this paper, we will denote by {ρn }n≥1 the mollifiers Z −1 ρ(ξ ) dξ n ρ(nx), x ∈ R, n ≥ 1, ρn (x) := R
where ρ ∈ Cc∞ (R) is defined by
(
e1/(x 0
ρ(x) :=
2 −1)
for |x| < 1, for |x| ≥ 1.
The next two approximation results will be also very useful. Lemma 3. Let f : R → R be uniformly continuous and bounded. If µ ∈ M(R), then ρn ∗ (f µ) − (ρn ∗ f ) (ρn ∗ µ) −→ 0 in L1 (R). n→∞
Proof. Since supp (ρn ) ⊂ [− n1 , n1 ] for n ≥ 1, we have Z ρn (y − x)f (x) dµ(x) [ρn ∗ (f µ)](y) = Z =
R y+ n1 y− n1
ρn (y − x) [f (y) + θy (x)] dµ(x)
Z
= f (y)
y+ n1 y− n1
Z ρn (y − x) dµ(x) +
y+ n1
y− n1
ρn (y − x)θy (x) dµ(x),
where θy (x) := f (x) − f (y), x ∈ R. We infer that for n ≥ 1, [ρn ∗ (f µ)] − (ρn ∗ f )(ρn ∗ µ) = (f − ρn ∗ f )(ρn ∗ µ) + 9n , where
Z 9n (y) =
y+ n1
y− n1
(2.1)
ρn (y − x) θy (x) dµ(x), y ∈ R, n ≥ 1.
For n ≥ 1, note that
Z
kρn ∗ µkL1 (R) = = = ≤ ≤
sup ∞
ϕ∈L (R) kϕkL∞ (R) ≤1
sup
ϕ∈L∞ (R) kϕkL∞ (R) ≤1
sup ∞
ϕ∈L (R) kϕkL∞ (R) ≤1
R
ϕ(y)(ρn ∗ µ)(y) dy Z
Z
R
Z
R
ϕ(y)
R
ρn (y − x) dµ(x) dy
(ρn ∗ ϕ)(x) dµ(x)
sup
kρn ∗ ϕkL∞ (R) kµkM
sup
kρn kL1 (R) kϕkL∞ (R) kµkM = kµkM ,
ϕ∈L∞ (R) kϕkL∞ (R) ≤1 ϕ∈L∞ (R) kϕkL∞ (R) ≤1
(2.2)
Global Weak Solutions for Shallow Water Equation
49
using Fubini’s theorem and Young’s inequality. Also, observe that Z y+ 1 n ρn (y − x) θy (x) dx, y ∈ R, n ≥ 1 [ρn ∗ f ](y) − f (y) = y− n1
≤
sup x∈[y− n1 ,y+ n1 ]
|θy (x)|, n ≥ 1.
From the above relation we infer by the uniform continuity of f that kρn ∗ f − f kL∞ (R) −→ 0. n→∞
On the other hand,
(2.3)
Z
k9n kL1 (R) = = ≤
sup ∞
ϕ∈L (R) kϕkL∞ (R) ≤1
sup ∞
ϕ∈L (R) kϕkL∞ (R) ≤1
sup
ϕ∈L∞ (R) kϕkL∞ (R) ≤1
≤ εn where
R
Z Z R R
ϕ(y)ρn (y − x)θy (x) dµ(x) dy
Z Z R
R
|ϕ(y)| ρn (y − x)|θy (x)| d|µ|(x) dy
Z Z
sup ∞
ϕ∈L (R) kϕkL∞ (R) ≤1
εn = sup
ϕ(y)9n (y) dy
R R
|ϕ(y)| ρn (y − x) d|µ|(x) dy,
sup
y∈R x∈[y− 1 ,y+ 1 ] n
|θy (x)|, n ≥ 1,
n
and |µ| is the total variation measure. We obtain that for n ≥ 1, Z sup [ρn ∗ |ϕ|](x) d|µ|(x) k9n kL1 (R) ≤ εn ∞ ϕ∈L (R) kϕkL∞ (R) ≤1
≤ εn
sup
ϕ∈L∞ (R) kϕkL∞ (R) ≤1
R
(2.4)
kρn ∗ |ϕ| kL∞ (R) kµkM ≤ εn kµkM ,
using again Fubini’s theorem and Young’s inequality. Combining (2.1)–(2.4) we get the statement. u t Lemma 4. Let f : R → R be uniformly continuous and bounded. If g ∈ L∞ (R), then ρn ∗ (fg) − (ρn ∗ f ) (ρn ∗ g) −→ 0 in L∞ (R). n→∞
Proof. Going through the first steps of the proof of Lemma 3, we get [ρn ∗ (fg)] − (ρn ∗ f )(ρn ∗ g) = (f − ρn ∗ f )(ρn ∗ g) + 8n , n ≥ 1, where
Z 8n (y) =
y+ n1
y− n1
ρn (y − x) θy (x) g(x) dx, y ∈ R, n ≥ 1,
(2.5)
50
A. Constantin, L. Molinet
and, as before, θy (x) := f (x) − f (y), x ∈ R. Note that by Young’s inequality kρn ∗ gkL∞ (R) ≤ kgkL∞ (R) , n ≥ 1, whereas
|8n (y)| ≤ kgkL∞ (R)
sup |f (x) − f (y)|, y ∈ R, n ≥ 1. |x−y|≤ n1
The previous two relations combined with (2.5) complete the proof in view of (2.3) and the uniform continuity of f . u t Remark. In Lemma 4 we showed that the difference of two sequences of smooth functions converges to zero in L∞ (R). It is interesting to note that for g ∈ L∞ (R) that is not continuous, none of the sequences is a Cauchy sequence in L∞ (R). We split the proof of the theorem in two parts. Existence proof. To show the existence of global weak solutions we proceed in several steps; namely a suitable approximation of the initial data u0 ∈ H 1 (R) by smooth functions un0 produces a sequence of global solutions un (t, ·) of (1.1) in H 3 (R); ii) proving that there is a subsequence of {un } which converges pointwise a.e. to a 1 (R × R) that satisfies (1.3) in the sense of distributions; function u ∈ Hloc + iii) showing that u ∈ Cw (R+ ; H 1 (R)), the space of continuous functions from R+ with values in H 1 (R) when the latter space is equipped with its weak topology; iv) establishing the strong continuity of the solution with respect to the temporal variable and the validity of the conservation laws E and F . i)
Step i) Let u0 ∈ H 1 (R) and assume that y0 := u0 − u0,xx ∈ M+ (R). Define un0 := ρn ∗ u0 ∈ H ∞ (R) for n ≥ 1. Clearly un0 −→ u0 in H 1 (R). n→∞
Note that
(2.6)
y0n := un0 − un0,xx = ρn ∗ y0 ≥ 0, n ≥ 1,
and ky0n kL1 (R) ≤ ky0 kM , n ≥ 1,
(2.7)
as one can see comparing with (2.2). Let un be the global solution of (1.1) with initial data un0 , cf. Lemma 1. Step ii) For fixed T > 0, the estimates provided by Lemma 1 and the form (1.3) of the equation show by means of Young’s inequality that the sequence {un }n≥1 is uniformly bounded in the space H 1 ((0, T ) × R). Therefore, it has a subsequence such that unk * u
weakly in H 1 ((0, T ) × R) for nk → ∞,
(2.8)
and unk −→ u nk →∞
a.e. on (0, T ) × R,
(2.9)
Global Weak Solutions for Shallow Water Equation
51
for some u ∈ H 1 ((0, T ) × R). Note that for fixed t ∈ (0, T ), we have by Lemma 1 and (2.7) that the sequence unx k (t, ·) ∈ BV (R) with V[unx k (t, ·)] = kunxxk (t, ·)kL1 (R)
≤ kunk (t, ·)kL1 (R) + ky nk (t, ·)kL1 (R) ≤ 2 ky0 kM
and
kunx k (t, ·)kL∞ (R) ≤ ky0nk kL1 (R) ≤ ky0 kM .
Here BV (R) is the space of functions with bounded variation and V(f ) is the total variation of f ∈ BV (R), cf. [21]. By Helly’s theorem (see [21]), there exists a subsequence, denoted again {unx k (t, ·)}, which converges at every point to some function v(t, ·) of finite variation with V(v(t, ·)) ≤ 2 ky0 kM . From (2.9) we have that for almost all t ∈ (0, T ), unx k (t, ·) → ux (t, ·) in D0 (R). This enables us to identify v(t, ·) with ux (t, ·) for a.e. t ∈ (0, T ). Therefore unx k −→ ux nk →∞
a.e. on (0, T ) × R,
(2.10)
and for a.e. t ∈ (0, T ), V(ux (t, ·)) = kuxx (t, ·)kM ≤ 2 ky0 kM .
(2.11)
Let us again fix t ∈ (0, T ). Note that the sequence {[un (t, ·)]2 + 21 [unx (t, ·)]2 }n≥1 is uniformly bounded in L2 (R), cf. Lemma 1 and (2.6). Therefore, it has a subsequence, denoted again {[unk (t, ·)]2 + 21 [unx k (t, ·)]2 }, converging weakly in L2 (R). For a.e. t ∈ (0, T ), we deduce from (2.9)–(2.10) that the weak L2 (R)-limit is [u(t, ·)]2 + 21 [ux (t, ·)]2 . As px ∈ L2 (R), a.e. on (0, T ) × R we have 1 1 ∂x p ∗ [(unk )2 + (unx k )2 ] −→ ∂x p ∗ (u2 + u2x ). n →∞ k 2 2
(2.12)
From the relations (2.9)–(2.10) and (2.12) we obtain that u satisfies Eq. (1.3) in D0 ((0, T ) × R). Step iii) From Lemma 1, Young’s inequality and the PDE, we infer that the sequence unt k (t, ·) is uniformly bounded in L2 (R) as t ∈ R+ . We also have an uniform bound on kunk (t, ·)kH 1 (R) for all t ∈ R+ and all n0k s. Hence the family t 7→ unk (t, ·) ∈ H 1 (R) is weakly equicontinuous on [0, T ] for any T > 0. It follows from the Arzela-Ascoli theorem that {unk } contains a subsequence, which we denote again by {unk }, which converges weakly in H 1 (R), uniformly in t. The limit function is u and is weakly continuous from R+ into H 1 (R). Since for a.e. t ∈ R+ , unk (t, ·) * u(t, ·) weakly in H 1 (R), we have ku(t, ·)kH 1 (R) ≤ lim inf kunk (t, ·)kH 1 (R) = lim inf kun0 k kH 1 (R) nk →∞
≤
lim inf ky0nk kL1 (R) nk →∞
nk →∞
≤ ky0 kM
for a.e. t ∈ R+ in view of Lemma 1 andYoung’s inequality. The previous relation implies that u ∈ L∞ (R+ × R).
52
A. Constantin, L. Molinet
Note that by Lemma 1, for all t ∈ R+ , we have kunx (t, ·)kL∞ (R) ≤ kun (t, ·)kL∞ (R) ≤ kun (t, ·)kH 1 (R) ≤ ky0n kL1 (R) ≤ ky0 kM , n ≥ 1.
Combining this with (2.10), we deduce that ux ∈ L∞ (R+ × R). Step iv) As u ∈ Cw (R+ , H 1 (R)), to conclude that u ∈ C(R+ , H 1 (R)) it is enough to show that the functional E(u(t)) = ku(t, ·)k2H 1 (R) is conserved in time. Indeed, if this holds, then ku(t) − u(s)k2H 1 (R) = ku(t)k2H 1 (R) − 2 u(t), u(s) 1 + ku(s)k2H 1 (R) H (R) 2 = 2 ku0 kH 1 (R) − 2 u(t), u(s) 1 , t, s ∈ R+ , H (R)
ku(t)k2H 1 (R)
and the scalar product in the last line converges to = ku0 k2H 1 (R) as s → t. The conservation of E(u) in time is proved by a regularization technique. As u solves (1.3) in distribution sense, we see that for a.e. t ∈ R+ , 1 2 u ) = 0, n ≥ 1. (2.13) 2 x Multiplying with ρn ∗ u, we obtain by integration and in view of Lemma 2 that for a.e. t ∈ R+ and all n ≥ 1, Z Z 1 d (ρn ∗ u)2 dx + (ρn ∗ u)[ρn ∗ (uux )] dx (2.14) 2 dt R ZR h i 1 + (ρn ∗ u) ρn ∗ px ∗ (u2 + u2x ) dx = 0. 2 R ρn ∗ ut + ρn ∗ (uux ) + ρn ∗ px ∗ (u2 +
By differentiation of (2.13) we obtain a relation which multiplied by ρn ∗ ux yields after integration and in view of Lemma 2, that for a.e. t ∈ R+ and all n ≥ 1, Z Z 1 d (ρn ∗ ux )2 dx + (ρn ∗ ux ) [ρn,x ∗ (uux )] dx 2 dt R ZR h i 1 + (ρn ∗ ux ) ∂x2 ρn ∗ p ∗ (u2 + u2x ) dx = 0. (2.15) 2 R Note that
∂x2 (p ∗ f ) = p ∗ f − f, f ∈ L2 (R).
1 As u2 (t, ·) + u2x (t, ·) ∈ L2 (R), being a weak L2 (R)-limit, we can rewrite relation 2 (2.15) as Z Z 1 d (ρn ∗ ux )2 dx + (ρn ∗ ux )[ρn,x ∗ (uux )] dx (2.16) 2 dt R R Z h i 1 + (ρn ∗ ux ) ρn ∗ p ∗ (u2 + u2x ) dx 2 ZR h 1 2 i 2 − (ρn ∗ ux ) ρn ∗ (u + ux ) dx = 0 2 R
Global Weak Solutions for Shallow Water Equation
53
for a.e. t ∈ R+ and all n ≥ 1. Adding (2.14) and (2.16), integration by parts shows that for a.e. t ∈ R+ and all n ≥ 1, Z h Z i 3 1 d (ρn ∗ ux )(ρn ∗ u2 ) dx (2.17) (ρn ∗ u)2 + (ρn ∗ ux )2 dx − 2 dt R 2 R Z Z 1 (ρn ∗ ux ) (ρn ∗ u2x ) dx = 0. + [ρn ∗ ux ] [ρn,x ∗ (uux )] dx − 2 R R Observe that lim kρn ∗ ux − ux kL2 (R) = lim kρn ∗ u2 − u2 kL2 (R) = 0.
n→∞
n→∞
Therefore for a.e. t ∈ R+ the second term in (2.17) converges to zero for n → ∞, 1 as u(t, ·) ∈ H h (R)R for a.e. i t ∈ R+ . Similarly, for a.e. t ∈ R+ the last term in (2.17) 1 3 converges to − 2 R ux for n → ∞ as ux ∈ L∞ (R+ × R). Further, note that Z
Z [ρn,x ∗ (uux )] [ρn ∗ ux ] dx = − (ρn,xx ∗ u) (ρn ∗ u) (ρn ∗ ux ) dx (2.18) R R Z h i + (ρn,xx ∗ u) (ρn ∗ u)(ρn ∗ ux ) − ρn ∗ (uux ) dx. R
Integration by parts shows that Z Z Z 1 1 3 (ρn,xx ∗ u) (ρn ∗ u) (ρn,x ∗ u) dx = − (ρn ∗ ux ) dx −→ − u3 dx. n→∞ 2 R 2 R x R (2.19) On the other hand, kρn,xx ∗ ukL1 (R) ≤ kuxx kM ≤ 2 ky0 kM , n ≥ 1,
(2.20)
in view of (2.2) and (2.11), whereas k(ρn ∗ u)(ρn ∗ ux ) − ρn ∗ (uux )kL∞ (R) −→ 0, n→∞
(2.21)
by Lemma 4 since ux (t, ·) ∈ L∞ (R) and u(t, ·) ∈ H 1 (R) ⊂ L∞ (R) is uniformly continuous on R. Therefore Z h i (ρn,xx ∗ u) (ρn ∗ u)(ρn ∗ ux ) − ρn ∗ (uux ) dx −→ 0, n→∞
R
and we showed that for a.e. t ∈ R+ the second to the last term in the relation (2.12) i h Z u3x dx for n → ∞. converges to 21 R
Let us now denote En (t) :=
Z h R
i (ρn ∗ u)2 + (ρn ∗ ux )2 dx, t ∈ R+ , n ≥ 1,
(2.22)
54
A. Constantin, L. Molinet
and define Gn (t) := 3
Z
Z (ρn ∗ ux )(ρn ∗ u2 ) dx − 2 [ρn ∗ ux ] [ρn,x ∗ (uux )] dx R Z R + (ρn ∗ ux ) (ρn ∗ u2x ) dx, t ∈ R+ , n ≥ 1. R
We proved above that for a.e. t ∈ R+ , d En (t) = Gn (t), n ≥ 1, dt
(2.23)
Gn (t) −→ 0.
(2.24)
and n→∞
From Lemma 2 applied to both ρn ∗ u and ρn ∗ ux , we have by (2.23) that Z t Gn (s) ds, t ∈ R+ , n ≥ 1. En (t) − En (0) =
(2.25)
0
We re-express the second term in Gn (t) by means of relations (2.18) and (2.19). Since kρn kL1 (R) = 1 for n ≥ 1, using appropriate combinations of Hölder’s and Young’s inequality and recalling (2.20), we obtain the existence of a constant K > 0 such that |Gn (t)| ≤ K, t ∈ R+ , n ≥ 1,
(2.26)
as u, ux ∈ L∞ (R+ × R). Relations (2.24)–(2.26) force lim [En (t) − En (0)] = 0, t ∈ R+ ,
n→∞
on account of Lebesgue’s dominated convergence theorem. For fixed t ∈ R+ , we therefore have E(u(t)) = lim En (t) = E(u0 ),
(2.27)
n→∞
in view of (2.6). This proves that u ∈ C(R+ ; H 1 (R)) and E is conserved along our solution. From (1.3) and Young’s inequality, it is now obvious that u ∈ C 1 (R+ ; L2 (R)). From Step iii) we know that for a.e. t ∈ R+ , unk (t, ·) * u(t, ·) weakly in H 1 (R). But kunk (t, ·) − u(t, ·)k2H 1 (R) = kunk (t, ·)k2H 1 (R) − 2 unk (t, ·), u(t, ·) 1 + ku(t, ·)k2H 1 (R) H (R) nk 2 = ku0 kH 1 (R) − 2 unk (t, ·), u(t, ·) 1 + ku0 k2H 1 (R) , H (R)
in view of Lemma 1 and (2.27). Therefore, relations (2.6) and (2.27) ensure kunk (t, ·) − u(t, ·)kH 1 (R) −→ 0 nk →∞
Global Weak Solutions for Shallow Water Equation
55
for a.e. t ∈ R+ . This relation shows that for a.e. t ∈ R+ F (unk (t, ·)) −→ F (u(t, ·)), nk →∞
as F is continuous in the H 1 (R)-norm. By Lemma 1, F (unk (t, ·)) = F (un0 k ) −→ F (u0 ), nk →∞
so that for a.e. t ∈ R+ , F (u(t)) = F (u0 ). Since t 7 → F (u(t)) is continuous, we infer t that F (u(t)) = F (u0 ) for all t ∈ R+ . u Remark. Convoluting (1.3) with the family of mollifiers Z {ρn }n≥1 and passing to the limit after integration for n → ∞, it is easy to see that law for our solution.
R
u(t, x) dx is also a conservation
Proof of uniqueness. Let u and v be two weak solutions of (1.3) within the class {f ∈ C 1 (R+ ; L2 (R)) ∩ C(R+ ; H 1 (R)) with the total variation of [f (t, ·) − fxx (t, ·)] ∈ M+ (R) uniformly bounded on R+ }. If M := sup{ku(t, ·) − uxx (t, ·)kM + kv(t, ·) − vxx (t, ·)kM }, t≥0
then for all (t, x) ∈ R+ × R, |u(t, x)| = |p ∗ [u(t, ·) − uxx (t, ·)](x)| ≤ kpkL∞ (R) ku(t, ·) − uxx (t, ·)kM ≤ |ux (t, x)| = |px ∗ [u(t, ·) − uxx (t, ·)](x)| ≤
1 M, 2
(2.28)
1 M, 2
and similarly |v(t, x)| ≤
1 1 M, |vx (t, x)| ≤ M, (t, x) ∈ R+ × R. 2 2
(2.29)
On the other hand, following the same procedure as in (2.2), we find that ku(t, ·)kL1 (R) = kp ∗ [u(t, ·) − uxx (t, ·)]kL1 (R) ≤ M, t ≥ 0, kux (t, ·)kL1 (R) = kpx ∗ [u(t, ·) − uxx (t, ·)]kL1 (R) ≤ M, t ≥ 0, kv(t, ·)kL1 (R) and kvx (t, ·)kL1 (R) ≤ M, t ≥ 0.
(2.30)
Let us define w(t, x) := u(t, x) − v(t, x),
(t, x) ∈ R+ × R.
Then for a.e. t ∈ R+ , we claim that Z Z d |ρn ∗ w| dx = (ρn ∗ wt ) sgn(ρn ∗ w) dx, dt R R and d dt
Z
(2.31)
Z R
|ρn ∗ wx | dx =
R
(ρn ∗ wxt ) sgn(ρn ∗ wx ) dx.
(2.32)
56
A. Constantin, L. Molinet
We justify (2.32) as the argument for (2.31) is similar. Let η : R+ → R be a decreasing C 2 function such that η(s) = 1 for s ∈ [0, 21 ], η(s) = e−s for s ≥ 1 and η is a polynomial on [ 21 , 1]. For R > 0, we put ηR (x) = η
|x| , x ∈ R. R
Applying Lemma 2 to the function |ρn ∗ wx | and ηR we infer that for all t ≥ 0 and n ≥ 1, Z Z |ρn ∗ wx |(t, x) ηR (x) dx − |ρn ∗ wx |(0, x) ηR (x) dx (2.33) R R Z tZ ∂t |ρn ∗ wx |(s, x) ηR (x) dx ds. = 0
R
But ∂t |ρn ∗ wx | = (ρn ∗ wxt ) sgn (ρn ∗ wx ) and t 7 → (ρn ∗ wxt ) = (ρn,x ∗ wt ) is uniformly bounded in L1 (R), as one can see from the PDE. Letting R → ∞ in (2.33), we obtain from Lebesgue’s dominated convergence theorem that for all t ∈ R+ , Z Z tZ Z |ρn ∗wx |(t, x) dx − |ρn ∗wx |(0, x) dx = ∂t |ρn ∗wx |(s, x) dx ds, n ≥ 1. R
R
0
R
Differentiating the above relation with respect to time, one obtains (2.32). Convoluting the PDE’s for u and v with ρn and using (2.31), we get that for a.e. t ∈ R+ and all n ≥ 1, Z Z d |ρn ∗ w| dx = (ρn ∗ wt ) sgn (ρn ∗ w) dx dt R RZ = − [ρn ∗ (wux )] sgn (ρn ∗ w) dx ZR − [ρn ∗ (vwx )] sgn (ρn ∗ w) dx Z R (2.34) ρn ∗ px ∗ [w(u + v)] sgn (ρn ∗ w) dx − R Z 1 ρn ∗ px ∗ [wx (ux + vx )] sgn (ρn ∗ w) dx. − 2 R Using (2.28)–(2.30) and Young’s inequality, we infer that for a.e. t ∈ R+ and all n ≥ 1, Z Z Z d |ρn ∗ w| dx ≤ 2M |ρn ∗ w| dx + 2M |ρn ∗ wx | dx + Rn (t), (2.35) dt R R R where
Rn (t) → 0 as t → ∞ |Rn (t)| ≤ K, n ≥ 1, t ∈ R+ .
(2.36)
Global Weak Solutions for Shallow Water Equation
57
Here and henceforth K > 0 is a constant depending on M and the H 1 (R)-norms of u(0) and v(0) – recall that H 1 (R)-norm is conserved for the class of solutions we are considering here. For example, Z Z [ρ ∗ (vw )] sgn (ρ ∗ w) dx |ρn ∗ (vwx )| dx ≤ n x n R R Z Z |ρn ∗ wx | |ρn ∗ v| dx + |ρn ∗ (vwx ) − (ρn ∗ v)(ρn ∗ wx )| dx ≤ R Z Z R 1 |ρn ∗ wx | dx + |ρn ∗ (vwx ) − (ρn ∗ v)(ρn ∗ wx )| dx, ≤ M 2 R R
and the second term on the right enters into the framework of Lemma 3. The uniform bound on the error term is attained by Young’s inequality and relations (2.28)–(2.30). The first and third term on the right-hand side of (2.34) can be dealt with analogously. The bound on the last term in (2.34) is slightly more complicated. As the procedure runs along the same lines with the bound on the first term on the right-hand side of the inequality (2.37), we prefer to give the details below for the latter one, slightly more cumbersome. In the same way, convoluting the PDE’s for u and v with ρn,x and using (2.32), we get for a.e. t ∈ R+ and all n ≥ 1,
d dt
Z
Z
R
|ρn ∗ wx | dx =
(ρn ∗ wxt ) sgn (ρn,x ∗ w) dx Z = − [ρn ∗ (wx (ux + vx ))] sgn (ρn,x ∗ w) dx ZR (2.37) − [ρn ∗ (wvxx )] sgn (ρn,x ∗ w) dx ZR − [ρn ∗ (uwxx )] sgn (ρn,x ∗ w) dx ZR 1 1 − ρn ∗ pxx ∗ (u2 − v 2 + u2x − vx2 )sgn (ρn,x ∗ w) dx. 2 2 R R
Using the identity ∂x2 (p ∗ f ) = p ∗ f − f for f ∈ L2 (R), we easily estimate the last term of the right-hand side by Z Z 1 2 1 2 2 2 |ρn ∗ w| dx ρn ∗ pxx ∗ (u − v + ux − vx ) dx ≤ M 2 2 Z R R +M
R
|ρn ∗ wx | dx + Rn (t),
(2.38)
58
A. Constantin, L. Molinet
with Rn in the class (2.36). To treat the first term of the right-hand side of (2.37), observe that in view of (2.28)–(2.30), Z −
Z [ρn ∗ (wx (ux + vx ))] sgn (ρn,x ∗ w) dx ≤ ρn ∗ (|wx | (|ux | + |vx |)) dx Z Z R R ρn ∗ |wx | dx ≤ M ρn ∗ |wx | dx ≤ (kux kL∞ (R) + kvx kL∞ (R) ) Z R hZ R i |ρn ∗ wx | dx + M ρn ∗ |wx | − |ρn ∗ wx | dx ≤M R R Z |ρn ∗ wx | dx + Rn (t), t ∈ R+ , n ≥ 1, =M R
(2.39)
with Rn in the class (2.36). Observe that for a.e. t ∈ R+ and all n ≥ 1, Z −
[ρn ∗ (uwxx )]sgn (ρn,x ∗ w) dx Z = − (ρn ∗ u)(ρn ∗ wxx ) sgn (ρn,x ∗ w) dx ZR − [ρn ∗ (uwxx ) − (ρn ∗ u)(ρn ∗ wxx )] sgn (ρn,x ∗ w) dx Z ZR ∂ |ρn ∗ wx | dx + |ρn ∗ (uwxx ) − (ρn ∗ u)(ρn ∗ wxx )| dx ≤ − (ρn ∗ u) ∂x R Z Z R |ρn ∗ wx | (ρn ∗ ux ) dx + |ρn ∗ (uwxx ) − (ρn ∗ u)(ρn ∗ wxx )| dx = R
R
R
after integration by parts. From the previous estimate and Lemma 3 we easily infer that for n ≥ 1 and a.e. t ∈ R+ , Z Z [ρ ∗ (uw )] sgn (ρ ∗ w) dx| ≤ M |ρn ∗ wx | dx + Rn (t) n xx n,x R
R
(2.40)
with Rn belonging the class (2.36). We have now to deal with the second term on the right-hand side of (2.37). For n ≥ 1 and a.e. t ∈ R+ , note that Z Z [ρn ∗ (wvxx )] sgn (ρn,x ∗ w) dx ≤ (ρn ∗ w)(ρn ∗ vxx ) dx R R Z + |ρn ∗ (wvxx ) − (ρn ∗ w)(ρn ∗ vxx )| dx. R
In view of Lemma 3, the last expression can be estimated by a function Rn (t) in the class (2.36). On the other hand, for all n ≥ 1 and a.e. t ∈ R+ , Z (ρn ∗ w)(ρn ∗ vxx ) dx ≤ kρn ∗ wkL∞ (R) kρn ∗ vxx kL1 (R) R
≤ kρn ∗ wkW 1,1 (R) kvxx kM
Global Weak Solutions for Shallow Water Equation
59
in view of (2.2). Therefore, for all n ≥ 1 and t ∈ R+ we have Z Z |ρn ∗ w| dx [ρn ∗ (wvxx )] sgn (ρn ∗ wx ) dx ≤ M R R Z |ρn ∗ wx | dx + Rn (t), +M
(2.41)
with Rn in the class (2.36). From (2.37)–(2.41) we infer that for n ≥ 1 and a.e. t ∈ R+ , Z Z d |ρn ∗ wx | dx ≤ 4 M |ρn ∗ w| dx dt R RZ |ρn ∗ wx | dx + Rn (t), +4M
(2.42)
R
R
with Rn in the class (2.35). Adding (2.35) and (2.42) we obtain by Gronwall’s inequality that for all t ∈ R+ and n ≥ 1, Z t Z e6M(t−s) Rn (s) ds |ρn ∗ w| + |ρn ∗ wx | (t, x) dx ≤ R 0 i hZ |ρn ∗ w| + |ρn ∗ wx | (0, x) dx e6Mt . + R
Fix t > 0 and let n → ∞ in the above inequality. As w, wx ∈ L1 (R) by (2.28)–(2.30) and relation (2.36) holds, we obtain by Lebesgue’s dominated convergence theorem that Z hZ i |w| + |wx | (t, x) dx ≤ |w| + |wx | (0, x) dx e6Mt , t ∈ R+ . R
R
Uniqueness is now plain. u t Example (Solitons). Corresponding to the initial peaked profile of height c ∈ R+ , u0 (x) := c e−|x| , x ∈ R, Eq. (1.3) has a unique weak solution u(t, x) = c e−|x−ct| , (t, x) ∈ R+ × R. This solution is a solitary wave traveling with speed c. Example (The interaction of two solitons). As discovered by Camassa and Holm (see [3]), the solitary waves of Eq. (1.1) are solitons. Consider a weak solution that far out in the past is formed by two solitary waves with the taller one to the left of the shorter: u(t, x) ' c1 e−|x−c1 t| + c2 e−|x−c2 t| as t → −∞. Then it can be seen – for details we refer to [4,9] – that for suitable p1 , p2 , q1 , q2 ∈ W 1,∞ (R), Eq. (1.1) has the following solution in distribution sense: u(t, x) = p1 (t) e−|x−q1 (t)| + p2 (t) e−|x−q2 (t)| , t, x ∈ R. Moreover, u(t, x) ' c1 e
−|x−c1 t−2 ln
c1 c1 −c2 |
+ c2 e
−|x−c2 t−2 ln
c1 −c2 c2 |
as t → ∞.
The taller wave catches the shorter and they collide. No overlapping of the peaks occurs. After the collision the taller wave reappears to the right of the shorter one and eventually the two waves reemerge with their initial shapes and speeds. This shows that the solitary waves are solitons, retaining their identities after interaction.
60
A. Constantin, L. Molinet
3. Stability of the Solitons Due to the weak interaction of the solitons with each other it is reasonable to expect that they are stable. The appropriate notion of stability is orbital stability: a wave starting close to a solitary wave always remains close to some translate of it at all later times (the shape of the wave remains approximately the same for all times). For waves u0 ∈ H 3 (R) that approximate the solitons in a special way, a stability result was proved in [11]. The stability of the solitons is settled by Theorem A ([12]). If u ∈ C([0, T ); H 1 (R)) is a solution to (1.1) with ku(0, ·) − c ϕkH 1 (R) < then
ε4 and 0 < ε < c, 81 c4
ku(t, ·) − c ϕ(· − ξ(t))kH 1 (R) < ε f or t ∈ (0, T ),
where ξ(t) ∈ R is any point where the function u(t, ·) attains its maximum. In Theorem A, u ∈ C([0, T ); H 1 (R)) is a solution to (1.1) if u is a solution of (1.3) in the sense of distributions and E and F are conserved. The results in [22] show that if u0 ∈ H s (R) with s > 23 is such that y0 := u0 −u0,xx ≥ 0, then the profile u0 develops into a solution defined for all t ∈ R+ and entering in the framework of Theorem A. We therefore have the stability of the solitons with respect to perturbations within that class. However, a natural perturbation of a soliton is another nearby soliton and in this case the results in [16,22] are not applicable to guarantee that a solution of (1.1) exists even locally in time. If the initial profile u0 ∈ H 1 (R) with y0 := u0 −u0,xx ∈ M+ (R) is H 1 (R)-close to a soliton, then the corresponding solution (whose existence and uniqueness is guaranteed by our theorem) remains close to some translate of the soliton at all later times, in view of Theorem A. 4. The Periodic Case One can also look for spatially periodic weak solutions for Eq. (1.1). Our methods apply also to this case – one can follow exactly the same steps. Identifying all spaces of periodic functions with function spaces over the unit circle S in R2 , we obtain the following Proposition. If u0 ∈ H 1 (S) is such that y0 := u0 − u0,xx ∈ M+ (S), then equation (1.3) has a unique solution u ∈ C 1 (R+ ; L2 (S)) ∩ C(R+ ; H 1 (S)) with initial data + u(0) = u0 and such that the total variation of y(t, R ·) := u(t, ·) − Ruxx (t, ·) ∈ M (S) is uniformly bounded on R+ . Moreover, I (u) := S u dx, E(u) := S (u2 + u2x ) dx and R F (u) := S (u3 + uu2x ) dx are conservation laws. Remark. (a) The problem of weak solutions to (1.1) for initial data within the class {u0 ∈ H 1 (S): (u0 − u0,xx ) ∈ M+ (S)} was also considered in [9]. Therein it was proved that for every initial data in the above defined class, Eq. (1.3) has a weak solution u ∈ C(R+ ; H 1 (S)) with u(0) = u0 and such that I (u) and E(u) are conserved in time. The proof of the uniqueness of this weak solution given in [9] contains a flaw. However, the present approach shows the statement in [9] is correct and that the weak solution enjoys actually of a better regularity.
Global Weak Solutions for Shallow Water Equation
61
(b) The example of the periodic peakon u(t, x) =
cosh(x − t − [x − t] − 21 ) sinh( 21 )
, (t, x) ∈ R+ × R,
(here [·] stands for the integer part) ensures that the regularity of the weak solutions provided by the proposition is the optimal one. Acknowledgement. The authors are grateful for the useful suggestions made by the referee.
References 1. Beals, R., Sattinger, R. and Szmigielski, J.: Acoustic scattering and the extended Korteweg–de Vries hierarchy. Adv. in Math. 140, (190–206) (1998) 2. Beals, R., Sattinger, D. and Szmigielski, J.: Multipeakons and a theorem of Stieltjes. Inverse Problems 15, 1–4 (1999) 3. Camassa, and Holm, D.: An integrable shallow water equation with peaked solitons. Phys. Rev. Lett. 71, 1661–1664 (1993) 4. Camassa, R., Holm. D. and Hyman, J.: A new integrable shallow water equation. Adv. Appl. Mech. 31, (1994) 5. Constantin, A.: Existence of permanent and breaking waves for a shallow water equation: A geometric approach. Ann. Inst. Fourier (Grenoble), in press 6. Constantin, A. and Escher, J.: Global existence and blow-up for a shallow water equation. Annali Sc. Norm. Sup. Pisa 26, 303–328 (1998) 7. Constantin, A. and Escher, J.: Global weak solutions for a shallow water equation. Indiana Univ. Math. J. 47, 1527–1545 (1998) 8. Constantin, A. and Escher, J.: Wave breaking for nonlinear nonlocal shallow water equations. Acta Mathematica 181, 229–243 (1998) 9. Constantin, A. and Escher, J.: Well-posedness, global existence, and blow-up phenomena for a periodic quasi-linear hyperbolic equation. Comm. Pure Appl. Math. 51, 475–504 (1998) 10. Constantin, A. and McKean, H.P.: A shallow water equation on the circle, Comm. Pure Appl. Math. 52, 949–982 (1999) 11. Constantin, A. and Molinet, L.: Orbital stability of the solitary waves for a shallow water equation. Preprint, University of Zürich (1998) 12. Constantin, and W. Strauss, A.: Stability of peakons. Comm. Pure Appl. Math., in press 13. Dai, H.-H.: Model equations for nonlinear dispersive waves in a compressible Mooney–Rivlin rod. Acta Mechanica 127, 293–308 (1998) 14. Fokas, and Fuchssteiner, B.: Symplectic structures, their Bäcklund transformation and hereditary symmetries. Physica D 4, 47–66 (1981) 15. Kouranbaeva, S.: The Camassa-Holm equation as a geodesic flow on the diffeomorphism group. J. Math. Phys. 40, 857–868 (1999) 16. Li, Yi and Olver, P.: Well-posedness and blow-up solutions for an integrable nonlinearly dispersive model wave equation. J. Differ. Eqs., in press 17. Malek, J., Necas, J., Rokyta M. and Ruzicka, M.: Weak and Measure-valued Solutions to Evolutionary PDEs. London: Chapman & Hall, 1996 18. McKean, H.P.: Shallow water and the diffeomorphism group. Preprint, Courant Institute of Mathematical Sciences, New York, 1998 19. McKean, H.P.: Breakdown of a shallow water equation. Asian J. Math., to appear 20. Misiolek, G.: A shallow water equation equation as a geodesic flow on the Bott–Virasoro group. J. Geom. Phys. 24, 203–208 (1998) 21. Natanson, I.P. Theory of Functions of a Real Variable. New York: F. Ungar Publ. Co., 1964 22. Rodriguez-Blanco, G.: On the Cauchy problem for the Camassa–Holm equation. Preprint IMPA, Rio de Janeiro 155, 1998 23. Whitham, G.B.: Linear and Nonlinear Waves. New York: J. Wiley & Sons, 1980 Communicated by A. Kupiainen
Commun. Math. Phys. 211, 63 – 83 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Interactions in Noncommutative Dynamics William Arveson? Department of Mathematics, University of California, Berkeley, CA 94720, USA Received: 12 October 1999 / Accepted: 21 October 1999
To the memory of Irving Segal Abstract: A mathematical notion of interaction is introduced for noncommutative dynamical systems, i.e., for one parameter groups of ∗-automorphisms of B(H ) endowed with a certain causal structure. With any interaction there is a well-defined “state of the past” and a well-defined “state of the future”. We describe the construction of many interactions involving cocycle perturbations of the CAR/CCR flows and show that they are nontrivial. The proof of nontriviality is based on a new inequality, relating the eigenvalue lists of the “past” and “future” states to the norm of a linear functional on a certain C ∗ -algebra.
Introduction, Summary of Results In this paper we are concerned with one-parameter groups of ∗-automorphisms, of the algebra B(H ) of all bounded operators on a Hilbert space H , which carry a particular kind of causal structure. More precisely, A history is a pair (U, M) consisting of a oneparameter group U = {Ut : t ∈ R} of unitary operators acting on a separable infinitedimensional Hilbert space H and a type I subfactor M ⊆ B(H ) which is invariant under the automorphisms γt (X) = Ut XUt∗ for negative t, and which has the following two properties: (irreducibility) (
[
γt (M))00 = B(H ),
(0.1)
t∈R
? On appointment as a Miller Research Professor in the Miller Institute for Basic Research in Science. Support is also acknowledged from NSF grant DMS-9802474.
64
W. Arveson
(trivial infinitely remote past) \
γt (M) = C · 1.
(0.2)
t∈R
We find it useful to think of the group {γt : t ∈ R} as representing the flow of time in the Heisenberg picture, and the von Neumann algebra M as representing bounded observables that are associated with the “past”. However, this paper is concerned with purely mathematical issues concerning the dynamical properties of histories, with problems concerning their existence and construction, and especially with the issue of nontriviality (to be defined momentarily). An E0 -semigroup is a one-parameter semigroup α = {αt : t ≥ 0} of unit-preserving ∗-endomorphisms of a type I∞ factor M, which is continuous in the natural sense [2– 8,10,11,29–33]. The subfactors αt (M) decrease as t increases, and α is called pure if ∩t αt (M) = C1. There are two E0 -semigroups α − , α + associated with any history, α − being the one associated with the “past” by restricting γ−t to M for t ≥ 0 and α + being the one associated with the “future” by restricting γt to the commutant M 0 for t ≥ 0. By an interaction we mean a history with the additional property that there are normal states ω− , ω+ of M, M 0 respectively such that ω− is invariant under the action of α − and ω+ is invariant under the action of α + . Both α − and α + are pure E0 -semigroups, and when a pure E0 -semigroup has a normal invariant state then that state is uniquely determined, see (4.1) below. Thus ω− (resp. ω+ ) is the unique normal invariant state of α − (resp. α + ). Remarks. Since the state space of any unital C ∗ -algebra is weak∗ -compact, the MarkovKakutani fixed point theorem implies that every E0 -semigroup has invariant states. But there is no reason to expect that there is a normal invariant state. Indeed, we have examples (unpublished) of pure E0 -semigroups which have no normal invariant states. Notice too that ω− , for example, is defined only on the algebra M of the past. Of course, ω− has many extensions to normal states of B(H ), but none of these normal extensions need be invariant under the action of the group γ . In fact, we will see that if there is a normal γ -invariant state defined on all of B(H ) then the interaction must be trivial. In order to define a trivial interaction we must introduce a C ∗ -algebra of “local observables”. For every compact interval [s, t] ⊆ R there is an associated von Neumann algebra (0.3) A[s,t] = γt (M) ∩ γs (M)0 . Notice that since γs (M) ⊆ γt (M) are both type I factors, so is the relative commutant A[s,t] . Clearly AI ⊆ AJ if I ⊆ J , and for adjacent intervals [r, s], [s, t], r ≤ s ≤ t we have (0.4) A[r,t] = A[r,s] ⊗ A[s,t] , in the sense that the two factors A[r,s] and A[s,t] mutually commute and generate A[r,t] as a von Neumann algebra. The automorphism group γ permutes the algebras AI covariantly, (0.5) γt (AI ) = AI +t , t ∈ R. Finally, we define the local C ∗ -algebra A to be the norm closure of the union of all the AI , I ⊆ R. A is a C ∗ -subalgebra of B(H ) which is strongly dense and invariant under the action of the automorphism group γ .
Noncommutative Interactions
65
Remarks. It may be of interest to compare the local structure of the C ∗ -algebra A to its commutative counterpart, namely the local algebras associated with a stationary random distribution with independent values at every point [19]. More precisely, suppose that we are given a random distribution φ; i.e., a linear map from the space of real-valued test functions on R to the space of real-valued random variables on some probability space (, P ). With every compact interval I = [s, t] with s < t one may consider the weak∗ -closed subalgebra AI of L∞ (, P ) generated by random variables of the form eiφ(f ) , f ranging over all test functions supported in I . When the random distribution φ is stationary and has independent values at every point, this family of subalgebras of L∞ (, P ) has properties analogous to (0.4) and (0.5), in that there is a one-parameter group of measure preserving automorphisms γ = {γt : t ∈ R} of L∞ (, P ) which satisfies (0.5), and instead of (0.4) we have the assertion that the algebras A[r,s] and A[s,t] are probabilistically independent and generate A[r,t] as a weak∗ -closed algebra. One should keep in mind, however, that this commutative analogy has serious limitations. For example, we have already pointed out that in the case of interactions there is typically no normal γ -invariant state on B(H ), and there is no reason to expect any normal state of B(H ) to decompose as a product state relative to the decompositions of (0.4). There is also some common ground with the Boolean algebras of type I factors of Araki and Woods [1], but here too there are significant differences. For example, the local algebras of (0.3) and (0.4) are associated with intervals (and more generally with finite unions of intervals), but not with more general Borel sets as in [1]. Moreover, here the translation group acts as automorphisms of the given structure whereas in [1] there is no assumption of “stationarity” with respect to translations. For our purposes, the local C ∗ -algebra A has two important features. First, it gives us a way of comparing ω− and ω+ . Indeed, both states ω− and ω+ extend uniquely to γ -invariant states ω¯ − and ω¯ + of A. We sketch the proof for ω− . Proposition 0.1. There is a unique γ -invariant state ω¯ − of A such that ω¯ − AI = ω− AI for every compact interval I ⊆ (−∞, 0]. Proof. For existence of the extension, choose any compact interval I = [a, b] and any operator X ∈ AI . Then for sufficiently large s > 0 we have I − s ⊆ (−∞, 0] and for these values of s, ω− (γ−s (X)) does not depend on s because ω− is invariant under the action of {γt : t ≤ 0}. Thus we can define ω¯ − (X) unambiguously by ω¯ − (X) = lim ω− (γt (X)). t→−∞
This defines a positive linear functional ω¯ − on the unital ∗-algebra ∪I AI , and now we extend ω¯ − to all of A by norm-continuity. The extended state is clearly invariant under the action of γt , t ∈ R. The proof of uniqueness of the extension is straightforward, and we omit it. u t It is clear from the proof of Proposition 0.1 that these extensions of ω− and ω+ are locally normal in the sense that their restrictions to any localized subalgebra AI define normal states on that type I factor.
66
W. Arveson
Definition 0.2. The interaction (U, M), with past and future states ω− and ω+ , is said to be trivial if ω¯ − = ω¯ + . More generally, the norm kω¯ − − ω¯ + k gives some measure of the “strength” of the interaction, and of course we have 0 ≤ kω¯ − − ω¯ + k ≤ 2. If there is a normal state ρ of B(H ) which is invariant under the action of γ , then since ω− (resp. ω+ ) is the unique normal invariant state of α− (resp. α+ ) we must have ρ M = ω− , ρ M 0 = ω+ , and hence ω¯ − = ω¯ + = ρ A by the uniqueness part of Proposition 0.1. In particular, if the interaction is nontrivial then neither ω¯ − nor ω¯ + can be extended from A to a normal state of its strong closure B(H ). The second important feature of A is that there is a definite “state of the past” and a definite “state of the future” in the following sense. Proposition 0.3. For every X ∈ A and every normal state ρ of B(H ) we have lim ρ(γt (X)) = ω¯ − (X),
t→−∞
lim ρ(γt (X)) = ω¯ + (X).
t→+∞
Proof. Consider the first limit formula. The set of all X ∈ A for which this formula holds is clearly closed in the operator norm, hence it suffices to show that it contains AI for every compact interval I ⊆ R. We will make use of the fact (discussed more fully at the beginning of Sect. 5) that if ρ is any normal state of M and A is an operator in M then lim ρ(γt (A)) = ω− (A),
t→−∞
see formula (4.1). Choosing a real number T sufficiently negative that I +T ⊆ (−∞, 0], the preceding remark shows that for the operator A = γT (X) ∈ M we have limt→−∞ ρ(γt (A)) = ω− (A), and hence lim ρ(γt (X)) = lim ρ(γt−T (γT (X))) = ω− (γT (X)) = ω¯ − (X).
t→−∞
t→−∞
The proof of the second limit formula is similar. u t Thus, whatever (normal) state ρ one chooses to watch evolve over time on operators in A, it settles down to become ω¯ + in the distant future, it must have come from ω¯ − in the remote past, and the limit states do not depend on the choice of ρ. For a trivial interaction, nothing happens over the long term: for fixed X and ρ the function t ∈ R 7→ ρ(γt (X)) starts out very near some value (namely ω¯ − (X)), exhibits transient fluctuations over some period of time, and then settles down near the same value again. For a nontrivial interaction, there will be a definite change from the limit at −∞ to the limit at +∞ (for some choices of X ∈ A). A number of questions arise naturally. 1) How does one construct examples of interactions? 2) How does one determine if a given interaction is nontrivial? 3) What C ∗ -dynamical systems can occur as the C ∗ -algebras of local observables associated with an interaction? The purpose of this paper is to provide an effective partial solution of problem 1) and a complete solution of problem 2). The latter involves an inequality which we feel is of some interest in its own right. These results are summarized as follows. By an eigenvalue list we mean a decreasing sequence of nonnegative real numbers λ1 ≥ λ2 ≥ . . . with finite sum. Every normal state ω of a type I factor is associated with a positive operator of trace 1, whose eigenvalues counting multiplicity can be arranged
Noncommutative Interactions
67
into an eigenvalue list which will be denoted 3(ω). If the factor is finite dimensional, we still consider 3(ω) to be an infinite list by adjoining zeros in the obvious way. Given two eigenvalue lists 3 = {λ1 ≥ λ2 ≥ . . . } and 30 = {λ01 ≥ λ02 ≥ . . . }, we will write k3 − 30 k =
∞ X k=1
|λk − λ0k |
for the `1 -distance from one list to the other. A classical result implies that if ρ and σ are normal states of a type I factor M, then we have k3(ρ) − 3(σ )k ≤ kρ − σ k (see Sect. 3). Combining the results of [7] with the results of Sect. 1 below, we obtain the following result on the existence of interactions having arbitrary finite eigenvalue lists. Theorem A. Let n = 1, 2, . . . , ∞ and let 3− and 3+ be two eigenvalue lists, each of which has only finitely many nonzero terms. There is an interaction (U, M) whose past and future states ω− , ω+ have eigenvalue lists 3− and 3+ , and whose past and future E0 -semigroups are both cocycle perturbations of the CAR/CCR flow of index n. Remarks. Theorem A is established in Sect. 3. We conjecture that the finiteness hypothesis of Theorem A can be dropped. Theorem A gives examples of interactions, but it provides no information about whether or not these interactions are nontrivial. We will show that this is the case whenever the eigenvalue lists of ω− and ω+ are different. That conclusion depends on the following, which is the main result of this paper (and which applies to interactions with arbitrary...i.e., not necessarily finitely nonzero...eigenvalue lists). Theorem B. Let (U, M) be an interaction with past and future states ω− and ω+ , and let ω¯ − and ω¯ + denote their extensions to γ -invariant states of A. Then kω¯ − − ω¯ + k ≥ k3(ω− ⊗ ω− ) − 3(ω+ ⊗ ω+ )k. Remarks. Theorem B is proved in Sect. 4. Notice the tensor product of states on the right. For example, 3(ω− ⊗ ω− ) is obtained from the eigenvalue list 3(ω− ) = {λ1 ≥ λ2 ≥ . . . } of ω− by rearranging the doubly infinite sequence of all products λi λj , i, j = 1, 2, . . . into decreasing order. It can be an unpleasant combinatorial chore to calculate 3(ω− ⊗ ω− ) even when 3(ω− ) is relatively simple and finitely nonzero; but we also show in Sect. 4 that if A and B are two positive trace class operators such that 3(A ⊗ A) = 3(B ⊗ B), then 3(A) = 3(B). Thus we may conclude Corollary 0.4. Let (U, M), ω− , ω+ be as in Theorem B, and let 3− and 3+ be the eigenvalue lists of ω− and ω+ respectively. If 3− 6 = 3+ , then the interaction is nontrivial. The following implies that “strong” interactions exist. Corollary 0.5. Let n = 1, 2, . . . , ∞ and choose > 0. There is an interaction (U, M) having past and future states ω− , ω+ , such that α − and α + are cocycle perturbations of the CAR/CCR flow of index n, for which kω¯ − − ω¯ + k ≥ 2 − .
68
W. Arveson
Theorem B depends on a more general result concerning the asymptotic behavior of eigenvalue lists, which may be of some interest on its own. Let α = {αt : t ≥ 0} be an E0 -semigroup acting on B(H ), which is pure in the sense defined above. The commutants Nt = αt (B(H ))0 are type I subfactors which increase with t, and because of purity their union is strongly dense in B(H ). Let ρ be a normal state of B(H ). We require the following information concerning the behavior of the eigenvalue lists of the restrictions ρ Nt for large t. Theorem C. Let α be a pure E0 -semigroup acting on B(H ), which has a normal invariant state ω. Then for every normal state ρ of B(H ) we have lim k3(ρ αt (M)0 ) − 3(ρ ⊗ ω)k = 0.
t→∞
Remarks. One might expect that since the Nt increase to B(H ), the restriction of a normal state to Nt should look like ρ itself when t is large. Indeed, if the invariant state ω is a vector state then its only nonzero eigenvalue is 1 and 3(ρ ⊗ ω) = 3(ρ); in this case Theorem C implies that the restriction of ρ to Nt has almost the same list as ρ when t is large. On the other hand, if ω is not a vector state then 3(ρ ⊗ ω) is very different from 3(ρ), and Theorem C shows that this intuition is wrong. We also remark that Theorem C is itself a special case of a more general result that is independent of the theory of E0 -semigroups (see [9]). 1. Existence of Dynamics Flows on spaces are described infinitesimally by vector fields. Flows on Hilbert spaces (that is to say, one-parameter unitary groups) are described infinitesimally by unbounded self-adjoint operators. In practice, one is usually presented with a symmetric operator A that is not known to be self-adjoint (much like being presented with a differential equation that is not known to possess solutions for all time), and one wants to know if there is a one-parameter unitary group that can be associated with it. Precisely, one wants to know if A can be extended to a self-adjoint operator. This problem of the existence of dynamics was solved by von Neumann as follows. Every densely defined symmetric operator A has an adjoint A∗ with dense domain D∗ , and using A∗ one defines two deficiency spaces E− , E+ by E± = {ξ ∈ D∗ : A∗ ξ = ±iξ }. von Neumann’s result is that A has self-adjoint extensions iff dim E− = dim E+ (see [15, Sect. XII.4]). Moreover, when E− and E+ have the same dimension, von Neumann showed that for every unitary operator from E− to E+ there is an associated self-adjoint extension of A. The purpose of this section is to establish an analogous result which locates the obstruction to the existence of dynamics for pairs of E0 -semigroups of the simplest kind (Corollary 1.1 below). That is based on the following more general result. Let M be a type I subfactor of B(H ), and let α, β be two E0 -semigroups acting, respectively, on M and its commutant M 0 . We want to examine conditions under which there is a one-parameter unitary group U = {Ut : t ∈ R} acting on H whose associated automorphism group γt (A) = Ut AUt∗ has α as its past and β as its future in the sense that (1.1) γ−t M = αt , γt M 0 = βt , t ≥ 0. The following result asserts that there is such a unitary group U if and only if the product systems of α and β are anti-isomorphic.
Noncommutative Interactions
69
Theorem. Let E α = {E α (t) : t > 0} and E β = {E β (t) : t > 0} be the respective product systems of α and β, E α (t) = {x ∈ M : αt (y)x = xy, y ∈ M}, E β (t) = {x 0 ∈ M 0 : βt (y 0 )x 0 = x 0 y 0 , y 0 ∈ M 0 }, and assume that there is a one-parameter unitary group U = {Ut : t ∈ R} whose associated automorphism group satisfies (1.1). Then E α and E β are anti-isomorphic. Indeed, for every t > 0 we have Ut E α (t) = E β (t), and the map θ : E α → E β defined by (1.2) θ(v) = Ut v, v ∈ E α (t), t > 0, is an anti-isomorphism of product systems (i.e., it is a Borel-measurable map which is unitary on fibers, and which satisfies θ (vw) = θ (w)θ (v) for every v ∈ E α (s), w ∈ E α (t), s, t > 0). Conversely, if θ : E α → E β is any anti-isomorphism of product systems, then for every t > 0 there is a unique unitary operator Ut ∈ B(H ) which satisfies (1.2) for every v ∈ E α (t). {Ut : t > 0} is a strongly continuous semigroup of unitary operators tending strongly to the identity as t → 0+, and its natural extension to a one-parameter unitary group gives rise to an automorphism group γ which satisfies (1.1). Proof. Assume that γt (A) = Ut AUt∗ , t ∈ R satisfies (1.1). Fix t > 0. We claim first that Ut E α (t) ⊆ M 0 . Indeed, if x ∈ M then for every v ∈ E α (t) we have xUt v = Ut γ−t (x)v = Ut αt (x)v = Ut vx. Next, we claim that Ut E α (t) ⊆ E β (t). For v ∈ E α (t), the preceding shows that Ut v ∈ M 0 , so it suffices to show that βt (y)Ut v = Ut vy for every y ∈ M 0 . For that, write βt (y)Ut v = γt (y)Ut v = Ut yUt∗ Ut v = Ut yv = Ut vy, the last equality because v ∈ M commutes with y ∈ M 0 . Next, note that E β (t) ⊆ Ut E α (t). Choosing w ∈ E β (t), set v = Ut∗ w. Note that v ∈ M because for every y ∈ M 0 we have yv = yUt∗ w = Ut∗ γt (y)w = Ut∗ βt (y)w = Ut∗ wy = vy. Note next that the element v = Ut∗ w ∈ M actually belongs to E α (t). Indeed, for every x ∈ M we have αt (x)v = αt (x)Ut∗ w. Since γ−t restricts to αt on M, we have γt (αt (x)) = x and the right side can be written Ut∗ γt (αt (x)) = Ut∗ xw = Ut∗ wx = vx. The above shows that for every t > 0 we have a linear map θt : E α (t) → E β (t) defined by θt (v) = Ut v. By assembling these maps we get a Borel-measurable map θ : E α → E β which is linear on fibers. Notice that θt is actually unitary, since for v1 , v2 ∈ E α (t) we have hv1 , v2 i 1 = v2∗ v1 = (Ut v2 )∗ (Ut v1 ) = θ (v2 )∗ θ (v1 ) = hθ (v1 ), θ (v2 )i 1. Finally, θ is an anti-isomorphism, because for v ∈ E α (s), w ∈ E α (t) we have θ(vw) = Us+t vw = Ut (Us v)w = Ut θ (v)w = Ut wθ (v) = θ (w)θ (v).
70
W. Arveson
To prove the converse, fix an anti-isomorphism θ : E α → E β . For every t > 0 pick an orthonormal basis e1 (t), e2 (t) . . . for E α (t) (we will have to choose more carefully presently...but for the moment we choose an arbitrary orthonormal basis for each fiber space). For every t > 0 define an operator Ut ∈ B(H ) by Ut =
∞ X
θ (en (t))en (t)∗ .
n=1
One checks easily that Ut Ut∗ = Ut∗ Ut = 1, hence Ut is unitary. Ut also satisfies (1.2), for if v ∈ E α (t) then we have en (t)∗ v = hv, en (t)i 1 and hence Ut v =
∞ X
X
hv, en (t)i θ(en (t)) = θ (
n=1
hv, en (t)i en (t)) = θ (v).
n
Note too that since the ranges of the operators in E α (t) span H , any operator Ut that satisfies (1.2) is determined uniquely. In particular, Ut does not depend on the choice of orthonormal basis {en (t)} for E α (t). We may choose the orthonormal basis {en (t)} so that each section t 7 → en (t) ∈ E α (t) is Borel measurable (because of the measurability axiom of product systems [2, Property 1.8 (iii)]), and once this is done we find that the function t ∈ (0, ∞) 7 → Ut ∈ B(H ) is Borel measurable. We claim next that {Ut : t > 0} is a semigroup. Indeed, if w ∈ E α (s), v ∈ E α (t), then since θ(v) ∈ M 0 commutes with w ∈ M we have Us Ut vw = Us θ(v)w = Us wθ (v) = θ (w)θ (v) = θ (vw) = Us+t vw. Since E α (s + t) is spanned by such product vw and since E α (s + t)H spans H , we conclude that Us Ut = Us+t . At this point, we use the measurability proposition [2, Prop. 2.5 (ii)] (stated there for the more general case of cocycles) to conclude that a) Ut is strongly continuous in t for t > 0, and b) Ut tends strongly to 1 as t → 0+. Now extend U in the obvious way to obtain a strongly continuous one-parameter unitary group acting on H . Let γt (A) = Ut AUt∗ , A ∈ B(H ), t ∈ R. It remains to show that for every t > 0 we have γ−t M = αt and γt M 0 = βt . Choose x ∈ M. To show that γ−t (x) = αt (x), it suffices to show that γ−t (x)v = αt (x)v for every v ∈ E α (t) (because H is spanned by the ranges of the operators in E α (t)). But for such a v we have γ−t (x)v = U−t xUt v = U−t xθ (v) = U−t θ (v)x = vx = αt (x)v. Choose y ∈ M 0 . To show that γt (y) = βt (y) it suffices to show that γt (y)w = βt (y)w for all w ∈ E β (t). For such a w we have w = θ (v) = Ut v for some v ∈ E α (t), hence γt (y)w = Ut yUt∗ Ut v = Ut yv = Ut vy = wy = βt (y)w, and the proof is complete. u t We view the following result as a counterpart for noncommutative dynamics of von Neumann’s theorem on the existence of self-adjoint extensions of symmetric operators in terms of deficiency indices.
Noncommutative Interactions
71
Corollary 1.1. Let α and β be two E0 -semigroups, acting on B(H ) and B(K) respectively, each of which is a cocycle perturbation of a CCR/CAR flow. There is a oneparameter group of automorphisms of B(H ⊗ K) which satisfies the condition of (1.1) if, and only if, α and β have the same numerical index. Proof. Consider the type I subfactor M of B(H ⊗ K) defined by M = B(H ) ⊗ 1K . We have M 0 = 1H ⊗ B(K), and α (resp. β) is conjugate to the action on M (resp. M 0 ) defined by A ⊗ 1K 7 → αt (A) ⊗ 1K (resp. 1H ⊗ B 7 → 1H ⊗ βt (B)), t ≥ 0. Now the product system of any CAR/CCR flow is anti-isomorphic to itself. This follows, for example, from the structural results on divisible product systems of [2, Sect. 6]. Alternately, one can simply write down explicit anti-automorphisms of the product systems described on pp. 12–14 of [2]. Since the structure of the product system of any E0 -semigroup is stable under cocycle perturbations, the same is true of cocycle perturbations of CAR/CCR flows. The preceding theorem implies that there is a one-parameter group of automorphisms γ = {γt : t ∈ R} of B(H ⊗ K) satisfying γ−t (A ⊗ 1K ) = αt (A) ⊗ 1K ,
γt (1H ⊗ B) = 1H ⊗ βt (B)
for every t ≥ 0 iff the product systems E α and Eβ are anti-isomorphic. The preceding paragraph shows that this is true iff Eα and Eβ are isomorphic; and since α and β are simply cocycle perturbations of CAR/CCR flows, the latter holds iff α and β have the same numerical index. u t Corollary 1.2. Let α and β be two pure E0 -semigroups which are cocycle-conjugate to the CAR/CCR flow of index n = 1, 2, . . . , ∞. Then there is a history (U, M) whose past and future semigroups are conjugate, respectively, to α and β. Remarks on the existence and nonexistence of dynamics: general case. It is natural to ask if every E0 -semigroup α can represent both the past and future of some history. More precisely, is there a history whose past and future E0 -semigroups are both conjugate to cocycle perturbations of α? This is certainly the case for the CAR/CCR flows, by Corollary 1.2. But in general the answer can be no. We have recently received a manuscript of Boris Tsirelson [34] in which examples of product systems are constructed which are not anti-isomorphic to themselves. It is shown in [3] that every abstract product system is isomorphic to the product system of some E0 -semigroup. It follows that there are E0 semigroups α whose product systems are not anti-isomorphic themselves. Since cocycle perturbations of E0 -semigroups must have isomorphic product systems, the theorem proved above implies that such E0 -semigroups (i.e., those whose product systems are not anti-isomorphic to themselves) cannot serve as both the past and future of any history. On the other hand, for every E0 -semigroup α acting on B(H ), which is pure in the sense that ∩t αt (B(H )) = C·1, there is a history whose past is conjugate to α. To see why this is so, let E be its product system. Let E ∗ be the product system opposite to E (E ∗ is defined as the same structure as E except for multiplication, and in E ∗ multiplication is defined by reversing the multiplication of E). Since E ∗ is a product system, the results of [3] imply that there is an E0 -semigroup β, acting on B(K), whose product system is isomorphic to E ∗ and therefore anti-isomorphic to E. We may conclude from the
72
W. Arveson
theorem proved above that there is a one parameter group of automorphisms γ acting on B(H ⊗ K) which satisfies (1.1) by having α ⊗ 1 (acting on B(H ) ⊗ 1) as its past and 1 ⊗ β (acting on 1 ⊗ B(K)) as its future. 2. Eigenvalue Lists of Normal States In this section we emphasize the importance of the “eigenvalue list” invariant that can be associated with normal states of type I factors, and we summarize its basic properties. An eigenvalue P list is a decreasing sequence λ1 ≥ λ2 ≥ . . . of nonnegative real numbers satisfying n λn < ∞. If 3 = {λ1 ≥ λ2 ≥ . . . } and 30 = {λ01 ≥ λ02 ≥ . . . } are two such lists we write ∞ X |λn − λ0n | k3 − 30 k = n=1
30 ,
from 3 to thereby making the space of all eigenvalue lists into a for the complete metric space. Let A be a positive trace class operator acting on a separable Hilbert space H . The positive eigenvalues of A (counting multiplicity) can be arranged in decreasing order, and if there are only finitely many nonzero eigenvalues then we extend the list by appending zeros in the obvious way. This defines the eigenvalue list 3(A) of A. Notice that even when H is finite dimensional, 3(A) is an infinite list. The following basic properties of eigenvalue lists will be used repeatedly. `1 -distance
Proposition 2.1. 2.1.1 For every positive trace class operator A we have 3(A) = 3(A ⊕ 0∞ ), 0∞ denoting the infinite dimensional zero operator. 2.1.2 For positive trace class operators A and B, 3(A) = 3(B) iff A ⊕ 0∞ is unitarily equivalent to B ⊕ 0∞ . 2.1.3 If L is any Hilbert-Schmidt operator from a Hilbert space H1 to a Hilbert space H2 , then 3(L∗ L) = 3(LL∗ ). 2.1.4 For positive trace class operators A, B we have 3(A) = 3(B) iff trace(An ) = trace(B n ) for every n = 1, 2, . . . Proof. The assertion (2.1.1) is obvious, and (2.1.2) follows after a routine application of the spectral theorem for self-adjoint compact operators. Proof of (2.1.3). Let K1 ⊆ H1 be the initial space of L and let K2 = LK1 ⊆ H2 be its closed range. The polar decomposition implies that L∗ L K1 and LL∗ K2 are unitarily equivalent. Hence L∗ L ⊕ 0∞ and LL∗ ⊕∞ are unitarily equivalent and the assertion (2.1.3) follows from (2.1.2). u t Proof of (2.1.4). If 3(A) = {λ1 ≥ λ2 ≥ . . . } then n
trace(A ) =
∞ X k=1
λnk , n = 1, 2, . . .
Thus 3(A) = 3(B) implies that trace(An ) = trace(B n ) for every n ≥ 1. Conversely, suppose that trace(An ) = trace(B n ) for every n = 1, 2, . . . Choose a positive number M so large that the interval [0, M] contains the spectra of both operators
Noncommutative Interactions
73
A and B. The linear functional f 7 → trace(Af (A)) defined on the commutative C ∗ algebra C[0, M] is positive, hence there is a unique finite positive measure µA defined on [0, M] such that Z M f (x) dµA (x) = trace(Af (A)), f ∈ C[0, M]. 0
The restriction of µA to (0, M] is concentrated on σ (A) ∩ (0, M], and for every positive eigenvalue λ of A we have µA ({λ}) = λ · multiplicity of λ. Doing the same for the operator B, we find that by hypothesis Z M Z M x n dµA (x) = x n dµB (x), n = 0, 1, 2, . . . , 0
0
and hence by the Weierstrass approximation theorem µA and µB define the same linear functional on C[0, M]. It follows that µA = µB , and the preceding observations lead us to conclude that 3(A) = 3(B). u t We will also make use of the following classical result, originating in work of Hermann Weyl around 1912. Proposition 2.2. If A, B are positive trace class operators acting on the same Hilbert space H , then k3(A) − 3(B)k ≤ trace|A − B|. Proof. A proof can be found in the appendix of [29]. u t Remarks. Notice that since 3(A) depends only on the unitary equivalence class of A, Proposition 2.2 actually implies that trace|A0 − B 0 |, k3(A) − 3(B)k ≤ inf 0 A ,B0
where A0 (resp. B 0 ) ranges over all operators unitarily equivalent to A (resp. B). Indeed, though we do not require the fact, it is not hard to show that k3(A) − 3(B)k is exactly the distance (relative to the trace norm) from the unitary equivalence class of A ⊕ 0∞ to the unitary equivalence class of B ⊕ 0∞ . Thus the eigenvalue list 3(A) provides a more-or-less complete invariant for classifying positive trace class operators up to unitary equivalence. On the other hand, the eigenvalue list is also a subtle invariant. To illustrate the point, suppose that A has only two positive eigenvalues 3/4 and 1/4, and that B has only three positive eigenvalues 3/5, 1/5, 1/5. The spectrum of A ⊕ B is the union of the spectra and the spectrum of A ⊗ B is the set of products of elements from the two spectra; however, both of these sets must be rearranged in decreasing order. Thus 3(A ⊕ B) = {3/4, 3/5, 1/4, 1/5, 1/5, 0, . . . }, 3(A ⊗ B) = {9/20, 3/20, 3/20, 3/20, 1/20, 1/20, 0, . . . }. Notice that A has only eigenvalues of multiplicity 1, B has eigenvalues of multiplicities 1 and 2, but that A ⊗ B has an eigenvalue of “peculiar” multiplicity 3. In the case of
74
W. Arveson
larger spectra, the relation between say 3(A ⊗ B) and the individual lists 3(A) and 3(B) depends in a complex way on the relative sizes of eigenvalues, and the problem of rearranging the set of products into decreasing order can be a difficult combinatorial chore. Turning now to normal states, let M be a type In factor, n = 1, 2, . . . , ∞ (one can assume without essential loss that M is concretely represented as a subfactor of B(H ) for some Hilbert space H ), and let ρ be a normal state of M. There is a Hilbert space K of dimension n such that M is isomorphic as a ∗-algebra to B(K), and in this case any such ∗-isomorphism must be isometric and normal. Thus we may identify ρ with a normal state of B(K), and consequently there is a positive operator R ∈ B(K) of trace 1 such that ρ(T ) = trace(RT ), T ∈ B(K). The eigenvalue list of ρ is defined by 3(ρ) = 3(R). The preceding discussion leads immediately to the following. Proposition 2.3. 2.3.1 If ρ1 and ρ2 are normal states of type I factors M1 and M2 , and if ρ1 and ρ2 are conjugate in the sense that there is a ∗-isomorphism θ of M1 onto M2 such that ρ2 ◦ θ = ρ1 , then 3(ρ1 ) = 3(ρ2 ). 2.3.2 If ρ1 and ρ2 are two normal states of a type I factor M, then k3(ρ1 ) − 3(ρ2 )k ≤ kρ1 − ρ2 k. Proof. The first assertion is apparent after we realize Mk as B(Hk ), k = 1, 2, use the fact that a ∗-isomorphism of B(H1 ) onto B(H2 ) is implemented by a unitary operator from H1 to H2 , and make use of (2.1.2). The second assertion is the inequality of Proposition 2.2. t u 3. CP Semigroups and the Existence of Interactions The corollary of Sect. 1 implies that any pair of pure E0 -semigroups α− , α+ , which are both cocycle conjugate to the same CAR/CCR flow, can be assembled so as to obtain a history (U, M) whose past and future E0 -semigroups are conjugate to α− and α+ . Moreover, if both α− and α+ have normal invariant states then (U, M) is in fact an interaction. Thus we are led to ask what the possibilities are. PMore precisely, suppose we are given an eigenvalue list 3 = {λ1 ≥ λ2 ≥ . . . } with n λn = 1 and a nonnegative integer n = 1, 2, . . . , ∞. Does there exist a cocycle perturbation α of the CAR/CCR flow of index n which is pure, and which leaves invariant a normal state whose eigenvalue list is 3? We do not know the answer in general, but we conjecture that it is yes. The purpose of this section is to provide an affirmative answer for the cases in which 3 has only a finite number of nonzero terms (Theorem A). This is essentially the main result of [7] (together with Corollary 1.1), and we merely summarize the main ideas so as to emphasize the role of dilation theory and semigroups of completely postive maps (sometimes called quantum dynamical semigroups) acting on matrix algebras, for such constructions.
Noncommutative Interactions
75
Suppose that α = {αt : t ≥ 0} is an E0 -semigroup acting on B(H ), and assume further that there is a normal state ω of B(H ) which is invariant, ω ◦ αt = ω, t ≥ 0. Letting be the density operator of ω, ω(T ) = trace(T ), T ∈ B(H ), then the projection P on the closed range of is the support projection of ω, i.e., the largest projection with the property that ω(P ⊥ ) = 0. Using ω ◦ αt = ω, we find that ω(1 − αt (P )) = ω(αt (P ⊥ )) = ω(P ⊥ ) = 0, hence 1 − αt (P ) ≤ 1 − P , hence αt (P ) ≥ P , t ≥ 0.
(3.1)
The inequality (3.1) has the following consequence. If we identify B(P H ) with the corner P B(H )P , then for every t ≥ 0 we can compress αt so as to obtain a completely positive map φt on B(P H ), φt (X) = P αt (X) P H , X ∈ P B(H )P . More significantly, because of (3.1) we have the semigroup property φs ◦ φt = φs+t , as one can easily verify using P αs (A)P = P αs (P AP )P for A ∈ B(H ). Thus we have defined a semigroup φ = {φt : t ≥ 0} of normal completely positive maps of B(P H ) satisfying φt (1) = 1 for t ≥ 0, together with the natural continuity property
lim hφt (X)ξ, ηi = φt0 (X)ξ, η ,
t→t0
ξ, η ∈ P H, X ∈ B(P H ). We appear to have lost ground, in that we started with a semigroup of ∗-endomorphisms and now have merely a semigroup of completely positive maps. However, notice that the restriction of ω to B(P H ) = P B(H )P is a faithful normal state which is invariant under the action of φ, ω ◦ φt = ω, t ≥ 0. Notice too that in case there are only a finite number of positive eigenvalues in the list 3(ω) then P H is finite dimensional, and thus φ = {φt : t ≥ 0} is a CP semigroup acting essentially on a matrix algebra, which leaves invariant a faithful state with prescribed eigenvalues λ1 ≥ λ2 ≥ · · · ≥ λr > 0. If α began life as a pure E0 -semigroup then ω is an absorbing state for φ in the sense that for every normal state ρ of B(P H ), lim kρ ◦ φt − ωk = 0.
t→∞
(3.2)
Conversely and most significantly, if we can create a pair (φ, ω) satisfying the conditions of the preceding paragraph then it is possible to reconstruct a pair (α, ω) consisting of an E0 -semigroup α having an invariant normal state ω with the expected eigenvalue list by a “dilation” procedure which reverses the “compression” procedure we have described above. Moreover, if the CP semigroup φ has a bounded generator (as it will surely have in the case where P H is finite dimensional), then its dilation to an E0 -semigroup will be cocycle-conjugate to a CAR/CCR flow whose index can be calculated directly in terms of φ (the details can be found in [6] and [7]). The following summarizes the result of the construction of (φ, ω) for finite eigenvalue lists given in [7, Theorem 5.1].
76
W. Arveson
Theorem 3.1. Let λ1 ≥ λ2 ≥ · · · ≥ λr > 0 be a list of positive numbers and let ω be a state of the matrix algebra Mr (C) whose density operator has this ordered list of eigenvalues. There is a semigroup φ = {φt : t ≥ 0} of unital completely positive maps on Mr (C) which leaves ω invariant, satisfies (3.2), and which can be dilated to a pure cocycle perturbation of a CAR/CCR flow having a normal invariant state whose eigenvalue list has exactly λ1 ≥ · · · ≥ λr as its nonzero elements. Theorem 3.3 leads to the following (see pp. 40–42 of [7]). Corollary 3.2. Let n = 1, 2, . . . , ∞ and let 3 = {λ1 ≥ λ2 ≥ . . . } be an eigenvalue list which has only a finite number of nonzero terms. There is a cocycle perturbation α of the CAR/CCR flow of index n which is pure, and which has an invariant normal state with eigenvalue list 3. Using Corollary 1.1 of, we deduce Theorem A of the introduction. Theorem A. Let n = 1, 2, . . . , ∞ and let 3− and 3+ be two eigenvalue lists having only a finite number of nonzero terms. There is an interaction (U, M) whose past and future normal states ω− , ω+ have eigenvalue lists 3− , 3+ respectively, and whose past and future E0 -semigroups are cocycle conjugate to the CAR/CCR flow of index n. 4. The Interaction Inequality Theorem A provides many examples of interactions, but it says nothing about whether or not these interactions are nontrivial. For that we need the inequality of Theorem B of the introduction. The purpose of this section is to prove Theorem B and discuss its consequences for interactions. Theorem B is based on the following more general result about E0 -semigroups. An E0 -semigroup α = {αt : t ≥ 0} acting on B(H ) is said to be pure if \ αt (B(H )) = C · 1. t≥0
Purity implies that for any two normal states ρ1 and ρ2 , lim kρ1 ◦ αt − ρ2 ◦ αt k = 0,
t→∞
see Proposition 1.1 of [7]. In particular, if there is a normal state ω which is invariant under α in the sense that ω ◦ αt = ω for every t ≥ 0 then ω must be an absorbing state in the sense that for every normal state ρ of B(H ) we have lim kρ ◦ αt − ωk = 0.
t→∞
(4.1)
Thus, if a pure E0 -semigroup has a normal invariant state then it is unique, and in particular the eigenvalue list 3(ω) of a normal invariant state ω provides a conjugacy invariant of pure E0 -semigroups. Given a pure E0 -semigroup acting on B(H ), the commutants Nt = αt (B(H ))0 are type I subfactors of B(H ) which increase with t, and by purity their union is a strongly dense ∗-subalgebra of B(H ). Let ρ be any normal state of B(H ). Since Nt is a type I factor, the restriction of ρ to Nt has an eigenvalue list, defined as in Sect. 3. The following result shows how these eigenvalue lists behave for large t.
Noncommutative Interactions
77
Theorem C. Let α = {αt : t ≥ 0} be a pure E0 -semigroup having a normal invariant state ω, and let Nt be the commutant αt (B(H ))0 . Then for every normal state ρ of B(H ) we have lim k3(ρ Nt ) − 3(ρ ⊗ ω)k = 0. t→∞
The proof of Theorem C requires some preparation. Lemma 4.1. Let {Ai : i ∈ I } be a net of positive trace class operators acting on a Hilbert space H and let B be a positive trace class operator such that trace (Ai ) = trace (B) for every i ∈ I . Suppose there is a set S ⊆ H , having H as its closed linear span, such that lim hAi ξ, ηi = hBξ, ηi , ξ, η ∈ S. i
Then trace|Ai − B| → 0, as i → ∞. Proof. By Proposition 1.6 of [7] it suffices to show that lim trace(Ai K) = trace(BK)
i→∞
for every compact operator K ∈ B(H ). The set S of compact operators K for which the assertion is true is a norm-closed linear space which contains all rank-one operators of the form ζ 7 → hζ, ξ i η, with ξ, η ∈ S. Since S spans H , it follows that S is the space of all compact operators. u t The next three lemmas relate to the following situation. We are given a normal ∗endomorphism α of B(H ) satisfying α(1) = 1. Let E be the linear space of operators E = {v ∈ B(H ) : α(x)v = vx, x ∈ B(H )}. If u, v are any two elements of E then v ∗ u is a scalar multiple of the identity operator, and in fact E is a Hilbert space relative to the inner product defined on it by v ∗ u = hu, viE 1. For any orthonormal basis v1 , v2 , . . . of E we have X vn xvn∗ , x ∈ B(H ). α(x) = n
Let ρ be a normal state of B(H ). It is clear that u, v ∈ E 7 → ρ(uv ∗ ) defines a bounded sesquilinear form on the Hilbert space E, hence by the Riesz lemma there is a unique bounded operator A ∈ B(E) such that hAu, viE = ρ(uv ∗ ), u, v ∈ E. A is obviously a positive operator and in fact we have trace A = 1, since for any orthonormal basis v1 , v2 , . . . for E, X X hAvn , vn i = ρ(vn vn∗ ) = ρ(α(1)) = ρ(1) = 1. trace A = n
n
The following result shows how to compute the eigenvalue list of the restriction of ρ to the commutant of α(B(H )) in terms of this “correlation” operator A.
78
W. Arveson
Lemma 4.2. Let ρ be a normal state of B(H ) and let A be the positive trace class operator on E defined by hAu, viE = ρ(uv ∗ ), u, v ∈ E. Then 3(ρ α(B(H ))0 ) = 3(A). Proof. By Proposition 2.3.1, it suffices to exhibit a normal ∗-isomorphism θ of B(E) onto α(B(H ))0 with the property that ρ(θ(T )) = trace(AT ), T ∈ B(E).
(4.2)
Consider the tensor product of Hilbert spaces E ⊗ H . In order to define θ we claim first that there is a unique unitary operator W : E ⊗ H → H which satisfies W (v ⊗ ξ ) = vξ , v ∈ E, ξ ∈ H . Indeed, for v, w ∈ E, ξ, η ∈ H we have
hvξ, wηiH = w ∗ vξ, η = hv, wiE hξ, ηi = hv ⊗ ξ, w ⊗ ηiE ⊗H . It follows that there is a unique isometry W : E ⊗ H → H with the stated property. W is unitary because its range spans all of H (indeed, any vector ζ orthogonal P to the range of W has the property v ∗ ζ = 0 for every v ∈ E, hence ζ = α(1)ζ = n vn vn∗ ζ = 0). For every X ∈ B(H ) we have W (1 ⊗ X)v ⊗ ξ = W (v ⊗ Xξ ) = vXξ = α(X)vξ = α(X)W (v ⊗ ξ ), hence W (1 ⊗ X)W ∗ = α(X). It follows that α(B(H ))0 = W (B(E) ⊗ 1)W ∗ , and thus we can define a ∗-isomorphism θ : B(E) → α(B(H ))0 by θ (T ) = W (T ⊗ 1)W ∗ . Writing u × v¯ for the rank-one operator on E defined by u × v¯ : w 7 → hw, viE u, we claim that (4.3) θ(u × v) ¯ = uv ∗ , for every u, v ∈ E. Indeed, if we pick a vector in H of the form η = wξ = W (w ⊗ ξ ), where w ∈ E and ξ ∈ H then we have θ(u × v)η ¯ = θ(w × v)W ¯ (w ⊗ ξ ) = W ((u × v) ¯ ⊗ 1)w ⊗ ξ = W ((u × v)w ¯ ⊗ ξ) ∗ ∗ = hw, viE W (u ⊗ ξ ) = hw, viE uξ = uv wξ = uv η, and (4.3) follows because H is spanned by all such vectors η. Now for every rank-one operator T = u × v¯ ∈ B(E) we have ρ(θ(T )) = ρ(θ(u × v)) ¯ = ρ(uv ∗ ) = hAu, viE = trace(AT ). Formula (4.2) follows for finite rank T ∈ B(E) by taking linear combinations, and the general case follows by approximating an arbitrary operator T ∈ B(E) in the strong operator topology with finite dimensional compressions P T P , P ranging over an increasing sequence of finite dimensional projections with limit 1. u t The following formulas provide a key step. Lemma 4.3. Let α, E be as above, let ρ be a normal state of B(H ) and let R ∈ L1 (H ) be its density operator ρ(X) = trace(RX), X ∈ B(H ). Define a linear operator L from E into the Hilbert space L2 (H ) of all Hilbert-Schmidt operators on H by Lv = R 1/2 v, v ∈ E. Then 4.3.1 hL∗ Lu, viE = ρ(uv ∗ ), u, v ∈ E, and
Noncommutative Interactions
79
4.3.2 for all ξ1 , ξ2 , η1 , η2 ∈ H we have
LL∗ (ξ1 × ξ¯2 ), η1 × η¯ 2
L2 (H )
D E = α(η2 × ξ¯2 )R 1/2 ξ1 , R 1/2 η1 . H
Proof of (4.3.1). Simply write D E
∗ L Lu, v E = hLu, LviL2 (H ) = R 1/2 u, R 1/2 v
L2 (H )
Proof of (4.3.2). We have
∗ LL (ξ1 × ξ¯2 ), η1 × η¯ 2
L2 (H )
= trace(v ∗ Ru) = ρ(uv ∗ ).
= L∗ (ξ1 × ξ¯2 ), L∗ (η1 × η¯ 2 ) E .
t u
(4.4)
Pick an orthonormal basis v1 , v2 , . . . for E. Then the right side of (4.7) can be rewritten as follows: X
L∗ (ξ1 × ξ¯2 ), vn E vn , L∗ (η1 × η¯ 2 ) E n
=
XD
ξ1 × ξ¯2 , R 1/2 vn
E
n
=
X n
L2 (H )
R 1/2 vn , η1 × η¯ 2
E L2 (H )
trace(vn∗ R 1/2 ξ1 × ξ¯2 )trace(R 1/2 vn η2 × η¯ 1 ) =
XD n
D
vn∗ R 1/2 ξ1 , ξ2
On the other hand, E D α(η2 × ξ¯2 )R 1/2 ξ1 , R 1/2 η1
E D
H
H
=
R 1/2 vn η2 , η1
.
vn (η2 × ξ¯2 )vn∗ R 1/2 ξ1 , R 1/2 η1
XD n
=
H
XD n
=
E
(η2 × ξ¯2 )vn∗ R 1/2 ξ1 , vn∗ R 1/2 η1
XD n
vn∗ R 1/2 ξ1 , ξ2
E D H
E H
E
η2 , vn∗ R 1/2 η1
H
E
H
,
and the last expression agrees with the bottom line of the previous formula. u t Lemma 4.4. For a pair A, B of self-adjoint compact operators on H , let A ◦ B be the bounded operator defined on the Hilbert space L2 (H ) of Hilbert-Schmidt operators by A ◦ B(T ) = AT B. Then A ◦ B is unitarily equivalent to A ⊗ B ∈ B(H ⊗ H ). Proof. Pick orthonormal bases e1 , e2 , . . . and f1 , f2 , . . . for H consisting of eigenvectors of A and B, Aen = αn en , Bfn = βn fn , n = 1, 2, . . . Letting em × f¯n be the rank-one operator ζ 7 → hζ, fn i en , then {em × f¯n : m, n = 1, 2, . . . } is an orthonormal basis for L2 (H ) and we have A ◦ B(em × f¯n ) = αm βn em × f¯n ,
m, n = 1, 2, . . .
Thus the unitary operator W : L2 (H ) → H ⊗ H defined by W (em × f¯n ) = em ⊗ fn , m, n = 1, 2, . . . satisfies W (A ◦ B)(em × f¯n ) = (A ⊗ B)W (em × f¯n ) for every t m, n = 1, 2, . . . , and hence W (A ◦ B)W ∗ = A ⊗ B. u
80
W. Arveson
Proof of Theorem C. Let R ∈ B(H ) be the density operator of the normal state ρ, trace(RT ) = ρ(T ), T ∈ B(H ). For every t > 0 let Et be the Hilbert space of intertwining operators associated with αt , Et = {T ∈ B(H ) : αt (A)T = T A, A ∈ B(H )}, 1/2 and let Lt : Et → L2 (H ) be the operator of Lemma 3, Lt v = R v, v ∈ Et . ∗ ∗ Lemma 4.3.1 implies that ρ(uv ) = Lt Lt u, v E , hence the correlation operator of ρ αt (B(H ))0 is L∗t Lt . By Lemma 4.2
3(L∗t Lt ) = 3(ρ αt (B(H ))0 ). On the other hand, (2.1.3) implies that 3(L∗t Lt ) = 3(Lt L∗t ). Thus it suffices to show that the eigenvalue lists of the operators Lt L∗t ∈ B(L2 (H )) converge to 3(ρ ⊗ ω), as t → ∞, in the metric of eigenvalue lists. By (4.3.2) we have D E
(4.5) Lt L∗t (ξ1 × ξ¯2 ), η1 × η¯ 2 L2 (H ) = αt (η2 × ξ¯2 )R 1/2 ξ1 , R 1/2 η1 , H
for all ξ1 , ξ2 , η1 , η2 ∈ H . Now since α is pure, αt (X) converges in the weak∗ -topology to ω(X)1 as t → ∞ (indeed, for every normal state σ , σ (αt (X)) converges to ω(X) = σ (ω(X)1), and the assertion follows because every element of the predual of B(H ) is a linear combination of normal states). Thus if we take the limit on t in the right side of (4.4) we obtain D D E E lim αt (η2 × ξ¯2 )R 1/2 ξ1 , R 1/2 η1 = ω(η2 × ξ¯2 ) R 1/2 ξ1 , R 1/2 η1 t→∞
H
= hη2 , ξ2 iH hRξ1 , η1 iH ,
H
where is the density operator of ω, ω(T ) = trace(T ), T ∈ B(H ). Let R ◦ be the operator on L 2 (H ) defined in Lemma 4.4, and notice that the right side of the preceding expression is R ◦ (ξ1 × ξ¯2 ), η1 × η¯ 2 L2 (H ) . Indeed, by definition of R ◦ we have R ◦ (ξ1 × ξ¯2 ) = Rξ1 × ξ2 , and
Rξ1 × ξ2 , η1 × η¯ 2 L2 (H ) = trace(η2 × η¯ 1 · Rξ1 × ξ2 ) = hRξ1 , η1 iH trace(η2 × ωξ2 ) hRξ1 , η1 iH hη2 , ξ2 iH , which, as asserted, agrees with the right side of the previous expression. Thus we have shown that
lim Lt L∗t (A), B L2 (H ) = hR ◦ (A), BiL2 (H ) t→∞
for rank-one operators A, B ∈ L2 (H ). Now Lemma 4.4 implies that R ◦ is unitarily equivalent to R ⊗ ∈ B(H ⊗ H ), and hence R ◦ is a positive trace class operator for which 3(R ◦ ) = 3(R ⊗ ) = 3(ρ ⊗ ω). On the other hand, Lemma 4.1 implies that lim trace|Lt L∗t − R ◦ | = 0.
t→∞
Noncommutative Interactions
81
By the inequality (2.3.2) we conclude that lim sup k3(Lt L∗t ) − 3(R ◦ )k ≤ lim trace|Lt L∗t − R ◦ | = 0. t→∞
t→∞
We have already seen that 3(R ◦ ) = 3(ρ ⊗ ω), and that 3(Lt L∗t ) = 3(ρ αt (B(H ))0 ). Thus Theorem C is proved. u t We now readily deduce the interaction inequality. Theorem B. Let (U, M) be an interaction with past and future states ω− and ω+ , and let ω¯ − and ω¯ + be their natural extensions to the local C ∗ -algebra A. Then kω¯ − − ω¯ + k ≥ k3(ω− ⊗ ω− ) − 3(ω+ ⊗ ω+ )k. Proof. Fix > 0. By Theorem C we can find T > 0 large enough so that for all t > T we have k3(ω+ A[0,t] ) − 3(ω+ ⊗ ω+ )k ≤ as well as
k3(ω− A[−t,0] ) − 3(ω− ⊗ ω− )k ≤ .
Now for t ≥ T , kω¯ + − ω¯ − k = kω¯ + ◦ γt − ω¯ − ◦ γ−t k ≥ kω¯ + ◦ γt A[−t,t] −ω¯ − ◦ γ−t A[−t,t] k = kω+ ◦ γt A[−t,t] −ω− ◦ γ−t A[−t,t] k.
(4.6)
Since γt gives rise to a ∗-isomorphism of A[−t,t] onto A[0,2t] while γ−t gives rise to a ∗-isomorphism of A[−t,t] onto A[−2t,0] , (2.3.1) implies that 3(ω+ ◦ γt A[−t,t] ) = 3(ω+ A[0,2t] ), and 3(ω− ◦ γ−t A[−t,t] ) = 3(ω− A[−2t,0] ). Thus by Proposition 2.3 the last term of (4.5) is at least k3(ω+ A[0,2t] ) − 3(ω− A[−2t,0] )k, which by our initial choice of T is at least k3(ω+ ⊗ ω+ ) − 3(ω− ⊗ ω− )k − 2. Since is arbitrary, the asserted inequality follows. u t Corollary 4.5. Let (U, M) be an interaction with past and future states ω− , ω+ . If 3(ω− ) 6 = 3(ω+ ), then the interaction is nontrivial. Proof. Contrapositively, suppose that the interaction is trivial and let − and + be the respective density operators of ω− and ω+ . Theorem B implies that − ⊗ − and + ⊗ + must have the same eigenvalue list. (2.1.4) of Proposition 2.1 implies that for every n = 1, 2, . . . we have trace(n− )2 = trace((− ⊗ − )n ) = trace((+ ⊗ + )n ) = trace(n+ )2 . Taking the square root we find that trace(n− ) = trace(n+ ) for every n = 1, 2, . . . and t another application of (2.1.4) leads to 3(− ) = 3(+ ). u
82
W. Arveson
Corollary 4.6. Let n = 1, 2, . . . , ∞ and choose > 0. There is an interaction (U, M) whose past and future E0 -semigroups are cocycle-conjugate to the CAR/CCR flow of index n such that kω¯ + − ω¯ − k ≥ 2 − . Proof. Choose positive integers p < q and consider the eigenvalue lists 3− = {1/p, 1/p, . . . , 1/p, 0, 0, . . . }, 3+ = {1/q, 1/q, . . . , 1/q, 0, 0, . . . }, where 1/p is repeated p times and 1/q is repeated q times. Theorem A implies that there is an interaction (U, M) whose past and future E0 semigroups are cocycle-conjugate to the CAR/CCR flow of index n, for which 3(ω− ) = 3− and 3(ω+ ) = 3+ . By Theorem B, kω¯ + − ω¯ − k ≥ k3(ω+ ⊗ ω+ ) − 3(ω− ⊗ ω− )k. If we neglect zeros, the eigenvalue list of ω− ⊗ ω− consists of the single eigenvalue 1/p 2 , repeated p2 times, and that of ω+ ⊗ ω+ consists of 1/q 2 repeated q 2 times. Thus k3(ω+ ⊗ ω+ ) − 3(ω− ⊗ ω− )k = p2 (1/p2 − 1/q 2 ) + (q 2 − p2 )/q 2 = 2 − 2p2 /q 2 , √ t and the inequality of Corollary 2 follows whenever q is larger than p 2/. u References 1. Araki, H. and Woods, E. J.: Complete Boolean algebras of type I factors. Publ. RIMS (Kyoto University) 2, ser. A, no. 2, 157–242 (1966) 2. Arveson, W.: Continuous analogues of Fock space. Memoirs Am. Math. Soc. 80 no. 3 (1989) 3. Arveson, W.: Continuous analogues of Fock space IV: Eessential states. Acta Math. 164, 265–300 (1990) 4. Arveson, W.: An addition formula for the index of semigroups of endormorphisms of B(H ). Pac. J. Math. 137, 19–36 (1989) 5. Arveson, W.: Quantizing the Fredholm index. In: Operator Theory: Proceedings of the 1988 GPOTSWabash conference, Conway, J. B. and Morrel, B. B. eds., Pitman Research Notes in Mathematics Series, London: Longman, 1990 6. Arveson, W.: Dynamical invariants for noncommutative flows. In: Operator algebras and Quantum Field Theory, Proceedings of the Rome conference, Doplicher et al., eds., 1996 7. Arveson, W.: Pure E0 -semigroups and absorbing states. Commun. Math. Phys. 187, 19–43 (1997) 8. Arveson, W.: On the index and dilations of completely positive semigroups. Int. J. Math., to appear 9. Arveson, W.: Eigenvalue lists of noncommutative probability distributions. Unpublished lecture notes, available from http://www.math.berkeley.edu/˜ arveson 10. Bhat, B.V.R.: Minimal dilations of quantum dynamical semigroups to semigroups of endomorphisms of C ∗ -algebras. Trans. A.M.S., to appear 11. Bhat, B.V.R.: On minimality of Evans-Hudson flows. Ppreprint 12. Chebotarev,A.M., Fagnola, F. : Sufficient conditions for conservativity of quantum dynamical semigroups. J. Funct. Anal. 153, 2, 131–153 (1993) 13. Davies E. B.: Quantum Theory of Open Systems. London–New York: Academic Press, 1976 14. Davies E. B.: Generators of dynamical semigroups. J. Funct. Anal. 34, 421–432 (1979) 15. Dunford, N. and Schwartz, J.: Linear Operators. II. New Yourk: Interscience, 1963 16. Evans, D.: Conditionally completely positive maps on operator algebras. Quart J. Math. Oxford 28 (2), 271–284 (1977) 17. Evans, D.: Quantum dynamical semigroups, symmetry groups, and locality. Acta Appl. Math. 2, 333–352 (1984) 18. Evans, D. and Lewis, J. T.: Dilations of irreversible evolutions in algebraic quantum theory. Comm. Dubl. Inst. Adv. Studies, Ser. A 24, 1977, 104 p. 19. Gelfand, I.M. and Vilenkin, N.Ya.: Generalized functions 4: Applications of harmonic analysis. New York: Academic Press, 1964 20. Gorini, V., Kossakowski, A. and Sudarshan, E.C.G.: Completely positive semigroups on N-level systems. J. Math. Phys.17, 821–825 (1976)
Noncommutative Interactions
83
21. Haag, R.: Local Quantum Physics. Berlin: Springer-Verlag, 1992 22. Hudson, R.L. and Parthasarathy, K.R.: Stochastic dilations of uniformly continuous completely positive semigroups. Acta Appl. Math. 2, 353–378 (1984) 23. Kümmerer, B.: Markov dilations on W ∗ -algebras. J. Funct. Anal. 63, 139–177 (1985) 24. Kümmerer, B.: Survey on a theory of non-commutative stationary Markov processes. In: Quantum Probability and Applications III, Lecture notes in Mathematics 1303 Berlin–Heidelberg–New–York: Springer, 1987, pp. 154–182 25. Lindblad, G.: On the generators of quantum dynamical semigroups. Commun. Math. Phys. 48, 119 (1976) 26. Mohari, A., Sinha, Kalyan B.: Stochastic dilation of minimal quantum dynamical semigroups. Proc. Ind. Acad. Sci. 102, 159–173 (1992) 27. Parthasarathy, K.R.: An introduction to quantum stochastic calculus. Basel: Birkhäuser Verlag, 1991 28. Pedersen, G.K.: C ∗ -algebras and their automorphism groups. London–New York: Academic Press, 1979 29. Powers, R. T.: Representations of uniformly hyperfinite algebras and their associated von Neumann rings. Ann. Math. 86, 138–171 (1967) 30. Powers, R. T.: An index theory for semigroups of endomorphisms of B(H ) and type I I factors. Can. J. Math. 40, 86–114 (1988) 31. Powers, R. T.: A non-spatial continuous semigroup os ∗-endomorphisms of B(H ). Publ. RIMS (Kyoto University) 23, 1053–1069 (1987) 32. Powers, R. T.: New examples of continuous spatial semigroups of endomorphisms of B(H ). Preprint 1994 33. Powers, R. T. and Price, G: Continuous spatial semigroups of ∗-endomorphisms of B(H ). Trans. A.M.S. 321, 347–361 (1990) 34. Tsirelson, B.: From random sets to continuous tensor products: Answers to three questions of W. Arveson. Preprint Communicated by H. Araki
Commun. Math. Phys. 211, 85 – 109 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Rational Solutions of the Master Symmetries of the KdV Equation Jorge P. Zubelli1 , D. S. Valerio Silva2 1 IMPA, Est. D. Castorina 110, RJ 22460-320, Brazil. E-mail:
[email protected] 2 Matemática, UFES – Vitória – ES 29060-900, Brazil. E-mail:
[email protected] Received: 28 January 1999 / Accepted: 22 October 1999
Abstract: In this article we characterize a certain class of rational solutions of the hierarchy of master symmetries for KdV. The result is that the generic rational potentials that decay at infinity and remain rational by all the flows of the master-symmetry KdV hierarchy are bispectral potentials for the Schrödinger operator. By bispectral potentials we mean that the corresponding Schrödinger operators possess families of eigenfunctions that are also eigenfunctions of a differential operator in the spectral variable. This complements certain results of Airault–McKean–Moser [4], Duistermaat–Grünbaum [10], and Magri–Zubelli [40]. As a consequence of bispectrality, the rational solutions of the master symmetries turn out to be solutions of a (generalized) string equation. 1. Introduction One remarkable feature of soliton theory is the fact that it brings to the forefront links between fields that are apparently unrelated. This can be traced as far back as the seminal work of Gardner, Greene, Kruskal and Miura [15] on the conserved quantities for the Korteweg–de Vries (KdV) equation. In their work, the presence of infinitely many conserved quantities was demonstrated making use of the scattering theory of Schrödinger operators. Deeply related to the existence of infinitely many conserved quantities is the remark that KdV is a bi-Hamiltonian system. If we denote by Xk the k th flow the KdV hierarchy, we have Xk (u) = ∂x
δHk−1 δHk = (−∂x3 + 4u∂x + 2ux ) , δu δu
where Hk is the k th conserved quantity of KdV. Furthermore, the two Poisson structures mentioned above are compatible in the sense that their linear combination gives again a Poisson structure. This circle of ideas was highlighted in the work of F. Magri [24,25], which could be considered the catalytic force in the work presented here.
86
J. P. Zubelli, D. S. Valerio Silva
By making use of the two structures above one is led naturally to the idea of writing a recursion operator to the KdV hierarchy. This can be done introducing the operator Nu = −∂x2 + 4u + 2ux ∂x−1 . def
Indeed, it is not hard to see that, at least formally, we have Xk = Nku X0 , where X0 (u) = ux . For example, k = 1 gives the usual KdV flow X1 (u) = 6uux − uxxx . The present article lies in the crossroads of many different areas. It concerns the characterization of the rational functions, decaying at infinity, that remain rational by the flows of a certain hierarchy {τj }∞ j =0 of nonlinear evolution equations, which is called the master symmetry hierarchy for KdV. This hierarchy, which plays an important role in the bi-Hamiltonian aspects of the KdV equation, is defined as follows: Start from the infinitesimal generator of the scaling transformations def
τ0 (u) =
1 xux + u, 2
and set, for k = 0, 1, · · · , def
τk+1 = Nu τk . It can be shown [40] that the KdV hierarchy {Xk }∞ k=0 and the master-symmetry satisfy hierarchy {τj }∞ j =0 [τj , τl ] = (l − j )τj +l ,
(1)
1 [τl , Xj ] = (j + )Xl+j . 2
(2)
and
Equation (1) corresponds to the celebrated (zero central charge) Virasoro relations. In fact, the master-symmetry vector fields under consideration are only the positive part of the Virasoro algebra. Equation (2) is telling us that the master-symmetry vector fields produce higher order flows of the KdV hierarchy by means of Lie derivatives. The connection between the bi-Hamiltonian structure of the KdV hierarchy and the mastersymmetries goes further. It can be shown that the Lie derivative of the first Poisson structure ∂x of the KdV hierarchy with respect to the vector field τ1 leads to the second structure [3]. Another instance of an exceptional link between different areas in soliton theory was displayed in the work of Duistermaat and Grünbaum [10] with the so-called “bispectral problem”. It was originally motivated by some questions in signal processing and computerized tomography [16–20]. The version of the bispectral problem that concerns us here goes as follows: Characterize the potentials u(x) for the Schrödinger operator L = −∂x2 + u whose eigenfunctions ϕ(x, λ) also satisfy a differential equation in the spectral parameter λ of the form A(λ, ∂λ )ϕ = 2(x)ϕ,
(3)
Rational Solutions of Master Symmetries of KdV Equation
87
where A(λ, ∂λ ) is a differential operator of positive order, independent of x, and 2(x) is a smooth function. The solution of the bispectral problem for Schrödinger operators turned out to be directly connected to the theory of solitons, more specifically to the rational solutions of the KdV equation, which in turn were characterized in the work of Airault, McKean, and Moser [4]. See also [2,9]. Later on, it became clear that the connection between the bispectral property and other soliton equations went much deeper [34,38,39]. The existence of large families of rational solutions is yet another remarkable property possessed by many integrable PDEs. The rational solutions of the KdV hierarchy, and more generally of the Kadomtsev–Petviashvilli equation, are connected to Calogero–Moser systems and to many other important aspects of soliton theory [1,7,22,35]. It was shown in [10] that the rational solutions of the KdV equation are bispectral. However, they comprise only “half” of the bispectral potentials. The remaining potentials, which were also described in [10], do not remain rational by the flows of the KdV hierarchy. Another piece of the puzzle was added in [40]. It was shown therein that the mastersymmetries of the KdV hierarchy are tangent to all the manifolds of bispectral potentials. Those manifolds were constructed by successive Darboux transformations in [10]. As a consequence of the results in [40] it followed that the bispectral potentials, decaying at infinity, remain rational by the flows of the master-symmetries for KdV. The present work addresses a natural follow-up question to the results of [40]: Are the rational potentials that remain rational by the KdV master-symmetries necessarily bispectral potentials? Our answer to this question requires some technical discussion on what is meant by remaining rational by the KdV master symmetries. In Sect. 1.2 we address this issue and discuss the concept that we call generic rational potentials. Our main results are the following: Theorem 1. The generic rational potentials decaying at infinity that remain rational by all the flows of the KdV master-symmetry hierarchy are bispectral. This is obtained as a consequence of a result in [10], which we recall in Sect. 1.2, and the following result which we prove herein: Theorem 2. Let u(x) be a rational potential that decays at infinity and remains rational by the master-symmetry flow τ1 . If we denote by P the set of nonzero poles of u, then X 2 c , (4) u(x) = 2 + x (x − p)2 p∈P
where P satisfies the constraints X c + 3 p
q∈P \{p}
2 = 0, (p − q)3
p ∈ P.
(5)
Furthermore, if the set of nonzero poles P is nonempty and u is a generic rational solution of all flows in the KdV master-symmetry hierarchy then either, c = ν(ν + 1) with ν ∈ Z≥0 , or c = l 2 − 1/4 with l ∈ Z. Using again the results of [10], the characterization of the vanishing rational solutions of the master-symmetry hierarchy given in Theorem 2 yields a description in terms of the Darboux method, namely:
88
J. P. Zubelli, D. S. Valerio Silva
Corollary 1. The class of generic rational solutions, decaying at infinity, of the KdV master-symmetry hierarchy consists of potentials obtained by successive rational Darboux transformations starting at u = 0 or u=
−1/4 . x2
1.1. Plan of the article. We conclude this introductory section with a little bit of background information on the bispectral problem, Darboux transformations, and the KdV master-symmetry hierarchy. In Sect. 2 we start the study of the rational solutions of the master-symmetry hierarchy for KdV. We characterize the constraints on the poles and their dynamics. The constraints in the present situation are similar to the ones of the Airault–McKean–Moser locus [4], with the difference that here the pole at zero has to be treated differently and introduces an extra parameter. This parameter is exactly the c that appears in Eq. (4). Section 3 is dedicated to the study of the compatibility between the locus constraints and the vector fields defined by the master symmetries. The conclusion is that if the potential remains rational by all the flows of the KdV master-symmetry hierarchy, then such flows are compatible with the locus conditions. In order toachieve our goal of characterizing the (vanishing) rational solutions of the hierarchy τj j ≥0 we will show in the next sections that the constant c of Eq. (4) takes only a certain set of discrete values, provided the set of nonzero poles is nonempty. The discreteness of the values of c is in fact the difficult part of our result. Our initial attempts of proving that using only the asymptotics at x = 0 or at x = ∞ failed. We eventually had to connect the two asymptotics, which led us to the use of the squared eigenfunction method. Section 4 is concerned with the effect of the master symmetries on their rational solutions. Those properties will then be used in Sect. 5 to conclude that the (vanishing) rational solutions of the master-symmetry flows are bispectral. We finish with a section containing some conclusions and suggestions for further research.
1.2. Background. In this section we shall collect background information on the bispectral problem for the Schrödinger operator, so as to make the paper accessible to a wider audience. For further information on the relationship between the bispectral problem and other areas we refer the reader to [21] and references therein. We shall first make precise what we mean by generic rational solutions of evolution equations. Let us consider an equation of the form ut = K(u).
(6)
In this paper we are concerned with rational solutions decaying at infinity. We explain this predilection in Sect. 6. Since we are interested in solutions that vanish at infinity, we write u in partial fraction decomposition form as u=
np XX p∈Q ν=1
cν,p . (x − p)ν
(7)
Rational Solutions of Master Symmetries of KdV Equation
89
Here, as usual, Q denotes the set of distinct poles of u and np the leading order of each pole p. It is now natural to require the coefficients cν,p and the poles p ∈ Q to be smooth functions of time. We shall also require the condition that the cardinality of the set of poles remains constant and that the leading coefficient of each such singularity cνp ,p does not vanish. We shall call u satisfying such conditions generic. We elaborate on what we are missing with the generic assumption in Sect. 6. A slight, though convenient, abuse of language will be used: We shall say that u = u0 (x) is a rational solution of K if there exists an integral curve u = u(x, t), of (6) of the form (7), which is well defined in a sufficiently small neighborhood of t = 0, such that u(x, 0) = u0 (x). Likewise, we shall say that u0 is a rational solution of a hierarchy σj if it is a rational solution of each flow σj in the hierarchy. The main technique for producing all the bispectral Schrödinger potentials is the Darboux transformation. It has been widely used in the soliton literature and generalized in many directions. More information on the geometric aspects of the Darboux method can be found in [26,27] and references therein. The (classical) Darboux process starts off from an operator L = −∂x2 + u for which we know how to solve the eigenvalue problem Lφ = λφ for all λ. We factorize L as a product of two first order operators L = A∗ A,
(8)
e = AA∗ . The function where A = ∂x −v and A∗ is its formal adjoint. We then construct L 2 v satisfies u = vx + v , which is a Riccati equation that can be linearized by means of the transformation v = χ 0 /χ. It turns out that the linearized equation is Lχ = 0, which by hypothesis we know how to solve. The new potential is given by e u = u − 2vx . It is e with a possible exception e(x, λ) = Aφ(x, λ) is an eigenfunction of L, easy to see that φ of λ = 0. The upshot is that the process can be iterated; each time one introduces a new complex parameter. The final potential after N iterations depends (in general) on N parameters. The process of going from u to e u is called a Darboux transformation. This transformation is said to be rational if it starts from a rational potential and yields another rational potential. Example 1. If one starts with the potential u = 0, one gets after one Darboux transformation the potential u1 = 2/(x + t1 )2 , where t1 is an arbitrary constant. Another transformation yields (modulo translation) u2 = −2∂x2 log(x 3 + 12t). One can easily check that u2 is a rational solution of the KdV hierarchy. It is in fact a stationary solution of the flow X2 . Successive iterations of the method lead to the Adler-Moser sequence of polynomials [2]. Example 2. If one starts from u0 = (l 2 −1/4)/x 2 , where l ∈ Z>0 , one gets after applying Darboux (l − 1)2 − 1/4 − 2∂x2 log(1 + sx 2l ) . x2 Each of the two examples above are the first nontrivial families of bispectral potentials. The potential in Example 2 does not remain rational by the KdV hierarchy flows. It is, however, an example of a stationary solution of τ1 . Let us now return to the bispectral problem. A few elementary symmetries of the bispectral problem are the following: u(x, t) =
90
J. P. Zubelli, D. S. Valerio Silva
Scaling. The map x 7 −→ x, which induces at the level of potentials u(x) 7 −→ 2 u(x). Translation. The map x 7 −→ x + t1 , which induces at the level of potentials u(x) 7 −→ u(x + t1 ). Addition of a constant. The map u(x) 7−→ u(x) + C. It can be shown [10] that bispectral potentials are necessarily rational functions of x. Furthermore, the only bispectral potentials that are unbounded are those of the form u = x, modulo the above symmetries. We shall, from now on, concentrate on the bispectral potentials decaying at infinity. The characterization of the bispectral potentials is given by Theorem 3 (Duistermaat and Grünbaum [10]). Let u be a bispectral potential decaying at infinity. Then, modulo the elementary symmetries above, u is of the form u = c/x 2 or it can be obtained from u = 0 or u = −1/4x 2 by iterations of rational Darboux transformations. We remark that the potentials obtained by rational Darboux transformations starting from 0 are exactly those that vanish at infinity and remain rational by the flows of the KdV hierarchy. On the other hand, it was shown in [40] that those potentials obtained starting from u = −1/4x 2 remain rational by the flows of the master symmetries of the KdV hierarchy. Theorem 4 (Prop. 4.3 of [10]). Let P be a balanced set and c = l 2 − 1/4 with l ∈ Z>0 . Suppose that u is of the form (4) with the configuration of nonzero poles P satisfying (5). Then, u(x) can be obtained by µ rational Darboux transformations starting from (l + µ)2 − 1/4 . x2 A natural question, which was addressed in [10], is what are the possible configurations of poles of a rational function in order for it to be bispectral. The next result gives an answer to this question. It is actually a particular case of a more general one in [10]. We shall use it in order to prove that the rational solutions of the master symmetries for KdV are bispectral. u0 (x) =
Theorem 5 (Duistermaat and Grünbaum [10]). Suppose that u is of the form X 2 c , u(x) = 2 + x (x − p)2
(9)
p∈P
where P is the set of nonzero poles of u. 1. If c = l 2 − 1/4, P is balanced, and satisfies l 2 − 1/4 + p3 then u is bispectral.
X q∈P \{p}
2 = 0, (p − q)3
∀p ∈ P,
(10)
Rational Solutions of Master Symmetries of KdV Equation
91
2. If c = ν(ν + 1), ν ∈ Z≥ 0, and P satisfies ν(ν + 1) + p3
X q∈P \{p}
2 = 0, (p − q)3
X
p∈P
2 p2k+1
= 0,
∀p ∈ P,
(11)
∀k, 1 ≤ k ≤ ν
(12)
then u is bispectral. We close this section with a few remarks on the KdV master-symmetry hierarchy. The concept of master-symmetry was introduced in the work of Fokas and Fuchssteiner [11] and explored in a number of ensuing works [8, 12–13, 14,28–30]. The flow τ0 (u) = (1/2)xux + u is the infinitesimal generator of the one parameter group of scaling symmetries u(x) 7 → es u(es/2 x),
s ∈ R.
A short computation gives that the first (nontrivial) flow of the hierarchy is x τ1 (u) = − (uxxx − 6uux ) − 2uxx + 4u2 + ux ∂x−1 u. 2 For our purposes, we are concerned only with rational functions. We define therefore ∂x−1 as an integral along a path of the complex plane joining ∞ to x and avoiding all the poles of the function under consideration. Since we are looking for rational solutions, the rational functions under consideration must have zero residues at all poles. This makes the choice of the integral well defined. 2. Motion of the Poles Under τ1 2.1. Preliminaries. This section focus on the rational solutions of the first flow of the master-symmetry hierarchy, namely ut = τ1 [u],
(13)
where τ1 = Nu τ0 . 2.2. The form of the rational solutions. The first key remark in understanding the rational solutions of the flow τ1 , is that they must have the form X cp . u(x, t) = (x − p)2 p∈Q
This is a consequence of the following: Proposition 1. If u is a rational solution of (13), of the form (7), then we have 1. If p ∈ Q, then np = 2. 2. The coefficient cj,p = 0 unless j = 2 .
92
J. P. Zubelli, D. S. Valerio Silva
Proof. This claim is a straightforward argument on the leading order term of the asymptotic expansion near a pole p of both sides of (13). The LHS gives (denoting the time derivative by˙ ) u˙ =
np XX p∈Q j =1
c˙j,p j pc ˙ j,p + . (x − p)j (x − p)j +1
The RHS is given by (Nu τ0 )(u) where 1 xux + u 2 1 X X cj,p (−j ) X X cj,p + = x 2 (x − p)j +1 (x − p)j p p j j X X cj,p (−j/2 + 1) jpcj,p = − . (x − p)j 2(x − p)j +1 p
τ0 [u] =
j
After plugging τ0 (u) given above into τ1 = Nu τ0 we get τ1 [u] = A + B + C, with XX
j (j + 1)(−(j/2) + 1)cj,p j (j + 1)(j + 2)pcj,p + , j +2 (x − p) 2(x − p)j +3 p j X X cj,p (−j/2 + 1) X X cj,p jpcj,p , − B = 4 (x − p)j (x − p)j 2(x − p)j +1 p p j j X X cj,p (−j/2 + 1) X X cj,p (−j ) pcj,p . + C = 2 (x − p)j +1 (x − p)j −1 (−j + 1) 2(x − p)j p p A=
−
j
j
Lemma 1. If p 6 = 0, then np ≤ 2. The leading term of A in the vicinity of a given pole p is j (j + 1)(j + 2)pcj,p 2(x − p)j +3
(14)
with j = np . On the other hand, the combined contributions of B and C, in the vicinity of p 6 = 0 is given by 2 −3jpcj,p
(x − p)2j +1
(15)
with j = np . If np > 2, then the leading term on the RHS would be the contribution from B and C, since in this case 2np + 1 > np + 3. But this cannot be balanced by the LHS whose highest order pole is np + 1, which is strictly less than 2np + 1. This proves the lemma.
Rational Solutions of Master Symmetries of KdV Equation
93
On a similar vein we can show: Lemma 2. If p = 0, then np = 2. The fact that c1,p = 0 is an obvious consequence of the fact that any nonzero coefficient for (x − p)−1 would lead to logarithms on the RHS. This concludes the proof of the proposition. u t We are then left with u=
X p
c2,p . (x − p)2
Henceforth, we set cp = c2,p . Now, since np = 2 the leading coefficients of A, B and C combine to give p(6cp − 12)cp = 0. (x − p)5 Since we are assuming that p is indeed a pole of u, it follows that cp = 2 for p 6 = 0. We have thus proved Lemma 3. Under the assumptions of Proposition 1, the coefficient cp = 2 whenever p 6 = 0. We are now ready to state Proposition 2. If u is a generic rational solution of ut = τ1 [u], then u=
X 2 c + , 2 x (x − p)2
(16)
06 =p∈Q
where c is independent of t. Proof. It remains to show that c˙ = 0. The value of c˙ = 0 is the coefficient of the term in x −2 of A + B + C. However, A does not have any term in x −2 . After a partial fraction decomposition of B and C, we get Coeff(B + C, x −2 ) = 0. t Hence, the coefficient of the term in x −2 of the RHS vanishes. u Remark 1. We are now left with the task of studying rational functions of the form (16). Most of this paper is dedicated to showing that either c = ν(ν + 1) with ν ∈ Z≥0 or c = l 2 − 1/4 for l ∈ Z. This turns out to be much harder than the proof of Proposition 1. It is rather disturbing that we had to add the hypothesis that u(x) remains rational by the higher order flows {τk }k≥2 in order to get such a result. We are naturally led to formulate the following: Conjecture 1. The conclusion of Theorem 1 also holds if we assume that u(x) remains rational under the flow of τ1 only and not necessarily by all the flows of the mastersymmetry hierarchy.
94
J. P. Zubelli, D. S. Valerio Silva
2.3. The motion of the poles. We now consider the evolution equation satisfied by the poles of u(x, t) a rational solution of ut = τ1 (u). Once again they will be obtained by balancing the powers of (x − p) on both sides of this equation. As in the case of the rational solutions of KdV studied by [2,4,9] we encounter two types of consequences from the balancing of the powers of (x − p). The first ones are evolution equations for the poles. The second ones are constraints on the pole locations. Now, one major difference between the present situation and the rational solutions of KdV is the fact that here the equation is not invariant by translations. This is a direct consequence of the fact that x multiplies three of the terms in τ1 . As a result we have to consider two possibilities. Namely, zero as a stationary pole of u(x, t) and zero as a moving pole. If p = 0 at a certain time, say t = 0, and after that p(t) 6 = 0, then we can assume, without loss of generality, that c = 0 in Eq. (16). This gives that X 2 . u(x, t) = (x − p(t))2 p∈Q
After plugging u of this form into ut = τ1 (u) and comparing the terms in (x − p)−3 we obtain X 2 = 0. (p − q)3 q∈Q\{p}
But these are exactly the celebrated locus conditions of Airault–McKean–Moser. It is known that the (Airault–McKean–Moser) locus is empty, unless the total number of poles is of the form d(d + 1)/2, where d is a positive integer. It follows again from the results of [10] (Thm. 3.4) that if the locus is not empty, then u can be obtained by Darboux transformations from u0 = 0. Hence, u is bispectral. It turns out that in this case, u happens to remain rational by the flows of the KdV hierarchy. We have thus proved: Proposition 3. If a generic rational potential u(x), decaying at ∞, remains rational by the flow of τ1 and does not have x = 0 as a stationary pole, then it also remains rational by the flows of the KdV hierarchy. We now focus on the case that x = 0 is stationary. From now on, we refer to P as the set of nonzero poles of u. We shall also denote by Pp the set def
Pp = P \ {p} . Proposition 4. Let u be of the form u=
X 2 c + , 2 x (x − p)2
(17)
p∈P
where each nonzero pole p(t) of u(·, t) depends smoothly on t. Then, u is a solution of ut = τ1 (u)
(18)
Rational Solutions of Master Symmetries of KdV Equation
95
if, and only if, the configuration of nonzero poles P(t) for u(·, t) satisfies X 1 = 0, c p
(19)
p∈P
X 2 c + = 0, 3 p (p − q)3
∀p ∈ P,
(20)
q∈Pp
and,
X 2p + q c , p˙ = −2 + p (p − q)2
p ∈ P.
(21)
q∈Pp
Proof. The proof of this result is a straightforward, albeit laborious, exercise in partial fraction decomposition. The evolution of the poles, which is given by conditions (21), is obtained by balancing the powers of (x − p)−3 on the left- and right-hand side of (18). Condition (19) appears by balancing the terms in x −3 . Condition (20) results from the t analysis of the powers of (x − p)−2 . Details can be found in [32]. u Remark 2. If we sum over all the poles the condition (20), we get X 1 = 0. c p3
(22)
p∈P
This equation appears also by balancing the terms in x −1 . Recall that the n-fold symmetric product S n (C) is the set of unordered n-tuples of distinct complex numbers. This set has the structure of a complex variety with local charts on Cn \ 1, where 1 = z = (z1 , · · · , zn ) ∈ Cn |∃i 6= j s.t. zi = zj . Let n be the number of nonzero poles of u. When c = 0 (or c=2) the subset of points in Cn (resp. Cn+1 ) described by condition (20) plays an important role in the theory of the rational solutions of KdV. It was called the locus by Airault, McKean, and Moser [4]. In analogy with the terminology of [4] we have the following: Definition 1. The set of points in S n (C) satisfying the constraints (19), and (20) will be called the locus. We shall denote it by Lc . A few natural questions arise: 1. For which values of c is the locus nonempty? 2. Is the vector field defined by (21) consistent with the locus conditions? 3. What is the relation between the locus Lc and the bispectral problem? We do not have the answer to the first question in its full generality. In the case c = 0, the locus is void unless n = d(d + 1)/2 for some d = 1, 2, · · · . See [4]. When c = l 2 − 1/4, then for certain values of n the locus is certainly non-void; this follows from the results of [40]. Indeed, in [40] it is shown that the master-symmetry flows are tangential to all the manifolds of bispectral potentials. In particular, it follows that the configurations of poles for the potentials obtained by successive applications of Darboux to u = −1/4x 2 are in the locus Lc for c = l 2 − 1/4.
96
J. P. Zubelli, D. S. Valerio Silva
Example 3. Take l ∈ Z>0 and any s ∈ C. The potential u(x, t) =
(l − 1)2 − 1/4 − 2∂x2 log(1 + sx 2l ) x2
is a stationary rational solution of (18). In particular, its pole configuration satisfies the locus conditions with c = (l − 1)2 − 1/4 and 2l nonzero poles. The pole configuration corresponds to the 2l th roots of unity multiplied by s −1/2l . As for the second question, we shall address it in the next section, where we treat the problem of consistency of the locus condition with the vector fields. Finally, the third problem can be answered in part by stating the main result of this paper. Namely, if a generic rational potential u(x) decaying at infinity conserves such properties by all the flows of the KdV master-symmetries, then it is bispectral. 3. The Locus and Higher Order Flows In the previous section we saw that the generic vanishing rational solutions of τ1 are potentials of the form u=
X 2 c + . 2 x (x − p)2
(23)
p∈P
Furthermore, the configuration of nonzero poles P satisfies the constraints given by the locus conditions (19) and (20), and evolves according to X 2p + q c . p˙ = −2 + p (p − q)2 q∈Pp
The main question we address in this section is whether the evolution above is indeed consistent with the locus conditions. To do that, we shall assume that we are working with a regular point, of a nonempty locus. If this is the case, then a vector (p) ˙ p∈P belongs to the tangent space TP Lc iff the following conditions are satisfied: c
X p˙ = 0, p2
p∈P
X p˙ − q˙ p˙ 2 = 0 , ∀p ∈ P. c 4+ p (p − q)4 q∈Pp
We remark that a tangent vector (p) ˙ p∈P to S n (C) induces a tangent vector on the space of functions of the form (23) according to v˙ =
X p∈P
4p˙ . (x − p)3
(24)
The main goal of this section is to show that the master-symmetry vector fields induce vector fields on the locus Lc . More precisely, we will show that
Rational Solutions of Master Symmetries of KdV Equation
97
Proposition 5. If u(x) is a generic rational solution, decaying at infinity, of the KdV master-symmetry flows {τi }i≥1 , then each τi induces a tangent vector to the locus, in the sense that X 4p˙ , τi [u] = (x − p)3 p∈P
where (p) ˙ p∈P ∈ T Lc . Proof. The generic assumption implies that u has the form (23) and that it makes sense to talk about the derivative of the poles along the flow of τi and τi+1 . We shall use the notation τj (p) to denote the time derivative of a pole p along the flow τj . The fact that τi+1 = Nu τi and that the rational solution of ut = τi+1 (u) has the form (23) implies that X −2µ0 (x − p)τi+1 (p) = τi+1 (u) = Nu τi (u), (25) p∈P
where µ(x) = x −2 . On the other hand, we have X −2µ0 (x − p)τi (p) Nu τi [u] = − ∂x2 + 4u + 2ux ∂x−1 def
=
X
p∈P
2µ000 (x − p)τi (p)
p∈P
X X 2µ(x − p) −2µ0 (x − q)τi (q) + 4 cµ(x) + p∈P
+ 2 cµ0 (x) +
X
q∈P
X 0 2µ (x − p) −2µ(x − q)τi (q).
p∈P
q∈P
After rearranging the terms we get X X 2µ000 (x − p)τi (p) − 24 µ(x − p)µ0 (x − p)τi (p) Nu τi [u] = p∈P
− 8c
p∈P
X
0
µ(x)µ (x − p)τi (p) − 4c
p∈P
− 16
X
X
µ0 (x)µ(x − p)τi (p)
p∈P 0
µ(x − q)µ (x − p)τi (p) − 8
p,q∈P p6=q
X
µ(x − q)µ0 (x − p)τi (q).
p,q∈P p6 =q
Hence, using that µ000 = 12µµ0 and swapping the rôles of p and q we have X X µ(x)µ0 (x − p)τi (p) − 4c µ0 (x)µ(x − p)τi (p) Nu (τi [u]) = − 8c p∈P
−8
X
p,q∈P p6=q
p∈P 0
µ(x − q)µ (x − p)[2τi (p) + τi (q)].
98
J. P. Zubelli, D. S. Valerio Silva
Now, a partial fraction decomposition yields def
4 = Nu τi = − 8c
X
µ00 (p)x −1 − µ0 (p)x −2
p∈P
− µ00 (p)(x − p)−1 − 2µ0 (p)µ(x − p) + µ(p)µ0 (x − p) τi (p) X − µ00 (p)x −1 + 2µ0 (p)x −2 − 4c p∈P
+ µ(p)µ0 (x) + µ00 (p)(x − p)−1 + µ0 (p)µ(x − p) τi (p) X − µ00 (p − q)(x − p)−1 −8 p6=q 0
− 2µ (p − q)µ(x − p) + µ(p − q)µ0 (x − p) + µ00 (p − q)(x − q)−1 − µ0 (p − q)µ(x − q) 2τi (p) + τi (q) . Substituting this into Eq. (25) and comparing the terms of the same power we get, denoting p˙ = τi (p), Coeff(4, x −1 ) : − 24c
X
p−4 p˙ = 0,
(26)
X 2(p − q)−4 (p˙ − q) ˙ = 0, Coeff(4, (x − p)−1 ) : 24 cp−4 p˙ +
(27)
p∈P
Coeff(4, (x − p)
−2
Coeff(4, x −3 ) : 8c
) : − 24 cp
X
q∈Pp −3
+
X
−3
2(p − q)
p˙ = 0,
(28)
q∈Pp
p−2 p˙ = 0,
(29)
p∈P
X (p − q)−2 (2p˙ + q) ˙ = 4τi+1 (p). Coeff(4, (x − p)−3 ) : 16 cp−2 p˙ +
(30)
q∈Pp
Equations (26), (27), (29) show that τi is indeed tangent to the locus, whereas Eq. (30) t give us the effect of τi+1 on the poles. u Although the next two propositions will not be used in the sequel, we believe they provide further information about the constraints that the poles of u have to satisfy in order for it to be a rational solution of the master-symmetry hierarchy. The first one is a byproduct of the proof of Proposition 5 and is the analogue to the case at hand of a key result in [4] (Prop. 3). Namely, ˙ ∈ T Lc Proposition 6. Let u be of the form (23), with P a regular point in Lc . Then, (p) iff the residues of F = Nu def
X
p∈P
4p˙ (x − p)3
Rational Solutions of Master Symmetries of KdV Equation
99
all vanish and X p˙ = 0. p2
p∈P
Furthermore, if this is the case then X 2p˙ + q˙ X cp˙ + (x − p)−3 . F = 16 p2 (p − q)2 p∈P
q∈Pp
Proof. Given (p) ˙ p∈P a tangent vector to S n (C) let’s consider the corresponding v˙ as ˙ after a computation similar to the given by Eq. (24). The action of Nu on the vector v, one in the previous result is given by an expression of the form 4. After looking into the coefficients of the terms in (x − p)−1 we get residues proportional to those in the RHS of Eqs. (26), (27), and (29). The first assertion then follows, since P is a regular point in the locus. Using Eqs. (29) and (30) gives the second assertion. u t Proposition 7. Let u be of the form (23). Then u belongs to the locus Lc iff c
X 1 = 0, p
p∈P
and for any closed pole-free path 0 we have I 1 u2 (x) dx = 0. 2π i 0
(31)
Proof. Just compute the residues of u2 using partial fraction decomposition. These residues are (except for numerical factors) the expressions on the LHS of Eqs. (19) and (20), which define the locus. The claim then follows by Cauchy’s formula. u t On a similar vein, one can show that if u is a vanishing rational solution of τ2 , then for any closed pole-free path 0, I 1 1 ( u2 + u3 )dx = 0. (32) 2π i 0 2 x The integrands for Eqs. (31) and (32) are the currents for the corresponding conserved quantities of the associated KdV hierarchy. 4. Asymptotic Expansions and Recursion Let us assume that the potential u is of the form u=
X 2 c + 2 x (x − p)2
(33)
p∈P
and that it remains rational by the flows of the KdV master-symmetry hierarchy. The first goal of this section is to show, in that case, that for sufficiently high i we have τi (u) = 0. More precisely, we have the following:
100
J. P. Zubelli, D. S. Valerio Silva
Proposition 8. Let u be a potential of the form (33), where n denotes the number of nonzero poles. If u remains rational by all the KdV master symmetries τi , i ≥ 0, then τi (u) = 0 , i > (n − 1)/2.
(34)
The second goal is to show that if c is not of the form ν(ν + 1), ν ∈ Z≥0 , then the configuration of nonzero poles is balanced in the sense that P = −P. Proposition 9. Under the same assumptions of Proposition 8, if c is not of the form ν(ν + 1), with ν = 0, 1, 2, · · · , then the set of nonzero poles is balanced. In order to prove Propositions 8 and 9, we shall make use of the asymptotic expansion of the vector fields τi (u) for x → 0 as well as that for x → ∞. Let us denote by X def pk , k = ±1, ±2, · · · . πk = p∈P
The first proposition gives us the asymptotic behavior of the τi (u) as x → ∞. Lemma 4. Let u be given by Eq. (33), then in a neighborhood of x = ∞, X βki x −k−2−2i , τi (u) = k≥1
where
βk0
= −k(k + 1)πk and, for i ≥ 0, βki =
k + 2i 4c + 8n − (k + 2i − 1)(k + 2i + 1) βki−1 k + 2i − 1 X 4(2l + j + 4i) + (j + 1)πj βli−1 . l + 2i − 1 j +l=k j ≥1
Proof. The asymptotic expansion of u near x = ∞ is given by u(x) = (c + 2n)x −2 + 4π1 x −3 + 6π2 x −4 + · · · + 2(j + 1)πj x −j −2 + . . . = cx
−2
+
∞ X
αj x −j −2 ,
j =0
with αj = 2(j + 1)πj . Applying τ0 we get τ0 (u) =
∞ X j =1
βj0 x −j −2 ,
where βj0 = −j (j + 1)πj . The next step is to apply Nu , to get ∞ X k+2 [4c + 8n − (k + 1)(k + 3)]βk0 τ1 (u) = k+1 k=0 X 4(2l + j + 4) 0 (j + 1)πj βl x −k−4 . + l+1 j +l=k, j ≥1
t A straightforward induction gives the claimed formula for βki . u
Rational Solutions of Master Symmetries of KdV Equation
101
Proof of Proposition 8. We have shown in Sect. 3 that the flows τi are tangent to the locus Lc . For (p1 , · · · , pn ) ∈ Cn \ 1 the n-tuple (π1 , · · · πn ) serves as coordinates for the ambient space of the locus. Furthermore, it passes to the quotient under the permutations of the p1 , · · · , pn . We shall denote by τi (πk ) the effect of the vector field τi on such coordinates. We expand u in the neighborhood of x = ∞ and look at the effect of τi on πk . Using Lemma 4 we get τi (u) = 4τi (π1 )x −3 + 6τi (π2 )x −4 + · · · + 2(k + 1)τi (πk )x −k−2 + · · · = β1i x −2i−3 + β2i x −2i−4 + · · · + βki x −k−2−2i + · · · . Hence, comparing the powers of x, we have that τi (πk ) = 0, provided i > (k − 1)/2. However, the knowledge of the numbers τi (πk ), for k = 1, · · · , n, uniquely determines the effect of τi on the poles p1 , · · · , pn . Hence, τi = 0 on the locus Lc for i > (n−1)/2. t u Remark 3. The argument used in the proof of Proposition 8 was inspired by the proof of Corollary 1, p. 109, of [4]. Our next goal is to prove Proposition 9. This will be done with the help of the asymptotic behavior of τj (u) at x = 0. Before delving into the more technical Lemma 6, we mention yet another elementary remark, namely: def
Lemma 5. Let P ={x1 , . . . , xn } be a set of n distinct nonzero complex numbers. Assume that −2j +1
x1
−2j +1
+ · · · + xn
= 0 , j = 1, 2, · · · , n.
(35)
Then, n is even and the set P is balanced, i.e., P = −P. Proof. Let’s consider the matrix
x1−1 .. A= .
x2−1 .. .
...
xn−1 .. . .
x1−2n+1 x2−2n+1 . . . xn−2n+1
Because of Eq. (35), the vector (1, . . . , 1) ∈ ker(A). Thus det(A) = 0. On the other hand, x1 · · · xn det(A) can be computed as a Vandermonde determinant giving Y 1 1 1 det(A) = − . 2 2 xi xj x1 . . . xn 1≤i<j ≤n
Since the numbers x1 , · · · , xn are distinct we have that there exists a pair (i, j ) 6 = (j, i) t s.t. xi + xj = 0. A simple induction yields the result. u Remark 4. Using the same techniques of Sect. 2 one can show that if u is a generic and vanishing rational solution of the flow τk , k ≥ 2, (but not necessarily of the others) then: The order of each singularity is precisely 2, no residue appears, and the coefficient c2,p for any nonzero pole p is of the form lp (lp + 1) with lp ∈ Z≥1 . In other words the solution u has the form X lp (lp + 1) c . u= 2 + x (x − p)2 p∈P
Furthermore, the value of the coefficient associated to the pole x = 0 is kept constant along the flow of τk .
102
J. P. Zubelli, D. S. Valerio Silva
We are now ready to state and prove: Lemma 6. Let u be a rational solution of the KdV master-symmetry hierarchy of the form (33), such that c 6 = ν(ν + 1) for every ν ∈ Z≥0 . Then, π−2k−1 = 0,
k = 1, 2, · · · .
Furthermore, τk (u) =
X r≥0
λkr x r ,
where λ0r = (r + 1)(r + 2)π−r−2 and the coefficients λkr are computed recursively by means of λkr =
X 2(l + 1) + 2 r +2 [4c − (r + 1)(r + 3)]λk−1 (j + 1)π−j −2 λk−1 + 4 , r+2 l r +3 l+1 j +l=r j,l≥0
for k ≥ 1. Proof. Start from the Laurent expansion of u at x = 0, u(x) = c x −2 +
∞ X
γk x k ,
k=0 def
where γk = 2(k + 1)π−k−2 . Applying τ0 we get
∞ X 1 λ0k x k , τ0 (u) = 1 + x ∂x u(x) = 2 k=0
def
with λ0k = (k + 1)(k + 2)π−k−2 . Hence, τ1 (u) =Nu τ0 (u) = − ∂x2 + 4u + 2ux ∂x−1 τ0 (u) =
∞ X k=0
−k(k − 1)λ0k x k−2
X ∞ ∞ X c j +4 2 + γj x λ0l x l x
j =0
2c +2 − 3 + x
l=0
∞ X j =0
.
j γj x
j −1
X ∞ l=0
λ0l x l+1 l+1
Rational Solutions of Master Symmetries of KdV Equation
103
We remark that the last summation could, in principle, be added to an integration constant. A closer look, however, indicates that such a constant must vanish. Indeed, τ1 (u) cannot have a pole of order 3 at x = 0. Thus, τ1 (u) =
∞ X
k [4c − (k − 1)(k + 1)]λ0k x k−2 k+1
k=0 ∞ X
+
X 2[2(l + 1) + j ] γj λ0l x k l+1
k=0 j +l=k
=
∞ X r=−1
λ1r x r ,
with λ1r =
X 2[2(l + 1) + j ] r +2 [4c − (r + 1)(r + 3)]λ0r+2 + γj λ0l . r +3 l+1 j +l=r r≥0
Since u is a rational solution of τ1 , the residue of τ1 (u) at x = 0 vanishes. Hence, λ1−1 = 0 and 2cλ01 = 12cπ−3 = 0. Thus, π−3 = 0, since c 6= 0. This gives τ1 (u) =
X r≥0
λ1r x r .
The induction procedure now becomes clear. Assume that for all s, 1 ≤ s ≤ k, π−2s−1 = 0
and
τs (u) =
X r≥0
λsr x r ,
with λsr =
X 4[2(l + 1) + j ] r +2 [4c − (r + 1)(r + 3)]λs−1 (j + 1)πj −2 λs−1 . r+2 + l r +3 l+1 j +l=r
Then, τk+1 (u) =
X r≥−1
λk+1 xr , r
where = λk+1 r
X 2[2(l + 1) + j ] r +2 [4c − (r + 1)(r + 3)]λkr+2 + γj λkl . r +3 l+1 j +l=r r≥0
Here again, since u is a rational solution of {τj }j ≥1 , we have that the τk+1 (u) cannot have a pole of order 3 at x = 0. Thus the possible integration constant was omitted.
104
J. P. Zubelli, D. S. Valerio Silva
Furthermore, the residue of τk+1 (u) vanishes at x = 0. This gives us that 0 = 2c λk1 X 2[2(l + 1) + j ] 3 k−1 k−1 = 2c [4c − 2 · 4]λ3 + γj λl 4 l+1 j +l=1
= 6c(c = 6c(c
+ 48cπ−3 λk−1 − 2)λk−1 3 0 − 2)λk−1 , 3
+ 16cπ−2 λk−1 1
since π−3 = 0
and
2c λk−1 = Resx=0 {τk (u)} = 0. 1
Using the induction hypothesis π−5 = · · · = π−2k−1 = 0 and the fact that Resx=0 {τk−1 (u)} = · · · = Resx=0 {τ1 (u)} = 0, we have λk1 =
3 5 2k + 1 k · ··· · 4 (c − 2)(c − 6) . . . (c − k(k + 1)) λ02k+1 = 0. 4 6 2k + 2
Since c 6 = ν(ν + 1), ν ∈ Z≥0 , we have λ02k+1 = (2k + 2)(2k + 3)π−2k−3 = 0. t Hence, π−2k−3 = 0, which concludes the induction. u Proof of Proposition 9. If c is not of the form ν(ν + 1), ν = 0, 1, 2, · · · , and u is a rational solution of the master-symmetry hierarchy, then Lemma 6 ensures that π−1 = π−3 = · · · = 0. This, in light of Lemma 5, gives us that the configuration P of nonzero poles is balanced. u t It now remains to analyze the situation when c = ν(ν + 1) with ν a positive integer. As a corollary of the proof of Lemma 6 we get Corollary 2. If u is of the form (33), with c = ν(ν + 1), ν = 0, 1, · · · , and remains rational by the flows of the master-symmetry hierarchy for KdV, then u is bispectral. Proof. If c = ν(ν + 1), with ν a positive integer, then the argument in the proof of Lemma 6 gives that π−3 = π−5 = · · · = π−2ν−1 = 0. Since the poles of u satisfy the locus condition, using Theorems 3.4 and 3.5 of [10] (recalled in Theorem 5) it follows that u is bispectral. u t
Rational Solutions of Master Symmetries of KdV Equation
105
5. The Squared Eigenfunction Method In this section we shall complete the proof that the generic rational potentials that decay at infinity and remain rational by the KdV master-symmetry hierarchy are bispectral. In order to do that we shall make use of a technique inspired by the squared eigenfunction method of soliton theory. More precisely, we shall link the eigenfunctions of the Schrödinger operator def
Lφ = −∂x2 φ + u(x)φ = k 2 φ
(36)
with the solutions of the equation −Fxxx + 4uFx + 2ux F = 4k 2 Fx .
(37)
In fact, we shall be concerned only with the case k = 0. We recall that the LHS of this equation Ku = −∂x3 + 4u ∂x + 2ux is exactly the operator associated to the second symplectic structure of the KdV hierarchy. Furthermore, it is related to Nu by means of Ku = Nu ∂x . First, the classical result, which goes as follows: Proposition 10. Let u be analytic in a domain, and suppose that φ satisfies Eq. (36) in this domain. Then, F = φ 2 satisfies (37) in the same domain. The proof is a straightforward computation, and will be omitted. Let us concentrate, however, on a partial converse to such a statement that will play a crucial role in the sequel. Lemma 7. Let F be a solution of Ku F = 0, where u is analytic in a domain D ⊂ C. def
Take x0 a point in D ⊂ C and define in a sector S near x0 , the function φ = F 1/2 , where the choice of the square root is arbitrary. Then, inside S, φ is a solution of Lφ = αφ −3
(38)
for some α ∈ C, which depends on the choice of the square root. Proof. Since Ku F = 0 we have that −φφxxx − 3φx φxx + 4uφφx + ux φ 2 + φ(−φxxx + uφx + ux φ) + 3(Lφ)φx = 0. Hence, or better
φ(−φxx + uφ)x + 3φx Lφ = 0, φ(Lφ)x + 3φx Lφ = 0.
This gives Eq. (38) upon integration. u t Before establishing the crucial lemma to conclude the proof that the rational solutions of the master-symmetry hierarchy are bispectral, we need an elementary remark, namely:
106
J. P. Zubelli, D. S. Valerio Silva
Remark 5. If u = c/x 2 , then obviously u is bispectral and in this case τ0 (u) = 0. In fact, the converse holds τ0 (u) = 0 iff u = c/x 2 , for some c ∈ C. Lemma 8. Let u be of the form (33), where P is nonempty, and c 6= ν(ν + 1) for every ν = 0, 1, 2, · · · . If u is a rational solution of the master-symmetry hierarchy, decaying at infinity, then c = l 2 − 1/4 with l ∈ Z. Proof. Because of Remark 5, we may assume that τ0 (u) 6= 0 since P is nonempty. From Proposition 8 it follows that there exists s ≥ 0 such that τs (u) 6 = 0
and
τs+1 (u) = 0.
Proposition 9 ensures that u is an even function and furthermore there exists l ∈ Z≥0 such that τs (u) = λx 2l + O(x 2l+2 ), x → 0,
(39)
where λ 6 = 0. We now integrate the rational function τs (u) and choose a constant of integration so that F = ∂x−1 τs (u) = λx 2l+1 + O(x 2l+2 ) . def
The function F belongs to the kernel of Ku , since Ku (F ) = Nu ∂x ∂x−1 τs (u) = τs+1 (u) = 0. Hence, because of Lemma 7 it follows that there exists α such that Lφ = αφ −3 .
(40)
Indeed, one chooses a point x0 sufficiently close to 0 so that Eq. (39) holds, and then uses analytic continuation in a suitable sector with vertex at 0. We now have to distinguish between two cases, depending on whether l vanishes or not. If l 6 = 0 we look at both sides of Eq. (40) to get, as x → 0, that [−(l 2 − 1/4) + c]λx l−3/2 (1 + O(x 2 )) = αλ−3/2 x −3l−3/2 (1 + O(x 2 )).
(41)
Hence, α = 0 and c = l 2 − 1/4, for some l ∈ Z \ {0}. On the other hand, if l = 0, let’s consider the analytic continuation of φ along a path joining 0 to ∞ and avoiding the poles of u. By the principle of the permanence of the functional identities, Eq. (40) remains valid along this path. Furthermore, since τs (u) = O(x −3 ) as x → ∞, it follows that F (x) = ∂x−1 τs (u)(x) = O(x −2 ) + C∞ , where C∞ is an integration constant. Hence, F is bounded, and so is φ = F 1/2 . We claim now that α, once again, is necessarily zero. Indeed, if C∞ = 0, then the RHS of Eq. (40) blows up as x → ∞, unless α = 0. If C∞ 6= 0, the LHS of (40) goes to zero as x → ∞, and hence α = 0. Once we know that α = 0, we look again at the behavior of x close to zero, which is given in this case by the LHS of Eq. (41). Since α = 0, it follows that c = −1/4. u t
Rational Solutions of Master Symmetries of KdV Equation
107
6. Conclusion and Suggestions for Further Research We conclude this article with some final remarks and suggestions for further research. Concerning the restriction to vanishing rational potentials, one might wonder about the necessity of imposing this decay condition at infinity. However, if we look at the expression of the recursion operator Nu , we realize that in order for the operator ∂x−1 to be well defined we have to fix an integration constant. That was done by defining the integration from −∞ to x. It is precisely for this choice that the tangency result of [40] holds. Any other choice would lead to extra KdV flows which are not in general tangent to the manifolds of bispectral potentials. Furthermore, it is shown in [10] that any bispectral potential that does not vanish at infinity is of the form ax + b. Besides forming a rather degenerate and small class, such potentials do not remain rational by the flows of τ1 in any reasonable sense unless a = 0. In fact, it is not hard to see that rational potentials unbounded at infinity do not remain rational by τ1 no matter what choice one makes of the integration constant in ∂x−1 . The requirement that the potentials under consideration would vanish at infinity comes in the spirit of the usual normalization of bispectral potentials that are bounded at infinity. However, we did not analyze in detail the remaining possibility of rational solutions to the master-symmetries that are bounded though not vanishing at infinity. This would be an interesting follow-up of the present work. The assumption that we are dealing with generic rational solutions (decaying at infinity) rules out a number of degenerate situations that would make this work excessively technical. Unfortunately, it also rules out many bispectral potentials that remain rational by the flows of hierarchies such as KdV or the master symmetry hierarchy. In all examples we analyzed of such degenerate cases we had situations such as many poles colliding or splitting from a single one. The simplest example of such a situation for the KdV hierarchy is the rational solution u = −2∂x2 log(x 3 + 12t). For t = 0 we have a potential that does not satisfy our requirement, although for any other t it does fit in our scheme. Example 3 provides an instance of the same kind of phenomena for the KdV master-symmetry hierarchy. The issue of extending our results to non-generic rational solutions is a deep one since it is related to the collision of poles and the closure of the locus. For the Kadomtsev-Petviashvilli hierarchy a recent article of G. Wilson [35] addresses many subtle aspects of such problems. We believe that similar results to the ones stated in the present work would also be true with the generic assumption relaxed but the method of looking into the flows of the poles and its constraints employed throughout this paper would require a substantial adaptation. The topic of Darboux transformations, which was used by Duistermaat and Grünbaum to address the bispectral problem for Schrödinger operators [10] is still a central one in the solution of the bispectral problem. It was used also in the context of matrix operators to produce classes of bispectral differential operators [36,37,39]. This connection can also be found in further developments such as [5,23]. A natural follow-up question to the present work would be the characterization of the rational operators that remain rational by certain classes of Darboux transformations and its relation to the bispectral problem. In this respect an important reference would be the recent work [6]. In the latter, the authors generalize the results of [40] to bispectral operators of arbitrary rank and order. Given that the techniques of [6] are both sophisticated and powerful, they would be a natural starting point for the study of a similar question to the one addressed in the present work in the higher order case. Another possible avenue associated to the work presented here is the fact that the master-symmetry hierarchy for the KdV is a subset of the W∞ symmetries studied by
108
J. P. Zubelli, D. S. Valerio Silva
Orlov and Schulman [31]. The latter set of symmetries corresponds to a realization of the algebra generated by k j d z dzk j ∈Z,k≥0 in terms of vector fields on the algebra of pseudo-differential operators in one variable. The vector field associated to zj , j > 0, corresponds to the j th flow of the KP hierarchy, see [33] for details. Furthermore, the hierarchy of master symmetries for KdV discussed here, correspond only to a sub-algebra of W∞ . Namely, the positive part of a certain Virasoro algebra. The fact that the rational solutions, decaying at infinity, of the master-symmetry hierarchy are bispectral potentials leads naturally to the question: Would the operators with rational coefficients that remain rational by the positive part of the W∞ algebra be also bispectral? Or, should one impose some extra condition such as a string-type equation? We plan to address some of these questions in a future publication. Acknowledgements. We would like thank Franco Magri for helpful conversations. Both authors were supported by the Brazilian National Research Council, CNPq. JPZ was also supported by PRONEX-FINEP through grant 76.97.1008-00. In a regrettable oversight, due credit was not given to A. Fokas in [40] for the introduction of the mastersymmetry concept in his joint work with B. Fuchssteiner [11]. One of the authors (JPZ) takes the occasion to correct the mistake. The authors thank both referees for their detailed comments, which substantially improved the final version of the paper.
References 1. Ablowitz, M. J. and Segur, H.: Solitons and the Inverse Scattering Transform. SIAM Studies in Applied Mathematics, Philadelphia, Pennsylvania, 1981 2. Adler, M. and Moser, J.: On a class of polynomials connected with the KdV equations. Commun. Math. Phys. 61, 1–30 (1978) 3. Adler M. and van Moerbeke, P.: Compatible Poisson structures and the Virasoro algebra. Comm. Pure Appl. Math. 47(1), 5–37 (1994) 4. Airault, H., Mc Kean, H. and Moser, J.: Rational and elliptic solutions of the KdV equation and a related many body problem. Comm. Pure Appl. Math. 30, 95–148 (1977) 5. Bakalov, B., Horozov, E. and Yakimov, M.: General methods for constructing bispectral operators. Phys. Lett. A 222(1-2), 59–66 (1996) 6. Bakalov, B., Horozov, E. and Yakimov, M.: Highest weight modules over the W1+∞ algebra and the bispectral problem. Duke Math. J. 93(1), 41–72 (1998) 7. Calogero, F. and Degasperis, A.: Spectral Transform and Solitons. I. Amsterdam: North-Holland, 1982 8. Carillo, S. and Fuchssteiner, B.: The abundant symmetry structure of hierarchies of nonlinear equations obtained by reciprocal links. J. Math. Phys. 30 (7), 1606–1613 (1989) 9. Choodnovsky, D. V. and Choodnovsky, G. V.: Pole expansion of nonlinear differential equations. Nuovo Cimento 40B, 339–353 (1977) 10. Duistermaat, J. J. and Grünbaum, F. A.: Differential equations in the spectral parameter. Commun. Math. Phys. 103 (2), 177–240 (1986) 11. Fokas, A. S. and Fuchssteiner, B.: The hierarchy of the Benjamin - Ono equation. Phys. Lett. A 86 (6-7), 341–345 (1981) 12. Fuchssteiner, B.: Mastersymmetries, higher order time-dependent symmetries and conserved densities of nonlinear evolution equations. Prog. Theor. Phys. 70 (6), 1508–1522 (1983) 13. Fuchssteiner, B.: Mastersymmetries for Completely Integrable Systems in Statistical Mechanics. In: L. Garrido, ed., Proc. Sitges Conference, Springer Lecture Notes in Physics 216, 1984, pp. 305–315 14. Fuchssteiner B. and Carillo, S.: The Action-Angle transformation for soliton equations. Physica A 166, 651–675 (1990) 15. Gardner, C.S., Greene, J.M., Kruskal, M.D. and Miura R. M.: Method for solving the Korteweg- de Vries equation. Phys. Rev. Lett. 19, 1095–1097 (1967)
Rational Solutions of Master Symmetries of KdV Equation
109
16. Grünbaum, F. A.: The limited angle reconstruction problem in computed tomography. In: Proc. Symp. Appl. Math. Vol.27, L. Shepp, ed., Providence, RI: AMS 1982, pp. 43–61 17. Grünbaum, F. A.: A new property of reproducing kernels for classical orthogonal polynomials. J. Math. Anal. Appl. 95, 491–500 (1983) 18. Grünbaum, F. A.: Band and time limiting, recursion relations, and some nonlinear evolution equations. In: W. Schempp R. Askey, T. Koorwinder, ed., Special Functions: Group Theoretical Aspects and Applications, Dordrecht: Reidel, 1984, pp. 271–286 19. Grünbaum, F. A.: Some new explorations into the mystery of time and band limiting. Adv. in Appl. Math. 13(3), 328–349 (1992) 20. Grünbaum, F.A.: Time-band limiting and the bispectral problem. Comm. PureAppl. Math. 47(3), 307–328 (1994) 21. Harnad, J. and Kasman, A.: The Bispectral Problem, CRM Proceedings and Lecture Notes. Providence, Rhode Island: American Mathematical Society, 1998 22. Kasman, A.: Bispectral KP solutions and linearization of Calogero-Moser particle systems. Commun. Math. Phys. 172(2), 427–448 (1995) 23. Kasman, A. and Rothstein, M.: Bispectral Darboux transformations: The generalized Airy case. Physica D 102(3–4), 157–176 (1997) 24. Magri, F.: Equivalence transformations for nonlinear evolution equations. J. Math. Phys. 18(7), 1405– 1411 (1977) 25. Magri, F.: A simple model of the integrable Hamiltonian equation. J. Math. Phys. 19, 1156–1162 (1978) 26. Magri, F., Pedroni, M. and Zubelli, J. P.: On the geometry of Darboux transformations for the KP hierarchy and its connection with the discrete KP hierarchy. Commun. Math. Phys. 188(2), 305–325 (1997) 27. Magri, F. and Zubelli, J. P.: Bi-Hamiltonian formalism and the Darboux-Crum method. I. From the KP to the mKP hierarchy. Inverse Problems 13(3), 755–780 (1997) 28. Oevel, G., Fuchssteiner, B. and Blaszak, M.: Action-Angle representation of multisolitons by potentials of mastersymmetries. Prog. Theor. Phys. 83(3), 395–413 (1990) 29. Oevel, W.: A geometrical approach to integrable systems admitting time dependent invariants. In: Topics in soliton theory and exactly solvable nonlinear equations (Oberwolfach, 1986), Singapore: World Sci. Publishing, 1987, pp. 108–124 30. Oevel, W. and Fuchssteiner, B.: Explicit formulas for symmetries and conservation laws of the KadomtsevPetviashvili equation. Phys. Lett. A 88(7), 323–327 (1982) 31. Orlov, A. and Schulman, E.: Additional symmetries for integrable equations and conformal algebra representation. Lett. Math. Phys. 12(3), 171–179 (1986) 32. Valerio Silva, D. S.: “Rational Solutions of the Master-Symmetries for KdV and the Bispectral Problem” (in portuguese). PhD thesis, IMPA, Rio de Janeiro, 1998 33. van Moerbeke, P.: Integrable Foundations of String Theory. CIMPA - Summer School at Sophia-Antipolis. In: O. Babelon et al., ed., Lectures on Integrable Systems, Singapore World Scientific, 1994, pp. 163–267 34. Wilson, G.: Bispectral Commutative Ordinary Differential Operators. J. reine angew Math. 442, 177–204 (1993) 35. Wilson, G.: Collisions of Calogero-Moser particles and an adelic Grassmannian. Invent. Math. 133(1), 1–41 (1998), with an appendix by I. G. Macdonald 36. Zubelli, J. P.: Differential Equations in the Spectral Parameter for Matrix Differential Operators of AKNS Type. PhD thesis, University of California at Berkeley, 1989 37. Zubelli, J. P.: Differential equations in the spectral parameter for matrix differential operators. Physica D 43, 269–287 (1990) 38. Zubelli, J. P.: On the polynomial τ -functions for the KP hierarchy and the bispectral property. Lett. Math. Phys. 24, 41–48 (1992) 39. Zubelli, J. P. Rational solutions of nonlinear evolution equations, vertex operators and bispectrality. J. Differ. Eq. 97(1), 71–98 (1992) 40. Zubelli, J. P. and Magri, F.: Differential equations in the spectral parameter, Darboux transformations, and a hierarchy of master symmetries for KdV. Commun. Math. Phys. 141(2), 329–351 (1991) Communicated by T. Miwa
Commun. Math. Phys. 211, 111 – 136 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
On the Existence of Force-Free Magnetic Fields with Small Nonconstant α in Exterior Domains R. Kaiser, M. Neudert, W. von Wahl Department of Mathematics, University of Bayreuth, 95440 Bayreuth, Germany Received: 20 June 1999 / Accepted: 25 October 1999
Abstract: The existence of force-free magnetic fields in the exterior domain of some compact simply connected surface S is proved via an iteration scheme. The iteration starts with an arbitrary exterior vacuum field, which contains flux tubes originating and ending on S. At one cross-section of such a flux tube with S an arbitrary function α is prescribed. For small values of α (in the Hölder-norm 1, λ; 0 < λ < 1) the iteration is shown to converge to a force-free field with the prescribed values of α in a flux tube which is close to the vacuum flux tube and α = 0 outside. The force-free field is close (in the Hölder- norm 1,λ) to the starting vacuum field, in particular, it has the same field line topology, the same boundary values on S and satisfies the same decay conditions in spatial infinity. It is in general three-dimensional and requires no continuous symmetries. 1. Introduction In the framework of a magnetohydrodynamic description of plasmas force-free magnetic fields play a prominent role. Their characteristic property is the alignment of magnetic field and electric current and thus the vanishing of the Lorentz-force. Magnetic field B and current density j then satisfy (in a suitable normalization) the equations j = α B, curl B = j, div B = 0.
(1)
As a consequence of Eqs. (1) the scalar function α(x) is constant on magnetic field lines but may vary from field line to field line. Force-free fields with α constant in space are a special case; for obvious reasons they are sometimes termed linear force-free fields. The interest in force-free fields has (at least) two sources. The first is “magnetic relaxation”: Suppose there is a viscous but perfectly conducting plasma together with a magnetic field contained in a volume V . In general, the Lorentz-force does not vanish and
112
R. Kaiser, M. Neudert, W. von Wahl
Fig. 1. A magnetic flux tube in the solar corona
the plasma, even if initially at rest, is set into motion. If there are no external forces the total energy in V must decrease because of viscous dissipation. Eventually, the plasma comes to rest, i.e. the system relaxes to a state of magnetostatic equilibrium. During this process the topology of knots and links of the magnetic lines remains conserved since by virtue of the perfect-conductivity assumption magnetic lines are frozen in the plasma. As a consequence magnetic helicity in any flux tube is a conserved quantity. The relaxed state is thus obtained by minimizing the total energy with the constraint of conserved helicity. These minimizing solutions are always force-free magnetic fields ([Wol], [Tay]). If the constraint of conserved helicity in any flux tube is weakened to conservation of helicity in the total plasma volume the relaxed state turns out to be a linear force-free field. However, besides the obvious advantage of a simplified problem, theoretical rationalizations of this replacement are doubtful ([Gra]). The other reasons for interest in force-free magnetic fields are astrophysical applications. Because of the high electrical conductivity of stellar material of even low density, stellar magnetic fields are usually accompanied by large electric currents ([LS], [Ch1]). If, moreover, the magnetic energy density dominates the plasma pressure which is the typical situation in stellar atmospheres, stationary magnetic fields have to be force-free. So, a variety of magnetic structures which have been observed in active regions of the solar corona (coronal loops, magnetic arcades, coronal mass ejections, etc.) have been modelled on the basis of force-free fields, see for instance [Pri] or [Bra]. Solar flares, for example, are nowadays attributed to a magnetic origin ([Al2]): Magnetic flux emerges from the solar convection zone into the coronal space. The simplest geometry is a looptype flux tube with footpoints anchored in the photosphere (see Fig. 1). Photospheric motions cause then a shearing and twisting of the magnetic flux tube up to the point where the flux tube becomes unstable. Finally, a substantial fraction of the free energy stored in the magnetic structure is released (probably by reconnection of magnetic field lines) in an eruptive event. Applications like these determine the geometric setting we will consider below. For linear force-free fields a well-posed boundary value problem of Neumann’s type can be formulated. Using integral equations ([Kr1]) or Hilbert space methods ([Sak], [YG]) this problem has been solved under quite general conditions. In the nonlinear case comparable results exist only if the problem is modified or restricted: A modified problem is obtained if the divergence-free condition is abandoned. In that case the abovementioned methods still work ([Pic], [Kr2]); these solutions, however, are obviously not appropriate for the description of magnetic fields and are merely of mathematical interest. The restricted problem refers to situations with plane – or axisymmetry. In that case the problem can be reduced to a single, in general nonlinear equation of elliptic
On the Existence of Force-Free Magnetic Fields
113
type for a scalar quantity describing the symmetric field ([LS]). For equations of such type an elaborate mathematical theory is available. Beyond these results there are yet a few special solutions developed for astrophysical purposes which are nonlinear and nonsymmetric ([CC], [Low]) and some nonexistence results in the whole space ([Ch2]) and in exterior domains ([Al1]) but no general existence results. What comes so far closest to a general solution has been obtained via an iteration scheme which works for small α and special initial configurations. This scheme has been proposed by [GR] and is here called α-iteration in analogy to the β- iteration, which has been devised for the more general problem of the existence of magnetohydrostatic equilibria ([Spi], [Lor]). The n th step of the α-iteration has the form hBn , ∇αn i = 0
in G,
αn |M = α 0
M ⊂ ∂G,
curl Bn+1 = αn Bn
in G,
div Bn+1 = 0
in G,
hν, Bn+1 i = g
on ∂G.
(2)
(3)
Here, G denotes an open domain in R3 , h. , .i the euclidian scalar product in R3 , ν is the exterior unit normal on ∂G, the function g is prescribed on ∂G and the function α 0 on such a part M of ∂G, where hν, B0 i 6= 0. Each step of the iteration requires the solution of an initial value problem for a linear first order differential equation (2) and of a Neumann-boundary value problem for inhomogeneous-harmonic vector fields (3). In case of convergence the iteration furnishes obviously a force-free magnetic field. In case of a bounded domain G Bineau ([Bin]) proved convergence of the iteration for small α0 and a harmonic initial field B0 . A nonzero lower bound on the field strength of B0 ensures finite length of all field lines in the flux tube emanating from M. As a result Bineau obtains a force-free magnetic field close to B0 with the prescribed values of α in a flux tube close to the initial one emanating from M. The proof of convergence, however, is at least incomplete. Bineau assumes a-priori bounds on kBk and k∇Bk as well as on the field line parameter without controlling these bounds in the course of the iteration. But in a rigorous treatment these bounds have to be controlled at every step of the iteration and, in particular, the bounds have to be uniform with respect to the iteration number n. This is no trivial matter since the bounds on the magnetic field and on the field line parameter depend on each other. The present paper furnishes also a convergence proof for the α-iteration, but differs in two respects from the work of Bineau. First, motivated from the astrophysical applications described above the underlying domain is not bounded but the exterior of a bounded simply connected domain. That is why we need some nonstandard potential-theoretic estimates in exterior domains (these are cited in Sect. 3) and why we have to formulate carefully the notion of an “admissible field configuration” (Sect. 4). The guideline is here again the astrophysical situation. Second, our convergence proof (Sect. 5) avoids the above mentioned shortcomings and presents in some detail the required estimates. One should note that the proof can easily be carried over to the simpler case of a bounded domain as well as to the case of the exterior of a bounded but multiply connected domain (see Remarks 3.3 and 5.6).
114
R. Kaiser, M. Neudert, W. von Wahl
2. General Notations In the following, we assume G ⊂ R3 to be a bounded domain with smooth boundary; by “smooth” we mean sufficient regularity of ∂G without fixing the exact class of regularity. In all cases considered here the class C 6 will be sufficient. ˆ means the exterior of G, i.e. Furthermore, G ˆ := R3 \ G. G ν denotes, if not explicitly defined otherwise, the outer normal with respect to G. A closed surface S is here the smooth boundary ∂D of some bounded domain D ⊂ R3 , where D always lies locally at one side of the boundary. Then, in particular, S = ∂D is orientable. For δ > 0 we define G−δ := x ∈ G dist (x, ∂G) > δ , ∂G−δ := ∂ G−δ . The euclidian scalar product in R3 is denoted by h. , .i, i.e. 3 X xj yj . x, y :=
j =1
For some differentiable vector field B, the Jacobian matrix will be denoted by DB. For multiindices a = (a1 , a2 , a3 ) ∈ N30 we set D a f :=
∂x1a1
∂ |a| f , |a| := a1 + a2 + a3 , a! := a1 !a2 !a3 !. ∂x2a2 ∂x3a3
For a continuous function or vector field f in some (bounded or unbounded) domain G we set kf kC 0 (G ) := sup |f (x)|. Furthermore, for some function or vector field f and k ∈ N
x∈G
k D k f k := max k D a f k . |a|=k
If some matrix A = (aij ) : G ⊂ R3 → R3×3 is continuous, we set k A kC 0 (G ) :=
3 X i,j =1
kaij k2C 0 (G )
1/2
.
3. Hölder-Spaces and the Neumann-Problem for Inhomogeneous-Harmonic Vector Fields in Exterior Domains Hölder spaces are the appropriate function spaces in classical potential theory. the socalled Let G ⊂ R3 be a domain. For a function or a vector field f ∈ C k (G) and 0 < λ < 1 we define then k f kC k,λ (G ) :=
k X j =0
k D j f kC 0 (G ) +[D k f ]C λ (G ) ,
(4)
On the Existence of Force-Free Magnetic Fields
115
where [D k f ]C λ (G ) := max sup
|a|=k x,y∈G
|D a f (x) − D a f (y)| . |x − y|λ
Distinguishing between local and global Hölder continuity, the following notation is customary in the literature (see, for instance, [GT]): In the local case the condition kf kC k,λ (K) < ∞ holds for every compact subset K ⊂ G k,λ (G). If, however, the uniform and f belongs therefore to the space C k,λ (G) := Cloc condition kf kC k,λ (G ) < ∞ holds, f is said to be k,λ (G) f ∈ C k,λ (G) := Cunif k,λ (G) is a Banach space with the norm defined in (4). In order to simplify our and Cunif k,λ (G), deviating from this notation and because we are not interested in the spaces Cloc convention we set
ˆ := C 1,λ (G), ˆ C 1,λ (G) unif
ˆ := C 0,λ (G), ˆ C λ (G) unif
ˆ := R3 \ G for some bounded domain G ⊂ R3 . where G ˆ there exists an extension f ∈ C 1,λ (R3 ), f ˆ = f such that For every f ∈ C 1,λ (G) G kf kC 1,λ (R3 ) ≤ c1 kf kC 1,λ (G) ˆ , kf kC 0 (R3 ) ≤ c1 kf kC 0 (G) ˆ
(5)
with some constant c1 > 0 independent of f (see [GT], Lemma 6.37; the unboundedness ˆ is not relevant here). We note that this construction is linear. of G Later on the following properties of the Hölder norm will turn out to be useful: [f · g]C λ (G ) ≤ kf kC 0 (G ) · [g]C λ (G ) + [f ]C λ (G ) · kgkC 0 (G ) , kf · gkC λ (G ) ≤ kf kC λ (G ) · kgkC λ (G ) , [1/f ]C λ (G ) ≤ k1/f k2C 0 (G ) · [f ]C λ (G ) .
(6) (7) (8)
ˆ f ∈ C 1,λ (∂G) means f ◦µ ∈ C 1,λ (U ), where Concerning the boundary ∂G of G (or G) µ : U → ∂G is a chart of ∂G. The corresponding norm k.kC 1,λ (∂G) therefore depends on the chosen atlas. For a subset M ⊂ ∂G which is open in the topology of ∂G and some f ∈ C 1,λ (M) with compact support in M we also write f ∈ C01,λ (M). In addition to the Hölder-norm we also need in exterior domains a weighted norm characterizing the asymptotic behaviour to obtain global estimates for potential theoretic ˆ and % > 0 we set problems. Here for a function or a vector field f in G |k f |k% := sup |x|% |f (x)|. ˆ x∈G
If |k f |k% < ∞ we also write |f (x)| = O(|x|−% ), |x| → ∞. The following potential theoretic results on the Neumann problem for inhomogeneous-harmonic vector fields in exterior domains are cited without proof. The proofs can be found in [NvW].
116
R. Kaiser, M. Neudert, W. von Wahl
Theorem 3.1 (Solvability and asymptotic behaviour). Let G ⊂ R3 be a bounded domain with smooth boundary and trivial topology (i.e. first and second Betti number are ˆ := R3 \ G, 1 < % < 3. Let, furthermore, zero), G ˆ R , |f (x)| = O |x|−% , |x| → ∞, f ∈ C λ G, ˆ R3 ∩ C λ G, ˆ R3 , w with zero flux , w ∈ C 1,λ G, |w(x)| = O |x|−% , |x| → ∞, g ∈ C 0 (∂G, R). Then the Neumann problem ˆ div v = f, curl v = w in G, hv, νi = g on ∂G, |x| → ∞ |v(x)| = O |x|1−% , has a unique solution. If f = 0, w = 0 then Z|v(x)| = O |x|−2 , |x| → ∞.
g d = 0 then even |v(x)| = O |x|−3 , |x| → ∞.
If f = 0, w = 0 and ∂G
The condition “w with zero flux” is necessary for the solvability of the problem and ˆ with outer normal ν˜ the integral means that for all closed oriented surfaces S ⊂ G Z
w, ν˜ d vanishes. This implies div w = 0. S
Theorem 3.2 (Hölder-Estimates). Suppose that 1 < % < 3, g ∈ C 1,λ (∂G) and let v be ˆ and there exists a the unique solution of the Neumann problem 3.1. Then v ∈ C 1,λ (G) constant c0 > 0 only depending on λ, %, G with k v kC 1,λ (G) ˆ + |k v |k%−1
≤ c0 · kf kC λ (G) ˆ + kwkC λ (G) ˆ + |k f |k% + |k w |k% + k g kC 1,λ (∂G) .
Remarks 3.3. (a) The corresponding Neumann-problem in the interior domain is solvable under the additional restriction Z Z f dx = gd. G
∂G
The asymptotic conditions, of course, are to be omitted. ˆ is multiply connected, i.e. if the first Betti-number n˜ of (b) If the exterior domain G ˆ ˆ is different from zero, the NeumannG, which is the number of handles of G or G, problem is still uniquely solvable under the same conditions, if the so-called generalized circulations Z 0j := hν × v, zj id, j = 1, . . . , n˜ ∂G
On the Existence of Force-Free Magnetic Fields
117
are prescribed, where {z1 , . . . , zn˜ } is an appropriate basis of the space ZR (G) := {v ∈ C 1 (G, R3 ) ∩ C 0 (G, R3 )| div v = 0, curl v = 0 in G, hν, vi = 0 on ∂G}. The asymptotic behaviour of the solution then is the same as stated in Theorem 3.1, but to obtain |v(x)| = O(|x|−3 ), |x| → ∞, the condition 01 = · · · = 0n˜ = 0 is necessary. P The estimate in Theorem 3.2 holds, if the term nj˜ =0 |0j | is added in the brackets on the right-hand side. 4. Field Lines and Admissible Configurations Field lines are here considered as curves with direction always parallel to the field vector and orientation which is induced by the vector field. Taking into account the invariance under reparametrization they can be described as solutions of the so-called field line equation γ˙ (t) = B γ (t) , where B stands for the vector field, t for the curve parameter and “ · ” means differentiation with respect to t. In the following we are interested in field line configurations ˆ with footpoints anchored on ∂G. which contain flux tubes in an exterior domain G Definition 4.1 (Assumptions and further notations). ˆ := R3 \ G, U ⊂ R2 boundary, G (1) Let G ⊂ R3 be a bounded domain with smooth open, M ⊂ ∂G open, U, µ, µ(U ) = M a local coordinate system for M, i.e. in particular µ : U → M is a homeomorphism and µ : U → R3 is twice continuously differentiable. Let ρ1 > 0 such that ∂µ ∂µ (1) (2) (s) ≥ ρ1 . ∀s = (s , s ) ∈ U : (1) (s) × (2) ∂s ∂s
ˆ R3 ) and ρ0 > 0 such that B, ν ≥ ρ0 , i.e. in particular, the (2) Let B ∈ C 1,λ (G, M field lines of B emanating from M have non-zero normal component. For s ∈ U let γ (., s) denote the solution of the initial value problem ˆ γ (0, s) = µ(s) ∈ M. γ˙ = B(γ ) in G, ˆ Let ]0, T (s)[ be the maximum interval of existence of γ (., s) in G, T (s) ∈ R+ ∪ {+∞}. We define now < T (s) , 0 M, B := (t, s) ∈ R3 s ∈ U, 0 < t 3 M, B := γ (t, s) (t, s) ∈ 0 M, B . ˆ emanating from M. The 3 M, B may be considered as the flux tube of B in G parameters t, s will be denoted as field line coordinates. For T > 0 we define furthermore min T , T (s) , 0 M, B, T := (t, s) ∈ R3 s ∈ U, 0 < t < 3 M, B, T := γ (t, s) (t, s) ∈ 0 M, B, T .
118
R. Kaiser, M. Neudert, W. von Wahl
Observe that the solutions of γ˙ = B(γ ) are restrictions of the solutions of γ˙ = B(γ ), since B Gˆ = B, where B denotes the extension of B (see Sect. 3). From the theory of ordinary differential equations we shall use the following results: Lemma 4.2. Let G ⊂ R1+3 be open, f : G → R3 continuous, I ⊂ R an interval with 0 ∈ I , ϕ1 , ϕ2 ∈ C 1 (I, R3 ) with ∀t ∈ I : (t, ϕi (t)) ∈ G, i = 1, 2, and ε0 , δ0 , L > 0. Let, furthermore, the following conditions be satisfied: (i) | ϕ1 (0) − ϕ2 (0) | ≤ ε0 , (ii) ∀t ∈ I : | ϕ˙i (t) − f (t, ϕi (t)) | ≤ δ0 , i = 1, 2, (iii) ∀t ∈ I : | f (t, ϕ1 (t)) − f (t, ϕ2 (t)) | ≤ L | ϕ1 (t) − ϕ2 (t). Then for all t ∈ I the estimate holds: | ϕ1 (t) − ϕ2 (t) | ≤ ε0 + 2δ0 | t | eL|t . The following lemma describes the dependence of solutions of an ordinary differential on the initial values. Lemma 4.3. Let G ⊂ R1+3 be open , f : G → R3 continuously differentiable with respect to x ∈ R3 . Let ϕ(., 0, y) denote the solution of the initial value problem x˙ = f (t, x), x(0) = y. Then there exists vj (t) :=
∂ϕ (t, 0, y0 ), j = 1, 2, 3, ∂yj
for (0, y0 ) ∈ G, and vj solves the initial value problem v˙j (t) = Dx f (t, ϕ(t, 0, y0 )) · vj (t), vj (0) = ej , j = 1, 2, 3. Using these propositions and Definition 4.1 the following lemma is easy to prove. ˆ Lemma 4.4. 3(M, B) is open in G. ˆ U, µ, M, as in Definition 4.1, Definition 4.5 (Admissible configuration). Assume G, G, 1,λ 3 ˆ ρ0 , δ, T > 0, B ∈ C (G, R ) with extension B ∈ C 1,λ (R3 , R3 ). We denote the pair M, B as an admissible configuration with parameters ρ0 , T , δ , iff the following conditions are satisfied:
(i) B, ν ≥ ρ0 > 0 in M. (ii) For the solution γ (., s) of the initial value problem γ˙ = B(γ ) in R3 , γ (0, s) = µ(s) ∈ M, we have
∀ s ∈ U : ∃ T0 (s) ∈ ]0; T2 [ : γ T0 (s), s ∈ ∂G, ∃ Tδ (s) ∈ ]0; T2 [ : γ Tδ (s), s ∈ ∂G−δ .
On the Existence of Force-Free Magnetic Fields
119
Fig. 2. Admissible field configuration
The second condition in the definition above has the following meaning: Every field line ˆ starting from M returns to the surface ∂G with a finite value of parameter t. of B in G There is a uniform upper bound T2 for the values t = T0 (s) and t = Tδ (s) where the line γ (t, s) starting in µ(s) penetrates the surfaces ∂G and ∂G−δ (see Fig. 2). A non-vanishing penetration depth of the field lines is guaranteed in particular, if the absolute value of the normal component of B on the corresponding part of the surface is bounded from below. Lemma 4.6. Suppose that the configuration M, B is admissible with parameters ρ0 , T , δ . Then diam 3(M, B) < k B kC 0 (G) ˆ · T + diam M. Proof. We have Zt
Zt γ˙ (τ, s) dτ =
γ (t, s) − γ (0, s) = 0
so
B γ (τ, s) dτ,
0
γ (t1 , s1 ) − γ (t2 , s2 ) ≤ γ (t1 , s1 ) − γ (0, s1 ) + γ (0, s1 ) − γ (0, s2 ) + γ (0, s2 ) − γ (t2 , s2 ) ≤ | t1 | · k B kC 0 (G) ˆ +diam M+ | t2 | · k B kC 0 (G) ˆ < k B kC 0 (G) ˆ · T + diam M.
t u
ˆ µ, M, γ as in Definition 4.1, Lemma 4.7. Assume G, G, ˆ R3 ), ρ0 , T , δ > 0, B ∈ C 1,λ (G, the configuration M, B admissible with parameters ρ0 , T , δ , η ∈ ]0, 1[, B 0 ∈ ˆ R3 ) satisfying C 1,λ (G,
(i) B 0 , ν = B, ν in M and 1 (1 − η)δ · exp − c1 k B kC 1 (G) (ii) k B 0 − B kC 0 (G) ˆ < ˆ ·T . c1 T 2 Then M, B 0 is admissible with parameters ρ0 , T , ηδ .
120
R. Kaiser, M. Neudert, W. von Wahl
Proof. Let γ denote the solution of the field line equation with respect to B, γ 0 the same with respect to B 0 according to Definition 4.1. Then, using the estimate (5) and Lemma T 4.2 (with ε0 = 0, f = B, L = kDBkC 0 (G) ˆ ) we get for 0 ≤ t ≤ 2 , | γ 0 (t, s) − γ (t, s) | ≤ 2 k B 0 − B kC 0 (R3 ) · t · exp{k DB kC 0 (R3 ) ·t} ≤ 2c1 k B 0 − B kC 0 (G) ˆ ·t · exp{c1 k B k
ˆ C 1 (G)
· t}.
Thus, assumption (ii) yields 0 γ Tδ (s), s − γ Tδ (s), s ≤ (1 − η)δ. For all x ∈ ∂G we have therefore 0 γ Tδ (s), s − x ≥ x − γ Tδ (s), s − γ 0 Tδ (s), s − γ Tδ (s), s ≥ δ − (1 − η)δ = ηδ. t The lemma follows then by continuity of γ 0 (. , s). u In the proof of the lemma to follow the field line coordinates will turn out to be useful to solve the first order linear partial differential equation
∇ψ, B = ϕ in flux tubes of the field B. ˆ µ, M, 3(M, B) be Theorem 4.8 (Solvability of the initial value problem). Let G, G, ˆ R3 ) with k B k 1 ˆ < ∞ and as in Definition 4.1, ρ0 > 0, B ∈ C 1 (G, C (G)
B, ν ≥ ρ0 in M. Furthermore, assume
ψ0 ∈ C 1 (M, R) and ϕ ∈ C 0 3(M, B), R .
Then the initial value problem
∇ψ, B = ϕ, ψ M = ψ0
(9)
has a unique solution in 3(M, B). Proof. (a) Let γ (., s) and T (s) be as in Definition 4.1. Since |B| is uniformly bounded, we have either T (s) = ∞, which is here also admitted, or lim γ (t, s) ∈ ∂G,
t%T (s)
in the case T (s) < ∞. From the uniqueness theorem for ordinary differential equations with Lipschitz-condition we know that every point lies on exactly one field line and that two different field lines cannot cross each other.
On the Existence of Force-Free Magnetic Fields
121
ˆ starting from M cannot return to M. Let (b) We show next that a field line of B in G ν˜ (s) := ν µ(s) be the outer normal with respect to G in the point µ(s) ∈ M ⊂ ∂G. We consider some solution γ (., s0 ) of the field line equation according to Definition 4.1 and assume that there is some t1 > 0, with γ (t1 , s0 ) := lim γ (t, s0 ) ∈ M t%t1
ˆ for t ∈ ]0, t1 [. Moreover, for τ > 0 sufficiently small, there exist and γ (t, s0 ) ∈ G continuous functions s(.) : ]t1 − τ, t1 [→ U and ϑ(.) : ]t1 − τ, t1 [→ R such that (10) γ (t, s0 ) = µ(s(t)) + ϑ(t) · ν˜ s(t) . We may assume s(.), ϑ(.) to be differentiable, and (without restriction) ˙ ϑ(t) < 0 on ]t1 − τ, t1 [
and
ϑ(t) & 0 for t % t1 .
Differentiating Eq. (10) with respect to t yields ∂µ ∂µ ˙ s(t) + s˙ (2) (t) · (2) s(t) + ϑ(t) · ν˜ s(t) (1) ∂s ∂s ∂ ν˜ ∂ ν˜ (1) + ϑ(t) s˙ (t) · (1) s(t) + s˙ (2) (t) · (2) s(t) . ∂s ∂s Taking the scalar product of Eq. (11) with ν˜ s(t) we get
˙ < 0, γ˙ (t, s0 ), ν˜ s(t) = ϑ(t) γ˙ (t, s0 ) = s˙ (1) (t) ·
(11)
and thus
B γ (t1 , s0 ) , ν˜ s(t1 ) ≤ 0,
which contradicts our assumption. (c) We first consider the initial value problem with homogeneous differential equation
(12) ∇ψhom , B = 0, ψhom M = ψ0 . Equation (12)1 means that ψhom is constant along the field lines of B:
∂ ψhom (γ (t, s)) = ∇ψhom γ (t, s) , γ˙ (t, s) ∂t
= ∇ψhom γ (t, s) , B γ (t, s) = 0. Therefore any solution of problem (12) satisfies (13) ψhom γ (t, s) = ψhom γ (0, s) = ψ0 µ(s) . From (a) and (b) we know that γ (., .) : 0 M, B → 3 M, B is a bijective mapping. Thus, conversely, ψhom : 3 M, B → R defined by Eq. (13) solves the problem (12).
122
R. Kaiser, M. Neudert, W. von Wahl
Therefore ψhom is the unique solution of (12). For the inhomogeneous problem
∇ψinh , B = ϕ, ψinh M = 0 we can easily find the solution ψinh γ (t, s) =
Zt
ϕ γ (t 0 , s) dt 0
0
by differentiating both sides with respect to t. Finally, ψ γ (t, s) = ψhom γ (t, s) + ψinh γ (t, s) Zt = ψ0 µ(s) + ϕ γ (t 0 , s) dt 0 0
is the unique implicitly given solution of the initial value problem (9). u t Considering the preceding proof and Lemma 4.3 the next lemma follows easily. ˆ µ, M, γ , 3(M, B) be as in Definition 4.1, ρ0 , T , δ > 0, B ∈ Lemma 4.9. Let G, G, 1 1,λ 0 ˆ C (G), α ∈ C0 (M, R) and the configuration (M, B) admissible with parameters (ρ0 , T , δ). Furthermore, let α be the solution of the initial value problem
∇α, B = 0, α M = α 0 in 3(M, B). Then ˆ ⊂ 3 M, B . supp α ∩ G ˆ Therefore α is trivially extendable to G. Lemma 4.10. In addition to the assumptions given in Lemma 4.9 suppose div B = 0. ˆ Then αB has zero flux in G. Proof. Because of Lemma 4.9 we have ˆ ⊂ 3 M, B , supp αB ∩ G and Lemma 4.6 yields diam 3 M, B 0 we have thus With αB
G ∪ supp (αB) ⊂ KR (0). ∂KR (0)
= 0,
div αB = α div B + ∇α, B = 0
in KR (0) \ G
On the Existence of Force-Free Magnetic Fields
123
and applying Gauß’s theorem to KR (0) \ G we obtain Z Z Z
αB, ν d = αB, ν d − ∂G
∂KR (0)
div αB dx = 0,
(14)
KR (0)\G
where ν denotes the outer normal with respect to G on the left-hand side and with respect to KR (0) on the right-hand side of Eq. (14). ˆ and S = ∂D with a domain D ⊂ R3 . In the case Now let S be a closed surface in G ˆ Gauß’s theorem yields D⊂G Z Z
αB, ν d = div αB dx = 0, S
D
in the other case, G ⊂ D, using Eq. (14) and Gauß’s theorem applied to D \ G leads to Z Z Z
div αB dx = 0. (15) αB, ν d = αB, ν d + S
∂G
D\G
Here, ν denotes the outer normal with respect to D at S = ∂D on the left-hand side of the first equation of (15) and with respect to G at ∂G on the right-hand side. u t 5. Convergence of the α-Iteration 1,λ ˆ 3 ˆ
Lemma 5.1. Let G, G, µ, M be as in Definition 4.1, ρ0 , T > 0, B ∈ C (G, R ) with B, ν ≥ ρ0 . Let γ (., s) be the solution of the field line equation in Definition 4.1 with respect to B and let ∂γ ∂γ ∂γ Dγ := ∂t ∂s (1) ∂s (2)
denote the Jacobian matrix of γ . Then there exist (in both variables) monotonically increasing functions κ1 , κ2 : R0+ × R0+ → R0+ depending on G, ρ0 , M, µ and λ but not on B, T , such that ≤ κ1 k B kC 1,λ (G) k Dγ kC λ (0) ˆ ,T and , k (Dγ )−1 kC λ (0) ≤ κ2 k B kC 1,λ (G) ˆ ,T with 0 := 0(M, B, T ). Proof. For simplicity we do not distinguish between T and T (s) where T (s) < T (cf. Def. 4.1). Otherwise we would have to replace T by min(T (s), T ). This simplification is possible since B can always be extended to the entire space R3 with the consequence T (s) = ∞. (a) According to Lemma 4.3 ∂γ .,s (1) ∂s
and
∂γ .,s (2) ∂s
124
R. Kaiser, M. Neudert, W. von Wahl
are solutions of the linear equation ω(t) ˙ = DB γ (t, s) · ω(t),
(16)
where DB is the Jacobian matrix of B. Since ∂γ ∂ ∂γ ∂ = B(γ ) = DB(γ ) · , ∂t ∂t ∂t ∂t γ˙ =
∂γ ∂t
is also a solution of Eq. (16). Furthermore, we have detDγ (0, s) = ∂γ (0, s) ∂γ (0, s) × ∂γ (0, s) ∂t (1) (2) ∂s ∂s
≥ ρ1 B µ(s) , ν˜ (s) ≥ ρ1 ρ0 > 0.
Thus
(17)
∂γ ∂γ ∂γ (., s), (1) (., s), (2) (., s) ∂t ∂s ∂s
is a fundamental system of solutions of Eq. (16) in 0(M, B). (b) Applying Lemma 4.2 (with δ0 = 0, ϕ1 = 0) we get ∂γ ∂γ ˆ ·t ∂s (j ) (t, s) ≤ ∂s (j ) (0, s) · exp kDBkC 0 (G) ≤ k Dµ kC 0 (U ) · exp kDBkC 0 (G) ˆ ·T for 0 ≤ t ≤ T , j = 1, 2. Obviously there is ∂γ ˆ . ∂t (t, s) ≤ k B kC 0 (G) ∂γ ∂γ (c) Now we estimate [Dγ ]C λ (0) . For ω = ∂γ ∂t , ∂s (1) , ∂s (2) we have Eq. (16) and thus for s1 , s2 ∈ U , ω(t,s ˙ 2 ) − DB γ (t, s1 ) · ω(t, s2 ) = DB γ (t, s2 ) − DB γ (t, s1 ) · ω(t, s2 ),
so
ω(t, ˙ s2 ) − DB γ (t, s1 ) · ω(t, s2 ) λ ≤ k Dγ kC 0 (0) · k DB kC λ (G) ˆ · | γ (t, s2 ) − γ (t, s1 ) | .
Lemma 4.2 yields then | ω(t, s2 ) − ω(t, s1 ) | ≤ | ω(0, s2 ) − ω(0, s1 ) | λ + 2 k Dγ kC 0 (0) · k DB kC λ (G) ˆ · sup | γ (τ, s2 ) − γ (τ, s1 ) | ·T τ ∈[0,T [ · exp kDBkC 0 (G) ˆ ·T ,
On the Existence of Force-Free Magnetic Fields
125
where | γ (τ, s2 ) − γ (τ, s1 ) | ≤ k Dγ kC 0 (0) · |s2 − s1 |. In the case ω =
∂γ ∂t
there is furthermore
| ω(0, s2 ) − ω(0, s1 ) | = | B µ(s2 ) − B µ(s1 ) | λ ≤ k DB kC 0 (G) ˆ · k µ kC λ (U ) · |s2 − s1 | ,
otherwise (ω =
∂γ , ∂s (j )
j = 1, 2)
| ω(0, s2 ) − ω(0, s1 ) | ≤ k Dµ kC λ (U ) · |s2 − s1 |λ . So, in any case | ω(t, s2 ) − ω(t, s1 ) | ≤ k Dµ kC λ (U ) + k DB kC 0 (G) ˆ · k µ kC λ (U ) |s2 − s1 |λ + 2 k Dγ k1+λ · k DB kC λ (G) ˆ ·T · exp kDBkC 0 (G) ˆ ·T C 0 (0) for s1 6 = s2 , 0 ≤ t ≤ T . Since ω satisfies the linear differential equation (16), we have, moreover, | ω(t2 , s) − ω(t1 , s) | ≤ T 1−λ · k ω˙ kC 0 (0) | t2 − t1 |λ ≤ T 1−λ · k DB kC 0 (G) ˆ · k Dγ kC 0 (0) for 0 ≤ t1 , t2 ≤ T , t1 6 = t2 , s ∈ U . Obviously, for t1 , t2 ∈ [0, T ], t1 6= t2 , s1 , s2 ∈ U , s1 6 = s2 we have | ω(t2 , s2 ) − ω(t2 , s1 ) | | ω(t2 , s1 ) − ω(t1 , s1 ) | | ω(t2 , s2 ) − ω(t1 , s1 ) | + . λ/2 ≤ 2 2 |s2 − s1 |λ | t2 − t1 |λ (t2 − t1 ) + |s2 − s1 | Therefore [ω]C λ (0) ≤ κ k B kC 1,λ (G) ˆ ,T , and applying the result of (b) we obtain k Dγ kC λ (0) ≤ κ1 k B kC 1,λ (G) ˆ ,T . (d) From the theory of linear ordinary differential equations it is well known that detDγ (t, s) = detDγ (0, s) · exp
Zt
traceDB γ (t 0 , s) dt 0 ,
0
and with the estimate (17) we have thus det Dγ (t, s) ≥ ρ1 ρ0 · exp − 3 k DB k 0 ˆ · T C (G)
126
R. Kaiser, M. Neudert, W. von Wahl
for (t, s) ∈ 0(M, B, T ). For the inverse of Dγ there holds the formula −1 1 ∂γ ∂γ ∂γ ∂γ ∂γ ∂γ T = × × . (18) · × Dγ detDγ ∂t ∂t ∂s (1) ∂s (2) ∂s (2) ∂s (1) −1 kC 0 (0) . Using the result of (b) we easily obtain the desired estimate for k Dγ −1 (e) Finally we look for an estimate for [ Dγ ]C λ (0) . Applying the estimates (6) and (8) we have −1 −1 −1 ≤ k detDγ kC 0 (0) · det Dγ · Dγ + Dγ C λ (0) λ −1 −1 · k det Dγ · Dγ k 0 + detDγ −1 λ −1C (0) ≤ k detDγ kC 0 (0) · detDγ · Dγ λ −1 2 + detDγ λ · k detDγ kC 0 (0) −1 · k detDγ · Dγ kC 0 (0) , and using Eq. (18), the estimate (7) and the results of (b),(c) and (d) we finally obtain −1 kC λ (0) ≤ κ2 k B kC 1,λ (G) t u k Dγ ˆ ,T . ˆ R3 ), ˆ µ, M as in Definition 4.1, T , ρ0 > 0, B ∈ C 1,λ (G, Lemma 5.2. Assume G,G, 1,λ 0 the configuration M, B admissible with parameters (ρ0 , T , δ) and α ∈ C0 (M, R). Let α be the trivial extension of the solution of the initial value problem (see Lemma 4.9)
∇α, B = 0, α M = α 0 . + + Then there exists a monotonically increasing function κ3 : R+ 0 × R0 → R0 depending 0 on G, µ, M, ρ0 , λ, but not on B, T , α , with the following property: 0 k α kC λ (G) ˆ , k α kC 1,λ (G) ˆ ≤ k α kC 1,λ (M) ·κ3 k B kC 1,λ (G) ˆ ,T ,
where k α 0 kC 1,λ (M) := k α 0 ◦ µ kC 1,λ (U ) . Proof. Using the notation of Theorem 4.8 we set ψhom = α, ψinh = 0 and define (19) ϑ(t, s) := α γ (t, s) = α 0 µ(s) , where γ stands as in Theorem 4.8 for the solution of the field line equation of B. We use again the abbreviation 0 := 0(M, B, T ). (a) We first estimate ϑ. It is easy to see that ∂ϑ =0 ∂t
and
k ϑ kC 0 (0) = k α 0 kC 0 (M) ,
and from (19) we have k ∇ϑ kC λ (0) ≤ k α 0 kC 1,λ (M) . (b) Next we estimate α. Using the abbreviation 3 := 3(M, B, T ) = 3(M, B) we immediatly see k α kC 0 (3) = k ϑ kC 0 (0) ,
On the Existence of Force-Free Magnetic Fields
127
and with ∇ϑ = Dγ
T
· (∇α) ◦ γ
we get k ∇α kC 0 (3) ≤ k Dγ
−1
kC 0 (0) · k ∇ϑ kC 0 (0) .
Now consider [∇α]C λ (3) . For t1 , t2 ∈ [0, T ], s1 , s2 ∈ U with (t1 , s1 ) 6 = (t2 , s2 ) we obtain ∇α γ (t1 , s1 ) − ∇α γ (t2 , s2 ) | γ (t1 , s1 ) − γ (t2 , s2 ) |λ ((Dγ )T )−1 · ∇ϑ (t1 , s1 ) − ((Dγ )T )−1 · ∇ϑ (t2 , s2 ) −1 λ ≤ k Dγ kC 0 (0) , | (t1 − t2 )2 + |s1 − s2 |2 |λ/2 thus
−1 λ −1 kC 0 (0) · k Dγ kC λ (0) · k ∇ϑ kC λ (0) . ∇α C λ (3) ≤ k Dγ
Since supp α ⊂ 3 Lemma 5.1 and part (a) of the proof imply 0 k α kC 1,λ (G) ˆ = k α kC 1,λ (3) ≤ kα kC 1,λ (M) · κ kBkC 1,λ (G) ˆ ,T
with κ being a function as described in the lemma. Furthermore, there holds the estimate (note that 3 = 3(M, B)) k α kC λ (G) ˆ ≤ k α kC λ (3) ≤
≤ 1 + (diam 3)1−λ · k α kC 1,λ (3) 1−λ ≤ 1 + (kBkC 0 (G) ˆ · T + diam M)
· kα 0 kC 1,λ (M) · κ kBkC 1,λ (G) ˆ ,T .
Here use has been made of Lemma 4.6. The last two estimates contain the statement of the lemma. u t ˆ µ, M as in Definition 4.1, ρ0 , T , δI , δII > 0, BI , BII ∈ Lemma 5.3. Assume G, G, ˆ R3 ), α 0 ∈ C 1,λ (M, R) and the configuration (M, Bj ) to be admissible with C 1,λ (G, 0 parameters (ρ0 , T , δj ), j = I, II, respectively. Let αj denote the trivial extension of the solution of the initial value problem
αj M = α 0 , j = I, II. ∇αj , Bj = 0 in 3j := 3(M, Bj ), Then there exist monotonically increasing functions κ4 , κ5 : R0+ ×R0+ → R0+ depending on G, µ, M, ρ0 , but not on Bj , α 0 , T , satisfying k αII − αI kC λ (3II ) ≤ kα 0 kC 1,λ (M) · κ4 kBI kC 1,λ (G) ˆ ,T · κ4 kBII kC 1,λ (G) ˆ , T · kBI − BII kC λ (3II ) , 0 k αII − αI kC λ (G) ˆ ≤ k α kC 1,λ (M) · κ5 kBI kC 1,λ (G) ˆ ,T · κ5 kBII kC 1,λ (G) ˆ , T · kBI − BII kC 1,λ (G) ˆ .
128
R. Kaiser, M. Neudert, W. von Wahl
Proof. Let 3j := 3(M, Bj ), 0j := 0(M, Bj ) and γj denote the solution of the field line equation with respect to Bj according to Definitions 4.1 and 4.5, j = I, II. (a) We have αII −αI γII (t, s) = α 0 µ(s) − αI γII (t, s) Zt
∇αI γII (τ, s) , γ˙II (τ, s) dτ =− 0
Zt =−
∇αI γII (τ, s) , BII − BI γII (τ, s) dτ
0
and thus (cf. Lemma 5.2) T · k ∇αI kC 0 (G) ˆ · k BI − BII kC 0 (3II ) 2 ≤ k α 0 kC 1,λ (M) ·κ3 (k BI kC 1,λ (G) ˆ , T · k BII − BI kC λ (3II ) .
k αII − αI kC 0 (3II ) ≤
(b) In order to estimate [αII − αI ]λ consider αII − αI γII (t1 , s1 ) − αII − αI γII (t2 , s2 ) γII (t1 , s1 ) − γII (t2 , s2 ) λ ≤ k (DγII )−1 kλC 0 (0 ) II αII − αI γII (t1 , s1 ) − αII − αI γII (t1 , s2 ) · |s1 − s2 |λ αII − αI γII (t1 , s2 ) − αII − αI γII (t2 , s2 ) + . | t1 − t2 |λ
(20)
Next we estimate the first term in curly brackets on the right-hand side of Eq. (20), 1 αII − αI γII (t1 , s1 ) − αII − αI γII (t1 , s2 ) · |s1 − s2 |λ Zt1
1 · = ∇αI , (BI − BII ) γII (τ, s1 ) λ |s1 − s2 | 0
− ∇αI , (BI − BII ) γII (τ, s2 ) dτ ≤ k
DγII kλC 0 (0 ) II
Zt1 · 0
1 γII (τ, s1 ) − γII (τ, s2 ) λ
γII (τ,s1 ) dτ · ∇αI , (BI − BII ) γII (τ,s2 )
DγII kλC 0 (0 ) II
·T · k ∇αI kC λ (G) ˆ · k BII − BI kC λ (3II ) ≤ k α 0 kC 1,λ (M) ·κ3 k BI kC 1,λ (G) ˆ ,T · κ1λ k BII kC 1,λ (G) ˆ , T · T · k BII − BI kC λ (3II ) .
≤ k
On the Existence of Force-Free Magnetic Fields
129
For the last inequality we have used Lemmas 5.1 and 5.2. Finally we consider the second term in the curly brackets on the right-hand side of the estimate (20), 1 αII − αI γII (t1 , s2 ) − αII − αI γII (t2 , s2 ) · | t1 − t2 |λ 1 · αI γII (t1 , s2 ) − αI γII (t2 , s2 ) = λ | t1 − t2 | Zt2
1 · ∇αI , (BII − BI ) γII (τ, s2 ) dτ = λ | t1 − t2 | t1
≤ | t1 − t2 |
· k ∇αI kC 0 (G) ˆ · k BII − BI kC 0 (3II ) ≤ T 1−λ · k α 0 kC 1,λ (M) ·κ3 k BI kC 1,λ (G) ˆ , T · k BII − BI kC 0 (3II ) . 1−λ
Again, we have used Lemma 5.2 for the last inequality. From both estimates we conclude [αII − αI ]C λ (3II ) ≤ k α 0 kC 1,λ (M) · κ3 k BI kC 1,λ (G) ˆ ,T · κ k BII kC 1,λ (G) ˆ , T · k BII − BI kC λ (3II ) , with κ being a monotonically increasing function. The first estimate in Lemma 5.3 follows now from (a) and (b). (c) Now let C be the convex hull of 3II . Applying the mean value theorem we obtain √ [B]C λ (3II ) ≤ 3 · (diam 3II )1−λ · k B kC 1 (C) . √ The factor 3 is due to the application of the mean value theorem to each of the three components of B. Clearly k B kC 1 (C) ≤k B kC 1+λ (R3 ) ≤ c1 k B kC 1,λ (G) ˆ . So we have the estimate (here 3 := 3II ) √ k B kC λ (3) ≤ 3c1 1 + (diam 3)1−λ · k B kC 1,λ (G) ˆ ,
(21)
and with Lemma 4.6 ( 3II := 3(M, BII )) √ k BI −BII kC λ (3II ) ≤ 3 c1 1 + (diam 3II )1−λ · k BI − BII kC 1,λ (G) ˆ √ 1−λ ≤ 3 c1 1 + (kBII kC 0 (G) · T + diam M) − B · k B I II kC 1,λ (G) ˆ ˆ . Thus the first part of Lemma 5.3 yields k αI − αII kC λ (3II ) ≤ k α 0 kC 1,λ (M) ·κ4 kBI kC 1,λ (G) ˆ ,T · κ kBII kC 1,λ (G) ˆ , T · k BI − BII kC 1,λ (G) ˆ with κ being again a monotonically increasing function. Since supp (αI − αII ) ⊂ 3(M, BI ) ∪ 3(M, BII ) and using the corresponding estimate with labels I and II interchanged we finally obtain the second estimate in Lemma 5.3. u t
130
R. Kaiser, M. Neudert, W. von Wahl
We are now in the position to present the main result of this article which is the proof of the convergence of the iteration scheme and by this the construction of a force-free magnetic field in the exterior domain. ˆ M, µ be as in Definition 4.1, ρ0 , T , δ > 0, 0 ∈ G, Theorem 5.4. Let G, G, ˆ R3 ) with B0 ∈ C 1,λ (G, div B0 = 0 and curl B0 = 0, |B0 (x)| = O(|x|−2 ), |x| → ∞. Suppose the configuration M, B0 to be admissible with parameters (ρ0 , T , δ). Then there exists η > 0 depending on G, M, µ, ρ0 , T , δ and k B0 kC 1,λ (G) ˆ , with the following property: If α 0 ∈ C01,λ (M, R) with kα 0 kC 1,λ (M) < η, then the iteration scheme
(i) αn solves ∇αn , Bn−1 = 0, αn M = α 0 according to Th. 4.8, (ii) jn := αn Bn−1 ,
(iii) Bn solves curl Bn = jn , div Bn = 0, Bn , ν ∂G = B0 , ν ∂G according to Theorem 3.1, converges in the following sense: ˆ R3 ) and α ∈ C λ (G, ˆ R) with There exist B ∈ C 1,λ (G, k Bn − B kC 1,λ (G) ˆ → 0, n → ∞, k αn − α kC λ (G) ˆ → 0, n → ∞, ˆ div B = 0, curl B = αB in G,
B, ν ∂G = B0 , ν ∂G , |B(x)| = O |x|−2 , |x| → ∞, α M = α 0 .
Proof. We use the abbreviation 3n := 3(M, Bn ) for n ∈ N0 . Let % ∈ ]1, 3[ , c0 be the constant depending on λ, %, G in Theorem 3.2, c1 the constant for the extensions of B0 , Bn to R3 described at the beginning of Sect. 3, K1 := 2 k B0 kC 1,λ (G) ˆ ,
R0 := diam G,
and η > 0 so small that √ c0 c1 3 · (R0 + K1 T + diam M)% + 1 · 1 + (K1 T + diam M)1−λ 1 · ηK1 · κ3 K1 , T < min 41 K1 , 4cδ1 T e− 2 c1 K1 T
(22)
and √ c0 c1 3 · (R0 + K1 T + diam M)% + 1 1 · 1 + (2K1 T + 2diam M)1−λ · K1 · κ52 K1 , T + κ3 K1 , T · η < . 2
(23)
On the Existence of Force-Free Magnetic Fields
131
Now suppose α 0 ∈ C01,λ (M) satisfying kα 0 kC 1,λ (M) < η. α1 is the trivial extension of the unique solution of the initial value problem
α1 , B0 = 0 in 30 , α1 M = α 0 according to Theorem 4.8. B1 − B0 is then the unique solution of the Neumann problem
(24) curl (B1 − B0 ) = α1 B0 , div (B1 − B0 ) = 0, B1 − B0 , ν ∂G = 0 according to Theorem 3.1. Note that α1 has compact support and thus the asymptotic condition on w in Theorem 3.1 is satisfied. For x ∈ 3n we have, while 0 ∈ G, | x |≤ diam G + diam 3n and therefore for a bounded function q, % |x|% |q(x)| ≤ diam G + diam 3n · k q k∞ .
(25)
Applying Lemma 4.6 we obtain |k α1 B0 |k% ≤ (R0 + K1 T + diam M)% · k α1 B0 kC 0 (G) ˆ . Because of Theorem 3.2 there holds
% k B1 −B0 kC 1,λ (G) ˆ ≤ c0 · 1 + (R0 + K1 T + diam M) · k α1 B0 kC λ (G) ˆ √ ≤ c0 c1 3 1 + (R0 + K1 T + diam M)% 1 + (K1 T + diam M)1−λ 0 · κ3 kB0 kC 1,λ (G) ˆ , T · k α kC 1,λ (M) · k B0 kC 1,λ (G) ˆ .
For the last inequality we have applied Lemma 5.2 and the estimate (21). According to assumption (22) we have k B1 − B0 kC 1,λ (G) ˆ
0, a > 0. Then there exists a C(ε, δ, a) > 0 such that for all x ∈ [0, a], |m(z, x) − iz1/2 | ≤ C(ε, δ, a), R x+δ
where C(ε, δ, a) depends on ε, δ, and sup0≤x≤a (
x
(2.9)
dy |q(y)|).
Theorems 2.1 and 2.2 can be proved following arguments ofAtkinson [2], who studied the Riccati-type equation satisfied by m(z, x), m0 (z, x) + m(z, x)2 = q(x) − z for a.e. x ≥ 0 and all z ∈ C\R.
(2.10)
Next, let qj (x), j = 1, 2 be two potentials satisfying (2.1), with mj (z) the associated (Dirichlet) m-functions. Combining the a priori bound (2.9) with the differential equation resulting from (2.10), [m1 (z, x) − m2 (z, x)]0 = q1 (x) − q2 (x) − [m1 (z, x) + m2 (z, x)][m1 (z, x) − m2 (z, x)], permits one to prove the following converse of Theorem 1.1.
(2.11)
Local Borg–Marchenko Results
277
Theorem 2.3 ([11]). Let arg(z) ∈ (ε, π − ε) for some 0 < ε < π and suppose a > 0. If q1 (x) = q2 (x) for a.e. x ∈ [0, a],
(2.12)
then |m1 (z) − m2 (z)| = O(e−2 Im(z |z|→∞
1/2 )a
).
(2.13)
Lemma 2.4. In addition to the hypotheses of Theorem 2.2, (resp., Theorem 2.3), suppose that H (resp., Hj , j = 1, 2) is bounded from below. Then (2.9) (resp., (2.13)) extends to all arg(z) ∈ (ε, π]. Proof. Since inf(spec(Hx )) ≥ inf(spec(H )), where Hx denotes the Schrödinger operd2 2 ator − dx 2 + q in L ([x, ∞)) with a Dirichlet boundary condition at x+ (and the same s.-a. b.c. at ∞ as H , if any), there is an E0 ∈ R such that for all x ∈ [0, a], m(z, x) is analytic in C\[E0 , ∞). Using m(z, x) = m(¯z, x), the estimate (2.9) holds on the boundary of a sector with vertex at E0 − 1, symmetry axis (−∞, E0 − 1], and some opening angle 0 < ε < π/2. An application of the Phragmén-Lindelöf principle (cf. [29, Part III, Sect. 6.5]) then extends (2.9) to all of the interior of that sector and hence in particular along the ray z ↓ −∞. Since (2.13) results from (2.9) upon integrating (cf. (2.11)), m1 (z, x) − m2 (z, x)]0 = −[m1 (z, x) + m2 (z, x)][m1 (z, x), −m2 (z, x)], x ∈ [0, a]
(2.14)
from x = 0 to x = a, the extension of (2.9) to z with arg(z) ∈ (ε, π ] just proven, allows one to estimate |m1 (z, x) + m2 (z, x)| = 2iz1/2 + O(1), arg(z) ∈ (ε, π ], |z|→∞
(2.15)
uniformly with respect to x ∈ [0, a], and hence to extend (2.13) to arg(z) ∈ (ε, π ]. u t Next, we briefly recall a few well-known facts on compactly supported q. Hence we suppose temporarily that sup(supp(q)) = α < ∞. In this case, the Jost solution f (z, x) associated with q(x) satisfies Z α sin(z1/2 (x − y)) iz1/2 x − dy q(y) f (z, y) f (z, x) = e z1/2 Zx α 1/2 1/2 dyK(x, y) eiz y , Im(z1/2 ) ≥ 0, x ≥ 0, = eiz x + x
(2.16)
(2.17) (2.18)
where K(x, y) denotes the transformation kernel satisfying (cf. [27, Sect. 3.1]) Z Z (y−x)/2 Z α 1 α dx 0 q(x 0 ) − dx 00 q(x 0 − x 00 ) × K(x, y) = 2 (x+y)/2 (x+y)/2 0 × K(x 0 − x 00 , x 0 + x 00 ), x ≤ y,
K(x, y) = 0, x > y,
(2.19) (2.20)
278
F. Gesztesy, B. Simon
|K(x, y)| ≤
1 2
Z
α
(x+y)/2
dx 0 |q(x 0 )| exp
Z x
α
dx 00 x 00 |q(x 00 )| .
(2.21)
Moreover, f (z, x) is a multiple of the Weyl solution, implying m(z, x) = f 0 (z, x)/f (z, x), z ∈ C\R, x ≥ 0,
(2.22)
and the Volterra integral equation (2.17) immediately yields |f (z, x)| ≤ Ce− Im(z f (z, x)
=
|z|→∞ Im(z1/2 )≥0
e
1/2 )x
iz1/2 x
, Im(z1/2 ) ≥ 0, x ≥ 0, −1/2
(1 + O(|z|
), x ≥ 0.
(2.23) (2.24)
Our final ingredient concerns the following result on finite Laplace transforms. Lemma 2.5 (= Lemma A.2.1 in [33]). Let g ∈ L1 ([0, a]) and assume that Ra −xy = O(e−xa ). Then g(y) = 0 for a.e. y ∈ [0, a]. 0 dy g(y)e x↑∞
Given these facts, the proof of Theorem 1.1 now becomes quite simple. Proof of Theorem 1.1. By Theorem 2.3 we may assume, without loss of generality, that q1 and q2 are compactly supported such that supp(qj ) ⊆ [0, a], j = 1, 2,
(2.25)
and by Lemma 2.4 we masuppose that (1.1) holds along the ray z ↓ −∞, that is, |m1 (z) − m2 (z)| = O(e−2|z|
1/2 a
z↓−∞
).
(2.26)
Denoting by mj (z, x) and fj (z, x) the m-functions and Jost solutions associated with qj , j = 1, 2, integrating the elementary identity d W (f1 (z, x), f2 (z, x)) = −[q1 (x) − q2 (x)]f1 (z, x)f2 (z, x) dx
(2.27)
from x = 0 to x = a, taking into account (2.22), yields Z a dx [q1 (x) − q2 (x)]f1 (z, x)f2 (z, x) 0
a = f1 (z, x)f2 (z, x)[m1 (z, x) − m2 (z, x)]
x=0
.
(2.28)
By (2.8), (2.23), and (2.26), the right-hand side of (2.28) is O(e−2|z| a ) as z ↓ −∞ (in fact, the right-hand side of (2.28) is zero at x = a due to the compact support assumption (2.25)), that is, Z a 1/2 dx [q1 (x) − q2 (x)]f1 (z, x)f2 (z, x) = O(e−2|z| a ). (2.29) 1/2
0
z↓−∞
Local Borg–Marchenko Results
279
Denoting by Kj (x, y) the transformation kernels associated with qj , j = 1, 2, (2.18) implies Z a 1/2 1/2 dy L(x, y) e2iz y , (2.30) f1 (z, x)f2 (z, x) = e2iz x + x
where L(x, y) = 2[K1 (x, 2y − x) + K2 (x, 2y − x)] Z 2y−x dx 0 K1 (x, x 0 )K2 (x, 2y − x 0 ), x ≤ y, +2
(2.31)
L(x, y) = 0, x > y or y > a.
(2.32)
x
Insertion of (2.30) into (2.29), interchanging the order of integration in the double integral, then yields Z a dx[q1 (x) − q2 (x)]f1 (z, x)f2 (z, x) 0 Z y Z a 1/2 dy [q1 (y) − q2 (y)] + dx L(x, y)[q1 (x) − q2 (x)] e−2|z| y = 0
0
= O(e−2|z|
1/2 a
z↓−∞
).
(2.33)
An application of Lemma 2.5 then yields Z y dx L(x, y)[q1 (x) − q2 (x)] = 0 for a.e. y ∈ [0.a]. [q1 (y) − q2 (y)] +
(2.34)
0
Since (2.34) is a homogeneous Volterra integral equation with a continuous integral t kernel L(x, y), one concludes q1 = q2 a.e. on [0, a]. u In particular, one obtains the following strengthened version of the original Borg– Marchenko uniqueness result, Theorem 1.2. Corollary 2.6. Let 0 < ε < π/2 and suppose that for all a > 0, |m1 (z) − m2 (z)| = O(e−2 Im(z
1/2 )a
|z|→∞
)
(2.35)
along the ray arg(z) = π − ε. Then q1 (x) = q2 (x) for a.e. x ∈ [0, ∞).
(2.36)
Remark 2.7. The Borg–Marchenko uniqueness result, Theorem 1.2 (but not our strengthened version, Corollary 2.6), under the additional condition of short-range potentials qj satisfying qj ∈ L1 ([0, ∞); (1 + x) dx), j = 1, 2, can also be proved using Property C, a device recently used by Ramm [30,31] in a variety of uniqueness results. In this case, (2.28) for z = λ > 0 becomes Z ∞ dx [q1 (x) − q2 (x)]f1 (λ, x)f2 (λ, z) 0
= −f1 (λ, 0)f2 (λ, 0)[m1 (λ + i0) − m2 (λ + i0)] = 0, λ > 0
(2.37)
280
F. Gesztesy, B. Simon
since m1 (z) = m2 (z), z ∈ C+ extends to m1 (λ + i0) = m2 (λ + i0), λ > 0 by continuity in the present short-range case. By definition, Property C stands for completeness of the set {f1 (λ, x)f2 (λ, x)}λ>0 in L1 ([0, ∞); (1 + x) dx) (this extends to L1 ([0, ∞))) and hence (2.37) yields q1 = q2 a.e. on [0, ∞). In the remainder of this section, we consider a variety of generalizations of the result obtained. Remark 2.8. The ray arg(z) = π − ε, 0 < ε < π/2 chosen in Theorem 1.1 and Corollary 2.6 is of no particular importance. A limit taken along any non-self-intersecting curve C going to infinity in the sector arg(z) ∈ (π/2 + ε, π − ε) will do as we can apply the Phragmén-Lindelöf principle ([29, Part III, Sect. 6.5]) to the region enclosed by C and its complex conjugate C¯ (needed in connection with Lemma 2.4 in order to reduce the general case to the case of spectra bounded from below). Remark 2.9. For simplicity of exposition, we only discussed the Dirichlet boundary condition g(0+ ) = 0
(2.38)
in the definition of H in (2.2). Next we replace (2.38) by the general boundary condition sin(α)g 0 (0+ ) + cos(α)g(0+ ) = 0, α ∈ [0, π )
(2.39)
in (2.2), denoting the resulting Schrödinger operator by Hα , while keeping the boundary condition at infinity (if any) identical for all α ∈ [0, π ). Denoting by mα (z) the Weyl– Titchmarsh function associated with Hα , the well-known relation (cf. e.g., Appendix A of [10] for precise details on Hα and mα (z)) mα (z) =
− sin(α) + cos(α)m(z) , α ∈ [0, π ), z ∈ C\R cos(α) + sin(α)m(z)
(2.40)
reduces the case α ∈ (0, π) to the Dirichlet case α = 0. In particular, Theorem 1.1 and Corollary 2.6 remain valid with mj (z) replaced by mj,α (z), α ∈ [0, π ). Indeed, 1/2 |m1,α (z) − m2,α (z)| = O(e−2 Im(z )a ) along the ray arg(z) = π − ε is easily seen |z|→∞
to imply, for all sufficiently small δ > 0, |m1,0 (z) − m2,0 (z)| = O(|z| e−2 Im(z
1/2 )a
|z|→∞
= O(e−2 Im(z
)
1/2 )(a−δ)
|z|→∞
)
(2.41)
along the ray arg(z) = π − ε. Hence one infers from Theorem 1.1 that for all 0 < δ < a, q1 = q2 a.e. on [0, a − δ]. Since δ > 0 can be chosen arbitrarily small, one concludes q1 = q2 a.e. on [0, a]. In fact, more is true. Since mα (z) → cot(α) along the ray, one concludes that |m1,α1 − m2,α2 (z)| and q1 = q2 a.e. on [0, a].
=
|z|→∞
|z|→∞ 1/2 )a −2 Im(z O(e ) along
a ray implies α1 = α2
Local Borg–Marchenko Results
281
Remark 2.10. If one is interested in a finite interval [0, b] instead of the half-line [0, ∞) in Theorem 1.1, with 0 < a < b, one introduces a self-adjoint boundary condition at x = b− of the type sin(β)g 0 (b− ) + cos(β)g(b− ) = 0, β ∈ [0, π ).
(2.42)
The analog of the Weyl solution ψ(z, x; b, β) for the corresponding Schrödinger operator H (b, β) in L2 ([0, b]) defined by d2 + q, β ∈ [0, π ), dx 2 dom(H (b, β)) = {g ∈ L2 ([0, b]) | g, g 0 ∈ AC([0, b]); g(0+ ) = 0, H (b, β) = −
(2.43)
sin(β)g 0 (b− ) + cos(β)g(b− ) = 0; (−g 00 + qg) ∈ L2 ([0, b])} is then defined by sin(β)ψ 0 (z, b− ; b, β) + cos(β)ψ(z, b− ; b, β) = 0, z ∈ C\R, − ψ 00 (z, x; b, β) + [q(x) − z]ψ(z, x; b, β) = 0.
(2.44)
Moreover, the analog of (2.18) is then of the type ψ(z, x; b, β) = ψ (0) (z, x; b, β) +
Z
b
x
dy K(x, y; b, β)ψ (0) (z, x; b, β), z ∈ C\R, x ∈ [0, b],
(2.45)
where ψ
(0)
(z, x; b, β) =
eiz
1/2 x
+ ζ (β, z) eiz
1/2 (2b−x)
1 + ζ (β, z) e2iz b −iz1/2 − cot(β) , Im(z1/2 ) ≥ 0 ζ (β, z) = −iz1/2 + cot(β) 1/2
,
(2.46)
is the corresponding Weyl solution in the case q(x) = 0, x ∈ [0, b], and K(x, y; b, β) is a transformation kernel analogous to K(x, t; h) discussed in Sect. 1.3 of [27]. Theorem 1.1 then extends to triples (qj , bj , βj ), j = 1, 2 with a < min(b1 , b2 ), replacing fj (z, x) in (2.29) by ψ(z, x; bj , βj ), j = 1, 2. More precisely, if m(z; bj , βj ) denote the mfunctions for H (bj , βj ), j = 1, 2 with a < min(b1 , b2 ) and |m(z; b1 , β1 ) − m(z; b2 , β2 )| = O(e−2 Im(z
1/2 )a
|z|→∞
)
(2.47)
along the ray arg(z) = π − ε, then q1 = q2 a.e. on [0, a]. In fact, it was precisely this version of Theorem 1.1 which was originally proven by one of us [33] in 1998. One can also derive additional results in the case a = b1 = b2 (cf. Theorem 1.3 in [33]). Indeed, ζ (β, z) = 1 + O(|z|−1/2 ), so by (2.45) and (2.46), |z|→∞
ψ(z, 0; a, β) ψ(z, a; a, β)
=
1 + O(|z|−1/2 ),
(2.48)
=
2eiz
(2.49)
|z|→∞ |z|→∞
1/2 a
(1 + O(|z|−1/2 )),
282
F. Gesztesy, B. Simon
which are analogous to (2.24). Thus, if q1 = q2 on [0, a] but β1 6= β2 , we have that [m1 (z; a, β1 ) − m2 (z; a, β2 )] −1
= ψ1 (z, 0; a, β1 )
−1
= ψ1 (z; 0, a, β1 )
(2.50) −1
W (ψ2 (z, 0; a, β2 ), ψ1 (z, 0; a, β1 ))
−1
W (ψ2 (z, a; a, β2 ), ψ1 (z, a; a, β2 )),
ψ2 (z, 0; a, β2 ) ψ2 (z, 0; a, β2 )
by the constancy of the Wronskian when q1 = q2 . But ψ 0 (a)/ψ(a) equals cot(β) by (2.44) and hence m1 (z; a, β1 ) − m2 (z; a, β2 ) = ψ1 (z, 0; a, β1 )−1 ψ2 (z, 0; a, β2 )−1 × × ψ1 (z, a; a, β1 )ψ2 (z, a; a, β2 )[cot(β1 ) − cot(β2 )]. (2.51) Using (2.48), (2.49), this implies that m1 (z; a, β1 ) − m2 (z; a, β2 ) = 4e2iz
1/2 a
[cot(β1 ) − cot(β2 )][1 + O(|z|−1/2 )],
(2.52)
which is Theorem 1.3 in [33]. While we have separately described a few extensions in Remarks 2.8–2.10, it is clear that they can all be combined at once. We also mention the analog of Theorem 1.1 for Schrödinger operators on the real line. Assuming q ∈ L1loc (R), q real-valued,
(2.53)
one introduces the corresponding self-adjoint Schrödinger operator H in L2 (R) by d2 + q, dx 2 dom(H ) = {g ∈ L2 (R) | g, g 0 ∈ ACloc (R); s. s.-a. b.c. at ± ∞;
H =−
(2.54)
(−g 00 + qg) ∈ L2 (R)}. Here “s. s.-a. b.c.” denotes separated self-adjoint boundary conditions at +∞ and/or −∞ (if any). The 2×2 matrix-valued m-function M(z) associated with H in L2 (R) is then defined by M(z) = (m− (z) − m+ (z))−1 × 1 (m− (z) + m+ (z))/2 , z ∈ C\R, × m− (z)m+ (z) (m− (z) + m+ (z))/2
(2.55)
where m± (z) denote the half-line m-functions associated with H restricted to [0, ±∞) and a Dirichlet boundary condition at x = 0. Next, let qj (x), j = 1, 2 be two potentials satisfying (2.53) and Hj the corresponding Schrödinger operators (2.54) in L2 (R), with Mj (z), j = 1, 2 the associated 2 × 2 matrix-valued m-functions. Then the analog of Theorem 1.1 reads as follows.
Local Borg–Marchenko Results
283
Theorem 2.11. Let a > 0, 0 < ε < π/2 and suppose that = O(e−2 Im(z
kM1 (z) − M2 (z)kC2×2
1/2 )a
|z|→∞
)
(2.56)
along the ray arg(z) = π − ε. Then q1 (x) = q2 (x) for a.e. x ∈ [−a, a].
(2.57)
Proof. We denote by mj,± (z) the half-line (Dirichlet) m-functions associated with Hj on [0, ±∞), j = 1, 2. Then a straightforward combination of (2.8) and (2.56) yields |m1,± (z) − m2,± (z)|
=
|z|→∞
O(|z|e−2 Im(z
1/2 )a
)
(2.58)
and hence (2.57), applying Theorem 1.1 separately to the two half-lines [0, ∞) and (−∞, 0] (and using the argument following (2.41). u t Finally, the reader might be interested in the analog of Theorem 1.1 in the case of second-order difference operators, that is, Jacobi operators. Let A be a bounded selfadjoint Jacobi operator in `2 (N0 ) (N0 = N ∪ {0}) of the type b a 0 0 . . . . . . 0
a0 0 A= 0 . .. .. .
0
b1 a1 0 a1 b2 a2 . 0 a2 . . .. .. . . . . . .. .. . .
... ... .. . .. . .. .
. . . . . . , ak > 0, bk ∈ R, k ∈ N0 . .. . .. .
The corresponding m-function of A is then defined by Z −1 dρ(λ)(λ − z)−1 , z ∈ C\R, m(z) = (δ0 , (A − z) δ0 ) = R
(2.59)
(2.60)
where δ0 = (1, 0, 0, . . . ). The analog of Theorem 1.1 in the discrete case then reads as follows. Denote by mj (z) the m-functions for two self-adjoint Jacobi operators Aj , j = 1, 2, denoting the matrix elements of Aj by aj,k , bj,k , j = 1, 2, k ∈ N0 . Then |m1 (z) − m2 (z)| = O(|z|−N ), |z|→∞
(2.61)
for some N ∈ N, N ≥ 3, if and only if a1,k = a2,k , b1,k = b2,k , 0 ≤ k ≤
N −4 if N is even (N ≥ 4) 2
(2.62)
and N −5 , 2 (2.63) N −3 if N is odd. b1,k = b2,k , 0 ≤ k ≤ 2 The proof is clear from (2.60) and the well-known formulas (cf. [4, Sect. VII.1]). Z Z dρ(λ) λPk (λ)Pk+1 (λ), bk = dρ(λ) λPk (λ)2 , k ∈ N0 , (2.64) ak = a1,k = a2,k , 0 ≤ k ≤
R
R
where {Pk (λ)}k∈N0 is an orthonormal system of polynomials with respect to the spectral measure dρ, with Pk (z) of degree k in z, P0 (z) = 1.
284
F. Gesztesy, B. Simon
3. Matrix-Valued Schrödinger Operators In our final section we extend Theorem 1.1 to matrix-valued potentials (cf., [6, Ch. III], [17,22] and the references therein). Let m ∈ N and denote by Im the identity matrix in Cm . Assuming Q = Q∗ ∈ L1 ([0, R])m×m for all R > 0,
(3.1)
we introduce the corresponding matrix-valued self-adjoint Schrödinger operator H in L2 ([0, ∞))m with a Dirichlet boundary condition at x = 0+ , by d2 Im + Q, dx 2 dom(H ) = {g ∈ L2 ([0, ∞))m | g, g 0 ∈ AC([0, R])m for all R > 0;
H =−
(3.2)
g(0+ ) = 0, s.-a. b.c. at ∞; (−g 00 + Qg) ∈ L2 ([0, ∞))m }. Here “s.-a. b.c. at ∞" again denotes a self-adjoint boundary condition at ∞ (if Q is not in the limit point case at ∞). For more details about the limit point/limit circle and all the intermediate cases, see [7,13–16,18,19,28,32] and the references therein. Next, let 9(z, x) be the unique (up to right multiplication of non-singular constant m × m matrices) m × m matrix-valued Weyl solution associated with H , satisfying 9(z, · ) ∈ L2 ([0, ∞))m×m , z ∈ C\R, 9(z, x) satisfies the s.-a. b.c. of H at ∞ (if any),
(3.3) (3.4)
− 9 00 (z, x) + [Q(x) − zIm ]9(z, x) = 0.
(3.5)
The m × m matrix-valued Weyl–Titchmarsh function M(z) associated with H is then defined by M(z) = 9 0 (z, 0+ )9(z, 0+ )−1 , z ∈ C\R
(3.6)
and similarly, we introduce its x-dependent version, M(z, x), by M(z, x) = 9 0 (z, x)9(z, x)−1 , z ∈ C\R, x ≥ 0.
(3.7)
The matrix Riccati equation satisfied by M(z, x), the analog of (2.10), then reads M 0 (z, x) + M(z, x)2 = Q(x) − zIm for a.e. x ≥ 0 and all z ∈ C\R.
(3.8)
Next, let Qj (x), j = 1, 2 be two self-adjoint matrix-valued potentials satisfying (3.1), and Mj (z), Mj (z, x) the Weyl–Titchmarsh matrices associated with the corresponding (Dirichlet) Schrödinger operators. Then the analog of (2.11) is of the form [M1 (z, x) − M2 (z, x)]0 = Q1 (x) − Q2 (x) −
(3.9) 1 2 [M1 (z, x) + M2 (z, x)][M1 (z, x) − M2 (z, x)]
− 21 [M1 (z, x) − M2 (z, x)][M1 (z, x) + M2 (z, x)]. Combining (3.9) with the elementary fact that any m × m matrix-valued solution U (x) of U 0 (x) = B(x)U (x) + U (x)B(x)
(3.10)
Local Borg–Marchenko Results
285
is of the form U (x) = V (x)CW (x),
(3.11)
where C is a constant m × m matrix and V (x), respectively, W (x), is a fundamental system of solutions of R 0 (x) = B(x)R(x), respectively, S 0 (x) = S(x)B(x), one can prove the analogs of Theorems 2.1–2.3 in the present matrix context. More precisely, the matrix analogs of Theorems 2.1 and 2.2 follow from Theorem 4.8 in [7]. The corresponding analog of Theorem 2.3 follows from Theorem 4.5 and Remark 4.7 in [7]. Moreover, in the case that H is bounded from below, Lemma 2.4 generalizes to the matrix-valued context and hence permits one to take the limit z ↓ −∞ in the matrix analog of (2.13). While the scalar case treated in detail in [11] is based on Riccati-type identities such as (2.11) and an a priori bound of the type (2.9) inspired by Atkinson’s 1981 paper [2], the matrix-valued case discussed in depth in [7] is based on corresponding Riccati-type identities such as (3.9) and an a priori bound of the type M(z, x) = iz1/2 Im + o(|z|1/2 )
(3.12)
first obtained by Atkinson in an unpublished manuscript [3]. In the special case of short-range matrix-valued potentials Q(x), m×m matrix analogs of the Jost solution F (z, x) as well as the transformation kernel K(x, y) associated with H as in (2.17)–(2.21) (replacing | · | by an appropriate matrix norm k · kCm×m ), have been discussed in great detail in the classical 1963 monograph by Agranovich and Marchenko [1, Ch. I]. Moreover, (2.22)–(2.24) trivially extend to the matrix case. Given these preliminaries, the analog of Theorem 1.1 and Corollary 2.6 reads as follows in the matrix-valued context. Theorem 3.1. Let a > 0, 0 < ε < π/2 and suppose kM1 (z) − M2 (z)kCm×m
= O(e−2 Im(z
|z|→∞
1/2 )a
)
(3.13)
along the ray arg(z) = π − ε. Then Q1 (x) = Q2 (x) for a.e. x ∈ [0, a].
(3.14)
In particular, if (3.13) holds for all a > 0, then Q1 = Q2 a.e. on [0, ∞). Sketch of Proof. As in the scalar case, we may assume without loss of generality that supp(Qj ) ⊆ [0, a], j = 1, 2.
(3.15)
The fundamental identity (2.27), in the present non-commutative case, needs to be replaced by d W (F1 (¯z, x)∗ , F2 (z, x)) = −F1 (¯z, x)∗ [Q1 (x) − Q2 (x)]F2 (z, x), dx
(3.16)
where Fj (z, x) denote the m × m matrix-valued Jost solutions associated with Qj , j = 1, 2, and W (F, G)(x) = F (x)G0 (x) − F 0 (x)G(x) the matrix-valued Wronskian of m × m matrices F and G. Identity (2.28) then becomes Z a dx F1 (¯z, x)∗ [Q1 (x) − Q2 (x)]F2 (z, x) 0
286
F. Gesztesy, B. Simon
a = F1 (¯z, x) [M1 (z, x) − M2 (z, x)]F2 (z, x) ∗
,
x=0
(3.17)
utilizing the fact M1 (¯z, x)∗ = M1 (z, x).
(3.18)
Fj obeys a transformation kernel representation Fj (z, x) = e
iz1/2 x
Z Im +
x
a
dy Kj (x, y) eiz
1/2 y
Im ,
(3.19)
Im(z1/2 ) ≥ 0, x ≥ 0, j = 1, 2. From this, (3.12), and the hypothesis of (3.13), one concludes by (3.17) that Z
a
dx F1 (¯z, x)[Q1 (x) − Q2 (x)]F2 (z, x) = O(e−2 Im(z z↓−∞
0
1/2 )a
).
(3.20)
Now let RA be right multiplication by A on n×n matrices and LB be left multiplication by B. Then LHS of (3.20) (3.21) Z y Z a 1/2 dx Q1 (y)−Q2 (y) + dx L(x, y)[Q1 (x)−Q2 (x)] e−2z y , = 0
0
where L is an operator on n × n matrices which is a sum of a left multiplication (by 2Kj (x, 2y − x)), a right multiplication (by 2K2 (x, 2y − x)), and a convolution of a left and right multiplication. It follows by Lemma 2.5, (3.20), and (??) that Z Q1 (y) − Q2 (y) +
y
dx L(x, y)[Q1 (x) − Q2 (x)] = 0.
(3.22)
0
This is a Volterra equation and the same argument based on Z
y 0
Z
x1
dx1 0
Z dx2 · · · · 0
xn−1
dxn =
yn n!
that a Volterra operator has zero spectral radius applies to operator-valued Volterra equat tions. Thus, (3.22) implies Q1 (y) − Q2 (y) = 0 for a.e. y ∈ [0, a]. u Extensions of Theorem 3.1 in the spirit of Remarks 2.8–2.10 and Theorem 2.11 can be made, but we omit the corresponding details at this point. Acknowledgements. F. G. thanks T. Tombrello for the hospitality of Caltech where this work was done.
Local Borg–Marchenko Results
287
References 1. Agranovich, Z.S. and Marchenko, V.A.: The Inverse Problem of Scattering Theory. New York: Gordon and Breach, 1963 2. Atkinson, F.V.: On the location of the Weyl circles. Proc. Roy. Soc. Edinburgh 88A, 345–356 (1981) 3. Atkinson, F.V.: Asymptotics of the Titchmarsh-Weyl function in the matrix case. Unpublished manuscript 4. Berezanskii, Ju.M.: Expansions in Eigenfunctions of Selfadjoint Operators. Providence, RI: Am. Math. Soc. 1968 5. Borg, G.: Uniqueness theorems in the spectral theory of y 00 +(λ−q(x))y = 0. In: Proc. 11th Scandinavian Congress of Mathematicians. Oslo: Johan Grundt Tanums Forlag, 1952, pp. 276–287 6. Carmona, R. and Lacroix, J.: Spectral Theory of Random Schrödinger Operators. Boston: Birkhäuser, 1990 7. Clark, S. and Gesztesy, F.: Weyl–Titchmarsh M-function asymptotics for matrix-valued Schrödinger operators. Proc. London Math. Soc., to appear 8. Everitt, W.N.: On a property of the m-coefficient of a second-order linear differential equation. J. London Math. Soc. 4, 443–457 (1972) 9. Gel’fand, I.M. and Levitan, B.M.: On the determination of a differential equation from its spectral function. Izv. Akad. Nauk SSR. Ser. Mat. 15, 309–360 (1951) (Russian); English transl. in Am. Math. Soc. Transl. Ser. 2 1, 253–304 (1955) 10. Gesztesy, F. and Simon, B.: Uniqueness theorems in inverse spectral theory for one-dimensinal Schrödinger operators. Trans. Am. Math. Soc. 348, 349–373 (1996) 11. Gesztesy, F. and Simon, B.: A new approach to inverse spectral theory, II. General real potentials and the connection to the spectral measure. Ann. of Math., to appear 12. Gesztesy, F. and Simon, B.: In preparation 13. Hinton, D.B. and Shaw, J.K.: On Titchmarsh–Weyl M(λ)-functions for linear Hamiltonian systems. J. Diff. Eqs. 40, 316–342 (1981) 14. Hinton, D.B. and Shaw, J.K.: Hamiltonian systems of limit point or limit circle type with both endpoints singular. J. Diff. Eqs. 50, 444–464 (1983) 15. Hinton, D.B. and Shaw, J.K.: On boundary value problems for Hamiltonian systems with two singular points. SIAM J. Math. Anal. 15, 272–286 (1984) 16. Kogan, V.I. and Rofe-Beketov, F.S.: On square-integrable solutions of symmetric systems of differential equations of arbitrary order. Proc. Roy. Soc. Edinburgh 74A, 1–40 (1974) 17. Kotani, S. and Simon, B.: Stochastic Schrödinger operators and Jacobi matrices on the strip. Commun. Math. Phys. 119, 403–429 (1988) 18. Krall, A.M.: M(λ) theory for singular Hamiltonian systems with one singular point. SIAM J. Math. Anal. 20, 664–700 (1989) 19. Krall, A.M.: M(λ) theory for singular Hamiltonian systems with two singular points. SIAM J. Math. Anal. 20, 701–715 (1989) 20. Krein, M.G.: Solution of the inverse Sturm-Liouville problem. Doklady Akad. Nauk SSSR 76, 21–24 (1951) (Russian) 21. Krein, M.G.: On the transfer function of a one-dimensional boundary problem of second order. Doklady Akad. Nauk SSSR 88, 405–408 (1953) (Russian) 22. Lacroix, J.: The random Schrödinger operator in a strip. In: Probability Measures on Groups VII, H. Heyer (ed.), Lecture Notes in Math. 1064, Berlin: Springer, 1984, pp. 280–297 23. Levitan, B.M.: Inverse Sturm-Liouville Problems, Utrecht: VNU Science Press, 1987 24. Levitan, B.M. and Gasymov, M.G.: Determination of a differential equation by two of its spectra. Russ. Math. Surveys 19:2, 1–63 (1964) 25. Marchenko, V.A.: Certain problems in the theory of second-order differential operators. Doklady Akad. Nauk SSSR 72, 457–460 (1950) (Russian) 26. Marˇcenko, V.A.: Some questions in the theory of one-dimensional linear differential operators of the second order. I. Trudy Moskov. Mat. Obšˇc. 1, 327–420 (1952) (Russian); English transl. in Am. Math. Soc. Transl. (2) 101, 1–104 (1973) 27. Marchenko, V.A.: Sturm-Liouville Operators and Applications. Basel: Birkhäuser, 1986 28. Orlov, S.A.: Nested matrix disks analytically depending on a parameter, and theorems on the invariance of ranks of radii of limiting disks. Math. USSR Izv. 10, 565–613 (1976) 29. Pólya, G. and Szeg˝o, G.: Problems and Theorems in Analysis I. Berlin: Springer, 1972 30. Ramm, A.G.: Property C for ODE and applications to inverse scattering. Z. angew. Analysis 18, 331–348 (1999) 31. Ramm, A.G.: Property C for ODE and applications to inverse problems. Preprint, 1999 32. Rofe-Beketov, F.S.: Selfadjoint extensions of differential operators in a space of vector functions. Sov. Math. Dokl. 10, 188–192 (1969) 33. Simon, B.: A new aproach to inverse spectral theory, I. Fundamental formalism. Ann. of Math. 150, 1029–1057 (1999) 34. Tikhonov, A.N.: On the uniqueness of the problem of electric prospecting. Doklady Akad. Nauk SSSR 69, 797–800 (1949) (Russian) Communicated by A. Jaffe
Commun. Math. Phys. 211, 289 – 302 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Invariant Tori in Hamiltonian Systems with Impacts Vadim Zharnitsky Division of Applied Mathematics, Brown University, Providence, RI 02912, USA. E-mail:
[email protected] Received: 8 September 1999 / Accepted: 16 November 1999
Abstract: It is shown that a large class of solutions in two-degree-of-freedom Hamiltonian systems of billiard type can be described by slowly varying one-degree-of-freedom Hamiltonian systems. Under some non-degeneracy conditions such systems are found to possess a large set of quasiperiodic solutions filling out two dimensional tori, which correspond to caustics in the classical billiard. This provides a unified proof of existence of quasiperiodic solutions in convex billiards and other systems with impacts including classical billiard in electric and magnetic fields, dual billiard, and Fermi–Ulam systems. 1. Introduction 1.1. Billiards and systems with impacts. The classical billiard system describes the free motion of a particle in a planar region bounded by a closed curve. The particle moves along a straight line and is reflected from the boundary according to the rule “the angle of reflection equals the angle of incidence”. The systematic study of classical billiards was started by Birkhoff to illustrate and develop certain concepts in the theory of Hamiltonian Dynamical systems with two degrees of freedom [5]. Since then the billiard has become a basic model in such diverse fields as Foundations of Statistical Mechanics, Ergodic Theory, Quantum Chaos, etc. The billiards represent the simplest systems in Classical Mechanics which still exhibit any kind of behavior observed in two-degree-of-freedom Hamiltonian systems. One of the most important results concerning billiards was the discovery of caustics clustering at the boundary of a smooth convex billiard with non-vanishing curvature by Lazutkin [16,17]. In geometric optics a caustic is defined as the envelope of a light ray trajectory, so that any ray tangent to a caustic remains tangent to it after reflection from the boundary. The presence of a caustic implies non-ergodic behavior as the corresponding invariant curve (family of rays tangent to a caustic) separates the phase space into invariant components. On the other hand, caustics can be used to estimate eigenvalues and to construct quasimodes in the corresponding Dirichlet problem, see [16].
290
V. Zharnitsky
In his original proof Lazutkin showed the existence of caustics by reducing the billiard ball map1 to a near integrable form, and by applying KAM theory to obtain invariant curves. The near integrable behavior in the vicinity of the boundary of a smooth convex billiard can be anticipated by observing that a trajectory nearly tangent to the boundary will experience many collisions with it before the curvature significantly changes. Such observation raises a hope of introducing different “time scales” and obtaining an adiabatic invariant. However, in the proofs by Lazutkin and others, see e.g. [3,6,17,18], this simple physical intuition is hidden because the billiard ball map is used. In this paper, we prove the existence of caustics in systems of billiard type by using Arnold’s result on the existence of invariant tori in smooth slowly varying oscillatory Hamiltonian systems [2], which allows us to make the above physical argument rigorous without sacrificing its clarity. In order to apply Arnold’s approach to the systems of billiard type, we use the Hamiltonian formalism for the systems with unilateral constraint developed in [13] and in subsequent papers by Markeev, Ivanov and their coauthors. The main idea of their approach is that the Hamiltonin systems with impacts can be largely treated as smooth Hamiltonian systems. Here, we start with the “billiard” Hamiltonian, which is nonsmooth at the boundary surface, and following Markeev [23] we apply the isoenergetic reduction in the presence of a unilateral constraint. After carrying out a canonical rescaling we obtain a slowly periodically varying Hamiltonian system with one degree of freedom. The obtained nonsmooth Hamiltonian function is reduced to a near integrable form following the well known procedure, see e.g. [2,20]. The vector field generated by the near integrable Hamiltonian induces a near integrable mapping on the surface of the section corresponding to the boundary. This map is smooth (for the billiard flow is smooth away from the boundary) and satisfies the conditions of Moser’s small twist theorem, which implies the existence of invariant curves corresponding to caustics. In a similar situation, KAM theory has been applied to a system with unilateral constraint in [22]. To summarize, we have applied the approach in [2], using a non-smooth version of the Hamiltonian formalism from [13] to the billiard systems. This is possible because the averaging transformations can be applied to nonsmooth Hamiltonian functions. Next, this approach is applied to a larger class of non-smooth Hamiltonian functions (Subsect. 1.3), which provides a unified stability proof for various systems of billiard type such as the Fermi–Ulam oscillator, dual billiard, and billiard in magnetic and electric fields (Sect. 2). Finally, in Sect. 3, we provide an example of instability due to Halpern [12] in the framework of the Hamiltonian approach for non-smooth systems [13] and give an improved criterion for the billiard flow to be well defined. Whenever appropriate, we assume that Hamiltonian functions are analytic in order not to burden the exposition with the estimates. All statements can be extended to finitely differentiable functions.
1.2. Caustics in convex billiards. In this section, we use the above approach to give a new proof of Lazutkin’s theorem on existence of caustics in convex billiards. Introducing the boundary coordinates (r, s) as in [17], where r is the distance from the boundary and s is the natural parameter along the boundary curve, we obtain the 1 The map that associates to an outgoing ray’s pair of the reflection point and the angle of reflection the corresponding data after the next reflection, see e.g. [14].
Invariant Tori in Hamiltonian Systems with Impacts
291
r
pr
R
s
r
I
Fig. 1. The trajectory of the billiard ball in the configuration space and in the reduced phase space
Lagrangian of a free particle in the new coordinates2 q˙ 2 q˙12 r˙ 2 s˙ 2 + 2 = + (1 − k(s)r)2 . 2 2 2 2 Carrying out the Legendre transformation we obtain the Hamiltonian L=
H =
ps2 pr2 + , 2 2(1 − k(s)r)2
(1)
which also describes the motion away from the boundary. Following Ivanov and Markeev [13], we modify the Hamiltonian so that it would describe full dynamics, including the impacts with the boundary, by letting r assume negative values and substituting |r| instead of r, H =
ps2 pr2 + . 2 2(1 − k(s)|r|)2
(2)
The equivalence of these systems is easy to see from Fig. 1, see also [6], where such a description is used for a billiard problem. Since the system is autonomous, the energy does not change and, therefore, it is reasonable to carry out isoenergetic reduction. Using an invariant relation of the Hamiltonian vector field with the 1-form pr dr + ps ds − H dt = pr dr + H d(−t) − (−ps )ds, we choose K = −ps as the new Hamiltonian and s as the new time q (3) K = −(1 − k(s)|r|) 2H − pr2 , where the sign of the square root is chosen to be positive, which corresponds to the motion in the positive direction of s. Thus, we have obtained a one-degree-of-freedom nonautonomous system. Taking H = 1/2 and rescaling the system3 K = ε2 F pr = εPR , s = εS
r = ε2 R,
2 This change of coordinates is well defined for r < 1 < 1 , but since we are interested in the solutions kmax k staying near the boundary, this is not a serious restriction. 3 The rescaling is motivated by an elementary geometric observation: if the angle of reflection is of order ε (i.e. pr ∼ ε) then the arc length between the two successive collisions is also of order ε (i.e. s ∼ ε) and the light ray stays ε 2 -close to the boundary (i.e. r ∼ ε 2 ).
292
V. Zharnitsky
we obtain the new Hamiltonian
q F = −(ε−2 − k(εS)|R|) 1 − ε2 PR2 .
(4)
Expanding the square root in Taylor series, we obtain F =
PR2 + k(εS)|R| + ε2 F2 (|R|, PR , εS, ε), 2
(5)
where F1 (|R|, PR , εS, ε) is a real analytic function for (R 6= 0, |PR | < ε −1 ). Using more convenient notation R = x, PR = y, S = t, F = H , we rewrite the Hamiltonian H =
y2 + k(λ)|x| + ε2 H2 (|x|, y, λ, ε), 2
(6)
where λ = εt. The leading term of the Hamiltonian represents a slowly varying onedegree-of-freedom Hamiltonian system, which corresponds to the system of a bouncing ball in a slowly varying gravity field. If the Hamiltonian function were smooth we would be able to apply Arnold’s theorem on perpetual conservation of adiabatic invariant [2]. But since the Hamiltonian is not smooth in x we will follow the reduction procedure in [2], pointing out how it modifies for the non-smooth case according to [22]. We first apply the action-angle transformation as in [22] (x, y, λ) → (φ, I, λ) with 2 the Hamiltonian H0 = y2 + k(λ)|x|. The details of the derivation can be found in [1] or in the next section. The action variable I is given by the area enclosed with the corresponding trajectory of the autonomous system H0 (|x|, y, λ) with the frozen parameter λ, √ √ Z Hk √ 2 2 3/2 H − kxdx = (7) H . I (x, y, λ) = I (H (x, y, λ), λ) = 4 2 3k 0 The angular variable φ(x, y, λ) is given by the time it takes the solution in the autonomous system to move from the section ((x = 0, y > 0)) to (x, y), divided by the period of one revolution T (H, λ) in the autonomous system. The new Hamiltonian takes the form 2 3 3 (8) H = √ k(εt)I + εH1 (I, |φ|, εt, ε), 2 2 where H1 is real analytic (see the next section). Because of the reflectional symmetry in the system and our choice of φ = 0 at x = 0, y > 0 the Hamiltonian depends on |φ|. Even though the obtained Hamiltonian is a small perturbation of an integrable one, KAM theory still cannot be applied due to the explicit time dependence in the leading term. Now, we make the leading term of the Hamiltonian time independent. Since the integral curves of the Hamiltonian system are invariantly associated with the differential form 1 I dφ − H (I, φ, λ)dt = − {H dλ − εI (H, λ, φ, ε)dφ} , ε where I (H, λ, φ, ε) =
√ 2 2 H 3/2 + εI1 (H, λ, φ, ε) 3 k(λ)
(9)
Invariant Tori in Hamiltonian Systems with Impacts
293
is the inverse function of (8), we can choose (εI (H, λ, φ, ε), φ, H, λ) as a new Hamiltonian, time, momentum, and position, respectively. Now, we introduce a linear “time-dependent” transformation which will make the leading term of the Hamiltonian εI (H, λ, φ, ε) λ− independent4 , 1 hk 2/3 i h = 2/3 H τ = 2/3 k (λ) hk i
Z
λ
k 2/3 (λ)dλ.
0
The new Hamiltonian takes the form J (h, τ, φ, ε) = εJ0 (h) + ε2 J1 (h, τ, |φ|, ε), where √ 2 2 h3/2 J0 (h) = 3 hk 2/3 i3/2 and J1 is 1-periodic in τ . The corresponding equations of motion take the form ∂J1 dτ (h, τ, |φ|, ε) = εJ00 (h) + ε2 ∂h dφ dh ∂J1 = −ε2 (h, τ, |φ|, ε). dφ ∂τ Finally, we let ν = J00 (h) and integrate the equations of motion on φ ∈ (0, 1/2) obtaining a monotone twist map (
τ1 = τ0 + εν0 + ε2 Q1 (τ0 , ν0 , ε) ν0 = ν1 + ε2 Q2 (τ0 , ν0 , ε),
(10)
which satisfies the conditions of Moser’s small twist theorem [24]. Applying the theorem in the annulus 1 ≤ ν ≤ 2 we obtain a large set of invariant circles and the measure of the complement of their union tends to zero as ε → 0. Retracing the transformations we find that the subset of the billiard table n o U ε = (r, s) ∈ R + × S 1 | ε 2 C1 k −1/3 (s) ≤ r ≤ ε2 C2 k −1/3 (s) , where 0 < C1 < C2 , is filled with caustics and the relative measure of the complement of their union tends to zero as ε → 0. Now, table U n = U εn , n ∈ N, where √ considern a sequence of subsets ofSthe billiard n εn = ( C1 /C2 ) ε0 . It is easy to see that n=1 U is a neighborhood of the boundary and εn → 0. Therefore, the caustics accumulate at the boundary and the relative measure of the complement of their union goes to zero near the boundary. 4 It is obtained by taking H = ck 2/3 h, so that the leading term in (9) would be λ-independent, using invariance of the 2-form dH ∧ dλ = dh ∧ dτ , and applying a normalized periodicity condition: τ = 0 if λ = 0 and τ = 1 if λ = 1.
294
V. Zharnitsky
1.3. Invariant tori in the systems of billiard type. In this section we state and prove the theorem on existence of invariant tori in the slowly periodically varying Hamiltonian systems generalizing (6) and to which many systems of billiard type can be reduced. The proof follows the argument in the preceding section. We assume that a slowly periodically varying oscillatory conservative system is described by the Hamiltonian function H = H0 (|x|, y, λ) + εH1 (|x|, y, λ, ε) (λ = εt). Introducing the action-angle variables for H = H0 (|x|, y, λ) as before, we let I (x, y, λ) be the area enclosed by the curve H0 (|ξ |, η, λ) = H0 (|x|, y, λ) and φ(x, y, λ) be the time it takes the solution with H = H0 (x, y, λ) to travel from ξ = 0 to ξ = x in the autonomous system H0 (|ξ |, η, λ) divided by the period T (H, λ). More formally, I y dx, (11) I (x, y, λ) = I (H (x, y, λ), λ) = H =H0 (|x|,y,λ)
and in a neighborhood of (x0 , y0 , λ0 ) the transformation can be obtained from the generating function Z y dx, (12) S(x, I, λ) = C(x,x0 )
where C is a part of the level curve H0 (|x|, y, λ) = H0 (I, λ) and H0y (|x0 |, y0 , λ0 ) 6= 0. If, however, H0y (|x0 |, y0 , λ0 ) = 0, then the generating function S(y, I, λ) has to be used. The generating function and the action-angle transformation is obtained using the relation between the 1-form ydx − H dt = I dφ − Kdt + dS = −φdI − Kdt + dS(x, I, t) and the vector field [1]. Indeed, for fixed I and frozen λ we obtain (12). Using the above differential relation we also obtain φ = ∂S ∂I (x, I, t) and since φ ∈ [0, 1] is an angle we obtain (11) ( i.e. S must increase by the value of I over each rotation). The new Hamiltonian takes the form H = H0 (I, λ) + εH1 (I, φ, λ, ε) + εSλ (x(I, φ, λ), I, λ) = H0 (I, λ) + ε H˜ 1 (I, φ, λ, ε).
(13)
Theorem 1.1. Assume that the surfaces H0 (|x|, y, λ) = H0 (I, λ) are homeomorphic to 2-tori on an open interval I ∈ (I1 , I2 ), fill out an open domain and the following conditions hold: H (ρ, y, λ, ε) is 1-periodic in λ and real analytic in all variables in + × [0, ε0 ), where + = ∩ {x > 0}, ∂H0 (I, λ) 6 = 0, ∂I I ∂2 d ω¯ = 2 H0 (I, λ) dλ 6 = 0, dI ∂I ∂H0 (0, y, λ) 6 = 0 ∂y
ω(I, λ) =
everywhere in the toroidal layer I ∈ (I1 , I2 ). Then for sufficiently small ε the above layer possesses invariant tori and the relative measure of the complement of their union tends to zero as ε → 0.
Invariant Tori in Hamiltonian Systems with Impacts
295
Proof. We proceed as in the proof of Lazutkin’s theorem by first showing that the Hamiltonian in the action-angle variables takes the form H (I, λ, φ, ε) = H0 (I, λ) + εH˜ 1 (I, λ, |φ|, ε)
(14)
and is real analytic in (I, λ, |φ|, ε) if φ 6= 0, 21 . Then applying the Implicit Function Theorem we obtain that the inverse function I (H, λ, φ, ε) = I0 (H, λ) + εI1 (H, λ, |φ|, ε)
(15)
is real analytic in (H, λ, |φ|, ε) if φ 6 = 0, 21 . Finally, carrying out an averaging transformation (H, λ) → (h, τ ) similar to the one in the previous section we obtain a near integrable Hamiltonian J (h, τ, φ, ε) = εJ0 (h) + ε2 J1 (h, τ, |φ|, ε),
(16)
which is real analytic in (h, τ, |φ|, ε) if φ 6= 0, 21 . We start with Proposition 1.1. Under the conditions of the theorem H0 (I, λ) is an analytic function. Proof. First, consider I (H, λ), given by the area enclosed with H = H0 (|x|, y, λ). Fixing H = H0 and λ = λ0 so that I1 < I (H0 , λ0 ) < I2 we show that I (H, λ) is analytic in a neighborhood of (H0 , λ0 ). Indeed, consider the right part of the boundary (x ≥ 0) as a union of arcs (zk , zk+1 ), where zk = (xk , yk ), k = 1, 2, ..., N , such that Hy (zk ) 6 = 0 and Hx (zk ) 6 = 0 and in each arc either Hx 6= 0 or Hy 6 = 0. Therefore, each arc can be represented as y(x, H, λ) or x(y, H, λ) and we have X Z xk+1 1 [I (H, λ) − I (H0 , λ0 )] = [y(x, H, λ) − y(x, H0 , λ0 )] dx 2 k,k+1∈Ay xk X Z yk [x(y, H, λ) − x(y, H0 , λ0 )] dy + k,k+1∈Ax
+
X
k ∈A / x ∪Ay
Z
yk+1
x(yk ,H,λ)
xk
[y(x, H, λ) − yk ]dx,
where k ∈ Ay (k ∈ Ax ) if the arcs having zk as a boundary point are such that Hy 6= 0(Hx 6 = 0). It is easy to see that all terms in the sums are real analytic functions. Therefore I (H, λ) is also real analytic. This implies analyticity of H0 (I, λ) by the Implicit Function 0 t Theorem, which can be applied in since ∂H ∂I (I, λ) 6 = 0. u Now, we show that the Hamiltonian function (14) is analytic. In a neighborhood of (x0 , y0 , λ0 ) the transformation (x, y, λ) → (φ, I, λ), where x0 x ≥ 0 and H0y (x0 , y0 , λ0 ) 6 = 0, is defined implicitly by φ = φ0 + y=
∂Sx0 (x, I, λ), ∂I
∂Sx0 (x, I, λ), ∂x
296
V. Zharnitsky
where the generating function Sx0 is given by Z x y(|ξ |, H0 (I, λ), λ) dξ. Sx0 (x, I, λ) = x0
In case H0y (x0 , y0 , λ0 ) = 0 then H0x (x0 , y0 , λ0 ) 6= 05 and the transformation can be defined implicitly by φ = φ0 + x=−
∂Sy0 (y, I, λ), ∂I
∂Sy0 (y, I, λ), ∂y
where the generating function Sy0 is given by Z y x(η, H0 (I, λ), λ) dη. Sy0 (y, I, λ) = y0
Thus, we can always choose a generating function which defines the transformation locally [1]. In a neighborhood of any point different from (0, y0 , λ0 ) the transformation is real analytic. Indeed, the generating function is analytic for it is an integral of an analytic function. If the transformation is generated by Sx0 , then we can invert the equation for φ to obtain x = x(I, φ, λ) and substitute it in the second equation for y. The obtained explicit canonical transformation is analytic and invertible for its Jacobian is equal to one. The same argument applies to the transformation generated by Sy0 . Because of the symmetry of the Hamiltonian in x, we obtain that x(I, −φ, λ) = −x(I, φ, λ), therefore |x(I, −φ, λ)| = x(I, |φ|, λ) and y(I, φ, λ) = y(|x(I, φ, λ)|, H0 (I, λ), λ) = y(|φ|, I, λ). Thus, H1 (|x|, y, λ, ε) will take the form εH1 (|φ|, I, λ, ε). Since S is antisymmetric in x and x is antisymmetric in φ then ∂λ S(x, I, λ) is symmetric in φ. Thus, we have proven that the Hamiltonian in action-angle variables is given by (14) and is real analytic. The averaging transformation (H, λ) → (h, τ ) is defined similar to the action-angle transformation: h is given by the area under the curve I = I0 (H, λ), Z 1 H0 (I0 (H, λ), α) dα, h(H, λ) = h(I0 (H, λ)) = 0
and the generating function is given by Z
λ
W (h, λ) =
H (J0 (h), ξ ) dξ,
0
where J0 (h) is the inverse of h(I ) and τ = ∂h W (h, λ). It is easy to check that the transformation preserves periodicity: if (λ, H ) → (τ, h) then (λ + 1, H ) → (τ + 1, h). The transformation is real analytic because H (I0 , λ) is analytic and ∂I H (I, λ) 6 = 0. Therefore, the new Hamiltonian function (16) is also real analytic, if φ 6 = 0, 21 . Proceeding as in the previous section we obtain the map (10) which satisfies the conditions of Moser’s small twist theorem and thus, has a large set of invariant circles. The rest of the proof follows the argument at the end of the last section. 5 The frequency does not vanish in and therefore ∇H 6 = 0.
Invariant Tori in Hamiltonian Systems with Impacts
297
Remark 1.1. If the Hamiltonian H (|x|, y, λ) is quasiperiodic in λ then after similar transformations one obtains a monotone twist map (10) which is quasiperiodic in τ . A similar result then can be obtained using a quasiperiodic version of the monotone twist map, see e.g. [32]. Corollary 1.1. Under the conditions of the theorem the action variable is a perpetually conserved adiabatic invariant. More precisely for any δ > 0 there exists ε0 > 0 such that if ε < ε0 and I1 + δ ≤ I0 ≤ I2 − δ, then |I (t) − I (0)| ≤ δ. Corollary 1.2. If all conditions of the above theorem are satisfied except for periodicity of H (x, y, λ) in λ then the action variable is an adiabatic invariant, i.e. the statement in the above Corollary is true for |t| ≤ Cε−1 . 2. Applications In this section we consider various systems to which the above theorem applies. 2.1. Billiard in constant magnetic and electric fields. The billiards in magnetic field were considered in [27] and later in [3,4]. We use the above theorem to provide a criterion of stability of the solutions near the boundary. The Lagrangian of the problem is given by L=
y˙ 2 x˙ 2 + + A(x, y)x˙ − W (x, y). 2 2
Introducing the boundary coordinates we obtain the Lagrangian 6 L=
r˙ 2 s˙ 2 + (1 − k(s)r)2 + M(r, s)˙s − V (r, s), 2 2
(17)
where M(r, s) and V (r, s) are related to A(x, y) and W (x, y). Carrying out the Legendre transformation we obtain the Hamiltonian H =
(ps − M)2 pr2 + V, + 2 2(1 − k(s)|r|)2
(18)
where we have exchanged r for |r| as before to account for collision with the boundary. Using the invariance of the form pr dr + ps ds − H dt = pr dr + H d(−t) − (−ps )ds we choose K = −ps as the new Hamiltonian and s as the new time q (19) K = −M − (1 − k(s)|r|) 2H − pr2 − 2V , where the square root is taken with positive sign. Expanding M in series of r we have M(r, s) = M1 (s)r + M2 (s)r 2 + · · · , and rescaling the system K = ε2 F,
pr = εPR ,
s = εS,
r = ε2 R,
6 We can neglect the other term linear in velocity N r˙ , for it can be absorbed in M s˙ since addition of a full time-derivative to the Lagrangian does not effect the equations of motion.
298
V. Zharnitsky
we obtain the new Hamiltonian F = − M1 (εS)|R|
(20) q − (ε −2 − k(εS)|R|) 2H −2V0 (εS)−ε2 PR2 −ε2 2V1 (εS)|R| + O(ε4 ) + O(ε2 ).
Expanding the square root and rearranging the terms we obtain F = where
PR2 + b(εS)|R| + ε2 F1 (|R|, PR , εS, ε), 2a(εS)
(21)
p
2H − 2V0 (εS), p V1 (εS) . b(εS) = k 2H − 2V0 (εS) − M1 (εS) + √ 2H − 2V0 (εS)
a(εS) =
To keep the expression under the square root positive we require H > max V0 (s) therefore a > 0. Introducing the standard notation R = x, PR = y, S = t, and F = H we obtain H =
y2 + b(λ)|x| + εH1 (|x|, y, λ, ε). 2a(λ)
(22)
Calculating the action as in the previous section √ 2 2a 3/2 H , I= 3b we obtain 1 3b(λ) I . H0 (I, λ) = √ 2 a(λ) Applying the theorem we obtain a large set of caustics near the boundary for sufficiently large energy, provided b(λ) > 0. 2.2. Caustics in dual billiards. Dual billiard is a dynamical system defined in the exterior X of a convex closed oriented curve 0 in the plane. If x ∈ X then P (x) = y, where [x, y] is tangent to 0 at a point O, oriented as the curve, and |x, O| = |O, y|. The stability problem for this system was formulated in [25] and studied later in [6, 10,11,30], see also a recent survey by Tabachnikov [30] for more references. We use a recent result by Boyland [6], where the dual billiard map was shown to be equivalent to an impact oscillator with the Hamiltonian given by q2 p2 + + ρ(t)|q|, 2 2 where ρ(t) is the curvature radius of the billiard boundary. We apply the theorem to show the existence of invariant curves in the small amplitude limit which corresponds to the caustics near the boundary. Indeed, rescaling the system K=
K = ε 2 F, p = εP , s = εS,
q = ε2 Q,
Invariant Tori in Hamiltonian Systems with Impacts
299
we obtain the new Hamiltonian F =
Q2 P2 + ρ(εS)|Q| + ε2 , 2 2
which satisfies the conditions of the theorem, provided ρ(s) > 0 and analytic. 2.3. Fermi–Ulam problem. The problem of stability of a ball bouncing elastically between two walls, one at rest the other one oscillating periodically, has been introduced by Fermi in order to explain the origin of the high-energy cosmic radiation [9]. It was further developed by Ulam [31] and others [8,29,18,21]. We slightly generalize the problem by assuming additional analytic time-dependent potential field. Therefore, in our problem the particle travels between two walls; one at x = 0, the other at x = p(t) according to x¨ + V 0 (x, t) = 0. Now, we use the transformation stopping the wall which originated in the theory of the heat equation to solve the free boundary problems and was later used for the quantum Fermi–Ulam problem, see [29] and references therein. Introducing the new variable and the new time Z t ds , x = p(t)y, τ = 2 0 p (s) we obtain a system of a ball bouncing elastically between two walls at y = 0 and y = 1 and moving according to 3 ¨ (t)y = 0, y 00 + p3 (t)Vx (p(t)y, t) + p(t)p
RT where t = t (τ ). The new system is also periodic in τ with the period Tτ = 0 pdt 2 (t) . We again slightly generalize the problem by considering a particle bouncing elastically between the two stationary walls in an arbitrary analytic potential y 00 + Wy (y, τ ) = 0. The Hamiltonian of the problem is given by K=
py2 2
+ W (y, τ ).
This is equivalent to the system of a particle moving on the circle in a potential nonsmooth at two points. If we let y be the angular variable y ∈ (−1, 1) then the Hamiltonian takes the form K=
py2 2
+ W (|y|, τ ).
Note that the exactness condition [19] I py dy = constant
300
V. Zharnitsky
is satisfied because W is periodic in y. Rescaling and introducing more appropriate variables py = ε−1 I, τ = εT ,
y = φ, K = ε−2 H,
we obtain H =
I2 + ε2 W (|φ|, εT ). 2
This system is already in the action-angle variables and ω = I 6= 0, ωI = 1 6 = 0. Proceeding as in the proof of the theorem, we obtain the exact map which satisfies the conditions of the monotone twist theorem and therefore possess invariant curves. This implies the stability result for Fermi–Ulam problem. 3. Example of Instability The billiard dynamics may be ill-defined even in a convex billiard with a continuous non-vanishing curvature. More precisely, Halpern constructed an example of a classical billiard with these properties, yet, possessing a trajectory reaching the boundary in finite time [12]. We present an analog example for the bouncing ball problem in the framework of smooth Hamiltonian systems and indicate how the construction can be carried over to the classical and dual billiards. This approach indicates the connection with the phenomenon of blow-up in finite-time in ODEs [7]. We also prove that the billiard flow is well defined for all time if the curvature k > k0 > 0 and it is of bounded variation. We start by constructing a piecewise constant k(t) : 0.5 < k(t) < 1.5, such that the equation x¨ + k(t)sgn(x) = 0 will have a solution coming to rest (x˙ = 0, x = 0) in finite time and then show how k(t) can be made continuous. First, we take a particular trajectory in the autonomous system k = 1 and modify k as follows: when the solution has the largest distance from the origin x, k is decreased by 1k so that the energy ˙ changes by 1H = 1k|x| (since H˙ = k|x|). We change k back to the original value k = 1 when the solution passes through x = 0 so that the energy remains the same at this moment. This procedure can be continued indefinitely. At the nth step the energy decreases by 1kn xn , where xn = Hn /k. Since k = 1 when the energy decreases, then 1Hn = 1kn Hn . Recalling √ that 0.5 ≤ k ≤ 1.5 we obtain the estimate on the period of one oscillation Tn ≤ C Hn . Since we are constructing a solution which looses its energy in finite time, we let P n| Hn = n13 so that Tn converges. The corresponding 1kn is given by |1kn | = |1H Hn ≤ c n . Starting with sufficiently large n0 so that |1kn | ≤ 0.5 for n ≥ n0 we obtain the example. We can now make k continuous by approximating it in the neighborhoods of the jumps with continuous linear functions. Repeating the construction we obtain an example of instability with continuous k and with the same estimates, since the neighborhoods where k is modified can be chosen arbitrarily small. This construction and its continuous modification carries over to the classical and dual billiards without any changes. Indeed, we have already showed that the bouncing ball problem is asymptotically close to the classical billiard near the boundary.
Invariant Tori in Hamiltonian Systems with Impacts
301
Therefore, constructing the sequences {Hn }, {1kn }, {Tn }, we will make errors of order o(Hn ), o(1kn ), and o(Tn ), which can be checked by direct calculations. Therefore, we will obtain a similar example and its continuous modification. The same argument applies to the dual billiard problem, see also [6]. Note, that the constructed k has unbounded variation. This turns out to be a necessary condition for such a construction. We prove this statement for the classical billiard Theorem 3.1. In a convex classical billiard with the curvature k ∈ BV (S 1 ) and k > k0 > 0 there is no trajectory which can reach the boundary in finite time. Proof. The Hamiltonian function for the classical billiard q H = 1 − (1 − k(t)|x|) 1 − y 2 is chosen so that near the boundary H ≥ 0 and H = 0 iff x = 0 and y = 0. Assuming y ≤ 0.5, we obtain an estimate for the distance from the boundary 2 H , |x|max ≤ √ 3 kmin and since q ˙ 1 − y2, H˙ = k|x| we obtain the inequality ˙ √2 H , |H˙ | ≤ |k| 3 kmin where k˙ exists a.e., since it is the derivative of a function of bounded variation [28]. Integrating, we obtain the energy decay estimate, RT HT ˙ ≥ exp−C 0 |k(t)|dt > 0, H0
RT ˙ is bounded by the variation of k on [0, T ] [28]. Therefore, a trajectory since 0 |k(t)|dt cannot reach the boundary in finite time. u t Remark 3.1. Another property of the system, which is crucial for the above instability examples, is that the period of oscillation would vanish as the solution approaches the boundary. It is these two properties that have been used by Coffman and Ulrich in [7], where an unbounded solution on the finite interval t ∈ [0, t ∗ ) for the equation x¨ + a(t)x 3 = 0 has been constructed with a(t) having unbounded variation near t ∗ . In this system the period of oscillation decays to zero as a solution grows unbounded. Acknowledgement. I would like to thank N. Berglund, A. Katok, and M. Levi for helpful discussions related to this article. I am also grateful to the referee for bringing my attention to the papers by Ivanov and Markeev, which developed the method used in this paper. This work was supported by the National Science Foundation under Grant No. DMS-9627721.
302
V. Zharnitsky
References 1. Arnold, V.I.: Mathematical methods of classical mechanics. New York: Springer-Verlag, 1978 2. Arnold, V.I.: On the behavior of an adiabatic invariant under slow periodic variation of the Hamiltonian. Sov. Math. Dokl. 3, 136–139 (1962) 3. Berglund, N., Kunz, H.: Integrability and ergodicity of classical billiards in a magnetic field. J. Stat. Phys. 83, 81–126 (1996) 4. Berglund, N., Hansen, A., Hauge, E.H. and Piasecki, J.: Can a local repulsive potential trap an electron? Phys. Rev. Lett. 77, 2149–2153 (1996) 5. Birkhoff, G.D.: Dynamical Systems. New York: American Mathematical Society, 1927 6. Boyland, P.: Dual billiards, twist maps and impact oscillators. Nonlinearity 9, 1411–1438 (1996) 7. Coffman, C.V., Ullrich, D.F.: On the continuation of solutions of a certain non-linear differential equation. Monatsh. Math. 71, 385–392 (1967) 8. Douady, R.: PhD Thesis, Ecole Polytechnique, 1988 9. Fermi, E.: On the origin of cosmic radiation. Phys. Rev. 75, 1169–1174 (1949) 10. Gutkin, E.: Dual polygonal billiards and necklace dynamics. Commun. Math. Phys. 143, 431–449 (1992) 11. Gutkin, E., Katok, A.: Caustics for inner and outer billiards. Commu. Math. Phys. 173, 101–133 (1995) 12. Halpern, B.: Strange billiard table. Trans. Am. Math. Soc. 232, 297–305 (1977) 13. Ivanov, A.P., Markeev, A.P.: The dynamics of systems with unilateral constraints. J. Appl. Math. Mech. 48, 448–451 (1984) 14. Katok, A., Hasselblatt, B.: em Introduction to the modern theory of dynamical systems. Encyclopedia of Mathematics and its Applications 54, Cambridge: Cambridge University Press, 1995 15. Kunze, M., Kupper, T., You, J.: On the application of KAM theory to discontinuous dynamical systems. J. Differ. Eqs. 139, 1–21 (1997) 16. Lazutkin, V.F.: KAM theory and semiclassical approximations to eigenfunctions. New York: SpringerVerlag, 1993 17. Lazutkin, V.F.: The existence of caustics for a billiard problem in a convex domain. Math. USSR, Isvestija 7, 185–214 (1973) 18. Laederich, S., Levi, M.: Invariant curves and time-dependent potentials. Ergod. Th. & Dynam. Sys. 11, 365–378 (1991) 19. Levi, M.: KAM theory for particles in periodic potentials. Ergod. Th. & Dynam. Sys. 10, 777–785 (1990) 20. Levi, M.: Quasiperiodic motions in superquadratic time-periodic potentials. Commun. Math. Phys. 143, 43–83 (1991) 21. Lichtenberg, A.J. and Lieberman, M.A.: Regular and stochastic motion. Berlin: Springer-Verlag, 1983 22. Markeev, A.P.: On the motion of a solid with an ideal non-retaining constraint. J. Appl. Math. Mech. 49, 545–552 (1985) 23. Markeev, A.P.: Qualitative analysis of systems with an ideal non-conservative constraint. J. Appl. Math. Mech. 53, 685–689 (1989) 24. Moser, J.K.: On invariant curves of area preserving mappings of an annulus. Nachr. Acad. Wiss. Gottingen Math. Phys. K1, 1–20 (1962) 25. Moser, J.K.: Stable and random motions in Dynamical Systems. Ann. Math. Stud. 77, Princeton, NJ: Princeton University Press, 1973 26. Ortega, R.: Asymmetric oscillators and twist mappings. J. London Math. Soc. 53, 325–342 (1996) 27. Robnik, M. and Berry, M.V.: Classical billiards in magnetic fields. J. Phys. A18, 1361–1378 (1985) 28. Royden, H.L.: Real Analysis. New York: Macmillan Publ. Co., 1988 29. Seba, P.: Quantum chaos in the Fermi-accelerator model. Phys. Rev A 41, 2306–2310 (1990) 30. Tabachnikov, S.: Dual billiards. Russ. Math. Surv. 48, 75–102 (1993) 31. Ulam, S.: On some statistical properties of dynamical systems. In: Proceedings of the fourth Berkeley symposium on mathematical statistics and probability, University of California: Berkeley, 1961 32. Zharnitsky, V.: Invariant curve theorem for quasiperiodic twist mappings and stability of motion in Fermi– Ulam problem. To appear in Nonlinearity (2000) Communicated by Ya. G. Sinai
Commun. Math. Phys. 211, 303 – 333 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Smooth Dependence of Thermodynamic Limits of SRB-Measures Miaohua Jiang1 , Rafael de la Llave2 1 Department of Mathematics and Computer Science, Wake Forest University, Winston-Salem, NC 27109,
USA. E-mail:
[email protected] 2 Department of Mathematics, University of Texas, Austin, TX 78712-1802, USA.
E-mail:
[email protected] Received: 26 April 1999 / Accepted: 17 November 1999
Abstract: The thermodynamic limits of SRB-measures for uniformly hyperbolic sets are smoothly dependent on the map in an appropriate functional space.
1. Introduction In a recent paper [Rue97], Ruelle showed that, in an appropriate sense, the SRB measure µf of an Axiom A map f depends differentiably on the map f and computed explicit formulas for the derivative. Since the motivation of that paper was to serve eventually as a justification of statistical mechanics out of equilibrium, it seems natural to extend these results to systems of many particles and study the dependence of the derivatives of the measures (out of which one can compute response functions, entropy production, etc.) as the number of particles increases; in particular, to study whether there is a thermodynamic limit as the number of particles tends to infinity. The existence of this thermodynamic limit of the derivatives of the measures implies that one can define the thermodynamic response functions. In this paper we establish differentiability of thermodynamic limits of SRB-measures for lattice systems that have been often used as models in statistical mechanics ([Sim93]). We call attention to the fact that for these systems not only it makes sense to consider the limit of physical quantities as the number of particles tends to infinity, but also one can make sense of the system with an infinite number of particles as a dynamical system on a Banach manifold. Hence, we will show that the systems we consider (uniformly hyperbolic systems) have thermodynamic properties slightly away from equilibrium. Besides the uncontroversial fact that hyperbolic systems are a very natural first step to study thermodynamic behavior of deterministic systems [Kry79], we note that it has been argued by Ruelle and Gallavotti-Cohen that the thermodynamic properties developed under the mathematical hypothesis of hyperbolicity should hold for all the physically relevant systems. This hypothesis is called the “chaotic hypothesis”. For a discussion of
304
M. Jiang, R. de la Llave
the relevance to physics and supporting evidence for this hypothesis we refer to [Gal98b, Gal98a,Gal99] . We assume that to each point of Zd we associate a dynamical system given by a phase space manifold M and a map f acting on M. So, the phase space of the whole system is M = ⊗Zd M and we can define a map on M by F = ⊗Zd f . These F are referred to as the uncoupled systems. We will be concerned with modifications of these uncoupled systems, which are obtained by small local couplings that are also translation invariant (see later precise definitions). Recently, it has been shown [JP98] that when f is Axiom A, one can define the SRB measures for such systems. The measure of the infinite system can be obtained by taking the limit (in an appropriate sense) of the SRB measures of finite dimensional systems. Moreover, the SRB measures of the infinite system also satisfy a variational principle that behaves naturally with respect to finite dimensional approximations. Even though some of the characteristics of SRB measures in finite dimensions, such as the approximation by periodic orbits [Bow75], do not even make sense, a certain number of them are still true. Notably, in [JP98] it was established that one can have a variational principle for the thermodynamic limit of SRB measures of finite subsystems (hence, we will refer to these limits as the SRB measures of the infinite system) and, indeed, a significant part of the thermodynamic formalism still holds. This characterization of SRB measures as equilibrium measures of a potential will play a very important role in our study. One question that, to the best of our knowledge remains open is whether the characterization of SRB measures of attractors as the weak limit of iterated Lebesgue measures carries through for the infinite dimensional system. In this paper, we will show that indeed, the SRB measures of the infinite system depend smoothly on the map when the space of maps is given an appropriate topology that incorporates smoothness, locality of couplings, and translation invariance. Of course, the differentiable dependence of measures is to be understood in a weak enough sense. If we look at the result of applying the measure to a function which is smooth enough, and which also has a local character, then the result will depend smoothly on the map. The size of the derivatives with respect to the map will depend only on the size of the derivatives with respect to the space variables of the test function (See Theorem 7.1.) It will happen that the derivatives will also enjoy the natural properties of translation invariance and locality. We will also show in a subsequent paper that the formulas for derivatives in [Rue97] can be adapted to the infinite dimensional system. Moreover, these derivatives are the limit of the derivative operators on a finite number of sites as we take the number of sites considered to infinity. The availability of the thermodynamic formalism allows us to show that there are thermodynamic limits of many dynamical quantities such as the metric entropy or the dimension of basic sets. We will also show in this paper that these thermodynamic limits depend smoothly on the map. We note that the smooth dependence of these dynamical quantities for finite dimensional systems have been considered in several places (see [Mn90,KKPW90,Wei92, Con92,Con95a,Con95b] as well as [Rue97]). The proof of our results will be as follows: It turns out that much of the geometric theory of hyperbolic systems (structural stability, persistence of hyperbolicity and the parametric versions of them) goes through for diffeomorphisms of Banach manifolds. Given a map 8 in a C 1 neighborhood of F , we can find a unique homeomorphism h close to the identity that satisfies 8 ◦ h = h ◦ F . This h will depend smoothly on 8 (the proof of this result in the finite dimensional case was done in [dlLMM86]). Moreover,
Smooth Dependence of SRB
305
u (x)⊕E s ¯ satisfies the invariant exponential splitting of the tangent bundle, Tx M = E8 ¯ 8 (x) u s ¯ and E8 (h(x)) ¯ depend smoothly on 8 (this was proved in [Mn90] for finite that E8 (h(x)) dimensional systems). Modifying the proofs indicated above, we will show that, if 8 u (h(x)), s (h(x)) ¯ E8 ¯ and their derivatives has the structure of local couplings, then h, E8 with respect to 8 also have similar properties and they depend smoothly on the map when we give them a topology based on a norm that incorporates locality, etc. By using the result above systematically, we can reduce the study of the SRB measure for 8 to the study of an equilibrium measure for F with a potential function that depends on 8. As it is well-known, in finite dimensions, an appropriate potential is just the logarithm of the Jacobian of 8 restricted along the unstable manifold. This potential has the advantage that, besides having an SRB measure as its equilibrium, it admits a clear geometric interpretation. Nevertheless, it is not suitable for the considerations of the limits as the number of sites goes to infinity since it does not have a limit. In order to be able to take limits as the size of the system grows, one has to take advantage of the group structure of the underlying lattice Zd . The thermodynamic limit of SRB measures on finite dimensional spaces with respect to Z-actions becomes an equilibrium measure for a meaningful potential with respect to a Zd+1 -action induced by the dynamics on the infinite dimensional space and the translation maps of the lattice Zd . An appropriate potential ϕ8 suitable for considerations in infinite dimensions is constructed in [JP98]. We will show that, under the conditions of our theorem, the potential function ϕ8 depends smoothly on 8. The rest of the argument is a thermodynamic argument showing that the pressure depends smoothly on the potential. This result is well-known in finite dimensions since the underlying lattice gas model is one dimensional and thus, the pressure is analytic. In the infinite dimensional system case, the underlying lattice gas model is higher dimensional. The C ∞ -smoothness of the pressure for the potential functions was proved in [BK97]. Hence, in our situation, P (ϕ8 ) the pressure of ϕ8 depends smoothly on 8. The main result follows since the SRB measure is the derivative functional of the pressure and the metric entropy can be expressed in terms of the pressure and the SRB measure. Since the precise definition of differentiability requires definitions of spaces of smooth diffeomorphisms with local coupling structures, translation invariance as well as the meaning in which derivatives are to be understood, we postpone the precise formulation of our results till these technical definitions are introduced. (See Theorem 7.1 for a precise formulation of the results. )
Remark 1.1. As it happens often in smooth ergodic theory, we note that the geometric theory of the attractors (structural stability, splittings) goes through assuming only one derivative but without assuming translation invariance. (Of course, if we assume translation invariance in the coupling, we also obtain translation invariance for the results). On the other hand, the part of the theory that deals with invariant measures requires more than one derivative and also translation invariance.
2. Preliminaries In this section, we will collect some results from smooth ergodic theory and from coupled map lattices. We will use them to establish the notation that we will use throughout the paper.
306
M. Jiang, R. de la Llave
2.1. SRB-measures for finite dimensional maps. Let M be a smooth compact Riemannian manifold and f be a C r -diffeomorphism of M, r > 1. We assume that f possesses a locally maximal hyperbolic set 3, i.e., f is uniformly hyperbolic on 3 and there exists an open neighborhood U ⊃ 3 that does not contain any larger invariant hyperbolic set. Two important particular cases of this situation are when f (U ) ⊂ U, 3 = ∩n∈N f n (U ), i.e., 3 is an attractor and, when 3 = M, i.e., the system is Anosov. Note that strictly speaking, an Anosov system is a particular case of an attractor. When 3 is an attractor, a Sinai-Ruelle-Bowen measure µ for f can be described by the following limit: n−1
1X g(f k (x)) = lim n→∞ n
Z gdµ,
k=0
where dm denotes the Lebesgue measure and the equality holds for any continuous function g on M [Bow75]. More generally, when 3 is a locally maximal hyperbolic set and f is topologically transitive on 3, the SRB measure µ is the unique invariant measure (called equilibrium state) at which the functional (called the pressure of ν) Z hν (f ) + ϕf (x)dν attains its maximal value, which is equal to the topological pressure P (ϕf ): Z P (ϕf ) = hµ (f ) + ϕf (x)dµ,
(1)
where hµ (f ) is the measure theoretical entropy, and ϕf (x) = − log J u f (x) (J u f (x) is the Jacobian of the restriction of f along the unstable manifold at x). The fact that µ satisfies (1) is called the variational principle. To ensure uniqueness of SRB-measures for direct product spaces, we will further assume that f is topologically mixing on 3. Note that a direct product of two topologically transitive systems is not necessarily topological transitive and a direct product of two topologically mixing ones is still topologically mixing. 2.2. Coupled map lattices. We will consider a compact manifold M of dimension n and we will assume that there is a C r diffeomorphism f defined on M, r ∈ N ∪ {∞}. We will also assume that there is a locally maximal hyperbolic set 3 ⊂ M. That is, we can find γ < 1 so that for every x ∈ 3, Tx M = Efu (x) ⊕ Efs (x) Df (x) :
Efs (x)
→
Efs (f (x));
(2) Df (x) :
Efu (x)
→
Efu (f (x))
and that, for an appropriately chosen C ∞ Riemannian metric g (called adapted metric), we have for all x ∈ 3, ||Df (x)|Efs (x) || ≤ γ ; ||Df (x)−1 |Efu (f (x)) || ≤ γ .
(3)
We will denote by d the distance in M induced by the Riemannian metric g. The phase space of lattice maps will be M = ⊗i∈Zd M endowed with some structures that we now describe.
Smooth Dependence of SRB
307
We define a distance in M by ρ(x, ¯ y) ¯ = sup d(xi , yi ),
(4)
i∈Zd
where x¯ = (xi ) and y¯ = (yi ) are two points in M and d is the Riemannian distance on M. As it is well-known, M can be given a structure of a Banach manifold with a Finsler metric as follows. We will consider ¯ ≡ sup |vi | < ∞} Tx¯ M = ⊕i∈Zd Txi M = {(vi )i∈Zd |vi ∈ Txi M, ||v|| i
`∞ (Rn ),
so that, in particular, it is not a separable space). (that is, we model T M on This defines a Finsler metric on T M which is compatible with (4). Note that we can still define the exponential map using the metric g on the manifold M as follows. Given a vector v¯ ∈ Tx¯ M, in the i th copy of M, we follow the geodesic flow for the metric g starting at xi , in the direction of vi for a unit of time. We note that using the exponential map, we can give M the structure of a Banach manifold modeled on `∞ (Rn ). Since all components of the exponential map are uniformly C ∞ differentiable maps, the exponential map defined above is a C ∞ chart from a neighborhood of Tx¯ M to a neighborhood of the manifold M. (Of course, since we define the topology on M through these charts, the fact that a neighborhood is covered is automatic. The only thing that needs to be checked is that the transitions from charts to charts are C ∞ but this follows easily from the fact that the geodesic flow is uniformly C j for all j ∈ N.) Note that since the geodesic flow on each coordinate is independent, the exponential map maps a ball in Tx¯ M onto a product of neighborhoods in M of each point xi . On the Banach manifold M, we can define the direct product map or uncoupled map by F = ⊗i∈Zd fi . The map is C r on the manifold M. Recall that, since all the component manifolds M are compact and the maps are identical copies, we have that all the maps fi have derivatives of order up to r uniformly bounded and that the derivatives of order r have a modulus of continuity that is independent of i. The map F not only is a C r map, but the r th derivative is uniformly continuous. More importantly, the map F possesses an infinite-dimensional hyperbolic set 1F = ⊗i∈Zd 3i , where fi and 3i are copies of f and 3, respectively. It is easy to check that ¯ = ⊕i∈Zd Efs (xi ), EFu (x) ¯ = ⊕i∈Zd Efu (xi ), then, setting EFs (x) ¯ ⊕ EFu (x) ¯ Tx¯ M = EFs (x) is a hyperbolic splitting in 1F and, for every x¯ ∈ 1F , ¯ = EFs (F (x)); ¯ DF (x)E ¯ Fu (x) ¯ = EFu (F (x)); ¯ DF (x)E ¯ Fs (x)
¯ −1 |EFu (F (x)) kDF (x)| ¯ EFs (x) ¯ k ≤ γ ; ||DF (x) ¯ || ≤ γ ,
where the γ is the same number as in (3). We call attention to the fact that, even if the definition of hyperbolic systems and splittings goes through without much problems, the definition of transitivity and the properties of approximation by periodic orbits are not quite straightforward to transport to the infinite dimensional context since, equipped with the Banach manifold that we introduced, the phase space is not separable. We recall the definitions of some objects discussed in [JP98].
308
M. Jiang, R. de la Llave
Let S denote the spatial translations on M induced by the translations on the integer ¯ = (xi+k ). Let the map G lattice Zd , i.e., for any k ∈ Zd and x¯ = (xi ) ∈ M, S k (x) be a C 2 -perturbation of the identity map on M. G is said to be spatially translation invariant if G ◦ S = S ◦ G. It is said to have short range property if G, written in the form G = (Gi )i∈Zd , where Gi : M → Mi , has the following property: there exist a decay constant θ, 0 < θ < 1 and a constant C > 0 such that for any fixed k ∈ Zd and any points x¯ = (xj ), y¯ = (yj ) ∈ M with xj = yj for all j ∈ Zd , j 6= k, d(Gi (x), ¯ Gi (y)) ¯ ≤ Cθ |i−k| d(xk , yk ). Define 8 = G◦F (or equivalently, 8 = F ◦G, since F is also a diffeomorphism). The map 8 is a perturbation of F . The infinite-dimensional dynamical system (M, (8, S)) is called a coupled map lattice. If G = I d, the lattice is called uncoupled . When G is spatially translation invariant, 8 satisfies the same property and the pair (8, S) generates a Zd+1 -action on M. Fix a point x¯ ∗ ∈ 18 , and a finite volume V ⊂ Zd , the map 8V : xV → 8V (xV ) on MV = ⊗i∈V Mi is defined coordinatewise by 8V (xV ) i = 8((xV , x ∗ |Vb ) i , i ∈ V , where the point (xV , x ∗ |Vb ) denotes an element in M whose restrictions to V and its b are x and x ∗ |Vb , respectively. complement V V The map 8V is a diffeomorphism of MV when the perturbation G is sufficiently close to identity and it is C 1 -close to the diffeomorphism FV = ⊗i∈V f . By the structural stability theorem 8V possesses a hyperbolic set 18V since FV has a hyperbolic set 1FV = ⊗i∈V 3. There exists a conjugating homeomorphism hV : 1FV → 18V , 8V ◦ hV = hV ◦ FV . The maps 8V and hV provide finite-dimensional approximations for the infinitedimensional maps 8 and h, respectively. Let µV be the SRB-measure on the hyperbolic set 18V for the map 8V . Then, it is shown in [JP98] that the measure µV “weakly converges” to a measure µ8 on 18 . The measure µ8 is invariant and exponentially mixing under 8 and spatial translations S. It also satisfies the variational principle: Z (5) Pτ (ϕ8 ) = hµ8 (τ ) + ϕ8 dµ8 , where τ denotes the Zd+1 action on 18 induced by 8 and S, Pτ (ϕ8 ) is the topological pressure for the potential function ϕ8 , and hµ8 (τ ) is the measure theoretical entropy of µ8 with respect to τ . This “weak convergence” is in a rather special sense since the measures are defined on the different spaces and we need to consider the convergence on observables that admit natural restrictions to smaller systems. The limiting process is similar to the one in the thermodynamic limit of Gibbs ensembles on lattices when the underlying finite volume tends to infinity. We call µ8 the SRB-measure of the coupled map lattice (8, S). The main purpose of this paper is to show that the relation 8 → µ8 is differentiable in a proper setup that includes an appropriate Banach space topology and a suitable weak sense of differentiability for measures.
Smooth Dependence of SRB
309
3. Structural Stability and Smooth Dependence on the Perturbation 3.1. Introduction. In this section, we will study structural stability and properties of the hyperbolic splitting for lattice systems. As it turns out, one can generalize strategies that worked for finite dimensional systems but we also have to pay attention to the spatial structure and prove not only regularity properties, but also properties of decay. Part of this strategy have already been considered in [BS88,PS91,Jia95]. For our purposes, it seems that the most convenient proof of structural stability is that of [Mos69], as modified in [dlLMM86]. This proof reduces structural stability to an application of the implicit function theorem and then, all the work goes into establishing that the operator we consider is differentiable. In our case, we will obtain the spatial properties by introducing spaces that incorporate these local properties. After we study the dependence of structural stability on the perturbation, we will study the properties of the invariant hyperbolic splitting and the dependence on the perturbation. We will do that by formulating the invariance of the splitting as a fixed point problem in an appropriate space of sections. 3.2. Decay functions. In this rather technical section, which perhaps can be omitted in a first reading, we introduce the concept of decay function and collect some of the elementary properties. These are functions which we have found useful to quantify the notions of locality of several geometric objects. They enjoy properties that make it rather convenient to apply tools of functional analysis to geometric problems. Before we describe the space of diffeomorphisms that capture the idea of localized (or short-ranged [PS91,Jia95]) interactions, we will need to introduce a technical device that will simplify some of the arguments in the definitions as well as the proof of the structural stability. Definition 3.1. We say that a positive valued function 0 : Zd → R+ is a decay function when: P 1. Pi∈Zd 0(i) < ∞, 2. j ∈Zd 0(i − j )0(j − k) ≤ 0(i − k). The importance of decay functions is that infinite matrices A = (aij )i,j ∈Zd endowed with the norm kAk = sup |aij |0 −1 (i − j ) i,j
form a Banach algebra. This, in turn, will make it possible to define spaces of maps that behave well under composition. Roughly speaking, our spaces of diffeomorphisms will contain maps where the influence of the i th coordinate of the argument on the j th coordinate of the map is bounded by a decay function. Concrete examples of decay functions are the following: Proposition 3.2. Given α > d, θ ≥ 0, for some a > 0 (sufficiently small depending on d, α, θ), the function defined by ( a|i|−α exp(−θ |i|) i 6 = 0, 0(i) = a i = 0, is a decay function.
310
M. Jiang, R. de la Llave
Proof. P When i = k, using that, in our case 0(i−j ) = 0(j −i), the desired result amounts to j 0 2 (i − j ) ≤ a. Since the left-hand side is independent of i, it is a convergent sum multiplied by a 2 , this can always be achieved taking a sufficiently small. When i 6 = k, we bound X 0(i − j )0(j − k) (6) eθ|i−k| j
= a2
X
|i − j |−α |j − k|−α e−θ
|i−j |+|j −k|−|i−k|
+ 2a 2 0(i − k)eθ |i−k| .
i6=j j 6 =k
It suffices to show that the right hand side of (6) is smaller than a|i − k|−α . Note that the second term in (6) is 2a 2 |i − k|−α . Since e−θ(|i−j |+|j −k|−|i−k|) ≤ 1, we can bound the first term of (6) by: X |i − j |−α |j − k|−α . a2 i6=j j 6 =k
We consider the sets B = {j ∈ Zd − {i, k} : |i − j | ≤ |j − k|} and B c = {j ∈ Zd − {i, k} : |i − j | > |j − k|}. Since max(|i − j |, |j − k|) ≥ |i − k|/2 for j ∈ B, we have |j − k|−α ≤ 2α |i − k|−α . Hence, |i − j |−α |j − k|−α ≤ 2α |i − j |−α |i − k|−α . Similarly, for j ∈ B c we have |i − j |−α |j − k|−α ≤ 2α |j − k|−α |i − k|−α . Hence, we have X |i − j |−α |j − k|−α e−θ[|i−j |+|j −k|−|i−k| a2 i6=j j 6 =k
≤ a2
X j ∈B
≤a
2
X
j ∈B
|i − j |−α |j − k|−α + a 2
X
|i − j |−α |j − k|−α
j ∈Bc
2α |i − k|−α |i − j |−α + a 2
X
2α |i − k|−α |j − k|−α .
j ∈Bc
Bounding the sums over B and B c by sums over Zd we obtain that the right hand side of (6) can be bounded by: P
a 2 (2 · 2α Kd,α + 2)|i − k|−α ,
(7)
where Kd,α = j ∈Zd −{0} |j |−α . Since the constant in (7) contains a factor a 2 , by taking a sufficiently small, we can achieve that the bound has the desired form. u t
Smooth Dependence of SRB
311
Remark 3.3. Note that 0(i) = exp(−θ |i|), 0 < θ < 1 is not a decay function. Nevertheless, it has been customary in many papers to use spaces in which the dependence on distant coordinates is bounded by exponential decay functions. Of course, since exp(−(θ + )|i|) ≤ |i|−α exp(−θ |i|) ≤ exp(−θ |i|), the results obtained using the decay function |i|−α exp(−θ |i|) imply results for exponential decay functions. In the later part of this paper, we will use exponential decay since it is more convenient for some calculations and to be able to use theorems from the literature as stated. Remark 3.4. Much of the results of this paper, in particular, all the results in Sects. 3, 4 can be generalized to the situation when we have a graph H in place of the lattice Zd . The only requirement is that we have a decay function 0(i, j ) satisfying the properties of Definition 3.1. In some graphs e.g. the Bethe Lattice where one can define a natural distance d satisfying the ultrametric property d(j, k) ≤ max(d(i, j ), d(i, k)), it is easy to introduce such 0. It suffices to take 0(i, j ) = aϕ(d(i, j )), where ϕ : R+ → R+ is a monotone decreasing function that tends to zero sufficiently fast and a is a sufficiently small constant. If we define the sets B and B c as above we have by the ultrametric property of the distance and the monotonicity of ϕ, because all the terms are positive. X X X 0(i, j )0(j, k) ≤ 0(i, j )0(j, k) + 0(i, j )0(j, k) j ∈Bc
j ∈B
j
≤
X
X
0(i, j )0(i, k) +
j ∈B
≤ 0(i, k)[
X
0(i, k)0(j, k)
j ∈Bc
X
0(i, j ) +
0(j, k)].
j ∈Bc
j ∈B
Note that if ϕ decreases fast enough, the sums in the brackets converge and, by choosing a sufficiently small we can make sure that they are smaller than 1. 3.3. Geometry of lattice bundles and spaces of sections. Definition 3.5. We will consider the following spaces of vector fields (i.e. sections of the tangent bundle): C 0 (M,T M) ¯ < ∞ . (8) = v( ¯ x) ¯ : x¯ → v(x) ¯ is continuous, ||v|| ¯ C 0 ≡ sup sup |vi (x)| ¯ M i∈Zd x∈
For 0 < α < 1, given a trivialization of T M by a finite number of coordinate charts, {P1 , . . . , Pl } we denote C α (M, T M) ¯ C 0 , sup max = v( ¯ x) ¯ : ||v|| ¯ C α ≡ max(||v|| i∈Zd
k
sup
x6¯ =y∈ ¯ M,xi ,yi ∈Pk
|vi (x) ¯ − vi (y)| ¯ < ∞ . d(x, ¯ y) ¯ α (9)
312
M. Jiang, R. de la Llave
Following the usual convention, we will also allow α = Lip in the definition above. In that case, we set α = 1 in (9). It is convenient and standard to think of Lip as a special symbol that Lip < 1, Lip > α, ∀α ∈ R+ < 1 and Lip enters in arithmetic expressions as 1. For convenience, we assume that the maximum distance on M is 1. Thus, we have kvkC α1 ≤ kvkC α2 when 0 ≤ α1 ≤ α2 . Remark 3.6. We note that the definition of the Hölder norm depends on the choice of the trivialization. Nevertheless, any two of these norms are equivalent. Note that we choose the trivialization in M, which is finite dimensional and compact, and not in M. Definition 3.7. We will denote by CS0 , CSα the subspaces of the spaces C 0 (M, T M), C α (M, T M), respectively, that also satisfy v(x) ¯ ∈ Tx¯ M. Definition 3.8. Let h : M → Rn be a function. For the convenience of later use, Rn can be considered as a finite dimensional Banach space. Define γα,i (h) = sup
sup
(xj )j 6=i (zj =yj =xj )j 6=i ,yi 6 =zi
kh(y) ¯ − h(¯z)k . d α (yi , zi )
Clearly, γα,i (h) is a semi-norm. We use α = Lip to denote the case when α = 1. Later, we will use this semi-norm to define spaces of C r diffeomorphisms of M with local interactions. When h is a function from M to L(Rn ), the space of all linear operators on Rn , the norm k · k on L(Rn ) is the one induced by the norm in Rn . The definition can be adapted to the coordinate functions of sections of product bundles by taking a finite number of coordinate charts of M. If we fix a coordinate chart in Mj , j ∈ Zd , given a section v of T M, we can identify vj , the j th variable of v = (vj ), with a function from M to Rn and take the maximum of the norm of the function corresponding to each chart as the norm of vj . From now on, we assume that a fixed finite coordinate chart of M is chosen. Definition 3.9. Given a decay function 0 (see Definition 3.1), we introduce the following Banach spaces for 0 ≤ α ≤ Lip, C0α (M, T M) = v : M → T M, v ∈ CS0 , kvkC0α ≡ max{kvkC 0 , sup γα,j (vi )0 −1 (i − j )} . i,j ∈Zd
Remark 3.10. Note that, even though the norm k · kC0α depends on the choice of the finite coordinate chart of M, all such norms are equivalent and thus, define the same Banach space. Again, we emphasize that we choose a finite chart on M and not charts on M. Now we define the Banach space of differentiable sections. Definition 3.11. For each r ∈ N, we denote C0r (M, T M) = v ∈ CS0 : M → T M, ∂k sup sup vi (x) ¯ C 0 kvkC0r ≡ max 0≤k≤r i ,··· ,i ∈Zd i∈Zd ∂xi1 · · · ∂xik 1 k −1 max{0 (i − i1 ), · · · 0 −1 (i − ik )} < ∞ .
Smooth Dependence of SRB
313
Note that the derivative ∂x∂i vi (x) ¯ (which we will denote by ∂i1 vi , for short) is a linear 1 operator on the tangent space T M. Its norm can be defined using the norm induced by the Finsler metric. k ¯ is similarly defined. The norm for the multilinear operator ∂xi ∂···∂xi vi (x) 1 k Using the Riemannian geometry exponential map, we can define spaces of maps close to the identity completely analogous to those of vector fields. For every map e : M → T M, G : M → M close to the identity map, we identify it with the map G −1 e x) ¯ where Ax¯ = ⊗i∈Zd Axi denotes the exponential map from Tx¯ M to G( ¯ ≡ Ax¯ G(x), e is one-toM induced by the Riemannian metric g. It is easy to see that the map G → G one and the identity map corresponds to the zero section. Under the norms just defined, we obtain open sets in these corresponding Banach spaces that are open neighborhoods of the identity map. Similar definitions of Banach spaces work for maps from M to other bundles over M that are direct sums of bundles on M on which we have a metric. In particular, the definition can be extended to Grassmannian of the tangent bundle, which will be used when we prove the smooth dependence of stable and unstable subbundles. We state several simple properties about the norm just defined. The proofs are omitted since they are straightforward verifications. Lemma 3.12. 1. If G is a C01 map close to identity, then G is C0α for any 0 ≤ α ≤ Lip. 2. Let h be a C0α map on M and G is C01 . Then, the composition G ◦ h is C0α and kG ◦ hkC0α ≤ kGkC 1 khkC0α . 0
3. G is C0r , r > 1 if and only if DG is C0r−1 . When h is a C0α vector field on M and G is C02 , the vector field DG · h is C0α and kDG · hkC0α ≤ kDGkC 1 khkC0α . 0
3.4. Regularity properties of the composition of maps. Next, we start to study the regularity of the composition of maps. The following result establishes continuity. Presumably, it is not optimal and one can use almost one derivative less. Nevertheless, it does not seem worth the effort to investigate this question at the present time. Lemma 3.13. The mapping C defined by C(G, h) = G ◦ h is Lipschitz when it is considered as (1) C : C01 × C 0 → C 0 ; (2) C : C02 × C0α → C0α . Proof. Lipschitz in G is obvious since the map C is linear in G and clearly bounded. To establish the first conclusion, we have the following estimation: ¯ − Gi ◦ e h(x)k ¯ kG ◦ h − G ◦ e hk = sup kGi ◦ h(x) i,x¯
≤ sup i,x¯
X j ∈Zd
k∂j Gi k sup khj (x) ¯ −e hj (x)k. ¯ j,x¯
314
M. Jiang, R. de la Llave
Note that
P
j ∈Zd
k∂j Gi k ≤
P
j ∈Zd
0(i − j )kGkC 1 . Then, we have: 0
X
kG ◦ h − G ◦ e hkC 0 ≤
j ∈Zd
0(i − j )kGkC 1 kh − e hkC 0 . 0
¯ − ∂j Gi (h(y))k, ¯ To prove the second conclusion, we start by estimating k∂j Gi (h(x)) where x, ¯ y¯ differ only at lattice site l ∈ Zd , X ¯ − ∂j Gi (h(y))k ¯ ≤ k∂kj Gi kkhk (x) ¯ − hk (y)k ¯ k∂j Gi (h(x)) k∈Zd
≤
X
k∈Zd
0(i − k)kGkC 2 0(k − l)khkC0α d α (x¯l , y¯l ) 0
≤ kGkC 2 khkC0α 0(i − l)d α (x¯l , y¯l ). 0
Hence, we conclude that (∂j G) ◦ h ∈ C0α , and k(∂j G) ◦ hkC0α ≤ kGkC 2 khkC0α . 0
Now, we use the finite increment formula to estimate kGi (h + h¯ ) − Gi (h)k. Z 1 DGi ◦ (h + s h¯ )h¯ dskα,j ≤ sup kDGi ◦ (h + s h¯ )h¯ k kGi (h + h¯ ) − Gi (h)k = k s 0 X k∂k Gi ◦ (h + s h¯ )h¯ k k ≤ k∈Zd
≤
X
k∈Zd
≤
X
k∈Zd
kGkC 2 khkC0α 0(i − k)kh¯ k kα,j 0
kGkC 2 khkC0α 0(i − k)0(k − j )kh¯ kC0α 0
≤ kGkC 2 khkC0α 0(i − j )kh¯ kC0α . 0
This proves the second conclusion. u t Once we have that the composition is continuous in those spaces, it is easy to prove higher order differentiability. Again, we note that the proof just given could go through under somewhat more general Phypotheses of regularity. To obtain Lipschitz continuity in the first case, we only need i 0(i) < ∞ and the second property of a decay function 0 is not needed. Moreover, we do not really need Lipschitz continuity to prove the differentiability in the lemma below but only continuity. Lemma 3.14. The mapping C defined by C(G, h) = G ◦ h
Smooth Dependence of SRB
315
is C 1 when it is considered as (1) C : C02 × C 0 → C 0 ; (2) C : C03 × C0α → C0α . Moreover, we have the following derivative formulas. D1 C(G, h)G = G ◦ h; D2 C(G, h)h¯ = (DG ◦ h)h¯ . Proof. The first conclusion is obvious since the map is linear in the first variable. To prove the second, we note that, by the finite increment formula, we have Z 1 DG(h + s h¯ )h¯ ds. G(h + h¯ ) − G(h) = 0
Applying Lemma 3.13 to DG, we have that DG is Lipschitz in the norm k · kC0α when G ∈ C03 . Thus, we have the following estimation: Z 1 (DG(h + s h¯ ) − DG(h))h¯ dskC0α kG(h + h¯ ) − G(h) − (DG(h))h¯ kC0α = k 0
≤ kDG(h + s h¯ ) − DG(h)kC0α kh¯ kC0α ≤ L1 (kh¯ kC0α )2 , where L1 denotes the Lipschitz constant for DG. This proves the C 1 -differentiability of composition in the sense indicated. u t Corollary 3.15. The mapping C defined by C(G, h) = G ◦ h is
Cr
when it is considered as (1) C : C0r+1 × C 0 → C 0 ; (2) C : C0r+2 × C0α → C0α .
Moreover, the derivatives are what one would obtain by formal computation. Proof. The proof can be obtained easily by induction based on Lemmas 3.13 and 3.14. t u 3.5. Structural stability with smooth dependence on the map. Now we state and prove the result on the smooth dependence of the conjugating map hG . Theorem 3.16. For any C r , r ≥ 1 hyperbolic map f of M with a hyperbolic set 3 and a decay function 0, there exists > 0 (we will make somewhat explicit in terms of properties of F at the end of the proof) such that if G is in the C0r+2 -neighborhood of the identity map I d from M → M, then, the coupled map 8 = G ◦ F is topologically conjugate to F on the hyperbolic set 1 = ⊗i∈Zd 3, where M = ⊗i∈Zd M and F = ⊗i∈Zd f : i.e., there exists a unique hG in the C 0 -neighborhood of I d satisfying 8 ◦ hG = hG ◦ F.
316
M. Jiang, R. de la Llave
Moreover, 1. hG is C0α , where 0 < α < 1 is close to the Hölder exponent of stable and unstable invariant subspaces for the unperturbed map. 2. The map G → hG is C r from C0r+2 to C0α . Remark 3.17. We are not claiming, at the moment, that hG is a homeomorphism. In finite dimensions this requires some arguments based either on index theory or on the fact that I d −8∗ is also invertible on C 0 sections. This requires some adaptation in the infinite dimensional case, which we will not reproduce here since we do not need this property in the considerations of smooth dependence of invariant measures. For details of the proof that hG is indeed a homeomorphism even in infinite dimensional situations, see [Jia95]. Proof. The results of the theorem are conclusions from the Implicit Function Theorem. Define a nonlinear function L : C0r+2 × C0α → C0α by: −1 (x))). ¯ L(G, h)(x) ¯ = h(x) ¯ − A−1 ¯ h(F x¯ 8(AF −1 (x)
Note that the first argument of L is a diffeomorphism and the second is a section. For typographical reasons, identifying sections and the homeomorphisms that they generate via the exponential mapping Ax¯ , we will write the map as: L(G, h) = h − G ◦ F ◦ h ◦ F −1 . Lemma 3.14 implies that L is differentiable in both arguments G and h. Note that L(I d, 0) = 0. Moreover, we have the following proposition on the invertibility of the derivative operator: Proposition 3.18. D2 L(I d, 0) = I d −F∗ is invertible as an operator on the space of sections C0α when α ≤ α ∗ , where 0 < α ∗ ≤ 1 is the Hölder exponent of the stable and unstable invariant subbundles for Df . Proof. Note that the equation for ξ given η (I d −F∗ )ξ = η can be written component-wise on each of the copies of the manifold as: ξi (. . . , xi−1 , xi , xi+1 , . . . ) − Df |f −1 (xi ) ξi (. . . , xi−1 , f −1 (xi ), xi+1 , . . . ) = ηi (. . . , xi−1 , xi , xi+1 , . . . ). In this equation, we can consider the variables xj , j 6 = i as parameters. The theory of finite dimensional hyperbolic systems (see e.g [KKPW90]) tells us that if we fix xj for j 6 = i the equation is solvable and we have for 0 ≤ α ≤ α ∗ , α ∗ > 0 depending only on f , we have ||ξi (. . . , xi−1 , ·, xi+1 , . . . )||C α (M,T M) ≤ K||ηi (. . . , xi−1 , ·, xi+1 , . . . )||C α (M,T M) . (10)
Smooth Dependence of SRB
317
Now, recalling that by the definition of C0α , we have ||ηi (. . . , xi−1 , ·, xi+1 , . . . , xj , . . . ) − ηi (. . . , xi−1 , ·, xi+1 , . . . , x¯j , . . . )||C 0 (M,T M) ≤ ||η||C0α 0(i − j )d(xj , x¯j )α
and by (10) we have: ||ξi (. . . , xi−1 , ·, xi+1 , . . . , xj , . . . ) − ξi (. . . , xi−1 , ·, xi+1 , . . . , x¯j , . . . )||C 0 (M,T M) (11) ≤ K||η||C0α 0(i − j )d(xj , x¯j )α . By the definition of C0α , (10) and (11) imply that ||ξ ||C0α ≤ K||η||C0α . This finishes the proof of Proposition 3.18. u t Thus, the Implicit Function Theorem implies that there exists > 0, for any G in the C0r+2 -neighborhood of I d, there exists a unique map hG is the C0α neighborhood of I d such that L(G, hG ) = 0, i.e, G ◦ F ◦ hG = 8 ◦ hG = hG ◦ F. t By Corollary 3.15, when G is C r+2 , the map G → hG is C r . u 4. Smooth Dependence of Invariant Hyperbolic Splittings To state and prove our results precisely we will need to endow the space of sections of the Grassmannian bundle G with a Banach manifold structure that also captures the ideas of locality and translation invariance. Similar treatments can be found in [PS91]. We will follow the standard practice in hyperbolic theory of representing linear subspaces close to a given one as the graph of a linear map from this space to its complement (see e.g., [HP70]). ¯ can be A section of the Grassmannian bundle close to the stable subspace E s (x) ¯ to E u (x), ¯ the identified with a section of the space of linear bundle maps from E s (x) unstable subspace. That is, given a family of linear maps ¯ : E s (x) ¯ → E u (x), ¯ H s,u (x) ¯ ⊂ Tx¯ M, we associate to it the family of spaces Gr(H s,u (x)) ¯ = {(v, H s,u (x)v) ¯ | v ∈ E s (x)}. ¯ Gr(H s,u (x)) It is easy to see, and indeed standard, that we can identify sections of the Grassmannian in E s ⊗ E u close to E s and sections of linear bundle maps from E s to E u . Therefore, if we give spaces of sections of linear bundle maps appropriate Banach norms, which capture the notions of locality, invariance, regularity, we can define a Banach manifold structure in sections of the Grassmannian bundle which captures the same ideas. Analogous definitions and considerations work, of course, for perturbations of the standard bundle. So, our next task will be to introduce appropriate spaces of sections of linear bundle maps that capture the desired properties.
318
M. Jiang, R. de la Llave
Recalling that Tx¯ M = ⊕j ∈Zd Txj M, we decompose the linear map H s (x) ¯ into blocks : Txj M → Txi M defined by ¯ = 5Txi M H s T M , Hijs (x)
¯ Hijs (x)
xj
where 5Txi M is the projection associated to the direct sum decomposition Tx¯ M = ⊕j ∈Zd Txj M. Given a decay function 0 we define −1 ¯ 0 = sup kHijs (x)k0 ¯ (i − j ). kH s (x)k x,i,j ¯
¯ → E u (x). ¯ Note that the This is clearly a Banach norm in a space of linear maps E s (x) fact that this norm is finite captures the idea that when j and i are very far apart, the j th coordinate of v has very little influence on the i th coordinate of H s (x)v. ¯ We denote by L0 the space of linear maps from with finite k k0 . Similarly, we s u define Ls,u 0 as the maps form E to E with finite norm || ||0 . Similar notation for maps α,β from one subbundle to another. We use the notation L0 to refer to the possibilities. It will be very important to note that, because of the properties of decay functions α,β the norm || ||0 makes the space L0 into a Banach algebra. Moreover, if A ∈ L0 , β,γ B ∈ L0 , we have ||BA||0 ≤ ||B||0 ||A||0 . ¯ E u (x) ¯ induced by a trivialization of Txi M If we introduce a trivialization of E s (x), we can define: C 0 (L0 ), C α (L0 ); as the Banach spaces of continuous and Hölder sections, respectively. In particular, the C0α norm is defined by s −1 ¯ C0α = max{sup kHijs k0 −1 (i − j ), sup γα,j (Hik (x))0 ¯ (i − j )}. kH s (x)k x,i,j ¯
k,i,j
(12)
We also introduce similarly spaces of Hölder sections for all the other bundles of linear maps considered above. We emphasize that we only use trivializations on finite dimensional objects and make them induce trivializations in the infinite dimensional ones. We also emphasize that the trivializations are only used to introduce Hölder norms. The objects that we are considering – sections of linear bundle maps – have an intrinsic meaning. All the functional equations we will derive are expressed in terms of geometric objects and their natural operations. Of course, the fact that an operator is a contraction, etc. will be expressed in norms that use trivializations. Again, it is important to note that the spaces with the norms (12) are Banach algebras under the pointwise multiplication of operators because of properties of decay functions. Using that the spaces of Hölder functions in a Banach algebra with a Hölder norm as the one we have introduced is a Banach algebra under pointwise multiplications. Translation invariance can be readily incorporated into consideration. We note that the translation S k introduced in Sect. 2 induces a transformation (still denoted by S k ) on linear maps such that s k ¯ = Hi+k,j ¯ (S k H s )ij (x) +k (S x).
Smooth Dependence of SRB
319
Clearly, S k does not change the C00 , C0α norms of a section. The space of sections which are invariant under the translations S k is a closed subspace 0 , C α respectively. of C00 , C0α . We will denote them by C0,S 0,S 4.1. Smooth dependence of the hyperbolic splitting. Following the lead of [Mn90], we will find it convenient to study the hyperbolic splittings for 8 evaluated at the point of ¯ ⊕ E u (h(x)). ¯ the conjugacy given by structural stability. That is we will study E s (h(x)) Since h is close to Id, we can use same trivialization chart containing both points x¯ and h(x). ¯ Thus, the tangent space Th(x) ¯ M is identified with Tx¯ M, x¯ ∈ 1. Therefore, we can apply the derivative operator D8 to sections over 1. The action on sections denoted by T is defined by v(F −1 (x)). ¯ T v(x) ¯ = D8 h◦F −1 (x) ¯ ¯ independent of This operator has the advantage that v(x) ¯ always depends on v(F −1 (x)) what 8 is. This is an important advantage with respect to the naively more natural push forward. See for example [Mn90,Rue97] for a more geometric justification. The main result of this section is the following theorem. Theorem 4.1. For any C r , r ≥ 4 hyperbolic map f of M with a hyperbolic set 3 and a decay function 0, there exists > 0 such that if G is in an C0r -neighborhood of the identity map I d from M → M, then, denoting as usual, F = ⊗i∈Zd f , 8 = G ◦ F , we have: 1. there exits a hyperbolic splitting of T M over the hyperbolic set 18 = hG (1F ): u s (hG (x)) ¯ ⊕ E8 (hG (x)) ¯ T M = E8
invariant under the derivative operator D8 = DG ◦ F · DF ; u (h (x)) are C0α sections of the Grassmannian close to EFu (x) ¯ in C0α norm. A 2. E8 G ¯ s (h (x)); ¯ similar result holds for E8 G u (h (x)) s ¯ in both C 0 and C0α 3. D8 is expanding on E8 8 ¯ and contracting on E8 (h8 (x)) norms; u,s (hG (x)) ¯ is C r−3 when the sections of the Grassman4. the map from G ∈ C0r → E8 α nian are given the C0 norm. Moreover, if G commutes with translations, so do the invariant subspaces. Proof. We will just give the proof for the unstable bundle. The result for the stable bundle follows by applying the result on unstable bundles to F −1 . (Of course, a direct proof for the stable follows along the same lines as the proof for the unstable.) We consider the operator D8 ◦ h in the components induced by the coordinates ¯ ⊕ EFu (x), ¯ EFs (F (x)) ¯ ⊕ EFu (F (x)) ¯ in the domain and the range, respectively. It can EFs (x) be represented by a matrix ss ¯ Asu (x) ¯ AG (x) G . D8 ◦ h(x) = ¯ Auu ¯ Aus G (x) G (x) us ss uu We note that Asu F = 0, AF = 0 because the splitting is invariant under F and AF , AF are block diagonal matrices.
320
M. Jiang, R. de la Llave
Notice that since AF is a direct product of finite dimensional systems, we have that uu kAuu F kC 0 = kDf kC 0 (M) ,
ss kAss F kC 0 = kDf kC 0 (M) ,
uu α kAuu F kC0 = kDf kC α (M) ,
ss α kAss F kC0 = kDf kC α (M) .
(13)
Therefore, when G is in a sufficiently small C0r -neighborhood r ≥ 4 of F , we can assume us kAsu G kC 0 ≤ , kAG kC 0 ≤ , 0
kAss G kC00
(14)
0
≤ µ < 1,
−1 k(Auu G ) kC00
≤ µ. < 1
with arbitrarily small and µ as close to λ as desired. We will consider sections of Grassmannians close to EFu as the graph of a section of ¯ → EFs (x). ¯ Given Ux¯ ∈ Lu0 , the space of linear maps from EFu (x) ¯ → linear maps EFu (x) s ¯ x¯ ∈ 1, we consider EF (x), ¯ Gr Ux¯ = {(Ux¯ v, v) | v ∈ EFu (x)}. Since ss ¯ x¯ + Asu ¯ [Auu ¯ + Aus ¯ x¯ ]v), D8|h(x) ¯ (Ux¯ v, v) = ([AG (x)U G (x)]v, G (x) G (x)U
we see that the graph of Ux¯ is invariant if and only if uu UF (x) ¯ + Aus ¯ x¯ ] = [Ass ¯ x¯ + Asu ¯ ¯ [AG (x) G (x)U G (x)U G (x)].
Equivalently su −1 (x))U −1 (x)) ¯ ¯ Ux¯ = Ass −1 ¯ + AG (F G (F −1 uu −1 F (x) −1 (x))U ¯ + Aus ¯ . · AG (F (x)) F −1 (x) ¯ G (F
(15)
We now define a map F: (G, U ) ∈ C r × C α (Lu0 ) → F(G, U ) ∈ C α (Lu0 ): (16) F(G, U ) = Ux¯ − ih h i−1 −1 su −1 −1 −1 (x))U ¯ (x)) ¯ Auu (x)) ¯ + Aus (x))U ¯ . Ass F −1 (x) ¯ + AG (F F −1 (x) ¯ G (F G (F G (F Note that this map can be defined when U is in a neighborhood of 0 and G is in a −1 is boundedly invertible and Aus has very small neighborhood of the identity. Auu G ◦F G norm. So that, indeed the second factor in (16) can be inverted. Note that F(Id, 0) = 0. Therefore, by the Implicit Function Theorem, Theorem 4.1 is proved once we show that the operator F defined by (16) is differentiable in G and U and D2 F is invertible at (Id, 0). We note that, by Corollary 3.15 the mappings −1 −1 −1 −1 , Asu , Auu , Aus G → Ass G ◦F G ◦F G ◦F G ◦F
(17)
are C r−3 when the G is considered in the manifold with the C0r topology and the target spaces with the C0α topology.
Smooth Dependence of SRB
321
Note that the map F defined in (16) is formed out of the maps (17) by applying Banach algebra operations, which are, of course, analytic. Hence, the map F is C r−3 when we consider it as a map of spaces of sections to spaces of sections endowed with the indicated topologies. Since we are assuming r ≥ 4, this means that the map is C 1 . To apply the implicit function theorem in the indicated spaces, we only need to check that the derivative with respect to the second argument is invertible at the origin. Given the form of the derivative of composition, and the well known formulas for the derivatives of the Banach algebra operations, we have: −1 uu −1 (x))U ¯ (x))) ¯ −1 . D2 F(I d, 0)Ux¯ = I d −Ass F −1 (x) ¯ (AF (F F (F
(18)
To finish the proof of Theorem 4.1, we just need to show that when 0 < α < 1 is appropriately chosen, the operator in (18) is invertible. We will do it by showing that the norm of −1 uu −1 (x))U ¯ (x))) ¯ −1 S : Ux¯ → Ass F −1 (x) ¯ (AF (F F (F
is strictly smaller than 1. We note that the operator U → UF −1 acting on C0α has norm ||DF −1 ||αC 0 . This norm is equal to ||Df −1 ||αC 0 (M) . Hence, recalling that C0α is a Banach algebra and the expressions for norms (13) we have that: −1 ss α α ||S|| ≤ ||DF −1 ||αC 0 (M) ||(Auu F ) ||C0 ||(AF )||C0
= ||Df −1 ||αC 0 (M) ||(Df uu )−1 ||C α (M) ||(Df ss )||C α (M) .
(19)
Clearly, for α > 0 adequately small, the RHS of (19) is smaller than 1. Indeed, the condition that α makes the RHS of (19) smaller than 1 is the same condition for the regularity of the splitting applying the invariant section theorem of [HP70]. u t To prepare for the proof of smooth dependence of the equilibrium state, we need u (h (x)) further to express the action of D8 on the invariant unstable subbundle E8 G ¯ in terms of an infinite matrix. Because of the invariance, there exists an infinite matrix B = (bij )i,j ∈Zd such that ss Ux¯ UF (x) ¯ Asu ¯ AG x) ¯ G (x) = B. ¯ Auu ¯ Id Id Aus G (x) G (x) Clearly, the matrix B is the matrix representation of D8 along the invariant unstable u (h (x)) subbundle E8 G ¯ under the chosen bases. So, we have ¯ + Aus ¯ x¯ . B = Auu G (x) G (x)U Note that D8 = DG · DF . If we write DG blockwise according to the hyperbolic splitting, we have u ¯ = Guu (F ◦ hG (x))(D ¯ f (hG (x))), ¯ Auu G (x)
where Guu denotes the block matrix corresponding to the action of DG on the unstable ¯ is a diagonal block matrix with D u f ((hG )i (x)) ¯ subbundle EFu (hG ) and (D u f (hG (x)))
322
M. Jiang, R. de la Llave
on the main diagonal. Note that kGuu − I d kC0r is small and khG − I d kCγα is small. We can rewrite the matrix in the form B = (D u f (xi ))(I d +AG ).
(20)
By the theorem just proved, we have that the map G → AG is C r−3 with respect to the C0r norm for G and C0α norm for AG . Further more, kAG kC0α ≤ δ(), where = kGkC0r and lim→0 δ() = 0. 5. Smooth Dependence of Potential Functions on Infinite Matrices We first describe the Banach space containing all the infinite matrices that are matrix representations of D u 8, the restriction of the derivative D8 on the unstable spaces. We denote this space by B. An infinite matrix function A(x), ¯ x¯ ∈ 1F is an element of B if it satisfies the following criteria: ¯ i,j ∈Zd is an infinite matrix. Each entry aij (x) ¯ is a matrix function (1) A(x) ¯ = (aij (x)) of finite size p × p, where p is the dimension of the unstable space of Df . ¯ we define its norm k · k using the following formula: (2) For matrix functions aij (x), ¯ = max kaij (x)k
sup |(aij (x)) ¯ kl |.
1≤k,l≤p x∈1 ¯ F
¯ is defined by (12) in the previous section. We recall the C0α -norm for A(x) ( kA(x)k ¯ = max
¯ sup kaij (x)k0
−1
ij ∈Zd
(i − j ), sup γα,j (aik (x))0 ¯ k,i,j
−1
)
(i − j ) .
An infinite matrix A(x) ¯ belongs to B iff kA(x)k ¯ < ∞. It is easy to check that B is a Banach space. In order to have a spatial translation invariant SRB measure, it is necessary that 8 is spatial translation invariant. Note that we do not need to assume the spatial translation invariance until the very last section when we introduce equilibrium states for potential functions. Next, we describe the Banach space of potential functions defined on the hyperbolic ¯ be a real function on 1F . We define its norm set 1F . Let ψ(x) ¯ γα,j (ψ)0 −1 (j ).} kψ(x)k ¯ = max{ sup |ψ(x)|, x∈1 ¯ F
We denote H all such functions with finite norm. It is also a Banach space. In order to define the map from the infinite matrix space B to the potential function space H, we need to define a special linear functional on B, tr 0 : B → H: ¯ = trace(a00 (x)). ¯ tr 0 (A(x)) The map L : B → H is defined by ¯ =: L(A(x)) ¯ =: ψA (x)
∞ X (−1)k k=1
k
¯ tr 0 (Ak (x)).
(21)
Smooth Dependence of SRB
323
In [JP98], it is proven that when 8 = G · F is C 1 -close to F and G is at least C 2 and has short-range property, a condition which is clearly satisfied when kG − IkC0r is small for the type of decay functions in Proposition 3.2, there exists a measure µφ on the hyperbolic set 18 invariant under both 8 and the translation S. The measure is obtained as a thermodynamic limit of SRB-measures for 8V , the finite dimensional approximations of 8. The pull-back measure µ˜ 8 ≡ (hG )∗ µ8
(22)
is an equilibrium state for the Zd+1 -action generated by (F, S) on 1 for a potential function ϕ8 and ¯ = − log J u f (x0 ) + ψAG (x), ¯ ϕ8 (hG (x))
(23)
where AG is defined by (20) and ψA is defined in (21). Since − log J u f (x0 ) is independent of 8 = G · F , the next step we need to show is ¯ depends on A smoothly. In fact, we have the following theorem. that ψA (x) Theorem 5.1. The map L is well-defined in a small neighborhood of the origin of the Banach space B and it is analytic. Proof. Let t be a real number. We have L(tA(x)) ¯ =
∞ X (−1)k k=1
k
k
tr ((tA(x)) ¯ )= 0
∞ X (−1)k k=1
k
tr 0 (Ak (x))t ¯ k.
It is clear that tr 0 (A1 A2 · · · Ak ) is a multilinear operator from B to H. We need to show that it is bounded. We start with k = 1. ¯ = |trace(a00 (x))| ¯ ≤ pkA(x)k. ¯ |tr 0 (A(x))| Assume that x¯ and y¯ differ only at j ∈ Zd , α |tr 0 (A(x)) ¯ − tr 0 (A(y))| ¯ = |trace(a00 (x) ¯ − a00 (y))| ¯ ≤ pkA(x)kd ¯ (xj , yj )0(j ).
These two inequalities imply that ktr 0 k ≤ p. When k = 2, X a0i bi0 | ≤ pkAkkBk. |tr 0 (AB) = |trace i∈Zd
For x¯ and y¯ with xi = yi , i 6 = j, i ∈ Zd , ¯ x)) ¯ − tr 0 (A(y)B( ¯ y))| ¯ = |trace |tr 0 (A(x)B(
X
a0i (x)b ¯ i0 (x) ¯ − a0i (y)b ¯ i0 (y)| ¯
i∈Zd
≤ |trace
X i∈Zd
a0i (x)(b ¯ i0 (x) ¯ − bi0 (y))| ¯ + |trace
X i∈Zd
(a0i (x) ¯ − a0i (y))b ¯ i0 (y)| ¯
324
M. Jiang, R. de la Llave
≤p
X
0(i)0(i − j ) +
i∈Zd
X
0(i)0(j ) kAkkBkd α (xj , yj )
i∈Zd
≤ 2pskAkkBkd α (xj , yj )0(j ),
P where s = j ∈Zd 0(j ). Thus, we have ktr 0 (AB)k ≤ 2pskAkkBk. By the Banach algebra property, we have ktr 0 (A1 A2 · · · Ak )k ≤ 2pskA1 kkA2 k · · · kAk k. Therefore, the operator L is analytic in the open neighborhood {A : kAk < 1} ⊂ B. u t 6. Smooth Dependence of Equilibrium States on Potential Functions In this last section we prove the smooth dependence of equilibrium states on their potential functions. The main tool used here is the transition from equilibrium states of coupled map lattices to invariant Gibbs states on higher dimensional lattice spin systems for corresponding potentials. The main results of this section were proved over the last few years in several related articles [BK95,JM96,BK97]. However, we shall give sufficient details to make this article self-contained and refer to [BK95,JM96,BK97] for some technical steps. We first state the main result of this section. To make better contact with the existent literature in statistical mechanics, we will use Banach spaces of exponentially decaying interactions rather than interactions which decay according to the decay functions in Proposition 3.2, which we used in the geometric part of the argument. (Note that in the statistical mechanics part of the argument, we will not need the property of Banach algebra under multiplication that made the decay functions useful in the geometric arguments.) As we explained earlier, the results are equivalent. ¯ y¯ ∈ M, We first define a new family of metrics ρq on M: for every 0 < q < 1, and x, ¯ y) ¯ = sup q |i| d(xi , yi ). ρq (x, i∈Zd
Then, the potential function space is defined to be the space of all Hölder continuous functions with respect to the metric ρq : |ϕ(x) ¯ − ϕ(y)| ¯ < ∞ . (24) H = ϕ(x) ¯ : 1 → R, sup ρqα (x, ¯ y) ¯ x6¯ =y¯ The norm on this Banach space is kϕk = sup x6¯ =y¯
|ϕ(x) ¯ − ϕ(y)| ¯ . α ρq (x, ¯ y) ¯
By choosing an appropriate 0 < q < 1, the inclusion map from the Banach space defined via a decay function 0 to H is bounded, and thus analytic. ¯ ∈ H be a potential function whose value depends only on the coordinate Let ϕ0 (x) x0 , which means it is actually a Hölder continuous function on 3 ⊂ M. We denote the ¯ in H by O (ϕ0 ). Let τ denote the Zd+1 -action on 1 induced -neighborhood of ϕ0 (x)
Smooth Dependence of SRB
325
by the map F and translations S. A measure µ invariant under τ is called an equilibrium state for ϕ ∈ O (ϕ0 ), if it satisfies the variational principle equation: Z Pτ (ϕ) = hτ (µ) + ϕdµ, where Pτ (ϕ) and hτ (µ) are the topological pressure for ϕ and the measure theoretical entropy of µ with respect to τ , respectively. The next theorem is a summary of results from [Jia95,JP98,BK97]. Theorem 6.1. For every ϕ0 and constants 0 < α, q < 1 in the definition of the space H (introduced in (24)), there exists an > 0 such that the equilibrium state µϕ for every function ϕ ∈ O (ϕ0 ) is unique. This unique equilibrium state denoted by µϕ is mixing with respect to both the map F and spatial translations S. By the results of the previous sections, we conclude that the SRB-measure for 8 is unique and mixing with respect to 8 and S when 8 satisfies the conditions in Theorem 4.1. To complete the proof of smooth dependence of the SRB-measure for 8, we only need to show the the equilibrium state depends smoothly on the potential function in O (ϕ0 ) ⊂ H. Let H∗ denote the dual space of H. Clearly, for any ϕ ∈ O (ϕ0 ) ⊂ H, µϕ ∈ H∗ . Now we state the main theorem of this section. Theorem 6.2. For any ϕ0 and constants 0 < α, q < 1 in the definition of the space H, there exists an > 0 such that the map from O (ϕ0 ) ⊂ H to H∗ : ϕ → µϕ is C ∞ Frechet differentiable. The strategy to prove the theorem is to use the symbolic representation of the uncoupled map lattice. To prove the smooth dependence of equilibrium states is then equivalent to proving the smooth dependence of invariant Gibbs states on potentials, which is proven by showing that the topological pressure is C ∞ Frechet differentiable in the corresponding potential space. 6.1. Symbolic representation of the uncoupled map lattice. Since we assume that f possesses a locally maximal hyperbolic set 3, there exists a Markov partition that induces a semi-conjugating map π from a subshift of finite type 6A onto 3 [Bow75]. The subshift is determined by an aperiodic matrix A since f is assumed to be topologically mixing. Let σt denote the left shift map on the subshift. We have f ◦ π = π ◦ σt . Using this map π , we can obtain naturally a semi-conjugating map π¯ = ⊗i∈Zd π from ⊗i∈Zd 6A Zd ) onto the infinite dimensional hyperbolic set 1 = ⊗ (denoted by 6A i∈Zd 3 for the d Z uncoupled map F . Let σs denote the maps on 6A induced by translations on Zd . We have F ◦ π¯ = π¯ ◦ ⊗i∈Zd σt , S ◦ π¯ = π¯ ◦ σs . d
Z is defined by The corresponding metric ρq on 6A
¯ = ρq (ξ¯ , η)
sup i∈Zd ,j ∈Z
q |i|+|j | d(ξi (j ), ηi (j )),
where d(·, ·) denotes the discrete distance on the space of finite symbols.
326
M. Jiang, R. de la Llave d
e denote the Banach space of all Hölder continuous functions on 6 Z . The norm Let H A is defined similarly: kϕk = max{ sup |ϕ(ξ¯ ), sup
ξ¯ 6 =η¯
Zd ξ¯ ∈6A
|ϕ(ξ¯ ) − ϕ(η)| ¯ }. ρqα (ξ¯ , η) ¯
Proposition 6.3 ([Jia95]). 1. The map ϕ to ϕ ◦ π¯ is a bounded linear operator. 2. The Zd+1 -action topological pressures for both functions ϕ and ϕ ◦ π¯ are equal. 3. For any ϕ ∈ O (ϕ0 ) ⊂ H, µϕ is its unique equilibrium state if and only if µ∗ϕ◦π¯ = π¯ ∗ (µϕ ), the pull-back measure under π¯ , is the unique equilibrium state for ϕ ◦ π¯ . By this proposition, to prove the main theorem, we need only to show that the topological e is C ∞ Frechet differentiable since the unique equilibrium state ¯ on H pressure Pτ (ϕ ◦ π) µϕ is the Frechet derivative of Pτ (·) at point ϕ. 6.2. Localization of the potential functions. To be able to use directly the results in [BK95,BK97] as they were stated in these papers, we introduce the Banach space of potentials that are localization of potential functions. For simplicity, we shall also drop the map π¯ in our notation. We assume ϕ0 (ξ¯ ), ξ¯ = (ξi )i∈Zd , ξi ∈ 6A is a potential Zd whose value depends only on the coordinate ξ . ϕ, ψ ∈ H. e function on 6A 0 d
Z as a subset of the full shift of dimension d+1. Localization of ϕ0 and ϕ. We consider 6A 0 The potential U obtained from localization of ϕ0 is a translation invariant longitudinal ∗ ) denotes the potential on the intervals of 6A . Let In = [−n, n]. Ibn = Z \ In . (ξI , ηIb element in 6A whose values in I agree with those of ξ and whose values in Ibagree with those of η∗ When n = 0, I0 = {0}, for a configuration ξI0 , choose any η0∗ such that (ξI0 , (η0∗ )Ib0 ) ∈ 6A and define
U 0 (ξI0 ) = ϕ0 (ξI0 , (η0∗ )Ib0 ). Assume that U 0 (ξIn−1 ) is defined for all configurations over In−1 . For a configuration ξIn , choose any ηn∗ (depending on the configuration ξIn ) such that (ξIn , (ηn∗ )Ibn ) ∈ 6A and define ∗ )[ ) U 0 (ξIn ) = ϕ0 (ξIn , (ηn∗ )Ibn ) − ϕ0 (ξIn−1 , (ηn−1 I
= ϕ0 (ξIn , (ηn∗ )Ibn ) −
n−1 X
n−1
U 0 (ξIk ).
k=0
In this way, we have ϕ0 (ξ ) =
∞ X n=0
U 0 (ξIn ).
Smooth Dependence of SRB
327
For all other types of configurations over finite volumes, the potential is defined to be zero. Since the function ϕ0 is Hölder continuous, the corresponding longitudinal potential decays exponentially to zero as the length of the interval increases. The procedure to localize ϕ is similar. The potential U is now defined for all configurations over d + 1-dimensional cubes. Because of the translation invariance, it is completely determined by its values for configurations over cubes centered at the origin Qn = ⊗i∈Zd ,j ∈Z [−n, n], ϕ(ξ¯ ) =
∞ X
U (ξ¯Qn ).
n=0
Let Qn denote the space of all configurations over the finite volume Qn . We define a realPfunction Un on Qn : Un (ξ¯ ) ≡ Un (ξ¯Qn ) ≡ U (ξ¯Qn ). Formally, we can write U= ∞ k=0 Un . Banach space of potentials. Let kUn k =
sup
ξ¯Qn ∈Qn
|Un (ξ¯Qn )|.
For 0 < θ < 1, define a norm for U : kU k = sup θ −n kUn k. 0≤n 0 such that the pressure function P (U 0 +U ) is C ∞ Frechet differentiable in the -neighborhood of the origin of the Banach space P.
328
M. Jiang, R. de la Llave
Proof. The proof of the theorem is based on the following two lemmas. Lemma 6.5. Let P (x) be a real function in a bounded convex open set U of a Banach space. 1. If P (x) is Gateaux differentiable and its Gateaux derivative Dx P as a bounded linear operator is continuous in x, then P (x) is Frechet differentiable. 2. If the Gateaux derivative Dx P is bounded for all x ∈ U, then P (x) is Lipschitz continuous in U. Lemma 6.6. Let fn (t), n = 1, 2, · · · , f∞ (t) be real functions on interval [−δ, δ]. If k f (t) n limn→∞ fn (t) = f∞ (t) for each t ∈ [−δ, δ] and supn,t | d dt k | < ∞ for each k, then
f∞ (t) is C ∞ and limn→∞
d k fn (t) dt k
=
d k f∞ (t) . dt k
We shall omit the proofs of these lemmas since they are standard. Lemma 6.6 is taken from [Sim93]. As a direct corollary from the first lemma, we have that if the (n+2)th order of the Gateaux derivative of P (x) is bounded in U, the P (x) is Frechet differentiable up to order n, n ≥ 1. Let X X 1 ln exp U 0 (ξ¯I ) + U (ξ¯Q ). PΛ (U 0 + U ) = |Λ| ξ¯Λ ∈Λ
I,Q⊂Λ
We have that limΛ→Zd+1 PΛ (U 0 + U ) = P (U 0 + U ) for all U ∈ P. According to the lemmas, in order to show that P (U 0 + U ) is Frechet differentiable in a neighborhood of the origin of P, it suffices to prove that sup | Λ,t
d k PΛ (U 0 + U + tV ) | 0, where m and C0 may depend on k and U 0 , but are independent of U , V , and Λ, such that the truncated correlation function satisfy the condition
| < gx1 · · · gxj ; gxj +1 · · · gxk >Λ | ≤ C0 e−ml ,
(26)
as long as there are two coordinate hyperplanes a distance of l apart separating x1 , · · · , xj from xj +1 , · · · , xk . We recall that the definition of the truncated correlation function < g1 ; g2 > for two functions g1 , g2 is < g1 g2 > − < g1 >< g2 >. The integral < · >Λ is again with respect to the Gibbs distribution for the potential U 0 + U over the finite space Λ . Estimation of the truncated correlation functions. To estimate the truncated correlation functions, we need the following theorem from [BK97] (named in that paper Theorem 1). Theorem 6.7. For each U 0 and 0 < θ < 1, there exist > 0, m > 0, C > 0 such that if kU k < , U ∈ P, the truncated correlation functions satisfy, for all functions h1 : X1 → R, h2 : X2 → R, X1 , X2 ⊂ Λ ⊂ Zd+1 , k < h1 h2 >Λ − < h1 >Λ < h2 >Λ k ≤ C min(|X1 |, |X2 |)kh1 kkh2 ke−md(X1 ,X2 ) , where d(X1 , X2 ) is the distance between the sets X1 and X2 and < · >Λ is the integral with respect to the Gibbs distribution from the potential U 0 + U . We now estimate | < gx1 · · · gxj ; gxj +1 · · · gxk >Λ |. We rewrite gx =
X c(Q)=x
V (ξ¯Q ) =
∞ X
Vn (ξ¯Qn (x) ),
n=0
where Qn (x) are cubes centered at x with 2n + 1 lattice points on each side and Vn , n = 0, 1, 2, · · · are functions from Qn (x) → R from the definition of the potential V .
330
M. Jiang, R. de la Llave
Note also that we have removed the restriction Q ⊂ Λ in the expression of gx for the −n kV k = 1, convenience of estimation. Since we assumed V ∈ P and kV k = sup∞ n n=0 θ n we have kVn k ≤ θ . It is clear that < gx1 · · · gxj ; gxj +1 · · · gxk >Λ is multi-linear in (gxi ). Therefore, we have | < gx1 · · · gxj ; gxj +1 · · · gxk >Λ | ∞ X
=
< Vn1 (ξ¯Qn (x1 ) ) · · · Vnj (ξ¯Qn (xj ) ); Vnj +1 (ξ¯Qn (xj +1 ) ) · · · Vnk (ξ¯Qn (xk ) ) >Λ .
n1 ,··· ,nk =0
We consider the terms of this sum in two cases: Case one. When n1 , · · · , nk ≤ 4l , where l is the separation constant between {x1 , · · · , xj } and {xj +1 , · · · , xk }. According to Theorem 6.7, we have |hVn1 (ξ¯Qn (x1 ) ) · · · Vnj (ξ¯Qn (xj ) ); Vnj +1 (ξ¯Qn (xj +1 ) ) · · · Vnk (ξ¯Qn (xk ) )iΛ | ≤ θ n1 +n2 +···+nk Ck(l/2 + 1)d+1 e−ml/2 ≤ θ n1 +n2 +···+nk C 0 e−ml/4 , where C 0 = sup Ck(l/2 + 1)d+1 e−ml/4 . 0≤l l/4 in the term determined by the sequence (n1 , n2 , · · · nk ). Note that | < h1 ; h2 >Λ | ≤ 2kh1 kkh2 k. So we have | < Vn1 (ξ¯Qn (x1 ) ) · · · Vnj (ξ¯Qn (xj ) ); Vnj +1 (ξ¯Qn (xj +1 ) ) · · · Vnk (ξ¯Qn (xk ) ) >Λ | ≤ 2kVn1 kkVn2 k · · · kVnk k. If we let
X
denote the sum of all such terms in Case two, we have
2
X 2
≤ 2kθ l/4
∞ X n1 ,··· ,nk−1 =0
θ n1 +n2 +···+nk−1 =
2k e(ln θ )l/4 . (1 − θ )k−1
Combining these two cases, we have the desired estimation (26). It seems to us that the arguments from Theorem 6.7 to the main Theorem are standard. However, we can not find exact references. u t
Smooth Dependence of SRB
331
7. Conclusions In this section we state the final result of our research, which explains the meaning of the smooth dependence of the measure. Some of the parts of the theorem work with less regularity. Theorem 7.1. Fix a decay function 0, a manifold M, a C r r > 4 system f with a locally maximal Axiom A set 3. Form the uncoupled system F : M → M defined by F = ⊗i∈Zd f . (M = ⊗i∈Zd M.) It contains a locally maximal hyperbolic set 1F = ⊗i∈Zd 3. Consider the set of coupled lattice maps 8 in a sufficiently small C0r neighborhood U of the uncoupled system F . Then, for all 8 ∈ U, 1) There is a locally maximal hyperbolic set 18 close to 1F . 2) There is a map h8 : 1F → 18 such that 8 ◦ h8 = h8 ◦ F. The map h8 is unique among those that are C00 close to the identity. It is C0α for some α > 0. 3) The map 8 → h8 is C r−2 considered as a map from C0r to C0α . 4) It is possible to define a measure µ8 on 18 , invariant under 8 and such that it is the limit of the SRB measures of finite approximations of the coupled lattice map. (Hence the measure is called SRB.) 5) Given any observable function 9 : M → R, 9 ∈ C0r−1 (similar definition as in Definition 3.11 ), then the function A9 : U → R defined by Z A9 (G) = 9 dµ8 is C r−3 when U is given the C0r topology. Moreover, ||D j AG || ≤ Kj ||9||C j +2 0
0≤j ≤r −3
here Kj can be chosen uniformly on U. Proof. Statements 1), 2), and 3) are parts of Theorem 3.16. Statement 4) is the main result of [JP98]. The statement in 5) follows from the following considerations, which are very similar to those of [Rue97] in the finite dimensional case, now that we have developed all the techniques to deal with infinite dimensional systems. First note that, using the structural stability equation, in item 2), we can write Z Z (27) 9 dµ8 = 9 ◦ h8 d µ˜ 8 , where µ˜ 8 is the measure invariant under the Zd+1 given by push-forward by F and by translations in the lattice, which is an equilibrium measure for the potential ϕ8 introduced in (23). The result of the theorem follows by considering (27) and observing: 1) The mapping 8 → 9 ◦ h8 is C r−2 when considered as a map from C0r to C0α (Use the regularity of the map 8 → h8 discussed in point 3) of this theorem and the regularity of the composition map in Corollary 3.15.)
332
M. Jiang, R. de la Llave
2) The map 8 → ϕ8 is C r−3 when 8 is given the C0r topology and ϕ8 is given the topology of H introduced in (24). This follows because the map 8 → AG is C r−3 (see Theorem 4.1 4)) and the map AG → ψAG is analytic (See Theorem 5.1.) 3) The function that associates a measure to its potential is C ∞ (see Theorem 6.2). t u Acknowledgement. The authors thank the Institute for Mathematics and its Applications at the University of Minnesota for providing a stimulating environment. This work was initiated while both authors were participating in the year-long program on Emerging Applications of Dynamical Systems. This work has been partially supported by NSF grants.
References [BK95] [BK97]
Bricmont, J. and Kupiainen, A.: Coupled analytic maps. Nonlinearity 8 (3), 379–396 (1995) Bricmont, J. and Kupiainen, A.: Infinite-dimensional SRB measures. Phys. D 103 (1–4), 18–33 (1997) and Lattice dynamics (Paris, 1995) [Bow75] Bowen, Rufus: Equilibrium states and the ergodic theory of Anosov diffeomorphisms. Lecture Notes in Mathematics, Vol. 470. Berlin–New York: Springer-Verlag, 1975 [BS88] Bunimovich, L.A. and Sina˘ı, Ya.G.: Spacetime chaos in coupled map lattices. Nonlinearity 1 (4), 491–516 (1988) [Con92] Contreras, G.: Regularity of topological and metric entropy of hyperbolic flows. Math. Z. 210 (1), 97–111 (1992) [Con95a] Contreras, G.: Average linking numbers of closed orbits of hyperbolic flows. J. London Math. Soc. (2) 51 (3), 614–624 (1995) [Con95b] Contreras, G.: The derivatives of equilibrium states. Bol. Soc. Brasil. Mat. (N.S.) 26 (2), 211–228 (1995) [dlLMM86] de la Llave, R., Marco, J.M. and Moriyón, R.: Canonical perturbation theory of Anosov systems and regularity results for the Livšic cohomology equation. Ann. of Math. (2), 123 (3), 537–611 (1986) [Gal98a] Gallavotti, G.: Chaotic dynamics, fluctuations, nonequilibrium ensembles. Chaos 8 (2), 384–392 (1998); Chaos and irreversibility (Budapest, 1997) [Gal98b] Gallavotti, G.: Chaotic hypothesis and universal large deviations properties. In: Proceedings of the International Congress of Mathematicians, Vol. I (Berlin, 1998), number Extra Vol. I, (electronic), 1998, pp. 205–233 [Gal99] Gallavotti, G.: A local fluctuation theorem. Phys. A 263 (1), 39–50 (1999) [HP70] Hirsch, M.W. and Pugh, C.C.: Stable manifolds and hyperbolic sets. In: Global Analysis (Proc. Sympos. Pure Math., Vol. XIV, Berkeley, Calif., 1968), Providence, RI: Am. Math. Soc., 1970, pp. 133–163 [Jia95] Jiang, M.: Equilibrium states for lattice models of hyperbolic type. Nonlinearity 8 (5), 631–659 (1995) [JM96] Jiang, M. and Mazel, A.E.: Uniqueness and exponential decay of correlations for some twodimensional spin lattice systems. J. Statist. Phys. 82 (3–4), 797–821 (1996) [JP98] Jiang, M. and Pesin,Y.B.: Equilibrium measures for coupled map lattices: existence, uniqueness and finite-dimensional approximations. Commun. Math. Phys. 193 (3), 675–711 (1998) [KKPW90] Katok, A., Knieper, G., Pollicott, M. and Weiss, H.: Differentiability of entropy for Anosov and geodesic flows. Bull. Am. Math. Soc. (N.S.) 22, (2), 285–293 (1990) [Kry79] Krylov, N.S.: Works on the foundations of statistical physics. Princeton, NJ: Princeton University Press, 1979 Translated from the Russian by A. B. Migdal, Ya. G. Sinai [Ja. G. Sina˘ı] and Yu. L. Zeeman [Ju. L. Zeeman], With a preface by A. S. Wightman, With a biography of Krylov by V. A. Fock [V. A. Fok], With an introductory article “The views of N. S. Krylov on the foundations of statistical physics” by Migdal and Fok, With a supplementary article “Development of Krylov’s ideas” by Sina˘ı, Princeton Series in Physics [Mn90] Ma né, R.: The Hausdorff dimension of horseshoes of diffeomorphisms of surfaces. Bol. Soc. Brasil. Mat. (N.S.) 20 (2), 1–24 (1990) [Mos69] Moser, J.: On a theorem of Anosov. J. Differ. Eqs. 5, 411–440 (1969) [PS91] Pesin, Ya.B. and Sina˘ı, Ya.G.: Space-time chaos in chains of weakly interacting hyperbolic mappings. In: Dynamical systems and statistical mechanics (Moscow, 1991), RI: Am. Math. Soc., Providence, 1991, pp. 165–198 Translated from the Russian by V. E. Naza˘ıkinski˘ı
Smooth Dependence of SRB
[Rue97] [Sim93] [Wei92]
333
Ruelle, D.: Differentiation of SRB states. Commun. Math. Phys. 187 (1), 227–241 (1997) Simon, B.: The statistical mechanics of lattice gases. Vol. I. Princeton, NJ: Princeton University Press, 1993 Weiss, H.: Some variational formulas for Hausdorff dimension, topological entropy, and SRB entropy for hyperbolic dynamical systems. J. Statist. Phys. 69 (3–4), 879–886 (1992)
Communicated by Ya. G. Sinai
Commun. Math. Phys. 211, 335 – 358 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Distributions on Partitions, Point Processes, and the Hypergeometric Kernel Alexei Borodin1 , Grigori Olshanski2 1 Department of Mathematics, The University of Pennsylvania, Philadelphia, PA 19104-6395, USA.
E-mail:
[email protected] 2 Dobrushin Mathematics Laboratory, Institute for Problems of Information Transmission,
Bolshoy Karetny 19, 101447 Moscow GSP-4, Russia. E-mail:
[email protected];
[email protected] Received: 22 September 1999 / Accepted: 23 November 1999
Abstract: We study a 3-parametric family of stochastic point processes on the one-dimensional lattice originated from a remarkable family of representations of the infinite symmetric group. We prove that the correlation functions of the processes are given by determinantal formulas with a certain kernel. The kernel can be expressed through the Gauss hypergeometric function; we call it the hypergeometric kernel. In a scaling limit our processes approximate the processes describing the decomposition of representations mentioned above into irreducibles. As we showed in previous works, the correlation functions of these limit processes also have determinantal form with so-called Whittaker kernel. We show that the scaling limit of the hypergeometric kernel is the Whittaker kernel. The integral operator corresponding to the Whittaker kernel is an integrable operator as defined by Its, Izergin, Korepin, and Slavnov. We argue that the hypergeometric kernel can be considered as a kernel defining a ‘discrete integrable operator’. We also show that the hypergeometric kernel degenerates for certain values of parameters to the Christoffel–Darboux kernel for Meixner orthogonal polynomials. This fact is parallel to the degeneration of the Whittaker kernel to the Christoffel–Darboux kernel for Laguerre polynomials.
0. Introduction Let Yn be the set of Young diagrams with n boxes and Y = Y0 t Y1 t Y2 t . . . be the set of all Young diagrams. In this paper we study a remarkable family of probability distributions on Yn , n = 0, 1, 2, . . . . The whole picture depends on 2 parameters z and z0 which satisfy certain conditions, see Sect. 1. For each pair (z, z0 ) we have a probability distribution on every Yn , n = (n) 0, 1, 2, . . . , we denote it by Mz,z0 . Its value on a Young diagram λ ∈ Yn with Frobenius
336
A. Borodin, G. Olshanski
coordinates (p1 , . . . , pd | q1 , . . . , qd ) has the form (n)
Mz,z0 (λ) =
n! (zz0 )d (zz0 )n
d Y (z + 1)pi (z0 + 1)pi (−z + 1)qi (−z0 + 1)qi 1 2 det , × pi !pi !qi !qi ! pi + qj + 1
(0.1)
i=1
where (a)k stands for a(a + 1) · · · (a + k − 1). (n) The distributions Mz,z0 have a representation-theoretic meaning. Let Sn be the symmetric group of degree n, S(∞) be the union of the groups Sn , and for λ ∈ Yn , let χ λ denote the irreducible character of Sn corresponding to λ. According to [KOV], there ex0 ists a central positive definite function χ (z,z ) on S(∞) such that, for any n, its restriction to Sn is X (n) χλ 0 Mz,z0 (λ) λ . χn(z,z ) = χ (e) |λ|=n
0
0
Moreover, the unitary representation T (z,z ) corresponding to χ (z,z ) admits a nice geometric description (at least for z0 = z¯ ), see [KOV]. This representation-theoretic aspect was the original motivation of our study but in the present paper we do not discuss it (see [P.I]). Let us associate to a Young diagram λ = (p1 , . . . , pd | q1 , . . . , qd ) a set of 2d points in Z0 = Z + 21 as follows: λ = (p1 , . . . , pd | q1 , . . . , qd ) 7 → {p1 + 21 , . . . , pd + 21 , −q1 − 21 , . . . , −qd − 21 }. Then every probability measure on Yn provides a probability measure on the set of all point configurations in Z0 with equal number of points in Z0+ = Z0 ∩ R+ and Z0− = Z0 ∩ R− and such that the total sum of absolute values of coordinates is equal to n. Next, having a distribution on each Yn , we can mix them using a distribution on the set {0, 1, 2, . . . } of indices n, then we get a probability distribution on Y. Thus, we get a probability measure on the set of all balanced point configurations in Z0 (i.e., configurations with equal number of points in Z0+ and Z0− ). According to standard terminology, we can say that we defined a point process on Z0 . Following a certain analogy with statistical physics, one can call the resulting object of the mixing procedure the grand canonical ensemble, see [V]. For our special distributions (0.1) we choose the mixing distribution to be the negative binomial distribution Prob{n} = (1 − ξ )t
(t)n n ξ , t = zz0 , n!
(0.2)
where ξ ∈ (0, 1) is an additional parameter. (This choice is explained by willing to n! from the RHS of (0.1).) We shall denote by Pz,z0 ,ξ the point remove the factor (t) n 0 process on Z thus obtained. The main result of this paper is the explicit computation of the correlation functions of Pz,z0 ,ξ . It turns out that they are given by determinantal formulas ρn (x1 , . . . , xn ) = det[K(xi , xj )]ni,j =1
Distributions on Partitions
337
with a certain kernel K(x, y) on Z0 . This kernel can be expressed through the Gauss hypergeometric function. We call it the hypergeometric kernel. (n) Due to the representation theoretic origin of our problem, the distributions Mz,z0 have a number of additional properties. In particular, as n → ∞, they converge to a probability measure on a certain limit object called the “Thoma simplex”, see [KOO] and Sect. 5 below. 1 In terms of point processes, this implies that after an appropriate scaling the point processes Pz,z0 ,ξ will converge, as ξ % 1, to a certain point process on R∗ = R \ {0} derived from the limit measure on the Thoma simplex (we shall give a rigorous proof of this result in [BO3]). This limit process has been thoroughly studied in our previous papers [P.I–P.V] 2 . Its correlation functions also have determinantal form with so-called Whittaker kernel, see [P.IV]. In Sect. 5 we show directly that the scaling limit of the hypergeometric kernel is the Whittaker kernel. The integral operator defined by the Whittaker kernel belongs to the class of integrable operators as defined by Its, Izergin, Korepin, Slavnov [IIKS]. We show that the operator corresponding to the hypergeometric kernel can be considered as an example of a “discrete integrable operator”, see also [B1]. A. Okounkov pointed out that important information can be obtained from consideration of another degeneration of the point process introduced above. Assume that z, z0 → ∞ and ξ = zzθ 0 = θt → 0, where θ > 0 is fixed. Then the mixing distribution (0.2) tends to the Poisson distribution with parameter θ: Prob{n} → e−θ
θn , n!
(n)
while Mzz0 tends to the Plancherel distribution on Yn : (n)
(n) (λ) = Mzz0 (λ) → M∞
dim2 λ , n!
(0.3)
where dim λ = χ λ (e) is the dimension of the irreducible representation of the symmetric group Sn corresponding to λ. Thus, we get an explicit formula for the correlation functions of the process governed by the poissonized Plancherel distributions.3 This formula allows to prove certain important facts about Plancherel distributions, see [BOO]. In particular, we were able to prove the conjecture by Baik, Deift, and Johansson [BDJ1,BDJ2] that the asymptotic behavior of λ1 , λ2 , . . . with respect to the Plancherel distribution is governed by the Airy kernel [TW] and, therefore, coincides with that of the largest eigenvalues of a matrix from the Gaussian Unitary Ensemble. (Other approaches to this conjecture can be found in [O1] and [J2], see also [BO4].) The paper is organized as follows. In Sect. 1 we introduce our main object of interest – the point process Pz,z0 ,ξ . In Sect. 2 we recall some generalities on determinantal point processes. The computation of the correlation functions of Pz,z0 ,ξ and the formulas for 1 This is a kind of dual object to the infinite symmetric group, see [T,VK,KV]. The limit measure is, 0 actually, a spectral measure for the decomposition of the representation T (z,z ) into irreducibles, see [KOV]. 2 A survey of the results is given in [BO1], see also [B2,B3]. 3 The limit relation (0.3) was known since the invention of the distributions M (n) , see [KOV], but up to zz0 now it was not used.
338
A. Borodin, G. Olshanski
the hypergeometric kernel can be found in Sect. 3. In Sect. 4 we show that if one of the parameters z, z0 is an integer, the hypergeometric kernel degenerates to the Christoffel– Darboux kernel for the Meixner orthogonal polynomials on Z+ . 4 In Sect. 5 we discuss the scaling limit of the hypergeometric kernel. Sect. 6 explains the connection with integrable operators. Sect. 7 is an appendix, we give there proofs of certain identities involving the Gauss hypergeometric functions which are used in Sect. 3. 1. Distributions on Partitions. The Grand Canonical Ensemble For n = 1, 2, . . . , let Yn denote the set of partitions of n, which will be identified with Young diagrams with n boxes. We agree that Y0 consists of a single element – the zero partition or the empty diagram ∅. Given λ ∈ Yn , we write |λ| = n and denote by d = d(λ) the number of diagonal boxes in λ. We shall use the Frobenius notation [Ma] λ = (p1 , . . . , pd | q1 , . . . , qd ). Here pi = λi −i is the number of boxes in the i th row of λ on the right of the i th diagonal box; likewise, qi = λ0i − i is the number of boxes in the i th column of λ below the i th diagonal box (λ0 stands for the transposed diagram). Note that p1 > · · · > pd ≥ 0,
q1 > · · · > qd ≥ 0,
d X (pi + qi + 1) = |λ|. i=1
The numbers pi , qi are called the Frobenius coordinates of the diagram λ. Throughout the paper we fix two complex parameters z, z0 such that the numbers (z)k (z0 )k and (−z)k (−z0 )k are real and strictly positive for any k = 1, 2, . . . . Here and below (a)0 = 1, (a)k = a(a + 1) . . . (a + k − 1), denotes the Pochhammer symbol. The above assumption on z, z0 means that one of the following two conditions holds: • either z0 = z¯ and z ∈ C \ Z, • or z, z0 ∈ R and there exists m ∈ Z such that m < z, z0 < m + 1. We set
t = zz0
and note that t > 0. For a Young diagram λ let dim λ denote the number of the standard Young tableaux of shape λ. Equivalently, dim λ is the dimension of the irreducible representation (of the symmetric group of degree |λ|) corresponding to λ, see [Ma]. In the Frobenius notation, Q (pi − pj )(qi − qj ) d Y 1 1≤i<j ≤d dim λ Q = |λ|! pi !qi ! (pi + qj + 1) i=1
=
d Y i=1
1≤i,j ≤d
1 1 , det pi !qi ! pi + qj + 1 1≤i,j ≤d
4 After the present paper was completed we learned that the “Meixner kernel” has also arisen in the recent
work [J1], see also [BO4].
Distributions on Partitions
339
see, e.g., [P.I, Prop. 2.6]. We introduce a function on the Young diagrams depending on the parameters z, z0 : Mz,z0 (λ) =
d td Y dim2 λ (z + 1)pi (z0 + 1)pi (−z + 1)qi (−z0 + 1)qi · (t)|λ| |λ|! i=1
=
|λ|! d t (t)|λ|
d Y i=1
(z + 1)pi (z0 + 1)pi (−z + 1)qi (−z0 + 1)qi 1 · det2 . pi !pi !qi !qi ! pi + qj + 1
We agree that Mz,z0 (∅) = 1. Thanks to our assumption on the parameters, Mz,z0 (λ) > 0 for all λ. Proposition 1.1. For any n,
X
Mz,z0 (λ) = 1,
λ∈Yn
so that the restriction of Mz,z0 to Yn is a probability distribution on Yn . (n)
We shall denote this distribution by Mz,z0 . Comments. This result is the starting point of our investigations. About its origin and representation-theoretic significance, see [KOV]. Several direct proofs of the proposition are known. E.g., a simple proof is given in [P.I, Sect. 7]. About generalizations, see [K, BO2]. Note that dim2 λ 0 (λ) = , M lim z,z |λ|! |z|,|z0 |→∞ so that the limit form of the identity of the proposition is X dim2 λ = 1, |λ|!
λ∈Yn
which is well known. Let Y = Y0 t Y1 t . . . denote the set of all Young diagrams. Consider the negative binomial distribution on the nonnegative integers, which depends on t and the additional parameter ξ , 0 < ξ < 1: πt,ξ (n) = (1 − ξ )t For λ ∈ Y we set
(t)n n ξ , n!
n = 0, 1, . . . .
Mz,z0 ,ξ (λ) = Mz,z0 (λ) πt,ξ (|λ|).
By the construction, Mz,z0 ,ξ (·) is a probability distribution on Y, which can be viewed (n) as a mixing of the finite distributions Mz,z0 . From the formulas for Mz,z0 and πt,ξ we get an explicit expression for Mz,z0 ,ξ : d P
(pi +qi +1)
Mz,z0 ,ξ (λ) = (1 − ξ )t ξ i=1 ×
td
d Y (z + 1)pi (z0 + 1)pi (−z + 1)qi (−z0 + 1)qi 1 det2 . (1.1) pi !pi !qi !qi ! pi + qj + 1 i=1
340
A. Borodin, G. Olshanski
Following a certain analogy with models of statistical physics (cf. [V]) one may call (Y, Mz,z0 ,ξ ) the grand canonical ensemble. Let Z0 denote the set of half-integers, Z0 = Z +
1 2
= {. . . , − 23 , − 21 , 21 , 23 , . . . },
and let Z0+ and Z0− be the subsets of positive and negative half-integers, respectively. It will be sometimes convenient to identify both Z0+ and Z0− with Z+ = {0, 1, 2, . . . } by making use of the correspondence ±(k + 21 ) ↔ k, where k ∈ Z+ . Denote by Conf(Z0 ) the space of all finite subsets of Z0 which will be called configurations. We define an embedding λ 7 → X of the set Y of Young diagrams into the set Conf(Z0 ) of configurations in Z0 as follows: λ = (p1 , . . . , pd | q1 , . . . , qd ) 7→ X = {p1 + 21 , . . . , pd + 21 , −q1 − 21 , . . . , −qd − 21 }.
(1.2)
Under the identification Z0 ' Z+ t Z+ , the map λ 7 → X is simply associating to λ the collection of its Frobenius coordinates. The image of the map consists exactly of the balanced configurations – the configurations X with the property |X ∩ Z0+ | = |X ∩ Z0− |. Under the embedding λ 7 → X the probability measure Mz,z0 ,ξ on Y turns into a probability measure on the configurations in Z0 . Following the conventional terminology, see [DVJ], we get a point process on Z0 ; let us denote it as Pz,z0 ,ξ . Our primary goal is to compute the correlation functions of this point process. 2. Determinantal Point Processes Let X be a countable set. Its finite subsets will be called configurations and denoted by the letters X, Y . The space of all configurations is denoted as Conf(X); this is a discrete space. In this section, by a point process on X we mean a map from a probability space distribution on to Conf(X). 5 Let P be a point process on X. It induces a probability P Conf(X), which is a nonnegative function π(X) such that X π(X) = 1. One may simply identify P and π: then the underlying probability space is Conf(X) itself. We introduce a related function on Conf(X) as follows: X π(Y ). ρ(X) = Y ⊇X
That is, ρ(X) is the probability that the random configuration contains X. Consider the Hilbert space `2 (X). An operator L in `2 (X) will be viewed as an infinite matrix L(x, y) whose rows and columns are indexed by points of X. By LX we denote the finite matrix of format X × X which is obtained from L by letting x, y range over X. Assume L is a trace class operator in X such that all the principal minors det LX are real nonnegative numbers. We agree that det L∅ = 1. We have X X tr(∧n L) = det LX det(1 + L) = n
X
5 Actually, such a definition is rather restricted but it suffices for our purpose. For general axiomatics of
point processes, see [DVJ,L1,L2].
Distributions on Partitions
341
(here ∧n L stands for the nth exterior power of L acting in the nth exterior power of the Hilbert space `2 (X)). Thus, we can define a point process by π(X) =
det LX . det(1 + L)
Let us call it the determinantal point process determined by the operator L. 6 Proposition 2.1. Let L satisfy the above assumption, π(·) be the corresponding point process, and ρ(·) be the associated function as defined above. Set K = L(1 + L)−1 . Then ρ(X) = det KX . Proof. We shall reproduce the argument indicated in [DVJ, Exercise 5.4.7]. Let f (x) be a function on X such that f0 (x) = f (x) − 1 is finitely supported. For any point process π(·), Y X Y X Y X π(Y ) f (y) = π(Y ) (1 + f0 (y)) = ρ(X) f0 (x). Y
y∈Y
Y
y∈Y
X
x∈X
When the process is defined by an operator L then, identifying f with the diagonal matrix with diagonal entries f (x), we get Y X Y X π(Y ) f (y) = det LY f (y) det−1 (1 + L) Y
y∈Y
Y
y∈Y
= det(1 + f L)det −1 (1 + L) = det (1 + f L)(1 + L)−1 X Y det KX f0 (x). = det (1 + L + f0 L)(1 + L)−1 = det(1 + f0 K) = X
Thus,
X X
ρ(X)
Y
f0 (x) =
x∈X
X X
det KX
Y
x∈X
f0 (x)
x∈X
t for any finitely supported function f0 , which implies ρ(X) = det KX . u Let ρn be the restriction of ρ(·) to the n-point configurations. One can view ρn as a symmetric function in n variables, ρn (x1 , . . . , xn ) = ρ({x1 , . . . , xn }), x1 , . . . , xn pairwise distinct. In this notation, the result of Proposition 2.1 reads as follows ρn (x1 , . . . , xn ) = det[K(xi , xj )]1≤i,j ≤n . We call ρn the n-point correlation function. 6 This term is not a conventional one. Such processes, not necessarily on discrete spaces, arise in different topics, in particular, in connection with random matrices. In [DVJ], they are called “fermion processes” but in the random matrix literature no special term is adopted. See [S] for a comprehensive survey on determinantal point processes.
342
A. Borodin, G. Olshanski
From now on we assume that X = X+ t X− (disjoint union of two countable sets) and we write `2 (X) = `2 (X+ ) ⊕ `2 (X− ). According to this decomposition we write operators in `2 (X) as 2 × 2 operator matrices, L++ L+− K++ K+− L= , K= . L−+ L−− K−+ K−− Given a configuration X, we set X ± = X ∩ X± . We shall deal with operators L such that L++ = 0, L−− = 0. Then, as is easily seen, π(X) = 0 unless X is balanced: |X + | = |X− |. Proposition 2.2. The transforms L 7→ K = L(1 + L)−1 and K 7 → L = K(1 − K)−1 define a bijective correspondence between (i) the operators L of the form
L=
0 A , −B 0
(2.1)
where the matrix 1 + AB is invertible (equivalently, 1 + BA is invertible) and (ii) the operators K of the form CD C K= , (2.2) DCD − D DC where 1 − CD is invertible (equivalently, 1 − DC is invertible). In terms of the blocks, this correspondence takes the form C = (1 + AB)−1 A = A(1 + BA)−1 ,
D = B,
(2.3)
−1
B = D.
(2.4)
A = C(1 − DC) In particular,
−1
= (1 − CD)
C,
1 − CD = (1 + AB)−1 , 1 − DC = (1 + BA)−1 .
(2.5)
Proof. The proof is straightforward, see [P.V, Prop. 1.8]. u t Proposition 2.3. Let K be a J -Hermitian 7 kernel of the form (2.2). Then 0 D∗ . L= −D 0 Proof. By Proposition 2.2, L is given by the formula (2.1) with B = D. Since K is t J -Hermitian, L is J -Hermitian, too. This implies A = B ∗ = D ∗ . u Remark 2.4. The major source of determinantal point processes is the Random Matrix Theory, see [Me]. First examples of such processes go back to sixties, see [Dy,L3]. As a separate class, these processes were identified by Macchi [Mac1,Mac2], who used the term fermion point processes, see also [DVJ]. We have encountered certain J -Hermitian kernels in our work on harmonic analysis on the infinite symmetric group, see [BO1, P.I–P.V]. At that time we were not aware of the fact that examples of J -Hermitian correlation kernels had appeared before in works of mathematical physicists on solvable models of systems with positive and negative charged particles, see [AF, CJ1, CJ2, G, F1–F3] and references therein. Relation of the present work to [BO1, P.I–P.V] is briefly discussed in Sect. 5.
7 I.e., Hermitian with respect to the indefinite inner product determined by the matrix J = 1 0
0 −1
.
Distributions on Partitions
343
3. Calculation of the Correlation Functions. The Hypergeometric Kernel In this section we shall apply the formalism of Sect. 2 to the point processes Pz,z0 ,ξ introduced at the end of Sect. 1. Naturally, we specify X = Z0 and X± = Z0± . Let us introduce 2 meromoprhic functions in u depending on z, z0 , ξ as parameters 0(u + 1 ± z)0(u + 1 ± z0 ) 0(1 ± z)0(1 ± z0 )0(u + 1)0(u + 1) 0(u + 1 ± z)0(u + 1 ± z0 ) 0 . = t −1/2 ξ u+1/2 (1 − ξ )±(z+z ) 0(±z)0(±z0 )0(u + 1)0(u + 1) 0
ψ± (u) = t 1/2 ξ u+1/2 (1 − ξ )±(z+z )
(3.1)
An important fact is that the functions ψ± (u) have exponential decay as u tends to +∞ along the real axis. (Indeed, since ξ ∈ (0, 1), the factor ξ u has exponential decay; 0 as for the remaining expression in (3.1), it behaves as a constant times u±(z+z ) , so that it has at most polynomial growth.) We shall consider two diagonal matrices of format Z+ × Z+ , depending on the parameters z, z0 , ξ and denoted as 9± , and a third matrix of the same format denoted as W: ( ) 0 t 1/2 ξ k+1/2 (1 − ξ )±(z+z ) (1 ± z)k (1 ± z0 )k 1 , W (k, l) = , 9± = diag k! k! k+l+1 (3.2) where k, l range over Z+ . Note that the k th diagonal entry of 9± equals ψ± (k). As the diagonal entries of 9± are real and positive, we may introduce the diagonal matrices 1/2 9± which are real and positive, too. Note also that W is real and symmetric. Proposition 3.1. The point process Pz,z0 ,ξ is a determinantal process in the sense of Sect. 2, and the corresponding operator L is as follows: "
1/2
1/2
0 9+ W 9− L= 1/2 1/2 −9− W 9+ 0
# .
(3.3)
Note that L is real and J -symmetric. Proof. Let λ be a Young diagram and X be the corresponding configuration, see (1.2). We must prove that det LX . (3.4) Mz,z0 ,ξ (λ) = det(1 + L) 0 A , where the prime sign means transposition, we have, Since L has the form −A0 0 taking account of the exact expression for the matrix A (see (3.3)): det LX = det2 [A(pi , qj )]1≤i,j ≤d =
d Y i=1
ψ+ (pi )ψ− (qi ) · det2
1 . pi + qj + 1
344
A. Borodin, G. Olshanski 0
In the latter product, the factor (1 − ξ )z+z coming from ψ+ cancels with the factor 0 (1 − ξ )−(z+z ) coming from ψ− , and we get d P
(pi +qi +1) det LX = det(1 + L)−1 ξ i=1 td det(1 + L)
×
d Y (z + 1)pi (z0 + 1)pi (−z + 1)qi (−z0 + 1)qi 1 det2 . pi !pi !qi !qi ! pi + qj + 1 i=1
Comparing this with the expression (1.1) for Mz,z0 ,ξ (λ), we see that they coincide up to a constant factor which does not depend on λ. Since both expressions define probability distributions, we conclude that they are identical. u t Remark 3.2. As a by-product of the proof we get the following result: det(1 + L) = det(1 + 9+ W 9− W 9+ ) = (1 − ξ )−t . 1/2
1/2
By Propositions 3.1 and 2.1, the correlation functions of the point process Pz,z0 ,ξ are given by the determinantal formula involving the operator K = L(1 + L)−1 . Theorem 3.3 below provides an explicit expression for this operator. Introduce the functions ξ ), R± (u) = ψ± (u) F (∓z, ∓z0 ; u + 1; ξ −1
(3.5)
ξ 0 t 1/2 ξ 1/2 ψ± (u) F (1 ∓ z, 1 ∓ z ; u + 2; ξ −1 ) , S± (u) = 1−ξ u+1
(3.6)
P± (u) = (ψ± (u))−1/2 R± (u), Q± (u) = (ψ± (u))−1/2 S± (u).
(3.7)
They are all well-defined for u ≥ 0, because ψ± (u) is strictly positive for u ≥ 0. In particular, they are well-defined at the points u = k ∈ Z+ . Note that the hypergeometric function that enters (3.5) or (3.6) remains bounded as u → +∞. Hence, the exponential decay of ψ± (u) implies the exponential decay of R± (u) and S± (u) as u tends to +∞ along the real axis. K++ K+− be the operator in `2 (Z+ ) ⊕ `2 (Z+ ) with the Theorem 3.3. Let K = K−+ K−− blocks P+ (k)Q+ (l) − Q+ (k)P+ (l) , K++ (k, l) = k−l P+ (k)P− (l) + Q+ (k)Q− (l) , K+− (k, l) = k+l+1 P− (k)P+ (l) + Q− (k)Q+ (l) , K−+ (k, l) = − k+l+1 P− (k)Q− (l) − Q− (k)P− (l) . K−− (k, l) = k−l Here the functions P± (u) and Q± (u) are defined in (3.7), and the expressions K±± (k, l) k=l are understood according to the L’Hospital rule. Then K = L(1 + L)−1 , where L is defined in (3.3).
Distributions on Partitions
345
We shall call the kernel K defined above the hypergeometric kernel. Note that K is J -symmetric (because of the minus sign in the expression for K−+ ). Proof. We shall prove that K has the form (2.2) with C = K+− ,
1/2
1/2
D = −L−+ = 9− W 9+ .
(3.8)
I.e., K++ = CD,
K−− = DC,
K−+ = DCD − D.
(3.9)
As K is J -symmetric, the desired result will follow from Proposition 2.3. 8 To avoid square roots, it is convenient to rewrite the desired relations (3.9) in terms of the operators 1/2
1/2
N = 9+ C9− ,
−1/2
W = 9−
−1/2
D9+
with the kernels N(k, l) =
R+ (k)R− (l) + S+ (k)S− (l) , k+l+1
W (k, l) =
1 . k+l+1
By virtue of the connection between P± , Q± and R± , S± , the relations (3.9) are equivalent to the following ones: 1 R+ (k)S+ (l) − S+ (k)R+ (l) , ψ+ (l) k−l 1 R− (k)S− (l) − S− (k)R− (l) , (W N)(k, l) = ψ− (k) k−l R− (k)R+ (l) + S− (k)S+ (l) 1 . (W NW − W )(k, l) = − ψ− (k)ψ+ (l) k+l+1 (NW )(k, l) =
(3.10) (3.11) (3.12)
We note once more that, by agreement, the indeterminacy arising in (3.10) and (3.11) for k = l is removed by making use of the L’Hospital rule. To prove the relations above we shall need certain identities involving the hypergeometric function. Lemma 3.4. Set b± (u) = R
∞ X R± (k) , u+k+1 k=0
b S± (u) =
∞ X k=0
S± (k) . u+k+1
(3.13)
Then the series absolutely converge for u 6 = −1, −2, . . . and the following relations hold: b b± (u) = ψ∓ (u)−1 S∓ (u), S± (u) = 1 − ψ∓ (u)−1 R∓ (u). (3.14) R Lemma 3.5. R+ (u)R− (−u − 1) + S+ (u)S− (−u − 1) = ψ+ (u)ψ− (−u − 1).
(3.15)
8 In the same way, one could verify directly that the operators C, D obey the relations (2.3) which means that K coincides with L(1 + L)−1 . However, reference to Proposition 2.3 makes this verification redundant.
346
A. Borodin, G. Olshanski
The proofs of these two lemmas can be found in the Appendix (Sect. 7). Let us now check (3.10). By the definition of N and W , (NW )(k, l) =
∞ X R+ (k)R− (j ) + S+ (k)S− (j )
(k + j + 1)(j + l + 1)
j =0
Assume first k 6 = l. We write 1 1 = (k + j + 1)(j + l + 1) k−l
.
1 1 − , l+j +1 k+j +1
(3.16)
(3.17)
and plug this into (3.16). Then we get (NW )(k, l) =
∞ ∞ S+ (k) X S− (j ) R+ (k) X R− (j ) + k−l l+j +1 k−l l+j +1 j =0
j =0
∞ ∞ S+ (k) X S− (j ) R+ (k) X R− (j ) − . − k−l k+j +1 k−l k+j +1 j =0
j =0
By (3.13), this can be written as (NW )(k, l) =
b− (l) + S+ (k)b b− (k) + S+ (k)b R+ (k)R S− (l) S− (k) R+ (k)R − . k−l k−l
Applying (3.14), we get R+ (k)S+ (l) − S+ (k)R+ (l) + S+ (k), ψ+ (l) R+ (k)S+ (k) − S+ (k)R+ (k) b− (k) + S+ (k)b + S+ (k) = S+ (k). S− (k) = R+ (k)R ψ+ (k) b− (l) + S+ (k)b S− (l) = R+ (k)R
This implies (3.10) for k 6 = l. To extend the argument above to the case k = l, we replace (3.17) by a slightly more complicated expression that makes sense for any k, l ∈ Z+ : 1 1 1 1 = lim − , (3.18) (k + j + 1)(j + l + 1) u→l k − u u + j + 1 k + j + 1 where u is assumed to be nonintegral. Since the functions R± (u), S± (u) have exponential decay as u → +∞ (see the paragraph before Theorem 3.3), we may interchange summation and the limit transition. Then we can repeat all the transformations. At the very end we must pass to the limit as u → l, which means that we follow the L’Hospital rule. This concludes the proof of (3.10). The proof of (3.11) is quite similar, and we proceed to the proof of (3.12). By virtue of the expression (3.10) for N W and our agreement about the L’Hospital rule, we get ∞ 1 X R+ (j )S+ (u) − S+ (j )R+ (u) . (3.19) (W NW )(k, l) = lim u→l ψ+ (u) (k + j + 1)(j − u) j =0
Distributions on Partitions
347
Using the transformation 1 1 =− (k + j + 1)(j − u) k+u+1
1 1 − , k+j +1 (−u − 1) + j + 1
we rewrite the above sum as follows: ∞ X R+ (j )S+ (u) − S+ (j )R+ (u)
(k + j + 1)(j − u)
j =0
=−
b+ (−u − 1)S+ (u) − b b+ (k)S+ (u) − b R S+ (k)R+ (u) S+ (−u − 1)R+ (u) R + . k+u+1 k+u+1
Next, applying (3.14), we transform this to −
R+ (u) S− (k)S+ (u) + R− (k)R+ (u) + ψ− (k)(k + u + 1) k+u+1 R+ (u) S− (−u − 1)S+ (u) + R− (−u − 1)R+ (u) − . + ψ− (−u − 1)(k + u + 1) k+u+1
Here the second and the fourth fractions cancel each other, while the third fraction equals ψ+ (u)/(k + u + 1), because of (3.15). Consequently, the whole expression is equal to −
ψ+ (u) S− (k)S+ (u) + R− (k)R+ (u) + . ψ− (k)(k + u + 1) k+u+1
Substituting this expression instead of the sum in (3.19), we get 1 S− (k)S+ (u) + R− (k)R+ (u) + (W NW )(k, l) = lim − u→l ψ− (k)ψ+ (u)(k + u + 1) k+u+1 1 S− (k)S+ (l) + R− (k)R+ (l) + . = − ψ− (k)ψ+ (l)(k + l + 1) k+l+1 Thus, (W NW − W )(k, l) = −
S− (k)S+ (l) + R− (k)R+ (l) , ψ− (k)ψ+ (l)(k + l + 1)
which proves (3.12). This completes the proof of the theorem. u t Remark 3.6. The proof of Theorem 3.3 given above is relatively simple but it does not explain how the correlation kernel has been found. A conceptual proof of Theorem 3.3 will appear in [BO3]. Another, representation theoretic proof of Theorem 3.3 was proposed recently by Okounkov [O2,O3].
348
A. Borodin, G. Olshanski
4. Connection with Meixner Polynomials In this section we shall show that when one of the parameters z, z0 becomes an integer, the “++”-block of the hypergeometric kernel defined in Theorem 3.3 turns into the Christoffel–Darboux kernel for Meixner orthogonal polynomials. The Meixner polynomials form a system {Mn (k; α + 1, ξ )} of orthogonal polynomials, which corresponds to the following weight function on Z+ : f (k) = fα,ξ (k) =
0(α + 1 + k)ξ k (α + 1)k ξ k = , k! 0(α + 1)k!
k ∈ Z+ .
Here k ∈ Z+ is the argument and α > −1 and 0 < ξ < 1 are parameters; deg Mn (k; α+ 1, ξ ) = n. For a detailed information about these polynomials see [NSU,KS]. 9 Meixner polynomials can be expressed through the Gauss hypergeometric function: Mn (k; α + 1, ξ ) = F (−n, −k; α + 1; ξ −1 ξ ) n 1−ξ k!0(−α − n) ξ F (−n, −α − n; 1 + k − n; ξ −1 ). = 0(1 + k − n)0(−α) ξ Basic constants related to these polynomials have the form Mn (k; α + 1, ξ ) = an k n + {lower degree terms in k}, ξ −1 n 1 , an = ξ (α + 1)n ∞ X 2 M2n (k; α + 1, ξ )f (k) hn = ||Mn (k; α + 1, ξ )|| = =
n! ξ n (1 − ξ )α+1 (α + 1)n
k=0
.
Consider the N th Christoffel–Darboux kernel for the Meixner polynomials. It projects the Hilbert space `2 (Z+ , f (·)) on the N -dimensional subspace spanned by the polynomials of degree ≤ N − 1. Let us pass from `2 (Z+ , f (·)) to the ordinary `2 space on Z+ , which corresponds to the counting measure. Then the Christoffel–Darboux kernel will be transformed to a certain kernel, which will be called the Meixner kernel and denoted as MN (k, l). We have: N−1 X
Mn (k; α + 1, ξ ) Mn (l; α + 1, ξ ) p f (k)f (l) hn n=0 aN −1 p f (k)f (l) = aN hN −1 MN (k; α + 1, ξ )MN−1 (l; α + 1, ξ ) − MN −1 (k; α + 1, ξ )MN (l; α + 1, ξ ) . × k−l
MN (k, l) =
9 Our normalization of the Meixner polynomials coincides with that of [KS] and slightly differs from that of [NSU].
Distributions on Partitions
349
Proposition 4.1. Let z = N + α, z0 = N, and let K++ (k, l) be the “++” block of the corresponding hypergeometric kernel. Then K++ (k, l) = MN (k + N, l + N ). Proof. The proof is straightforward. u t Consider the N-point “Meixner ensemble” on Z+ whose joint probability distribution has the form p(k1 , . . . , kN ) = const ·
Y
(ki − kj )2
1≤i<j ≤N
n Y
fα,ξ (ki ).
i=1
A standard argument (see, e.g., [Me]) shows that the correlation functions of this ensemble are given by determinantal formulas with the Meixner kernel: ρn (x1 , . . . , xn ) = det[MN (ki , kj )]ni,j =1 . Then Proposition 4.1 shows that our point process Pz,z0 ,ξ restricted to the first copy of Z+ for z = N + α, z0 = N coincides with the trace of the N -point Meixner ensemble on the set {N, N + 1, . . . }. In this subset the number of points of the Meixner ensemble can vary from 0 to N, which agrees with our picture. 5. Scaling Limit of the Hypergeometric Kernel: The Whittaker Kernel Recall that the construction of the point processes Pz,z0 ,ξ was started from certain prob(n) ability distributions on Yn denoted as Mz,z0 , see Sect. 1. These distributions possess an additional important property: they converge, as n → ∞, to a probability distribution on a certain limit object called the Thoma simplex: ∞ X (αi + βi ) ≤ 1}. = {α1 ≥ α2 ≥ . . . ≥ 0; β1 ≥ β2 ≥ . . . ≥ 0
(5.1)
i=1
It is a compact topological space with respect to the topology of coordinate-wise convergence. More precisely, for every n we embed the set Yn of partitions of n into by making use of the map Yn 3 λ = (p1 , . . . , pd | q1 , . . . , qd ) pd + 1/2 q1 + 1/2 qd + 1/2 p1 + 1/2 ,..., , 0, 0, . . . ; ,..., , 0, 0, . . . . 7→ n n n n (5.2) (n) (n) Next, we identify Mz,z0 with its push-forward under the map (5.2), so that Mz,z0 turns into a probability measure on with finite support.
(n)
Theorem 5.1. The measures Mz,z0 weakly converge to a probability measure Pz,z0 on as n → ∞. Proof. This follows from a general theorem, see [KOV]. u t
350
A. Borodin, G. Olshanski
Recall now that to construct the process Pz,z0 ,ξ on the lattice Z0 we have mixed all (n) the distributions Mz,z0 , n = 0, 1, 2, . . . , using the negative binomial distribution with suitable parameters, see Sect. 1: π(n) = (1 − ξ )t
(t)n n ξ , n!
ξ ∈ (0, 1).
(5.3)
Let us embed Z0 into the punctured line R∗ = R \ {0} and then rescale the process Pz,z0 ,ξ by multiplying the coordinates of its points by (1 − ξ ). Then the rescaled point configuration in R∗ that corresponds to λ ∈ Yn differs from the image (5.2) of λ in by the scaling factor (1 − ξ )n. The discrete distribution on the positive semiaxis with Prob{(1 − ξ )n} = (1 − ξ )t
(t)n , n = 0, 1, 2, . . . , n!
which depends on the parameter ξ ∈ (0, 1), converges, as ξ → 1, to the gammadistribution with parameter t s t−1 −s (5.4) e ds. γ (ds) = 0(t) e = × R+ with This brings us to the following construction. Consider the space the probability measure t−1 ez,z0 = Pz,z0 ⊗ s e−s ds. P 0(t) e we associate a point configuration in R∗ as follows To any point ω = ((α|β), s) ∈ ((α|β), s) 7 → (α1 s, α2 s, . . . ; −β1 s, −β2 s, . . . ).
(5.5)
ez,z0 . ez,z0 defines a point process on R∗ which will be denoted as P Thus, the measure P Then the considerations above together with Theorem 5.1 suggest the following Theorem 5.2. The point processes Pz,z0 ,ξ scaled by (1 − ξ ) converge, as ξ → 1, to the ez,z0 . point process P We will give a rigorous formulation of this claim and its proof in [BO3]. Meanwhile, we will use this theorem as a prompt. The main result of our previous work [P.I–P.V] was an explicit computation of the ez,z0 . 10 To formulate this result we shall need the classical correlation functions of P Whittaker function Wκ,µ (x), x > 0. This function can be characterized as the only solution of the Whittaker equation ! 2− 1 µ 1 κ 4 − + y=0 y 00 − 4 x x2 x
such that y ∼ x κ e− 2 as x → +∞ (see [E1, Chapter 6]). Here κ and µ are complex parameters. Note that Wκ,µ = Wκ,−µ . 10 About correlation functions of point processes living on a nondiscrete space, see [DVJ,L1,L2].
Distributions on Partitions
351
We shall employ the Whittaker function for real κ and real or pure imaginary µ; then Wκ,µ is real. We introduce the functions (zz0 )1/4 W ±(z+z0 )+1 , z−z0 (x), (0(1 ± z)0(1 ± z0 ) x)1/2 2 2 (zz0 )3/4 W ±(z+z0 )−1 , z−z0 (x). Q± (x) = (0(1 ± z)0(1 ± z0 ) x)1/2 2 2 P± (x) =
(5.6)
ezz0 have the form Theorem 5.3. The correlation functions of the process P n 0 ρ en(z,z ) (u1 , . . . , un ) = det K(ui , uj ) i,j =1 , n = 1, 2, . . . ; u1 , . . . , un ∈ R∗ ,
where the kernel K(u, v) is conveniently written in the block form u, v > 0; K++ (u, v), K (u, −v), u > 0, v < 0; +− K(u, v) = u < 0, v > 0; K−+ (−u, v), K (−u, −v), u, v < 0; −− with
P+ (x)Q+ (y) − Q+ (x)P+ (y) , x−y P+ (x)P− (y) + Q+ (x)Q− (y) , K+− (x, y) = x+y P− (x)P+ (y) + Q− (x)Q+ (y) , K−+ (x, y) = − x+y P− (x)Q− (y) − Q− (x)P− (y) . K−− (x, y) = x−y K++ (x, y) =
The kernel K(u, v) is called the Whittaker kernel, see [P.IV, Th. 2.7, BO1, Th. III]. 11 Clearly, the hypergeometric kernel (see Theorem 3.3) and the Whittaker kernel have the same structure. Theorem 5.2 prompts that the Whittaker kernel is the scaling limit of the hypergeometric one. In the next theorem we establish this fact by a direct computation. Theorem 5.4. For the hypergeometric kernel K given by Theorem 3.3 and the Whittaker kernel K given by Theorem 5.3 the following limit relation holds 1 v u u, v ∈ R+ , K∗∗ , = K∗∗ (u, v), lim ξ %1 1 − ξ 1−ξ 1−ξ where the subscript ∗∗ stands for any of the four symbols ++, +−, −+, −−. 11 In those papers, the term “Whittaker kernel” concerned the block K ++ while the kernel K was called the matrix Whittaker kernel.
352
A. Borodin, G. Olshanski
Proof. Take x, y > 0 and denote x , k= 1−ξ Then (1 − ξ )k ≈ x, (1 − ξ )l ≈ y. Since 1−ξ 1 ≈ , k−l x−y it is enough to show that
y l= . 1−ξ 1 1−ξ ≈ , k+l+1 x+y
P± (k) ≈ P± (x),
Q± (k) ≈ Q± (x).
(5.7)
We shall employ the following asymptotic relation which connects the hypergeometric function and the Whittaker function: lim F (a, b; u; 1 − xu ) = x
u→+∞
a+b−1 2
x
e 2 W −a−b+1 , a−b (x), 2
2
x > 0,
(5.8)
ξ 1 = 1 − 1−ξ . Applying (5.8) we get the following limit see [E1, 6.8(1)]. Note that ξ −1 relations for the hypergeometric functions entering (3.5) and (3.6): ξ )≈x F (∓z, ∓z0 ; k + 1; ξ −1 ξ ) F (1 ∓ z, 1 ∓ z0 ; k + 2; ξ −1
(1 − ξ )(k + 1) Next, the factor (ψ± (3.1)): (ψ± (k))
(u))1/2
1/2
= t ≈
≈x
∓(z+z0 )−1 2
x
e 2 W ±(z+z0 )+1 , z−z0 (x), 2
∓(z+z0 )−1 2
2
x 2
e W ±(z+z0 )−1 , z−z0 (x). 2
2
entering (3.7) behaves as follows (ψ± (u) was defined in 1/2 k+1/2
ξ
(1 ± z)k (1 ± z0 )k 0 (1 − ξ )±(z+z ) k!k! !1/2 0
t 1/2 e−x x ±(z+z ) 0(1 ± z)0(1 ± z0 )
1/2
.
Finally, from (3.7) we obtain t 1/4 W ±(z+z0 )+1 , z−z0 (x) = P± (x), (0(1 ± z)0(1 ± z0 )x)1/2 2 2 3/4 t W ±(z+z0 )−1 , z−z0 (x) = Q± (x).u t Q± (k) ≈ (0(1 ± z)0(1 ± z0 )x)1/2 2 2 Remark 5.5. As was demonstrated in Sect. 4, the “++”-block of the hypergeometric kernel turns into the Christoffel–Darboux kernel for Meixner polynomials when z = N + α, z0 = N, N ∈ Z+ . It is well known that in the scaling limit as ξ → 1, the Meixner polynomials turn into the Laguerre polynomials (see [KS,NSU]). This agrees with the ez,z0 to the positive fact that for z = N + α, z0 = N , the restriction of the process P semiaxis coincides with the N-point Laguerre ensemble, see [P.III, Remark 2.4]. Note that the shift by N which we were doing to match Pz,z0 ,ξ and the Meixner ensemble disappears after we take the limit. P± (k) ≈
Remark 5.6. A straightforward check shows that the scaling limit of the kernel L(x, y) defined by (3.3) is the kernel L(x, y) of the operator L = K(1 − K)−1 , where K is the integral operator in L2 (R∗ , dx) corresponding to the Whittaker kernel (the kernel L(x, y) was explicitly computed in [P.V, Theorem 2.4]).
Distributions on Partitions
353
6. Integrable Operators In this section we shall show that the operator given by the Whittaker kernel belongs to the class of integrable operators as defined by Its, Izergin, Korepin and Slavnov [IIKS]. We shall also argue that the hypergeometric kernel might be considered as an example of a discrete kernel giving an “integrable operator”, see also [B1]. We shall follow [De] in our description of integrable operators. Let 6 be an oriented contour in C. We call an operator V acting in L2 (6, |dζ |) integrable if its kernel has the form PN 0 j =1 fj (ζ )gj (ζ ) 0 , ζ, ζ 0 ∈ 6, V (ζ, ζ ) = ζ − ζ0 for some functions fj , gj , j = 1, . . . , N. We shall always assume that N X
fj (ζ )gj (ζ ) = 0, ζ ∈ 6,
j =1
so that the kernel V (ζ, ζ 0 ) is nonsingular (this assumption is not necessary for the general theory). The notion of an integrable operator was first introduced in [IIKS]. It turns out that for an integrable operator V the operator R = V (1 + V )−1 is also integrable. Proposition 6.1 ([IIKS]). Let V be an integrable operator as described above and R = 1 − (1 + V )−1 = V (1 + V )−1 . Then the kernel R(ζ, ζ 0 ) has the form PN 0 j =1 Fj (ζ )Gj (ζ ) 0 , ζ, ζ 0 ∈ 6, R(ζ, ζ ) = ζ − ζ0 where
Fj = (1 + V )−1 fj ,
Gj = (1 + V t )−1 gj , j = 1, . . . , N. PN P If N j =1 fj (ζ )gj (ζ ) = 0 on 6, then j =1 Fj (ζ )Gj (ζ ) = 0 on 6 as well. Proof. See [KBI, Ch. XIV, De]. u t It is not difficult to show that for integrable operators V1 , V2 , the product V1 V2 is also integrable. This fact and Proposition 6.1 imply that operators of the form I + V , where V is integrable form a group. A remarkable fact is that the function Fj , Gj can be expressed via a suitable Riemann– Hilbert problem, see [IIKS,De] for details. Now we pass to a much more special situation. Let 6 = R∗ . According to the splitting ∗ R = R+ t R− and further identification of R− with a second copy of R+ , we shall occasionally write the kernels of operators in L2 (R∗ ) in block form; then we shall put the symbol of the operator into square brackets. Consider an integral operator V on R∗ whose kernel V (x, y) has the following block form: # " h+ (x)h− (y) 0 x+y , x > 0, y > 0, (6.1) [V ](x, y) = − (x) − h+ (y)h 0 x+y
354
A. Borodin, G. Olshanski
for some functions h+ (x) and h− (x) defined on the positive semiaxis. Then the operator V is integrable. Indeed, V (x, y) =
f1 (x)g1 (y) + f2 (x)g2 (y) , x−y
(
where
x, y ∈ R∗ ,
(6.2)
(
0, x>0 , f1 (x) = h− (−x), x < 0 ( h+ (x), x > 0 , g1 (x) = 0, x 0 , 0, x0 g2 (x) = . h− (−x), x < 0
f2 (x) =
Assume that there exist four functions A± (x), B± (x) defined on the positive semiaxis such that b∓ = 1 − A± , b∓ = B± , B (6.3) A 2 h± h2± Z
where b ϕ (x) =
y>0
ϕ(y)dy x+y
(6.4)
is the Stieltjes transform. Then Proposition 6.1 implies that the kernel of the operator R = V (1 + V )−1 has the form F1 (x)G1 (y) + F2 (x)G2 (y) , x, y ∈ R∗ R(x, y) = x−y (
with F1 (x) = G1 (x) =
(x) − Bh++(x) , x>0
(A ;
A− (−x) h− (−x) , x < 0 ( A (x) + x>0 h+ (x) , B− (−x) − h− (−x) , x < 0
F2 (x) = ;
G2 (x) =
+ (x) h+ (x) , B− (−x) h− (−x) , ( B (x) + h+ (x) , A− (−x) h− (−x) ,
In block form the kernel R(x, y) can be written as follows: " A (x)B (y)−B (x)A (y) A [R](x, y) =
x>0 x0 x, where Tπ < −D >= Tπ ∩ TX < −D > and TX < −D >= HomOX (1X < D >, OX ), where 1X < D > denotes the sheaf of 1-forms with log poles along D. In particular, (d −1 )−1 (L0 ) ⊂ trAE maps to iD∗ AE|D , and L−1 ⊂ (d −1 )−1 (L0 ) is defined as the kernel. Then L−2 = OX . The product struture on trA•E is defined in [1], 2.1.1.2, and coincides with the Lie algebra structure on trA0E = AE,π . Since L−2 = trA−2 E , to see that the product structure stabilizes L• , one just has to see that L0 ⊂trA0E is a subalgebra, which −1 is obvious, and that L0 × L−1 →trA−1 E takes values in L , which is a consequence of Proposition 2.2. As in Sect. 1, we denote by AE/S the relative Atiyah algebra of E, with symbolic part TX/S and by AE,π Beilinson’s subalgebra of the global Atiyah algebra with symbolic part Tπ . If ι : F ⊂ E is a vector bundle, isomorphic to E away of D, then one has an injection of differential operators i
→ Diff(E, E) Diff(E, F ) −
(2.1)
induced by ι on the second argument, and an injection j
→ Diff(F, F ) Diff(E, F ) −
(2.2)
induced by ι on the first argument. One has Definition 2.1. A(E/S,F /S) := AE/S ∩i Diff(E, F ) ∼ = AF /S ∩j Diff(E, F ) Recall γE : trA−1 E → AE/S denotes the map coming from the filtration by the order of poles of OX×X (∗1) on trA−1 E . One has Proposition 2.2. −1 (AE/S,E(−D)/S ) ∼ γE−1 (AE/S,E(−D)/S ) ∼ = γE(−D) = L−1 .
Proof. One considers E(−D) E ◦ (21) + E E ◦ E E ◦ (−1) ◦ E E◦ E(−D) E ◦ E(−D) E (21) ⊕ / = E(−D) E ◦ (−1) E E ◦ (−1) E(−D) E ◦ (−1)
(2.3)
362
H. Esnault, I-H. Tsai
which, via the natural inclusion to E E ◦ (21) (2.4) E E ◦ (−1) is the inverse image γE−1 Diff(E, E(−D)) (here we abuse notations, still denoting by γE the map coming from the filtration), and via the map coming from the natural inclusion E(−D) E ◦ (D)(21) E(−D) E ◦ (21) → E(−D) E ◦ (−1) E(−D) E ◦ (D)(−1) and the identification with the first term of the filtration on
(2.5)
E(−D)E ◦ (D)(21) E(−D)E ◦ (D)(−1)
E E◦ E(−D) E ◦ (D) ∼ = E E ◦ (−1) E(−D) E ◦ (D)(−1) −1 Diff(E, E(−D)) . u t is the inverse image γE(−D)
(2.6)
The filtration induced by the order of poles of OX×X (∗D) induces the exact sequences 0 → Hom(E, E(−D)) → A(E/S,E(−D)/S) → TX/S (−D) → 0, 0 → End(E) → AE/S → TX/S → 0, 0 → End(E) → AE(−D)/S → TX/S → 0.
(2.7) (2.8) (2.9)
Now, as one has an injection L• ⊂ trA•E with cokernel Q, and again by looking at the filtration by the order of poles on the sheaf in degree (-1), one obtains Q∼ = End(E)|D [1]
(2.10)
and Theorem 2.3. One has an exact sequence 0 → R 0 π∗ (End(E)|D ) → R 0 π∗ (L• ) → R 0 π∗ (trA•E ) → 0. On the other hand, one has an injection L• ⊂ trA•E(−D) ⊕ iD ∗ AE|D with cokernel P, and, as L• injects into trA•E(−D) , one has an exact sequence 0 → iD ∗ AE|D [0] → P → [trA•E(−D) /L• ] → 0,
(2.11)
where iD : D → X is the closed embedding. We see that the induced filtration on the sheaf in degree (-1) of [trA•E(−D) /L• ] has graded pieces (0, End(E|D ), TX/S |D ), whereas the filtration on the sheaf in degree (0) has graded pieces (0, Tπ /Tπ < −D >∼ = TX/S |D ). This last point comes from the obvious Lemma 2.4. {P ∈ Diff(E, E), P (E(−D)) ⊂ E(−D)} ∼ = {P ∈ Diff(E(−mD), E(−mD)), (P ) ∈ End(E) ⊗ T < −D >} for any m ∈ Z, where is the symbol map.
Determinant Bundle
363
So Lemma 2.5. [trA•E(−D) /L• ] is quasi-isomorphic to End(E|D )[1]. The connecting morphism R −1 π∗ [trA•E(−D) /L• ] → R 0 π∗ (iD ∗ AE|D )[0] is just the
natural embedding π∗ (End(E|D )) → π∗ (iD ∗ AE|D ) with cokernel π∗ π |−1 D TS . If D is ∼ T irreducible, one has π∗ π|−1 T , and therefore = S D S Proposition 2.6. If D is irreducible, one has an exact sequence 0 → R 0 π∗ L• → R 0 π∗ [trA•E(−D) ] ⊕ R 0 π∗ [iD ∗ AE|D ] → TS → 0 and the image of R 0 π∗ L• is obtained from the direct sum by taking the pull back under the diagonal embedding TS → TS ⊕ TS . On the other hand, still assuming D irreducible, one has the exact sequence 0 → π∗ End(E|D ) → R 0 π∗ [iD∗ AE|D ] → TS ∼ = π∗ π |−1 D TS → 0
(2.12)
and the Atiyah algebra A det(π∗ E|D ) is the push out of R 0 π∗ [iD∗ AE|D ] by the trace map π∗ End(E|D ) → OS . Defining id⊕Tr (2.13) K := Ker OS ⊕ π∗ End(E|D ) −−−→ OS ∼ = π∗ End(E|D ), one thus obtains Theorem 2.7. If D is irreducible, one has an exact sequence 0 → K → R 0 π∗ L• → π∗ (trA•E(−D) ) + Adetπ∗ (E|D ) → 0. It can be easily shown that the embedding π∗ End(E|D ) ⊂ R 0 π∗ L• in Theorems 2.3 and 2.7 is the same embedding of a subsheaf of ideals. It finishes the proof of Theorem 1.1 when D is irreducible. In general, since D is étale over S, its irreducible components are disjoint, thus one proves Theorem 1.1 by adding one component at a time. References 1. Beilinson, A., Schechtman, V.: Determinant Bundles and Virasoro Algebras. Commun. Math. Phys. 118, 651–701 (1988) Communicated by A. Connes
Commun. Math. Phys. 211, 365 – 393 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Quantum Vertex Representations via Finite Groups and the McKay Correspondence Igor B. Frenkel1 , Naihuan Jing2,3 , Weiqiang Wang2,1 1 Department of Mathematics, Yale University, New Haven, CT 06520, USA 2 Department of Mathematics, North Carolina State University, Raleigh, NC 27695-8205, USA.
E-mail:
[email protected];
[email protected] 3 Mathematical Sciences Research Institute, 1000 Centennial Drive, Berkeley, CA 94720, USA
Received: 17 August 1999 / Accepted: 2 December 1999
Abstract: We establish a q-analog of our recent work on vertex representations and the McKay correspondence. For each finite group 0 we construct a Fock space and associated vertex operators in terms of wreath products of 0 × C× and the symmetric groups. An important special case is obtained when 0 is a finite subgroup of SU2 , where our construction yields a group theoretic realization of the representations of the quantum affine and quantum toroidal algebras of ADE type. 1. Introduction In our previous paper [FJW] (see [W,FJW] for historical remarks and motivations) we have shown that the basic representation of an affine Lie algebra b g of ADE type can g via be constructed from a finite subgroup 0 of SU2 related to the Dynkin diagram of b the McKay correspondence. In particular, we have recovered a well-known construction [FK,Se] of the basic representation of b g from the root lattice Q of the corresponding finite dimensional Lie algebra g. In fact our construction yields naturally the vertex representation of the toroidal Lie algebra b g which contains the affine Lie algebra as a distinguished subalgebra. The main goal of the present paper is to q-deform our construction in [FJW]. Again as in the undeformed case we will naturally obtain the earlier construction [FJ] of the g) from the root lattice Q and basic representation of the quantum affine algebra Uq (b g) [GKV] (also cf. [Sa,J3]). The its generalization to the quantum toroidal algebra Uq (b q-deformation is achieved by replacing consistently the representation theory of 0 by that of 0 × C× . The representation ring for C× is identified with the ring of Laurent polynomials C[q, q −1 ] so that the formal variable q corresponds to the natural onedimensional representation of C× . It turns out that rather complicated expressions for g) and the quantum operators in the Drinfeld realization of the quantum affine algebra Uq (b b toroidal algebra Uq (g) follow instantly from the simple extra factor C× . The idea to use representations of C× to obtain a q-deformation of the basic representation was
366
I. B. Frenkel, N. Jing, W. Wang
mentioned in [Gr] and is widely used in geometric constructions of representations (see e.g. [CG]). As in the previous paper [FJW] we give the construction of quantum vertex operators starting from an arbitrary finite group 0 and a self-dual virtual character ξ of 0 × C× . Using the restriction and induction functors in representation theory of wreath products of 0 × C× with the symmetric group Sn for all n we construct two “halves” of quantum vertex operators corresponding to any irreducible character γ of 0 × C× . Then choosing an irreducible character of C× , i.e. an integer power of q we assemble both halves into one quantum vertex operator. The special case when 0 is a subgroup of SU2 is important for the application to g) and for relations [W] to the theory of Hilbert schemes of representation theory of Uq (b g) we choose points on surfaces. To recover the basic representation of Uq (b ξ = γ0 ⊗ (q + q −1 ) − π ⊗ 1C× , where γ0 and 1C× are the trivial characters of 0 and C× respectively, q and q −1 are the natural and its dual characters of C× , and π is the natural character of 0 in SU2 . The fact that the quantum toroidal algebra intrinsically presents in our construction is an additional indication of its importance in representation theory of quantum affine algebras. Moreover when 0 is cyclic of order r + 1, π ' γ ⊕ γ −1 , where γ is the natural character of 0, one can modify our virtual character ξ with an extra parameter p = q k , k ∈ Z by letting ξ = γ0 ⊗ (q + q −1 ) − (γ ⊗ p + γ −1 ⊗ p−1 ). In the special case when p = q ±1 the quantum vertex representation of the quantum g) can be factored to the basic representation of the quantum affine toroidal algebra Uq (b g). This is a q-analog of the factorization in the undeformed case, which algebra Uq (b exists for an arbitrary simply-laced affine Lie algebra. To obtain the basic representations of quantum toroidal and affine algebras we only need the quantum vertex operators corresponding to irreducible representations of 0 and their negatives in the Grothendieck ring of this group. We attach two halves of quantum vertex operators using the simplest nontrivial representations of C× , namely q and q −1 . g) Each of the two choices and only these two yield the basic representations of Uq (b g), in perfect correspondence with the construction in [FJ]. This choice of an and Uq (b irreducible character of C× is essentially the only freedom that exists in our construction of quantum vertex operators for the quantum affine and toroidal algebras and is fixed by comparison with the algebra relations. However it raises the question of constructing a “natural” quantum vertex operator corresponding to any virtual character γ of 0. This question is closely related to the well-known problem of finding a q-deformation of vertex operator algebras associated to the basic representation of an affine Lie algebra. This paper is organized in a way similar to [FJW]. In Sect. 2 we review the theory of wreath products of 0 and extend it to 0 × C× . In Sect. 3 we define the weighted bilinear form on 0 × C× and its wreath products. In Sect. 4 we introduce two distinguished q-deformed weight functions associated to subgroups of SU2 . In Sect. 5 we define the Heisenberg algebra associated to 0 and the weighted bilinear form, and we construct its representation in a Fock space. In Sect. 6 we establish the isometry between the representation ring of wreath products of 0 × C× and the Fock space representation of the Heisenberg algebra. In Sect. 7 we construct quantum vertex operators acting on the representation ring of the wreath products. In Sect. 8 we obtain the basic representations
Quantum Vertex Representation and McKay Correspondence
367
of quantum toroidal algebras and quantum affine algebras from representation theory of wreath products for 0 × C× . 2. Wreath Products and Vertex Representations 2.1. The wreath product 0n . Let 0 be a finite group and n a non-negative integer. The wreath product 0n is the semidirect product of the nth direct product 0 n = 0 × · · · × 0 and the symmetric group Sn : 0n = {(g, σ )|g = (g1 , . . . , gn ) ∈ 0 n , σ ∈ Sn } with the group multiplication (g, σ ) · (h, τ ) = (g σ (h), σ τ ), 0n
by permuting the factors. where Sn acts on Let 0∗ be the set of conjugacy classes of 0 consisting of c0 = {1}, c1 , . . . , cr and 0 ∗ be the set of r + 1 irreducible characters: γ0 , γ1 , . . . , γr . Here we denote the trivial character of 0 by γ0 . The order of the centralizer of an element in the conjugacy class c is denoted by ζc , so the order of the conjugacy class c is |c| = |0|/ζc , where |0| is the order of 0. A partition λ = (λ1 , λ2 , . . . , λl ) is a decomposition of n = |λ| = λ1 + · · · + λl with nonnegative integers: λ1 ≥ · · · ≥ λl ≥ 1, where l = l(λ) is called the length of the partition λ and λi are called the parts of λ. Another notation for λ is λ = (1m1 2m2 · · · ) with mi being the multiplicity of parts equal to i in λ. Denote by P the set of all partitions of integers and by P(S) the set of all partition-valued functions on a set P S. The weight of a partition-valued function ρ = (ρ(s))s∈S is defined to be kρk = s∈S |ρ(s)|. We also denote by Pn (resp. Pn (S)) the subset of P (resp. P(S)) of partitions with weight n. Just as the conjugacy classes of Sn are parameterized by partitions, the conjugacy classes of 0n are parameterized by partition-valued functions on 0∗ . Let x = (g, σ ) ∈ 0n , where g = (g1 , . . . , gn ) ∈ 0 n and σ ∈ Sn is presented as a product of disjoint cycles. For each cycle (i1 i2 · · · ik ) of σ , we define the cycle-product element gik gik−1 · · · gi1 ∈ 0, which is determined up to conjugacy in 0 by g and the cycle. For any conjugacy class c ∈ 0 and each integer i ≥ 1, the number of i-cycles in σ whose cycle-product lies in c will be denoted by mi (c). This gives rise to a partition ρ(c) = (1m1 (c) 2m2 (c) . . . ) for Thus we obtain a partition-valued function ρ = (ρ(c))c∈0∗ ∈ P(0∗ ) such that c ∈ 0∗ . P kρk = i,c imi (ρ(c)) = n. This is called the type of the element (g, σ ). It is known [M] that two elements in the same conjugacy class have the same type and there exists a one-to-one correspondence between the sets (0n )∗ and Pn (0∗ ). We will freely say that ρ is the type of the conjugacy class of 0n . Given a class c we denote by c−1 the class {x −1 |x ∈ c}. For each ρ ∈ P(0∗ ) we also associate the partition-valued function ρ = (ρ(c−1 ))c∈0∗ . Given a partition λ = (1m1 2m2 . . . ), we denote by Y i mi mi ! zλ = i≥1
368
I. B. Frenkel, N. Jing, W. Wang
the order of the centralizer of an element of cycle type λ in S|λ| . The order of the centralizer of an element x = (g, σ ) ∈ 0n of type ρ = (ρ(c))c∈0∗ is given by Y zρ(c) ζcl(ρ(c)) . Zρ = c∈0∗
2.2. Grothendieck ring R0×C × . Let RZ (0) be the Z-lattice generated by γi , i = 0, . . . , r, and R(0) = C ⊗ RZ (0) be the space of complex class functions on the group 0. In our previous work on the McKay correspondence and vertex representations L [W,FJW] we studied the Grothendieck ring R0 = n≥0 R(0n ). In the quantum case we need to add the ring R(C× ), the space of characters of C× = {t ∈ C|t 6= 0}. Let q be the irreducible character of C× that sends t to itself. Then R(C× ) is spanned by irreducible multiplicative characters q n , n ∈ Z, where q n (t) = t n ,
t ∈ C× .
Thus R(C× ) is identified with the ring C[q, q −1 ], and we have R(0 × C× ) = R(0) ⊗ R(C× ). An elements of R(0 × C× ) can be written as a finite sum: X fi ⊗ q ni , fi ∈ R(0), ni ∈ Z. f = i
We can also view f as a function on 0 with values in the ring of Laurent polynomials −1 we will write f q to indicate the formal variable q,P then f q (c) = C[q, P q ].nIn this case−1 × ni i i fi (c)q ∈ C[q, q ]. As a function on 0 × C , we have f (c, t) = i fi (c)t . Denote by R0×C× the following direct sum: M R(0n × C× ) ' R0 ⊗ C[q, q −1 ]. R0×C× = n≥0
2.3. Hopf algebra structure on R0×C× . The multiplication m in C× and the diagonal d
map C× −→ C× × C× induce the Hopf algebra structure on R(C× ). ∼ =
d∗
mC× : R(C× ) ⊗ R(C× ) −→ R(C× × C× ) −→ R(C× ), ∼ =
m∗
1C× : R(C× ) −→ R(C× × C× ) −→ R(C× ) ⊗ R(C× ).
(2.1) (2.2)
In terms of the basis {q n } we have q i · q j = q i+j , 1(q k ) = q k ⊗ q k , where we abbreviate 1C× by 1 and follow the convention of writing a ·b = mC× (a ⊗b). The antipode SC× and the counit C× are given by SC× (q n ) = q −n ,
C× (q n ) = δn0 .
Quantum Vertex Representation and McKay Correspondence
369
We extend the Hopf algebra structures on R(C× ) and R0 [Z] into a Hopf algebra structure on R0×C× using a standard procedure in Hopf algebras [A]. The multiplication and comultiplication are given by the respective composition of the following maps: ∼ =
m : R(0n × C× ) ⊗ R(0m × C× ) −→ R(0n × C× × 0m × C× ) I nd⊗1
1⊗mC×
−→ R(0n × 0m × C× ) −→ R(0n+m × C× );
(2.3)
Res⊗1
1 : R(0n × C× ) −→ ⊕nm=0 R(0n−m × 0m × C× ) 1⊗1C×
−→ ⊕nm=0 R(0n−m × 0m × C× × C× ) ∼ =
−→ ⊕nm=0 R(0n−m × C× ) ⊗ R(0m × C× ),
(2.4)
where we have used the identification of R(C× × C× ) with R(C× ) ⊗ R(C× ) in (2.1– 2.2). Also I nd : R(0n × 0m ) −→ R(0n+m ) denotes the induction functor and Res : R(0n ) −→ R(0n−m × 0m ) denotes the restriction functor. The antipode is given by S(f (g, t)) = f (g −1 , t −1 ),
g ∈ 0, t ∈ C× .
In particular, S(γ )(c) = γ (c−1 ) for γ ∈ 0 ∗ . As we mentioned earlier, we may write f ∈ R0×C× as X fi (g)q ni . f q (g) = S(f q )(g)
P
i
(g −1 )q −ni .
= i fi Then The counit is defined by
(R(0n × C× )) = 0,
if n 6 = 0,
and on R(C× ) is the counit of the Hopf algebra R(C× ). 3. A Weighted Bilinear Form on R(0n × C× )
P 3.1. A standard bilinear form on R0×C× . Let f, g ∈ R(0 × C× ) with f = i fi ⊗ q ni P and g = i gi ⊗ q mi . The C[q, q −1 ]-valued standard C-bilinear form on R(0 × C× ) is defined as X q hfi , gj i0 q ni −mj hf, gi0 = i,j
=
XX i,j c∈0∗
ζc−1 fi (c)gj (c−1 )q ni −mj ,
where we recall that c−1 denotes the conjugacy class {x −1 |x ∈ c} of 0, and ζc is the order of the centralizer of the class c in 0. Sometimes we will also view the bilinear form as a function of t ∈ C× : X q hf, gi0 (t) = ζc−1 f (c, t)S(g(c, t)). c∈0∗
370
I. B. Frenkel, N. Jing, W. Wang
The following is a direct consequence of the orthogonality of irreducible characters of 0: q
hγi ⊗ q k , γj ⊗ q l i0 = δij q k−l , X γ (c0 )S(γ )(c) = δc,c0 ζc , c, c0 ∈ 0∗ .
(3.1)
γ ∈0 ∗ q
Let h , i0n be the C[q, q −1 ]-valued bilinear form on R(0n × C× ). The C[q, q −1 ]valued standard bilinear form in R0×C× is defined in terms of the bilinear form on R(0n × C× ) as follows: X q hun , vn i0n , hu, viq = where u =
P
n un
and v =
P
n≥0
n vn
with un , vn ∈ R(0n × C× ).
3.2. A weighted bilinear form on R(0 × C× ). A class function ξ ∈ R(0 × C× ) is called Aself-dual if for all x ∈ 0, t ∈ C× , ξ(x, t) = S(ξ(x, t)), q −1
or equivalently ξ q (x) = ξ (x −1 ). We fix a self-dual class function ξ . The tensor product of two representations γ and β in R(0 × C× ) will be denoted by γ ∗ β. Let aij ∈ C[q, q −1 ] be the (virtual) multiplicity of γj in ξ ∗ γi , i.e., ξ ∗ γi =
r X
aij γj .
(3.2)
j =0
We denote by Aq the (r + 1) × (r + 1) matrix (aij )0≤i,j ≤r . Associated to ξ we introduce the following weighted bilinear form q
q
hf, giξ = hξ ∗ f, gi0 , f, g ∈ R(0 × C× ), where we use the superscript q to indicate the q-dependence. The superscript q is often omitted if the q-variable in characters f and g is clear from the context. The explicit formula of the bilinear form is given as follows: 1 X q −1 q ξ (x)f q (x)g q (x −1 ) hf, giξ = |0| x∈0 X −1 ζc−1 ξ q (c)f q (c)g q (c−1 ), (3.3) = c∈0∗
which is the average of the character ξ ∗ f ∗ g over 0. The self-duality of ξ together with (3.3) implies that aij = aj i , i.e.
Aq
is a hermitian-like matrix with the bar action given by q = q −1 . The orthogonality (3.1) implies that q
aij = hγi , γj iξ .
(3.4)
Remark 3.1. If ξ is the trivial character γ0 , then the weighted bilinear form becomes the standard one on R(0 × C× ).
Quantum Vertex Representation and McKay Correspondence
371
3.3. A weighted bilinear form on R(0n ×C× ). Let V be a 0 ×C× -module which affords a character γ in R(0 × C× ). We can decompose V as follows: M Vi ⊗ C(ki ), V = i
where Vi is a (virtual) 0-module in R(0) and C(ki ) is the one dimensional C× -module afforded by the character q ki . The nth outer tensor product V ⊗n of V can be regarded naturally as a representation of the wreath product (0 × C× )n via permutation of the factors and the usual direct product action. More precisely, note that 0n × C× can be viewed as a subgroup of (0 × C× )n by the diagonal inclusion from C× to (C× )n : n
0n × C× −→ (0 n × C× ) o Sn = (0 × C× )n . This provides a natural 0n × C× -module structure on V ⊗n . We denote its character by ηn (γ ). Explicitly we have (g, σ, t).(v1 ⊗ · · · ⊗ vn ) = (g1 , t)vσ −1 (1) ⊗ · · · ⊗ (gn , t)vσ −1 (n) ,
(3.5)
where g = (g1 , . . . , gn ) ∈ 0 n . Let εn be the (1-dimensional) sign representation of 0n so that 0 n acts trivially while letting Sn act as a sign representation. We denote by εn (γ ) ∈ R(0n × C× ) the character of the tensor product of εn ⊗ 1 and V ⊗n . The weighted bilinear form on R(0n × C× ) is now defined by q
q
hf, giξ,0n = hηn (ξ ) ∗ f, gi0n , f, g ∈ R(0n × C× ). We shall see in Corollary 6.4 that ηn (ξ ) is self-dual if the class function ξ is invariant q under the antipode S. In such a case the matrix of the bilinear form h , iξ is equal to its adjoint (transpose and bar action). We can naturally extend ηn to a map from R(0) ⊗ q k to R(0 × C× ) as in the classical case (cf. [W]). In particular, if β and γ are characters of representations V and W of 0 respectively, then ηn (β ⊗ q k + γ ⊗ q l ) =
n X m=0 k
×
n ×C I nd00n−m [η (β ⊗ q k ) ⊗ ηm (γ ⊗ q l )], ×C× ×0m ×C× n−m
(3.6)
ηn (β ⊗ q − γ ⊗ q l ) =
n X m=0
On R0×C× =
P
×
n ×C (−1)m I nd00n−m [η (β ⊗ q k ) ⊗ εm (γ ⊗ q l )]. ×C× ×0m ×C× n−m
L
n R(0n
(3.7)
× C× ) the weighted bilinear form is given by X q q hun , vn iξ,0n , hu, viξ = P
n≥0
where u = n un and v = n vn with un , vn ∈ R(0n × C× ). q The bilinear form h , iξ on R0×C× is C-bilinear and takes values in C[q, q −1 ]. When n = 1, it reduces to the weighted bilinear form defined on R(0 × C× ). We will often omit the superscript q and use the notation h , iξ for the weighted bilinear form on R0×C× .
372
I. B. Frenkel, N. Jing, W. Wang
4. Quantum McKay Weights 4.1. Quantum McKay correspondence. Let di = γi (c0 ) be the dimension of the irreducible representation of 0 corresponding to the character γi . The following generalizes a result of McKay [Mc]: Proposition 4.1. For each class c ∈ 0∗ the column vector v(c) = (γ0 (c), γ1 (c), . . . , γr (c))t q
is an eigenvector of the (r + 1) × (r + 1)-matrix Aq = (hγi , γj iξ ) with eigenvalue ξ q (c). In particular (d0 , d1 , . . . , dr ) is an eigenvector of Aq with eigenvalue ξ q (c0 ). Proof. We compute directly that r X X X q q 0 0 0 −1 hγi , γk iξ γk (c) = ζc−1 )γk (c) 0 ξ (c )γi (c )γk (c k c0 ∈0∗
k=0
X
=
c0 ∈0∗
X
=
c0 ∈0∗ q
q 0 0 ζc−1 0 ξ (c )γi (c )
X
γk (c0
−1
)γk (c)
k
q 0 0 ζc−1 0 ξ (c )γi (c )ζc δcc0
= ξ (c)γi (c).
t u
Let π be an irreducible faithful representation π of 0 of dimension d. For each integer n we define the q-integer [n] that can be viewed as a character of C× by [n] =
q n − q −n = q n−1 + q n−3 + · · · + q −n+1 . q − q −1
We take the following special class function: ξ = γ0 ⊗ [d] − π ⊗ 1C× ,
(4.1)
where we have also used the symbol π for the corresponding character, and 1C× = q 0 is the trivial character of C× . Proposition 4.2. The weighted bilinear form associated to (4.1) is non-degenerate. If π is an embedding of 0 into SUd and t 6 = 1 is a nonnegative real number, then the weighted bilinear form evaluated on t is positive definite. Proof. A simple fact of finite group theory says that hf, f iπ ≤ dhf, f i. Assume that t ∈ R+ . Observe that t d−1 + t d−3 + · · · + t −d+1 ≥ d and the equality holds if and only if t = 1.
Quantum Vertex Representation and McKay Correspondence
373
q
Let Aq = (hγi , γj iξ ) = (aij ). Note that for any faithful representation π of 0 we have that X cij γj , cij ∈ N. π ∗ γi = j
Then it follows that q
hγi , γj iξ (t) = (t d−1 + t d−3 + · · · + t −d+1 )hγi , γj i − hγi , γj iπ = [d](t)δij − cij = A1 + ([d](t) − d)I. According to Steinberg (see e.g. [FJW]), A1 is positive semi-definite which generalizes McKay’s observation in the case of d = 2. This implies that the eigenvalues of Aq are t ≥ [d](t) − d ≥ 0. Thus the matrix Aq (t) is positive-definite when t > 0 and t 6 = 1. u We remark that when |t| = 1 and t is close to 1, the signature of Aq (t) is (−1, 1, . . . , 1) due to [d](t) ≤ d. Remark 4.3. The matrix A1 is integral, and the entries of Aq are the q-numbers of the corresponding entries in A1 when r ≥ 2. 4.2. Two quantum McKay weights. Let 0 is a finite subgroup of SU2 and we introduce the first distinguished self-dual class function ξ = γ0 ⊗ (q + q −1 ) − π ⊗ 1C× , where π is the character of the embedding of 0 in SU2 . The matrix of the weighted bilinear form h , iξ (cf. (3.4)) has the following entries: q + q −1 , if i = j, −1, if hγi , γj i1ξ = −1, (4.2) aij = −2, if hγi , γj i1ξ = −2 and 0 = Z/2Z 0, otherwise. In particular when q = 1 the matrix (aij1 ) coincides with the extended Cartan matrix of ADE type according to the five classes of finite subgroups of SU2 : the cyclic, binary dihedral, tetrahedral, octahedral, and icosahedral groups. McKay [Mc] gave a direct correspondence between a finite subgroup of SU2 and the affine Dynkin diagram D of ADE type. Each irreducible character γi corresponds to a vertex of D, and the number of edges between γi and γj (i 6= j ) is equal to |hγi , γj i1ξ |, where hγi , γj i1ξ = aij1 are the entries of matrix A1 of the weighted bilinear form h , i1ξ . For this reason we will call our q matrix Aq = (aij ) = (hγi , γj iξ ) the quantum Cartan matrix. Let 0 be the cyclic subgroup of SU2 of order r + 1. We can introduce the second deformation parameter in the quantum Cartan matrix. Let γi (i = 0, . . . , r) be the full set of irreducible characters of C× such that γi ∗ γj = γi+j mod r+1 . The embedding of 0 in SU2 is given by π = γ1 + γr . For p = q k ∈ R(C× ) we let ξ = ξ q,p = γ0 ⊗ (q + q −1 ) − (γ1 ⊗ p + γr ⊗ p−1 ).
374
I. B. Frenkel, N. Jing, W. Wang
When p = 1 the second choice reduces to the first choice in type A. This class function is self-dual since S(γi ) = γr+1−i , i = 0, 1, . . . , r and S(q) = q −1 , S(p) = p−1 . It is easy to see that q,p
aij (q, p) = hγi , γj iξ
= [2]δij − pδi+1,j − p−1 δi−1,j .
(4.3)
Thus the matrix of the weighted bilinear form h , iξ (cf. (3.4)) has the following form: 0 · · · −p−1 q + q −1 −p −p−1 q + q −1 −p ··· 0 −1 q + q −1 · · · , 0 if r ≥ 2, (4.4) 0 −p ··· ··· ··· ··· ··· −p 0 · · · −p −1 q + q −1 or
q + q −1 −p − p−1 , −p − p−1 q + q −1
if r = 1.
(4.5)
q,p
Note that when 0 = 1, the matrix of the bilinear form h , iξ is q + q −1 − p − p−1 , which is degenerate when q = p ±1 . We will call this matrix the (q, p)-Cartan matrix (of type A). The self-duality of ξ q,p transforms into the condition that the (q, p)-Cartan matrix is ∗-invariant, where the ∗ action is the composition of transpose and bar action. Namely, aij (q, p) = aj i (q −1 , p −1 ). q,p
Proposition 4.4. If p 6 = q ±1 , then the bilinear form h , iξ q,p q ±1 , the bilinear form h , iξ is degenerate of rank r.
is non-degenerate. If p =
q,p
q,p
Proof. Let Aq,p = (hγi , γj iξ ) be the matrix of the bilinear form h , iξ and let ω be a (r + 1)th root of unity. Then γi (cj ) = ωij and γi ∗ γj = γi+j . From this and Proposition 4.1 we see that as a matrix over C[q, q −1 ] the eigenvalues of Ap,q are q +q −1 −ωi p−ω−i p−1 , i = 0, . . . , r. The function q +q −1 −ωi p−ω−i p−1 ∈ R(C× ) t is non-zero except when i = 0 and q = p ±1 . u 5. Quantum Heisenberg Algebras and 0n 5.1. Heisenberg algebra b h0,ξ . Let b h0,ξ be the infinite dimensional Heisenberg algebra −1 over C[q, q ], associated with 0 and ξ ∈ R(0 × C× ), with generators am (c), c ∈ 0∗ , m ∈ Z and a central element C subject to the following commutation relations: [am (c−1 ), an (c0 )] = mδm,−n δc,c0 ζc ξq m (c)C, c, c0 ∈ 0∗ . For m ∈ Z, γ ∈ 0 ∗ and k ∈ Z we define X am (γ ⊗ q k ) = ζc−1 γ (c)am (c)q mk , c∈0∗
(5.1)
Quantum Vertex Representation and McKay Correspondence
375
and then extend it to R(0 × C× ) linearly over C. Thus we have for γ ∈ R(0 × C× ) X ζc−1 γq m (c)am (c). (5.2) am (γ ) = c∈0∗
In particular we have am (γ ⊗ q k ) = am (γ )q mk . It follows immediately from the orthogonality (3.1) of the irreducible characters of 0 that for each c ∈ 0∗ , X S(γ (c))am (γ ). am (c) = γ ∈0 ∗
Note that this formula is also valid if the summation runs through 0 ∗ ⊗ q k with a fixed k. Proposition 5.1. The Heisenberg algebra b h0,ξ has a new basis given by an (γ ) and C (n ∈ Z, γ ∈ 0 ∗ ) over C[q, q −1 ] with the following relations: qm
[am (γ ), an (γ 0 )] = mδm,−n hγ , γ 0 iξ C.
(5.3)
Proof. This is proved by a direct computation using Eqs. (5.1), (3.3) and (3.1), X 0 0 0 ζc−1 ζc−1 [am (γ ), an (γ 0 )] = 0 γ (c)γ (c )[am (c), an (c )] c,c0 ∈0∗
= mδm,−n
X c,c0 ∈0∗
= mδm,−n
X
c∈0∗
0 0 ζc−1 ζc−1 0 γ (c)γ (c )δc−1 ,c0 ζc ξq m (c)C
ζc−1 γ (c)γ 0 (c−1 )ξq m (c)C qm
= mδm,−n hγ , γ 0 iξ C.
t u
5.2. Action of b h0,ξ on the space S0×C× . Let S0×C× be the symmetric algebra generated by a−n (γ ), n ∈ N, γ ∈ 0∗ over C[q, q −1 ]. We define a−n (γ ⊗ q k ) = a−n (γ )q −kn and the natural degree operator on the space S0×C× by deg(a−n (γ ⊗ q k )) = n, which makes S0×C× into a Z+ -graded algebra. h0,ξ with The space S0×C× affords a natural realization of the Heisenberg algebra b C = 1. Since a−n (γ ⊗ q k ) = q −nk a−n (γ ), it is enough to describe the action for a−n (γ ). The central element C acts as the identity operator. For n > 0, a−n (γ ) act as multiplication operators on S0×C× . The element an (γ ), n ≥ 0 acts as a differential operator through contraction: an (γ ).a−n1 (α1 )a−n2 (α2 ) . . . a−nk (αk ) P qn = ki=1 hγ , αi iξ a−n1 (α1 )a−n2 (α2 ) . . . aˇ −ni (αi ) . . . a−nk (αk ). Here ni > 0, αi ∈ R(0) for i = 1, . . . , k, and aˇ −ni (αi ) means the very term is deleted. In this case S0×C× is an irreducible representation of b h0,ξ with the unit 1 as the highest weight vector.
376
I. B. Frenkel, N. Jing, W. Wang
5.3. The bilinear form on S0×C× . As a b h0,ξ -module, the space S0×C× admits a bilinear form h , i0ξ over C[q, q −1 ] characterized by h1, 1i0ξ = 1,
a ∈b h0,ξ ,
hau, vi0ξ = hu, a ∗ vi0ξ ,
(5.4)
with the adjoint map ∗ on b h0,ξ given by an (γ ⊗ q k )∗ = a−n (γ ⊗ q k ),
n ∈ Z.
(5.5)
Note that the adjoint map ∗ is a C-linear anti-homomorphism of b h0,ξ , and q ∗ = q. We still use the same symbol ∗ to denote the hermitian-like dual, since it clearly generalizes the ∗-action on the deformed Cartan matrix (4.4). For any partition λ = (λ1 , λ2 , . . . ) and γ ∈ 0 ∗ , we define a−λ (γ ) = a−λ1 (γ )a−λ2 (γ ) . . . . For ρ = (ρ(γ ))γ ∈0 ∗ ∈ P(0 ∗ ), we define a−ρ⊗q k = q −kkρk
Y
a−ρ(γ ) (γ ).
γ ∈0 ∗
It is clear that for a fixed k ∈ Z the elements a−ρ⊗q k , ρ ∈ P(0 ∗ ) form a basis of S0×C× over C[q, q −1 ]. Given a partition λ = (λ1 , λ2 , . . . ) and c ∈ 0∗ , we define a−λ (c ⊗ q k ) = q −k|λ| a−λ1 (c)a−λ2 (c) . . . . For any ρ = (ρ(c))c∈0∗ ∈ P(0∗ ) and k ∈ Z, we define 0 −kkρk a−ρ⊗q k = q
Y
a−ρ(c) (c).
c∈0∗
It follows from Proposition 5.1 that 0 0 0 kρk(l−k) Zρ ha−ρ⊗q k , a−σ ⊗q l iξ = δρ,σ q
YY c∈0∗ i≥1
ξq i (c)mi (ρ(c)) ,
(5.6)
0 0 where ρ, σ ∈ P(0∗ ). Note that S(a−ρ⊗q k ) = a−ρ⊗q −k , where we recall that ρ ∈ P(0∗ )
is the partition-valued function given by c 7 → ρ(c−1 ), c ∈ 0.
Quantum Vertex Representation and McKay Correspondence
377
6. The Characteristic Map as an Isometry 6.1. The characteristic map ch. Let 9 : 0n → S0×C× be the map defined by 9(x) = 0 if x ∈ 0 is of type ρ. a−ρ n We define a C-linear map ch : R0×C× −→ S0×C× by letting ch(f ) = hf, 9i0n X 0 = Zρ−1 S(f (ρ))a−ρ ,
(6.1)
ρ∈P (0∗ )
where f (ρ) ∈ C[q, q −1 ] is the value of f at the elements of type ρ. The map ch is called the characteristic map. This generalizes the definition of the characteristic map in the classical setting (cf. [M,FJW]). The space S0×C× can also be interpreted as follows. The element a−n (γ ), n > 0, γ ∈ 0 ∗ is identified as the nth power sum in a sequence of variables yg = (yiγ )i≥1 . By the commutativity among a−n (γ ) (γ ∈ 0 ∗ , n > 0) and dimension counting it is clear that the space S0×C× is isomorphic with the space 30 of symmetric functions indexed by 0 ∗ tensored with C[q, q −1 ] (cf. [M]). Denote by cn (c ∈ 0∗ ) the conjugacy class in 0n of elements (x, s) ∈ 0n such that s is an n-cycle and x ∈ c. Denote by σn (c ⊗ q k ) the class function on 0n × C× which takes values nζc t −nk (i.e. the order of the centralizer of an element in the class cn times t −nk ) on elements in the class cn × t and 0 elsewhere. For ρ = {i mi (c) }i≥1,c∈0∗ ∈ Pn (0 ∗ ) and k ∈ Z, Y σi (c)mi (c) σρ⊗q k = q −nk i≥1,c∈0∗
is the class function on 0n × C× which takes value Zρ t −nk on the conjugacy class of type ρ × t and 0 elsewhere. Given γ ∈ 0 ∗ and k ∈ Z, we denote by σn (γ ⊗ q k ) the class function on 0n × C× which takes values nγ (c)t −nk on elements in the class cn × t (c ∈ 0∗ ) and 0 elsewhere. 0 k Lemma 6.1. The map ch sends σρ⊗q k to a−ρ⊗q k . In particular, it sends σn (γ ⊗ q ) to
a−n (γ ⊗ q k ) in S0×C× .
Proof. This is verified by the definition of ch (6.1) and the character values of σn defined above. u t Proposition 6.2. Given γ ∈ 0 ∗ , the character value of ηn (γ ⊗ q k ) on the conjugacy class cρ of type ρ = (ρ(c))c∈0∗ is given by ηn (γ ⊗ q k )(cρ ) =
Y
γ (c)l(ρ(c)) q nk .
(6.2)
c∈0∗
In particular, we have ηn (γ ⊗ q k ) = ηn (γ )q nk . Proof. We first let (g, σ ) be an element of 0n such that σ is a cycle of length n, say σ = (12 · · · n). Let {ei } be a basis of V , and γ ⊗ q k is afforded by the action: (h, t)ej =
378
P
I. B. Frenkel, N. Jing, W. Wang
i cij (h)t
ke
i,
where h ∈ 0. We then have
(g, σ, t).(ej1 ⊗ ej2 ⊗ · · · ⊗ ejn ) = (g1 , t)ejn ⊗ (g2 , t)ej1 ⊗ · · · ⊗ (gn , t)ejn−1 X = t kn cin jn (g1 )ci1 j1 (g2 ) · · · cin−1 jn−1 (gn )ein ⊗ ei1 · · · ⊗ ein−1 . i1 ,... ,in
It follows that ηn (γ ⊗ q k )(cρ , t) = trace (g, σ, t) X t kn cj1 jn (g1 )cj2 j1 (g2 ) · · · cjn jn−1 (gn ) = j1 ,... ,jn
= trace t kn a(gn )a(gn−1 ) . . . a(g1 )
= trace t kn a(gn gn−1 . . . g1 ) = γ (c)q kn (t).
Given x × y ∈ 0n , where x ∈ 0r and y ∈ 0n−r , by (3.5) we clearly have ηn (γ ⊗ q k )(x × y, t) = ηn (γ ⊗ q k )(x, t)ηn (γ ⊗ q k )(y, t). This immediately implies the formula. u t A similar argument gives that Y
εn (γ ⊗ q k )(x, t) = (−1)n
(−γ (c))l(ρ(c)) t nk ,
(6.3)
c∈0∗
where x is any element in the conjugacy class of type ρ = (ρ(c))c∈0 ∗ . Formula (6.2) is equivalent to the following: ηn (γ ⊗ q k )(cρ , t) =
YY (γ ⊗ q k )(c, t i )mi (ρ(c)) .
(6.4)
c∈0∗ i≥1
The following result allows us to extend the map from γ ∈ 0 ∗ to R(0n ). Proposition 6.3. For any γ ∈ R(0), we have X 1 a−n (γ )(q −k z)n , n n≥0 n≥1 X X k n n−1 1 −k n a−n (γ )(q z) . ch(εn (γ ⊗ q ))z = exp (−1) n
X
n≥0
ch(ηn (γ ⊗ q k ))zn = exp
n≥1
(6.5) (6.6)
Quantum Vertex Representation and McKay Correspondence
379
Proof. It follows from definition of ch (6.1) and (6.4) that X YY X ch(ηn (γ ⊗ q k ))zn = Zρ−1 S(γq ik (c)mi (ρ(c)) )a−ρ(c) z||ρ|| q −||ρ|| ρ
n≥0
=
X ρ
c∈0∗ i≥1
Zρ−1
Y
γ (c)l(ρ(c)) a−ρ(c) (q −k z)||ρ||
c∈0∗
Y X = (ζc−1 γ (c))l(λ) zλ−1 a−λ (c)(q −k z)|λ| c∈0∗
λ
X X 1 −1 −k n ζc γ (c)a−n (c)(q z) = exp n c∈0∗ n≥1 X 1 a−n (γ )(q −k z)n . = exp n n≥1
Similarly we can prove (6.6) using the following identity: YY (−γq ik (c))mi (ρ(c)) εn (γ ⊗ q k )(x) = (−1)n c∈0∗ i≥1
k n
= (−q )
YY (−γ (c))mi (ρ(c))
c∈0∗ i≥1 nk
= εn (γ )(x)q .
The same argument as in the classical case (cf. [FJW]) by using (3.6) and (3.7) will show that the proposition holds for linear combination of simple characters such as t γ ⊗ q k − β ⊗ q k , and thus it is true for any element γ ⊗ q k , where γ ∈ R(0). u Comparing components we obtain ch(ηn (γ ⊗ q k )) = ch(εn (γ ⊗ q k )) =
X q −nk λ
zλ
λ
zλ
X q −nk
a−λ (γ ), (−1)|λ|−l(λ) a−λ (γ ),
where the sum runs over all partitions λ of n. Corollary 6.4. The formula (6.4) remains valid when γ ⊗ q k is replaced by any element ξ ∈ R(0 × C× ). In particular ηn (ξ ) is self-dual provided that ξ is invariant under the antipode S. 6.2. Isometry between R0×C× and S0×C× . The symmetric algebra S0×C× = S0 ⊗ C[q, q −1 ] has the following Hopf algebra structure over C. The multiplication is the usual one, and the comultiplication is given by 1(q k ) = q k ⊗ q k , 1(an (γ ⊗ q k )) = an (γ ⊗ q k ) ⊗ q nk + q nk ⊗ an (γ ⊗ q k ),
380
I. B. Frenkel, N. Jing, W. Wang
where γ ∈ 0 ∗ . The last formula is equivalent to the following: 1(an (c ⊗ q k )) = an (c ⊗ q k ) ⊗ q nk + q nk ⊗ an (c ⊗ q k ).
(6.7)
where c ∈ 0∗ . The antipode is given by S(q k ) = q −k , S(an (γ ⊗ q k )) = −an (γ ⊗ q −k ). The antipode commutes with the adjoint (dual) map ∗: ∗2 = S 2 = I d,
S∗ = ∗S.
(6.8)
Recall that we have defined a Hopf algebra structure on R0×C× in Sect. 2. Proposition 6.5. The characteristic map ch : R0×C× −→ S0×C× is an isomorphism of Hopf algebras. Proof. It follows immediately from the definition of the comultiplication in the both Hopf algebras (cf. (2.4) and (6.7)). u t Remark 6.6. The comultiplication (6.7) is in fact induced from that of the classical case in [FJW] and only works for C = 1. Remark 6.7. There is another coproduct called Drinfeld comultiplication 1D on the algebra S0×C× adjoined by a central element q c . The formula on S0×C× at level c is as follows [J2]: 1D (an (γ )) = an (γ ) ⊗ q |n|c/2 + q −|n|c/2 ⊗ an (γ ).
(6.9)
We do not know a conceptual interpretation of the Drinfeld comultiplication in R0×C× . Recall that we have defined a bilinear form h , iξ on R0×C× and a bilinear form on S0×C× denoted by h , i0ξ , where ξ is a self-dual class function. The following lemma is immediate from our definition of h , i0ξ and the comultiplication 1. Lemma 6.8. The bilinear form h , i0ξ on S0×C× can be characterized by the following two properties: 0
0
1) ha−n (β ⊗ q k ), a−m (γ ⊗ q l )iξ = nδn,m q n(l−k) hβ, γ iξ , β, γ ∈ 0 ∗ , k, l ∈ Z. 0
0
2) hfg, hiξ = hf ⊗g, 1hiξ , where f, g, h ∈ S0×C× , and the bilinear form on S0×C× ⊗ 0
S0×C× , is induced from h , iξ on S0×C× .
Theorem 6.9. The characteristic map is an isometry from the space (R0×C× , h , iξ ) to the space (S0×C× , h , i0ξ ). Proof. By Corollary 6.4, the character value of ηn (ξ ) at an element x of type ρ is YY ξq i (c)mi (ρ(c)) . ηn (ξ )(x) = c∈0∗ i≥1
Quantum Vertex Representation and McKay Correspondence
Thus it follows from definition that hσρ⊗q k , σρ 0 ⊗q l iξ =
X
µ∈Pn (0∗ )
381
Zµ−1 q n(l−k) ξq (cµ )σρ (cµ )σρ 0 (cµ )
= δρ,ρ 0 Zρ−1 q n(l−k) ξ(cρ )Zρ Zρ YY = δρ,ρ 0 Zρ q n(l−k) ξq i (c)mi (ρ(c)) . c∈0∗ i≥1
By Lemma 6.1 and the formula (5.6), we see that hσρ⊗q k , σρ 0 ⊗q l iξ = ha−ρ⊗q k , a−ρ 0 ⊗q l i0ξ = hch(σρ⊗q k ), ch(σρ 0 ⊗q l )i0ξ . Since σρ⊗q k , ρ ∈ P(0∗ ) form a C-basis of R0×C× , we have shown that ch : R0×C× −→ t S0×C× is an isometry. u From now on we will not distinguish the bilinear form h , iξ on R0×C× from the 0 bilinear form h , iξ on S0×C× . 7. Quantum Vertex operators and R0×C× 7.1. Vertex Operators and Heisenberg algebras in F0×C× . Let Q be an integral lattice with basis αi , i = 0, 1, . . . , r endowed with a symmetric bilinear form. As in the case of q = 1 (cf. [FK]), we fix a 2-cocycle : Q × Q −→ C× such that (α, β) = (β, α)(−1)hα,βi+hα,αihβ,βi . We remark that the cocycle can be constructed directly by prescribing the values of (αi , αj ) ∈ {±1} (i < j ). Let ξ be a self-dual virtual character in R0×C× . Recall that the lattice RZ (0) is a Zlattice under the bilinear form h , i1ξ , here the superscript means q = 1. For our purpose we will always associate a 2-cocycle as in the previous subsection to the integral lattice (RZ (0), h , i1ξ ) (and its sublattices). Let C[RZ (0)] be the group algebra generated by eγ , γ ∈ RZ (0). We introduce two special operators acting on C[RZ (0)]: A (-twisted) multiplication operator eα defined by eα .eβ = (α, β)eα+β , α, β ∈ RZ (0), and a differentiation operator ∂α given by ∂α eβ = hα, βi1ξ eβ , α, β ∈ RZ (0). These two operators are then extended linearly to the space F0×C× = R0×C× ⊗ C[RZ (0)]
(7.1)
by letting them act on the R0×C× part trivially. We define the Hopf algebra structure on C[RZ (0)] and extend the Hopf algebra structure from R0×C× to F0×C× as follows: 1(eα ) = eα ⊗ eα ,
S(eα ) = e−α .
382
I. B. Frenkel, N. Jing, W. Wang q
The bilinear form h , iξ on R0×C× is extended to F0×C× by heα , eβ iξ = δα,β . With respect to this extended bilinear form we have the ∗-action (adjoint action) on the operators eα and ∂α : (eα )∗ = e−α ,
(z∂α )∗ = z−∂α .
(7.2)
For each k ∈ Z, we introduce the group theoretic operators H±n (γ ⊗ q k ), E±n (γ ⊗ q k ), γ ∈ R(0), n > 0 as the following compositions of maps: H−n (γ ⊗ q k ) : R(0m × C× ) I nd⊗mC×
−→
−→
−→
R(0n × C× ) ⊗ R(0m × C× )
R(0n+m × C× )
E−n (γ ⊗ q k ) : R(0m × C× ) I nd⊗mC×
ηn (γ ⊗q k )⊗
εn (γ ⊗q k )⊗
−→
R(0n × C× ) ⊗ R(0m × C× )
R(0n+m × C× ) Res
En (γ ⊗ q k ) : R(0m × C× ) −→ R(0n ) ⊗ R(0m−n × C× ) hεn (γ ⊗q k ),·iξ
−→
R(0m−n × C× ) Res
Hn (γ ⊗ q k ) : R(0m × C× ) −→ R(0n ) ⊗ R(0m−n × C× ) hηn (γ ⊗q k ),·iξ
−→
R(0m−n × C× ),
where Res and Ind are the restriction and induction functors in R0 =
L
n≥0 R(0n ).
We introduce their generating functions in a formal variable z: X H∓n (γ ⊗ q k )z±n , H± (γ ⊗ q k , z) = n≥0
k
E± (γ ⊗ q , z) =
X
E∓n (γ ⊗ q k )(−z)±n .
n≥0
We now define the vertex operators Yn± (γ ⊗ q l , k) , γ ∈ 0 ∗ , k, l ∈ Z, n ∈ Z + hγ , γ i1ξ /2 as follows. X 1 Yn+ (γ ⊗ q l , k)z−n−hγ ,γ iξ /2 Y + (γ ⊗ q l , k, z) = n∈Z+hγ ,γ i1ξ /2
= H+ (γ ⊗ q l , z)E− (γ ⊗ q l−k , z)eγ (q −l z)∂γ ,
(7.3)
Y − (γ ⊗ q l , k, z) = (Y + (γ ⊗ q l , k, z−1 ))∗ X 1 = Yn− (γ ⊗ q l , k)z−n−hγ ,γ iξ /2 n∈Z+hγ ,γ i1ξ /2
= E+ (γ ⊗ q l−k , z)H− (γ ⊗ q l , z)e−γ (q −l z)−∂γ .
(7.4)
Quantum Vertex Representation and McKay Correspondence
383
One easily sees that the operators Yn± (γ ⊗ q l , k) are well-defined operators acting on the space F0×C× . We extend the Z+ -gradation on R0×C× to a 21 hγ , γ i1ξ + Z+ -gradation on F0×C× by letting deg a−n (γ ⊗ q k ) = n, deg eγ =
1 hγ , γ i1ξ . 2
We denote by R 0×C× the subalgebra of R0×C× excluding the generators an (γ0 ), n ∈ Z× . The bilinear form h , iξ on F 0×C× = R 0×C× ⊗ R Z (0) will be the restriction of h , iξ on F0×C× to F 0×C× . In the case of the second choice of ξ and p = q ±1 , the Fock space F 0×C× can also be obtained as the quotient of F0×C× modulo the radical of h , iξ . We define e a−n (γ ⊗ q k ), n > 0 to be a map from R0×C× to itself by the following composition: R(0m × C× )
σn (γ ⊗q k )⊗
−→
R(0n × C× ) ⊗ R(0m × C× ) I nd⊗mC×
−→
R(0n+m × C× ).
We also define e an (γ ⊗ q k ), n > 0 to be a map from R0×C× to itself as the composition Res⊗1
R(0m × C× ) −→ R(0n × C× ) ⊗ R(0m−n × C× ) q
hσn (γ ⊗q k ),·iξ
−→
R(0m−n × C× ).
Proposition 7.1. The operators e an (γ ), γ ∈ 0 ∗ , n ∈ Z× satisfy the Heisenberg algebra relations (5.1) with C = 1. Proof. This is similarly proved as for the classical setting in [W]. u t 7.2. Group theoretic interpretation of vertex operators. To compare the vertex operators Y ± (γ ⊗q l , k, z) with the familiar vertex operators acting in the Fock space we introduce the space V0×C× = S0×C× ⊗ C[RZ (0)]. q
We extend the bilinear form h , iξ in S0×C× to the space V0×C× and also extend the Z+ -gradation on S0×C× to a 21 Z+ -gradation on V0 . We extend the characteristic map to the map ch : F0×C× −→ V0×C× by identity on RZ (0). Then Proposition 6.5 and Theorem 6.9 imply that we have an isometric isomorphism of Hopf algebras. We can now identify the operators from the previous subsections with the operators constructed from the Heisenberg algebra.
384
I. B. Frenkel, N. Jing, W. Wang
Theorem 7.2. For any γ ∈ R(0) and k ∈ Z, we have X 1 −k n ⊗ q , z) = exp a−n (γ )(q z) , n n≥1 X 1 k −k n a−n (γ )(q z) , ⊗ q , z) = exp − n n≥1 X 1 an (γ )(q −k z)−n , ⊗ q k , z) = exp n n≥1 X 1 ⊗ q k , z) = exp − an (γ )(q −k z)−n . n
k
ch H+ (γ ch E+ (γ ch H− (γ ch E− (γ
(7.5) (7.6) (7.7) (7.8)
n≥1
Proof. The first and second identities were essentially established in Proposition 6.3 together with Lemma 6.1, where the components are viewed as operators acting on R0×C× or S0×C× . Note that an (γ ⊗ q k ) = an (γ )q kn . We observe from definition that the adjoint ∗-action of E+ (γ ⊗ q k , z) and H− (γ ⊗ q k q , z) with respect to the bilinear form h , iξ are E− (γ ⊗ q k , z−1 ) and H− (γ ⊗ q k , z−1 ) respectively. The third and fourth identities are obtained by applying the adjoint action ∗ to the first two identities. u t Remark 7.3. Replacing γ by −γ in (7.5) and (7.7) we obtain the equivalent formulas (7.6) and (7.8) respectively. Applying the characteristic map to the vertex operators Y ± (γ , k, z), we obtain the following group theoretical explanation of vertex operators acting on the Fock space F0×C× . Theorem 7.4. For any γ ∈ R0 and k ∈ Z, we have Y + (γ , k, z) X X 1 1 e a−n (γ )zn exp − e an (γ )q −kn z−n eγ z∂γ = exp n n n≥1
k
n≥1 −1 ∗
= ch(H+ (γ , z))ch(S(H+ (γ ⊗ q , z
) ))eγ z∂γ ,
Y − (γ , k, z) X X 1 1 e a−n (γ )q kn zn exp e an (γ )z−n e−γ z−∂γ = exp − n n n≥1
k
= ch(S(H+ (γ ⊗ q , z
−1
n≥1 ∗
)))ch(H+ (γ , z) )e−γ z−∂γ .
We note that for γ ∈ 0 ∗ , l ∈ Z, Y ± (γ ⊗ q l , k, z) = Y ± (γ , k, q −l z).
(7.9)
Quantum Vertex Representation and McKay Correspondence
385
It follows from Theorem 7.4 that ch Y ± (γ , k, z) = X± (γ , k, z) X 1 n(k∓k)/2 n z a−n (γ )q = exp − n n≥1 X 1 n(−k∓k)/2 −n an (γ )q z e±γ z±∂γ . × exp n n≥1
In general the vertex operators Y ± (γ , k, z) (for k ∈ Z) generalize the vertex operators considered in [J3] (for k = ±1). When q = 1 they specialize to the vertex operators Y ± (γ , z) studied in [FJW]. 8. Basic Representations and the McKay Correspondence 8.1. Quantum toroidal algebras. Let Q be the root lattice of an affine Lie algebra of simply laced type A, D, or E with the invariant form ( | ). The quantum toroidal algebra g) is the associative algebra generated by xi± (n), ai (m), q d , q c , 0 ≤ i ≤ r, n, m ∈ Z Uq (b subject to the following relations [GKV]: q d ai (n)q −d = q n ai (n), q d xi± (n)q −d = q n xi± (n),
(8.1)
[ai (m), aj (n)] = δm,−n
(8.2)
q mc
− q −mc
[(αi |αj )m] , m q − q −1 [(αi |αj )m] ∓|m|c/2 ± q xj (m + n), [ai (m), xj± (n)] = ± m (z − q ±(αi ,αj ) w)xi± (z)xj± (w) = xj± (w)xi± (z)(q ±(αi ,αj ) z − w), δij {δ(zw−1 q −c )ψi+ (wq c/2 ) − δ(zw−1 q c )ψi− (zq c/2 )} , q − q −1 N =1−(αi ,αj ) X s N (−1) x ± (z ) · · · xi± (zs ) · Symz1 ,...zN s i 1
[xi+ (z), xj− (w)] =
(8.3) (8.4) (8.5) (8.6)
s=0
· xj± (w)xi± (zs+1 ) · · · xi± (zN ) = 0, for (αi |αj ) ≤ 0, where the generators α(n) are related to ψi± (±n) via: X X ψi± (±n)z∓n = ki±1 exp(±(q − q −1 ) αi (±n)z∓n ), ψi± (z) = n≥0
n>0
and the Gaussian polynomial [m]! m , = n [n]![m − n]!
[n]! = [n][n − 1] · · · [1].
The generating function of xn± is defined by X xi± (n)z−n−1 , xi± (z) = n∈Z
i = 0, . . . , r.
(8.7)
386
I. B. Frenkel, N. Jing, W. Wang
The quantum toroidal algebra contains a special subalgebra – the quantum affine g), which is generated by simply omitting the generators associated to i = 0. algebra Uq (b The relations are called the Drinfeld realization of the quantum affine algebras. g) admits a further deformation In the case of type A, the quantum toroidal algebra Uq (b g). Let (bij ) be the skew-symmetric (r + 1) × (r + 1)-matrix Uq,p (b
0 −1 0 · · · 0 1
1 0 −1 ··· 0 0
0 1 0 ··· 0 0
··· ··· ··· ··· ··· ···
0 0 0 ··· 0 −1
−1 0 0 . · · · 1 0
(8.8)
± g) is the associative algebra generated by xin , ai (m), The quantum toroidal algebra Uq,p (b d d c 1 2 q , q , q , 0 ≤ i ≤ r, m, n ∈ Z subject to the following relations [GKV,VV]:
q d1 ai (n)q −d1 = q n ai (n), q d1 xi± (n)q −d1 = q n xi± (n), d2
−d2
q ai (n)q d2 ± q xi (n)q −d2 [ai (m), aj (n)] =
= ai (n),
(8.10)
= q ±δn0 xi± (n), [(αi |αj )m] q mc − q −mc mbij δm,−n p , m q − q −1
[(αi |αj )m] ∓|m|c/2 mbij ± p xj (m + n), q m (p bij z − q ±(αi |αj ) w)xi± (z)xj± (w) = xj± (w)xi± (z)(pbij q ±(αi |αj ) z − w), [ai (m), xj± (n)] = ±
[xi+ (z), xj− (w)] =
δij {δ(zw−1 q −c )ψi+ (wq c/2 ) − δ(zw−1 q c )ψi− (zq c/2 )} , q − q −1 N=1−(αi |αj )
Symz1 ,...zN
X s=0
(8.9)
N ± (−1) x (z ) · · · xi± (zs ) · s i 1 s
(8.11) (8.12) (8.13) (8.14) (8.15) (8.16)
· xj± (w)xi± (zs+1 ) · · · xi± (zN ) = 0, for (αi |αj ) ≤ 0, where the generators ai (n) are related to ψi± (±m) via: ψi± (z) =
X n≥0
ψi± (±n)z∓n = ki±1 exp(±(q − q −1 )
X
αi (±n)z∓n ).
(8.17)
n>0
g) is the simple module generated by the We recall that the basic module of Uq (b highest weight vector v0 such that ai (n).v0 = 0, c
q .v0 = qv0 ,
xi± (n).v0 = 0, d
q .v0 = v0 .
We say a module is of level one if q c acts as q.
n ≥ 0,
Quantum Vertex Representation and McKay Correspondence
387
8.2. A new form of McKay correspondence. In this subsection we let 0 to be a finite subgroup of SU2 and consider two distinguished choices of the class function ξ in R0×C× introduced in Sect. 4.2. First we consider ξ = γ0 ⊗ (q + q −1 ) − π ⊗ 1C× , where π is the character of the two-dimensional natural representation of 0 in SU2 . The Heisenberg algebra in this case has the following relations (cf. Prop. 5.1 and (4.2)): ( mδm,−n (q m + q −m )C, i = j , (8.18) [am (γi ), an (γj )] = mδm,−n aij1 C, i 6= j where aij1 are the entries of the affine Cartan matrix of ADE type (see (3.4) at d = 2). When 0 6 = Z/2Z or 1, the relations (8.18) can be simply written as follows: [am (γi ), an (γj )] = mδm,−n [aij ]q m C. Recall that the matrix A1 = (hγi , γj i1ξ ) = (aij1 )0≤i,j ≤r is the Cartan matrix for the corresponding affine Lie algebra [Mc]. In particular aii1 = 2; aij1 = 0 or −1 when 1 = a 1 = −2. Let g (resp. g) ˆ be i 6 = j and 0 6 = Z/2Z. In the case of 0 = Z/2Z, a01 10 the corresponding simple Lie algebra (resp. affine Lie algebra) associated to the Cartan matrix (aij1 )1≤i,j ≤r (resp. A). Note that the lattice RZ (0) is even in this case. We define the normal ordered product of vertex operators as follows. : Y + (γi , k, z)Y + (γj , k 0 , w) : 0
= H+ (γi , z)H (γj , w)S(H+ (γi ⊗ q k , z−1 )∗ H+ (γj ⊗ q k , w−1 )∗ ) × eγi +γj z∂γi w +
−
∂γ j
,
0
: Y (γi , k, z)Y (γj , k , w) : 0
0
= H+ (γi , z)H (−γj ⊗ q −k , w)S(H+ (γi ⊗ q k , z−1 )∗ H+ (−γj ⊗ q k , w−1 )∗ ) × eγi −γj z∂γi w
−∂γj
.
Other normal ordered products are defined similarly. We introduce for a ∈ R the following q-function: X ∞ (q −a+1 z; q 2 )∞ [an] n z = exp − (q a+1 z; q 2 )∞ n[n] n=1 ∞ X a (−z)m , = m
(1 − z)aq 2 =
m=0
where we expand the power series using the q-binomial theorem and (q a − q −a )(q a−1 − q −a+1 ) · · · (q a−m+1 − q −a+m−1 ) a , = m (q m − q −m )(q m−1 − q −m+1 ) · · · (q − q −1 ) ∞ Y (1 − aq n ). (a; q)∞ = n=0
(8.19)
388
I. B. Frenkel, N. Jing, W. Wang
a equals the Gaussian polynomial. m The identities in the following theorems are understood as usual by means of correlation functions (cf. e.g. [FJ,J1]).
When a is a non-negative integer,
Theorem 8.1. Let ξ = γ0 ⊗(q +q −1 )−π ⊗1C× . Then the vertex operators Y ± (γi , k, z), Y ± (−γj , k, z), γi ∈ 0 ∗ , k ∈ Z acting on the group theoretically defined Fock space F0×C× satisfy the following relations: Y ± (γi , k, z)Y ± (γj , k, w) = (γi , γj ) : Y ± (γi , k, z)Y ± (γj , k, w) : 1 hγi , γj i1ξ = 0 ∓k −1 (z − q w) hγi , γj i1ξ = −1 , × (z − q ∓k−1 w)(z − q ∓k+1 w) hγi , γj i1 = 2 ξ Y ± (γi , k, z)Y ∓ (γj , k, w) = (γi , γj ) : Y ± (γi , k, z)Y ∓ (γj , k, w) : 1 hγi , γj i1ξ = 0 −1 hγi , γj i1ξ = −1 , (z − w) × −1 (z − qw)(z − q w) hγi , γj i1 = 2 ξ
Y ± (γi , k, z)Y ± (−γj , −k, w) = (γi , γj ) : Y ± (γi , k, z)Y ± (−γj , −k, w) : 1 hγi , γj i1ξ = 0 ∓k hγi , γj i1ξ = −1 , (z − q w) × (z − q ∓k−1 w)−1 (z − q ∓k+1 w)−1 hγi , γj i1 = 2 ξ
Y + (γi , k, z)Y − (−γj , −k, w) = (γi , γj ) : Y + (γi , k, z)Y − (−γj , −k, w) : 1 hγi , γj i1ξ = 0 −2k −1 (z − q w) hγi , γj i1ξ = −1 , × (z − q −2k−1 w)(z − q −2k+1 w) hγi , γj i1 = 2 ξ
Y − (γi , k, z)Y + (−γj , −k, w) = (γi , γj ) : Y − (γi , k, z)Y + (−γj , −k, w) : 1 hγi , γj i1ξ = 0 hγi , γj i1ξ = −1 . (z − w)−1 (z − qw)(z − q −1 w) hγi , γj i1 = 2 ξ
Quantum Vertex Representation and McKay Correspondence
389
Proof. It is a routine computation to see that E− (γi ⊗ q k , z)H+ (γj ⊗ q l , w) = H+ (γj ⊗ q l , w)E− (γi ⊗ q k , z)(1 −
w l−k hγi ,γj i1ξ q )q 2 , z
where the q-analog of the power series (1 − x)nq2 is defined in (8.19). In particular, we have (1 − w/z)q 2 = 1 − w/z, (1 − w/z)2q 2 = (1 − qw/z)(1 − q −1 w/z). t Then the theorem is proved by observing that zγ e∂β = zhγ ,βiξ e∂β zγ . u 1
Remark 8.2. Replacing the vertex operator Y ± by X± via the characteristic map ch in the above formulas, we get the corresponding formulas for vertex operators X± (γ , k, z) acting on V0×C× . Now we consider the second distinguished class function ξ q,p = γ0 ⊗ (q + q −1 ) − (γ1 ⊗ p + γr ⊗ p−1 ), when 0 is a cyclic group of order r + 1. In this case the Heisenberg algebra (5.3) has the following relations according to Prop. 5.1 and (4.3): [am (γi ), an (γj )] = mδm,−n [aij1 ]q m pmbij C,
(8.20)
where aij1 are the entries of the affine Cartan matrix of type A and r ≥ 2. This is the g) provided that we identify same Heisenberg subalgebra (c = 1) in Uq,p (b ai (n) =
[n] an (γi ). n
Recall that (bij ) is the skew-symmetric matrix given in (8.8). We need to slightly modify the definition of the middle term in the vertex operators. For each i = 0, 1, . . . , r we define the modified operator z∂γ ,p on the group algebra C[RZ (0)] by z∂γi ,p eβ = zhγi ,βiξ p− 2 1
where β =
P
j
mj γj ∈ RZ (0).
1
Pr
1 j =1 hγi ,mj γj iξ bij
eβ ,
(8.21)
We then replace the operator z±∂γi in the definition of the vertex operators Y ± (γi , k, z) by the operator z±∂γi ,p . The formulas in Theorems 7.4 remain true after the term z±∂ appearing in the formulas are modified accordingly. The proof of the following theorem is similar to that of Theorem 8.1.
390
I. B. Frenkel, N. Jing, W. Wang
Theorem 8.3. Let 0 be a cyclic group of order r + 1 and let ξ = γ0 ⊗ (q + q −1 ) − (γ1 ⊗ p + γr ⊗ p−1 ). The vertex operators Y ± (γi , k, z) and Y ± (−γi , k, z), γi ∈ 0 ∗ acting on the group theoretically defined Fock space F0×C× satisfy the following relations. Y ± (γi , k, z)Y ± (γj , k, w) = (γi , γj ) : Y ± (γi , k, z)Y ± (γj , k, w) : 1 hγi , γj i1ξ = 0 − 21 bij ∓k b −1 (z − q p ij w) hγi , γj i1ξ = −1 , p (z − q ∓k−1 w)(z − q ∓k+1 w) hγ , γ i1 = 2 i j ξ
Y ± (γi , k, z)Y ∓ (γj , k, w) = (γi , γj ) : Y ± (γi , k, z)Y ∓ (γj , k, w) : 1 hγi , γj i1ξ = 0 1 p − 2 bij (z − pbij w)−1 hγi , γj i1ξ = −1 , (z − qw)(z − q −1 w) hγ , γ i1 = 2 i j ξ
Y ± (γi , k, z)Y ± (−γj , −k, w) = (γi , γj ) : Y ± (γi , k, z)Y ± (−γj , −k, w) : 1 hγi , γj i1ξ = 0 1 hγi , γj i1ξ = −1 , p − 2 bij (z − q ∓k pbij w) (z − q ∓k−1 w)−1 (z − q ∓k+1 w)−1 hγ , γ i1 = 2 i j ξ
Y + (γi , k, z)Y − (−γj , −k, w) = (γi , γj ) : Y + (γi , k, z)Y − (−γj , −k, w) : 1 hγi , γj i1ξ = 0 1 p − 2 bij (z − q −2k pbij w)−1 hγi , γj i1ξ = −1 , (z − q −2k−1 w)(z − q −2k+1 w) hγ , γ i1 = 2 i j ξ
Y − (γi , k, z)Y + (−γj , −k, w) = (γi , γj ) : Y − (γi , k, z)Y + (−γj , −k, w) : 1 hγi , γj i1ξ = 0 1 p − 2 bij (z − pbij w)−1 hγi , γj i1ξ = −1 . (z − qw)(z − q −1 w) hγ , γ i1 = 2 i j ξ Remark 8.4. Replacing the vertex operators Y ± by X ± via the characteristic map ch we obtain the corresponding results on the space V0×C× .
Quantum Vertex Representation and McKay Correspondence
391
8.3. Quantum vertex representations of Uq (b g). For each i = 0, . . . , r let aei (n) =
[n] an (γi ). n
It follows from (5.3) and (8.18) that [e ai (m), aej (n)] = δm,−n
[mhγi , γj i1ξ ] m
[m].
(8.22)
According to McKay, the bilinear form hγi , γj i1ξ is exactly the same as the invariant form ( | ) of the root lattice of the affine Lie algebra b g. This implies that the commutation relations (8.22) are exactly the commutation relations (8.2) of the Heisenberg algebra g) if we identify aei (n) with ai (n). Thus the Fock space S0×C× is a level one in Uq (b g). Under the new variable (by representation for the Heisenberg subalgebra in Uq (b identifying ai (n) with aei (n)) and after a q-shift we obtain that X + (γi ⊗ q −k/2 , k, z) X X ai (−n) kn/2 n ai (n) kn/2 −n γ ∂γ z exp − z q q = exp e z , [n] [n] −
n≥1 −k/2
n≥1
, k, z) X (γi ⊗ q X X ai (−n) −kn/2 n ai (n) −kn/2 −n −γ −∂γ z exp z q q = exp − e z . [n] [n] n≥1
n≥1
The following theorem gives a q-deformation of the new form of McKay correspondence in [FJW] and provides a direct connection from a finite subgroup 0 of SU2 to the g) of ADE type. quantum toroidal algebra Uq (b Theorem 8.5. Given a finite subgroup 0 of SU2 , each of the following correspondence g) on the Fock space gives a vertex representation of the quantum toroidal algebra Uq (b F0×C× : xi± (n) −→ Yn± (γi , −1), [n] q c −→ q; an (γi ), ai (n) −→ n or xi± (n) −→ Yn∓ (−γi , 1), [n] q c −→ q, an (γi ), ai (n) −→ n where i = 0, . . . , r, and n ∈ Z. Proof. Using the usual method of q-vertex operator calculus [FJ,J1] and Theorem 8.1 we see that the vertex operators Y ± (γi , ±1, z) satisfy relations (8.3), (8.4) and (8.6). Observe further that the above vertex operators at k = ±1 have the same form as those in the basic representations of the quantum affine algebras (see [FJ]). Thus the relations (8.5) and (8.7) are also verified. For each fixed k = 1 or −1 we have shown that the operators Y ± (γi , ±1, z) give a level one representation of the quantum toroidal algebra g) (see also [Sa,J3]). u t Uq (b
392
I. B. Frenkel, N. Jing, W. Wang
Remark 8.6. Replacing Y ± by X± in the above theorem, we obtain a vertex representag) in the space V0×C× . tion of Uq (b We can easily get the basic representation of the quantum affine algebra Uq (b g) on a certain distinguished subspace of F0×C× . Denote by S 0×C× the symmetric algebra generated by a−n (γi ), n > 0, i = 1, . . . , r over C[q, q −1 ]. S 0×C× is isometric to R 0×C× . We define F 0×C× = R 0×C× ⊗ C[R Z (0)] ∼ = S 0×C× ⊗ C[R Z (0)]. The space V0×C× associated to the lattice RZ (0) is isomorphic to the tensor product of the space R 0×C× and RZ (0) as well as the space associated to the rank 1 lattice Zα0 . Corollary 8.7. Given a finite subgroup 0 of SU2 , each of the following correspondence g) on the Fock space gives the basic representation of the quantum affine algebra Uq (b F 0×C× : xi± (n) −→ Yn± (γi , −1), [n] q c −→ q; an (γi ), ai (n) −→ n or xi± (n) −→ Yn∓ (−γi , 1), [n] an (γi ), q c −→ q, ai (n) −→ n where i = 1, . . . , r. In the case of our second distinguished class function ξ q,p = γ0 ⊗ (q + q −1 ) − (γ1 ⊗ p + γr ⊗ p−1 ), we need to consider the Fock space e0×C× = R0×C× ⊗ C[RZ (0)/R 0 ] ∼ F Z = S0×C× ⊗ C[R Z (0)], e0×C× where RZ0 is the radical of the bilinear form h , i1ξ . The correspondence space for F e under the characteristic map ch will be denoted V0×C× . Using similar method as in the proof of Theorem 8.5 we derive the the following theorem. Theorem 8.8. Let 0 be a cyclic group of order r + 1 ≥ 2 and p = q ±1 . Each of the e0×C× : g) on F following correspondence gives the basic representation of Uq (b xi± (n) −→ Yn± (γi , −1), [n] q c −→ q; an (γi ), ai (n) −→ n or xi± (n) −→ Yn∓ (−γi , 1), [n] an (γi ), q c −→ q, ai (n) −→ n where i = 0, . . . , r.
Quantum Vertex Representation and McKay Correspondence
393
Remark 8.9. The algebraic picture obtained by replacing the vertex operator Y ± by X± e0×C× in the above theorem was given by Sato [Sa]. and F0×C× by V g) is only This theorem partly shows why the two-parameter deformation for Uq (b available in the case of type A. It also singles out the special case of p = q ±1 , where q,q ±1
is semi-definite positive (see Sect. 4.2) which the matrix of the bilinear form h , iξ permits the factorization of F0×C× into F 0×C× . We remark that our method can be generalized by replacing R(0) by any finite dimensional Hopf algebra with a Haar measure. A more general deformation is obtained by replacing C× by any torsion-free abelian group. In another direction one can replace g) at r th roots of unity. C× by its finite analog Z/rZ to study Uq (b Acknowledgement. I.F. is supported in part by NSF grant DMS-9700765. N.J. is supported in part by NSA grant MDA904-97-1-0062 and NSF grant DMS-9970493.
References [A]
Abe, E.: Hopf algebras. Cambridge Tracts in Mathematics, 74, Cambridge–New York: Cambridge University Press, 1980 [CG] Chriss, N. and Ginzburg, V.: Representation theory and complex geometry. Boston: Birkhäuser, 1997 [FJ] Frenkel, I.B. and Jing, N.: Vertex representations of quantum affine algebras. Proc. Natl. Acad. Sci. USA 85, 9373–9377 (1988) [FJW] Frenkel, I.B., Jing, N. and Wang, W.: Vertex representations via finite groups and the McKay correspondence. Int’l. Math. Res. Notices 4, 195–222 (2000) [FK] Frenkel, I.B. and Kac, V.G.: Basic representations of affine Lie algebras and dual resonance models. Invent. Math. 62, 23–66 (1980) [GKV] Ginzburg, V., Kapranov, M. and Vasserot, E.: Langlands reciprocity for algebraic surfaces. Math. Res. Lett. 2, 147–160 (1995) [Gr] Grojnowski, I.: Instantons and affine algebras I: The Hilbert scheme and vertex operators. Math. Res. Lett. 3, 275–291 (1996) [J1] Jing, N.: Twisted vertex representations of quantum affine algebras. Invent. Math. 102, 663–660 (1990) b2 ). J. Alg. 182, 448–468 [J2] Jing, N.: Higher level representations of the quantum affine algebra Uq (sl (1996) [J3] Jing, N.: Quantum Kac–Moody algebras and vertex representations. Lett. Math. Phys. 44, 261–271 (1998) [M] Macdonald, I.G.: Symmetric functions and Hall polynomials. 2nd ed., Oxford: Clarendon Press, 1995 [Mc] McKay, J.: Graphs, singularities and finite groups. Proc. Sympos. Pure Math. 37, AMS, 183–186 (1980) [Se] Segal, G.: Unitary representations of some infinite dimensional groups. Commun. Math. Phys. 80, 301–342 (1980) [Sa] Saito, Y.: Quantum Toroidal algebras and their vertex representations. Publ. Res. Inst. Math. Sci. 34, 155-177 (1998) [VV] Varagnolo, M. and Vasserot, E.: Double-loop algebras and the Fock space. Invent. Math. 133, 133–159 (1998) [W] Wang, W.: Equivariant K-theory and wreath products. MPI preprint # 86, August 1998; Equivariant K-theory, wreath products and Heisenberg algebra. Duke Math. J. 103, 1–23 (2000) [Z] Zelevinsky, A.: Representations of finite classical groups, A Hopf algebra approach. Lecture Notes in Mathematics, 869. Berlin–New York: Springer-Verlag, 1981 Communicated by T. Miwa
Commun. Math. Phys. 211, 395 – 406 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Canonical Tensor Product Subfactors K.-H. Rehren Institut für Theoretische Physik, Universität Göttingen, Bunsenstrasse 9, 37073 Göttingen, Germany. E-mail:
[email protected] Received: 24 November 1999 / Accepted: 6 December 1999
Abstract: Canonical tensor product subfactors (CTPS’s) describe, among other things, the embedding of chiral observables in two-dimensional conformal quantum field theories. A new class of CTPS’s is constructed some of which are associated with certain modular invariants, thereby establishing the expected existence of the corresponding two-dimensional theories.
1. Introduction and Results There is a common mathematical note which recurs again and again in the areas of conformal quantum field theory and modular invariants on the one hand, and asymptotic subfactors and quantum doubles on the other hand. In all these areas, there arise inclusions of von Neumann factors of the form A⊗B ⊂ C sharing a “canonical” property (see Def. 1.1 below) for which we call them “canonical tensor product subfactors” (CTPS, cf. [17]). E.g., in chiral quantum field theories on S 1 , CTPS’s describe the violation of Haag duality for disjoint intervals (Jones–Wassermann subfactors, cf. [21,10]), or the embedding of a coset model into a given ambient model. In two-dimensional conformal quantum field theories they describe the embedding of chiral subtheories [17] which is (incompletely) reflected also by modular invariant coupling matrices [15]. Ocneanu’s asymptotic subfactors [16] which are sometimes regarded as generalized quantum doubles [5] are also CTPS’s. The main result in this article is the presentation (Thm. 1.4) of a class of new CTPS’s associated with extensions of closed systems of endomorphisms (Def. 1.2, 1.3). Among them there is a subclass of considerable importance for the understanding of modular invariants. Namely, with every modular invariant constructed by a method due to Böckenhauer, Evans and Kawahigashi [3] one can associate one of the new CTPS’s which, if interpreted as a local inclusion of an algebra of chiral observables into an algebra of two-dimensional observables, allows to prove the existence of a complete two-
396
K.-H. Rehren
dimensional local conformal quantum field theory associated with the given modular invariant (Cor. 1.6). The mathematical abstraction of this physical problem as a problem on von Neumann algebras and subfactors is most efficient. It is based on the seminal realization [7] that positive-energy representations (“superselection charges”) and their fusion are conveniently expressed in terms of endomorphisms, promoting particle statistics to a unitary operator representation (braiding) on the physical Hilbert space, and identifying the statistical dimension as (the square root of) a Jones index. An extension or embedding of a quantum field theory gives rise to a subfactor whose canonical endomorphism corresponds to the branching of the extended vacuum representation [12]. This subfactor is conveniently characterized by its “Q-system” [11] which describes the local fields of the extended theory and their correlation functions in terms of superselection charges and charged “exchange amplitudes” of the embedded theory [12,19]. In Sect. 3, we define the subfactors of Thm. 1.4 by their Q-systems. This permits to read off the coupling of left and right chiral charges (the modular invariant coupling matrix), as explained in [19], and the relative amplitudes of conformal blocks contributing to the local correlation functions of the corresponding two-dimensional quantum field theories. CTPS’s are very special cases of “symmetric joint inclusions”, i.e., triples of von Neumann algebras (A, B, C) such that A and B are commuting subalgebras of C. After a survey of some general properties of symmetric joint inclusions in Sect. 4, we give a characterization of “normality” for CTPS’s in Proposition 4.3. This is a maximality property which, in the case of the embedding of chiral observables into a two-dimensional quantum field theory, corresponds to the maximally extended chiral algebras and diagonal or permutation invariants [17]. The canonical property mentioned before is a natural feature of the embedding of chiral quantum field theories into a two-dimensional conformal quantum field theory, reflecting the independence of its left and right moving degrees of freedom [17]. It is defined as follows. Definition 1.1. A tensor product subfactor of the form A ⊗ B ⊂ C is called a canonical tensor product subfactor (CTPS) if either A, B, C are type II factors and C considered as an A⊗B-A⊗B bimodule decomposes into irreducibles which are all tensor products of A-A bimodules with B-B bimodules, or if A, B, C are type III factors and the dual canonical endomorphism θ ≡ ι¯ ◦ ι ∈ End (A ⊗ B) decomposes into irreducibles which are all tensor products of endomorphisms of A with endomorphisms of B. Let, in the type III case, M Zα,β α ⊗ β, θ' α,β
where the sum extends over two sets of mutually inequivalent irreducible endomorphisms of A and of B, respectively. Then we call the matrix of multiplicities Z with non-negative integer entries the coupling matrix of the CTPS. The coupling matrix in the type II case is defined analogously in terms of mutually inequivalent irreducible A-A and B-B bimodules. Here, as always throughout this paper, ι : A ⊗ B → C denotes the inclusion homomorphism of the subfactor under consideration, and ι¯ : C → A ⊗ B a conjugate homomorphism [13].
Canonical Tensor Product Subfactors
397
In order to state our main result, we have to introduce some further notions. We consider type III von Neumann factors N , and denote by End fin (N ) the set of unital endomorphisms λ of N with finite dimension d(λ). Definition 1.2. A closed N -system is a set 1 ⊂ End fin (N ) of mutually inequivalent irreducible endomorphisms such that (i) idN ∈ 1, (ii) if λ ∈ 1 then there is a conjugate endomorphism λ¯ ∈ 1, and (iii) if λ, µ ∈ 1 then λµ belongs to 6(1), the set of endomorphisms which are equivalent to finite direct sums of elements from 1. Definition 1.3. Let N ⊂ M be a subfactor with inclusion homomorphism ι : N → M. An extension of the closed N -system 1 is a pair (ι, α), where ι is as above, and α is a map 1 → End fin (M), λ 7 → αλ , such that (E1) ι ◦ λ = αλ ◦ ι, (E2) ι(Hom(ν, λµ)) ⊂ Hom(αν , αλ αµ ). After these preliminaries, we can state our main results (proven in Sect. 3). Theorem 1.4. Let N1 ⊂ M and N2 ⊂ M be two subfactors of M, and (ι1 , α 1 ) and (ι2 , α 2 ) a pair of extensions of a finite closed N1 -system 11 and a finite closed N2 system 12 , respectively. Then there exists an irreducible CTPS opp
A ≡ N1 ⊗ N2
⊂B
with dual canonical endomorphism θ ≡ ι¯ ◦ ι '
M
opp
Zλ1 ,λ2 λ1 ⊗ λ2 ,
λ1 ∈11 ,λ2 ∈12
whose coupling matrix Z of multiplicities is given by Zλ1 ,λ2 = dim Hom(αλ11 , αλ22 ). The special case when 1i are braided systems is of particular interest for the problem of modular invariants in conformal quantum field theory: Proposition 1.5. Assume in addition that the closed systems 11 and 12 are braided with unitary braidings ε1 and ε2 , respectively, turning (11 ) and (12 ) (and hence also 6(11 ) and 6(12 )) into braided monoidal categories. If for all λi , µi ∈ 1i and all φ ∈ Hom(αλ11 , αλ22 ), ψ ∈ Hom(αµ1 1 , αµ2 2 ), one has (E3) (ψ × φ) ◦ ι1 (ε1 (λ1 , µ1 )) = ι2 (ε2 (λ2 , µ2 )) ◦ (φ × ψ),
then the canonical isometry w1 ∈ Hom(θ, θ 2 ) (defined in Sect. 3 in the proof of Thm. opp 1.4) and the braiding operator ε(θ, θ ) naturally induced by the braidings ε1 and ε2 satisfy ε(θ, θ )w1 = w1 . This result answers an open question in quantum field theory, where possible matrices Z are classified which are supposed to describe the restriction of a given two-dimensional modular invariant conformal quantum field theory to its chiral subtheories, while it is actually not clear whether any given solution Z does come from a two-dimensional quantum field theory. This turns out to be true for a large class of solutions, obtained in [3]:
398
K.-H. Rehren
Corollary 1.6. Let A : I 7 → A(I ) be a chiral net of local observables associated with the open intervals I ⊂ R, such that each A(I ) is a type III factor. Let 1DHR be a closed system of mutually inequivalent irreducible DHR-endomorphisms [7] of A with finite dimension, localized in some interval I0 , and put N := A(I0 ). We assume that A is conformally covariant, implying [6] that the system of restrictions 1 := {λ = λDHR |N : λDHR ∈ 1DHR } is a closed N -system. Let N1 ⊂ N be a subfactor with canonical endomorphism θ ∈ 6(1), and N ⊂ M its Jones extension. Put Zλ,µ := dim Hom(αλ+ , αµ− ) , where α ± are the pair of α-inductions [12,20,1] associated with the braidings given by the DHR statistics and its opposite. Then there is a two-dimensional local conformal quantum field theory described by a net B : O 7 → B(O) of observables associated with the double-cones O = I × J in R2 , containing subnets of left and right chiral observables AL and AR both isomorphic with A, such that the local inclusions of chiral observables AL (I ) × AR (J ) ⊂ B(O) are CTPS’s with coupling matrix ZC. (Here C is the matrix describing sector conjugation in 1.) Equivalently, the restriction of the vacuum representation of the two-dimensional quantum field theory B to its chiral subtheories is given by π 0 |AL ⊗AR =
X
Zλ,µ¯ πλ ⊗ πµ .
λ,µ∈1
The corollary combines and adapts results from [12,3]. The point is that if the dual canonical endomorphism θ associated with N ⊂ M belongs to 6(1), then α-induction [12,20,1] provides a pair of extensions (ι, α + ) and (ι, α − ) of the closed N -system 1 which satisfies (E1), (E2) as well as (E3) (e.g., [1, I; Def. 3.3, Lemma 3.5 and 3.25]). The associated coupling matrix Zλ,µ is automatically a modular invariant [3]. By the characterization of extensions of local quantum field theories given in [12, Prop. 4.9], the local subfactor AL (I0 ) ⊗ AR (I0 ) ⊂ B(O0 ) given by Thm. 1.4 induces an entire net of subfactors, indexed by the double-cones O of two-dimensional Minkowski space. (The charge conjugation C arises due to an anti-isomorphism between N opp and N , cf. [12, Prop. 4.10 ff.].) The statement of Proposition 1.5 is precisely the criterium given in [12] for the resulting two-dimensional quantum field theory to be local. Thus, every modular invariant found by the α-induction method given in [3] indeed corresponds to a two-dimensional local conformal quantum field theory extending the given chiral nets of observables.
2. Extensions of Systems of Endomorphisms We collect some immediate consequences of the definition of an extension, Def. 1.3, using terminology and notations as in [4,13]. Proposition 2.1. An extension (ι, α) of a closed N -system gives rise to a monoidal functor from the full monoidal C* subcategory of End fin (N ) with objects (1) (the set of finite products from 1) into the monoidal C* category End fin (M). In Lof elements ν ν implies α α ' L N ν α (notwithstanding α will be particular, λµ ' ν Nλµ λ µ λ ν λµ ν reducible in general), and αλ¯ is conjugate to αλ .
Canonical Tensor Product Subfactors
399
Proof. The functor maps objects λ1 ◦ . . . ◦ λn to αλ1 ◦ . . . ◦ αλn , and intertwiners T to ι(T ) which are again intertwiners by iteration of (E2). It follows from (E2) that α preserves the fusion rules as stated. In particular, αidN is an idempotent within End fin (M), implying that its dimension is 1, hence it is invertible and must coincide with idM . Thus the functor preserves the monoidal unit object. It preserves the right monoidal product of intertwiners trivially, and the left monoidal product by (E1). Conjugacy between αλ¯ and αλ is a consequence of the following lemma. u t ¯ Lemma 2.2. Let (ι, α) be an extension of a closed N-system 1. Let R ∈ Hom(idN , λλ) ¯ ¯ and R ∈ Hom(idN , λλ) be a pair of standard isometries solving the conjugate equations (1λ × Rλ∗ ) ◦ (R¯ λ × 1λ ) = d(λ)−1 1λ = (1λ¯ × R¯ λ∗ ) ◦ (Rλ × 1λ¯ ), and thus implementing the unique left- and right-inverses [13] 8λ and 9λ for λ ∈ 1. Then ι(Rλ ) and ι(R¯ λ ) induce left- and right-inverses 8αλ and 9αλ for αλ . If either N ⊂ M has finite index, or 1 is a finite system, then d(αλ ) = d(λ), and 8αλ and 9αλ are the unique standard left- and right-inverses. Proof. The first statement is an obvious consequence of (E2). If the index d(ι)2 is finite, then (E1) implies d(αλ ) = d(λ). If 1 is finite, then the minimal dimensions d(αλ ) are uniquely determined by the fusion rules of {αλ , λ ∈ 1}, and the latter coincide with those of {λ ∈ 1}. Hence again d(αλ ) = d(λ). Since d(λ) are also the dimensions associated t with the pair of isometries ι(Rλ ), ι(R¯ λ ), the last claim follows by [13, Thm. 3.11]. u Thus, general properties of standard left- and right-inverses [13] are applicable. We shall in the sequel repeatedly exploit the trace property d(ρ)8ρ (S ∗ T ) = d(τ )8τ (T S ∗ )
if
S, T ∈ Hom(ρ, τ )
for standard left-inverses of ρ, τ ∈ End fin (M), their multiplicativity 8ρτ = 8τ 8ρ , as well as the equality of standard left- and right-inverses 8ρ = 9ρ on Hom(ρ, ρ). 3. Construction of the New CTPS’s We shall prove Theorem 1.4 by the specification of the Q-systems (or “canonical triples”) (θ, w, w1 ), which uniquely determine the asserted subfactors [11]. Longo’s characterization states that θ ∈ End fin (A) is the dual canonical endomorphism associated with a subfactor A ⊂ B if (and only if) there is a pair of isometries w ∈ Hom(idA , θ) and w1 ∈ Hom(θ, θ 2 ) satisfying (Q1) (Q2) (Q3)
w ∗ w1 = θ(w∗ )w1 = d(θ )−1/2 1lA , w1 w1 = θ(w1 )w1 , and w1 w1∗ = θ(w1∗ )w1 .
Namely, then the map w1∗ θ( · )w1 is the minimal conditional expectation onto its image A1 = w1∗ θ(A)w1 ⊂ A. For ι1 : A1 → A the inclusion map and ι¯1 : A → A1 defined by θ = ι1 ι¯1 , the pair of isometries w ∈ Hom(idA , ι1 ι¯1 ) and ι−1 1 (w1 ) ∈ Hom(idA1 , ι¯1 ι1 ) achieves the conjugacy between ι1 and ι¯1 . By the Jones construction [9], then, the subfactor A1 ⊂ A determines its dual subfactor (the Jones extension) A ⊂ B such that θ = ι¯ι. Proof of Theorem 1.4. First notice that the multiplicity of idA in θ is ZidN1 ,idN2 = dim Hom(idM , idM ) = 1, so the asserted subfactor is automatically irreducible.
400
K.-H. Rehren
In order to show that θ given in the theorem is the dual canonical endomorphism associated with a subfactor A ⊂ B, we construct the Q-system (θ, w, w1 ) as follows. We first choose a complete system of mutually inequivalent isometries W(λ1 ,λ2 ,l) ≡ Wl ∈ A ≡ N ⊗ N opp , where l is considered as a multi-index (λ1 ∈ 11 , λ2 ∈ 12 , l = 1, . . . Zλ1 ,λ2 ), and put X opp Wl (λ1 ⊗ λ2 )( · ) Wl∗ . θ= l
The choice of these isometries is immaterial and affects the subfactor to be constructed only by inner conjugation. Since Hom(idA , θ) is one-dimensional, the isometry w is already fixed up to an irrelevant complex phase: w = W0 , where 0 refers to the multi-index l = 0 ≡ (idN1 , idN2 , 1). The second isometry, w1 , must be of the form X (Wl × Wm ) ◦ T nlm ◦ Wn∗ , w1 = l,m,n
n ∈ Hom(ν ⊗ ν where Tlm 1 2 , (λ1 ⊗ λ2 ) ◦ (µ1 ⊗ µ2 )), since these operators span Hom(θ, θ 2 ). n must be of the form In turn, Tlm X n n n Tlm = ζlm,e Te1 ⊗ (Te∗2 )opp (ζlm,e ∈ C) , 1 e2 1 e2 opp
opp
opp
e1 ,e2
where Tei constitute orthonormal bases of the intertwiner spaces Hom(νi , λi µi ), opp opp opp since these operators span the spaces Hom(ν1 ⊗ ν2 , (λ1 ⊗ λ2 ) ◦ (µ1 ⊗ µ2 )) ≡ opp opp opp Hom(ν1 , λ1 µ1 ) ⊗ Hom(ν2 , λ2 µ2 ). Note that if T ∈ Hom(α, β) is isometric in N , then (T ∗ )opp ∈ Hom(β, α)opp ≡ Hom(α opp , β opp ) is isometric in N opp . The labels ei are again multi-indices (λ, µ, ν, e = 1, . . . dim Hom(ν, λµ)). n , such that w1 It remains therefore to determine the complex coefficients ζlm,e 1 e2 is an isometry satisfying Longo’s relations (Q1–3) above. To specify these coefficients, we equip the spaces Hom(αλ11 , αλ22 ) with the non-degenerate scalar products (φ, φ 0 ) := 81λ1 (φ ∗ φ 0 ) (where 8iλi stand for the induced left-inverses for αλi i , i = 1, 2, cf. Lemma 2.2). With respect to these scalar products, we choose orthonormal bases {φl , l = 1, . . . Zλ1 ,λ2 } for all λ1 ∈ 11 , λ2 ∈ 12 , and put s d(λ2 )d(µ2 ) 1 n ∗ )ι2 (Te2 )φn ]. 8ν1 [ι1 (Te∗1 )(φl∗ × φm ζlm,e1 e2 = d(θ)d(ν2 ) This formula is only apparently asymmetric under exchange 1 ↔ 2: by the trace property d(λ2 )82λ2 (φφ ∗ ) = d(λ1 )81λ1 (φ ∗ φ), an orthonormal basis ψl of Hom(αλ22 , αλ11 ) differs q d(λ1 ) from φl∗ by a factor d(λ , so that in fact 2) s n ζlm,e = 1 e2
d(λ1 )d(µ1 ) 2 ∗ )ι1 (Te1 )ψn ]. 8ν2 [ι2 (Te∗2 )(ψl∗ × ψm d(θ)d(ν1 )
With these coefficients, condition (Q1) is trivially satisfied, since left multiplication of w1 by w∗ singles out the term l = 0 due to W0∗ Wl = δl0 . This leaves only terms √ n = δmn (up with λi = idNi , hence µi = νi , for which Tei are trivial and d(θ )ζ0m,e 1 e2
Canonical Tensor Product Subfactors
401
√ P to cancelling complex phases), so d(θ )w∗ w1 = n Wn Wn∗ = 1lA . For θ (w∗ )w1 the argument is essentially the same. We turn to the conditions (Q2) and (Q3). Whenever we compute either of the four products occurring, we obtain a Kronecker delta Ws∗ Wt = δst for one pair of the labels l, m, n, . . . involved, while the remaining operator parts are of the form (Wl × Wm × Wk ) (Te1 × 1κ1 )Tf1 ⊗ (((Te2 × 1κ2 )Tf2 )∗ )opp Wn∗ , (Wl × Wm × Wk ) (1λ1 × Tg1 )Th1 ⊗ (((1λ2 × Tg2 )Th2 )∗ )opp Wn∗ for the left- and right-hand side of (Q2), w1 w1 = θ (w1 )w1 , and in turn, h i (Wl × Wm ) Te1 Tf∗1 ⊗ ((Te2 Tf∗2 )∗ )opp (Wn × Wk )∗ , h i (Wl × Wm ) (1λ1 ×Tg∗1 )(Th1 ×1κ1 ) ⊗ (((1λ2 ×Tg∗2 )(Th2 ×1κ2 ))∗ )opp (Wn × Wk )∗ for the left- and right-hand side of (Q3), w1 w1∗ = θ (w1∗ )w1 . (In these expressions, we do not specify the respective intertwiner spaces to which the various operators T belong, since these are determined by the context.) The numerical coefficients of these operators are sums over products of two ζ ’s or one ζ and one ζ , respectively, with a summation over one common label s = 1, . . . Zσ1 ,σ2 due to the above Kronecker δst . These sums can be carried out. Namely, the coefficients of the above operators on both sides of (Q2) involve one factor ζ...s which is a scalar product of the form 81σ1 (Xφs ) = (X∗ , φs ) in Hom(ασ11 , ασ22 ), so with the operator φs∗ contributing to the other factor ζ yields P summation ∗ ∗ s (X , φs )φs = X. Then the coefficients of the above operators on both sides of (Q2) are easily cast into the respective form s X d(λ2 )d(µ2 )d(κ2 ) s n ζlm,e1 e2 ζsk,f1 f2 = d(θ )2 d(ν2 ) s ∗ × φk∗ )ι2 ((Te2 × 1κ2 )Tf2 )φn ], ×81ν1 [ι1 (Tf∗1 (Te∗1 × 1κ1 ))(φl∗ × φm s X d(λ2 )d(µ2 )d(κ2 ) s ζmk,g ζn = 1 g2 ls,h1 h2 d(θ )2 d(ν2 ) s ∗ × φk∗ )ι2 ((1λ2 × Tg2 )Th2 )φn ]. ×81ν1 [ι1 (Th∗1 (1λ1 × Tg∗1 ))(φl∗ × φm
Now, since the passage from bases of the form (Te × 1κ )Tf to bases (1λ × Tg )Th of Hom(ν, λµκ) for any fixed ν, λ, µ, κ is described by unitary matrices, equality of both sides of (Q2) follows. The case of (Q3) is in the same vein, but slightly more involved. In the coefficients P s ζs , we read again the first factor as a scalar product on the left-hand s ζlm,e 1 e2 nk,f1 f2 P ∗ 1 2 (X , φs ) within Hom(ασ1 , ασ2 ) and perform the summation s (X ∗ , φs )φs∗ = X with the operator φs∗ contributing to the second factor. This yields, after application of the trace property for standard left-inverses, the coefficients on the left-hand side of (Q3), s X d(λ2 )d(µ2 )d(κ2 )d(ν2 ) d(λ1 )d(µ1 ) s ζlm,e ζs = 1 e2 nk,f1 f2 d(θ )2 d(σ2 )2 d(σ1 ) s ∗ )ι2 (Te2 Tf∗2 )(φn × φk )ι1 (Tf1 Te∗1 )]. ×81µ1 λ1 [(φl∗ × φm
402
K.-H. Rehren
P m To compute the coefficients s ζsk,g ζn on the right-hand side of (Q3), we 1 g2 ls,h1 h2 first rewrite the second factor as a scalar product (φs , X) within Hom(ασ11 , ασ22 ). This is achieved by applying the trace property: s d(λ2 )d(σ2 ) d(λ1 )d(σ1 ) 1 ∗ 1 n = 8σ1 [φs 8λ1 ((φl∗ × 1ασ2 )ι2 (Th2 )φn ι1 (Th∗1 ))]. ζls,h 1 h2 2 d(θ)d(ν2 ) d(ν1 ) m can be performed as before, yielding Now the sum over s with φs contributing to ζsk,g 1 g2 the coefficients on the right-hand side of (Q3) in the form s X d(λ2 )d(κ2 )d(σ2 )2 d(λ1 )d(σ1 ) n m ζsk,g ζ = h ls,h g 1 2 1 2 d(θ )2 d(ν2 )d(µ2 ) d(ν1 ) s ∗ )ι2 ((1λ2 ×Tg∗2 )(Th2 ×1κ2 ))(φn ×φk )ι1 ((Th∗1 ×1κ1 )(1λ1 ×Tg1 ))]. ×81µ1 λ1 [(φl∗ ×φm
q q d(σ ) ∗ to bases ∗ T T Noting that the passage from bases d(µ) e f d(σ ) d(ν) (1λ × Tg )(Th × 1κ ) of Hom(νκ, λµ) is again described by unitary matrices for any fixed ν, κ, λ, µ, we obtain equality of both sides of (Q3). It remains to show that w1 is an isometry, w1∗ w1 = 1. Performing the multiplication w1∗ w1 yields two Kronecker deltas from the factors Wl × Wm , and two more Kronecker deltas from the factors Te1 ⊗ (Te∗2 )opp . Thus X X s Ws Wn∗ , ζlm,e ζn w1∗ w1 = 1 e2 lm,e1 e2 ns
lm,e1 e2
and we have to perform the sums over l, m, e1 , e2 (involving, as sums over multi-indices, the summation over sectors λi , µi ∈ 1i for fixed νi ∈ 1i , i = 1, 2). n as a scalar product (φm , X) within It turns out convenient to express ζlm,e 1 e2 1 2 Hom(αµ1 , αµ2 ) as before (with indices relabelled), and to perform the sum over m first. This yields X lm,e1 e2
s ζlm,e ζn = 1 e2 lm,e1 e2
X d(λ2 )d(µ2 ) d(λ1 )d(µ1 ) × d(θ )d(ν2 ) d(ν1 )
l,e1 e2
81ν1 [φs∗ ι2 (Te∗2 ) φl × 81λ1 [(φl∗ × 1αµ1 )ι2 (Te2 )φn ι1 (Te∗1 )] ι1 (Te1 )]. 1
In this expression, we can perform the sums over (e1 , µ1 ) and over (e2 , µ2 ) after a unitary passage from the bases of orthonormal isometries Te of Hom(ν, λµ) to the bases q d(λ)d(ν) (1λ ×T ∗0 )(R¯ λ ×1ν ), and obtain after use of the conjugate equations for Rλ , R¯ λ , d(µ)
e1
X lm,e1 e2
s ζlm,e ζn = 1 e2 lm,e1 e2
X d(λ2 )2 81ν1 [9λ22 (φl φl∗ ) × (φs∗ φn )]. d(θ )
l,λ1 λ2
Here 9λ22 is the standard right-inverse implemented by ι2 (R¯ λ2 ) which coincides with 82λ2 on Hom(αλ22 , αλ22 ), and can be evaluated by the trace property: 9λ22 (φl φl∗ ) =
Canonical Tensor Product Subfactors
82λ2 (φl φl∗ ) = Zλ1 ,λ2 . Hence X lm,e1 e2
403
d(λ1 ) 1 ∗ d(λ2 ) 8λ1 (φl φl )
d(λ1 ) d(λ2 ) .
=
The sum over l yields the multiplicity factor
X d(λ1 )d(λ2 )Zλ ,λ 1 2 81ν1 (φs∗ φn ) = δsn , = d(θ )
s ζlm,e ζn 1 e2 lm,e1 e2
and hence w1∗ w1 =
P
∗ n Wn Wn
λ1 ,λ2
= 1. This completes the proof of the theorem. u t
Proof of Proposition 1.5. Left multiplication of w1 with the induced braiding operator X (Wm0 × Wl 0 ) ◦ (ε1 (λ1 , µ1 ) ⊗ (ε2 (λ2 , µ2 )∗ )opp ) ◦ (Wl × Wm )∗ ε(θ, θ) = mlm0 l 0
amounts to a unitary passage from bases Te ∈ Hom(ν, λµ) to bases ε(λ, µ)Te ∈ n are invariant under these changes Hom(ν, µλ). But by (E3), the coefficients ζlm,e 1 e2 t of bases. Hence ε(θ, θ)w1 = w1 . u
4. Joint Inclusions and Normality The main purpose of this section is to introduce and discuss the notions of “normality” and “essential normality” which are potential properties of pairs of commuting von Neumann subalgebras. They are particularly relevant for the physical application we have in mind since the embedding of chiral subtheories into two-dimensional conformal quantum field theories always gives rise to essentially normal CTPS’s [17]. We start by introducing these and related notions in the broader context of “joint inclusions” of von Neumann algebras, i.e., triples (A, B, C) such that A ∨ B ⊂ C. We first record some more or less elementary properties of joint inclusions, before we give a simple characterization of normality in the case of CTPS’s in terms of the coupling matrix. Definition 4.1. Let 3 = (A, B, C) be a joint inclusion of von Neumann algebras. We denote by 3c := (B c , Ac , C) the joint inclusion of the relative commutants in C. We write 31 ⊂ 32 if C1 = C2 and A1 ⊂ A2 , B1 ⊂ B2 , and call 32 intermediate w.r.t. 31 . We call 3 symmetric if 3 ⊂ 3c (i.e., A and B commute with each other). We call 3 normal if 3 = 3c (i.e., A and B are each other’s relative commutants). We call 3 essentially normal if 3c = 3cc . One has the following elementary facts. Proposition 4.2. 1. 3c = 3ccc . 2. If 31 ⊂ 32 then 3c2 ⊂ 3c1 . 3. If 3 is symmetric then 3 ⊂ 3cc ⊂ 3c . 4. 3 is essentially normal if and only if 3 and 3c are both symmetric. 5. Every symmetric 3 has a normal intermediate joint inclusion. 6. If 3 is normal then one has Z(A) = (A ∨ B)c = Z(B) ⊃ Z(C), so A and likewise B are factors if and only if A ∨ B ⊂ C is irreducible, and in this case C necessarily is also a factor. Proof. Assertions 1–4 are obvious. A normal intermediate joint inclusion is given by, t e.g., (B c , B cc , C). Assertion 6 holds since (A ∨ B)c = Ac ∩ B c . u
404
K.-H. Rehren
While these statements are in quite some parallelism to the theory of self-adjoint extensions of symmetric unbounded operators, Assertion 5 is a departure from this parallelism, since self-adjoint extensions do not always exist for symmetric operators. The parallelism seems to become closer if one restricts to the subclass of tensor product subfactors (canonical or not) which are obviously symmetric joint inclusions. But neither ((1l ⊗ B)c , (A ⊗ 1l)c , C) nor the joint inclusion ((1l ⊗ B)c , (1l ⊗ B)cc , C) in 4.2(5) will again be a tensor product subfactor in general. While we have no general criterium for the existence of normal intermediate tensor product subfactors in general, the following proposition gives a simple characterization of normality in the case of CTPS’s, which entails certain constraints on the structure of A1 ⊗ B1 ⊂ C for which a normal CTPS A ⊗ B ⊂ C can possibly be intermediate. These constraints will apply to the embeddings of left and right chiral subtheories into two-dimensional conformal quantum field theories, which by [17] give rise to CTPS’s whose relative commutants are again tensor product subfactors, hence symmetric. Thus these local subfactors are essentially normal CTPS’s by Prop. 4.2(4), and the normal intermediate subfactor corresponds to the maximally extended chiral algebras (going along with permutation modular invariants). We do not evaluate these constraints here, but it is clear that theLtotal dual canonical endomorphism must be of the form (¯ιA ⊗ ι¯B ) ◦ θ ◦ (ιA ⊗ ιB ) ' α (¯ιA αιA ) ⊗ (¯ιB σ (α)ιB ), where θ corresponding to the normal intermediate inclusion is of the special “permutational” form (N3) as described in the following proposition. Proposition 4.3. Let A ⊗ B ⊂ C be a CTPS of type III with coupling matrix Z, i.e., the dual canonical endomorphism is of the form M Zα,β α ⊗ β, θ' α∈1A ,β∈1B
where 1A 3 idA and 1B 3 idB are two sets of mutually inequivalent irreducible endomorphisms in End fin (A) and End fin (B). Then the following conditions are equivalent. (N1) The joint inclusion (A ⊗ 1lB , 1lA ⊗ B, C) is normal, i.e., A ⊗ 1lB and 1lA ⊗ B are each other’s relative commutants in C. (N2) The coupling matrix couples no non-trivial sector of A to the trivial sector of B, and vice versa, i.e., Zα,idB = δα,idA and ZidA ,β = δβ,idB . (N3) The sets 1A and 1B are closed A- and B-systems, respectively, i.e., they are both closed under conjugation and fusion. There is a bijection σ : 1A → 1B which preserves the fusion rules, i.e., dim Hom(α1 , α2 α3 ) = dim Hom(σ (α1 ), σ (α2 )σ (α3 )). The matrix Z is the permutation matrix for this bijection, i.e., Zα,β = δσ (α),β . The proof is published in [17, Lemma 3.4 and Thm. 3.6]. Making contact with the new CTPS’s in Thm. 1.4, we first point out that in general they will not be normal, since among the coupling matrices constructed in [3] there are those which are not of the form (N2, N3).
Canonical Tensor Product Subfactors
405
The most simple case N1 = N2 = M hence Z = 1l is known for a while [12], and clearly is normal by Prop. 4.3. Specifically, it describes the “diagonal” extension of the chiral observables by local two-dimensional observables carrying opposite chiral charges. We conclude from Prop. 4.3 that the left and right chiral observables are each other’s relative commutants within this two-dimensional theory, and the same holds whenever the coupling matrix in Cor. 1.6 satisfies condition (N2, N3), i.e., describes a permutation modular invariant (cf. [17]). In the abstract mathematical setting, the subfactors with Z = 1l constructed in [12] were recognized [14] (up to some trivial tensoring with a type III factor) L as the type II asymptotic subfactor [16] associated with σ (M) ⊂ M, where σ ≡ λ∈1 λ. As the asymptotic subfactor M ∨ M c ⊂ M∞ associated with a fixed point inclusion M G ⊂ M for an outer action of a group G, provides the same category of M∞ -M∞ bimodules as a fixed point inclusion for an outer action of the quantum double D(G) on M∞ , general asymptotic subfactors in turn are considered [16,5] as generalized quantum doubles. General asymptotic subfactors are CTPS’s, i.e., M ∨ M c ' M ⊗ M c are in a tensor product position within M∞ , and every irreducible M ∨M c -M ∨M c bimodule associated with the asymptotic subfactor respects the tensor product [16]. They are normal, i.e., M and M c are each other’s relative commutant in M∞ . Moreover, the system of M∞ M∞ bimodules associated with an asymptotic subfactor has a non-degenerate braiding [16,8]. We do not know at present whether the new CTPS’s always share this braiding property, which ought to be tested with methods as in [8]. 5. Conclusion We have shown the existence of a class of new subfactors associated with extensions of closed systems of sectors. The proof proceeds by establishing the corresponding Qsystems in terms of certain matrix elements for the transition between two extensions. The new subfactors are canonical tensor product subfactors and include the asymptotic subfactors. Among the new subfactors, there are the local subfactors of two-dimensional conformal quantum field theory associated with certain modular invariants, thereby establishing the expected existence of these theories. We also gave a characterization of normality of CTPS’s which corresponds to the maximal subtheories of chiral observables in these models. Acknowledgement. I am deeply indebted to Y. Kawahigashi, M. Izumi, T. Matsui, and I. Ojima who made possible my visit to Japan during the summer of 1999 where the present work was completed. I want to thank all of them as well as H. Kosaki, Y. Watatani, and T. Masuda for discussions, and for their hospitality extended to me at the Department of Mathematical Sciences, University of Tokyo, the Graduate School of Mathematics, Kyushu University, and the Research Institute for Mathematical Sciences, Kyoto University. I also thank H. Kurose for giving me the opportunity to present these results at the workshop “Advances in Operator Algebras” [18] held at RIMS, Kyoto. Financial support by a Grant-in-Aid for Scientific Research from the Ministry of Education (Japan) is gratefully acknowledged. Finally, I thank J. Böckenhauer for sending me a preliminary manuscript on related issues from a complementary perspective [2].
References 1. Böckenhauer, J., Evans, D.E.: Modular invariants, graphs and α-induction for nets of subfactors. I, Commun. Math. Phys. 197, 361–386 (1998), II, ibid. 200, 57–103 (1999), and III, ibid. 205, 183–229 (1999) 2. Böckenhauer, J., Evans, D.E.: Modular invariants from subfactors: Type I coupling matrices and intermediate subfactors. Preprint math.OA/9911239, Commun. Math. Phys., to appear
406
K.-H. Rehren
3. Böckenhauer, J., Evans, D.E., Kawahigashi, Y.: On α-induction, chiral generators and modular invariants for subfactors. Commun. Math. Phys. 208, 429–487 (1999) 4. Doplicher, S., Roberts, J.E.: A new duality theory for compact groups. Invent. Math. 98, 157–218 (1989) 5. Evans, D.E., Kawahigashi, Y.:Quantum Symmetries on Operator Algebras. Oxford: Oxford University Press, 1998 6. Guido, D., Longo, R.: The conformal spin and statistics theorem. Commun. Math. Phys. 181, 11–36 (1996) 7. Haag, R.: Local Quantum Physics. Berlin: Springer, 1996 8. Izumi, M.: The structure of sectors associated with the Longo–Rehren inclusions. I, Kyoto preprint (1999) 9. Jones, V.F.R.: Index for subfactors. Invent. Math. 72, 1–25 (1983) 10. Kawahigashi, Y., Longo, R., Müger, M.: Multi-interval subfactors and modularity of representations in conformal field theory. Preprint math.OA/9903104 (1999) 11. Longo, R.: A duality for Hopf algebras and for subfactors. I. Commun. Math. Phys. 159, 133–150 (1994) 12. Longo, R., Rehren, K.-H.: Nets of subfactors. Rev. Math. Phys. 7, 567–597 (1995) 13. Longo, R., Roberts, J.E.: A theory of dimension. K-Theory 11, 103–159 (1997) 14. Masuda, T.: An analogue of Longo’s canonical endomorphism for bimodule theory and its application to asymptotic inclusions. Int. J. Math. 8, 249–265 (1997) 15. Moore, G., Seiberg, N.: Naturality in conformal field theory. Nucl. Phys. B313, 16–40 (1989) 16. Ocneanu, A.: Quantum symmetry, differential geometry, and classification of subfactors. Univ. Tokyo Seminary Notes 45 (1991) (notes recorded by Y. Kawahigashi) 17. Rehren, K.-H.: Chiral observables and modular invariants. Commun. Math. Phys. 208, 689–712 (2000) 18. Rehren, K.-H.: New subfactors associated with closed systems of sectors. Preprint math.OA/9911148 (1999), in: Progress in Operator Algebras, Kyoto 1999 Proceedings, ed. H. Kurose, RIMS Kokyuroku 1131, 32–39 (2000) 19. Rehren, K.-H., Stanev, Ya.S., Todorov, I.T.: Characterizing invariants for local extensions of current algebras. Commun. Math. Phys. 174, 605–633 (1996) 20. Xu Feng: New braided endomorphisms from conformal inclusions. Commun. Math. Phys. 192, 349–403 (1998) 21. Xu Feng: Jones-Wassermann subfactors for disconnected intervals. Preprint q-alg/9704003 (1997) Communicated by H. Araki
Commun. Math. Phys. 211, 407 – 412 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Sets of Fibre Homotopy Classes and Twisted Order Parameter Spaces Stefan Bechtluft-Sachs, Marco Hien Naturwissenschaftliche Fakultät I, Universität Regensburg, Universitätsstraße 31, 93053 Regensburg, Germany. E-mail:
[email protected];
[email protected] Received: 26 January 1999/ Accepted: 7 December 1999
Abstract: We propose a machinery to calculate the set of global defect indices of a regularly defected order taking values in arbitrary parameter bundles.
1. Introduction An ordered medium is conveniently described by a section 6 of a fibre bundle E over a manifold M. In nematics of linear particles for instance, the bundle E = P T M is the projective tangent bundle of the coordinate space M, (see [4–6]). In general 6 is defined outside some defect subset 1 ⊂ M only. In the sequel 1 will be assumed a submanifold in which case the order is called regularly defected. Let SN denote the sphere bundle of the normal bundle N of 1. By restriction to the boundary of a tubular neighbourhood identified with SN we get a map σ : SN → E whose fibre homotopy class [σ ]1 is called the global defect index, see [2]. It describes the behaviour of the order in the vicinity of the defect up to continous deformations. Recall that two bundle maps σ , σ 0 are fibre homotopic, if there is a homotopy H between them which is at every stage compatible with the bundle projections. The local defect index ι(σ ) is the homotopy class of the restriction of σ to the fibre over a point x0 ∈ 1. Our interest lies in the interplay between the global defect index [σ ]1 and the local defect index ι(σ ). In [2] we showed that the global defect index of an ordered medium is not in general determined by the local defect index. We also developed a method to calculate the set of global defect indices if the normal bundle of the defect as well as the order parameter bundle are trivial. In the present investigation we extend this to problems involving nontrivial bundles, such as nematics in a coordinate space with nontrivial tangent bundle, see Sect. 4.2. We will slightly reformulate the S setting. Let V denote the typical fibre of the order parameter bundle E. The set A := x∈1 Map(SNx , Ex ) carries a natural topology which
408
S. Bechtluft-Sachs, M. Hien
makes the obvious map A → 1 a fibre bundle with fibre F = Map(S n−1 , V ). We denote by Map(SN, E)1 ⊂ Map(SN, E) the subset of fibre preserving maps and by 0(A) the set of cross sections of A → 1. The Exponential Law provides a homeomorphism 0(A) → Map(SN, E)1 hence a bijection between the set [1, A]0 of fibre homotopy classes of sections of A → 1 and the set [SN, E]1 of global defect indices (see [3], ϕ chapter VII.2). We are also interested in the set [SN, E]1 of global defect indices with a fixed local defect index ϕ. More generally we first calculate the set [1, A]0 of fibre homotopy classes of sections of A for an arbitrary fibre bundle π : A → 1. We need to assume that 1 = Cf is the mapping cone of a cofibration f : A → X with an H-cogroup A, see Sect. 2 or [10] for these notions. The most prominent case is that of a cell complex. Then A = S k = ∂D k+1 is a sphere and 1 = Cf arises from X by attaching a cell of dimension k + 1. We will present a method to calculate [1, A]0 roughly in terms of [X, A|X ]0 and [A, F ]. In the case of a cell complex this amounts to a cell-by-cell description of [1, A]0 . Our approach will require to work in the category of pointed spaces first. Choose base points x0 ∈ 1 and ϕ ∈ F := π −1 (x0 ). The set [[1, A]]0 of fibre homotopy classes of sections relative basepoint can be computed from an exact sequence of Baues ([1]) generalizing the ordinary homotopy sequence of a cofibration. The set [1, A]0 is then obtained by dividing out the action of π1 (F ). A subtlety arising here is the fact that the connecting map in the sequence in [1] is not simply an equivariant map with respect to the action of π1 (F ) but is itself given by a group action. Proposition 1 gives a compatibility formula for these two actions. 2. A Sequence of Sets of Fibre Homotopy Classes We briefly recall the long exact sequence of sets of fibre homotopy classes of sections from [1]. This sequence generalizes the homotopy sequence of a cofibration, see [3], p. 447. For the moment we are working in the category of topological spaces with basepoint ∗. For topological spaces X and Y in this category we denote by [[X, Y ]] the set of homotopy classes relative basepoints. Let f : A → X be a cofibration, i.e. any map ¯ : X × I → Y . Thus the G : A × I ∪ X × {0} → Y extends to a homotopy G problem of extending maps A → Y to X descends to the corresponding extension problem in the homotopy category. For example, if M is a manifold and 1 ⊂ M a submanifold, the inclusion 1 ,→ M is a cofibration. The mapping cone of f is the topological space Cf := X ∪f CA arising from X by attaching the reduced cone CA := (A×I )/((A×0)∪({∗}×I )) via the mapping f . Now the inclusion i : X → Cf is again a cofibration and one easily sees that its mapping cone is homotopy equivalent to the reduced suspension SA := CA/(A × {1}). The reader unfamiliar with these notions is referred to [3,10]. In the sequel we assume that A is an H-cogroup, which means that for any space Y the set [[A, Y ]] carries a group structure natural in Y . A typical example of an H-cogroup is the reduced suspension SX of any space X; the spheres S n , n ∈ N for instance. Consider a cofibration f : A → X and a locally trivial fibration (fibre bundle) π : A → Cf with fibre F := π −1 (∗). We denote by 6∗ X := X × S 1 /{∗} × S 1 the pointed suspension of X and by B → 6∗ X the pull-back of A via the canonical projection 6∗ X → X. Assuming that there is a fixed cross section σ : Cf → A with
Sets of Fibre Homotopy Classes
409
restriction s := σ |X we have the exact sequence (Theorem 2.4.1 (B) in [1]) wf#
σ+
f#
i#
[[6∗ X, B]]s0 −→ [[SA, F ]] −→ [[Cf , A]]0 −→ [[X, A|X ]]0 −→ [[A, F ]]
(1)
of pointed sets. Here wf# is a group homomorphism and σ + is defined by a group action of [[SA, F ]] on [[Cf , A]]0 which we will write as σ + ([α]) = [σ ]0 + [α]. We recall its definition from [1], p. 59, which we will need in the next section. Let µ : Cf → Cf ∨SA denote the map obtained by collapsing A × {1/2} ⊂ Cf to a point and let i : F ,→ A be the inclusion. For a section σ : Cf → A and α : SA → F as above the composition µ
σ ∨i◦α
sα : Cf −→ Cf ∨ SA −→ A is not a cross section, but homotopic over Cf relative X to a cross section σα . Define [σ ]0 + [α] := [σα ]0 . 3. Free Homotopy Classes The exact sequence 1 permits to calculate the set [[Cf , A]]0 of section homotopy classes relative basepoint, whereas our proper interest lies in the set [Cf , A]0 of section homotopy classes without regarding basepoints. The latter is canonically identified with the orbit set of an action of π1 (F ) on [[Cf , A]]0 , which we now proceed to describe. By evaluation on the base point x0 ∈ X we have a fibration π : 0(A) → F . Given a section s of A we can lift any loop γ in F to a path γ¯ in 0(A) starting with s. The fibre homotopy class of the endpoint γ¯ (1) does not depend on the choices involved. Thus [γ ] · [s]0 := [γ¯ (1)]0 is well-defined. We have a canonical bijection [ϕ]
[[1, A]]0 /π1 (F ) → [1, A]0 , where the right-hand side denotes the set of free homotopy classes of cross sections s of p : A → 1 for which s(x0 ) lies in the arc-component of ϕ. For the map σ + of (1) we have the following formula: Proposition 1. [γ ] · ([σ ]0 + [α]) = [γ ] · σ + ([α]) = [γ ] · [σ ]0 + [γ ] · [α].
(2)
Thus σ + is equivariant if and only if σ is a fixed point for the action of π1 (F ) on [[Cf , A]]0 . Proof. Let γ¯ denote a lifting of γ over the fibration Map(SA, F ) → F with γ¯ (0) = α and e γ denote a lifting of γ over the fibration A → F with e γ (0) = σ , so that [γ¯ (1)] = [γ ] · [α]
and
[e γ ]0 = [γ ] · [σ ]0 .
By the Exponential Law γ¯ and e γ define maps ¯ : SA × [0, 1] → F G
and
e : Cf × [0, 1] → Cf . G
Consider the map µ×id
¯ e G G∪
6 : Cf × [0, 1] −→ (Cf ∨ SA) × [0, 1] −→ A.
410
S. Bechtluft-Sachs, M. Hien
Now p ◦ 6 is homotopic relative X × [0, 1] to the projection pr : Cf × [0, 1] → Cf along a homotopy H : Cf × [0, 1] × [0, 1] → Cf . Let K denote the mapping completing the following commutative diagram: Cf × [0, 1] × 0 ∪ X × [0, 1] × [0, 1] ↓ Cf × [0, 1] × [0, 1]
6∪6|X ◦pr
−→
K% H
−→
A ↓p
(3)
Cf .
We define Kt,s := K|Cf ×{t}×{s} , t, s ∈ [0, 1]. Directly from the definition of the two group actions, see [1], p. 59, we conclude [K0,1 ]0 = [σ ]0 + [α]
and
[K1,1 ]0 = [γ ] · [σ ]0 + [γ ] · [α].
(4)
The diagram (3) includes the following commutative diagram: K0,1 ∪γ
Cf × 0 × 1 ∪ {∗} × [0, 1] × 1 −→ A K.,1 % ↓ p ↓ pr −→ Cf Cf × [0, 1] × 1 from which we obtain [K1,1 ]0 = [γ ] · [K0,1 ]0 = [γ ] · ([σ ]0 + [α]). Our assertion now follows combining this equation with (4). u t 4. Examples and Applications If 1 is obtained from X by attaching a single m-cell em , i.e. 1 = Cf , where f : S n−1 → X is the attaching map, (1) reads wf#
σ+
f#
i#
[[6∗ X, B]]s0 −→ πm (F ) −→ [[1, A]]0 −→ [[X, A|X ]]0 −→ πm−1 (F ). Let 1 be a connected finite cell complex of dimension N and cm be the number of m-cells of 1. By induction over the skeleta we immediately get an estimate: Proposition 2. [ϕ]
[ϕ]
#[1, A]0 ≤ #[[1, A]]0 ≤
N Y
(#πm (F ))cm .
m=1
In order to apply this to the setting of defect topology, we have to compute πm (F ) = πm (Map(SNx0 , Ex0 ), α). This group is involved in a long exact sequence ρα
· · · → πm+1 (Ex0 ) −→ πm+n−1 (Ex0 ) →
(5) τ#
→ πm (Map(S n−1 , Ex0 ), α) −→ πm (Ex0 ) → · · · , where ρα : πm+1 (Ex0 ) → πm+n−1 (Ex0 ), β 7→ [α, β] is the Whitehead product with α ∈ πn−1 (Ex0 ), see [9]. We also have chosen basepoints s0 ∈ S n−1 = SNx0 , v0 = α(s0 ) ∈ Ex0 and defined the map τ to be the evaluation at s0 . If 1 = S m we get a bijection σ + : πm (F ) → [[S m , A]]0 . The action of π1 (F ) on m [[S , A]]0 pulls back to an action θ of π1 (F ) on πm (F ) and [[S m , A]]0 is in bijection with πm (F )/θ. We get
Sets of Fibre Homotopy Classes
411 [ϕ]
Proposition 3. Either [S m , A]0 is empty or [ϕ]
#πm (F )/#π1 (F ) ≤ #[S m , A]0 ≤ #πm (F ). Together with the sequence (5) it is now possible to explicitly calculate lower and [ϕ] upper bounds for #[SN, E]S m in cases where the corresponding homotopy groups of the fibre Ex0 are known. 4.1. Example. Let N → S m be a 2-dimensional complex vector bundle and E = P N its projective bundle. The local defect index of the projection map SN → E is the [η] Hopf map η : S 3 → S 2 . The number ρ := #[SN, E]S m of fibre homotopy classes of mappings SN → E with local defect index [η] ∈ π3 (S 2 ) is bounded from below and above according to the following list (see [8] for the order of πk (S l )): m
2
3
4 5
6
7
8
9
10
11
12
13
14
15
... ≤ ρ
∞ ∞ 1 1
5
4
1
1
23
42
1
3
210
3
ρ ≤ ...
∞ ∞ 4 4 36 30 4 12 360 672 16 144 10080 120
4.2. Normal Nematics. We need to consider the case where E = P N is the projective bundle of the normal bundle of 1. This reflects the fact that uniaxial particles are forced to be transverse to the defect. From Proposition 3 estimates for ρ := #[SN, E][π] S m , where π : S n−1 → RPn−1 is the usual projection, are easily deduced. For instance in the case n := codim(1, M) = 10, 1 = S m we get: m
2
3
4 5 6
7
8
9
10
... ≤ ρ
1
6
1 1 1
60
1 ∞
6
ρ ≤ ...
2 24 1 1 2 240 8 ∞ 96
4.3. Remark. In the following example [[S m , A]]0 does not contain a fixed point for the action of π1 (F ), hence θ differs from the usual action for every σ . Let SN := S 3 → S 2 be the Hopf map with fibre S 1 and E := S 2 × V → S 2 be the trivial bundle with fibre V . Assume that π1 (V ) = 0 , π2 (V ) 6 = 0 and π3 (V ) = 0, (e.g. V := CP∞ ). In this situation there is only one possible local index 0 ∈ π1 (V ), so that [SN, E]S 2 = [SN, E]0S 2 ≈ π2 (Map0 (S 1 , V ))/θ. In the exact sequence (5) the Whitehead products vanish and we obtain that π2 (Map0 (S 1 , V )) ∼ = π2 (V ) 6 = 0. The usual action of π1 (Map0 (S 1 , V )) on the higher homotopy groups is by automorphisms. Hence the orbit space π2 (Map0 (S 1 , V ))/π1 (Map0 (S 1 , V )) has at least two elements. But as E → S 2 is the trivial bundle we obtain [SN, E]S 2 = [S 3 , S 2 × V ]S 2 = [S 3 , V ] ≈ π3 (V )/π1 (V ),
412
S. Bechtluft-Sachs, M. Hien
and therefore #[SN, E]S 2 = 1. This shows that in this case θ differs from the usual group action. In particular, the ambiguity of the global defect index depends non-trivially on the isomorphism type of the bundles involved. References 1. Baues, H. J.: Obstruction theory. Lecture Notes in Mathematics 628, Berlin–Heidelberg–New York: Springer-Verlag, 1977 2. Bechtluft-Sachs, S., Hien, M.: The Global Defect Index. Commun. Math. Phys. 202, 403–409 (1999) 3. Bredon, G. E.: Topology and geometry. Berlin–Heidelberg–New York: Springer-Verlag, 1993 4. Mermin, N. D.: The topological theory of defects in ordered media. Rev. Mod. Phys. 51, 591–648 (1979) 5. Michel, L.: Symmetry, defects and broken symmetry. Configurations. Hidden symmetry. Rev. Mod. Phys. 52, 617–651 (1980) 6. Jänich, K.: Topological properties of ordinary nematics in 3-space. Acta Appl. Math. 8, 65–74 (1987) 7. Poénaru, V., Toulouse, G.: The crossing of defects in ordered media and the topology of 3-manifolds. J. Phys. 38, 887–895 (1977) 8. Ravenell, D. C.: Complex cobordism and stable homotopy groups of spheres. London–New York: Academic Press, 1986 9. Whitehead, G.W.: On products in homotopy groups. Ann. of Math. 47, 460–475 (1946) 10. Whitehead, G.W.: Elements of homotopy theory. Berlin–Heidelberg–New York: Springer-Verlag, 1978 Communicated by H. Araki
Commun. Math. Phys. 211, 413 – 438 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Central Limit Theorem for Stochastic Hamilton–Jacobi Equations Fraydoun Rezakhanlou? Department of Mathematics, University of California, Berkeley, CA 94720-3840, USA Received: 15 February 1999 / Accepted: 14 December 1999
Abstract: We study the asymptotic behavior of uε (x, t) = εu xε , εt , where u solves the Hamilton–Jacobi equation ut + H (x, ux ) = 0 with H a stationary ergodic process in the x-variable. It was shown in Rezakhanlou–Tarver [RT] that uε converges to a deterministic function u¯ provided H (x, p) is convex in p and the convex conjugate of H in the p-variable satisfies certain growth conditions. In this article we establish a central limit theorem for the convergence by showing that √ for a class of√examples, ¯ t) + εZ(x, t) + o( ε), where uε (x, t) can be (stochastically) represented as u(x, Z(x, t) is a suitable random field. In particular we establish a central limit theorem when the dimension is one and H (x, p) = 21 p2 − ω(x), where ω is a random function that enjoys some mild regularity. 1. Introduction Hamilton–Jacobi equation is one of the most studied equations in the theory of partial differential equations. Its importance stems from its vast applications in numerous areas of science. Hamilton–Jacobi equations are closely related to the Hamiltonian systems. Various results for Hamilton–Jacobi equations have important counterparts or direct applications for the Hamiltonian systems. In control theory and differential game theory, Hamilton–Jacobi equations describe the time evolution of the so-called value functions. Several growth models in physics and biology are macroscopically described by Hamilton–Jacobi equations (see for example [KS]). In these models, a random interface separates regions associated with different phases and the interface can be locally approximated by the graph of a solution to a Hamilton–Jacobi equation. Such a solution gives us a macroscopic description of the interface. Microscopically though, the interface is rough and fluctuates about the macroscopic solution. A central limit theorem should provide us with a better description of the interface. ? Supported in part by NSF Grant DMS97-04565.
414
F. Rezakhanlou
We study models for which the microscopic interface is also described by a Hamilton– Jacobi equation, but now with a Hamiltonian that is random and highly oscillatory. The main goal of the present article is to study such models and establish a central limit theorem for the fluctuations of the interface. In dimension one, the differentiated version of a Hamilton–Jacobi equation becomes a scalar conservation law for the inclination of the one-dimensional interface. This can be used to model one-dimensional fluids. The oscillation of the Hamiltonian is now interpreted as the oscillation of the flux and is often associated with the impurity or the lack of data in the fluid. The viscous version of our equation is known as Richards equation in connection with the infiltration problem [P1-2], and is called the contaminant transport equation in some geoscience applications [BZ]. In the context of fluids, our results can be used to obtain some qualitative information on shocks and their fluctuations. When the initial data is random and the Hamiltonian is quadratic and nonrandom, a variant of our results can be found in [WX1]. For simplicity we always consider deterministic initial data. However, our central limit theorem is still true if we consider random initial data and assume that initially we have a central limit theorem (see Remark 2.13). Where H has the special form H (x, p, ω) = ω(x)p2 , a variant of our results can be found in [WX2]. In a subsequent paper [R1], we use some of the ideas of this paper to study shock fluctuations for some particle systems that model fluids with one conservation law. We consider random interfaces that are described by the graph of a random height function u(x, t) = u(x, t, ω). The function u solves a Hamilton–Jacobi equation of the form (1.1) ut + H (x, ux , ω) = 0, where ux = (ux1 , ux2 , . . . , uxn ) denotes the gradient of u with respect to x and ω denotes the randomness. We always assume that H is stationary and ergodic with respect to a suitable shift operator and that H (x, p, ω) is convex in p. Under certain conditions [RT], the rescaled height function uε (x, t) = εu xε , εt converges to a deterministic function u(x, ¯ t) provided initially uε (x, 0) = g(x) for a deterministic function g. The function u¯ solves the homogenized equation ( u¯ t + H¯ (u¯ x ) = 0 (1.2) u(x, ¯ 0) = g(x), where H¯ is deterministic and inherits the convexity of H . Recall that by the Hopf–Lax– Oleinik formula, u can be written as x−y , (1.3) u(x, ¯ t) = inf g(y) + t L¯ y t where L¯ is the convex conjugate of H¯ . Our strategy for studying the fluctuations of uε has two basic steps. First we try to find a class of simple solutions of (1.1) for which a central limit theorem can be carried out using well-known results in the probability theory. Secondly, if the class of simple solutions in the first step is rich enough, by comparison and approximation, we establish a central limit theorem for general solutions of (1.1). For the first step, we consider solutions of the form {up (x, t) : p ∈ Rd }, where up (x, t) = w(x, p) − t H¯ (p)
(1.4)
Central Limit Theorem for Stochastic Hamilton–Jacobi Equations
415
and w(x, p) = p · x + o(|x|) as |x| → ∞. These solutions are associated with the linear solutions u¯ p (x, t) = p · x − t H¯ (p) of (1.2), because limε→0 εw xε , p = p · x. In several examples, especially when d = 1, the function w(x, p) can be found explicitly. If this function is explicit enough, one can decide whether there is a central limit theorem for εw xε , p as ε → 0. Assume x √ √ (1.5) , p = p · x + εB(x, p) + o( ε) wε (x, p) := εw ε for a random field B(x, p) which is jointly continuous in x and p. Here the equality means the equality in the distribution; see Definition 2.2 of the next section for the precise meaning of (1.5). We now formulate a conjecture concerning the fluctuations of uε , provided that (1.5) holds: √ √ ¯ t) + ε inf {B(x, p(x, y, t)) − B(y, p(x, y, t))} + o( ε), (1.6) uε (x, t) = u(x, y∈I (x,t)
where I (x, t) is the set of y at which the infimum in (1.3) is attained and x−y , p(x, y, t) = ∇ L¯ t
(1.7)
¯ A heuristic derivation of (1.6) will be given in where ∇ L¯ denotes the gradient of L. Sect. 3. In this article, we establish (1.6) for certain classes of Hamiltonians. For example, if w(x, p) is continuously differentiable in (x, p), wp (x, p) is continuously differentiable in x, and wp (x, p) is injective in x, then (1.6) is true. Equation (1.6) is also valid if instead of injectivity of wp (·, p) we assume d = 1 and lim Hp (x, wx (x, p)) = ±∞ ,
p→±∞
uniformly in x. It turns out that if H (x, p) =
1 2 p − ω(x) 2
(1.8)
with ω a stationary and ergodic process, then w(x, p) fails to be continuously differentiable. In this case we establish the central limit theorem under the assumption that ω(x) gets close to its minimum value often enough so that, with probability one, Z n 1 dy = +∞. (1.9) lim 2 √ n→∞ n ω(y) − inf ω −n In practice, our results are useful mostly in dimension one because in higher dimension we do not even know w(x, p) exists. Even if we show such w exists, a central limit theorem for w as in (1.5) seems to be challenging. In Corollary 2.6 of the next section we discuss a class of Hamiltonians for which w exists and is linear in p. For such a class it is easy to decide whether there is a central limit theorem for w. Corollary 2.6 was also established in the last section of [RT]. Some interesting results in the context of the random Schrödinger equation can be recast as a homogenization problem for the viscous version of our equation ut + H (x, ux , ω) = 1u ,
(1.10)
416
F. Rezakhanlou
where H is given by (1.8). This is because we can apply the Feynman–Kac formula to write Z t n ε x ε −1 ε ω(βs )ds , (1.11) u (x, t, ω) = −ε log E exp ε g(β t ) − ε
x ε
0
x ε
and E denotes the expectation with where βs is a Brownian motion with β0 = respect to β. As a result, the limiting behaviour of uε is closely related to a large deviation ¯ Such a large principle for a Brownian motion in random media (the rate function is L). deviation principle has been thoroughly addressed by Szniman in his book [S]. The representation (1.11) suggests a connection between the homogenization questions and directed polymers which is also related to the roughening phenomenon of the growing surfaces. See the last chapter of [S] and Sect. 5 of [KS] for more information. In a subsequent paper [R2], we address the issue of homogenization for (1.10) and (1.1) when H is not convex. The organization of the paper is as follows. In the next section we state our main results. In Sect. 3 we derive (1.6) heuristically and explain the method of our proofs. In Sect. 4, we reduce the central limit theorem of uε (x, t) to a central limit theorem for a related random field S ε (x, y, t) (see (2.5) for the definition of S ε ). In Sect. 5, (2.6) is established assuming some regularity of w(x, p). Section 6 is devoted to the classical Hamiltonian (2.8). 2. Notation and Main Results This section is devoted to the statement of our main results. Consider a probability space (, F, P ), where F is a σ -field of measurable subsets of and P is a probability measure. A measurable function H : Rd × Rd × → R with the following properties is given: Assumption 2.1. (i) H (x, p, ω) is continuous in (x, p) variables, convex and continuously differentiable in the p-variable, (ii) if L(x, q, ω) = supp (q·p−H (x, p, ω)) denotes the convex conjugate of H (x, p, ω) in p-variable then L is finite and coercive. We define coercivity to mean that L(x, q, ω) ≥ φ(q) − C0 (ω),
(2.1)
where φ is a convex function, lim
|q|→∞
φ(q) = +∞ |q|
and C0 (ω) is a non-negative function that is finite with probability one. (iii) With probability one, C1 (ω) = sup L(x, 0, ω) < ∞. x
(2.2)
(2.3)
Given a Lipschitz continuous function g : Rd → R, we define uε (x, t, ω) = inf {g(y) + S ε (x, y, t, ω)}, y
where S ε is defined by the following variational formula: Z t ξ(s) , ξ˙ (s), ω ds , L S ε (x, y, t) = S ε (x, y, t, ω) := inf ε ξ ∈Ax,y 0
(2.4)
(2.5)
Central Limit Theorem for Stochastic Hamilton–Jacobi Equations
417
where Ax,y = {ξ ∈ W 1,∞ (0, t; Rd ) : ξ(0) = y, ξ(t) = x} and W 1,∞ (0, t; R) denotes the space of Lipschitz functions from [0, t] to Rd . If we make some regularity assumptions on H (·, ·, ω), then uε will be the unique viscosity solution of the following Hamilton–Jacobi equation: ( uεt + H xε , uεx , ω = 0, (2.6) uε (x, 0, ω) = g(x). For the notion of viscosity solution and the derivation of (2.6), see [Li] or [FS]. In the sequel, by uε we always mean (2.4) with an L that satisfies Assumption 2.1. ¯ P¯ ) be two probability spaces. Let A be an ¯ F, Definition 2.2. Let (, F, P ) and (, open subset of Rk and suppose Xε : A × → R is a sequence of measurable functions ¯ → R be a such that for each ω ∈ the function Xε (·, ω) is continuous. Let X : A × ¯ the function X(·, ω) is continuous. We may measurable function such that for all ω ∈ ¯ into C(A; R). Then regard X ε (respectively X) as a function from (respectively ) ε we say the processes X converge to the process X if for every compact set B ⊂ A and every bounded continuous function F : C(B; R) → R, we have Z Z (2.7) lim F (Xε (ω))P (dω) = F (X(ω))P¯ (dω). ε→0
We say the finite dimensional marginals of Xε converge to the finite dimensional marginals of X if we require (2.7) to hold only for sets B ⊂ A that are finite. It is well-known that the convergence of the processes Xε implies that for every positive δ and every compact set B ⊂ A, there exists a compact subset A ⊂ C(B; R) and ε0 > 0 such that (2.8) inf P (Xε ∈ A) ≥ 1 − δ. 0 τ ) < 1. (2.11) defined by Let y¯e (x, t) = y¯e (x, t, ω) denote the minimizer in (2.10) (which is well√ ε (x + εe(x), t) (2.11) for almost all ω). Then the finite dimensional marginals of y± converge to the finite dimensional marginals of y¯e (x, t). P¯ (Ze (x, y1 , t) = Ze (x, y2 , t)) = 0,
It is worth mentioning that if we assume e is identically zero in parts (i–iii) of Theorem 2.5, the differentiability assumption on L¯ can be dropped. As the first application of Theorem 2.5, let 1 = C 1 (Rd ; Rd ) × C 1 (Rd ; R) and ˆ c) let F1 denote the Borel σ -field of 1 . The elements of 1 are denoted by ω = (b, ˆ d d d ˆ where b : R → R and cˆ : R → R are two continuously differentiable functions. The derivatives of bˆ and cˆ are denoted by b and c: b = bˆx , c = cˆx ,
(2.12)
with b a matrix-valued function and c a vector-valued function. Let P1 be a probability measure on (1 , F1 ) such that for all x, the matrix b(x) is invertible with probability one with respect to P1 , ! P1 and the processes
sup k(b(x))−1 k < ∞ = 1,
x∈Rd
√ x 1 x − x , εcˆ , x ∈ Rd √ ε bˆ ε ε ε
ˆ ˆ converge to a continuous process (B(x), C(x)). We define H (x, p, ω) = H¯ (b(x))−1 (p − c(x)) ,
(2.13)
(2.14)
(2.15)
where H¯ : Rd → R is a strictly convex function that is coercive and continuously differentiable. As a straightforward consequence to Theorem 2.5 we have the following:
Central Limit Theorem for Stochastic Hamilton–Jacobi Equations
419
Corollary 2.6. Let uε be as in (2.4), where H is of the form (2.15) with ω distributed according to P1 . Then the finite dimensional marginals of γeε converge to the finite dimensional marginals of the process x−y ˆ ˆ ˆ ˆ ¯ · (e(x) + B(x) − B(y)) + C(x) − C(y) , ∇L γε (x, t) := inf y∈I (x,t) t (2.16) where L¯ is the convex conjugate of H¯ . When d = 1 and c ≡ 0, Corollary 2.6 was shown in [RT]. The proof of Theorem 2.5 and Corollary 2.6 will be given in Sect. 4. For Corollary 2.6, we only need to verify that ! ε (x) − bˆ ε (y) ˆ b + cˆε (x) − cˆε (y), (2.17) S ε (x, y, t, ω) = t L¯ t where bˆ ε (x) = ε bˆ xε and cˆε (x) = εcˆ xε . Note also that Corollary 2.6 is consistent with our conjecture stated in the introduction because we may choose ˆ w(x, p) = b(x) · p + c(x), ˆ ˆ C) ˆ implies that the processes √1 (wε (x, p) − p · x) and the convergence of (2.14) to (B, ε ˆ ˆ converge to the process B(x) · p + C(x). As the second application of Theorem 2.5, we choose 2 = C(R; R) with its Borel σ -field F2 . We define the translation operator τx : 2 → 2 by τx ω(y) = ω(x + y). Let P2 be a probability measure on 2 which is stationary with respect to the translations τx . The expectation with respect to P2 is denoted by E2 . We assume (2.18) P2 inf ω(x) = 0, sup ω(x) < ∞ = 1. x∈R
x∈R
Let K : R → [0, ∞) be a continuously differentiable strictly convex function with K(0) = 0. Let L denote the convex conjugate of K. We assume that for some constants c1 and c2 , K(p) = 0, L0 (p)p ≤ c1 L(p) + c2 . lim (2.19) |p|→∞ |p| Clearly K : (−∞, 0] → [0, ∞) and K : [0, ∞) → [0, ∞) are one-to-one and onto. Their inverses will be denoted by A− and A+ respectively. We define
We set
H (x, p, ω) = K(p) − ω(x).
(2.20)
ϕ ± (λ) = E2 A± (λ + ω(0)), λ ≥ 0 ,
(2.21)
where E2 denotes the expectation with respect to P2 . We assume that the processes ! Z x ε 1 ε ± ± A (λ + ω(y))dy − xϕ (λ) , (x, λ) ∈ R × [0, ∞) B± (x, λ, ω) := √ ε ε 0 (2.22)
420
F. Rezakhanlou
converge to the continuous process B ± (x, λ). We write ψ ± (λ) for the inverses of ϕ ± (λ). Define − − − ψ (p) if p ≤ p := ϕ (0) + + H¯ (p) = ψ (p) if p ≥ p := ϕ + (0) 0 if ϕ − (0) ≤ p ≤ ϕ + (0) and let L¯ denote the convex conjugate of H¯ . Note that if z 6 = 0, then L¯ is differntiable / [p− , p+ ]. We also define at z and L¯ 0 (z) ∈ ( B + (x, ψ + (p)) p ≥ p+ B(x, p) = B − (x, ψ − (p)) p ≤ p− . Theorem 2.7. Let uε be as in (2.4), where H is of the form (2.20) and ω is distributed according to P2 . Moreover, we assume that with probability one, Z n 1 dz = ±∞. (2.23) lim n→∞ n2 −n K 0 (A± (ω(z))) Then the finite dimensional marginals of γeε (x, t) converge to the finite dimensional marginals of n o inf B(x, p(x, y, t)) − B(y, p(x, y, t)) + e(x)p(x, y, t) , (2.24) γe (x, t) = y∈I (x,t)−{x}
where p(x, y, t) = L¯ 0
x−y t
.
Before stating our last theorem, we study an example. We take 3 = C(R; R) with its Borel σ -field F3 . A pair of positive numbers δ and l are given. Let P3 be a probability measure on (3 , F3 ) such that P3 (δ ≤ ω(x) ≤ l) = 1.
(2.25)
We also assume that P3 is translation invariant. The expectation with respect to P3 is denoted by E3 . Let K : R → [0, ∞) be a coercive convex and continuously differntiable function with K(0) = 0. Define two functions A± such that A+ : [0, ∞) → [0, ∞), A− : [0, ∞) → (−∞, 0] and K(A± (λ)) = λ. We define H (x, p, ω) = ω(x)K(p), λ ± ± for λ ≥ 0, ϕ (λ) = E3 A ω(0) ( (ϕ + )−1 (p) if p ≥ 0, H¯ (p) = (ϕ − )−1 (p) if p ≤ 0, R x A+ H¯ (p) dy if p ≥ 0, 0 ω(y) w(x, p) = R x A− H¯ (p) dy if p ≤ 0. 0 ω(y) We then have the following properties for w(x, p): (i) w(x, t) is differentiable in x and wx (x, p) is continuous in (x, p), (ii) d = 1 and limp→±∞ Hp (x, wx (x, p)) = ±∞ uniformly in x,
(2.26) (2.27) (2.28)
(2.29)
Central Limit Theorem for Stochastic Hamilton–Jacobi Equations
421
(iii) for every (x, p), we have H (x, wx (x, p)) = H¯ (p). We also assume (iv) The processes
√1 ε
εw
x ε,p
− p · x converge to a continuous process B(x, p),
Our last theorem asserts that whenever (i)–(iv) are satisfied then we have a central limit theorem for uε . In particular we have a central limit theorem for the Hamiltonian given by (2.26). Our central limit theorem is also true if the conditions (i)–(ii) are replaced with (i’) w(x, p) is continuously differntiable in (x, p), (ii’) for all p, wp (x, p) is injective and continuously differntiable in x. Theorem 2.8. Assume Assumption 2.1 and suppose there exist a function w(x, p) for which either (i)–(iv) or (i’)–(ii’), (iii)–(iv) hold with an H¯ that is strictly convex. Then the finite dimensional marginals of the processes γeε (x, t) converge to the finite dimensional marginals of n o B(x, p(x, y, t)) − B(y, p(x, y, t)) + e(x).p(x, y, t) , γe (x, t) := inf y∈I (x,t)
where p(x, y, t) = ∇ L¯
x−y t
and L¯ is the convex conjugate of H¯ .
The proof of Theorem 2.8 will be presented in Sect. 6. Remark 2.9. For both Theorems 2.7 and 2.8, the second and third parts of Theorem 2.5 apply as well. This is because we establish these theorems by verifying Assumption 2.3. √ Remark 2.10. In our results we can replace ε with an increasing function k(ε) with limε→0 k(ε) = 0 and limε→0 k(ε) essential in this work is the ε = ∞. What is however √ continuity of the limiting process B(x, p). If we replace ε with k(ε), then n2 in (2.23) must also be replaced with k −1 ( n1 ). Remark 2.11. When the function u¯ is x-differentiable at (x0 , t0 ), then I (x0 , t0 ) = {y(x0 , t0 )} is a singleton and p(x0 , y0 , t0 ) = u¯ x (x0 , t0 ) for y0 = y(x0 , t0 ). In general, the set {p(x0 , y, t0 ) : y ∈ I (x0 , t0 )} coincides with the set of limit points of the set {ux (x, t) : u is x-differentiable at (x, t)} as (x, t) approaches (x0 , t0 ). See [CMS] for more details. ε (x, t) in the third part of TheoRemark 2.12. Our interest in the limiting behaviour of y± ε rem 2.5 stems from the fact that y± (x, t) is the point at which the extreme (generalized) backward characteristic curves, emenating from the point (x, t), intersect the x-axis. See ε was used in [RT] to study [FS] or [CMS] for more details. The limiting behaviour of y± ε the limiting behaviour of ux when the Hamiltonian H has the form (2.15).
Remark 2.13. In our main results, Theorems 2.5–2.8, we may choose a random initial 1 data. If we assume that the process ε− 2 (uε (x, 0, ω) − g(x)) converges to a continuous process R(x), then our results are still valid provided that the process Z(x, y, t) in (2.10) is replaced with the process Z(x, y, t) + R(y).
422
F. Rezakhanlou
3. Heuristics and Sketch of Proofs In this section we give a heuristic derivation of (1.6). We then sketch the proofs of Theorems 2.7 and 2.8. To ease the notation in this section, we will not display the dependence on ω in most places. Suppose that we can find a function w(x, p) = w(x, p, ω) such that H (x, wx (x, p)) = H¯ (p), x √ √ , p = x · p + εB(x, p) + o( ε). εw ε If we set w ε (x, p) = εw xε , p and v ε (x, t, p) := wε (x, p) − t H¯ (p) = x · p − t H¯ (p) +
√ √ εB(x, p) + o( ε),
(3.1) (3.2)
(3.3)
then v ε is a solution to Eq. (2.6), i.e. vtε + H
x ε
, vxε = 0.
From (2.4) we also know v ε (x, t, p) = inf {wε (y, p) + S ε (x, y, t)}. y
If the infimum is attained at some y ε (x, t, p), we obtain S ε (x, y ε (x, t, p), t) = v ε (x, t, p) − wε (y ε (x, t, p), p).
(3.4)
Suppose that the transformation p 7 → y ε (x, t, p) is invertible for each (x, t). Let p ε (x, y, t) denote the inverse, i.e. y ε (x, t, pε (x, y, t)) = y. Clearly v ε (x, t, p) converges to v(x, t, p) = x · p − t H¯ (p) and v satisfies x−y . v(x, t, p) = inf y · p + t L¯ y t
(3.5)
(3.6)
For such a variational problem we know the minimizer. Let us assume that L¯ and H¯ are continuously differentiable. It is not hard to show that the infimum in (3.6) is attained at y(x, t, p) = x − t∇ H¯ (p). If we denote the inverse of p → y(x, t, p) by p(x, y, t), . we certainly have p(x, y, t) = ∇ L¯ x−y t If w(x, p) is sufficiently well-behaved, we expect to have a central limit theorem for p ε as well. Assume √ √ (3.7) pε (x, y, t) = p(x, y, t) + εR(x, y, t) + o( ε) for an appropriate (random) continuous process R(x, y, t). From (3.4) we know S ε (x, y, t) = v ε (x, t, pε (x, y, t)) − wε (y, pε (x, y, t)).
(3.8)
Central Limit Theorem for Stochastic Hamilton–Jacobi Equations
423
We then substitute the right-hand side of (3.3) and (3.7) for v ε and pε to obtain S ε (x, y, t) = {(x − y) · p(x, y, t) − t H¯ (p(x, y, t))} √ + ε{(x − y) − t∇ H¯ (p(x, y, t))} · R(x, y, t) √ √ + ε{B(x, p(x, y, t)) − B(y, p(x, y, t))} + o( ε). It follows from the definition p(x, y, t) that the first term on the right-hand side is x−y ¯ t L t and the second term is zero. Thus, we expect √ √ x−y ε ¯ + ε{B(x, p(x, y, t)) − B(y, p(x, y, t))} + o( ε). (3.9) S (x, y, t) = t L t An important feature of the above calculation is the cancellation of R. As a result, we should be able to prove a central limit theorem for S ε without establishing any central limit thorem for p ε . It turns out that under the assumptions of Theorem 2.8, we can show (3.10) S ε (x, y, t) = sup w ε (x, p) − wε (y, p) − t H¯ (p) . p
Using this formula, we establish a central limit theorem for S ε using only pε = p + o(1) instead of (3.7). For Theorem 2.7, one faces more challenges. In the case of classical Hamiltonian H (x, p) = 21 p2 − ω(x), the effective Hamiltonian has a flat piece. More precisely, for some positive p0 , we have H¯ (p) = 0 if p ∈ [−p0 , p0 ]. Moreover, we do not have a simple expression for w(x, p) when p ∈ (−p0 , p0 ). In fact in some cases, one can show that that for p ∈ (−p0 , p0 ), there is no piecewise smooth w(x, p) that would satisfy (3.1) in viscosity sense. What saves us however is that if we assume (1.9), then our heuristic can be made rigorous using only w(x, p) with p ∈ / (−p0 , p0 ). For such p, outside an interval of the form (x − r ε (x, t), x + r ε (x, t)) y ε (x, t, p) in (3.4) is always√ for some r ε (x, t) of order o( ε). Finally let us remark that the condition (1.9) is satisfied if we assume some mild regularity on ω. For example, if the minimum of ω is attained at some x0 and ω is twice differentiable at x0 then (1.9) is always satisfied. This is because the differentiability of ω implies ω(x) − ω(x0 ) = O(|x − x0 |2 ) which in turn implies
Z
x0
dx = +∞. √ ω(x) − ω(x0 ) x0 −δ Hence it suffices to choose n large enough so that x0 , x0 − δ ∈ (−n, n).
4. Proof of Theorem 2.5 We first prove a lemma that would allow us to restrict the infimum in (2.4) to a bounded set of y. Lemma 4.1. (i) For every r > 0, there exists a constant c1 (r) such that for every ω with C0 (ω), C1 (ω) ≤ r and every (x, t), uε (x, t, ω) =
inf
|y−x|≤c1 (r)t
{g(y) + S ε (x, y, t, ω)}.
(4.1)
424
F. Rezakhanlou
(ii) There exists a constant c2 such that for every (x, t), n x − y o . g(y) + t L¯ u(x, ¯ t) = inf |y−x|≤c2 t t
(4.2)
Proof. Recall that by Assumption 2.1, we have supx L(x, 0, ω) = C1 (ω) < ∞, L(x, q, ω) ≥ φ(q) − C0 (ω). By Jensen’s inequality Z t Z t ξ(s) , ξ˙ (s), ω ds ≥ L φ(ξ˙ (s))ds − C0 (ω)t ε 0 0 Z t 1 ξ˙ (s)ds − C0 (ω)t ≥ tφ t 0 ξ(t) − ξ(0) − C0 (ω)t. = tφ t Hence, if C0 (ω), C1 (ω) ≤ r, S ε (x, y, t, ω) ≥ tφ
x−y t
− rt.
On the other hand, by choosing ξ(s) ≡ x, we obtain Z t x , 0, ω dt ≤ t sup L(x, 0, ω) ≤ tr. L S ε (x, x, t, ω) ≤ ε x 0
(4.3)
(4.4)
Since g is Lipschitz continuous, we can find a constant c0 = c0 (g) such that |g(x) − g(y)| ≤ c0 |x − y|. Because of this and the fact that φ grows faster than linearly, we can choose a constant > c1 (r), then c1 (r) such that if |x−y| t x−y − rt > g(x) + tr. g(y) + tφ t From this, (4.3) and (4.2) we learn that for such y, g(y) + S ε (x, y, t, ω) > g(x) + S ε (x, x, t, ω). This evidently implies (4.1). The proof of (4.2) is similar. u t The proof of our second lemma is contained in the proof of Lemma 5.1 in [RT]. For the sake of completeness, we sketch its proof. Lemma 4.2. Let A be either a compact subset of Rd × [T1 , T2 ] such that for every (x, t) ∈ A the set I (x, t) = {y(x, t)} consists of one point, or A = {(x, t)} is a singleton. Define x−y ¯ : |y| ≤ l, |y − I (x, t)| ≥ λ − u(x, ¯ t) . al (λ) = inf min g(y) + t L (x,t)∈A t (4.5) Then al (λ) > 0 if λ > 0 and for sufficiently large l, limλ→0 al (λ) = 0.
Central Limit Theorem for Stochastic Hamilton–Jacobi Equations
425
Proof. Suppose to the contrary, there exists a positive λ such that al (λ) = 0. Then we can find sequences xn , tn and yn such that (xn , tn ) ∈ A, |yn | ≤ l, |yn − I (xn , tn )| ≥ λ, limn→∞ (xn , tn ) = (x, t), limn→∞ yn = y and xn − yn (4.6) lim g(yn ) + tn L¯ − u(x ¯ n , tn ) = 0. n→∞ tn = u(x, ¯ t). This in turn implies y ∈ I (x, t). If A = This implies g(y) + t L¯ x−y t {(x, t)}, we have a contradiction because all (xn , tn ) are equal to (x, t) and as a result |yn − I (x, t)| ≥ λ. If I (x, t) = {y(x, t)} is singleton for every (x, t) ∈ A, then |yn − y(xn , tn )| ≥ λ. It is not hard to see that y(x, t) is continuous on A. Hence |y − y(x, t)| ≥ λ which is in contradiction with y ∈ I (x, t) = {y(x, t)}. The second assertion limλ→0 al (λ) = 0 can be verified in a similar fashion. First we choose l sufficiently large so that for (x, t) ∈ A, the infimum in (1.3) can be restricted to y with |y| ≤ l. We then observe that if we have two convergent sequences (xn , tn ) ∈ A, t |yn | ≤ l such that limn→∞ |yn − I (xn , tn )| = 0, then (4.6) holds. u To this end let us fix a function k(l) and a sequence of non-decreasing functions α = (αl ) with limr→0 αl (r) = 0 for every l. Let K(k, α) denote the set of functions b(x, y, t) with the following properties: For every l, |b(x1 , y1 , t1 ) − b(x2 , y2 , t2 )| ≤ αl (|x1 − x2 |) + αl (|y1 − y2 |) + αl (|t2 − t1 |), (4.7) |b(x1 , y1 , t1 )| ≤ k(l) for every x1 , x2 , y1 , y2 .t1 , t2 with |x1 |, |x2 |, |y1 |, |y2 | ≤ l and t1 , t2 ∈ [T1 , T2 ]. We then define Aε = Aε (α, k, r) = {ω : C0 (ω) ≤ r, C1 (ω) ≤ r, Z ε (·, ·, ·, ω) ∈ K(α, k)},
(4.8)
1 x−y ε ¯ . (4.9) Z (x, y, t, ω) = √ S (x, y, t, ω) − t L t ε Note that Assumptions 2.1 and 2.3 on P imply that for every positive δ, we can find ε0 , r, k(·) and α = (αl ) such that for the corresponding Aε (α, k, r) we have where
ε
inf P (Aε (α, k, r)) ≥ 1 − δ.
0 0 if λ > 0, and limλ→0 a −1 (λ) = 0. We now set h(ε) = a −1 (c1 ε) for a positive constant c1 that will be determined later. We claim that for an appropriate c1 , (4.11) holds. To see this, suppose |y − I (x, t)| ≥ λ, ¯ be the closest point in I (x, t) to y. Then there are |y| ≤ l0 , (x, t) ∈ A, and let y(y) constants c0 and c1 such that for every ω ∈ Aε , ε √ x −y ¯ + εZ ε (x ε , y, t, ω) g(y) + S (x , y, t, ω) = g(y) + t L t √ x−y ¯ − c0 ε ≥ g(y) + t L t √ ≥ u(x, ¯ t) + a(λ) − c0 ε √ x − y(y) ¯ ¯ + a(λ) − c0 ε = g(y(y)) ¯ + tL t √ ¯ t, ω) + a(λ) − c1 ε, ≥ g(y(y)) ¯ + S ε (x ε , y(y), ε
ε
where for the first inequality we used the definition of Aε and the fact that the convexity √ ¯ From this we learn that if a(λ) − c1 ε > 0 of L¯ implies the local Lipschitzness of L. then such y will not be in I ε (x ε , t, ω). This clearly implies that (4.11) holds if y ∈ t I ε (x ε , t, ω). u In the next two lemmas, we will assume L¯ is continuously differentiable and will establish some of the properties of γeε as ε goes to zero. The reader can readily check that if we set e = 0 then the differentiability of L¯ will not be needed. ε Lemma 4.4. Let A and Aε be as in the previous lemma and define Ze (x, y, t, ω) = · e(x). Then Z ε (x, y, t, ω) + ∇ L¯ x−y t
lim sup
sup ε
ε→0 (x,t)∈A ω∈Aε
lim sup
sup
− 21
ε ε √ ε u (x , t, ω) − u(x, ¯ t) − ε inf Ze (x, y, t, ω) = 0, y∈I (x,t) (4.13)
sup
ε→0 (x,t)∈A ω∈Aε z∈I ε (x ε ,t,ω)
ε Z (x, z, t, ω) − inf Z ε (x, y, t, ω) = 0. e e y∈I (x,t)
(4.14)
Proof. From Lemma 4.3 we learn that for every (x, t) ∈ A and every ω ∈ Aε , uε (x ε , t, ω) = min{g(y) + S ε (x ε , y, t, ω) : |y − I (x, t)| ≤ h(ε)} = min
min
{g(y) + S ε (x ε , y, t, ω)}.
y∈I ¯ (x,t) |y−y|≤h(ε) ¯
(4.15)
Central Limit Theorem for Stochastic Hamilton–Jacobi Equations
427
Let y(y) ¯ denote the closest point y¯ ∈ I (x, t) to y. Suppose y¯ ∈ I (x, t) and |y − y| ¯ ≤ h(ε). Then ε √ x −y + εZ ε (x ε , y, t, ω) S ε (x ε , y, t, ω) = t L¯ t √ √ √ x −y x −y + ε∇ L¯ · e(x)+ εZ ε (x ε , y, t, ω)+o( ε) = t L¯ t t √ √ x −y x −y + ε ∇ L¯ · e(x)+Z ε (x, y, t, ω) +o( ε) = t L¯ t t √ ε √ x −y + εZe (x, y(y), ¯ t, ω)+o( ε), = t L¯ t √ where for the last two equalities we used the fact that ω ∈ Aε . Here and below o( ε) −1/2 R = 0 uniformly in (x, t) ∈ A, y ∈ [y¯ − means an error term R with limε→0 ε h(ε), y¯ + h(ε)] for some y¯ ∈ I (x, t), and uniformly in ω ∈ Aε . From this and (4.15) we deduce x−y min g(y) + t L¯ lim sup sup ε−1/2 uε (x ε , t, ω) − min ε→0 (x,t)∈A ω∈Aε y∈I ¯ (x,t) |y−y|≤h(ε) ¯ t √ ε ¯ t, ω) = 0. + εZe (x, y, is attained at y = y¯ for This evidently implies (4.13) because min g(y) + t L¯ x−y t every y¯ ∈ I (x, t). Next we establish (4.14). Let z ∈ I ε (x ε , t, ω). Then, for ω ∈ Aε , ε √ x −z ε ε ε ε ¯ + εZ ε (x ε , z, t, ω) u (x , t, ω) = g(z) + S (x , z, t, ω) = g(z) + t L t √ x − z x − z + ε∇ L¯ · e(x) = g(z) + t L¯ t t √ √ ε + εZ (x, z, t, ω) + o( ε), where for the last equality, we used the continuous differentiability of L¯ and (4.7). From this and (4.13) we learn lim sup
sup
sup
ε→0 (x,t)∈A ω∈Aε z∈I ε (x ε ,t,ω)
where
|ε−1/2 X(z, x, t) + Y ε (z, x, t)| = 0,
(4.16)
x−z − u(x, ¯ t), t Y ε (z, x, t) = Zeε (x, z, t, ω) − inf Zeε (x, y, t, ω). X(z, x, t) = g(z) + t L¯
y∈I¯(x,t)
Observe that X(z, x, t) ≥ 0 and if |z − I (x, t)| ≤ h(ε), then Y ε (z, x, t) = R ε (z, x, t) + o(1) where ¯ t, ω) − R ε (z, x, t) = Zeε (x, y(z),
inf
y∈I (x,t)
Zeε (x, y, t, ω) ,
with y(z) ¯ the closest point in I (x, t) to z. Clearly R ε ≥ 0 as well. Hence (4.16) implies t R ε goes to zero and this in turn implies that Y ε goes to zero, proving (4.14). u
428
F. Rezakhanlou
Proof of Theorem 2.5. The proof of parts (i) and (ii) are omitted because it is very similar to the proof of Theorems 5.1 and 5.2 of [RT]. The proof of part (iii) is also similar to the proof of Theorem 5.3 of [RT] and we only sketch it. We only establish the convergence of ε (x + √εe(x), t, ω) to the one-dimensional marginals one-dimensional marginals of y± of y¯e (x, t). Step 1. Fix a point (x, t) ∈ Q and suppose I (x, t) = {y1 , y2 , . . . , yn }. For each i ∈ {1, 2, . . . , n} and a nonnegative τ , define Gi (τ ) = {ω : Ze (x, yi , t, ω) + τ < Ze (x, yj , t, ω) for all j 6 = i}, Gεi (τ ) = {ω : Zeε (x, yi , t, ω) + τ < Zeε (x, yj , t, ω) for all j 6 = i}. We then define
Aεi (α, k, r, τ ) = Aε (α, k, r) ∩ Gεi (τ ),
(4.17)
where Aε (α, k, r) was defined by (4.8). We also set [ [ [ Gεi (τ ), G(τ ) = Gi (τ ), Aε (α, k, r, τ ) = Aεi (α, k, r, τ ). Gε (τ ) = i
i
i
From our assumption (2.11), P¯ (G(τ )) < 1 for every τ > 0 ,
lim P¯ (G(τ )) = 1.
τ →0
(4.18)
Also, since the set {(x, y) : |x − y| > τ } is an open set and its boundary has zero Lebesgue measure, we have lim P (Gεi (τ )) = P¯ (Gi (τ )).
(4.19)
ε→0
From this and (4.10), we learn that for every positive δ, there exist kδ (·), αδ = (αδ,l (·)), rδ , ε0 (δ) and τ (δ) such that if ε = Aεi (αδ , kδ , rδ , τ (δ)), Bδε = Bi,δ
n [ i=1
ε Bi,δ ,
then inf
0 0, Z
x y
0 + (z, λ)dz = t, Z
then S(x, y, t, ω) =
y
x
A+ (λ + ω(z))dz − λt.
(5.8)
(5.9)
The same is true if 0 + , A+ are replaced with 0 − and A− . Proof. Observe that ρ(x) = A+ (λ + ω(x)), then if H(x, p) = K(p) − ω(x) and R x Hp (x, ρ(x)) = K 0 ρ(x) . We also define w(x) = 0 ρ(z)dz. Now suppose x(·) is a solution to ( dx 0 dt = K ρ(x) (5.10) x(t) = x. R x(s) + d 0 (z, λ) = 1, which implies Then ds 0 Z x 0 + (z, λ)dz = t. x(0)
Comparing this with (5.8) implies that in fact x(0) = y. From Lemma 5.1 we obtain (5.9). u t When H (x, p, ω) = K(p) − ω(x), we define Z x ± A± (λ + ω(y))dy λ ≥ 0, v (x, λ, ω) = 0 ( v + (x, H¯ (p), ω) if p ≥ p+ , w(x, p, ω) = − v (x, H¯ (p), ω) if p ≤ p− ,
(5.11)
where p ± and H¯ (p) are defined right after (2.22). We are now ready to derive a formula similar to (3.10). Lemma 5.3. Suppose H (x, p, ω) = K(p) − ω(x) and w(x, p, ω) is defined by (5.11). Then for y ∈ / [z+ (x, t), z− (x, t)], ˆ S(x, y, t, ω) = S(x, y, t, ω) :=
sup
p∈[p / − ,p+ ]
[w(x, p) − w(y, p) − t H¯ (p)],
where z± (x, t) = z± (x, t, ω) are defined implicitly by Z z− (x,t) Z x + 0 (z, 0, ω)dz = − 0 − (z, 0, ω)dz = t. z+ (x,t)
x
(5.12)
(5.13)
Central Limit Theorem for Stochastic Hamilton–Jacobi Equations
433
Proof. We only verify (5.12) when y < z+ (x, t), the case y > z− (x, t) can be treated in the same way. When y < z+ (x, t), we only need to show S(x, y, t, ω) = sup[v + (x, λ, ω) − v + (y, λ, ω) − tλ] λ≥0
Z
= sup λ≥0
x
y
A (λ + ω(z))dz − tλ . +
(5.14)
This is because for y < z+ (x, t) < x, we always have Z x Z x A− (λ + ω(z))dz ≤ 0 ≤ A+ (λ + ω(z))dz y
y
simply because A− ≤ 0 and A+ ≥ 0. Moreover, since K grows faster than linearly at infinity, A+ grows slower than linearly at infinity. Hence the supremum in (5.14) is ¯ attained at some nonnegative λ¯ = λ(x, y, t, ω). If λ¯ > 0, we certainly have Z x ¯ 0 + (z, λ)dz =t y
¯ from setting the derivative of the expression in the brackets in (5.14) to be zero at λ = λ. By the strict monotonicity of 0 + in λ we conclude that such λ¯ is unique. Now (5.14) followsRfrom Lemma 5.2. If λ¯ = 0, then the derivative at λ = 0 is nonpositive, which x t means y 0 + (z, 0)dz ≤ t. This is impossible because y < z+ (x, t). u The variational formula (5.12) allows us to establish a central limit theorem for Sˆ in just the same way we established a central limit theorem for uε in Sect. 4. Define x y t , , ,ω . Sˆ ε (x, y, t, ω) = εSˆ ε ε ε More precisely, i h Rx supλ≥0 ε yε A+ (λ + ω(z))dz − tλ if x ≥ y, Sˆ ε (x, y, t, ω) = i h R εx supλ≥0 ε yε A− (λ + ω(z))dz − tλ if x ≤ y.
(5.15)
ε
Proof of Theorem 2.7. Step 1. For the sake of simplicity, we only present the proof when e = 0. First we claim that for every pair of positive numbers (l, r), there exists a constant c0 (l, r) such that if sup ω ≤ r and | x−y t | ≤ l, then # " Z x ε ε ± sup A (λ + ω(z)) − tλ . (5.16) ε Sˆ (x, y, t, ω) = 0≤λ≤c0 (l,r)
y ε
Here and below by something like (5.16), we mean that a + sign appears if x ≥ y and a − sign appears if x ≤ y. To see (5.16) when x ≥ y, observe that if sup ω ≤ r and 0 ≤ x−y t ≤ l, then
434
F. Rezakhanlou
Z ε
x ε y ε
A+ (λ + ω)dz − tλ ≤ (x − y)A+ (λ + r) − tλ ≤ tlA+ (λ + r) − tλ
growth at infinity. and this is negative when λ is large because A± (λ) has a sublinear ε (x, t, ω) = zε (x, t) = εz± x , t and λε (x, y, t) to be the Next we define z± ± ε ε ε (x, t), then λε is the unique λ for maximizer in (5.16). More precisely, if y ≤ z+ R x/ε + ε (x, t), then λε is the unique λ for which which ε y/ε 0 (z, λ)dz = t, and if y ≥ z− R x/ε − ε (x, t), zε (x, t)], then ε y/ε 0 (z, λ)dz = t. It is not hard to show that if y ∈ [z+ − ε λ = 0. We certainly have Z x ε Sˆ ε (x, y, t, ω) = ε A± (λε (x, y, t) + ω(z))dz − tλε (x, y, t). (5.17) y ε
If we define L¯ to be the convex conjugate of H¯ , then x−y ¯ = sup[(x − y)p − t H¯ (p)] = sup [(x − y)p − t H¯ (p)], tL t p p∈(p / − ,p + ) ( x−y supλ≥0 [(x − y)ϕ + (λ) − tλ] if x ≥ y, = t L¯ supλ≥0 [(x − y)ϕ − (λ) − tλ] if x ≤ y, t
(5.18)
where ϕ ± was defined by (2.21). Clearly there exists a unique maximizer λ = λ¯ ( x−y t ) x−y x−y 0 ¯ ¯ ¯ ¯ in (5.18). Moreover, λ( t ) = H L if x 6 = y and λ(0) = 0. Hence t x−y x−y x−y = (x − y)ϕ ± λ¯ ( ) − t λ¯ ( ). (5.19) t L¯ t t t Step 2. Given a compact set B ⊂ R × R × (0, ∞), choose r0 so that for every (x, y, t) ∈ B, # " Z x ε ε ± A (λ + ω(z)) − tλ , Sˆ (x, y, t, ω) = sup ε t L¯
x−y t
0≤λ≤r0
y ε
= sup [(x − y)ϕ ± (λ) − tλ]. 0≤λ≤r0
Choose l large enough so that if (x, y, t) ∈ B, then | x−y t | ≤ l. We define o n + + ¯ ¯ a (δ) = inf L(z) − sup zϕ (λ) − λ : |λ − λ(z)| > δ , 0 ≤ λ ≤ r0 ) , 0≤z≤l λ n o ¯ ¯ > δ , 0 ≤ λ ≤ r0 . L(z) − sup zϕ − (λ) − λ : |λ − λ(z)| a − (δ) = inf −l≤z≤0
λ
As in Lemma 4.2, it is not hard to show a ± (δ) > 0 if δ > 0 and limδ→0 a ± (δ) = 0. Recall the process B ε (x, λ, ω) defined by (2.22) and define Aε = Aε (α, k, r) = {ω : B ε (·, ·, ω) ∈ K(k, α) , sup ω ≤ r}, where K(α, k) is the space of functions b(x, λ) such that |b(x1 , λ1 ) − b(x2 , λ2 )| ≤ αl (|x1 − x2 |) + αl (|λ1 − λ2 |), |b(x1 , λ1 )| ≤ k(l),
Central Limit Theorem for Stochastic Hamilton–Jacobi Equations
435
for every x1 , x2 , λ1 , λ2 with |x1 |, |x2 | ≤ l and λ1 , λ2 ∈ [0, l]. A repetition of the proof of Lemma 4.3 implies that there exists a function h(ε) with limε→0 h(ε) = 0 such that ≤ l, then if ω ∈ Aε and x−y t ε x − y λ (x, y, t, ω) − λ( ¯ ) ≤ h(ε). t
(5.20)
Step 3. A repetition of the proof of Lemma 4.4 implies lim
sup
1 x−y sup ε− 2 Sˆ ε (x, y, t, ω) − t L¯ t
ε→0 (x,y,t)∈B ω∈Aε
√ ε x−y x−y ε ¯ ¯ ), ω) − B± (y, λ( ), ω) = 0. − ε B± (x, λ( t t
(5.21)
As in the proof of Theorem 2.5(ii), we use (5.21) to conclude that the processes 1 x−y ε ε ˆ ¯ ˆ Z (x, y, t, ω) = √ S (x, y, t, ω) − t L t ε converge to the process Z(x, y, t, ω) = B ± (x, λ¯ (
x−y x−y ), ω) − B ± (y, λ¯ ( ), ω). t t
We then use Theorem 2.5 to deduce that if uˆ ε (x, t, ω) = inf {g(y) + Sˆ ε (x, y, t, ω)}, y
then the finite-dimensional marginals of the processes 1 ¯ t)) γˆ0ε (x, t) = √ (uˆ ε (x, t, ω) − u(x, ε converge to the finite-dimensional marginals of the process γ0 (x, t) defined by (2.24). It remains to show √ (5.22) uˆ ε (x, t, ω) − uε (x, t, ω) = o( ε). ε , we Final Step. From (2.23), the translation invariance of P2 and the definition of z± √ ε can show that with probability one, z± (x, t) = x + o( ε). More precisely, if we define
√ 1 ε (x, t, ω) − x| ≤ δ ε for all ε ∈ (0, )}, B± (δ, l) = {ω : |z± l for δ, l positive, then lim P2 (B± (δ, l)) = 1.
l→∞
(5.23)
436
F. Rezakhanlou
This is because 1 √ ε (0, t, ω)| > δ ε for some ε ∈ 0, lim sup P2 (2 − B+ (δ, l)) = lim sup P2 |z+ l l→∞ l→∞ Z 0 1 t 0 + (z, 0, ω)dz ≤ for some ε ∈ 0, ≤ lim sup P2 δ ε l − √ε l→∞ Z 0 t 0 + (z, 0, ω)dz ≤ for a sequence εl → 0 ≤ P2 εl − √δε l Z 0 dz < ∞ = 0, ≤ P2 lim inf ε 0 + ε→0 √δ K (A (z)) ε
where for the first equality we used the translation invarience of of P2 and for the last equality, we used our assumption (2.23). This proves (5.23) with the + sign. The proof of (5.23) with the − sign is similar. We now recall a lemma from [RT]. According to Lemma 3.1 of [RT], for every compact set B ⊂ R × R × (0, ∞), the functions S ε (x, y, t, ω), (x, y, t) ∈ B are Lipschitz, uniformly in ε. From the definition of Sˆ ε and (5.16), it is not hard to show that the same uniform Lipschitzness is true for Sˆ ε . From this and the Lipschitzness of g, we learn that for every compact set A ⊂ ×(0, ∞) and every l, δ > 0, there exists C0 such that if (x, t) ∈ A, 0 < ε < 1l and ω ∈ B(δ, l), then √ ε ε inf ε {g(y) + S (x, y, t, ω)} ≤ C0 δ ε , (5.24) sup u (x, t, ω) − ε y ∈[z / + (x,t),z− (x,t)] (x,t)∈A o n √ ε inf g(y) + Sˆ ε (x, y, t, ω) ≤ C0 δ ε. sup uˆ (x, t, ω) − ε (x,t),zε (x,t)] y ∈[z / (x,t)∈A + −
(5.25)
From (5.24), (5.25) and (5.12) we learn that for some constant C1 , √ sup |uε (x, t, ω) − uˆ ε (x, t, ω)| ≤ C1 δ ε (x,t)∈A
for every ω ∈ B(δ, l). From this, the previous step and (5.23), it is not hard to establish t the convergence of the finite dimensional marginals of γ0ε to γ0 as ε → 0. u
6. Proof of Theorem 2.8 We start with the derivation of (3.10). Lemma 6.1. Under the assumptions of Theorem 2.8, we have ˆ S(x, y, t, ω) = S(x, y, t, ω), where
ˆ S(x, y, t, ω) = sup[w(x, p, ω) − w(y, p, ω) − t H¯ (p)]. p
(6.1)
Central Limit Theorem for Stochastic Hamilton–Jacobi Equations
437
Proof. To ease the notation, we do not display the dependence on ω. First we assume (i)–(iii) hold. Let x(t, y, p) denote the unique solution of ( dx dt = Hp (x, wx (x, p)) (6.2) x(0, y, p) = y. Clearly x(t, y, p) is continuous in (t, x, p). Recall that by (5.3), w(x, p) − w(y, p) − t H¯ (p) ≤ S(x, y, t) , for all (x, y, t, p). From this and (5.4) we learn, ˆ S(x(t, y, p), y, t) = S(x(t, y, p), y, t). ˆ we need to make sure that for every (x, y, t), there exists To complete the proof of S = S, a p such that x(t, y, p) = x. For this, it suffices to show limp→±∞ x(t, y, p) = ±∞. It is not hard to deduce this from (6,2) and (ii). We now assume (i’)–(ii’) instead of (i)–(ii). Since w(·, p) grows linearly for large p and H¯ grows faster than linearly, we conclude that for each (x, y, t) with t > 0, a maximizer in (6.1) exists. Fix (x, y, t) and denote the maximizer by p¯ = p(x, ¯ y, t). We clearly have ˆ S(x, y, t) = w(x, p) ¯ − w(y, p) ¯ − t H¯ (p), ¯ ¯ − wp (y, p) ¯ − t∇ H¯ (p). ¯ 0 = wp (x, p)
(6.3)
Let x(·) denote the solution to (5.2) with w(z) = w(z, p) ¯ and x(0) = y. Then by Lemma 5.1, S(x(t), y, t) = w(x(t), p) ¯ − w(y, p) ¯ − t H¯ (p). ¯ (6.4) ˆ ¯ is an To show S(x, y, t) = S(x, y, t), it suffices to show x(t) = x. Since wp (·, p) injective function and since (6.3) holds, we only need to show ¯ − wp (y, p) ¯ − t∇ H¯ (p) ¯ = 0. wp (x(t), p)
(6.5)
Since wp is continuously differentiable in x, from differentiating H (x, wx (x, p)) = H¯ (p) in p we learn wxp (x, p)Hp (x, wx (x, p)) = ∇ H¯ (p). This in turn implies dx d wp (x(t), p) = wpx (x(t), p) dt dt
= wpx (x(t), p)Hp x(t), wx (x(t), p)
= ∇ H¯ (p). This evidently implies (6.5). u t Proof of Theorem 2.8. The proof is very similar to the proof of Theorem 2.7. After a repetition of Steps 1–3 of the proof of Theorem 2.7, we deduce a central limit theorem t for Sˆ ε . We then apply Theorem 2.5 to complete the proof. u
438
F. Rezakhanlou
References [BZ] [CMS] [E] [FS] [KS] [La] [Li] [CEL] [CL] [LPV] [P1] [P2] [R1] [R2] [RT] [S] [WX1] [WX2]
Bosma, W. and van der Zee, S.: Transport of reacting solute in a one-dimensional chemically heterogeneous porous medium. Water Resour. Res. 29, 117–131 (1993) Cannarsa, P., Mennucci, A. and Sinestrari, C.: Regularity results for solutions of a class of Hamilton– Jacobi equations. Arch. Rational Mech. Anal. 140, 197–223 (1997) Evans, L.C.: Periodic homogenisation of certain fully nonlinear partial differential equations. Proceedings of the Royal Society of Edinburgh, Section A 127 (3–4), 245–265 (1992) Fleming, W.H. and Soner, H.M.: Controlled Markov Processes and Viscosity Solutions. Berlin: Springer, 1993 Krug, J. and Spohn, H.: Kinetic roughening of growing surfaces. In: Solides Far From Equilibrium, C. Godreche, ed. Cambridge: Cambridge Univ. Press, 1991, pp. 479–588 Lax, P.: Hyperbolic Systems of Conservation Laws and the Mathematical Theory of Shock Waves. Number 11, in Conference Board of the Mathematical Sciences Regional Conference Series inApplied Mathematics, Philadelphia: Society for Industrial and Applied Mathematics, 1973 Lions, P.L.: Generalized Solutions of Hamilton–Jacobi Equations. Number 69 in Research Notes in Mathematics, Boston: Pitman Advanced Publishing Program, 1982 Crandall, M.G., Evans, L.C. and Lions, P.L.: Some properties of viscosity solutions of Hamilton– Jacobi equations. Trans Am. Math. Soc. 282 (2), 487–502 (1984) Crandall, M.G. and Lions, P.L.: Viscosity solutions of Hamilton–Jacobi equations. Trans Am. Math. Soc. 277 (1), 1–42 (1983) Lions, P.L., Papanicolaou, G. and Varadhan, S.R.S.: Homogenization of Hamilton–Jacobi equations. Unpublished Philip, J.R.: Theory of infiltration. Adv. in Hydrosciences 1, 213–305 (1969) Phillip, J.R.: Issues in flow and transport in heterogeneous porous media. Transport in Porous Media 1, 319–338 (1986) Rezakhanlou, F.: Central limit theorem for exclusion processes. In preperation Rezakhanlou, F.: In preperation Rezakhanlou, F. and Tarver, J.L.: Homogenization for stochastic Hamilton–Jacobi equation. To appear in Arch. Rational Mech. Anal. Szintman, A.-S.: Brownian Motion, Obstacles and Random Media. Berlin: Springer, 1998 Wehr, Y. and Xin, J.: White noise perturbation of the viscous shock fronts of the Burgers equation. Communun. Math. Phys. 181, 183–203 (1996) Wehr, Y. and Xin, J.: Front speed in the Burgers equation with a random flux. J. Stat. Phys. 88, 843–871 (1997)
Communicated by J. L. Lebowitz
Commun. Math. Phys. 211, 439 – 464 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Travelling Waves in a Chain of Coupled Nonlinear Oscillators Gérard Iooss1 , Klaus Kirchgässner2 1 Institut Universitaire de France, INLN, UMR CNRS-UNSA 6618, 1361 route des Lucioles, 06560 Valbonne,
France. E-mail:
[email protected] 2 Mathematisches Institut A, Universität Stuttgart, Pfaffenwaldring 57, 70569 Stuttgart, Germany.
E-mail:
[email protected] Received: 10 September 1999 / Accepted: 15 December 1999
Abstract: In a chain of nonlinear oscillators, linearly coupled to their nearest neighbors, all travelling waves of small amplitude are found as solutions of finite dimensional reversible dynamical systems. The coupling constant and the inverse wave speed form the parameter space. The groundstate consists of a one-parameter family of periodic waves. It is realized in a certain parameter region containing all cases of light coupling. Beyond the border of this region the complexity of wave-forms increases via a succession of bifurcations. In this paper we give an appropriate formulation of this problem, prove the basic facts about the reduction to finite dimensions, show the existence of the ground states and discuss the first bifurcation by determining a normal form for the reduced system. Finally we show the existence of nanopterons, which are localized waves with a noncancelling periodic tail at infinity whose amplitude is exponentially small in the bifurcation parameter. 1. Introduction Consider the dynamics of a one-dimensional network of nonlinear oscillators, as described by the infinite system X¨ n + V 0 (Xn ) = γ (Xn+1 − 2Xn + Xn−1 ), n ∈ Z.
(1)
Here, Xn (e t), e t ∈ R, gives the position of the nth particle, V (Xn ) its potential energy, V being a regular function independent of n, and the positive constant γ measures the coupling between nearest neighbors, which is assumed to be linear. Furthermore, the function V satisfies V 0 (0) = 0, V 00 (0) = 1. We shall construct solutions of (1) in the form of travelling waves. In fact, we shall develop a general method for classifying travelling waves of small amplitude via an infinite sequence of bifurcations. We shall discuss in detail the groundstate and the first of these bifurcations.
440
G. Iooss, K. Kirchgässner
With the ansatz Xn (e t) = e x (e t − nτ ), after scaling the time as e t = τ t, and denoting x(t) = e x (τ t), system (1) is transformed to x(t) ¨ + τ 2 V 0 [x(t)] = γ τ 2 [x(t − 1) − 2x(t) + x(t + 1)]
(2)
which is a scalar ”neutral” or ”advance-delay” differential equation. Equations of this type have been the subject of various investigations on the dynamics of lattices. Friesecke and Wattis have shown in [6] the surprising fact that, in a unidimensional hamiltonian network, solitary waves exist, even if the coupling is nonlinear. They used a variational approach. How delicate this result really is, will appear also in the subsequent analysis. Further results along these lines were given by Smets and Willem [19]. Equation (2) has been investigated by MacKay and Aubry in [15] for the existence of time-periodic and localized-in-space standing waves, so-called breathers. Aubry then, while searching for “multibreathers”, developed in [1] the technique of “phase torsion” to study the existence of travelling waves. Rusticini also studied equations of the type considered here in [17,18]. His motivation came from problems of optimal control. He proved a Hopf-bifurcation theorem by constructing 2d-center manifolds for periodic solutions via a Lyapunov-Schmidt argument. Some of his analysis is close to ours, like the ad hoc construction of C 0 -semigroups on the positive and the negative spectral part – both being infinite dimensional. We should also mention the recent work of Mallet-Paret et al. in [3,13,14] on waves in higher dimensional lattices. There, the dynamics is restricted to discrete systems, but give a global picture of the solutions. The arguments rely on an advanced form of the Lyapunov-Schmidt method given by X.B. Lin (cf. [14]). With the method being developed here, we exploit two facts: first the ellipticity of (2) in its continuous parts, and the intrusion of hyperbolicity via the discrete terms. With increasing intensity of coupling, the effect of the latter will be more and more dominating, and the complexity of the solution behavior will explode. Nevertheless, one can perform the “continuous limit” for (1) and thus obtain travelling wave solutions of the following nonlinear wave equation uetet + V 0 (u) = Kuξ ξ
(3)
t) = u(e t, nh), and for the function u(e t, ξ ). Its discretized form (1) is obtained with Xn (e K = γ h2 , where h is the discretization step. Looking for solutions of (3) of the form of travelling waves, u(e t, ξ ) = e x (e t − ξ/c),
(4)
leads to the discretized form (2), where τ = h/c. Now, (4) implies for (3) (1 − K/c2 )
d 2e x +e x = g(e x ), e dt2
(5)
where g is defined by V 0 (x) = x − g(x), hence g(x) = O(x 2 ). It is then clear that travelling waves, as solutions of (5), exist near 0, if and only if K/c2 < 1, i.e. γ τ 2 < 1. They form a one parameter family in the neighborhood of 0. In the present work, we prove the existence of the corresponding travelling waves (if γ τ 2 < 1) for the above discretized model (1), but we also prove the existence of infinitely many other types of travelling waves near 0, for values of (γ , τ ) in regions
Travelling Waves in Chain of Coupled Nonlinear Oscillators
441
such that γ τ 2 > 1. This shows in particular how dangerous the belief might be that all nontrivial solutions of a discretized version of (5) survive the limit h → 0. The method we shall develop is based on previous work in [11,16,20], proving the reducibility of quasilinear elliptic systems in infinite cylindrical domains. Treating the system as evolutionary in the unbounded variable, one is able to show that, under quite general conditions, the original system, if restricted to a suitable neighborhood of 0, is equivalent to a flow on a finite dimensional manifold. Extending this idea to the problem under consideration in Sect. 2, we are able to prove the validity of a reduction of (2) to a system of ordinary differential equations whose dimension equals the dimension of the invariant subspace belonging to the central part of the spectrum of the linearization at 0, and which inherits the “reversibility” from the original equation (2). This is done in Sects. 4 and 5. It should be emphasized that the extension of the previous results to the case considered here is by no means straightforward. In the following sections we analyze the case of small coupling first, when no bifurcation occurs and all “small” travelling waves are periodic. Thereafter we treat the first bifurcation occurring at a critical value of the coupling constant γ (near 21 for τ = 1). The difficulties of applying previous reduction results [20] will be apparent in that case and solved in a general way. Exploiting the reducibility to a finite system of ordinary differential equations, we apply normal form theory. The resulting system is integrable on this level of approximation and quite rich in its structure. In order to keep the scope of this paper limited, however, we suppress the instinct to describe all possible solutions as well as the proof of persistence for the full system - not just reduced to its normal form – of the solutions found. That would complete the analysis. We rather construct some of the most interesting forms of waves, such as “nanopterons”. These are roughly the superposition of a localized travelling (solitary) wave, whose principal part is given explicitly, and exponentially small (in the bifurcation parameter) periodic waves (“phonons”). The proof of their existence follows from the work of Lombardi in [12]. For other type of solutions, like periodic or quasi-periodic ones, see the methods developed in [8]. 2. Extended Formulation Instead of treating (2) directly, we introduce a new variable v ∈ [−1, 1] and functions X(t, v) = x(t + v). The notation U (t)(v) = (x(t), ξ(t), X(t, v))T indicates our intention to construct U as a map from R into some function space living on the v-interval [−1, 1]. We use the notations ξ(t) = x(t), ˙ δ 1 X(t, v) = X(t, 1), and δ −1 X(t, v) = X(t, −1). Equation (2) can now be written as follows: ∂t U = Lγ ,τ U + Mτ (U ), where Lγ ,τ is the linear, nonlocal operator
Lγ ,τ
0 1 0 = −τ 2 (1 + 2γ ) 0 γ τ 2 (δ 1 + δ −1 ) , 0 0 ∂v
and Mτ (U ) = τ 2 (0, g(x), 0)T ,
(6)
442
G. Iooss, K. Kirchgässner
where g(x) = x − V 0 (x) = ax 2 + bx 3 + · · · = 0(x 2 ) as x → 0. Moreover, we require the boundary condition X(t, 0) = x(t). Observe that (6) is somewhat more general than the original Eq. (2), if we allow g to depend not only on x, but on ξ, X as well. In that case, the coupling could be a smooth nonlinear function as indicated in the introduction. We introduce Banach-spaces H and D for U (v) = (x, ξ, X(v))T , 2 0 H=R × C2 [−1, 11] , (7) D = U ∈ R × (C [−1, 1]) X(0) = x with the usual maximum norms. The operator Lγ ,τ then maps D into H continuously. The nonlinearity Mτ is supposed to satisfy Mτ ∈ C k (D, D), k ≥ 1, and kMτ (U )kD ≤ c(ρ)kU k2D
(8)
for all U ∈ D with kU kD ≤ ρ; ρ being an arbitrary positive constant. In our particular case g ∈ C 2 () suffices for the validity of the assumption on Mτ ; denotes an open neighborhood of 0 ∈ R. It is obvious that Lγ ,τ , acting in H with domain D, has a compact resolvent in H. Moreover, Lγ ,τ and Mτ , both anticommute with the reflexion S in H, given by S(x, ξ, X)T = (x, −ξ, X ◦ s)T ,
(9)
where X ◦ s(v) = X(−v). Therefore, (6) is reversible. Although (6) is illposed as an initial value problem, it is possible to construct, nevertheless, solutions bounded for all t ∈ R. Using a proper extension of certain reduction methods for quasilinear elliptic systems (cf. [11,16,20]) one is able to reduce (6) to a finite dimensional system of ordinary differential equations, which is reversible and has the property to contain all bounded solutions which are close to the trivial solution U = 0. The dimension of this reduced system will depend on the coupling parameter γ and on the delay-advance parameter τ . This dependence of the dimension as a function of (γ , τ ) is detailed in the next section. 3. The Spectrum of Lγ ,τ To determine the spectrum
P
≡
P
Lγ ,τ of Lγ ,τ , the resolvent equation
(λI − Lγ ,τ )U = F
(10)
has to be solved for any given F = (f0 , f1 , F2 )T ∈ H, with λ ∈ C, and U = (x, ξ, X)T ∈ D. This is possible provided that N (λ; γ , τ ) 6 = 0, where N(λ; γ , τ ) = −λ2 − τ 2 (1 + 2γ ) + γ τ 2 (eλ + e−λ ).
(11)
Indeed, we obtain x = −[N(λ; γ , τ )]−1 (λf0 + f1 + γ τ 2 feλ ), −1
ξ = −[N(λ; γ , τ )] Z v X(v) = eλv x − eλ(v−s) F2 (s)ds, 0
(12)
{[λ + N (λ; γ , τ )]f0 + λf1 + γ τ λfeλ }, 2
2
(13) (14)
Travelling Waves in Chain of Coupled Nonlinear Oscillators
443
with feλ =
Z
1
[−eλ(1−s) F2 (s) + e−λ(1−s) F2 (−s)]ds.
0
P Since N(λ; γ , τ ) is an entire function of λ for every (γ , τ ) ∈ R2+ , the spectrum Lγ ,τ consists of isolated eigenvalues λ. They are roots of N (λ; γ , τ ), and thus have finite multiplicities. P Remark that Lγ ,τ is real and that SLγ ,τ + Lγ ,τ S = 0 holds. Lγ ,τ is then invariant P under λ 7 → λ and λ 7 → −λ. Thus, Lγ ,τ is invariant under reflexion on the real – and the imaginary axis in C. Thus, we can restrict the following considerations to λ = p +iq with nonnegative p P and q. P P Lγ ,τ ∩ iR of the spectrum is determined by The central part 0 ≡ 0 Lγ ,τ = N(iq; γ , τ ) = 0, q ∈ R, i.e. q 2 + 2γ τ 2 cos q − τ 2 (1 + 2γ ) = 0.
(15)
For eigenvalues of higher multiplicity we have to solve in addition q − γ τ 2 sin q = 0
(16)
if the multiplicity is at least two. For triple eigenvalues 1 − γ τ 2 cos q = 0
(17)
has to hold also. 2 In the P parameter-space (γ , τ ) ∈ R+ , the set DE, for which there are double eigenvalues on 0 , consists of a sequence of curves which we call ”DEC”. They are parametrized by q = x ∈ R+ as follows: d(γ , τ )(x) : τ 2 = x 2 − 2x tan x/2, γ = τ −2 x/ sin x.
(18)
For triple eigenvalues it follows in addition x = tan x.
(19)
Since d(γ , τ )(x)/dx vanishes on DEC if (19) holds, the triple eigenvalues P appear as cusps on DEC. There are no eigenvalues of multiplicity higher that 3 on 0. P In the following lemma we describe the character of 0 on the bifurcation curves P+ DEC. P eigenvalues P To conform with Fig. 1, we restrict this description to 0 , i.e. those on 0 having positive imaginary part. Due to reversibility the rest of 0 is obtained by a simple reflexion on the real axis. P P Lemma 1. (i) For each (γ , τ ) ∈ R2+ , there exists p0 > 0, such that all λ ∈ Lγ , τ \ 0 satisfy | Re λ| ≥ p0P . P (ii) Let λ = p + iq ∈ \ 0 , then q (20) |q| ≤ τ + 2 γ τ 2 + 4e−2 cosh(p/2) holds.
444
G. Iooss, K. Kirchgässner
(iii) Given any DEC. It contains P exactly one cuspPpoint, (γ ∗ , τ ∗ ) say, corresponding ∗ ∗ to a triple eigenvalue iq of + 0 Lγ ∗ , τ ∗ = {±iq }. For all other 0 . Moreover, P+ (γ , τ ) on that DEC, 0 Lγ , τ contains either two or one double eigenvalue. The first case happens P where two different DEC’s intersect. - If (γ , τ ) does not belong to any DEC, + 0 Lγ , τ consists of simple eigenvalues. (iv) For each fixed τ ∈ (0, 2π], there exists a strictly increasing sequence (γj∗ (τ )), P ∗ ∗ 0 < γ1∗ < . . . , such that + 0 Lγj ,τ possesses a double eigenvalue +iqj , which P+ has largest modulus among all eigenvalues. All other eigenvalues in 0 Lγj∗ ,τ are P simple. If γ ∈ (γj∗ (τ ), γj∗+1 (τ )), γ0∗ ≡ 0, + 0 Lγ ,τ consists of 2j + 1 simple eigenvalues.
Fig. 1. Pure imaginary eigenvalues of Lγ ,τ (upper half). Dots are simple eigenvalues. A simple cross and a double cross respectively means double or triple eigenvalue
Travelling Waves in Chain of Coupled Nonlinear Oscillators
For larger values of τ , the situation
P+ 0
445
Lγ ,τ is described in Fig. 1.
Proof. Ad (i). Let us denote λn = pn + iqn the roots λ of N (λ; γ , τ ) = 0. Assuming pn 6 = 0, and pn → 0 as n → ∞, we have qn2 + 2γ τ 2 cos qn − τ 2 (1 + 2γ ) = εn → 0, qn − γ τ 2 sin qn = εn0 → 0.
Hence, qn is bounded, and we can extract a subsequence qkn converging towards q∗ , and q∗ satisfies q∗2 + 2γ τ 2 cos q∗ − τ 2 (1 + 2γ ) = 0, q∗ − γ τ 2 sin q∗ = 0. This means that q∗ is a (double) root of N (iq; γ , τ ), contradicting the isolatedness of the roots. This completes the proof of (i). Ad (ii). Denoting λ = p + iq the roots λ of N (λ; γ , τ ) = 0, we have p 2 − q 2 = 2γ τ 2 cosh p cos q − τ 2 (1 + 2γ ), pq = γ τ 2 sinh p sin q. It follows: q 2 ≤ τ 2 + p 2 + 4γ τ 2 cosh2 p/2 ≤ τ 2 + 4(γ τ 2 + 4e−2 ) cosh2 p/2, and assertion (ii) is immediate. Ad (iii). Since γ ∈ R+ , the components of DE are defined by the inequalities sin x > 0, x − 2 tan x/2 > 0. Hence, to the nth component belongs the interval In = (2πn, xn ), n ∈ N+ , where xn is defined by 2nπ < xn < (2n + 1)π, xn = 2 tan xn /2. We have τ (2nπ) = 2nπ, τ (xn ) = 0 and γ (x) → +∞ as x → 2nπ − , or x → xn+ . There is a unique xn∗ in In satisfying (19). DEC is a smooth curve in In \ {xn∗ }. It is easy to check that γ (x) resp. τ (x) decreases resp. increases on (2nπ, xn∗ ) and reverses ∗ ∗ ∗ its type of growth on (xP n , xn ). The cusp points [γ (xn ), τ (xn )] are the points of the parameter-space, where 0 contains triple eigenvalues. These are ±ixn∗ . Moreover, the coordinates of the cusp points satisfy p p (21) γ = (1 + τ )τ −2 , cos 2τ + τ 2 = (1 + τ )−1 , sin 2τ + τ 2 > 0. For a given (γ , τ ), the double eigenvalues iq are solutions of [τ 2 (1 + 2γ ) − q 2 ]2 + 4[q 2 − γ 2 τ 4 ] = 0,
(22)
which is obtained after elimination of sin q, P and cos q in (15,16). This shows that we cannot have more than 2 double eigenvalues in + 0 Lγ ,τ . The case when q is a double root of (22) corresponds to (γ , τ ) satisfying (21), i.e. +iq is a triple eigenvalue of Lγ ,τ . In such a case, ±iq are the only pure imaginary eigenvalues of Lγ ,τ , since for fixed τ = τ (xn∗ ), we know by a continuity argument starting with γ = 0,P that for γ < γ (xn∗ ), there is only one pair of simple pure imaginary eigenvalues in 0 Lγ ,τ , or equivalently one P positive simple solution q of (15). We conclude that for every (γ , τ ) ∈ R2+ , 0 consists P+ of simple eigenvalues, if (γ , τ ) does not belong to DE. Otherwise 0 contains exactly one double eigenvalue if (γ , τ ) does not belong to the intersection of two components of DE, or is not a cusp point. Intersection points of two components of DEC give two P pairs of double eigenvalues on 0 .
446
G. Iooss, K. Kirchgässner
Ad (iv). Set hτ (q) = (1 − cos q)(q 2 − τ 2 )−1 , where we assume q > τ. One finds the set (γ , τ ) of (18), in looking for q, γ , τ such that hτ (q) = (2γ τ 2 )−1 , dhτ (q)/dq = 0. It is easy to show, for 0 < τ < 2π, that hτ has its minimum value 0 for q = 2nπ, n = 1, 2, . . . and maxima for one value of q in every interval between these minima, the values of maxima decaying as q increases. Assertion (iv) of Lemma 1 then follows directly. Notice that for τ > 2π the function hτ may have one minimum and one maximum before the first minimum of the form 2nπ. The case when hτ has an horizontal inflexion point gives the cusp point. This completes the proof. u t Remark that the spectrum of Lγ ,τ is not sectorial (see part (ii) of the lemma). This implies, that we cannot use the traditional reduction tools based on estimates of the resolvent operator (iqI − Lγ ,τ )−1 of order 1/|q| for |q| large. Indeed, such an estimate implies the spectrum to be sectorial. Therefore, we have to solve subsequently the affine linear hyperbolic system (23) ad hoc.
4. Weak Coupling and Periodic Waves P The bifurcations in system (6) will occur when the cardinality of 0 Lγ ,τ changes. Thus, the set {[τ (x), γ (x)]} described in Fig. P 1 is the critical set where bifurcations take place. Let 10 denote the set of (γ , τ ) where 0 Lγ ,τ contains only one pair of simple eigenvalues ±iq1 . In this section the case (γ , τ ) ∈ 10Pis treated. P separate (6) into P We a central and a hyperbolic part due to the separation = 0 + h of Lγ ,τ . Then, we use the Reduction-Theorem 3 in [20] to justify the application of a center manifold argument. It will follow, that all small nontrivial solutions of (6) are periodic in this case. Introduce the spaces Ejα (Z) for α ∈ R, j ∈ N, with norms kf kj , and similarly the vector-valued version Eαj (Z), as follows Ejα (Z)
j
.
= f ∈ C (R, Z) kf kj = max sup e 0≤k≤j t∈R
−α|t|
k
|D f (t)| < ∞ .
For α > 0, these Banach-spaces consist of functions, which may grow exponentially at infinity. Sometimes we need exponentially decaying functions, which will be denoted by E −α (Z). If necessary, we use weights cosh(αt) instead of exp(α|t|). The eigenprojection P1 , on the two-dimensional subspace spanned by eigenvectors belonging to ±iq1 , is computed as the sum of the residues for the 3 components (12,13,14) of the solution of the resolvent Eq. (10). This leads to the following result Lemma 2. Assume (γ , τ ) ∈ 10 , then is defined in H by
P
0 Lγ ,τ
= {±iq1 }, and the eigenprojection P1
(P1 U )0 = a1 (U )/N1 , (P1 U )1 = q1 b1 (U )/N1 , 1 [a1 (U ) cos q1 v + b1 (U ) sin q1 v], (P1 U )2 = N1
Travelling Waves in Chain of Coupled Nonlinear Oscillators
447
where U = (x, ξ, X)T ∈ H, a1 (U ) = q1 x − γ τ 2 σ (U ), b1 (U ) = ξ − γ τ 2 ρ(U ), Z 1 sin q1 (1 − s)[X(s) + X(−s)]ds, σ (U ) = Z
0 1
ρ(U ) =
cos q1 (1 − s)[X(s) − X(−s)]ds,
0
N1 = q1 − γ τ 2 sin q1 . To prepare application of [20] we have to consider the affine linear system associated with (6) for the hyperbolic part. Set Qh = I − P1 , Uh = Qh U , then the equation ∂t Uh = Lγ ,τ Uh + Qh F
(23)
has to be solved for Uh ∈ Eα0 (Dh ) ∩ Eα1 (Hh ) and for each α ≥ 0. We have F = (0, f, 0)T , f ∈ E0α (R), a1 (Uh ) = b1 (Uh ) = 0, and thus q1 xh = γ τ 2 σ (Uh ), ξh = γ τ 2 ρ(Uh ), 1 def (0, −γ τ 2 sin q1 , − sin q1 v)T f. Qh F = Fh = N1 Furthermore, Xh is given as Xh (t, v) = φ(t + v) −
1 N1
Z
t
f (s) sin q1 (t + v − s)ds,
(24) (25)
(26)
0
and we have to satisfy γ τ2 xh (t) = Xh (t, 0) = q1
Z 0
1
sin q1 (1 − s)[Xh (t, s) + Xh (t, −s)]ds,
hence Xh may now be written as follows:
Z t+v 1 f (s) sin q1 (t + v − s)ds Xh (t, v) = xh (t + v) + N1 t Z v 1 f (t + v − s) sin(q1 s)ds, = xh (t + v) + N1 0
which leads to q1 ∂ Xh (t, v) = x˙h (t + v) + ∂v N1
Z
v
(27) (28)
f (t + v − s) cos(q1 s)ds.
0
Hence, there exists a constant c independent of α ∈ (−α0 , α0 ) such that ||Xh ||E α (C 1 [−1,+1]) ≤ ||xh ||E1α + c||f ||E0α . 0
(29)
448
G. Iooss, K. Kirchgässner
Now we take the Fourier transform of (23). For being able to do it, we first assume α < 0 (i.e. the function f and the unknown Uh decay exponentially at infinity). We then obtain bh analytic with respect to k ∈ Bα = {k ∈ C; | Im k| < α}, taking an expression for U values in Dh , and which is solution of bh (k) = Qh F b(k). (ikI − Lγ ,τ )U
(30)
For α > 0, we use the distributions in Sα0 (see Appendix 1), and for α = 0 the tempered distributions in S 0 . Henceforth, set S 0 = S00 . In such spaces, we cannot use the formula we established in Sect. 3 for the resolvent, since we have no right to divide by N (ik; γ , τ ) (see Proposition 4 of Appendix 1), contrary to the case when α < 0, where Fourier transforms are analytic. For any α, using properties shown in Appendix 1 for α > 0, the Fourier transform of Xh (·, v), given by (28), yields b xh (k), ξh (k) = ikb
Z fb(k) v ik(v−s) bh (k, v) = eikvb xh (k) + e sin(q1 s)ds, X N1 0 γ τ2 b f (k)[− sin q1 + 2q1 (q12 − k 2 )−1 (cos k − cos q1 )], N(ik; γ , τ )b xh (k) = − N1 and, after noticing that N(ik; γ , τ ) = 2γ τ 2 (cos k − cos q1 ) + k 2 − q12 , and N1 = q1 − γ τ 2 sin q1 , this leads to b(k; γ , τ )fb(k)] = 0, N(ik; γ , τ )[b xh (k) + H
(31)
b is defined by the identity where H q1 1 b(k; γ , τ ). = +H 2 N(ik; γ , τ ) N1 (k − q12 ) Now, via Proposition 4 of Appendix 1, Eq. (31) leads to b(k; γ , τ )fb(k) = b xh (k) + H
a+ δq1 + a− δ−q1 in Sα0 , α ≥ 0 0 for α < 0
with a± to be determined. b is analytic in the strip Bp0 , tending to 0 as 1/k 2 at infinity. So We notice that k 7 → H we have the following b(k; γ , τ ) is the Fourier transform of a function t 7→ Lemma 3. The function k 7 → H 1 , for any δ < p , where H 1 is the space of g such that t 7 → g(t)eδ|t| ∈ H (t; γ , τ ) ∈ H−δ 0 −δ H 1 (R). Moreover, for f ∈ E0α , and α ∈ (−α0 , α0 ), α0 < δ, then H (·; γ , τ ) ∗ f ∈ E1α and there is a constant C independent of α such that ||H (·; γ , τ ) ∗ f ||E1α ≤ C||f ||E0α .
Travelling Waves in Chain of Coupled Nonlinear Oscillators
449
b. For | Im k| ≤ Proof. Let us suppress the parameter (γ , τ ) for the moment in H and H δ < p0 , we have b(k)| ≤ const, (1 + |k|2 )|H b(k) ∈ L2 (R) holds. Now, for 0 ≤ δ < p0 we have hence, k 7 → (1 + |k|2 )1/2 H Z 1 δt b(iδ + s)ds eit (iδ+s) H e eδt H (t) = 2π R Z 1 b(iδ + s)ds, eits H = 2π R which implies (by Plancherel) 1 b(iδ + ·)||L2 . ||eδ(·) H (·)||L2 = √ ||H 2π Moreover, 1 d δt [e H (t)] = dt 2π
Z R
b(iδ + s)ds, iseits H
hence (by Plancherel) ||
1 d δ(·) b(iδ + ·)||L2 , [e H (·)]||L2 = √ ||i(·)H dt 2π
i.e., doing the same estimate with −δ, we get t 7 → eδ|t| H (t) ∈ H 1 (R). Now consider for −δ < α < δ, Z ||H˙ ∗ f ||E0α = supe−α|t| | H˙ (t − τ )f (τ )dτ | R t∈R Z ≤ ||f ||E0α sup e−α|t|+α|τ |−δ|t−τ | |eδ|t−τ | H˙ (t − τ )|dτ t∈R R
1/2 Z ≤ ||f ||E0α ||eδ|·| H˙ (·)||L2 sup e2[α(|τ |−|t|)−δ|t−τ |] dτ c ||f ||E0α . ≤√ δ − |α|
t∈R R
This estimate completes Proposition 5 of Appendix 1, and the lemma is proved. u t eh ) with eh = (e xh , e ξh , X Now, let us define U e xh (t) = −[H (·; γ , τ ) ∗ f ](t), d e xh (t), ξh (t) = e dt Z v 1 e xh (t + v) + f (t + v − s) sin(q1 s)ds. Xh (t, v) = e N1 0 eh ∈ Eα (D) for α ∈ (−α0 , α0 ), with an estimate Due to (29), it is clear that U 0 eh ||Eα (D)∩Eα (H) ≤ C(α)||f ||Eα ||U 0 1 0
(32)
450
G. Iooss, K. Kirchgässner
eh is a solution of (23). For showing this we and C is bounded on (−α0 , α0 ). Moreover U first notice that the first and third equation of (23) are easily verified by construction. Now, the Fourier transform of the second equation is just identity (31) for b e x h . It results that the second equation of (23) is satisfied. b eh = 0. Let us show eh = 0, hence P1 U For α < 0, we have by construction P1 U e that for α ≥ 0, P1 Uh = 0 also holds, since this implies formally exactly the same computations (see Lemma 2 for the definition of P1 ). Indeed, it is sufficient to show that eh ) = 0, because this implies b1 (U eh ) = 0 by differentiating σ (U eh ) with respect to a1 (U eh ) (analytic in a strip t and integrating by parts. Taking the Fourier transform of a1 (U for α < 0, in S 0 for α = 0, in Sα0 for α > 0), we obtain, due to the properties shown in Proposition 2 of Appendix 1, Z v Z 1 sin q1 (1 − v) [f (t + u) + f (t − u)] sin q1 (v − u)du dv (k) F 0
= fb(k)
0
Z
1
Z
sin q1 (1 − v)
0
v
2 cos ku sin q1 (v − u)du dv,
0
eh )] ∈ Sα0 is proportional to which is the basic identity for showing that F[a1 (U Z 2γ τ 2 1 b sin q1 (1 − v) cos kvdv e x h (k) 1 − q1 0 Z 1 cos kv − cos q1 sin q1 (1 − v) dv − fb(k)2γ τ 2 N1−1 q12 − k 2 0 b(k; γ , τ )fb(k). It results that fb(k) is a factor of a quantity, now with b e x h (k) = −H independent of the choice of space for f , i.e. independent of α. Since we know that eh ) = 0 for α < 0, the independence with respect to α shows that a1 (U eh ) = 0 for a1 (U eh = 0 holds for all α ∈ (−α0 , α0 ). α ≥ 0 also, and thus P1 U xh a linear combination For α ≥ 0, the full solution Uh of (31) is obtained by adding toe of the form b+ exp(iq1 t) + b− exp(−iq1 t), with b± = a± /2π, (see Proposition 3 of eh ∈ E α (Dh ), we conclude P1 Uh = 0 if and only if b± = 0. Appendix 1). But, since U 0 Thus, we have finally Lemma 4. Assume f ∈ E0α , for α ∈ (−α0 , α0 ), α0 < δ < p0 , then the system (23) has a unique solution Uh ∈ E0α (Dh ) ∩ E1α (Hh ), and the linear map E0α (R) 3 f 7 → Uh ∈ E0α (D) ∩ E1α (H) is bounded uniformly in α ∈ (−α0 , α0 ). Thus, we have verified the assumptions of Theorem 3 in ([20], p. 133) with the special nonlinearity of Mτ (U ) = τ 2 (0, g(U ), 0)T . Therefore, there exists a neighborhood of 0 in D and, for each (γ , τ ) ∈ 10 , a mapping h ∈ Cbk (Dc ; Dh ), where Dc = P1 D, Dh = Qh D, with h(0; γ , τ ) = 0, Dh(0; γ , τ ) = 0, such that ec : R → Dc is any solution of (i) if U ∂t Uc = Lγ ,τ Uc + P1 Mτ [Uc + h(Uc ; γ , τ )] e=U ec + h(U ec ; γ , τ ) solves (6). ec (t) ∈ c for all t ∈ R, then U with U e : R → D solves (6), and U e(t) ∈ for all t ∈ R, then (ii) if U ec (t); γ , τ ), t ∈ R, eh (t) = h(U U ec (t) solves (33). holds, and U
(33)
Travelling Waves in Chain of Coupled Nonlinear Oscillators
451
Theorem 5. For any (γ , τ ) ∈ 10 , and for U near the origin in D, the system (6) reduces to a two dimensional reversible smooth vector field. Moreover, the set of solutions near 0 of (6) constitutes a one parameter family of periodic orbits, bifurcating from 0. Corollary 6. For any (γ , τ ) ∈ 10 , the set of solutions near 0 of Eq. (2) is a one parameter family of periodic solutions, bifurcating from 0 (the parameter is the amplitude of the oscillations). It then results that, for (γ , τ ) ∈ 10 , the only small amplitude travelling waves of the original problem (1), belong to a family of time-periodic waves bifurcating from 0. Proof of the theorem. Once we reduced our problem into the two-dimensional reversible smooth vector field for Uc (33), with a linear part having the simple pair of eigenvalues ±iq1 , the result is known as the Devaney-Lyapunov theorem (see [4]). In fact, it is just a consequence of the implicit function theorem. u t
5. Reduction Near the First Critical Curve 0 In this section P we define 00 ⊂ 00 = boundary of 10 as the set of parameters (γ0 , τ0 ) such that 0 Lγ0 ,τ0 = {±iq1 , ±iq0 }, where q0 and q1 are positive and ±iq1 are simple, ±iq0 are double eigenvalues of Lγ0 ,τ0 . 000 is obviously dense in 00 . Thus, we are faced with the simplest possible bifurcation of our problem. Let us proceed as in the previous section. We have
N(iq0 ; γ0 , τ0 ) = ∂λ N (iq0 ; γ0 , τ0 ) = N (iq1 ; γ0 , τ0 ) = 0. The eigenprojection P1 on the two-dimensional subspace, spanned by the eigenvectors belonging to ±iq1 , was already given in the previous section. We compute the eigenprojection P0 on the four-dimensional subspace, spanned by the eigenvectors and generalized eigenvectors belonging to ±iq0 . This projection is again given by the sum of the two coefficients of (λ ± iq0 )−1 in the Laurent expansion (see [10]) of the resolvent operator (λI − Lγ0 ,τ0 )−1 near the double poles ±iq0 . We obtain the following P Lemma 7. Assume (γ0 , τ0 ) ∈ 000 , and 0 Lγ0 ,τ0 = {±iq0 , ±iq1 }, where iq0 is the double eigenvalue, then the spectral projection Pc on the six-dimensional subspace P belonging to 0 Lγ0 ,τ0 , is given as Pc = P0 + P1 where P0 and P1 are projections of rank 4 resp. 2, commuting with Lγ0 ,τ0 , such that P0 P1 = P1 P0 = 0. They are explicitly defined, for U = (x, ξ, X)T ∈ H, as follows: (P1 U )0 = N1−1 a1 (U ), (P1 U )1 = q1 N1−1 b1 (U ), (P1 U )2 (v) = N1−1 [a1 (U ) cos q1 v + b1 (U ) sin q1 v], 2q0 2 a0 (U ) − c0 (U ), 2 N0 3N0 ! 2q02 2q0 γ0 τ02 2 b0 (U ) − + ρ b0 (U ), (P0 U )1 = − 2 N0 N0 3N0
(P0 U )0 = −
452
G. Iooss, K. Kirchgässner
! 2γ0 τ02 2q0 b0 (U ) + ρ b0 (U ) sin q0 v + N0 3N02
(P0 U )2 (v) =(P0 U )0 cos q0 v − +
2b0 (U ) 2a0 (U ) v sin q0 v − v cos q0 v, N0 N0
where qj2 = τ02 (1 + 2γ0 − 2γ0 cos qj ), N1 = q1 − γ0 τ02 sin q1 6= 0,
j = 0, 1, q0 = γ0 τ02 sin q0 ,
N0 = γ0 τ02 cos q0 − 1 6 = 0,
aj (U ) = qj x − γ0 τ02 σj (U ), j = 0, 1, bj (U ) = ξ − γ0 τ02 ρj (U ), c0 (U ) = x − γ0 τ02b σ0 (U ), Z 1 sin qj (1 − s)[X(s) + X(−s)]ds, j = 0, 1, σj (U ) = 0
Z ρj (U ) =
1
cos qj (1 − s)[X(s) − X(−s)]ds, j = 0, 1,
0
Z b σ0 (U ) =
1
(1 − s) cos q0 (1 − s)[X(s) + X(−s)]ds,
0
Z ρ b0 (U ) =
1
(1 − s) sin q0 (1 − s)[X(s) − X(−s)]ds.
0
The reader can check easily that P0 P1 = P1 P0 = 0 follows from the 4 identities Z q0 − 2γ0 τ02 Z 1 − 2γ0 τ02
1
sin q0 (1 − s) cos(q1 s)ds = 0,
0
(1 − s) cos q0 (1 − s) cos(q1 s)ds = 0,
0
Z
q1 − 2γ0 τ02 Z
1
1
1
cos q0 (1 − s) sin(q1 s)ds = 0,
0
(1 − s) sin q0 (1 − s) sin(q1 s)ds = 0.
0
A necessary and sufficient condition for U to be in the hyperbolic invariant subspace Hh is that the following 6 conditions are realised: b0 (U ) = 0, j = 0, 1. aj (U ) = bj (U ) = c0 (U ) = ρ To prepare application of [20], we have to solve the affine linear system, associated with (6) for the hyperbolic part. Set Qh = I − P0 − P1 , Uh = Qh U, then we have to solve ∂t Uh = Lγ0 ,τ0 Uh + Qh F
(34)
Travelling Waves in Chain of Coupled Nonlinear Oscillators
453
for Uh ∈ Eα0 (Dh ) ∩ Eα1 (Hh ) and for each α ∈ (−α0 , α0 ). We have F = (0, f, 0)T , f ∈ E0α (R), and Fh = Qh F, (Fh )0 = 0, (Fh )1 = {−q1 N1−1 + (3N02 )−1 (3γ02 τ04 − 3 − q02 )}f, n o (Fh )2 (v) = −N1−1 sin q1 v + 2q0 (3N02 )−1 sin q0 v + 2N0−1 v cos q0 v f. The component Xh is now given by eh (t, v), Xh (t, v) = φ(t + v) + X ! Z t 2q 2(t + v−s) 0 eh (t, v) = f (s) sin q0 (t + v−s) + cos q0 (t + v−s) ds + X N0 3N02 0 Z t −1 sin q1 (t + v − s)f (s)ds, − N1 0
and Z xh (t) = φ(t) +
− N1−1
! 2q0 2(t − s) f (s) sin q0 (t − s) + cos q0 (t − s) ds + N0 3N02
t
0
(35)
Z
t
sin q1 (t − s)f (s)ds, Z t+v −1 sin q1 (t + v − s)f (s)ds + Xh (t, v) = xh (t + v) + N1 Z
0
t
t+v
(36) !
2q0 2(t + v−s) sin q0 (t + v−s) + cos q0 (t + v−s) ds 2 N0 3N t 0 Z v sin(q1 s)f (t + v − s)ds + = xh (t + v) + N1−1 0 ! Z v 2q0 2s f (t + v−s) sin q0 s + cos q0 s ds, − N0 3N02 0 −
f (s)
which leads to
Z v ∂ −1 cos(q1 s)f (t + v − s)ds + Xh (t, v) = x˙h (t + v) + q1 N1 ∂v 0 ! Z v 2q02 2 2s f (t + v − s) ( 2 + ) cos q0 s − sin q0 s ds. − N0 N0 3N0 0
Hence, there exists a constant c independent of α ∈ (−α0 , α0 ) such that ||Xh ||E α (C 1 [−1,+1]) ≤ ||xh ||E1α + c||f ||E0α 0
(37)
holds again. Now we take the Fourier transform of (34). For being able to do it, we bh analytic proceed as in the previous section. For α < 0, we obtain an expression for U
454
G. Iooss, K. Kirchgässner
with respect to k ∈ Bα = {k ∈ C; | Im k| < α}, taking values in Dh and which is the solution of b(k). bh (k) = Qh F (ikI − Lγ ,τ )U
(38)
For α ≥ 0 we need to use the distributions in Sα0 (see Appendix 1). For any α, we have b xh (k), ξh (k) = ikb
Z fb(k) v ik(v−s) bh (k, v) = eikvb xh (k) + e sin(q1 s)ds + X N1 0 ! Z v 2q 2s 0 eik(v−s) sin(q0 s) + cos q0 s ds, − fb(k) N0 3N02 0
3γ 2 τ 4 − 3 − q02 −q1 xh (k) = −fb(k) + 0 0 2 + N(ik; γ0 , τ0 )b N1 3N0 ! Z 1 1 2q0 2s 2 2 cos k(1 − s)[ sin q1 s − sin q0 s − cos q0 s]ds . + γ 0 τ0 N1 N0 3N02 0 After using the definitions of N0 , N1 and the fact that q1 (resp q0 ) is a simple (resp. double) root of N(ik; γ0 , τ0 ) = 0, this leads, after elementary computations, to b(k; γ0 , τ0 )fb(k)] = 0, xh (k) + H N(ik; γ0 , τ0 )[b
(39)
b is defined by the identity where H 2(k 2 + q02 ) 2q02 q1 1 b(k; γ0 , τ0 ). = − +H − 2 2 2 2 2 2 N(ik; γ0 , τ0 ) N1 (k − q1 ) N0 (k − q0 ) 3N0 (k 2 − q02 ) (40) The function C 3 k 7 −→ N(ik; γ0 , τ0 ) is entire, and ±q1 (resp. ±q0 ) are the unique simple (resp. double) roots of N(ik; γ0 , τ0 ) = 0 in a strip Bp0 where p0 > α0 was defined in Lemma 1 (i). N behaves as k 2 at infinity in Bp0 . Notice that C 3 k 7 → b(k; γ0 , τ0 ) is analytic in the strip Bp0 and tends to 0 as 1/k 2 for |k| → ∞. It results H b is the Fourier transform of a function by the lemma shown at previous section, that H 1 , for any δ < p . R 3 t 7 → H (t; γ0 , τ0 ) ∈ H−δ 0 It results from Proposition 4, 3 and 5 of Appendix 1, that the solution of (39) reads + iq t 1 + a − e −iq1 t + (a + + itb + )e iq0 t 1 0 0 a1 e xh (t) + [H (·; γ0 , τ0 ) ∗ f ](t) = +(a0− − itb0− )e−iq0 t , for α ≥ 0, = 0, for α < 0, e as in the previous section, based on a0± , a1± , b0± being arbitrary constants. Now, define U eh . The same argument as in the the new e xh = −H (·; γ0 , τ0 ) ∗ f, and formula (36) for X eh ) = P0 (U eh ) = 0 previous section, using Proposition 2 of Appendix 1, shows that P1 (U ± ± independently of α. Then, for α ∈ (−α0 , α0 ), α0 < δ < p0 , we have a0 = a1 = b0± = 0.
Travelling Waves in Chain of Coupled Nonlinear Oscillators
455
Lemma 8. Assume f ∈ E0α , for α ∈ (−α0 , α0 ), α0 < δ < p0 , then the affine system (34) has a unique solution Uh ∈ E0α (Dh ) ∩ E1α (Hh ), and the linear map E0α 3 f 7 −→ Uh ∈ E0α (D) ∩ E1α (H) is bounded uniformly in α ∈ (−α0 , α0 ). So, as in the previous section, we have verified the assumptions of Theorem 3 in ([20] p. 133), and we are now able to use a reduction on a 6-dimensional center manifold. e0 6. Normal Form Near 0 As we have observed, 000 is dense in 00 . Exceptional points on 00 are the cusp points, where there is one pair of triple eigenvalues, and the angular points, where there are two pairs of double eigenvalues and one pair of simple eigenvalues. In what follows, we exclude points of the parameter plane which are close to points of 00 where the ratio q1 /q0 takes the values 1 (cusps), 1/2, 2, 1/3, 3 corresponding to strong resonances. We also exclude neighborhoods of the angular points of 00 . As a consequence we consider only those points (γ , τ ) ∈ 00 near points where (q1 /q0 )(γ , τ ) is close to a rational number r/s such that r + s ≥ 5. This set of points is denoted by e 00 ⊂ 00 . We stay in the parameter plane near the part e 00 of 00 where neighborhoods of strong resonances are avoided. For the computation of the normal form we need to define, for every point near e 00 , the nearest weak resonance. For any rational number r/s let us define the subset of e 00 , q1 (γ , τ ) r q1 (γ , τ ) r 0 e − < εr+s , and = implies r 0 +s 0 ≥ r + s}. Ir/s = {(γ , τ ) ∈ 00 ; q0 (γ , τ ) s q0 (γ , τ ) s 0 It is clear that e 00 is the union of Ir/s for r/s ∈ Q∗+ \{1, 2, 3, 1/2, 1/3}. We then compute the normal form for q1 /q0 = r/s and we shall play on (γ , τ ) to cover the full neighborhood of e 00 . The linear operator on the 6-dimensional central subspace has the form iq0 1 0 0 0 0 0 iq0 0 0 0 0 0 0 0 0 iq1 0 (0) L = 0 0 0 0 −iq0 1 0 0 0 0 −iq 0 0 0 0 0 0 0 −iq1 ζ0 , ζ1 , ζ 0 , e ζ 0 , ζ 1 defined by in the basis ζ0 , e ζ0 = (1, iq0 , eiq0 v )T , e ζ0 = (0, 1, veiq0 v )T , ζ1 = (1, iq1 , eiq1 v )T , and which satisfies Lγ0 ,τ0 ζ0 = iq0 ζ0 ,
Sζ0 = ζ 0 ,
ζ0 = iq0e ζ0 + ζ0 , Se ζ 0, ζ0 = −e Lγ0 ,τ0e Lγ0 ,τ0 ζ1 = iq1 ζ1 ,
Sζ1 = ζ 1 .
456
G. Iooss, K. Kirchgässner
It is easy to check that the projection P0 + P1 may now be defined as follows: ζ0 + Cζ1 + c.c. = U0 , (P0 + P1 )U = Aζ0 + Be with A=
i iq0 [b0 (U ) + ia0 (U )] + [γ0 τ02 ρ b0 (U ) + ic0 (U )], 2 N0 3N0
B = −N0−1 [b0 (U ) + ia0 (U )], C = 1/2N1−1 [a1 (U ) − ib1 (U )]. The structure of the reversible normal form corresponding to the linear operator L(0) is computed in Appendix 2. It is shown in particular that the reduced 6-dimensional system, with its normal form written at order r + s − 2, takes the following form: dA = iq0 A + B + iAP (u1 , u2 , u4 ) + O(|A| + |B| + |C|)r+s−1 , (41) dt dB = iq0 B + iBP (u1 , u2 , u4 ) + AQ(u1 , u2 , u4 ) + O(|A| + |B| + |C|)r+s−1 , dt (42) dC (43) = iq1 C + iCR(u1 , u2 , u4 ) + O(|A| + |B| + |C|)r+s−1 , dt where u1 = AA, u2 = i/2(AB − AB), u4 = CC, and P , Q, R are polynomials with smoothly parameter dependent real coefficients for (γ , τ ) in the neighborhood of any (γ0 , τ0 ) ∈ Ir/s . The 0th order coefficients in P , Q, R correspond to the critical linear part of system (6). We notice that the normal form, truncated at order r +s−2, contains all solutions of the classical 1:1 resonant normal form (just consider solutions with C = 0). Let us specify the main coefficients of system (41,42,43). We have at first orders P (u1 , u2 , u4 ) = a1 (γ , τ ) + a2 u1 + a3 u2 + a4 u4 , Q(u1 , u2 , u4 ) = b1 (γ , τ ) + b2 u1 + b3 u2 + b4 u4 , R(u1 , u2 , u4 ) = c1 (γ , τ ) + c2 u1 + c3 u2 + c4 u4 . Coefficients a1 , b1 , c1 cancel for (γ , τ ) = (γ0 , τ0 ), and may be easily computed by using the property that p iq0 ± b1 (γ , τ ) + ia1 (γ , τ ), iq1 + ic1 (γ , τ ) and their complex conjugate, are the six eigenvalues of the operator Lγ ,τ for (γ , τ ) close to (γ0 , τ0 ). Notice that, b1 (γ , τ ) = 0 on 00 and we have b1 (γ , τ ) > 0 on the side 10 of the curve 00 . Now, as for the 1:1 resonance case, the most important coefficient is b2 , which we compute below. Let us denote the basic differential Eq. (6) as follows: dU = Lγ0 ,τ0 U + (γ − γ0 )L(1,0) U dt + (τ − τ0 )L(0,1) + M2,0 (U, U ) + M3,0 (U, U, U ) + . . .
Travelling Waves in Chain of Coupled Nonlinear Oscillators
457
with L(1,0) U = τ02 (0, −2x + X1 + X−1 , 0)T , L(0,1) U = 2τ0 (0, −(1 + 2γ0 )x + γ0 (X 1 + X−1 ), 0)T , M2,0 (U, U ) = τ02 (0, ax 2 , 0)T , M3,0 (U, U, U ) = τ02 (0, bx 3 , 0)T . The Taylor expansion of the 6-dimensional center manifold reads ζ 0 + Cζ 1 + ζ0 + Cζ1 + Aζ 0 + Be U = Aζ0 + Be X s s (m,n) m n r0 e r0 r1 s0 e + (γ − γ0 ) (τ − τ0 ) A B C A B 0 C 1 8r0er0 r1 s0es0 s1 ,
(44)
r0 + r1 + s0 +e s0 + s1 = 1, where the sum does not contain terms with m = n = 0, r0 +e and we have in a classical way (0,0)
(2iq0 I − Lγ0 )8200000 = M2,0 (ζ0 , ζ0 ), (0,0)
−Lγ0 8100100 = 2M2,0 (ζ0 , ζ 0 ), (0,0) (0,0) (0,0) ζ0 + (iq0 I − Lγ0 )8200100 = 2M2,0 (ζ 0 , 8200000 ) + 2M2,0 (ζ0 , 8100100 ) ia2 ζ0 + b2e
+ 3M3,0 (ζ0 , ζ0 , ζ 0 ). This leads to (0,0)
8200000 = K1 (1, 2iq0 , e2iq0 v )T , (0,0)
8100100 = 2a(1, 0, 1)T , (0,0)
ζ0 + φζ0 + 8200100 = ia2e
b2 (0, 0, v 2 eiq0 v )T , 2
with a2 and φ still unknown, and K1 = a[1 − 4q02 τ0−2 (1 − γ0−1 τ0−2 )]−1 , −N0 b2 = τ02 {2a 2 [1 − 4q02 τ0−2 (1 − γ0−1 τ0−2 )]−1 + 4a 2 + 3b}.
(45)
Notice that γ0 τ02 > 1 due to (16) , and that N0 may take any sign since it changes its sign at the cusp points of DEC, hence there are situations in the parameter plane such that the coefficient b2 is negative. For the truncated normal form at cubic order, we have solutions with C = 0, corresponding to a flat extra oscillatory part, and reducing to the solutions of the classical 1:1 reversible resonance vector field. We know that for b2 < 0 there is a one parameter (a “circle”) family of orbits homoclinic to 0 (see for instance [9]) A = r0 (t)ei(q0 t+ψ(t)+θ ) , B = r1 (t)ei(q0 t+ψ(t)+θ ) , C = 0, s −1 p 2b1 (γ , τ ) , cosh[t b1 (γ , τ )] r0 (t) = −b2 r1 (t) =
dr0 (t) , dt
ψ(t) = a1 (γ , τ )t + 2
p a2 p b1 (γ , τ ) tanh(t b1 (γ , τ )), b2
458
G. Iooss, K. Kirchgässner
where θ ∈ R. Two of them are reversible: θ = 0 or π. For the full vector field (41,42,43), we are now able to use in particular the results of E. Lombardi [12]: under the non resonance assumptions which are realized here, there exists a family of pairs of reversible solutions of (2) homoclinic to periodic solutions of exponentially small amplitude. This means that this type of solutions, which are mainly given by the above mentioned reversible orbits homoclinic to 0, now contain an oscillating part in C which cannot be 1/2 annihilated, whose size is O(e−c/b1 ) hence exponentially small in the bifurcation paa “phonon” at infinity, the central “localized” part of the rameter b1 (γ , τ ). So it remains √ of these “localsolution being of order b1 (γ , τ ). More precisely, the principal parts √ ized” travelling waves are obtained up to order O[b1 (γ , τ )] (resp. O[ b1 (γ , τ )]) for r + s ≥ 6 (resp. r + s = 5), in replacing in the center manifold expansion (44) of U , amplitudes A, B, C by the above explicit expressions (see [5]). Theorem 9. For (γ , τ ) in a neighborhood of the curve 00 , except near exceptional points (cusps, angular points and strong resonances), and for U near the origin in D, the system (6) reduces to a 6-dimensional reversible vector field, with a fixed point at the origin and a linear part possessing a pair of double eigenvalues ±iq0 and a pair of simple eigenvalues ±iq1 . The bifurcation parameter is b1 = dist[(γ , τ ), 00 ], (counted > 0 in 10 ). All generic bifurcating (periodic, quasiperiodic, homoclinic, . . . ) small bounded solutions of this 6-dim reversible vector field correspond to “small” travelling waves, solutions of (2).In particular, for (γ , τ ) in the open set where b2 < 0 [see(45)], there are travelling waves which are localized in space, with exponentially small oscillating tails, called “nanopterons” (following J. P. Boyd’s denomination [2]). Appendix 1. Construction of a Suitable Distribution Space Given α > 0 and Bα := {z ∈ C/| Im z| < α}, define the space Sα as follows Sα = {f : Bα → C/f holomorphic in Bα , qm,p (f ) < ∞, (m, p) ∈ N2 }, where qm,p (f ) = sup |zm f (p) (z)|eα| Re z| , and where N is the set of integers starting at 0.
z∈Bα
The pair (Sα , qm,p ) defines a Fréchet space. Notice that (cosh αz)−1 ∈ Sα if α 2 < π/2, so Sα is nontrivial and we have Sα ⊂ S (space of rapidly decaying functions). Proposition 1. The Fourier transform F defines a bijection on Sα , being continuous in both directions. b belongs to Sα . Let us notice that Proof. For any φ ∈ Sα , we first show that Fφ =: φ for any p, m ≥ 0, and k ∈ Bα , we have Z b(p) (k) = x p e−ikx φ (m) (x)dx. i p+m k m φ R
Now, take k = kr + iki and choose kr > 0 and z = x + i(ε − α), 0 < ε < α, - for kr < 0, take z = x + i(α − ε) and argue analogously – then, one obtains Z b(p) (k)eαkr = eεkr ei(ε−α)ki e−ikr x zp φ (m) (z)exki dz, i p+m k m φ z∈R+i(ε−α)
Travelling Waves in Chain of Coupled Nonlinear Oscillators
459
whence it follows b(p) (k)|eαkr ≤ π eεkr Cp,m (φ), |k m φ where Cp,m (φ) = sup (1 + | Re z|2 )|zp φ (m) (z)|eα| Re z| < ∞ is independent of ε. The z∈Bα
limit ε → 0+ then yields b) ≤ π[qp,m (φ) + qp+2,m (φ)]. qm,p (φ b = Fφ ∈ Sα , and the map φ 7→ φ b Therefore, to each φ ∈ Sα , there exists a unique φ is continuous. The surjectivity of this map follows by applying the inverse Fourier transform; and the above estimate gives the continuity in both directions. Now, define the dual space Sα0 of linear continuous forms on Sα and provide it with the weak topology, i.e. pointwise convergence. Then F 0 – which we denote by F again – is again a bijection on Sα0 , and it is continuous in both directions. Moreover, we have t S 0 ⊂ Sα0 , where S 0 is the set of tempered distributions. u Proposition 2. Given α > 0, f ∈ E0α (R) and r ∈ C 0 [0, 1]; then R (i) f ∈ Sα0 via hf, φi := R f (t)φ(t)dt, for any φ ∈ Sα , and the embedding E0α ,→ Sα0 is continuous. (ii) [Ff (· + v)](k) = eikv (Ff )(k), v ∈ R. R1 (iii) h(t) := 0 r(s)f (t + s)ds ∈ Sα0 and Z 1 r(s)eiks ds. (Fh)(k) = (Ff )(k) 0
Proof. Ad (i). For every φ ∈ Sα the following inequality is valid Z |hf, φi| = | f (t)φ(t)dt| ≤ π[q0,0 (φ) + q2,0 (φ)].||f ||E0α . R
Ad (ii). This identity is obtained, similar to the case of tempered distributions Z b(t)dt bi = f (t + v)φ hFf (· + v), φi := hf ((· + v), φ R Z Z Z b(t − v)dt = f (t)φ f (t) φ(s)e−i(t−v)s ds dt = R R ZR iv(·) f (t)F[e φ](t)dt = hFf, eiv(·) φi = heiv(·) Ff, φi. = R
Ad (iii). The inclusion E0α ⊂ Sα0 is obvious. Now, let hn (t) =
n X j =1
R1
r(sj )f (t + sj )1sj ∈ Sα0
r(s)f (t + s)ds. Then we have Z −α|t| e |hhn − h, φi| ≤ [q0,0 (φ) + q2,0 (φ)] |h(t) − hn (t)|dt. 2 1 R +t
be any Riemann sum for
0
460
G. Iooss, K. Kirchgässner
The integrand tends pointwise to 0 as n → ∞ and is dominated by an integrable function, thus lim hhn , φi = hh, φi, φ ∈ Sα
n→∞
holds. Similarly, we conclude hFhn , φi = hhn , Fφi → hFh, φi n→∞
and the left side converges to the expression on the right side of the assertion (iii) as n → ∞. The proposition is proved. u t Proposition 3. For any f ∈ Sα0 , [F(Df )](k) = ik(Ff )(k) holds. Moreover F(eiqt ) = 2πδq , and F(iteiqt ) = −2π δq0 . t Proof. Same proof as in S 0 . u Proposition 4. Let K be an analytic and polynomially bounded function in the strip Bδ where δ > α. Assume that K has a finite number of roots zj with multiplicity mj , j = 1, 2, ...N , in the strip Bα . Then the kernel in Sα0 of the linear operator f → Kf Pmj P (k) is formed by all linear combinations of the form N j =1 k=1 aj k δzj with arbitrary (m)
aj k ∈ C (where δq is the Dirac distribution in q which is trivially in Sα0 , and δq mth derivative of δq ).
is the
Proof. Assume first that all roots are simple. For f ∈ kernel defined above, and for any φ in Sα , we have 0 = hKf, φi = hf, Kφi since Kφ ∈ Sα . This means that hf, ψi = 0 for all ψ in Sα which cancel at simple roots zj , j = 1, 2, . . . N. Now, any φ ∈ Sα may be decomposed as the sum of N + 1 functions in Sα , Q N X φ(zp ) j 6=p (z − zj ) Q + ψ(z), φ(z) = cosh α(z − zp ) j 6=p (zp − zj ) p=1
P where ψ has simple roots in zj , j = 1, 2, . . . N, and hf, φi = N p=1 ap φ(zp ). This proves Proposition 4 for simple roots. Assume now that z1 , . . . .zr are double roots, and zr+1 , . . . zN simple roots of K = 0 in the strip Bα . We conclude from the result above, that r N Y X (z − zp ) f = aj δzj . p=1
j =1
hQ
i PN r Hence, for any φ in Sα , we have hf, p=1 (z − zp ) φi = j =1 aj φ(zj ). This means that for any ψ in Sα having simple roots in zp , p = 1, . . . r, we have hf, ψi =
r X j =1
cj ψ 0 (zj ) +
N X j =r+1
bj ψ(zj ),
Travelling Waves in Chain of Coupled Nonlinear Oscillators
i−1 r (z − z ) . Let us take p p=1,p6 =j j Pr Sα , we have the decomposition φ(z) = p=1 φ(zp )χp (z) + ψ(z), with 0 if q 6 = p, and = 1 if q = p, where p ∈ {1, . . . r}. Now, z1 , . . . zr are simple
where bj = aj
hQ
r p=1 (zj
i−1
461
− zp )
, cj = aj
hQ
any φ ∈ χp (zq ) = roots of ψ, hence we have
hf, φi =
N X
αp φ(zp ) +
p=1
r X
cp φ 0 (zp ),
p=1
with N X
αp = hf, χp i −
bj χp (zj ) −
j =r+1
r X j =1
cj χp0 (zj ), p = 1, . . . r,
αp = bp , p = r + 1, . . . N. Therefore, Proposition 4 is proved for roots at most double. For roots of arbitrary order, the proof is left to the reader. u t Proposition 5. Let H ∈ E0−δ and g ∈ E0α , with δ > α ≥ 0, then we have i) H ∗ g ∈ E0α , with ||H ∗ g||0,α ≤ 2(δ − α)−1 ||H ||0,−δ ||g||0,α b.b b = FH is the Fourier transform in the usual sense of ii) F(H ∗ g) = H g where H functions, and b g and F(H ∗ g) are Fourier transforms in Sα0 for α > 0, in S 0 for α = 0. Proof. i) comes from the inequality Z R
eα(|s|−|t|)−δ|t−s| ds ≤ 2(δ − α)−1 .
Now, for α > 0, F(H ∗ g) ∈ Sα0 and satisfies ∀ϕ ∈ Sα , hF(H ∗ g), ϕi = hH ∗ g, Fϕi Z Z Z = H (t − s)g(s)e−ikt ϕ(k)dkdtds Z Z b(k)dkds = hg, F(ϕ.H b)i = g(s) e−iks ϕ(k)H bi = hH b.b = hb g , ϕ.H g , ϕi. b is analytic in the strip Bδ ⊃ Bα , b ∈ Sα because H We noticed, in this calculation, that ϕ.H −1 b.b g ) = H ∗ g in E0α . and bounded in Bα . As a corollary, this shows that F (H 0 0 0 For α = 0, H ∗ g ∈ E0 = Cb (R), F(H ∗ g) ∈ S , and all equalities above hold for φ ∈ S. u t
462
G. Iooss, K. Kirchgässner
Appendix 2. Reversible Normal Form Associated with L(0) As indicated for instance in ([7], p. 18 and 23–24) we need to solve DU0 N(U0 ).L(0)∗ U0 = L(0)∗ N (U0 ), SN = −N ◦ S, e0 , N1 , N 0 , N e0 , N 1 )T , and where N = (N0 , N S(A, B, C, A, B, C)T = (A, −B, C, A, −B, C)T . Moreover N has polynomial components of an arbitrarily fixed degree in variables (A, B, C, A, B, C). Let us define the linear differential operator ∂f ∂f ∂f + (−iq0 B + A) − iq1 C + ∂A ∂B ∂C ∂f ∂f ∂f + (iq0 B + A) + iq1 C , + iq0 A ∂A ∂B ∂C
D∗ f = − iq0 A
then we must verify D∗ N0 = −iq0 N0 , e0 = −iq0 N e0 + N0 , D∗ N ∗ D N1 = −iq1 N1 . Independent first integrals of D∗ f = 0, are s
u1 = AA, u2 = i/2(AB − AB), u3 = iq0 B/A + ln A, u4 = CC, u5 = Ar C , where we assumed that A=
q1 q0
= rs . We observe that
u1 2u2 A u1 (u3 − ln A) (u3 − ln A), B = , B= + , A iq0 iA iq0 A −1/s
C = u4 u5
Ar/s , C = u5 A−r/s , 1/s
hence, a polynomial in variables (A, B, C, A, B, C) can be expressed as a function of variables (A, u1 , u2 , u3 , u4 , u5 ), polynomial in (u1 , u2 , u3 , u4 ), with coefficient functions of (A, u5 ), the dependence in u5 being with polynomials of (u5 )±1/s . Now, considering polynomial solutions of D∗ f = 0, it results easily, with the variables (A, u1 , u2 , u3 , u4 , u5 ), that f is independent of A, i.e. f (A, B, C, A, B, C) = φ(u1 , u2 , u3 , u4 , u5 ), where φ(u1 , u2 , u3 , u4 , u5 ) =
X
r /s
φr1 r2 r3 r4 r5 ur11 ur22 ur33 ur44 u55
(finite sum)
with integers rj ≥ 0, j = 1, 2, 3, 4, and r5 ≥ 0 or < 0. We can first assert that φ is independent of u3 . This is due to the occurrence of ln A at some power in ur33 , and a
Travelling Waves in Chain of Coupled Nonlinear Oscillators
463
study at infinity shows that φ cannot behave polynomially in A if u3 occurs in φ. Now, an examination of the exponents of C and A (making B = 0) in φ leads to the conditions r4 + r5 ≥ 0, sr1 + rr5 is a positive multiple of s. Hence, r5 = k5 s, with rk5 ≥ −r1 , sk5 ≥ −r4 . It results that in case k5 > 0, one has a r0 r0 5 monomial ur11 ur22 ur44 uk55 while in case k5 < 0, one has a monomial u11 ur22 u44 u−k 5 , with r10 = r1 + rk5 , r40 = r4 + sk5 . Finally the polynomial solutions of D∗ f = 0 can be written as f = P0 (u1 , u2 , u4 ) + u5 P1 (u1 , u2 , u4 , u5 ) + u5 P2 (u1 , u2 , u4 , u5 ), where Pj are polynomials in their arguments. Notice that, if one has in addition f ◦ S = ±f , then polynomials Pj have real or pure imaginary coefficients. Let us now solve D∗ N0 = −iq0 N0 , N0 ◦ S = −N 0 . We observe that D∗ (AN0 ) = 0, hence AN0 = φ0 (u1 , u2 , u4 ) + u5 φ1 (u1 , u2 , u4 , u5 ) + u5 φ2 (u1 , u2 , u4 , u5 ), and u1 should be a factor of the polynomials φ0 and φ1 . Finally one obtains, after using the reversibility condition, N0 = iA[P0 (u1 , u2 , u4 ) + u5 P1 (u1 , u2 , u4 , u5 ) + u5 P2 (u1 , u2 , u4 , u5 )] + iA
r−1
C s P3 (u2 , u4 , u5 ),
e0 = where P0 , P1 , P2 , P3 have real coefficients. Let us consider the equation D∗ N r−1 ∗ ∗ ∗ e e −iq0 N0 + N0 , and observe that D (AN0 ) = AN0 , D (AB) = u1 , D (A BC s ) = u5 , hence (using reversibility again) e0 = iAB[P0 (u1 , u2 , u4 ) + u5 P1 (u1 , u2 , u4 , u5 ) + u5 P2 (u1 , u2 , u4 , u5 )] + AN + AA[Q0 (u1 , u2 , u4 ) + u5 Q1 (u1 , u2 , u4 , u5 ) + u5 Q2 (u1 , u2 , u4 , u5 )] + + iA
r−1
r
BC s P3 (u2 , u4 , u5 ) + A C s Q3 (u2 , u4 , u5 ).
If r = 1, making A = 0, leads to 0 = BC s P3 (u2 , u4 , 0), and u5 is factor of P3 if r = 1. Finally, we also have D∗ (CN1 ) = 0, then the normal form reads N0 = iA[P0 (u1 , u2 , u4 ) + u5 P1 (u1 , u2 , u4 , u5 ) + u5 P2 (u1 , u2 , u4 , u5 )] r−1
+ iA
C s P3 (u2 , u4 , u5 ),
e0 = iB[P0 (u1 , u2 , u4 ) + u5 P1 (u1 , u2 , u4 , u5 ) + u5 P2 (u1 , u2 , u4 , u5 )] N + A[Q0 (u1 , u2 , u4 ) + u5 Q1 (u1 , u2 , u4 , u5 ) + u5 Q2 (u1 , u2 , u4 , u5 ] r−2
r−1
+ iA BC s P3 (u2 , u4 , u5 ) + A C s Q3 (u2 , u4 , u5 ), N1 = iC[R0 (u1 , u2 , u4 ) + u5 R1 (u1 , u2 , u4 , u5 ) + u5 R2 (u1 , u2 , u4 , u5 )] + iC
s−1
Ar R3 (u1 , u2 , u5 ),
where all polynomials have real coefficients and where u5 is in factor in P3 when r = 1.
464
G. Iooss, K. Kirchgässner
Acknowledgement. The authors acknowledge R. MacKay for giving them this challenging problem, J. MalletParet for his advice about previous work, and A. Mielke for fruitful discussions.
References 1. Aubry, S.: Breathers in nonlinear lattices: Existence, linear stability and quantization. Physica D 103, 201–250 (1997) 2. Boyd, J.P.: A numerical calculation of a weakly non-local solitary wave: the φ 4 breather. Nonlinearity 3, 177–195 (1990) 3. Chow, Shui-Nee, Mallet-Paret, J., Shen, Wenxian: Traveling waves in lattice dynamical systems. J. Diff. Eq. 149, 248–291 (1998) 4. Devaney, R.L.: Reversible diffeomorphisms and flows. Trans. Am. Math. Soc. 218, 89–113 (1976) 5. Dias, F.,Iooss, G.: Capillary-gravity solitary waves with damped oscillations. Physica D 65, 399–423 (1993) 6. Friesecke, G., Wattis, J.A.: Existence theorem for solitary waves on lattices. Commun. Math. Phys. 161, 391–418 (1994) 7. Iooss, G., Adelmeyer, M.: Topics in Bifurcation theory and Applications. Adv. Ser. Nonlinear Dynamics, 3, Singapore: World Sci., 1992 8. Iooss, G., Kirchgässner, K.: Water waves for small surface tension: An approach via normal form. Proc. Roy. Soc. Edinburgh, 122A, 267–299 (1992) 9. Iooss, G., Pérouème, M.C.: Perturbed homoclinic solutions in reversible 1:1 resonance vector fields. J. Diff. Eq. 10 2, 62–88 (1993) 10. Kato, T.: Perturbation theory for linear operators. Berlin–Heidelberg–New York: Springer-Verlag, 1966 11. Kirchgässner, K.: Wave solutions of reversible systems and applications. J. Diff. Eqs. 45, 113–127 (1982) 12. Lombardi, E.: Oscillatory integrals and phenomena beyond all algebraic orders; Applications to homoclonic orbits in reversible systems. To appear in Lecture Notes in Maths. Berlin–Heidelber–New York: Springer-Verlag 13. Mallet-Paret, J.: The global structure of traveling waves in spatially discrete dynamical systems. To appear in J. Dyn. Diff. Eqs., 1999 14. Mallet-Paret, J. The Fredholm alternative for functional differential equations of mixed type. To appear in J. Dyn. Diff. Eqs., 1999 15. MacKay, R.S., Aubry, S.: Proof of existence of breathers for time-reversible or hamiltonian networks of weakly coupled oscillators. Nonlinearity 7, 1623–1643 (1994) 16. Mielke, A.: Reduction of quasilinear elliptic equations in cylindrical domains with applications. Math. Meth. Appl. Sci. 10, 51–66 (1986) 17. Rustichini, A.: Functional differential equations of mixed type: The linear autonomous case. J. Diff. Eqs. Vol. 1 No. 2, 121–143 (1989) 18. Rustichini, A.: Hopf bifurcation for functional differential equations of mixed type. J. Diff. Eqs. Vol. 1 No. 2, 145–177 (1989) 19. Smets, D., Willem, M.: Solitary waves with prescribed speed on infinite lattices. J. Funct. Anal. 149, 266–275 (1997) 20. Vanderbauwhede, A., Iooss, G.: Center manifold theory in infinite dimensions. Dynamics Reported, 1 new series, 125–163 (1992) Communicated by A. Kupiainen
Commun. Math. Phys. 211, 465 – 485 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Critical Coupling Constants and Eigenvalue Asymptotics of Perturbed Periodic Sturm–Liouville Operators Karl Michael Schmidt Mathematisches Institut der Universität, Theresienstr. 39, 80333 München, Germany Received: 23 September 1999 / Accepted: 21 December 1999
Abstract: Perturbations of asymptotic decay c/r 2 arise in the partial-wave analysis of rotationally symmetric partial differential operators. We show that for each end-point λ0 of the spectral bands of a perturbed periodic Sturm–Liouville operator, there is a critical coupling constant ccrit such that eigenvalues in the spectral gap accumulate at λ0 if and only if c/ccrit > 1. The oscillation theoretic method used in the proof also yields the asymptotic distribution of the eigenvalues near λ0 . Introduction According to the classical oscillation criterion of Kneser [26], potentials of the type c q(x) ∼ 2 (x → ∞) present a boundary case in the spectral theory of Sturm–Liouville x operators d2 − 2 + q(x), dx as the corresponding eigenvalue equation is oscillatory or non-oscillatory at ∞, corresponding to an infinite or finite number of eigenvalues below the essential spectrum, depending on whether c < − 41 or c > − 41 . The limiting case was settled to be nonoscillatory by a refinement of Kneser’s criterion obtained independently by Weber [41], Hartman [16], and Hille [20]. This critical behaviour is also reflected in the asymptotic distribution of the eigenvalues in the oscillatory case, where a correction term to the semi-classical formula appears ([21]; an alternative treatment, including the rotationally symmetric case, can be found in [23]). c Potential terms of decay ∼ 2 naturally occur as an angular momentum term in the r partial-wave analysis of rotationally symmetric partial differential operators, and hence are of practical interest as well. For example, it has been observed [19] that in the spectrum of Schrödinger operators with spherically symmetric, radially periodic potentials, the
466
K. M. Schmidt
spectral gaps of the associated one-dimensional periodic Sturm–Liouville operator are densely filled with eigenvalues, which arise from the interaction of the periodic potential with the angular momentum term. An analysis of this interaction will therefore help understand this striking occurrence of dense point spectrum. There are a large number of studies concerning eigenvalues produced by a shortrange perturbation in gaps of periodic Sturm–Liouville operators ([2, 29, 45] (cf. [12]), [9, 10]; under more general assumptions, including the higher-dimensional case, [7, 11]). Furthermore, the eigenvalue asymptotics in the large coupling limit have been studied extensively ([1, 18, 3, 4, 5]). The hypotheses under which these results are valid, however, c exclude perturbations of decay order ∼ 2 (r → ∞). Rofe-Beketov has devoted a series r c of papers to this case, observing first ([30] Theorems 9, 10; [31] Theorems 4, 5) that 2 is r a critical power of decay even in the presence of a periodic background potential, every spectral gap containing only finitely many eigenvalues for small |c|, but infinitely many eigenvalues for sufficiently large |c| (supercritical case). For a fixed coupling constant, sufficiently remote gaps contain only a finite number of eigenvalues. Later he could specify the value of the critical constant, and extended his results in part to the case of an almost periodic background potential ([32, 33]). Khryashchev [22] has given analogous results (under slightly more general assumptions on the perturbation) for the periodic Schrödinger operator in one, and in three or more dimensions. Recently, Gesztesy and Ünal [14] have developed an interesting extension of Kneser’s oscillation criterion to situations with a background potential. The discrete eigenvalues below the essential spectrum of a critically perturbed periodic Sturm–Liouville operator were studied in [37]; from the characterisation of the critical constant given there it is clear that the lowest angular momentum term for ra1 dially periodic Schrödinger operators in the plane, − 2 , is supercritical, producing 4r an infinity of eigenvalues below the essential spectrum, except in the trivial case of a constant potential (cf. the example at the end of Sect. 1 below). The existence of such eigenvalues had been discovered only shortly before [6]. In his work on the eigenvalue asymptotics in the large coupling limit for perturbed periodic Sturm–Liouville operators, Sobolev [38] admits long-range interactions, including the critical power decay; however, as the coupling constant tends to infinity, it passes the critical threshold sooner or later, and the critical character of such perturbations is no longer visible in this kind of asymptotic limit. Zelenko [44] has given a formula for the asymptotic distribution of the eigenvalues for a fixed long-range perturbation; however, c the asymptotic distribution with a perturbation ∼ 2 with fixed, supercritical c does not r seem to be known so far. It is the purpose of the present paper to develop a unified and transparent method which will permit both to establish the appearance of critical constants for the perturbation and to determine their value, and to obtain the asymptotic distribution of the eigenvalues at the edge of a gap in the supercritical case. We shall assume the following general hypotheses throughout: Let p > 0, p1 , q ∈ L1loc (R) be real-valued and α-periodic, α > 0, and let T0 be the self-adjoint realisation in L2 (R) of the corresponding Sturm–Liouville operator −
d d p + q. dr dr
Perturbed Periodic Sturm–Liouville Operators
467
c Furthermore, let q˜ ∈ L1loc (0, ∞), q(r) ˜ ∼ 2 (r → ∞) for some c ∈ R; we assume that r the perturbed Sturm–Liouville equation ˜ u = λu −(pu0 )0 + (q + q)
(i)
is non-oscillatory at 0 for all λ ∈ R. Let T be any self-adjoint realisation of the perturbed Sturm–Liouville operator d d + q + q˜ − p dr dr on the interval (0, ∞). ˜ − c = o(1) (r → ∞). We abbreviate q(r) ˆ := r 2 q(r) Remark. The assumption of non-oscillation at 0 implies that the spectral phenomena studied in this paper arise from the interplay between the perturbation and the periodic potential near infinity. Specifically, in the case p ≡ 1, the equation (i) is non-oscillatory at 0 if 1 q(r) + q(r) ˜ ≥ − 2 + const 4r near 0 (as can be seen by applying the Kneser–Weber–Hartman–Hille oscillation criterion ([17] Chapter XI, Exercise 1.2) to the transformed differential equation for v(x) := xu( x1 )). As a direct consequence of the above assumptions on the perturbation, the essential spectra of T and T0 coincide; in the interior of the spectral bands of T0 , the spectrum of T is purely absolutely continuous (cf. [15] Theorem 23, p. 29; [39] Theorem 2b). In particular, there are no embedded eigenvalues (even the integrated density of states is unchanged – [39] Corollary 12); the spectrum of T differs from that of T0 only in that discrete eigenvalues may appear in gaps of the essential spectrum. (For simplicity, we treat the unbounded interval (−∞, inf σe (T0 )) as a spectral gap.) The exact number and position of these eigenvalues will sensitively depend on the coefficients p, q, q˜ and on the boundary condition at 0 (if equation (i) is in the limit circle case at 0); here we address the more general question of whether or not the total number of eigenvalues in a given gap is finite or infinite. In Sect. 1 we prove the following result. D denotes the discriminant (cf. Appendix 4.) of the unperturbed α-periodic equation −(pu0 )0 + qu = λu.
(ii)
Theorem 1. Let λ0 be an end-point of a gap in the essential spectrum of T , and ccrit :=
α2 . 4|D|0 (λ0 )
c < 1, and λ0 is an accumuThen λ0 is not an accumulation point of eigenvalues if ccrit c > 1. lation point of eigenvalues if ccrit Remark. It is clear from the properties of the discriminant that ccrit is positive at the left-hand end-point, and negative at the right-hand end-point of a gap; as a consequence, eigenvalues cannot accumulate at both end-points of the gap. The critical constant is closely related to the effective mass m of the unperturbed periodic problem; indeed, ccrit = (8m(λ0 ))−1 ([32] Theorem 2, [33] Theorem 2).
468
K. M. Schmidt
With the method used to prove Theorem 1, we are also able to decide the behaviour in the limiting case c = ccrit ; this requires further assumptions on the asymptotics of the perturbation q. ˜ For cˆ ccrit (r → ∞) q(r) ˜ ∼ 2 + 2 r r (log r)2 we prove the following result in Sect. 2. In analogy to the Kneser-Weber-Hartman-Hille oscillation criterion, again a critical threshold for the coupling constant appears; it is interesting to note that it is the same as for the leading order. Theorem 2. Let λ0 , ccrit be as in Theorem 1. Then λ0 is not an accumulation point of cˆ cˆ < 1, and λ0 is an accumulation point of eigenvalues if > 1. eigenvalues if ccrit ccrit In the supercritical case, in which eigenvalues do accumulate at a given edge of the essential spectrum, it is a natural question to ask about their asymptotic distribution as one approaches the essential spectrum. In Sect. 3 we prove Theorem 3. Let λ0 be an end-point of a gap in the essential spectrum of T , and λ1 an c > 1. Then, interior point of the same gap. Let ccrit be defined as in Theorem 1, and ccrit for λ in the gap, the number of eigenvalues between λ1 and λ is given asymptotically by q c ccrit − 1 log|λ − λ0 | (λ → λ0 ). N(λ) ∼ 4π R∞ |r q(r) ˜ − rc | dr < ∞ the asymptotic remainder term is bounded. In the case In the Appendix, we collect several tools used in the proofs of these theorems. 1. Critical Coupling Constants In order to prove Theorem 1, we proceed as follows. According to the characterisation of the spectrum of T0 in terms of stability intervals of the periodic equation (ii) (cf. Appendix 4.), λ0 is an end-point of a non-degenerate instability interval In , for some n ∈ N0 (cf. Proposition 5). By Theorem 4, a subinterval [λ1 , λ2 ] ⊂ In contains infinitely many eigenvalues if and only if the difference of the Prüfer angles of solutions of the perturbed equation (i) with spectral parameter λ1 and λ2 , respectively, is unbounded as r → ∞. The asymptotic behaviour of the Prüfer angles of solutions of the unperturbed equation (ii) with spectral parameter λ ∈ In is well-known and does not depend on λ to leading order (Proposition 5). Using the fact that q(r) ˜ → 0 (r → ∞) and Sturm’s comparison theorem, it is not difficult to see that for λ ∈ In , the Prüfer angle of solutions of the perturbed equation (i) has the same asymptotic growth, nπ r + O(1) (r → ∞). (iii) ϑ(r) = − α At the end-points of the gap, however, the relative oscillation analysis is considerably more complicated. A first glance at the Prüfer equations associated with (i) and (ii), ϑ 0 = (q + q˜ − λ) cos2 ϑ − ϑ00 = (q − λ) cos2 ϑ0 −
1 sin2 ϑ, p
1 sin2 ϑ0 , p
Perturbed Periodic Sturm–Liouville Operators
469
might raise the suspicion that the difference ϑ − ϑ0 must always be bounded, as the perturbation of the right-hand side is integrable; but this reasoning is fallacious. On the contrary, we shall prove that at λ0 , the Prüfer angle of solutions of (i) has the c < 1, but has an additional unbounded asymptotic term if same asymptotics (iii) if ccrit c > 1. We call equation (i) relatively oscillatory with respect to (ii) in the latter case. ccrit As the equation for the Prüfer angle cannot be solved explicitly, we resort to perturbation theory. Using a fundamental system (u, v) of equation (ii) in an ansatz combining variation of constants and a Prüfer-type transformation, we introduce a generalised angle variable γ which is unbounded if and only if (i) is relatively oscillatory with respect to (ii). Furthermore, γ solves a differential equation of appealing simplicity (iv), which, however, does not yet satisfy our needs: its right-hand side is homogeneous in the perturbation and therefore will not readily show a threshold behaviour concerning the boundedness or otherwise of its solutions. Using Rofe-Beketov’s formula (Lemma 2) for the second solution v, we are led to a Kepler transformation (Lemma 1) yielding a new, essentially equivalent, generalised angle ϕ. The differential equation (v) for ϕ looks more complicated than equation (iv), but its structure is considerably more revealing. Its 1 right-hand side consists of a leading term ∼ and a higher-order term which does not r significantly affect the asymptotics of solutions. Moreover, the coupling constant does not enter the leading term homogeneously, but only in one of its summands; the interplay between this term and the other terms eventually gives rise to the critical behaviour. If we choose, for the solution u, the α-periodic or α-antiperiodic solution which exists at the end-point λ0 of the instability interval In (cf. Appendix 4), the coefficients of equation (v) are essentially α-periodic. We introduce the average ϕ of ϕ over one period, which has the same asymptotic growth as ϕ (Proposition 2). It satisfies a differential equation with essentially constant coefficients, from which its asymptotic growth can readily be estimated (Proposition 3). Finally, we derive the formula for the critical constant by calculating the derivative of the discriminant. We remark that Rofe-Beketov ([30–33]) derives equation (iv), but then interprets this equation as the Prüfer equation of an associated Sturm–Liouville equation with coefficients given rather implicitly by means of a transformation of the independent variable. Then an oscillation criterion of Hille and Wintner ([40] Theorem 2.12) provides a quick proof for the existence of the critical threshold in the case p ≡ 1. However, this approach has several disadvantages. Already the generalisation to p 6≡ 1 meets severe obstacles, and it is not possible to decide the limiting case, or to derive the eigenvalue asymptotics, both of which proceed naturally from the method developed here (see Sects. 2, 3). To prove Theorem 1, let (u, v) be a real-valued fundamental system of (ii) with Wronskian equal to 1, and y any real-valued solution of (i). Then, introducing modified Prüfer variables a, γ ∈ ACloc (0, ∞) by y u v sin γ = a , py 0 pu0 pv 0 − cos γ a straightforward calculation gives the differential equation for γ ˜ sin γ − v cos γ )2 . γ 0 = q(u
(iv)
470
K. M. Schmidt
Proposition 1. Let u be a non-trivial solution of (ii), ϑ1 its Prüfer angle, and ϑ the Prüfer angle of a non-trivial solution of (i). Then ϑ(r) = ϑ1 (r) + ϕ(r) + O(1) (r → ∞), where ϕ is any solution of the differential equation ! (q − λ + p1 )(u2 − (pu0 )2 ) 1 0 2 2 2 sin ϕ − sin ϕ cos ϕ − cos ϕ (c + q(r))u ˆ ϕ = r (u2 + (pu0 )2 )2 (pu0 )2 c + q(r) ˆ 2upu0 2 sin ϕ cos ϕ + cos ϕ . + r2 u2 + (pu0 )2 r(u2 + (pu0 )2 )2 (v) Proof. Let a, γ be defined as above, ϑ2 be a Prüfer angle, and %, %1 , %2 be Prüfer radii of y, u and v, respectively. Then sin γ −%2 sin(ϑ − ϑ2 ) ; a =% %1 sin(ϑ − ϑ1 ) − cos γ π %2 sin(ϑ2 − ϑ1 )(tan(ϑ − ϑ1 + ) + cot(ϑ2 − ϑ1 )). %1 2 As the Wronskian %1 %2 sin(ϑ2 − ϑ1 ) = 1 and thus cot(ϑ2 − ϑ1 ) is locally bounded, Lemma 1 implies ϑ(r) = ϑ1 (r) + γ (r) + O(1) (r → ∞). hence, tan γ =
Substituting v from Rofe-Beketov’s formula (Lemma 2), equation (iv) turns into ˆ γ 0 = (c + q(r)) × cos γ 2
u(r) tan γ − r
Z 1
r
(q − λ + p1 )(u2 − (pu0 )2 ) (u2 + (pu0 )2 )2
!
pu0 + r(u2 + (pu0 )2 )
!2 ,
and the Kepler transformation 1 tan γ (r) − ϕ(r) := arctan r
Z
r
(q − λ + p1 )(u2 − (pu0 )2 )
1
!!
(u2 + (pu0 )2 )2
= γ (r) + O(1) (r → ∞) yields the assertion. u t Now consider the case λ = λ0 , and let u be the α-(anti-)periodic solution of the periodic equation (ii). Then A := −
(q − λ0 + p1 )(u2 − (pu0 )2 ) (u2 + (pu0 )2 )2
, B := cu2
are α-periodic. ˆ ∈ L1 ([R0 , ∞)), Proposition 2. Let R0 > 0, A, B ∈ L1loc ([R0 , ∞)) be α-periodic, Q loc ˆ Q(r) = o(1) (r → ∞), and ϕ : [R0 , ∞) → R a locally absolutely continuous function satisfying 1 1 ˆ sin2 ϕ(r)) + O ϕ 0 (r) = (A(r) cos2 ϕ(r) − sin ϕ(r) cos ϕ(r) + (B(r) + Q(r)) r r2
Perturbed Periodic Sturm–Liouville Operators
471
(r → ∞). Then the function ϕ(r) :=
1 α
Z r
r+α
ϕ
(r ≥ R0 )
is locally absolutely continuous, ϕ(r) − ϕ(r) = o(1) (r → ∞), and 1 ϕ 0 (r) = (A cos2 ϕ(r) − sin ϕ(r) cos ϕ(r) + B sin2 ϕ(r)) + R(r) sin2 ϕ(r) r (vi) 1 (r → ∞) +O r2 Z r+α R 1 ˆ = o 1 (r → ∞), and constants A := 1 α A, B := Q with R(r) := 0 α αr r r R 1 α α 0 B. Proof. The mean value theorem of integral calculus provides, for each r ≥ R0 , some r0 ∈ [r, r + α] such that ϕ(r) = ϕ(r0 ); hence for each % ∈ [r, r + α]: Z r+α 1 |ϕ 0 | = O (r → ∞), |ϕ(%) − ϕ(r)| = |ϕ(%) − ϕ(r0 )| ≤ r r and in particular |ϕ(r) − ϕ(r)| → 0 (r → ∞). An integration by parts yields ϕ 0 (r) =
1 α
Z r
r+α
ϕ0
r+α Z r+α 1 2 2 ˆ =− (A cos ϕ − sin ϕ cos ϕ + (B + Q) sin ϕ) α% % r Z Z r+α 1 r+α 1 2 ˆ sin2 ϕ) d% + O 1 − (A cos ϕ − sin ϕ cos ϕ + (B + Q) α r %2 % r2 1 = (A cos2 ϕ(r) − sin ϕ(r) cos ϕ(r) + B sin2 ϕ(r)) + R(r) sin2 ϕ(r) r Z r+α 1 (A(cos2 ϕ − cos2 ϕ(r)) − (sin ϕ cos ϕ − sin ϕ(r) cos ϕ(r)) + αr r 1 2 2 ˆ . + (B + Q)(sin ϕ − sin ϕ(r))) + O r2 The assertion follows, since Z ϕ(%) 2 sin cos ≤ |ϕ(%) − ϕ(r)| |sin ϕ(%) − sin ϕ(r)| = ϕ(r) 2
2
t (r ≥ R0 , % ∈ [r, r + α]), and a similar estimate holds for sin cos and cos2 . u Remark. If the function : r →
ˆ Q(r) is absolutely integrable on (R0 , ∞), then so is R. r
472
K. M. Schmidt
Proposition 3. Let R0 > 0, A, B ∈ R, g ∈ L1loc ([R0 , ∞)), g(r) = o( 1r ) (r → ∞), and ϕ : [R0 , ∞) → R locally absolutely continuous such that ϕ 0 (r) =
1 (A cos2 ϕ(r) − sin ϕ(r) cos ϕ(r) + B sin2 ϕ(r)) + g(r) (r ≥ R0 ). r
Then ϕ is bounded if 4AB < 1, and unbounded if 4AB > 1. In the latter case, ϕ(r) ∼
sgn A √ 4AB − 1 log r (r → ∞); 2
if, in addition, g ∈ L1 ((R0 , ∞)), the remainder term is bounded. Proof. With a suitable constant ϕ0 ∈ R we have A+B + A cos ϕ − sin ϕ cos ϕ + B sin ϕ = 2 2
2
thus ψ := ϕ − ϕ0 satisfies 1 ψ (r) = r 0
A+B + 2
p 1 + (A − B)2 cos 2(ϕ − ϕ0 ); 2
! p 1 + (A − B)2 cos 2ψ(r) + g(r) (r ≥ R0 ). 2
(vii)
Since lim rg(r) = 0, there is some r0 > R0 such that r→∞
p |4rg(r)| ≤ 1 + (A − B)2 − |A + B| (r ≥ r0 ). p 1st case. 4AB < 1, or equivalently, |A + B| < 1 + (A − B)2 . Then, for r ≥ r0 , the right-hand side of (vii) is strictly positive for ψ in a neighbourhood of 0 mod π , and strictly negative for ψ in a neighbourhood of π2 mod π, and it follows by standard arguments that ψ is globally bounded. p 2nd case. 4AB > 1, or equivalently, |A + B| > 1 + (A − B)2 . Then p |A + B| − 1 + (A − B)2 r → ∞ (r → ∞). log |ψ(r) − ψ(r0 )| ≥ 4 r0 In order to derive the asymptotics of ψ in this case, we rewrite (vii) as p A + B + 1 + (A − B)2 0 (cos2 ψ(r) + f 2 sin2 ψ(r)) + g(r) ψ (r) = 2r with constant
v p u u A + B − 1 + (A − B)2 t p . f := A + B + 1 + (A − B)2
The Kepler transformation ψ˜ := arctan(f tan ψ) turns this into √ 1 4AB − 1 ˜ + (f cos2 ψ˜ + sin2 ψ)g(r). ψ˜ 0 (r) = sgn(A + B) 2r f
Perturbed Periodic Sturm–Liouville Operators
473
As the factor of g is non-negative and bounded, this implies Z r sgn A √ r ˜ ˜ 4AB − 1 log + O |g| (r → ∞). ψ(r) = ψ(r0 ) + 2 r0 r0 ˜ is bounded, and that The assertion follows when we take into account that |ϕ − ψ| Rr r |g| =0 lim 0 r→∞ log r by l’Hospital’s rule. u t We conclude the proof of Theorem 1 by expressing the critical constant found above, ccrit
1 =− 4
Z
1 α
α
(q − λ0 + p1 )(u2 − (pu0 )2 ) (u2 + (pu0 )2 )2
0
!−1
1 α
Z
α
u2
−1
,
0
in terms of the derivative of the discriminant at λ0 . In order to calculate the latter, we consider the solutions of the initial value problems X 0 (λ, · ) =
1 0 p X(λ, · ), X(λ, 0) = 8(0), q −λ 0
u v is the fundamental system at λ = λ0 with u (anti-)periodic and where 8 = pu0 pv 0 v given by Rofe-Beketov’s formula (Lemma 2). Then the derivative of X with respect to the parameter λ at λ = λ0 is the solution of the formally differentiated initial value problem
1 0 0 0 p 8, ∂λ X(λ0 , 0) = 0; ∂λ X(λ0 , · ) + ∂λ X (λ0 , · ) = −1 0 q − λ0 0 0
by variation of constants we obtain D 0 (λ0 ) = tr ∂λ X(λ0 , α) =
Z
α
0
0 0 tr 8(α)8−1 8 . −1 0
we may therefore By the mean value theorem, pu0 has a zero; without loss of generality (−1)n v(α) 0 n assume that pu (0) = 0. Then D(λ0 ) = 2(−1) and 8(α) = , which 0 (−1)n implies 0
Z
D (λ0 ) = −v(α) 0
α
n
Z
u = −(−1) 2
0
α
(q − λ0 + p1 )(u2 − (pu0 )2 ) Z (u2 + (pu0 )2 )2
α
u2 .
2
0
Remarks. 1. In the relatively oscillatory case, the Prüfer angle ϑ of a solution of (i) with spectral parameter λ0 has the asymptotic growth
474
K. M. Schmidt
sgn c nπr + ϑ(r) ∼ − α 2 R∞
r
c − 1 log r (r → ∞); ccrit
|q(r)| ˆ r If λ0
dr < ∞, then the asymptotic remainder term is bounded. is the infimum of the essential spectrum, then the corresponding periodic 2. solution u of equation (ii) has no zeros, and one can use d’Alembert’s formula, instead of Rofe-Beketov’s formula, to obtain a second solution v. Equation (v) then simplifies to 1 1 0 2 2 2 (c + q(r))u ˆ sin ϕ − sin ϕ cos ϕ − cos ϕ , ϕ = r pu2
if
and applying Propositions 2 and 3 with A := − pu1 2 , B := cu2 , we find the representation ccrit
1 =− 4
1 α
Z 0
α
1 pu2
−1
1 α
Z
α
u
2
−1
0
for the critical constant (cf. [37] Theorem 1). Example. The rotationally symmetric Schrödinger operator in R2 , H = −1 + q(| · |) (with a bounded periodic function q) is unitarily equivalent to the direct sum of Sturm– Liouville operators on the half-axis (0, ∞), −
`2 − 41 d2 + q(r) + , ` ∈ N0 , r2 dr 2
which arise by a separation of variables in polar coordinates. For ` ∈ N, the angular momentum term is positive, and hence does not produce eigenvalues below the essential spectrum. For ` = 0, however, the lower spectrum is infinite except in the trivial case of constant q: indeed, for p = 1 the critical coupling constant given in the preceding remark satisfies ccrit ≥ − 41 by the Schwarz inequality, with equality if and only if u (and hence q) is constant. As a consequence, radially periodic Schrödinger operators in the plane always have infinitely many eigenvalues below their essential spectra (cf. [37] Theorem 2). 2. The Limiting Case In this section we prove the following analogue of Proposition 3. Proposition 4. Let R0 > 0, A, B, C ∈ R such that 4AB = 1. Let ϕ : [R0 , ∞) → R be a locally absolutely continuous function satisfying 1 C 2 A cos2 ϕ(r) − sin ϕ(r) cos ϕ(r) + B sin2 ϕ(r) + ϕ 0 (r) = sin ϕ(r) r (log r)2 1 (r → ∞). +o r(log r)2 Then ϕ is globally bounded if 4AC < 1, and unbounded if 4AC > 1.
Perturbed Periodic Sturm–Liouville Operators
Setting C :=
cˆ α
Z
α
475
u2 in Proposition 4, and
0
1 R(r) := rα
Z
r+α
r
C cu ˆ 2 ∼ 2 r(log r)2 log
in (vi), this shows that the perturbed Sturm–Liouville equation ccrit cˆ u = λ0 u −(pu0 )0 + q + 2 + 2 r r (log r)2 is relatively oscillatory (with respect to (ii)) if if
cˆ > 1, and relatively non-oscillatory ccrit
cˆ < 1. ccrit Theorem 2 follows by means of the Relative Oscillation Theorem.
Proof of Proposition 4. We shall prove the assertion for positive C; for C ≤ 0 the bound1 ) by Sturm comparison. Furthermore, edness of ϕ then follows from that for C ∈ (0, 4A we assume that A, B > 0 (the complementary case A, B < 0 can be handled in an analogous way). s Define ϕ0 as in the proof p of Proposition 3, and 8(s) := ϕ(e ) − ϕ0 (s ≥ log R0 ); then with K := A + B = 1 + (A − B)2 > 0, 80 (s) = K cos2 8(s) 1 C (B cos2 8(s) − sin 8(s) cos 8(s) + A sin2 8(s)) + o( 2 ) + Ks 2 s AC K 2 s 2 + BC 2 2 sin 8(s) + 8(s) cos = Ks 2 K 2 s 2 + BC 1 C sin 8(s) cos 8(s) + o( 2 ) (s → ∞). − 2 Ks s The Kepler transformation r 9(s) := arctan
! AC tan 8(s) K 2 s 2 + BC
yields 9 0 (s) =
√ 1 √ 1 ( AC cos2 9(s)−sin 9(s) cos 9(s)+ AC sin2 9(s))+o( ) (s → ∞). s s
Now apply Proposition 3. u t
476
K. M. Schmidt
3. Eigenvalue Asymptotics In this section we prove Theorem 3. To this end, we extend the analysis of the relative oscillation behaviour of the perturbed equation, as performed in Sect. 1 for an end-point of a spectral gap, to points inside an instability interval. Instead of the α-(anti-)periodic solution of the unperturbed equation, we have Floquet solutions which are periodic up to multiplication by an exponential factor (cf. Appendix 4). Including this exponential growth into equation (v) yields equation (viii), to which we apply the averaging procedure of Sect. 1. Under the assumptions of Theorem 3, let λ be a point between λ0 and λ1 . The spectral gap under consideration is a non-degenerate instability interval In , for some n ∈ N0 . As the perturbed equation is relatively oscillatory at λ0 , In contains infinitely many eigenvalues between λ and λ0 (with λ0 as an accumulation point). Here we ask how the (finite) number of eigenvalues between λ and λ1 grows asymptotically as λ approaches λ0 ; by Theorem 4, this number is given (with an error bounded independently of λ) by the asymptotic growth (as r → ∞) of the difference of Prüfer angles of arbitrary solutions of equation (i) at λ and at λ1 . Hence it is sufficient to study the quantity ϑλ (r) − ϑλ (r0 ) +
nπ r (r > r0 ) α
for some r0 > 0 and ϑλ a Prüfer angle of any solution of (i) with spectral parameter λ ∈ In . Because the perturbation decays at infinity, we need not actually follow this solution through the limit r → ∞, but only go up to a point rmax (λ) chosen such that the increase of the relative Prüfer angle in the remaining interval [rmax , ∞) is bounded uniformly with respect to λ. Specifically, setting √ |c| r1 (λ) := √ |λ − λ0 |
and
q |c| + supr≥r1 (λ) q(r) ˆ rmax (λ) := (λ ∈ In ), √ |λ − λ0 |
we have |q(r)| ˜ ≤ |λ − λ0 | (r ≥ rmax (λ), λ ∈ In ), and rmax (λ) ∼ r1 (λ) (λ → λ0 ). Proposition 5 and Sturm comparison then imply nπ r ≤ C (r ≥ rmax (λ)) ϑλ (r) − ϑλ (r0 ) + α with a constant C > 0 which does not depend on λ ∈ In . According to Proposition 1, we only commit an error bounded uniformly in λ if, instead of ϑλ − ϑλ (r0 ) + nπα · , we consider any solution ϕλ of the equation (v), where now u is an arbitrary non-trivial solution of (ii) with eigenvalue parameter λ. Choosing for u the Floquet solution corresponding to the Floquet multiplier (−1)n em(λ)α (where m(λ) denotes the Floquet exponent – cf. Appendix 4), and applying the Kepler transformation ψλ (r) := arctan(e2m(λ)r tan ϕλ (r)),
Perturbed Periodic Sturm–Liouville Operators
477
we obtain 1 ψλ0 (r) = (A(r) cos2 ψλ (r) − sin ψλ (r) cos ψλ (r) + B(r) sin2 ψλ (r)) r q(r) ˆ u2 (r)e−2m(λ)r sin2 ψλ (r) + 2m(λ) sin ψλ (r) cos ψλ (r) + r 1 + Ounif ( 2 ) (r → ∞). r Here A(r) :=
(viii)
(q − λ + p1 )(u2 − (pu0 )2 )e−2m(λ)r (u2 + (pu0 )2 )2 e−4m(λ)r
and B(r) := cu2 e−2m(λ)r (r ∈ R) are α-periodic; and the subscript in Ounif and ounif indicates that the remainder term has the respective asymptotic behaviour as r → ∞ uniformly with respect to λ ∈ In . In analogy to the proof of Theorem 1, we now introduce the averaged function Z 1 r+α ψ λ (r) := ψλ (r ≥ r0 ), α r for which an integration by parts gives Z 2m(λ) r+α 0 ψ λ (r) = sin ψλ cos ψλ α r Z r+α 1 + (A cos2 ψλ − sin ψλ cos ψλ + (B + qu ˆ 2 e−2m(λ) · ) sin2 ψλ ) αr r 1 + Ounif ( 2 ) (r → ∞); r R R r+α 1 r+α A, B := α1 r B are continuous in λ ∈ In . note that A := α r For each r > r0 there is, by the mean value theorem, an r˜ ∈ [r, r + α] such that ψ λ (r) = ψ(˜r ), and thus for % ∈ [r, r + α], |ψλ (%) − ψ λ (r)| = |ψλ (%) − ψλ (˜r )| Z r+α 1 ≤ |ψλ0 | ≤ αm(λ) + Ounif ( ) (r → ∞). r r We may therefore replace ψλ by ψ λ in the second integral above to obtain 0
ψ λ (r) =
1 (A cos2 ψ λ (r) − sin ψ λ (r) cos ψ λ (r) + B sin2 ψ λ (r)) + gλ (r) r
with gλ (r) :=
m(λ) α
Z r
r+α
sin 2ψλ +
1 rα
Z r
r+α
(ix)
2 q(%)u ˆ (%)e−2m(λ)% d% sin2 ψ λ (r)
1 1 + m(λ)Ounif ( ) + Ounif ( 2 ) (r → ∞). r r This equation is the analogue of equation (vi); note that here we have the additional error term m(λ)Ounif ( 1r ), and that the first term of gλ does not decay at infinity. It will
478
K. M. Schmidt
dominate the behaviour of ψ λ for large r; however, we consider equation (ix) on the interval [r0 , rmax (λ)] only. From the asymptotics of m(λ), u(x, λ) and pu0 (λ) as λ → λ0 (Lemma 3) we find p p A(λ) = A(λ0 ) + O( |λ − λ0 |), B(λ) = B(λ0 ) + O( |λ − λ0 |); furthermore, with the function G(r) :=
(sup u20 ) rα
Z r
r+α
|q| ˆ =o
1 (r → ∞), r
where u0 denotes the (anti-)periodic solution of equation (ii) at λ0 , we have p gλ (r) ≤ G(r) + O( |λ − λ0 |) (r ≥ r0 ). As Z 1 α 1 |A| + 1 + |B| + o r r r 0 √ and hence, |ψ λ (rmax (λ)) − ψλ (rmax (λ))| = O( |λ − λ0 |) (while |ψ λ (r0 ) − ψλ (r0 )| = O(1)), ψ λ serves the purpose of estimating N (λ) as well as ψλ . As in the beginning of the proof of Proposition 3, rewrite Z
|ψ λ (r) − ψλ (r)| ≤
r+α
|ψλ0 | ≤ αm(λ) +
A(λ) cos2 ψ λ − sin ψ λ cos ψ λ + B(λ) sin2 ψ λ 1 1p = (A(λ) + B(λ)) + 1 + (A(λ) − B(λ))2 cos 2(ψ λ − 9(λ)) 2 2 with a suitable 9(λ) continuous at λ0 . For λ sufficiently close to λ0 , the function βλ (r) := ψ λ (r) − 9(λ) then satisfies p 1 βλ0 (r) = (A(λ) + B(λ) + 1 + (A(λ) − B(λ))2 )(cos2 βλ (r) 2 + f (λ)2 sin2 βλ (r)) + gλ (r) with
v p u u A(λ) + B(λ) − 1 + (A(λ) − B(λ))2 t p . f (λ) := A(λ) + B(λ) + 1 + (A(λ) − B(λ))2
The Kepler transformation β˜λ := arctan(f (λ) tan βλ ) yields sgn(A(λ) + B(λ)) p 4A(λ)B(λ) − 1 2r 1 sin2 β˜λ (r))gλ (r) + (f (λ) cos2 β˜λ (r) + f (λ) sgn(A(λ0 ) + B(λ0 )) p ( 4A(λ0 )B(λ0 ) − 1 = p 2r (λ → λ0 ). + O( |λ − λ0 |)) + gλ (r)O(1)
β˜λ0 (r) =
Perturbed Periodic Sturm–Liouville Operators
Since log and
479
1 rmax (λ) = − log|λ − λ0 | + O(1) (λ → λ0 ), r0 2
Z rmax (λ) Z rmax (λ) p gλ ≤ G + (rmax (λ) − r0 )O( |λ − λ0 |) r0 r0 = o( log|λ − λ0 | ) + O(1)
by l’Hospital’s rule, we find |β˜λ (rmax (λ)) − β˜λ (r0 )| ∼ Z If
1p 4A(λ0 )B(λ0 ) − 1 log|λ − λ0 | (λ → λ0 ). 4
∞
|q(%)| ˆ d% < ∞, then G ∈ L1 (r0 , ∞) (cf. Remark after Proposition 3). % r This completes the proof of Theorem 3.
4. Appendix 1. Relative oscillation theory. Consider the Sturm–Liouville equation (ii) under general hypotheses, i.e. with p > 0, p1 , q ∈ L1loc (a, b), −∞ ≤ a < b ≤ ∞. If a non-trivial solution u of (ii) is real-valued, one can introduce its Prüfer variables %, ϑ by u cos ϑ =% ; pu0 sin ϑ a straightforward calculation yields the Prüfer equation ϑ 0 = (q − λ) cos2 ϑ −
1 sin2 ϑ. p
By Sturm comparison (cf. [43] Theorem 16.1), ϑ(r) is a strictly monotone decreasing function of λ for fixed r > r0 and ϑ(r0 ). As the Prüfer radius % can be obtained from the Prüfer angle ϑ by a simple integration, the equation for ϑ is equivalent to the original Sturm–Liouville equation; in particular, ϑ carries all the spectral information in principle. Theorem 4 (Relative Oscillation Theorem). Let T be a self-adjoint realisation of d d p dx + q on (a, b) with separated boundary conditions (if appropriate). Further− dx more, let λ1 , λ2 ∈ R, λ1 < λ2 , and ϑλ1 , ϑλ2 be solutions of the Prüfer equation for λ = λ1 , and λ = λ2 , respectively. Then, denoting by ET the spectral family of T , |dim ET ([λ1 , λ2 )) − lim
c↓a,d↑b
1 ((ϑλ1 − ϑλ2 )(d) − (ϑλ1 − ϑλ2 )(c))| ≤ 5. π
While not directly a corollary of the theorems given there, this theorem can be proved along the lines of [42] Satz 4.1, [43] Theorem 14.7a, 16.4.
480
K. M. Schmidt
Remark. The estimate in Theorem 4 is far from optimal; indeed, taking into account whether the equation is in the limit point case or limit circle case at the end-points, and taking the Prüfer angles for suitably chosen solutions, it can be considerably sharpened, see [43] Theorems 14.7, 14.8. However, here we count eigenvalues only up to a bounded error, and it is more convenient to have an estimate valid regardless of the properties of the end-points a, b. We also remark that, in order to cope with the difficulties of oscillation theory in the oscillatory case, the study of the Wronskian of certain solutions for λ = λ1 and λ = λ2 , resp., has recently been proposed as an alternative (renormalised oscillation theory, [13]), where the difference of the Prüfer angles appears in a different guise.
2. Kepler transformations. Oscillation theory reduces the spectral analysis of the Sturm–Liouville operator, at least in principle, to the study of the non-linear differential Prüfer equation. In concrete cases it usually turns out that, as it stands, this equation does not allow a convenient qualitative characterisation of its solutions.Often, therefore, one u is subjected to uses modified Prüfer transformations, in which the phase vector pu0 a linear transformation (which may depend both on the spectral parameter and the independent variable) before polar coordinates are introduced (see [28] p. 24, [35] p. 1090, [36]; for difference equations, [24, 27]). In general, the appropriate choice of modified Prüfer variables is far from obvious; the following transformation allows a controlled simplification of a given equation of Prüfer type. Lemma 1 (Kepler transformation). Let I ⊂ R be an interval, ϕ, f, g : I → R locally absolutely continuous functions, f > 0. Then there is a unique choice of the branches of arctan in ϕ˜ = arctan(f tan ϕ + g)) such that ϕ˜ is locally absolutely continuous, and ϕ(x) ˜ ∈ [nπ −
π π π π , nπ + ] ⇔ ϕ(x) ∈ [nπ − , nπ + ] 2 2 2 2
for all n ∈ Z, x ∈ I . Furthermore, ϕ˜ 0 = (log f )0 sin ϕ˜ cos ϕ˜ + f g 0 cos2 ϕ˜ +
f ϕ0 . cos2 ϕ + f 2 (sin ϕ + g cos ϕ)2
The proof is a simple calculation. Interpreting (for fixed t ∈ I ) ϕ(t) as a unit arc in R2 , and setting x := cos ϕ(t), y := sin ϕ(t), then ϕ(t) ˜ is the angle of the linearly transformed point x˜ 1 0 x := y˜ f (t)g(t) f (t) y with respect to the x-axis; the unit circle is thereby transformed into an ellipse centred at 0. If g = 0, the circle is stretched in the direction of the y-axis only. The latter transformation was used by Kepler in his calculation of the area of ellipse segments ([25], p.54).
Perturbed Periodic Sturm–Liouville Operators
481
3. Rofe-Beketov’s formula. There is a well-known procedure, often called d’Alembert’s formula (cf. [17] XI.2 (ix)), to obtain, from a given positive solution u of a Sturm– Liouville equation (ii) on the interval I , a second linearly independent solution v. Indeed, using the fact that the Wronskian of any two solutions is constant and choosing t0 ∈ I , one easily finds Z t 1 (t ∈ I ). v(t) = u(t) 2 t0 pu In this formula, the requirement that u be positive (or negative) is essential, because any zero of u leads to a non-integrable singularity of pu1 2 in general. This is a severe drawback of d’Alembert’s formula: for λ above the infimum of the essential spectrum, with all solutions oscillatory, it only gives fundamental systems defined on short intervals. However, a modification of d’Alembert’s formula by Rofe-Beketov provides, for any given solution u, a second linearly independent solution v defined on the same interval, irrespective of whether or not u has zeros. Lemma 2 (Rofe-Beketov’s formula). If u : I → R is a non-trivial solution of the Sturm–Liouville equation (ii) under general hypotheses and t0 ∈ I , then Z v(t) := u(t)
t
(q +
1 p
− λ)(u2 − (pu0 )2 )
(u2 + (pu0 )2 )2
t0
−
(pu0 )(t) (t ∈ I ) u2 (t) + (pu0 )2 (t)
defines a solution of (ii); (u, v) is a fundamental system of Wronskian 1. Remark. For p ≡ 1, this formula can be found in [32], Lemma 2. It should be noted that the solution v given by d’Alembert’s and by Rofe-Beketov’s formula, respectively, differ by a constant multiple of u only, and thus are essentially the same. The solution given by d’Alembert’s formula can be extended across zeros of u; the advantage of Rofe-Beketov’s formula is that it gives a closed expression for the maximally extended solution. For a generalisation to Hamiltonian systems of higher dimension, see [34]. Lemma 2 can be proved by a lengthy computation; it may be instructive to however, u v =A with some matrix see the ansatz which leads to the formula. Writing pu0 pv 0 function A, we obtain the conditions 1 = A21 u2 + (A22 − A11 )upu0 − A12 (pu0 )2 from the Wronskian, and A0
u pu0
1 0 u p ,A pu0 q −λ 0
=
(x)
from the differential equation (where [ · , · ] denotes the matrix commutator). Choosing A11 = A22 =: a and A21 = −A12 =: b, the parameters a and b can be determined, and one obtains Rofe-Beketov’s formula. The choice A11 = A22 =: a and A12 = 0, A21 =: b leads to the classical formula of d’Alembert. Note that for both choices, the resulting matrix A does not solve the differential equation suggested by (x), 1 0 0 p ,A . A = q −λ 0
482
K. M. Schmidt
4. Results from Floquet theory. If the coefficients p, q of the Sturm–Liouville equation are periodic with a common period α > 0, Floquet theory permits a quite precise qualitative characterisation of its solutions and of the spectrum of the associated selfadjoint operator. A comprehensive survey of Floquet theory can be found in [8] and [43], Sect. 12; here we only sketch a few concepts and further developments which are needed in the proofs of the present paper. All solutions of the periodic equation are described by the monodromy matrix M, i.e. the value of the canonical fundamental system after one period, which has the property that u u ( · + α) = M pu0 pu0 for any solution u of the periodic equation. M is a real 2 × 2 matrix of determinant 1, and hence one of three cases holds with respect to its eigenvalues, classified in terms of the discriminant D := tr M: 1st case. If |D| > 2, the eigenvalues λ1 , λ2 are real, with |λ1 | > 1, |λ2 | < 1, λ1 λ2 = 1. Taking their respective eigenvectors as initial conditions for the differential equation, one obtains Floquet solutions with the property uj (x + α) = λj uj (x) λj is called a Floquet multiplier, m := to see that the functions x 7 → uj (x)e−mx ,
1 α
(x ∈ R, j ∈ {1, 2}).
log|λ1 | > 0 the Floquet exponent. It is easy x 7→ (pu0j )(x)e−mx
are α-periodic or α-antiperiodic, depending on whether D (and hence λj ) is positive or negative. In particular, all solutions of the periodic equation are unbounded, i.e. the trivial solution is unstable. 2nd case. If |D| > 2, the eigenvalues are non-real, conjugate points on the unit circle; the corresponding Floquet solutions are α-periodic in the pointwise norm. Thus all solutions are bounded, i.e. the trivial solution is stable. 3rd case. If |D| = 2, M has one eigenvalue only, either 1 or −1, and the corresponding (real-valued) Floquet solution is α-periodic or α-antiperiodic, respectively. If the eigenvalue has geometric multiplicity 2, all solutions are α-(anti-)periodic (coexistence). As a real-analytic function of the spectral parameter, D is strictly monotone increasing or decreasing in the intervals where |D| < 2. If λ ∈ R, |D| = 2, then either D 0 (λ) 6= 0 and there is coexistence of (anti-)periodic solutions, or D 0 (λ) = 0, |D|00 (λ) 6 = 0 and there is no coexistence (cf. [8] Theorem 2.3.1, [43] p. 187 seq.). Hence there are (open) stability intervals on the real axis. The spectrum of the Sturm– Liouville operator T0 is purely absolutely continuous and, as a set, coincides with the closure of the stability intervals. The intervals separating the stability intervals (closed instability intervals) degenerate to points in the case of coexistence, and although their interior is empty then, they are nevertheless counted as (vanishing or degenerate) instability intervals. This convention is justified by the following relationship between the set of instability intervals and the asymptotic growth of the Prüfer angles of solutions, which can be proved along the lines of the proof of [8] Theorem 3.1.2. Proposition 5. Let (In )n∈N0 be the monotone increasing enumeration of the closed instability intervals of the periodic Sturm–Liouville equation (ii) with I0 = (−∞, inf σe (T0 )].
Perturbed Periodic Sturm–Liouville Operators
483
Then for each λ ∈ In , the Prüfer angle 2 of an arbitary solution of the Sturm–Liouville equation satisfies nπ x + Ounif (1) (x → ∞). 2(x) = − α For the asymptotic analysis of Sect. 3 we need the following observations on the asymptotic behaviour of the Floquet exponent and the corresponding Floquet solution near the edge of a non-degenerate instability interval. Lemma 3. Let λ0 be an end-point of a non-degenerate instability interval In , u0 the associated (anti-)periodic solution, and r0 ∈ R a point such that u00 (r0 ) = 0. For each λ ∈ In , let m(λ) be the Floquet exponent and u( · , λ) the corresponding Floquet solution, with u2 (r0 , λ) + (pu0 )2 (r0 , λ) = u20 (r0 ) + (pu00 )2 (r0 ). Then 0 ≤ m(λ) ≤
(xi)
p p 1p 0 |D (λ0 )| |λ − λ0 | + o( |λ − λ0 |) (λ → λ0 ); α
u( · , λ) and (pu0 )( · , λ) are uniformly continuous functions of λ ∈ In on each compact set, and p u( · , λ) = u0 + O( |λ − λ0 |), p (pu0 )( · , λ) = pu00 + O( |λ − λ0 |), (λ → λ0 ) uniformly on each compact set. Proof. The estimate on m(λ) follows by observing that D is differentiable at λ0 , and that |D(λ)| = 2 cosh m(λ)α (λ ∈ In ). Without loss of generality we can assume that r0 = 0, which implies in particular that u0 is a multiple of u1 , (u1 , u2 ) being the canonical fundamental system. Thus the monodromy matrix at λ0 is (−1)n u2 (α, λ0 ) , M(λ0 ) = 0 (−1)n where u2 (α, λ0 ) 6 = 0, as In is non-degenerate by hypothesis. For λ ∈ In so close to λ0 that u2 (α, λ) 6 = 0, the eigenvalue of larger modulus of the monodromy matrix u2 (α, λ) u1 (α, λ) M(λ) = (pu01 )(α, λ) (pu02 )(α, λ) is D(λ) + (−1)n (−1)n em(λ)α = 2 1 with corresponding eigenvector , where γ (λ)
s
D(λ) 2
2 −1
s (pu02 )(α, λ) − u1 (α, λ) (−1)n D(λ) 2 −1 + γ (λ) := 2u2 (α, λ) u2 (α, λ) 2 √ 0 p |D (λ0 )| p |λ − λ0 | + o( |λ − λ0 |) (λ → λ0 ). ∼ (−1)n u2 (α, λ0 )
484
K. M. Schmidt
Now the Floquet solution u has a multiple of this eigenvector as initial value, so √ 0 p |D (λ0 )| p |λ − λ0 | + o( |λ − λ0 |))) u(x, λ) = Cλ (u1 (x, λ) + u2 (x, λ)(−1)n u2 (α, λ0 ) p = Cλ (u1 (x, λ0 ) + O( |λ − λ0 |)), √ and similarly, (pu0 )(x, λ) = Cλ ((pu01 )(x, λ0 )+O( |λ − λ0 |)), with a constant Cλ > 0. The assertion follows, as√on the one hand u0 = u0 (0)u1 ( · , λ0 ), on the other, (xi) t implies u20 (0) = Cλ2 (1 + O( |λ − λ0 |)). u Acknowledgements. It is a pleasure to thank E. B. Davies, F. Gesztesy, A. M. Hinz, H. Kalf, F. S. Rofe-Beketov, and the referee for valuable hints.
References 1. Alama, S., Deift, P.A., Hempel, R.: Eigenvalue branches of the Schrödinger operator H − λW in a gap of σ (H ). Commun. Math. Phys. 121, 291–321 (1989) 2. Birman, M.Sh.: On the spectrum of singular boundary value problems. AMS Translations (2) 53, 23–80 (1966) 3. Birman, M.Sh.: Discrete spectrum in the gap of the continuous one in the large-coupling constant-limit. Operator Theory: Adv. and Appl. 46, 17–25 (1990) 4. Birman, M.Sh.: Discrete spectrum in the gaps of a continuous one for perturbations with large coupling limit. Adv. Sov. Math. 7, 57–73 (1991) 5. Birman, M.Sh.: On a discrete spectrum in gaps of a second order perturbed periodic operator. Funct. Anal. Appl. 25 (2), 158–161 (1991) 6. Brown, B.M., Eastham, M.S.P., Hinz, A.M., Kriecherbauer, T., McCormack, D.K.R., Schmidt, K.M.: Welsh eigenvalues of radially periodic Schrödinger operators. J. Math. Anal. Appl. 225, 347–357 (1991) 7. Deift, P.A., Hempel, R.: On the existence of eigenvalues of the Schrödinger operator H − λW in a gap of σ (H ). Commun. Math. Phys. 103, 461–490 (1986) 8. Eastham, M.S.P.: The spectral theory of periodic differential equations. Edinburgh: Scottish Academic Press, 1973 9. Firsova, N.E.: A trace formula for a perturbed one-dimensional Schrödinger operator with periodic potential I. Probl. mat. fiz. 7, 162–177 (1974) (Russian) 10. Firsova, N.E.: A trace formula for a perturbed one-dimensional Schrödinger operator with periodic potential II. Probl. mat. fiz. 8, 158–171 (1976) (Russian) 11. Gesztesy, F., Simon, B.: On a theorem of Deift and Hempel. Commun. Math. Phys. 116, 503–505 (1988) 12. Gesztesy, F., Simon, B.: A short proof of Zheludev’s theorem. Trans. Am. Math. Soc. 335, 329–340 (1993) 13. Gesztesy, F., Simon, B., Teschl G.: Zeros of the Wronskian and renormalized oscillation theory. Am. J. Math. 118 571–594 (1996), 14. Gesztesy, F., Ünal, M.: Perturbative oscillation criteria and Hardy-type inequalities. Math. Nachr. 189, 121–144 (1998) 15. Glazman, I.M.: Direct methods of qualitative spectral analysis of singular differential operators. Israel Program of Scientific Translation, Jerusalem. 1965 16. Hartman, Ph.: On the linear logarithmic-exponential differential equation of the second order. Am. J. Math. 70, 764–779 (1948) 17. Hartman, Ph.: Ordinary differential equations. New York: J. Wiley & Sons, 1964 18. Hempel, R.: On the asymptotic distribution of the eigenvalue branches of a Schrödinger operator H − λW in a spectral gap of H . J. reine angew. Math. 399, 38–59 (1989) 19. Hempel, R., Herbst, I., Hinz, A.M., Kalf, H.: Intervals of dense point spectrum for spherically symmetric Schrödinger operators of the type −1 + cos|x|. J. London Math. Soc. (2) 43, 295–304 (1989) 20. Hille, E.: Nonoscillation theorems. Trans. Am. Math. Soc. 64, 234–252 (1948) 21. Jörgens, K.: Eigenwerte singulärer Sturm–Liouville-Probleme. Ann. Acad. Scient. Finn., Ser. A I, 336 4, 3–24 (1963) 22. Khryashchev, S.V.: Discrete spectrum for a periodic Schrödinger operator perturbed by a decreasing potential. Operator Theory: Adv. and Appl. 46, 109–114 (1990) 23. Kirsch, W., Simon, B.: Corrections to the classical behavior of the number of bound states of Schrödinger operators. Ann. Phys. 183, 122–130 (1988)
Perturbed Periodic Sturm–Liouville Operators
485
24. Kiselev, A., Last, Y., Simon, B.: Modified Prüfer and EFGP transforms and the spectral analysis of one-dimensional Schrödinger operators. Commun. Math. Phys. 194, 751–758 (1998) 25. Klein, F.: Einleitung in die analytische Mechanik. Vorlesung, gehalten in Göttingen 1886/87. Stuttgart: Teubner, 1991 26. Kneser, A.: Untersuchungen über die reellen Nullstellen der Integrale linearer Differentialgleichungen. Math. Ann. 42, 409–435 (1893) 27. Last, Y., Simon, B.: Modified Prüfer and EFGP transforms and deterministic models with dense point spectrum. J. Funct. Anal. 154, 513–530 (1998) 28. Pearson, D.B.: Singular continuous measures in scattering theory. Commun. Math. Phys. 60, 13–36 (1978) 29. Rofe-Beketov, F.S.: A test for the finiteness of the number of discrete levels introduced into the gaps of a continuous spectrum by perturbations of a periodic potential. Dokl. Akad. Nauk SSSR 156 (3), 515–518 (1964) (Russian) 30. Rofe-Beketov, F.S.: Spectral analysis of the Hill operator and its perturbations. Funkcional’ny˘ı analiz 9, 144–155 (1977) (Russian) 31. Rofe-Beketov, F.S.: A generalisation of the Prüfer transformation and the discrete spectrum in gaps of the continuous one. In: Spectral Theory of Operators, Baku: Elm, 1979, pp. 146–153 (Russian) 32. Rofe-Beketov, F.S.: Spectrum perturbations, the Kneser-type constants and the effective masses of zonestype potentials. In: Constructive Theory of Functions ’84, Sofia: 1984, pp. 757–766 33. Rofe-Beketov, F.S.: Kneser constants and effective masses for band potentials. Sov. Phys. Dokl. 29 (5), 391–393 (1984) 34. Rofe-Beketov, F.S.: On the estimate of growth of solutions of the canonical almost periodic systems. Mat. fiz. anal. geom. 1, 139–148 (1994) (Russian) 35. Schmidt, K.M.: Dense point spectrum for the one-dimensional Dirac operator with an electrostatic potential. Proc. Royal Soc. Edinburgh A 126, 1087–1096 (1996) 36. Schmidt, K.M.: Absolutely continuous spectrum of Dirac systems with potentials infinite at infinity. Math. Proc. Camb. Phil. Soc. 122, 377–384 (1997) 37. Schmidt, K.M.: Oscillation of the perturbed Hill equation and the lower spectrum of radially periodic Schrödinger operators in the plane. Proc. Am. Math. Soc. 127, 2367–2374 (1999) 38. Sobolev, A.V.: Weyl asymptotics for the discrete spectrum of the perturbed Hill operator. Adv. Sov. Math. 7, 159–178 (1991) 39. Stolz, G.: On the absolutely continuous spectrum of perturbed periodic Sturm–Liouville operators. J. reine angew. Math. 416, 1–23 (1991) 40. Swanson, C.A.: Comparison and oscillation theory of linear differential equations. New York: Academic Press, 1968 41. Weber, H.: Die partiellen Differential-Gleichungen der mathematischen Physik. Band 2, 5. Auflage. Braunschweig: Vieweg, 1912 42. Weidmann, J.: Oszillationsmethoden für Systeme gewöhnlicher Differentialgleichungen. Math. Z. 119, 349–373 (1971) 43. Weidmann, J.: Spectral theory of ordinary differential operators. Lect. Notes in Math. 1258, Berlin: Springer, 1987 44. Zelenko, L.B.: Asymptotic distribution of eigenvalues in a lacuna of the continuous spectrum of the perturbed Hill operator. Math. notes of the Acad. Sci. USSR 20, 750–755 (1976) 45. Zheludev, V.A.: Perturbation of the spectrum of the one-dimensional self-adjoint Schrödinger operator with a periodic potential. In: Topics in Mathematical Physics, Vol. 4: Spectral theory and wave processes, ed. M.Sh. Birman. New York: Consultants Bureau, 1971, pp. 55–76 Communicated by B. Simon
Commun. Math. Phys. 211, 487 – 496 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
The Gas Phase of Continuous Systems of Hard Spheres Interacting via n-Body Potential Aldo Procacci1 , Benedetto Scoppola2 1 Dep. Matemática-ICEx, UFMG, CP 702, Belo Horizonte G 30.161-970, Brazil.
E-mail:
[email protected] 2 Dip. di Matematica, Università “La Sapienza” di Roma, P.zza A. Moro 5, 00185 Roma, Italy.
E-mail:
[email protected] Received: 9 September 1999 / Accepted: 4 January 2000
Abstract: We show that a system of classical continuous hard spheres interacting through a general n body potential satisfying suitable integrability conditions, admits a high temperature-low activity gas phase. We find explicitly a condition on the activity λ and the inverse temperature β which ensures that the Mayer series for the pressure is absolutely convergent uniformly in the volume.
1. Introduction The proof of the analyticity of the thermodynamic functions for continuous particle systems with long range many body interaction is a problem studied for a very long time but not yet completely solved. It has been shown, [R], that a system of particles interacting through a stable n-body potential satisfying the temperedness condition (which is a rather mild condition on the mutual potential energy between clusters of particles) admits a thermodynamic limit; in particular the pressure of the gas exists in the infinite volume limit. Due to this result, a goal which seems to be reasonable is the proof of the convergence of the Mayer expansion, and therefore of the thermodynamic variables, for systems with Ruelle stable integrable and tempered n-body potential. Such convergence would lead to the proof of the existence of the pure gas phase for these systems. In the proof of convergence, a further condition, that seems unavoidable when the number of bodies n involved in the interaction is arbitrary, is the exponential decay of the interaction in n (see below, Example 1, Sect. 3). Even with this further restriction, however, the problem is still open. There are very few results on this subject and the conditions imposed on the n-body potential in the existing papers are much stronger and cumbersome. The problem was analyzed via Kirkwood–Salzburg equations in [G] and in [M]. In those paper it was shown that Kirkwood–Salzburg equations has a unique solution
488
A. Procacci, B. Scoppola
in some neighborhood of λ = 0, for stable interactions satisfying cumbersome conditions on the long distance decay of the n-body potential. Both papers give as (simplest) examples of allowed n-body potentials finite range and positive interactions. Recently, in [RS], a more direct approach to this problem, based essentially on the cluster expansion tree graph formula given in [BF], has been proposed. The convergence of the Brydges–Federbush type cluster expansion, for dilute continuous systems of classical statistical mechanics with many-body interaction is proved for a class of interactions. The proof requires a stable potential satisfying an integrability condition, which is, substantially (see formula 4.1 in [RS]) the request of exponential decay of the many-body potential at large distances. Moreover the condition to have an n body interaction with finite n is required in the paper, and such a condition could be unpleasant in the applications. However in [RS] the short distance stability problems are solved, and the results so obtained, although the proofs are a bit involved, goes in the direction of the main goal mentioned above. In the present paper we also apply a combinatorial approach to this problem, based on the cluster expansion for the lattice polymer gas, and in particular on the CassandroOlivieri Hierarchy presented in [CO] and [PS]. We find a different class of n-body potentials for which the Mayer series of the pressure can be shown (by direct bounds) to be analytic in a well defined high temperature low density region. Namely we avoid the short distance stability problems considering systems in which the particles interact through a hard core two body potential, but we admit a long distance n-body potential decaying in an absolutely integrable way, and we do not require the finiteness of n. So we are simplifying the short range problems with a quite strong condition, but we are able to control long range behaviour considerably slower than exponential decay. We can also exhibit explicitly the thermodynamical region in the parameters λ and β where the convergence is proved: namely λ has to be such that at least the pure hard core gas is analytic, and moreover a relation between λ and β (given in Sect. 3) has to be satisfied. In the next section we give some definitions and notations and review a key theorem we proved in [PS]. Sect. 3 is devoted to the basic result of this paper. 2. Definitions, Notations and Preliminary Results 2.1. Potential energy and n-body potentials. We consider a system of classical particles enclosed in a box 3 ⊂ Rd . We denote by xi the coordinates in Rd of the i th particle. We suppose that n particles in the box 3 at positions x1 , x2 , . . . , xn interact via potential energy U (x1 , x2 , . . . , xn ) satisfying suitable conditions specified by the following definitions. Definition 1. Let k ≥ 2 and d be integers. We will denote with bold letters x points in Rd while dx will denote the usual Lebesgue measure in Rd . A k-body potential is a map V (k) : (Rd )k → R : (x1 , x2 , . . . , xk ) 7 → V (x1 , x2 , . . . , xk ) with the properties V (x1 , x2 , . . . , xk ) = V (xσ (1) , xσ (2) , . . . , xσ (k) ), V (x1 + a, x2 + a, . . . , xk + a) = V (x1 , x2 , . . . , xk )
(2.1) (2.2)
for any permutation σ : {1, 2, . . . , k} 7 → {σ (1), σ (2), . . . , σ (k)} (symmetry in the exchange of particles) and for any a ∈ Rd (translational invariance).
Gas Phase of Continuous Systems of Hard Spheres
489
If, for any k ≥ 2 a k-body potential V (k) is given then we say that a family of potential (or briefly a potential) is given and denoted by {V } (or briefly V ). Definition 2 (Potential Energy or Interaction). Let a family of potential V be given, then for any n ≥ 2 and n-ple (x1 , x2 , . . . xn ) we define the interaction U (x1 , . . . , xn ) =
n X
X
V (k) (xi1 , . . . , xik )
(2.3)
k=2 1≤i1 0 and b ≥ 1 such that dn ≤ d¯ for all n, and 0 Idn /2 , + Bn , Bn = O(n−δ ), (2.10) An = λn Idn /2 0 where λn are real and independent of ξ while Bn may depend on ξ ; furthermore, the behaviour of λn ’s is assumed to be as follows λn = nb + o(nb ),
λm − λn = 1 + o(n−δ ), n < m. mb − nb
(2.11)
(A2) Gap condition. There exists δ1 > 0 such that dist σ (Jdi Ai ), σ (Jdj Aj ) > δ1 > 0, ∀i 6 = j ; (σ (·) denotes “spectrum of ·”). Note that for large i, j , the gap condition follows from the asymptotic property. (A3) Smooth dependence on parameters. All entries of Bn are d¯ 2 Whitney–smooth d¯ 2 -norm bounded by some positive constant L. functions of ξ with CW
502
L. Chierchia, J. You
(A4) Non-resonance condition. meas{ξ ∈ O :
hk, ω(ξ )i(hk, ω(ξ )i + λ)(hk, ω(ξ )i + λ + µ) = 0} = 0, (2.12)
S for each 0 6 = k ∈ Zd and for any λ, µ ∈ n∈N σ (Jdn An ); meas ≡ Lebesgue measure. (A5) Regularity of the perturbation. The perturbation P ∈ FDa,ρ (r,s),O is regular in the ¯ < ∞ with a¯ > a. In fact, we assume that one of the sense that kXP kDa,ρ a,ρ (r,s),O following holds: (a) ρ > 0, a¯ > a = 0; (b) ρ = 0, a¯ > a > 0, (such conditions correspond, respectively, to analytic or smooth solutions). Now we can state our KAM result. Theorem 1. Assume that N in (2.7) satisfies (A1)–(A4) and P is regular in the sense of ¯ b, δ, δ1 , a¯ − a, L, γ ) (A5) and let γ > 0. There exists a positive constant = (d, d, ¯ < , then the following holds true. There exists a Cantor set such that if kXP kDa,ρ a,ρ (r,s),O Oγ ⊂ O with meas(O \ Oγ ) → 0 as γ → 0, and two maps (real analytic in θ and Whitney smooth in ξ ∈ O) 9 : T d × Oγ → Da,ρ (r, s) ⊂ Pa,ρ , ω˜ : Oγ → R d , ˜ )t, ξ ) is a quasi-periodic such that for any ξ ∈ Oγ and θ ∈ T d the curve t → 9(θ + ω(ξ solution of the Hamiltonian equations governed by H = N +P . Furthermore, 9(Td , ξ ) is a smoothly embedded d-dimensional H -invariant torus in Pa,ρ . Remarks. (i) For simplicity we shall in fact assume that all eigenvalues λi of An are positive for all n’s. The case of some non-positive eigenvalues can be easily dealt with at the expense of a (even) heavier notation. (ii) In the above case (i.e. positive eigenvalues), Theorem 1 yields linearly stable KAM tori. (iii) The parameter γ plays the role of the Diophantine constant for the frequency ω˜ in the sense that there is τ > 0 such that ∀k ∈ Zd \{0}, hk, ωi ˜ >
γ . 2|k|τ
Notice also that Oγ is claimed to be nonempty and big only for γ small enough. (iv) The regularity property a¯ > a is used only in estimating the measure of O\Oγ . Such regularity requirement is not necessary for for constructing periodic solutions, i.e., d = 1. Thus the above theorem applies to the construction of periodic solutions for 1-D nonlinear Schrödinger equations. (v) The non-degeneracy condition (2.12) (which is stronger than Bourgain’s nondegenerate condition [4] but weaker than Melnikov’s one [13]) covers the multiple normal frequency case: this is the technical reason that allows to treat PDE’s with periodic boundary conditions.
KAM Tori for 1D Nonlinear Wave Equations
503
3. Application to 1D Wave Equations In this section we show how Theorem 1 implies the existence of quasi-periodic solutions for 1D wave equations with periodic boundary conditions. Let us rewrite the wave equation (1.1) as follows: utt + Au = f (u), Au ≡ −uxx + V (x, ξ )u, x, t ∈ R, u(t, x) = u(t, x + 2π ), ut (t, x) = ut (t, x + 2π ),
(3.1)
where V (·, ξ ) is a real–analytic (or smooth) periodic potential parameterized by some ξ ∈ Rd (see below) and f (u) is a real–analytic function near u = 0 with f (0) = f 0 (0) = 0. As it is well known, the operator A with periodic boundary conditions admits an orthonormal basis of eigenfunctions φn ∈ L2 (T), n ∈ N, with corresponding eigenvalues µn satisfying the following asymptotics for large n Z 1 2 V (x)dx + O(n−2 ). µ2n−1 , µ2n = n + 2π T For simplicity, we shall consider the case of vanishing mean value of the potential V and assume that all eigenvalues are positive: Z V (x)dx = 0 , µn ≡ λ2n > 0 , ∀ n. (3.2) T
Following Kuksin [10] and Bourgain [3], we consider a family of real analytic (or smooth) potentials V (x, ξ ), where the d-parameters ξ = (ξ1 , · · · , ξd ) ∈ O ⊂ Rd are √ simply taken to be a given set of d frequencies λni ≡ µni : √ (3.3) ξi ≡ µni ≡ λni , i = 1, · · · , d where µni are (positive) eigenvalues of 5 A. We may also (and shall) require that there exists a positive δ1 > 0 such that |µk − µh | > δ1 ,
(3.4)
for all k > h except when k is even and h = k − 1 (in which case µk and µh might even coincide). Notice that, in particular, having d eigenvalues as independent parameters excludes the constant potential case V ≡ constant (where, of course, all eigenvalues are double: µ2j −1 = µ2j = j 2 + V ). In fact, this case seems difficult to be handled by KAM approach even in the finite dimensional case. Such difficulty does not arise, instead, in the remarkable alternative approach developed by Craig, Wayne [7] and Bourgain [3,4]. Equation (3.1) may be rewritten as u˙ = v, v˙ + Au = f (u),
(3.5)
which, as is well known, may be viewed as the (infinite dimensional) Hamiltonian equations u˙ = Hv , v˙ = −Hu associated to the Hamiltonian Z 1 1 g(u) dx, (3.6) H = (v, v) + (Au, u) + 2 2 T 5 Plenty of such potentials may be constructed with, e.g., the inverse spectral theory.
504
L. Chierchia, J. You
where g is a primitive of (−f ) (with respect to the u variable) and (·, ·) denotes the scalar product in L2 . As in [15], we introduce coordinates q = (q0 , q1 , · · · ), p = (p0 , p1 , · · · ) through the relations Xp X qn λn pn φn (x), u(x) = √ φn (x), v = λn n∈N
n∈N
√ where6 λn ≡ µn . System (3.5) is then formally equivalent to the lattice Hamiltonian equations Z X qn ∂G , G≡ g( (3.7) q˙n = λn pn , p˙ n = −λn qn − √ φn )dx , ∂qn λn T P
n∈N
corresponding to the Hamiltonian function H = n∈N λn (qn2 + pn2 ) + G(q). Rather than discussing the above formal equivalence, we shall, following [15], use the following elementary observation (proved in the Appendix): Proposition 3.1. Let V be analytic (respectively, smooth), let I be an interval and let t ∈ I → (q(t), p(t)) ≡ {qn (t)}n≥0 , {pn (t)}n≥0 be an analytic (respectively, smooth7 ) solution of (3.7) such that X |qn (t)| + |pn (t)| na enρ < ∞ sup t∈I n∈N
(3.8)
for some ρ > 0 and a = 0 (respectively, for ρ = 0 and a big enough). Then u(t, x) ≡
X qn (t) √ φn (x), λn
n∈N
is an analytic (respectively, smooth) solution of (3.1). Before invoking Theorem 1 we still need some manipulations. We first switch to complex variables: wn = √1 (qn + ipn ), w¯ n = √1 (qn − ipn ). Equations (3.7) read then 2
w˙ n = −iλn wn − i
2
˜ ˜ ∂G ˙¯ n = iλn w¯ n + i ∂ G , , w ∂ w¯ n ∂wn
˜ is given by where the perturbation G Z X wn + w¯ n ˜ g( φn )dx. G(w) = √ 2λn T
(3.9)
(3.10)
n∈N
Next we introduce standard action-angle variables (θ, I ) = ((θ1 , · · · , θd ), (I1 , · · · , Id )) in the (wn1 , · · · , wnd , w¯ n1 , · · · , w¯ nd )-space by letting, Ii = wni w¯ ni , i = 1, · · · , d, 6 Recall that, for simplicity, we assume that all eigenvalues µ are positive. n 7 Regularity refers to the components q and p . n n
KAM Tori for 1D Nonlinear Wave Equations
505
so that the system (3.9) becomes dIj dθj = ωj + PIj , = −Pθj , j = 1, · · · , d, dt dt d w¯ n dwn = −iλn wn − iPw¯ n , = iλn w¯ n + iPwn , n 6 = n1 , n2 , · · · , nd , (3.11) dt dt ˜ with the (wn1 , · · · , wnd , w¯ n1 , · · · , w¯ nd )-variables expressed in terms where P is just G of the (θ, I ) variables and the frequencies ω = (ω1 , ..., ωd ) coincide with the parameter ξ introduced in (3.3): ωi ≡ ξi = λni .
(3.12)
The P Hamiltonian associated to (3.11) (with respect to the symplectic form dI ∧ dθ + i n dwn ∧ d w¯ n ) is given by X
H = hω, I i +
λn wn w¯ n + P (θ, I, w, w, ¯ ξ ).
(3.13)
n6 =n1 ,··· ,nd
Remark. Actually, in place of H in (3.13) one should consider the linearization of H around a given point I0 and let I vary in a small ball B (of radius 0 < s |I0 |) in the “positive” quadrant {Ij > 0}. In such a way the dependence of H upon I is obviously analytic. For notational convenience we shall however do not report explicitly the dependence of H on I0 . Finally, to put the Hamiltonian in the form (2.9) we couple the variables (wn , w¯ n ) corresponding to “closer” eigenvalues. More precisely, we let zn = (w2n−1 , w2n , w¯ 2n−1, w¯ 2n ) for large8 n, say n > n¯ > nd and denote by z0 = {wn } 0≤n≤n¯ , {w¯ n } 0≤n≤n¯ n6 =n1 ,...,nd
the remaining conjugated variables. The Hamiltonian (3.13) takes the form H = hω, I i +
1X hAn zn , zn i + P (θ, I, z, ξ ), 2
n6 =n1 ,...,nd
(3.14)
n∈N
where 0 I2 An = Diag(λ2n−1 , λ2n , λ2n−1 , λ2n ) I2 0 0 0 λ2n−1 − λ2n 0 I2 0 0 0 + = λ2n I2 0 0 λ2n−1 − λ2n 0 0 0 0
0 0 , 0 0 0 Id0 with Id0 0
for n > nd , while A0 = Diag({λn }, {λn }; 1 ≤ n ≤ nd , n 6= n1 , · · · , nd )
d0 = n¯ + 1 − d. The perturbation P in (3.14) has the following (nice) regularity property. 8 Compare (A1).
506
L. Chierchia, J. You
Lemma 3.1. Suppose that V is real analytic in x (respectively, belongs to the Sobolev space H k (T) for some k ∈ N). Then for small enough ρ > 0 (respectively, a > 0), r > 0 and s > 0 one has = O(|z|2a,ρ ) ; kXP kDa+1/2,ρ a,ρ (r,s),O
(3.15)
here the parameter a is taken to be 0 (respectively, the parameter ρ is taken to be 0). A proof of this lemma is given in the Appendix. In fact, XP is even more “regular” (a fact, however, not needed in what follows): (3.15) holds with 1 in place of 1/2. The Hamiltonian (3.14) is seen to satisfy all the assumptions of Theorem 1 with: dn = 4, n ≥ 1; d0 = n¯ + 1 − d; d¯ = max{d0 , 4}; b = 1; δ = 2; δ1 chosen as in (3.4); a¯ − a = 21 . Thus Theorem 1 yields the following Theorem 2. Consider a family of 1D nonlinear wave equation (3.1) parameterized by ξ ≡ ω ∈ O as above with V (·, ξ ) real-analytic (respectively, smooth). Then for any 0 < γ 1, there is a subset Oγ of O with meas(O\Oγ ) → 0 as γ → 0, such that (3.1)ξ ∈Oγ has a family of small-amplitude (proportional to some power of γ ), analytic (respectively, smooth) quasi-periodic solutions of the form X un (ω10 t, · · · , ωd0 t)φn (x), u(t, x) = n
where un : Td → R and ω10 , · · · , ωd0 are close to ω1 , · · · , ωd . Remark. As mentioned above, our KAM theorem (which applies only to the case that not all the eigenvalues are multiple9 and under the hypothesis that all µn ’s are positive) implies that the quasi-periodic solutions obtained are linearly stable. In the case that all the eigenvalues are double (as in the constant potential case), one should not expect linear stability (see the example given by Craig, Kuksin and Wayne [6]). We also notice that, essentially with only notational changes, the proof of the above theorem goes through in the case that some of the eigenvalues are negative. 4. KAM Step Theorem 1 will be proved by a KAM iteration which involves an infinite sequence of change of variables. At each step of the KAM scheme, we consider a Hamiltonian vector field with Hν = Nν + Pν , where Nν is an “integrable normal form” and Pν is defined in some set of the form10 D(sν , rν ) × Oν . We then construct a map11 8ν : D(sν+1 , rν+1 ) × Oν+1 ⊂ D(rν , sν ) × Oν → D(rν , sν ) × Oν 9 Recall that we require that the torus frequencies are independent parameters. 10 Recall the notations from Section 2. 11 Recall that the parameters a, ρ and a¯ are fixed throughout the proof and are therefore omitted in the
notations.
KAM Tori for 1D Nonlinear Wave Equations
507
so that the vector field XHν ◦8ν defined on D(rν+1 , sν+1 ) satisfies kXHν ◦8ν − XNν+1 krν+1 ,sν+1 ,Oν+1 ≤ νκ with some new normal form Nν+1 and for some fixed ν-independent constant κ > 1. To simplify notations, in what follows, the quantities without subscripts refer to quantities at the ν th step, while the quantities with subscripts + denotes the corresponding quantities at the (ν + 1)th step. Let us then consider the Hamiltonian H = N + P ≡ e + hω, I i +
1X hAn zn , zn i + P , 2
(4.1)
n∈N
defined in D(r, s)×O; the An ’s are symmetric matrices. We assume that ξ ∈ O satisfies12 (for a suitable τ > 0 to be specified later) |hk, ωi−1 | < k(hk, ωiIdi dj
|k|τ |k|τ d¯ , k(hk, ωiIdi + Ai Jdi )−1 k < ( ) , γ γ |k|τ d¯2 ) , + (Ai Jdi ) ⊗ Idj − Idi ⊗ (Jdj Aj ))−1 k < ( γ
(4.2)
We also assume that max k
|p|≤d¯ 2
∂ p An k ≤ L, ∂ξ p
(4.3)
on O, and kXP kr,s,O ≤ .
(4.4)
We now let 0 < r+ < r, and define s+ = where
4 1 1 s 3 , + = γ −c 0(r − r+ ) 3 , 2
(4.5)
0(t) ≡ sup uc e− 4 ut ∼ t −c 1
u≥1
for t > 0. Here and later, the letter c denotes suitable (possibly different) constants that do not depend on the iteration step13 . We now describe how to construct a set O+ ⊂ O and a change of variables 8 : D+ × O+ = D(r+ , s+ ) × O+ → D(r, s) × O, such that the transformed Hamiltonian H+ = N+ + P+ ≡ H ◦ 8 satisfies all the above iterative assumptions with new parameters s+ , + , r+ , γ+ , L+ and with ξ ∈ O+ . 12 The tensor product (or direct product) of two m × n, k × l matrices A = (a ), B is a (mk) × (nl) matrix ij defined by a11 B · · · a1n B ··· ··· ··· . A ⊗ B = (aij B) = an1 B · · · amn B
k · k for matrix denotes the operator norm, i.e., kMk = sup|y|=1 |My|. Recall that ω and the Ai ’s depend on ξ. 13 Actually, here c = d¯ 4 τ + d¯ 2 τ + d¯ 2 + 1.
508
L. Chierchia, J. You
4.1. Solving the linearized equation. Expand P into the Fourier–Taylor series X Pklα eihk,θ i I l zα , P = k,l,α
where k ∈ Zd , l ∈ Nd and α ∈ ⊗n∈N Ndn with finite many non-vanishing components. Let R be the truncation of P given by X Pkl0 eihk,θ i I l R(θ, I, z) ≡ P0 + P1 + P2 ≡ X
+
k,|l|≤1
Pk0α e
ihk,θ i
k,|α|=1
with 2|l| + |α| = 2
X
zα +
Pk0α eihk,θ i zα ,
(4.6)
k,|α|=2
X
lj +
j =1,··· ,d
X
|αj | ≤ 2.
j ∈N
It is convenient to rewrite R as follows: X Pkl0 eihk,θ i I l R(θ, I, z) = k,|l|≤1
+
X X hRik , zi ieihk,θ i + hRjki zi , zj ieihk,θ i , k,i
(4.7)
k,i,j
where Rik , Rjki are respectively the di vector and (dj × di ) matrix defined by Rik =
Z
j
1 + δi ∂P −ihk,θ i e dθ|z=0,I =0 , Rjki = ∂zi 2
Z
∂ 2 P −ihk,θ i e dθ |z=0,I =0 . (4.8) ∂zj ∂zi
Note that Rijk = (Rjki )T . Rewrite H as H = N + R + (P − R). By the choice of s+ in (4.5) and by the definition of the norms, it follows immediately that kXR kr,s,O ≤ kXP kr,s,O ≤ .
(4.9)
Moreover s+ , + are such that, in a smaller domain D(r, s+ ), we have kXP −R kr,s+ < c + .
(4.10)
Then we look for a special F , defined in domain D+ = D(r+ , s+ ), such that the time one map φF1 of the Hamiltonian vector field XF defines a map from D+ → D and transforms H into H+ . More precisely, by second order Taylor formula, we have H ◦ φF1 = (N + R) ◦ φF1 + (P − R) ◦ φF1 = N + {N, F } + R Z Z s 1 1 ds {{N + R, F }, F } ◦ φFt dt + {R, F } + (P − R) ◦ φF1 . + 2 0 0 = N + + P+ X 0 +{N, F } + R − P000 − hω0 , I i − hRnn zn , zn i, (4.11) n∈N
KAM Tori for 1D Nonlinear Wave Equations
where
509
Z 2 ∂P ∂ P 0 dθ |I =0,z=0 , dθ|I =0,z=0 , Rnn = ω = ∂I ∂zn2 X 0 hRnn zn , zn i, N+ = N + P000 + hω0 , I i + 0
Z
1 P+ = 2
Z
1
Z
s
ds
0
0
n∈N
{{N + R, F }, F } ◦ XFt dt + {R, F } + (P − R) ◦ φF1 .
We shall find a function F of the form X F (θ, I, z) = F0 + F1 + F2 = |l|≤1,|k|6 =0
X
+
Fkl0 eihk,θ i I l +
|k|+|i−j |6 =0
X hFik , zi ieihk,θ i i∈N
hFjki zi , zj ieihk,θ i
,
(4.12)
satisfying the equation {N, F } + R − P000 − hω0 , I i −
X n∈N
0 hRnn zn , zn i = 0.
(4.13)
Lemma 4.1. Equation (4.13) is equivalent to Fkl0 = (ihk, ωi)−1 Pkl0 ,
k 6 = 0, |l| ≤ 1,
(hk, ωiIdi + Adi Jdi )Fik = iRik ,
(hk, ωiIdi + Adi Jdi )Fijk
− Fijk (Jdj Aj )
=
iRijk ,
(4.14)
|k| + |i − j | 6 = 0.
Proof. Inserting F , defined in (4.12), into (4.13) one sees that (4.13) is equivalent to the following equations14 : {N, F0 } + P0 − hω0 , I i = 0, {N, F1 } + P1 = 0, X 0 hRnn zn , zn i = 0. {N, F2 } + P2 −
(4.15)
n∈Z
The first equation in (4.15) is obviously equivalent, by comparing the coefficients, to the first equation in (4.14). To solve {N, F1 } + P1 = 0, we note that15 {N, F1 } = h∂I N, ∂θ F1 i + h∇z N, J ∇z F1 i X h∇zi N, iJdi ∇zi F1 i = h∂I N, ∂θ F1 i + =i
X
i
(h
k,i
=i
hk, ωiFik , zi i + hAi zi , Jdi Fik i)eihk,θ i
X h(hk, ωiIdi + Ai Jdi )Fik , zi ieihk,θ i . k,i
14 Recall the definition of P in (4.6). i 15 Recall the definition of N in (4.1).
(4.16)
510
L. Chierchia, J. You
It follows that Fik are determined by the linear algebraic system i(hk, ωiIdi + Ai Jdi )Fik + Rik = 0, i ∈ N, k ∈ Zd . Similarly, from {N, F2 } = h∂I N, ∂θ F2 i + =i
X |k|+|i−j |6=0
=i
X
|k|+|i−j |6=0
=i
X
|k|+|i−j |6=0
X h∇zi N, iJdi ∇zi F2 i i
(h
hk, ωiFjki zi , zj i + hAi zi , Jdi (Fjki )T zj i + hAj zj , Jdj Fjki zi i)eihk,θ i
(h hk, ωiFjki zi , zj i + h(Aj Jdj Fjki − Fjki Jdi Ai )zi , zj i)eihk,θ i h(hk, ωiFjki + Aj Jdj Fjki − Fjki Jdi Ai )zi , zj ieihk,θ i
(4.17)
it follows that, Fjki is determined by the following matrix equation: (hk, ωiIdj + Aj Jdj )Fjki − Fjki (Jdi Ai ) = iRjki , |k| + |i − j | 6 = 0,
(4.18)
where Fjki , Rjki are dj × di matrices defined in (4.12) and (4.7). Exchanging i, j we get the third equation in (4.14). u t The first two equations in (4.14) are immediately solved in view of (4.2). In order to solve the third equation in (4.14), we need the following elementary algebraic result from matrix theory. Lemma 4.2. Let A, B, C be respectively n × n, m × m, n × m matrices, and let X be an n × m unknown matrix. The matrix equation AX − XB = C,
(4.19)
is solvable if and only if Im ⊗ A − B ⊗ In is nonsingular. Moreover, kXk ≤ k(Im ⊗ A − B ⊗ In )−1 k · kCk. In fact, the matrix equation (4.19) is equivalent to the (bigger) vector equation given by (I ⊗ A − B ⊗ I )X 0 = C 0 , where X0 , C 0 are vectors whose elements are just the list (row by row) of the entries of X and C. For a detailed proof we refer the reader to the Appendix in [20] or [12], p. 256. Remark. Taking the transpose of the third equation in (4.14), one sees that (Fijk )T satisfies the same equation of Fjki . Then (by the uniqueness of the solution) it follows that Fjki = (Fijk )T .
KAM Tori for 1D Nonlinear Wave Equations
511
4.2. Estimates on the coordinate transformation. We proceed to estimate XF and 81F . We start with the following Lemma 4.3. Let Di = D( 4i s, r+ + 4i (r − r+ )), 0 < i ≤ 4. Then kXF kD3 ,O < c γ −c 0(r − r+ ).
(4.20)
Proof. By (4.2), Lemma 4.1 and Lemmata 7.4, 7.5 in the Appendix, we have |Fkl0 |O ≤ |hk, ωi|−1 |Pkl | < c γ −c |k|c e−|k|(r−r+ ) s 2−2|l| , k 6 = 0,
kFik kO = k(hk, ωiIdi + Ai Jdi )−1 Rik k ≤ k(hk, ωiIdi + Ai Jdi )−1 k · kRik k < c γ −c |k|c |Rik |,
kFijk kO ≤ k(hk, ωiIdi dj + (Ai Jdi ) ⊗ Idj − Idi ⊗ (Jdj Aj ))−1 k · kRijk k < c γ −c |k|c kRijk k, |k| + |i − j | 6 = 0,
(4.21)
where k · kO for matrix is similar to (2.4). It follows that X 1 X 1 l ihk,θ i kF k ≤ ( |f | · |I | · |k| · |e | + |Fik | · |zi | · |k| · |eihk,θ i | θ kl0 D , O 2 s2 s2 X + kFijk k · |zi | · |zj | · |k| · |eihk,θ i |) < c γ −c 0(r − r+ )kXR k < c γ −c 0(r − r+ ),
(4.22)
where 0(r − r+ ) = supk |k|c e−|k| 4 (r−r+ ) . Similarly, X |Fkl0 | · |eihk,θ i | < c γ −c 0(r − r+ ). kFI kD2 ,O = 1
|l|≤1
Now we estimate kXF 1 kD2 ,O . Note that kFz1i kD2 ,O = k
X k
Fik e−i kD2 ,O
< c γ −c 0
X k,i
|Rik |e|k|r < c γ −c 0k
∂P1 k. ∂zi
It follows that kXF 1 kD2 ,O < c
X i∈N
kFz1i kD2 ,O i a eiρ
< c γ −c 0
X ∂P1 k ki a eiρ < c γ −c 0, ∂zi i∈N
by the definition of the weighted norm.
(4.23)
512
L. Chierchia, J. You
Note that16 kFz2i kD2 ,O = k
X (Fijk + (Fijk )T )zj eihk,θ i kD2 ,O k,j
< c γ −c 0k
∂P2 k. ∂zi
(4.24)
Similarly, we have kXF 2 kD2 ,O < c γ −c 0.
(4.25)
The conclusion of the lemma follows from the above estimates. u t In the next lemma, we give some estimates for φFt . The following formula (4.26) will be used to prove that our coordinate transformations is well defined. Inequality (4.27) will be used to check the convergence of the iteration. 1
Lemma 4.4. Let η = 3 , D i η = D(r+ + 2
i−1 2 (r
− r+ ), 2i ηs), i = 1, 2. We then have
φFt : D 1 η → Dη , 0 ≤ t ≤ 1, 2
(4.26)
3
if ( 21 γ −c 0 −1 ) 2 . Moreover, kDφF1 − I dkD 1
2η
< c γ −c 0.
(4.27)
Proof. Let kD m F kD,O = max{|
∂ |i|+|l|+p F |D,O , |i| + |l| + |α| = m ≥ 2}. ∂θ i ∂I l ∂zα
Note that F is polynomial in I of order 1, in z of order 2. From17 (4.25) and the Cauchy inequality, it follows that kD m F kD1 ,O < c γ −c 0,
(4.28)
for any m ≥ 2. To get the estimates for φFt , we start from the integral equation, Z t t XF ◦ φFs ds φF = id + 0
so that
φFt
: D 1 η → Dη , 0 ≤ t ≤ 1, as it follows directly from (4.28). Since 2
Z
DφF1 = I d +
0
1
(DXF )DφFs ds = I d +
Z 0
1
J (D 2 F )DφFs ds,
it follows that kDφF1 − I dk ≤ 2kD 2 F k < c γ −c 0. t The estimates of second order derivative D 2 φF1 follows from (4.28). u 16 Recall (2.3), the definition of the norm. 17 Recall the definition of the weighted norm in (2.6).
(4.29)
KAM Tori for 1D Nonlinear Wave Equations
513
4.3. Estimates for the new normal form. The map φF1 defined above transforms H into H+ = N+ + P+ (see (4.11) and (4.13)) with 1 X + hAi zi , zi i, (4.30) N+ = e+ + hω+ , yi + 2 i∈Z+
where
0 e+ = e + P000 , ω+ = ω + P0l0 (|l| = 1), A+ i = Ai + 2Rii .
(4.31)
Now we prove that N+ shares the same properties with N . By the regularity of XP and by Cauchy estimates, we have |ω+ − ω| < , kRii0 k < i −δ
(4.32)
with δ = a¯ − a > 0. It follows that −1 k(A+ i ) k≤
kA−1 i k
≤ 2kA−1 i k,
0 1 − 2kA−1 i Rii k
−1 k(hk, ω + P0l00 iIdi − Jdi A+ i ) k≤ ¯
k(hk, ωiIdi + Ai Jdi )−1 k |k|τ d¯ ) , (4.33) ≤ ( 1 − k(hk, ωiIdi + Ai Jdi )−1 k γ+
¯
¯
provided |k|dτ < c (γ d − γ+d ). Similarly, we have + −1 k(hk, ω + P0l00 iIdi dj + (A+ i Jdi ) ⊗ Idj − Idi ⊗ (Jdj Aj )) k ≤ ( ¯2
|k|τ d¯2 ) , γ+
(4.34)
¯2
¯2
provided |k|d τ < c (γ d −γ+d ). This means that in the next KAM step, small denomi¯2 ¯2 ¯2 nator conditions are automatically satisfied for |k| < K where K d τ < c (γ d −γ+4d ). The following bounds wil be used later for the measure estimates: |
∂ l (A+ ∂ l (ω+ − ω) i − Ai ) | ≤ , | |O < c i −δ , O ∂ξ l ∂ξ l
(4.35)
for |l| ≤ d¯ 2 (by definition of the norms). 4.4. Estimates for the new perturbation. To complete the KAM step we have to estimate the new error term. By the definition of φF1 and Lemma 4.4, H ◦ φF1 = N+ + P+ is well defined in D 1 η . Moreover, we have the following estimates: 2
kXP+ kD 1 = kXR 1 dt R s {{N +R,F },F }◦φ s +{R,F }+(P −R)◦φ 1 kD 1 2η
0
Rt
≤ kX(
0
R1 0
dt
F
s 0 {{N +R,F },F }◦φF
F
kD 1 + kX(P −R)◦φ 1 kD 1 2η
≤ kX{{N +R,F },F } kDη + kXP −R kDη < c γ −c 0 2 3 < c + 4
by (4.9) and Lemma 7.3.
2η
F
2η
(4.36)
514
L. Chierchia, J. You
Thus, there exists a big constant c, independent of iteration steps, such that a,ρ ¯
kXP+ kr+ ,s+ = kXP+ kD 1 ≤ cγ −c 0 2 η = c+ . 2η
(4.37)
The KAM step is now completed. 5. Iteration Lemma and Convergence For any given s, , r, γ , we define, for all ν ≥ 1, the following sequences rν = r(1 −
ν+1 X
2−i ),
i=2
ν =
4
3 cγν−c 0(rν−1 − rν )2 ν−1 , ν+1 X −i
γν = γ (1 −
2 ),
i=2
1 13 ν , Lν = Lν−1 + ν−1 , 2 ν−1 Y 1 1 i ) 3 s0 , sν = ην−1 sν−1 = 2−ν ( 2
ην =
i=0
1 c −1 d¯2 ¯2 ¯2 ν−1 (γν−1 − γνd ) d τ , Kν = 2 Dν = Da,ρ (rν , sν ),
(5.1)
where c is the constant in (4.37). The parameters r0 , 0 , γ0 , L0 , s0 , K0 are defined respectively to be r, , γ , L, s, 1. Note that ∞ Y 3 i [0(ri−1 − ri )]2( 4 ) , 9(r) = i=1
is a well defined finite function of r. 5.1. Iteration Lemma. The preceding analysis may be summarized as follows. ¯ δ, δ1 , a¯ − a, L, τ, γ ) is small enough. Then the Lemma 5.1. Suppose that 0 = (d, d, following holds for all ν ≥ 0. Let X hAνi (ξ )zi , zi i, Nν = eν + hων (ξ ), I i + i∈N
be a normal form with parameters ξ satisfying |k|τ |k|τ d¯ , k(ihk, ων iIdi + Aνi Jdi )−1 k < ( ) , γν γν |k|τ d¯2 ) k(ihk, ων iIdi dj + (Aνi Jdi ) ⊗ Idj − Idi ⊗ (Jdj Aνj ))−1 k < ( γν |hk, ων i−1 |
|k| Jdi )−1 k≥( |k| γν , k(hk,ων iI2m +(Ai γν ) , or τ
|k| ν+1 −1 k(hk,ων+1 >Idi dj +(Aν+1 j Jdi )⊗Idj −Idi ⊗(Jdj Aj )) k>( γν
¯2
)d
,
ν , and a symplectic change of variables with ων+1 = ων + P0l0
8ν : Dν+1 × Oν+1 → Dν ,
(5.3)
such that Hν+1 = Hν ◦ 8ν , defined on Dν+1 × Oν+1 , has the form X hAν+1 zi , zi i + Pν+1 , Hν+1 = eν+1 + hων+1 , I i + i
(5.4)
i∈N
satisfying max | l≤d¯ 2
∂ l (Aν+1 (ξ ) − Aνi ) ∂ l (ων+1 (ξ ) − ων (ξ )) i | ≤ ν , max|l|≤d¯2 | | ≤ ν i −δ , l ∂ξ ∂ξ l a,ρ ¯
kXPν+1 kDν+1 ,Oν+1 ≤ ν+1 .
(5.5)
(5.6)
5.2. Convergence. Suppose that the assumptions of Theorem 1 are satisfied. To apply the iteration lemma with ν = 0, recall that 0 = , γ0 = γ , s0 = s, L0 = L, N0 = N, P0 = P ,
O0 =
τ
ξ ∈O:|
τ
¯
|hk,ωi−1 |< |k|γ ,k(hk,ωiIdi +Ai Jdi )−1 k( γν γν |k|τ d¯2 ) }, and {ω ∈ Oν : kM−1 2 k>( γν {ξ ∈ Oν : |hk, ων i−1 | >
(6.1)
KAM Tori for 1D Nonlinear Wave Equations
517
where M1 = hk, ων iIdi + Aνi Jdi , M2 = hk, ων iIdi dj + (Aνj Jdj ) ⊗ Idi − Idj ⊗ (Jdi Aνi ).
(6.2)
In the set {ξ ∈ O : kM(ω)−1 k > C} are included also the ξ ’s for which M is not P P j j 18 P000 (ξ )|C d¯2 ≤ , Aνi = invertible. Recall that ων (ξ ) = ξ + ν−1 j =0 P000 (ξ ) with | P 0,ν P 0,ν Ai + 2 ν Rii with k ν Rii k = O(i −δ ). Lemma 6.1. There is a constant K0 such that, for any i, j, and |k| > K0 , γ meas(Rνk ∪ Rνki ∪ Rνkij ) < c τ −1 . |k| Proof. As it is well known meas (Rνk ) ≤
γν . |k|τ
The set Rνki is empty if i > const |k|, while, if i ≤ const |k|, from Lemmata 7.6, 7.7 there follows that γν meas(Rνki ) < c τ −1 . |k| We now give a detailed proof for the most complicated estimate, i.e., the estimate on the measure of the set Rνkij . Note that the main part of M2 is diagonal19 . In fact M2 can be rewritten as ν M2 ≡ Aij + Bij ,
with Aij = hk, ων+1 iIdi dj + λj Diag(Idj /2 , −Idj /2 ) ⊗ Idi − λi Idj ⊗ Diag (−Idi /2 , Idi /2 ). (6.3) The matrix Aij is diagonal with entries λkij = hk, ων i ± λi ± λj in the diagonal, where ν is a matrix of size λi , λj are given in (2.10) and ± sign depends on the position. Bij ν −δ −δ −δ −δ O(i + j ) since Ai = Ai + Bi + O(i ) = Ai + O(i ) by (2.11) and (4.32). In the rest of the proof we drop in the notation the indices i, j since they are fixed. Now either all λkij ≤ |k| or there are some diagonal elements λkij > |k|. We first consider the latter case. By permuting rows and columns, we can find two non-singular matrices Q1 , Q2 with elements 1 or 0 such that A11 0 B˜ 11 B˜ 12 ν + ˜ ˜ , (6.4) Q1 (A + B )Q2 = 0 A22 B21 B22 where A11 , A22 are diagonal matrices and A11 contains all diagonal elements λkij which are bigger than |k|. Moreover, defining Q3 , Q4 , D as I 0˜ I − (A11 + B˜ 11 )−1 B˜ 12 , Q , = Q3 = 4 0 I −B˜ 21 (A11 + B˜ 11 )−1 I 18 Recall (4.32), (5.5). 19 Recall (2.10), (6.2).
518
L. Chierchia, J. You
and D = A22 + B˜ 22 − B˜ 21 (A11 + B˜ 11 )−1 B˜ 12 = A22 + O(i −δ + j −δ ),
(6.5)
A11 + B11 0 . Q3 Q1 A + B ν+1 Q2 Q4 = 0 D
(6.6)
we have
For ξ ∈ O such that D is invertible, we have (A + B ν )−1 = Q2 Q4
(A11 + B11 )−1 0 0 D −1
Q3 Q1 .
(6.7)
Since the norm of Q1 , Q2 , Q3 , Q4 , (A11 + B11 )−1 are uniformly bounded, it follows from (6.7) that τ d¯2 τ d¯2 |k| |k| ν −1 −1 ⊂ ξ ∈ Oν : kD k > c . ξ ∈ Oν : k(A + B ) k > γν γν (6.8) If all λkij < c |k| we simply take D = A + Bν . Since all elements in D are of size O(|k|), by Lemma 7.6 in the Appendix, we have τ d¯2 d¯2 |k| γν −1 ⊂ ξ ∈ Oν : | det D| < c . (6.9) ξ ∈ Oν : kD k > c γν |k|τ −1 Let N denote the dimension of D (which is not bigger than d¯ 2 ). Since D = A22 + O(i −δ + j −δ ), the N th order derivative of det D with respective to some ξi is bounded 1 |k|N (provided that |k| is bigger enough). From (6.8), (6.9) and away from zero by 2d Lemma 7.7, it follows that τ d¯2 −1 |k| k> meas Rνkij = meas ξ ∈ Oν : k A + B ν γν d¯2 γν ≤ meas ξ ∈ Oν : | det D| < c |k|τ −1 d¯2 N γ γν < c τ −1 . c |k|, then Rνki = ∅; If max{i, j } > c |k| b−1 , i 6 = j for b > 1 or |i − j | > const |k| for b = 1, then Rνkij = ∅, where the constant c depends on the diameter of O.
KAM Tori for 1D Nonlinear Wave Equations
519
Proof. As above, we only consider the most complicated case, i.e., the case of Rνkij . 1
Notice that max{i, j } > const |k| b−1 for b > 1 or |i − j | > const |k| for b = 1 implies |λi ± λj | = (j b − i b )(1 + O(i −δ + j −δ )) 1 ≥ |j − i|(i b−1 + j b−1 )(1 + O(i −δ + j −δ )) ≥ const |k|. 2
(6.11)
It follows that Aij defined in (6.3) is invertible and k(Aij )−1 k < |k|−1 . ν )−1 k < 2|k|−1 for large k (say |k| > K ), i.e, By Neumann series, we have k(Aij + Bij 0 ν t Rkij = ∅. u
Lemma 6.3. For b ≥ 1, we have [ Rν = meas meas ν≥0
[
δ
ν,|k|>Kν ,i,j
(Rνk ∪ Rνki ∪ Rνkij ) < c γ 1+δ .
Proof. The measure estimates for R0 comes from our assumption (2.12). We then consider the estimate [ [ [ Rνkij , meas ν |k|>Kν i,j
which is the most complicate one. Let us consider separately the case b > 1 and the case b = 1. We first consider b > 1. By Lemmata 6.1, 6.2, if |k| > K0 and i 6 = j , we have meas
[
Rkij = meas
i6=j
[
Rkij 1
i6 =j ;i,j ±2λi and A22 =< k, ων > I . Repeating the arguments in Lemma 6.1, we get (6.9) and Rνkii ⊂ ξ : | det D| < c
γν d¯2 |k|τ −1 Y = ξ: |hk, ων i + O i −δ | < c
⊂ ξ : |hk, ων i| < c
γ |k|τ −1
γν d¯2 |k|τ −1 1 + δ ≡ Qkii . i
Since Qkii ⊂ Qki0 i0 for i ≥ i0 , using (6.10), we find that meas
[ i
X i0 γ 1 Rkii ≤ |Rkii | + |Qki0 i0 | < c + |k|τ −1 i0−δ i max{d + 2 +
2 b−1 , (d
+ 1)
[ [
meas
|k|>Kν i,j
+meas ∪|k|>Kν
1+δ δ
The quantity meas X
meas
ν≥1
ν
|k|>Kν
[ [ |k|>Kν i,j
S
i,j
Rνkij γν
Rνkii
i
S S
(6.14)
+ 1}. As in (6.12), (6.14), we find
Rνkij γν [
so that
[ [
= meas
γν+1
|k|>Kν i6 =j
Rνkij γν
δ
< c Kν−1 γ 1+δ .
Rνkij is then bounded by
δ
< c γ 1+δ
X
δ
Kν−1 < c γ 1+δ ,
ν≥0
(6.15)
1+δ
2 , (d + 1) δ + 1}. This concludes the proof for b > 1. provided τ > max{d + 2 + b−1 Consider now b = 1. Without loss of generality, we assume j ≥ i and j = i + m. Note that Lemma 6.2 implies Rkij = ∅ for m > C|k|. Following the scheme of the above proof, we find [ [ [ [ Rkij = Rki,i+m = Rki,i+m k,i,j
k,i,m
[
⊂
[
k,m h ⊂ ω : | det M| < c h Proof. First, note that if M is a nonsingular N × N matrix with elements bounded by 1 adjM so that |mij | ≤ m, its inverse is M −1 = |M| kM −1 k < c
mN −1 |detM|
with a constant depending on N. In particular, if m = const|k|, |DetM| >
|k|N −1 h ,
then
kM −1 k < c h. This proofs the lemma. u t In order to estimate the measure of Rν+1 , we need the following lemma, which has been proven in [19,21]. A similar estimate is also used by Bourgain [4]. Lemma 7.7. Suppose that g(u) is a C m function on the closure I¯, where I ⊂ R 1 is a finite interval. Let Ih = {u : |g(u)| < h} , h > 0. If for some constant d > 0, 1 |g (m) (u)| ≥ d for all u ∈ I , then meas (Ih ) ≤ ch m , where c = 2(2+3+· · ·+m+d −1 ). For the proof of Lemma 3.1, we need the following Lemma 7.8.
X j ∈Z
e−|n−j |r+ρ|j | ≤ Ceρ|n| ,
X j,n∈Z
if ρ < r, q ∈ Zρ where C depends on r − ρ.
|qj |e−|n−j |r+|n|ρ ≤ C|q|ρ
524
L. Chierchia, J. You
Lemma 7.9. X X (1 + |n − j |)−K |j |a < c |n|a , |qj |(1 + |n − j |)−k |n|a ≤ C|q|a j ∈Z
j,n∈Z
if K > a + 1, q ∈ Za,ρ=0 , where C depends on K − a − 1. The proofs of the above two lemmata are elementary and we omit them. A direct proof of Lemma 3.1. It is clearly enough to consider the case of f (u) equal to a monomial uN+1 for some N ≥ 1. From (3.10), one can see that the regularity of G ˜ In the following, we shall give the proof for G. implies the regularity of G. Suppose that the potential V (x) is analytic in |Imx| < r (respectively, belongs to are analytic in |Imx| < r (respectively, Sobolev space H K ) then the eigenfunctions P belong to H K+2 ). If we let φi (x) = ain eihn,xi , then (see, e.g.,[7]) |ain | < c e−|i−n|r respectively |ain | < c (1 + |n − i|−K−2 ). Recall that G(q) =
X i0 ,··· ,iN
qi · · · qiN Ci0 ···iN p 0 , λi0 · · · λiN
where Z Ci0 ···iN =
X
φi0 · · · φiN dx =
T1
(
N Y
n0 +n1 +···+nN =0 s=0
ainss ),
with |ainss | < c e−|is −ns |r (respectively, |ainss | < c (1 + |ns − is |−K−2 ). In what follows, we assume either a = 0, ρ > 0 or a > 0, ρ = 0. Since Gqj = (N + 1)
X i1 ,··· ,iN
qi · · · qiN Cj i1 ···iN p 1 , λj λi1 · · · λiN
it follows that kGq ka+ 1 ,ρ = kGq0 k + 2
0 such that σk (µ∗ ) = 0, and σk : [µ∗k+1 , µ∗k ] → [0, 1] is a C ∞ decreasing diffeomorphism onto [0, 1]. Set µk the inverse of σk . For each k ≥ 1 large enough, define Tk : [0, 1]×[f −1 (a), f (a)] 7 → R by Tk (σ, x) = fµkk (σ ) (x). For f −1 (a) < x < f (a) and µ small, define ta (µ, x) by Xta (µ,x) (µ, x) = a. The sequence (Tk )k converges in the C ∞ topology to the transition map T∞ : [0, 1] × [f −1 (a), f (a)] 7 → R, defined by T∞ (σ, x) = Xta (0,x)−σ (0, b). Note that T∞ depends on both a and b and, when necessary, we set T∞,a,b to indicate such a dependence. Observe also that ∂x T∞ (σ, x) is bounded away from zero by a constant which does not depend on (σ, x). With b fixed, and taking a sufficiently close to 0, we can assume that this constant is arbitrarily large. This fact will be used later in the proof of the next proposition. All the concepts above can be given for any fixed point q0 in the interior of I besides 0. s (q ) × W u (q ), We can also define T∞,a,b for every pair a < q0 < b with (a, b) ∈ Wloc 0 loc 0 s u where Wloc (q0 ) (resp. Wloc (q0 )), denotes the local stable (resp. unstable) manifold of q0 . Indeed, let n and m be such that an = f n (a), bm , with f m (bm ) = b close enough to q0 , and an < q0 < bm : for (σ, x) ∈ [0, 1] × [f −1 (a), f (a)], set T∞,a,b (σ, x) = f m T∞,an ,bm (σ, f n (x)) . Note that this definition does not depend on n, m. The construction of X0 starts with the following result: ¯ and a piecewise C 2 map Proposition 1. Given ∈ (0, 1) there are δ¯ > 0, b0 ∈ (0, δ) ¯ ¯ f0 : [−1, δ] 7 → [−1, δ] with a saddle-node fixed point q0 and three discontinuities ¯ c2 = 0, satisfying: −1 < c1 < c2 < b0 < c3 < δ, 1. f0 has two fixed points q1 and q2 , q1 ∈ (c1 , f0 (b0 )) and q2 ∈ (c2 , c3 ); ¯ f0 (c1 +) = f0 (c3 +) = b0 ; 2. f0 (c1 −) = f0 (c3 −) = δ, ¯ < c2 ; f0 (c2 +) < f0 (−1), f0 (b0 ) = f0 (δ) 3. f0 (c2 −) = −1, 0 4. if f0 (x) is defined then |f00 (x)| ≥ 1 − . In addition, f0 restricted to [−1, c1 ] is increasing with inverse denoted by f0−1 ; √ ¯ 5. there is c > 2 and −1 < xc < c1 so that |f00 (x)| > c, ∀x ∈ (xc , δ];
(2)
Strange Attractors Across the Boundary of Hyperbolic Systems
531
s (q ) 6. q0 ∈ (−1, xc ), there is a¯ ∈ (q0 , c1 ) with [f0−2 (a), ¯ a] ¯ ⊂ (xc , c1 ), (−1, a) ¯ ∈ Wloc 0 u ×Wloc (q0 ), such that if T∞ = T∞,f (−1),f −1 (a) ¯ , then
inf {∂x T∞ (σ, x) : (σ, x) ∈ [0, 1] × [−1, f0 (−1)]} > c. Proof. Let ∈ (0, 1). To construct the desired map f0 we start with the tent map h0 given by : h0 (x) =
2x − 1 if − 1 ≤ x ≤ − 21 , −2x − 1 if − 21 < x ≤ 0.
(3)
Note that h0 has a fixed point q1 ∈ (c1 , 0) with c1 = − 21 , and h00 (x) = 2 wherever h00 is defined. Denote the inverse of h0 restricted to [−1, c1 ] by h−1 0 . ∗ ¯ close to Choose at once q ∈ (q1 , 0), a¯ ∈ (−1, c1 ) close to c1 , and x ∗ < h−2 0 (a) −2 h0 (a). ¯ Note that |h00 (x)| ≥ 2, ∀x ∈ [x ∗ , 0].
(4)
¯ Pick 0 < b0 < δ¯ sufficiently close to 0 in a way that h0 extends to the interval (0, δ] satisfying the following conditions: (a) (b) (c) (d) (e)
¯ = q ∗; h0 (b0 ) = h0 (δ) ¯ h0 has only one discontinuity at some c3 ∈ (b0 , δ); ¯ h0 is increasing (resp. decreasing) in (0, c3 ) (resp. (c3 , δ]); ¯ h0 (c3 −) = δ and h0 (c3 +) = b0 ; ¯ |h00 (x)| ≥ 4 in (0, δ).
¯ Next, we deform h0 in order to obtain a new map h1 such that h1 = h0 in [x ∗ , δ], 0 h1 (x) ≥ 1 everywhere and such that x = −1 is a saddle-node fixed point of h1 . Here we clarify that since x = −1 is an extreme of the interval under consideration, we only consider derivatives to the right of x = −1 for the properties involved in the definition of a saddle-node fixed point. Moving to the right the saddle-node fixed point of h1 at x = −1 we get a map f0 ¯ and with a generic saddle-node fixed point q0 ∈ (−1, x ∗ ), such that f0 = h1 in [x ∗ , δ], |f00 (x)| ≥ 1 − wherever f00 is defined. Figure 1 displays the main features of f0 . We claim that for q0 close enough to −1, then f0 satisfies the properties in the statement of the proposition. Indeed, Property (3) is exactly the last inequality. Choosing q0 sufficiently close to −1 we ensure that the transition map T∞ restricted to [0, 1] × [−1, √ f0 (−1)] satisfies ∂x T∞ (σ, x) > c∗with c ∈ R close to 2 (say). In particular, c > 2. By construction, [f0−2 (a), ¯ a] ¯ ⊂ (x , c1 ) is contained in [−1, q ∗ ] = f0 ([c2 , b0 ]) and so (6) follows with xc = x ∗ . ¯ we obtain Conditions (4) and (5), and since f0 = h0 in [x ∗ , δ], ¯ As f0 = h1 in [xc , δ], we get (2) by (d) above. Property (1) is clearly fulfilled with c1 = − 21 , c2 = 0 and c3 as in (c). This completes the proof of our claim and also the proof of Proposition 1. u t
532
C. A. Morales, M. J. Pacifico, E. R. Pujals
δ
-
111 000
11111 00000
q0
[-1, f0 (-1)] xc
-2 [f0 (a),
c1
δ q1
c2 c3 b a] - 0 f 0 ( b0) = f0( δ )
Fig. 1. The one-dimensional map f0
Note 1. Given α ∈ (0, 1) we can further assume, and we do so, that there are δ0 > 0 (small), 0 < K0 < 1 (close to 1) and D > 0 such that f0 satisfies α x − K0 if 0 ≤ x < δ0 , (5) f0 (x) = |x|α − 1 if − δ0 < x < 0, and
|f00 (x)| ≤ D|x|α−1
wherever f00 is defined. This assumption will be used later in the construction of the vector field X0 . ¯ → [−1, δ]. ¯ We denote by Rα,δ,b ¯ 0 the set of such functions f0 : [−1, δ] 2.2. The Poincaré map F0 . In this section we describe a plane region S and a map F0 defined on a subset S ∗ of S. The region S will be a cross section to the vector field X0 required in Theorem 1, the Poincaré map associated to X0 will be defined on S ∗ and it will coincide with F0 restricted to S ∗ . For this we proceed in the following way. Let ¯ and 0 < y0 < 1 be as in the previous section and consider the plane region 0 < b0 < δ, ¯ × [y0 , 1]. Sδ,b ¯ 0 ,y0 = [−1, b0 ] × [−1, 1] ∪ [b0 , δ] We denote by (x, y) the coordinates of Sδ,b ¯ 0 ,y0 . Set ¯ × [y0 , 1], J2 = {b0 } × [−1, −y0 ]. J1 = {δ}
Strange Attractors Across the Boundary of Hyperbolic Systems
533
( δ , 1)
(-1, 1)
J1
y x
(0,0)
.
(y0 , b0 ) J2
(-1, -1)
( - y , b 0) 0
(b0 , -1) Fig. 2. The region Sδ,b ¯ 0 ,y0
¯ 1) = (b0 , −1), and Given a reversing orientation diffeomorphism i : J1 → J2 with i(δ, ¯ y0 ) = (b0 , −y0 ), we obtain the annulus e Sδ,b from S i(δ, ¯ 0 ,y0 ,i ¯ 0 ,y0 identifying q ∈ J1 δ,b with i(q) ∈ J2 . Sδ,b The coordinate system (x, y) in Sδ,b ¯ 0 ,y0 lifts to a coordinate system in e ¯ 0 ,y0 ,i , which we still denote by (x, y). ¯ Proposition 2. Given α ∈ (0, 1), β − α > 1 and a0 ∈ (0, 18 ), there are 0 < b0 < δ, e ¯ and 0 < y0 < 1 so that if i : J1 3 (δ, y) 7 → (b0 , −y) ∈ J2 and S = Sδ,b ¯ 0 ,y0 ,i , then there is F0 : S → S, (x, y) 7 → (f0 (x), g0 (x, y)), such that f0 ∈ Rα,δ,b ¯ 0 , and 1 β yx , for 0 < x ≤ y0 g0 (x, y) = 2 1 − 2 y(−x)β − 21 , for − y0 ≤ x < 0. Moreover, if S ∗ = S \ {x = 0} then F0 restricted to S ∗ is C ∞ and the following inequalities hold : ∂y g0 (q) , ∂x g0 (q) , ∂y g0 (q) , det (DF0 (q)) < a0 , 0 0 f0 (x) f0 (x) q=(x,y)∈S ∗
(6)
2 2 ∂ g (q) ∂ 2 g (q) 2 ∂ g0 (q) , ∂xx g0 (q) , yy 0 , xy 0 < a0 . f 0 (x) f 0 (x) f 0 (x) yy q=(x,y)∈S ∗ 0 0 0
(7)
sup
sup
Proof. Let a0 , α and β be as in the statement, and fix 0 < < 1. For this , let ¯ → [−1, δ] ¯ satisfying (1)-(6) in Proposition 1. In particular, 0 < b0 < δ¯ and f0 : [−1, δ] ¯ by Note 1, there exists 0 < δ0 < δ, such that f0 (x) is as in (5) for some constant K0 > 0 that we assume close to 1, for 0 < |x| < δ0 . Moreover, 1 − ≤ |f00 (x)| ≤ D|x|α−1 wherever f00 is defined. Let δ > 0 be such that 2δ < δ0 , βδ β 1, there is δ1 > 0 such that |x|β ≤
2 a¯ 0 |x|1−α for 0 ≤ |x| ≤ δ1 . Dα
Shrinking δ, we can assume 2δ ≤ δ1 . Now, consider G(x) = |x|β . We have G0 (δ) = βδ β−1 , and G0 (δ) is bigger than the slope (β − 1)δ β−1 of the segment joining (δ, G(δ)) and (2δ, βδ β ). As G0 (δ) → 0 when δ → 0+ for β − 1 > 0, shrinking δ once more we can construct a map η satisfying (E1) and (E2), proving the claim. Note that η is C β in R and C ∞ in R \ {0}. Now we choose a C ∞ map ψ : R → R such that (P1) ψ(0) = 1 and ψ(x) = 0 for |x| ≥ 1 (P2) 0 ≤ |ψ 0 (x)| ≤ 1 and |ψ 00 (x)| ≤ 1 ∀ x. Let c1 , c2 = 0, and c3 be the discontinuities of f0 , and b0 < b1 < δ¯ with f0 (b1 ) = b0 . Define 1 1 for − 1 ≤ x ≤ c1 , −1 ≤ y ≤ 1 2 η(x)y + 2 , 1 1 for c1 < x ≤ c2 , −1 ≤ y ≤ 1 − η(x)y − 2 , 1 2 η(x)y, for c2 < x ≤ b0 , −1 ≤ y ≤ 1 g0 (x, y) = 21 x−b1 for b0 < x ≤ b1 , δ ≤ y ≤ 1 2 η(x)y + 2δψ( b1 −b0 ), 1 2 η(x)y + 2δ, for b1 < x ≤ c3 , δ ≤ y ≤ 1 − 1 η(x)y − 2δψ( x−c3 ), ¯ δ ≤ y ≤ 1. for c3 < x ≤ δ, ¯ 3 2 δ−c (9) One can easily verify that: (a) (b) (c) (d) (e)
−g0 (c1− , y) = g0 (c1+ , y), ∀ y ∈ [−1, 1], −g0 (c3− , y) = g0 (c3+ , y), ∀ y ∈ [δ, 1], g0 (c1− , 1) < 1, g0 (c1− , −1) > g0 (c3 −, 1), g0 (c3− , δ) > δ, g0 (c3+ , δ) < −δ, g0 (c3+ , 1) > g0 (c1+ , −1), g0 (c1+ , 1) > −1 .
Then, the map F0 : Sδ,b ¯ 0 ,δ → Sδ,b ¯ 0 ,δ given by F0 (x, y) = (f0 (x), g0 (x, y)) is well Sδ,b defined. Also, by (a) and (b), F0 lifts to a map defined in the annulus S = e ¯ 0 ,y0 ,i , ¯ y) ∈ J1 7→ (b0 , −y) ∈ J2 being a reversing orientation with y0 = δ, and i : (δ, diffeomorphism. Moreover, (c), (d) and (e) imply that F0 is a C ∞ diffeomorphism on S ∗ . The features of F0 are displayed in Fig. 3, and the main properties of F0 = (f0 , g0 ) are the following:
Strange Attractors Across the Boundary of Hyperbolic Systems
535
F0 (A) B
A
y
J1
D
C
.
x
(b ,y)
F0 (C)
0
F(D) 0
F0 (B)
0
J2
( b 0 , -1)
Fig. 3. The map F0
1
β 2 yx , for 0 < 1 − 2 y(−x)β − 21 ,
x≤δ for − δ ≤ x < 0. (ii) ∂y g0 (x, y) = 21 η(x) and, by (E1), |∂y g0 (x, y)| < a0 , ∀ (x, y) ∈ S ∗ . 2 g (x, y) = 0 for every (x, y) ∈ S ∗ . (iii) ∂yy 0 (iv) By (E2), (P2), and the fact that |f00 (x)| ≥ 1 − wherever f00 is defined, if m = min{b1 − b0 , δ¯ − c3 } and δ is small enough then ∂x g0 (x, y) 2δ 1 0 0 f 0 (x) ≤ 2(1 − ) |η (x)| + m(1 − ) sup |ψ | 0 2δ a0 + < a0 , ∀(x, y) ∈ S ∗ . ≤ 2(1 − ) (1 − ) (i)
f0 ∈ Rα,δ,b ¯ 0 , and g0 (x, y) =
(v) det DF0 (x, y) = f00 (x)∂y g0 (x, y) = 21 f00 (x)η(x), and (E1) implies | det DF0 (x, y)| ≤
D|x|α−1 |η(x)| ≤ a0 , for |x| ≤ 2δ, and 2
| det DF0 (x, y)| ≤ D2α−2 βδ β+α−1 , for |x| ≥ δ. So, shrinking δ, we can assume | det DF0 (x, y)| < a0 for every (x, y) ∈ S ∗ . (vi) Using that β − α > 1, (E4), and (P2) we get 2 ∂xx g0 (x, y) 2δ β(β − 1)|x|β−2 + 2 sup |ψ 00 |, f 0 (x) ≤ sup α−1 α|x| m (1 − ) 0 1 and a0 ∈ (0, 18 ), there are 0 < b0 < δ¯ and a C 3 vector field X0 in R3 satisfying the following properties: 1. X0 has a singularity at the origin σ0 = (0, 0, 0) ∈ R3 so that the eigenvalues {λ1 , λ2 , λ3 } of DX0 (σ0 ) are real and satisfy −
λ2 λ3 = α, − = β; λ1 λ1
(10)
2. there is an orientation-reversing diffeomorphism i such that if S = Sδ,b ¯ 0 ,y0 ,i , then S × {z = 1} is a cross section of X0 (still denoted by S) and the curve S ∩ {x = 0} is contained in the stable manifold of σ0 ; 3. a first return map F0 for X0 is defined in S ∗ = S \ {x = 0}, and F0 satisfies Proposition 2. ¯ 0 < y0 < 1, S = Sδ,b Proof. Let α, β and a0 be as in the statement. Let 0 < b0 < δ, ¯ 0 ,y0 ,i , ¯ where i(δ, y) = (b0 , −y), and F0 : S → S be as in Proposition 2. Now we take a linear vector field Y0 so that Y (σ0 ) = 0 and the eigenvalues {λ1 , λ2 , λ3 } of DY0 (σ0 ) satisfy (10). Embedding S into R3 , with the identification q ∈ S → (q, 1) ∈ R3 , taking into account the linear coordinate system (x, y) in S, we obtain that Y0 induces a Poincaré map P0 : 6 ⊂→ {x = ±1} given by P0 (x, y) = (y|x|β , |x|α ), where 6 is is a vertical band containing {x = 0}. Note that P0 leaves {x = constant} ⊂ 6 into {z = constant} ⊂ {x = ±1}. We complete the flow of Y0 in a smooth, non-zero and non-linear way so that the induced first return map 5 : {x = ±1} → S is given by 5(±1, y, z) = (σ z, My), with M > 1/2, 0 < σ < 1 small. Moreover, 5 satisfies 5 ◦ P0 (x, y) = F0 (x, y). This defines X0 (x, y) for (x, y) ∈ 6. To define X0 (x, y) for (x, y) ∈ S \ 6 we suspend the restriction of F0 to S \ 6 in the usual way in order to obtain a flow as in Fig. 3 (say). This construction is done in such a way that X0 induces a first return map in S \ {x = 0} which coincides with F0 . The proof of the proposition is finished. u t We observe that the construction of the flow of X0 indicated in Fig. 3 is similar to the one for the geometric Lorenz attractor in [9], except by the fact that we have considered as a cross section an annulus instead of a square. More precisely, we have constructed a vector field X0 , which presents a transversal cross section homeomorphic to an annulus S, so that the forward orbit of points in S by X0 either goes directly to the singularity σ or else returns to S. In particular, we have a first return map F0 : S ∗ → S whose properties are described in Proposition 2.
Strange Attractors Across the Boundary of Hyperbolic Systems
537
σ0
Fig. 4. The vector field X0
Remark that, by construction, we can take an open set U close to (and containing) the saturated ∪t≥0 (X0 )t (S) of S by the flow (X0 )t of X0 such that U is positively invariant for X0 , and its boundary ∂U is a cross-section homeomorphic to the disjoint union of a bi-torus bT 2 and a torus T 2 , inside the region in R3 bounded by bT 2 . Let 30 = Cl(∪t≥0 (X0 )t ∩n∈N Cl(F0n (S ∗ ))) . We observe that every forward orbit of X0 with initial data in ∂U flows directly to 30 . Thus, 30 = ∩t≥0 (X0 )t (U ), and we conclude that U is an isolating block for 30 . Here Cl(B) denotes the closure of B. Now, for all X nearby X0 , set 3X = ∩t≥0 Xt (U ). Note that 3X is compact for X nearby X0 , and ∂U is a cross section to X0 . By definition, U is an isolating block for 3X . Next, we prove the following result: Lemma 1. If 3X is a robust strange attractor of X then 3X is non-Lorenz-like. Proof. First observe that by construction, 3X contains a singularity. So, 3X is a robust singular strange attractor. It remains to prove that it is not equivalent to any geometric Lorenz attractor. For this, observe that any geometric Lorenz attractor as in [9] display an isolating block whose boundary is homeomorphic to the torus T 2 . Since the homeomorphic type of the boundary of any isolating block is invariant by equivalence, we obtain that the boundary of any isolating block of any singular attractor equivalent to the geometric Lorenz attractor is homeomorphic to T 2 . On the other hand, 3X has an isolating block U such that ∂U is homeomorphic to the disjoint union of bT 2 and T 2 . Thus 3X is not equivalent to any Lorenz geometric attractor. The same kind of arguments prove that 3X is neither equivalent to the attractors in [12] nor to the ones in [14]. This proves the lemma. u t
538
C. A. Morales, M. J. Pacifico, E. R. Pujals
2.4. Construction of N , U and U . First note that the vector field X0 constructed in 2.3 has a unique saddle-node closed orbit q0 , which corresponds to the saddle-node fixed point of f0 at Proposition 1. By construction, this saddle-node orbit is normally contracting : if t0 is the period of q0 , D(X0 )t0 (q0 ) restricted to S has a unique eigenvalue in the unit circle and this is equal to 1 and the other one is strictly less than 1 in norm. We fix the open set U as the one described in the last subsection. We define N as the set of vector fields X ∈ X r having a unique saddle-node closed orbit close to q0 . Then N is a co-dimension one sub-manifold containing X0 , which splits any small neighborhood of X0 into two regions N − and N + . We fix hereafter this co-dimension one sub-manifold N . Let us fix U. For every vector field X close to X0 the annulus S is a cross section of X. Moreover X induces a first return map FX : S \ {x = 0} → S. Observe that by construction, the lift of the vertical foliation V, given by the lift of {x = constant}, is a C ∞ invariant stable foliation of F0 = FX0 . The next lemma shows that, if we choose a0 sufficiently small in Proposition 2, then V is robust for small C r , r ≥ 3, perturbations of X0 . Lemma 2. Assume that a0 ∈ (0, 18 ) satisfies 1 2a0 < , 1 − a0 2
1 a0 < , 2 (1 − a0 ) 2
4a02 1 < . 2 (1 − a0 ) 2
(11)
Let 0 < α < 1, β − α > 1, and X0 be the vector field obtained in Proposition 3 for α, β, and a0 . Then, there is a neighborhood U of X0 in X r so that for all X ∈ U, there is an invariant C 2 stable foliation of FX in S, depending C 2 on X. The proof of this lemma is in the Sect. 5. We fix hereafter the open set U obtained from this lemma. This completes the description of the vector field X0 , the co-dimension one submanifold N , and the open sets U, and U . The remainder of the paper is devoted to the description of the dynamics of vector fields in U. 3. One-Dimensional Analysis Let Xµ ∈ C k (I, X r ), k ≥ 2, transverse to N at X0 and FXµ the family of first return maps to S along the flow of Xµ . Lemma 2 implies that we can associate to Xµ a parametrized ¯ → [−1, δ], ¯ δ¯ > 0, such that fµ is the family of one-dimensional maps fµ : [−1, δ] first coordinate of FXµ . Since k ≥ 2, we obtain that fµ depends C 2 on µ. Moreover, fµ unfolds a saddle-node fixed point q0 . Definition 1. We say that the family fµ above generically unfolds a saddle-node periodic orbit at µ = 0 if the corresponding family of vector fields Xµ crosses N transversally at X0 ∈ N . The goal of this section is to prove the following results. Theorem 2. If fµ generically unfolds a saddle-node periodic orbit at µ = 0, then fµ is topologically transitive and has positive Lyapunov exponent at every dense orbit for every µ > 0 sufficiently small.
Strange Attractors Across the Boundary of Hyperbolic Systems
539
We point out that this theorem is false without property (6) in Proposition 1. Indeed, this property compensates the fact that fµ , µ > 0, is not an expanding map. The transitiveness of fµ , µ > 0, in Theorem 2 follows from: Proposition 4. If fµ generically unfolds a saddle-node periodic orbit at µ = 0 then, if ¯ is any interval, then there exists m ∈ N such µ > 0 is sufficiently small and I ⊂ [−1, δ] ¯ that fµm (I ) = [−1, δ]. Then f0 satisfies conditions (1)–(6) in Proposition 1. Proof. Let fµ be as in the statement. √ Moreover, there exist c > 2 and xc (µ) > −1 such that |fµ0 (x)| ≥ c, ∀x ≥ xc (µ), wherever fµ0 (x) exists. Set c1 (µ), c2 (µ), and c3 (µ), respectively, the continuations of the discontinuities of f0 for µ > 0 small and define ¯ . Ec (µ) = {x; |fµ0 (x)| ≥ c} = [xc (µ), δ] Denote by q1 (µ), and q2 (µ) the continuations of, respectively, the repelling fixed points q1 and q2 of f0 , cf. Sect. 2 and Fig. 1. For µ sufficiently small, ¯ −1 < c1 (µ) < q1 (µ) < fµ (b0 (µ)) < c2 (µ) < b0 (µ) < q2 (µ) < c3 (µ) < δ. Before we continue, let us fix some notations: we set |I | for the length of the interval I , ¯ \ B for the complement of B ⊂ [−1, δ]. ¯ ∂B for the boundary of B, and C(B) = [−1, δ] ¯ a] ¯ and I0 (µ) = [fµ−1 (a), ¯ a]. ¯ Recall that both J0 (µ) and I0 (µ) Set J0 (µ) = [fµ−2 (a), are contained in Ec (µ). ¯ is an open interval containing q1 (µ), Claim (1). If µ > 0 is small and I ⊂ [−1, δ] ¯ In particular, the respectively q2 (µ), then there exists n ≥ 0 such that fµn (I ) = [−1, δ]. same conclusion holds if either I0 (µ) ⊂ I or J0 (µ) ⊂ I . ¯ So, to obtain Indeed, let I be as in the statement. Note that fµ2 [c1 (µ), c2 (µ)] = [−1, δ]. the result it is enough to prove that there is n ≥ 0 so that fµn (I ) ⊃ [c1 (µ), c2 (µ)]. For this, assume first that q1 (µ) ∈ I . Since |fµ0 (x)| > c > 1 for c1 (µ) < x < c2 (µ), there is m ≥ 0 such that either fµm (I ) ⊃ [q1 (µ), c2 (µ)] or fµm (I ) ⊃ [c1 (µ), q1 (µ)]. Suppose fµm (I ) ⊃ [q1 (µ), c2 (µ)]. As fµ ([q1 (µ), c2 (µ)]) = [−1, q1 ] and fµ ([−1, q1 ]) ⊃ [c1 (µ), c2 (µ)], we conclude the result in this case. If fµm (I ) ⊃ [c1 (µ), q1 (µ)], since fµ ([c1 (µ), q1 (µ)]) ⊃ [q1 (µ), c2 (µ)], we also conclude the proof. Next assume that q2 (µ) ∈ I . Since |fµ0 (x)| > c > 1 for c2 (µ) < x < c3 (µ) there is ` ≥ 0 such that either fµ` (I ) ⊃ [q2 (µ), c3 (µ)] or fµ` (I ) ⊃ [c2 (µ), q2 (µ)]. Suppose ¯ ⊃ [c3 (µ), δ]; ¯ first fµ` (I ) ⊃ [q2 (µ), c3 (µ)]. Since fµ ([q2 (µ), c3 (µ)]) = [q2 (µ), δ] ¯ fµ ([c3 (µ), δ]) ⊃ [c2 (µ), b0 (µ)], and fµ ([c2 (µ), b0 (µ)]) ⊃ [c1 (µ), q1 (µ)], we conclude that fµ`+3 (I ) ⊃ [c1 (µ), q1 (µ)], and we proceed as in the first case. Now, if fµ` (I ) ⊃ [c2 (µ), q2 (µ)] ⊃ [c2 (µ), b0 (µ)], and fµ ([c2 (µ), b0 (µ)]) ⊃ [c1 (µ), q1 (µ)] for fµ (b0 (µ)) > q1 (µ), we conclude the proof as in the first case. √ Claim (2). For any 2 < c0 < c there is µ0 > 0 such that for 0 < µ < µ0 , the first l 0 lµ integer lµ such that fµ ([−1, fµ (−1)] ⊂ J0 (µ) is well defined and |(fµµ ) (x)| ≥ c0 for every x ∈ [−1, fµ (−1)].
540
C. A. Morales, M. J. Pacifico, E. R. Pujals
To this claim, recall the notations fixed in the beginning of Sect. 2.1. Given √ prove 2 < c0 < c, we write γ = c − c0 . For k large and every (σ, x) ∈ [0, 1] × [−1, fµ (−1)] we have Tk : [0, 1] × [−1, fµ (−1)] 3 (σ, x) 7 → fµkk (σ ) (x) ∈ J0 (µ), and Tk → T∞ in the C 2 topology. So, | ∂x Tk (σ, x) − ∂T∞ (σ, x)| < γ , for k large and every (σ, x). Thus, by hypothesis (6) in Proposition 1, there is kγ > 0 such that for k ≥ kγ , inf (σ,x) ∂x Tk (σ, x) ≥ c − γ = c0 . Now let µk : [0, 1] 3 σ 7 → µk (σ ) ∈ [µ∗k+1 , µ∗k ], where µ∗k is the partition of the parameters as in Sect. 2.1. Recall that µ∗k → 0+ . Set µ0 = µkγ and take 0 < µ < µ0 . Since J0 (µ) covers two fundamental domains for fµ , such lµ as in the statement exists and lµ = k for µ ∈ [µ∗k+1 , µ∗k ]. l
To estimate (fµµ )0 (x) , recall that σk : [µ∗k+1 , µ∗k ] → [0, 1] is the inverse of µk and l
so, fµµ (x) = Tk (σk (µ), x). In particular, l
(fµµ )0 (x) = ∂x Tk (σk (µ), x) ≥ inf ∂x Tk (σ, x) ≥ c − γ = c0 . (σ,x)
This finishes the proof of Claim (2). From now on we assume that all µ > 0 satisfies Claim (2). To prove Proposition 4 we use the following auxiliary lemmas whose proofs are in Sect. 5. Lemma 3. There are µ0 > 0 and c > λ > 1 such that for every 0 < µ < µ0 , if I is an interval such that I ∩ C(Ec (µ)) 6 = ∅, then either I contains a pre-image of I0 (µ) or there is m = mI ∈ N such that fµm (I ) ⊂ J0 (µ), fµi (I ) is connected for all 0 ≤ i ≤ m, and |fµm (I )| ≥ λ|I |. Lemma 4. There is λ > 1 such that for any interval I such that c2 (µ) ∈ I there is an interval I + ⊂ I and m ∈ N such that fµm (I + ) either contains J0 (µ) or is an interval contained in Ec (µ) satisfying |fµm (I + )| ≥ λ|I |. Lemma 5. There is λ > 1 such that if I ⊂ Ec (µ) is an interval containing c1 (µ) or j c3 (µ), fµ2 (I ) is connected, and fµ (I ) ⊂ Ec (µ) for j ∈ {1, 2}, then either q1 (µ) ∈ I ∪ fµ2 (I ) ∪ fµ3 (I ) or |fµ2 (I )| ≥ λ|I |. Lemma 6. Let I be an interval containing c1 (µ) or c3 (µ) such that fµ2 (I ) is connected, and fµ2 (I ) ∩ C(Ec (µ)) 6 = ∅. Then fµ (I ) ⊃ J0 (µ). Lemma 7. Let I be an interval containing c1 (µ) or c3 (µ) such that fµ2 (I ) is not connected. Then either J0 (µ) ⊂ fµ2 (I ) or J0 (µ) ⊂ fµ3 (I ). Proof of Proposition 4 using the lemmas. Let µ0 > 0 and λ0 be the minimum of the ¯ be λ’s in Lemmas 3–5. In particular, c > λ0 > 1. For every µ < µ0 , let I ⊂ [−1, δ] an interval. Without loss of generality we assume that I ⊂ Ec (µ). For any n ≥ 0, set In = fµn (I ). To prove the proposition we proceed by the following algorithm: j
(A) Define k = k(I ) as the first integer such that fµ (I ) is connected and contained in Ec (µ) for 0 ≤ j ≤ k − 1, and either fµk (I ) ⊂ Ec (µ) but it is not connected or fµk (I ) is not contained in Ec (µ).
Strange Attractors Across the Boundary of Hyperbolic Systems
541
Case I: Ik ⊂ Ec (µ) and Ik is not connected. Then either c1 (µ) or c3 (µ) ∈ Ik−1 . Note that Ik−1 is connected. There are the following possibilities: 1. If fµ2 (Ik−1 ) is connected and fµ2 (Ik−1 ) ∩ C(Ec (µ)) 6 = ∅, then Lemma 6 implies Ik ⊃ J0 (µ) and we are done by Claim 3. 2. If fµ2 (Ik−1 ) is connected and contained in Ec (µ), by Lemma 5, either q1 ∈ Ik−1 ∪ fµ2 (Ik−1 ) ∪ fµ3 (Ik−1 ) and we are done by Claim 1, or |fµ2 (I )| > λ0 |Ik−1 | > λ0 |I | . Then go to (A) replacing I by fµ2 (Ik−1 ). 3. If fµ2 (Ik−1 ) is not connected then Lemma 7 implies that either J0 (µ) ⊂ Ik+1 or J0 (µ) ⊂ Ik+2 and we are done by Claim 3. Case II: Ik ∩ C(Ec (µ)) 6 = ∅. 1. If Ik is connected, then Lemma 3 implies that either Ik contains a pre-image of I0 (µ) and we are done by Claim 3, or |fµm (Ik )| > λ0 |Ik |, for some m ∈ N. Go to (A) replacing I by fµm (Ik ). 2. If Ik = fµ (Ik−1 ) is not connected, then either c1 (µ) or c2 (µ) or c3 (µ) ∈ Ik−1 , and we have the following possibilities: − + ∪ Ik−1 as before. By Lemma 4, either there a) If c2 (µ) ∈ Ik−1 , write Ik−1 = Ik−1 + m + )| ≥ is m > 0 such that fµ (I ) contains J0 (µ), and we are done, or |fµm (Ik−1 + m λ0 |Ik−1 |. Then, go to (A) replacing I by fµ (Ik−1 ). b) If c1 (µ) or c3 (µ) ∈ Ik−1 and fµ2 (Ik−1 ) is connected and is not contained in Ec (µ) then Lemma 6 implies that Ik+1 contains J0 (µ) and we are done by Claim 3. c) If c1 (µ) or c3 (µ) ∈ Ik−1 and fµ2 (Ik−1 ) is not connected then, either J0 (µ) ⊂ Ik+1 or J0 (µ) ⊂ Ik+2 by Lemma 7 and we are done by Claim 3. d) If c1 (µ) or c3 (µ) ∈ Ik−1 and fµ2 (Ik−1 ) is connected and contained in Ec (µ) then Lemma 5 implies q1 (µ) ∈ Ik−1 ∪fµ2 (Ik−1 )∪fµ3 (Ik−1 ) and we are done by Claim 3 or |fµ2 (Ik−1 )| ≥ λ0 |Ik−1 |. Then, go to (A) replacing I by fµ2 (Ik−1 ). The algorithm is complete. Since every time we restart the algorithm we do so with an interval in Ec (µ) with length equal to λ0 > 1 times the length of the previous one, ¯ This finishes the it is clear that at the end we reach m ∈ N such that fµm (I ) = [−1, δ]. proof of Proposition 4. Next, let us estimate the Lyapunov exponent of fµ , 0 < µ < µ0 . For this, let λ0 be the minimum of the λ’s in Lemmas 3–5. Recall that c > λ0 > 1, where c is the constant in ¯ as before and define Proposition 1. For 0 < µ < µ0 , set Ec (µ) = [xc (µ), δ] Aµ = {x; ∃ n ∈ IN
such that
fµn (x) = c2 (µ)}.
Proposition 5. Suppose that fµ is as in Theorem 2. For 0 < µ < µ0 , given x ∈ Ec (µ) \ Aµ there is δ > 0 so that |(fµn )0 (x)| ≥ δλ0 n for every n ≥ 0. Proof. Recall the notations and definitions in the proof of Lemma 3 and that J0 (µ) = ¯ a], ¯ for every µ. [fµ−2 (a), For each µ < µ0 , let lµ be given in Claim 3, µ , rµ , Rµ , be as in Lemma 3, and > 0 be the constants in Proposition 1. Recall that for all µ, |fµ0 (x)| > (1 − ) wherever fµ0 (x) exists.
542
C. A. Morales, M. J. Pacifico, E. R. Pujals
Given x ∈ Ec (µ) \ Aµ and n ≥ 0 we write n = J1 + K1 + J2 + K2 + · · · Jr1 + Kr2 + R, with r1 = r2 or r1 = r2 + 1, where Ji and Ki are defined in the following way: fµj (x) ∈ Ec (µ) for 0 ≤ j ≤ J1 ; fµJ1 +j (x) ∈ C(Ec (µ)) for 0 ≤ j ≤ K1 . If Ji is defined, Ki is such that P
fµ
1≤s≤i−1 (Js +Ks )+Ji +j
(x) ∈ C(Ec (µ)) for 0 ≤ j ≤ Ki .
If Ki is defined, Ji+1 is such that P
fµ
1≤s≤i (Js +Ks )+j
(x) ∈ Ec (µ) for 0 ≤ j ≤ Ji+1 , and R = n −
X
(Ji + Ki ) .
Note that r2 /r1 ≤ 1. Also, by construction, if x ∈ C(Ec (µ)) then there is a positive iterate of x in J0 (µ) and, for points in y ∈ J0 (µ), we have fµ (y) ∈ Ec (µ) and fµ2 (y) ∈ Ec (µ). Hence, Ji ≥ 2 for all i and all x. Moreover, we clearly have Ki ≤ lµ . Thus, since r2 /r1 ≤ 1, P r2 Ki P ≤ lµ < lµ . 2 r1 Ji From this it follows that P P Ji 1 1 Ji P P P =P > = . n−R 1 + ( Ki )/( Ji ) 1 + lµ Ji + Ki Set ρµ =
1 1+lµ .
Note that ρµ < 1 for every µ. Thus, X Ji ≥ ρµ (n − R) .
(12)
−ρµ lµ
Next we estimate |(fµn )0 (x)|. Set δ = (1 − )lµ λ0
, where ρµ is as above. j n(x) Given x ∈ C(Ec (µ)), let n(x) be such that fµ (x) ∈ J0 (µ), and fµ (x) n(x) for 0 ≤ j < n(x). By Lemma 3 we have |(fµ )0 (x)| ≥ λ0 , with c > λ0 >
∈ / J0 (µ) 1. Using
this and (12) we get |(fµn )0 (x)| = |(fµR )0 (fµn−R (x)| |(fµn−R )0 (x)| ≥λ
P
Ji
ρ (n−R)
λr02 |(fµR )0 (fµn−R (x)| ≥ λ0 µ
|(fµR )0 (fµn−R (x)| .
Now we bound the last term in the inequality above. j If r1 = r2 , then fµ (fµn−R (x) ∈ Ec (µ) for every 0 ≤ j ≤ R, and hence |(fµR )0 (fµn−R (x)| ≥ cR ≥ λR 0. If r1 = r2 + 1 then R ≤ lµ and so |(fµR )0 (fµn−R (x)| ≥ (1 − )R ≥ (1 − )lµ . Thus, if r1 = r2 , using that 1 − ρµ > 0, and δ < 1, we obtain ρ (n−R) R
|(fµn )0 (x)| ≥ λ0 µ
ρ n
(1−ρµ )R
c ≥ λ0 µ λ0
ρ n
ρ n
≥ λ0 µ ≥ δ λ0 µ .
On the other hand, if r1 = r2 + 1, using that R ≤ lµ , ρ (n−R)
|(fµn )0 (x)| ≥ λ0 µ
ρ (n−lµ )
(1 − )lµ ≥ λ0 µ
ρ n
(1 − )lµ = δ λ0 µ . −ρµ lµ
All together this finishes the proof of Proposition 5 with δ = (1 − )lµ λ0 The proof of Theorem 2 is complete. u t
.
Strange Attractors Across the Boundary of Hyperbolic Systems
543
4. Proof of Theorem 1 Let N , U ⊂ X r , and U fixed in Sect. 2, and Xµ ∈ C k (I, X r ), r ≥ 3, k ≥ 2, µ ∈ [−1, 1], be a one parameter family of vector fields crossing N transversally at X0 ∈ N . Shrinking U if necessary, we can assume that X0 satisfies Proposition 3. We shall also assume that Xµ ∈ N − for µ < 0, and Xµ ∈ N + for µ > 0. As explained in Sect. 2.3.1, Xµ induces a one parameter family of first return maps Fµ = FXµ defined on S ∗ . Lemma 2 implies that, on C 2 -coordinates, Fµ (x, y) = (fµ (x), gµ (x, y)). Since X0 satisfies Proposition 3 we have that f0 satisfies Proposition 1. Moreover, fµ generically unfolds a saddle-node periodic orbit at µ = 0, according to Definition 1. Consider first µ > 0. In this case, Theorem 2 implies that fµ is transitive, and has positive Lyapunov exponents at dense orbits. Let 3µ = ∩t (Xµ )t (U ). Since fµ is transitive and Fµ has a contracting invariant foliation by Lemma 2, 3µ is a transitive set for Xµ . The fact that fµ has positive Lyapunov exponents at dense orbits implies that 3µ is a strange attractor for Xµ . Now observe that Theorem 2 and Lemma 2 hold for small perturbations of fµ and Fµ respectively. Then, 3µ is a robust strange attractor of Xµ . So, it is non-Lorenz-like by Lemma 1. Next, we analyze (Xµ ) ∩ U for µ < 0. For this, recall that the saddle-node fixed point q0 for f0 splits into two fixed points q0 (µ) and q3 (µ), one of them, say q0 (µ), is repelling and the other, q3 (µ), is attracting. Lemma 2 implies that Fµ , ∀µ, preserves and contracts a stable foliation and so q0 (µ) and q3 (µ) induce two fixed points Q0 (µ) and Q3 (µ) for Fµ , with Q0 (µ) being of saddle-type and Q3 (µ) being an attracting fixed orbit. By construction, the action of Fµ on S looks like Fig. 5. Let B1 (µ) and B2 (µ) be as in Fig. 5 and set B0 (µ) = Fµ−1 (B1 (µ) ∪ B2 (µ)) and S0 = S \ W s (Q3 (µ)). Note that B0 (µ) is a band containing {x = 0} ⊂ W s (σµ )∩S, where σµ is the singularity of Xµ . Let S1 = S0 \ B0 (µ). Observe that S1 has two connected components, S1− and S1+ , whose shape is as in Fig. 6. Observe that we also have (Fµ ) ∩ S ⊂ {Q3 (µ)} ∪ {∩n Fµn (S1 )}. Now consider the region R0 (µ) bounded by the segments P1 P2 , P2 P3 ,
Q3( µ )
Q 0( µ)
B0(µ)
B1(µ)
B2(µ)
X= 0 Fig. 5. The action of Fµ on S
544
C. A. Morales, M. J. Pacifico, E. R. Pujals
S1+
S1-
Fig. 6. The shape of S1 = S0 \ B0 (µ)
-
+
S2
S2
P1 R1
-
Fµ( S 2 )
P2
R0 P4 P3 +
Fµ( S 2 ) Fig. 7. A 3-symbol subshift
P3 P4 , and P4 P1 , see Fig. 7. We take these segments such that R0 (µ) ∩ Fµ (S1 ) = ∅. In particular, R0 (µ) ∩ (Fµ ) ∩ S = ∅. Then, if S2 = S1 \ R0 (µ) we obtain (Fµ ) ∩ S ⊂ {Q3 (µ)} ∪ {∩n Fµn (S2 )}. Note that S2 has two connected components, S2− and S2+ , as indicated in Fig. 7. Choose R1 (µ) ⊂ S2− such that R1 (µ) ∩ Fµ (S2 ) = ∅, as indicated in Fig. 7. Hence, (Fµ ) ∩ S ∩ R1 (µ) = ∅. If S3 = S2 \ R1 (µ) then S3 has three connected components and, by the choice of R1 (µ), we have (Fµ ) ∩ S ⊂ {Q3 (µ)} ∪ {∩n Fµn (S3 )}.
(13)
Note that ∩n Fµn (S3 ) is a 3-symbol subshift. In particular, it is transitive. Thus, ∩n Fµn (S3 ) ⊂ (Fµ ) ∩ S.
(14)
(Fµ ) ∩ S = {Q3 (µ)} ∪ {∩n Fµn (S3 )}.
(15)
Now, (13) and (14) give
Strange Attractors Across the Boundary of Hyperbolic Systems
545
Since (Xµ ) ∩ U is the saturated of (Fµ ) ∩ S by the flow (Xµ )t , if O(Q3 (µ)) denotes the Xµ orbit of Q3 (µ), 3(Fµ ) = ∩n Fµn (S3 ), and 3(Xµ ) is the saturated of 3(Fµ ) by (Xµ )t we get (Xµ ) ∩ U = {O(Q3 (µ)), σµ } ∪ 3(Xµ ). Note that 3(Xµ ) is the suspension of the 3-symbol subshift 3(Fµ ). Thus, for µ < 0, (Xµ ) ∩ U is the union of an attracting closed orbit, a singularity, and a suspension of a 3-symbol subshift, concluding the proof of Theorem 1. 5. Robustness of Stable Foliations and Auxiliary Lemmas In this section we prove Lemma 2 and the ones used to obtain Proposition 4. First, we show the robustness of the invariant foliation V of S by F0 under small C 2 perturbations of X0 . See also [3] and references therein. Recall that we have chosen a small neighborhood U of X0 in Sect. 2. We can assume that every X ∈ U has a singularity σ (X) close to the origin whose eigenvalues λ1 (X), λ2 (X) and λ3 (X) satisfy the same properties as the ones of DX0 . Furthermore, both separatrices of W u (σ (X)) still intersect the cross section S defined in Sect. 2. By a C 2 change of coordinates we can assume, without loss of generality, that σ (X) is the origin, that its eigenvectors have the directions of the coordinate axis and that the plane {x = 0} contains a local stable manifold of the singularity. So, each X ∈ U induces a Poincaré map FX defined on S ∗ = S \ {x = 0}. Observe that FX can be seen as the composition RX ◦ LX , where LX is the Poincarè map LX : S → 6 + ∪ 6 − and RX : 6 + ∪ 6 − → S is a C 2 diffeomorphism. Moreover, RX can be expressed as the composition RX = J ◦ R0 , where R0 is the corresponding C 2 diffeomorphism for X0 and J is a C 2 -perturbation of the identity map. If we set α = −λ3 /λ1 and β = −λ2 /λ1 , then LX (x, y) = (y|x|β , |x|α ). Notice that α and β actually depend on X ∈ U, but we shall omit this when no confusion is possible. By hypothesis, β − α > 1, cf. Sect. 2. The stable foliation FXs for X ∈ U will be obtained as the integral curves of a C 2 vector field ηX on S such that ηX (x, y) = (φX (x, y), 1); φX (0, y) = 0,
(16)
where φX : S → [−1, 1] will be a fixed point of an appropriate graph transform. To do this we proceed as follows. Given a vector V = (u, v) with v 6 = 0 the slope of V is defined by slope(V ) = u/v. Consider the space A of continuous maps φ : U × S → R such that |φ(X, (x, y))| ≤ 1 and φ(X, (0, y)) = 0. To each φ ∈ A φ we associate a continuous vector field ηX (x, y) = (φ(X, (x, y)), 1). We shall define T : A → A so that having T (φ) = φ is equivalent to having φ
φ
slope(DFX (q) FX−1 (ηX (FX (q))) = slope(ηX (q)), q = (x, y). φ
φ
(17) φ
When φ is C 2 , if FX is the integral curve of ηX then (17) holds if and only if FX is φ invariant under FX . In this case, FX is the desired stable foliation. Hence, we are looking for T so that φ
T (φ)(X, q) = slope(DFX (q) FX−1 (ηX (FX (q))) for q = (x, y), x 6= 0.
546
C. A. Morales, M. J. Pacifico, E. R. Pujals
This leads to define, for (X, (x, y)) ∈ U × S ∗ and φ ∈ A, T : A → A by (φ◦Fˆ )∂y g−∂y f (X, (x, y)) if x 6 = 0, T (φ)(X, (x, y)) = ∂x f −(φ◦Fˆ )∂x g 0 if x = 0,
(18)
where Fˆ (X, q) = (X, FX (x, y)). Our goal is to prove first that T is a well defined operator, and secondly, that T has a C 2 fixed point φ. For this, we endow A with the norm of the supremum and prove that T is a contraction with respect to this norm. Afterward we prove that the fixed point of T is in fact C 2 , completing the proof of Lemma 2. We proceed in the following way. To simplify notations, given X as above define F (X, (x, y)) = FX (x, y). We start proving the following result. Lemma 8. Let a0 < 1 satisfying (11), and X C 2 -close to X0 . Then one can write F (X, (x, y)) = (f (X, (x, y)), g(X, (x, y))) such that there are positive constants ki = 1, 2, 3 such that for any q = (x, y) ∈ S ∗ , we have |∂y g(X, q)| |∂y f (X, q)| |∂x g(X, q)| ≤ a0 ; ≤ k1 |x|(β−α+1) ; ≤ k2 |x|(β−α+1) ; |∂x f (X, q)| |∂x f (X, q)| |∂x f (X, q)| |detDq F (X, q)| ≤ a0 |x|(β+α−1) ;[6pt] (ii) |Dq F (X, q)| ≤ k3 |x|(α−1) ; (iii) |∂y g| |∂y f | |∂x g| , , , |detDq F | < a0 . sup |∂x f |∂x f | |∂x f | S∗ (i)
Proof. Note that (6) and (7), Proposition 2, imply that conditions (i)–(iii) above are satisfied by F0 . As F , outside a strip [−δ, δ] × [−1, 1] (for some δ > 0, fixed), is a diffeomorphism C 2 near to F0 , we obtain that F also satisfies (i)–(iii) outside this strip. In the strip we reason as follows. First recall that F can be written as F = RX ◦ LX = J ◦ R0 ◦ LX , where J is C 2 near the identity, R0 is a fixed C 2 diffeomorphism, and LX (q) = (y|x|β , |x|α ). We shall make computations for x > 0. The case x < 0 is similar. Recalling the construction of the vector field X0 in Proposition 3, the derivative Dq F can be written as 1 βyx β−1 x β M + 2 , (19) Dq F (x, y) = σ + 3 4 αx α−1 0 with M > 1/2, σ > 0 very small (both fixed) and i = i (X, yx β , x α ) → 0 as X → X0 . For simplicity we set = i (X, yx β , x α ). Hence, ∂x f = [βyx (β−α) + (M + )α].x (α−1) , ∂y f = x β , ∂x g = [(σ + )βyx (β−α) + α].x (α−1) ,
∂y g = (σ + )x β .
(20)
Note that (20) implies that there are positive constants ki , 1 ≤ i ≤ 4, 0 < k1 < 1, so that |∂y g/∂x f | = k1 x (β−α+1) , Moreover,
|∂y f/∂x f | = k2 x (β−α+1) .
β−α + α ∂x g = (σ + )βyx ∂ f βyx β−α + (M + )α . x
(21)
(22)
Strange Attractors Across the Boundary of Hyperbolic Systems
547
In particular, (22) together with the facts that → 0 as X → X0 and β − α > 0 imply |∂x g/∂x f | < a0 for x small enough .
(23)
Hence (21), (22) and (23) prove (i). On the other hand, (20) together with (6), and (i) above, imply (ii) and (iii) for X sufficiently C 2 near to X0 , and x sufficiently small. This finishes the proof of the lemma. Proposition 6. Let T be as in (18). Then T (A) ⊂ A, T is continuous and a contraction, with constant of contraction independent of φ ∈ A. Proof. We claim that there is K > 0 so that for any φ ∈ A, X ∈ U and q ∈ S, |T (φ)(X, q)| ≤ K x β−α+1 ,
and
|T (φ)(X, q)| ≤
2a0 . 1 − a0
(24)
Indeed, using that |φ| ≤ 1 and Lemma 8 we obtain |T (φ)(X, q)| =
|(φ ◦ Fˆ )∂y g − ∂y f | (x, y) |∂x f − (φ ◦ Fˆ )∂x g|
|(φ ◦ Fˆ )∂y g/∂x f − ∂y f/∂x f | (x, y) |1 − (φ ◦ Fˆ )∂x g/∂x f | |∂y g/∂x f | + |∂y f/∂x f | (x, y) ≤ 1 − |∂x g/∂x f | k2 |x|β−α+1 + k3 |x|β−α+1 ≤ K|x|β−α+1 , ≤ 1 − k1 ≤
(25)
proving the first inequality in (24). Using once more Lemma 8 and the choice of a0 we get |T (φ)(X, q)| ≤
|∂y g/∂x f | + |∂y f/∂x f | 2a0 < 1, ≤ 1 − |∂x g/∂x f | 1 − a0
(26)
proving our claim. Now observe that (24) implies first |T (φ)(X, q)| → 0 as x → 0 and so T (φ) is continuous. Secondly, T (A) ⊂ A if we choose a0 small enough. Next we prove that T is a contraction. Indeed, |detDq F |.|φ1 ◦ Fˆ − φ2 ◦ Fˆ | (∂x f − (φ1 ◦ Fˆ )∂x )(∂x f − (φ2 ◦ Fˆ )∂x g 1 a0 |φ1 − φ2 | < |φ1 − φ2 |, ≤ (1 − a0 )2 2
|T (φ1 ) − T (φ2 | =
where we used Lemma 8 and the hypothesis (11) on a0 , cf. Sect. 2. This finishes the proof of Proposition 6.
548
C. A. Morales, M. J. Pacifico, E. R. Pujals
Proposition 6 above shows the existence of a fixed point φ0 ∈ A for T . We shall prove that φ0 actually depends C 2 on (X, q). First, let us see that φ is C 1 . For this, let A1 be the space of continuous maps A : U × S → L(X r × R2 , R), satisfying sup(X,q) |A(Y, q)| < 1 and A(X, (0, y)) = 0, and let us introduce the operator Te : A×A1 → A×A1 such that if φ is C 1 then Te(φ, D(φ)) = (T (φ), D(T φ)). This operator is defined as Te(φ, A) = (T φ, S(φ, A)) and the explicit form of S(φ, A) is ( [V1 (φ) − T (φ)V2 (φ) + N (φ)(A ◦ Fˆ )D Fˆ ](X, q), for x 6 = 0, 0, for x = 0, where V1 (φ) =
φ ◦ Fˆ
1
D∂y f,
(27)
D∂x f − D∂x g, ∂x f − (φ ◦ Fˆ )∂x g ∂x f − (φ ◦ Fˆ )∂x g detDq F . N(φ) = (∂x f − (φ ◦ Fˆ )∂x g)2
(28)
∂x f − (φ ◦ Fˆ )∂x g 1
V2 (φ) =
D∂y g −
∂x f − (φ ◦ Fˆ )∂x g φ ◦ Fˆ
(29)
Recall that Fˆ (X, q) = (X, FX (q)). Next we prove that Te is a well defined operator and a contraction with constant of contraction independent of φ. First we prove the following result: Lemma 9. Let a0 satisfying (11). If X is C 2 -close to X0 and F (X, (x, y)) = (f (X, (x, y)), g(X, (x, y))) is the first return map associated to X, then there are positive constants ki = 4, ..., 8 such that for any q = (x, y) ∈ S ∗ : |D∂y g| ≤ k4 |x|β−α ; |∂x f | |D∂y f | ≤ k6 |x|β−α ; (ii) |∂x f | In particular, (i)
|D∂x g| ≤ k5 |x|−1 ; |∂x f | |D∂x f | ≤ k7 |x|−1 . |∂x f | |D 2 F | ≤ K|x|α−2 .
(30)
| det Dq F | 1 |D Fˆ | ≤ k8 |x|β , and |N (φ)|.|D Fˆ | < , ∀φ; |∂x f |2 2 (iv) D∂y g D∂y f , < a0 . , sup ∂x f ∂x f S∗ (iii)
Proof. From the expressions in Lemma 8 we obtain, first, that K|x|α−1 ≤ |∂x f | ≤ K|x|α−1 .
(31)
Strange Attractors Across the Boundary of Hyperbolic Systems
549
Secondly, (a) ∂X (∂y g) = x β ∂X + (σ + )x β log(β)∂X β, ∂x (∂y g) = ∂x x β + (σ + )βx β−1 , ∂y (∂y g) = ∂y x β . Thus, |D∂y g| ≤ K|x|β−1 , and by (31), |D∂y g| ≤ K|x|β−α . |∂x f |
(32)
(b) Analogously, from the explicit form for ∂X (∂x g), ∂x2 g, ∂y ∂x g, we obtain |∂X (∂x g)| ≤ K|x|α−1 , |∂x (∂x g)| ≤ K|x|α−2 , and |∂y (∂x g)| ≤ K|x|β−1 . Thus, |D∂x g| ≤ K|x|α−2 , and by (31), |D∂x g| ≤ K|x|−1 . |∂x f |
(33)
Now, (32) and (33) give (i). (c) In the same way we obtain |D∂y f | ≤ K|x|β−1 and by (31) we get |D∂y f | ≤ K|x|β−α . |∂x f |
(34)
(d) From the explicit form for ∂x f , we get ∂X (∂x f )| ≤ K|x|α−1 . On the other hand, ∂x (∂x f ) = βyx β−1 ∂x + β(β − 1)yx β−2 + αx α−1 ∂x + α(M + )(α − 1)x α−2 , which implies |∂x (∂x f )| ≤ K|x|α−2 . We also have ∂y (∂x f ) = βyx β−1 ∂y + βx β−1 + αx α−1 ∂y ; which implies |∂y (∂x f )| ≤ K|x|β−1 . Hence, |D∂x f | ≤ K|x|α−2 , and by (31) we get |D∂x f | ≤ K|x|−1 . |∂x f |
(35)
Now, (34) and (35) prove (ii). Also, (32)–(35) imply |D 2 F | ≤ K|x|α−2 , and (32)-(35), (32), (34) together with (7) imply (iv). To verify (iii) we proceed as follows. det Dq F and, using (ii) in Lemma 8, First recall that N(φ) = 2 ˆ (∂x f −(φ◦F )∂x g)
| det Dq F | |D Fˆ | ≤ (∂x f − (φ ◦ Fˆ )∂x g)2 a0 a0 |x|β+α−1 |DF | ≤ |x|β . 2 2 |∂x f | (1 − ∂x g/∂x f ) (1 − a0 )2
|N(φ)||D Fˆ | =
(36)
This gives the first part of (iii). For the other one, recall that it holds outside the strip by hypothesis (7) on F0 . In a strip [−δ, δ] × [−1, 1] (for some δ > 0, fixed) the choice of t a0 and (36) imply the desired inequality for x small enough. u
550
C. A. Morales, M. J. Pacifico, E. R. Pujals
Proposition 7. The operator S : A×A1 → A1 defined in (5) is well defined, continuous and a contraction, with constant of contraction independent of φ ∈ A. Proof. We prove first that S(φ, A) is continuous. Using Lemmas 8 and 9, we estimate V1 (φ), T (φ)V2 (φ), and N(φ)(A ◦ Fˆ )Fˆ , for any φ ∈ A and q = (x, y) ∈ S. We have |φ ◦ Fˆ | 1 |Dgy | + |Dfy | ˆ |∂x f − (φ ◦ F )∂x g| |∂x f − (φ ◦ Fˆ )∂x g| ! ! 1 1 −1 |∂x f | |D∂y g| + |∂x f |−1 |D∂y f | ≤ ∂x g ∂x g 1 − | ∂x f | 1 − | ∂x f |
|V1 (φ)| ≤
≤
(37)
1 {|∂x f |−1 |D∂y g| + |∂x f |−1 |D∂y f |} ≤ K|x|β−α . |1 − a0 |
Using (25) we obtain β−α+1
|T (φ)V2 (φ)| ≤ K|x|
|D∂x g| |D∂x f | 1 + 1 − a0 |∂x f | |∂x f |
≤ K|x|β−α+1 |x|−1 < K|x|β−α .
(38)
We also have |N(φ)||(A ◦ Fˆ )D Fˆ | ≤
1 | det Dq F |.|A|.|D Fˆ |.|∂x f |−2 ≤ K|x|β . (1 − a0 )2
(39)
Note that (24), (37)–(39) imply |S(φ, A)| ≤ |V1 (φ)| + |T (φ)|.|V2 (φ)| + |N (φ)|.|(A ◦ Fˆ )D Fˆ | ≤ K|x|β−α .
(40)
Then, since β − α > 0, S is continuous at x = 0. On the other hand, (24), (37)–(39) together with (iii) in Lemma 9 imply |V1 (φ)|
0 so that for any (Y, q) = (Y, (x, y)) ∈ U × S ∗ and (φ, A) ∈ A × A1 we have 1. 2. 3. 4. 5.
|V1d (φ, A)(Y, q)| ≤ k|x|β−α−1 ; |T (φ)V2d (φ, A)(Y, q)| ≤ k|x|β−α−1 ; |S(φ, A)(Y, q)| ≤ k|x|β−α ; |N d (φ, A)(Y, q)| ≤ k|x|β−α ; |N(φ)(Y, q)(A ◦ Fˆ )(Y, q).D 2 Fˆ (Y, q))| ≤ k|x|β−1 .
Proof. We start by estimating V1d (φ, A). From the expression in (43) we obtain |V1d (φ, A)| ≤
|D∂y g| |D 2 ∂y g| K|V2 (φ)| + K|D Fˆ | + K |∂x f | |∂x f | |D 2 ∂y f | |D∂y f | |D∂x g| K|V2 (φ)| + K |D Fˆ | + K + |∂x f | |∂x f | |∂x f |
≤ K|x|β−α−1 , where we used Lemmas 9 and 10, and the fact that (28) with Lemma 8 imply |V2 (φ)| ≤ K|x|−1 .
554
C. A. Morales, M. J. Pacifico, E. R. Pujals
To estimate T (φ)V2d (φ, A) we use the expressions in (44), (25), Lemmas 9 and 10 to obtain |T (φ)V2d (φ, A)| ≤ |T (φ)| |x|−2 (K + K|x|α ) + K|x|β−α−1 ≤ K|x|β−α−1 . To estimate [V2 (φ)]M [S(φ, A)]diag we use (40) and the fact that Lemma 9 with (28) imply |V2 (φ)| < K|x|−1 to obtain |[V2 (φ)]M [S(φ, A)]diag | = |V2 (φ)||S(φ, A)| < K|x|β−α−1 . To estimate [N d (φ, A)]diag [(A ◦ Fˆ )D Fˆ ]M , we use Lemmas 8 and 9 to obtain |[N d (φ, A)]diag [(A ◦ Fˆ )D Fˆ ]M | = |N d (φ, A)||(A ◦ Fˆ )D Fˆ | ≤ |N d (φ, A)||D Fˆ | ≤ K|x|β−1 . To estimate N(φ)[D Fˆ ]t (B ◦ Fˆ )D Fˆ we use Lemma 8 and (39) to obtain |N(φ)[D Fˆ ]t (B ◦ Fˆ )D Fˆ | ≤ |N (φ)||DF |2 ≤ K|x|β+α−1 . To estimate N(φ)D 2 Fˆ (A ◦ Fˆ ) we use (29) and Lemma 9 to get |N (φ)| ≤ k|x|β−α+1 . This, together with (2) in Lemma 9 and the fact that |D 2 Fˆ | ≤ |D 2 F | imply |N(φ)D 2 Fˆ (A ◦ Fˆ )| ≤ k|x|β−1 . This finishes the proof of Lemma 11. u t Proposition 8. The operator H : A × A1 × A2 → A × A1 × A2 defined in (42) is continuous and, for φ ∈ A and A ∈ A1 fixed, H (φ, A, .) : A2 → A2 is a contraction, with the constant of contraction independent of φ and A. Moreover, there exist r0 > 0 and R0 > 0 so that if Br0 = {B ∈ A2 ; |B| ≤ r0 } then H (φ, A, B) ∈ BR0 (B) for any B ∈ Br0 . Proof. Using Lemma 10 we get |H (φ, A, B)| ≤ K|x|β−α−1 , proving that H (φ, A, B) is continuous at x = 0. On the other hand, we also have for φ and A fixed, |H (φ, A, B1 ) − H (φ, A, B2 )| = |N (φ)||[D Fˆ ]t |(B1 ◦ Fˆ − B2 ◦ Fˆ )D Fˆ | ≤ K|x|β+α−1 |B1 ◦ Fˆ − B2 ◦ Fˆ |, which implies that H (φ, A, .) is a contraction, with constant of contraction independent of φ and A. To conclude, observe that for φ and A fixed, (3) in Lemma 10, Lemma 11 and (7) in 2.1 for F0 imply that there exists a positive constant K0 so that |H (φ, A, B)| ≤ K0 + |N (φ)||DF |2 |B|. But |N(φ)||DF |2 < 1/2, see Lemma 9, and so, for R0 satisfying R0 > K0 and r0 = 2(R0 − K0 ) we get |H (φ, A, B)| ≤ R0 for every B ∈ Br0 . This concludes the proof. u t
Strange Attractors Across the Boundary of Hyperbolic Systems
555
Observe that Proposition 8 implies the existence of an attracting fixed point B0 for H (φ, A, .). If φ is C 2 then Tb defined in (41) is such that Tbn (φ, Dφ, D 2 φ) → (φ0 , Dφ0 , D 2 φ0 ), where φ0 is the fixed point of T . This implies that B0 = D 2 φ0 , and we conclude that φ0 is C 2 . The proof of Lemma 2 is complete. Now we prove Lemmas 3–7, used in Sect. 3. Proof of Lemma 3. We start recalling some consequences from the fact that fµ generically unfolds the saddle-node q0 . First, we have that for every µ > 0 small there are µ > 0, µ → 0+ as µ → 0+ and dµ ∈ N such that if −dµ −2
rµ := fµ
(a) ¯ < fµ (rµ ) < Rµ := fµ2 (rµ ),
then 1. µ1 < µ2 implies rµ2 < rµ1 and Rµ1 < Rµ2 ; Rµ , rµ → 0 as µ → 0+ ; 2. fµ0 (x) < 1 for x ∈ [−1, rµ ); fµ0 (x) ≥ 1 + µ for x ∈ (Rµ , c1 (µ)]; 1 − µ ≤ fµ0 (x) ≤ 1 + µ for x ∈ (rµ , Rµ ) . Moreover, if z0 ∈ (q0 , xc ) is sufficiently near to q0 then there are λ0 > 1 and µ0 > 0 such that for all 0 < µ < µ0 if c0 is given in Claim 3 then c0 > inf{fµ0 (x); x ∈ [z0 , c1 (µ))} > (1 − µ )2 inf{fµ0 (x); x ∈ [z0 , c1 (µ))} > λ0 > 1.
(56)
¯ is an interval then: Furthermore, if I ⊂ [−1, δ] (a) if I contains z0 and xc then I contains a pre-image of I0 , (b) if I ⊂ [Rµ , fµ2 (Rµ )] and mI is the first integer such that fµmI (I ) ⊂ J0 (µ), then fµmI −1 (I ) ⊂ [z0 , c1 (µ))].
Fix z0 ∈ (q0 , xc ) and µ0 , and λ0 as above. Recall that we also have µ0 satisfying Claim 3. 0 To simplify notations, we shall assume, from √ now on, that c = c , where c is the constant in Proposition 1. Observe that, since c > 2 and µ → 0 as µ → 0, there is no loss of ¯ be any interval such generality in assuming this. For every µ ∈ (0, µ0 ), let I ⊂ [−1, δ] that I does not contain a pre-image of I0 and I ∩ C(Ec (µ)) 6= ∅. We proceed according to the following two cases: Case I. I ∩ Ec (µ) 6 = ∅. In this case, xc (µ) ∈ I . There are two possibilities for z0 : either z0 ∈ I or not. If z0 ∈ I , then we are done by the choice of z0 and (a) above. Otherwise I ⊂ [z0 , c1 (µ)] and property (c) satisfied by z0 implies the result with λ = λ0 and m = mI . Note that |fµm (I )| ≥ λ0 |I |. Case II. I ∩Ec (µ) = ∅. We distinguish three cases: either I ⊂ [−1, rµ ), or I ⊂ [rµ , Rµ ], or I ⊂ [Rµ , xc ). In the first case, since fµ restricted to [−1, rµ ) is increasing, we can define nI as the first integer such that fµ−nI (I ) ⊂ [−1, fµ (−1)]. Then mI = lµ − nI , where lµ is given in Claim 3. So, using the fact that the inverse of fµ0 is increasing in [−1, rµ ), we obtain l −nI
|fµmI (I )| = |fµµ
m
(I )| ≥ |fµ µ (I )| ≥ c0 |I | ≥ λ0 |I | and we are done.
556
C. A. Morales, M. J. Pacifico, E. R. Pujals
In the second case, when I ⊂ [rµ , Rµ ], then J = fµ2 (I ) ⊂ [Rµ , fµ2 (Rµ )]. Property (2) satisfied by rµ and Rµ gives |J | ≥ (1 − µ )2 |I |.
(57)
On the other hand, property (c) satisfied by z0 and µ < µ0 give |fµk (J )| ≥ |J | for 0 ≤ k ≤ mJ − 1 .
(58)
By property (d), fµmJ (J ) = fµ (fµmJ −1 (J )) and fµmJ −1 (J ) ⊂ [z0 , c1 (µ)], and property (c) together with (57) and (58) give |fµmJ (J )| ≥ inf{fµ0 (x); x ∈ [z0 , c1 (µ)]} |fµmJ −1 (J )| ≥ ≥
inf{fµ0 (x); x inf{fµ0 (x); x
(59)
∈ [z0 , c1 (µ)]} |J | ∈ [z0 , c1 (µ)]} (1 − µ )2 |I | = λ0 |I |.
Since fµmJ (J ) = fµmJ +2 (I ), we obtain fµmJ +2 (I )| ≥ λ0 |I |, and we conclude the proof taking mI = mJ + 2 and λ = λ0 . Finally, in the third case, that is, for I ⊂ [Rµ , xc ], property (d) satisfied by z0 and µ0 gives fµmI −1 (I ) ⊂ [z0 , c1 (µ)) and so, using property (c) we get |fµmI (I )| = |fµ (fµmI −1 (I ))| ≥ λ0 |fµmI −1 (I )| ≥ λ0 |I |, where we used property (2) satisfied by rµ and Rµ and the fact that I ⊂ [Rµ , xc ] to obtain the last inequality on the left-hand side of the above expression. Choosing m = mI and t λ = λ0 we have the result. The proof of Lemma 3 is complete. u Proof of Lemma 4. We can assume that I ⊂ Ec (µ). Otherwise, we have I ∩C(Ec (µ)) 6 = ∅ and since c2 (µ) ∈ I , we conclude that I ⊃ [xc , c2 (µ)] ⊃ J0 (µ), and we are done. Write I = I − ∪ I + , |I + | ≥ |I |/2, both connected and contained in Ec (µ). There are two possibilities for fµ (I + ): either it is connected or not. If fµ (I + ) is connected, then fµ (I + ) ⊃ [fµ (−1), fµ2 (−1)], provided fµ (I + ) does not contain [−1, fµ (−1)]. In this case, fµ (I + ) contains a pre-image of q1 (µ) and we are done by Claim (1), Sect. 3. On the other hand, if fµ (I + ) is connected and J = fµ (I + ) ⊂ [−1, fµ2 (−1)], Lemma 3 implies that either J contains a pre-image of I0 and we are done, or there is mJ such that fµmJ (J ) ⊂ J0 (µ) and |fµmJ (J )| ≥ λ0 |J |, with λ0 > 1. Thus, using that I + ⊂ Ec (µ), we get c |fµmJ (J )| = |fµmJ +1 (I + )| ≥ λ0 |fµ (I + )| ≥ λ0 c |I + | ≥ λ0 |I |. 2
(60)
On the other hand, as fµmJ (J ) ⊂ J0 (µ), and |fµ0 (x)| > c for every x ∈ J0 (µ), using (60), (59), and that c2 /2 > 1, c > λ0 > 1, we obtain |fµmJ +2 (I + )| = |fµmJ +1 (J )| ≥ c |fµmJ (J )| ≥ λ0
c2 |I | ≥ λ0 |I |. 2
This finishes the proof in this case, with m = mJ + 2 and λ = λ0 . If fµ (I + ) is not connected, then either [c1 (µ), c2 (µ)] ⊂ I + , and so q1 (µ) ∈ I finishing the proof by Claim (1), or [c2 (µ), c3 (µ)] ⊂ I + and in this case q2 (µ) ∈ I + ,
Strange Attractors Across the Boundary of Hyperbolic Systems
557
and we conclude the result applying Claim (1) again. The proof of Lemma 4 is complete. t u Proof of Lemma 5. Assume c1 (µ) ∈ I . Write I = I − ∪ I + , I − ⊂ [−1, c1 (µ)] and ¯ (both connected) with I − ∩ I + = {c1 (µ)}. Without loss of generality I + ⊂ [c1 (µ), δ] we can assume |I + | ≥ |I − |. Then fµ (I ) = fµ (I − ) ∪ fµ (I + ). Note that fµ (I − ) is connected. If fµ (I + ) is not connected then c2 (µ) ∈ I + , implying [c1 (µ), c2 (µ)] ⊂ I + . So, in this case we conclude that q1 (µ) ∈ I . Assume that fµ (I ± ) is connected. There are two possibilities for fµ2 (I ± ): either it is connected or not. In the first case, |fµ2 (I ± )| ≥ c2 |I ± |,
(61)
for fµ (I ± ) ⊂ Ec (µ) and fµ2 (I ± ) ⊂ Ec (µ). The fact that |I + | ≥ |I − | together with (61) give c2 |I |. |fµ2 (I )| ≥ |fµ2 (I + )| ≥ c2 |I + | ≥ 2 Choosing λ = c2 /2 we finish the proof. In the second case, that is, if either fµ2 (I + ) or fµ2 (I − ) is not connected then we proceed as follows. Observe under this hypothesis, either c2 (µ) ∈ fµ (I + ) or c3 (µ) ∈ fµ (I − ). ¯ ⊂ fµ (I − ). Since q1 (µ) ∈ fµ ([c2 (µ), b0 ]) Then [c2 (µ), b0 ] ⊂ fµ (I + ) or [c3 (µ), δ] ¯ we get q1 (µ) ∈ fµ2 (I + ) or [c2 (µ), b0 ]) ⊂ fµ2 (I − ). and [c2 (µ), b0 ] ⊂ fµ ([c3 (µ), δ]), t Thus q1 (µ) ∈ fµ2 (I + ) ∪ fµ3 (I − ). This completes the proof of Lemma 5. u ¯ ∈ fµ2 (I ). As Proof of Lemma 6. If ci (µ) ∈ I , i ∈ {1, 3}, then δ¯ ∈ fµ (I ) and so fµ (δ) ¯ fµ2 (I ) is connected and fµ2 (I )∩C(Ec (µ)) 6 = ∅, we conclude J0 (µ) ⊂ fµ2 (I ). a¯ < fµ (δ), This finishes the proof. u t Proof of Lemma 7. Assume ci (µ) ∈ I , i ∈ {1, 3}, write I = I − ∪ I + as in the previous lemmas. We have fµ (I ) = fµ (I + ) ∪ fµ (I − ), δ¯ ∈ ∂fµ (I − ) and b0 (µ) ∈ ∂fµ (I + ). As ¯ ⊂ fµ (I − ) or [c2 (µ), b0 (µ)] ⊂ fµ (I + ). In fµ2 (I ) is not connected, either [c3 (µ), δ] the later case J0 (µ) ⊂ fµ2 (I + ) ⊂ fµ2 (I ) and in the former J0 (µ) ⊂ fµ3 (I − ) ⊂ fµ3 (I ), proving the result. u t Acknowledgements. The authors thanks IMPA for its hospitality. This work was partially supported by CNPq, FAPERJ, and Pronex-Dynamical Systems, Brasil
References [1] [2] [3] [4] [5]
Alves, J. F., Bonatti C., Viana M.: SBR measures for partially hyperbolic systems whose central direction is mostly expanding. Preprint Affraimovich, V., Chow, S., Liu, W.: Lorenz type attractors from co-dimension one bifurcations. J. Dyn. and Diff. Eq. 7, 375–407 (1995) Bamon, R., Labarca, R., Mañe, R., Pacifico M. J.: The explosion of singular cycles. Publ.Math. IHES 78, 207–232 (1993) Bonatti, Ch., Pumariño, A., Viana, M.: Lorenz attractors with arbitrary expanding dimension. C. R. Acad. Sci. Paris, 324 Série I, 883–888 (1997) Bonatti, Ch., Diaz, L. J.: Persistent non-hyperbolic transitive diffeomorphisms.Ann. Math. 143, 357–396 (1995)
558
[6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20]
C. A. Morales, M. J. Pacifico, E. R. Pujals Carvalho, M.: Sinai–Ruele–Bowen measures for n-dimensional derived from Anosov diffeomorphisms. Ergod. Th. and Dynam. Sys., 13, 21–44 (1993) Christy, J.: Branched surfaces and attractors I: Dynamic branched surfaces. Trans. Am. Math. Sci. 336, 759–784 (1993) Díaz, L. J., Rocha, J., Viana, M.: Strange attractors in saddle-node cycles: Prevalence and globality. Invent. Math. 125, 403–425 (1996) Guckenheimer, J., Williams, R.: Structural stability of Lorenz attractors. Publ. Math. IHES 50, 59–72 (1979) Lorenz, E. N.: Deterministic non-periodic flows. J. Atmosp. Sci. 20, 130–141 (1963) Mañé, R.: Contributions to the stability conjecture. Topology 17, 383–396 (1978) Morales, C. A., Pacifico, M. J.: Strange attractors arising from hyperbolic systems. Preprint Morales, C. A.: Lorenz attractors through saddle-node bifurcations. Ann. IHP Non Lin. An. 13, 589–617 (1996) Morales, C. A., Pujals, E. R.: Singular strange attractors on the boundary of Morse Smale systems. Ann. Scient. Éc. Norm. Sup. 30, 693–717 (1997) Morales, C. A., Pacifico, M. J., Pujals, E. R.: On C 1 robust transitive sets for three-dimensional flows. C. R. Acad. Sci. Paris 326 Série I, 81–86 (1998) Newhouse, S., Palis, J., Takens, F.: Bifurcations and stability of diffeomorphisms. Publ. Math. IHES 57, 5–71 (1983) Palis, J., Takens, F.: Hyperbolicity and sensitive chaotic dynamics at homoclinic bifurcations. Cambridge Studies in Advanced Mathematics 35, (1993) Pesin, Ya. B.: Dynamical systems with generalized hyperbolic attractors: hyperbolic, ergodic and topological properties. Ergod. Th. Dynam. Sys. 12, 123–151 (1992) Sinai, Ya. G.: Gibbs measure for partially hyperbolic attractors. Ergod. Th. Dynam. Sys. 9, 417–438 (1982) Williams, R. F.: Classification of one-dimensional attractors. Proc. AMS Symp. Pure Math. 14, 341–361 (1970)
Communicated by Ya. G. Sinai
Commun. Math. Phys. 211, 559 – 583 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
The Internal Description of a Causal Set: What the Universe Looks Like from the Inside Fotini Markopoulou Center for Gravitational Physics and Geometry, Department of Physics, The Pennsylvania State University, University Park, PA 16802, USA. E-mail:
[email protected] Received: 20 February 1999 / Accepted: 8 November 1999
Abstract: We describe an algebraic way to code the causal information of a discrete spacetime. The causal set C is transformed to a description in terms of the causal pasts of the events in C. This is done by an evolving set, a functor which to each event of C assigns its causal past. Evolving sets obey a Heyting algebra which is characterised by a non-standard notion of complement. Conclusions about the causal structure of the causal set can be drawn by calculating the complement of the evolving set. A causal quantum theory can be based on the quantum version of evolving sets, which we briefly discuss.
1. Introduction In general, an entire spacetime, or an entire spatial slice in canonical gravity, can only be seen by an observer either in the infinite future or outside the universe. This is unphysical, so in the present paper we look for an alternative description of the causal structure of the spacetime that codes what an observer inside the universe can observe. Another motivation for this work is to develop the proposal advocated in [1], that causality persists at the planck scale. Before one can argue whether or not this is plausible, there is the question of how to describe the causal structure. A spacetime metric is totally unsuitable since it is classical; when quantized it “fluctuates”, leading to confusion in any argument, either for, or against, planckian causality. An alternative to the Lorenzian metric which has been explored, for example, in [2] is a causal set. This is the discrete equivalent of a Lorentzian spacetime, the set of events in a discrete spacetime, partially ordered by the causal relations. Because the causal set can be defined prior to a spacetime manifold, it has been used in quantum gravity approaches that hold that a manifold is a classical concept, only found in a suitable classical limit of the planckian theory. Incorporating a causal set, as for example in [1,4, 5], makes a theory “in principle” causal. However, the causal set itself is simply a very
560
F. Markopoulou
large collection of causal relations. If causality can instead be coded in an algebra, it may be possible to represent it directly in a quantum theory. In order to address both of the above problems, we wish to introduce a transformation from the causal set C to the set of causal pasts of each event in C. One possibility is to simply replace each event with its causal past. This has some advantages but it makes no progress towards a better coding of the causal structure. The maps between the causal pasts will be given exactly as in the causal set case. Therefore, we instead choose to work with evolving sets. An evolving set is a functor from the causal set to the category of sets. It is a generalization of an ordinary set that varies over the events in C. It gives the causal past of each event but also contains the causal structure of C in an intrinsic fashion. Evolving sets can be generalised to causal quantum theories, as we will discuss in this paper. Evolving sets satisfy a particular algebra, called a Heyting algebra, with operations which reflect the underlying causal set. In particular, the Heyting algebra complement is a measure of the non-triviality of the causal structure of C. This algebra corresponds to a type of non-standard logic whose development, historically, is closely related to the passage of time. In short, the evolving set of causal pasts contains the same information about the causal structure as the causal set, but this information can be given in terms of algebraic relations. An algebraic way to express time evolution – as opposed to time given geometrically as an extra dimension – is likely to be advantageous in addressing quantum gravity issues. In broader terms, the use of varying sets, of which the evolving ones which we study here are a particular case (those that vary over a causal set), applies to several occasions in physics when we need to make explicit the conditions under which a physical statement is made. In the present work, and generally in a causal quantum theory, we need to make explicit the time (event) dependence. Isham [6] and Isham and Butterfield [7] who introduced varying sets in physics have used them to specify consistent sets of histories and the different levels of coarse-graining. The outline of the paper is the following. In Sect. 2, we give the definition of a causal set and explain when we regard the causal structure of C to be trivial. In Sect. 3, we give a preview of the final construction of the past-sets over C to indicate what technical tools are needed and motivate some of the rather technical sections that follow. Since the evolving set of causal pasts is a generalisation of an ordinary set, in Sect. 4 we review material from set theory. In particular, we examine the definition of a subset of some set in standard set theory. Then, we give the definition of a subset in categorial terms, since this is the form which we will generalise. The mathematical side of this material can be found in the standard mathematical literature on topos theory, for example in Mac Lane and Moerdijk [8], and we mainly concentrate on its physical interpretation. In Sect. 5, we construct the evolving set over a general causal set. In Sect. 6, we apply this definition to the example of discrete Newtonian histories, a particularly simple case of a causal set. In Sect. 7, we define the complement of a causal set and show that it is empty when the causal set is a lattice. The complement provides an algebraic definition of event-dependent causal horizons and we examine the possibility of generalising this to global properties of the causal set, such as black holes and branchings. The complement is one of the operations of the Heyting algebra which the evolving sets obey. We give the definitions of all the four operations in a Heyting algebra in Sect. 8, and we compare it to the standard Boolean algebra of ordinary sets.
Internal Description of a Causal Set
561
Just as the boolean algebra of set theory implies an underlying boolean logic, the Heyting algebra of evolving sets means that the underlying logic is intuitionistic. For completeness, we translate the algebraic operations of Sect. 8 to logical ones in the Appendix. Finally, in Sect. 9, we outline possible generalisations of the present results to causal spin networks and the quantum theory. 2. Review of Causal Sets A causal set C is a discrete partially ordered set with structure that is intended to mirror that of Lorentzian spacetime. Namely, for any pair of points p and q either one is to the future of the other, say p ≤ q, or they are causally unrelated. The ordering relation is antisymmetric, if p ≤ q and q ≤ p, we must conclude that p = q since timelike loops are not allowed. That the causal set is discrete means that the cardinality of the set {r ∈ C : p ≤ r ≤ q} for any pair p ≤ q is finite. We will also assume that the causal set has a finite number of elements. Two elements p and q in the causal set have greatest lower bound (g.l.b.), r, when r is an element in the causal set such that r ≤ p and r ≤ q and, for any other element z, z ≤ p and z ≤ q implies z ≤ r. Similarly, p and q have a least upper bound (l.u.b.) when there is an element t in the causal set such that p ≤ t and q ≤ t, and if there is some element z later than both p and q, then it must also be later than t. The existence of a l.u.b. for two elements means that they will eventually meet at that common future event, while their g.l.b. is their last common past event. A supremum for a partially ordered set, if it exists, is an element s in the partially ordered set later than every other element, that is, p ≤ s for all p ∈ C. Similarly, an infimum is an element i in the partially ordered set before every other element, i ≤ p for all p ∈ C. An infimum in the causal set means a single first event for the whole universe, while a supremum is the single final event that all others lead to. A lattice is a partially ordered set with a g.l.b. and a l.u.b. for every pair of elements (p, q) and a supremum and infimum. We are mainly interested in causal sets with a finite number of events. In this case, if there is a g.l.b. and a l.u.b. for each pair of events in the causal set, there is also a supremum and an infimum. When a causal set is a lattice, we will say that it has trivial causal structure. The reason for this is explained in Sect. 7. We use the causal set to express time-dependence. The “position” of some event in the causal set is determined by its causal relations to the rest of the events in the causal set. To emphasize the fact that we use an event p to specify a time instant relative to the rest of C, we will often call it the stage p. 3. The Evolving Set of Causal Pasts: The Basic Idea A classical observer at a given time instant can be placed on the corresponding event p in the causal set without affecting the causal set. He knows about all the events in the past of p. In this minimal description, time passes when the observer moves to a later stage q ≥ p in the causal set. At q, the set of events in his causal past is larger but still includes all the past events of p. If we are “outside” the causal set, what we see is a collection of such causal pasts, one for each event in C. It is the thesis of this paper, however, that being outside the causal set is unphysical. We instead care about the same situation as viewed from “inside” C by one of the above observers. We want to know in what way the inside viewpoint is different from the outside one and if it has some interesting structure.
562
F. Markopoulou
We will show that the inside viewpoint is given by a functor from the causal set category to the category of sets. Upgrading the causal set to a category involves nothing more than calling the events “objects” and the ordering relation p ≤ q , when it exists, an “arrow p → q”. A collection of objects and arrows forms a category when the composition of the arrows is associative and there is an identity arrow. This is certainly the case for the causal set. From p ≤ q ≤ r ≤ s we can conclude that p ≤ s independently of the order in which we removed the mid-events (associativity) and p ≤ p for any p ∈ C (identity). The category of sets, Set, which we need for the sets of causal pasts, has sets for its objects and functions between sets for arrows. Composition of functions is associative, (f ◦ g) ◦ h = f ◦ (g ◦ h), and for each set there is an identity map from the set to itself. Our task is the following: we need to go from the causal set C to the events in the past of each p ∈ C. The former belong to the causal set, while the latter are objects in Set. One can go from one category to another, while preserving the structure of the first, by a functor. So, we employ the functor Past from the causal set to the category of sets, Past : C → Set.
(1)
It has components Past(p) for each p ∈ C which are the events that lie in the past of that p, Past(p) = {r ∈ C : r ≤ p}.
(2)
They are sets. That Past is a functor means not only that it spits out Past(p) for each p but also that whenever p ≤ q it gives Pastpq : Past(p) → Past(q)
(3)
which takes the causal past of p to the causal past of q. Past preserves composition, Pastpr = Pastqr ◦ Pastpq
whenever
p ≤ q ≤ r,
(4)
and Pastpp is the identity map, Pastpp : Past(p) → Past(p).
(5)
Clearly, when p ≤ q, the past of q contains the past of p and the map Pastpq of Eq. (3) is really just set inclusion, Pastpq : Past(p) ⊆ Past(q).
(6)
(The reason we bother with properties (4) and (5) which are obvious when Pastpq is set inclusion is that we are interested in later generalising the functor Past and assign, for example, open spin network states to each p, or turn Pastpq into a dynamical evolution arrow.) The main role of Past is to transform events in C into sets (of events). In fact, functors from a causal set to Set, like Past, themselves form a category. These functors are not far from being sets, which is why they can be thought of as evolving, or varying sets, a generalisation of ordinary sets which can be found, for example, in [8]. We will, therefore, understand Past by generalising standard set theory. We start by giving a review of the relevant material for sets.
Internal Description of a Causal Set
563
4. Sets and Subsets We start with a review of material from set theory. Consider a set X and a subset of it, A. How do we list those elements x ∈ X which are also contained in the subset A? We use the characteristic function for A, χA : X → {1, 0},
(7)
defined by χA (x) = or, diagramatically,
1 0 0 1 11 00
1 if x ∈ A 0 otherwise.
A
(8)
1 0 0 1 11 00 1
χA
0
{1,0}
X
The function χA splits the elements of X into those that also belong to A (which get the label 1) and those that do not (which are marked 0). The subset A can be recovered using the inverse of the characteristic function, as the inverse image A = χA−1 (1).
(9)
What we have in mind for our time-dependent setup is a generalised notion of a subset, appropriate for evolving sets. In order to generalise, let us first scrutinize the above construction. Let us also treat sets as objects in the category Set, since we will later have to use evolving sets as objects in the apropriate category. As we have already said above, the category Set has sets as its objects and its arrows are set functions. Of course, sets contain elements, and this is essential in the above definition of a subset. On the other hand, when we lay out the category Set there is nothing about elements to start with. How do they come about? Set is a category with a terminal object. A terminal object in a category has the property that there is a unique arrow from any other object in the category to it. If it exists, the terminal object is unique up to isomorphism.1 In Set, the terminal object should then be a set that is special in that there is a unique function from every other set into it. We can check that this is the one-element set in the following way. The function g(x) := constant on an element x ∈ X, exists for any set X. Since its only output is constant, for each X the function g is unique (up to isomorphism). Thus, the terminal object is the set with the single element constant. We will denote the terminal object {0}. Note that we only care about the fact that the terminal object has only one element and not about which one this element is since all one-element sets are isomorphic and 1 In Set there is also an initial object, a set that is special in that there is a unique arrow from it to every other set in the category. With some thought we can convince ourselves that this is the empty set.
564
F. Markopoulou
our constructions only need to be good up to isomorphism. We therefore denote it by {0} but we could just as well have used {1}, {e} or {constant} (and the 0 in the terminal object has absolutely nothing to do with the 0 in the set {1, 0}). Even if we are not told anything about the objects of Set having elements, we can infer that they do from the existence of a terminal object. Consider a function from {0} to some set X. Since the range of this function is a single element, the only output it can have is an element of X. Thus, the functions from {0} to X are in one-to-one correspondence with the elements of X, and we can, in fact, define an element of X to be such a function. This may seem a far too complicated way of doing things, and it certainly is if all we cared about were sets. We need the concept of elements since we need to tell which events in C have occured at any given time. The above definition of an element proves useful in the rest of the present work because it can be generalised to evolving sets. Going back to the subset A, we have ⊆
χA
A −→ X −→ {1, 0}.
(10)
The first thing to note is that when we refer to A as the subset of X we mean that there is an inclusion function from A to X. To emphasize this, we will give the name f to this function, f
χA
A −→ X −→ {1, 0}.
(11)
The characteristic function χA is an equivalent description of the function f . We choose to interpret 1 as “it is true that x is in A”, i.e. 1 is chosen as the true one of the two truth-values of the set {1, 0}.2 “Choosing the element 1” from {1, 0} as described above, means that we use a function from the terminal object to the set {1, 0} that outputs the element 1, T : {0} −→ {1, 0}
such that
T (0) = 1.
(12)
Its name T is suggestive of “true”, since this function represents our choice for which element of {1, 0} makes the expression “x ∈ A” be true. In its full detail then, the set inclusion diagram (11) is therefore
f
1 χA
A
0
T
0 {0}
{1,0} X
The shaded set is the image of f of A in X, or, equivalently, the inverse image χA−1 (1) in X. 2 We are effectively replacing the set {1, 0} by the set {true, false} of boolean truth values.
Internal Description of a Causal Set
565
A subset is a pullback. What the characteristic function χA does is pick the elements of X that carry the label 1 from the two possible in {1, 0}. The subset A is then defined to be the set that contains those elements only. The philosophy here can be roughly interpreted as follows. The set of truth-values {1, 0} is the simplest set we have that, together with the simplest possible (but nontrivial) inclusion function T : {0} → {1, 0}, exhibits all the features of a subset of a set. The characteristic function χA “lifts off” this model case of a subset, applies it to the particular set X we provide, and returns the subset A. The mathematical explanation of the same philosophy is a pullback diagram. We group the functions f : A → X of eq. (11), χA : X → {1, 0} of Eq. (7) and T : {0} → {1, 0} of Eq. (12) into the diagram A −→ {0} f yT y χA X −→ {1, 0},
(13)
where the top arrow is the unique function from A to {0} (the function g we defined above). We know that A contains those x ∈ X with χA (x) = 1. This is precisely the statement that the diagram above is a pullback square, i.e. that A is the set that contains x ∈ X that have the same image under χA as T (0), χA (x) = 1 = T (0).
(14)
We say that A is the pullback in the above diagram. In terms of functions, χA is the only function from X to {1, 0} along which the true function T pulls back to yield the inclusion function f , i.e. f is the pullback of T along χA .3 At this stage we are done with all the technical material from set theory. We will now go ahead to apply it, generalise it, and draw conclusions from it. The generalisation will attempt to capture the following: In the case of sets and subsets that we have seen, it is characteristic that χA splits the original set in exactly two parts, those that belong to the subset and those that do not, with anything in the middle excluded. In such cases we only need two truth-values and {1, 0} suffices. For example, this defines the causal past of a single event in the causal set: we can assign 1 to all events r ≤ p and 0 to all q ≥ p. However, we aim for more. We wish to have truth-values that inform us, not only whether an event has occured or not, but also how long we need to wait till it does, or, whether it will not happen no matter how long we wait. This is provided by the enlarged set of truth-values (in fact a functor) and the characteristic function for the evolving set P ast. 5. The Evolving Set Past The “viewpoint” of an event p in the causal set, the history of the world to the knowledge of p, is the set of events in the causal set in the past of p. This is a subset of the whole causal set C. Here we consider the causal pasts evolving over the causal set. 3 A subset is a special case of a pullback. In general, that A is a pullback means it contains pairs of elements (x, 0) with x ∈ X and 0 ∈ {0} that have the same image in {1, 0}. It is because A is a subset that the second argument in this pair is the single element 0 of the terminal object and, since we can unambiguously drop it, we think of A as containing elements of X only.
566
F. Markopoulou
As we said in Sect. 3, pasts of events in C are the outputs of a functor from the causal set to sets, Past : C −→ Set
(15)
Past(p) = {r ∈ C : r ≤ p}
(16)
that assigns to each p ∈ C its past
and to each causal relation p → q, when it appears in the causal set, a function Pastpq : Past(p) −→ Past(q),
(17)
which includes the identity map Pastpp : Past(p) → Past(p). As an example, here is a causal set 4 6 12 3
11 8
2
(18)
9
5 1
10
7
where we have drawn all the causal relations (and not only the nearest-event ones as is often done). When Past varies over this causal set, it gives the sets of causal pasts Past(4) 24 1
Past (12)
6
Past(2) 32 1
Past (6) 8 5 7 1
Past (11)
2
2 1
Past (8) 1
Past (3)
9 Past(9) 7
5 1 7 1 Past(1)
12
11
10
10
10
Past (5) 7
Past(7)
Past(10)
(19) which themselves form a partially ordered set as the arrows are set inclusions. Past is a C → Set functor. It is a particular object in the category of functors from a partially ordered set C to the category of sets Set, denoted by SetC . (Arrows between these functors are natural transformations, an example of which we define below.) Such functors, since they are evolving sets, are generalisations of sets. For this reason, the category SetC has common features with the category Set of sets. It, also, has a
Internal Description of a Causal Set
567
terminal object, and it is a generalisation – in fact, an evolving version – of the oneelement set that is the terminal object for sets. It is the functor t.o. which assigns to each p ∈ C the one-element set {0} t.o. : C −→ Set
t.o.(p) = {0}
with
(20)
with an identity arrow between one-element sets for every existing causal link, t.o.pq : {0} −→ {0}
when
p ≤ q.
(21)
We can visualise the terminal object as the same web of relations as in C, but with {0} stuck in place of every event: {0} {0} {0}
{0}
{0}
{0}
{0}
{0}
(22)
{0} {0}
{0}
{0}
What the causal set encodes is whether one event is before or after some other event. Let us examine this. Suppose we are at event p and need to say whether some other event r ∈ C has occured. Clearly, if r ≤ p, then r has happened. We can, however, do better than this and assign truth-values at p, not only to “r precedes p”, but to when r will happen. Suppose there is some q in the future of p, q ≥ p, for which r is past, r ≤ q. At p, the “time” we need to wait before r occurs is the causal relation p → q.4 This arrow is therefore a candidate for the truth-value, at p, for r to have occured. Under more careful inspection, we can see that a single arrow is insufficient. First, if q has both p and r in its past, so will an event q 0 ≥ q. That is, the causal relations p → q 0 for all q 0 ≥ q should also be included in the truth-value we are calculating. Second, in general, there will not be a single first event q in the common future of p and r, but a set of such events, as is the case in the diagram below. These, and all events in their future, should be the causal relations in the truth-value at p for r having occured. They are the bold arrows in the following diagram.
q q2
p
(23)
q1
r
4 An example of “length of time” truth-values is Newtonian histories, which we work out in the next section. There, the arrow p → q can be assigned a length, the number of events between p and q. This, of course, is only possible because in the Newtonian case the causal set is a full linear order. In general, it is the arrow itself that we think of as the “time elapsed” between p and q.
568
F. Markopoulou
Let us formalize this. Let us define R(p) as the set of all causal relationships that start at p (including p ≤ p), R(p) = {all p −→ p0 : p ≤ p0 in C}.
(24)
Next, we define subsets S(p) of R(p) such that, if S(p) contains the causal relation p → q, it should also contain p → q 0 , for every q 0 ≥ q. S(p) is therefore a leftextendible subset of R(p) since the above definition can also be read as, if (p → q) ∈ S(p), and there is (q → q 0 ) ∈ C, then (p → q 0 ) = (q → q 0 ) ◦ (p → q) ∈ S(p). Namely, S(p) is closed under left multiplication (extending a causal relation to the future). A left-extendible subset S(p) of R(p) (in our convention5 ) is called a sieve on p. The truth-value at p that we are looking for is precisely a sieve. In fact, since it is not simply a yes/no truth-value, but also tells us at p when r will occur, it is called a timetill-truth value. A sieve is the type of truth-value appropriate to statements which, once they become true, they remain true, as is the case for an event having occured. Therefore, at p, we have a sieve truth-value for every other event r in the causal set. The set of all these sieves is the set of truth-values at p. We call this set (p), (p) = {sieves on p}.
(25)
A special subset of the events in C is, of course, those in Past(p). There is a sieve in (p) that corresponds to these particular events. They are special because at p they have already happened. While we may think of the sieve truth-value in (p) for r ∈ / Past(p) as a partially true value, the sieve for the events in Past(p) is the totally true value. We can use this understanding to find the sieve in (p) that corresponds to the events in Past(p) and thus the set Past(p) itself. Since these events have occured at p and at all events that follow p, the truth-value we are looking for is the entire R(p), the maximal sieve in (p). In more detail, R(p) is an element of (p) which is selected in the manner described in Sect. 4 for choosing elements of a set using the terminal object. We define an inclusion of {0} into (p), which we call the totally true arrow at p, Tp : Tp : {0} −→ (p),
(26)
Tp (0) = R(p) = (maximal sieve on p).
(27)
such that
Let us now observe that Past(p) ⊂ C and call this inclusion fp , fp : Past(p) ⊆ C.
(28)
Both Past(p) and C are sets, and as we saw in Sect. 4, the former is a subset of the latter means that Past(p) is the pullback in the diagram
−→ {0} Past(p) T f y p y p χp C −→ (p) 5 We call a sieve what in [8] is a cosieve.
(29)
Internal Description of a Causal Set
569
namely, Past(p) contains those events r ∈ C that satisfy χp (r) = (maximal sieve on p) = Tp (0),
(30)
exactly as we saw in Sect. 4. We have, therefore, found that the set of truth-values (p) at a given event p is the set of sieves on p and, in particular, that the maximal sieve gives the component Past(p) of Past at p. We next have to let these results “vary” over C so that we can define the functor Past by a generalisation of what we did for Past(p). As we move from Past(p) to Past(q), the appropriate truth-values are consistently given by (p) and (q), so that we can find Past(q) given Past(p), p ≤ q and C. That is, the diagram fp
−→ Past(p) Past y pq fq Past(q) −→
C yidC C
(31)
commutes (idC is the identity operation on the causal set). On the left of the diagram we have components of the functor Past. We can organize the righthand side in the same way by constructing the constant functor World : C −→ Set
(32)
World(p) = C for any p ∈ C, Worldpq = idC .
(33) (34)
with
Now the diagram (31) reads fp
−→ World(p) Past(p) yWorldpq yPastpq fq Past(q) −→ World(q)
(35)
What this diagram says is that Past is a subfunctor of World, i.e. there is an inclusion f : Past −→ World,
(36)
which is a natural transformation, i.e. it has components fp , fq that make the diagram (35) commute, fq · Pastpq = Worldpq · fp for every two events p, q that are causally related. The set of truth-values for Past(p) is (p) while for Past(q) it is (q). These, too, are components of the functor : C −→ Set
(37)
that for each event in C outputs its set of truth-values, the sieves on that event, and has functions pq : (p) → (q) that, given the set of sieves on p give the set of sieves at q.
570
F. Markopoulou
We, therefore, have a characteristic function χ from World to , χ : World −→ .
(38)
which is also a natural transformation, it has components at each event that compose as f did above. We obtain the particular subfunctor Past, when, at each event in the causal set, χ maps into the maximal sieve at that event. Namely, the function Tp above is the p-component of the arrow T from the terminal object t.o. for SetC defined in (20) to , T : t.o. −→ , with Tp : t.o.(p) = {0} → (p).
(39) (40)
We, finally, have everything we need to give the subfunctor P ast as a pullback. All four Past, World, t.o. and are objects in the category SetC . We may then construct the diagram Past −→ f y χ World −→
t.o. yT
(41)
and ask that the functor Past makes this diagram pullback. This means that it reduces to the diagram (31) consistently at each event in the causal set. Past is a subfunctor of World.6 We now know how to specify an evolving set. We have defined an evolving causal past by giving the functor Past that generates the causal past of each event and corresponding evolving set of truth-values . We can now analyze their physical significance. Overall, we have achieved two things. One is to express the causal structure in terms of causal pasts at each event, which we regard as physically more satisfactory than an entire spacetime or causal set. This is the description that can be given from an observer inside the causal set. Second, even when the causal set is not a lattice, the evolving sets over it is one. As we will see next, they satisfy a particular algebra, called a Heyting algebra, whose operations reflect the underlying causal set. Thus, we can give the causal structure algebraically. We do this in Sect. 8. Before we continue with the algebra of causal sets, we will give a simple example of causal pasts in a causal set. We will consider histories in a discrete Newtonian universe. 6. Discrete Newtonian Histories Discrete Newtonian time evolution is the case where integers 0, 1, 2, 3, . . . ∈ N can label the preferred time parameter. In a Newtonian world, we have a preferred foliation 6 All of Past, t.o., World and are evolving sets, and χ, T and f are natural transformations. In particular, χ has components χp
C −→ (p) yidCχ y pq q C −→ (q)
(42)
Internal Description of a Causal Set
571
with slices S0 , S1 , S2 , S3 , . . . labelled by the time stage when they occur. It is, therefore, a very special causal set: the fully ordered set of integers. At some given time n, we may ask for the history up to that time. It is the sequence of slices up to n, namely,7 History(n) = {S0 , S1 , S2 , . . . , Sn } .
(43)
Clearly there is a history at each time stage n ∈ N. We can set up a functor History, History : N −→ Set,
(44)
which, when fed the time instant, spits out the history up to that time, History(n), as given by (43). The functor History also provides maps between histories which are just the inclusions of the shorter histories into the longer ones, Historynm : History(n) ⊆ History(m)
whenever n ≤ m.
(45)
We may now want to ask whether some slice Si is already in History(n) and, if not, when it will be included. First, we note that all slices can be found in the components of the constant functor World : N −→ Set,
(46)
World(n) = {S0 , S1 , S2 , . . . } for any n ∈ N.
(47)
since its components are History is a subfunctor of World since, at each time n, History(n) ⊆ World(n), : History(0) −→ History(1) −→ History(2) ··· History ⊆ ⊆ ⊆ f y y y y World World(0) −→ World(1) −→ World(2) · · ·
(48)
...
(By analogy to the section on sets, World is the evolving set version of the set X, while History is an “evolving subset” in place of A.) Diagramatically, we have
n
S(2)
2
History(2 )
S(1 )
1
History(1 )
S(0 )
0
History(0 )
...
History(n)
...
World(n) S(n)
...
World( 2)
...
World( 1)
...
World(0)
(49)
7 In Newtonian evolution the speed of light is infinite and we do not have the local (bubble) evolution present in a causal universe. Effectively, the slice Sn can simply be replaced by n, or, equivalently, we can reduce the slice to contain a single event and the causal structure will not change. Therefore, the discrete Newtonian history {S0 , S1 , S2 , . . . , Sn } is isomorphic to just {0, 1, 2, . . . n}.
572
F. Markopoulou
At time n, the slice Si may or may not be in History(n). If it is not already history at n, then, if we wait for some time t we may find that Si ∈ History(n + t). As a result, the characteristic function that tells us at n when Si is history is least t at which Si ∈ History(n + t), if this occurs χn (Si ) = (50) ∞ otherwise. Of the numbers that χn (Si ) spits out, 0 is to be understood as “true now” (Si is already history at time n), 1 is “true tomorrow” and so on. ∞ is the symbol we chose for “never true”, if the particular slice Si never appears in History. Note that χn takes its values in the set N∞ , the integers together with ∞. Therefore, the set of truth-values at n is N∞ . In fact, the set of truth-values is N∞ at any time. We can chain these sets together to form : N −→ Set, that represents the time-development of the set of truth-values, only, in our especially simple case, there is no time development and (n) = N∞ for any n ∈ N.
(51)
then is the “constant evolving set” (with identity maps between components) = {N∞ , N∞ , N∞ , N∞ , . . . } .
(52)
Note that the true now (at n) output of χn in (50) is the value 0. This is the maximal sieve in (n) above. We can see that this is because the sieves here tell us how long we need to wait before Si joins History, and 0 is, of course, the shortest possible wait. It is the maximal sieve since it contains all longer waits, 1, 2, . . . , n, etc.8 Finally, the history at a given time can be obtained by first defining the function Tn at n to have output 0 which is the true now value in (50), Tn : {0} → (n)
with
Tn (0) = 0.
(54)
Then History(n) is the pullback in
−→ {0} History(n) T ⊆ y n y χn World(n) −→ (n)
(55)
8 We can write (n) as all the arrows from n to any m ≥ n, i.e. all the possible “waiting times”, or time-till-truth values: n −→ n n −→ n + 1 .. . (n) = n −→ n + k . (53) n −→ n + k + 1 .. . n −→ ∞
The length of these arrows is the number that χn of Eq. (50) outputs. These are the sieves on n. Going from (n) to (n + 1), the length of each arrow goes down by one since, what at n was “true tomorrow”, at n + 1 is “true now” and so on. Thus is closed under making the arrow n → m even longer, which means that what is true now will remain true tomorrow. The arrow n → n is the shortest one, it means no wait, i.e. true now (at n).
Internal Description of a Causal Set
573
namely, it contains those slices Si in World(n) that satisfy χn (Si ) = 0 = Tn (0),
(56)
and thus are already true at n. 7. The Complement of Past Measures the Causal Structure of C In Sect. 5 we provided a time-dependent way of telling when some event in a causal set has occured (Eqs. (38) and (25)). What characterises an event that will never happen? In this section we give the relevant definition which encodes much of the causal structure of C. We then make some first remarks about the possibility of defining algebraically global properties of a causal set, such as black holes and branchings. A sieve on p ∈ C is the time-till-truth value for some other event r ∈ C to happen. In the same way, if r will never happen, then it should satisfy r∈ / Past(q)
for any q ≥ p,
(57)
since r never happening for p means that it will never be in the past of the future of p. The set of events satisfying (57) is the evolving set complement of Past(p), the set of events in the causal set C that are not in the past of the future of p: ¬Past(p) = {r ∈ C : r ∈ / Past(q) for any q ≥ p} .
(58)
As a first example, consider the Newtonian case. There the functor History plays the same role as Past, and thus / History(m) for any m ≥ n}. ¬History(n) = {k ∈ N : Sk ∈
(59)
But, since the causal ordering is a full linear ordering, any slice will at some point join History. Thus, the complement of History(n) is empty. It is generally the case with a causal set that is a lattice that the complement of Past is empty. This is because, by definition, when C is a lattice any two events p and r have a least upper bound which is an event with both p and r in its past. Thus, definition (58) returns the empty set. Note that this implies that ¬Past cannot distinguish between different lattice causal sets. Second, let us take the very small causal set in (18). It is not a lattice. There is no single “big bang” event and no single “final crunch”. Consider the event labelled 7 and let us find the complement of Past(7). ¬Past(7) is all the events not in the past of the future of 7, i.e. 3, 4, 10, 11 and 12. We mark them white: 4 6 12 3
11 8
2
(60)
9
5 1 7
10
574
F. Markopoulou
Let us imagine placing an observer on event 7. The future of 7 is 5,6 and 9. In these future times, he will receive information from anything in the past of 5,6 and 9, i.e. from 1, 2, 5 and 8. He will never, however, be able to see 3, 4, 10, 11 and 12, the events in ¬Past(7). As far as 7 is concerned, ¬Past(7) = {3, 4, 10, 11, 12} is beyond its causal horizon, 7 will never receive information from these events. It is not suprising that there is such a boundary when the causal structure is nontrivial, and parts of the causal set are either disconnected or branch off. This is reminiscent of topology change in a spacetime, although, naturally, we do not have the same notion of topology here. We should emphasize that this is a horizon for 7 and not a global property of C. Such an event-dependent notion of causal horizon is useful in constraining a microscopic theory to reproduce the Einstein equations at large scale (used in [10]). However, since this paper is devoted simply to the exposition of the evolving set Past and not its applications, we will not go further into this, except for comments in the conclusion section. A global property of the spacetime, such as a branching of the causal set or a black hole, needs, presumably, the agreement of the complements of a large class of events. A permanent braching of C will result to all events in one branch having (almost all) the events in the other branch in their complements. It is a very interesting but subtle problem to express a black hole in terms of the causal structure by considering intersections of the complements of the events in C. This is work currently in progress. The non-standard complement of Past is one of the four operations of the algebra of (evolving subsets of) evolving sets. It is a Heyting algebra, which we define in the next section by giving its four operations. 8. Evolving Sets Obey a Heyting Algebra Recall that Past, as a functor from C to sets, can be regarded as an evolving set, a set that varies over all events in C. As with the other definitions in this paper, we start from sets and generalise to evolving sets. 8.1. The algebra of sets. Take a set X, and consider the set of its subsets, its powerset. There are four possible operations on the subsets of X, and they produce four unique new sets: • Union: (A ∪ B) = {x ∈ X : x ∈ A or x ∈ B}. • Intersection: (A ∩ B) = {x ∈ X : x ∈ A and x ∈ B}. • Implication: (A ⇒ B) = {x ∈ X : if x ∈ A then x ∈ B} = {x ∈ X : x ∈ / A or x ∈ B}. • Complement: The complement ¬A of A is characterised by A ∩ ¬A = ∅
and
A ∪ ¬A = X,
(61)
and, therefore, ¬A = {x ∈ X : x ∈ / A}. Note that the complement can, in fact, be derived from the implication operation by defining ¬A = (A ⇒ ∅).
(62)
Internal Description of a Causal Set
575
This means that ¬A contains those x ∈ X which “if they are in A they also are in ∅”, which is a formal way to exclude the elements of A from ¬A. As on other occasions in this paper, it is this twisted definition that we generalise below. From the two relations (61), one can prove the familiar ¬¬A = A.
(63)
Diagramatically, the four set operations are:
(64)
The partially ordered set of subsets of X is a lattice, since every two subsets have their union as a l.u.b. and their intersection as a g.l.b. The infimum of the powerset of X is the empty set ∅ and the supremum is X itself. Importantly, it is a distributive lattice, i.e. the identity A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C)
(65)
holds. A distributive lattice in which every element has a complement that satisfies (61) is a Boolean lattice, or, a Boolean algebra. Therefore, the subsets of X form a Boolean algebra. 8.2. The algebra of evolving sets. With evolving sets, we have the same four operations of union, intersection, implication and complement, but different relations between them and thus a different algebra which is known as a Heyting algebra. Consider two evolving sets, or functors from C to Set which are subfunctors of our constant functor World. Let us call them F and G. We have worked with one of them, Past. Another example would be the functor generating events in the future of each event in C, or sets of spacelike separated events in each causal past, etc. The four possible operations on F and G are given componentwise (at each event in C) but so that they preserve the causal structure of C. They are:9 • Union: (F ∪ G)(p) = F (p) ∪ G(p). • Intersection: (F ∩ G)(p) = F (p) ∩ G(p). 9 It is somewhat difficult to draw diagrams for evolving sets. If the reader wishes to have an illustration of a Heyting algebra, a good alternative is to draw open sets. Open subsets of a given (open) set O also satisfy a Heyting algebra. Given an open set U ⊂ O, the set-theoretic complement of U (i.e. the one satisfying the definition (61)), which we will call U c to avoid confusion, is a closed set. If we want an algebra of open sets, we need, instead of U c , to use the interior of the set-theoretic complement of U , I nt (U c ). But then, clearly, U ∪ I nt (U c ) ⊂ O since the closure of U has been left out. Open sets are the standard example of a Heyting algebra in the literature. That they behave in the same way as evolving sets is not surprising, roughly speaking, they both involve an infinite sequence, the former of time stages to the future, the latter of points to the boundary.
576
F. Markopoulou
• Implication: (F ⇒ G)(p) = {r ∈ C : for any causal relation f : p → q, if r ∈ F (q), then r ∈ G(q)}.
(66)
That is, (F ⇒ G)(p) contains all those events, which, if by the time step f : p → q join F at q, they also join G at q. One can check that this is a SetC functor. • Complement: Since we have already worked with the complement of Past, we will discuss the complement using this particular example rather than the abstract functor F . We will basically give the justification for the definition (58). The problem with applying the definition (61) of a boolean complement to Past is that it gives its complement as all r ∈ C that are not in Past(p), namely (C − Past(p)). But, since a causal past keeps getting larger with time, i.e. Past(p) ⊆ Past(q) when p ≤ q, this complement would get smaller in time, (C − Past(p)) ⊇ (C − Past(q)). This has the wrong behavior to be a functor like Past and hence we cannot get an algebra of evolving sets using this complement. Let us instead define the complement by generalising (62). First we need an evolving version of ∅. This is the functor Empty that assigns to each p ∈ C the empty set (with the expected inbetween arrows). We then get ¬Past = (Past ⇒ Empty).
(67)
¬Past(p) := {r ∈ C : for any causal relation f : p → q in C, f (r) ∈ / Past(q)} = {r ∈ C : r ∈ / Past(q) for any q ≥ p}.
(68)
In components this gives
which is the definition (58). This is a SetC functor and an appropriate complement for Past. Generally, ¬Past satisfies Past ∩ ¬Past = Empty,
(69)
Past ∪ ¬Past ⊂ World.
(70)
but
(An example of this is the case of a lattice causal set where the righthand side of (70) is just Past.) Because the Heyting algebra complement satisfies the weaker set of conditions (69) and (70), instead of (61), it is often called a pseudo-complement. We simply call it complement, or a Heyting algebra complement. For a Heyting algebra complement, it is not true anymore that ¬¬Past equals Past (again consider the example where C is a lattice). In general, for some evolving set, the double complement is larger than the original. The subobjects of a SetC object like World, therefore, form a lattice: there is union and intersection of any two, World is the supremum and Empty the infimum. This is independent of whether the causal set itself is a lattice. This is significant because a lattice can be used as an algebra and, at the same time, evolving sets still contain all the causal information in C.
Internal Description of a Causal Set
577
The lattice of evolving sets is a distributive one, since for any three subobjects F, G and H of World, F ∩ (G ∪ H ) = (F ∩ G) ∪ (F ∩ H )
(71)
holds. A lattice that is distributive and has implication defined for every pair of elements is called a Heyting algebra. It is weaker than the boolean algebra (it is a generalisation). For a Heyting algebra element (evolving set) F , ¬¬F ⊇ F
(72)
(but one can check that ¬¬¬F = ¬F ). If it is the case that ¬F ∪ F = World, then ¬¬F = F and the Heyting algebra reduces to a boolean one. Summarising, the algebra of the analogue of the powerset of an evolving set is a Heyting algebra. A key feature of a Heyting algebra is the complement. It is intimately tied to the causal structure and therefore has physical significance. 9. Using Evolving Sets in a Causal Quantum Theory The reason we carried out the analysis of this paper is that it may serve in obtaining a quantum theory of gravity. We are keeping as much of the causal structure of general relativity as it is possible without using metric manifolds. Clearly, we are restricted to very basic features of a spacetime, namely, the causal ordering of events. No further qualifications of these events have been specified. It is true that it is possible to recover the metric of a spacetime up to a conformal factor from the causal relations between all its events. It has been argued that in the discrete case, even this factor may be fixed by assigning spacetime volume to each event [11]. However, in the causal set as it is used here, no straightforward relationship between its events and those of a classical spacetime has been assumed. In this case, the causal structure is not all that is needed. As it has been proposed in [1] and further elaborated in [4,5,12], extra spatial structure can be attached to a causal set in the form of a spin network, a graph with edges labelled by representations of SU (2) [13]. Spin networks have already been part of a quantized gravity theory in loop quantum gravity [14]. On the other hand, in the causal spin network scheme, as in other applications of causal sets, a problem is how to represent the causal structure in a way that is useful in the quantum theory. We hope that the present algebraic formulation of causality will resolve this, at least by restricting the quantum theory to the functorial form that implies underlying causality, as we will outline below. The application of evolving sets to spin networks and generally a causal quantum theory is currently being constructed. Here we indicate how this application may be carried out. 9.1. Elementary causal quantum theory. Standard quantum theory is unambiguously applicable under certain conditions: that the time is fixed and that the separation between the system in question and its environment is also fixed.10 Quantum field theory is, in 10 At the conceptual level, varying sets is the method proposed by Isham [6] to accomodate contextual physical statements, where context means the conditions under which a certain physical description and physical statement is applicable. We have dealt with evolving sets (and in this section we discuss what may be called “evolving algebras” and “evolving state spaces”) where the context is causality, namely we need to specify the event p in order to get its classical past or the quantum theory at p. The same method can be used in a third context, to treat the system-environment split implied by standard quantum theory. Here the context is all possible systems in the universe. This is work in progress with C. Rovelli.
578
F. Markopoulou
principle, more flexible, however, the single Hilbert space employed restricts its use to a semiclassical regime of quantum gravity. Leaving aside for the time being the system/environment issue, and keeping the causal set as the time evolution record, we may propose removing the fixed-time condition by using a Hilbert space for each event in the causal set. For example, in the Newtonian case, this results in a sequence of Hilbert spaces which we may regard as the evolution of the quantum universe. Now we note the following. The present work suggests that the Hilbert spaces at each event may be regarded as objects in the category of Hilbert spaces, Hilb (with linear transformations as the arrows). Then, the causal structure is preserved in the quantum theory if there is a functor from the causal set to Hilb, Q : C → Hilb. Let us check if this is the case. For each event p ∈ C, we have a Hilbert space H (p) = Q(p). However, given a causal relation p ≤ q in C, it is not, in general, the case that there is a linear map from H (p) to H (q) such that H (p) is a linear subspace of H (q). Therefore, the ordering of C is not preserved by the functor Q into Hilb. Some of the ways to circumvent this problem and maintain causality in the quantum theory that we are aware of, is to restrict the linear maps, evolve linear subspaces of the above Hilbert spaces (evolve the projection operators), and replace the Hilbert spaces with algebras of observables at each event (see 9.3).
9.2. Causal spin networks. Causal evolution of spin networks is in principle similar to the above scheme, but with the important difference that it leads to entangled states rather than states in a single Hilbert space (one Hilbert space for each event). Analyzing a functor from the causal set to entangled states or density matrices is beyond the scope of this paper, but we will illustrate how they arise. One may regard a causal history of spin networks as a sequence of spin network graphs such that the nodes of all the graphs in a given history are the elements of the causal set and therefore are partially ordered.11 Between the nodes of a single spin network, therefore, there are no causal relations. Assuming a causal history with connected graphs, the set of nodes of a spin network is maximal, i.e. there is not an element in C outside this set that is not causally related to some element in the set. A maximal set of causally unrelated events in C is a maximal antichain. The complaint with which we started this paper, that a spatial slice can only be seen by an observer in the infinite future, is also present in the spin network evolution since an antichain requires the same observer. What this work suggests is that we should instead work with (partial) antichains at event p, for each p ∈ C (p is also a spin network node). These are sets of nodes in a causal spin network history that are causally unrelated inside Past(p). Given the functor Past, we can construct the functor Antichains : C → Set, Antichains (p) = {sets of causally unrelated events in Past(p)}.
(73)
For every causal relation p ≤ q, the antichains at p are, of course, a subset of those in q. Given an element in Antichains (p), there is a number of graphs that has these events as its nodes, i.e., for each p we have the set Graphs(p) of all graphs with elements of 11 In fact, it is the local changes of graphs which generate the history that are partially ordered into a causal set. The difference does not affect the present discussion. For more detail on this, see [5].
Internal Description of a Causal Set
579
Antichains(p) as their sets of nodes. (We ignore the SU (2) labels which further enlarge this set.) Clearly, Graphs(p) includes open spin networks. An open spin network has free edges, all labelled by SU (2) representations, and thus can be regarded as an entangled state. Hence, to go from open spin networks at p to open spin networks at a later q, we need an evolution operator on density matrices. 9.3. Observable causal quantum theory. A further possibility is to replace the Hilbert spaces of the scheme in 9.1 with the algebra of observables at each event.12 This means using the functor OQ : C −→ A,
(74)
where A(p) will be the algebra of observables at p. This scheme provides what we may regard as a quantum field theory on a causal set. Concluding this section, we note that the use of functors from the causal set to our preferred description of the universe is a safe way to check that this description is indeed causal. If the quantum theory is the output of a functor that has the causal set as its domain, then causality is build into the theory. Even if causality is not the preferred set of conditions on evolution (one may wish to impose a weaker ordering than the causal set, for example a non-transitive order), the same method can be used to check that these conditions are preserved in the quantum theory. Conclusions In this paper, we used the functor Past : C → Set to transform from an external (outside the universe) viewpoint of causality to the internal, finite-time, viewpoint provided by the components of Past at each event. The result provides an algebraic description of causality and great possibilities for generalisations to diverse types of causal quantum theories, some of which we outlined above. Conclusions about the form of quantum evolution that causality permits can be reached in this way. However, note that this would be considered a kinematical restriction since the causal set on which the evolving set is defined is fixed. Past is the simplest evolving set, since it only uses the causal set and no additional spatial or field information. It does, however, serve as a good example in introducing evolving sets, the Heyting algebra of evolving powersets, and the non-standard evolving complement. At the classical level (from the perspective of a causal set approach), the technical advantage of Past is that it transforms the partially ordered set of causal relations into a lattice and suggests significant improvements in the description of the causal structure without the use of a spacetime.13 The actual implementation in the quantum theory will be reported in future work. Further, it is possible to construct a framework for a (classical) causal cosmological theory by requiring that every physical observable corresponds to an observation made by an observer inside the universe, represented by an event or a collection of events in the set of causal relations of that universe. Such “internal” observables, when referring to events in the observer’s causal past, are subfunctors of Past. For example, in the 1 + 1 12 I need to thank Eli Hawkins for this observation. 13 After the work in this paper was completed, the paper of Bombelli and Meyer [15] came to our attention.
From a given causal set, they construct quantities that contain the same events as Past and ¬Past.
580
F. Markopoulou
causal histories of Abjorn, Loll and others [16], a physical observable may be events that have spacetime valence n (the number of ingoing and outgoing causal relations to such an event is n). Written as a varying set, this observable will be the n-valent events that have occured at p and is a subfunctor of Past. Such internal causal observables obey a Heyting algebra. We close by noting that sieves can be used to specify time. This is an algebraic alternative to time expressed as t ∈ R. In work currently in progress, we investigate projection operators which are time-dependent in the sense that their eigenvalues belong to a larger set than the standard 1 and 0; in fact they correspond to sieves. Acknowledgements. I am very grateful to Chris Isham for introductory discussions on varying sets and his very useful suggestions and criticisms on the present work. Detailed comments on the first draft from Eli Hawkins, Carlo Rovelli, Lee Smolin and Adam Ritz have made this paper much clearer than it originally was. I am thankful to Sameer Gupta, David Meyer and Roger Penrose for suggestions on using the complement to define a black hole. Thanks are due to John Baez, John Barrett and Alex Heller and particularly to Bas van Fraasen for the first discussions on intuitionistic logic. This work was supported by NSF grants PHY/9514240 and PHY/9423950 to the Pennsylvania State University and a gift from the Jesse Phillips Foundation.
Appendix: Boolean vs. Intuitionistic Logic This appendix is an account of the basics of Boolean and intuitionistic logic and their relation to the Boolean and Heyting algebras. The boolean algebra obeyed by sets means that when a physical theory is ultimately built on set-theoretic foundations (which is almost universally the case), the underlying logic is boolean. Loosely speaking, an observer in such a theory will make statements which obey boolean logic. For a theory based on evolving sets, which we propose here, the Heyting algebra of evolving sets indicates that the underlying logic is intuitionistic. For completeness, and because we would like to point the physicist reader to a mathematical literature possibly of use in issues of time evolution in physics, this appendix is a review of both boolean and intuitionistic logic. From the perspective of a physicist, there is little difference between the algebraic operations and the corresponding logical ones (propositional calculus). Practically, it simply involves “reading” the operations as propositions rather than sets. For example, each proposition x in the logical operations that follow may be replaced by “a ∈ A” for some subset A of X which we used in Sect. 8. The algebraic operations of union, intersection, implication, and complement then become the logical connectives OR, AND, IMPLIES and NOT. Most of the interpretational discussion has already been carried out in Sect. 8. We now simply present the two logical systems. In the intuitionistic case, we include in the presentation some of the historical reasons for the introduction of this type of logic into mathematics. 9.4. Boolean logic. To have either an algebra or a propositional calculus we first need a lattice, as explained in Sect. 8. A lattice is first of all a partial order, so we partial order propositions by ≤. The proposition x ≤ y is to be read as “if x is true, then y is true”. For a pair of propositions x and y, the four boolean logical operations OR, AND, IMPLIES and NOT produce four new propositions. Any further statements made (sentences) are constructed by combinations of these basic operations. They are in obvious correspondence with the four boolean algebra operations.
Internal Description of a Causal Set
• • • •
581
OR (union): (x ∨ y) is true when either x or y are true. AND (intersection): (x ∧ y) is true when both x and y are true. IMPLIES (implication): (x ⇒ y) is true when, if x is true, then y is also NOT (complement): The proposition ¬x is true whenever x is false.
true.
We can tabulate the boolean operations: x 1 1 0 0
y 1 0 1 0
x∨y 1 1 1 0
x∧y 1 0 0 0
x⇒y 1 0 1 1
¬x 0 0 1 1
Note, first, that ⇒ as defined above means that (x ⇒ y) = ¬x ∨ y,
(75)
that is “x implies y” is equivalent to “either x is false or y is true”, which with little thinking we can believe (or use the table above). Second, in a boolean algebra, every proposition x has a negation ¬x which satisfies x ∧ ¬x = 0
and
x ∨ ¬x = 1,
(76)
where 0 means false and 1 means true. As a result, not not x is the same as x, ¬¬x = x.
(77)
9.5. Intuitionistic logic. We think it is useful to present intuitionistic logic with reference to the motivation of the mathematicians who invented it. This is standard material and we freely quote from the introduction of [8] and from [17]. Intuitionistic logic and the mathematics based on it originated with Brouwer’s work on the foundations of mathematics at the beginning of this century [18]. He insisted that all proofs be constructive. This means that he did not allow proof by contradiction and hence he excluded the classical (boolean) “for all x, either x, or not x”. Intuitionism is a form of constructive mathematics. The classical (boolean) mathematician believes that every mathematical statement x is true or false, whether or not he has the proof for it. The constructive mathematician does not consider x to be true or false unless he can either prove it or disprove it. That is, x may be true tomorrow, or false tomorrow. To quote Brouwer, “the belief in the universal validity of the excluded middle in mathematics is considered by the intuitionists as a phenomenon in the history of civilization of the same kind as the former belief in the rationality of π , or the rotation of the firmament about the earth”. Brouwer’s approach was not formal or axiomatic, but subsequently Heyting and others introduced formal systems of intuitionistic logic, weaker than classical logic. Heyting first formalised the basic axioms of intuitionism which, usually detached from Brouwer’s extreme position, has turned out to be useful for mathematics beyond the original context. As in the Boolean case, we start with a lattice of propositions, ordered by ≤. The four operations work as follows: •
OR
and AND are the same as in the boolean case.
582
•
F. Markopoulou
IMPLIES: The statement (x ⇒ y) means that y holds under the assumption that x holds, namely, we show (x ⇒ y) by deriving y from the hypothesis x. (Note that there is no causal implication in ⇒, there is no sense of y causally following x.) (x ⇒ y) is characterized by
z ≤ (x ⇒ y)
if and only if
z ∧ x ≤ y,
(78)
that is, (x ⇒ y), namely the condition (78), is the union of all z that satisfy z∧x ≤ y.14 Note that the constructive interpretation of (x ⇒ y) is weaker than the boolean one where (x ⇒ y) = ¬x ∨ y. This leads to the modified intuitionistic negation. • NOT: The statement ¬x means that (x ⇒ z), where z is a contradiction. Usually a contradiction is denoted 0, so ¬x = (x ⇒ 0).
(79)
that is, we have “not x” when x leads to a contradiction. From the definition (78) of ⇒, we get that y ≤ ¬x
if and only if
y ∧ x = 0,
(80)
namely, ¬x is the union of all propositions y which have nothing in common with x. As a result ¬¬x needs not equal x. Also, although x ∧ ¬x = 0, it may not be the case that x ∨ ¬x = 1. In short, intuitionistic logic, or, equivalently, a Heyting algebra, is particularly suitable in a theory with time evolution, when we are concerned with physical statements which become true at a certain time stage and stay true afterwards. References 1. Markopoulou, F. and Smolin, L. (1997): Causal evolution of spin networks. Nucl. Phys. B508, 409 2. Bombelli, L., Lee, J., Meyer, D. and Sorkin, R. (1987): Space-time as a causal set. Phys. Rev. Lett. 59, 521; Sorkin, R. (1990): Space-time and causal sets. in Proc. of SILARG VII Conf., Cocoyoc, Mexico; Meyer, D. A. 1988: The Dimension of Causal Sets. PhD. Thesis, Massachussets Institute of Technology 3. Bombelli, L. and Meyer, D. (1989): The origin of Lorentzian geometry. Phys. Lett. A 141, 226 4. Markopoulou, F. (1997): Dual formulation of spin network evolution. gr-qc/9704013 5. Markopoulou, F. and Smolin, L. (1998): Quantum geometry with intrinsic local causality. Phys. Rev. D 58, 084032 6. Isham, C. J. (1997): Topos theory and consistent histories: the internal logic of the set of all consistent sets. Int. J. Theor. Phys. 36, 785 7. Isham, C.J. and Butterfield, J. (1998): A topos perspective on the Kochen–Specker theorem: I. Quantum States as Generalized Valuations, quant-ph/9803055. 2. Conceptual aspects, and classical analogues, Int. J. Theor. Phys. 38, 827–859 (1999) 8. Mac Lane, S. and Moerdijk, I. (1992): Sheaves in Geometry and Logic: A First Introduction to Topos Theory. London: Springer-Verlag 9. Goldblatt, R. (1984): Topoi, the categorial analysis of logic. Amsterdam: North-Holland 10. Jacobson, T. (1995): Thermodynamics of spacetime: The Einstein equation of state. Phys. Rev. Lett. 75, 1260 11. Sorkin, R. (1991): A finitary substitute for continuous topology. Int. J. Theor. Phys. 30, 923 12. Borissov, R. and Gupta, S. (1998): Propagating spin modes in canonical quantum gravity. Phys. Rev. D 60 (1999) 024002 13. Penrose, R. (1971): Theory of quantised directions. In: Quantum theory and beyond, ed. Bastin, T. Cambridge: Cambridge University Press 14 In the open set example (footnote 8), (U ⇒ V ) is the union of all open sets W whose intersection with i S U is included in V , i.e. (U ⇒ V ) = i Wi where Wi ∩ U ⊂ V .
Internal Description of a Causal Set
583
14. Rovelli, C. and Smolin, L. (1995): Discreteness of area and volume in quantum gravity. Nucl. Phys. B 593, 734. For a review, see Rovelli, C. (1998): Loop quantum gravity, Living Reviews in Relativity 1 15. Bombelli, L. and Meyer, D. (1989): Phys. Lett. A 141, 226 16. Ambjorn, J. and Loll, R. (1998): Non-perturbative Lorentzian Quantum Gravity and Topology Change. Nucl. Phys. B 536, 407–434 (1998); Ambjorn, J. Nielsen, J.L., Rolf, J. and Loll, R. 1998: Euclidean and Lorentzian Quantum Gravity: Lessons from 2 dimensions. Chaos Soliton Fractals 10, 177–195 (1999) 17. Bridges, D.S., Douglas, S. and Richman, F. (1987): Varieties of constructive mathematics. New York: Cambridge University Press) 18. Brouwer’s Cambridge Lectures on Intuitionism. (1981); ed. VanDalen, D. Cambridge: Cambridge University Press Communicated by H. Nicolai
Commun. Math. Phys. 211, 585 – 613 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Essential Self-Adjointness of Translation-Invariant Quantum Field Models for Arbitrary Coupling Constants Fumio Hiroshima?,?? Technische Universität München, Zentrum Mathematik, Gebelsberger Strasse 49, 80290 München, Germany. E-mail:
[email protected] Received: 26 May 1999 / Accepted: 9 November 1999
Abstract: The Hamiltonian of a system of quantum particles minimally coupled to a quantum field is considered for arbitrary coupling constants. The Hamiltonian has a translation invariant part. By means of functional integral representations the existence of an invariant domain under the action of the heat semigroup generated by a self-adjoint extension of the translation invariant part is shown. With a non-perturbative approach it is proved that the Hamiltonian is essentially self-adjoint on a domain. A typical example is the Pauli–Fierz model with spin 1/2 in nonrelativistic quantum electrodynamics for arbitrary coupling constants. Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2. Quantum Field Models . . . . . . . . . . . . . . . . . . . 2.1 Precise formulations . . . . . . . . . . . . . . . . . 2.2 Main results . . . . . . . . . . . . . . . . . . . . . . 2.3 Examples of nonrelativistic quantum electrodynamics 3. Schrödinger Representations . . . . . . . . . . . . . . . . 3.1 Gaussian processes . . . . . . . . . . . . . . . . . . 3.2 Second quantizations . . . . . . . . . . . . . . . . . 4. Functional Integral Representations . . . . . . . . . . . . 4.1 Stochastic integrals and moment inequalities . . . . 4.2 Functional integral representation of heat semigroups 5. Essential Self-Adjointness . . . . . . . . . . . . . . . . . 5.1 Invariant domains . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
586 587 587 588 590 591 591 592 594 594 600 602 602
? On leave of absence from Department of Mathematics, Faculty of Science, Hokkaido University, Sapporo 060-0810, Japan. E-mail:
[email protected] ?? Present address: Department of Mathematics and Physics, Setsunan University, Osaka 572-8508, Japan. E-mail:
[email protected] 586
F. Hiroshima
5.2 5.3
Translation-invariance . . . . . . . . . . . . . . . . . . . . . . . . . . 608 Proofs of theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 610
1. Introduction We consider the system of quantum particles minimally coupled to a quantized field with an ultraviolet cutoff. For a coupling constant e, the Hamiltonian of the system is formally defined as a densely defined symmetric operator on the tensor product of a Hilbert space and a Boson Fock space. The Hamiltonian has a translation-invariant part [4]. The purpose of this paper is to show the essential self-adjointness of the Hamiltonian for arbitrary coupling constants on a domain. A typical example is the system of atoms coupled to a quantized radiation field (photon), that is, the Pauli–Fierz model with spin 1/2 in nonrelativistic quantum electrodynamics ([3,5–11,17–20,26,32,34]). The Pauli–Fierz √ Hamiltonian HPF (see (2.6)) is defined for ultraviolet cutoffs ρˆ satisfying λ := ρ/ ˆ ω ∈ L2 (R3 ), ω := ω(k) := |k|, k ∈ R3 . It is established in [1,2] that HPF with the dipole approximation [23] is self-adjoint on a domain for arbitrary coupling constants under appropriate conditions. Moreover the properties of its spectrum are studied. The decoupled Hamiltonian H0 is defined by HPF with the coupling constant replaced by zero. Thus HPF = H0 +HI , where HI := HPF − H0 . It is proved in [25] that, in the√case where the coupling constant is sufficiently small and ultraviolet cutoffs satisfy λ/ ω ∈ L2 (R3 ), HI is H0 -bounded [28, p.162] with relative bound strictly smaller than one. In such a case the self-adjointness of HPF follows from the Kato–Rellich theorem [28, Theorem X.12]. Moreover, under some additional conditions, it is shown in [6,17,18] that the ground states of HPF exist and their expectations of the number of photons are finite. It is worthwhile to consider for arbitrary coupling constants the question whether the ground states √ of H2PF exist / L (R3 ). This is a crucial problem that is yet and/or ultraviolet cutoffs such that λ/ ω ∈ to be investigated. To tackle this problem we would first need to establish the essential self-adjointness of HPF under the following conditions: √ / L2 (R3 ) and/or e : arbitrary. (1.1) λ/ ω ∈ In this case we can not regard HPF as the sum of H0 and an H0 -bounded perturbation. Thus general perturbation theory [21] does not appear to be available. The main purpose of this paper is to find a core of HPF in case (1.1) by use of a non-perturbative approach. In this paper we consider generalized models such as HPF with (1.1). We find cores of their Hamiltonians under appropriate conditions. The strategy is as follows: using a Schrödinger representation ([9,14]), we construct a functional integral representation ([12–14,16,19,30,31,33]) of the heat semigroup generated by a self-adjoint extension of the translation-invariant part of the Hamiltonian. Then under some conditions we show that there exists a domain Sess which is invariant under the action of the heat semigroup and is contained in the domain of its infinitesimal generator. We see that the translation-invariant part is essentially self-adjoint on Sess [28, Theorem X.49]. Finally, by an inequality [15] derived from the functional integral representation, we conclude that the Hamiltonian is essentially self-adjoint on a domain. This paper is organized as follows: in Sect. 2 we give the definition of the generalized models and examples of the Pauli–Fierz model, and state the main theorems. Sect. 3 is devoted to a quick review of a probabilistic description of the generalized models in such a way that we can apply a functional integral representation. In Sect. 4 we derive a moment inequality of Hilbert-space-valued stochastic integrals and give a functional
Essential Self-Adjointness
587
integral representation of the heat semigroup stated above. In Sect. 5 we show the existence of an invariant domain under the action of the heat semigroup and give proofs of the main theorems stated in Sect. 2. 2. Quantum Field Models 2.1. Precise formulations. In this subsection we formulate the system of particles minimally coupled to a quantized field and define its Hamiltonian. The particles move in Rd (d : fixed). We begin with introducing notation often used in this paper. Let L be a Hilbert space and A an operator in L. We write the domain of A as D(A). We denote by (·, ·)L (resp. k · kL ) the scalar product (resp. the norm) on L. Note that (f, g)L is antilinear in A in ⊕R L with D(hAiR ) := ⊕R f and linear in g. Let hAiR := ⊕R i=1 D(A). We set R ⊕ R i=1 R ⊕i=1 R 2 d ∼ WR := ⊕i=1 L (R ) = Rd C dk, where Rd dk denotes a constant fiber direct integral [29, XIII.16]. transformation on L2 (Rν ) for ν = d, d + 1 is defined by R The Fourier √ −iyx 2 ν ϑν f (y) := Rν f (x)e dx/ (2π)ν and we set hϑν iR F := Fˆ for F ∈ ⊕R i=1 L (R ). We omit subscripts in k · kL , h·iR and ϑν , unless confusion arises. For a measurable function, m : Rν → C, ν ∈ N, we denote by the same symbol m the multiplication operator by m in L2 (Rν ) with maximal We fix Dn ∈ N. The Boson Fock space L∞ domain. n over WD is defined by F := n=0 ⊗s WD , where ⊗s WD is the n-fold symmetric tensor product of WD with ⊗0s WD := C. We denote by Ffin the set of finite particle vectors in F [28, p.208] and := {1, 0, 0, . . . } the vacuum in F. Moreover we denote by a(f ) and a † (f ), f ∈ WD , the annihilation operator and the creation operator with domain Ffin , respectively, which leave Ffin invariant, and satisfy for all 8, 9 ∈ Ffin , [a(f ), a † (g)]8 = (f¯, g)WD ,
(2.1)
[a(f ), a(g)]8 = [a (f ), a (g)]8 = 0, (8, a † (f )9)F = (a(f¯)8, 9)F ,
(2.2) (2.3)
†
†
where [A, B] := AB − BA and f¯ is the complex conjugate of f . Since a(f ) and a † (f ) are closable, we denote their closed extensions by the same symbols. Define o 1 n ˆ + a(C η) ˆ , η ∈ WD , φ(η) := √ a † (η) 2 where T denotes the closure of closable operator T and Cλ := λ˜ 1 ⊕ · · · ⊕ λ˜ D with λ˜ µ = λ˜ µ (k) := λµ (−k), µ = 1, . . . , D. Let ω be a measurable function such that ω := ω(k) > 0 for almost every k ∈ Rd , and Kµ , µ = 1, . . . , d, a multiplication 2 d operator in √ L (R ) defined by (Kµ f )(k) := kµ f (k). It is physically reasonable to take ω(k) = k 2 + m2 , m ≥ 0, we do not, however, use such a specific form. The free Hamiltonian [28, p.220] in F is defined by the second quantization [27, p.302] of hωiD , Hrad := d0(hωiD ), and the number operator [28, p.220] by Nrad := d0(hI iD ), where I is the identity operator on L2 (Rd ). We fix S ∈ N. For each k ∈ Rd let g(k)R be a linear ⊕ operator of CS to CD . We define a linear operator of WS to WD by G := Rd g(k)dk. We assume that G is bounded with Gf = Gf¯, and that the norm of G is less than one, kGF kWD ≤ kF kWS , for simplicity. Define o 1 n ˆ + a(GC ρ) ˆ , ρ ∈ WS . A(ρ) := √ a † (Gρ) 2
588
F. Hiroshima
Let K be a Hilbert space. Hilbert space H is given by H := K ⊗ L2 (RdJ ) ⊗ F ∼ = K ⊗ H0 , R⊕ where H0 := L2 (RdJ ) ⊗ F ∼ = RdJ Fdx. Let m(L) be the set of weakly measurable L-valued functions on Rd [27, Appendix to IV.5]. Let λ ∈ WR and ρ ∈ m(WR ). We say that λ is real if λ¯ = λ and that ρ is real if ρ(x) is real pointwise. Set 0, Z T Z −d/2 dt t lim n→∞ 0
Rd
2 p khωl/2 i(ρˆν (x) − λˆ νn (x))kWS e−|x| /2t dx = 0.
(2.4)
ˆ ∈ D(hωq/2 i) pointwise. More[Vq,p ] ρˆν ∈ Cb1 (Rd ; WS ) for ν = 1, . . . , d and divρ(x) Pd 0 q/2 d over hω idivρˆ ∈ Cb (R ; WS ), where divρˆ := ν=1 ∂ν ρˆν . ˆ ∈ Ker G pointwise. [U0 ] ρˆν ∈ Cb2 (Rd ; WS ) ∩ G for ν = 1, . . . , d and divρ(x) ˆ ∈ D(hωi) pointwise with [U1 ] ρˆν ∈ Cb2 (Rd ; WS ) ∩ G for ν = 1, . . . , d and divρ(x) 0 d hωidivρˆ ∈ Cb (R ; WS ). PJ
j j =1 ∇µ and Prad,µ := d0(hKµ i), µ = 1, . . . , d. Then Uµ (t) := T T eit (pµ ⊗I ) eit (I ⊗Prad,µ ) = eit (I ⊗Prad,µ ) eit (pµ ⊗I ) , µ = 1, . . . , d, is a one-parameter unitary
Define pµT := −i
group in t ∈ R in H0 . Hence by Stone’s theorem [27, Theorem VIII.8] there exists a self-adjoint operator Pµ in H0 such that Uµ (t) = exp itPµ , t ∈ R, µ = 1, . . . , d.
Definition 2.2 (Translation-invariance). We say that ρˆ is translation-invariant if ρ is real, ρ ∈ m(WS ) and, for all µ = 1, . . . , d and j = 1, . . . , J , exp itAj (ρ) exp isPµ = exp isPµ exp itAj (ρ) , t, s ∈ R. For d-tuple ρ = (ρ1 , . . . , ρd ) we define ρˆ := (ρˆ1 , . . . , ρˆd ). For ρˆ = (ρˆ1 , . . . , ρˆd ) we say that ρˆ is translation-invariant if each ρˆν , ν = 1, . . . , d, is translation-invariant and that ρˆ is real if each ρˆν , ν = 1, . . . , d, is real. Moreover we write ρˆ ∈ (G) if each ρˆν ∈ G, ν = 1, . . . , d. Let K := K0 + I ⊗ Hrad . Theorem 2.3 (Invariant domain I). Let V = 0. We assume that ρ is real and that ρˆ ∈ (U0 ∩ U2,4 ) ∪ (U1 ∩ U2,4 ∩ V2,4 ). Then there exists a nonnegative self-adjoint b in H0 such that operator K b KdD(1⊗I )∩D(I ⊗Hrad )∩D(I ⊗Nrad ) ⊂ K b
and, for t ≥ 0, e−t K (C ∞ (I ⊗ Nrad ) ∩ D(I ⊗ Hrad )) ⊂ C ∞ (I ⊗ Nrad ) ∩ D(I ⊗ Hrad ). We define Sess ⊂ H0 by 2 ) ∩dµ=1 Sess := C ∞ (I ⊗ Nrad ) ∩ D(I ⊗ Hrad n o · D (pµT )2 ⊗ I ∩ D (I ⊗ Hrad )(pµT ⊗ I ) , k where C ∞ (A) := ∩∞ k=1 D(A ).
Theorem 2.4 (Invariant domain II). Let V = 0. We assume that ρˆ ∈ (U0 ∩ U4,8 ) ∪ b (U1 ∩ U4,8 ∩ V4,8 ) and that ρˆ is translation-invariant. Then e−t K Sess ⊂ Sess , t ≥ 0. We define class P of external potentials V as follows: P: V is 1-bounded with relative bound strictly smaller than one, where 1 is the Laplacian in L2 (RdJ ).
590
F. Hiroshima
Theorem 2.5 (Essential self-adjointness I). Let J = 1 and V ∈ P. We assume that ρˆ satisfies the same assumption as in Theorem 2.4 and that hωn/2 iηˆ l ∈ Cb0 (Rd ; WD ), n = −1, 0, l = 1, . . . , L. Then H is essentially self-adjoint on K ⊗ Sess and bounded below. In particular, K ⊗ {D(1 ⊗ I ) ∩ D(I ⊗ Nrad ) ∩ D(I ⊗ Hrad )} is a core of H . Theorem 2.6 (Essential self-adjointness II). Let J = 1 and V ∈ P. We assume that ρˆ satisfies the same assumption as in Theorem 2.4. Then K0 is essentially self-adjoint on Sess and bounded below. In particular, D(1 ⊗ I ) ∩ D(I ⊗ Nrad ) ∩ D(I ⊗ Hrad ) is a core of K0 . Let ρˆ = ⊕Sν=1 ρˆν ∈ m(WS ) be translation-invariant. Then
91 , e−is Pµ Aj (ρ)eis Pµ 092
H
= 91 , Aj (ρ)92
H
ˆ fin , s ∈ R, j = 1, . . . , J , where ⊗ ˆ denotes algebraic tensor for 91 , 92 ∈ C0∞ (RdJ )⊗F product. Hence one sees that ρˆν must be of the form ρˆν (x) = ρˆν (k)(x) = eikx λˆ ν (k)
(2.5)
with λˆ ν ∈ L2 (Rd ), ν = 1, . . . , S. Let ρˆ = (ρˆ1 , . . . , ρˆd ) be translation-invariant. We give sufficient conditions for (1) ρˆ ∈ U0 ∩U4,8 and (2) ρˆ ∈ U1 ∩U4,8 ∩V4,8 , respectively. By (2.5) we see that ρˆµ (k)(x) = ⊕Sν=1 eikx λˆ µν (k) with λˆ µν ∈ L2 (Rd ), µ = 1, . . . , d. P Then ⊕Sν=1 dµ=1 eikx kµ λˆ µν (k) ∈ Ker G and λˆ µν ∈ ∩4k=0 D(ωk/2 ) imply (1), and T λˆ µν ∈ ∩4k=0 D(ωk/2 ) D(ω3 ) implies (2). 2.3. Examples of nonrelativistic quantum electrodynamics. We shall give typical examples, HPF , KPF , of models considered in this paper. Set J = 1, K = C2 , S = d = 3, D = d − 1 = 2, L = 3 and ω(k) = |k|. We define a bounded linear operator of W3 to W2 by G(λ1 ⊕ λ2 ⊕ λ3 ) := (e11 λ1 + e21 λ2 + e31 λ3 ) ⊕ (e12 λ1 + e22 λ2 + e32 λ3 ), where λµ ∈ L2 (R3 ), µ = 1, 2, 3, and er (k) := (e1r (k), e2r (k), e3r (k)), k ∈ R3 , r = 1, 2, are polarization vectors so that er (k) · es (k) = δrs , er (k) · k = 0, r, s = 1, 2. Triplet %ˆ = (ρˆ1 , ρˆ2 , ρˆ3 ) is defined by 3
z }| p { ρˆν (x) = ρˆν (k)(x) = 0 ⊕ · · · ⊕ λ(k)e−ikx / (2π )3 ⊕ · · · ⊕ 0, ν = 1, 2, 3, | {z } the νth
√ where λ := ρ/ ˆ ω ∈ L2 (R3 ). Set Bµ := (e/2)σµ , where σµ , µ = 1, 2, 3, denote the Pauli matrices, and ηˆ µ (x) = ηˆ µ (k)(x) = µ = 1, 2, 3.
1 (k)) 2 (k)) −iλ(k)e−ikx (k × eµ −iλ(k)e−ikx (k × eµ p p ⊕ , (2π)3 (2π )3
Essential Self-Adjointness
591
The Pauli–Fierz Hamiltonian HPF is formally defined on C2 ⊗ L2 (R3 ) ⊗ F by HPF
3 eX := I ⊗ (KPF + I ⊗ Hrad ) + σµ ⊗ Bµ , 2
(2.6)
µ=1
KPF :=
3 2 1X −i∇µ ⊗ I − eA(ρµ ) + V ⊗ I, 2 µ=1
where ∇µ denotes the generalized partial differential operators in xµ and Z ⊕ n o 1 r r ˜ i·x eµ p a † ⊕2r=1 λe−i·x eµ + a ⊕2r=1 λe dx, A(ρµ ) := R3 2(2π)3 Z ⊕ o 1 n Bµ := √ a † ηˆ µ (x) + a C ηˆ µ (x) dx. 2 R3 Theorem 2.7. Let V ∈ P and ρ be a real-valued function. Let λ ∈ L2 (R3 ) and ω2 λ ∈ L2 (R3 ). Then we have the following: (1) HPF is essentially self-adjoint on C2 ⊗ Sess and bounded below. (2) KPF is essentially self-adjoint on Sess and bounded below. In particular, C2 ⊗ {D(1 ⊗ I ) ∩ D(I ⊗ Nrad ) ∩ D(I ⊗ Hrad )} is a core of HPF and D(1 ⊗ I ) ∩ D(I ⊗ Nrad ) ∩ D(I ⊗ Hrad ) a core of KPF . Theorem 2.8. Let ρ be a real-valued function. Let λ ∈ L2 (R3 ), ω2 λ ∈ √ V ∈ 2P and 2 3 3 L (R ) and λ/ ω ∈ L (R ). Then we have the following: (1) HPF is essentially selfadjoint on C2 ⊗ {D(1 ⊗ I ) ∩ (I ⊗ Hrad )}. (2) KPF is essentially self-adjoint on D(1 ⊗ I ) ∩ (I ⊗ Hrad ). 3. Schrödinger Representations 3.1. Gaussian processes. In this subsection we introduce Boson Fock space F0 and give Schrödinger representations of F and F0 ([9,14,16]). Let WR0 := L2 (R) ⊗ WR ∼ = R 0 2 d+1 ⊕i=1 L (R ), R ∈ N. The Boson Fock space over WD is denoted by F0 and the annihilation operator and the creation operator in F0 by b(F ) and b† (F ), F ∈ WD0 , respectively. b] (F ) satisfies (2.1), (2.2) and (2.3) with WD and F replaced by WD0 and e := I ⊗ G and F0 , respectively. We define S := d0(hI ⊗ ωi) and N := d0(hI i). Let G o 1 n e e ρ) ρ) ˆ + b(GC ˆ , A0 (ρ) := √ b† (G 2 o i n e eρ) ˆ − b† (G ˆ , ρ ∈ WS0 . E0 (ρ) := √ b(GC ρ) 2 Throughout this paper notation G\ (resp.G\ ) denotes G0 orG (resp.G0 or G). Positive \ semidefinite bilinear forms q\ on WS are given by q(f, g) := Gfˆ, Ggˆ , f, g ∈ WS , WD eFˆ , G eG ˆ , F, G ∈ WS0 . We set q\ (f, f ) := q\ (f ). Define K\ := and q0 (F, G) := G 0 W D o n \ \ \ f ∈ WS q\ (f ) = 0 . We denote by [f ]\ the equivalent class of f ∈ WS in WS /K\ . Set
592
F. Hiroshima \
([f ]\ , [g]\ )\ := q\ (f, g)/2. Thus R\ := {[f ]\ |f ∈ WS , f : real} is a real Hilbert space A\ ([f ]\ ), [f ]\ ∈ R\ } be the Gaussian process over a with scalar product (·, ·)\ . Let {b probability measure space (Q\ , µ\ ) indexed by vectors of R\ with covariance (·, ·)\ . We \ A\ ([f ]\ ). We define A\ (f ) := A\ ( 0, there exists b() > 0 such that
L
X
X
L
B ⊗ φ(η )9 = (B ⊗ I )(I ⊗ φ(η ))9 l l l l
(5.32)
l=1
l=1
b + b()k9k ≤ (I ⊗ K)9 b By a limiting argument, (5.32) extends to 9 ∈ D(I ⊗ K). b Thus ˆ K). for 9 ∈ K⊗D( PL b l=1 Bl ⊗ φ(ηl ) is infinitesimally small with respect to I ⊗ K in H. Hence V ⊗ I + PL b l=1 Bl ⊗ φ(ηl ) is I ⊗ K-bounded with relative bound strictly smaller than one since b+ V ∈ P. By the Kato–Rellich theorem [28, Theorem X.12] we conclude that I ⊗ K PL V ⊗ I + l=1 Bl ⊗ φ(ηl ) is essentially self-adjoint on K ⊗ Sess , which implies that H t is essentially self-adjoint on K ⊗ Sess . u bS replaced Proof of Theorem 2.6. One can also prove Lemmas 5.4 and 5.9 with K and H −ieA (Z) 0 b bSA θ −1 , and HSA , respectively. Thus K0 dD(1⊗I )∩D(I ⊗Hrad )∩D(I ⊗Nrad ) ⊂ θ H by e bSA θ −1 −tθ H Sess ⊂ Sess . Moreover, a similar argument as that of Sess ⊂ D(K0 ) and e Lemma 5.10 yields that b (HSA + E)−1 9 (x, A) ≤ ((−1/2)1 ⊗ I + E)−1 |9| (x, A) bSA θ −1 -bounded pointwise for E > 0 and almost every (x, A) ∈ Rd × Q. Thus V is θ H t in H0 with relative bound strictly smaller than one. Hence theorem follows. u Proof of Theorem 2.7. It is easily checked that the triplet %ˆ is translation-invariant and t that %ˆ ∈ U0 ∩ U4,8 . Thus Theorem 2.7 follows from Theorems 2.5 and 2.6. u
612
F. Hiroshima
Proof of Theorem 2.8. We see that D(1 ⊗ I ) ∩ D(I ⊗ Hrad ) ⊂ D(HPF ) and D(1 ⊗ t I ) ∩ D(I ⊗ Hrad ) ⊂ D(KPF ). Thus we can get the desired results by Theorem 2.7. u Acknowledgements. I thank H. Spohn for the kind hospitality at Zentrum Mathematik, Technische Universität München. I am grateful to S. Albeverio for the kind hospitality at Universität Bonn where part of this work was done. I thank T. Shimizu for a careful reading of the manuscript. I also thank the Japan Society for the Promotion of Science (JSPS) for financial support through a postdoctoral fellowship.
References 1. Arai, A.: Self-adjointness and spectrum of Hamiltonians in nonrelativistic quantum electrodynamics. J. Math. Phys. 22, 534–537 (1981) 2. Arai, A.: On a model of a harmonic oscillator coupled to a quantized, massless, scalar field. I. J. Math. Phys. 22, 2539–2548 (1981) 3. Arai, A., Hirokawa, M., Hiroshima, F.: On the absence of eigenvectors of Hamiltonians in a class of massless quantum field models without infrared cutoff. J. Funct. Anal. 168, 470–497 (1999) 4. Avron, J.E., Herbst, I.W., Simon, B.: Separation of center of mass in homogeneous magnetic fields. Ann. Phys. 114, 431–451 (1978) 5. Bach, V., Fröhlich, J., Sigal, I.M.: Quantum electrodynamics of confined nonrelativistic particles. Adv. in Math. 137, 299–395 (1998) 6. Bach, V., Fröhlich, J., Sigal, I.M.: Spectral Analysis for systems of atoms and molecules coupled to the quantized radiation field. Commun. Math. Phys. 207, 249–290 (1999) 7. Blanchard, P.: Discussion mathématique du modéle de Pauli et Fierz relatif á catastrophe infrarouge. Commun. Math. Phys. 15, 156–172 (1969) 8. Fefferman, C.: On electrons and nuclei in a magnetic field. Adv. in Math. 124, 100–153 (1996) 9. Fefferman, C., Fröhlich, J., Graf, G.M.: Stability of ultraviolet-cutoff quantum electrodynamics with non-relativistic matter. Commun. Math. Phys. 190, 309–330 (1997) 10. Fröhlich, J.: On the infrared problem in a model of scalar electrons and massless, scalar bosons. Ann. Inst. Henri Poincaré 19, 1–103 (1973) 11. Fröhlich, J.: Existence of dressed one electron states in a class of persistent models. Fortschritte der Physik 22, 159–198 (1974) 12. Fröhlich, J. and Park, Y.M.: Correlation inequalities and thermodynamic limit for classical and quantum continuous systems II. Bose-Einstein and Fermi-Dirac statistics. J. Stat. Phys. 23, 701–753 (1980) 13. Glimm, J. and Jaffe, A.: Quantum Physics, New York: Springer-Verlag, 1987 14. Hiroshima, F.: Functional integral representation of a model in quantum electrodynamics. Rev. Math. Phys. 9, 489–530 (1997) 15. Hiroshima, F.: Diamagnetic inequalities for systems of nonrelativistic particles with a quantized field. Rev. Math. Phys. 8, 185–203 (1996) 16. Hiroshima, F.: Weak coupling limit and removing an ultraviolet cutoff for a Hamiltonian of particles and interacting with a quantized scalar field. J. Math. Phys. 40, 1215–1236 (1999) 17. Hiroshima, F.: Ground states and spectrum of non-relativistic quantum electrodynamics. Transaction of the AMS (to be published) 18. Hiroshima, F.: Ground states of a model in nonrelativistic quantum electrodynamics I. J. Math. Phys. 40, 6209–6222 (1999) 19. Hiroshima, F.: Ground states of a model in nonrelativistic quantum electrodynamics II. J. Math. Phys. 41, 661–674 (2000) 20. Hübner, M. and Spohn, H.: Radiative decay: Nonperturbative approaches. Rev. Math. Phys. 7, 363–387 (1995) 21. Kato, T.: Perturbation Theory for Linear Operators. Berlin–Heidelberg–NewYork: Springer-Verlag, 1966 22. Kato, T. and Masuda, K.: Trotter’s product formula for nonlinear semigroups generated by the subdifferentiables of convex functionals. J. Math. Soc. Japan 30, 169–178 (1978) 23. van Kampen, N.: Contribution to the quantum theory of light scattering. Det Kongeliege Danske Videns. Selskab, Matt. Fys. Medd. 26 No. 15, 1–77 (1951) 24. Karatzas, I. and Shreve, S.E.: Brownian Motion and Stochastic Calculus. Graduate Texts in Mathematics 113, Berlin–Heidelberg–New York: Springer-Verlag, 1997 25. Okamoto, and Yajima, K.: Complex scaling technique in non-relativistic massive QED. Ann. Int. Henri. Poincaré 42, 311–327 (1985) 26. Pauli, W. and Fierz, M.: Zur Theorie der Emmision langwelliger Lichtquanten. Nuovo Cimento 15, 167–188 (1938)
Essential Self-Adjointness
613
27. Reed, M. and Simon, B.: Method of Modern Mathematical Physics Vol. I. New York: Academic Press, 1980 28. Reed, M. and Simon, B.: Method of Modern Mathematical Physics Vol. II. New York: Academic Press, 1975 29. Reed, M. and Simon, B.: Method of Modern Mathematical Physics Vol. IV. New York: Academic Press, 1978 30. Simon, B.: The P (φ)2 Euclidean (Quantum) Field Theory. Princeton, NJ: Princeton Univ. Press, 1974 31. Simon, B.: Functional Integral representation and Quantum Physics. New York: Academic Press, 1979 32. Skibsted, E.: Spectral analysis of N -body system coupled to a bosonic field. Rev. Math. Phys. 10, 989– 1026 (1998) 33. Spohn, H.: Effective mass of the polaron: A functional integral approach. Ann. Phys. 175, 278–318 (1987) 34. Spohn, H.: Asymptotic completeness for Rayleigh scattering. J. Math. Phys. 38, 2281–2296 (1997) 35. Yoshida, K.: Functional Analysis. Berlin–Heidelberg–New York: Springer-Verlag, 1980 Communicated by H. Araki
Commun. Math. Phys. 211, 615 – 628 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Quadratic Bosonic and Free White Noises ? ´ Piotr Sniady
Instytut Matematyczny, Uniwersytet Wrocławski, pl. Grunwaldzki 2/4, 50-384 Wroclaw, Poland. E-mail:
[email protected] Received: 23 August 1999 / Accepted: 8 December 1999
Abstract: We discuss the meaning of renormalization used for deriving quadratic bosonic commutation relations introduced by Accardi [ALV] and find a representation of these relations on an interacting Fock space. Also, we investigate classical stochastic processes which can be constructed from noncommutative quadratic white noise. We postulate quadratic free white noise commutation relations and find their representation on an interacting Fock space. 1. Introduction Hudson and Parthasarathy [HP] showed that a Brownian motion B(T ) can be represented ? as a sum of two noncommuting operators: annihilation a(0,T ) and creation a(0,T ), B(T ) =
? a(0,T ) + a(0,T )
Z = 0
T
(at + at? ),
where at and at? stand for the infinitesimal annihilation and creation operators respectively. Accardi [ALV], in order to study some physical problems, introduced quadratic white noise operators, which informally can be written as nt = at? at , bt = (at )2 and bt? = (at? )2 . The first one, called the number operator has been already considered in the white noise calculus and it does not cause serious difficulties. The other two, called quadratic annihilation and quadratic creation operators, in fact represent infinite quantities and therefore have to be redefined. Indeed, it can be shown that because of [at , as? ] = δ(t −s) we have [at2 , as? 2 ] = 2δ 2 (t − s) + 4δ(t − s)at as? , ? Research partially supported by the Scientific Research Committee in Warsaw under grant number P03A05415.
´ P. Sniady
616
where δ denotes the Dirac distribution. Since the square of the delta function is not well defined, this relation is meaningless. Furthermore it is too singular to apply the subtraction renormalization [S]. By renormalization δ 2 (x) = γ0 δ(x) Accardi postulates that the renormalized quadratic white noise operators should fulfill the following commutation relation: [bt , bs? ] = 2γ0 δ(t − s) + 4δ(t − s)ns , R R ? = ψ b takes the form which for smeared operators bφ = φt bt , bψ s s ? ] = 2γ0 hφ, ψi + 4nφψ [bφ , bψ ¯ .
(1)
(2)
In Sect. 2 we present another discussion of this relation in a more general context of q-deformed commutation relations. In Sect. 3 we show that from this discussion follows for the bosonic case the meaning of renormalization constant γ0 as the inverse of the lengthscale taken for quadratic variation of a (noncommutative) Brownian motion and we discuss other commutation relations. Furthermore, from quadratic white noise operators we construct some classical stochastic processes. Accardi and Skeide [ALV,AS] have constructed a Fock representation of quadratic white noise relations. The construction presented in the paper [ALV] uses the Kolmogorov decomposition for a certain positive kernel. Another approach is presented in the paper [AS] where the construction of quadratic white noise operators is based on the theory of Hilbert modules. In Sect. 4 and 5 we present a direct construction of such a representation on an interacting Fock space. Our method is based on defining explicitly a scalar product on a symmetric Fock space. In Sect. 6 we discuss the existence of a Fock representation of an algebra containing both quadratic and usual linear white noise operators. It turns out that it is in general not possible to find such a representation. The main reason is that under a certain lengthscale the renormalized quadratic operators lose their intuitive meaning as squares of creation and annihilation operators. In Sect. 7 we introduce free quadratic white noise operators which should describe the squares of free creation and annihilation operators with small violation of freeness and construct their representation. Both standard and quadratic white noises are weak processes, i.e. mappings from some linear space S to operators on a Hilbert space. Contrary to white noise commutation relations, the quadratic relation (2) involves not only a scalar product in S, but a product of two elements of S as well. From the noncommutative geometry viewpoint [C] it would be interesting to consider noncommutative spacetime algebras S as well and quadratic white noise relations provide appropriate examples. Unfortunately, for the bosonic white noise there seems to be some limitations on the choice of S but for the free case the construction works for all associative algebras. 2. General Renormalized Quadratic White Noise For a Hilbert space H and a real number q, −1 < q ≤ 1 let us consider q-deformed white noise operators [FB,BKS]: the creation aφ? and its adjoint annihilation aφ indexed by φ ∈ H. These operators fulfill the following commutation relation: aφ aψ? − qaψ? aφ = hφ, ψi.
(3)
Quadratic Bosonic and Free White Noises
617
For the case H = L2 (M, dµ) we can write informally Z Z ? φ(t)at , aφ = φ(t)at? , aφ = M
M
where at , at? denote white noise annihilation and creation operators. Our goal is to introduce operators bφ and bφ? which would be informally treated as integrals of squares of white noise operators Z Z φ(t)(at )2 , bφ? = φ(t)(at? )2 . bφ = M
M
In order to give meaning to these expressions let us consider a sequence (Ii ) of disjoint measurable subsets of M, each of the same measure l and a sequence (χi ) of orthogonal functions 1 : x ∈ Ii . χi (x) = 0:x∈ / Ii Furthermore let us consider piecewise constant functions φ, ψ, X X φ(xi )χi (x), ψ(x) = ψ(xi )χi (x), φ(x) = i
i
for a sequence (xi ) such that xi ∈ Ii . Now let us define X X ? φ(xi )(a √1 χi )2 , bψ = ψ(xi )(a ?√1 bφ = l
i
i
l
χi
)2 .
A simple computation shows that for squares of creation and annihilation operators aζ2 aξ? 2 − q 4 aξ? 2 aζ2 = (1 + q)hζ, ξ i2 + q(1 + q)2 hζ, ξ iaξ? aζ hold. For this reason we have ? ? − q 4 bψ bφ bφ bψ
= (1 + q)
X
ψ(xi )φ(xi ) + q(1 + q)2
i
X i
Since the L2 (M, dµ) norm of the function
1 √
χ l i
ψ(xi )φ(xi ) a ?√1
χ l i
a √1
χ l i
.
is equal to 1, the operator a ?√1
χ l i
a √1
χ l i
is a number operator. If we consider only the creation and annihilation operators aθ? , aθ for functions θ that are piecewise constant on sets Ii then the operators a ?√1 a √1 χi χ l l i R and Ii at? at have the same commutation relations with others and therefore they are indistinguishable in the sense of vacuum expectation values. Under these assumptions we can write ? ? − q 4 bψ bφ b φ bψ Z Z 1+q = ψ(x)φ(x) dµ(x) + q(1 + q)2 ψ(x)φ(x) at? at . (4) l M M The preceding calculations hold only for a very limited class of functions φ and ψ. However, we shall postulate the following commutation relation between quadratic creation and annihilation operators for all φ and ψ: ? ? − q 4 bψ bφ = γ hφ, ψi + c nφψ bφ bψ ¯ ,
(5)
operator, should be understood for some constants γ , c and where nf , called a number R as a generalization of the usual number operator M f (x)at? at .
´ P. Sniady
618
2.1. Fock representations. Hudson–Parthasarathy’s operators aφ , aφ? (φ ∈ H) are usually represented as operators acting on some Hilbert space with a cyclic vector with the property that aφ = 0 for all φ ∈ H. Since the operators bφ , bφ? are interpreted as smeared renormalised squares of white noise operators at , at? , therefore it is natural to ask if it is possible to find a representation of operators bφ , bφ? , nφ acting on some Hilbert space 0 2 such that 0 2 contains a cyclic vector , called a vacuum, such that bφ = 0, nφ = 0 for all φ. In such a setup we will be able to define a state τ on the space of operators acting on 0 2 defined by τ (X) = h, Xi which would play the role of a (noncommutative) expectation value.
3. Bosonic Quadratic White Noise 3.1. Bosonic commutation relations. For the bosonic case q = 1 Eq. (4) takes the form ? ] = 2γ0 hφ, ψi + 4nφψ [bφ , bψ ¯ ,
(6)
where γ0 = 1l . Furthermore, we postulate that two creation, two annihilation and two number operators should commute: [bφ , bψ ] = 0,
? [bφ? , bψ ] = 0,
[nφ , nψ ] = 0.
(7)
A simple calculation for piecewise constant functions X i
?
φ(xi )a √1
χ l i
a √1
l
χi ,
X j
?
ψ(xj )(a √1
)
2
χ l j
=2
X k
ψ(xk )φ(xk )(a ?√1
χ l k
)2
gives us a motivation for the following commutation relations: ? ? ] = 2bφψ , [nφ , bψ
[bψ , nφ ] = 2bψφ ? .
(8)
3.2. Classical quadratic processes. By the spectral theorem a commuting family of normal operators has a common spectral measure. After applying a state the spectral measure becomes an ordinary measure which has a natural probabilistic interpretation as a joint distribution of random variables corresponding to operators from our family. Let us define for s ∈ R, Qs (φ) = bφ ? + bφ? + snφ .
(9)
Similar to the white noise it is a weak process [S], i.e. an operator valued function on a linear space L2 (M, dµ) ∩ L∞ (M, dµ). In the case M = R+ we can construct from it a stochastic process Qs (t) = Qs (χ(0,t) ). Theorem 1. Let us fix s ∈ R. Then {Qs (φ)} forms a commuting family of normal operators and therefore it is a classical stochastic process. With respect to the expectation value τ , it is a Markovian process.
Quadratic Bosonic and Free White Noises
619
Proof. The first part of the proof is a simple application of (6)–(8). The property that Qs is a process with independent increments means exactly that for all disjoint sets M1 , M2 ⊂ M and fi ∈ Alg{Qs (φ) : φ ∈ L2 (Mi )} the equality τ (f1 f2 ) = τ (f1 )τ (f2 ) holds. Note that every expression containing operators nφ , bφ , bφ? (φ ∈ H) can be written according to the relations (6), (8) in the normal form, a linear combination of products of type bφ? 1 · · · bφ? k nχ1 . . . nχm bψ1 · · · bψl .
(10)
Each of the operators nφ1 , bφ1 , bφ? 1 commutes with each of the operators nφ2 , bφ2 , bφ? 2 for φi ∈ L2 (Mi , dµ), therefore a product of two expressions of the form (10), one being an element of Alg{nφ , bφ , bφ? : φ ∈ L2 (M1 )} and the other an element of Alg{nφ , bφ , bφ? : φ ∈ L2 (M2 )} is–up to a permutation of factors–in a normally ordered form. The state τ has a property that on normally ordered products it takes nonzero values only on multiples of identity and τ (f1 f2 ) = τ (f1 )τ (f2 ) follows. Now it is enough to notice that the expectation value of Qs (φ) is equal to 0 for any φ. t u 3.3. Quadratic variation of a Brownian motion. Let M = R+ and let us consider an arithmetic series (ti ), ti = li. For the sum of squares of increments of a standard Brownian motion the following operator equality holds: X X φ(ti )[B(ti+1 ) − B(ti )]2 = φ(ti )[aχi + aχ? i ]2 i
i
=
X i
φ(ti )[aχ2i + 2aχ? i aχi + aχ? i 2 + (ti+1 − ti )],
where χi is the characteristic function of an interval (ti , ti+1 ). In the preceding discussion we have chosen the commutation relations between operators lbχi , lbχ? j and lnχk to coincide with commutation relations between aχ2i , aχ?2j and aχ? k aχk whenever the length
of intervals is equal to l = γ10 . Therefore, for any function φ which is piecewise constant on intervals (ti , ti+1 ) we can write X
φ(ti )[B(ti+1 ) − B(ti )]2
i
=
X i
φ(ti )
Z 1 1 φ(x) dx. [bχi + bχ? i + 2nχi ] + (ti+1 − ti ) = Q2 (φ) + l l R+
This equation can be viewed as follows. Just like aφ , aφ? are quantum components of the Brownian motion, for functions φ which are piecewise constant on intervals which length is a multiplicity of γ10 operators bφ , bφ? , 2nφ are quantum components of the quadratic
variation of Brownian motion. The constant γ10 describes the lengthscale under which such interpretation is no longer valid. The measures corresponding to γ0 Q2 (t) + t = γ0 Q2 (χ(0,t) ) + t for t being the multiplicity of γ10 are therefore the χ 2 distributions. From this it follows that for arbitrary t these are gamma distributions and γ0 Q2 (t) + t is a gamma process.
´ P. Sniady
620
4. Quadratic Bosonic White Noise on an Interacting Fock Space Let A be a commutative C ? -algebra of continuous functions on some set M with a measure µ and let the state on A induced by µ be denoted also by µ. Definition 1. A partition of a finite set A is a collection π = {π1 , . . . , πm } of nonempty sets πp , which are pairwise disjoint and their union is equal to A. An ordered partition of a finite set A is a set π = {π1 , . . . , πm } of nonempty sequences πp = (πp1 , . . . , πp,np ), such that the family of sets {πp1 , . . . , πp,np }, 1 ≤ p ≤ m forms a partition of A. bn f2 (A) = L ⊗ For a fixed positive constant γ0 let us consider a vector space 0 n≥0 A b bn (where A⊗ denotes the symmetric tensor power) with a sesquilinear form defined by hχ1 ⊗ · · · ⊗ χk , ψ1 ⊗ · · · ⊗ ψl i Y γ0 2k X µ(χπ?p1 ψπp1 · · · χπ?pnp ψπpnp ), = δkl k! np
(11)
{π1 ,...,πm } 1≤p≤m
where the sum is taken over all ordered partitions π of the set {1, . . . , n}. Please note that this sesquilinear form is well-defined on the full tensor power A⊗n , bn however we shall usually use it on the symmetric tensor power A⊗ . L bn b0 f 2 ⊗ ⊗ In the sum 0b = n≥0 A appears a summand A which should be understood as a one-dimensional Hilbert space C where is a unital vector called vacuum. f2 will be called bosonic quadratic Fock space. The Hilbert space 0b2 , a completion of 0 b b In the following by A⊗k we shall mean the completion of the symmetric tensor power bk A⊗ with respect to the scalar product (11). Question 1. For the sesqilinear form (11) all algebraic considerations of this section hold even if the algebra A is not commutative. If this case we only have to assume that the state µ is tracial and we have to replace the number operator (14) by a pair of left and right number operators. Unfortunately, in this general situation the form (11) is not always positively definite. Is it possible to find some nontrivial examples of noncommutative algebras A with tracial states µ such that (11) is positively definite? For ψ ∈ A we define the action of the quadratic creation, annihilation and number operators on simple tensors by X ? (χ1 ⊗ · · · ⊗ χk ) = χ1 ⊗ · · · ⊗ χi ⊗ ψ ⊗ χi+1 ⊗ · · · ⊗ χk , (12) bψ 0≤i≤k
(13) bψ (χ1 ⊗ · · · ⊗ χk ) = 2γ0 µ(ψ ? χ1 ) χ2 ⊗ · · · ⊗ χk X χ2 ⊗ · · · ⊗ χi−1 ⊗ (χi ψ ? χ1 ) ⊗ χi+1 ⊗ · · · ⊗ χk , +2 2≤i≤k
nψ (χ1 ⊗ · · · ⊗ χk ) =
X
χ1 ⊗ · · · χi−1 ⊗ (ψχi ) ⊗ χi+1 ⊗ · · · ⊗ χk ,
(14)
1≤i≤k
for k ≥ 1 and their action on the vacuum by ? () = ψ, bψ
bψ () = 0,
nψ () = 0.
(15)
Quadratic Bosonic and Free White Noises
621
Please note that simple tensors are in general not elements of the symmetric tensor power of A. However, by linearity these definitions extend to a dense subspace of the symmetric bn bn b(n−1) . What is important, the range of operators bφ : A⊗ → A⊗ , tensor power A⊗ bn b(n+1) bn bn ? ⊗ ⊗ ⊗ ⊗ , nφ : A → A is again a symmetric power of A. bφ : A → A A difficulty arises from the fact that such defined operators are not bounded. For example we shall not claim that bφ? is an adjoint of bφ because such a statement is not easy to prove since it demands careful discussion of domains of operators. It seems that in order to do this we would have to define these operators on some analogue of exponential domain of Hudson and Parthasarathy [HP] in a less intuitive way. Similarly commutation relations will hold only in a restricted sense. Theorem 2. Operators bφ , bφ? , nφ fulfill the following operator norm estimates respect to the scalar product (11):
√ √
b b γ0 kφkL2 + (k − 1)kφkL∞ ,
bφ : A⊗k → A⊗(k−1) ≤ 2k
√ √
? b b γ0 kφkL2 + (k − 1)kφkL∞ ,
bφ : A⊗(k−1) → A⊗k ≤ 2k
b b
nφ : A⊗k → A⊗k ≤ k kφkL∞ .
with
(16) (17) (18)
Proof. Let us consider a map A⊗k → A⊗(k−1) defined on simple tensors by ψ1 ⊗ · · · ⊗ ψk 7 → 2γ0 hφ, ψ√ 1 iψ2 ⊗ · · · ⊗ ψk . It is easy to see that the operator norm of this map does not exceed 2kγ0 kφkL2 . And now, for any i let us consider a map A⊗k → A⊗(k−1) defined on simple tensors by ψ1 ⊗ · · · ⊗ ψk 7 → 2ψ2 ⊗ · · · ⊗ ψi−1 ⊗ (ψi φ ? ψ1 ) ⊗√ψi+1 ⊗ · · · ⊗ ψk . It is easy to see that the operator norm of this map does not exceed 2k kφkL∞ . The sum of these maps is equal to bφ , which shows the estimate (16). The estimation (17) follows from (16) because bφ is an adjoint of bφ? what will be proven in Theorem 3 and therefore their norms are equal. The inequality (18) is obvious. u t bk of operators aφ , aφ? for all This theorem allows us to define the action on A⊗ φ ∈ L2 (M, dµ) ∩ L∞ (M, dµ) and of operator nφ for all φ ∈ L∞ (M, dµ).
Theorem 3. For any ζ ∈ L2 (M, dµ) ∩ L∞ (M, dµ) operators bζ and bζ? are adjoint in the sense that hbζ 9, 8i = h9, bζ? 8i bk bl for all 9 ∈ A⊗ , 8 ∈ A⊗ . For any φ ∈ L∞ (M, dµ) the adjoint of nφ is equal to nφ ? in the sense that hnζ 9, 8i = h9, nζ ? 8i, bk bl , 8 ∈ A⊗ . for all 9 ∈ A⊗
P P bk M ∈ A⊗ and 8 = N φ1N ⊗ · · · ⊗ Proof. Let us consider 9 = M ψ0M ⊗ · · · ⊗ ψk−1
P bl . Since 9 is a symmetric tensor the value of a scalar product 9, N φ1N ⊗ φlN ∈ A⊗ N ⊗ ζ ⊗ φ N ⊗ · · · ⊗ φ N does not depend on i. This implies that · · · ⊗ φi−1 i l h9, b? (ζ )8i = (l + 1)h9, ζ ⊗ 8i.
´ P. Sniady
622
We can split the sum in the definition (11) of h9, ζ ⊗ 8i into two parts: over ordered partitions π of the set {0, 1, . . . , k − 1} which contain a block consisting of a single element 0 and all the others ordered partitions. Since the state µ is tracial we have 2k X X γ0 µ(ψ0M? ζ ) h9, ζ ⊗ 8i = δk,l+1 k! M N Y γ0 X µ ψπM? φ N · · · ψπM? φN × p,np πp,np p1 πp1 n {π1 ,...,πm } 1≤p≤m p X X γ0 µ ψ0M ? ζ ψπM? φ N · · · ψπM? φN + q,nq πq,nq q1 πq1 {π1 ,...,πm } 1≤q≤m
Y
×
1≤p≤m,p6=q
γ0 M? M? N N µ ψπp1 · · · ψπp,np φπp1 · · · φπp,np , np
where the sums over π are taken over all ordered partitions π of the set {1, . . . , k − 1}. Note that for any nonempty subset A of the set {1, . . . , k − 1} we have X πq
X X µ ψ0M ? ζ ψπM? φ N · · · ψπM? φN = q,nq πq,nq q1 πq1
πq 1≤r≤nq
M ?
× µ ψπM? φ N · · · ψπM? φN ψπMq,r ζ ? ψ0 q1 πq1 q,r−1 πq,r−1
1 nq
φπNq,r ψπM? φN · · · ψπM? φN , q,nq πq,nq q,r+1 πq,r+1
where the sums are taken over all sequences πq = (πq,1 , . . . , πq,nq ) such that each of the elements of A appears in πq exactly once. Now, it is easy to see that " h9, bζ? 8i
= 2δk,l+1 γ0
X M
hψ0M , ζ ihψ1M ⊗ · · · ⊗ ψkM , 8i+
# XX M M hψ1M ⊗ · · · ⊗ ψi−1 ⊗ ψiM ζ ? ψ0M ⊗ ψi+1 ⊗ · · · ψkM , 8i , + i
M
which proves the first part of the theorem. The proof of the fact that the adjoint of nφ is equal to nφ ? is very simple and we shall omit it. u t bk we have Theorem 4. For any φ, ψ ∈ L2 (A) ∩ L∞ (A), ζ, η ∈ L∞ (A) and 8 ∈ A⊗ ? ]8 = 0, [bφ? , bψ ? ]8 [bφ , bψ ? [nζ , bψ ]8
[bφ , bψ ]8 = 0,
= (2γ0 hφ, ψi + 4nφ ? ψ )8, =
2bζ? ψ 8,
[bψ , nζ ]8 = 2bζ ? ψ 8.
(19) (20) (21)
Proof. Since the definitions of creation quadratic operators and standard creation operators coincide, two quadratic creation operators commute. Quadratic annihilation operators are their adjoints so they commute with each other as well.
Quadratic Bosonic and Free White Noises
623
Let us consider two auxiliary annihilation operators bˆψ (χ1 ⊗ · · · ⊗ χk ) = 2γ0 µ(ψ ? χ1 ) χ2 ⊗ · · · ⊗ χk , X χ2 ⊗ · · · ⊗ χi−1 ⊗ (χi ψ ? χ1 ) ⊗ χi+1 ⊗ · · · ⊗ χk . b˜ψ (χ1 ⊗ · · · ⊗ χk ) = 2 2≤i≤k
We have bψ = bˆψ + b˜ψ . The definition of bˆ coincides up to a factor with the definition of the standard annihilation operator, therefore ? ] = 2γ0 hφ, ψi. [bˆφ , bψ It is easy to see that there are exactly two terms in the commutator which do not cancel: ? ](χ1 ⊗ · · · ⊗ χk ) = 2γ0 (ψφ ? χ1 + χ1 φ ? ψ) ⊗ χ2 ⊗ · · · ⊗ χk , [b˜φ , bψ
which is equal to the action of 4γ0 nψφ ? . If we do not assume that A is commutative we have to replace n by an appropriate sum of left and right multiplication operators. u t 5. Another Representation of the Quadratic Bosonic Fock Space The construction from the previous subsection can be presented in a more direct way. Let us consider an isomorphism C(M)⊗· · ·⊗C(M) = Calg (M ×· · ·×M), where Calg (M n ) denotes the space of continuous functions on M n = M × · · · × M which are finite sums of simple tensors. The multiplication map A⊗n 3 x1 ⊗ · · · ⊗ xn 7 → x1 · · · xn ∈ A under this isomorphism is equal to the diagonal map Calg (M n ) 3 f 7→ 1f ∈ C(M), where (1f )(x) = f (x, x, . . . , x) for any x ∈ M. For any ordered partition π = {π1 , . . . , πk } of the set {1, . . . , n} let 1π : M k → M n be an embedding of M k onto the diagonal of M n defined by partition π : 1π (x1 , . . . , xk ) = (y1 , . . . , yn ), where yr = xs for r ∈ πs . 1?π (µ⊗m ) denotes the pull–back of the measure µ⊗m on M m onto a multidiagonal of M k defined by 1π , namely Z Z f (x1 , . . . , xk ) d1?π (µ⊗m ) = f [1π (y1 , . . . , ym )]dµ(y1 ) · · · dµ(ym ). Mk
Mm
Note that however the function 1π depends on the choice of order of blocks of partition π, the pull–back measure 1?π (µ⊗m ) does not depend on it. Therefore the scalar product (11) can be represented as Z 8(x1 , . . . , xk )9(x1 , . . . , xk ) dµk (x1 , . . . , xk ) h8, 9i = δkl Mk
for 8 ∈ Calg (M k ), 9 ∈ Calg (M l ), where the measure µk on M k is given by µk =
2k k!
X {π1 ,...,πm }
γ0m 1? (µ⊗m ). |π1 | · · · |πm | π
´ P. Sniady
624
In other words: the measure µk on M k is a sum of the product measure on M k and of product measures with supports on all multidiagonals of M k . The operators defined in the last section in this context are represented as follows: X φ(xi )9(x1 , . . . , xi−1 , xi+1 , . . . , xn+1 ), (22) (bφ? 9)(x1 , . . . , xn+1 ) = i
Z
(bφ 9)(x1 , . . . , xn ) = 2γ0
φ(xn+1 )9(x1 , . . . , xn+1 )dµ(xn+1 ) (23) X φ(xi )9(x1 , x2 , . . . , xi−1 , xi , xi , xi+1 , . . . , xn ), +2 M i
(nφ 9)(x1 , . . . , xn ) = 9(x1 , . . . , xn )
X
φ(xi ).
(24)
i
6. Quadratic and Linear Bosonic White Noise It is natural to ask if it is possible to incorporate both quadratic white noise operators bφ , bφ? , nφ and linear white noise operators aφ , aφ? to the same algebra. We postulate relations (6)–(8) of quadratic white noise, a relation of white noise [aφ , aψ? ] = hφ, ψi, and some relations linking quadratic and linear noises, among which we shall mention only ? ] = 2aφ? ? ψ , [bφ , aψ? ] = 2aφψ ? . [aφ , bψ We shall prove now that in general it is impossible to find a Fock representation of these relations. Let X be a measurable subset of M. Let 0 < µ(X) = l < ∞ and let χ(x) = 1 for x ∈ X and χ (x) = 0 otherwise. By rewriting operators in the normal order we have for any c ∈ R, h(caχ? aχ? + bχ? ), (caχ? aχ? + bχ? )i = h, (caχ aχ + bχ )(caχ? aχ? + bχ? )i = 2c2 hχ , χi2 + 2γ0 hχ , χi + 2chχ 2 , χ i + 2chχ, χ 2 i = 2c2 l 2 + 4cl + 2γ0 l. It is easy to see that for l
π/ log 4 (instability). From a mathematical point of view this model has been studied by Bach et al [2]. (It was even possible to consider the case when arbitrarily many nuclei and a magnetic field is present [1].) In particular these authors showed that the energy is indeed positive in the range predicted by Chaix et al. The objective of the present paper is to show that this stability bound is optimal: Our main result, Theorem 1, does not only give a proof of the existence of a stability bound as predicted by Chaix et al, it even gives a quantitative improvement of the value, in fact we obtain the sharp value. ? ©2000 The authors. Reproduction of this article for non-commercial purposes by any means is permitted.
?? Present address: Mathematics 253-37, Caltech, Pasadena, CA 91125, USA. E-mail:
[email protected] 630
D. Hundertmark, N. Röhrl, H. Siedentop
In Sect. 2 we fix our notation and formulate our instability result. In order to prove the instability we pick a suitable charge density matrix in Sect. 3. In Sect. 4 we estimate the difference of our trial density matrix and a simpler but non-normalizable “density matrix” suggested by Chaix et al. The remaining chapters contain the evaluation of the energy of our density matrix exhibiting a manifestly negative main term dominating the remainders. 2. Statement of the Main Result In order to formulate our result properly we fix some notations. We introduce the following convention: x = (x, s) is an element of G := R3 × {1, 2, 3, 4} and dx is the product measure of G (Lebesgue measure in the first factor and counting measure in the second factor). The same applies to the variables y, p, q, p0 , q 0 , and ξ . We will use this notation without further notice throughout the paper except for Sect. 6.2. Let D = −iα · ∇ + mβ denote the free Dirac operator of a particle of mass m acting in L2 (R3 ) ⊗ C4 . In momentum space it is given by the 4 × 4-matrix multiplication operator
m σ ·p D(p) = , σ ·p −m where σ = (σ1 , σ2 , σ3 ) are the three Pauli matrices 0 1 0 −i 1 , σ2 = , σ3 = σ1 = 1 0 i 0 0
0 . −1
Let H+ and H− = H⊥ + denote the positive and negative spectral subspaces of D(p) (as multiplication operator in H := L2 (R3 )⊗C4 ), and P+ and P− denote the corresponding projections. An operator γ on L2 (R3 ) ⊗ C4 is called a charge density matrix if it is self-adjoint, trace class, and fulfills the inequality −P− ≤ γ ≤ P+ .
(1)
The particle number of a charge density matrix γ is the sum of the electron number Ne (γ ) and the positron number Np (γ ), i.e., N (γ ) := Ne (γ )+Np (γ ) = tr P+ γ P+ −tr P− γ P− . We denote the set of charge density matrices γ with finite kinetic energy, i.e., tr(|D||γ |) < ∞, by X. These density matrices also have a finite total energy which is – in our case – given by the Hartree–Fock functional Eα : X → R, Eα (γ ) = tr(Dγ ) + αD(ργ , ργ ) − R
α 2
Z G×G
dxdy
|γ (x, y)|2 , |x − y|
(2)
dxdyf (x)g(y)|x−y|−1 is the Coulomb scalar product. The
where D(f, g) := (1/2) R6 P kernel of the trace class operator γ can be written as γ (x, y) := i∈Z wi ei (x)ei (y), where the ei are eigenfunctions and the wi are eigenvalues of γ . Finally, ργ (x) := P 2 i∈Z wi |ei (x)| the charge density. The terms in this functional are, from left to right, the kinetic energy Ekin , the direct energy WD , and the exchange energy WX .
Instability
631
A system is called stable, if its energy is bounded from below by a constant times the number of particles N . It is in fact known that the energy is positive, if α ≤ 4/π (Bach et al [2]). – The objective of the present paper is to show that this bound is sharp, i.e., we will prove that this bound on the stability is critical: beyond 4/π the system becomes instable, more precisely we will show that even the energy per particle can become arbitrarily negative: Eα (γ ) = −∞. γ ∈X N (γ )
Theorem 1. If α > αc := 4/π, then inf
The overall strategy of our proof is straightforward: we construct a sequence of charge density matrices giving arbitrarily negative energies per particle. However, the difficulty in implementing the strategy is twofold: Firstly, the localization to finite volume requires care since naive multiplication with a cutoff function immediately conflicts with the positivity and negativity condition (1). Secondly, it is necessary to improve the construction of Chaix et al [4] to get the sharp expression for the leading term of the energy per volume. 3. An Ansatz for a Minimizing Charge Density Matrix 3.1. Approximation of the Vacuum Density Matrix. The free Dirac operator is diagonalized by the unitary matrix −sp σ · ωp cp , R(p) := sp σ · ωp cp i.e.,
E(p) R (p)D(p)R(p) = 0 ∗
0 , − E(p)
where ωp := p/|p| for any non-vanishing vector p ∈ R3 . We used the abbreviations 1/2 = cos θp and sp := (E(p) − E(p) := (p2 + m2 )1/2 , cp := (E(p) + m)/2E(p) 1/2 = sin θp . m)/2E(p) ˜ + := {(f1 , f2 , 0, 0) ∈ H}, and the We decompose H into the first two components, H ˜ − := {(0, 0, f3 , f4 ) ∈ H}. Then the operator R ∗ is a unitary transformation last two, H ˜ + and R ∗ (H− ) = H ˜ − . In particular condition (1) for γ is on H such that R ∗ (H+ ) = H equivalent to 1 0 0 0 , − ≤ R∗γ R ≤ 0 0 0 1 and P− is given explicitly by
0 0 R∗. P− = R 0 1
˜ − defines a “dressed” vacuum by means ˜+⊕H Moreover, any unitary operator T on H of the projector 0 0 T R∗. P˜− := RT ∗ 0 1
632
D. Hundertmark, N. Röhrl, H. Siedentop
We will use the rotation given by cos η(p) sin η(p)σ · ωp , T (p) = − sin η(p)σ · ωp cos η(p) whose angle η(p) is given by a real valued function η : R3 → R. We will write s˜p for sin η(p) and c˜p for cos η(p). Chaix et al [4] used the matrix γC := P˜− − P− to argue for the upper bound π/ ln 4 for αc . This can also be written as γC = RF R ∗ with s˜p2 −˜sp c˜p σ · ωp 0 0 0 0 T (p) − = . (3) F (p) = T ∗ (p) 0 1 0 1 −˜sp c˜p σ · ωp −˜sp2 It obviously fulfills (1). However, the problem with this choice is that γC is not a charge density matrix: Being a multiplication operator in momentum space, it is translationally invariant and therefore either vanishes or has no finite number of particles, infinite kinetic, and infinite exchange energy. To circumvent this problem we pick γ := Rχa,r F χa,r R ∗ ,
(4)
where we introduce a smooth “characteristic function” on the ball of radius r. (The parameter a controls the smoothness.) For large r and small a this approximates γC . Technically not only the careful choice of the cutoff function χa,r : R3 → R will be important but also the placement of the cutoff in between R and F is essential: Because of the order of the operators, the cutoff does not mix electrons and positrons and thus γ also satisfies (1). This property is not fulfilled by many of the obvious regularizations of γC . This will become clear in the next section. Finally, let us note that in momentum space γ has the integral kernel Z dξ 0(ξ , p, q), (5) γ (p, q) := R3
where χa,r (p − ξ )F (ξ )(2π )−3/2 [ χa,r (ξ − q)R ∗ (q). 0(ξ , p, q) := R(p)(2π)−3/2 [
(6)
3.2. The Choice of Cutoff and Rotation. Let B(x, r) denote the ball of radius r centered at x and Br the characteristic function of B(0, r). Choose a non-vanishing function h ∈ ˆ hk ˆ 2 )2 C0∞ (R3 , R) which is spherically symmetric with supp h = B(0, 21 ). Let g := (h/k and ga (x) := a −3 g(x/a), a ∈ R+ . With this choice ga is positive, since by the symmetry of h its Fourier transform hˆ is real. Moreover ||ga ||1 = 1. Finally, we define our cutoff as χa,r := Br ∗ gar .
(7)
At this point we should remark that simpler cutoffs, e.g., a characteristic function in momentum space, also have the properties we need. But in the end we get an instability bound of αc kχa,r k22 /kχa,r k44 , whereas our choice yields a factor 1 for a → 0 as we will prove in the following lemma.
Instability
633
Lemma 1. p
p
1. For all r > 0, 1 ≤ p < ∞ we have 0 ≤ χa,r ≤ 1 and ||χa,r ||p = r 3 ||χa,1 ||p . 2. The quotient Q := kχa,r k44 /kχa,r k22 is bounded above by 1, independent of r, and its limit for a → 0 is 1. 3. The cutoff has compact support in momentum space; more precisely we have supp[ χa,r ⊂ B(0, ρ) with ρ := (ar)−1 . Proof. 1. This is obvious, since χa,r (x) = χa,1 (x/r). 2. We have Q ≤ 1, because kχa,r k44 ≤ kχa,r k22 by 1. Furthermore, we have r 3 kB1 ∗ ga k44 kχa,r k44 = kχa,r k22 r 3 kB1 ∗ ga k22 which shows the independence of r and also shows the convergence result, since ga is an approximate δ function. cr gc χa,r = (2π )3/2 B 3. The Fourier transform of χa,r is[ ar . It is sufficient to look at the −1 −3/2 ˆ ˆ h ∗ h; thus supp gˆ ⊂ B(0, 1). Using support of gc ar . By definition gˆ = khhk1 (2π ) the identity Z Z 1 x 1 −ipx e g ˆ dx = e−iarpx g(x)dx = g(arp), gc ar (p) = (ar)3 (2π)3/2 R3 ar (2π )3/2 −1 t we get supp gc ar ⊂ B(0, (ar) ). u
Now, we choose the angle. For λ, c, d ∈ R+ with 2λ−1 ≤ c < d and λ ≤ 1 set η = ηλ,c,d : R3 → [0, π/2],
(8)
arcsin((|ξ |λ)−2 ) c ≤ |ξ | ≤ d . 0 elsewhere
(9)
( ηλ,c,d (ξ ) =
This choice will eventually lead us to an instability problem that is similar to the one treated by Evans et al [5], Formulae (15) ff, as will become clear later. It implies in particular s˜ξ ≤ 1/4. Thus 0 as defined in (6) has supp 0 ⊂ Dρ with Dρ := {(ξ , p, q) ∈ R9 |ξ |, |p|, |q| ∈ [c − 2ρ, d + 2ρ], |p − ξ | ≤ ρ, |q − ξ | ≤ ρ}. (10) We pick ρ ≤ ρmax := 1/3. With this choice of η, γ is trace class and thus finally a charge density matrix. More precisely, its number of particles is N(γ ) = tr P+ γ P+ − tr P− γ P− Z Z = (2π)−3 dqdξ 2˜sξ2[ χa,r (q − ξ )2 − (2π )−3 dqdξ (−2)˜sξ2[ χa,r (q − ξ )2 Z −3 3 2 = 4(2π) r kχa,1 k2 dξ s˜ξ2 = 2π −2 r 3 kχa,1 k22 λ−4 (c−1 − d −1 ) (11) using R ∗ P+ =
10 00 R ∗ , and R ∗ P− = R∗. 00 01
634
D. Hundertmark, N. Röhrl, H. Siedentop
4. Estimates of the Difference Between the Regularized and Unregularized Vacuum Density Matrices Our goal is to compare our ansatz with the one of Chaix et al in a controlled way. Our choice of η and the resulting domain Dρ enables us to estimate the difference of the ansatz of Chaix et al and our ansatz, i.e., we write M(ξ , p, q) := R(p)F (ξ )R ∗ (q) as ˜ M(q) := R(q)F (q)R ∗ (q) plus error terms which we will estimate asymptotically as ρ tends to zero. Note that γ (p, q) = (2π)−3
Z χa,r (ξ − q). dξ M(ξ , p, q)[ χa,r (p − ξ )[
Let V := {(ξ , p, q) |ξ |, |q| ∈ [c, d]} ∩ Dρ , and denote the characteristic function of V by χV . We divide the differences s˜q − s˜ξ and c˜q − c˜ξ into two functions each, fs1 (ξ , q) : = (˜sq − s˜ξ )χV , fc1 (ξ , q) : = (c˜q − c˜ξ )χV ,
fs2 (ξ , q) : = (˜sq − s˜ξ )(1 − χV ), fc2 (ξ , q) : = (c˜q − c˜ξ )(1 − χV ).
On V the differences will be small whereas on its complement this might not be the case. However, the measure of the complement is small. We need the following estimates. Lemma 2. For (ξ , p, q) ∈ Dρ and ρ → 0 we have 1. σ · ωp − σ · ωq = O(ρ) and σ · ωξ − σ · ωq = O(ρ), 2. cp − cq = O(ρ) and sp − sq = O(ρ), 3. fs1 = O(ρ) and fc1 = O(ρ), where the bounds in 1 are meant elementwise. Proof. 1. First note that the elementwise bound is equivalent to the bound in (operator) norm. Since the Pauli matrices anticommute and σj2 = 1 we have kσ · (ωp − ωq )k ≤ |ωp − ωq |, i.e., we can as well show |ωp − ωq | = O(ρ). Denoting the angle between p and q by β (0 ≤ β ≤ π), we get the following statements by elementary geometry: – 2ρ ≥ |p − q| ≥ min{|p|, |q|} sin β – |ωp − ωq | = 2 sin(β/2) – sin(β/2) ≤ sin β if β ≤ 2π/3. 4ρ = O(ρ) if β ≤ 2π/3. But the biggest β allowed in the domain Hence |ωp − ωq | ≤ c−2ρ Dρ appears if |ξ |, |p|, |q| = c − 2ρ, |ξ − p|, |ξ − q| = ρ, and if p, q, and ξ are in the same plane. We get 2 sin(β/4) = ρ/(c − 2ρ), and since sin(π/6) = 1/2 we are on the safe side with ρ < ρmax = 1/3. Because ξ = p is allowed by Dρ , this also proves the second statement.
Instability
635
2. Obviously the statement is true for m = 0. Therefore, we assume m > 0 in the following. We have q q √ 2(cmp − cmq ) = 1 + (1 + p2 )−1/2 − 1 + (1 + q2 )−1/2 1 + (1 + p2 )−1/2 − 1 − (1 + q2 )−1/2 p =p 1 + (1 + p2 )−1/2 + 1 + (1 + q2 )−1/2 (1 + q2 )1/2 − (1 + p2 )1/2 p p (1 + q2 )1/2 (1 + p2 )1/2 1 + (1 + p2 )−1/2 + 1 + (1 + q2 )−1/2 −1 (q − p)(q + p) (1 + q2 )1/2 + (1 + p2 )1/2 p = p (1 + q2 )1/2 (1 + p2 )1/2 1 + (1 + p2 )−1/2 + 1 + (1 + q2 )−1/2 =
= O(ρ). Similarly we get √
2(smp − smq ) = =
q q 1 − (1 + p2 )−1/2 − 1 − (1 + q2 )−1/2
−1 −(q − p)(q + p) (1 + q2 )1/2 + (1 + p2 )1/2 p p (1 + q2 )1/2 (1 + p2 )1/2 1 − (1 + p2 )−1/2 + 1 − (1 + q2 )−1/2 = O(ρ).
Note that the denominator is bounded away from zero uniformly in p and q. 3. For (ξ , p, q) ∈ V we compute λ2 (˜sq − s˜ξ ) = q−2 χ[c,d] (|q|) − ξ −2 χ[c,d] (|ξ |) (ξ χ[c,d] (|q|) − qχ[c,d] (|ξ |))(ξ χ[c,d] (|q|) + qχ[c,d] (|ξ |)) . = q2 ξ 2 If |ξ |, |q| ∈ [c, d] – as fulfilled in V – we get a factor of ξ − q = O(ρ). Because s˜ξ ≤ 1/4, we have c˜q − c˜ξ =
q q (˜sq − s˜ξ )(˜sq + s˜ξ ) q 1 − s˜q2 − 1 − s˜ξ2 = − q 1 − s˜q2 + 1 − s˜ξ2
≤ −(˜sq − s˜ξ )(˜sq + s˜ξ ) = O(ρ).
t u
˜ ˜ Now, since M(q) = M(q, q, q), we can estimate the difference M(ξ , p, q) − M(q) by applying Lemma 2 to M(ξ , p, q) and using the identity (σ ·q)(σ ·p) = qp+iσ ·(q ×p).
636
D. Hundertmark, N. Röhrl, H. Siedentop
We get M(ξ , p, q) ! −˜sξ c˜ξ σ · ωξ s˜ξ2 −sp σ · ωp sq σ · ωq cq cp = sp σ · ωp cp −sq σ · ωq cq −˜sξ c˜ξ σ · ωξ −˜sξ2 =
s˜ξ2 (cp cq − sp sq σ · ωp σ · ωq ) + s˜ξ c˜ξ (cp sq σ · ωξ σ · ωq + sp cq σ · ωp σ · ωξ ) s˜ξ2 (sp cq σ · ωp + cp sq σ · ωq ) + s˜ξ c˜ξ (sp sq σ · ωp σ · ωξ σ · ωq − cp cq σ · ωξ )
! s˜ξ2 (sp cq σ · ωp + cp sq σ · ωq ) + s˜ξ c˜ξ (sp sq σ · ωp σ · ωξ σ · ωq − cp cq σ · ωξ ) −˜sξ2 (cp cq − sp sq σ · ωp σ · ωq ) − s˜ξ c˜ξ (cp sq σ · ωξ σ · ωq + sp cq σ · ωp σ · ωξ ) i h 2˜sq2 sq cq − s˜q c˜q (cq2 − sq2 ) σ · ωq s˜q2 (cq2 − sq2 ) + 2˜sq c˜q sq cq i = h −˜sq2 (cq2 − sq2 ) − 2˜sq c˜q sq cq 2˜sq2 sq cq − s˜q c˜q (cq2 − sq2 ) σ · ωq + F1 (ξ , p, q) + F2 (ξ , p, q) −˜sq cos(η(q) + 2θq )σ · ωq s˜q sin(η(q) + 2θq ) = −˜sq cos(η(q) + 2θq )σ · ωq −˜sq sin(η(q) + 2θq )
˜ + F1 (ξ , p, q) + F2 (ξ , p, q), + F1 (ξ , p, q) + F2 (ξ , p, q) = M(q)
(12)
where we collected all terms of order O(ρ) in F1 and the rest, involving fs2 or fc2 , in F2 . Hence F1 = O(ρ), and supp F2 ⊂ Dρ \ V . 5. The Energy of the Regularized Vacuum Density Matrix We now compute the energy of γ . As indicated by our notations, we will focus on the ˜ contribution of M(q) which is essentially the energy that Chaix et al [4] get. Note that for our special choice of γ the direct energy D(ργ , ργ ) is zero, since we even have ργ (x) = 0 which itself is easily seen by noting tr C4 γ (p, q) = 0 and the fact that the Fourier transform is a linear operator acting element wise. Our kinetic energy is tr(Dγ ) = (2π)−3 = (2π)−3
Z 3 ZR
R6
dq tr C4 D(q)γ (q, q) dqdξ tr C4 D(q)M(ξ , q, q)[ χa,r (q − ξ )2 .
(13)
˜ The contribution of the M(q) term is (2π)−3
Z
˜ dqdξ tr C4 D(q)M(q)[ χa,r (q − ξ )2 Z −˜sq c˜q σ · ωq s˜q2 E(q) 0 = (2π)−3 kχa,r k22 dq tr C4 0 −E(q) −˜sq c˜q σ · ωq −˜sq2 R3 Z dqE(q)˜sq2 . = (2π)−3 r 3 kχa,1 k22 4 R6
R3
Instability
637
The other contributions are of lower order in ρ = 1/(ar) as we will show now. Set C1 : = supρ≤ρmax ρ −1 sup(ξ ,q,q)∈Dρ {tr C4 D(q)F1 (ξ , q, q)}, C2 : = sup(ξ ,q,q)∈Dρmax {tr C4 D(q)F2 (ξ , q, q)}. Note that both, C1 and C2 , are independent of ρ and finite, since F1 and F2 have no singularities and F1 |Dρ = O(ρ). We get Z 2 (2π)−3 dqdξ tr D(q)F (ξ , q, q)[ χ (q − ξ ) 4 1 a,r C R6 Z −3 2 χa,r (q − ξ ) ≤ (2π) ρC1 c−2ρ≤|q|≤d+2ρ dqdξ [ ξ ∈R3 4 = (2π)−3 ρC1 π((d + 2ρ)3 − (c − 2ρ)3 )kχa,r k22 3 2 r 4 = (2π)−3 C1 π((d + 2ρ)3 − (c − 2ρ)3 )kχa,1 k22 = O(r 2 ) a 3 recalling ρ = 1/(ar). Because for all (ξ , q, q) ∈ Dρ \ V we have |ξ | ∈ [c − 2ρ, c + ρ] ∪ [d − ρ, d + 2ρ] we can also control the other term Z 2 (2π)−3 dqdξ tr C4 D(q)F2 (ξ , q, q)[ χa,r (q − ξ ) R6 Z −3 2 χa,r (q − ξ ) ≤ (2π) C2 |ξ |d−ρ dqdξ[ q∈R3 4 χa,1 k22 = O(r 2 ). = (2π)−3 C2 π((c + ρ)3 − (c − 2ρ)3 + (d + 2ρ)3 − (d − ρ)3 )r 3 k[ 3 The exchange energy is Z 1 |γ (x, y)|2 = dqdq0 dp|q − q0 |−2 γ (q0 + p, q0 ) γ (q + p, q) 2 |x − y| 2π G×G Z 1 1 dqdq0 dξ dξ 0 dp|q − q0 |−2 M(ξ 0 , p + q0 , q0 ) M(ξ , p + q, q) · = (2π)6 2π 2 χa,r (q0 − ξ 0 )[ χa,r (p + q − ξ )[ χa,r (q − ξ ), (14) ·[ χa,r (p + q0 − ξ 0 )[ P where A B := i,j aij bij for two matrices A and B of the same dimension. A bar over a matrix denotes elementwise conjugation. We exhibit the main term Z
dxdy
M(ξ 0 , p0 , q0 ) M(ξ , p, q) ˜ 0 ) M(q) ˜ + F˜1 (ξ , p, q, ξ 0 , p0 , q0 ) + F˜2 (ξ , p, q, ξ 0 , p0 , q0 ) = M(q 0 + F˜2 (ξ , p, q, ξ 0 , p0 , q0 ),
collecting all products involving F1 in F˜1 and the others involving F2 (ξ , p, q) in F˜2 0 and the rest – involving F2 (ξ 0 , p0 , q0 ) – in F˜2 . We get the properties F˜1 = O(ρ),
638
D. Hundertmark, N. Röhrl, H. Siedentop 0
supp F˜2 = Dρ × (Dρ \ V ), and supp F˜2 = (Dρ \ V ) × Dρ . Here the main contribution ˜ ˜ 0 ) M(q). Using comes from the term M(q σ · ωq0 σ · ωq =
1 q30 q10 + iq20 q1 − iq2 q3 0 0 0 q1 + iq2 −q3 |q0 ||q| q1 − iq2 −q3 = 2ωq0 ωq ,
where q := (q1 , q2 , q3 ), we get ˜ ˜ 0 ) M(q) = M(q 4˜sq s˜q0 (sin(η(q) + 2θq ) sin(η(q0 ) + 2θq0 ) + cos(η(q) + 2θq ) cos(η(q0 ) + 2θq0 )ωq ωq0 ). Plugging this into Formula (14) gives three terms. The first one is Z 4 1 dqdq0 |q − q0 |−2 s˜q s˜q0 (2π)6 2π 2 R6 h i sin(η(q) + 2θq ) sin(η(q0 ) + 2θq0 ) + cos(η(q) + 2θq ) cos(η(q0 ) + 2θq0 )ωq ωq0 Z dξ dξ 0 dp[ χa,r (p + q − ξ )[ χa,r (q − ξ )[ χa,r (p + q0 − ξ 0 )[ χa,r (q0 − ξ 0 ) R9 Z 8 2 k[ χa,r ∗ [ χa,r k2 dqdq0 |q − q0 |−2 s˜q s˜q0 = (2π)8 R6 h i sin(η(q) + 2θq ) sin(η(q0 ) + 2θq0 ) + cos(η(q) + 2θq ) cos(η(q0 ) + 2θq0 )ωq ωq0 Z 8 3 4 r kχ k dqdq0 |q − q0 |−2 s˜q s˜q0 = a,1 4 (2π)5 R6 h i sin(η(q) + 2θq ) sin(η(q0 ) + 2θq0 ) + cos(η(q) + 2θq ) cos(η(q0 ) + 2θq0 )ωq ωq0 . (15) (In passing we note that the support of the first integrand in (15) is contained in Dρ motivating the inclusion of the boundary layer of width 2ρ in its Definition (10).) The other two terms are again of lower order in r. The strategy of showing this is the same as before. We first define C3 : = supρ≤ρmax ρ −1 sup(ξ ,p+q,q,ξ 0 ,p+q0 ,q0 )∈Dρ2 {F˜1 (ξ , p + q, q, ξ 0 , p + q0 , q0 )}, C4 : = sup(ξ ,p+q,q,ξ 0 ,p+q0 ,q0 )∈Dρ2
max
{F˜2 (ξ , p + q, q, ξ 0 , p + q0 , q0 )},
and compute Z 2 dqdq0 |q − q0 |−2 F˜1 (ξ , p + q, q, ξ 0 , p + q0 , q0 ) (2π)8 Z 0 0 0 0 0 χa,r (p + q − ξ )[ χa,r (q − ξ )[ χa,r (p + q − ξ )[ χa,r (q − ξ ) dξ dξ dp[ Z Z 2 2 χa,r k2 , ρC3 k [ χa,r ∗ [ dq dq0 |q − q0 |−2 ≤ (2π)8 |q|∈[c−2ρ,d+2ρ] |q0 |∈[c−2ρ,d+2ρ] = O(r 2 ),
Instability
639
and Z 2 dqdq0 |q − q0 |−2 F˜2 (ξ , p + q, q, ξ 0 , p + q0 , q0 ) (2π)8 Z 0 0 0 0 0 χa,r (p + q − ξ )[ χa,r (q − ξ )[ χa,r (p + q − ξ )[ χa,r (q − ξ ) dξ dξ dp[ Z Z 2 2 C4 k [ χa,r ∗ [ dq dq0 |q − q0 |−2 χa,r k2 ≤ (2π)8 |q|d−ρ |q0 |∈[c−2ρ,d+2ρ] Z Z 2 2 ≤ C4 k [ χa,r ∗ [ dq dq0 |q0 |−2 χa,r k2 (2π)8 |q|d−ρ q 0 ∈B(0,2d+4ρ) 4 1 χa,1 k22 π((c + ρ)3 = 7 8 r 3 C4 k [ χa,1 ∗ [ 2 π 3 3 3 − (c − 2ρ) + (d + 2ρ) − (d − ρ)3 )4π(2d + 4ρ) = O(r 2 ). 0
We omit the computation for F˜2 , because it is almost identical to the previous one. 6. Negativity of the Energy – Instability We are now in a position to prove our instability result. To simplify matters, we first show that it suffices to treat the massless case. Since the energy expression of Chaix et al turned out to be the leading one as explained in Sect. 5, it is enough to show negativity of this very term. The remaining terms can be made arbitrarily small compared to the leading one by suitable choice of the parameter r (large r). This gives a total energy which is negative. Since the massless case is homogeneous under dilation, we get immediately arbitrarily negative energies keeping the particle number fixed. 6.1. Reduction to the Massless Case. Denote the third order coefficients in r of the particle number, of the total, kinetic, and exchange energies as functions of the mass m and the rotation angle ηλ,c,d by N3 (ηλ,c,d ), (Eα )3 (m, ηλ,c,d ), (Ekin )3 (m, ηλ,c,d ), and (WX )3 (m, ηλ,c,d ). We can reduce the problem to the massless case by investigating the scaling of (Eα )3 in m. We start with the kinetic energy Z q dp |p|2 + λm2 sin2 (η(p)) (Ekin )3 (λm, η1,c,d ) = (2π)−3 kχa,1 k22 4 R3 Z q dp |p|2 + m2 sin2 (η(λp)) = λ4 (Ekin )3 (m, ηλ,c/λ,d/λ ), = (2π)−3 kχa,1 k22 λ4 4 R3
(16)
and ηλ,c/λ,d/λ meets our requirements (8). To deal with the exchange energy we first have to get the explicit mass dependence, sin(η(q) + 2θq ) = sin(η(q)) cos(2θq ) + cos(η(q)) sin(2θq ), cos(η(q) + 2θq ) = cos(η(q)) cos(2θq ) − sin(η(q)) sin(2θq ),
640
D. Hundertmark, N. Röhrl, H. Siedentop
and sin(2θq ) = |q|/E(q),
cos(2θq ) = m/E(q).
The mass appears in the following four expressions: m2 , E(q)E(q0 )
|q|m , E(q)E(q0 )
|q0 |m , E(q)E(q0 )
|q||q0 | . E(q)E(q0 )
They all scale in m the same way, e.g., the first one scales as m2 λ2 m2 p p p =p . q2 + λ2 m2 (q0 )2 + λ2 m2 (q/λ)2 + m2 (q0 /λ)2 + m2 Just as in the case of the kinetic energy, we get (WX )3 (λm, η1,c,d ) = λ4 (WX )3 (m, ηλ,c/λ,d/λ ).
(17)
All together, this gives (Eα )3 (λm, η1,c,d ) = λ4 (Eα )3 (m, ηλ,c/λ,d/λ ).
(18)
Moreover, (Eα )3 is continuous at m = 0, because both kinetic and exchange energy can be written as a sum of terms of form Z f (m, x)g(x)dx with f (m, x) continuous and g ∈ L1 compactly supported. Now, if we find a pair c, d and a negtive a 0 with (Eα )3 (0, η1,c,d ) < a 0 , then by continuity there is an m0 > 0 such that also (Eα )3 (m, η1,c,d ) < a 0 for m < m0 . Thus by (18) and (11) we get for the energy per particle (Eα )3 (m, ηλ,c/λ,d/λ ) λ−4 a 0 ≤ N3 (ηλ,c/λ,d/λ ) 2π −2 kχa,1 k22 λ−4 λ(c−1 − d −1 ) a0 → −∞ = 2 −2 2π kχa,1 k2 λ(c−1 − d −1 )
(19)
for λ → 0 (provided λ < m0 /m). 6.2. The Massless Case. We now set m = 0 and λ = 1. Then θq = π/4, and, writing τ for dc , p for |p| and q for |q|, the kinetic energy is 1 k[ χa,1 k22 4 (Ekin )3 = 2π 2
Z p∈[c,d]
pdp = 2π −2 kχa,1 k22 p4
Z c
d
dp = 2π −2 kχa,1 k22 log τ. p (20)
Instability
641
The exchange energy in spherical coordinates – writing p instead of q 0 – is 8 kχa,1 k44 (WX )3 = (2π)5 Z dZ dZ 1 4π2π c
c
q q du −2 −4 p −2 1 − p −4 q 2 p 2 dpdq q 1 − q 2 2 −1 q + p − 2qpu Z dZ dZ 1 udu −4 −4 2 2 q p q p dpdq . +4π2π 2 2 c c −1 q + p − 2qpu
We introduce Legendre functions of the second kind, Z 1 Z 1 du du 1 = := 2 + p 2 − 2qpu p q 1 q 2pq −1 −1 + 2 q p −u Z 1 Z 1 1 udu udu = := 2 + p 2 − 2qpu p q 1 q 2pq −1 −1 2 q + p −u
1 Q0 pq
q 1 p + ≥ 0, 2 q p
1 Q1 pq
q 1 p + ≥ 0. 2 q p
Because the exchange energy gives a negative contribution to √the total energy, we have to estimate it from below. First, we drop the Q1 term and use 1 − z ≥ 1 − z (z ∈ [0, 1]), and in the second step we drop (pq)−5 and estimate p−4 + q −4 from above by 2/c4 . Thus we get 2 (WX )3 ≥ 3 kχa,1 k44 π Z dZ d q 1 p + ((pq)−1 − p−1 q −5 − q −1 p−5 + (pq)−5 )dpdq Q0 2 q p c c Z dZ d 2 q 2 1 p 4 Q0 + (pq)−1 . ≥ 3 kχa,1 k4 (1 − 4 ) π c 2 q p c c In the following we only consider c and τ as variables. Applying Formula (17) of [5], Z dZ d π2 q 1 p Q0 + (pq)−1 = log τ + O(1), (21) 2 q p 2 c c we get α (Eα )3 = (Ekin )3 − (WX )3 2 α 1 π2 log τ ≤ 2π −2 log τ kχa,1 k22 − kχa,1 k44 + C 4 + O(1) 2π 2 c π log τ = 2π −2 log τ kχa,1 k22 − α kχa,1 k44 + C 4 + O(1). 4 c Now if α > (4/π)kχa,r k22 /kχa,r k44 , we find a pair c, τ such that for some negative a 00 , we get (Eα )3 (0, η1,c,d ) < a 00 . We can use the same scaling argument as above (Eα )3 (0, ηλ,c/λ,d/λ ) λ−4 a 00 ≤ 2 N3 (ηλ,c/λ,d/λ ) 2π −2 kχa,1 k2 λ−4 λ(c−1 − d −1 ) a 00 → −∞ = 2π −2 kχa,1 k22 λ(c−1 − d −1 )
642
D. Hundertmark, N. Röhrl, H. Siedentop
for λ → 0. Here we could also use that in the massless case the energy scales like inverse length. Once we found a negative energy like above, we also get arbitrary negative energies without changing the number of particles by contracting everything to one point. With Lemma 1 this finally yields instability for α > 4/π by choosing a small enough. Acknowledgement. This work has been supported by the European Union through its Training, Research, and Mobility program, grant FMRX-CT 96-0001. Dirk Hundertmark also thanks the Deutsche Forschungsgemeinschaft for financial support (grant Hu 773/1-1).
References 1. Bach, Volker, Barbaroux, Jean-Marie, Helffer, Bernard, and Siedentop, Heinz: Stability of matter for the Hartree-Fock functional of the relativistic electron-positron field. Doc. Math. 3, 353–364 (electronic) 1998 2. Bach, Volker, Barbaroux, Jean-Marie, Helffer, Bernard, and Siedentop, Heinz: On the stability of the relativistic electron-positron field. Commun. Math. Phys. 201, 445–460 (1999) 3. Chaix, P., and Iracane, D.: From quantum electrodynamics to mean-field theory: I. The Bogoliubov–Dirac– Fock formalism. J. Phys. B 22(23), 3791–3814 (December 1989) 4. Chaix, P., Iracane, D., and Lions, P. L.: From quantum electrodynamics to mean-field theory: II. Variational stability of the vacuum of quantum electrodynamics in the mean-field approximation. J. Phys. B. 22(23), 3815–3828 (December 1989) 5. Evans, William Desmond, Perry, Peter, and Siedentop, Heinz: The spectrum of relativistic one-electron atoms according to Bethe and Salpeter. Commun. Math. Phys. 178(3), 733–746 (July 1996) Communicated by B. Simon
Commun. Math. Phys. 211, 643 – 658 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Power Law Subordinacy and Singular Spectra. II. Line Operators Svetlana Ya. Jitomirskaya1 , Yoram Last2,3 1 Department of Mathematics, University of California, Irvine, CA 92697, USA 2 Institute of Mathematics, The Hebrew University, 91904 Jerusalem, Israel 3 Department of Mathematics, California Institute of Technology, Pasadena, CA 91125, USA
Received: 21 October 1999 / Accepted: 21 December 1999
Abstract: We study Hausdorff-dimensional spectral properties of certain “whole-line” quasiperiodic discrete Schrödinger operators by using the extension of the Gilbert– Pearson subordinacy theory that we previously developed in [19]. 1. Introduction In this paper we apply the power-law extension of the Gilbert–Pearson theory of subordinacy [12,11], developed in [19], to certain popular models of one-dimensional quasiperiodic Schrödinger operators. Namely, we are interested in Hausdorff-dimensional properties of spectral measures of those operators. While we also formulate and prove here some statements for operators defined on the “half-line”, and the “half-line” proofs are simpler and somewhat more transparent, our main interest here is in the “whole-line” case. Therefore, we treat here some technical issues of adapting theorems and methods of [19], best suited for the half-line, to the whole-line case. In this framework, we prove some statements that are quite general and that can be applied to examples other than those that we consider. However, we do not really attempt to develop a general theory for applying subordinacy methods to whole-line problems, but rather focus on handling some specific examples. Our main object are discrete operators on `2 (Z), defined by (H ψ)(n) = ψ(n + 1) + ψ(n − 1) + V (n)ψ(n).
(1.1)
The potential V = {V (n)}∞ n=−∞ is a sequence of real numbers. As in [19], some of our technical statements are also valid for continuous analogs of those operators, of the d2 2 form − dx 2 + V (x) on L (R), as long as the potential V (x) is such that we are in the limit point case (so the operator is essentially self-adjoint). The proofs for the discrete and continuous cases are essentially the same. Same is true for more general tridiagonal discrete operators (see the discussion in [19]). However, our actual discussion here will only involve operators of the form given by (1.1).
644
S. Ya. Jitomirskaya, Y. Last
One family of models that we consider here is that of quasiperiodic potentials of the form V (n) = f (αn + θ ),
(1.2)
where f is a trigonometric polynomial of arbitrary degree and α is an irrational. Before we formulate our main result, we’ll need to recall the notion of the Lyapunov exponent. We denote the n-step transfer-matrix of H u = Eu by 8n (E),
(1.3)
8n (E) ≡ Mn (E)Mn−1 (E) · · · M1 (E),
where Mn (E) ≡
E − V (n) 1
! −1 0
.
(1.4)
In the case of an ergodic potential V (n) = f (T n θ ), where T is an ergodic transformation of a phase space (X, µ), θ ∈ X, f : X → R, the corresponding operator H , and therefore the transfer-matrices 8n and Mn , will depend on θ, and we have that Mn (E, θ) = M(E, T n θ), where ! E − f (θ ) − 1 M(E, θ) ≡ . (1.5) 1 0 In the ergodic case, the Lyapunov exponent, γ (E), is defined as R R ln k8k (E, θ )kdµ(θ ) ln k8k (E, θ )kdµ(θ ) = inf X . γ (E) ≡ lim X k→∞ k k k
(1.6)
The celebrated Pastur–Ishii theorem (see, e.g., [7]) states that positive Lyapunov exponents imply a.e. absence of absolutely continuous spectrum for ergodic operators (everywhere absence for quasiperiodic operators, by a recent result of Last–Simon [25]). It turns out that, at least in some cases, one can say significantly more. Let T θ = Tα θ = θ + α (mod 1). For any irrational α, Tα is an ergodic transformation of [0, 1). We will sometimes drop the α-dependence from the notations if there is no ambiguity. Let Vθ (n) = f (Tαn θ ), where α is any irrational, and f (θ ) = Pp j j j =0 aj cos 2πθ + bj sin 2πθ is a trigonometric polynomial. Theorem 1. Let Hθ be an operator of the form (1.1) with a quasiperiodic potential, V = Vθ . Suppose that γ (E) > 0 for every E in some Borel set A ⊂ R. Then for any θ ∈ [0, 1), the restriction µ(A ∩ · ) of the spectral measure of Hθ , is zero-dimensional, namely, it is supported on a set of zero Hausdorff dimension. Recently, Bourgain and Goldstein [4] have proven that under exactly the same assumptions, for any fixed θ, the spectrum is pure point for a.e. α. Both Theorem 1 and the result of [4] are probably true for the case of a real analytic function f as well, however, some further arguments are needed for the proofs. Thus, for quasiperiodic operators of this type, positive Lyapunov exponents lead to pure point spectrum for a.e. (but not all! [1,20]) α, θ, and to zero-dimensional spectrum for all α, θ . It had been proven by Herman
Power Law Subordinacy and Singular Spectra. II
645
[14], with a result later generalized and significantly improved by Sorets–Spencer [26] (also see [4] for a new proof), that if f (θ ) is a trigonometric polynomial, α irrational, and V (n) = λf (θ + αn), then for λ sufficiently large, the Lyapunov exponents γ (E) are positive for any E ∈ R. Therefore, our theorem implies the zero-dimensionality of the spectral measures in all those cases. In particular, Theorem 1 applies to the extensively-studied case of the Almost Mathieu operator, Hα,λ,θ , which is the operator of the form (1.1) with V (n) = λ cos(2π αn + θ ), where λ ∈ R, θ ∈ [0, 2π ], and α is any irrational. In the regime |λ| > 2, this operator has spectral properties known to depend very delicately on the arithmetics of both α and θ (see [16] for a review). While pure point spectrum for a.e. α, θ had been recently proven [17], the singular continuous spectrum is known to occur for certain α’s and/or θ’s, and the full picture is rather complicated and is far from being established. It turns out, however, that if we concentrate on Hausdorff-dimensional properties of the spectral measures, rather than on distinguishing between pure-point and singular-continuous spectra, the situation becomes simpler. As an immediate corollary of Theorem 1 and Herman’s Theorem [14], we obtain: Corollary 2. For |λ| > 2, every irrational α, and every θ, the spectral measure of the Almost Mathieu operator Hα,λ,θ is zero-dimensional. It should be noted that for other values of the parameters the dimensional characteristics of spectral measures are mostly known. For rational α’s, the spectrum is absolutely continuous by periodicity; also, for |λ| < 2 and any α, θ , a large absolutely continuous component exists [22,25] (the spectrum is actually now known to be purely absolutely continuous for a.e. α [17], but this result has not yet been established for all α), so in all these cases the spectral measures are of dimension 1. Thus, besides the critical value |λ| = 2, where the situation is expected to be more delicate (see [13,23]), this theorem provides a full description of the Hausdorff-dimensional characteristics of the spectral measures of Hα,λ,θ . Another “whole-line” example that we study here is the Fibonacci Hamiltonian Hλ , which is the operator√of the form (1.1) on `2 (Z) with potential V (n) = λ([(n + 1)ω] − [nω]), where ω = ( 5 − 1)/2 is the golden mean, and [x] ≡ max{m ∈ Z | m ≤ x}. Hλ is the most studied of all one-dimensional quasicrystal models. It is known [2,29, 30] that for every λ 6 = 0, it has purely singular-continuous spectrum, and moreover, its spectrum (as a set) is a Cantor set of zero Lebesgue measure. We will show Theorem 3. For every λ there exists an α > 0 such that the Fibonacci Hamiltonian Hλ has purely α-continuous spectrum, namely, its spectral measures do not give weight to sets of Hausdorff dimension less than α. Remark. Raymond [27] has recently shown that for large λ, the spectrum of Hλ (as a set) has Hausdorff dimension strictly less than 1. This implies that for such λ’s, the spectrum must also be β-singular (see below) for some β < 1. A version of the main results of this paper had been previously announced in [18], (Theorems 3 and 4 of [18]). However, Theorem 1 generalizes the result announced in [18], in that it only requires positivity of the Lyapunov exponents. While being a second part, this paper can be read independently of [19], as we reintroduce the notations, and precisely quote the necessary facts from [19] (inequality (2.13).) However, we refer the reader to [19] for certain general discussions. The rest of this paper is organized as follows. In Sect. 2, we review some facts concerning Hausdorff measures and dimensional spectral properties, and introduce some
646
S. Ya. Jitomirskaya, Y. Last
notations. In Sect. 3, we prove some technical statements needed for the treatment of the whole-line case. In Sect. 4, we prove Theorem 1. We also formulate and prove a corresponding half-line version. Finally, in Sect. 5, we prove Theorem 3. 2. Preliminaries Here we discuss briefly some definitions and preliminary results, particularly those related to dimensional properties of spectral measures. A somewhat more detailed account is given in [19], and for more information see [24]. Recall that for any subset S of R and α ∈ [0, 1], the α-dimensional Hausdorff measure, hα , is given by α
h (S) ≡ lim
inf
δ→0 δ-covers
∞ X
|bν |α ,
(2.1)
ν=1
S where a δ-cover is a cover of S by a countable collection of intervals, S ⊂ ∞ ν=1 bν , such that for each ν the length of bν is at most δ. Given any ∅ 6= S ⊆ R, there exists a unique α(S) ∈ [0, 1] such that hα (S) = 0 for any α > α(S), and hα (S) = ∞ for any α < α(S). This unique α(S) is called the Hausdorff dimension of S. Given α, a measure µ is called α-continuous if µ(S) = 0 for every set S with hα (S) = 0. It is called α-singular if it is supported on some set S with hα (S) = 0. We say that µ is zero-dimensional if it is α-singular for every α > 0. Given a (positive, finite) measure µ and α ∈ [0, 1], we define Dµα (x) ≡ lim sup →0
µ((x − , x + )) (2)α
(2.2)
and T∞ ≡ {x | Dµα (x) = ∞}. The restriction µ(T∞ ∩ · ) ≡ µαs is α-singular, and µ((R \ T∞ ) ∩ · ) ≡ µαc is α-continuous. Thus, each measure decomposes uniquely into an α-continuous part and an α-singular part: µ = µαc + µαs . Moreover, an αsingular measure must have Dµα (x) = ∞ a.e. (with respect to it) and an α-continuous measure must have Dµα (x) < ∞ a.e.. It is important to note that Dµα (x) is defined with a lim sup. The corresponding limit need not exist. The study of the spectral measure µ of an operator of the form (1.1) is related to the study of the Weyl–Titchmarsh m-function. We will briefly list the necessary related facts. More details can be found, for example, in [5]. Let Z+ = {1, 2, 3, . . . }, and Z− = {. . . , −2, −1, 0}. Consider the family of phase boundary conditions: ψ(0) cos β + ψ(1) sin β = 0,
(2.3)
where −π/2 < β < π/2. Let z = E + i, and consider equation H u = zu
(2.4)
It is known that for > 0, (2.4) has unique (up to normalization) solutions uˆ ± z which are ± ± 2 ± ` at, correspondingly, ±∞. Let uˆ β,z be uˆ z normalized by uˆ β,z (0) cos β + uˆ ± β,z (1) sin β ± satisfying the boundary condi= 1. We denote by u± the solutions of (2.4) on Z 1,β,z ± tions (2.3) and by u± 2,β,z the solutions of (2.4) on Z satisfying the orthogonal bound± 2 ary conditions, and normalized by |ui,β,z (0)|2 + |u± i,β,z (1)| = 1, i = 1, 2; that is,
Power Law Subordinacy and Singular Spectra. II
647
± u± 2,β,z = u1,π/2+β,z . For z in the upper half-plane, the right and left Weyl–Titchmarsh ± ± ± + m-functions are uniquely defined by uˆ ± β,z = u2,β,z ∓ mβ (z)u1,β,z . We denote by m + the m-function m0 , corresponding to the Dirichlet boundary conditions, and by m− the + + m-function m− 0 . The functions m are related to mβ by a simple change of variables formula: +
m =
m+ β cos β − sin β
m+ β sin β + cos β
.
(2.5)
It is also immediate from the definition that the following relation is true: m± (z) = ∓
uˆ ± z (1)
uˆ ± z (0)
.
(2.6)
The whole-line m-function is then defined for =z > 0 by (see, e.g., [5]) M(z) ≡
m+ (z)m− (z) − 1 . m+ (z) + m− (z)
(2.7)
M(z) coincides with the Borel transform of the spectral measure µ of the operator (1.1): Z dµ(x) . (2.8) M(z) = x−z It is shown in [10] that lim sup 1−α |M(E + i)| = ∞ →0
⇔
Dµα (E) = ∞.
(2.9)
Therefore, the study of dimensional spectral properties can be reduced to the study of the behavior of M(E + i) as → 0, which in turn reduces to the study of the → 0 behavior of m± (E + i). The main technical achievement of the Gilbert–Pearson theory of subordinacy [11,12] (and the power-law extension of it [19]) is in relating such behavior to properties of solutions of Eq. (1.3). We now recall the main construction of [19]. Given an operator H of the form defined by (1.1), we fix E ∈ R, and let u± 1,β + or Z− , be u± , the normalized solution of (1.3), defined on, correspondingly, Z 1,β,E ± and obeying the boundary conditions (2.3). Let u± 2,β be u2,β,E , the solution obeying the orthogonal boundary conditions. For any function u : Z+ → R, we denote by kukL the norm of u over a lattice interval of length L, that is kukL ≡
" [L] X
#1/2 |u(n)| + (L − [L])|u([L] + 1)| 2
2
,
(2.10)
n=1
where [L] denotes the integer part of L. Similarly, for functions u : Z− → R, we define kukL ≡
"[L]−1 X n=0
#1/2 |u(−n)| + (L − [L])|u(−[L])| 2
2
.
(2.11)
648
S. Ya. Jitomirskaya, Y. Last
Now given any > 0, we define lengths L± β () ∈ (0, ∞), by requiring the equality ± ku± 1,β kL± () ku2,β kL± () = β
β
1 . 2
(2.12)
The functions L± β () are well defined (by (2.12)), and are monotonely decreasing continuous functions which go to infinity as goes to 0. Our main technical tool will be the following inequality which relates the Weyl–Titchmarsh m-function (for z in the upper half plane) to the solutions u1 and u2 (Theorem 1.1 of [19]): √ √ ku+ 1,β kL+ 5 + 24 5 − 24 β () < + . < + |m+ ku2,β kL+ () |mβ (E + i)| β (E + i)|
(2.13)
β
Finally, we will need one more general statement, that is the existence of generalized eigenfunctions [3,25,28], from which it is known that for a.e. E with respect to the spectral measure µ, there should exist β ∈ (−π/2, π/2] such that the solutions u± 1,β both obey lim sup L→∞
kukL < ∞. L1/2 ln L
(2.14)
3. Whole-line Operators In this section we discuss some technical issues that we need for dimensional analysis of the whole-line case. In particular, we will study the “right” and “left” transfer-matrices on similar scales. Given an operator of the form (1.1), define 8+ n (E) ≡ 8n (E), and (E) ≡ M (E)M (E) · · · M (E)M (E). Note that the 8± 8− −n+1 −n+2 −1 0 n n (E)’s are related to solutions of Eq. (1.3) by u(1) u(n + 1) (E) (3.1) = 8+ n u(0) u(n) and
u(0) u(−(n + 1)) (E) , n ≥ 0, = 8− n+1 u(1) u(−n)
and so, for any β, 8+ n (E) and 8− n+1 (E)
! + u+ sin β − cos β 1,β (n + 1) u2,β (n + 1) = cos β sin β u+ u+ 1,β (n) 2,β (n)
! − u− cos β − sin β 1,β (−(n + 1)) u2,β (−(n + 1)) = . − sin β − cos β u− u− 1,β (−n) 2,β (−n)
(3.2)
(3.3)
(3.4)
In order to relate the spectral properties of a whole-line problem with the left and right transfer-matrix behaviors, we will use the following general
Power Law Subordinacy and Singular Spectra. II
649
2 Lemma 4. Let H be an operator of the form (1.1). Fix δ > 1 and let α = 1+δ . Suppose that for every E in some Borel set A, there exist a, c > 0 and sequences kn → ∞, jn± ≤ ckn , such that jn± X 2 δ k8± i k > akn . i=1
Then, for every > 0, the restriction µ(A ∩ · ) is (α + )-singular. Remark. Note that the assumptions of Lemma 4 imply (α + )-singularity of both the right and the left half-line problems for any boundary condition. They require more than just that, however, as they also require some correlation between the two sides. This type of requirement is truly necessary, since, in general, the whole-line spectral problem can have much greater continuity than the corresponding half-line problems. Indeed, it is possible to construct potentials for which each of the half-line problems would have zero-dimensional spectrum for any boundary phase β, and yet the wholeline problem would have purely one-dimensional spectrum. Potentials of this type can be constructed as sparse barrier potentials of the type which we studied in [19]. By making the barriers high enough, one can insure that each of the half-line problems has zero-dimensional spectrum, but then by making them sparse enough and correlating the two sides appropriately (one needs the scales where barriers occur on the right to be very different from the scales where barriers occur on the left), one can cause the whole-line problem to have one-dimensional spectrum. The precise construction will be published elsewhere. It follows from the discussion in Sect. 2, particularly (2.9), that Lemma 4 will be proven if we establish that lim sup→0 1−α |M(E + i)| = ∞, where M(z) is defined by (2.7). Since our main technical tool that allows us to study the → 0 behavior of m through the behavior of solutions (and therefore, transfer-matrices), inequality (2.13), is formulated in [19] and proven only for m+ of operators acting on Z+ , it will be convenient for us to find an expression for M(z) that uses only m+ . We observe that the space `2 (Z− ) is naturally identified with `2 (Z+ ) by a unitary operator U : `2 (Z) → `2 (Z), defined by (U ψ)(n) = ψ(−n + 1), n ∈ Z. Clearly, U 2 is the identity operator. For any operator H on `2 (Z), we define an operator H˜ on `2 (Z) by H˜ = U H U −1 . For operators of the form (1.1), H˜ will be H with a potential ˜± ˜+ ˜± reflected with respect to the point 1/2. Let m ˜ +, m β, u β , Lβ , denote, correspondingly, m+ , m+ , u± and L± of the operator H˜ . Then it is immediate from (2.6) that β
β
β
1 . (3.5) m ˜+ Using (2.5),(2.7), and (3.5), we obtain by an elementary computation that for any β ∈ (−π/2, π/2], m− = −
˜+ m+ β (z)m π/2−β (z) − 1 m+ ˜+ β (z) + m π/2−β (z)
= M(z).
(3.6)
Also, we will use the fact that the operator U, applied formally, sends sequences u : Z± → R into sequences (U u) : Z∓ → R. It follows immediately from the definition that ˜+ U u− i,β = u i,π/2−β , i = 1, 2.
(3.7)
650
S. Ya. Jitomirskaya, Y. Last
It is also clear from (2.10), (2.11), (2.12) that − L˜ + π/2−β = Lβ
(3.8)
kukL = kU ukL .
(3.9)
and
Lemma 5. Suppose that there exists a β ∈ (−π/2, π/2] and a sequence j → 0, such that for every E in some Borel set A, we have that j1−α m+ β (E + ij ) → ∞ and ˜+ j1−α m π/2−β (E + ij ) → ∞ as j → ∞. Then the restriction µ(A ∩ · ) is α-singular.
Proof. From (3.6) we immediately obtain that under the conditions of the lemma we t have j1−α M(E + ij ) → ∞ as j → ∞. Thus, (2.9) yields the result. u The rest of the proof uses key ingredients from the proofs of Corollaries 4.1 and 4.2 of [19]. From (3.3),(3.4) we see that for any β, we have ± ± ± ± 2 2 2 2 2 k8± n (E)k ≤ |u1,β (±(n + 1))| + |u1,β (±(n))| + |u2,β (±(n + 1))| + |u2,β (±(n))| , ± 2 1 PL 2 ± 2 and therefore we have ku± n=1 k8n (E)k . Thus , under the 1,β kL + ku2,β kL ≥ 2 ± conditions of Lemma 4, for any L ≥ jn and any β ∈ (−π/2, π/2], we obtain ± 2 2 ku± 1,β kL + ku2,β kL ≥
a δ k . 2 n
(3.10)
In particular, inequality (3.10) holds for any L ≥ ckn . Since for a.e. E w.r.t. µ, there + exists a phase β(E) ∈ (−π/2, π/2] such that (2.14) holds for both u− 1,β(E) and u1,β(E) , we deduce that for such E 0 s and large n’s, δ
2 ku± 2,β kckn ≥ a1 kn ,
(3.11)
−1
for some a1 > 0. Take n = kn α . Then, by (2.12),(2.14), for a.e. E w.r.t. µ and large n, we get ku± 2,β(E) kL±
β(E) (n )
ku± 1,β(E) kL± β(E) (n )
>
c1 ± 2n Lβ(E) (n ) ln2 L± β(E) (n )
(3.12)
for some positive constant c1 . If it so happens that L± β(E) (n ) < ckn , for either the + or − case, then (3.12) implies that for some c2 > 0, ku± 2,β(E) kL±
β(E) (n )
ku± 1,β(E) kL±
β(E) (n )
1−α
kn α > c2 2 ln kn
(3.13)
Power Law Subordinacy and Singular Spectra. II
651
for the corresponding case. Otherwise (namely, if L± β(E) (n ) ≥ ckn ), (2.12) and (3.11) imply that for E ∈ A, ku± 2,β(E) kL±
β(E) (n )
ku± 1,β(E) kL± β(E) (n )
δ− α1
≥ 2a12 kn
1−α
= 2a12 kn α
(3.14)
for this case. Thus, we see that in either case, for a set of E’s of full µ(A ∩ · ) measure, we have positive constants C(E), such that for large n, ku± 2,β(E) kL±
β(E) (n )
ku± 1,β(E) kL±
β(E) (n )
1−α
kn α > C(E) 2 . ln kn
(3.15)
We thus obtain by (2.13), for any > 0, n1−α− m+ β (E)(E
+ in ) ∼
n1−α−
ku+ 2,β(E) kL+
β(E) (n )
ku+ 1,β(E) kL+
β(E) (n )
knα ≥ C(E) 2 → ∞ (3.16) ln kn
as n → ∞. Similarly, using in addition (3.7),(3.8),(3.9), we get 1−α− ˜+ n1−α− m π/2−β(E) (E + in ) ∼ n
= n1−α−
ku˜ + 2,π/2−β(E) kL˜ +
π/2−β(E) (n )
ku˜ + 1,π/2−β(E) kL˜ + π/2−β(E) (n ) ku− 2,β(E) kL−
β(E) (n )
ku− 1,β(E) kL−
(3.17)
→∞
β(E) (n )
as n → ∞. Therefore, Lemma 5 implies Lemma 4. u t For the zero-dimensionality our main tool will be the following Corollary 6. Let H be an operator of the form (1.1). Suppose that for every E in some k Borel set A, there exist c > 0 and sequences kn → ∞, jn± ≤ ckn , such that k8± jn± γ ak n ≥ e , for some a, γ > 0. Then the restriction µ(A ∩ · ) is zero-dimensional. Remark. This corollary is still more general than needed for our current application (in particular, we will only use it with γ = 1). However, this more general formulation may be useful in certain other situations. ±
Proof. Clearly,
k k8± jn±
≥e
γ akn
implies
jn X i=1
2 β k8± j k > akn for any β < ∞. Therefore,
by Lemma 4, µ(A ∩ · ) is α-singular for any α > 0. u t
652
S. Ya. Jitomirskaya, Y. Last
4. Quasiperiodic Operators and Zero-Dimensional Spectrum. Proof of Theorem 1 As a warmup, we first prove a half-line version of Theorem 1. Let us consider Hβ+ , operators on `2 (Z+ ), defined by (1.1) along with a phase boundary condition (2.3). In case of ergodic V (n), the Lyapunov exponent γ (E) is defined as before and is, clearly, not dependent on β, θ . Theorem 7. Let V be as in Theorem 1. Suppose that γ (E) > 0 for every E in some Borel set A, then for every boundary phase β and every θ , the restriction µ(A ∩ · ), of the corresponding spectral measure of the operator Hβ+ , is zero-dimensional. Proof. Recall that the upper Lyapunov exponent γ (E) is defined by γ (E) ≡ lim sup(1/n) ln k8n (E)k. n→∞
Note that, unlike γ (E), the upper Lyapunov exponent of an ergodic operator Hθ is dependent on θ. We will use the following Theorem 8 (Corollary 4.3 of [19]). Suppose that γ (E) > 0 for every E in some Borel set A, then for every boundary phase, the restriction µ(A ∩ · ) is zero-dimensional. Therefore, we need to prove that under the conditions of Theorem 7, for every θ, we have γ (E, θ) > 0. Note that boundedness of V (n) immediately implies the following a priori upper bound: k8n (θ, E)k ≤ enC
(4.1)
γ (E) for some C < ∞. Put c(E) = 2C−γ (E) . Let Ak = {θ ∈ [0, 1) : k8k (θ, E)k > kγ (E)/2 }. We obtain, using (1.6), that e Z Z Z 1 ln k8k (θ, E)kdθ = + ≤ |Ak |kC + (1 − |Ak |)kγ (E)/2, kγ (E) ≤ 0
Ak
[0,1)\Ak
where | · | stands for the Lebesgue measure. We therefore obtain |Ak | ≥ c(E). Since f is a trigonometric polynomial of degree p, k8k k2 is a trigonometric polynomial of degree 2kp, and the set Ak consists of no more than 4kp intervals. Therefore, there exists a segment, 1k ⊂ Ak , with |1k | ≥ c(E) 4kp . We will now show that for every θ , there exists a sequence jn → ∞ such that k8jn k is exponentially large. This sequence will be related to a continued fractions expansion of α. Let pn /qn be the sequence of continued fraction approximants of α. We will use the following elementary properties of continued fractions, that can be found, for example, in [21]. (We assume here α > 0. Otherwise, the signs in (4.3) will change.) |qn α − pn | < |kα − p|, k = 1, . . . , qn+1 − 1, k 6= qn , p ∈ Z;
(4.2)
qn α − pn > 0 for n odd and qn α − pn < 0 for n even;
(4.3)
|qn α − pn |
q1n . Then, for any θ, there exists a k in {0, 1, . . . , qn + qn−1 − 1} such that Tαk θ ∈ 1. We will now finish the proof of Theorem 7. Set c(E)qn + 1. kn = 4p
(4.5)
Then, by Lemma 9, for every θ there exists a j in {0, 1, . . . , qn + qn−1 − 1} such that j j j Tα θ ∈ Akn , and so k8kn (Tα θ )k > ekn γ (E)/2 . Since 8j +kn (θ ) = 8kn (Tα θ )8j (θ ), and each 8i is unimodular, we obtain that either k8j (θ )k or k8j +kn (θ )k is greater than e
kn γ (E) 4
. Let
n o kn γ (E) . jn = min j ∈ {0, . . . , qn + qn−1 − 1 + kn } : k8j (θ )k ≥ e 4
(4.6)
By (4.1) we have jn ≥
kn γ (E) , 2C
(4.7)
so jn → ∞ as n → ∞. To finish the proof of Theorem 7 it now remains to notice that (4.5),(4.6),(4.7) yield k8jn (θ )k ≥ e t u
c(E)γ (E) jn 4(8p+c(E))
, which implies γ (E, θ ) ≥
c(E)γ (E) 4(8p+c(E)) .
Proof of Lemma 9. Let k +(−) define such s ∈ {0, . . . , qn + qn−1 − 1} that Tαs θ is the nearest neighbor to the right (left) of Tαk θ . To prove the lemma we show that for any k ∈ {0, . . . , qn + qn−1 − 1}, we have dist(Tαk θ, Tαk
+(−)
θ)
0. Note that the denominators qn are the same for α and −α. Then, from (4.6),(4.7), and(4.9), we obtain for n sufficiently large, that the conditions of Corollary 6 c(E)γ (E) 8p (1 − ), γ = 1, c = c(E) + 1, are satisfied with A = {E : γ (E) > 0}, a = 4(8p+c(E))
the sequence kn defined by (4.5), and the sequences jn± defined by (4.6) with 8n = 8±,α n . t u
5. The Fibonacci Hamiltonian. Proof of Theorem 3 Our proof of Theorem 3 relies on two main ingredients. First, we will show that for energies E in the spectrum of Hλ , the half-line m-functions scale in a way that corresponds to α-continuous spectrum, namely that for every β, lim sup→0 1−α |m+ β (E + i)| < ∞ (where α is some positive constant depending only on λ). Second, we will use the symmetry properties of the potential to conclude a corresponding result for the whole-line m-function. To prove all that, we use a number of facts that where established by Süt˝o in [29], as well as a result of Iochum and Testard [15]. We start with Proposition 10. Let λ ∈ R and let Hλ be the corresponding Fibonacci Hamiltonian. Then there exists an α˜ > 0 such that for any E in the spectrum of Hλ , any solution u of Hλ u = Eu obeys lim inf L−α˜ kukL > 0. L→∞
Proof. Let {Fn }∞ n=1 be the Fibonacci numbers, defined by F0 = 1, F1 = 1, Fn+1 = Fn + Fn−1 . As shown by Süt˝o (Proposition 1 of [29]), the Fibonacci potential obeys V (Fn + `) = V (`) for n ≥ 3 and 1 ≤ ` ≤ Fn . Thus, for n large and 1 ≤ ` ≤ Fn−2 , we also get V (2Fn + `) = V (Fn+1 + Fn−2 + `) = V (Fn−2 + `) = V (`), namely, we have V (`) = V (Fn + `) = V (2Fn + `)
(5.1)
for 1 ≤ ` ≤ Fn−2 . By Lemma 1 of [29], we have for any 2 × 2 matrix B with det B = 1, max{|Tr B| · kB9k, kB 2 9k} ≥
1 k9k 2
(5.2)
for any 2-vector 9. Thus, if we have a bound on |Tr B| of the form |Tr B| ≤ C with C > 1, we get kB9k2 + kB 2 9k2 >
1 k9k2 . 4C 2
(5.3)
For every m < k, let 8m,k denote the 2×2 transfer matrix which takes (u(m+1), u(m))T to (u(k + 1), u(k))T . Let 1 ≤ ` ≤ Fn−2 . By (5.1), we have that 8`,Fn +` = 8Fn +`,2Fn +` , and moreover, we have Tr 8`,Fn +` = Tr 80,Fn . As shown in the proof of Lemma 2 of [29], there exists a constant Cλ , depending only on λ, such that |Tr 80,Fn | < Cλ for every E in the spectrum of Hλ and every n. Thus, (5.3) implies |u(Fn + ` + 1)|2 + |u(Fn + `)|2 + |u(2Fn + ` + 1)|2 + |u(2Fn + `)|2 1 > (|u(` + 1)|2 + |u(`)|2 ), 4Cλ2
(5.4)
Power Law Subordinacy and Singular Spectra. II
655
from which we easily deduce that kukFn+2
1 > 1+ 4Cλ2
!1/2 kukFn−2
(5.5)
t for large n. Since the Fn ’s grow exponentially in n, (5.5) proves the required result. u We can now get the required estimate for m+ β. Proposition 11. Let λ ∈ R and let Hλ be the corresponding Fibonacci Hamiltonian. Then there exists an α > 0, which depends only on λ, such that for any E in the spectrum of Hλ and every β, lim sup→0 1−α |m+ β (E + i)| < ∞. Proof. By combining (2.12) and (2.13), we get
1−α
|m+ β (E
+ i)| ∼
(ku+ ku+ )α−1 1,β kL+ 2,β kL+ β () β ()
ku+ 2,β kL+ () β
ku+ 1,β kL+ β ()
(5.6)
+ α α−2 = (ku+ . 2,β kL+ () ) (ku1,β kL+ () ) β
β
Iochum and Testard [15] proved an estimate that is analogous to our Proposition 10, but bounds the growth of solutions from above. Their result implies that for every λ ∈ R, there exists a finite positive constant α˜ 1 , such that for every E in the spectrum of Hλ , any solution u of Hλ u = Eu obeys lim supL→∞ L−α˜ 1 kukL < ∞. This means that for + + α˜ 1 appropriate energies E, the solution u+ 2,β of (5.6) obeys ku2,β kL+ () < C2 (Lβ ()) for β
α˜ > C1 (L+ some constant C2 . Similarly, Proposition 10 implies that ku+ β ()) 1,β kL+ β () for some constant C1 . Thus, for α ≤ 1, the r.h.s. of (5.6) is bounded from above by α(α˜ 1 +α)−2 ˜ α˜ , which is bounded from above (as → 0) if we take α ≤ C2α C1α−2 (L+ β ()) ˜ t u 2α/( ˜ α˜ 1 + α).
Proof of Theorem 3. By Proposition 1 of [29], the Fibonacci potential obeys V (−n) = V (n − 1) for any n ≥ 2. Let z = E + i, where E ∈ R and > 0, then the uniqueness of the solutions uˆ ± z (introduced in Sect. 2) along with this symmetry, implies that uˆ + z (1)
uˆ + z (0)
=
uˆ − z (−2) uˆ − z (−1)
.
(5.7)
For any solution u of Hλ u = zu, we clearly have for every n, u(n − 1) u(n + 1) = z − V (n) − , u(n) u(n)
(5.8)
and thus uˆ − z (−2) uˆ − z (−1)
= z − V (−1) − z − V (0) −
uˆ − z (1)
uˆ − z (0)
−1 .
(5.9)
By using (2.6) and the fact that V (0) = 0 and V (−1) = λ, (5.7) and (5.9) imply −m+ (z) = z − λ − (z − m− (z))−1 ,
(5.10)
656
S. Ya. Jitomirskaya, Y. Last
which can be rearranged to yield m− (z) = z − (z − λ − m+ (z))−1 .
(5.11)
Since we already know that Hλ has purely singular spectrum, its spectral measure is supported on the set of energies E for which |M(E + i0)| = ∞. Moreover, it had been shown by Gilbert [11] that the spectral measure is supported on the set of E’s for which Hλ u = Eu has solutions that are subordinate both to left and to the right. This implies that for such energies, m± (E + i0) exist and are either real or infinity (where we identify m± (E + i0) = ∞ with |m± (E + i0)| = ∞, namely with lim→0 |m± (E + i)| = ∞). That is, the spectral measure of Hλ is supported on a set S that is contained in its spectrum and such that for any E ∈ S, |M(E + i0)| = ∞ and m± (E + i0) both exist and are in R ∪ {∞}. From (2.7) along with the fact that m± (z) have positive imaginary parts, we see that for each E ∈ S, we must have either |m+ (E + i0)| = |m− (E + i0)| = ∞ or that m± (E + i0) are real and m+ (E + i0) = −m− (E + i0). But, by (5.11) we see that if |m+ (E + i0)| = ∞, then m− (E + i0) is finite, and so for each E ∈ S we must have that m± (E + i0) are real and m+ (E + i0) = −m− (E + i0). Let E ∈ S, then by + inverting (2.5) to yield m+ β in terms of m , we obtain m+ β =
m+ cot β + 1 , cot β − m+
(5.12)
+ from which we see that |m+ β (E + i0)| = ∞ exactly when cot β = m (E + i0), and in + −1 that case, |m+ β (E + i)| ∼ |(cot β − m (E + i)) | as → 0. Thus, Proposition 11 + implies that for small , | cot β − m (E + i)| is much larger than . By differentiating the r.h.s of (5.11) with respect to m+ (at the point where z = E and m+ = cot β), we see that as → 0, |m+ (E + i) + m− (E + i)| ∼ | cot β − m+ (E + i)| and thus, by (2.7), we also have |M(E + i)| ∼ |m+ β (E + i)|. By Proposition 11, this implies that
lim sup 1−α |M(E + i)| < ∞ →0
(5.13)
for a.e. E w.r.t. the spectral measure of Hλ , and thus, Hλ has purely α-continuous spectrum. u t Remarks. • The proof of Theorem 3 uses the symmetry of the potential to show that for an appropriate β, |m+ β (E + i)| ∼ |M(E + i)| as → 0, and thus that the continuity result for the half-line problem (Proposition 11) translates to a corresponding continuity result for the whole-line problem. The crucial fact to showing this is (5.11), which has the form m− (z) = f (z, m+ (z)),
(5.14)
where f is a rational function of two variables with real coefficients. By using (5.8) and (2.6), one can show that a relation of the form (5.14) would hold whenever V (k − n) = V (m + n) for some k, m ∈ Z and all n ∈ N, and the implication |m+ β (E + i)| ∼ |M(E + i)| can thus be extended to any such case. That is, if V (k−n) = V (m+n) for some k, m ∈ Z and all n ∈ N, and if there exists an 0 < α ≤ 1 such that for any E in some Borel set S and every β, lim sup→0 1−α |m+ β (E + i)| < ∞, then the restriction of the spectral measure of the whole-line problem to S is α-continuous.
Power Law Subordinacy and Singular Spectra. II
657
• Since our above proof had been obtained, there had been some work done by others on extending it to deal with other potentials. In particular, a) Damanik [8] has extended our above argument to treat a more general class of Fibonacci-like potentials. b) Very recently, Damanik–Killip–Lenz [9] have found an argument that does away with the symmetry requirement in our above proof. It allows them to prove spectral α-continuity for arbitrary realizations in the hull of Fibonacci-like potentials. Acknowledgement. We would like to thank J. Avron and B. Simon for useful discussions. SJ is an Alfred P. Sloan Research Fellow, and was supported in part by NSF Grants DMS-9501265 and DMS-9704130. YL was supported in part by NSF Grant DMS-9801474 and by an Allon Fellowship.
References 1. Avron, J. and Simon, B.: Singular continuous spectrum for a class of almost periodic Jacobi matrices. Bull. AMS 6, 81–85 (1982) 2. Bellissard, J., Iochum, B., Scoppola, E., Testard, D.: Spectral properties of one-dimensional quasi-crystals. Commun. Math. Phys. 125, 527–543 (1989) 3. Berezanskii, Y.: Expansions in Eigenfunctions of Selfadjoint Operators. Transl. Math. Monogr. Vol. 17. Providence, RI: Am. Math. Soc., 1968 4. Bourgain, J., Goldstein, M.: On nonperturbative localization with quasiperiodic potential. Ann. of Math., to appear 5. Carmona, R., Lacroix, J.: Spectral Theory of Random Schrödinger Operators. Boston, MA: Birkhauser, 1990 6. Cornfeld, I., Fomin, S., Sinai, Ya.: Ergodic Theory. New York: Springer, 1982 7. Cycon, H.L., Froese, R.G., Kirsch, W., Simon, B.: Schrödinger Operators. Berlin–Heidelberg–NewYork: Springer, 1987 8. Damanik, D.: α−continuity properties of one-dimensional quasicrystals. Commun. Math. Phys. 192, 1765–1769 (1998) 9. Damanik, D., Killip, R., Lenz, D.: Uniform spectral properties of one-dimensional quasicrystals, III. α-continuity. Commun. Math. Phys., to appear 10. del Rio, R., Jitomirskaya, S., Last, Y., Simon, B.: Operators with singular continuous spectrum, IV. Hausdorff dimensions, rank-one perturbations, and localization. J. d’Analyse Math. 69, 153–200 (1996) 11. Gilbert, D.J.: On subordinacy and analysis of the spectrum of Schrödinger operators with two singular endpoints. Proc. Roy. Soc. Edinburgh Sect. A 112, 213–229 (1989) 12. Gilbert, D.J., Pearson, D.: On subordinacy and analysis of the spectrum of one-dimensional Schrödinger operators. J. Math. Anal. 128, 30–56 (1987) 13. Gordon, A., Jitomirskaya, S., Last, Y., Simon, B.: Duality and singular continuous spectrum in the almost Mathieu equation. Acta Math. 178, 169–183 (1997) 14. Herman, M.: Une methode pour minorer les exposants de Lyapunov et quelques exemples montrant le caractere local d’un theoreme d’Arnold et de moser sur le tore en dimension 2. Commun. Math. Helv. 58, 453–502 (1983) 15. Iochum, B., Testard, D.: Power law growth for the resistance in the Fibonacci model. J. Stat. Phys. 65, 715–723 (1991) 16. Jitomirskaya, S.: Almost everything about the almost Mathieu operator II. In: Proceedings of the XI International Congress of Mathematical Physics, Paris 1994, Cambridge, MA: Int. Press, 373–382 (1995) 17. Jitomirskaya, S.: Metal insulator transition for the almost Mathieu operator.Ann. of Math. 150, 1159–1175 (1999) 18. Jitomirskaya, S., Last, Y.: Dimensional Hausdorff properties of singular continuous spectra. Phys. Rev. Lett. 76, 1765–1769 (1996) 19. Jitomirskaya, S., Last, Y.: Power-Law subordinacy and singular spectra, I. Half line operators. Acta Math. 183, 171–189 (1999) 20. Jitomirskaya, S., Simon, B.: Operators with Singular Continuous Spectrum, III. Almost Periodic Schrodinger Operators. Commun. Math. Phys. 165, 201–206 (1994) 21. Khintchin, A.Ya.: Continued Fractions. Groningen: Noordhoff, 1963 22. Last, Y.: A relation between absolutely continuous spectrum of ergodic Jacobi matrices and the spectra of periodic approximants. Commun. Math. Phys. 151, 183–192 (1993) 23. Last, Y.: Zero measure spectrum for the almost Mathieu operator. Commun. Math. Phys. 164, 421–432 (1994)
658
S. Ya. Jitomirskaya, Y. Last
24. Last Y.: Quantum dynamics and decompositions of singular continuous spectra. J. Funct. Anal. 142, 406–445 (1996) 25. Last, Y., Simon, B.: Eigenfunctions, transfer matrices, and absolutely continuous spectrum of onedimensional Schrödinger operators. Invent. Math. 135, 329–367 (1999) 26. Sorets, E., Spencer, T.: Positive Lyapunov exponents for Schrödinger operators with quasi-periodic potentials. Commun. Math. Phys. 142, 543–566 (1991) 27. Raymond, L.: A constructive gap labelling for the discrete Schrödinger operator on a quasiperiodic chain. Preprint 28. Simon, B.: Schrödinger semigroups, Bull. AMS 7, 447–526 (1982) 29. Süt˝o, A.: The spectrum of a quasiperiodic Schrödinger operator. Commun. Math. Phys. 111, 409–415 (1987) 30. Süt˝o, A.: Singular continuous spectrum on a Cantor set of zero Lebesgue measure for the Fibonacci Hamiltonian. J. Stat. Phys. 56, 525–531 (1989) Communicated by B. Simon
Commun. Math. Phys. 211, 659 – 686 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Exponential Mixing and | ln h ¯ | Time Scales in Quantized Hyperbolic Maps on the Torus Francesco Bonechi1 , Stephan De Bièvre2 1 INFN, Sezione di Firenze, Dipartimento di Fisica, Università di Firenze, Largo E. Fermi 2, 50125 Firenze,
Italy. E-mail:
[email protected] 2 UFR de Mathématiques et UMR AGAT, Université des Sciences et Technologies de Lille, 59655 Villeneuve
d’Ascq Cedex, France. E-mail:
[email protected] Received: 1 October 1999/Accepted: 4 January 2000
Abstract: We study the behaviour, in the simultaneous limits h¯ → 0, t → ∞, of the Husimi and Wigner distributions of initial coherent states and position eigenstates, evolved under the quantized hyperbolic toral automorphisms and the quantized baker map. We show how the exponential mixing of the underlying dynamics manifests itself in those quantities on time scales logarithmic in h¯ . The phase space distributions of the coherent states, evolved under either of those dynamics, are shown to equidistribute on the torus in the limit h¯ → 0, for times t between 21 | lnγ h¯ | and | lnγ h¯ | , where γ is the Lyapounov exponent of the classical system. For times shorter than 21 | lnγ h¯ | , they remain concentrated on the classical trajectory of the center of the coherent state. The behaviour of the phase space distributions of evolved position eigenstates, on the other hand, is not the same for the quantized automorphisms as for the baker map. In the first case, they equidistribute provided t → ∞ as h¯ → 0, and as long as t is shorter than | lnγ h¯ | . In the second case, they remain localized on the evolved initial position at all such times. 1. Introduction It has been known for a long time and proven in a large number of situations that the eigenfunctions of a quantum system that has an ergodic classical limit equidistribute on the relevant energy surface as h¯ → 0 [Bo,BDB1,BDB2,CdV,DBDE,GL,HMR,Sc,Z1]. Similarly, if the underlying dynamics is mixing, this has an effect on the off-diagonal matrix elements of observables between eigenstates [Bo,CR1,Z2,Z3]. In other words, signatures of ergodicity or mixing on spectral properties of quantum systems have been extensively studied. In this paper we exhibit a phenomenon in the time domain that is a signature of the exponential mixing of the underlying dynamics: the equidistribution on the relevant energy surface of the Husimi and Wigner distributions of an evolved coherent state in the limit where simultaneously h¯ → 0 and 21 | lnγ h¯ | 2. We write Av± = λ±1 v± , λ = exp γ > 1, v± = (cos θ± , sin θ± ) for the eigenvectors and the eigenvalues of A. We will always write N = 1/(2π h¯ ), although N will be assumed to be an integer only when we deal with the quantum map on HN (κ) (in which case automatically A ∈ SL2 (Z)), in Proposition 9. Proposition 8. Let z ∈ C, with Imz > 0. (i) Let A ∈ SL2 (R) with |TrA| > 2. Then, for each f ∈ `ˇ1,2 , there exists Cf > 0 such that ∀ x0 ∈ T2 , λ2t W . Of [t, N](x0 ) − (f ◦ At )(x0 ) ≤ Cf β+ (z) N (ii) Let A ∈ SL2 (Z) with |TrA| > 2. If f ∈ `ˇ1,k , (k ∈ N∗ ) then there exist Cf,1 , Cf,2 > 0 such that ∀ x0 ∈ T2 and ∀ (t, N), k λ2t N 2 Cf,1 2 β (z) −π N c+ − dx f (x) ≤ + C e , f,2 k 2t 2 β (z) λ − T
Z W O [t, N](x0 ) − f
where c+ = Max{| cos θ+ |, | sin θ+ |}. Proof. (i) Since χξ (x) = exp 2π ihx, ξ i, we have OpW (χξ ) = U(ξ1 /N , ξ2 /N ) and hence we obtain, using (18), π
OχWξ [0, N](x0 ) = e2iπhx0 ,ξ i e− N (ξ,B(z)ξ ) .
(29)
Using (25), it then follows that OχWξ [t, N ] = OχWξt [0, N ], where ξt = A−t ξ . As a result, P if f = n fn χn , then X π fn e− N (nt ,B(z)nt ) χnt (x0 ) . (30) OfW [t, N](x0 ) = n∈Z2
Using (30) we have that X π W |fn |(1 − e− N (nt ,B(z)nt ) ) Of [t, N](x0 ) − f ◦ At (x0 ) ≤ n∈Z2
π X |fn |(nt , B(z)nt ) , N n∈Z2 πβ+ (z) X |fn |knt k2 , ≤ N 2
≤
n∈Z
Exponential Mixing and | ln h¯ | Time Scales in Quantized Hyperbolic Maps
673
where we used the inequality 1 − e−y ≤ y, for each y ≥ 0. The result then follows from the properties of kA−t k and from the regularity of f . (ii) We recall that s+ = cot θ+ is a quadratic irrational. The basic properties of quadratic irrationals that we will use can be found in [Kh]. Using the Schwarz inequality we have that
−t
A n ≥ hv+ , A−t ni = hAt v+ , ni = λt | sin θ+ ||n1 − n2 s+ | , and from (30) we obtain the following estimate: Z W O [t, N](x0 ) − f
T2
λ2t X dx f (x) ≤ |fn | e−πβ− (z) N
sin2 θ+ |n1 −n2 s+ |2
.
n6 =0
Without loss of generality suppose that s+ > 0. It is convenient to divide the sum in two parts, I1 and I2 . In I1 we sum over n1 · n2 ≤ 0, from which we easily obtain the exponential term in the estimate. To discuss I2 , we recall that since s+ is a quadratic irrational, there exists a constant C > 0 such that ∀ n1 , n2 > 0, |n1 − s+ n2 | >
1 . Cn2
Then, I2 ≤
X
|fn1 ,n2 | + |f−n1 ,−n2 | e
−πβ− (z)
λ2t sin2 θ+ N C 2 n2 2
.
n1 ,n2 >0
We then obtain the final estimate using the inequality exp(−x) < Ck0 /x k , valid for some t Ck0 > 0 and for each x > 0. u From now on, N = 1/(2π h¯ ) is an integer and let Hh¯ (κ) be the Hilbert space defined in Sect. 2. The key observation allowing a simple analysis of the behaviour of OfW,κ [t, N ] in the (t, N)-plane is the following proposition, which shows that for times t 0. If f ∈ `ˇ1,k , then there exist Cf,1 , Cf,2 > 0 such that for each x0 ∈ T2 and all t, N , t k λ πN W,κ W β− (z) . [t, N](x ) − O [t, N ](x ) < C + Cf,1 exp − Of 0 0 f,2 f N 4 Proof. Using the inequality of Proposition 6(iii) and recalling that cn0 = OχWn [0, N ], it is easy to find Cf,1 such that πN W,κ Of [t, N](x0 ) − OfW [t, N ](x0 ) ≤ Cf,1 e− 4 β− (z) + X n
n p( t ) |fn | cnt N (x0 )e−i(p
nt 1 nt 2 N )κ1 −p( N )κ2
− cn0t (x0 )
.
674
F. Bonechi, S. De Bièvre
Because p(x) 6 = 0 only if |x| > 1/2 the sum is limited to n such that |nt 1 /N| or |nt 2 /N| > 1/2. Then, we have X πN W,κ |fn | Of [t, N](x0 ) − OfW [t, N ](x0 ) ≤ Cf,1 e− 4 β− (z) + 2 kA−t nk> N2
!k
2 A−t n +2 |fn | ≤ Cf,1 e N n k 2kA−t k X πN ≤ Cf,1 e− 4 β− (z) + 2 |fn |knkk . t u N n X
− π4N β− (z)
It is now clear that the results of Propositions 8 and 9 imply Theorem 1. We now give a more direct proof of the mixing regime of Theorem 1 for the Husimi distribution, based on the intuitive picture presented in the introduction and in particular on (5). Proposition 10. Let z ∈ C, (Imz > 0), and A ∈ SL2 (Z), with TrA > 2. (i) Let f ∈ `ˇ1,1 . There exists Cf,A > 0 such that ∀x0 ∈ R2 , h¯ , t > 0, Z |z| + 1 1 dx f (x)| ≤ Cf,A √ |hx0 , z|M(A)−t OpzAW (f )M(A)t |x0 , zi − √ . 2 Imz λt h¯ T (ii) Let f ∈ `ˇ1,k , with k ≥ 1. There exist Cf,A , Cf1 , Cf2 > 0 such that ∀x0 ∈ T2 , N, t > 0, Z AW (f )Mκ (A)t |x0 , z, κi − dx f (x)| |hx0 , z, κ|Mκ (A)−t Opκ,z T2
√ t k λ N + Cf1 e−π Nβ− (z)/4 . ≤ Cf,A t + Cf2 λ N R P ∞ 2 ˇ Proof. R (i) Let f1 = n f1,n χn ∈ `1,1 (with T2 dx f1 (x) = 0) and f2 ∈ C (R ) such that R2 dy [|∂q f2 | + |∂p f2 |] < ∞. Adapting the proof given in Theorem 4 of [DB] it can be shown that Z Z X t −t dy (f1 ◦ A )(y)f2 (y)| ≤ CA λ ( |f1,n |knk) dy [|∂q f2 | + |∂p f2 |], | R2
for some CA > 0. Then, if
R T2
R2
n
f = 0,
|hx0 , z|M(A)−t OpzAW (f )M(A)t |x0 , zi| Z |hy, A−t · z|x0 , zi|2 | dy (f ◦ At )(y) =| 2π h¯ R2 Z |h0, A−t · z|x0 − y, zi|2 | dy (f ◦ At )(y) =| 2π h¯ R2 Z |h0, A−t · z|ξ, zi|2 | dξ (f ◦ At )(x0 − ξ ) =| 2π h¯ R2 Z X |fn |knk)λ−t dy [|∂q h| + |∂p h|], ≤ CA ( n
R2
Exponential Mixing and | ln h¯ | Time Scales in Quantized Hyperbolic Maps
675
where h(ξ ) = |h0, A−t · z|ξ, zi|2 /(2π h¯ ) is the Husimi distribution of |ψi = |0, A−t zi. The result then follows from Proposition 7(i). (ii) The only difficulty to repeat on the torus the estimate given under (i) comes from the fact that |hy, z1 , κ|x, z2 , κi|2 6 = |h0, z1 , κ|x − y, z2 , κi|2 . Indeed, AW hx0 , z, κ|M(A)−t Opκ,z (f )M(A)t |x0 , z, κi Z |h0, A−t · z; κ|y, z; κi|2 dy(f ◦ At )(x0 − y) = 2π h¯ T2 Z dy G(x0 , z; y, A−t · z)(f ◦ At )(y), + T2
where G(x, z1 ; y, z2 ) = N(|hy, z2 , κ|x, z1 , κi|2 − |h0, z2 , κ|x − y, z1 , κi|2 ). The first term can be evaluated as in part (i), using this time Proposition 7 (ii). Let G(x, z1 ; y, z2 ) = P β Gβ (x, z1 ; z2 ) exp i2π hβ, yi. By using the definition of anti-Wick quantization we easily rewrite AW AW (χβ )|x, z1 , κi − χβ (x)h0, z2 , κ|Opκ,z (χβ∗ )|0, z2 , κi. Gβ (x, z1 ; z2 ) = hx, z1 , κ|Opκ,z 2 1
By subtracting the same quantity calculated on the plane, which is zero, and by using Lemma 5 we finally have Gβ (x, z1 ; z2 ) h i β = h , z2 |0, z2 i hx, z1 , κ|OpκW (χβ )|x, z1 , κi − hx, z1 |Op W (χβ )|x, z1 i N h i β − χβ (x)h0, z1 | , z1 i h0, z2 , κ|OpκW (χβ∗ )|0, z2 , κi − h0, z2 |OpW (χβ∗ )|0, z2 i . N After some simple calculation we can finally write (βt = A−t β) Z dy G(x0 , z; y, A−t · z)(f ◦ At )(y) T2 h i X β fβ h , z|0, zi hx0 , z, κ|OpκW (χβt )|x0 , z, κi − hx0 , z|OpW (χβt )|x0 , zi = N β h i X βt fβ χβt (x0 )h0, z| , zi h0, z, κ|OpκW (χβ∗ )|0, z, κi − h0, z|OpW (χβ∗ )|0, zi . − N β
The result then follows by making use of Proposition 9. u t 3.2. The period of the quantum map. It is natural to wonder if (4) holds beyond lnγN and in particular if it holds possibly for times that are polynomial in N. We shall show (see (34)–(35)) with an example that the mixing regime may break down on a time scale which is only logarithmic in N , thereby showing that on this scale the agreement between the classical and quantum or semi-classical evolutions breaks down. The intuitive “proof” of (4) given in the introduction, which is based on (5), does not predict any breakdown of (4) at long times. It is possible to get an intuitive understanding of the breakdown of the mixing regime using the uncertainty principle as follows. First remark that, given any ψ ∈ HN (κ), the support of its Wigner distribution has, due to the uncertainty principle, necessarily a linear size of the order of at least h¯ in all
676
F. Bonechi, S. De Bièvre
directions: indeed, from 1X1P ≥ h¯ and 1X, 1P ≤ 1, one concludes 1X, 1P ≥ h¯ . Consider now M(A)t |x, z, κi; we expect its Wigner distribution to have a spread of √ √h¯ exp γ t along the unstable direction and therefore to wrap around the torus as soon as h¯ exp γ t >> 1. The transversal distance between the successive windings so obtained √ −1 can be estimated by h¯ exp −γ t. When this distance becomes less than h¯ , the support of the Wigner distribution of M(A)t |x, z, κi can no longer separate the separate windings, because of the previous remark, and as a result one expects the classical evolution picture √ −1 may break down at such times, given by h¯ exp −γ t ∼ h¯ or t ∼ 23 | lnγ h¯ | . We shall now exhibit precisely such a phenomenon for the quantized cat maps. For that purpose we consider A such that either a12 = 1 or a21 = 1 and [a11 ]2 = [a22 ]2 = 0 (we use the notation [x]n = x mod n, both for numbers and matrices). We know that κ = 0 gives an admissible quantization; furthermore it was shown in [HB]that the corresponding quantum map Mo (A) is periodic with period n(N ) = Min t | Mo (A)t = eiφN , φN ∈ R . These periods have been studied in [Ke], where it is argued that they behave “on average” linearly in N, but with great fluctuations about this average. It is of course clear that the “mixing regime” must break down before the period. In the following we will show that there exists a sequence N2k+1 of values of N for which the period is extremely short: n(N2k+1 ) ∼ 2 ln Nγ2k+1 , leading to the announced result. We will need the following simple formulas for A. If we call λ the biggest eigenvalue of A, we know that for each t ∈ N+ , At = pt A − pt−1 ,
pt+1 = Tr(A)pt − pt−1 , where pt =
λt − λ−t . λ − λ−1
(31)
We also introduce TN = Min t | [At ]N = 1 . We denote by AN the matrix with integer entries such that ATN = 1 + NAN . Then, following [HB], n(N ) = TN if N is odd or if N is even and [(AN )12 ]2 = [(AN )21 ]2 = 0; otherwise n(N ) = 2TN . We finally define for k ∈ N+ , n o (32) Nk ≡ Max N | [Ak ]N = 1 , and we prove the following result. Proposition 11. For each k ∈ N+ we have that TNk = k and N2k = 2pk ,
N2k+1 = pk + pk+1 .
(33)
Proof. Using (31), we see that Nk is the greatest integer such that [pk a11 − pk−1 − 1]Nk = 0, [pk a22 − pk−1 − 1]Nk = 0,
[pk a12 ]Nk = 0, [pk a21 ]Nk = 0,
or, because of the hypothesis about the off-diagonal terms of A, [pk ]Nk = 0 , [pk−1 ]Nk = −1. This means that Nk is the greatest common divisor of pk and pk−1 + 1, i.e. Nk = (pk , pk−1 + 1). We are going to show by induction that, for each s = 0, . . . , k − 1, Nk = (pk−s + ps , pk−(s+1) + ps+1 ) . Since p0 = 0 and p1 = 1, this is clearly true for s = 0. Supposing it is true for s, we have Nk = (TrA pk−s−1 − pk−s−2 + ps , pk−s−1 + ps+1 )
Exponential Mixing and | ln h¯ | Time Scales in Quantized Hyperbolic Maps
677
= (TrA (pk−s−1 + ps+1 ) − TrA ps+1 − pk−s−2 + ps , pk−s−1 + ps+1 ), = (ps+2 + pk−(s+2) , pk−(s+1) + ps+1 )., so that it is true for s + 1. In the third line we used the identity (a, ca − b) = (a, b), valid for all a, b, c, and formula (31). If k = 2`, then, setting s = ` in the above formula, we have N2` = (2p` , p`−1 + p`+1 ) = (2p` , TrA p` ) = 2p` ; , because TrA is pair by hypothesis. If k = 2` + 1, then N2`+1 = p` + p`+1 . For each k we have that [Ak ]Nk = [ATNk ]Nk = [ATNk ]NTN = 1. From the definition of the period k we have that TNk ≤ k and from the definition of Nk it follows that Nk ≤ NTNk . Since the sequence {Nk } is increasing (see (31)–(33)) we conclude that TNk ≥ k and hence t that TNk = k. u Since [TrA]2 = 0, we have that [p2k ]2 = 0 and [p2k+1 ]2 = 1, so that [N2k+1 ]2 = 1. Using the results of Proposition 11, it then follows that 2k + 1 is the quantum period for N2k+1 = pk + pk+1 , i.e. n(N2k+1 ) ≈ 2 ln Nγ2k+1 . Keeping in mind that M(A)−1 = M(A−1 ), so that M(A)t = M(A)t−n(N2k+1 ) = M(A−1 )(n(N2k+1 )−t) , (0 ≤ t ≤ n(N2k+1 )), we can apply (3) and (4) to A−1 to conclude that if we perform the limits running only over N2k+1 , we have Z OfAW,0 [t, N2k+1 ](x0 ) = f (x 0 ) dx 0 , (34) lim k→∞ ln N2k+1 ln N2k+1 t 23 2γ γ
lim
k→∞
T2
|OfAW,0 [t, N2k+1 ](x0 ) − f (At−n(N2k+1 ) x0 )| = 0 .
(35)
ln N2k+1 3 ln N2k+1 t2 2 γ γ
Equation (35) clearly shows the breakdown of the mixing regime for times beyond 23 lnγN , in the cases considered here. It is of course still possible that, “generically”, it remains valid for much longer times, as the classical intuition would predict. 3.3. Position eigenstates. In this subsection we study the Wigner and Husimi distributions of evolved position eigenstates by studying the matrix elements hejκ , M(A)−t AW (f )M(A)t eκ i in the limits t → ∞ and OpκW (f )M(A)t ejκ i and hejκ , M(A)−t Opκ,z j N → ∞. If we applied the heuristic argument of the introduction to this case, we would conclude that the mixing regime should set in no later than at times of order ln N 2γ . As a matter of fact, it sets in much sooner, as Proposition 13 shows. We need the following preparatory result. Lemma 12. For each f ∈ `ˇ1,k , there exist Cf > 0 such that for each N, t ≥ 0, j = 0, . . . N − 1 and |J | < N/2 we have that Z 1 κ2 /π + 2j + J dp (f ◦ At )( , p)ei2π Jp | |hejκ+J , M(A)−t OpκW (f )M(A)t ejκ i − 2N 0
−t !k
A . (36) ≤ Cf N
678
F. Bonechi, S. De Bièvre
Proof. Using the Egorov theorem and (14) we find that X n1 n2 n2 fAt n eiπ N ei(κ2 +2πj ) N hejκ+J , ejκ+n1 i hejκ+J , M(A)−t OpκW (f )M(A)t ejκ i = n
=
X n
fAt (J +n1 N,n2 ) n2
π
e−ik1 n1 ei N n2 (J +N n1 ) ei(κ2 +2πj ) N . It is clear that the integral in (36) is obtained posing n1 = 0. The remaining terms are easily estimated: X X |fAt (J +Nn1 ,n2 ) | ≤ |fAt n | n
knk≥ N2
n1 6=0
≤
X 2knk k N
n
|fAt n | ≤
!k 2 A−t X knkk |fn |. N n
t u
Proposition 13. For each f ∈ `ˇ1,k , there exist Cf,1 , Cf,2 such that, for all t, N, j = 0, . . . , N − 1, one has
−t !k Z
A Cf,2 κ −t W t κ dx f | ≤ Cf,1 + tk . |hej , M(A) Opκ (f )M(A) ej i − 2 N λ T Proof. From (36) it is clear that it will be enough to show that it exists Cf,2 such that, for all 0 ≤ q ≤ 1, Z Z 1 Cf,2 t (f ◦ A )(q, p)dp − f (x)dx| ≤ tk . | 2 λ T 0 Indeed, Z Z 1 t (f ◦ A )(q, p)dp − |
T2
0
f (x)dx| ≤
X
X
|fAt (0,n) | ≤
≤
|fm |
kmk≥λt | cos θ− |
n6 =0
1
X
λtk | cos θ− |k
m
|fm |kmkk ,
where we used the inequality At (0, n) ≥ |hv− , At (0, n)i| ≥ λt | cos θ− |, valid for each n 6 = 0. u t It is clear that we actually have, for any sequence θN → ∞, lim
N →∞
θN 0, 0 < < 1 and 0 < α < 4 be given and let MN be as above. Then √ there exists a subset MN of the index set 1 ≤ i1 , i2 ≤ MN with the following properties: MN ≤ N12α MNN ; (a) 0 ≤ 1 − ]M N ∞ (b) For all f ∈ C (T), and for all k ∈ N, there exists a constant Cf,k so that, for all 0 ≤ t < (1 − ) log2 N and for all j ∈ MN , (N )
(N )
(N )
(N )
|hxj , i|VB−t Op0W (f )VBt |xj , ii − hxj , i|Op0W (f ◦ B t )|xj , ii| ≤ 1 + N α− 4 + N −k 2 ; Cf,k √ MN (ii) For each f ∈ C ∞ (T) there exist Cf1 , Cf2 , Cf3 such that ∀x0 ∈ T2 , N, t > 0, Z |
T2
dy gxsc0 ,z (y, t)f (y) −
Z T2
dy f (y)| ≤ Cf1
√ N 2t + C + Cf3 e−π Nβ− (z)/4 . f 2 2t N
The second part of this proposition√says that the semi-classically evolved Husimi distribution equidistributes provided N 0 such that for all 0 ≤ t < (1 − ) log2 N and for each j ∈ MN , 2t j 1 1 + )| ≤ Cf,k ; |hej , VB−t Op0W (f )VBt ej i − f ( N N /4−α N k/2
Exponential Mixing and | ln h¯ | Time Scales in Quantized Hyperbolic Maps
681
(ii) There exists C > 0 s.t., for 0 ≤ j ≤ N − 1, f ∈ C ∞ (T), t, N > 0 we have
0 Z
f 1 2t j 2t (j + 1) sc gj (x, t)f (x) dx − [f ( ) + f( )]| ≤ C √ ∞ . | 2 N N N T2 The first part of this result is again readily understood in terms of evolution of the support of the Wigner function of the initial state ej , which is the vertical strip at j/N , of width 1/N. The dynamics contracts this strip vertically, stretches it horizontally to size 2t /N, and centers it at 2t j/N . As a result, one does not expect mixing to set in before times of order log2 N , i.e. when 2t /N ∼ 1; this is indeed confirmed by the above result. Comparing furthermore the first part of the proposition to the second, one concludes again that the semi-classical evolution can not be distinguished from the quantum-mechanical one up to times 2t N. 4.2. Proof of Propositions 14–15 and of Theorem 2. To prove part (i) of Proposition 15 as well as of Proposition 14, we first of all need the following Egorov theorem, proven in [DBDE]. Let’s define, for each t > 0 (compare this to (25)), Et (f ) = VB−t Op0W (f )VBt − Op0W (f ◦ B t ), and Et (m) = Et (χm ). Proposition 16 ([DBDE]). Let ηN > 0 and tN > 0 such that 2tN ηN < N . Then there exists a subspace GηN (tN ) ⊂ HN such that (i) dim GηN (tN ) ≥ N − 2tN ηN ; (ii) for each ψ ∈ GηN (tN ), |n| < ηN and 0 ≤ t ≤ tN Et (0, n) ψ = 0. The proposition asserts roughly that, provided one looks at times shorter than log2 N , and provided one restricts one’s attention to a “good” subspace GηN , there is no error in the Egorov theorem for trigonometric polynomials of degree at most ηN . The good subspace gets smaller as the time gets larger, but is non-trivial on the time-scale considered. Two obvious weaknesses of the above result are that it does not describe the good space explicitly and that it deals only with functions of q. These are at the origin of the limitations of Theorem 2 pointed out in the introduction. The following corollary is an easy consequence of Proposition 16. We write PBηN for the projector onto BηN , the orthogonal complement of GηN . N Corollary 17. Let {φj }M j =1 be a family of orthonormal vectors in HN . Let ηN , tN be as in Proposition 16, δN > 0 and
2 }. MN = {1 ≤ j ≤ MN | hφj |PBηN |φj i < δN
Then 2; (i) #MN ≥ MN − 2tN ηN /δN ∞ (ii) for each f ∈ C (T) and k ∈ N, there exists Cf,k > 0 such that, for each 0 ≤ t ≤ tN and j ∈ MN ,
Et (f )φj ≤ Cf,k (δN + η−k ) . N
682
F. Bonechi, S. De Bièvre
2 /M → 0, δ → 0, η → We will always choose things in such a way that 2tN ηN /δN N N N ∞, so that the corollary asserts that the error is “small” on “many” φj .
Proof. (i) From Proposition 16 we have that dim BηN ≤ 2tN ηN and X 2 hφj |PBηN |φj i ≥ (MN − #MN )δN , dim BηN = TrPBηN ≥ j 6 ∈MN
from which the statement follows. P (ii) Let j ∈ MN and f = n fn χ0n ∈ C ∞ (T). Then X X
Et (f )φj ≤ |fn | Et (0, n)φj + |fn | Et (0, n)φj . |n| 1. However, in general there is no explicit formula for the density function, and much recent interest has focused on approximating the density [HuntB,Froy1,Froy2,HuntF,K-M-Y,M], with particular emphasis on Ulam’s method [Ulam]. In this note we will present an alternative approach. We shall only consider the case where T is a real analytic Markov map. In this case a well-known approach to finding the a.c.i.m. µ is given by the weighted distribution of periodic orbits. More precisely, the sequence of atomic T -invariant probability measures P M 0 x∈Fix(M) δx /|(T ) (x)| , M≥1 (0.1) mM = P M 0 x∈Fix(M) 1/|(T ) (x)|
688
M. Pollicott, O. Jenkinson
(where the summations are over the set Fix(M) = {x ∈ I : T M x = x} of period-M points) converges to µ in the weak-star topology (see [K-H, p. 635]). That is, for each k ∈ Z, Z Z 1
0
e2πikx dmM (x) → µ(k) ˆ :=
1
e2π ikx dµ(x).
0
This convergence can be shown to be exponentially fast (see Sect. 4), the rate being determined by the second eigenvalue of the Perron-Frobenius operator LT (defined in Sect. 1). In this note we will present a rather more efficient method of computing these Fourier coefficients µ(k). ˆ We achieve this by a more elaborate regrouping of the periodic points to define new invariant signed probability measures µM by m m m X X X Y X 1 δx (−1) , r(z, nj ) µM = ni )0 (x) − 1| NM (n ,... ,n ) m! |(T j =1 i=1 x∈Fix(ni )
m 1 n1 +...+nm ≤M
j 6 =i
z∈Fix(nj )
where we denote
1 , n|(T n )0 (x) − 1| and the normalisation constant NM is simply m m m X X X Y X 1 (−1) r(z, nj ) . NM = ni )0 (x) − 1| m! |(T j =1 (n ,... ,n ) r(x, n) =
i=1 x∈Fix(ni )
m 1 n1 +...+nm ≤M
j 6 =i
z∈Fix(nj )
Note in particular that the measures µM are supported on those periodic points of period at most M. The advantage of these more involved definitions is that in the case when T is C ω we have the following superexponential convergence of the Fourier coefficients. Theorem 1. Suppose that T : I → I is a piecewise C ω expanding Markov map of the interval, with absolutely continuous invariant probability measure µ. There is a sequence of T -invariant signed probability measures µM , supported on the points of period at most M, and 0 < θ < 1, such that for each k ∈ Z, there exists C > 0 with Z 1 2 2πkix e dµM (x) − µ(k) ˆ ≤ Cθ M . 0
More generally, for any real analytic function g : I → R, we can similarly see R1 R1 2 that | 0 g(x)dµM (x) − 0 g(x)dµ(x)| ≤ Cθ M . Of particular interest is the choice g(x) = log |T 0 (x)|, since by the Rohlin–Pesin equality [Ro,Pe] we can identify the metric entropy hµ (T ) with such an integral, Z (0.2) hµ (T ) = log |T 0 (x)|dµ(x). The ergodic theorem means that hµ (T ) equals the Lyapunov exponent lim
n→+∞
1 log |(T n )0 (x)| n
for Lebesgue almost all x. Combining identity (0.2) with the method of proof of Theorem 1, we also prove
Computing Invariant Densities and Metric Entropy
689
Theorem 2. Suppose that T : I → I is a piecewise C ω expanding Markov map of the interval, with absolutely continuous invariant probability measure µ. There is a sequence of T -invariant signed probability measures µM , supported on the points of period at R 2 most M, and constants 0 < θ < 1, C > 0, such that |hµ (T )− log |T 0 | dµM | ≤ Cθ M . We organise the article as follows. In Sect. 1 we collect together a few easy results on expanding piecewise-analytic Markov interval maps, and consider analytically parametrised families of such maps. In Sect. 2 we show that the measures µM converge to the a.c.i.m. µ. In Sect. 3 we consider the speed of convergence, and prove our theorems. In Sect. 4 we use our methods to estimate the entropy of the a.c.i.m. for two families of examples. 1. Some Simple Properties of the Density and Metric Entropy Consider a piecewise C ω expanding Markov interval map T : I → I . By this we mean (a) there exists a partition 0 = a0 < a1 < . . . < ap = 1 such that T is real-analytic on each [ai , ai+1 ] (i.e., extends to a holomorphic map on some complex neighbourhood of [ai , ai+1 ]), (b) there exists κ > 1 such that |T 0 (x)| ≥ κ for all x ∈ I . We shall also assume for convenience that T [ai , ai+1 ] = I for all i. This condition is not essential, but eases the exposition in the sequel. In particular it implies T is topologically mixing. A more general definition of a Markov map would allow each image T [ai , ai+1 ] to be some union ∪j ∈J [aj , aj +1 ] of pieces in the partition. Under this hypothesis, the invariant density ρ in Lemma 1 would only be C ω on each piece of the partition. Moreover, the various function spaces we consider would be defined relative to the partition, and would consist of functions with some prescribed regularity on each piece of the partition. Let µ denote the unique ergodic a.c.i.m. [C-E] and let ρ ∈ L1 (I ) denote the associated density (or Radon-Nikodym derivative) i.e., dµ(x) = ρ(x)dλ(x), where λ is Lebesgue measure. In general it is not possible to find an explicit expression for the density ρ. We can define the Perron–Frobenius operator LT : C 0 (I ) → C 0 (I ) by LT k(x) =
X T y=x
k(y) , k ∈ C 0 (I ). |T 0 (y)|
Recall that the space of real-analytic functions C ω (I ) is a Fréchet space. If we fix some open complex neighbourhood U of I , then the space of bounded holomorphic functions on U is a Banach space with respect to the uniform norm. Lemma 1. The density ρ is a real-analytic function, and satisfies Lρ = ρ. This is a standard result following from the change of variable formula (cf. [K-H, p. 186]), the latter observation following immediately from the fact that L preserves C ω (I ), i.e., LC ω (I ) ⊂ C ω (I ). Lemma 2. Suppose the metric entropy hµ (T ) is equal to the topological entropy h(T ). Then the map T is C ω conjugate to a piecewise linear map whose slope on each partition piece is ±p, where p is the number of pieces in the Markov partition.
690
M. Pollicott, O. Jenkinson
A simple calculation with Radon-Nikodym derivatives [K-H, p. 187] gives d(µT ) d(λT ) dλ dµT (x) = = ρ(T x)|T 0 (x)|ρ(x)−1 . dµ d(λT ) dλ dµ
(1.1)
By T -invariance of µ the jacobian dµT /dµ must satisfy −1 X dµT = 1, a.e. (y) dµ
(1.2)
T y=x
If hµ (T ) = h(T ), then µ is the unique measure of maximal entropy, so (see [Wal1]) − log |T 0 | is an essential coboundary. Thus − log dµT dµ is also an essential coboundary
(i.e. − log dµT dµ = vT − v + c for some real c and bounded measurable function v). But then (1.2) implies dµT dµ must itself be constant, and that this constant is precisely p. Therefore, if we assume that 0 is a fixed point, for convenience, then by (1.1) we deduce T is conjugate to the required piecewise linear map by the conjugacy h(x) = Rx ρ(y)dy. 0
Lemma 3. Let Tε (−a < ε < a) be a family of C ω expanding Markov maps, which vary analytically with the parameter ε, and with a.c.i.m. µε . Then the associated metric entropies h(µε ) also vary analytically. The invariant density ρε for Tε satisfies LTε ρε = ρε (by Lemma 1). Moreover, the maximal eigenvalue 1 for LTε is simple and isolated. It follows from standard analytic perturbation that the map ε 7→ ρε ∈ C ω (I ) is C ω [Kato]. By (0.2) we can write R theory 0 h(µε ) = log |Tε |ρε (x)dx, from which Lemma 3 immediately follows. Let us now consider the specific family of maps Tε : [0, 1] → [0, 1] defined by Tε (x) = 2x + ε sin 2π x (mod 1), for −
1 1 1 so that (ν1 + ν2 )({x0 }) > 0 and (ν1 + ν2 )({x1 }) > 0. However the period-two point x01 is given zero weight by ν1 , and negative weight by ν2 , so that (ν1 + ν2 )({x01 }) < 0 (the same is also true for x10 ). 3. Speed of Approximation So far in deriving our formulae, we have assumed that F (z, t) is an entire function of both z and t. We now justify this. First recall (see [Gr]) that a bounded linear operator L : B → B of a Banach space B over C isP called nuclear if there exist normalised ui ∈ B, normalised li ∈ B ∗ , and λi ∈ C with i |λi | < ∞ such that Lw =
∞ X
λi li (w)ui .
i=1
Lemma 7. F (z, t) is an entire function of both z and t. Proof. For each piece [ai , ai+1 ] of the Markov partition we let Ti : I → [ai , ai+1 ] denote the corresponding inverse branch of T (i.e. Ti ◦ T = id on [ai , ai+1 ]), and zi ∈ [ai , ai+1 ] the fixed point of Ti . For i = (i1 , . . . , in ), let Ti denote the composition Ti1 ◦ . . . ◦ Tin , and write |i| = n. For each such i, let zi ∈ I denote the unique fixed point of Ti , and let Ii denote the image Ti I . Let us choose n sufficiently large so that, for each i with |i| = n, there is an open complex disc Di ⊃ Ii of radius di , centered at zi , such that
696
M. Pollicott, O. Jenkinson
(a) for each 1 ≤ j ≤ p, Tj extends holomorphically to the union D (n) := ∪|i|=n Di , with Tj (D (n) ) a proper subset of Dj , (b) etg /|T 0 | extends to a holomorphic function on each Di , and hence ` to a holomorphic function (which we denote by φ) on the disjoint union Dn := |i|=n Di . Let B denote the space of functions which are holomorphic on the disjoint union Dn and continuous on D n . Equipped with the uniform norm, B is a Banach space. The operator L− log |T 0 |+tg is well-defined on B, and takes the form L− log |T 0 |+tg w(z) =
p X
φ(Tj (z))w(Tj (z)).
j =1
This operator is then nuclear (see [Ru1, p. 236]), and in particular the trace trace(Lm − log |T 0 |+tg ) =
∞ X i=1
eim < ∞
is well-defined for each m, where the ei are the eigenvalues of L− log |T 0 |+tg counted with multiplicity. By Fredholm theory [Gr] we have the trace formulae trace(Lm − log |T 0 |+tg )
=
X x∈Fix(m)
m
etg (x) , |1 − (T m )0 (x)|
and we can identify !
∞ m X z trace(Lm F (z, t) = det(I − zL− log |T 0 |+tg ) = exp − − log |T 0 |+tg ) m m=1
and this is an entire function of both z and t [Ru1]. u t The above lemma immediately implies the superexponential decay of the power series coefficients CN (t) of F (z, t). We now give a more precise bound on the rate of decay. Lemma 8. There exists 0 < θ < 1 such that, for any t ∈ C, CN (t) = O(θ N ). 2
Proof. To ease the notation, let us write L = L− log |T 0 |+tg , so that Lw(z) =
p X j =1
φ(Tj z)w(Tj z) =
p X
Lj w(z),
j =1
where Lj w(z) := φ(Tj z)w(Tj z). Suppose n is chosen sufficiently large as in Lemma 7, and that our Banach space B is as before. Let i = i1 . . . in , and let i 0 = i1 . . . in−1 be its length-(n − 1) prefix. We write φi := φ|Di . Observe that, for each j , the function Lj w is defined by (Lj w)i = wj ◦ Tj .φj ◦ Tj , where j = j i 0 .
(3.1)
Computing Invariant Densities and Metric Entropy
Define Di 0 to be the disjoint union
697
`p
k=1 Di 0 k ,
and note that
Tj (Di 0 ) ⊂ Dj . Let Bj be the smallest disc, centered at zj , which contains Tj (Di 0 ). Let βj = maxx∈Tj (D 0 ) |zj − x| denote the radius of Bj . Clearly we have βj < dj , i so define the contraction ratio αj = βj /dj < 1. Let 0j be a circle of radius γj centred at zj , and such that βj < γj < dj . Clearly for any given > 0 we may choose 0j such that βj /γj < (1 + )αj . Cauchy’s formula means that, for z ∈ Di ⊂ Di 0 , we can express (Lj w)i (z) = wj (Tj z)φj (Tj z) =
1 2π i
Z
wj (ξ )φj (ξ )
0j
ξ − Tj z
dξ.
For z ∈ Di , ξ ∈ 0j we write ξ − Tj z
−1
−1
−1
= ξ − zj = ξ − zj
1−
Tj z − zj ξ − zj
∞ X
Tj z − zj
r=0
ξ − zj
!−1 !r .
Note this expansion is valid because |Tj z − zj | < |ξ − zj |, since the circle 0j is outside the image Tj (Di 0 ). Therefore, for z ∈ Di ⊂ Di 0 we have (Lj w)i (z) =
r=0
so that (Lw)i (z) =
r 1 Z wj (ξ )φj (ξ ) Tj z − zj r+1 dξ, 2π i 0j ξ − zj
∞ X
p ∞ X X
Tj z − zj
r=0 j =1
r 1 Z wj (ξ )φj (ξ ) r+1 dξ. 2π i 0j ξ − zj
Now define vj ,r ∈ B and mj ,r ∈ B ∗ by (
Tj z − zj
vj ,r (z) :=
mj ,r (w) :=
1 2π i
if z ∈ Di 0 if z ∈ / Di 0
0
and
r
Z 0j
wj (ξ )φj (ξ ) r+1 dξ. ξ − zj
From (3.2) we then have Lw =
∞ X X r=0 |j |=n
mj ,r (w)vj ,r .
(3.2)
698
M. Pollicott, O. Jenkinson
Now we just normalise by setting uj ,r =
vj ,r ||vj ,r ||∞
so that Lw =
=
vj ,r
and lj ,r =
βjr
∞ X X
mj ,r ||mj ,r ||∞
,
λj ,r lj ,r (w)uj ,r ,
(3.3)
r=0 |j |=n
for coefficients λj ,r = ||vj ,r ||∞ ||mj ,r ||∞ . To estimate the decay of these coefficients as r → ∞, we have λj ,r = ||vj ,r ||∞ ||mj ,r ||∞ ≤ βjr
||φ||∞
2π γjr+1
0 sufficiently small, we may choose max|j |=n αj < α < 1 such that λj ,r ≤ Kα r
(3.4)
for all |j | = n, where !−1 = Q||etg ||∞ ,
K = ||φ||∞ 2π min γj |j |=n
(3.5)
and Q is a constant depending only on T . Re-labelling the various functions, functionals, and coefficients in (3.3) we write Lw =
∞ X
λq lq (w)uq ,
q=0
where the λq are non-increasing. For any fixed r, there were pn functions of the form uj ,r , and pn functionals of the n form lj ,r , so choosing θ0 = α 1/p < 1, the estimate (3.4) implies q
λq ≤ Kθ0 .
(3.6)
We now estimate the coefficients CN (t) in the power series expansion of the Fredholm determinant F (z, t). Formally − zλ1 l1 (u3 ) . . . 1 − zλ1 l1 (u1 ) − zλ1 l1 (u2 ) −zλ2 l2 (u1 ) 1 − zλ2 l2 (u2 ) − zλ2 l2 (u3 ) . . . det(I − zL) = −zλ3 l3 (u1 ) − zλ3 l3 (u2 ) 1 − zλ3 l3 (u3 ) . . . .. .. .. . . . . . . so we can express CN (t) = (−1)N
X i1 2. It is well known that for p > 2, the elements of W1,p () can be represented as continuous functions and the embedding ¯ W1,p () ⊂ C() is a compact operator. Moreover, continuous W1,p -functions are Hölder continuous: 1− 2 (2.1) ( |x − y| ≤ δ ) ⇒ |f (x) − f (y)| . δ p k∇f kLp (B(x,δ)) . The embedding result will be used in the following form. Lemma. There is a constant C (depending on and on p > 2) such that for any ε > 0 there exists a finite rank operator K in W1,p () such that kKk1,p ≤ C, kf − Kf k∞ ≤ ε kf k1,p .
(2.2)
Proof. Extend f to the whole plane with Sobolev norm kf k1,p , and consider a grid of equilateral triangles 1 of size δ 1. Define Kf to be a continuous function satisfying ( f at all vertices Kf = linear in each triangle 1. Then for each 1, we have the following estimates: |∇(Kf )| .
(2.1) 1 −2 kf − f (center)kL∞ (1) . δ p δ
Z 1∗
|∇f |p
1
p
,
where 1∗ is the union of 1 with the adjacent triangles. It follows that Z Z p |∇Kf | . |∇f |p . 1∗
1
Summing up over all 1’s, we obtain the first inequality. The second inequality follows from (2.1) by the choice of δ. u t 2.3. Lemma. If p > 2 and t < −2(1 − p2 ), then Lt W1,p () ⊂ W1,p (). Proof. Let f ∈ W1,p (). Changing the variable in the integral we obtain Z Z p |∇(Lf )| ≤ |∇(f |F 0 |−t )|p |F 0 |2−p . I + I I,
Z
where I :=
|∇f |p |F 0 |−tp+2−p ≤ const kf k1,p ,
714
N. Makarov, S. Smirnov
(because −tp + 2 − p > 0), and Z I I :=
p
0 −t
p
0 2−p
|f | |∇(|F | )| |F |
≤
p kf k∞
Z
|∇(|F 0 |−t )|p |F 0 |2−p .
To see that the latter integral is finite, we only need to consider neighborhoods of critical points. Suppose c is a zero of F 0 of order k ≥ 1. Then we have (as z → c): |∇(|F 0 |−t ) . |z|−1−kt , and |∇(|F 0 |−t )|p |F 0 |2−p . |z|−p(1+kt)+k(2−p) . Since the inequality t < −(1 + k1 )(1 − p2 ) implies −p(1 + kt) + k(2 − p) > −2, the integral converges. u t
2.4. The function s(t). We define ¯ s(t) := logd ρ(Lt , C()). ¯ = ρ(Lt , JF ), and therefore by PrzyWe will see later (Remark 3.4) that ρ(Lt , C()) tycki’s result (see Subsect. 1.2), we have s(t) =
P (t) , log d
in agreement with notation in Introduction. Some preliminary properties of s(t) are stated in the following lemma. Lemma. (i) For every point z0 ∈ ∂, we have Lnt 1(z0 ) kLnt k∞ , and therefore
X y∈F −n (z0 )
|Fn0 (y)|−t = d s(t)n+o(n) .
(ii) The function s(t), t ≤ 0 is strictly decreasing and satisfies the inequality s(kt) ≤ k s(t)
(∀k ≥ 1).
(2.3)
Thermodynamics of Rational Maps I
715
Proof. Since t ≤ 0, the function z 7 → Lnt 1(z) is subharmonic, and therefore we have kLnt k∞ = kLnt 1k∞ = sup Lnt 1. ∂
If z1 , z2 ∈ ∂, then we can choose a simply connected domain that contains z1 and z2 but does not contain forward iterates of the critical points. All branches of F −n are conformal on such a domain, and by the distortion theorem we have |Fn0 (y1 )| |Fn0 (y2 )| as n → ∞, where y1 , y2 denote the images of z1 , z2 under the same branch. It is easy to see that the constants in this relation can be chosen independent of the points z1 , z2 . This completes the proof of the first statement. Similar argument and the area estimate show that X |Fn0 (y)|−2 . 1. y∈F −n (z0 )
By Hölder’s inequality, d = n
X
1 =
y∈F −n (z0 )
X
y∈F −n (z0 )
2 2−t
|Fn0 (y)|−t
X
y∈F −n (z0 )
−t
2−t
|Fn0 (y)|−2
,
and we have s(t) ≥ 1 − 2t , and s 0 (0−) ≤ − 21 , so s(t) is strictly decreasing. To prove (2.3), we simply observe that Lnkt 1 ≤ (Lnt 1)k . u t 2.5. Two-norm inequality. Lemma. Let t and p be as in Lemma 2.3. Then there exists a positive number ε = ε(p, t) such that kLnt f k1,p ≤ d n(s(t)−ε)+o(n) kf k1,p + Cn kf k∞ , (f ∈ W1,p ()).
(2.4)
Proof. We have Z
|∇(Lnt f )|p .
Z
y∈F −n (z)
Z +
p
X
|(∇f )(y)| |Fn0 (y)|−(1+t) dA(z)
X
y∈F −n (z)
p |f (y)| |Fn0 (y)|−1 |∇(|Fn0 |−t )(y)| dA(z)
:= I + I I. By the argument of the previous lemma, we have p
p
I I ≤ Cn kf k∞ .
716
N. Makarov, S. Smirnov
On the other hand, by Hölder’s inequality, we have Z I ≤
X
y∈F −n (z)
|(∇f )(y)|p |Fn0 (y)|−2
p0
X
p0 ( p2 −1−t)
y∈F −n (z)
|Fn0 (y)|
p
dA(z),
where p 0 is the conjugate exponent (i.e. p−1 + (p0 )−1 = 1). Using the obvious relation |Fn0 (y)|−2 dA(z) = dA(y), we obtain the estimate I ≤
p k∇f kp
p0
n
p
L
p0 (1+t− p2 ) . ∞
It remains to note that
10
1 2 0
p
n
= d n p0 s(p (1+t− p ))+o(n) ,
L 2 0
p (1+t− ) p
∞
2 1 1 0 s p · 1+t − < 0 s(p0 t) ≤ s(t) p0 p p
and that
by Lemma 2.4. u t 2.6. Proof of Theorem.
Fix numbers t < 0 and p > 2 satisfying 2 . t < −2 1 − p
The transfer operator Lt is bounded in W1,p () by Lemma 2.3. By Lemma 2.5, for any given q ∈ (0, 1), we can find an integer N and a constant Q such that Ns kLN t f k1,p ≤ q d kf k1,p + Qkf k∞ , (s := s(t)).
By induction, we have kNs kf k1,p + QMk kf k∞ , (k = 1, 2, . . . ), kLkN t f k1,p ≤ d
where the sequence {Mk } is determined by the equations M1 = 1, Mk+1 = d N s Mk + kLkN t k∞ . ¯ we have Since d s is the spectral radius of Lt in C(), Mk ≤ d (k+o(k))N s as k → ∞. It follows that and
(k+o(k))N s , kLkN t k1,p ≤ d
ρ(Lt , W1,p ()) ≤ d s .
(2.5)
Thermodynamics of Rational Maps I
717
The opposite inequality is obvious: kLnt k∞ = kLnt 1k∞ . kLnt 1k1,p . kLnt k1,p . Let us now prove the strict inequality for the essential spectral radius. The argument is again based on the estimate (2.5), in which we choose q such that q
2 belongs to some Sobolev space W1,q (C) sufficiently close to 2, the condition 2 p −1 , (2.6) p0 := P (p 0 log [g |F 0 | p ]) < p0 P (log g), p−1 ˆ implies the quasicompactness of the transfer operator Lg in W1,p (C). Corollary. Let F and g be as above, and let λ denote the spectral radius of Lg in C(JF ). Suppose also that g satisfies Ruelle’s condition (1.4): ∃n : λn > sup gn . JF
ˆ and Then for all numbers p > 2 sufficiently close to 2, the operator Lg acts in W1,p (C) is quasicompact.
718
N. Makarov, S. Smirnov
Proof. Since g vanishes at the critical points of F , we can represent it as follows: g = h |F 0 |τ , where τ is some positive number and h ∈ W1,q for some q > 2. By the argument of Lemma 2.3, the transfer operator acts in W1,p provided that 2 < p < q and p
0, we have ε
τ −ε
k gn |Fn0 |−ε k∞ = k hnτ gn τ k∞ . λn2 for some λ2 < λ. Given p close to 2, we set ε = p − 2. Then we have
X
0 0 p 0 (−1+2/p)p
z 7 → gn (y) |Fn (y)|
y∈F −n (z) ∞
X
0 −1 p 0 −ε
= z 7 → gn gn |Fn (y)|
y∈F −n (z)
∞
(p 0 −1)n n+o(n)
. λ2
λ
p0 n
. λ3
with some λ3 < λ. This implies (2.6). u t The last statement represents a version of Ruelle’s theorem mentioned in Sect. 1.2. As we noted, the condition (1.6) is weaker than Ruelle’s condition (1.4). The latter condition can fail even if (1.6) is valid.
3. Analyticity of the Pressure Function In this section we verify the statements of Theorems A and B for non-exceptional polynomials.
3.1. Theorem. If F is not exceptional, then the function s(t) is real analytic for t < 0.
Thermodynamics of Rational Maps I
719
Again, the proof is rather standard. It is contained in the next four lemmas. Fix t < 0 and p > 2 satisfying the condition of Lemma 2.3. Denote λ ≡ λ(t) := d s(t) . In other words,
¯ = ρ(Lt , W1,p ()). λ = ρ(Lt , C())
We show that λ(t) is an isolated, simple eigenvalue of Lt : W1,p () → W1,p (). Then the theorem follows by the usual application of the analytic perturbation theory. The first lemma is taken from [32]. Lemma 3.3 is a version of the construction of conformal measures due to Patterson [26] and Sullivan [35]. Lemma 3.5 is essentially Lemma 6.1 of [21]. For the convenience of the reader, we outline the proofs. 3.2. λ is an eigenvalue. Lemma. We have ker(Lt −λ) 6 = ∅ in W1,p (). The corresponding eigenspace contains a non-negative eigenfunction. Proof. Since ρess (Lt , W1,p ()) < λ, there are only finitely many eigenvalues λj satisfying |λj | = λ, and the corresponding spectral projections have finite ranks. Denote X gj . gj := Pj 1; g0 := 1 − Applying Lnt , we have
Lnt g0 +
X
Lnt gj = Lnt 1,
and since kLnt g0 k∞ . kLnt g0 k1,p = o(kLnt 1k∞ ) as n → ∞, at least one of gj ’s is not zero. We also have kLnt gj k1,p nkj λn as n → ∞, where kj ≥ 0 is the maximal integer number such that ϕj := (L − λj )kj gj 6= 0, (i.e. kj is the size of the corresponding Jordan cell). Let k := max{kj }. Then X λnj ϕj + o(λn ) pn := n−k (Lnt 1) =
(3.1)
j : kj =k
¯ Since the functions ϕj are linearly independent, we have in W1,p () and also in C(). kpn k∞ kpn k1,p λn , and we also have pn (z0 ) λn for some fixed z0 ∈ ∂. Since pn ≥ 0, it follows that
N N
1 X 1 X pn (z0 ) pn
& 1.
N λn N λn n=1
∞
n=1
t By (3.1), this is possible only if one of the eigenvalues λj is positive. u
720
N. Makarov, S. Smirnov
¯ → 3.3. Existence of eigenmeasures. Let L∗t denote the adjoint of the operator Lt : C() ¯ of finite complex measures according to the ¯ Then L∗t acts in the space M() C(). following formula: L∗t : ν 7 → µ := |F 0 |−t (ν ◦ F ). The latter means that
|F 0 |t ∈ L1 (µ),
in particular µ(Crit F ) = 0, and that
Z
ν(F A) =
A
|F 0 |t dµ
for every set A such that F is one-to-one on A and satisfies A ∩ (Crit F ) = ∅. In the special case ν = δz , we have X |F 0 (y)|−t δy . (3.2) L∗t δz = y∈F −1 (z)
Lemma. There exists a probability measure ν on JF such that L∗t ν = λ(t) ν. Proof. Fix a point z ∈ ∂ and consider the sequence of positive measures X |F 0 (y)|−t δy . µn := λ−n (L∗t )n δz = λ−n y∈F −n (z)
Clearly, L∗t µn = λ µn+1 , and by the proof of Lemma 3.2, we have kµn k = λ−n Lnt 1(z) nk for some integer k ≥ 0. Next we define νn :=
n X
µn ,
j =0
and take some (weak-∗) limit point ν of the sequence νn / kνn k. Then ν is a probability measure supported on JF , and since kλ(µn+1 − µ0 )k nk 1 kL∗t νn − λνn k = k+1 = → 0, kνn k kνn k n n t we have L∗t ν = λν. u 3.4. Remark. The last lemma implies in particular that ¯ ρ(Lt , C(JF )) = ρ(Lt , C()). Indeed, λ is an eigenvalue of the adjoint of Lt : C(JF ) → C(JF ), and therefore ρ(Lt , C(JF )) ≥ λ. The opposite inequality is obvious: kLnt kC(JF ) = kLnt 1kC(JF ) ≤ kLnt 1kC() ¯ .
Thermodynamics of Rational Maps I
721
3.5. The support of an eigenmeasure. Lemma. Let ν be a probability measure on JF satisfying L∗t ν = λ(t) ν. Then either supp ν = JF , or the set 6 := supp ν is finite and satisfies F −1 6 \ Crit F = 6, in particular F is exceptional. In the latter case, we have (see Introduction for notation) log λ(t) = −tχ∗ = −tχmax . Proof. From the equation λν = |F 0 |−t ν ◦ F we have F −1 6 \ Crit F ⊂ 6. It follows that if #6 = ∞, then we can find a point a ∈ 6 such that [
F −n a ⊂ 6,
n≥0
which implies 6 = JF . On the other hand, if #6 < ∞, then by (3.3) we have (x ∈ 6) ⇒ (ν(x) 6= 0) ⇒ (|F 0 (x)| 6 = 0 and ν(F x) 6= 0) ⇒ (x ∈ F −1 6 \ Crit F ). To prove the last statement of the lemma, observe that if b ∈ Per F , then clearly log λ(t) ≥ − tχb . On the other hand, we have log λ(t) = −tχa for every periodic point a ∈ 6. u t
(3.3)
722
N. Makarov, S. Smirnov
3.6. Multiplicity of λ. Lemma. Suppose there exists a probability measure ν such that L∗ ν = λ(t) ν and supp ν = JF . Then λ ≡ λ(t) is a simple eigenvalue of the operator Lt in W1,p (): dim ker (Lt − λ)2 = 1. Proof. We will need the following fact: if f ∈ W1,p (), then ( Lt f = λf implies f = 0. f |JF = 0
(3.4)
Assuming (3.4), we can use the following standard argument to prove the lemma. It is known that the existence of an eigenmeasure with supp ν = JF implies dim ker (Lt − λ) = 1 in C(JF ), see for example Sect. 3.6 of [21]. By (3.4), the same is true for the space W1,p (). Suppose now that (Lt − λ)2 h = 0 for some h ∈ W1,p (). We need to show that f := (Lt − λ)h is trivial. By (3.4), it is sufficient to prove f |JF = 0. We have (f, ν) = (Lt h, ν) − (λh, ν) = (h, L∗t ν) − λ (h, ν) = 0. Since dim ker (Lt − λ) = 1, we can assume (by Lemma 3.2) that f ≥ 0, and therefore, we have f = 0 ν-almost everywhere. The equality f |JF = 0 now follows from the assumption supp ν = JF . It remains to prove (3.4). Fix z ∈ . We have |f (z)| = |λ−n Lnt f (z)| X |Fn0 (y)|−t |f (y)| ≤ λ−n y∈F −n (z)
. λ−n
X
y∈F −n (z)
|Fn0 (y)|−t dist(y, JF )α ,
where α < −t is a fixed positive number such that W1,p () ⊂ Hα , see (2.1). Observe now that dist(y, JF ) . |Fn0 (y)|−1 .
(3.5)
Indeed, if z is in the basin of attraction to ∞, and G(·) denotes the Green function with pole at infinity, then (3.5) follows from the estimates |Fn0 (y)| |∇G(z)| = d n |∇G(y)| d n G(y) . dist(y, JF ) G(z) . = dist(y, JF )
(3.6)
Thermodynamics of Rational Maps I
723
On the other hand, if z belongs to some bounded component of C\JF , then the iterates {F n } are uniformly bounded in the discs B(y, dist(y, JF )) (the discs lie in the filled-in Julia set), and so (3.5) follows from the Schwarz lemma. We can now finish the proof of (3.4). From (3.6) and (3.5), we have X |Fn0 (y)|−t−α |f (z)| . λ−n y∈F −n (z)
≤ d −s(t)n d s(t+α)n d o(n) → 0 as n → ∞, because s(·) is strictly decreasing. u t We conclude this section with several remarks concerning some other “hyperbolic” features of non-exceptional polynomials. 3.7. Remarks. (i) Perron-Frobenius Theorem. The probability eigenmeasure ν ≡ νt in Lemma 3.3 is unique, and if ft ∈ W1,p () denotes the non-negative eigenfunction of Lt satisfying νt (ft ) = 1, then the rank one operator
P := (·, νt )ft
is the spectral projection of Lt : W1,p () → W1,p () corresponding to the isolated eigenvalue λ ≡ λ(t). One can show that ρ((I − P)Lt , W1,p ()) < λ, which implies that
(3.7)
λ−n Lnt → P
with exponential rate of convergence in the uniform operator topology. To prove (3.7), we first observe that the set {ft = 0} is finite. Assume that Lt fˆ = λˆ fˆ for some number λˆ of modulus λ and some function fˆ ∈ W1,p () with normalization νt (|fˆ|) = 1. Then we have |fˆ| = ft (use, e.g., the argument of [21], p.142). Define the function η = η(z) for z ∈ JF \ {ft = 0} by the equation fˆ = ηft . From the identity (Lt ft )(z) = we have
λη(y) ft (y)|F 0 (y)|−t = 0, 1− ˆ λη(z)
X y∈F −1 z
λ (Lt ηft )(z), ˆλη(z)
724
N. Makarov, S. Smirnov
and therefore
λ η(Fy) = η(y) λˆ except for a finite set of y’s. Taking two periodic points with relatively prime periods and with orbits avoiding this finite set, we have λˆ = λ. u t (ii) Equilibrium states. Let µt denote the probability measure ft νt . Standard argument shows that µt is an ergodic, F -invariant measure. We claim that µt is a unique equilibrium state: P (t) = ht − tχt , where we write ht and χt for the entropy and the exponent of µt . The equality (3.8) follows from the Rokhlin-type formula Z ht = log Jt dµt , where Jt := λ(t)
(3.8)
(3.9)
ft ◦ F 0 t |F | ∈ L1 (µt ) ft
is the Jacobian of µt . (We also use the obvious fact that log ft is integrable with respect to µt .) The formula (3.9) follows from the well-known estimate Z ht ≥ log Jt dµt and from the variational principle. To prove the uniqueness result, it is sufficient to show that if µ is an equilibrium state, then for all 9 ∈ C∞. µ(9) = µt (9) The latter is an immediate consequence (cf. [28]) of the differentiability at 0 of the pressure function (s ∈ R), p(s) := P (−t log |F 0 | + s9), see the next remark and also Subsect. 2.7. (iii) Derivatives of the pressure function. For non-exceptional polynomials, one can establish the same formulas for the derivatives of P (t) as in the hyperbolic case (see [29–31]). Namely, for the first derivative we have P 0 (t) = −χt ,
(t < 0),
and also P 0 (0−) = −χm , 1 P 0 (−∞) = sup χµ = lim log kFn0 k∞ . n→∞ n M (Recall that m denotes the measure of maximal entropy.) The first statement follows, for example, from the variational principle which also implies the inequality P 0 (0−) ≥ −χm .
Thermodynamics of Rational Maps I
725
To prove that we denote
P 0 (0−) ≤ −χm , Pε (t) = P (−t log(|F 0 | + ε)),
and consider the corresponding equilibrium state µε,t : Z Pε (t) = hε,t − t log(|F 0 | + ε) dµε,t (hε,t is the entropy of the equilibrium state). It follows that hε,t → Pε (0) = log d as t → 0, and therefore
weak*- lim µε,t = m t→0
by the upper semicontinuity of the entropy and the uniqueness of the maximal measure. Since Pε (t) ≤ P (t), we have log d − Pε (t) −t t→0− hε,t − Pε (t) ≥ lim sup −t t→0− Z = − lim inf log(|F 0 | + ε) dµε,t t→0− Z = − log(|F 0 | + ε) dm → χm as ε → 0.
P 0 (0−) ≥ lim sup
To state the formula for the second derivative of the pressure function, we denote P j 2 A := log |F 0 | and Sn := n−1 j =0 A ◦ F . For t < 0, consider the asymptotic variance σt n 2 of the process {A ◦ F }n≥0 in L (µt ): Z 1 [Sn − µt (Sn )]2 dµt σt2 := lim n→∞ n Z ∞ Z X A(A ◦ F n )dµt . = A2 dµt + 2 n=1
The variance is finite because of the exponential decay of the correlations R asymptotic A(A ◦ F n )dµt (use the fact that Lt (Aft ) ∈ W1,p () and apply Perron-Frobenius). As in the hyperbolic case, we have P 00 (t) = σt2 . Indeed, standard computation based on the differentiation of the identity Lτ fτ = λ(τ )fτ (with normalization νt (fτ ) ≡ 1 for the eigenfunctions fτ ) shows that P 00 (t) = n−1 [µt (Sn2 ) − µt (Sn )2 ] − hn−1 Sn f˙t , νt i,
726
N. Makarov, S. Smirnov
(the dot denotes the derivative with respect to t) and so we need to show that the last term tends to zero as n → ∞. Since j
h(A ◦ F j )f˙t , νt i = λ(t)−j hA(Lt )f˙t , νt i, we have
hn−1 Sn f˙t , νt i = hAMn f˙t , νt i,
where
n−1
j
1 X Lt ˙ ft Mn f˙t := n λ(t)j
W1,p
−→
hf˙t , νt ift = 0.
j =0
(iv) P (t) ≡ P˜ (t) for non-exceptional maps. This follows from the fact that the equilibrium states µt are non-atomic. The latter can be proved as follows. The analyticity of the pressure function implies that P (t) > P 0 (−∞) t,
(∀t < 0).
On the other hand, we have |P 0 (−∞)| = lim
n→∞
1 log kFn0 k∞ . n
Hence, for every t < 0, we have n kFn0 k−t ∞ = o(λ(t) ) as n → ∞.
Suppose now that νt (x) 6 = 0. Since νt is an eigenmeasure, we have |Fn0 |−t (νt ◦ F n ) = λ(t)n νt , and νt (F n x) =
λ(t)n νt (x) → ∞. |Fn0 (x)|−t
3.8. Rigidity. It follows from Remark (iii) that if P 00 (t) = 0 for some t < 0, then σt = 0 and therefore the function log |F 0 | is homologous to a constant in L2 (µt ), i.e. for some u ∈ L2 (µt ) we have log |F 0 | = u − u ◦ F + const.
(3.10)
According to Zdunik [38], log |F 0 | can be homologous to a constant in L2 (m), where m is the maximal measure, if and only if F is critically finite and the corresponding orbifold is parabolic. One can modify the argument in [38] to extend her result to our equilibrium states µt . Theorem. Let F be a nonexceptional rational function. Then P 00 (t) > 0 for all t < 0.
Thermodynamics of Rational Maps I
727
Proof. Suppose P 00 (t) = 0 for some t < 0 and let µ = µt denote the corresponding equilibrium state. We claim that (3.10) implies F −1 (CV) ⊂ CV ∪ C, where
(3.11)
C := J ∩ Crit F and CV := {F n c : n ≥ 1, c ∈ C}.
It then follows that the set CV is finite, in which case the statement is known. To prove (3.11), we need the following lemma. Let us choose a subset S ⊂ J with µS > 1/2 such that u is bounded on S. Lemma. Let p ∈ J \ CV. Then there is a disc B about p and a subset E ⊂ B of full µ-measure in B such that the following is true: for every pair of points x, y ∈ E, there is an integer n > 0 and a component P of F −n B such that (i) the map F n : P → B is univalent, and (ii) x, y ∈ F n (S ∩ W ). This lemma immediately implies (3.11). First we observe that u is bounded on E∩ 21 B. Indeed, if x = F n a and y = F n b for some a, b ∈ S ∩ W , then by (3.10) we have u(x) − u(y) = log
|Fn0 (a)| + u(b) − u(a), |Fn0 (b)|
and the first term to the right is bounded by the distortion theorem. Next we take x ∈ CV, y ∈ F −1 x and suppose that y 6 ∈ C ∪ CV. It follows that u is µ-bounded in some neighborhood of y. Applying (3.10), we see that u is µ-bounded in some neighborhood of x. On the other hand, there is a critical point c ∈ C \ CV such that x = F k c for some k ≥ 1. Then u is µ-bounded near c, but the equation log |Fk0 | = u ◦ F k − u + const, shows that u cannot be µ-bounded at x. This proves (3.11) and hence the theorem. We now turn to the proof of the lemma. Consider the natural extension (J˜, F˜ , µ) ˜ of the dynamical system (J, F, µ). Recall that F˜ is the left shift in the space of sequences J˜ := {x˜ = (. . . , x−1 , x0 , x1 , . . . ) ∈ J
Z
: xk+1 = F xk }.
Let πk : J˜ → J denote the projection onto the k-th coordinate. We will write π for π0 . ˜ The ergodic measure µ˜ is defined as a unique F˜ -invariant measure satisfying µ = π∗ µ. For a given disc B and n > 0, we denote by U−n the union of the components of F −n B on which F n is univalent. Consider the set O := {x˜ ∈ J˜ : x0 ∈ B, xk ∈ Uk for all k < 0}. We can introduce a direct product structure in O in the following way. Let 6 be the set of all infinite sequences of the inverse branches participating in the construction of O: 6 = O/ ∼,
728
N. Makarov, S. Smirnov
where, by definition, x˜ ∼ y˜ if the points xk and yk belong to the same component of Uk for all k < 0. If τ : O → 6 denotes the corresponding projection, then the map π ×τ : O → B ×6 is a bijection. Consider now the restriction of µ˜ to the set O as a measure on B ×6. Let ρ denote the projection of this measure to 6 and {µσ : σ ∈ 6} the corresponding family (“canonical system”) of conditional measures on B. The proof of the lemma is based on the following two facts: (*) if the radius of B is sifficiently small, then µ(O) ˜ > 0; (**) the restriction of µ to B is absolutely continuous with respect to µσ for ρ-a.e. σ . Assuming these facts, we can now finish the proof of the lemma. Since µ(π ˜ −1 S) > 21 , applying the ergodic theorem we can find a subset E of O of full measure, µ(O ˜ \ E) = 0, such that (x, ˜ y˜ ∈ E) ⇒ (∃k < 0, xk ∈ S, yk ∈ S). Denote
Eσ := π(E ∩ τ −1 σ ).
Then we have
Z µσ (B \ Eσ ) dρ(σ ),
0 = µ(O ˜ \ E) = and therefore
µσ (B \ Eσ ) = 0
for ρ-a.e. σ .
By (**), we have µEσ = µB
for ρ-a.e. σ ,
and so almost every set Eσ satisfies the condition of the lemma. It remains to verify (*) and (**). Proof of (*). Recall that µ = f ν, where f = ft and ν = νt are the corresponding eigenfunction and eigenmeasure respectively. Since p 6∈ CV, we have f (p) 6= 0. We will also use the estimate n k Fn0 k−t ∞ . λ1 ,
λ1 < λ := λ(t),
(3.12)
which is true, as was already mentioned, for all non-exceptional maps. For n > 0, let C−n be the union of the components P of F −n B such that P ∩ C 6= ∅, but F P ⊂ U1−n . It is clear that the number of such components P of C−n as well as the degrees of the maps F n : P → B are bounded by a constant depending only on the degree of F . Using the fact that ν is an eigenmeasure and that f (p) 6 = 0, it follows that if the radius of B is small enough, then µC−n . const
kf k∞ k Fn0 k−t ∞ µB , n f (p) λ
Thermodynamics of Rational Maps I
729
with a constant depending only on the degree of F . For an arbitrary N , we can take B so small that C−1 , . . . , C−N = ∅, and by (3.12), we can choose N such that X X µC−n = µC−n < µB. n>0
Since
n>N
π −1 B \ O ⊂
[
{x˜ ∈ π −1 B : x−n ∈ C−n },
n>0
we have µO ˜ ≥ µB −
X
t µC−n > 0.u
Proof of (**). Fix η ∈ (0, 1), and let B 0 denote the disc ηB. We will show that if µ(O ˜ ∩ π −1 B 0 ) > 0,
(3.13)
then ( µe > 0, e ⊂ B 0 )
⇒
(µσ e > 0 for ρ-a.e. σ ).
By (*), the inequality (3.13) holds for all η close to 1, and therefore (**) follows. u t We will use the symbols Pk , k > 0, to denote any component of U−k . The statement follows from the estimate −1 −1 Pk ] ≥ const µ˜ [O ∩ π0−1 B 0 ∩ π−k Pk ], µ˜ [O ∩ π0−1 e ∩ π−k
(3.14)
with a constant independent of k and Pk . Since [ −1 π−n Pn & O as n → ∞, (Pn )
we have −1 Pk ] = µ˜ [O ∩ π0−1 e ∩ π−k
=
lim
n→∞
lim
n→∞
X X
−1 µ˜ [O ∩ π−n Pn ]
µ(Pn ∩ F −n e),
(3.15)
where the sums are taken over all components Pn such that F n−k Pn = Pk . We can represent the right-hand side of (3.14) in a similar way, and so to prove (3.14) we only need to compare the µ-measures of the sets Pn ∩ F −n e and Pn ∩ F −n B 0 . Assume first that the eigenfunction f does not vanish on J . Then it is enough to notice that the ν-measures of the above sets are comparable. The latter is a consequence of the distortion theorem and of the fact that ν is an eigenmeasure: R λ−n e |Fn0 |t d(ν ◦ F n ) ν(Pn ∩ F −n e) = −n R . 0 t n ν(Pn ∩ F −n B 0 ) λ B 0 |Fn | d(ν ◦ F )
730
N. Makarov, S. Smirnov
The eigenfunction f may have zeros in general. Let Z denote the set {f = 0}. Since F in non-exceptional, there is an integer m > 0 such that δ := dist(Z, F −m Z) > 0. We can also assume that the disc B is so small that the diameters of all sets Pn ∩F −n B 0 are δ. Returning to the computation (3.15), we modify some of the terms µ(Pn ∩ F −n e) as follows. If the set Pn ∩ F −n e contains a point at which f is very small, then we replace the coresponding term with the sum X µ(Pn+m ∩ F −n−m e) taken over all components Pn+m such that F m Pn+m = Pn . In the new expression, the eigenfunction f is bounded away from zero by a constant independent of n, and so the previous argument applies. u t 4. Hidden Spectrum In this section we study the phase transition case, and complete the proof of Theorems A and B. Let F be an exceptional polynomial. We assume that F is not conjugate to a Chebychev’s polynomial. From the discussion in Sect. 1.3, it follows that there exists a fixed point a ∈ JF , F (a) = a, such that F −1 a \ {a} ⊂ Crit F. Consider the function H (z) := |z − a|. We have
H ◦F (z) = H
Y
|z − c|k(c)+1 ,
c∈Crit F ∩F −1 a
where k(c) denotes the multiplicity of a critical point c. We also define the number κ˜ > 0 from the equation κ˜ = min{k(c) : c ∈ F −1 a ∩ Crit F }. 1 − κ˜ 4.1. The functions sκ (t). The idea is to replace the weights |F 0 |−t in the transfer operators (1.1) with “homologous” weights of the form H ◦ F κt . Gκ,t := |F 0 |−t H ¯ and the corresponding transfer If 0 ≤ κ ≤ κ, ˜ then the weights Gκ,t are continuous in operators X Gκ,t (y)f (y) Lκ,t f (z) := y∈F −1 (z)
Thermodynamics of Rational Maps I
731
¯ The special property of the case κ = κ˜ is that every point in are bounded in C(). has at least one preimage that is not a zero of Gκ,t ˜ . This means that we are no longer in the “exceptional” situation – we have L∗κ,t ˜ ν 6= 0
(4.1)
for every probability measure ν on JF . Unfortunately, the operators Lκ,t ˜ are not bounded in any space W1,p (), and to apply the technique of Sects. 2 and 3 we have to use Lκ,t with κ < κ. ˜ (The operators with κ < κ˜ do not satisfy (4.1) but they are bounded in appropriate Sobolev spaces.) ¯ Define Let λκ (t) denote the spectral radius of Lκ,t in C(). sκ (t) := logd λκ (t). We will need the following properties of the functions sκ (t). ˜ then sκ 0 (t) ≤ sκ (t). (i) If t < 0 and 0 ≤ κ ≤ κ 0 ≤ κ, Proof. Denote 0 h(z) = |z − a|−t (κ −κ) and observe that 1 Lnκ0 ,t 1 = Lnκ,t h. h Let zn be the points in ∂ such that kLnκ0 ,t k∞ = Lnκ0 ,t 1(zn ). The existence of such points follows from the subharmonicity of the function z 7 → Lnκ0 ,t 1(z). Then we have kLnκ0 ,t k∞ Lnκ,t h(zn ) ≤ kLnκ,t hk∞ . kLnκ,t k∞ , which implies the statement. u t (ii) If there is a probability measure ν satisfying L∗κ,t ν = λκ (t) ν and if ν 6 = δa , then
sκ 0 (t) = sκ (t)
for all κ 0 > κ.
Proof. We have kLnκ0 ,t k∞ & kh Lnκ0 ,t 1k∞ = kLnκ,t hk∞ & hLnκ,t h, νi = λnκ (t)hh, νi
λnκ (t), which implies
sκ 0 (t) ≥ sκ (t). u t (iii) For every κ ∈ [0, κ], ˜ the function sκ (·) is strictly decreasing. Proof. It is clear that νt 6 = δa if t is sufficiently close to 0. By the previous statement, we have sκ (t) = s(t) for such t’s, and therefore the function sκ (t) is strictly decreasing in a neighborhood of 0. It remains to note that sκ is convex (use Hölder’s inequality and t the definition of sκ ). u
732
N. Makarov, S. Smirnov
4.2. Lemma. s˜ (t) > −t (1 − κ) ˜ logd |F 0 (a)|. Proof. Denote M := F 0 (a). The statement is obvious if a is a neutral fixed point, so we assume that a is repelling: |M| > 1. For simplicity, we write G and L instead of Gκ,t ˜ and Lκ,t ˜ respectively. Observe that ˜ G(a) = |M|−t (1−κ) .
By (4.1), we can consider the operator ν 7→ kL∗ νk−1 L∗ ν on the set of probability measures on JF . By Schauder’s theorem, this operator has a fixed point ν, and we have L∗ ν = λ ν
(4.2)
for some λ > 0. It is clear that logd λ ≤ s˜ (t), and it remains to show that G(a) < λ.
(4.3)
Since a is a repelling point of F , there is a conformal map ϕ from the unit disc onto some neighborhood of a such that (|z| < |M|−1 ).
ϕ(Mz) = F (ϕ(z)), If |z| < |M|−(1+n) , then we have
|ϕ 0 (M n z)| |M|n , |ϕ 0 (z)|
|Fn0 (ϕ(z))| = |M|n and Gn (ϕ(z)) = |Fn0 |−t
|ϕ(M n z) − a| |ϕ(z) − a|
κt ˜
˜ |M|−tn (|M|n )κt = G(a)n .
To prove (4.3), we consider the sequence of pairwise disjoint domains Un := ϕ (|M|−(2+n) < |z| < |M|−(1+n) ), By construction,
Fn
is injective on Un ,
F n (U n
Gn (z) G(a) Then by (4.2), we have −n
ν(Un ) = λ
n)
(n ≥ 0).
= U0 , and
for z ∈ Un .
Z U
Gn (z) dν(z)
λ−n G(a)n ν(U ). It is easy to see that supp ν = JF . (This follows from (4.1), see the proof of Lemma 3.5.) Hence ν(U ) > 0, and since the domains Un are disjoint, we have X X G(a) n . ν(Un ) < 1, λ n≥0
which implies (4.3). u t
n≥0
Thermodynamics of Rational Maps I
733
4.3. The operators Lκ,t with κ < κ. ˜ The argument of Lemma 2.3 shows that if t < 0 and 0 ≤ κ < κ, ˜ then Lκ,t is bounded in W1,p () with p > 2 sufficiently close to 2. We can now apply the methods of Sects. 2 and 3 to establish the following result. The condition (4.4) below simply means that a measure ν satisfying L∗κ,t ν = λκ (t) ν cannot be equal to δa , and therefore supp ν = JF by the proof of Lemma 3.5. Indeed, we have L∗κ,t δa = Gκ,t (a) δa , and if we assume (4.4), then Gκ,t (a) = |F 0 (a)|−t (1−κ) < λκ (t). Lemma. Let 0 ≤ κ < κ, ˜ and t < 0. Suppose that sκ (t) > −t (1 − κ) logd |F 0 (a)|.
(4.4)
Then the function sκ (·) is real analytic at t, and there is a non-atomic equilibrium state µκ,t for the function log Gκ,t . Proof. There are only minor changes in the reasoning of the previous sections. We again write G and L for Gκ,t and Lκ,t . (i) We first establish a two-norm inequality similar to (2.4). Choose p > 2 such that L acts in W1,p (). We claim that for some ε > 0, kLn f k1,p ≤ d n(sκ (t)−ε)+o(n) kf k1,p + Cn kf k∞ , (f ∈ W1,p ()). To prove (4.5), we repeat the computation of Lemma 2.5 to obtain Z p p p |∇(Ln f )|p . kLˆ n k p0 k∇f k1,p + Cn kf k∞ ,
where Lˆ denotes the transfer operator ˆ (z) = Lf
X
ˆ f (y) G(y)
y∈F −1 z
with the weight function 0
ˆ := Gp0 |F 0 |( p −1)p ≡ Gκ, G ˆ tˆ, 2
p 0 is the conjugate exponent, and 2 , tˆ := p0 t + 1 − p
κˆ :=
κt t +1−
2 p
.
(4.5)
(4.6)
734
N. Makarov, S. Smirnov
Since κˆ > κ and tˆ > p0 t, the properties (i) and (iii) of Subsect. 4.1 imply that 1 1 sκˆ (tˆ) < 0 sκ (p0 t) ≤ sκ (t), p0 p and therefore
1
kLˆ n k p0 = d n(sκ (t)−ε)+o(n) . Together with (4.6), the latter implies (4.5). (ii) The quasicompactness of L, ¯ ≡ λκ (t), ρess (L, W1,p ()) < ρ(L, W1,p ()) = ρ(L, C()) is a consequence of the two-norm inequality (4.5). It also follows that λκ (t) is an eigenvalue of L : W1,p () → W1,p () and that there is a probability measure νκ,t satisfying L∗ νκ,t = λκ (t) νκ,t . The proofs are identical to those in Sects. 2 and 3. As we mentioned, from (4.4) we have supp νκ,t = JF .
(4.7)
This in turn implies that λκ (t) is a simple isolated eigenvalue of L : W1,p () → W1,p (), and so the spectrum sκ (·) is analytic at t. The proof is exactly the same as in Lemma 3.6 except that the fact ( f ∈ W1,p (), Lf = λκ (t) f, f |JF ≡ 0 )
⇒
(f ≡ 0)
(4.8)
requires a slightly different argument. Fix z ∈ \ JF . Then we have |f (z)| = |λκ (t)−n Ln f (z)| X . λ−n |Gn (y)| dist(y, JF )β , κ (t) y∈F −n (z)
for some positive number β < −t. Using the inequality (3.5), we have X Gn (y) |Fn0 (y)|−β |f (z)| . λ−n κ (t) y∈F −n (z)
tκ = λ−n κ (t) H (z)
X
y∈F −n (z)
|Fn0 (y)|−t−β H (y)−tκ
= eo(n) d −sκ (t)n d sκˆ (tˆ)n , with tˆ := t + β > t, and κˆ =
t > κ. t +β
By (i) and (iii) of Subsect. 4.1, we have sκˆ (tˆ) < sκ (t), which completes the proof of (4.8). (iii) The construction of an equilibrium state µ and the proof that µ has no atoms is the same as in Subsect. 3.7. u t
Thermodynamics of Rational Maps I
735
4.4. Corollary. P˜ (t) = s˜ (t) log d. Proof. Fix t < 0. By property (i) of Subsect. 4.1 and by Lemma 4.2, we have ˜ logd |F 0 (a)|, sκ (t) ≥ s˜ (t) > −t (1 − κ) and therefore sκ (t) > −t (1 − κ) logd |F 0 (a)| for some parameter κ ∈ (0, κ) ˜ which we now consider fixed. As we mentioned, the last inequality implies that there exists an eigenmeasure νκ,t satisfying supp νκ,t = JF . By property (ii), it follows that s˜ (t) = sκ (t). Applying the variational principle (see Subsect. 1.2), we have sκ (t) log d = P (log Gκ,t ). We also have the equality P˜ (t) = P (log Gκ,t ) which follows from the existence of a non-atomic equilibrium state for the function log Gκ,t and from the fact that if µ is a probability measure on JF such that µ(a) = 0, then µ(log Gκ,t ) = −tχµ .
(4.9)
To prove (4.9), we observe that if log
H ◦F 6 ∈ L1 (µ), H
then both sides in (4.9) are −∞, otherwise we have H ◦F = 0. µ log H Indeed, for ε ∈ (0, 1) denote Hε := H + ε. Then log Hε ◦ F ≤ log H ◦ F + const H H ε
on JF , and log
H ◦F Hε ◦ F µ- a.e. −→ log Hε H
as ε → 0.
t u
736
N. Makarov, S. Smirnov
4.5. Proof of Theorems A and B. If F is not exceptional, then PF (t) is real analytic on the negative axis, and therefore PF (t) > −χmax for all t < 0. The equality PF = P˜F was explained in Subsect. 3.7. Suppose now that F is an exceptional map. Clearly, we always have PF (t) ≥ max {P˜F (t), −χmax t}. If PF (t) > −χ∗ t for some t < 0, then we have PF (t) = P˜F (t) by the property (ii) and Lemma 3.5. This completes the proof of Theorem A. A phase transition occurs if and only if χ∗ > P˜F0 (−∞). On the other hand, it is clear that P˜F0 (−∞) = sup {χµ : µ ∈ M, µ(6F ) = 0}, and Theorem B follows.
4.6. Remark. One can extend all results of Sects. 3.7 and 3.8 to exceptional polynomials. In particular, the argument of Sect. 3.8 proves Theorem C: P˜ 00 (t) > 0 for all t < 0 unless F is critically finite with parabolic orbifold. In the next section we will also use the following formula involving P˜ 0 (t). For t < 0, let κ be a number satisfying the conditions of Lemma 4.3, and let µ ≡ µκ,t be the corresponding equilibrium state. Then applying (4.9), we have P˜ 0 (t) = −χµ . Since P˜ (t) = hµ − tχµ , we get dimµ =
hµ P˜ (t) =t− . χµ P˜ 0 (t)
(4.10)
(The first equality in (4.10) follows from Mañé’s formula [23].)
5. Dimension Spectrum In this section we study the dimension properties of the maximal measure m and prove Theorem D.
Thermodynamics of Rational Maps I
737
5.1. Definitions and results. We define the box-counting dimension spectrum f (α) of m as follows: log N (δ; α, η) , f (α) := lim lim sup η→0 δ→0 | log δ| where N(δ; α, η) is the maximal number of disjoint discs B of radius δ centered at JF and satisfying δ α+η ≤ mB ≤ δ α−η . The Hausdorff dimension spectrum f˜(α) is defined be the equation f˜(α) := dim {z : α(z) exists and = α}, where α(z) is the pointwise dimension of m at z, and dim denotes Hausdorff dimension if the set is uncountable and −∞ otherwise. Recall the statement of Theorem D. Let α0 denote the Hausdorff dimension of the maximal measure. By (iii) of Subsect. 3.7, we have α0 = |s 0 (0−)|−1 . Claim. (i) The function s(t) on {t ≤ 0}, and the function f (α) on {α ≤ α0 } form a Legendre pair: s(t) = sup
α≤α0
f (α) − t , α
f (α) = inf [t + αs(t)], t≤0
(t ≤ 0), (α ≤ α0 ).
(ii) The functions s˜ (t), t ≤ 0, and f˜(α), α ≤ α0 form a Legendre pair. Using Theorem D, we can restate our results on the pressure function in terms of the spectra f (α) and f˜(α). Let us assume that F is not critically finite with parabolic orbifold. Denote 1 . αmin := 0 |s (−∞)| If s(t) has a phase transition point, then we also define the parameters α˜ min := and αc := We always have
1 |˜s 0 (−∞)|
1 1 = 0 . |s 0 (tc +)| |˜s (tc )|
0 < αmin < α0 ,
and in the phase transition case we have 0 < αmin < α˜ min < αc < α0 . Finally, note that f (α0 ) = f˜(α0 ) = α0 because α0 = dimm. Corollary 1. If F is not critically finite with parabolic orbifold, then f˜(α) is a real analytic, strictly increasing and strictly convex (f˜00 > 0) function on the interval (α˜ min , α0 ), and f˜(α) ≡ −∞ for α < α˜ min .
738
N. Makarov, S. Smirnov
Proof. Define Since
s˜ 00
α(t) := |˜s 0 (t)|−1 .
> 0, we have
s˜ 00 (t) > 0, (˜s 0 (t))2 and so α(t) is strictly increasing on the interval (−∞, 0), and the inverse function t (α) is real analytic on (α˜ min , α0 ). It follows that for α ∈ (α˜ min , α0 ), the function α 0 (t) =
f˜(α) = inf [t + α s˜ (t)] t≤0
= t (α) + α s˜ (t (α)) t has the stated properties. It is also clear that f˜(α) ≡ −∞ if α < α˜ min . u Corollary 2. If F is not exceptional (more generally, if there is no phase transition), then f ≡ f˜. 1 In the phase transition case, f (α) is C but not C 2 on (αmin , α0 ). More precisely, f˜(α), αc ≤ α ≤ α0 , linear, α min ≤ α ≤ αc , f (α) = 0, α = αmin , −∞, α < α . min Proof. Reasoning as above, we have f (α) = t + αs(t),
(αc ≤ α ≤ α0 ),
αs 0 (t)
= −1. We also have where α and t are related by the equation α f (α) = tc 1 − on [αmin , αc ]. αmin It follows that f 0 is continuous at αc . Indeed, 1 f 0 (αc +) = [f (αc ) − tc ] αc αc 1 tc 1 − − tc = αc αmin tc = f 0 (αc −). =− αmin The rest of the proof is obvious. u t We will prove the theorem only for polynomials with connected Julia sets. The proof is considerably shorter in this special case because we can express the spectra s(t) and s˜ (t) in terms of the Riemann map ϕ : 1 := {|z| > 1} → A(∞),
(ϕ(∞) = ∞),
where A(∞) is the basin of attraction to infinity, and apply some general facts of the conformal mapping theory. (For arbitrary rational maps, one should replace certain parts of the argument with corresponding dynamical considerations.) Recall that for connected polynomial Julia sets, m is the image of the normalized Lebesgue measure under the boundary correspondence. In what follows, we assume that the polynomial F is exceptional (but not Chebychev’s) with 6F = {a}.
Thermodynamics of Rational Maps I
739
5.2. Lemma. For each t < 0, we have d ns(t) d n(1−t) d n˜s (t) d n(1−t)
Z Z
|z|=1+d −n |z|=1+d −n
|ϕ 0 |t ,
(5.1)
˜ |ϕ − a|−κt |ϕ 0 |t .
(5.2)
Proof. Fix some point in A(∞) and consider the preimages {y} under F n . The Riemann map ϕ conjugates F with the dynamics T : z 7 → zd on 1 . Differentiating the identity F n ◦ ϕ = ϕ ◦ T n , we get |Fn0 (y)| d n |ϕ 0 (ϕ −1 y)|−1 . The points {ϕ −1 y} are equidistributed on a circle of radius rn satisfying rn − 1 d −n . Applying the distortion theorem, we have kLnt k∞ and kLnκ,t ˜ k∞
d
X y
n(1−t)
Z |z|=rn
|ϕ 0 |t ,
˜ |Fn0 (y)|−t |y − a|−κt
d n(1−t)
Z |z|=rn
˜ |ϕ − a|−κt |ϕ 0 |t .u t
5.3. Proof of (i). The key observation is that s(t) coincides with the packing spectrum of the maximal measure m: π(t) = lim sup ε→0
where
L(ε; t) := sup B
log L(ε; t) , | log ε|
X
diam(B)t
B∈B
the supremum being taken over all collections B of disjoint discs B satisfying mB = ε. It is a general fact (see [20]) that the harmonic measure packing spectrum of an arbitrary simply connected domain is related to the integral means spectrum R 0 t |z|=r |ϕ (z)| |dz| β(t) := lim sup | log(r − 1)| r→1 of the corresponding conformal map by the equation π(t) = β(t) + 1 − t.
740
N. Makarov, S. Smirnov
Thus for polynomials with connected Julia set, the equality s(t) = π(t) follows from (5.1). The packing spectrum and the box-counting dimension spectrum of an arbitrary measure satisfy the Legendre-type relation s(t) =
f (α) − t , (t ≤ 0), α α≤dim m sup
and so we obtain the first formula in (i). Applying the inverse Legendre transform, we get co f (α) = inf [t + αs(t)], t 0, let Uε denote the ε-neighborhood of the exceptional point a, mε the restriction of the maximal measure m to J \ Uε , and let πε (t) and fε (α) be the packing and the box dimension spectra of mε . As we mentioned, πε (t) is the Legendre transform of fε (α). From (5.2) it is easy to see that πε (t) ≤ s˜ (t). Applying the inverse transform to this inequality, we have co fε ≤ inf [t + α s˜ (t)]. t≤0
On the other hand, it is clear that the Hausdorff spectrum f˜(α) satisfies the inequality f˜(α) ≤ sup fε (α), ε>0
and therefore we have f˜(α) ≤ inf [t + α s˜ (t)]. t≤0
To finish the proof, we need to verify the opposite inequality. Fix α ∈ (α˜ min , α0 ) and define t = t (α) by the equation α s˜ 0 (t) + 1 = 0. We will show that dim{z : α(z) ≥ α} ≥ α s˜ (t) + t = t −
s˜ (t) . s˜ 0 (t)
Let κ be a number satisfying the conditions of Lemma 4.3, and let µ ≡ µκ,t be the corresponding equilibrium state. By (4.10), we have dimµ = t −
s˜ (t) . s˜ 0 (t)
On the other hand standard ergodic argument shows that for µ- a.e. z, we have α(z) ≥
log d 1 =− 0 = α. χµ s˜ (t)
This completes the proof of Theorem D.
742
N. Makarov, S. Smirnov
References 1. Baladi, V.: Periodic orbits and dynamical spectra. Ergodic Theory Dynamical Systems. 18, 255–292 (1998) 2. Bohr, T., Cvitanovi´c, P., Jensen, M. H.: Fractal aggregates in the complex plane. Europhys. Lett. 6, 445–450 (1988) 3. Bowen, R.: Equilibrium states and the ergodic theory of Anosov diffeomorphisms. Lecture Notes in Math. 470, Berlin: Springer, 1975 4. Brolin, H.: Invariant sets under iteration of rational functions. Ark. Mat. 6, 103–144 (1965) 5. Bruin, H., Keller, G.: Equilibrium states for S-unimodal maps. Ergodic Theory Dynamical Systems 18, 765–789 (1998) 6. Carleson, L., Gamelin, T.W.: Complex dynamics. New York: Springer, 1993 7. Denker, M., Urba´nski, M.: Ergodic theory of equilibrium states for rational maps. Nonlinearity 4, 103–134 (1991) 8. Douady, A., Hubbard, J. H.: A proof of Thurston’s topological characterization of rational functions. Acta Math. 171, 263–297 (1993) 9. Douady, A., Sentenac, P., Zinsmeister, M.: Implosion parabolique et dimension de Hausdorff. R. Acad. Sci. Paris Sér. I Math. 325, 765–772 (1997) 10. Feigenbaum, M.J., Procaccia, I., Tél, T.: Scaling properties of multifractals as an eigenvalue problem. Phys. Rev. A (3) 39, 5359–5372 (1989) 11. Freire, A., Lopes, A.O., Mañé, R.: An invariant measure for rational maps. Bol. Soc. Brasil. Mat. 14, 45–62 (1983) 12. Haydn, N.: Convergence of the transfer operator for rational maps. Preprint (1996) 13. Haydn, N., Ruelle, D.: Equivalence of Gibbs and equilibrium states for homeomorphisms satisfying expansiveness and specification. Commun. Math. Phys. 148, 155–167 (1992) 14. Hofbauer, F., Keller, G.: Equilibrium states for piecewise monotonic transformations. Ergodic Theory Dynamical Systems 2, 23–43 (1982) 15. Hofbauer, F., Keller, G.: Equilibrium states and Hausdorff measures for interval maps. Math. Nachr. 164, 239–257 (1993) 16. Ionescu-Tulcea, C.T., Marinescu, G.: Théorie ergodique pour des classes d’opérations non complètement continues. Ann. of Math. (2) 52, 140–147 (1950) 17. Liverani, C.: Decay of correlations. Ann. of Math. (2) 142, 239–301 (1995) 18. Lopes, A.O.: Dimension spectra and a mathematical model for phase transition. Adv. in Appl. Math. 11, 475–502 (1990) 19. Ljubich, M.Ju.: Entropy properties of rational endomorphisms of the Riemann sphere. Ergodic Theory Dynamical Systems 3, 351–385 (1983) 20. Makarov, N.: Fine structure of harmonic measure. Algebra i Anal. 10, 217–268 (1998) 21. Makarov N., Smirnov, S.: Phase transition in subhyperbolic Julia sets. Ergodic Theory Dynamical Systems 16, 125–157 (1996) 22. Makarov N., Smirnov, S.: Thermodynamics of rational maps, II. In preparation 23. Mañé, R.: The Hausdorff dimension of invariant probabilities of rational maps. In: Dynamical systems. Valparaiso, 1986, Lecture Notes in Math. 1331, Berlin–New York: Springer, 1988, pp. 86–117 24. Milnor, J.: Dynamics in One Complex Variable: Introductory lectures. SUNY Stony Brook Institute for Mathematical Sciences Preprints 5, 1990 25. Ott, E., Withers, W.D., Yorke, J.A.: Is the dimension of chaotic attractors invariant under coordinate changes? J. Stat. Phys. 36, 687–697 (1984) 26. Patterson, S.J.: The limit set of a Fuchsian group. Acta Math. 136, 241–273 (1976) 27. Pesin, Y.: Dimension theory in dynamical systems. Chicago: Chicago University Press, 1997 28. Przytycki, F.: On the Perron-Frobenius-Ruelle operator for rational maps on the Riemann sphere and for Hölder continuous functions. Bol. Soc. Brasil. Mat. (N.S.) 20, 95–125 (1990) 29. Ruelle, D.: Thermodynamic formalism. The mathematical structures of classical equilibrium statistical mechanics. Encyclopedia of Mathematics and its Applications 5. Reading, MA: Addison-Wesley, 1978 30. Ruelle, D.: Repellers for real analytic maps. Ergodic Theory Dynamical Systems 2, 99–107 (1982) 31. Ruelle, D. The thermodynamic formalism for expanding maps. Commun. Math. Phys. 125, 239–262 (1989) 32. Ruelle, D.: Spectral properties of a class of operators associated with conformal maps in two dimensions. Commun. Math. Phys. 144, 537–556 (1992) 33. Ruelle, D.: Dynamical zeta functions for piecewise monotone maps of the interval. CRM Monograph Series 4. Providence, RI: American Mathematical Society, 1994 34. Sinai, Ya.G.: Gibbs measures in ergodic theory. Russ. Math. Surveys 27, 21–69 (1972) 35. Sullivan, D.: Conformal dynamical systems. In: Geometric dynamics. Rio de Janeiro, 1981, Lecture Notes in Math, 1007, Berlin–New York: Springer, 1983, pp. 725–752 36. Walters, P.: An introduction to Ergodic Theory. New York: Springer, 1982
Thermodynamics of Rational Maps I
743
37. Young, L.-S.: Statistical properties of dynamical systems with some hyperbolicity. Ann. of Math. 147, 585–650 (1998) 38. Zdunik, A.: Parabolic orbifolds and the dimension of the maximal measure for rational maps. Invent. Mat. 99, 627–649 (1990) 39. W. Ziemer: Weakly differentiable functions. Sobolev spaces and functions of bounded variation. New York: Springer, 1989 40. Zinsmeister, M.: Formalisme thermodynamique et systèmes dynamiques holomorphes. Paris: Panoramas et Synthèses. Société Mathematique de France, 1996 Communicated by Ya. G. Sinai
Commun. Math. Phys. 211, 745 – 777 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Vertex Operators Arising from the Homogeneous b Realization for gl N Yun Gao? Department of Mathematics and Statistics, York University, Toronto, Canada M3J 1P3. E-mail:
[email protected] Received: 16 August 1999 / Accepted: 18 January 2000
Dedicated to my teacher Steve Berman Abstract: We use the underlying Fock space for the homogeneous vertex operator b to construct a family of vertex operators. representation of the affine Lie algebra gl N As an application, an irreducible module for an extended affine Lie algebra of type AN −1 coordinatized by a quantum torus Cq of 2 variables (or 3 variables) is obtained. Moreover, this module turns out to be a highest weight module which is an analog of the basic module for affine Lie algebras. 0. Introduction In the representation theory of affine Kac–Moody Lie algebras, one distinguished module with no finite dimensional counterpart is the basic module. It turns out remarkably that the basic module can be identified with a Fock space while the affine Lie algebra is realized as certain vertex operators on the Fock space. The first construction for the (1) basic module called principal, was developed by Lepowsky–Wilson [LW] for A1 and generalized to all affine Lie algebras of type ADE by Kac–Kazhdan-Lepowsky–Wilson [KKLW]. Another construction called homogeneous, was developed by Frenkel-Kac [FK], independently by Segal [S], for affine Lie algebras of type ADE. The difference between those two approaches is the choice of the Heisenberg subalgebra (so the choice of the Fock space). Vertex operator representations for affine Lie algebras have a number of beautiful applications to the theories of modular forms, solitons, combinatorial identities and others, see [K] and [FLM] and references therein. Since then, the homogeneous vertex operator representations for quantized affine algebras Uq (g) have been constructed by Frenkel–Jing [FJ] for untwisted cases and by Jing [J] for twisted cases. Unlike the vertex operators for affine Lie algebras, the new ? This work is supported by a fellowship from the Natural Sciences and Engineering Research Council of Canada. The author is grateful toYale University, particularly, to Professors I. Frenkel, H. Garland, G. Seligman and E. Zelmanov for their hospitality during his stay.
746
Y. Gao
feature here is that the commutator relation involves only the δ-function with none of its higher order derivatives. This is because of the assumption that q is generic. In this paper, we will use the underlying Fock space for the homogeneous vertex b to construct a family of vertex operator representation of the affine Lie algebra gl N operators. More precisely, given a pair (3, q), where q is a non-zero complex number, 3 is a subset of complex numbers containing zero and closed under addition, we define vertex operators Xij (r, z) depending on the parameter q, for 1 ≤ i, j ≤ N and r ∈ 3. These vertex operators together with the Heisenberg algebra form a Lie algebra V(3, q) which is proven to represent an affinization of the matrix algebra with entries in a skewpolynomial ring. We would like to point out that the valuation cα of the operator zα plays a role in our construction, where c is a non-zero complex number. The case 3 = {0} is b . However, if (3, q) is generic, trivial as V(3, q) represents the affine Lie algebra gl N by enlarging the Fock space, the case 3 = Z (or Z + Zξ with Imξ 6 = 0) will provide an irreducible vertex operator representation for an extended affine Lie algebra of type AN −1 coordinatized by a quantum torus Cq of 2 variables (or 3 variables). The quantum torus is an algebraic version of the non-commutative torus defined as in [M]. The representation for such a Lie algebra coordinatized by a quantum torus has been given in [BS] and [G2] via the principal construction. Extended affine Lie algebras were first introduced in [H-KT] and systematically studied in [AABGP] and [BGK]. They are a natural generalization of finite dimensional simple Lie algebras and affine Kac–Moody algebras (see [ABGP]). Other examples include toroidal Lie algebras (with certain derivations added) which have been studied in [F2,MRY] and [EF], among others. We will introduce highest weight modules and their irreducible quotients for those extended affine Lie algebras under consideration. The structure of the irreducible highest weight modules is not known in general and will be further investigated elsewhere. Nevertheless, we shall single out one irreducible highest weight module L(ω0 ) which is an analog of the basic module for affine Kac– Moody algebras. This singled out module will be shown to be nothing but the above enlarged Fock space. Consequently, a character formula for L(ω0 ) is obtained. Since our matrix Lie algebras have non-commutative coordinate algebras, we shall use a lattice to construct the associative N × N matrix algebra MN (C) rather than to construct the Lie algebra slN as was done in [FK,S] and [FLM]. As is known, an important step in the vertex operator theory which will occupy most of this paper is to compute the commutator relations. If q r1 +r2 6 = 1, the same phenomenon as appeared in the quantized affine algebra Uq (g) ([J] and [FJ]) is the fact that the commutator [Xij (r1 , z1 ), Xkl (r2 , z2 )] involves only δ-function with none of its higher order derivatives. The key point here is to use a q-deformed partition identity for the first derivative Dδ(z) (see (3.34) and (3.35) below). In the case q r1 +r2 = 1, the commutator relation is similar to (but more subtle than) the one in the ordinary affine Lie algebras. The organization of this paper is now in order. In Sect. 1, we mimic the lattice construction for the Lie algebra slN to construct the N ×N matrix algebra MN . In Sect. 2, basic definitions and notations in the homogeneous construction are recalled. We then define a family of vertex operators for any pair (3, q) and compute the commutator relations in Sect. 3. In Sect. 4, we shall realize the Lie algebra obtained in Sect. 3 and lift it to a Lie algebra on the enlarged Fock space. Finally, in Sect. 5, we review some basics on extended affine Lie algebras and introduce highest weight modules. We then go on to derive some applications to representations of extended affine Lie algebras.
b Vertex Operators Arising from Homogeneous Realization for gl N
747
Throughout this paper, we denote the field of complex numbers, real numbers and the ring of integers by C, R and Z respectively. 1. A Lattice Construction for MN (C) One crucial feature in the study of homogeneous vertex operator representations for untwisted affine Lie algebras of the simply-laced type ([FK,S] and [FLM]) is the lattice construction for simple finite dimensional Lie algebras. We shall use a rank N lattice to construct a simple associative algebra of dimension N 2 . Of course, it must be isomorphic to the N × N matrix algebra MN (C). Let N be a positive integer and P = Z1 ⊕ · · · ⊕ ZN
(1.1)
be a rank N free abelian group provided with a Z-bilinear form (·, ·) which is given by (i , j ) = δij , for 1 ≤ i, j ≤ N . Let Q = Z(1 − 2 ) ⊕ · · · ⊕ Z(N −1 − N )
(1.2)
be the rank (N − 1) free subgroup of P . We know that Q is an even lattice. That is, (α, α) ∈ 2Z, for all α ∈ Q.
(1.3)
Set 1 = {α ∈ Q : (α, α) = 2}. Then 1 = {i − j : 1 ≤ i 6= j ≤ N}.
(1.4)
Let ε : Q × Q → {±1} be a function satisfying the bimultiplicativity condition ε(α + β, γ ) = ε(α, γ )ε(β, γ ) and ε(α, β + γ ) = ε(α, β)ε(α, γ )
(1.5)
for α, β, γ ∈ Q, and the condition ε(α, α) = (−1) 2 (α,α) , 1
for all α ∈ Q.
(1.6)
ε is sometimes called an asymmetry function on Q (see [K]). The formula (1.6) immediately implies ε(α, β)ε(β, α) = (−1)(α,β) ,
(1.7)
for α, β ∈ Q. Set H = P ⊗Z C. Then H is an N -dimensional complex space. Extend the form on P to a C-bilinear symmetric non-degenerate form (·, ·) on H . Let M be the finite dimensional complex space M=H⊕
X α∈1
⊕Cxα ,
748
Y. Gao
where xα ’s are symbols indexed by α ∈ 1. Then dimC M = dimC H + |1| = N 2 . Moreover, M is an algebra over C under the following multiplication: i j = δij j , k xm −n = δkm xm −n , xm −n k = δnk xm −n , ( δnk ε(m − n , n − l )xm −l xm −n xk −l = δnk ε(m − n , n − m )m
(1.8) if m 6 = l , if m = l
where 1 ≤ i, j ≤ N, 1 ≤ m 6 = n ≤ N and 1 ≤ k 6= l ≤ N . Also, define a C-linear function T : M → C as follows. T (xα ) = 0, T (i ) = 1,
(1.9)
for α ∈ 1 and 1 ≤ i ≤ N . For convenience, we shall always write eii = i , for 1 ≤ i ≤ N, and eij = xi −j , for 1 ≤ i 6 = j ≤ N.
(1.10)
Since ε(0, α) = ε(α, 0) = 1 for α ∈ Q, (1.8) can be rewritten uniformly as eij ekl = δj k ε(i − j , j − l )eil
(1.11)
for 1 ≤ i, j, k, l ≤ N. It is easy to see that T (eij ekl ) = T (ekl eij ) = δj k δil ε(i − j , j − i )
(1.12)
for 1 ≤ i, j, k, l ≤ N. Note that eij ej i = −eii ,
and T (eij ej i ) = ε(i − j , j − i ) = −1,
(1.13)
for 1 ≤ i 6 = j ≤ N. Proposition 1.14. M is a simple associative C-algebra of dimension N 2 with the identity 1 = 1 + · · · + N . The form (·, ·) on M defined by (x, y) = T (xy) for x, y ∈ M is non-degenerate invariant. Proof. We first show that M is associative. Since ε(i = ε(i = ε(i = ε(i = ε(i
− j , j − j , j − j , j − j , j − j , j
− l )ε(i − l , l − n ) − n + n − l )ε(i − l , l − n ) − n )ε(i − j , n − l )ε(i − l , l − n ) − n )ε(−i + j , l − n )ε(i − l , l − n ) − n )ε(j − l , l − n ),
we have (eij ekl )emn = ε(i − j , j − l )ε(ei − l , l − n )δj k δlm ein = ε(i − j , j − n )ε(j − l , l − n )δj k δlm ein = eij (ekl emn )
(1.15)
b Vertex Operators Arising from Homogeneous Realization for gl N
749
for 1 ≤ i, j, k, l, m, n ≤ N. Hence M is associative. P Next let I be an ideal of M. For any x = aij eij ∈ I, aij ∈ C. Then ekl xemn = ε(k − l , l − m )ε(k − m , m − n )alm ekn ∈ I. If alm 6 = 0 for some l and m, then ekn ∈ I for all k, n = 1, · · · , N . This shows that I = (0) or M. Thus M is simple. The rest of proof is obvious. u t Remark 1.16. Define an asymmetry function ε on Q as follows: ( −1 if j = i, i − 1, ε(i − i+1 , j − j +1 ) = 1 otherwise, for 1 ≤ i, j ≤ N − 1. Then φ : M → MN (C) defined by ( Eij for 1 ≤ i ≤ j ≤ N φ(eij ) = −Eij for 1 ≤ j < i ≤ N is an isomorphism and (x, y) = tr(φ(x)φ(y)) for x, y ∈ M, where Eij is the standard N × N matrix which is 1 in the (i, j )-entry and 0 everywhere else. 2. Basics on Vertex Operators From now on we assume that N ≥ 2. Let L = M − be the Lie algebra under the commutator product [x, y] = xy − yx. Note that L is the general linear Lie algebra glN over C. Define i∗ ∈ H ∗ (the dual space of H ) by i∗ (j ) = δij for 1 ≤ i, j ≤ N. One can also define a nondegenerate symmetric bilinear form on H ∗ by taking (i∗ , j∗ ) = δij , 1 ≤ i, j ≤ N. We sometimes identify H with H ∗ by identifying eii = i with i∗ for 1 ≤ i ≤ N. The subalgebra H of L is a Cartan subalgebra of L. We have the root space decomposition with respect to H : X ⊕Li −j , (2.1) L = L0 ⊕ 1≤i6 =j ≤N
P where Li −j = Ceij , 1 ≤ i 6 = j ≤ N and L0 = H = N i=1 ⊕Ci . −1 Now we consider the Lie algebra glN (C[t0 , t0 ]) = L ⊗ C[t0 , t0−1 ], where C[t0 , t0−1 ] b associated with L is the algebra of Laurent polynomials over C. Define a Lie algebra L as an infinite dimensional complex space b = L ⊗ C[t0 , t −1 ] ⊕ Cc0 L 0
(2.2)
[x1 ⊗ t0n1 , x2 ⊗ t0n2 ] = [x1 , x2 ] ⊗ t0n1 +n2 + n1 δn1 +n2 ,0 (x1 , x2 )c0 ,
(2.3)
with the Lie bracket
b It has a subalgebra H b: where x1 , x2 ∈ L, n1 , n2 ∈ Z, c0 is a central element of L. b = H ⊗ C[t0 , t −1 ] ⊕ Cc0 . H 0
(2.4)
750
Y. Gao
We denote e=L b ⊕ Cd0 L
(2.5)
b with the degree operator d0 = t0 d , a semi-direct product of L dt0 e=H b ⊕ Cd0 H e L b (or L) e is called the affinization of L. is a subalgebra of L. Define X b± = H ⊗ t ±1 C[t ±1 ] = ⊕(H ⊗ t0n ), H 0 0
(2.6)
(2.7)
n∈±Z+
where Z+ = {n ∈ Z : n > 0}. Then b− b+ ⊕ Cc0 ⊕ H s=H
(2.8)
is a Heisenberg algebra. Write h ⊗ t0n = h(n) for h ∈ H and n ∈ Z. Let b− ) = C[i (n) : 1 ≤ i ≤ N, n ∈ −Z+ ] S(H
(2.9)
b− , which is the algebra of polynomials in infinitely denote the symmetric algebra of H b− ) is an H e-module in which c0 acts as many variables i (n), 1 ≤ i ≤ N, n ∈ −Z+ . S(H 1, d0 acts as the degree operator (i.e. d0 i (n) = ni (n)), H acts trivially, i (n) acts as the multiplication operator for n ∈ −Z+ and i (n) acts as the partial differential operator n∂ ∂i (−n) for n ∈ Z+ . Then [α(n1 ), β(n2 )] = (α, β)n1 δn1 +n2 ,0 , [d0 , α(n)] = nα(n)
(2.10)
for α, β ∈ H and n1 , n2 , n ∈ Z. Following [FLM], we define α(z) =
X
b− ))[[z, z−1 ]] α(n)z−n ∈ (EndS(H
(2.11)
n∈Z
and E ± (α, z) = exp(
X α(n) b− ))[[z∓1 ]] z−n ) ∈ (EndS(H n
(2.12)
n∈±Z+
for α ∈ H . The following elementary properties of E ± (α, z) can be easily established (for example, see (4.1.1) in [FLM]).
b Vertex Operators Arising from Homogeneous Realization for gl N
751
Lemma 2.13. For α ∈ H , the expressions E ± (α, z) exist and E ± (0, cz) = 1,
(2.14)
E ± (α, c1 z)E ± (β, c2 z) = exp(
X
c1−n α(n) + c2−n β(n) −n n
n∈±Z+
[d0 , E ± (α, cz)] = −DE ± (α, cz) = (
X
z
),
α(n)c−n z−n )E ± (α, cz),
(2.15) (2.16)
n∈±Z+
[β(n), E + (α, cz)] = 0 if n ∈ Z+ −
n n
or n = 0,
(2.17)
−
[β(n), E (α, cz) = −(β, α)c z E (α, cz) if n ∈ Z+ , [β(n), E + (α, cz) = −(β, α)cn zn E + (α, cz) if n ∈ −Z+ , [β(n), E − (α, cz) = 0 if n ∈ −Z+
or n = 0,
d . where α, β ∈ H, c, c1 , c2 ∈ C× = C \ {0}, and D = z dz
Next let C[Q] =
X
⊕Ceα
(2.18)
α∈Q
be the group algebra of Q. For β ∈ Q, define eβ ∈ EndC[Q] by eβ eα = ε(β, α)eα+β , for α ∈ Q.
(2.19)
Then it follows from the bimultiplicativity of ε that eα eβ = ε(α, β)eα+β
(2.20)
for α, β ∈ Q. Also, for β ∈ H , define β(0) ∈ EndC[Q] by β(0)eα = (β, α)eα , for α ∈ Q.
(2.21)
Then for β ∈ H and α ∈ Q, [β(0), eα ] = (β, α)eα .
(2.22)
We shall impose a vector space Z-grading on C[Q] by setting 1 degeα = − (α, α), for α ∈ Q. 2 Let d0 ∈ EndC[Q] be the corresponding degree operator. That is, d0 eα = − 21 (α, α)eα , for α ∈ Q. So one has 1 [d0 , eα ] = eα (−α(0) − (α, α)) 2
(2.23)
b− ) ⊗ C[Q], VQ = S(H
(2.24)
for α ∈ Q. Set
752
Y. Gao
e. We embed EndS(H b− ) and EndC[Q] (respectively, (EndS(H b− )) the Fock space for H {z} and (EndC[Q]){z}) canonically into EndVQ (respectively, (EndVQ ){z}). Recall that zα ∈ (EndC[Q])[z, z−1 ] is defined as (thought of as zα(0) ) zα eβ = z(α,β) eβ
(2.25)
for α, β ∈ Q. Thus, in (EndC[Q]){z}, we have [α(0), zβ ] = 0,
and
zα eβ = eβ zα+(α,β)
(2.26)
zα
for α, β ∈ Q. It is clear that the formula (2.25) expresses as an operator from C[Q] to C[Q][z, z−1 ]. Now let c be any non-zero complex number. Consider the valuation cα of the operator zα . Namely, cα is the operator C[Q] → C[Q] given by cα eβ = c(α,β) eβ , for α, β ∈ Q.
(2.27)
cα
This operator has been used in [F1] and [J]. It will play a role in our construction of vertex operators as well. As in (2.26), we have [α(0), cβ ] = 0, for α, β ∈ Q. Finally, we set
and
X
cα eβ = eβ cα+(α,β)
(2.28)
zn ∈ C[[z, z−1 ]],
(2.29)
formally the Fourier expansion of the δ-function, and X nzn . (Dδ)(z) = Dδ(z) =
(2.30)
δ(z) =
n∈Z
n∈Z
Most notations related to formal series and the Fock space used in this section can be found in the book [FLM]. 3. Construction of a Class of Vertex Operators Let (3, q) be a pair, where q is a non-zero complex number and 3 is a subset of C containing 0 and closed under addition. We shall always fix one choice for ln q such that q r = er ln q for all r ∈ 3. Beginning with the pair (3, q), we define a family of vertex operators which will eventually generate a Lie algebra. For r ∈ 3, 1 ≤ i, j ≤ N, we define the vertex operator Xij (r, z) as follows: Xij (r, z) = E − (−i , z)E − (j , q r z)E + (−i , z)E + (j , q r z) (i −j ,i −j )
(j ,i −j )
2 q −rj − 2 r · ei −j zi −j + (3.1) X i (n) − q −nr j (n) X i (n) − q −nr j (n) z−n ) exp(− z−n ) = exp(− n n
n∈−Z+
· ei −j z
( − , − ) i −j + i j 2 i j
n∈Z+
q
( , − ) −rj − j 2i j r
.
b Vertex Operators Arising from Homogeneous Realization for gl N
753
Remark 3.2. Since zi −j +
(i −j ,i −j ) 2
∈ (EndC[Q])[z, z−1 ],
(3.3)
we obtain that Xij (r, z) ∈ (EndVQ )[[z, z−1 ]] and so we have Xij (r, z) =
X
xij (r, n)z−n ,
(3.4)
n∈Z
where xij (r, n) ∈ EndVQ . Remark 3.5. In the definition of vertex operators (3.1), Xij (r +r 0 , z) = Xij (r 0 , z) whenever q r = 1, where r, r 0 ∈ 3, 1 ≤ i, j ≤ N. In particular, Xii (r, z) = Xii (0, z) = 1 when q r = 1, for 1 ≤ i ≤ N. b− ) ⊗C C[Q]: e, eα , zα and cα on VQ = S(H We now summarize the actions of H c0 7 → 1, d0 7 → d0 ⊗ 1 + 1 ⊗ d0 , β = β ⊗ 1 7→ β(0) = 1 ⊗ β(0), β(n) = β ⊗ t0n 7 → β(n) ⊗ 1 for n ∈ Z \ {0}, eα = 1 ⊗ eα , zα = 1 ⊗ zα and , cα = 1 ⊗ cα
(3.6)
for α ∈ Q, β ∈ H . In our construction, c will be always a power of q. Proposition 3.7. For any 1 ≤ i, j, k ≤ N , n ∈ Z, r ∈ 3, we have [k (n), Xij (r, z)] = (δki − δkj q nr )zn Xij (r, z), [d0 , Xij (r, z)] = −DXij (r, z).
(3.8) (3.9)
Proof. Equation (3.8) follows from (2.10) and (2.22) while (3.9) follows from (2.16), (2.23) and the fact that Dzα+
(α,α) 2
= (α(0) +
(α, α) α+ (α,α) 2 )z 2
(3.10)
for α ∈ Q. u t The normal ordering can be defined as in [FLM]. The only difference is the operator cα which is in the identical position as zα . It then follows that : Xij (r, z) := Xij (r, z). Remark 3.11. d0 can be rewritten as N N N X X 1 XX 1X 2 : i (−n)i (n) := − i (0) − i (−n)i (n) ∈ EndVQ . d0 = − 2 2 i=1 n∈Z
i=1
i=1 n∈Z+
754
Y. Gao
We define (comparing with (4.2.15) and (7.1.58) in [FLM]) : Xij (r1 , z1 )Xkl (r2 , z2 ) : X (i (n) − q −nr1 j (n))z−n + (k (n) − q −nr2 l (n))z−n 1 2 ) = exp(− n n∈−Z+
· exp(−
X (i (n) − q −nr1 j (n))z−n + (k (n) − q −nr2 l (n))z−n 1 2 ) n
n∈Z+
(3.12)
· ε(i − j , k − l )ei −j +k −l i −j +
· z1
(i −j ,i −j +k −l ) 2
· q −r1 j −r2 l −
k −l +
z2
(k −l ,i −j +k −l ) 2
(j ,i −j ) ( , − ) r1 − l k2 l r2 2
for r1 , r2 ∈ 3, 1 ≤ i, j, k, l ≤ N . Then one has : Xij (r1 , z1 )Xkl (r2 , z2 ) := (−1)(i −j ,k −l ) : Xkl (r2 , z2 )Xij (r1 , z1 ) : .
(3.13)
We have the following basic result. Lemma 3.14. For 1 ≤ i, j, k, l ≤ N , r1 , r2 ∈ 3, X k (n) − q −nr2 l (n) X i (n) − q −nr1 j (n) z1−n ) exp(− z2−n ) exp(− n n n∈Z+
n∈−Z+
X i (n) − q −nr1 j (n) X k (n) − q −nr2 l (n) z2−n ) exp(− z1−n ) = exp(− n n n∈−Z+
· (1 −
q r2 z
n∈Z+
q r2 z
z2 z2 δik 2 2 −δil ) (1 − r )δj l (1 − ) (1 − r )−δj k z1 q 1 z1 z1 q 1 z1
in the formal power series algebra (EndVQ )[[z1−1 , z2 ]] ⊆ (EndVQ ){z1 , z2 }. Proof. We first have X X i (n1 ) − q −n1 r1 j (n1 ) −n z1 1 , − [− n1 n1 ∈Z+
n2 ∈−Z+
k (n2 ) − q −n2 r2 l (n2 ) −n2 z2 ] n2
X i (n1 ) − q −n1 r1 j (n1 ) −n k (−n1 ) − q n1 r2 l (−n1 ) n [ z1 1 , z2 1 ] = n1 −n1 n1 ∈Z+
= −
X δik + q −n1 r1 +n1 r2 δj l − q n1 r2 δil − q −n1 r1 δj k z2 ( )n1 n1 z1
n1 ∈Z+
= − δik
X (z2 /z1 )n1 X (q r2 z2 /q r1 z1 )n1 − δj l n1 n1
n1 ∈Z+
+ δil
n1 ∈Z+
X (q r2 z2 /z1 )n1 X (z2 /q r1 z1 )n1 + δj k n1 n1
n1 ∈Z+
= δik ln(1 −
z2 ) + δj l ln(1 − z1
n1 ∈Z+ r 2 q z2 ) − δil q r1 z1
ln(1 −
q r2 z2 z2 ) − δj k ln(1 − r ), z1 q 1 z1
b Vertex Operators Arising from Homogeneous Realization for gl N
here we use the formula ln(1 − z) = − formal rule
755
P
zn n∈Z+ n .
Now the result follows from the
exp A exp B = (exp B exp A) exp[A, B],
(3.15)
where [A, B] commutes with A and B. u t Proposition 3.16. For r1 , r2 ∈ 3, 1 ≤ i, j, k, l ≤ N , Xij (r1 , z1 )Xkl (r2 , z2 ) z1 (i −j ,k −l ) 2 ) z2 z2 q r2 z2 q r2 z2 −δil z2 · (1 − )δik (1 − r )δj l (1 − ) (1 − r )−δj k q −δj k r1 +δj l r1 . z1 q 1 z1 z1 q 1 z1
= : Xij (r1 , z1 )Xkl (r2 , z2 ) : (
Proof. By (2.26), (2.28) and Lemma 3.14, we have Xij (r1 , z1 )Xkl (r2 , z2 ) X i (n) − q −nr1 j (n) X i (n) − q −nr1 j (n) z1−n ) exp(− z1−n ) = exp(− n n n∈−Z+
n∈Z+
( − , − ) i −j + i j 2 i j
· ei −j z1 · exp(−
q
( , − ) −r1 j − j 2i j r1
X k (n) − q −nr2 l (n) X k (n) − q −nr2 l (n) z2−n ) exp(− z2−n ) n n
n∈−Z+
n∈Z+
( − , − ) k −l + k l 2 k l
(l ,k −l )
q −r2 l − 2 r2 · ek −l z2 X (i (n) − q −nr1 j (n))z−n + (k (n) − q −nr2 l (n))z−n 1 2 = exp(− ) n n∈−Z+
· exp(−
X (i (n) − q −nr1 j (n))z−n + (k (n) − q −nr2 l (n))z−n 1 2 ) n
n∈Z+
q r2 z2 q r2 z2 −δil z2 z2 ) (1 − r )−δj k · (1 − )δik (1 − r )δj l (1 − 1 z1 q z1 z1 q 1 z1 · ε(i − j , k − l )ei −j +k −l i −j +(i −j ,k −l )+
· z1
k −l +
· z2
(k −l ,k −l ) 2
(i −j ,i −j ) 2
q −r2 l −
q −r1 j −(j ,k −l )r1 −
(j ,i −j ) r1 2
(l ,k −l ) r2 2
z1 (i −j ,k −l ) 2 ) z2 z2 q r2 z2 q r2 z2 −δil z2 · (1 − )δik (1 − r )δj l (1 − ) (1 − r )−δj k q −δj k r1 +δj l r1 z1 q 1 z1 z1 q 1 z1
= : Xij (r1 , z1 )Xkl (r2 , z2 ) : (
as desired. u t
756
Y. Gao
To calculate the commutators of vertex operators, we need some more notations and identities. Set ij
Fkl (r1 , r2 , z1 , z2 )
z1 (i −j ,k −l ) −δj k r1 +δj l r1 2 ) q z2 z2 q r2 z2 q r2 z2 1−δil z2 q r1 z1 · (1 − )δik (1 − r )δj l (1 − ) (1 − r )1−δj k . z1 q 1 z1 z1 q 1 z1 z2
= : Xij (r1 , z1 )Xkl (r2 , z2 ) : (
(3.17)
Note that Xij (r1 , z1 )Xkl (r2 , z2 ) ij
= Fkl (r1 , r2 , z1 , z2 )(1 −
q r2 z2 −1 z2 z2 ) (1 − r )−1 r . z1 q 1 z1 q 1 z1
(3.18)
Lemma 3.19. For r1 , r2 ∈ 3, 1 ≤ i, j, k, l ≤ N , the limit limr
z2 →q
1 z1
ij
Fkl (r1 , r2 , z1 , z2 )
exists and lim
ij
z2 →q r1 z1
Fkl (r1 , r2 , z1 , z2 )
= ε(i − j , j − l )δj k Xil (r1 + r2 , z1 ) · (1 − q r1 )δij q
r − 21 δij
(1 − q r2 )δj l q
r − 22 δj l
(3.20) (1 − q r1 +r2 )1−δil q
r1 +r2 2 δil
.
Remark 3.21. Before proving Lemma 3.19, we give some explanation about the limit (3.20). By abuse of notation, if i 6 = j , we set (1 − q r )δij = 1 even when q r = 1. For example, if i 6 = k, limz2 →q r1 z1 (1 − zz21 )δik = 1 even when q r1 = 1, and we write limr (1 −
z2 →q 1 z1
z2 δik ) = (1 − q r1 )δik . z1
This convention will make our formula more concise and will be often used in the rest of the paper. Proof of Lemma 3.19. We first have ij
Fkl (r1 , r2 , z1 , z2 ) = F1 (z1 , z2 )ε(i − j , k − l )ei −j +k −l F2 (z1 , z2 )F3 ,
(3.22)
where F1 (z1 , z2 ) X (i (n) − q −nr1 j (n))z−n + (k (n) − q −nr2 l (n))z−n 1 2 ) = exp(− n n∈−Z+
X (i (n) − q −nr1 j (n))z−n + (k (n) − q −nr2 l (n))z−n 1 2 ), · exp(− n n∈Z+
(3.23)
b Vertex Operators Arising from Homogeneous Realization for gl N
757
F2 (z1 , z2 ) −j +1−δij +δik −δil −δj k +δj l k −l +1−δkl z2 r 2 z2 q z2 q r2 z2 1−δil · (1 − )δik (1 − r )δj l (1 − ) (1 − z1 q 1 z1 z1
= z1i
z2 1−δj k q r1 z1 ) , q r1 z1 z2
(3.24)
and F3 = q −r1 j −r2 l −
(j ,i −j ) ( , − ) r1 − l k2 l r2 −δj k r1 +δj l r1 2
.
(3.25)
Since limz2 →q r1 z1 F1 (z1 , z2 ) exists and F2 (z1 , z2 ) ∈ (EndC[Q])[z1 , z2 , z1−1 , z2−1 ], ij one sees that limz2 →q r1 z1 Fkl (r1 , r2 , z1 , z2 ) exists. Note that F2 (z1 , z2 ) contains the factor (1 − q rz12z1 )1−δj k . It follows that ij
lim
z2 →q r1 z1
Fkl (r1 , r2 , z1 , z2 ) = δj k
lim
z2 →q r1 z1
ij
Fj l (r1 , r2 , z1 , z2 )
= δj k F1 (z1 , q r1 z1 )ε(i − j , j − l )ei −l F2 (z1 , q r1 z1 )F3 .
(3.26)
Next we let k = j and get X i (n) − q −n(r1 +r2 ) l (n) z1−n ) n
F1 (z1 , q r1 z1 ) = exp(−
n∈−Z+
X i (n) − q −n(r1 +r2 ) l (n) z1−n ), exp(− n
(3.27)
n∈Z+
−j−δil +δj l
F2 (z1 , q r1 z1 ) = z1i
(q r1 z1 )j −l +1−δj l (1−q r1 )δij (1−q r2 )δj l (1−q r1 +r2 )1−δil
= z1i −l +1−δil q r1 j−r1 l +(1−δj l )r1 (1−q r1 )δij (1−q r2 )δj l (1−q r1 +r2 )1−δil , (3.28) and F3 = q −r1 j −r2 l −
(j ,i −j ) ( , − ) r1 − l j2 l r2 −r1 +δj l r1 2
= q −r1 j −r2 l −(1−δj l )r1 −
δij −1 δj l −1 2 r1 − 2 r2
(3.29) .
t Now (3.20) follows from (3.26)-(3.29) and the definition of Xil (r1 + r2 , z1 ). u ij
Next we want to show that Fkl (r1 , r2 , z1 , z2 ) has a symmetric property. Lemma 3.30. For r1 , r2 ∈ 3, 1 ≤ i, j, k, l ≤ N , z2 (−1)(i −j ,k −l ) ( )(i −j ,k −l ) z1 z1 δik q r1 z1 q r1 z1 1−δj k z1 · (1 − ) (1 − r )δj l (1 − ) (1 − r )1−δil q −δli r2 +δlj r2 2 z2 q z2 z2 q 2 z2 (3.31) r r 2 2 z2 q z2 q z2 1−δil z2 = (1 − )δik (1 − r )δj l (1 − ) (1 − r )1−δj k z1 q 1 z1 z1 q 1 z1 r 1 q z1 z1 · q −δj k r1 +δj l r1 . z2 q r2 z2
758
Y. Gao ij
Fkl (r1 , r2 , z1 , z2 ) = Fijkl (r2 , r1 , z2 , z1 ).
(3.32)
Proof. Equation (3.32) follows from (3.31) and (3.13). Let us prove (3.31). In fact, the left-hand side of (3.31) equals to z1 q r1 z1 z2 (− )δik +δj l −δil −δj k (1 − )δik (1 − r )δj l z1 z2 q 2 z2 r 1 q z1 1−δj k z1 · (1 − ) (1 − r )1−δil q −δli r2 +δlj r2 z2 q 2 z2 z2 z2 z2 = (− + 1)δik (− + q r1 −r2 )δj l (− + q r1 )1−δj k z1 z1 z1 z2 z2 · (− + q −r2 )1−δil q −δli r2 +δlj r2 (− )−2 z1 z1 r 2 z2 q z2 z2 = (1 − )δik (1 − r )δj l q δj l r1 −δj l r2 (1 − r )1−δj k q r1 −δj k r1 z1 q 1 z1 q 1 z1 q r2 z2 1−δil −r2 +δil r2 −δli r2 +δlj r2 z1 2 (1 − ) q q ( ) z1 z2 z2 δik q r2 z2 δj l q r2 z2 1−δil z2 = (1 − ) (1 − r ) (1 − ) (1 − r )1−δj k 1 z1 q z1 z1 q 1 z1 r 1 q z1 z1 · q −δj k r1 +δj l r1 z2 q r2 z2 as expected. u t The following basic result is similar to (4.16) in [J] whose proof is straightforward. Lemma 3.33. For any a, b ∈ C and a 6= b, we have (1 − az)−1 (1 − bz)−1 =
z−1 ((1 − az)−1 − (1 − bz)−1 ) a−b
in C[[z, z−1 ]]. Proposition 3.34. If q r1 +r2 6 = 1, then q r2 z2 −1 q r1 z1 z1 z1 −1 q r1 z1 −1 z2 ) − (1 − ) (1 − ) (1 − r )−1 (1 − q 1 z1 z1 z2 q r2 z2 q r2 z2 z2 q r1 z1 z1 q r1 z1 = (1 − q r1 +r2 )−1 (δ( ) − δ( r )) z2 z2 q 2 z2 Proof. By Lemma 3.33, we see that the left hand side of the identity is equal to z2 z2 q r2 z2 −1 ) ) (q −r1 − q r2 )−1 ( )−1 ((1 − r )−1 − (1 − z1 q 1 z1 z1 z1 −1 q r1 z1 −1 q r1 z1 z1 −r2 r1 −1 z1 −1 (q − q ) ( ) ((1 − ) − (1 − ) ) − z2 q r2 z2 z2 q r2 z2 z2 z2 q r2 z2 −1 q r1 z1 ((1 − r )−1 − (1 − ) = (1 − q r1 +r2 )−1 z2 q 1 z1 z1 z1 q r1 z1 −1 − (1 − r )−1 + (1 − ) ) q 2 z2 z2 q r1 z1 z1 q r1 z1 (δ( ) − δ( r )) as needed. u t = (1 − q r1 +r2 )−1 z2 z2 q 2 z2
b Vertex Operators Arising from Homogeneous Realization for gl N
759
The identity in the above proposition may be viewed as a deformation of the following well-known identity: z(1 − z)−2 − z−1 (1 − z−1 )−2 = (Dδ)(z).
(3.35)
q r1 +r2
→ 1 in (3.34) will recover the identity (3.35). Indeed, taking limit Now we are in the position to show our first commutator relation: Proposition 3.36. If q r1 +r2 6 = 1, then [Xij (r1 , z1 ), Xkl (r2 , z2 )] r1
r2
= ε(i − j , j − l )δj k (1 − q r1 )δij q − 2 δij (1 − q r2 )δkl q − 2 δkl r1 +r2 z2 · (1 − q r1 +r2 )−δil q 2 δil Xil (r1 + r2 , z1 )δ( r ) q 1 z1 r2
r1
− ε(k − l , l − j )δil (1 − q r2 )δkl q − 2 δkl (1 − q r1 )δij q − 2 δij r1 +r2 z1 · (1 − q r1 +r2 )−δkj q 2 δkj Xkj (r1 + r2 , z2 )δ( r ). q 2 z2 Proof. By (3.18), we have [Xij (r1 , z1 ), Xkl (r2 , z2 )] = Xij (r1 , z1 )Xkl (r2 , z2 ) − Xkl (r2 , z2 )Xij (r1 , z1 ) z2 q r2 z2 −1 z2 ij (3.37) ) (1 − r )−1 = Fkl (r1 , r2 , z1 , z2 ) r (1 − q 1 z1 z1 q 1 z1 z1 q r1 z1 −1 z1 − Fijkl (r2 , r1 , z2 , z1 ) r (1 − ) (1 − r )−1 . 2 q z2 z2 q 2 z2 From (3.32) and Proposition 3.34, the above becomes z2 ij Fkl (r1 , r2 , z1 , z2 ) r q 1 z1 q r2 z2 −1 z2 q r1 z1 z1 q r1 z1 −1 z1 · ((1 − ) (1 − r )−1 − (1 − ) (1 − r )−1 ) r z1 q 1 z1 z2 q 2 z2 z2 q 2 z2 q r1 z1 z1 ij ) − δ( r )). = Fkl (r1 , r2 , z1 , z2 )(1 − q r1 +r2 )−1 (δ( z2 q 2 z2 Applying Lemma 3.19, we get [Xij (r1 , z1 ), Xkl (r2 , z2 )]
q r1 z1 ) z2 q r2 z2 ) − (1 − q r1 +r2 )−1 Fijkl (r2 , r1 , z2 , z1 )δ( z1 ij
= (1 − q r1 +r2 )−1 Fkl (r1 , r2 , z1 , z2 )δ(
r1
r2
= ε(i − j , j − l )δj k (1 − q r1 )δij q − 2 δij (1 − q r2 )δj l q − 2 δj l r1 +r2 z2 · (1 − q r1 +r2 )−δil q 2 δil Xil (r1 + r2 , z1 )δ( r ) q 1 z1 r2
r1
− ε(k − l , l − j )δil (1 − q r2 )δkl q − 2 δkl (1 − q r1 )δlj q − 2 δlj r1 +r2 z1 · (1 − q r1 +r2 )−δkj q 2 δkj Xkj (r1 + r2 , z2 )δ( r ). t u q 2 z2
760
Y. Gao
Next we shall focus our attention on the case q r1 +r2 = 1 (so q r1 = q −r2 ). In this case, we have F2 (z1 , z2 ) −j +1−δij +δik −δil −δj k +δj l k −l +1−δkl z2 z2 δik q r2 z2 δj l z2 · (1 − ) (1 − r ) (1 − r )2−δil −δj k 1 z1 q z1 q 1 z1
= z1i
q r1 z1 . z2
(3.38)
Lemma 3.39. Suppose that q r1 +r2 = 1, 1 ≤ i, j, k, l ≤ N . Then limr
z2 →q
ij
1 z1
Fkl (r1 , r2 , z1 , z2 ) = ε(i − j , j − i )δj k δil (1 − q r1 )δij (1 − q r2 )δij . (3.40)
Moreover, if j = k and i 6 = l, then lim (1 −
z2 →q r1 z1
z2 −1 ij ) Fkl (r1 , r2 , z1 , z2 ) q r1 z1 r1 δij −
= ε(i − j , j − l )(1 − q ) q
r1 2 δij
r2 δkl −
(1 − q ) q
r2 2 δkl
(3.41) Xil (0, z1 ).
Proof. Equation (3.40) follows from (3.20), (3.5) and the fact that (1−q r1 +r2 )1−δil = δil . The proof of (3.41) is the same as the proof of (3.20). u t Corollary 3.42. If q r1 +r2 = 1, j 6 = k, i = l, then lim (1 −
z1 →q r2 z2
z2 −1 ij ) Fkl (r1 , r2 , z1 , z2 ) q r1 z1 r1 δij −
= − ε(k − l , i − j )(1 − q ) q
r1 2 δij
r2 δkl −
(1 − q ) q
r2 2 δkl
(3.43) Xkj (0, z2 ).
Proof. Since z2 −1 ij ) Fkl (r1 , r2 , z1 , z2 ) q r1 z1 z1 z1 z2 = (1 − r )−1 (1 − r )(1 − r )−1 Fijkl (r2 , r1 , z2 , z1 ) q 1 z1 q 2 z2 q 2 z2 z1 z1 = (− r )(1 − r )−1 Fijkl (r2 , r1 , z2 , z1 ), 2 q z2 q 2 z2 (1 −
it follows from (3.41) that limr (1 −
z1 →q 2 z2
z2 −1 ij ) Fkl (r1 , r2 , z1 , z2 ) r q 1 z1 r1
r2
= − ε(k − l , i − j )(1 − q r1 )δij q − 2 δij (1 − q r2 )δkl q − 2 δkl Xkj (0, z2 ) as desired. u t
b Vertex Operators Arising from Homogeneous Realization for gl N
761
Also, we have from (3.37) [Xij (r1 , z1 ), Xkl (r2 , z2 )] z2 z2 q r1 z1 q r1 z1 −2 ij (1 − ) = Fkl (r1 , r2 , z1 , z2 ) r (1 − r )−2 − Fijkl (r2 , r1 , z2 , z1 ) q 1 z1 q 1 z1 z2 z2 z2 z2 q r1 z1 q r1 z1 −2 ij = Fkl (r1 , r2 , z1 , z2 )( r (1 − r )−2 − (1 − ) ) q 1 z1 q 1 z1 z2 z2 z2 ij = Fkl (r1 , r2 , z1 , z2 )(Dδ)( r ), q 1 z1 (3.44) here we use the identity (3.35). By Proposition 2.2.4 in [FLM], and (3.44), we obtain [Xij (r1 , z1 ), Xkl (r2 , z2 )] ij
= Fkl (r1 , r2 , z1 , q r1 z1 )(Dδ)(
z2 z2 ij ) − (Dz2 Fkl )(r1 , r2 , z1 , z2 )δ( r ) r q 1 z1 q 1 z1
(3.45)
= G1 − G2 , where z2 ), and q r1 z1 z2 ij G2 = (Dz2 Fkl )(r1 , r2 , z1 , z2 )δ( r ). q 1 z1 ij
G1 = Fkl (r1 , r2 , z1 , q r1 z1 )(Dδ)(
(3.46) (3.47)
From (3.40), we have G1 = ε(i − j , j − i )δj k δil (1 − q r1 )δij (1 − q r2 )δkl (Dδ)(
z2 ). r q 1 z1
(3.48)
To compute G2 , we first have Dz2 F1 (z1 , z2 ) =
X
(k (n) − q −nr2 l (n))z2−n F1 (z1 , z2 )
n∈−Z+
+ F1 (z1 , z2 )
X n∈Z+
(3.49) (k (n) − q
−nr2
l (n))z2−n ,
762
Y. Gao
and Dz2 F2 (z1 , z2 ) = (k (0) − l (0) + 1 − δkl )F2 (z1 , z2 ) z2 z2 + δik (− )(1 − )−1 F2 (z1 , z2 ) z1 z1 q r2 z2 q r2 z2 + δj l (− r )(1 − r )−1 F2 (z1 , z2 ) q 1 z1 q 1 z1 z2 z2 + (2 − δil − δj k )(− r )(1 − r )−1 F2 (z1 , z2 ) 1 q z1 q 1 z1 (3.50) − F2 (z1 , z2 ) = (k (0) − l (0) − δkl )F2 (z1 , z2 ) z2 z2 q r2 z2 q r2 z2 + δik (− )(1 − )−1 F2 (z1 , z2 ) + δj l (− r )(1 − r )−1 F2 (z1 , z2 ) z1 z1 q 1 z1 q 1 z1 z2 z2 −1 + (2 − δil − δj k )(− r )(1 − r ) F2 (z1 , z2 ) q 1 z1 q 1 z1 = J1 (z1 , z2 ) + J2 (z1 , z2 ) + J3 (z1 , z2 ) + J4 (z1 , z2 ), where J1 (z1 , z2 ) = (k (0) − l (0) − δkl )F2 (z1 , z2 ), z2 z2 J2 (z1 , z2 ) = δik (− )(1 − )−1 F2 (z1 , z2 ), z1 z1 q r2 z2 q r2 z2 J3 (z1 , z2 ) = δj l (− r )(1 − r )−1 F2 (z1 , z2 ), q 1 z1 q 1 z1 z2 z2 J4 (z1 , z2 ) = (2 − δil − δj k )(− r )(1 − r )−1 F2 (z1 , z2 ). q 1 z1 q 1 z1 Note that ij
Dz2 Fkl (r1 , r2 , z1 , z2 ) = (Dz2 F1 (z1 , z2 ))ε(i − j , k − l )ei −j +k −l F2 (z1 , z2 )F3 + F1 (z1 , z2 )ε(i − j , k − l )ei −j +k −l (Dz2 F2 (z1 , z2 ))F3
(3.51)
= I1 (z, z2 ) + I2 (z1 , z2 ), where I1 (z1 , z2 ) = (Dz2 F1 (z1 , z2 ))ε(i − j , k − l )ei −j +k −l F2 (z1 , z2 )F3
(3.52)
I2 (z1 , z2 ) = F1 (z1 , z2 )ε(i − j , k − l )ei −j +k −l (Dz2 F2 (z1 , z2 ))F3 .
(3.53)
and
Since F2 (z1 , z2 ) contains the factor (1 − q rz12z1 )2−δil −δj k we have I1 (z1 , q r1 z1 ) = 0 except the case i = l and j = k. In this case we have F1 (z1 , q r1 z1 ) = 1. It then follows
b Vertex Operators Arising from Homogeneous Realization for gl N
763
from (3.49) and (3.40) that I1 (z1 , q r1 z1 ) X X (j (n) − q −nr2 i (n))(q r1 z1 )−n + (j (n) − q −nr2 i (n))(q r1 z1 )−n ) =( n∈−Z+
n∈Z+
ij · Fkl (r1 , r2 , z1 , q r1 z1 )
= ε(i − j , j − i )(1 − q r1 )δij (1 − q r2 )δkl (−i (z1 ) + j (q r1 z1 ) + i (0) − j (0)). (3.54) Next we consider I2 (z1 , z2 ). This will be carried out in three cases. Assume that δil + δj k = 0 (i.e., i 6 = l and j 6 = k ). From (3.38) and (3.50) we see that (Dz2 F2 )(z1 , q r1 z1 ) = 0
(3.55)
as J1 , J2 , J3 and J4 in (3.50) contain the common factor (1 − q rz12z1 ). Assume that δil + δj k = 2 (i.e., i = l and j = k ). From (3.50) and (3.53) we have (Dz2 F2 )(z1 , z2 ) = J1 (z1 , z2 ) + J2 (z1 , z2 ) + J3 (z1 , z2 ). Then, by (3.40), we get I2 (z1 , z2 ) ij
= Fj i (r1 , r2 , z1 , z2 )(j (0) − i (0) − δij ) z2 z2 ij + Fj i (r1 , r2 , z1 , z2 )(δij (− )(1 − )−1 ) z1 z1 q r2 z2 q r2 z2 ij + Fj i (r1 , r2 , z1 , z2 )(δij (− r )(1 − r )−1 ) q 1 z1 q 1 z1
(3.56)
and so lim
z2 →q r1 z1
I2 (z1 , z2 )
= (j (0) − i (0) − δij )(1 − q r1 )δij (1 − q r2 )δij ε(i − j , k − l ) + δij (−q r1 )(1 − q r1 )δij −1 (1 − q r2 )δij ε(i − j , k − l ) + δij (−q r2 )(1 − q r2 )δij −1 (1 − q r1 )δij ε(i − j , k − l ) = (j (0) − i (0))ε(i − j , k − l )(1 − q r1 )δij (1 − q r2 )δij , here we use the following identity whose proof is straightforward: δij (1 − q r1 )δij (1 − q r2 )δij = δij (−q r1 )(1 − q r1 )δij −1 (1 − q r2 )δij + δij (−q r2 )(1 − q r2 )δij −1 (1 − q r1 )δij .
(3.57)
764
Y. Gao
Assume that δil + δj k = 1 (i.e., i = l, j 6 = k or i 6 = l, j = k ). Then J1 (z1 , q r1 z1 ) = J2 (z1 , q r1 z1 ) = J3 (z1 , q r1 z1 ) = 0, and we obtain z2 I2 (z1 , z2 )δ( r ) q 1 z1 z2 z2 z2 ij = (2 − δil − δj k )((− r )(1 − r )−1 Fkl (r1 , r2 , z1 , z2 ))δ( r ) q 1 z1 q 1 z1 q 1 z1 (3.58) z2 z2 −1 ij z2 = δj k ((− r )(1 − r ) Fkl (r1 , r2 , z1 , z2 ))δ( r ) q 1 z1 q 1 z1 q 1 z1 z2 z2 −1 ij z2 + δil ((− r )(1 − r ) Fkl (r1 , r2 , z1 , z2 ))δ( r ). q 1 z1 q 1 z1 q 1 z1 By (3.41) and (3.43), the above becomes z2 z2 z2 ij δj k ((− r )(1 − r )−1 Fkl (r1 , r2 , z1 , z2 ))δ( r ) 1 1 q z1 q z1 q 1 z1 z2 z2 −1 ij z1 + δil ((− r )(1 − r ) Fkl (r1 , r2 , z1 , z2 ))δ( r ) 1 1 q z1 q z1 q 2 z2 r1 r2 z2 = − ε(i − j , k − l )δj k (1 − q r1 )δij q − 2 δij (1 − q r2 )δkl q − 2 δkl Xil (0, z1 )δ( r ) q 1 z1 r r z1 r2 δkl − 22 δkl r1 δij − 21 δij (1 − q ) q Xkj (0, z2 )δ( r ). + ε(k − l , i − j )δil (1 − q ) q q 2 z2 (3.59) Therefore, we have proved our second commutator relations: Proposition 3.60. If q r1 +r2 = 1, then (i) when δj k + δil = 0,
[Xij (r1 , z1 ), Xkl (r2 , z2 )] = 0;
(ii) when δj k + δil = 1, [Xij (r1 , z1 ), Xkl (r2 , z2 )] r1
r2
= ε(i − j , k − l )δj k (1 − q r1 )δij q − 2 δij (1 − q r2 )δkl q − 2 δkl Xil (0, z1 )δ( r2
z2 ) q r1 z1
− ε(k − l , i − j )δil (1 − q r2 )δkl q − 2 δkl (1 − q r1 )δij r1 z1 · q − 2 δij Xkj (0, z2 )δ( r ); q 2 z2 (iii) when δj k + δil = 2, [Xij (r1 , z1 ), Xkl (r2 , z2 )]
z2 z1 ) − j (z2 )δ( r )) r 1 q z1 q 2 z2 z 2 + ε(i − j , j − i )(1 − q r1 )δij (1 − q r2 )δkl (Dδ)( r ). q 1 z1
=ε(i − j , j − i )(1 − q r1 )δij (1 − q r2 )δkl (i (z1 )δ(
To conclude this section, let V(3, q) be the C-linear span of operators i (n), d0 , 1 and xij (r, n), the coefficients of Xij (r, z) (see (3.4)), where n ∈ Z, 1 ≤ i, j ≤ N, r ∈ 3. From Propositions 3.7, 3.36 and 3.60, we see that Proposition 3.61. V(3, q) is a Lie subalgebra of gl(VQ ).
b Vertex Operators Arising from Homogeneous Realization for gl N
765
4. Realization and Lifting for V(3, q) In this section we will find a realization for the Lie algebra V(3, q). It turns out that V(3, q) is a homomorphic image of an affinization of the matrix algebra with entries in a skew-polynomial ring. If (3, q) is generic, we further lift V(3, q) to a Lie algebra W(3, q) on the enlarged Fock space W3 = C[3] ⊗C VQ . Given a pair (3, q), where q is a non-zero complex number and 3 is a subset of C containing 0 and closed under addition, let 30 = {r ∈ 3 : q r = 1}. Definition 4.1. The pair (3, q) is said to be generic if 30 = {0}. Remark 4.2. If 3 = {0}, then any pair (3, q) for any non-zero number q is generic. √ 2π −1 / 3Z∗ −1 , where In general, (3, q) is generic if and only if ln q 6= 0 and ln q ∈ ∗ Z = Z \ {0}. In particular, the pair (Z, q) is generic if and only if q is not a root of unity while the pair (R, q) is generic if and only if |q|√ 6 = 1. (C, q) is never generic. 2π θ −1 and θ is an irrational (real) Also, (Z ⊕ Zξ, q) with Imξ 6 = 0 is √ generic if q =θ πe number and is not generic if ξ = −1 and q = e with a rational (real) number θ. Define r
Yij (r, z) = (1 − q r )−δij q 2 δij Xij (r, z)
(4.3)
for 1 ≤ i 6 = j ≤ N and r ∈ 3, or 1 ≤ i = j ≤ N but r ∈ 3 \ 30 . Rewriting (3.7), (3.36) and (3.60) gives us the following: Proposition 4.4. [k (n), Yij (r, z)] = (δki − δkj q nr )zn Yij (r, z), [d0 , Yij (r, z)] = −DYij (r, z).
(4.5) (4.6)
If r1 + r2 ∈ 3 \ 30 , then [Yij (r1 , z1 ), Ykl (r2 , z2 )]
z2 ) q r1 z1 z1 − ε(k − l , l − j )δil Ykj (r1 + r2 , z2 )δ( r ). q 2 z2
= ε(i − j , j − l )δj k Yil (r1 + r2 , z1 )δ(
(4.7)
If r1 + r2 ∈ 30 , then (i) when δj k + δil = 0 (so δj k = δil = 0), [Yij (r1 , z1 ), Ykl (r2 , z2 )] = 0;
(4.8)
(ii) when δj k + δil = 1, [Yij (r1 , z1 ), Ykl (r2 , z2 )]
z2 ) r q 1 z1 z1 − ε(k − l , i − j )δil Ykj (0, z2 )δ( r q 2z
= ε(i − j , k − l )δj k Yil (0, z1 )δ(
(4.9) 2
);
766
Y. Gao
(iii) when δj k + δil = 2 (so δj k = δil = 1), [Yij (r1 , z1 ), Ykl (r2 , z2 )]
z2 z1 ) − j (z2 )δ( r )) q r1 z1 q 2 z2 z2 + ε(i − j , j − i )(Dδ)( r ). q 1 z1
= ε(i − j , j − i )(i (z1 )δ(
Write Yij (r, z) =
X
yij (r, n)z−n .
(4.10)
(4.11)
n∈Z
Then the Lie algebra V(3, q) is spanned by 1, d0 , i (n), 1 ≤ i ≤ N and yij (n, r), 1 ≤ i 6 = j ≤ N, or 1 ≤ i = j ≤ N but r ∈ 3 \ 30 , for all n ∈ Z. For later use, we shall find explicit actions of some of the operators yij (r, n) on b− ) ⊗ C[Q]. 1 ⊗ 1 ∈ VQ = S(H Lemma 4.12. (i) yij (r, n)(1 ⊗ 1) = 0, for 1 ≤ i 6= j ≤ N, r ∈ 3, n ∈ Z≥0 ; 1 1 ⊗ 1, for 1 ≤ i ≤ N, r ∈ 3 \ 30 , n ∈ Z≥0 . yii (r, n)(1 ⊗ 1) = δn,0 −r/2 q − q r/2 (ii) Proof. For 1 ≤ i 6 = j ≤ N, r ∈ 3, we have X i (n) − q −nr j (n) X i (n) − q −nr j (n) z−n ) exp(− z−n ) Yij (r, z) = exp(− n n n∈−Z+
· ei −j z
n∈Z+
i −j +1 −rj + 2r
q
.
Then r
Yij (r, z)(1 ⊗ 1) = q 2 z exp(−
X i (n) − q −nr j (n) z−n )(1 ⊗ 1), n
n∈−Z+
which gives (i). For 1 ≤ i ≤ N, r ∈ 3 \ 30 , we have
X i (n) − q −nr i (n) z−n Yii (r, z) = (1 − q r )−1 q exp − n n∈−Z+ X i (n) − q −nr i (n) z−n . · exp − n r 2
n∈Z+
Then r
Yii (r, z)(1 ⊗ 1) = (1 − q r )−1 q 2 exp(−
X 1 − q −nr i (n)z−n )(1 ⊗ 1) n
n∈−Z+
which yields (ii). u t
b Vertex Operators Arising from Homogeneous Realization for gl N
767
P Let R = C[3] = r∈3 ⊕Cer be the group algebra of 3 (though 3 may not be a group). Let σ be the automorphism of R given by σ (er ) = q r er . Then we can form the skew polynomial ring X ⊕t0n R R[t0 , t0−1 ; σ ] = n∈Z
with multiplication defined as at0n = t0n σ n (a), for a ∈ R, n ∈ Z. That is, er t0n = q nr t0n er ,
for r ∈ 3, n ∈ Z.
(4.13)
R[t0 , t0−1 ; σ ] is an associative algebra over C. Consider the (associative) tensor product algebra MN ⊗C R[t0 , t0−1 ; σ ] = MN (R[t0 , t0−1 ; σ ]) and write eij (a) = eij ⊗ a, for a ∈ R[t0 , t0−1 ; σ ], 1 ≤ i, j ≤ N . Let glN (R[t0 , t0−1 ; σ ]) be the Lie algebra MN (R[t0 , t0−1 ; σ ])− as usual. Then [eij (u), ekl (v)] = eij (u)ekl (v) − ekl (v)eij (u) = ε(i − j , k − l )δj k eil (uv) − ε(k − l , i − j )δil ekj (vu), for u, v ∈ R[t0 , t0−1 ; σ ] and 1 ≤ i, j, k, l ≤ N . Define κ, χ : R[t0 , t0−1 ; σ ] → C to be C-linear functions given by ( 1, if n = 0 and r ∈ 30 , n r κ(t0 e ) = 0, otherwise; ( 1, if n = 0 and r = 0, n r χ (t0 e ) = 0, otherwise.
(4.14)
(4.15)
Let d0 , d be the degree operators on R[t0 , t0−1 ; σ ] defined by d0 (t0n er ) = nt0n er , d(t0n er ) = rt0n er for n ∈ Z and r ∈ 3. We form a 2-dimensional central extension of glN (R[t0 , t0−1 ; σ ]), 0 = glN (R[t0 , t0−1 ; σ ]) ⊕ Cc0 ⊕ Cc G3
with Lie bracket [eij (t0n1 er1 ), ekl (t0n2 er2 )]
= eij (t0n1 er1 )ekl (t0n2 er2 ) − ekl (t0n2 er2 )eij (t0n1 er1 )
+ ε(i − j , k − l )δj k δil κ((d0 t0n1 er1 )t0n2 er2 )c0 + ε(i − j , k − l )δj k δil χ ((dt0n1 er1 )t0n2 er2 )c
= ε(i − j , k − l )δj k q n2 r1 eil (t0n1 +n2 er1 +r2 )
− ε(k − l , i − j )δil q n1 r2 ekj (t0n1 +n2 er1 +r2 )
+ n1 ε(i − j , j − i )δj k δil q n2 r1 κ(t0n1 +n2 er1 +r2 )c0 + r1 ε(i − j , j − i )δj k δil q n2 r1 χ(t0n1 +n2 er1 +r2 )c,
for r1 , r2 ∈ 3, n1 , n2 ∈ Z, 1 ≤ i, j, k, l ≤ N .
(4.16)
768
Y. Gao
Let 0 ⊕ Cd0 ⊕ Cd G3 = G3
(4.17)
0 and the degree derivations d , d, where c , c are central be the semi-direct product of G3 0 0 elements of G3 . Note that C[t0 , t0−1 ] is a subalgebra of R[t0 , t0−1 ; σ ]. Correspondingly, the affinization e gl N of glN is a subalgebra of G3 . Set X eij (t0n er )z−n , (4.18) eij (r, z) = n∈Z
for r ∈ 3, 1 ≤ i, j ≤ N . Then we have Proposition 4.19. In G3 , [ekk (t0n er1 ), eij (r2 , z)] = δki q
−nr1 n
z eij (r1 + r2 , q −nr1
(4.20) −r1
r1 +r2
z) − δkj q n
nr2 n
z eij (r1 + r2 , z)
−nr1
κ(e )c0 z + r1 q χ(er1 +r2 )czn ), + δki δkj (nq [d0 , eij (r, z)] = −Deij (r, z), [d, eij (r, z)] = reij (r, z).
(4.21)
Moreover, [eij (r1 , z1 ), ekl (r2 , z2 )]
z2 ) q r1 z1 z1 − ε(k − l , i − j )δil ekj (r1 + r2 , z2 )δ( r ) q 2 z2 z2 + ε(i − j , k − l )δj k δil κ(er1 +r2 )(Dδ)( r )c0 q 1 z1 z2 + r1 ε(i − j , k − l )δj k δil χ(er1 +r2 )δ( r )c, q 1 z1
= ε(i − j , k − l )δj k eil (r1 + r2 , z1 )δ(
where r1 , r2 , r ∈ 3, 1 ≤ i, j, k, l ≤ N . Proof. We only check (4.22), [eij (r1 , z1 ), ekl (r2 , z2 )] X X [eij (t0n1 er1 ), ekl (t0n2 er2 )]z1−n1 z2−n2 = n1 ∈Z n2 ∈Z
=
X X
ε(i − j , k − l )δj k eil (t0n1 +n2 er1 +r2 )q n2 r1 z1−n1 z2−n2
n1 ∈Z n2 ∈Z
−
X X
ε(k − l , i − j )δil ekj (t0n1 +n2 er1 +r2 )q n1 r2 z1−n1 z2−n2
n1 ∈Z n2 ∈Z
+
X X
n1 ε(i − j , k − l )δj k δil κ(t0n1 +n2 er1 +r2 )q n2 r1 z1−n1 z2−n2 c0
n1 ∈Z n2 ∈Z
+
X X
n1 ∈Z n2 ∈Z
r1 ε(i − j , k − l )δj k δil χ(t0n1 +n2 er1 +r2 )q n2 r1 z1−n1 z2−n2 c
(4.22)
b Vertex Operators Arising from Homogeneous Realization for gl N
=
X X
769
ε(i − j , k − l )δj k eil (t0n1 +n2 er1 +r2 )z1−n1 −n2 (
n1 ∈Z n2 ∈Z
−
X X
ε(k − l , i − j )δil ekj (t0n1 +n2 er1 +r2 )z2−n1 −n2 (
n1 ∈Z n2 ∈Z
+
X
n1 ε(i − j , k − l )δj k δil κ(er1 +r2 )(
n1 ∈Z
+ r1
X
q r1 z1 n2 ) z2
z2 n1 ) c0 r q 1 z1
ε(i − j , k − l )δj k δil χ(er1 +r2 )(
n1 ∈Z
q r2 z2 n1 ) z1
z2 n1 ) c r q 1 z1
which is what we want. u t Consider the subalgebras glN (R[t0 , t0−1 ; σ ])b= glN (R[t0 , t0−1 ; σ ]) ⊕ Cc0 ⊕ Cc
(4.23)
glN (R[t0 , t0−1 ; σ ])e= glN (R[t0 , t0−1 ; σ ])b⊕ Cd0
(4.24)
0 and of G3
of G3 . Comparing Proposition 4.4 with 4.19, one can easily show that the following result holds true. Theorem 4.25. The linear map glN (R[t0 , t0−1 ; σ ])e→ V(3, q) given by eij (t0n er ) 7 → yij (r, n), for 1 ≤ i 6 = j ≤ N, r ∈ 3, n ∈ Z; ( i (n), for 1 ≤ i ≤ N, r ∈ 30 , n ∈ Z n r eii (t0 e ) 7 → yii (r, n), for 1 ≤ i ≤ N, r ∈ 3 \ 30 , n ∈ Z; c0 7 → 1, c 7 → 0, d0 7 → d0 is a Lie algebra homomorphism. To get a module for G3 , we need to assume that (3, q) is generic. So from now on we suppose that (3, q) is generic. That is 30 = {0}. Define W3 = C[3] ⊗C VQ ,
(4.26)
(a ⊗ X)(b ⊗ w) = ab ⊗ Xw
(4.27)
and a ⊗ X ∈ gl(W3 ) as for a, b ∈ C[3], X ∈ V(3, q), w ∈ VQ . Let W(3, q) be the linear span of operators er ⊗ yij (r, n), 1 ≤ i 6 = j ≤ N, n ∈ Z, r ∈ 3; er ⊗ yii (r, n), 1 ≤ i ≤ N, n ∈ Z, r ∈ 3 \ {0}; 1 ⊗ i (n), 1 ≤ i ≤ N, n ∈ Z; 1 ⊗ 1, 1 ⊗ d0 , d ⊗ 1.
(4.28)
Then it follows from Proposition 4.4 that those operators satisfy the same derived relations from (4.5) through (4.10). Hence W(3, q) is a Lie subalgebra of gl(W3 ). Now we can state our main theorem.
770
Y. Gao
Theorem 4.29. The linear map π : G3 → W(3, q) given by π(eij (t0n er )) = er ⊗ yij (r, n), for 1 ≤ i 6 = j ≤ N, n ∈ Z, r ∈ 3; ( 1 ⊗ i (n), for 1 ≤ i ≤ N, n ∈ Z, r = 0, n r π(eii (t0 e )) = er ⊗ yii (r, n), for 1 ≤ i ≤ N, n ∈ Z, r ∈ 3 \ {0}; π(c0 ) = 1 ⊗ 1, π(d0 ) = 1 ⊗ d0 , π(c) = 0, π(d) = d ⊗ 1, is a Lie algebra homomorphism. If 3 is a group, then W3 is irreducible as G3 module. Proof. It follows from (4.4) and (4.19) that π is a Lie algebra homomorphism. Let us check the irreducibility when 3 is a group. Let U be a nonzero submodule of b− ) ⊗C C[Q]. W3 = C[3] ⊗C VQ = C[3] ⊗C S(H
(4.30)
b− is a subalgebra of G3 , Lemma 9.13 b+ + Cc0 + H Since the Heisenberg algebra s = H in [K] (or Theorem 1.7.3 in [FLM]) implies that U is completely reducible as s-module and so X b− ) ⊗ r ) ⊕(er ⊗ S(H U= r∈3
for Pmsome γsubspaces r of C[Q] (thanks to the degree operator d). Suppose that f = i i=1 si e ∈ r0 , where r0 ∈ 3, si ∈ C, si 6 = 0, γi ∈ Q, for 1 ≤ i ≤ m, and γi 6 = γj if i 6 = j . We claim that eγk ∈ r0 for some k. Pick α ∈ H such that (α, γm−1 − γm ) 6= 0. Then α(0)f − (α, γm )f =
m−1 X
si (α, γi − γm )e
γi
=
i=1
si0
i=1
where = si (α, γi − γm ), 1 ≤ i ≤ m − 1. Since process and get some eγk ∈ r0 . Also, from (3.1), (2.14), (2.15) and (4.3), we have ei −j = exp(
m−1 X
0 sm−1
si0 eγi ∈ r0 ,
6 = 0, we may continue this
X i (n) − j (n) X i (n) − j (n) z−n )Yij (0, z) exp( z−n )z−i +j −1 n n
n∈−Z+
n∈Z+
for 1 ≤ i 6 = j ≤ N . It then follows that eγk +Q ⊆ r0 and so r0 = C[Q]. For r 6 = r0 , we have b− ) ⊗ r0 ) (er−r0 ⊗ (y11 (r − r0 , 0))(er0 ⊗ S(H b− ) ⊗ C[Q] ⊆ er ⊗ S(H b− ) ⊗ r , = er ⊗ (y11 (r − r0 , 0))S(H b− ) 6 = (0) by Lemma 4.12. We thus obtain that r = C[Q] where (y11 (r − r0 , 0))S(H t for all r ∈ 3 and so U = W3 . u Remark 4.31. If 3 = {0} and q = 1, the theorem gives a vertex operator representation e which was given in [F1]. for gl N
b Vertex Operators Arising from Homogeneous Realization for gl N
771
5. Application to Extended Affine Lie Algebras Extended affine Lie algebras were first introduced in [H-KT] (under the name of irreducible quasi-simple Lie algebras) and systematically studied in [AABGP] and [BGK]. They can be roughly characterized as complex Lie algebras which have a non-degenerate invariant form, a finite dimensional Cartan subalgebra, a discrete irreducible root system, and the ad-nilpotency of non-isotropic root spaces. This new class of Lie algebras is a higher dimensional generalization of affine Kac–Moody algebras (see [ABGP]). Toroidal Lie algebras (with certain derivations added) are the simplest examples of other extended affine Lie algebras which have been studied by [F2,W,MRY,Y,Sl] and [EF]. Armed with the root graded Lie algebras studied by [BM,BZ,N] and [Se] among others, and some variations of the cyclic (or dihedral) homology, one can expect a complete classification of extended affine Lie algebras of all types as was done in [BGK,BGKN,G1] and [AG]. In this section, we will study representations of certain extended affine Lie algebras of type AN −1 . We first introduce highest weight modules by using a pentagonal decomposition and then single out one irreducible highest weight module, which is the counterpart of the basic module for affine Kac–Moody algebras. We go on to show that this singled out module is nothing but the Fock space defined as in (4.26). Let (3, q) = (Z, q) ( or (Z + Zξ, q) with Imξ 6 = 0). Note that we still assume (3, q) is generic. Write e1 = t ( or e1 = t1 , eξ = t2 ). Then R[t0 , t0−1 ; σ ] = Cq [t0±1 , t ±1 ] ( or Cq [t0±1 , t1±1 , t2±1 ]),
(5.1)
where Cq [t0±1 , t ±1 ] (or Cq [t0±1 , t1±1 , t2±1 ]) is the so-called quantum torus (see [M]) defined as the free associative C-algebra with generators t0±1 , t ±1 ( or t0±1 , t1±1 , t2±1 ) modulo relations t0 t0−1 = t0−1 t0 = tt −1 = t −1 t = 1 and tt0 = qt0 t ( or t0 t0−1 = t0−1 t0 = t1 t1−1 = t1−1 t1 = t2 t2−1 = t2−1 t2 = 1, t1 t0 = qt0 t1 , t2 t0 = q ξ t0 t2 , and t1 t2 = t2 t1 ). We will simply write Cq = Cq [t0±1 , t ±1 ] ( or Cq [t0±1 , t1±1 , t2±1 ]). One has (see [BGK]) Cq = Z(Cq ) ⊕ [Cq , Cq ],
(5.2)
where Z(Cq ) is the center of Cq . 0 ( G resp.) be defined as in (4.16) ( (4.17) resp.). The nondegenerate invariant Let G3 3 form on G3 can be defined as (eij (u), ekl (v)) = ε(i − j , k − l )δj k δil κ(uv), (eij (u), c0 ) = (eij (u), c) = (eij (u), d0 ) = (eij (u), d) = 0, (c0 , d) = (c, d0 ) = 0, (c0 , d0 ) = (c, d) = 1,
(5.3)
for u, v ∈ Cq , 1 ≤ i, j, k, l ≤ N. 0 ( G resp.) modulo Now let g03 ( g3 resp.) be the quotient of the Lie algebra G3 3 CIN , where IN is the N × N identity matrix. Actually, g03 ( g3 resp.) is a subalgebra of 0 ( G resp.) because I is in the split center of G . The restriction of the Lie algebra G3 3 3 N the form on g3 is nondegenerate, and g3 has the Cartan subalgebra H = h ⊕ Cc0 ⊕ Cc ⊕ Cd0 ⊕ Cd,
(5.4)
772
Y. Gao
where h = Q ⊗C C = Ch1 ⊕ · · · ⊕ ChN −1 ,
(5.5)
and hi = eii − ei+1,i+1 = i − i+1 , 1 ≤ i ≤ N − 1. Define τ0 , τ ∈ H∗ as follows: τ0 |h⊕Cc0 ⊕Cc = 0, τ |h⊕Cc0 ⊕Cc = 0, τ0 (d) = τ (d0 ) = 0, τ0 (d0 ) = τ (d) = 1.
(5.6)
Then the root system of g3 with respect to H is R =(1 + Zτ0 + Zτ ) ∪ (Zτ0 ⊕ Zτ ), ( or (1 + Zτ0 + Zτ + Zξ τ ) ∪ (Zτ0 ⊕ Zτ ⊕ Zξ τ ))
(5.7)
and the root space decomposition is as follows: g3 =
X
⊕gα ,
α∈R
where g0 = H; gi −j +nτ0 +mτ = Ceij (t0n t m ), f or1 ≤ i 6= j ≤ N, n, m ∈ Z;
( or gi −j +nτ0 +m1 τ +m2 ξ τ = Ceij (t0n t1m1 t2m2 ), for 1 ≤ i 6= j ≤ N, n, m1 , m2 ∈ Z);
gnτ0 +mτ =
N X
⊕Ceii (t0n t m ), for n, m ∈ Z but (n, m) 6 = (0, 0)
i=1
( or gnτ0 +m1 τ +m2 ξ τ =
N X
⊕Ceii (t0n t1m1 t2m2 ),
i=1
for n, m1 , m2 ∈ Z but (n, m1 , m2 ) 6= (0, 0, 0)). This Lie algebra g3 is an extended affine Lie algebra. Moreover, g3 is nondegenerate of nullity 2 if 3 = Z and degenerate of nullity 3 if 3 = Z + Zξ because the isotropic roots generate a 2-dimensional complex space (see [G] and [AABGP] for more). Since IN acts on VQ (thus on W3 ) as zero, by (4.29), we have Proposition 5.8. For (3, q) = (Z, q), or (Z + Zξ, q), W3 is an irreducible g3 -module. Remark 5.9. Note that g3 is not necessarily tame. Now we shall recall the definition from [AABGP] and [BGK]. The core of g3 is defined to be the subalgebra (actually an ideal) generated by non-isotropic spaces gα , α ∈ 1+Zτ0 +Zτ ( or 1+Zτ0 +Zτ +Zξ τ ). g3 is said to be tame if and only if the orthogonal complement of the core equals to the center of the core. The main point of the tameness is that non-isotropic spaces have some control over isotropic spaces. This will allow us to classify extended affine Lie algebras.
b Vertex Operators Arising from Homogeneous Realization for gl N
773
Now we suppose that (3, q) = (Z, q) for a moment for simplicity. Using the lexicographical order on 0 = Zτ0 ⊕ Zτ we obtain a pentagonal decomposition: gZ = g− ⊕ D− ⊕ H ⊕ D+ ⊕ g+ ,
(5.10)
where g± is the subalgebra spanned by eij (t0n t m ), 1 ≤ i 6 = j ≤ N, n ∈ ±Z+ , m ∈ Z, eij (t m ), 1 ≤ i 6= j ≤ N, m ∈ ±Z+ , eij , 1 ≤ i ≶ j ≤ N, eii (t0n t m ), 1 ≤ i ≤ N, n ∈ ±Z+ , m ∈ Z, eii (t m ) − ei+1,i+1 (t m ), 1 ≤ i ≤ N − 1, m ∈ ±Z+ ,
(5.11)
respectively, and D± is the subalgebra spanned by IN (t m ), m ∈ ±Z+ , respectively. Write D = D− ⊕ Cc ⊕ D+ ,
(5.12)
which is another Heisenberg algebra contained in gZ . Furthermore, one can show that [H, g± ] ⊆ g± , [H, D± ] ⊆ D± , [D, g± ] = (0).
(5.13)
Denote U(g) the universal enveloping algebra of a Lie algebra g. For λ ∈ H∗ , a gZ -module M is said to be a highest weight module of weight λ if there exists vλ ∈ M, vλ 6 = 0 such that g+ vλ = (0), M = U(gZ )vλ , hvλ = λ(h)vλ , for all h ∈ H.
and
(5.14)
Clearly, if a gZ -module M is a highest weight module, then M admits a weight space decomposition relative to H. A highest weight module M of weight λ is called a Verma module if every highest weight module of weight λ is a quotient of M. A Verma module M(λ) of weight λ can be constructed as the quotient of U(gZ ) by the left ideal generated by g+ and h − λ(h)1, h ∈ H. Reasoning as usual, M(λ) has a unique proper maximal submodule N (λ). Let L(λ) = M(λ)/N (λ),
(5.15)
which is an irreducible gZ -module. Remark 5.16. L(λ) may also be constructed as follows. Let Cvλ be the one dimensional H-module with hvλ = λ(h)vλ , h ∈ H. Define (D + H)-module J = U(D + H) ⊗U(H) vλ = U(D)/U(D)(c − λ(c)) ⊗ vλ . Now define g+ vλ = (0), then J becomes a (g+ + D + H)-module. Let M(λ)0 = U(gZ ) ⊗U(g+ +D+H) J ∼ = U(g− ) ⊗ U(D)/U(D)(c − λ(c)) ⊗ vλ .
(5.17)
774
Y. Gao
M(λ)0 has a unique proper maximal submodule N (λ)0 . Of course, L(3) ∼ = M(λ)0 /N (λ)0 . There are two known graded simple quotients for U(D)/U(D)(c−λ(c)) as (D+Cd)modules. One is the case λ(c) 6 = 0. In this case, one has the canonical realization for the Heisenberg algebra D: ρ : U(D)/U(D)(c − λ(c)) → U(D− ).
(5.18)
The other is the case λ(c) = 0. In this case, one must have ρ : U(D)/U(D)c → C[t k , t −k ]
(5.19)
for some k ∈ Z≥0 , see for example [C]. The structure of L(λ) for general λ will be discussed elsewhere. Our next business is to show that L(ω0 ) ∼ = WZ as gZ -modules for ω0 defined below. Thus the highest weight module L(ω0 ) may be regarded as the basic module for gZ . From (4.29), we see that b− ) ⊗C C[Q] WZ = C[t, t −1 ] ⊗C S(H is an irreducible gZ -module. Moreover, WZ has a weight space decomposition. To this end, we define ω0 ∈ H∗ as follows: ω0 |h⊕Cc⊕Cd0 ⊕Cd = 0, ω0 (c0 ) = 1. Then we have
X
WZ =
⊕Wµ ,
(5.20)
(5.21)
µ∈P (WZ )
where P (WZ ) = {ω0 + α + nτ0 + mτ : α ∈ Q, n ∈ Z≤0 , m ∈ Z}
(5.22)
is the set of weights. More precisely, we have Wω0 +α+nτ0 +mτ = t m ⊗ g ⊗ eα ,
(5.23)
b− ) of degree n + 1 (α, α), for α ∈ Q, n ∈ where g is the homogeneous subspace of S(H 2 Z≤0 , m ∈ Z. It then follows that all weight spaces are finite dimensional. Let X (dim Wµ )eµ ∈ C[[H∗ ]] (5.24) ch WZ = µ∈P (WZ )
be the formal character for WZ , where C[[H∗ ]] is the completion of the group algebra C[H∗ ]. Note that dim Wω0 +α+nτ0 +mτ = dim Wω0 +α+nτ0 for any m ∈ Z. Then one may get X 1 eω0 +α− 2 (α,α)τ0 )ϕ(e−τ0 )−N δ(eτ ), (5.25) ch WZ = ( α∈Q
b Vertex Operators Arising from Homogeneous Realization for gl N
775
where ϕ(p) =
Y
(1 − pn )
(5.26)
n∈Z+
is the Euler product, and δ is the δ-function as in (2.29). The character formula (5.25) b and can be described as the product of the character formula for the affine algebra gl N the δ-function. Summarizing the above, we have proved the following result. Proposition 5.27. WZ is an irreducible gZ -module with induced actions from GZ and has the character formula as in (5.25). b− ) ⊗C C[Q], we have vω0 = 1 ⊗ 1 ⊗ 1. Then hvω0 = In WZ = C[t, t −1 ] ⊗C S(H ω0 (h)vω0 , for h ∈ H. From Lemma 4.12 and Theorem 4.29, we have g+ vω0 = (0) and IN (t m )vω0 =
N t m ⊗ 1 ⊗ 1, for m ∈ 3 \ {0}. q −m/2 − q m/2
Therefore we obtain Theorem 5.28. L(ω0 ) is isomorphic to WZ as gZ -modules and ch L(ω0 ) = (
X
eω0 +α− 2 (α,α)τ0 )ϕ(e−τ0 )−N δ(eτ ). 1
α∈Q
Remark 5.29. In this case, we see that ρ in (5.19) is given by ρ : U(D)/U(D)c → C[t, t −1 ] with ρ(IN (t m )) =
N t m , for m ∈ 3 \ {0}. − q m/2
q −m/2
The rest of the paper is devoted to tame extended affine Lie algebras. Let (3, q) = (Z, q) ( or (Z + Zξ, q) with Imξ 6 = 0). Set slN (Cq ) = {X ∈ glN (Cq ) : tr(X) ∈ [Cq , Cq ]} to be the subalgebra of glN (Cq ) which is generated by eij (u), u ∈ Cq , 1 ≤ i 6= j ≤ N . Define Lc (3) = slN (Cq ) ⊕ Cc0 ⊕ Cc
(5.30)
to be the subalgebra of g03 , and let L(3) = Lc (3) ⊕ Cd0 ⊕ Cd
(5.31)
be the subalgebra of g3 . The restriction of the invariant form on L(3) is again nondegenerate. This Lie algebra L(3) is a tame extended affine Lie algebra(see [BGK]). It has the same root system R as g3 and the following root space decomposition: L(3) = ⊕α∈R Lα ,
776
Y. Gao
where L0 = H is the Cartan subalgebra of L(3), Lα = gα for α ∈ 1 + Zτ0 + Zτ ( or 1 + Zτ0 + Zτ + Zξ τ ), and Lnτ0 +mτ =
N−1 X
⊕C(eii − ei+1,i+1 )(t0n t m ) ⊕ IN ((Ct0n t m ) ∩ [Cq , Cq ])
i=1
(or Lnτ0 +m1 τ +m2 ξ τ =
N−1 X
⊕C(eii − ei+1,i+1 )(t0n t1m1 t2m2 )
i=1
⊕ IN ((Ct0n t1m1 t2m2 ) ∩ [Cq , Cq ]), for n, m ∈ Z and (n, m) 6 = (0, 0) ( or n, m1 , m2 ∈ Z and (n, m1 , m2 ) 6= (0, 0, 0)). By taking restriction, we know that W3 is an L(3)-module. Theorem 5.32. L(3) is an tame extended affine Lie algebra containing the Heisenberg subalgebra s and W3 is an irreducible L(3)-module. Proof. First we have (1 − q n )t0n = (t0n t −1 )t − t (t0n t −1 )( or (t0n t1−1 )t1 − t1 (t0n t1−1 )), (1 − q m )t m = t0 (t0−1 t m ) − (t0−1 t m )t0 ,
( or (1 − q m1 +m2 ξ )t1m1 t2m2 = t0 (t0−1 t1m1 t2m2 ) − (t0−1 t1m1 t2m2 )t0 ). It then follows that t0n , t m ( or t1m1 t2m2 ) ∈ [Cq , Cq ] for n ∈ Z \ {0}, m ( or m1 + m2 ξ ) b− is a subalgebra of L(3) b+ ⊕ Cc0 ⊕ H ∈ 3 \ {0} because (3, q) is generic. So s = H and eii (t m ) ( or eii (t1m1 t2m2 )) is in L(3) for 1 ≤ i ≤ N and m ( or m1 + m2 ξ ) ∈ 3 \ {0}. The irreducibility can be shown by going through the proof of Theorem 4.29 and t usingthe operator y11 (r − r0 , 0). u Remark 5.33. Note that if (3, q) = (Z, q) is generic, then L(3) = g3 . But if (3, q) = (Z + Zξ, q) is generic, then L(3) is a proper subalgebra of g3 . References [AABGP] Allison, B.N., Azam, S., Berman, S., Gao, Y., Pianzola, A.: Extended affine Lie algebras and their root systems. Memoir. Am. Math. Soc. 126, 605 (1997) [ABGP] Allison, B.N., Berman, S., Gao, Y., Pianzola, A.: characterization of affine Kac–Moody Lie algebras. Commun. Math. Phys. 185, 671–688 (1997) [AG] Allison, B.N. and Gao, Y.: The root system and the core of an extended affine Lie algebra. Submitted [BZ] Benkart, G. and Zelmanov, E.: Lie algebras graded by finite root systems and intersection matrix algebras. Invent. Math. 126, 1–45 (1996) [BGK] Berman, S., Gao, Y., Krylyuk, Y.: Quantum tori and the structure of elliptic quasi-simple Lie algebras. J. Funct. Anal 135, 339–389 (1996) [BGKN] Berman, S., Gao, Y., Krylyuk, Y. and Neher, E.: The alternative torus and the structure of elliptic quasi-simple Lie algebras of type A2 . Trans. Am. Math. Soc. 347, 4315–4363 (1995) [BM] Berman, S. and Moody R.V.: Lie algebras graded by finite root systems and the intersection matrix algebras of Slodowy. Invent. Math. 108, 323–347 (1992) [BS] Berman, S. and Szmigielski, J.: Principal realization for extended affine Lie algebra of type sl2 with coordinates in a simple quantum torus with two variables. Preprint [C] Chari, V.: Integrable representations of affine Lie algebras. Invent. Math. 85, 317–335 (1986) [EF] Etingof, P. and Frenkel, I.B.: Central extensions of current groups in two dimensions. Commun. Math. Phys. 165, 429–444 (1994)
b Vertex Operators Arising from Homogeneous Realization for gl N [F1] [F2] [FJ] [FK] [FLM] [G1] [G2] [J] [H-KT] [K] [KKLW] [LW] [M] [MRY] [N] [S] [Se] [Sl] [W] [Y]
777
Frenkel,I.B.: Representations of affine Lie algebras, Hecke modular forms and Korteweg–De Vries type equations. Lecture notes in Math. 933, 71–110 (1982) Frenkel, I.B.: Representations of Kac–Moody algebras and dual resonance models. Lectures in Appl. Math. 21, 325–353 (1985) Frenkel, I.B. and Jing, N.H.: Vertex representations of quantum affine algebras. Proc. Natl. Acad. Sci. USA 85, 9373–9377 (1988) Frenkel, I.B. and Kac, V.G.: Representations of affine Lie algebras and dual resonance models. Invent. Math. 62, 23–66 (1980) Frenkel, I.B., Lepowsky, J. and Meurman, A.: Vertex Operator Algebras and the Monster. New York: Academic Press, 1989 Gao, Y.: The degeneracy of extended affine Lie algebras. Manuscripta. Math. 97, 233–249 (1998) Gao,Y.: Representations of extended affine Lie algebras coordinatized by certain quantum tori. Compositio Mathematica, to appear Jing, N.H.: Twisted vertex representations of quantum affine algebras. Invent. Math. 102, 663–690 (1990) Høegh-Krohn, R. and Torresani, B.: Classification and construction of quasi-simple Lie algebras. J. Funct. Anal. 89, 106–136 (1990) Kac, V.G.: Infinite dimensional Lie algebras. Third edition, Cambridge: Cambridge Univ. Press, 1990 Kac, V.G., Kazhdan, D.A., Lepowsky, J. and Wilson, R.L.: Realization of the basic representations of the Euclidean Lie algebras. Advances in Math. 42, 83–112 (1981) (1) Lepowsky, J. and Wilson, R.L.: Construction of the affine Lie algebra A1 . Commun. Math. Phys. 62, 43–53 (1978) Manin, Y.I.: Topics in noncommutative geometry. Princeton, NJ: Princeton University Press, 1991 Moody, R.V., Rao, S.E. and Yokonuma, T.: Toroidal Lie algebras and vertex representations. Geom. Ded. 35, 283–307 (1990) Neher, E.: Lie algebras graded by 3-graded root systems. Am. J. Math. 118, 439–491 (1996) Segal, G.G.: Unitary representations of some infinite-dimensional groups. Commun. Math. Phys. 80, 301–342 (1981) Seligman, G.B.: Rational Methods in Lie Algebras. Lect. Notes in Pure and Applied Math. 17, New York: Marcel Dekker, 1976 Slodowy, P.: Beyond Kac–Moody algebras and inside. Can. Math. Soc. Conf. Proc. 5, 361–371 (1986) Wakimoto, M.: Extended affine Lie algebras and a certain series of Hermitian representations. Preprint (1985) Yamada, H.: Extended affine Lie algebras and their vertex representations. Publ. RIMS, Kyoto U. 25, 587–603 (1989)
Communicated by T. Miwa