Commun. Math. Phys. 250, 1–21 (2004) Digital Object Identifier (DOI) 10.1007/s00220-004-1137-0
Communications in
Mathematical Physics
On the Structure of KMS States of Disordered Systems Stephen Dias Barreto1 , Francesco Fidaleo2 1
Department of Mathematics, Padre Conceicao College of Engineering, Verna Goa 403 722, India. E-mail:
[email protected] 2 Dipartimento di Matematica, Universit`a di Roma Tor Vergata, Via della Ricerca Scientifica 1, 00133 Roma, Italy. E-mail:
[email protected] Received: 18 March 2003 / Accepted: 12 March 2004 Published online: 27 July 2004 – © Springer-Verlag 2004
Abstract: We study the set of KMS states of spin systems with random interactions. This is done in the framework of operator algebras by investigating Connes and Borchers –invariants of W ∗ –dynamical systems. In the case of KMS states exhibiting a property of invariance with respect to the spatial translations, some interesting properties emerge naturally. Such a situation covers KMS states obtained by infinite–volume limits of finite–volume Gibbs states, that is the quenched disorder. This analysis can be considered as a step towards fully understanding the very complicated structure of the set of temperature states of quantum spin glasses, and its connection with the breakdown of the symmetry for replicas.
1. Introduction Traditionally, spin glasses have been studied as spin systems with random interactions. Extensive investigations on the existence of the thermodynamic limit have been made, and the equilibrium statistical mechanics of such systems has been studied, see [6, 8, 16, 27, 28, 30, 42] and the literature cited therein. It is expected that a spin glass exhibits phase transition at temperatures below a critical temperature Tc = Tf , where Tf is the freezing temperature of the spin glass. There is thus a breakdown of ergodicity, and each equilibrium state can be decomposed into its ergodic components with the same free energy, see e.g. [15, 16, 28], see also [41] for a very recent rigorous result. One of the main problems with the spin glass is to characterize and label the ergodic components. Because of the complicated random structure, this becomes a formidable task. In an attempt to understand this complex behaviour of a quantum spin glass, we study its temperature (i.e. KMS) states. This is done in the framework of operator algebras by modelling the spin glass as a quantum spin system with random interactions. Our analysis applies to the more general situation of disordered spin models, the spin glasses being the most interesting examples among them.
2
S.D. Barreto, F. Fidaleo
We start by using a standard procedure (see e.g. [13], Sect. IV.6.γ ), and define the C∗ appropriate C ∗ –algebra of observables as A := A⊗L∞ (, µ). Here, A := Mn (C) Zd
is the algebra of observables of a spin system on the lattice Zd , whereas, in order to take into account the disorder, the probability space (, µ) is the sample space for the coupling constants, the latter being random variables on it. Further, we also assume that the space translations act ergodically on (, µ) as measure preserving transformations. In this way, the time evolution and the spatial translations can be appropriately defined on the whole algebra A as mutually commuting groups of automorphisms, see below. It is well–known that the set of temperature states of such a system is a very intricate one. Roughly speaking, the presence of multiple phases is connected to the existence of some temperature state ϕ such that the GNS representation πϕ generates a von Neumann algebra with non–trivial centre Zπϕ .1 However, we always have πϕ I ⊗ L∞ (, µ) ⊂ Zπϕ for the temperature states under consideration. Thus, in order to understand the structure of KMS states of a disordered system, it seems to be natural to decompose any such state with respect to the image πϕ I ⊗ L∞ (, µ) of the common algebra L∞ (, µ) as a first step. This allows us to investigate the Borchers invariant B of the von Neumann algebra πϕ (A) generated by the GNS representation of the KMS state ϕ.2 Our main results can be summarized as follows. We start by proving a result which is of interest in itself (Theorem 4.4). Namely, consider a C ∗ –algebra B equipped with a sequence {αn }n∈N of ∗–automorphisms, and a state ϕ left invariant by the αn . If the GNS representation πϕ of ϕ acts on a separable Hilbert space, the support–projection of ϕ in the bidual B∗∗ is central and the sequence {αn }n∈N is aymptotically Abelian with respect to ϕ, then the type III0 component is absent in πϕ (B) . Coming back to our set–up, we consider locally normal KMS states ϕ on A, invariant with respect to the spatial translations.3 Such a situation naturally arises for KMS states obtained by infinite–volume limits of finite–volume Gibbs states, that is it covers quenched disorder. For such states, the Borchers invariant B (πϕ (A) ) coincides up to a normalization depending on the temperature, with the Arveson spectrum sp(τ ω ) of the random evolution {τtω }t∈R at fixed ω ∈ , the latter being almost surely independent of ω, see [4]. Further, let the centre Zπϕ of the GNS representation of ϕ coincide with L∞ (, µ). This situation can be considered the natural substitute of “pure thermodynamical phase” in the disordered case. In the last case, we conclude that there exists a unique λ ∈ (0, 1] such that the GNS representation πϕ of ϕ is made of type IIIλ factors almost surely. Such result is in accordance with the standard fact that the physically relevant quantities do not depend on the disorder. Possible applications are presented by an example discussed in some detail in Sect. 5. We end with Sect. 6 containing some general considerations on the KMS boundary condition for quantum disordered spin systems, taking into account known results relative to the classical case. The paper is supplemented by some appendices containing technical results relative to the direct integral decomposition of unbounded selfadjoint operators, the explicit description of their spectra, and finally some useful properties of Connes and Borchers –invariants of W ∗ –dynamical systems. 1
For the Gelfand–Naimark–Segal (GNS for short) constuction see [39], Sect. I.9. The Borchers invariant B is the natural generalization of the Connes invariant to W ∗ –dynamical systems with non–trivial centres, see [7, 11, 32]. 3 In our setting, “locally normal” simply means that ϕ is normal when restricted to I ⊗ L∞ (, µ), see Sect. 2. 2
KMS States of Disordered Systems
3
2. Preliminaries In the present paper we deal only with von Neumann algebras with separable preduals unless specified otherwise. Besides, all representations of C ∗ –algebras are understood to be on separable Hilbert spaces. We start with the usual spin algebra A :=
C∗
Mj
,
j ∈Zd
where Mi = Mn (C) for every site i ∈ Zd , and the C ∗ –completion is defined w.r.t. the unique C ∗ –cross norm. We consider also a standard measure space (, µ) based on a compact separable space , and a Borel probability measure µ. The group Zd of the spatial translations is supposed to act on the probability space (, µ) by measure preserving ergodic transformations {Tx }x∈Zd . A one–parameter random group of automorphisms (t, ω) ∈ R × → τtω ∈ Aut(A)
(2.1)
is acting on A. It is supposed to be strongly continuous in the time variable for each fixed ω ∈ , and jointly strongly measurable. Consider, for a localized element A, the strongly measurable function fA,t (ω) := τtω (A). We get fA,t L∞ (,µ;A) ≡ esssup τtω (A) A = A A , ω∈
where the last equality follows as τtω
is isometric. We assume further that τ acts locally. Namely, if A is a localized element of A, then the function fA,t ∈ L∞ (, µ; A) belongs to the C ∗ –subalgebra A ⊗ L∞ (, µ).4 The group {αx }x∈Zd of spatial translations acts in a natural way on A by shifting the observables on the lattice. Finally, we assume the commutation rule τtTx ω αx = αx τtω
(2.2)
for each x ∈ Zd , ω ∈ , and t ∈ R. Such a picture arises in a natural way in the study of disordered systems (see e.g. [5, 13]), and more precisely when one considers spin systems with random Hamiltonians. The last one is the natural framework of spin glasses, see [6, 15, 16, 27] and the references cited therein. For example, one could start from a net {H (ω)} ⊂Zd , the
being finite subsets of Zd , of local random Hamiltonians, which is made up of (A )s.a. – valued measurable functions arising from finite–range interactions, and satisfying the equivariance condition H +x (ω) = αx (H (T−x ω)) .
(2.3)
If this is the case, we assume that the one–parameter random automorphisms group (2.1) will arise from the infinite–volume limit of the corresponding ones relative to the local random Hamiltonians satisfying (2.3), see [10], Sect. 6.2, and [4], Sect. 3. In order to avoid trivial situations, we suppose that the random automorphisms group (2.1) 4
This fact would follow from ergodicity, for random systems arising from finite–range interactions described in Sects. 3 and 4 of [4], provided that the action of Zd on the sample space (, µ) is ergodic as described above. It is trivially satisfied for almost all (quantum versions of) models present in literature (see e.g. [6, 8, 27, 28, 30, 42]) as that in Sect. 5, taking into account that the addenda in (5.1) commute with each other.
4
S.D. Barreto, F. Fidaleo
is non–trivial. Notice that all the conditions listed above are automatically satisfied in many cases of interest, see for example the pivotal model treated in Sect. 5. The following result concerns the almost surely independence of the Arveson spectrum for random systems. This is precisely Theorem 5.3 of [4]. For the reader’s convenience, we report here such a result by filling a slight gap in the original proof.5 Theorem 2.1. Under the above assumptions, there exists a measurable set F ⊂ of full measure, and a closed set ⊂ R such that ω ∈ F implies sp(τ ω ) = . Proof. We get by [32], Proposition 8.1.9, s ∈ R |fˆ(s)| ≤ τfω sp(τ ω ) = f ∈L1 (R)
where “ˆ” stands for (inverse) Fourier transform, and +∞ τfω (A) := f (t)τtω (A) dt , −∞
the integral being understood in the Bochner sense. By a standard density argument, we can reduce the situation to a dense set {fk }k∈N ⊂ L1 (R). Define k (ω) := τfωk . It was shown in [4] that the functions k are measurable and invariant. By ergodicity, they are constant almost everywhere. Let {Nk }k∈N be null subsets of such that, for each k ∈ N c and ω ∈ Nk , k (ω) = k ∞ . c ω0 Consider F := k∈N Nk , and take := sp(τ ), where ω0 is any element of F . As an immediate consequence of this, we have that F is a measurable set of full measure, and ω ∈ F implies sp(τ ω ) = .
In order to study the temperature states of such a disordered system, we define A := A ⊗ L∞ (, µ), ∗ where the above C ∗ –tensor product is uniquely determined as any commutative C – d algebra is nuclear. The following local structure ([9]) A finite subsets of Z of A is inherited from the corresponding one of A by defining for finite subsets of Zd ,
A = A ⊗ L∞ (, µ) . Notice that, by identifying A with a closed subspace of L∞ (, µ; A), each element A ∈ A is uniquely represented by a measurable essentially bounded function ω → A(ω) with values in A. The group Zd of all the space translations is naturally acting on the C ∗ –algebra A as
Further, define on
ax (A)(ω) := αx (A(T−x ω)) .
(2.4)
A , bounded,
tt (A)(ω) := τtω (A(ω)) .
(2.5)
5 Compare with the analogous result ([23], Th´eor`eme III.1) concerning the spectrum of a one–dimensional random discretized Schr¨odinger operator.
KMS States of Disordered Systems
5
Proposition 2.2. Under the above assumptions, {tt }t∈R extends to a one–parameter group of automorphisms of A satisfying t t a x = a x tt ,
t ∈ R, x ∈ Zd .
(2.6)
Proof. Let A be a localized element of A. Then, A(ω) = ak (ω)Ak , k
where the above sum is finite, the ak are complex valued measurable essentially bounded functions, and {Ak } ⊂ A , for some bounded region ⊂ Zd . By (2.5), we get ak (ω)τtω (Ak ) . tt (A)(ω) = k
) are measurable Under the assumptions listed above, the functions ω → τtω (Ak
A → A. We essentially bounded functions in A ⊗ L∞ (, µ) ≡ A. Hence, tt :
compute
tt (A) A ≡ esssup τtω (A(ω)) A = esssup A(ω) A ≡ A A . ω∈
ω∈
Namely, tt is a ∗–preserving isometric homomorphism of the dense set
A of A,
{tt }t∈R
into A with inverse t−t . Further, t0 = id and tt1 +t2 = tt1 tt2 . Therefore, extends to a one–parameter group of automorphisms of all of A. Taking into account (2.2), it readily follows that {tt }t∈R and {ax }x∈Zd commute with each other.
In order to take into account the effects of the disorder, it is natural to start by considering A as describing the algebra of observables on which the space translations, described by {ax }x∈Zd , as well as the time evolution given by {tt }t∈R act, see [5, 13]. Notice that A contains copies A⊗I and I ⊗L∞ (, µ) of A and L∞ (, µ) respectively, denoted by an abuse of notation, also by A and L∞ (, µ). Let π be a representation of A. We easily get π(L∞ (, µ)) ⊂ Zπ . Suppose that π is normal when restricted to L∞ (, µ). The last condition simply means that π is locally normal as each local algebra A is finite–dimensional, see [38]. In such a situation, there exists a measure ν on , absolutely continuous w.r.t. µ, such that π(L∞ (, µ)) ∼ L∞ (, ν) .
(2.7)
Equivalently, we can find a measurable set Eν ⊂ (precisely the “support” of ν) uniquely determined by ν modulo null sets, such that one can choose for ν ν(E) = µ(E ∩ Eν ) . As π is locally normal, we have also π(A) = π(A ⊗ C()) , that is π(A ⊗ C()) is a weakly dense separable C ∗ –subalgebra of π(A) .
6
S.D. Barreto, F. Fidaleo
We can consider the subcentral decomposition of the restriction of π to the separable C ∗ –subalgebra A ⊗ C(), w.r.t. π(L∞ (, µ)) ≡ π(C()) , see [39], Theorem IV 8.25. We obtain ⊕ π= πω µ(dω) (2.8)
on Hπ =
⊕
Hω µ(dω)
where, for ω ∈ Eνc , the complement of the support Eν of ν, πω is the trivial representation on the trivial Hilbert space Hω ≡ {0}. The measurable field {πω }ω∈ of representations of A ⊗ C() is uniquely determined by its restriction on A. This follows from the fact that for each A ∈ A and f ∈ L∞ (, µ), we have ⊕ π(A ⊗ f ) = f (ω)πω (A ⊗ I )µ(dω) .
Now, if
M := π(A) =
⊕
Mω ν(dω) ,
then for almost all ω ∈ ,
Mω = πω (A ⊗ L∞ (, µ)) ≡ πω (A ⊗ I ) . If ϕ is a locally normal state on A, then there exists ([39], Prop. IV.8.34) a ∗–weak measurable field {ϕω }ω∈ of positive forms on A such that, for each A ∈ A, ϕ(A) = ϕω (A(ω))µ(dω) , (2.9)
the function ω → A(ω) being the representative of A in L∞ (, µ; A). Consider the GNS representation πϕ relative to ϕ. It is straightforward to check that, for almost all ω ∈ , πϕω is unitarily equivalent to the restriction of πω to A ⊗ I ∼ A, where πϕω is the GNS representation of ϕω , and πω is the representation occurring in (2.8) relatively the decomposition of πϕ . Finally, we show that A is asymptotically Abelian w.r.t. spatial translation under any locally normal state. Proposition 2.3. Let ϕ be a locally normal state on A. Then, for each A, B ∈ A we have, lim ϕ [ax (A), B]∗ [ax (A), B] = 0 . |x|→+∞
Proof. As the elementary tensors are total in norm, we prove the assertion for such a total set. Let f, g ∈ L∞ (, µ) and A, B ∈ A. Taking into account
ϕω µ(dω) = ϕ(1) = 1,
we obtain by (2.4) and (2.9), ϕ [ax (A ⊗ f ), B ⊗ g]∗ [ax (A ⊗ f ), B ⊗ g] f (T−x ω)g(ω)2 ϕω [αx (A), B]∗ [αx (A), B] µ(dω) =
2 2 ≤ f ∞ g ∞ ϕω µ(dω) [αx (A), B] 2 −→ 0 ,
as A is asymptotically Abelian in norm w.r.t. the spatial translations.
KMS States of Disordered Systems
7
For most of the models considered in the literature, the time evolution (2.1) exhibits better regularity properties. For example, it could be jointly continuous. In the last situation, {tt }t∈R given by (2.5) leaves A ⊗ C() globally stable. Further, the space describing the disorder could have a natural local structure. As the disorder is kept fixed during the analysis (we should treat only states ϕ such that µϕI ⊗C() ≺ µ), we have chosen A ⊗ L∞ (, µ) instead of A ⊗ C() as the algebra of observables. We also have disregarded the possible local structure of the disorder. In this way, the information about the disorder is directly encoded into the algebra of observables, yet we can treat more general models.6
3. General Properties of Temperature States In the present section we study some general properties of KMS states on the algebra of observables. We start by recalling the definition of the KMS boundary condition. be the space consisting of the (analytic continuation of the) Fourier transLet D forms of functions in the class D, the latter being the space made of all infinitely often differentiable compactly supported functions in R. A state ϕ on the C ∗ –algebra B satisfies the KMS boundary condition at inverse temperature β ∈ R\{0}, w.r.t the group of automorphisms {τt }t∈R if (i) t → ϕ(Aτt (B)) is a continuous function for every A, B ∈ B, (ii) ϕ(Aτt (B))f (t − iβ) dt = ϕ(τt (B)A)f (t) dt whenever f ∈ D. For the equivalent characterizations of the KMS boundary condition, the main results about KMS states, and finally the connections with Tomita theory of von Neumann algebras, see [10, 18, 34, 35, 40] and the references cited therein. Let ϕ be a KMS state on B w.r.t. {τt }t∈R at inverse temperature β which are kept fixed during the analysis. Consider all positive selfadjoint operators T ηZπϕ such that ϕ ∈ DT 1/2 and T 1/2 ϕ = 1.7 Consider the state ϕT on B given by ϕT := (πϕ ( · )T 1/2 ϕ , T 1/2 ϕ ) .
(3.1)
6 Consider the following modification of the model described by (5.1), whose formal Hamiltonian is given by
H =−
1 f (Ji,j )(i − j )σ (i)σ (j ) , 2 d
(2.10)
i,j ∈Z
where f is a real valued bounded Borel non-continuous function on K, and , µ are given in (5.2). Then, the model is essentially the same as that in Sect. 5, but the time evolution associated to (2.10) does not preserve the continuous A–valued functions. One can reduce oneself to the situation of Sect. 5 by taking the pull–back measure by f . In this way, the C ∗ –algebra of observables changes drastically, without affecting the substance of the model described by (2.10). Furthermore, if we consider any very simple disordered spin system on Z2 subjected to an external random two–value magnetic field described by a chessboard configuration (i.e. consisting of two points), the disorder variables have no local structure. 7 For a closed operator T with polar decomposition T = V |T |, being affiliated to a von Neumann algebra M means that both V and (1 + T ∗ T )−1 belong to M. In such a situation we write T ηM.
8
S.D. Barreto, F. Fidaleo
Proposition 3.1. There is a one–to–one correspondence T → ϕT between (i) the set {T } of positive operators affiliated to the centre Zπϕ such that ϕ ∈ DT 1/2 and T 1/2 ϕ = 1, (ii) the set {ϕT } of KMS states normal w.r.t. ϕ.8 Proof. By rescaling, we can suppose that β = 1. Put M := πϕ (B) . Let T be as above, then ϕT defined by (3.1) is a KMS state on B by Prop. 5.3.29 of [10]. Consider the cyclic projection P ∈ M onto the subspace [MT 1/2 ϕ ]. Then P ∈ Zπϕ as it coincides with the central support of T . The induction map ρ(X) = P X[MT 1/2 ϕ ] realizes the searched map of πϕ (B) = M onto πϕT (B) = MP . Namely, ϕT is normal w.r.t. ϕ. Suppose now that ψ is a KMS state on B normal w.r.t. ϕ. Let ρ be the normal map of πϕ (B) onto πψ (B) such that, for each A ∈ B, ρ(πϕ (A)) = πψ (A). Let now f ∈ D, and X = πϕ (A), Y = πϕ (B). We get ϕ ρ(Xσt (Y ))ψ , ψ f (t − i) dt ≡ ψ(Aτt (B))f (t − i) dt ϕ = ψ(τt (B)A)f (t) dt ≡ ρ(σt (Y )X)ψ , ψ f (t) dt, where σ ϕ is the modular group relative to the extension ( · ϕ , ϕ ) of ϕ to all of M (which we are also denoting by ϕ). As πϕ (B) is dense in M w.r.t. the strong operator topology, we conclude by a standard density argument that (ρ( · )ψ , ψ ) is also a KMS state on M w.r.t. σ ϕ . Then, by [10], Prop. 5.3.29, there exists a positive operator T as above such that ψ = ϕT . The uniqueness follows again by Prop. 5.3.29 of [10].
Now we specialize the above analysis to the case under consideration. Proposition 3.2. Let ϕ be a locally normal t–KMS state on A at inverse temperature β. Consider the field {ϕω }ω∈ of positive forms on A in (2.9) arising from the subcentral decomposition w.r.t. πϕ (L∞ (, µ)). Then, for almost all ω ∈ , the form ϕω is τ ω –KMS at the same inverse temperature β. Proof. As tt does not leave in general A ⊗ C() globally stable, we cannot directly apply Theorem A.13 of [37] to the situation under consideration. So we give a direct Taking into account (2.9), proof of the result. Namely, let ϕ be a KMS state and f ∈ D. we get by the Fubini Theorem, µ(dω)h(ω) dt[ϕω (Aτtω (B))f (t − iβ) − ϕω (τtω (B)A)f (t)] = 0 for each h ∈ L∞ (, µ), and A, B ∈ A.9 Then, for each A, B ∈ A there exists a measurable set A,B ⊂ of full measure such that on A,B , (3.2) ϕω (Aτtω (B))f (t − iβ) dt = ϕω (τtω (B)A)f (t) dt . 8 Recall that a representation π of the C ∗ –algebra B is normal w.r.t. another representation π if 1 2 there exists a normal ∗–epimorphism ρ : π2 (B) → π1 (B) such that ρ ◦π2 = π1 . The state ϕ ∈ S (B) is normal w.r.t. ψ ∈ S (B) if the GNS representation πϕ is normal w.r.t. πψ . 9 Notice that ϕ could be trivial on a set of positive measure. This does not affect the proof as we will ω recover that ϕω should be the trivial KMS functional on such a set.
KMS States of Disordered Systems
9
We choose a countable dense set A0 ⊂ A. Then (3.2) is satisfied for each A, B ∈ A0 , on the set 0 := A,B ⊂ of full measure. Fix A, B ∈ A and choose sequences A,B∈A0
{An }, {Bn } in A0 , converging to A, B respectively. We get by the Lebesgue Dominated Convergence Theorem, ϕω (Aτtω (B))f (t − iβ) dt = lim ϕω (An τtω (Bn ))f (t − iβ) dt n = lim ϕω (τtω (Bn )An )f (t) dt = ϕω (τtω (B)A)f (t) dt, n
which is the KMS boundary condition for all ϕω such that ω ∈ 0 .
In what follows, we give a description of KMS states which are normal w.r.t. a fixed locally normal KMS state in terms of fields of positive operators. Let ϕi , i = 1, 2, be KMS states on A, normal w.r.t. a fixed locally normal KMS state ϕ. Consider the subcentral decomposition (2.8) of the GNS representation πϕ w.r.t. L∞ (, µ). By Proposition 3.1, Proposition 3.2 and Proposition A.2, there exist measurable fields {Ti (ω)}ω∈ , i = 1, 2, of positive operators affiliated to Zπϕω . Suppose that Ei , i = 1, 2, is the complement of the measurable set consisting of all ω ∈ for which Ti (ω) = 0. It is straightforward to verify that (i) πϕ1 ≺ πϕ2 ⇒ ν(E1 \E2 ) = 0, (ii) πϕ1 ◦ πϕ2 ⇒ ν(E1 ∩ E2 ) = 0, where ≺ and ◦ mean the usual containment and disjointness of representations respectively, and ν is the measure appearing in (2.7). The reverse implications can be obtained by looking at each fibre. 4. Invariant Temperature States and Their Spectral Properties In the present section we study the structure of states on A which are invariant w.r.t. the action {ax }x∈Zd of spatial translations (invariant for short). Roughly speaking, the last property means that the state under consideration satisfies the invariance property only “in average”.10 Then we specialize our analysis to the case of KMS states. The last ones enjoy some nice structural properties we are going to analyze. We start by summarizing some basic properties of locally normal states (not necessarily KMS) invariant w.r.t. spatial translations. Theorem 4.1. Let ϕ be a locally normal invariant state on A. Consider the decomposition appearing in (2.9). Then (i) ϕω ◦ αx = ϕT−x ω for all x ∈ Zd , (ii) ϕω (I ) = 1, (iii) Zπϕω ∼ = ZπϕT ω for all x ∈ Zd , x
where the above equalities, as well as the unitary equivalence, are satisfied almost everywhere. Proof. For each locally normal invariant state ϕ on A and each A ∈ A, we obtain by (2.4), (2.9), and after an elementary change of variable, 10 Such an invariant state ϕ on A is described by a class of states {ϕ } ω ω∈ on the spin algebra A exhibiting a natural equivariance property, see Theorem 4.1.
10
S.D. Barreto, F. Fidaleo
ϕ ◦ ax (A) = = ≡
=
ϕω (αx (A(T−x ω)))µ(dω) ϕTx ω (αx (A(ω)))µ(dω) ϕTx ω ◦ αx ((A(ω)))µ(dω) ϕω ((A(ω)))µ(dω),
where the last equality is obtained as ϕ is supposed to be invariant. Namely, by Theorem IV.8.34 of [39], ϕω ◦ αx = ϕT−x ω almost surely for each fixed x ∈ Zd . By identifying the GNS triplet (πϕω ◦αx , Hϕω ◦αx , ϕω ◦αx ) relative to ϕω ◦ αx with (πϕω ◦ αx , Hϕω , ϕω ), we have also πϕω ◦αx ∼ = πϕT−x ω almost surely. As Zd is countable, (i) and (iii) are satisfied for all x ∈ Zd simultaneously, on a common measurable set of full measure. By (i), it is easy to show that the measurable function ω → ϕω (I ) is an invariant one, then (ii) follows by ergodicity.
Consider for α ∈ {∞} ∪ {1, 2, . . . } ∪ {λ∞ , λ0 , λ1 , . . . }, the Abelian von Neumann algebras L∞ (Eα , να ) defined as follows. For α ∈ {∞} ∪ {1, 2, . . . }, (En , νn ) is the countable set En = n of cardinality n, equipped with the counting measure νn (the symbol ∞ corresponds to the denumerable cardinality). For α = λn , (Eλn , νλn ) is the disjont union [0, 1] ∪ n equipped with the measure νλn made of the Lebesgue measure λ on [0, 1], and the counting measure on n (the value n = 0 corresponds to L∞ ([0, 1], λ)). As a corollary to Theorem 4.1, we have Corollary 4.2. Let ϕ be as in Theorem 4.1. Then there exists a unique α ∈ {∞} ∪ {1, 2, . . . } ∪ {λ∞ , λ0 , λ1 , . . . } such that Zπϕω ∼ L∞ (Eα , να ) almost surely. Proof. Consider the sets α ⊂ consisting of ω such that Zπϕω ∼ L∞ (Eα , να ). Such sets are measurable ([31]) and give a disjoint countable partition of . Moreover, each α is invariant w.r.t. the action of spatial translations by (iii) of the cited theorem. The assertion follows by ergodicity.
We cannot conclude that Zπϕω is almost surely of a unique multiplicity class.11 However, for locally normal invariant KMS states we have that Zπϕω is almost surely of infinite multiplicity, see Theorem 4.6. We report a lemma which is crucial for the proof of the results which follow. Consider a sequence {αn }n∈N of ∗–automorphisms of a C ∗ –algebra B, together with a state ϕ on B which is invariant for {αn }. Let {Un } be the covariant implementation of {αn } relative to the GNS triplet (πϕ , Hϕ , ϕ ) corresponding to ϕ. Denote by α˜ n := adUn the corresponding automorphism on M := πϕ (B) . 11 Notice that there are uncountable many Abelian von Neumann algebras acting on separable Hilbert spaces, up to unitary equivalence.
KMS States of Disordered Systems
11
Lemma 4.3. Suppose that ϕ is separating for πϕ (B) .12 Then lim ϕ [αn (A), B]∗ [αn (A), B] = 0 n→∞
for every A, B ∈ B implies
(4.1)
lim [α˜ n (X), Y ]ξ = 0
n→∞
for every X, Y ∈ πϕ (B) and ξ ∈ Hϕ . Proof. As ≡ ϕ is cyclic for M , it is enough to show that n → ∞ implies [α˜ n (X), Y ] → 0. Let ε > 0 and X, Y ∈ M1 be fixed. Then there exist X , Y ∈ M \{0} such that (X − X ) < ε,
(Y − Y ) < ε .
Further, we can find by the Kaplansky Density Theorem, A, B ∈ B with πϕ (A) ≤ 1, πϕ (B) ≤ 1, such that (X − πϕ (A)) < 1 ∧ 1/ Y ε , (Y − πϕ (B)) < 1 ∧ 1/ X ε . We get [X, α˜ n (Y )] − πϕ ([A, αn (B)]) ≤ X α˜ n (Y ) − πϕ (A)α˜ n (πϕ (B)) + α˜ n (Y )X − α˜ n (πϕ (B))πϕ (A) . As both the terms of the r.h.s. of the above inequality are estimated in the same way, we consider only the first one. We get Xα˜ n (Y ) − πϕ (A)α˜ n (πϕ (B)) ≤ α˜ n−1 (X − πϕ (A))(Y − Y ) + Y α˜ n−1 (X − πϕ (A)) + πϕ (A)(α˜ n (Y − πϕ (B))) 0 w.r.t. the time evolution associated to the Hamiltonian (5.1), see e.g. [10]. We analyze in some detail the uniqueness case.16 Namely, suppose that for a fixed β > 0, the Ising type model under consideration admits a unique KMS state, say ϕω , almost surely. Lemma 5.1. Under the above assumptions, the map ω ∈ → ϕω ∈ S(A) is ∗–weak measurable. Further, it satisfies almost surely the condition of equivariance ϕω ◦ αx = ϕT−x ω
(5.3)
w.r.t. the spatial translations, simultaneously. Proof. Consider on A ⊗ C() the states ϕ , ⊂ Zd finite, which are given by extensions to all of A⊗C(), of the states corresponding to the finite–volume Gibbs ensemble (with any chosen boundary condition) associated to the random Hamiltonian (5.1). By compactness, the net {ϕ } has a subsequence converging to the state ϕ. The latter is a KMS state on A⊗C() w.r.t. the random time evolution as that given in (2.1), determined by (5.1), such that the measure corresponding to ϕI ⊗C() is precisely µ. Further, it is invariant w.r.t. the space translations {ax }x∈Zd given on A ⊗ C() as in (2.4). Looking at each fibre, by Proposition 3.2 and Theorem 4.1 we have a decomposition of ϕ into a direct integral of states which, by uniqueness, coincides with the ϕω almost everywhere. Then, the map ω → ϕω is measurable and satisfies (5.3) almost surely w.r.t. µ.
15 The analysis presented in this section applies mutatis mutandis to models described by more general local Hamiltonians {H } ⊂Zd , coming from finite–range interactions as outlined in Sect. 2. 16 The hypotheses of Lemma 5.1 are satisfied if the quantum model under consideration admits some critical temperature. The situation is well clarified for many classical disordered models (see e.g. [28]), contrary to the quantum situation where, in the knowledge of the authors, there are few rigorous results concerning this point. However, it is expected also that quantum disordered systems exhibit critical temperatures at the high temperature regime.
14
S.D. Barreto, F. Fidaleo
From now on, we restrict ourselves to the situation where there is only one KMS state ϕω on A at inverse temperature β > 0 which is kept fixed during the analysis. In this case, the structure of the locally normal KMS states on A ≡ A ⊗ L∞ (, µ) is well understood.17 To this end, we have the following Proposition 5.2. For the model under consideration, under the above assumptions there is a one–to–one correspondence f → ϕf between positive normalized equivalence classes of L1 –functions on the sample space (, µ) and locally normal KMS states on A at inverse temperature β, where the state ϕf is given by ϕf (A) = f (ω)ϕω (A(ω))µ(dω) , A ∈ A, (5.4)
and ϕω is, almost surely, the unique KMS state on A at the same β. Proof. Taking into account Lemma 5.1, the field {ϕω } is measurable, then the state given in (5.4) is well–defined, and gives by construction a locally normal KMS state at inverse temperature β. Further, different functions give rise to different states. Now, let us fix a locally normal KMS state ϕ on A. By Proposition 3.2, there exists a measurable field {ψω } of KMS forms on A such that ϕf (A) = ψω (A(ω))µ(dω) .
By uniqueness of the KMS state ϕω , we obtain ψω = ψω (I )ϕω almost surely. The function f we are looking for is precisely f (ω) := ψω (I ). This concludes the proof.
In a situation such as the one just described above, there is a unique locally normal KMS state ϕ on A which is translation invariant. Further, there exists a unique λ > 0 such that ϕ is a direct integral of IIIλ factors almost surely. This immediately leads us to the following Proposition 5.3. Under the above assumptions, the following assertions hold true for the model under consideration. (i) There exists a unique locally normal a–invariant KMS state ϕ on A at inverse temperature β. Such a state is given by ϕ(A) = ϕω (A(ω))µ(dω) , A ∈ A,
where ϕω is, almost surely, the unique KMS state on A at the same β. (ii) The von Neumann algebra πϕ (A) generated by the GNS representation associated with the unique locally normal translation invariant KMS state ϕ on A is a direct integral of IIIλ factors for a fixed λ ∈ (0, 1]. Proof. The first part follows by ergodicity (Theorem 4.1), whereas the latter readily follows from Corollary 4.7. Consider the inverse temperature β > 0 such that the model under consideration admits a unique β–KMS state on A w.r.t. the time evolution associated to the Hamiltonian (5.1) at fixed values of couplings Jij , almost surely. For such β (or more generally 17
4.7.
Most of the forthcoming analysis also applies to the case when Zπϕ ∼ L∞ (, µ), see Corollary
KMS States of Disordered Systems
15
in the situation described in Corollary 4.7), the fact that πϕω (A) is almost surely a type IIIλ factor for a fixed λ ≡ λβ ∈ (0, 1] cannot be directly proven by using the standard results in [2, 33]. This is because the ϕω do not satisfy any natural invariance condition. We end the section by briefly describing what happens in the “multiple phase” regime. After taking the infinite–volume limit along various subsequences nk ↑ Zd , we will find, in general different, locally normal translation invariant t–KMS states on A at fixed inverse temperature β. Fix one such state ϕ. Then, one recovers a ∗–weak measurable field {ϕω }ω∈ ⊂ S(A) of τ ω –KMS states satisfying the equivariance property (5.3). According to Proposition 3.1, the set of the t–KMS states ϕT ∈ S(A), locally normal w.r.t. ϕ, have the form ϕT (A) = πϕω (A(ω))T (ω)1/2 ϕω , T (ω)1/2 ϕω H µ(dω) .
ϕω
Here, (πϕω , Hϕω , ϕω ) is the GNS representation of ϕω , {T (ω)}ω∈ is a measurable field of closed densely defined operators on Hϕω affiliated to the (isomorphic) centres Zϕω respectively, satisfying ϕω ∈ DT (ω)1/2 almost surely, and µ(dω) = 1. We refer the reader to Sect. 3 for further details.
T (ω)1/2 ϕω 2Hϕ
ω
6. Comments The present section is devoted to some comments relative to temperature states of quantum disordered spin systems, taking into account known results for the classical situation. Collecting together the results of Proposition 3.2 and Theorem 4.1, we recover for KMS states of A, all (the generalizations to the quantum case of) the properties which characterize the metastates. Metastates naturally arise when one considers thermodynamic limits of finite–volume Gibbs states on the skew space of joint configurations of spins and couplings which exist by compactness, that is for quenched disorder.18 The measure on such a skew space satisfies the following properties. Its marginal distribution of the couplings is the given probability measure describing the disorder, whereas the conditional distribution of the spins, given the couplings, is some infinite–volume Gibbs state almost surely. Such a field of infinite–volume Gibbs states satisfies an equivariant property as in Theorem 4.1, that is it gives an Aizenman–Wehr metastate, see [1], Sect. 5. For recent results on metastates, we refer the reader to [28–30] and the references cited therein. It can happen that the measure so obtained is not necessarily jointly Gibbsian, see [14, 22]. Therefore, the KMS boundary condition for the algebra A generated by spin and disorder variables w.r.t. the natural time evolution {tt }t∈R allows to recover quantum metastates after direct integral decomposition, the last one assuming the meaning of the quantum counterpart of the classical procedure of conditioning w.r.t. the disorder variables. The phenomenon that infinite–volume limits of finite–volume Gibbs states might be non-Gibbsian, can be partially explained in purely a mathematical setting as follows. Without any specific mention of the disordered systems, a quite similar approach was adopted in [21] (see also [3]) in order to study the statistical mechanics of semi–quantum lattice systems, that is systems whose quasi–local algebras of observables have non–trivial centres, then including commutative systems. Many of disordered systems treated in this paper fall into the framework of [21, 3], provided that the algebra generated by the 18 We refer the reader to [17] for some considerations relative to quenched and annealed disorder without referring to the local structure of the models under consideration.
16
S.D. Barreto, F. Fidaleo
disorder variables also has a local structure made of (commutative) finite–dimensional algebras.19 The last condition is satisfied for the model in Sect. 5 under the additional condition |K| < +∞, K given in (5.2). In such a situation, A := A ⊗ C() and the states ϕ under consideration are those such that the measures µϕI ⊗C() on determined by ϕI ⊗C() are absolutely continuous w.r.t. µ. In order to describe semi–quantum systems, an alternative approach was pursued in [21]. Namely, the algebra A is embedded as a subalgebra, into a quasi–local simple C ∗ – algebra F in a canonical way. Further, a Umegaki conditional expectation ε : F → A is constructed. In such a way, (invariant) states ϕ on A are lifted by ε, to (invariant) states ψϕ := ϕ ◦ ε on F. It was shown that the Gibbs condition for the state ψϕ on F, which turns out to be equivalent to the KMS condition for ψϕ , is equivalent to the Gibbs condition for the invariant state ϕ on A. Obviously, the KMS condition for invariant states ϕ on A is weaker than the corresponding condition for lifted states ψϕ on F, hence than the Gibbs condition for ϕ or equivalently for ψϕ .20 For semi–quantum systems (i.e. most of quantum disordered models under consideration) there is then a non–trivial condition which does not refer to the local structure, the KMS condition, which is weaker than the Gibbs condition. On the other end, for such models, limit points of finite–volume Gibbs states are KMS states, thanks to Prop. 5.3.25 of [10]. Summarizing, infinite-volume limits of finite-volume Gibbs states on the algebra A describing the spin and disorder variables should satisfy the KMS boundary condition, but they could be not necessarily infinite–volume Gibbs states. The last fact is well–known for classical disordered systems. Further, in the classical case there seems to be no characterization for such a “weak Gibbsianess” which does not refer to the local structure of the observable algebra, see [14, 22, 26] for the physical interpretation of such a phenomenon, and for further details. Therefore, the KMS boundary condition for the algebra generated by spin and disorder variables seems to be exactly tailored in order to describe the analog of the weak Gibbsianess in the quantum case. Moreover, it does not refer to the local structure of the observable algebra. The investigation of conditions which allow to seek the joint Gibbsian states among the weak Gibbsian (i.e. KMS) ones for quantum disordered systems, as well as the study of such a phenomenon for specific models, is a very interesting research program which is beyond the aims of the present paper. Appendix A. Decomposition of Closed Opertators By a measurable field of closed densely defined linear operators on ⊕ H= Hω dµ(ω)
we mean a field ω → T (ω) of closed linear operators on Hω with dense domain DT (ω) such that all functions ω → (T (ω)x(ω), y(ω))Hω are measurable for all measurable 19 In general, the approach followed in [21, 3] is suitable for (semi–)classical systems only if the net {H } of local Hamiltonians are assigned a–priori and not recovered by dynamical considerations. This is due to the fact that, contrary to the quantum case, the description of the equilibrium statistical mechanics of some relevant classical systems (i.e. the lattice gases, [24]) has no direct dynamical meaning. 20 The fact that the KMS condition is weaker than the Gibbs condition is a standard mathematical phenomenon which can be easily understood by considering the extreme situation of commutative systems, where the KMS condition is trivially satisfied for all states.
KMS States of Disordered Systems
17
fields of vectors {x(ω)} ⊂ DT (ω) , and {y(ω)} ⊂ Hω , ( · , · )Hω being the inner product on Hω . In such a situation, we define
⊕
T :=
T (ω) dµ(ω)
(A.1)
on the domain
DT := {x(ω)} ⊂
DT (ω) {x(ω)} measurable, 2 T (ω)x(ω) Hω dµ(ω) < +∞ .
(A.2)
Proposition A.1. The linear operator T on H defined by (A.1), with domain given by (A.2), is densely defined and closed. Proof. For each x ∈ H with x(ω) ∈ DT (ω) , we define z ∈ DT as z(ω) := (1 ∨ T (ω)x(ω) Hω )−1 x(ω) . It is easy to verify that each z belongs to DT .21 But the set of z(ω) arising from such z is total in Hω for each ω ∈ , that is DT is dense in H. Let now xn → x, T xn → y. There exist subsequences xnk (ω) → x(ω) and T xnk (ω) → y(ω) almost everywhere. Then, for almost all ω ∈ , x(ω) ∈ DT (ω) and T (ω)x(ω) = y(ω). Further, T (ω)x(ω) 2Hω dµ(ω) = y 2 ,
that is x ∈ DT and T x = y.
A densely defined closed operator T is said to be decomposable if it commutes with all diagonalizable operators. Let T = V |T | be the polar decomposition of T . It is straighforward to verify that T is decomposable iff V and (1 + T ∗ T )−1 both commute with the algebra of diagonalizable operators, see also [12]. We leave to the reader the proof of the following Proposition A.2. Let T be a densely defined closed decomposable operator. Then there exists a measurable field {T (ω)}ω∈ of densely defined closed operators such that ⊕ T (ω) dµ(ω) . T =
Let ϕ be a normal semifinite faithful (n.s.f. for short) weight on the von Neumann algebra M, and consider any subcentral decomposition ⊕ Mω dµ(ω) M=
of M. To such a decomposition, there corresponds a decomposition ⊕ ϕω dµ(ω) ϕ=
of the weight ϕ into n.s.f. weights, as described in Appendix A of [37]. 21
We are assuming without loss of generality, that the Borel measure µ is finite.
18
S.D. Barreto, F. Fidaleo
Proposition A.3. In the above situation, we have for the corresponding Tomita operators
⊕
ϕ =
⊕
ϕω dµ(ω) ,
Jϕ =
Jϕω dµ(ω) .
Proof. As ϕ and Jϕ commute with every element in the centre Z(M) of M ([35], Sect. 10), they are decomposable. The proof relative to the explicit decomposition of ϕ and Jϕ is analogous to that in [20], Lemma 3.3.
Notice that the above situation covers the analogous one described in Theorem A.13 of [37], relative to lower semicontinuous KMS weights on separable C ∗ –algebras.
Appendix B. On the Spectrum of Decomposable Operators The following proposition describes the spectrum of a decomposable selfadjoint operator in terms of the spectra of operators living on the fibres. ⊕ Proposition B.1. Let T = T (ω) dµ(ω) be selfadjoint. Then σ (T ) =
c σ (T (ω)) E ⊂ measurable , µ(E ) = 0 .
ω∈E
Proof. Let λ ∈ P(T ), then there exists a measurable set E such that µ(E c ) = 0 and (i) λ ∈ P(T (ω)), (ii) RT (ω) (λ) ≤ K for each ω ∈ E. As in the selfadjoint case we have RT (ω) (λ) = dist(λ, σ (T (ω)))−1 for the resolvent RT (ω) (λ), we conclude that λ ∈ Int sion follows analogously.
P(T (ω)) . The reverse inclu-
ω∈E
Corollary B.2. Let T be as in Proposition B.1. Suppose that for some measurable F of full measure, there exists a closed set C ⊂ R such that σ (T (ω)) = C for each ω ∈ F . Then σ (T ) = C. Proof. Let E be any measurable set of full measure. We have in such a situation,
ω∈F
and the proof follows.
σ (T (ω)) ⊂
ω∈E
σ (T (ω))
KMS States of Disordered Systems
19
Appendix C. On the Borchers Spectrum In the present section we deal with arbitrary von Neumann algebras unless it is specified otherwise. The Borchers spectrum B (α) ([7]) of an action g ∈ G → αg ∈ Aut(M) of a locally compact group G on a von Neumann algebra M is defined as sp(α e ), B (α) := e
where e runs over all invariant projections with central support z(e) = I . The Connes spectrum (α)([11]) is analogously defined by requiring that e merely runs over all non-zero invariant projections. As B (σ ϕ ) does not depend on the modular group σ ϕ corresponding to any n.s.f. weight ϕ on M ([19], Prop. 1), we can define B (M) := B (σ ϕ ) . Proposition C.1. For a von Neumann algebra M we have Exp (B (M)) = S(M)\{0}, where S(M) is Connes S–invariant ([11]). Proof. The proof is analogous to that of part (iii) in Prop. 1 of [19], taking into account that ϕv (x) := ϕ(vxv ∗ ) is a n.s.f. weight on M for each n.s.f. weight ϕ, and each isometry of M with left support e = vv ∗ in the centralizer M ϕ , see [34], Sect. 2.21.
It is almost immediate to show that M semifinite implies the triviality of B (M). Conversely in the separable case, the triviality of the Borchers spectrum implies that only type III0 factors and/or semifinite factors can appear in the factor decomposition of M. The last facts are summarized in the following Proposition C.2. Let M be a von Neumann algebra. (i) If M is semifinite, then B (M) = {0}. (i) Suppose that M has a separable predual and let ⊕ M= Mω dµ(ω) σ (Z(M))
be its factor decomposition. If B (M) = {0}, then µ(E c ) = 0, where E is the measurable set consisting of ω ∈ σ (Z(M)) such that Mω is either semifinite or type III0 . Proof. By Theorem 8.9.4 in [32], M semifinite implies B (M) = {0}. Conversely, suppose that M has separable predual. Let B (M) = {0}, and consider the set F in σ (Z(M)) consisting of ω ∈ σ (Z(M)) such that Mω is type IIIλ , λ ∈ (0, 1]. Such a set is measurable, see [36]. Suppose that F has strictly positive measure. Taking into account the definition of the Arveson spectrum of an action (see [32]) together with Proposition B.1, we obtain
B (M) ⊃ (Mω )
⊃
{E|µ(E c )=0} ω∈E
{E|µ(E c )=0} ω∈E∩F
which is a contradiction.
(Mω ) {0}
20
S.D. Barreto, F. Fidaleo
Acknowledgements. We thank R. Longo and L. Zsid´o for some helpful suggestions. We are grateful to the anonymous referees whose suggestions contributed to improve the presentation of the paper.
References 1. Aizenman, M., Wehr, J.: Rounding effects of quenched randomness on first-order phase transitions. Commun. Math. Phys. 130, 489–528 (1990) 2. Araki, H.: Remarks on spectra of modular operators of von Neumann algebras. Commun. Math. Phys. 28, 267–278 (1972) 3. Araki, H.: Operator algebras and statistical mechanics. In: Mathematical problems in theoretical physics, Proc. Internat. Conf. Rome 1977, eds. G. Dell’Antonio, S. Doplicher, G. Jona–Lasinio, Lecture Notes in Physics 90, Berlin-Heidelberg-New York: Springer, 1978, pp. 94–105 4. Barreto, S.D.: A quantum spin system with random interactions I. Proc. Indian Acad. Sci. 110, 347–356 (2000) 5. Bellissard, J., van Elst, A., Schulz–Baldes, H.: Noncommutative geometry of the quantum Hall effect. J. Math. Phys. 35, 5373–5451 (1994) 6. Binder, K., Young, A.P.: Spin glass: Experimental facts, theoretical concepts and open questions. Rev. Mod. Phys. 58, 801–976 (1986) 7. Borchers, H.-J.: Characterization of inner ∗-automorphisms of W ∗ -algebras. Publ. RIMS Kyoto Univ. 10, 11–49 (1974) 8. Bovier, A., Picco, P. (eds.): Mathematical aspects of spin glasses and neural networks. Basel-Boston-Berlin: Birkh¨auser, 1998 9. Bratteli, O., Robinson, D.W.: Operator algebras and quantum statistical mechanics I. Berlin-Heidelberg-New York: Springer, 1981 10. Bratteli, O., Robinson, D.W.: Operator algebras and quantum statistical mechanics II. Berlin-Heidelberg-New york: Springer, 1981 ´ Norm. Sup. 6, 133–252 11. Connes, A.: Une classification des facteurs de type III. Ann. Scient. Ec. (1973) 12. Connes, A.: Sur la th´eorie non-commutative de l’int´egration. In: Lecture Notes in Mathematics 725, Berlin-Heidelberg-New York: Springer, 1979 13. Connes, A.: Noncommutative geometry. London: Academic Press, 1994 14. van Enter, A.C.D., Maes, C., Schonmann, R.H., Shlosman, S.: The Griffiths singularity random field. Am. Math. Soc. Trans. 198, 51–58 (2000) 15. van Enter, A.C.D., van Hemmen, J.L.: The thermodynamic limit for long range random systems. J. Stat. Phys. 32, 141–152 (1983) 16. van Enter, A.C.D., van Hemmen, J.L.: Statistical mechanical formalism for spin-glasses. Phys. Rev. A 29, 355–365 (1984) 17. Fidaleo F., Liverani, C.: Statistical properties of disordered quantum systems. In: Proceedings of the 19th Conference on Operator Theory, Basel-Boston-Berlin: Birkh¨auser, to appear 18. Haag, R., Hugenholtz, N.M., Winnink, M.: On the equilibrium states in quantum statistical mechanics. Commun. Math. Phys. 5, 215–236 (1967) 19. Herman, R.H., Longo, R.: A note on the –spectrum of an automorphism group. Duke Math. J. 47, 27–32 (1980) 20. Isola, T.: Modular structure of the crossed product by a compact group dual. J. Oper. Theory 33, 3–31 (1995) 21. Kishimoto, A.: Equilibrium states of a semi-quantum lattice system. Rep. Math. Phys. 12, 341–374 (1977) 22. K¨ulske, C.: (Non-)Gibbsianness and phase transitions in random lattice spin models. Markov Process. Relat. Fields 5, 357–383 (1999) 23. Kunz, H., Souillard, B.: Sur le spectre des op´erateurs aux diff´erence finies al´eatoires. Commun. Math. Phys. 78, 201–246 (1986) 24. Lanford III, O.E., Ruelle, D.: Observables at infinity and states with short range correlations in statistical mechanics. Commun. Math. Phys. 13, 194–215 (1969) 25. Longo, R.: Algebraic and modular structure of von Neumann algebras of Physics. Proc. Symp. Pure Math. 38, 551–566 (1982) 26. Maes, C., Redig, F., Van Moffaert, A.: Almost Gibbsian versus weakly Gibbsian measures. Stoch. Process. Appl. 79, 1–15 (1999) 27. Mezard, M., Parisi, G., Virasoro, M.A: Spin–glass theory and beyond. Singapore: World Scientific, 1986 28. Newman M.N.: Topics in disordered systems. Basel-Boston-Berlin: Birkh¨auser, 1997
KMS States of Disordered Systems
21
29. Newman, M.N., Stein, D.L.: Thermodynamic chaos and the structure of short–range spin glasses. In: Mathematical aspects of spin glasses and neural networks, eds. A. Bovier, P. Picco, Basel-Boston-Berlin: Birkh¨auser, 1998, pp. 243–287 30. Newman, M.N., Stein, D.L.: Ordering and broken symmetry in short-ranged spin glasses. Available at http://arXiv:cond–mat/0301403 v2, 2003 31. Nielsen, O.A.: Direct integral theory. New York-Basel: Marcel Dekker, 1980 32. Pedersen, G.K.: C ∗ –algebras and their automorphism groups. London: Academic Press, 1979 33. Størmer, E.: Spectra of states, and asymptotically Abelian C ∗ –algebras. Commun. Math. Phys. 28, 279–294 (1972) 34. Strˇatilˇa, S.: Modular theory in operator algebras. Tunbridge Wells-Kent: Abacus Press, 1981 35. Strˇatilˇa, S., Zsid´o, L.: Lectures on von Neumann algebras, Tunbridge Wells, Kent: Abacus Press, 1979 36. Sutherland, C.E.: Crossed products, direct integrals and Connes’ classification of type III-factors. Math. Scand. 40, 209–214 (1977) 37. Sutherland, C.E.: Cartan subalgebras, transverse measures and non–type–I Plancherel formulae. J. Funct. Anal. 60, 281–308 (1985) 38. Takesaki, M.: On the conjugate space of operator algebra. Tohˆoku Math. J. 10, 194–203 (1958) 39. Takesaki, M.: Theory of operator algebras I. Berlin-Heidelberg-New York: Springer, 1979 40. Takesaki, M., Winnink, M.: Local normality in quantum statistical mechanics. Commun. Math. Phys. 30, 129–152 (1973) 41. Talagrand, M.: On the meaning of Parisi’s functional order parameter. C. R. Acad. Sci. Paris, Ser. I 337, 625–628 (2003) 42. Young, A.P. (ed.): Spin glasses and random fields. Singapore: World Scientific, 1997 Communicated by J.L. Lebowitz
Commun. Math. Phys. 250, 23–45 (2004) Digital Object Identifier (DOI) 10.1007/s00220-004-1138-z
Communications in
Mathematical Physics
Intersection Numbers of Twisted Cycles Associated with the Selberg Integral and an Application to the Conformal Field Theory Katsuhisa Mimachi1 , Masaaki Yoshida2 1 2
Department of Mathematics, Tokyo Institute of Technology, Oh-okayama, Meguro-ku, Tokyo 152-8551, Japan. E-mail:
[email protected] Department of Mathematics, Kyushu University, Ropponmatsu, Fukuoka 810-8560, Japan. E-mail:
[email protected] Received: 2 June 2003 / Accepted: 6 February 2004 Published online: 27 July 2004 – © Springer-Verlag 2004
Dedicated to Professor Kazuhiko Aomoto on the occasion of his 65th birthday Abstract: Intersection numbers of twisted (or loaded) cycles associated with the Selberg integral are studied. In particular, the self-intersection number of the cycle which is invariant under the action of the symmetric group is expressed by the product of trigonometric functions. This formula reproduces the four-point correlation functions in the conformal field theory calculated by Dotsenko-Fateev in [3]. In our study, a compact non-singular model (Terada model) of the configuration space of n + 3 points on the real projective line and a q-analogue of the Chu-Vadermonde formula for the hypergeometric series play a crucial role. Intersection numbers of the corresponding cocycles are also studied.
Contents 1. Intersection Numbers of Twisted Cycles . . . . . 2. Low Dimensional Cases . . . . . . . . . . . . . 2.1 1-dimensional case . . . . . . . . . . . . . 2.2 2-dimensional case . . . . . . . . . . . . . 2.3 3-dimensional case . . . . . . . . . . . . . 3. n-Dimensional Case . . . . . . . . . . . . . . . . 4. Proof of Theorem 1 . . . . . . . . . . . . . . . . 5. An Application to the Conformal Field Theory . . 6. Intersection Numbers of Twisted Cocycles . . . . 6.1 Intersection numbers . . . . . . . . . . . . 6.2 Selberg integral case . . . . . . . . . . . . 6.3 Reciprocity relation for the Selberg integral
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
24 26 26 27 29 30 35 39 40 40 42 44
This is a revised version of “Intersection numbers of twisted cycles and the correlation functions of the conformal field theory”, Kyushu Univ. preprint series 2002-23.
24
K. Mimachi, M. Yoshida
Introduction Let L be a local system (locally constant sheaf) determined by the integrand u(t) =
n i=1
tiα (1 − ti )β
(ti − tj )2γ
1≤i<j ≤n
of the Selberg integral [9]. In the locally finite twisted homology group Hnlf (T , L) with coefficients in L, there is a unique cycle C (up to a constant factor) which is invariant under the action of the symmetric group Sn acting as the permutations of the coordinates of T , where T is the complement of the singular locus of u(t): T = {t = (t1 , . . . , tn ) ∈ Cn | ti = tj (i = j ), ti = 0, 1}. The main purpose of the present paper is to give an expression of the self-intersection number C • C as the product of trigonometric functions in α, β and γ (Theorem 1). To derive it, we make use of a smooth compactification, constructed in [12], of the configuration space of the colored n + 3 distinct points on the real projective line; each of the (n + 2)!/2 open cells becomes an n-polyhedron which is called the Terada model of dimension n or simply the Terada-n. (This is named after the pioneering work [10] on the configuration spaces.) When we evaluate the self-intersection number C • C, the adjacency relation of the Terada-n’s and the coding of their faces used in [12] plays a crucial role. Note that the Terada-n is known as the Stasheff polytope or the associahedron in the study of the homotopy (n + 2)-associativity. As an application of the formula in Theorem 1, we show that the coefficients of the four-point correlation function in the conformal field theory calculated by DotsenkoFateev in [3] are identified with the intersection numbers of the twisted cycles associated with the conformal blocks. This is a generalization of our previous work in [8], where only the special cases n = 1, 2 are discussed. The n = 1, 2 cases are studied by Dotsenko-Fateev in their first paper [2]. Our study of the Terada-n in Sect. 3 is also used when we derive an explicit formula of the self-intersection number of the twisted cocycle associated with the integrand of the Selberg integral (Theorem 2). The intersection number is expressed as the product of linear forms in α, β and γ . The quadratic relation of hypergeometric integrals stated in [7] (p. 219) together with Theorems 1 and 2 leads to the reciprocity relation for the Selberg integral. 1. Intersection Numbers of Twisted Cycles Let u(t) = i fi (t)αi be a multivalued function on T ⊂ Cn , where αi ∈ R and T is the complement of the singular locus ∪i {fi (t) = 0} in Cn . Let L be the local system (locally constant sheaf) defined by u: the sheaf consisting of the local solutions of dL = Lω for ω = du(t)/u(t). Let Hn (T , L) be the nth homology group with coefficients in L, Hnlf (T , L) the nth locally finite homology group with coefficients in L. Elements of these twisted homology groups are called twisted cycles or loaded cycles. Under some genericity conditions on the functions fi and the exponents αi , we have the isomorphism, called the regularization, reg : Hnlf (T , L) −→ Hn (T , L),
Intersection Numbers of Twisted Cycles
25
which is the inverse of the natural map Hn (T , L) → Hnlf (T , L). To define intersection numbers for , ∈ Hnlf (T , L), we regularize one of them and compute the intersection number of the consequent cycles. Actually, the intersection form • : Hnlf (T , L) × Hnlf (T , L) −→ C is the Hermitian form defined by (, ) −→ • =
aρ aσ
ρ, σ
It (ρ, σ )vρ (t)vσ (t)/|u|2
t∈ρ∩σ
for , ∈ Hnlf (T , L), if reg and are represented by reg =
aρ ρ ⊗ v ρ ,
=
ρ
aσ σ ⊗ vσ ,
σ
where aρ , aσ ∈ C, each ρ or σ is an n-simplex, vρ or vσ a section of L on ρ or σ , the complex conjugation, and Ix (ρ, σ ) the topological intersection number of ρ and σ at x. The value • of the intersection form for , ∈ Hnlf (T , L) is called the intersection number of and . Here we assumed αi ∈ R, however, the following argument and Theorem 1 remain valid for αi ∈ C if we introduce the dual sheaf L∨ of L and defined the intersection form as
−
Hnlf (T , L) × Hnlf (T , L∨ ) −→ C. We refer the reader to [8] for more detail. Let n u(t) = tiα (1 − ti )β
(tj − ti )2γ
1≤i<j ≤n
i=1
be a multivalued function defining the local system L, where we assume α, β, γ ∈ R\Z and some genericity condition on α, β, γ . Let us define, for σ ∈ Sn , a loaded n-cycle Cσ = Dσ ⊗ uDσ (t), where Dσ is a bounded domain {(t1 , . . . , tn ) ∈ Rn | 0 < tσ (1) < · · · < tσ (n) < 1} with the standard orientation, and uDσ (t) is the standard loading defined in Remark 1 below. (The orientation of Dσ adopted in [8] is different from the present one.) Then C :=
Cσ
σ ∈Sn
is a generator of the Sn -invariant subspace Hnlf (T , L)Sn . The following main theorem gives an evaluation of the self-intersection number of C:
26
K. Mimachi, M. Yoshida
Theorem 1. Jn (α, β, γ ) : = C 2 = C • C = n!
n j =1
1 − e(γ ) 1 − e(α + β + (n + j − 2)γ ) (1 − e(α + (j − 1)γ ))(1 − e(β + (j − 1)γ )) 1 − e(j γ )
√ n n −1 s(α + β + (n + j − 2)γ )s(γ ) = n! , 2 s(α + (j − 1)γ )s(β + (j − 1)γ )s(j γ ) j =1
√
where e(A) = exp(2π −1A) and s(A) = sin(π A). Remark 1. If each factor fi (t) of u(t) is defined over R, and D is a domain of the real manifold TR (the real locus of T ), then it is convenient to load D with a section uD (t) = ( i fi (t))αi i
of L on D, and to make a loaded cycle D ⊗ uD (t), where i = ± is so determined that
i fi (t) is positive on D, and the argument of i fi (t) is assigned to be zero. This choice of a section is said to be standard. In this paper, we adopt only the standard loading. Thus, we frequently omit the loading and denote just the topological cycles for simplic−−→ ity. For example, when u(t) = t α (1 − t)β , we denote by (0, 1) to express (0, 1) ⊗ u(t), −−−→ and (1, ∞) for (1, ∞) ⊗ t α (t − 1)β . In the next section, as a warm-up, we study low dimensional cases. 2. Low Dimensional Cases 2.1. 1-dimensional case. The function defining the local system L on T = C\{0, 1} is u(t) = t α (t − 1)β , where α, β, α + β ∈ R\Z. Then, the self-intersection number J1 = {0 < t < 1} • {0 < t < 1} can be expressed as the sum of local contributions at the barycenters of all the possible faces of the closure {0 < t < 1} : √ 1 −1 s(α + β) 1 e(α)e(β) − 1 −1 − − =− = , (2.1) dα dβ (e(α) − 1)(e(β) − 1) 2 s(α)s(β) where da = e(a) − 1,
√ e(a) = exp(2π −1a),
and
s(a) = sin(π a).
Indeed, a regularization reg C ∈ H1 (T , L) of C = (0, 1) ∈ H1lf (T , L) can be given by 1 1 −−−−−→ reg C = S( ; 0) + [ , 1 − ] − S(1 − ; 1) ⊗ u(t), dα dβ where the symbol S(a ; z) stands for the positively oriented circle centered at the point z with starting and ending point a, is a small positive number and the argument of each
Intersection Numbers of Twisted Cycles
27 C˜
reg C
1
0 Figure 1
factor of u(t) on the oriented circle S( ; 0) or S(1 − ; 1) is defined so that arg t takes values from 0 to 2π on S( ; 0), and arg(1 − t) from 0 to 2π . On the other hand, the other C can be deformed into the sine-like curve C˜ as is described in Fig. 1. Thus the number of the intersection points of the supports of reg C and of C˜ is three, and we have the expression (2.1) of the intersection number C • C. Remark 2. In general, let a loaded cycle Y supported by |Y | be given. If all the hypersurfaces (with non-trivial exponents) touching the cycle |Y | cross normally, then it is known [5] that the self-intersection number Y 2 is equal to the sum of the local contributions at the barycenters of all of the faces of the closure of |Y |. In what follows, we use this fact without any mention of the construction of regY and of the deformation of Y . 2.2. 2-dimensional case. The function defining the local system L is u(t) = (t2 − t1 )2γ i=1,2 tiα (1 − ti )β . In the complex (t1 , t2 )-space T ∼ = C2 , we consider the lines {t1 = 0}, {t2 = 0}, {t1 = 1}, {t2 = 1}, {t1 = t2 }, with exponents α, α, β, β, 2γ , respectively. Since the singular locus of u(t) is not normally crossing, we blow up the (t1 , t2 )-space centered at the two triple singular points (t1 , t2 ) = (0, 0) and (1, 1), and obtain a surface T ; let π : T → T ( C2 ) be the projection. As a result, the strict transform of the two bounded chambers of R2 defined by 0 < t1 < t2 < 1 and 0 < t2 < t1 < 1 are the pentagons—truncated triangles—bounded by the seven lines (Fig. 2), where the numerals along the segments indicate the corresponding exponents. Then the intersection number {0 < t1 < t2 < 1} • {0 < t1 < t2 < 1} is given by the sum 1 1 1 1 1 + + + + dα dβ d2β+2γ d2γ d2α+2γ 1 1 1 1 1 1 1 1 1 1 + + + + + dα dβ dβ d2β+2γ d2β+2γ d2γ d2γ d2α+2γ d2α+2γ dα
1+
of the local contributions at the barycenters of all the possible faces of the pentagon π −1 {0 < t1 < t2 < 1}, where A denotes the closure of A.
28
K. Mimachi, M. Yoshida
t2
π −1 {0 < t1 < t2 < 1}
2γ
α 2γ
β
β π
2γ + 2β t1
α
α α
2γ + 2α
β
β
Figure 2
On the other hand, the intersection number {0 < t1 < t2 < 1} • {0 < t2 < t1 < 1} is equal to e(γ ) d2γ
π −1 {0 < t1 = t2 < 1} • π −1 {0 < t2 = t1 < 1} .
Since the boundary of the segment π −1 {0 < t1 = t2 < 1} consists of two points with exponents 2α + 2γ and 2β + 2γ , we have
π −1 {0 < t1 = t2 < 1} • π −1 {0 < t2 = t1 < 1} = −1 −
1 1 − . d2α+2γ d2β+2γ
Consequently, we obtain J2 = {0 < t1 < t2 < 1} • ({0 < t1 < t2 < 1} + {0 < t2 < t1 < 1}) 2 dα+β+γ dα+β+2γ dγ = dα dα+γ dβ dβ+γ d2γ 1 s(α + β + γ )s(α + β + 2γ )s(γ ) . =− 4 s(α)s(α + γ )s(β)s(β + γ )s(2γ ) In this 2-dimensional case, the Terada model is just a pentagon, and the adjacency of the two pentagons Tσ = π −1 {t|0 < tσ (1) < tσ (2) < 1} (σ ∈ S2 ), which are piecewiselinearly isomorphic to the Terada-2, is quite simple. In the higher dimensional cases, the Terada models are not this simple; studies of their combinatorial properties will play a crucial role.
Intersection Numbers of Twisted Cycles
29
Figure 3
2.3. 3-dimensional case. The function defining the local system L is u(t) = (tj − ti )2γ tiα (1 − ti )β . 1≤i<j ≤3
1≤i≤3
In the complex t = (t1 , t2 , t3 )-space T ∼ = C3 , we consider the planes {ti = 0}, {ti − 1 = 0}, (1 ≤ i ≤ 3) and {ti − tj = 0} (1 ≤ i < j ≤ 3), with exponents α, β and 2γ , respectively. Let π : T → T be the minimal blow-up of T along the non-normally crossing loci of the union of these planes. Then the strict transform Tσ = π −1 {0 < tσ (1) < tσ (2) < tσ (3) < 1} is the polyhedron bounded by six pentagons and three rectangles. It is piecewise-linearly isomorphic to the Terada model in dimension 3 (Terada-3 for short). In order for the reader to recall the shape of the polygon Tσ , we often call it “the Terada-3 Tσ .” See Fig. 3, where Tid is illustrated. Here the suffix id corresponds to the identity element id of the symmetric group S3 . The boundary of the Terada-3 Tid = π −1 {0 < t1 < t2 < t3 < 1} consists of the six penatagons π −1 {0 = t1 < t2 < t3 < 1},
π −1 {0 < t1 = t2 < t3 < 1},
π −1 {0 < t1 < t2 = t3 < 1},
π −1 {0 < t1 < t2 < t3 = 1},
π −1 {0 = t1 = t2 = t3 },
π −1 {t1 = t2 = t3 = 1},
with exponents α, 2γ , 2γ , β, 6γ + 3α, 6γ + 3β, the three rectangles π −1 {0 = t1 = t2 < t3 < 1}, π −1 {0 < t1 = t2 = t3 < 1}, π −1 {0 < t1 < t2 = t3 = 1}, with exponents 2γ + 2α, 6γ , 2γ + 2β, 21 edges and 14 vertices.
30
K. Mimachi, M. Yoshida
Let us see the adjacency of the Terada-3’s Tσ for σ ∈ S3 . We can readily see that π −1 {0 < t2 < t1 < t3 < 1} is adjacent to Tid through the pentagon π −1 {0 < t1 = t2 < t3 < 1}, and that π −1 {0 < t3 < t2 < t1 < 1} is adjacent to Tid through the rectangle π −1 {0 < t1 = t2 = t3 < 1}. Similarly, π −1 {0 < t2 < t3 < t1 < 1} is adjacent to Tid through the edge π −1 {0 < t1 < t2 = t3 < 1} ∩ π −1 {0 < t1 = t2 = t3 < 1}. Thus the value J3 Tσ = Tid • 3! σ ∈S3
is equal to the sum 2 e(3γ ) −1 π {0 < t1 = t2 = t3 < 1} Tid2 − d6γ 2 e(γ ) −1 π {0 < t1 = t2 < t3 < 1} + d2γ 2 e(γ ) −1 π {0 < t1 < t2 = t3 < 1} + d2γ 2 e(γ ) e(3γ ) −1 π {0 < t1 = t2 < t3 < 1} ∩ π −1 {0 < t1 = t2 = t3 < 1} − d2γ d6γ 2 e(γ ) e(3γ ) −1 π {0 < t1 < t2 = t3 < 1} ∩ π −1 {0 < t1 = t2 = t3 < 1} . − d2γ d6γ Every summand—the self-intersection number of each face—will be calculated in the next section. If we admit the results there (see Example 3), we have J3 = − =−
3!dα+β+2γ dα+β+3γ dα+β+4γ dγ2 dα dα+γ dα+2γ dβ dβ+γ dβ+2γ d2γ d3γ 3! s(α + β + 2γ )s(α + β + 3γ )s(α + β + 4γ )s(γ )2 . 8 s(α)s(α + γ )s(α + 2γ )s(β)s(β + γ )s(β + 2γ )s(2γ )s(3γ )
3. n-Dimensional Case Let u(t) =
n
tiα (1 − ti )β
i=1
(tj − ti )2γ
1≤i<j ≤n
be a multivalued function defining the local system L. In the complex t = (t1 , . . . , tn )space T ∼ = Cn , we consider the hyperplanes {ti = 0}, {ti − 1 = 0} (1 ≤ i ≤ n),
and
{ti − tj = 0} (1 ≤ i < j ≤ n)
with exponents α, β, and 2γ , respectively. Let π : T → T be the minimal blow-up of T along the non-normally crossing loci of the union of these hyperplanes (cf. [11]). Each strict transformation Tσ = π −1 {0 < tσ (1) < · · · < tσ (n) < 1}
Intersection Numbers of Twisted Cycles
31
of the closure of the chamber defined by 0 < tσ (1) < · · · < tσ (n) < 1,
σ ∈ Sn
in Rn is piecewise-linearly isomorphic to the Terada model of dimension n (Terada-n for short). In what follows, we fix t0 and tn as t0 = 0,
tn = 1,
where for notational simplicity, we set n = n + 1, and for 0 ≤ i < i + k ≤ n with (i, i + k) = (0, n ), we denote by (i · · · i + k) the hyperface π −1 {t0 < t1 < · · · < ti = · · · = ti+k < · · · < tn < tn } of Tid . Then the hyperfaces of the Terada-n Tid = π −1 {t0 < t1 < · · · < tn < tn } are known ([12]) to be (0 · · · i ), (1 · · · i + 1), . . . , (n − i + 1 · · · n )
i = 1, . . . , n.
For a given family Si (1 ≤ i ≤ k) of sets, if any pair Si and Sj for 1 ≤ i < j ≤ k satisfies the condition either 1. Si ∩ Sj = ∅, or 2. Si Sj or Si Sj , then we say that the family satisfies the condition (DC), which is named after disjoint/contained. Each (n − k)-face of Terada-n Tid can be uniquely expressed as the intersection of k hyperfaces, say, H1 = (a1 · · · b1 ), H2 = (a2 · · · b2 ), . . . , Hk = (ak · · · bk ), where the family of the sets {a1 · · · b1 }, {a2 · · · b2 }, . . . , {ak · · · bk } enjoys the condition (DC).
32
K. Mimachi, M. Yoshida
Example 1. (1) When n = 1, the Terada-1 Tid = {0 < t1 < 1} is the segment with the boundary consisting of the two points (01) = {0 = t1 } and (11 ) = {t1 = 1}. (2) When n = 2, the Terada-2 Tid = π −1 {0 < t1 < t2 < 1} is the pentagon bounded by the five edges (01) = π −1 {0 = t1 } ∩ Tid , (12) = π −1 {t1 = t2 } ∩ Tid , (22 ) = π −1 {t2 = 1} ∩ Tid , (012) = π −1 {0 = t1 = t2 } ∩ Tid , (122 ) = π −1 {t1 = t2 = 1} ∩ Tid , and with the five vertices (01) ∩ (22 ), (01) ∩ (012), (12) ∩ (012), (12) ∩ (122 ), (22 ) ∩ (122 ). (3) When n = 3, the boundary of the Terada-3 Tid = π −1 {0 < t1 < t2 < t3 < 1} consists of the 2-dimensional faces (01) = π −1 {0 = t1 } ∩ Tid , (12) = π −1 {t1 = t2 } ∩ Tid , (23) = π −1 {t2 = t3 } ∩ Tid , (33 ) = π −1 {t3 = 1} ∩ Tid , (012) = π −1 {0 = t1 = t2 } ∩ Tid , (123) = π −1 {t1 = t2 = t3 } ∩ Tid , (233 ) = π −1 {t2 = t3 = 1} ∩ Tid , (0123) = π −1 {0 = t1 = t2 = t3 } ∩ Tid , (1233 ) = π −1 {t1 = t2 = t3 = 1} ∩ Tid , the 1-dimensional faces (01) ∩ (23), (01) ∩ (0123), . . . , etc. and 0-dimensional faces (01) ∩ (23) ∩ (0123), (01) ∩ (012) ∩ (0123), . . . , etc. Since the hyperplanes {0 = ti },
1≤i≤k
and
{ti = tj },
1≤i<j ≤k
pass through the (n − k)-simplex 0 = t1 = · · · = tk < tk+1 < · · · < tn < 1, the exponent of the hyperface (01 · · · k) for 1 ≤ k ≤ n is kα + k(k − 1)γ . Similarly, the exponent of the hyperface (p · · · p + k) for 1 ≤ p < k + p ≤ n is k(k + 1)γ , and that of (n − k + 1 · · · n ) for 1 ≤ k ≤ n is kβ + k(k − 1)γ . For a hyperface (a · · · b) with exponent λ, we set [a · · · b] =
1 1 = . e(λ) − 1 dλ
(3.1)
Intersection Numbers of Twisted Cycles
33
Since the hypersurfaces in question are normally crossing, Theorem of [5] implies Lemma 1. The self-intersection number Tid2 of the Terada-n Tid equals the sum of (−)n
n
[a1 · · · b1 ][a2 · · · b2 ] · · · [ak · · · bk ]
k=0
for all the possible sets {ai , . . . , bi } (1 ≤ i ≤ k) satisfying the condition (DC), where 0 ≤ ai < bi ≤ n and (ai , bi ) = (0, n ). The summand for k = 0 is regarded as 1. Remark 3. The sequence a, · · · , b of [a · · · b] is always the consecutive one a, a + 1, a +2, · · · , b−1, b; recall the convention of the symbol (a · · · b) made for the hyperfaces in the beginning of this section. In what follows, we take this convention without further comment. Example 2. (1) When n = 1, Tid2 = −1 − [01] − [11 ]. (2) When n = 2, Tid2 =1 + [01] + [12] + [22 ] + [012] + [122 ]
+ [01][22 ] + [01][012] + [12][012] + [12][122 ].
On the other hand, we have Lemma 2. If a Terada-n Tσ for some σ ∈ Sn is adjacent to Tid through a face F = ∩i Hi , where Hi = (pi · · · pi + qi ) and 1 ≤ pi < pi + qi ≤ n, then qi −1 qi (qi +1)/2 (−1) e(γ ) [pi · · · pi + qi ] F • F. Tid • Tσ = i
Here the self-intersection number F • F is given by the sum of (−1)dimF [a1 · · · b1 ][a2 · · · b2 ] · · · [ak · · · bk ] for all the possible sets {aj , . . . , bj } (0 ≤ aj ≤ n, 1 ≤ bj ≤ n , 1 ≤ j ≤ k) such that {pi · · · pi + qi } and {aj · · · bj } (1 ≤ j ≤ k) satisfy the condition (DC) and that (aj , bj ) = (0, n ). The product for k = 0 is regarded as 1. Remark 4. Not all of the Terada-n’s of the form π −1 {0 < tσ (1) < · · · < tσ (n) < 1} for σ ∈ Sn are adjacent to Tid . Indeed, when n = 4, exactly π −1 {0 < t2 < t4 < t1 < t3 < 1}
and π −1 {0 < t3 < t1 < t4 < t2 < 1}
are not adjacent to Tid . A combinatorial reasoning of this fact will be discussed elsewhere in the future. Example 3. (1) When n = 2, we have π −1 {0 < t1 < t2 < 1} • π −1 {0 < t2 < t1 < 1} = e(γ )[12](12) • (12) = e(γ )[12]{−1 − [012] − [122 ]}.
34
K. Mimachi, M. Yoshida
(2) When n = 3, we have π −1 {0 < t1 < t2 < t3 < 1} • π −1 {0 < t2 < t1 < t3 < 1} = e(γ )[12](12) • (12), where (12) • (12) = 1 + [012] + [0123] + [123] + [1233 ] + [33 ] + [012][0123] + [012][33 ] + [0123][123] + [123][1233 ] + [1233 ][33 ], and π −1 {0 < t1 < t2 < t3 < 1} • π −1 {0 < t3 < t1 < t2 < 1} = −e(γ )e(3γ )[12][123]{(12) ∩ (123)} • {(12) ∩ (123)}, where
{(12) ∩ (123)} • {(12) ∩ (123)} = −1 − [0123] − [1233 ].
Combination of Lemma 1 and Lemma 2 yields that Jn = Tid • Tσ n! σ ∈Sn
can be expressed as the sum of (−)n (−)qi e(γ )qi (qi +1)/2 [pi · · · pi + qi ] [aj · · · bj ] i
j
n
for 1 ≤ pi < pi + qi ≤ n, 0 ≤ aj < bj ≤ such that (aj , bj ) = (0, n ), and that the sets {pi , . . . , pi + qi }, {aj , . . . , bj } satisfy the condition (DC). Note that, if there is, in the summands, a term of the form (−)qi e(γ )qi (qi +1)/2 [pi · · · pi + qi ] [aj · · · bj ], 1≤i≤l
j
then there is, in the summands, also a term of the form (−)qi e(γ )qi (qi +1)/2 [pi · · · pi + qi ] [pi · · · pi + qi ] [aj · · · bj ] . 1≤i≤l i=k1 ,... ,kt
i=k1 ,... ,kt
Thus we have l
t=0 1≤k1 1, there are infinitely many nonconjugate Borel subalgebras (this was already noticed in [CF]). Furthermore, using the combinatorics of the root system of g, we assign to each Borel subalgebra a proper toroidal subalgebra k ⊂ g. (As k can be finite dimensional, we allow n to equal 0 in the definition of a toroidal Lie algebra.) In this way we obtain a parabolic decomposition g = n− ⊕ k ⊕ n+ with b ⊂ k ⊕ n+ . While k = h for a generic b, for special choices of b the subalgebra k may equal almost any toroidal subalgebra of g (for a more detailed discussion see Sect. 2). The first main result of the present paper is that, if k is infinite dimensional and the restriction of λ onto the center z of g intersected with [k, k] is nonzero, then any subquotient M of the Verma module Mg (λ) is isomorphic to the g–module induced from + the k ⊕ n+ –submodule M n ⊂ M consisting of all n+ –invariant vectors in M. This + reduces the study of M to the study of the k–module M n , and is a vast generalization of previously obtained results in [BC, CF and FK] . Our theorem implies that, if a Borel subalgebra b yields an infinite dimensional toroidal subalgebra k, then only very special b–highest weight modules or their submodules are not parabolically induced from k: for those the highest weight λ must vanish on the part of the center of g contained in [k, k]. In fact, explicit vertex operator constructions of such modules have been given by S. Berman and Y. Billig, [BB], and it has been shown that they can be integrable with finite dimensional weight spaces. Moreover, we may speculate that the generic Borel subalgebras, i.e. those for which k is finite dimensional and hence our result does not apply, yield mostly irreducible Verma modules. This may deserve further study, however it is beyond the scope of the present paper. Our reduction result yields a natural equivalence of categories. We define categories Og and Ok of g– and k–modules respectively, which admit an ascending filtration whose associated quotients are isomorphic to subquotients of Verma modules. We then consider their respective blocks Og (ν) and Ok (ν) of modules with a central charge ν. The second main result of the paper is that, for any ν which does not vanish on z ∩ [k, k], the functors of parabolic induction and the functor of n+ –invariants are mutually inverse equivalences of Ok (ν) and the full subcategory of Og (ν) whose objects are generated by their n+ –invariants. Any irreducible object of Og (ν) is also an object of the latter subcategory, but as we show, there are reducible modules in Og (ν) which are not generated by their n+ –invariants.
Reduction Theorem for Highest Weight Modules over Toroidal Lie Algebras
49
Notational Conventions. The ground field is C and all Lie algebras and their representations are automatically assumed to be defined over C. Some real vector spaces are also considered but this is clearly indicated. The superscript ∗ stands for dual space, U (·) stands for enveloping algebra, spanA stands for span over a submonoid A ⊂ C. We put R+ := {r ∈ R | r ≥ 0} and Z+ := Z ∩ R+ . Finally, a cone in a real vector space W is by definition an R+ –invariant submonoid C ⊂ W . 1. Triangular Decompositions and Highest Weight Modules Let W be a finite dimensional real vector space. For any subset P ⊂ W \{0}, a decomposition P = P + P − is called a triangular decomposition of P , if the cone C + ⊂ W generated by P + ∪(−P − ) (or, equivalently, its opposite cone C − ) contains no nontrivial vector subspace of W , cf. [DP2]. Let F = {{0} = Fn ⊂ Fn−1 ⊂ . . . ⊂ F1 ⊂ F0 = W } be a flag of maximal length in W , where n = dim W . We call F oriented if, for each i, one of the connected components of Fi \Fi+1 is labeled by + and the other one is labeled by −, i.e. Fi \Fi+1 = (Fi \Fi+1 )+ (Fi \Fi+1 )− . An oriented flag F determines the triangular decomposition P ± := P ∩ (∪i (Fi \Fi+1 )± ) of any subset P ⊂ W \{0}. It is easy to see that if P is finite, every triangular decomposition of P can be determined by an oriented flag F for which P ∩ F1 = ∅. This is no longer true if P is infinite. We now introduce an invariant of a triangular decomposition. For a subset M ⊂ W , let M denote the closure of M in the natural topology of W . Given a triangular decomposition P = P + P − , set P0 := (C + ∩ C − ) ∩ P . The following statements are an exercise on closed convex sets. Proposition 1. Let P = P + P − be a triangular decomposition of P ⊂ W \{0}. (i) For any oriented flag of maximal length F = {{0} = Fn ⊂ Fn−1 ⊂ . . . ⊂ F1 ⊂ F0 = W } which determines the decomposition P = P + P − , we have P0 ⊂ F1 . (ii) There exists an oriented flag of maximal length F = {{0} = Fn ⊂ Fn−1 ⊂ . . . ⊂ F1 ⊂ F0 = W } which determines the decomposition P = P + P − and such that P0 = F1 ∩ P . Let g be a (complex) Lie algebra with a fixed finite dimensional abelian selfnormalizing subalgebra h ⊂ g for which g decomposes as g = h ⊕ (⊕α∈h∗ \{0} gα ),
(1)
where gα := {g ∈ g | [h, g] = α(h)g for every h ∈ h}. The set := {α ∈ h∗ \{0} | gα = 0} is the root system of g. A Borel subalgebra b of g containing h is by definition any subalgebra of the form b = h ⊕ (⊕α∈+ gα ) for some triangular decomposition = + − of in the real vector space spanR . We put b± := ⊕α∈± gα . The triangular decomposition = + − corresponding to b defines also the subalgebra gb := h ⊕ (⊕α∈0 gα ) of g. A g-module V is a weight module (with respect to h) if V = ⊕µ∈h∗ V µ , where V µ := {v ∈ V | h · v = µ(h)v for all h ∈ h}. The support of V is the set supp V := {µ ∈ h∗ | V µ = 0}. Any submodule or quotient of a weight module is a weight module. The enveloping algebra U (g) is a weight module under the adjoint action of g with supp U (g) = Q := spanZ . For λ ∈ h∗ denote by v λ the one dimensional b–module on which h acts via λ and b+ acts by zero. The g–module Mg (λ) := U (g) ⊗U (b) v λ is the Verma module with b–highest weight λ. Note that Mg (λ) is a weight module which is isomorphic to U (b− )
50
I. Dimitrov, V. Futorny, I. Penkov
as a b− –module. Furthermore, Mg (λ) admits a unique maximal proper g–submodule, and hence, a unique irreducible quotient module Lg (λ). A highest weight module with b–highest weight λ is defined as an arbitrary nonzero quotient of Mg (λ). 2. Toroidal Lie Algebras Let be a finitely generated free abelian group (written additively), and let A() denote the group algebra of , i.e. A() = { γ ∈ cγ t γ | cγ = 0 for finitely many γ }, where t is a formal variable. Set D := Hom(, C). Given a finite dimensional reductive (possi◦ ◦ bly abelian) Lie algebra g, the following commutation relations endow g ⊗ A() ⊕ D with a Lie algebra structure:
[g ⊗ t γ , g ⊗ t γ ] = [g, g ] ⊗ t γ +γ , [d, g ⊗ t γ ] = d(γ )g ⊗ t γ , [d, d ] = 0, ◦
where g, g ∈ g, γ , γ ∈ , and d, d ∈ D. Let ϕ be a nondegenerate symmetric ◦ ◦ invariant bilinear form on g, and let z := C ⊗Z . The toroidal Lie algebra t(g, , ϕ) := ◦
◦
g ⊗A()⊕D⊕z is defined as the central extension of g ⊗A()⊕D via the commutation relation
[g ⊗ t γ + d, g ⊗ t γ + d ] = [g, g ] ⊗ t γ +γ + d(γ )g ⊗ t γ − d (γ )g ⊗ t γ +δγ ,−γ ϕ(g, g )1 ⊗ γ . ◦
Example 1. Let g be simple and = Zn . Then we can identify A() with C[t1±1 , . . ., tn±1 ] so that t (k1 ,... ,kn ) = t1k1 . . . tnkn for (k1 , . . . , kn ) ∈ Zn , and D ∼ , . . ., = span{t1 ∂t∂1
|t1 =...=tn =1 ◦ ∂ g tn ∂tn }. Furthermore, let κ be the Killing form of . The toroidal algebra |t1 =...=tn =1 ◦ ◦ ◦ g[n] := t(g, Zn , κ) is the “standard” toroidal Lie algebra, cf. [MRY]. Finally, let h ◦ ◦ be a Cartan subalgebra of g, ϕ = κ ◦ ◦ and = Zn . In this case t(h, Zn , ϕ) is a |h×h ◦
Heisenberg subalgebra of g[n].
Proposition 2. ◦ ◦ (i) If g is simple, then g[n] is isomorphic to the universal central extension of ◦
g ⊗ C[t1± , . . . , tn± ] ⊕ D. ◦
◦
◦
◦
◦
g z g z (ii) If g = (⊕m i=1 i ) ⊕ is reductive with simple components i and center , then ◦
◦
g t(g, , ϕ) ∼ = ((⊕m i=1 t( i , i , ϕ
◦
◦
|gi ×gi
◦
)) ⊕ t(z, 0 , ϕ ◦
◦
|z×z
))/J,
where i are copies of , and J is the central ideal generated by all differences of corresponding elements in i and j for 1 ≤ i = j ≤ m. Sketch of the proof. Statement (i) follows from the fact that any universal central ◦ ◦ extension of the Zn -graded Lie algebra g ⊗ C[t1±1 , . . . , tn±1 ] is isomorphic to g ⊗ C[t1±1 , . . . , tn±1 ] ⊕ C[t ±1 ,... ,tn±1 ] /dC[t1±1 , . . . , tn±1 ] (see [K]). Statement (ii) is a simple 1 exercise.
Reduction Theorem for Highest Weight Modules over Toroidal Lie Algebras
51 ◦
The term toroidal Lie algebra is often used for a consecutive extension of g ⊗ C[t1±1 , . . . , tn±1 ] by an infinite dimensional center, and then by a (finite or infinite dimensional) derivations algebra. In the present paper we consider the “minimal nontrivial” ◦ extensions of g ⊗ C[t1±1 , . . . , tn±1 ] defined above. ◦
◦
Unless explicitly stated otherwise, we will assume that g is reductive. Fix g = ◦
◦
◦
t(g, , ϕ). If h is a Cartan subalgebra of g, then h := h ⊕ D ⊕ z is a Cartan subalgebra ◦
◦
of g (here and below we identify h with h ⊗ C ⊂ g), and g admits a root decomposition ◦
(1). To describe the root system of g, notice that ⊂ h∗ ⊕ D∗ as z is in the center of g. Since is canonically embedded into D∗ = (Hom(, C))∗ , we will denote the elements of and their images in D∗ by the same letters. Then ◦
= ( + ) ∪ (\{0}), ◦
◦
(2)
◦
as gα+γ = gα ⊗ t γ and gγ = h ⊗ t γ for α ∈ , γ ∈ . In what follows we will also identify with the subset 1 ⊗Z of z. This enables us to evaluate on linear functions defined on z. With the conventions above γ ∈ may stand for three different things but the meaning of γ will be clear from the context. Namely, when we write γ alone or x ⊗ t γ we understand γ as an element of , when we write α + γ we understand γ as an element of D∗ , and when we write ϕ(γ ) we understand γ as an element of z. 3. The Decomposition g = n− ⊕ k ⊕ n+ Let b be a Borel subalgebra of g corresponding to a triangular decomposition = + − . Set 0 := ∩ (0 ∪ {0}) and note that 0 is a subgroup of . Indeed, according to Proposition 1(ii), 0 = ∩ F1 for a suitable hyperplane F1 ⊂ spanR . Hence 0 = ∩ (0 ∪ {0}) = ∩ (( ∩ F1 ) ∪ {0}) = (( ∩ F1 ) ∩ ) ∪ {0} = ∩ F1 , ◦
i.e. 0 equals the subgroup ∩ F1 of . For every α ∈ , (α + ) ∩ 0 ⊂ α + D∗ is ◦
◦
either empty or a translation of 0 . Denote by k the set of all roots α ∈ for which ◦ ◦ ◦ (α + ) ∩ 0 ⊂ α + D∗ is a translation of 0 , and let k := h ⊕ (⊕ ◦ gα ). α∈k
◦
◦
Proposition 3. The Lie algebra k is a reductive subalgebra of g. Furthermore, gb = ◦ h ⊕ (⊕α∈0 gα ) is a proper toroidal subalgebra of g isomorphic to t(k, 0 , ϕ ◦ ◦ ). |k×k
◦
◦
◦
◦
Proof. We verify first that k is a subalgebra of g. Indeed, if α, β ∈ k and α+β ∈ , then α + γ1 ∈ F1 and β + γ2 ∈ F1 , for some γ1 , γ2 ∈ , and hence (α + β) + (γ1 + γ2 ) ∈ F1 , ◦ ◦ ◦ ◦ ◦ i.e. α + β ∈ k . Furthermore, k = −k , and therefore k ⊂ g is reductive. The fact that ϕ ◦
◦
|k×k
is nondegenerate follows from the fact that ϕ
◦
◦
|h×h
and ϕ
◦
◦
|gα ×g−α
◦
, for α ∈ ,
are nondegenerate. Note also that gb is a proper subalgebra of g as 0 is a proper subset of . ◦ To establish an isomorphism between gb and t(k, 0 , ϕ ◦ ◦ ), fix a system of simple ◦
|k×k
◦
roots {α1 , . . . , αs } of k , and for every 1 ≤ i ≤ s fix a nonzero element g ±αi ∈ g±αi .
52
I. Dimitrov, V. Futorny, I. Penkov ◦
Furthermore, for every 1 ≤ i ≤ s, fix γi ∈ such that αi +γi ∈ 0 . Denote by k the subalgebra of gb generated by gi±αi ⊗ t γi for 1 ≤ i ≤ s and h. Then there exists a unique Lie ◦
◦
algebra isomorphism θ : k → k with θ (g ±αi ) = g ±αi ⊗ t γi and such that θ ◦
|zk ⊗C⊕D ⊕z
◦
◦
is the identity, where zk denotes the center of k. The reader will check immediately that ◦
the extension : t(k, 0 , ϕ ◦
◦
|k×k
◦
) → gb of θ defined by (g ⊗ t γ ) := θ(g) ⊗ t γ for
g ∈ k and γ ∈ 0 is an isomorphism.
When the Borel subalgebra b of g is given, we will denote gb by k, and the root system ◦
0 of k by k . We have k = (k + 0 ) ∪ (0 \{0}). In particular, 0 = 0 precisely when k is infinite dimensional. Set also n± = ⊕α∈± \k gα . Then g = n− ⊕ k ⊕ n+ . ◦
It is a natural problem to determine all pairs (0 , k) arising from Borel subalgebras b of g as in Proposition 3 above. We will now reduce this problem to a purely combinatorial question about root systems. Although we do not answer the latter, we show that the set of such pairs is rather large. An example of a pair which does not arise from a Borel subalgebra of g is given below, see Example 2. Assume b is determined by the partition = + − . Fix a hyperplane F1 ⊂ spanR corresponding to = + − as in Proposition 1(ii). Let F1 = ker τ , where τ : spanR → R is a nonzero linear ◦
function. Notice that spanR = (spanR ) ⊕ (spanR ), and hence τ = τ1 + τ2 , where ◦
τ1 : spanR → R and τ2 : spanR → R are the restrictions of τ to spanR and ◦ ◦ ◦ spanR respectively. Then 0 = ker τ1 ∩ and k = τ2−1 (τ1 ()) ∩ . Moreover, τ1 () is a discrete subgroup of R of rank d := rk − rk 0 . There are three possible cases. ◦
◦
(i) 0 = . Then k = τ2−1 ({0}) for a nonzero τ2 , i.e. the root system of k is the ◦
◦
intersection of with a proper subspace of spanR . ◦
(ii) rk = 1 and rk 0 = 0. Then k = ∅.
◦
(iii) None of the above holds for and 0 . In this case finding all possible pairs (0 , k) ◦
◦
is equivalent to finding all root subsystems of of the form τ2−1 ( ) ∩ , where is a subgroup of R of rank d. ◦
◦
◦
Example 2. 1. Let be a basis of , 0 ⊂ and let k be the root subsystem of ◦
◦
◦
with basis 0 . Then k = ker τ2 ∩ for an appropriate τ2 . Hence the pair (0 , k) arises from a Borel sublagebra of g unless 0 = and 0 = , or rk = 1, rk 0 = 0 and 0 = ∅. ◦ ◦ 2. Let rk 0 = rk −1 = 0. Then τ1 () ∼ = Z and the condition on k is k = τ2−1 (Z)∩ ◦
◦
◦
for some τ2 . In particular, if g is simple of type B2 with = {±ε1 ± ε2 , ±ε1 , ±ε2 }, ◦ ◦ ◦ then τ2−1 (τ1 ()) ∩ can be any proper subsystem of . Thus k = sl(2) ⊕ sl(2) (when ◦ ◦ ◦ ◦ ◦ = sl(2) or k = h. However, if g is simple of type C3 , the reader k = {±ε1 ± ε2 }), k ∼ ◦
◦
will check that the subsystem consisting of the long roots cannot be realized as k .
Reduction Theorem for Highest Weight Modules over Toroidal Lie Algebras
53 ◦
◦
◦
Furthermore, it is not difficult to verify that for any subalgebra k of g the pair (0 , k) arises from a Borel subalgebra of g whenever rk − rk 0 is large enough. Finally, note that the construction of k admits natural iteration. Indeed, the Borel subal− gebra b determines a triangular decomposition 0 = + 0 0 , i.e. a Borel subalgebra of k. By applying the above construction to k and its Borel subalgebra k ∩ b, we obtain a toroidal subalgebra k2 of k. Continuing the process, we obtain a strictly descending sequence of toroidal Lie algebras h = ks ⊂ . . . ⊂ k1 = k ⊂ g
(3)
together with a fixed Borel subalgebra ki ∩ b of ki for each 1 ≤ i ≤ s. Example 3. Here is an example of two Borel subalgebras in g = sl(3)[3] for which k is ◦
the same but the corresponding sequences (3) differ. Let = {±α, ±β, ±(α + β)} and let δ1 , δ2 , δ3 be fixed generators of . Define the linear function √ √ τ : spanR → R by putting τ (α) = 0, τ (β) = 2, τ (δ1 ) = 0, τ (δ2 ) = 1, τ (δ3 ) = 3. It determines a partition = −− ++ , where := ker τ ∩ and ++ := {ξ ∈ | τ (ξ ) > 0}, −− := −++ . Clearly, = (Zδ1 \{0}) ∪ (±α + Zδ1 ). Consider the following two − + − triangular decompositions of : := ( )+ 1 ( )1 and := ( )2 ( )2 , + + where ( )1 := {nδ1 , ±α + nδ1 | n > 0} ∪ {α}, ( )2 := {nδ1 | n > 0} ∪ (α + Zδ1 ), ± + ±± ( )± . One checks immediately that ( )− i := −( )i . Finally, set ()i := i + − = ()i ()i for i = 1, 2 are two triangular decompositions of . In both cases 0 = and k ∼ = sl(2)[1] is an affine Lie algebra with roots . The two triangular decompositions of produce respectively a standard and an imaginary Borel subalgebra of k. More precisely, the sequences corresponding to the decompositions − + − = ()+ 1 ()2 and = ()2 ()2 are respectively h⊂k∼ = sl(2)[1] ⊂ g
(4)
and ◦
h ⊂ k2 ∼ = t(h, Zδ1 , κ
◦
◦
|h×h
)⊂k∼ = sl(2)[1] ⊂ g.
(5)
4. On the Structure of Subquotients of Mg (λ) ◦
For the rest of the paper b is a fixed Borel subalgebra of g = t(g, , ϕ) corresponding to the triangular decomposition = + − . We have g = n− ⊕ k ⊕ n+ , where n± := ⊕α∈(± \k ) gα . For λ ∈ h∗ , set Dk (λ) := λ + spanZ k and put Mk := ⊕µ∈Dk (λ) M µ for any subquotient M of Mg (λ). In this section we assume that 0 = 0, i.e. that k is infinite dimensional, and establish a relation between submodules of Mg (λ) and respective submodules of Mk (λ). We ◦
consider first the case when g is semisimple as here the result is very natural and easy to state. ◦
Theorem 1. Let g be semisimple, k be infinite dimensional and N be a nontrivial submodule of Mg (λ). The following assertions hold:
54
I. Dimitrov, V. Futorny, I. Penkov
(i) Nk = 0. (ii) If λ(0 ) = 0, then the canonical g–module homomorphism U (g) ⊗U (k⊕n+ ) Nk → N is an isomorphism. The following corollary is immediate. ◦
Corollary 1. Let g be semisimple and k be infinite dimensional. (i) For any λ the Verma module Mg (λ) is irreducible if and only if the Verma module Mk (λ) is irreducible. (ii) If λ(0 ) = 0, then Lg (λ) ∼ = U (g) ⊗U (k⊕n+ ) Lk (λ). The proof of Theorem 1 uses induction on the Lie algebra g which requires a similar ◦ statement for reductive g as well. Therefore, in the rest of the section we will state and ◦ prove an analog of Theorem 1 for reductive g. Theorem 1 is a particular case of Theorem 2 below. ◦ ◦ ◦ ◦ ◦ Let g = gss ⊕ z be a reductive Lie algebra with semisimple part gss and center z. For λ ∈ h∗ , put ν := λ|z and ◦
Z − (ν) := span{z ⊗ t γ ∈ n− | z ∈ z and λ(γ ) = 0}. ◦
− ⊕Z − (ν), where the splitting of t(h, , ϕ Then n− = n ◦
◦
◦
◦
|h×h
◦
)∩n− is compatible with the
− ⊕k⊕n+ . One can consider U (˜ splitting h = hss ⊕ z. Set g˜ := n g) as a left U (g)–module − with trivial action of Z (ν), and define the g–module M (λ) as U (˜g) ⊗U (b) v λ . g ◦
Theorem 2. Let g be reductive, k be infinite dimensional and N be a nontrivial submodg (λ). The following assertions hold. ule of M (i) Nk = 0. (ii) If λ(0 ) = 0, then the canonical g–module homomorphism U (˜g) ⊗U (k⊕n+ ) Nk → N is an isomorphism. The following corollary is immediate. ◦
Corollary 2. Let g be reductive and k be infinite dimensional. (i) For any λ the module M g (λ) is irreducible if and only if the Verma module Mk (λ) is irreducible. (ii) If λ(0 ) = 0, then Lg (λ) ∼ = U (˜g) ⊗U (k⊕n+ ) Lk (λ). Proof of Theorem 2. Fix an oriented flag F of maximal length which determines the triangular decomposition = + − as in Proposition 1 (ii). Let F1 = ker τ , i.e. α ∈ is a root of k if and only if τ (α) = 0, cf. Sect. 3. Put b± = ⊕α∈± gα , and let 1λ ∈ v λ be a fixed nonzero vector. ◦
◦
◦
Consider a decomposition g = ⊕p gp , where each gp is either a simple Lie algebra ◦
or is the one dimensional abelian Lie algebra C z for some z ∈ z with ϕ(z, z) = 0. As a first step we show that it suffices to establish the theorem for each of the algebras gp :=
Reduction Theorem for Highest Weight Modules over Toroidal Lie Algebras ◦
t(gp , , ϕ
◦
◦
◦
|gp ×gp
◦
55 ◦
) (cf. Proposition 2 (ii)). Set g¯ p := ⊕i =p gi , g¯ p := t(g¯ p , , ϕ
◦
◦
|g¯ p ×g¯ p
)
and assume that Theorem 2 is true for g1 and, by induction, also for g¯ 1 . Let X ∈ N be a nonzero weight vector written as X = ( i ui vi wi ) · 1λ , where ui ∈ U (n− ∩ g1 ), vi ∈ U (k ∩ g1 ), wi ∈ U (b− ∩ g¯ 1 ) and the vectors wi · 1λ are linearly independent. g1 (λ|h∩g ). Set Let Mi denote the g1 –module generated by wi · 1λ . Then Mi ∼ = M 1 Xi := (ui vi wi ) · 1λ . The g1 –module Ni generated by Xi is a submodule of Mi . By Theorem 2 (i) applied to the g1 –modules N1 and M1 , there exists u¯ ∈ U (g1 ) for which u¯ · X1 = (uu ¯ 1 v1 ) · (w1 · 1λ ) is a nonzero weight vector in (N1 )k∩g1 . But then u¯ · X = 0 and, clearly, u¯ · X ∈ Nk∩g1 . Repeating the above argument for g¯ 1 instead of g1 and u¯ · X instead of X, we prove that Nk = 0. If we furthermore assume that λ(0 ) = 0, Theorem 2 (ii) applied to the g1 –modules Ni and Mi implies that vi · (wi · 1λ ) ∈ Nk∩g1 , and hence X belongs to the g1 –module generated by ⊕i (Ni )k∩g1 . A similar argument for g¯ 1 implies that N is generated by Nk . ◦ It now remains to prove the theorem when g is simple or one dimensional. First we ◦
consider the case when g = Cz for some z with ϕ(z, z) = 0. Let X ∈ N be a nonzero vector. Write X = ( i ui vi ) · 1λ , where ui ∈ U (n− ) are distinct monomials in {z ⊗ t γ | λ(γ ) = 0} and vi ∈ U (k). We will show that vi ∈ Nk for every i. If z ⊗ t γ occurs in some of the monomials ui , then for any r ∈ Z+ we have −γ r r ui v i · 1 λ , (z ⊗ t ) · X = r!(−ϕ(z, z)λ(γ )) i
where ui is the monomial ui with r terms equal to z ⊗ t γ erased whenever at least r such terms occur in ui , and ui = 0 otherwise. (In this case U (n− ) is commutative so the procedure of erasing terms from monomials is not ambiguous.) Taking r to be the maximal degree in which z ⊗ t γ occurs in a monomial ui , we obtain the nonzero vector 1 (z ⊗ t −γ )r · X = uj vj · 1λ ∈ N, X := r!(−ϕ(z, z)λ(γ ))r j
where vj equals vi for some i and uj is obtained from ui by erasing all terms equal to
z ⊗ t γ . The same procedure applied to X yields the nonzero vector 1 −γ r ) ·X = uk vk · 1λ ∈ N, X := (z ⊗ t (r )!(−ϕ(z, z)λ(γ ))r k
where vk equals vi for some i and uk is obtained from ui by erasing all terms equal to z ⊗ t γ and z ⊗ t γ . Continuing in this way, we obtain that vi0 · 1λ ∈ N for some i0 . Then we set X¯ := X − (ui0 vi0 ) · 1λ and show by an easy induction argument that vi · 1λ ∈ N ◦ for every i. This completes the proof in the case when g is abelian. Note that in this case the condition on λ is not needed for assertion (ii). ◦ For the rest of the proof g is a simple Lie algebra and we fix an indivisible element γ ∈ 0 , which has the additional property λ(γ ) = 0, whenever λ(0 ) = 0. (An element γ ∈ 0 is indivisible if lγ ∈ for l ∈ Q implies l ∈ Z.) The proof will be carried out in two separate cases: when rk 0 ≥ rk − 1 and when rk 0 < rk − 1.
56
I. Dimitrov, V. Futorny, I. Penkov ◦
◦
◦
Case 1. rk 0 ≥ rk − 1. Fix a basis B of g consisting of root vectors of g and of a ◦
◦
◦
basis of h with the property that for each h ∈ h ∩ B there are corresponding elements ◦
e, f ∈ B so that {e, h, f } forms a canonical sl(2)–basis. Then the set B = {x ⊗ t γ ∈ ◦ b− | x ∈ B , γ ∈ } is a basis of b− . For x ∈ B, let rt(x) denote the corresponding root. Assume N µ = 0 and fix a nonzero vector X ∈ N µ . Write X = ( i ui vi ) · 1λ , where − ui ∈ U (n ) are distinct monomials in the basis B and vi ∈ U (k). We are going to prove that (i) Nk = 0; (ii) if λ(0 ) = 0, then vi ∈ Nk for every i. To prove (ii) it is enough to show that vi0 · 1λ ∈ Nk for some i0 for which ui0 has maximal degree (here degree refers to the canonical filtration of U (n− )) among all ui . Indeed, then we will consider X − ui0 vi0 · 1λ = ( i =i0 ui vi ) · 1λ , for which there are fewer monomials ui whose degree equals the degree of ui0 , and will complete the argument by induction. Fix a set of generators γ1 , . . . , γn of so that γ1 , . . . , γn−1 ∈ 0 . Recall that ◦
Q = spanZ and notice that τ (Q) = Zτ (γn ) + τ () is a discrete subset of R. Furthermore, fix a linear order on − with the property that β ≺ α whenever one of the following conditions holds: – for some i α is a root of ki and β is not a root of ki ; – there is no such i, but (α, γ ) < (β, γ ). Finally, fix a linear order on B in a compatible way, i.e. x ≺ y whenever rt(x) ≺ rt(y). The proof of (i) and (ii) will be carried out by induction on τ (λ − µ). Denote by TX− the set of roots α of n− for which there exists a root vector x ∈ gα ∩ B which occurs in a monomial ui of maximal degree, and let TX0 be the set of roots β of k for which there exists a root vector y ∈ gβ ∩ B which occurs in some vi . Set TX := TX− ∪ TX0 . Fix a root α0 ∈ TX− that cannot be written as a nonnegative integer combination of roots from (TX ∪ {γ })\{α0 } (since TX is a finite set, such a root always exists). Now we consider two subcases. ◦
◦
◦
◦
Subcase 1. α0 ∈ . Then α0 = α 0 + γ0 for some α 0 ∈ and γ0 ∈ . Let f ∈ B be ◦ ◦ the root vector corresponding to α 0 and let e ∈ B be the root vector corresponding to ◦ −α 0 . Set h := [e, f ]. We claim that the vector X := (e ⊗ t −γ0 +Kγ ) · X is nonzero for large enough K ∈ Z+ . Indeed, assume that f ⊗ t γ0 occurs in u1 = x1 . . . xp , where xi ∈ B ∩ n− and α0 is among rt(x1 ), . . . , rt(xp ). Then (e ⊗ t −γ0 +Kγ ) · (u1 v1 · 1λ ) −γ 0 ( i=1 x1 . . . xi−1 [e ⊗ t +Kγ , xi ]xi+1 . . . xp v1 ) · 1λ (x1 . . . xj −1 xj +1 . . . xp )((lϕ(e, f )) (h ⊗ t Kγ )v1 ) · 1λ + X p
= = ,
(6)
γ0 where j is the minimal index for which xj = f ⊗t and there are exactly l such indices. Now write X as X = ( j uj vj ) · 1λ , where u1 := x1 . . . xj −1 xj +1 . . . xp , see (6). We claim that v1 = (lϕ(e, f )) (h ⊗ t Kγ )v1 , i.e. that u1 can only be obtained from commuting vectors of the form e ⊗ t −γ0 +Kγ with f ⊗ t γ0 in (6). Indeed, this is a consequence of
Reduction Theorem for Highest Weight Modules over Toroidal Lie Algebras
57
the choice of α0 : rt(e ⊗ t −γ0 +Kγ ) = −α0 + Kγ . On one hand, commuting e ⊗ t −γ0 +Kγ consecutively with vectors among x1 , . . . , xp different from f ⊗ t γ0 cannot result in a root vector whose corresponding root belongs to . On the other hand, commuting e ⊗ t −γ0 +Kγ consecutively with at least two vectors among x1 , . . . , xp of which at least one equals f ⊗ t γ0 will result in a root vector of n− whose corresponding root β has a product (β, γ ) larger than all such products corresponding to the vectors xi . This proves that X = 0, and clearly X ∈ N µ with τ (λ − µ ) < τ (λ − µ). Hence, by induction, Nk = 0. If, furthermore, λ(0 ) = 0, the induction assumption implies 1/(lϕ(e, f )) v1 = ((h ⊗ t Kγ )v1 ) · 1λ ∈ N . If h ⊗ t Kγ ∈ k2 , then h ⊗ t −Kγ · (((h ⊗ t Kγ )v1 ) · 1λ ) = −Kϕ(h, h)λ(γ )v1 · 1λ ∈ N , which shows that v1 · 1λ ∈ N . The proof is complete in this subcase. If h ⊗ t Kγ ∈ k2 , we use Theorem 2 for the proper subalgebra k instead of g, its Borel subalgebra b ∩ k, the corresponding Verma module Mk (λ) and its submodule N ∩ Mk (λ). A computation similar to the one above shows that v1 · 1λ ∈ N . Subcase 2. α0 = γ0 ∈ . Fix a vector h⊗t γ0 which occurs in a monomial ui of maximal ◦
degree and consider the vectors e and f of B corresponding to h, see the convention ◦ ◦ ◦ about the choice of B . Denote the root of g corresponding to e by α. Let q be the small◦
− . Then, similarly to Subcase 1, consider est integer for which α + qγ0 is a root of n +Kγ −qγ 0 X := (f ⊗ t ) · X and write X = ( j u1 v1 ) · 1λ . There are two possibilities: α + (q − 1)γ0 is a root either of n+ or of k. An argument similar to the one used in Subcase 1 shows that, for large K,
– if α + (q − 1)γ0 is a root of n+ , then we have (after possible relabeling) u1 = (f ⊗ t −(q−1)γ0 +Kγ )x1 . . . xi−1 xi+1 . . . xp and v1 = −2l v1 , where i is the minimal index for which xi = h ⊗ t γ0 and l is the number of indices r with xr = h ⊗ t γ0 ; – if α + (q − 1)γ0 is a root of k, then we have (after possible relabeling) u1 = x1 . . . xi−1 xi+1 . . . xp and v1 = (−2l) (f ⊗ t −(q−1)γ0 +Kγ )v1 , where i is the minimal index for which xi = h ⊗ t γ0 and l is the number of indices r with xr = h ⊗ t γ0 . The proof is completed by induction as in Subcase 1. Case 2. rk 0 < rk − 1. In this case the set {τ (γ ) | γ ∈ } is a dense subset of R. Fix a nonzero vector X ∈ N and write X = ( i ui vi ) · 1λ , where ui ∈ U (n− ) are linearly ◦
◦
◦
independent and vi ∈ U (k). Furthermore, fix α ∈ and a nonzero e ∈ gα . We element will prove first that N contains nonzero vectors of the form ( i ui vi ) · 1λ , where ui are (distinct) monomials in {e ⊗ t γ ∈ n− | γ ∈ } and all vi are among vi . ◦
◦
◦
As in Case 1 we work with a basis B of g consisting of root vectors of g and a basis of ◦
◦
◦
h. However, the basis for h will be fixed later in the proof. Note that, since h is abelian, ◦
U (t(h, , ϕ
◦
◦
◦
|h×h
) ∩ n− ) is simply the symmetric algebra of t(h, , ϕ
◦
◦
|h×h ◦
) ∩ n− , and
hence all definitions below are independent of the choice of a basis of h. Consider a ◦
linear order on B such that ◦
◦
◦
◦ ◦
– if x1 and x2 ∈ B are root vectors with corresponding roots β 1 and β 2 , then (α, β 1 ) < ◦ ◦
(α, β 2 ) implies x1 ≺ x2 ,
58
I. Dimitrov, V. Futorny, I. Penkov ◦
◦
◦ ◦
◦
– if x1 ∈ B is a root vector with corresponding root β 1 , and x2 ∈ B ∩h, then (α, β 1 ) < 0 ◦ ◦
implies x1 ≺ x2 and (α, β 1 ) ≥ 0 implies x2 ≺ x1 .
◦
◦
We set B := {x ⊗ t γ ∈ b− | x ∈ B } and extend the order on B to a linear order ◦ on B, i.e. if x1 ≺ x2 ∈ B , then x1 ⊗ t γ ≺ x2 ⊗ t γ ∈ B. Any monomial u = z1 . . . zl in B is written so that if zi ≺ zj , then i < j . The inverse lexicographical order on the monomials in B has the following property: if zl−i ≺ zl −i , where i is the minimal nonnegative integer for which zl−i and zl −i correspond to different ele◦
ments of B , then u = z1 . . . zl ≺ z1 . . . zl = u . Also, if u = u u, with u = 1, then u ≺ u. Let X = ( i ui vi ) · 1λ ∈ N µ , where ui are distinct monomials in B. For u = (x1 ⊗ t γ1 ) . . . (xl ⊗ t γl ) we define the height of a monomial u in B as ◦ ◦◦ ◦ ◦ ◦ ◦ ht(u) := j (α − β j , α), where xj ∈ gβ j , assuming that g0 := h. We are going to prove the existence of ( i ui vi ) · 1λ ∈ N as above by induction on the maximal height of the monomials u1 , u2 , . . . . If the maximal height is 0, there is nothing to prove. Otherwise, assume that the monomial u1 has maximal height and is, in addition, maximal (with respect to the inverse lexicographical order) among all monomials of maximal height. ◦
Let u1 = x1 . . . xk . . . xl , where k is the largest index for which the vector z of B corre◦ ◦ ◦ ◦ sponding to xk does not equal e. We now choose a root vector e ∈ gα with (α, α ) > 0 ◦
◦
such that [e , z] = 0, which is possible as g is simple. At this point, if z ∈ h, we fix a ◦
basis h1 , h2 , . . . of h such that [e , hi ] = 0 for i ≥ 2. Since {τ (γ ) | γ ∈ } is dense in R, ◦ there exists γ ∈ such that τ (rt(x)) < −(τ (α ) + τ (γ )) < 0 for any x ∈ B entering a monomial ui . It is not difficult to check that (e ⊗ t γ +Kγ ) · X = j uj vj , where vj are among vi and the height of each of the elements uj is smaller than the height of u1 . This proves the existence of a vector in N as required. We may furthermore assume that ◦ all monomials ui are of the same degree. Indeed, consider h and f ∈ g so that {e, h, f } is a canonical sl(2)–basis. Then, repeating the procedure above, we can find a nonzero vector X = ( i ui vi ) · 1λ for which ui are monomials in {f ⊗ t γ ∈ n− | γ ∈ }. Now the monomials ui are necessarily of the same degree. Assume that X = ( i ui vi ) · 1λ ∈ N , where ui are (distinct) monomials of the same degree d in {e ⊗ t γ ∈ n− | γ ∈ }. As in Case 1 we need to prove that Nk = 0, and, if λ(0 ) = 0, then vi ∈ Nk . The proof will be carried by induction on the maximal degree of a monomial ui . Set X := {γ ∈ | e ⊗ t γ occurs in a monomial ui }. Let γ0 ∈ X be the element with the following properties: τ (γ0 ) = minγ ∈ τ (γ ) and if τ (γ0 ) = τ (γ ) for some γ ∈ X , then γi − γ0 ∈ + . This simply means that, for any integer K and any γ ∈ X , [f ⊗ t −γ0 +Kγ , e ⊗ t γ ] is either in n+ or in k ∩ b− . Assume furthermore, that u1 = x1 . . . xl , where exactly p of the vectors x1 , . . . , xl equal e ⊗ t γ0 , and x1 is one of them. Consider X := (f ⊗ t −γ0 +Kγ ) · X. The reader will verify easily that if X = ( i ui vi ) · 1λ , then, after a possible relabeling, u1 = x2 . . . xl , v1 = p(h ⊗ t Kγ )v1 and all ui are polynomials in {e ⊗ t γ ∈ n− | γ ∈ } of degree d − 1. An obvious induction argument shows that Nk = 0. In order to prove that v1 ∈ Nk we proceed as in the last paragraph of the proof of Case 1, Subcase 1. ◦
◦
◦
Example 4. Let g be simple, 0 = and k = h. In this case Theorem 1 has been proved in [FK]. The other previously treated cases are:
Reduction Theorem for Highest Weight Modules over Toroidal Lie Algebras
59
◦
◦
– rk = 1, arbitrary g and arbitrary k, [CFM], ◦ ◦ ◦ – rk = 2, rk 0 = 1, arbitrary g and k = g, [CF], ◦
◦
◦
◦
◦
◦
– rk = 2, rk 0 = 1, g = sl(2) and arbitrary k (i.e., k = g or k = h), [FK]. Example 3 revisited. In the notations of Example 3, 0 = Zδ1 is the lattice corresponding both to k and k2 . Starting with the first Borel subalgebra we can apply Theorem 1 under the condition that λ(δ1 ) = 0. This reduces the study of subquotients of Mg (λ) to the study of subquotients of Verma modules of k ∼ = sl(2)[1] for a standard Borel subalgebra, which are known to have a rich and interesting structure. For the second Borel subalgebra we can apply the reduction theorem twice under the same condition. (Note that in general one needs a stronger condition to apply a second reduction.) This shows that the structure of the subquotients of Mg (λ) is much simpler in this case, as all Verma modules of the Heisenberg algebra k2 with nonzero highest weight are irreducible. 5. Equivalence of Categories In this final section we introduce an analog of the category O for toroidal Lie algebras and extend Theorems 1 and 2 to appropriate equivalences of categories. ◦ Consider first the case when g is semisimple. Let Og denote the full subcategory of the category g–mod whose objects M admit an ascending g–module filtration {Mi } such that the quotients Mi /Mi+1 are isomorphic to subquotients of Verma modules. The category Ok is defined analogously. Since z acts via a character on each subquotient of a highest weight module, z acts locally finitely on any object of Og or Ok . Both categories Og and Ok are closed under countable direct sums. Let Og (ν) and Ok (ν) be the full subcategories of Og and Ok respectively, such that their objects have the property that z acts via the homomorphism ν : z → C on all quotients of their defining filtrations. The category Og is then the full subcategory of countable dimensional g–modules in the direct product of categories ×ν∈z∗ Og (ν) whose objects are countably dimensional vector spaces over C. A similar statement holds for Ok as well. Induction provides a functor I : Ok → Og ,
N → U (g) ⊗U (k⊕n+ ) N,
where n+ acts trivially on N . The canonical image of N in U (g) ⊗U (k⊕n+ ) N consists of n+ –invariants and, hence, I (N ) is generated by its n+ –invariants. Clearly, the restriction of I on Ok (ν), which we also denote by I , is a well defined functor I : Ok (ν) → Og (ν). For ν(0 ) = 0 we will explicitly construct a functor adjoint to I . We start with the following lemma. +
Lemma 1. Let M be a subquotient of Mg (λ). If λ(0 ) = 0, then M n = Mk , where + M n are the n+ –invariants of M and Mk is as in Sect. 4. Proof. It suffices to prove the lemma for quotients of Mg (λ). Let M = Mg (λ)/K. By + definition, (Mg (λ)/K)k = Mg (λ)k /Kk and, by Theorem 1, Mg (λ)n = Mg (λ)k and + + + + K n = Kk . Hence, it is enough to show that (Mg (λ)/K)n = Mg (λ)n /K n . + + + The inclusion (Mg (λ)/K)n ⊃ Mg (λ)n /K n is obvious. The opposite inclusion + follows from Theorem 1. Indeed, assume to the contrary, that 0 = m ∈(Mg (λ)/K)n ∈
60
I. Dimitrov, V. Futorny, I. Penkov +
+
Mg (λ)n /K n . Then m generates a proper submodule M of M with the property that Mk ∩ Mk = 0. The preimage M of M in Mg (λ) has the property that Mk = Kk , which is in contradiction with Theorem 1 (ii) as M = K. Lemma 1 implies now that for ν(0 ) = 0 we have a well defined functor R : Og (ν) → Ok (ν),
+
M → M n .
+
Indeed, in order to see that M n is an object of Ok (ν) we consider a filtration of M whose associated quotients are subquotients of Verma g–modules. The consecutive application of Lemma 1 to the quotients Mi /Mi−1 ensures that the associated quotients of + + the filtration {Min } of M n are subquotients of Verma k–modules. + If ν(0 ) = 0, it is not true in general that, for any object M of Og (ν), R(M) = M n is an object of Ok (ν). Here is an example. Example 5. Let g = sl(2)[1], let b be an imaginary Borel subalgebra, i.e. k is the Heisenberg algebra, and let λ = 0. The module Mg (0) is well understood, see [F2]. Denote by N the maximal proper submodule of Mk (0), let K be the g–submodule of Mg (0) generated by N and set M := Mg (0)/K. A direct computation shows that M n+ contains + + a k–submodule isomorphic to n− . Therefore, if M n admitted a filtration {(M n )i } as − − required, n would have nontrivial k ∩ b–invariants. However, the k–module n has no + k ∩ b–invariants, and hence M n is not an object of the category Ok (0). ◦
Theorem 3. Let g be semisimple and ν(0 ) = 0. (i) The functor I : Ok (ν) → Og (ν) is a left adjoint to the functor R : Og (ν) → Ok (ν), i.e. R ◦ I is naturally isomorphic to the identity functor on Ok (ν). (ii) The functors R and I are mutually inverse equivalences of Ok (ν) and the full subcategory Ogk (ν) of Og (ν) whose objects are generated as g–modules by their n+ – invariants. Proof. (i) Let N be an object of Ok (ν) with a filtration {Ni }. Clearly, there is a canonical injection of k–modules +
: N → (U (g) ⊗U (k⊕n+ ) N )n . To see that is a surjection, assume the contrary. Then for some minimal i there is a + + nonzero vector m ∈ Min := (U (g) ⊗U (k⊕n+ ) Ni )n , m ∈ (Ni ). Hence the projection + of m in Mi /Mi−1 is a nontrivial n –invariant vector such that m ∈ (Mi /Mi−1 )k . This contradicts Lemma 1, therefore is an isomorphism. The above argument shows that R ◦ I is naturally isomorphic to the identity functor on the objects of Ok (ν). A straightforward checking shows that this is true on morphisms too. To show (ii), notice that for any object M of Og (ν) there is a canonical morphism of g–modules + : U (g) ⊗U (k⊕n+ ) M n → M. We claim that is injective. To see this it suffices to prove that is injective on + U (g) ⊗U (k⊕n+ ) Min for every i, {Mi } being the defining filtration on M. The latter is established by a simple inductive argument based on Theorem 1. Finally, in order to assert that is surjective one has to assume that M is an object of Ogk (ν). This implies (ii) modulo some details that the reader will easily fill in.
Reduction Theorem for Highest Weight Modules over Toroidal Lie Algebras
61
Here is an explicit example of an object in Og (ν) which is not an object of Ogk (ν). Example 6. Let g = sl(2)[1], cf. Example 1. Set e(i) := e ⊗ t i , f (i) := f ⊗ t i , h(i) = h ⊗ t i , for i ∈ Z. Here {e, f, h} is a standard basis of sl(2) and κ is normalized so that κ(h, h) = 2. Then the explicit commutation relations of sl(2)[1] are as follows: [e(i), f (j )] = h(i + j ) + iδi,−j c, [h(i), e(j )] = 2e(i + j ), [h(i), f (j )] = −2f (i + j ), [h(i), h(j )] = 2iδi,−j c, [e(i), e(j )] = [f (i), f (j )] = 0, [d, e(i)] = ie(i), [d, h(i)] = ih(i), [d, f (i)] = if (i), [c, e(i)] = [c, h(i)] = [c, f (i)] = [c, d] = 0. Fix the Borel subalgebra b = h ⊕i>0 Ch(i) ⊕i Ce(i). For the corresponding triangular decomposition g = n− ⊕ k ⊕ n+ we have n− = ⊕i Cf (i), k = ⊕i Ch(i) ⊕ Cc ⊕ Cd and n+ = ⊕i Ce(i). First we construct an indecomposable k⊕n+ –module M0 of length 2. Let C[s1 , s2 , ...] be the ring of polynomials in infinitely many variables {si }i=1,2,... , and let W = Cv ⊕Cu be a two dimensional vector space. Set ∂ 2i ∂si p(s) for i < 0, h(i) · p(s) := si p(s) for i > 0, ∂ c · p(s) := p(s), and d · p(s) := ( ∞ i=1 i ∂si )p(s) for any p(s) ∈ C[s1 , s2 , . . . ]. Then M := C[s1 , s2 , . . . ] ⊗ W is a ⊕i =0 Ch(i) ⊕ Cc ⊕ Cd–module with the above action on C[s1 , s2 , . . . ] and with trivial action on W . By putting h(0) · (p(s) ⊗ v) := 0 and h(0) · (p(s) ⊗ u) := −2p(s) ⊗ u, we define a k–module structure on M . Clearly, M decomposes into a direct sum of the irreducible k–modules C[s1 , s2 , . . . ] ⊗ v and C[s1 , s2 , . . . ] ⊗ u. Below we will extend the k–module structure on M to a k⊕n+ –module structure such + that (M )n = C[s1 , s2 , . . . ] ⊗ v. Then M := I (M) = U (g) ⊗U (k⊕n+ ) M will be an indecomposable g–module with composition series 0 ⊂ M1 ⊂ M with M/M1 ∼ = M2 , where M1 := I (C[s1 , s2 , . . . ] ⊗ v) and M2 := I (C[s1 , s2 , . . . ] ⊗ u), n+ acting trivially on C[s1 , s2 , . . . ]⊗v and C[s1 , s2 , . . . ]⊗u. Indeed, C[s1 , s2 , . . . ]⊗v and C[s1 , s2 , . . . ]⊗ u are irreducible k–modules on which c acts nontrivially, and hence, by Theorem 1, both + M1 and M2 are irreducible g–modules. Finally, M n = C[s1 , s2 , . . . ] ⊗ v, and hence (I ◦ R)(M) ∼ = M1 . To construct the k ⊕ n+ –structure on M we first set e(n) · (p(s) ⊗ v) := 0 and recall i that Schur’s polinomials S are given by their generating function exp( n i≥1 si x ) = n n∈Z Sn (s1 , s2 , . . . )x . (Note that Sn = 0 for n < 0.) From the definition of Sn we obtain ∂ Sn (s) = Sn−i (s). ∂si We show now that there exists a unique k ⊕ n+ –module structure on M which extends its k–module structure and such that e(0) · (1 ⊗ u) = 1 ⊗ v. The uniqueness is clear. For the existence, we first use the commutation relations [h(i), e(−n)] = 2e(i − n) with i > 0 and n > 0 to derive the equality e(−n) · (1 ⊗ u) = Sn (¯s ) ⊗ v, where s¯ stands for s1 , s2 /2, s3 /3, . . . . Furthermore, we extend the latter formula to
(−2)#L si Sn−i∈L i (¯s ) ⊗ v, (7) e(−n) · (si1 si2 . . . sil ⊗ u) = L⊂K
i∈K\L
where n ∈ Z, K is the set of all indices i1 , i2 , . . . , il with possible repetitions, and #L stands for the number of elements of L. We leave it to the reader to check that (7) together
62
I. Dimitrov, V. Futorny, I. Penkov
with the equality e(n) · (p(s) ⊗ v) = 0 determines a k ⊕ n+ –module structure on the + k–module M . Finally, since M has length 2 as a k–module, (M )n is a k–submodule + of M , and e(0) · (1 ⊗ u) = 1 ⊗ v, we conclude that (M )n = C[s1 , s2 , . . . ] ⊗ v. Recall that the Borel subalgebra b of g determines a strictly decreasing sequence of toroidal Lie algebras (3). In order to be able to construct a chain of adjoint functors between ki –modules and ki+1 –modules, we need a generalization of Theorem 3 in the ◦ case of a reductive g. Theorem 2 enables us to modify the category Og (ν) and the functor I accordingly. ◦ g (ν) be the full subcategory of Og (ν) on whose Assume now that g is reductive. Let O − objects Z (ν) acts trivially, and let I˜ be the functor g (ν), I˜ : Ok (ν) → O
N˜ → U (˜g) ⊗U (k⊕n+ ) N,
where n+ acts trivially on N . Using Theorem 2 one establishes that for any object M ◦ g (ν), R(M) is an object of Ok (ν). The main result for reductive g is the following of O theorem. ◦
Theorem 4. Let g be reductive and ν(0 ) = 0. g (ν) is a left adjoint to the functor R : O g (ν) → (i) The functor I˜ : Ok (ν) → O Ok (ν), i.e. R ◦ I˜ is naturally isomorphic to the identity functor on Ok (ν). (ii) The functors R and I˜ are mutually inverse equivalences of Ok (ν) and the full k (ν) of O g (ν) whose objects are generated as g–modules by their n+ – subcategory O g invariants. The proof of Theorem 4 is a straightforward modification of the proof of Theorem 3 based on Theorem 2. Acknowledgements. The authors gratefully acknowledge MSRI’s support and excellent working conditions. We also thank the referee for several thoughtful comments. Vyacheslav Futorny is a Regular Associate of the ICTP.
References [BB] [BC] [C] [CF] [CFM] [D] [DP1] [DP2]
Berman, S., Billig, Y.: Irreducible representations for toroidal Lie algebras. J. Algebra 221, 188–231 (1999) Berman, S., Cox, B.: Enveloping algebras and representations of toroidal Lie algebras. Pacific J. Math. 165, 239–267 (1994) Cox, B.: Verma modules induced from nonstandard Borel subalgebras. Pacific J. Math. 165, 269–294 (1994) Cox, B., Futorny, V.: Borel subalgebras and categories of highest weight modules for toroidal Lie algebras. J. Algebra 236, 1–28 (2001) Cox, B., Futorny, V., Melville, D.: Categories of nonstandard highest weight modules for affine Lie algebras. Math.Z. 221, 193–209 (1996) Donaldson, S.: Anti self–dual Yang–Mills connections over complex algebraic surfaces and stable vector bundles. Proc. Lond. Math. Soc. 50(3), 1–26 (1985) Dimitrov, I., Penkov, I.: Partially and fully integrable modules over Lie superalgebras. In: Studies in Advanced Mathematics, S.-T. Yau (ed.), 4, Providence, RI: AMS and Internatl. Press, 1997, pp. 49–67 Dimitrov, I., Penkov, I.: Partially integrable highest weight modules. Transform. Groups 3, 241–253 (1998)
Reduction Theorem for Highest Weight Modules over Toroidal Lie Algebras [F1]
63
Futorny, V.: Representations of Affine Lie algebras. Queen’s Papers in Pure and Applied Mathematics 106, Queen’s University, 1997 [F2] Futorny, V.: Verma type modules of level 0 for affine Lie algebras. Trans. Am. Math. Soc. 349, 2619–2661 (1997) [FK] Futorny, V., Kashuba, I.: Verma type modules for toroidal Lie algebras. Comm. Algebra 27, 3979–3991 (1999) [IKU] Inami, T., Kanno, H., Ueno, T.: Higher-dimensional WZW model on K¨ahler manifold and toroidal Lie algebra. Mod. Phys. Lett A 12, 2757–2764 (1997) [IKUX] Inami, T., Kanno, H., Ueno, T., Xiong, C.-S.: Two–toroidal Lie algebra as current algebra of four–dimensional K¨ahler WZW model. Phys. Lett B 399, 97–104 (1997) [JK] Jakobsen, H., Kac, V.: A new class of unitarizable highest weight representations of infinite dimensional Lie algebras. Lecture Notes in Physics 226, Berlin-Heidelberg-New York: Springer-Verlag, 1985, pp. 1–20 [K] Kassel, C.: K¨ahler differentials and coverings of complex simple Lie algebras extended over a commutative algebra. J. Pure Appl. Algebra 34, 265–275 (1984) [MRY] Moody, R., Rao, S., Yokonuma, T.: Toroidal Lie algebras and vertex representations. Geom. Dedicata 35, 283–307 (1990) Communicated by L. Takhtajan
Commun. Math. Phys. 250, 65–80 (2004) Digital Object Identifier (DOI) 10.1007/s00220-004-1118-3
Communications in
Mathematical Physics
Quasi-Spherical Metrics and Applications Yuguang Shi1, , Luen-Fai Tam2, 1 2
Key Laboratory of Pure and Applied Mathematics, School of Mathematics Science, Peking University, Beijing 100871, P.R. China. E-mail:
[email protected] Department of Mathematics, The Chinese University of Hong Kong, Shatin, Hong Kong, P.R. China. E-mail:
[email protected] Received: 7 July 2003 / Accepted: 22 October 2003 Published online: 1 June 2004 – © Springer-Verlag 2004
Abstract: In this paper, using the idea of Bartnik [B2] on quasi-spherical metrics we continue our study on the boundary behaviors of compact manifolds with nonnegative scalar curvature and nonempty boundary. Unlike the previous work [ST] of the authors and the work of Liu-Yau [LY], we only assume each boundary component has nonnegative curvature which is not identically zero. We also study the case that the boundary is embedded in the quotient of the infinity of the Euclidean space over a finite group. The regularity of the black hole boundary condition of quasi-spherical metrics is also discussed. 0. Introduction In [B2], Bartnik introduced a family of metrics with prescribed scalar curvature, which are called quasi-spherical metrics. Let 0 be a compact strictly convex surface in R3 and hence the exterior of 0 can be foliated by the level surfaces r of the distance function from 0 . One can construct a metric with zero scalar curvature on the exterior of 0 which is of the form u2 dr 2 + gr , where gr is the induced metric from R3 on r and u is a solution of
H0 ∂u = u2 r u + (u − u3 )K on0 × [0, ∞), ∂r u(x, 0) = u0 (x)
(0.1)
where H0 is the mean curvature and K is the Gauss curvature of r . If 0 is the standard sphere, then (0.1) is a special case considered in [B2]. For simplicity, we still call metrics obtained in this way quasi-spherical metrics.
Research partially supported by NSF of China, Projects 10001001 and 10231010. Research partially supported by Earmarked Grant of Hong Kong #CUHK4032/02P.
66
Y. Shi, L.-F. Tam
The construction of [B2] is very useful. In [ST], by studying the properties of (0.1), the authors proved the following: Theorem 0.1. Let (3 , g) be a compact manifold of dimension three with boundary and with nonnegative scalar curvature. Suppose ∂ has finitely many components i so that each component has positive Gauss curvature and positive mean curvature with respect to the unit outward normal. Then for each boundary component i , (i) H dσ ≤ H0 dσ, (0.2) i
i
(i)
where H0 is the mean curvature of i with respect to the outward normal when it is isometrically embedded in R3 , dσ is the volume form on i induced from g. Moreover, if equality holds in (0.2) for some i , then ∂ has only one component and is a domain in R3 . This result can be interpreted as positivity of quasi-local mass defined by Brown and York [BY 1, BY 2]. Later, in [LY], Liu and Yau proposed new definitions of quasilocal energy and momentum surface energy of a spacelike 2-surface with positive intrinsic curvature in a spacetime. Using Theorem 0.1, they showed that the quasilocal energy of the boundary of a compact spacelike hypersurface which satisfies the dominant energy condition is strictly positive unless the spacetime is flat along the spacelike hypersurface. However, both papers need to assume the Gauss curvature of the boundary surface is positive. It is interesting to know whether this assumption can be removed. One purpose of this paper is to consider the case that the boundary of the manifold is only assumed to have nonnegative Gauss curvature which is positive somewhere in each boundary component. More precisely, we will show: Theorem 0.2. Let (3 , g) be a compact manifold of dimension three with smooth boundary and with nonnegative scalar curvature. Suppose ∂ has finitely many components i so that each component has nonnegative Gauss curvature which is positive somewhere and has positive mean curvature H with respect to the unit outward normal. Then for each boundary component i , (i) H dσ ≤ H0 dσ, (0.3) i
i
(i) H0
is the mean curvature of i with respect to the outward normal when it is where isometrically embedded in R3 , dσ is the volume form on i induced from g. To prove Theorem 0.1, one embeds the surface in R3 , which can be done if the Gauss curvature is strictly positive, see [N]. Then one can construct quasi-spherical metrics. If we only assume that the surface has nonnegative Gauss curvature, then there is an example by Iaia [I] of a smooth surface (S2 , g) with nonnegative Gauss curvature which cannot be C 3 embedded in R3 . However, such a surface still has a C 1,1 embedding by [GL, HZ]. In particular, the mean curvature of the embedded surface can be defined almost everywhere so that (0.3) is meaningful. If equality holds in (0.3), then we have the following: Theorem 0.3. With the same assumptions as in Theorem 0.2, if equality holds in (0.3) for some i , then ∂ has only one component and is scalar flat. If in addition each i can be C 4 embedded in R3 with positive mean curvature, then is a domain in R3 .
Quasi-Spherical Metrics and Applications
67
An interesting boundary condition in (0.1) is the black hole boundary condition u−1 = 0. In case 0 ⊂ R3 is the sphere with radius 2M, M > 0, then the solution with this boundary data is just the Schwarzschild 3-metric. Hence it is interesting to understand what we can say about the regularity of the metric near r = 0. We are able to prove that similar to the Schwarzschild metric the constructed metric has bounded curvature, can be extended to be Lipschitz at r = 0, and the surface r = 0 is totally geodesic. This is a partial generalization of Bartnik [B2, Theorem 4.6] and is related to a question of Bartnik [B3] In certain cases the results in [ST] are still true in the asymptotically locally Euclidean (ALE) case. More precisely, if is spin and the boundary of in Theorem 0.1 is assumed to be embedded in Rn \ {0}/ , where is a finite subgroup of SO(n) which acts freely on Rn \ {0}, then under certain conditions on , results similar to those in Theorem 0.1 are true. In particular, if is a subgroup of SU(m) so that has odd order, then (0.2) is still true, where H0 is the mean curvature of the boundary surface in Rn \ {0}/ . This gives some restriction on the existence of metrics with nonnegative scalar curvature on a compact manifold with boundary whose components may not be simply connected. The paper is organized as follows: Theorems 0.2 and 0.3 will be proved in §1; in §2 we study regularity properties of solutions of (0.1) with black hole boundary condition and in §3 we discuss nonexistence of certain Riemannian metrics on compact manifolds with boundary. 1. Boundary with Nonnegative Curvature We first recall some results in [ST]. Let 0 be a strictly convex compact hypersurface in Rn . Then the exterior of 0 in Rn with the standard metric can be represented by (0 × (0, ∞), dr 2 + gr ), where gr is the induced metric on r which is the hypersurface consisting of the points outside 0 and with distance r from 0 . Consider the following initial value problem: 2H0 ∂u = 2u2 r u + (u − u3 )Rr on 0 × [0, ∞), ∂r (1.1) u(x, 0) = u0 (x) where u0 (x) > 0 is a smooth function on 0 , H0 and Rr are the mean curvature and scalar curvature of r respectively, and r is the Laplacian operator on r . By [ST, §2], we have: Lemma 1.1. For any smooth initial data u0 > 0, (1.1) has a unique solution such that r − 1 r − 1 2 2 1 + C2 exp − ξ(s)ds ≤ u(x, r) ≤ 1 − C1 exp − ϕ(s)ds , 0
where
0
Rr (x, r) > 0, x∈0 H0 (x, r) −2 C1 = 1 − max u0 + 1 ,
ϕ(r) = min
0
Rr (x, r) > 0, x∈0 H0 (x, r) −2 C2 = min u0 − 1,
ψ(r) = max
0
and ξ(r) = ϕ(r) if min0 u0 ≤ 1, ξ(r) = ψ(r) if min0 u0 > 1.
68
Y. Shi, L.-F. Tam
Moreover, the metric ds 2 = u2 dr 2 + gr has zero scalar curvature and is asymptotically Euclidean in the sense that if ds 2 = ij gij dx i dx j in the standard coordinates (x 1 , . . . , x n ) on the exterior of 0 in Rn , then there exists a constant C such that |gij (x) − δij | + |x| |∇0 gij (x)| + |x|2 |∇02 gij (x)| ≤ C|x|2−n , where ∇0 , ∇02 are the gradient and Hessian operator with respect to the Euclidean metric. The uniqueness of solutions of (1.1) can be seen as follows. Suppose u, v are two solutions of (1.1) on 0 × [0, T ], T > 0 with the same initial data, and suppose u, v are bounded above and below by some positive constants. Let w = u − v. Then w satisfies: 2H0
∂w = 2u2 r w + 2(u + v)v + 1 − u2 − uv − v 2 Rr w ∂r
which is of the form 2H0
∂w = 2u2 r w + F w, ∂r
where F is a smooth function. From this it is easy to see that u ≡ v. Now suppose (, g) is a smooth compact oriented surface with smooth metric g and with nonnegative Gauss curvature which is positive somewhere. By [GL,HZ], (, g) can be C 1,1 isometrically embedded in R3 as a convex surface. Let X : (, g) → R3 be such an embedding. Define the mean curvature H0 in R3 with respect to the unit normal n by H0 = X, n , where is the intrinsic Laplacian of (, g) and X is considered as a vector valued function and the inner product is taken in R3 . In our convention, the mean curvature with respect to the outward normal of the unit sphere in R3 is 2. Note that H0 is defined almost everywhere, is bounded because X is C 1,1 and is equal to the usual mean curvature if X is C 2 . Moreover, H0 is independent of the embedding in the following sense. Suppose Y is another C 1,1 embedding of (, g) as a convex surface in R3 . Then by the result of Pogorelov [P, p.167], X() and Y() are congruent. Hence there is a C 1,1 isometry F of (, g) and a Euclidean motion A in R3 such that A ◦ X = Y ◦ F . By [Ha], F is in fact C 2 . Hence in our definition the mean curvatures at the corresponding points in X() and Y() are equal. With this definition in mind, we have the following: Theorem 1.1. Let (3 , g) be a compact manifold of dimension three with smooth boundary and with nonnegative scalar curvature. Suppose ∂ has finitely many components i so that each component has nonnegative Gauss curvature which is positive somewhere and has positive mean curvature H with respect to the unit outward normal. Then for each boundary component i ,
H dσ ≤ i (i)
i
(i)
H0 dσ,
(1.2)
where H0 is the mean curvature of i with respect to the outward normal when it is isometrically embedded in R3 , dσ is the volume form on i induced by g.
Quasi-Spherical Metrics and Applications
69
Proof. We only prove the case that ∂ has only one boundary component . The general case can be proved similarly. As mentioned above, (, g) can be C 1,1 embedded in R3 as a convex surface by [GL, HZ]. The embedding is obtained in the following way. One can find smooth functions wi on such that (, e2wi g) has positive Gauss curvature and such that wi → 0 in C ∞ as i → ∞. In fact, since the Gauss curvature K of (, g) is nonnegative and is positive somewhere, one can find a smooth function such that g w = −1 in a neighborhood of {x ∈ | K = 0}, where g is the Laplacian of (, g). Then wi = i w will satisfy the requirements provided i is positive, small and i → 0. Let Xi : (, e2wi g) → R3 be the embedding of (, e2wi g). Passing to a subsequence and modifying Xi by rigid motions in R3 if necessary, Xi converge in C 1 to a C 1,1 isometric embedding X of (, g) in R3 . Moreover, ||∇ 2 Xi || ≤ C1 ,
(1.3)
for some constant C1 independent of i. Here ∇ is computed with respect to the metric g on . For each i, let vi be the solution of in g vi = 0 . 1 wi 2 vi = e on ∂ = Here we also denote the Laplacian of (, g) by g . By the maximum principle, vi is positive and vi converge to the constant function 1 in C ∞ . Let gi = vi4 g. Then (, gi ) = (, e2wi g) and the scalar curvature Ri of (, gi ) is given by Ri = vi−5 (−8g vi + Rvi ) ≥ 0, where R is the scalar curvature of g. Let Hi be the mean curvature of with respect to the metric gi . Since gi converge to g in C ∞ , Hi converge uniformly on to H as i → ∞. Since H > 0, Hi > 0 if i is large enough and we assume this is true for all i. By Theorem 4.2 in [ST], for each i we have
Hi dσi ≤
Hi,0 dσi ,
(1.4)
where Hi,0 is the mean curvature of (, gi ) when embedded in R3 and dσi is the volume element of (, e2wi g). By (1.4) and the fact that Hi → H as i → ∞, in order to prove the theorem, it is sufficient to prove that Hi,0 dσi = H0 dσ, (1.5) lim i→∞
where H0 is the mean curvature of (, g) when embedded in R3 which is defined after Lemma 1.1. Let and i be the Laplacians of (, g) and (, e2wi g) respectively, let ∇ and ∇i be the covariant derivatives with respect to g and e2wi g respectively, and let n and ni be the unit normals of X() and Xi () respectively. Then
70
Y. Shi, L.-F. Tam
X − i Xi , n = (H − Hi,0 ) + Hi,0 n − ni , n .
(1.6)
Now n, X − i Xi dσ
= n, X dσ − n, i Xi dσi + n, i Xi e2wi − 1 dσ
=−
3
∇ns , ∇x s g dσ +
s=1
+
3 s=1
∇i ns , ∇i xis gi dσi
n, i Xi e2wi − 1 dσ,
where n = (n1 , n2 , n3 ), X = (x 1 , x 2 , x 3 ) and Xi = (xi1 , xi2 , xi3 ), , , , g , , gi are the inner products in R3 , g and gi respectively. Using the fact that wi → 0, X → Xi in C 1 as i → ∞ and using (1.3), we conclude that lim
i→∞
n, X − i Xi dσ = 0.
(1.7)
Since ni → n uniformly as i → ∞, by (1.3) we also have lim
i→∞
Hi,0 n − ni , n dσ = 0.
(1.8)
By (1.6)–(1.8), we conclude that lim
i→∞
H0 − Hi,0 dσ = 0.
Since wi → 0, (1.4) is true. This completes the proof of the theorem.
Suppose equality holds in (1.2) for a boundary component. Then we have the following: Theorem 1.2. With the same assumptions as in Theorem 1.1, if equality holds in (1.2) for some i , then ∂ is connected and is scalar flat. If in addition ∂ can be C 4 embedded in R3 with positive mean curvature, then is a domain in R3 . Before we prove the theorem, we need the following lemma. Lemma 1.2. Let (, g) as in Theorem 1.1 so that (∂, g) is connected. Suppose there is a C 4 isometric embedding X of ∂ in R3 with mean curvature H0 . Then for any > 0, there is a smooth metric g on and a C ∞ embedding X of (∂, g ) in R3 such that (i) (∂, g ) has positive Gauss curvature; (ii) ||g −g ||C 3 () ≤ ; (iii) ||X−X ||C 2 () ≤ and ( iv) g has nonnegative scalar curvature on .
Quasi-Spherical Metrics and Applications
71
Proof. ∂ = is conformal to S2 because it has nonnegative Gauss curvature and is positive somewhere. Hence we may write (, g) = (S2 , g) so that gij = e2w λij , where λij is the standard metric of S2 . The isometric embedding can be represented by X(x) = ρ(x)x for x ∈ S2 and ρ(x) is a C 4 function on S2 . We assume that the origin is the center of the largest ball contained in the convex body bounded by 0 which is the image of ∂. Then ρ > 0. In local coordinates of S2 we have, gij = ρi ρj + ρ 2 λij .
(1.9)
ρ Let > 0, and let ρ = 1+ ρ . It is easy to see that ρ converges uniformly to ρ in C 4 as → 0. Consider the surfaces given by
X (x) = ρ (x)x. Since X has nonnegative Gauss curvature, X has positive Gauss curvature, see [LW] for example. Moreover, H0 → H0 , where H0 is the mean curvature of X (S2 ). By (1.9) and the fact that gij = e2w λij , the metric for X (x) is
1 gij = e2w + 2 ρ 3 + 2 ρ 4 λij . 4 (1 + ρ) Hence gij = e2w λij , where w is C 4 and it converges to w in C 4 as → 0. Now for each > 0, we can find smooth functions vk, on S2 such that vk, converge to w in C 4 as k → ∞. Since (, g ) has positive Gauss curvature, we can find a smooth embedding Xk of (, e2vk, λij ) such that Xk converge to X in C 2 as k → ∞, see [N]. From this we conclude that for any > 0, there exists a smooth function v on such that the metric gij = e2v λij satisfies ||g − g ||C 4 ≤ in . Moreover, there exists an embedding X of (, g ) in R3 such that ||X − X||C 2 () ≤ . Now we solve the equation in g v = 0 . 1 (v −w) v = e2 on ∂ = Consider the metric v 4 g on which is g when restricted on . We denote this metric also by g . Then as before, g is a smooth metric on with nonnegative scalar curvature. Since ||g − g ||C 4 ≤ in is small, ||(v − 1)||C 3 () ≤ C( ) such that C( ) → 0 as → 0. From this, it is easy to see the lemma is true.
Proof of Theorem 1.2. Let 1 , . . . , k be the components of ∂. Suppose equality holds in (1.2) for some i. Let us assume that i = 1 so that
(1)
1
(H0 − H ) dσ = 0.
(1.10)
We want to prove that k = 1. Suppose k > 1. For any 0 < < 1, solve the following: in , g v = 0 . v = 1 − on 2 , v = 1 on i for i = 2
72
Y. Shi, L.-F. Tam
By the maximum principle, v > 0 on . Consider the metric g˜ = (v )4 g on , then it has nonnegative scalar curvature as in the proof of Theorem 1.1. It is equal to (1 − )4 g on 2 and is equal to g on i for i = 2. Moreover, the mean curvature on each i with respect to g˜ is positive if is small enough and the Gauss curvature of (i , g) ˜ is nonnegative and is positive somewhere for each i. By the strong maximum principle, v < 1 inside because v < 1 on 2 and v = 1 on i if i = 2. By the strong maximum principle again, the mean curvature of 1 with respect to g˜ is
∂v H˜ = H + 4 > H, ∂ν where H is the mean curvature of g. ˜ Since g˜ = g on 1 , we have (1) (1) ˜ H dσ ≤ H0 dσ = Hd σ < H0 dσ 1
1
1
1
by Theorem 1.1. This is a contradiction. Hence ∂ has only one component. Next we want to prove that is scalar flat. Let = ∂ and let H0 be the mean curvature of in R3 . Let R be the scalar curvature of and let u be the solution of
g u −
=0 u=1
R 8u
in . on ∂ =
By the maximal principle, we see 0 < u ≤ 1. Let H¯ be the mean curvature of with ∂u respect to the metric g¯ = u4 g, then H¯ = H + 4 ∂u ∂ν > 0, because H > 0 and ∂ν ≥ 0. Moreover,
∂u dσ ∂ν ∂u = H dσ + 4 u dσ ∂ν R = |∇u|2 + u2 H0 dσ + 4 8 R ≥ |∇u|2 + u2 , H¯ dσ + 4 8
H¯ dσ =
H dσ + 4
where we have used (1.10) and Theorem 1.1. Hence R ≡ 0. Suppose (1.10) is true and if in addition = ∂ can be C 4 embedded (with embedding X) in R3 with positive mean curvature H0 . Note that in this case, X() is convex by [Sa]. By Lemma 1.2, there exist smooth metrics gi on with nonnegative scalar curvature such that (, gi ) has positive Gauss curvature and there exists an embedding Xi of (, gi ) for each i with the following properties: lim ||Xi − X||C 2 () = lim ||g − gi ||C 3 () = 0.
i→∞
i→∞
(1.11)
By the proof of Theorem 4.1 and 4.2 in [ST], we can glue (, gi ) with the exterior of Xi () in R3 to be a smooth manifold Mi , so that we can find a metric on the exterior of Xi (), which will also be denoted by gi , then gi is a smooth metric in and
Quasi-Spherical Metrics and Applications
73
on the exterior of Xi (), and is Lipschitz near Xi (). In fact, the metric gi is equal to u2i dr 2 + gi,r , where dr 2 + gr is the Euclidean metric on the exterior of Xi () as described at the beginning of this section and ui is the solution of (1.1) with initial condition ui = Hi,0 /Hi . Here Hi is the mean curvature of with respect to the metric gi and Hi,0 is the mean curvature of Xi () in the Euclidean space. We identify (, gi ) with Xi (). We want to define a bilipschitz map between M1 and Mi for i > 1. To begin with, let us identify as S2 as a smooth surface. We may denote the embedding Xi as a map Xi (x) = ρi (x)x and X(x) = ρ(x)x with x ∈ S2 in R3 . We may also assume that ρ(x) > 0 everywhere. Define a map Fi from R3 \ I (i ) to R3 \ I (1 ) by Y = Fi (X) =
ρ1 (X/|X|) X, ρi (X/|X|)
where I (i ) denotes the interior of Xi () in R3 . It is easy to see that Fi is bijective. Moreover, Fi (ρi (x)x) = ρ(x)x. Hence Fi induces an identity map on . Since ρ(x) and ρi (x) are bounded away from zero by a positive constant independent of i, and the C 1 norm of ρi is bounded by a constant independent of i, we conclude that C2−1 ≤ ||Fi (X1 ) − Fi (X2 )|| ≤ ||X1 − X2 || ≤ C2 ||Fi (X1 ) − Fi (X2 )||
(1.12)
for some constant C2 independent of i. On the other hand, by the construction of gi as described in the proof of Theorem 4.1 in [ST] and by Lemma 1.1, we conclude that the metric gi on R3 \ I (i ) is equivalent to the Euclidean metric, provided i is large enough. Here we have used the fact that H0 > 0 everywhere and (1.11). More precisely, there is a constant C3 > 0 independent of i so that C3−1 ge ≤ gi ≤ C3 ge , where ge is the Euclidean metric. Hence Fi is a bilipschitz map from (R3 \ I (i ), gi ) to (R3 \ I (1 ), ge ). Now we extend Fi to be a map from Mi to M1 such that Fi | is the identity map. Since gi → g in C 3 (), it is easy to see that C4−1 dg1 (Fi (p), Fi (q)) ≤ dgi (p, q) ≤ C4−1 dg1 (Fi (p), Fi (q)) for some constant C4 independent of i. Hence there is a constant C5 independent of i such that for any W01,2 (Mi ) function f we have
1 6
f dVi Mi
3
≤ C5
|∇i f |2 dVi ,
(1.13)
Mi
where dVi is the volume element of (Mi , gi ) and ∇i f is the gradient of f with respect to gi . By [ST, §3], for each i, there is a harmonic spinor i such that the norm of i satisfies i (X)||i ≤ C|X|−1 log |X| ||i (X) −
(1.14)
74
Y. Shi, L.-F. Tam
i is a parallel spinor on R3 \ I (i ) for some constant C in X ∈ R3 \ I (i ), where and ||i || ≡ 1 with respect to the Euclidean metric. Here || · ||i is the norm with respect 1,2 to gi and |X| is the Euclidean norm of X. Moreover, ||i ||i is in Wloc . By (1.13) and (1.14), we conclude that
1 (||i ||i − 1)6 dVi
3
≤ C5
Mi
||∇i i ||2 dVi .
(1.15)
Mi
On the other hand, from [ST], we have
||∇i i ||2 dVi ≤ C6 Hi,0 − Hi dσi Mi
(1.16)
for some absolute constant C6 . Note that ||i ||i is uniformly bounded by 1 by construction. Using the fact that gi → g in C 3 in , we may assume that {i } converges in C 1 in to a spinor in (, g). By (1.15) and (1.16), using the fact that
lim Hi,0 − Hi dσi = (H0 − H ) dσ = 0 i→∞
we conclude that is a nontrivial parallel spinor in (, g). Hence (, g) is Ricci flat (see [Hi]), and hence it is flat because dim() = 3. Since (, g) can be C 4 embedded in R3 , we can glue (, g) along with R3 \ I (), where I () is the image of under the embedding, so that a tubular neighborhood of in is identified with a tubular neighborhood of in R3 and the outward normals are the same. The resulting manifold is a C 3 manifold M and the metric given by g on and the Euclidean metric ge on R3 \ I () is Lipschitz near . Since the Gauss curvature of is nonnegative and the curvature of is zero, as in [S, v. 5, p.284-288] one can prove that on the set {x ∈ : K(x) > 0}, the second fundamental forms with respect to g and ge are the same. Moreover, suppose x ∈ such that the Gauss curvature is zero in a neighborhood of x, then x is a parabolic point of in R3 because H0 > 0. In this case, we can still prove that the second fundamental forms are the same. By Lemma 4.1 in [ST], (M, g) is a C 2 metric on a C 3 manifold which is flat outside a compact set. Hence it is isometric to R3 by volume comparison [BC]. This completes the proof of the theorem.
2. Regularity: Black Hole Boundary Conditions In [ST], it was remarked that as in [B2] one can solve (1.1) with black hole boundary 3 data u−1 0 = 0. In case 0 ⊂ R is the sphere with radius 2M, M > 0, then the solution with this boundary data is just the Schwarzschild 3-metric. Hence it is interesting to understand what we can say about the regularity of the metric near r = 0. The following theorem answers these questions and partially generalizes Theorem 4.6 in Bartnik’s work [B2]. In our case, we only assume 0 to be strictly convex. In [B2], an equation more general than (1.1) was considered with 0 being the standard two sphere. We also prove that the quasi-spherical metric can be extended in a neighborhood of the black hole. In this section, we always assume that 0 is a compact strictly convex surface in R3 . For the sake of completeness, we will also give the details of the construction of the solution u. For any positive integer k, let uk be the solution of (1.1) with uk (x, 0) = k.
Quasi-Spherical Metrics and Applications
75
Theorem 2.1. With above notations, uk subconverge to a solution u of the equation in (1.1) in 0 × (0, ∞) such that the metric ds 2 = u2 dr 2 + gr on 0 × (0, ∞) is asymptotically Euclidean with zero scalar curvature. Moreover, the following are true: (i) There exist positive constants C, C such that C/r ≤ u2 (x, r) ≤ C /r, for (x, r) ∈ 0 × (0, 1]. (ii) The surface 0 = 0 × {0} is totally geodesic with respect to the metric ds 2 . More precisely, let ||A|| be the norm of the second fundamental form of r = 0 × {r}, then lim sup ||A|| = 0. r→0 r
(iii) The curvature of the metric ds 2 is bounded near r = 0. (iv) The metric ds 2 is Lipschitz up to r = 0. More precisely, let F : 0 × (0, ∞) → 0 × (0, ∞) be defined by F (r, x) = (ρ, y) r ∗
with y(r, x) = x and ρ(r, x) = 0 u(x, τ ) dτ . Then the metric F −1 (ds 2 ) can be extended to be Lipschitz continuous on 0 × [0, 1]. Proof. For any a > 0, by Lemma 1.1 and using the notations in the lemma, we have 1 ≤ uk (x, r) ≤ 1 − exp −
− 1
a
2
ϕ(s)ds
0
for all x ∈ 0 and a ≤ r < ∞. From this it is easy from the results in [ST] that after passing to a subsequence if necessary, uk converges to a smooth solution of the equation in (1.1) so that the metric ds 2 = u2 dr 2 +gr is scalar flat and is asymptotically Euclidean near r = ∞. To prove (i), note that if 0 ≤ r ≤ 1, then r r ϕ(s)ds) ≥ exp(− ψ(s)ds) ≥ 1 − αr 1 − βr ≥ exp(− 0
0
for some α > 0, β > 0. Let 0 < r0 < 1 be such that 1 − αr0 > 0. Then r −2 1 − (1 − k ) exp − ϕ(s)ds ≥ 1 − (1 − k −2 )(1 − βr) 0
= k −2 + β(1 − k −2 )r ) exp −
on [0, r0 ]. Similarly we have 1 − (1 − k
−2
r
ψ(s)ds
≤ k −2 + α(1 − k −2 )r.
0
Hence by Lemma 1.1
k −2 + α(1 − k −2 )r
− 1 2
− 1 2 ≤ uk (x, r) ≤ k −2 + β(1 − k −2 )r
on 0 × (0, r0 ]. Let k → ∞; it is easy to see that (i) is true.
76
Y. Shi, L.-F. Tam
To prove (ii), by the computations in [ST, §1], the second fundamental form hij of r with respect to the metric ds 2 is equal to u−1 h0ij , where h0ij is the second fundamental form with respect to the Euclidean √ metric. Hence (ii) is true. To prove (iii), if we let w = ru, then 1 1 −1 r 2 u + r 2 ur 2
1 = H0 r −1 w + r 2 2u2 r u + 2(u − u3 )K
= H0 r −1 w + r −1 2w 2 r w + (rw − w 3 )K .
∂w 2H0 = 2H0 ∂r
Hence if we let r = et , we have 2H0
∂w = 2w2 r w + H0 w + 2(et w − w 3 )K ∂t
for −∞ < t < 0. Note that C ≤ w2 ≤ C for all t by (ii). As in the proof of [ST, Lemma 2.5], for any integers k, ≥ 0, there is a constant C1 such that k ∂ w j + ||∇r w|| ≤ C1 ∂t k j =0
for all x ∈ 0 and for all −∞ < t < 0. Here, ∇r is the intrinsic covariant derivate of r . Hence on 0 × (0, 1], for any ≥ 0, there is a constant C2 such that 3 ∂u 1 j r 2 + r 2 ||∇r u|| ≤ C2 ∂r
(2.1)
j =0
for some constant C2 . To compute the curvature tensor of the metric ds 2 . Let e1 , e2 , e3 be a local ortho∂ normal frame so that e3 = u−1 ∂r which is the normal to r with respect to ds 2 . Then ∂ e1 , e2 , e˜3 = ∂r is a local orthonormal frame with respect to the Euclidean metric. By the computations in [ST, §1], on r we have R1212 = (1 − u−2 )K,
(2.2)
where K is the Gauss curvature of r and for 1 ≤ i ≤ 2 2 −R3i3i = (log u)ii + (log u)i − u−3 ur h0ii ,
(2.3)
where (log u)ii = ∇r2 log u (ei , ei ) and h0ij is the second fundamental form of r with respect to the Euclidean metric. Hence (iii) follows from (i) and (2.1)–(2.3). To prove (iv), first observe that by (i), (2.1) and the fact that u ≥ 1, it is easy to see that F is a diffeomorphism from 0 × (0, ∞) onto itself. Let x = (x 1 , x 2 ) and
Quasi-Spherical Metrics and Applications
77
y = (y 1 , y 2 ) be the local coordinates in 0 in the domain and in the target respectively. Then the differential of F is given by ∂ρ
∂ρ ∂x 2 ∂y 1 ∂x 2 ∂y 2 ∂x 2
∂ρ ∂x 1 ∂y 1 ∂x 1 ∂y 2 ∂x 1
∂r ∂y 1 ∂r ∂y 2 ∂r
r u 0 = 0 0
∂u (τ, x) dτ ∂x 1
1 0
Here we have used (2.1) for the computation of
∂r ∂ρ ∂x 1 ∂ρ ∂x 2 ∂ρ
∂r ∂y 1 ∂x 1 ∂y 1 ∂x 2 ∂y 1
∂r ∂y 2 ∂x 1 ∂y 2 ∂x 2 ∂y 2
1
− u1
u
= 0 0
r
∂ρ . ∂x i
∂u 0 ∂x 1 (τ, x) dτ
r
∂u 0 ∂x 2 (τ, x) dτ
0 1
.
Hence − u1
r
1 0
∂u 0 ∂x 2 (τ, x) dτ
0 1
,
and the pull back metric under F −1 is (F
−1 ∗
) (ds ) = u 2
2
∂r ∂ρ
2
∂r ∂r ∂r ∂r dρ + i j dy i dy j + (dρdy i + dy i dρ) ∂y ∂y ∂ρ ∂y i
2
+gij (r(ρ, y), y)dy i dy j r r ∂u ∂u 2 = dρ + dτ dτ dy i dy j i j 0 ∂x 0 ∂x r
∂u i i − dτ dρdy + dy dρ + gij (r(ρ, y), y)dy i dy j , i 0 ∂x where gij (r, x) is smooth in r and x. By (i) and (2.1), there is a constant C3 such that near ρ = 0, i.e. near r = 0,
0
∂ ∂y j
r 0
r
∂u ≤ C3 r 21 , dτ ∂x i
r ∂ 1 ∂u ∂u dτ = j dτ ≤ C3 r 2 , i i ∂x ∂x 0 ∂x
∂ ∂ρ
r 0
∂u ∂r ∂u ≤ C3 , dτ = i ∂x i ∂x ∂ρ
∂gij ∂gij ∂r ∂gij = ≤ C3 + ∂y k ∂r ∂y k ∂x k and
∂gij ∂gij ∂r 1 = ∂ρ ∂r ∂ρ ≤ C3 r 2 .
Hence (iv) is true. This completes the proof of the theorem.
78
Y. Shi, L.-F. Tam
3. Non-Simply Connected Boundary In this section, we always assume that (n , g), (n ≥ 4), is a spin compact manifold with nonnegative scalar curvature and with boundary ∂ such that (∂, g) can be isometrically embedded in Rn∗ / = Rn \ {0}/ with image , where is a nontrivial / for some compact finite subgroup of SO(n) which acts freely on Rn∗ , and = n strictly convex hypersurface in R so that τ ( ) = for all τ ∈ . We also assume . that the origin is in the interior of If we glue the exterior part of with along the boundary via the isometry of ∂ we obtain a spin smooth manifold. We denote a manifold constructed in this way by )/ , where E( ) is the exterior part M . Here the exterior part of is defined as E( in Rn . Note that the tangent cone Rn∗ / of M has a spin structure induced by the of spin structure of M . Theorem 3.1. Let (, g) and be as above. Suppose ∂ has positive mean curvature H with respect to the unit outward normal. If in addition, there is a nontrivial parallel spinor on the tangent cone of M with respect to the spin structure induced by that of M , then
H dσ ≤
∂
H 0 dσ,
(3.1)
∂
where H0 is the mean curvature of ∂ when embedded in Rn∗ / with respect to the Euclidean metric. Moreover, if equality holds then is Ricci flat. Remark 3.1. (1) If is trivial then this is the case considered in [ST]. (2) If the dimension is odd then must be trivial. It was proved in [Sh] by the first author that if n = 2m, ⊂ SU(m) such that the 2m order || of is odd, then Cm ∗ / = R∗ / has a unique spin structure. Moreover, m C∗ / has two linear independent parallel spinors. Hence we have the following: Corollary 3.1. Let (2m , g) as before. Suppose ⊂ SU(m) and || is odd. Then (3.1) is true. Moreover, if equality holds, then is Ricci flat. In order to prove Theorem 3.1, we need the following positive mass theorem for asymptotically locally Euclidean (ALE) manifold with a metric which may not be smooth. A positive mass theorem for smooth ALE manifolds under certain conditions has been proved in [D]. Let N n be a smooth manifold, such that there is a bounded domain ⊂ N with smooth boundary ∂ and such that there is a continuous Riemannian metric g on N with the following properties: (i) g is smooth on N \ and , and is Lipschitz near ∂ such that the scalar curvature on N \ is integrable. (ii) The mean curvatures at ∂ with respect to the outward normal and with respect to the metrics g|N\ and g| are the same. (iii) N has only one end, which is ALE in the following sense:
Quasi-Spherical Metrics and Applications
79
There is a compact set K containing such that N \ K = E such that E is diffeomorphic to Rn \ BR (0)/ and in the standard coordinates in Rn , the pull back metric of g on Rn \ BR (0) satisfies: gij = δij + bij , with
bij + r∂bij + r 2 ∂∂bij = O(r 2−n ),
where r and ∂ denotes Euclidean distance and the standard gradient operator on Rn , respectively. We should remark that because of (i), the outward unit normal on ∂ is well-defined. Moreover, (i) and (iii) imply that the ADM mass of end of N is also well-defined by the proof in [B1]. Explicitly, the ADM mass at the end E is given by
C(n)mE = lim gij,j − gjj,i dSi , r→∞ S(r)
where we have already taken the universal covering of Rn \ BR (0)/ , and C(n) is a positive constant, S(r) is the Euclidean sphere in Rn and dSi . Proposition 3.1. Let (N, g) be as above with nonnegative scalar curvature. Suppose the tangent cone of N has a nontrivial parallel spinor. Then mE ≥ 0. Moreover, if mE = 0 then N is Ricci flat. Proof. The proof is similar to the proof in [ST]. Let be a nontrivial parallel spinor on the tangent cone. Then one can find a harmonic spinor 1 which is asymptotically equal to . In particular, 1 is nontrivial. This has been proved in [Sh]. Then we can apply the Lichnerowicz formula which is still true because of property (ii) (see [ST, Lemma 3.2]), and we can conclude that mE ≥ 0. If mE = 0, then we can conclude that 1 is parallel in and N \ . But 1 is continuous near ∂, see [ST, Lemma 3.3]. Hence 1 is a nontrivial parallel spinor on and is Ricci flat by [Hi]. Now we are ready to prove Theorem 3.1. ( x ) = H (π( x )) Proof of Theorem 3.1. Let π : Rn∗ → Rn∗ / be the projection. Let H 0 0 x ) = H (π( x )). By Lemma 1.1, there is a unique solution u of the following and let H ( initial value problem:
u r0 ∂ 2H ∂r u(x, 0)
= 2 u2 r u + ( u − u3 )Rr on 0 × [0, ∞) , 0 = HH
(3.2)
0 also denotes the mean curvature of the level surface r of the distance funcwhere H , and Rr is the scalar curvature of r . Since is invariant under , r is tion from also invariant under . Hence u ◦ τ is still a solution of (3.2). By the uniqueness, we conclude that u◦τ = u. Hence the metric ds 2 = u2 dr 2 + gr is well-defined, where u is the function on M such that u = u ◦ π , and gr is the induced r ) by the Euclidean metric. Moreover, if we extend the metric ds 2 metric on r = π( to such that ds 2 is equal to the given metric g, then ds 2 satisfies the conditions in
80
Y. Shi, L.-F. Tam
Proposition 3.1. Since the tangent cone of M has a nontrivial parallel spinor, we can apply Proposition 3.1 to conclude that m ≥ 0, where m is the mass of the end of M . On the other hand, by Lemma 4.2 and Theorem 2.1 in [ST], we have 0 0 − H ) ≥ C(n)m || (H − H ) = (H
for some positive constant C(n) depending only on n. From this and Proposition 3.1, the theorem follows.
Acknowledgement. The authors would like to thank Robert Bartnik, Jiaxing Hong and Andrejs Treibergs for useful discussions.
References [B1] [B2] [B3] [BC] [BY 1] [BY 2] [D] [GL] [Ha] [HH] [Hi] [HZ] [I] [LW] [LY] [N] [P] [Sa] [Sh] [ST] [Sp] [Y]
Bartnik, R.: The mass of asymptotically flat manifold. Comm. Pure Appl. Math. 39, 661–693 (1986) Bartnik, R.: Quasi-spherical metrics and prescribed scalar curvature. J. Differ. Geom. 37, 31–71 (1993) Bartnik, R.: Private communication Bishop, R. L., Crittenden, R. J.: Geometry of manifolds. Landon-New York-San Diego: Academic Press, 1964 Brown, J. D., York, J. W.: Mathematical aspects of classical field theory (Seattle, WA, 1991). Providence, RI: Am. Math. Soc. 1992, pp. 129–142 Brown, J. D., York, J. W.: Quasilocal energy and conserved charges derived from the gravitational action. Phys. Rev. D (3) 47 4, 1407–1419 (1993) Dahl, M.: The positive mass theorem for ALE manifolds. Mathematics of Gravitation, Part I (Warsaw, 1996), Banach Center Publ. 41, Part I, Warsaw: Polish Acad. Sci. 1997, pp. 133–142 Guan, P., Li, Y. Y.: The Weyl problem with nonnegative Gauss curvature. J. Differ. Geom. 39, 331–342 (1994) Hartman, P.: On unsmooth two-dimensional Riemannian metrics. Am. J. Math. 74, 215–226 (1952) Hawking, S. W., Horowitz, G. T.: The gravitational Hamiltonian, action, entropy and surface terms. Classical Quantum Gravity 13, 1487–1498 (1996) Hitchin, N.: Harmonic spinors. Adv. in Math. 14, 1–55 (1974) Hong, J., Zuily, C.: Isometric embedding of the 2-sphere with nonnegative curvature in R3 . Math. Z. 219, 323–334 (1995) Iaia, J. A.: Geometric analysis and nonlinear partial differential equations (Denton, TX, 1990). In: Lecture Notes in Pure and Appl. Math. Vol 144, New York: Dekker, 1993, pp. 213–220 Li,Y., Weinstein, G.: A priori bounds for co-dimension one isometric embeddings. Am. J. Math. 121, 945–965 (1999) Liu, C.-C. M.,Yau, S.-T.: Positivity of quasilocal mass. Phys. Rev. Lett. 90, 231102-1–231102-4 (2003) Nirenberg, L.: The Weyl and Minkowski problems in differential geometry in the large. Comm. Pure Appl. Math. 6, 337–394 (1953) Pogorelov, A.V.: Extrinsic geometry of convex surfaces. Providence, RI: American Mathematical Society, 1973 Sacksteder, R.: On hypersurfaces with no negative sectional curvatures. Am. J. Math. 82, 609–630 (1960) Shi, Y.: Crepant resolution, rigidity theorem, and parallel spinors on asymptotically locally Euclidean manifolds. To appear in Pac. J. Math. Shi,Y.-G., Tam, L.-F.: Positive mass theorem and the boundary behaviors of compact manifolds with nonnegative scalar curvature. J. Diff. Geom. 62, 79–125 (2002) Spivak, M.: A Comprehensive introduction to differential geometry. V.5, Berkely, CA: Publish or Perish 1970–75 Yau, S.-T.: Geometry of three manifolds and existence of black hole due to boundary effect. Adv. Theo. Math. Phys. 5, 755–767 (2001)
Communicated by G.W. Gibbons
Commun. Math. Phys. 250, 81–94 (2004) Digital Object Identifier (DOI) 10.1007/s00220-004-1085-8
Communications in
Mathematical Physics
Soliton Surfaces in the Mechanical Equilibrium of Closed Membranes Brian Smyth Department of Mathematics, University of Notre Dame, Notre Dame, IN 46556, USA. E-mail:
[email protected] Received: 17 July 2003 / Accepted: 18 November 2003 Published online: 28 April 2004 – © Springer-Verlag 2004
Abstract: For a closed material membrane in equilibrium in a force field we investigate whether the external observables (membrane geometry and force field) determine the internal membrane response (the stress tensor T ), when the mean stress 21 Tr T is known. For membranes with boundary the indeterminacy of the response is classical. For closed membranes the geometry decides the question. We show uniqueness for all but a class of soliton surfaces — the globally isothermic surfaces (§1, 2); the physical phenomenon exhibited by any closed globally isothermic membrane in equilibrium is that, with all observables static, there is in total a 1-parameter family of responses with the same mean stress — all canonically determined by membrane geometry (Theorem 1). There exist closed embedded globally isothermic surfaces of every genus. The recognition of the role of these soliton surfaces settles the old classification problem for the space of static shears in any closed membrane, explicitly identifying all static shears, where before only a genus-dependent dimension bound was known (Theorem 2). 1. Introduction This work treats the equilibrium of a closed thin linearly elastic material membrane under the influence of a force field, from the geometric point of view, using connexions between the geometry, the conformal structure and the stress; for exceptional membrane configurations — the surfaces in the title — there are exceptional families of stress responses. These exceptional configurations turn out to be abundant, being stable under the conformal group of R3 ∪ {∞} and, as we show, occurring in every topological genus. Given a smooth compact orientable material membrane M with smooth boundary (possibly empty) in space, in equilibrium under a given external force field F , it is of interest to consider the extent to which the internal response — that is, the stress tensor
Partially supported by NSF Grant DMS00 71729
82
B. Smyth
T — is determined. The specific question treated here concerns the extent to which the observables (membrane geometry and force field) and the mean stress function τ = 21 Tr T determine the full stress tensor T . I am grateful to Andrew Smyth who interested me in this subject and for a conversation which crystallized for me the significance of this question for closed membranes. All membrane surfaces M are assumed smooth, compact and oriented with or without smooth boundary. M is called closed when the boundary ∂M is empty. When ∂M = ∅ the boundary is given the induced orientation. The orientation and induced metric on M determine the structure of a Riemann surface on M and its complex structure is denoted J (see §2). Force and moment balance considerations between the force field F and the response T , applied in the small to M, lead to the equilibrium equation derived in §2 (see also [9]). It may be thought of as an inhomogeneous system of linear first order partial differential equations for the unknown internal response T into which the observables enter as known quantities. Our study of the uniqueness question above is the source of the identification of the distinguished class of closed soliton membrane surfaces and the extra flexibility in the stress response they allow. Our analysis will show that uniqueness is the rule but an exceptional geometric class of surfaces emerges naturally — the globally isothermic surfaces (§2). The occurrence of such a surface as a membrane configuration signals a canonical non-uniqueness in the stress tensor response — that is, there are distinct stress tensor solutions with the same mean stress (pointwise) and any two such stress tensors stand in a canonical relationship to one another determined solely by the geometry of the membrane (Theorem 1, below). The globally isothermic surfaces are defined by a condition linking the geometry with the complex structure; namely, on the complement of the umbilic set, the Hopf quadratic differential of the surface (see §2) is a pure imaginary functional multiple of a global holomorphic quadratic differential . The differential is uniquely determined to within a real constant multiple and is called the underlying holomorphic quadratic differential of the globally isothermic surface. The condition is truly geometric, being equivalent to the condition that the principal foliations of the surface coincide with the foliations determined by a global holomorphic quadratic differential — in this case i — in the manner of [20] (see p.17). In §4 we gain a better understanding of this anomalous family of surfaces. A sphere has no non-trivial holomorphic quadratic differentials and so, from the definition, the only topologically spherical examples are round spheres. The main result on the physical question posed above is: Theorem 1. Let M be a smooth closed membrane in equilibrium under an external force field F with mean stress function τ . Either the mean stress uniquely determines the stress tensor or else M is a globally isothermic surface of genus p ≥ 1. If M is a globally isothermic membrane of genus p ≥ 1, then the space of solutions of the equilibrium equation with mean stress τ is in one-to-one correspondence with R and the correspondence is canonically determined by the geometry of the membrane M. Remark. By the geometry of M is meant its configuration in R3 , that is, its first and second fundamental forms. With this result the question of the existence of globally isothermic compact embedded surfaces in R3 becomes important and is settled for every genus in §4. It also follows that there is no exceptional behaviour for topologically spherical membranes, that is, the mean stress completely determines the stress tensor on any topologically spherical membrane in equilibrium.
Soliton Surfaces in the Mechanical Equilibrium of Closed Membranes
83
Theorem 1 is proved in §3. The argument associates a complex quadratic differential to the responding stress tensor in the way of Hopf [10]. The theory of holomorphic differentials then enters naturally via divergence-trace conditions arising from the equilibrium equation (see §2). An analysis of the index of an isolated umbilic figures in the proof, in §3, but the vital algebraic identity comes from the normal component of the equilibrium equation which links the Hopf differential of the membrane surface with a quadratic differential determined by the stress tensor. This algebraic identity is the source of the appearance of soliton surfaces in the theory. Examples of globally isothermic surfaces include all constant mean curvature surfaces, their Hopf differentials being themselves holomorphic; in particular, from Kapouleas [12], there are compact examples of every genus but none are embedded, by Alexandrov [1], except for the round sphere. The class of globally isothermic surfaces is much larger since the property is stable under the conformal group of R3 ∪ {∞} (see §4). Bonnet surfaces (that is, surfaces of non-constant mean curvature admitting a non-trivial 1-parameter family of isometric deformations preserving the mean curvature) form another interesting class of globally isothermic examples but Bonnet surfaces are never compact [3]. Thus the existence of compact embedded globally isothermic surfaces in every genus is of greatest interest here (see §4). A stress tensor T is called a shear if τ ≡ 0 and a surface tension if T ≡ τ I . The equilibrium equation becomes a homogeneous system when F ≡ 0 and describing its solution space is important for the general case; these solutions are called the static or residual stress tensors. This vector space is of some physical interest. A simple model where this space presents itself is the pericardial sac at end-diastole, where F ≡ 0, and it was proposed [18] to investigate the stresses occurring in this case. For closed strictly convex membranes there are no non-trivial static stresses, by a result of Cohn-Vossen and Blaschke (see Vekua [21]). It was suspected [18] that this space might always be finite-dimensional but it can be shown that there are closed embedded surfaces of every genus for which this space is infinite-dimensional [19]. It is significant that the space of static shears for membranes related by a conformal transformation of space can be shown to be canonically isomorphic but no canonical correspondence appears between their spaces of static stresses [19]. As has been long known (see Theorem 5.2, [16]), the space of static shears is finitedimensional of real dimension bounded by 6(p − 1) if the genus p of the membrane is p > 1, 2 if p = 1 and 0 if p = 0. But, this rough bound apart, nothing more was known. Our theorem below gives a simple genus-independent explicit classification of the space of static shears on any closed membrane with the geometry deciding everything. The effectiveness of our approach originates in the recognition of the soliton surfaces in the background of all uniqueness considerations. Applying Theorem 1 with τ ≡ 0, we obtain the following result. Theorem 2. If a smooth closed membrane admits a non-trivial static shear, then it must be a globally isothermic surface of genus p ≥ 1 and this shear is uniquely determined to within a constant multiple by the membrane geometry alone. Conversely, any smooth closed globally isothermic membrane of genus p ≥ 1 has a non-trivial static shear. Any torus of revolution is globally isothermic (see §4) and so this theorem would apply. But the result was known only for circular tori of revolution and then only from explicit computation [16]. Our theorem gives the result for all globally isothermic membranes and the underlying holomorphic quadratic differential supplies all of the static shears in a canonical way (§2). If the membrane is not globally isothermic there are no non-trivial static shears.
84
B. Smyth
The fact that globally isothermic surfaces admit no umbilics of positive index (Proposition 2, §4) gives at once a simple and striking consequence. Corollary 1. If a smooth closed membrane has an isolated umbilic of positive index then it admits no non-trivial static shear. A surface in R3 whose Hopf differential is locally a real multiple of a local holomorphic quadratic differential on the complement of the umbilic set is called isothermic (§2) and such surfaces were first studied by Cayley [7] and then by Bianchi, Darboux, B¨acklund, Christoffel and others (see [4]) in the late nineteenth and early twentieth century; this property is also conformally invariant and a soliton theory was developed for these surfaces by Cie´sli´nski-Goldstein-Sym [8] in 1995; the geometric interest in these surfaces in recent years, as well as their earlier history, is chronicled in the lecture notes of Burstall [4]. Because globally isothermic surfaces emerge here as such an exceptional class among membrane configurations, it is worth contrasting them with isothermic surfaces and giving examples (see §4). 2. Surface Theory, the Soliton Surfaces and the Equilibrium Equation We first explain some of the basic differential geometry of surfaces, in particular the Codazzi equation, the Hopf differential and the simple geometric characterization of the particular soliton surfaces which dominate the uniqueness considerations in the physical problem considered later. Let M be a smooth oriented surface without boundary and x : M −→ R3 an immersion of M in R3 . The standard inner product < , > on R3 induces a Riemannian metric g on M defined by gp (X, Y ) =< x∗p (X), x∗p (Y ) > for any vectors X, Y ∈ Mp ; here Mp denotes the tangent space to M at p and x∗p denotes the differential of x at p ∈ M. The given orientation on M determines a unique unit normal vector field ξ to M along x. The second fundamental form A of the immersion is defined pointwise by Xξ = −x∗p (Ap X) for all X ∈ Mp . The operator A is symmetric with respect to g and depends to within sign on the choice of unit normal field; its characteristic functions on M are the mean curvature function H = 21 Tr A and the Gauss curvature function K = det A. The oriented area element da of the metric g is defined by da(X, Y ) = < x∗ (X) × x∗ (Y ), ξ >, where × is the cross product. If ∇ is the Riemannian connexion of the induced metric g on M, then the equality of mixed partials of the vector-valued function x on M results in the Codazzi equation for A (∇X A)Y − (∇Y A)X = 0, for all vectors X and Y tangent to M. The complex structure J on M is defined pointwise as counterclockwise rotation of the oriented inner product space (Mp , gp ) through the angle π2 . Local coordinates (u1 , u2 ) on M for which g = e2ρ (du21 + du22 ), with ρ a smooth real-valued function of these variables, are called local isothermal coordinates for the metric g and if the coordinate frame { ∂u∂ 1 , ∂u∂ 2 } gives the surface orientation the coordinates are called positive isothermal coordinates; w = u1 + iu2 is then called a local complex coordinate for the underlying Riemann surface. ∂ ∂ The complex quadratic differential defined locally by = g(A ∂w , ∂w )dw 2 , ∂ is called the Hopf differential of the immersion x; here ∂w = 21 { ∂u∂ 1 − i ∂u∂ 2 } and ∂ 1 ∂ ∂ ∂ w¯ = 2 { ∂u1 + i ∂u2 }. If
Soliton Surfaces in the Mechanical Equilibrium of Closed Membranes
A=
H +α β β H −α
85
with respect to the above positive local coordinates (u1 , u2 ), then the corresponding local representation of the Hopf differential is = ωdw2 = − 2i e2ρ (β + iα)dw 2 , where w = u1 + iu2 . Codazzi’s equation for A is ωw¯ = 21 e2ρ Hw and the zeros of are the umbilics of the immersion x. To say that a symmetric tensor field T of type (1,1) is divergence-free is equivalent to saying that S = j (T ) = J T J satisfies Codazzi’s equation. The associated ∂ ∂ global complex quadratic differential to T is = g(S ∂w , ∂w )dw 2 = φ(w)dw 2 = i 2ρ 2 − 2 e (b + ia)dw , where τ +a b T = , b τ −a with respect to the coordinate frame. The divergence-free condition on T is equivalent to the local equation φw¯ = − 21 e2ρ τw and so if T is also trace-free then is holomorphic. An immersion x : M −→ R3 is globally isothermic if there exists a global holomorphic quadratic differential on M such that = ik on the complement of the umbilic set U in M, k being a real-valued function on M − U . The holomorphic differential being then uniquely determined to within a real constant multiple by — and therefore by the geometry of M — will be called the underlying holomorphic quadratic differential of the globally isothermic surface. The differential then uniquely determines a smooth symmetric divergence-free trace-free tensor field T0 on M. Since on a globally isothermic surface the differential is uniquely determined to within a real constant factor, the same is true of T0 . This field ultimately accounts for the extra freedom in the response on a globally isothermic membrane. An immersion x : M −→ R3 is isothermic if there exist positive isothermal coordinates {u1 , u2 } on a neighbourhood V of each non-umbilic point and a local holomorphic quadratic differential V in the coordinate w = u1 + iu2 such that the Hopf differential satisfies = ikV V on V , kV being a real-valued function on V . Note that if A diagonalizes with respect to these isothermal coordinates then β ≡ 0 and = ikV V , where kV is a real function on V and V is the holomorphic quadratic differential idw 2 on V . Conversely, if on an isothermal patch V centred at a non-umbilic point p, = ikV V for kV and V as above, then V = f (w)dw 2 , with f a holomorphic function of w, and since p is not an umbilic, f (o) = 0. After making a holomorphic change of coordinates we may assume f ≡ i and, in these new coordinates, β ≡ 0 and so A diagonalizes. Thus, being isothermic is equivalent to the existence of isothermal coordinates on a neighbourhood of each non-umbilic point with respect to which the second fundamental form diagonalizes. It was in this form that the notion was first introduced by Cayley [7]. Examples and properties of isothermic and globally isothermic surfaces are discussed in §4. The soliton theory of these surfaces is found in [8]. An infinitesimally thin material membrane in equilibrium under the influence of an external force field F may be considered as a smooth surface M with smooth boundary all smoothly embedded in R3 . The smooth embedding x : M −→ R3 is taken to be smooth at the boundary, in the sense that it extends to a smooth embedding of a larger surface into R3 . We consider only oriented surfaces because our interest is in closed membranes and these are automatically orientable. The boundary, if non-empty, is given the canonical orientation, i.e., that for which the oriented unit normal to the boundary
86
B. Smyth
∂M points into the interior of M. The force field is a smooth vector-valued function F : M −→ R3 . The nature of the responding forces of stress within the membrane (i.e., the stress tensor T ), holding it in equilibrium in response to the applied field F, is explained below. In our discussion we also assume that these forces are C ∞ . Post facto, it will be seen that for the equilibrium equation we need no more than x to be C 3 , T to be C 1 and F to be continuous. The equilibrium of an isotropic membrane (for which T ≡ τ I , such as happens with a liquid film or bubble) acted on by a smooth force field F , was considered first byYoung [23] and then by Laplace [13] at the beginning of the nineteenth century in studies of capillary surfaces (see [17]). The more general equation was later found by Beltrami [2] and Lecornu [15]. If we imagine a small slit cut in the membrane along a geodesic emanating from p in the direction of a unit vector e ∈ Mp , then equal restorative forces must be applied in opposite directions at points on either side of the cut in order to maintain the membrane in equilibrium. At the left edge of the cut, the limiting value of this force (per unit length) as the length of the cut goes to zero, is denoted r(p, e) and called the response at p corresponding to the direction e. Clearly r is an R3 -valued function on the unit tangent bundle. The equilibrium equation of the membrane is derived in Gurtin-Murdoch [8] from force and moment balance considerations in the small, and in their notation r(p, e) = −x∗p (Tp (J e)), where T defines a smooth tensor field of type (1,1) on M which is symmetric with respect to the induced metric g. The tensor field T is called the stress tensor of the membrane in equilibrium under the field F . Let D be any compact simply-connected domain in M with oriented boundary a smooth closed curve γ (s), 0 ≤ s ≤ l, parametrized by arc length s. Thinking of this domain of the membrane in isolation, the only forces keeping it in equilibrium are F , acting over the interior, and the forces of response corresponding to the oriented unit tangent field e acting at the boundary ∂D of D. The external force field F gives rise to the R3 -valued 2-form F da, where da is the natural area element of the induced metric on M for the fixed choice of orientation. Equilibrium calls for the integral of this 2-form over D to balance the integral of the restorative forces of the response on D along its oriented boundary ∂D. Thus the equilibrium equation for D is l F da + r(γ (s), γ˙ (s))ds = 0, 0
D
or alternatively,
F da +
D
ω = 0, ∂D
where ω(X) = −x∗ (T (J X)). This defines an R3 -valued 1-form ω on M, which we call the response 1-form of the membrane, and which completely encodes the forces of stress within the membrane. By Stokes’ theorem, the equilibrium condition on D is (F da + dω) = 0, D
where dω denotes the exterior derivative of ω. The vector-valued 2-form dω may be computed from dω(X, Y ) = Xω(Y ) − Y ω(X) − ω([X, Y ]), using the fundamental equations of surface theory, and we obtain dω = [x∗ (div T ) + Tr (AT )ξ ]da. Hence
Soliton Surfaces in the Mechanical Equilibrium of Closed Membranes
87
D
(F + x∗ (div T ) + Tr (AT )ξ )da = 0
and, by continuity of F , T and div T , F + x∗ (div T ) + Tr (AT )ξ = 0 is the equilibrium equation of the membrane. Remark. If ft : M −→ R3 is a 1-parameter family of isometric deformations of an oriented surface M with oriented unit normal field ξt and associated second fundamental form At , then the Gauss equation may be written J At J At = KI , where K is the Gauss t curvature of the induced metric. Writing B = ( dA dt )t=0 we have a symmetric tensor field satisfying Codazzi’s equation, so that T0 = J BJ must have zero divergence, i.e., div T0 = 0. Differentiating Gauss’ equation above we obtain T0 A − J AT0 J = 0, from which it follows that Tr AT0 = 0. Thus, it follows that any isometric deformation gives rise to a static stress T0 (i.e., a solution of the equilibrium equation with F = 0). If H is preserved under the deformation, then Tr T0 = 0 and T0 is a static shear. Similar remarks may be made for infinitesimal isometric deformations. The connexion between static stresses and infinitesimal isometric deformations already figures in the work of Blaschke and Cohn-Vossen on the rigidity (and vanishing of static stresses) of closed strictly convex surfaces (see Vekua [19]). That there is no uniqueness in the solutions of the equilibrium equation is known classically from the infinite-dimensional space of smooth symmetric tensor fields satisfying the equilibrium equation on any planar membrane configuration in the presence of the external field F ≡ 0 — the Airy tensors. Such a tensor field is obtained by taking T = j −1 (Hφ ), where φ is any smooth function on the planar configuration M and Hφ is its Hessian operator (defined by Hφ (X) = ∇X (∇φ), where ∇ is the connexion of the induced metric) and j is the operator defined in §2. Using the fact that M is planar — in fact, vanishing Gauss curvature (flatness) is enough — it is easily checked that Hφ satisfies Codazzi’s equation. Thus, from §2, div T ≡ 0 and, since A ≡ 0 for planar surfaces, referring back to the equilibrium equation we see this is all that is needed for solutions of the equilibrium equation for planar membranes with F ≡ 0. Conversely, suppose that M is a simply-connected compact flat membrane with smooth boundary in equilibrium with F ≡ 0. If T is a solution of the equilibrium equation then, since div T ≡ 0, S = j (T ) satisfies Codazzi’s equation. Letting denote the Laplace operator of the flat induced metric on the membrane M, we solve the Poisson equation ψ = −2τ for ψ, where 2τ = Tr T = −Tr S. Now Hψ satisfies Codazzi’s equation since M is flat and so also does S. It follows that S0 = S − Hψ is a Codazzi operator with trace zero. Thus, the quadratic differential determined by S0 must be holomorphic. Now (M, g) being flat and simply-connected, it may be identified with a simply-connected region of the plane endowed with its euclidean metric. Then = − 2i (b + ia)dw 2 , where w is the complex coordinate on the plane and b + ia is a holomorphic function of w. M being simply-connected, there exists a holomorphic function f = k + il on M such that fww = b + ia. From the Cauchy-Riemann equations for f, this becomes b + ia = 2ilww and this means that S0 = Hl or S = Hψ+l is Hessian. In summary, when M is simply-connected and flat with smooth boundary, the space of solutions of div T = 0 may be identified with the space of Hessians of all smooth functions on M. This then is the space of solutions of the equilibrium equation for planar membranes when F ≡ 0, and it is infinite-dimensional. When M is flat but not planar,
88
B. Smyth
A is not identically vanishing and the normal component of the equilibrium equation gives a further restriction on the space of solutions. 3. The Proof of the Main Result In this section we prove Theorem 1 for closed membranes but, to begin with, M will be taken to be compact and oriented with smooth boundary ∂M, possibly empty. We consider the tangential and normal equilibrium equations as conditions on the quadratic differentials corresponding to the second fundamental form and the stress tensor. It is the normal equation that leads to the appearance of globally isothermic surfaces. If two distinct stress tensors T and T˜ satisfying the equilibrium equation div T + Tr (AT )ξ + F = 0 have the same mean stress, then their difference T0 = T˜ − T is non-trivial and satisfies (i) div T0 = 0, (ii) Tr (AT0 ) = 0 and (iii) Tr T0 = 0. Let be the corresponding nontrivial quadratic differential determined by T0 . As we saw in §2, (i) and (iii) imply is holomorphic. Its zero set Z() is therefore discrete in M. Let H +α β A= β H −α and T0 =
a b b −a
be the matrices of A and T0 with respect to a positive isothermal coordinate frame { ∂u∂ 1 , ∂u∂ 2 } on a neighbourhood V on which g = e2ρ (du21 + du22 ). Then the corresponding local representations of the associated quadratic differentials, encountered in §2, are = ωdw2 and = φdw 2 , where ω = − 2i e2ρ (β + iα) and φ = − 2i e2ρ (b + ia). Since Tr T0 = 0, condition (ii) above translates into ωφ¯ being pure imaginary and so ω = ikφ on V − Z(), k being a real function on the latter set. Hence = ik on M − Z(), k being a smooth real function on the latter set. There are just two possibilities. The first is that k ≡ 0 on M − Z() and in this case we show that M must be globally isothermic. If we show that Z() consists of umbilics then we will have = ik on the complement of the umbilic set, as required. Our analysis gives somewhat more: all points of Z() are umbilics of negative index and all umbilic points outside of Z() are of index zero, and are even removable singularities of the principal foliations (see Proposition 2, §4). Assume p ∈ Z() is not an umbilic. We may take p to be the origin o of the local isothermal coordinates. Let m > 0 be the order of vanishing of at p. Since p is assumed not an umbilic, ω(o) = 0 and we may assume that ω does not vanish on V . From the identity ω = ikφ on V − {o} it follows that k and φ do not vanish on V − {o}. Any smooth non-vanishing complex-valued function v = r + is on V − {o} may be thought of as a vector field on this punctured neighbourhood with an isolated singularity at o and the index jv (o) of v at o is defined to be the integer rds − sdr 1 jv (o) = , 2π C r 2 + s 2
Soliton Surfaces in the Mechanical Equilibrium of Closed Membranes
89
the value of which is independent of the choice of simple closed curve C circulating within V once around the singularity o, counterclockwise. From the definition, j gives a homomorphism from the ring of germs of non-vanishing complex functions on punctured neighbourhoods of o onto the additive group of integers Z. Note that jv (o) = 0 if v is purely real, purely imaginary or if v extends to a non-vanishing continuous complex function on a full neighbourhood of o. Moreover jv (o) = m > 0 if v is holomorphic with a zero of order m > 0 at o, as follows from the previous remark on representing v(w) = wm u(w) with u holomorphic and non-vanishing on V −{o}. The earlier identity ω = ikφ on V − {o} therefore gives jω (o) = ji (o) + jk (o) + jφ (o) = m > 0. In particular, ω must vanish at o, a contradiction. Hence each p ∈ Z() must be an umbilic and so we certainly have = ik on the complement M − U of the umbilic set U . This completes the proof that M is globally isothermic when k ≡ 0. In the event that k ≡ 0 on M − Z(), then ≡ 0 on M − Z() and therefore on M. Hence M is composed of umbilics and, if closed, must be the round sphere. This completes the proof of Theorem 1. If k ≡ 0 but M is not closed, then it must be a proper subdomain of a round sphere or plane in R3 . We may further note that in the two exceptional cases of surfaces with boundary just mentioned, M is conformally equivalent to a domain D in C and may be represented globally as = φ(z)dz2 , where φ(z) is a holomorphic function of the coordinate z on C. Such differentials form an infinite-dimensional vector space over C and, conversely, each such determines a T0 satisfying (i) and (iii) above — and (ii) holds trivially when M is planar and, in consequence of (iii), also when M is spherical. Thus, for proper subdomains of the round sphere or plane, any solution T of the equilibrium equation determines an infinite family of solutions T + cT0 , where c is any real constant and T0 is a tensor field corresponding to any holomorphic quadratic differential via the correspondence used in §2; these solutions all have the same mean stress as T . We summarize these results in the following theorem which contains Theorem 1 and of which Theorem 2 is a simple consequence. Theorem 3. Let M be a smooth compact orientable membrane with smooth boundary ∂M (possibly empty) in equilibrium under an external force field F , and let T be the responding stress tensor on M and τ = 21 Tr T the mean stress of T . Either T is the unique solution of the equilibrium equation with mean stress τ, or M is a globally isothermic surface but not the round sphere S 2 . Conversely, if M is globally isothermic but not the round sphere, then either i) M is a proper subdomain of the round sphere S 2 or the plane R2 and the space of solutions of the equilibrium equation with mean stress τ is infinite-dimensional, or ii) M is closed and the space of solutions of the equilibrium equation with mean stress τ is one-dimensional and canonically determined by the geometry. Note that when ∂M = ∅ the proof does not require any information on the behaviour of the stress tensor T at the boundary. If two solutions of the equilibrium equation with the same mean stress also coincide in directions tangential to the boundary, then their difference T0 is zero in these directions. Since Tr T0 ≡ 0 on account of the mean stress condition, it follows that T0 is zero in all directions at each point of the boundary. Hence the holomorphic differential , associated to T0 , vanishes on ∂M and so is identically
90
B. Smyth
zero. Thus if ∂M = ∅ and two solutions T and T˜ of the equilibrium equation with the same mean stress on M coincide in tangential (or normal) directions along the boundary, then they coincide everywhere on M. Corollary 2. If the boundary of M is nonempty, then T is uniquely determined by τ and the value of T in directions tangent (or normal) along the boundary.
4. Remarks on Globally Isothermic Surfaces In this study of the equilibrium of material membranes, globally isothermic surfaces came to the fore with the phenomenon described in Theorem 1, and for that reason existence of such compact embedded surfaces in every genus is important. We also point out some properties of globally isothermic surfaces which serve to better distinguish them within the larger family of isothermic surfaces with examples of both types. If is a complex quadratic differential on the Riemann surface M, then at each point of the complement M − Z() of its zero set Z() there is a unique linear subspace of the tangent space, defined by φ(w)dw2 > 0, where = φ(w)dw 2 is the representation of in any local complex coordinate about this point; this then defines a foliation of M − Z() (see [20], p.17). The pair of orthogonal foliations on M − Z() given locally by Im (φ(w)dw 2 ) = 0 we call the foliations of ; we will be interested in the case where is holomorphic and non-trivial so that Z() is discrete. The foliations of the differential i intersect those of the differential at an angle π4 . Now if is the Hopf quadratic differential of an immersion x of M, then the complement M − U of the umbilic set U is foliated by the principal foliations and these are described by Im (ω(w)dw2 ) = 0, where = ω(w)dw 2 in any local complex coordinate on M − U . From §2, if x is globally isothermic then = ik on M − U , where is a holomorphic quadratic differential on M and k is a real function on M − U ; it follows as in the proof in §3, that Z() is contained in U and, by the definition above, that the principal foliations of x coincide with the foliations of i on M − U ; moreover the foliations of i extend the principal foliations of x onto the set M − Z(). Conversely, if the principal foliations of x (defined only on the set M − U ) coincide with the foliations of a global holomorphic quadratic differential i, then Z() is contained in U and, from the definition above, = ik on M − Z() with k some real function on M − Z() and therefore the relation holds on M − U and x is globally isothermic. Hence Proposition 1. An immersion x is globally isothermic if and only if its principal foliations coincide with the foliations of some global holomorphic quadratic differential. When M is globally isothermic and is the underlying holomorphic quadratic differential then i determines the principal foliations. A distinctive feature of any globally isothermic surface is the character of its isolated umbilics. Proposition 2. On a globally isothermic surface every isolated umbilic has index ≤ 0. On the complement of the set of umbilics of negative index, the foliations of the underlying holomorphic quadratic differential extend the principal foliations.
Soliton Surfaces in the Mechanical Equilibrium of Closed Membranes
91
Proof. On M −U we have = ik with a global holomorphic quadratic differential, and since does not vanish there neither does . Hence Z() is in U . Any isolated umbilic outside Z() has index zero since the foliations of i extend the principal foliations on M − Z(). We now give a formula for the index of any isolated umbilic p0 . Obviously k is defined and nowhere zero on a punctured isothermal neighbourhood V − {p0 } of p0 . Writing = ω(w)dw 2 and = φ(w)dw 2 in the local complex coordinate, φ(w) is holomorphic with a zero of order m ≥ 0 at w = 0. Interpreting the complex-valued functions ω(w), ik(w) and φ(w) as vector fields with isolated singularities at w = 0, the identity = ik on V − {p0 } and the usual index property gives the relation jω = jik + jφ between their indices. Now since ik is pure imaginary its index is zero. Since φ is holomorphic its index is −m, the order of vanishing of at p0 . Hence jω = −m. The umbilic index of p0 is easily determined by to be 21 jω . Hence the umbilic index of p0 is − m2 ≤ 0. Surfaces of constant mean curvature are globally isothermic since for them Codazzi’s equation amounts to the statement that the Hopf differential is itself holomorphic. Thus, from Kapouleas [12] we know that there exist compact globally isothermic surfaces in R3 of every genus; but none of these, except the round sphere, is embedded [1]. The Bonnet surfaces — surfaces not of constant mean curvature but admitting a one-parameter family of isometric deformations preserving the mean curvature function — give another class of globally isothermic surfaces; but none of these can be compact [3]. Example. Let γ (t) = (γ1 (t), γ2 (t)) be a smooth curve in the upper-half xy-plane (i.e., γ2 (t) > 0), and f (t, θ ) = (γ1 (t), γ2 (t) cosθ, γ2 (t) sin θ) the surface M obtained by T γ1 2 +γ2 2 rotating γ about the x-axis. Let s(T ) = dt be the arc length parameter of γ γ2 with respect to the Poincar´e metric on the upper half plane. Then (s, θ ) are isothermal coordinates for the reparametrization F (s, θ ) = f (t (s), θ ) of the given surface. It is easily checked that = k(ds + idθ )2 with k real. Since ds + idθ is a globally defined holomorphic 1-form on M, the surface M is globally isothermic. If γ is a smooth simple closed curve in the upper half plane then we get an embedded globally isothermic torus. It is worth noting that unlike surfaces of constant mean curvature, globally isothermic surfaces are not necessarily analytic. To see this one need only take a torus of revolution generated by a smooth non-analytic simple closed curve in the upper half plane. Further embedded examples which are not of revolution follow from the next remark. Remark. The property of being a globally isothermic surface is preserved under the conformal group. Indeed let x : M −→ R3 be globally isothermic with Hopf differential and underlying holomorphic differential . Then for any conformal transformation ψ, x˜ = ψ ◦ x has the same induced conformal structure and second fundamental forms related by A˜ = pA + qI with p and q functions on M and p vanishing nowhere. Thus the umbilic ˜ is a non-zero functional sets of both immersions coincide and the Hopf differential multiple of . So x˜ is globally isothermic also with as underlying holomorphic differential. Proposition 3. In R3 there exist closed embedded globally isothermic surfaces of every genus.
92
B. Smyth
Proof. Let x : M −→ S 3 be a compact oriented surface minimally embedded in the unit sphere S 3 with oriented unit normal field ξ in S 3 and corresponding second fundamental form A and Hopf differential . Such surfaces exist in every genus, by Lawson [14]. Stereographic projection into R3 gives an embedding x˜ with the same induced conformal structure and A˜ = pA + qI with p and q functions on M and p vanishing nowhere. ˜ is a non-vanishing multiple of — which is holomorphic by the minimality Hence of x — so that x˜ is globally isothermic. Finally a few remarks on isothermic surfaces, for which there is now the fundamental result of Cie´sli´nski-Goldstein-Sym [8] on the integrability of the Gauss-Codazzi equation and the resultant soliton theory (see also [4]). The soliton theory for such surfaces has as its setting simply-connected isothermal coordinate neighbourhoods; the implications of the soliton theory for globally isothermic surfaces is not yet understood. It is easy to see that the definition of an isothermic surface given in §2 is equivalent to the requirement that, locally on the non-umbilic set, the principal foliations coincide with the foliations of some local holomorphic quadratic differential. It was seen in §2 that being isothermic is equivalent to the requirement that there exist local isothermal coordinates on a neighbourhood of each non-umbilic point with respect to which the second fundamental form diagonalizes. This notion of considering the local simultaneous diagonalization of the first and second fundamental forms was Cayley’s [7] point of departure in his study. The property of being isothermic is also invariant under conformal transformations of R3 ∪ {∞}. Another characterization is due to Calapso [6], who showed that any non-trivial solution of the fourth-order p.d.e. (
σuv σuv )uu + ( )vv + (σ 2 )uv = 0 σ σ
determines a four-dimensional family of isothermic surfaces. Globally isothermic surfaces are isothermic, but Cayley had already noted that any ellipsoid is isothermic and among topologically spherical surfaces only round spheres are globally isothermic. Every smooth surface of revolution M intersecting its axis of revolution is isothermic but it can be easily checked that it is not globally isothermic — unless it be a piece of a round sphere or of a plane. Proposition 4. There exist closed embedded isothermic surfaces of every genus in R3 which are not globally isothermic. Proof. In genus p = 0 the ellipsoids are isothermic but not globally isothermic as remarked above. For any genus p ≥ 2, examples can be constructed as follows: in the example above first choose the generating curve to be convex and containing two congruent straight line segments orthogonal to the axis of revolution and at the same distance from it. The original curve may be chosen so that no umbilics of the surface of revolution occur outside the planar part. The resulting torus of revolution contains two congruent planar annuli. Mark off a small contractible circle in one of these annuli as well as the congruent circle in the congruent annulus. Take p ≥ 2 copies of this surface (together with its pair of marked circles) stacked along the axis of revolution. We may smoothly join the neighbouring planar annuli of revolution by gluing a cylinder of revolution of negative curvature between neighbouring annuli along the circular markings. Then no new umbilics are introduced so that all umbilics of the new genus p surface are in the interior or boundary of the planar regions and therefore none are isolated. The resulting
Soliton Surfaces in the Mechanical Equilibrium of Closed Membranes
93
surface of genus p ≥ 2 is isothermic since it is locally of revolution in a neighbourhood of any non-umbilic point. If M were globally isothermic then applying Riemann-Roch to the underlying holomorphic quadratic differential of Propositions 1 and 2, we have 0 > χ (M) = 2(1 − p) = j, where the latter summation of umbilic indices (all negative, by Proposition 2) is taken over Z(); in particular, since p ≥ 2, M must have an isolated umbilic of negative index. As the construction gives no isolated umbilics whatsoever, we have a contradiction to Proposition 2 and M is not globally isothermic. Finally we construct an example in genus p = 1; in the initial torus of revolution in the previous construction remove the planar disk bounded by the small contractible circle and smoothly attach a rotationally symmetric disk having a single umbilic (which must be of index 1); the resulting torus is isothermic since it is locally a surface of revolution in a neighbourhood of each non-umbilic point but it is not globally isothermic, by Proposition 2, since it contains an umbilic of positive index. Remark. A simply connected isothermic surface without umbilics is globally isothermic. To see this fix an orientation of the surface M and cover it with positive isothermal coordinate patches {Vα }α∈I with = ikα α for some real function kα and some holomorphic quadratic differential α on each patch Vα . By holomorphicity kαβ = kkβα is a non-zero real constant on Vα ∩ Vβ and satisfies the cocycle condition kαβ kβγ = kαγ on each non-empty Vα ∩ Vβ ∩ Vγ . Thus there is a 0-cocycle assigning a non-zero constant cα to each Vα such that kαβ = ccβα for each non-empty Vα ∩ Vβ (see, for example, Weil [22], p. 85, Lemma 1). Then k = kcαα defines a global smooth function on M and = cα α defines a global holomorphic quadratic differential on M with = ik on M, as required. Acknowledgement. I am grateful to the referee for pointing out that Theorem 1 resonates with a recent characterization of isothermic surfaces [5], namely that they are precisely those surfaces which are not determined up to conformal diffeomorphism by the trace-free part of the second fundamental form (see also [11]). This work was begun at the Max Planck Institute for Mathematics in the Sciences in Leipzig where I was a visitor in 2000 and continued at the Max-Planck-Institut f¨ur Mathematik in Bonn in 2001. It is a pleasure to record my thanks to both institutes. I thank Robert Finn for conversations on this work when we were both visitors in Leipzig and Chuu-Lian Terng for pointing me toward the lecture notes of Burstall [4] on isothermic surfaces.
References 1. Alexandrov, A.D.: Uniqueness theorems for surfaces in the large I, II. Am. Math. Soc. Transl. 21(2), 341–388 (1962) 2. Beltrami, E.: Sull’equilibrio delle superficie flessibili ed inestendibili. Memorie dell’Academia delle Scienze dell’Instituto di Bologna 3, 217–265 (1882) 3. Bobenko, A.I., Eitner, U.: Painlev´e Equations in the Differential Geometry of Surfaces. Lecture Notes in Mathematics, Vol. 1753, Berlin-Heidelberg-New York: Springer Verlag, 2000 4. Burstall, F.E.: Isothermic surfaces: conformal geometry, Clifford algebras and integrable systems. Lecture Notes, Tsing Hua University, Taipei, 1999 5. Burstall, F.E., Pedit, F., Pinkall, U.: Schwarzian Derivatives and Flows on Surfaces. In: Differential Geometry and Integral Systems M.A. Guest, R. Miyaoka, Y. Ohnita, eds., Contemp. Math. 308, Providence, RI: Am. Math. Soc., 2002, pp. 39–61
94
B. Smyth
6. Calapso, P.: Sulla superficie a linee di curvatura isoterme. Rendiconti Circ. Mat. Palermo 17, 275–286 (1903) 7. Cayley, A.: On surfaces divisible into squares by their curves of curvature. Proc. London Math. Soc. 4, 8–9, 120–121 (1872) 8. Cie´sli´nski, J., Goldstein, P., Sym, A.: Isothermic surfaces in E3 as soliton surfaces. Phys. Lett. A 205, 37–43 (1995) 9. Gurtin, M., Murdoch, A.I.: A continuum theory of elastic material surfaces. Arch. Rational Mech. Anal. 57, 293–323 (1975) 10. Hopf, H.: Differential geometry in the large. Lecture Notes in Mathematics, Vol. 1000, BerlinHeidelberg-New York: Springer Verlag, 1990 11. Kamberov, G., Pedit, F., Pinkall, U.: Bonnet pairs and isothermic surfaces. Duke Math. J. 92, 637–644 (1998) 12. Kapouleas, N.: Compact constant mean curvature surfaces in Euclidean three-space. J. Differ. Geom. 33, 683–715 (1990) 13. Laplace, P.S.: Trait´e de M´echanique C´eleste. Suppl´ement au livre X (1806), Vol. IV. Paris: GauthierVillars, 1806. Annotated English translation by Nathaniel Bowditch (1839); reprinted by New York: Chelsea Publishing Company, 1996 14. Lawson, B.: Complete minimal surfaces in S 3 . Ann. of Math. 92, 335–374 (1970) 15. Lecornu, L.: Sur l’´equilibre des surfaces flexibles et inextensibles. J. de l’Ecole Polytechnique (Paris) 29, 1–109 (1880) 16. Molzon, R., Man, C-S.: Residual stress in membranes. J. Elasticity 20, 181–202 (1988) 17. Nitsche, J.C.C.: Lectures on minimal surfaces. Vol. 1. Cambridge: Cambridge University Press, 1989 18. Shivakumar, P.N., Man, C-S., Rabkin, S.W.: Modelling of the heart and pericardium at end-diastole. J. Biomech. 22, 201–209 (1989) 19. Smyth, B.: The space of residual stress tensors of a closed membrane. In preparation 20. Strebel, H.: Quadratic Differentials. Ergebnisse der Mathematik und ihrer Grenzgebiete 5, BerlinHeidelberg-New York: Springer Verlag, 1984 21. Vekua, I.N.: Generalized Analytic Functions. New York: John Wiley, 1965 22. Weil, A.: Vari´et´es kaehleriennes. Paris: Hermann, 1956 23. Young, T.: An essay on the cohesion of fluids. Phil. Trans. Roy. Soc. (London) 1, 65–87 (1805) Communicated by G.W. Gibbons
Commun. Math. Phys. 250, 95–117 (2004) Digital Object Identifier (DOI) 10.1007/s00220-004-1120-9
Communications in
Mathematical Physics
Hausdorff Measure of the Singular Set of Landau-Lifshitz Equations with a Nonlocal Term Shijin Ding1, , Boling Guo2 1 2
Department of Mathematics, South China Normal University, Guangzhou, Guangdong 510631, P.R. China. E-mail:
[email protected] Center for Nonlinear Studies, Institute of Applied Physics and Computational Mathematics, P.O.Box 8009, Beijing 100088, P.R. China. E-mail:
[email protected] Received: 22 July 2003 / Accepted: 9 December 2003 Published online: 11 June 2004 – © Springer-Verlag 2004
Abstract: This paper is concerned with the partial regularity of the stationary solutions to the Landau-Lifshitz system of ferromagnetic spin chain with Gilbert damping and a nonlocal term in 3-dimensions. The Hausdorff measure of the singular set is discussed. 1. Introduction Steady state soft ferromagnetic problem is to find minimizers of the energy E=
1 |∇u|2 + Aϕ(u) − Bu · H0 − u · H, 2
where u = (u1 , u2 , u3 ) is the spin vector, |u| = 1, A is the anisotropy constant, n is the symmetric axis of the uniaxial crystal. For example, n = (0, 0, 1), B is a constant, H0 is the applied fields, H is the induced field. The magnetic moment u links the magnetic field and the magnetic induction B by the relation B = H˜ + 4πu,
divB = 0,
curlB = j
satisfying the decomposition H˜ = H0 +H . The applied field H0 arises from the external current j : divH0 = 0,
curlH0 = j,
and the demagnetizing field satisfying div(H + 4πu) = 0,
curlH = 0.
The author is partially supported by the National Natural Science Foundation of China (Grant No.19971030) and by Guangdong Provincial Natural Science Foundation (Grant No.000671 and No.031495).
96
S. Ding, B. Guo
The density ϕ(u) of the anisotropy energy A ϕ(u) favors some special direction of the magnetization such that Aϕ(u) attains its minimum. In this paper, without loss of generality, we ignore the applied energy and the anisotropy energy. That is, we only consider the following energy in = B 3 (0): 1 E= |∇u|2 − u · H. 2 B3 B3 The Euler-Lagrange equation for the energy is −u − u|∇u|2 = H − < H, u > u. If the right-hand side is identically 0, then this is the equation of typical harmonic maps. The dynamical equation of the ferromagnetic spin chain is the following Landau-Lifshitz flow: δE(u) δE(u) ut = −λ1 u × u × + λ2 u × , δu δu where λ1 > 0 is the Gilbert damping constant. This equation can be rewritten as (for λ1 = λ2 = 1) 1 1 ut − (u × ut ) = u + u|∇u|2 + H (u) − H (u)u, in B 3 × R+ , 2 2
(1.1)
where H (u) is the nonlocal term which satisfies the following quasi-steady state Maxwell equations: curlH (u) = 0, in D (R3 ),
(1.2)
div(H (u) + u) = 0, in D (R3 ).
(1.3)
We impose on Eqs (1.1)–(1.3) the initial condition u(x, 0) = u0 (x)
(1.4)
∂u | 3 =0 ∂ν ∂B
(1.5)
and boundary condition
in (1.4), |u0 (x)| ≡ 1; in (1.5), ν is the unit outer normal to the boundary of B 3 , u is the zero extension of u from B 3 to R3 , u = (u1 , u2 , u3 ) is the spin vector, “×” denotes the vector cross product in R 3 . We should notice that this extension guarantees u ∈ L∞ (R3 × R+ ) ∩ L∞ (0, ∞; W −1,∞ (R3 )). For the existence of weak solutions to Eq. (1.1) (without nonlocal terms), Zhou, Guo et al have done much work years ago (see [22 – 25]). We also refer to F. Alouges 1 (R3 ), and A. Soyeur [1]; they proved that if λ2 = 1, and the initial data u0 ∈ Hloc 2 3 ∇u0 ∈ L (R ), |u0 | = 1, a.e., then there exists a weak solution. If u0 ∈ H 1 (), λ1 > 0, then the Neumann Boundary problem admits infinitely many weak solutions. If λ1 → 0, then the equation tends to ut = u × u; but if λ1 → ∞, the equation tends to ut = u + u|∇u|2 , the harmonic map heat flow.
Hausdorff Measure of Landau-Lifshitz Equations
97
For the equation with a nonlocal term ((1.1)–(1.3)), Carbou and Fabrie in [2] obtained the global smooth solution for small initial data and Neumann boundary condition, and the local existence of smooth solutions. Carbou [3] discussed the partial regularity of stationary solution of the steady state equations with a nonlocal term and got some results similar to those of harmonic maps. Another important problem for Eq. (1.1)–(1.3)) is the regularity problem. Up to now, we have understood that the discussions on the regularity problem of Landau-Lifshitz equations is similar to that for the harmonic map heat flow although it is much more difficult. So let us first recall some results on the regularity of the harmonic map heat flow. First of all, Chen and Struwe in [7] established the existence and the partial regularity results for some weak solutions of harmonic map heat flow. Later, Chen and Lin [5] generalized the main results of [7] to the flow with a Dirichlet condition. However, Coron [10] observed that there are infinitely many weak solutions to the flow different from those constructed in [7] or [5]. On the other hand, Riviere’s example [20] showed that a weakly harmonic map from B 3 into S 2 may be everywhere singular on B 3 . Therefore, one suitable admissible space is needed to guarantee the partial regularity. Feldman [14] introduced, motivated by the studies on stationary harmonic maps by Evans [13], a notion of “stationary weak solution” for the flow and proved that a stationary solution must be partially regular. The Hausdorff measure of the singular set was given. He also pointed out that the solution constructed by Chen and Struwe is stationary. Since the conditions of “stationary solution” seem unnatural, Chen, Li and Lin in [6] studied the partial regularity for a class of more general weak solutions than those in [14], i.e., the weak solutions satisfying the monotonicity inequality and energy inequality which include the solutions in [14]. They proved the same results for such solutions as in [14] and [7]. The uniqueness was obtained by Freire in [15] for the two dimensional problem. Chen and Wang [8] extended the conclusions of [6] to the case that the target is a homogeneous Riemannian manifold. However, the study on the partial regularity of the Landau-Lifshitz system is much more difficult than that of the harmonic map heat flow. The essential difficulty lies in the derivation of the monotonicity inequality because we have a bad term u × u or u × ut in Landau-Lifshitz systems. In this aspect, the first progress was made by Guo and Hong in 1993 [17] in which they revealed the links between system (1.1) (n = 2 and without a nonlocal term and applied field) and the harmonic maps heat flow and established the same partial regularity results as those for the harmonic map heat flow [7]. The uniqueness of weak solution with finite energy for the 2-dimensional problem was obtained by Chen, Ding and Guo in 1998 [4]. Recently, Liu [18] has got the partial regularity for the stationary solutions of the high dimensional equation (1.1) (without a nonlocal term again and the domain has no boundary) using a similar method to [14]. However, this method is not applicable to the Dirichlet problem. Later, the authors proved in [11] and [12], by somewhat different arguments, similar partial regularity theorems for the weak solutions of the initial value problem and initial-boundary value problem satisfying the energy inequality and generalized monotonicity inequality in which the singular set was characterized in a different way from [18]. In [18] and [11], the authors found that they have to express the possible singular set by two sets both of which have some dimensional Hausdorff measure zero. In 2002, R. Moser [19] studied the partial regularity of stationary solutions to Landau-Lifshitz equations without nonlocal terms in 3 or 4 dimensions. He characterized the singular set only by one set due to the smallness of the dimensions. His method
98
S. Ding, B. Guo
is different from that of [18] and [11] but can not be applied to higher dimensional problems. As we know, the key point to get the partial regularity for the harmonic map heat flow is the monotonicity inequalities. However, they are no longer true for Landau-Lifshitz equations as pointed out in [18] and [11]. We can only get the generalized monotonicity inequality to give the energy decay. This is why we have the two-set characterization of the singular set in higher dimensions in [18] and [11]. However, as in [19] for small dimensional problems, the singular set can be characterized only by one set, as usual as for harmonic map heat flow. It is worth noticing that his method is also different from and much more simple than the one usually used for the harmonic map heat flow although it can not be applied to high dimensional problems. In this paper, we adopt some ideas of Moser’s since n = 3. However, to define the stationary solutions, in view of the nonlocal term, we must have detailed studies on the quasi-steady state Maxwell equations although there were some discussions by Carbou in [3]. Secondly, the nonlocal term also leads to some difficulties in the derivations of the generalized monotonicity inequality and energy decay when we intend to improve the generalized monotonicity on almost every time layer t to the cylinders and when we prove the compactness claim for the energy decay. Unlike the usual method for the harmonic map heat flow, we do not need the theories of Hodge decompositions, Hardy space, BMO space and Hardy maximal functions. Only the H´elein technique is used. The proof is direct and concise. It is obvious that in physical settings, in view of the Maxwell equations, it is reasonable to restrict ourselves to three dimensional problems. Our main results are Theorem 5.1 and Theorem 5.2 about the partial regularity of stationary solutions to (1.1)–(1.5). 2. Quasi-Static Maxwell Equations In this section we discuss Maxwell equations (1.2)–(1.3). We recall that Carbou [3] proved the following lemma Lemma 2.1. Let u ∈ H 1 (B 3 , S 2 ). Let H = ∇ ∈ L2 (R3 , R3 ) be the solution of curlH = 0, div(H + u) ˜ =0
(2.1)
in D (R3 ) where u˜ is equal to u in B 3 (|u| = 1 a.e in B 3 ) and zero outside B 3 . Then H ∈ ∩1≤p 0 such that
H Lp (R3 ) ≤ Kp u Lp (B 3 ) .
(2.3)
Proof. This is because |u| ˜ = |u| = 1 a.e in B 3 and u˜ = 0 outside of B 3 , then u˜ ∈ ∞ 3 −1,∞ L (R ) and divu ∈ W . Consider ∈ H 1 (R3 ) such that H = ∇ and = −divu in R3 . We have that ∈ ∩1≤p dσ (y).
On the other hand, since divu ∈ L∞ (0, T ; L2 (B 3 )), from the Lp theory of Riesz transforms (see [21]) we have ∂ 2
= −Ri Rj (−divu), i, j = 1, 2, 3, ∂xi ∂xj where Ri are the Riesz transforms Ri (−divu) =
((3 + 1)/2) (xi − yi )(−divu(y, t)) dy (3+2)/2 π |x − y|(3+1) R2
and ϕ ∈ W 2,2 (R3 ) with
W 2,2 (R3 ) (t) ≤ C divu L2 (R3 ) (t). This result can also be deduced from Coifman-Fefferman [9] for general singular integral operators. We should also note that (2.4)–(2.7) hold for H = ∇ and .
100
S. Ding, B. Guo
Remark 2.4. If
Lε (x, t) =
y∈B 3 \B(x,ε)
x−y G (|x − y|)divu(y, t)dy |x − y|
(2.9)
and L(x, t) = lim Lε (x, t) = −H + v, then we have from the theory of Riesz transε→0 dσ (y) ≤ form (see [21]) that H (x, t) ∈ L∞ (0, ∞; Lp (B 3 )), |v(x, t)| ≤ K ∂B 3 |x − y|2 K ln(1 − |x|) ∈ L∞ (0, ∞; Lp (B 3 )) for any p > 1 and
L(x, t) L2 ≤ C ∇u L2 .
(2.10) L∞ (0, T ; H 1 (B 3 ))
with Remark 2.5. (Definition of weak solution). A function u ∈ ut ∈ L2 (0, T ; L2 (B 3 )) is called a weak solution if (1.1)–(1.3) hold in the sense of distribution. 3. Definition of Stationary Solution Definition 3.1. A weak solution u of (1.1) is called a stationary solution if for any ξ(x, t) ∈ C01 (B 3 × R+ , R3 ), θ (x, t) ∈ C01 (B 3 × R+ , R) with ξ(x, t), θ (x, t), ∇(x,t) ξ, ∇(x,t) θ bounded on B 3 × R+ and ξ, θ ≡ 0 for t = 0 and t ≥ t ∗ > 0 such that x + τ ξ |∂B 3 = I d, t + τ θ |∂B 3 = I d, there holds +∞ +∞ 1 1 ∂uτ )τ =0 + ∂τ+ ( ut − u × ut )( e(uτ ) + |H (uτ )|2 dxdt ≤ 0, 3 3 2 2 ∂τ B B 0 0 (3.1) where uτ (x, t) = u(x + τ ξ(x, t), t + τ θ (x, t)), e(u) = 21 |∇u(x, t)|2 . We want to make the definition applicable in the following. To this aim we first compute the right derivative in the definition. The computation can be done as in [14], for simplicity, we simply compute the derivative at τ = 0 and without loss of generality, we simply assume that u is smooth. It follows from (1.2) and (1.3) that Hτ = H (uτ ) satisfies curlHτ = 0, in D (R3 ),
(3.2)
div(Hτ + uτ ) = 0, in D (R3 ).
(3.3)
Let the potential ατ (x, t) be such that Hτ = ∇ατ , then (3.2) and (3.3) becomes ατ = −divuτ , in D (R3 ). Therefore
ατ = −
B3
(3.4)
G(|x − y|)divuτ dy +
∂B 3
G(|x − y|) < uτ , n(y) > dσ (y), (3.5)
where G is the fundamental solution in three dimensions. Denote (xτ , tτ ) =2 (x + +∞ τ (x, t)|2 dxdt, B = − +∞ |∇u τ ξ(x, t), t+τ θ (x, t)), Aτ = 21 0 τ 0 B3 B 3 |Hτ | dxdt. It is simple to compute j
divuτ (x, t) = divu(xτ , tτ ) + τ uixj (xτ , tτ )ξxi + τ uit (xτ , tτ )θxi .
(3.6)
Hausdorff Measure of Landau-Lifshitz Equations
101
According to [14], we have +∞ 1 +∞ dAτ i i k i i = (uxj uxk ξxj + ut uxj θxj ) − |∇u|2 (divξ + θt ).(3.7) 3 3 dτ τ =0 2 B B 0 0 On the other hand
+∞
Bτ = 0
B3 +∞
=−
0
+∞
|Hτ |2 dxdt
B3
Hτ · uτ dxdt
∇ατ · uτ dxdt =− B3 0 +∞ (divuτ ) · ατ dxdt − = B3
0
+∞ ∂B 3
0
(uτ · n)ατ dσ.
(3.8)
It follows from (3.5), (3.6) and (3.8) that dBτ dB1 dB2 dB3 dB4 = + + + , dτ dτ dτ dτ dτ where
+∞
−B1 =
dt 0
B3 B3
(3.9)
j
[divu(xτ , tτ ) + τ uixj (xτ , tτ )ξxi + τ uit (xτ , tτ )θxi ] · j
·[divu(yτ , tτ ) + τ uiyj (yτ , tτ )ξyi + τ uit (yτ , tτ )θyi ]G(|x − y|)dxdy. (3.10) Hence we have dB1 − dτ τ =0 +∞ j = dt [uixi xk ξ k + uixi t θ + uixj ξxi + uit θxi ][divu(y, t)]G(|x − y|)dxdy B3 B3 0 +∞ j k i dt [uiyi yk ξ+ uyi t θ + uiyj ξyi + uit θyi ][divu(x, t)]G(|x − y|)dxdy + 0
B3 B3
(3.11) It follows from [3] that j [uixi xk ξ k + uixj ξxi ][divu(y, t)]G(|x − y|)dxdy 3 3 B B j + [uiyi yk ξ k + uiyj ξyi ][divu(x, t)]G(|x − y|)dxdy B3 B3 j = −2 [divudivξ − uixj ξxi ][divu(y, t)]G(|x − y|)dxdy 3 3 B B x−y + , ξ(x, t) − ξ(y, t) > divu(x, t)divu(y, t)dxdy. G (|x − y|) < 3 3 |x − y| B B (3.12)
102
S. Ding, B. Guo
Therefore +∞ dB1 j − = −2 dt [divudivξ − uixj ξxi ][divu(y, t)]G(|x − y|)dxdy 3 3 dτ τ =0 B B 0 +∞ x−y , ξ(x, t) − ξ(y, t) dt G (|x − y|) < + 3 3 |x − y| B B 0 > divu(x, t)divu(y, t)dxdy +∞ dt (uixi t θ + uit θxi )divu(y, t)G(|x − y|)dxdy + B3 B3 0 +∞ dt (uiyi t θ + uit θyi )divu(x, t)G(|x − y|)dxdy. (3.13) + B3 B3
0
The last two terms can be estimated as follows: (uixi t θ + uit θxi )divu(y, t)G(|x − y|)dxdy B3
=−
B3
uit (x, t)θ (x, t)divu(y, t)
xi − yi G (|x − y|)dxdy, |x − y|
(3.14)
B3
(uiyi t θ + uit θyi )divu(x, t)G(|x − y|)dxdy
=−
B3
uit (y, t)θ (y, t)divu(x, t)
xi − yi G (|x − y|)dxdy. |x − y|
(3.15)
So we have from (3.13)–(3.15) that +∞ dB1 j =2 dt [divudivξ − uixj ξxi ][divu(y, t)]G(|x − y|)dxdy 3 3 dτ τ =0 B B 0 +∞ x−y , ξ(x, t) − ξ(y, t) dt G (|x − y|) < − |x − y| B3 B3 0 > divu(x, t)divu(y, t)dxdy +∞ xi − yi +2 dt uit (x, t)θ (x, t)divu(y, t) G (|x − y|)dxdy. |x − y| B3 0 (3.16) Moreover, since B2 , B3 in (3.9) are +∞ B2 = dt x∈B 3
0
y∈∂B 3
j
[divu(xτ , tτ ) + τ uixj ξxi + τ uit θxi ]
(u(y, t), n(y))G(|x − y|)dxdσ (y),
+∞
B3 =
dt 0
x∈∂B 3
y∈B 3
(3.17) j
[divu(yτ , tτ ) + τ uiyj ξyi + τ uit θyi ]
(u(x, t), n(x))G(|x − y|)dydσ (x),
(3.18)
Hausdorff Measure of Landau-Lifshitz Equations
103
we have +∞ dB2 j = dt [uixi xk ξ k + uixi t θ + uixj ξxi + uit θxi ](u(y, t), n(y)) 3 3 dτ τ =0 x∈B y∈∂B 0 G(|x − y|)dxdσ (y), (3.19)
+∞ dB3 j = dt [uiyi yk ξ k + uiyi t θ + uiyj ξyi + uit θyi ](u(x, t), n(x)) dτ τ =0 x∈∂B 3 y∈B 3 0 G(|x − y|)dydσ (x), (3.20) and similarly to the above we have dB2 dB3 + dτ τ =0 dτ τ =0 +∞ =2 dt x∈B 3
0
j
y∈∂B 3
[−divu(x, t)divξ(x, t) + uixj ξxi + ui xi t +uit θxi ](u(y, t), n(y))G(|x − y|)dxdσ (y). (3.21)
We note that the term B4 is determined only by the boundary data which is indepen 4 dent of τ then dB = 0. We finally get dτ τ =0
+∞ dBτ = −2 dt divu(x, t)divξ(x, t) (x, t)dxdt dτ τ =0 B3 0 +∞ j dt
(x, t)uixj (x, t)ξxi (x, t) +2 3 B 0 +∞ x−y , ξ(x, t) − ξ(y, t) dt < + 3 3 |x − y| B B 0 > G (|x − y|)divu(x, t)divu(y, t)dxdy +∞ x−y , ut (x, t) dt < −2 |x − y| B3 B3 0 > θ (x, t)divu(y, t)G (|x − y|)dxdy,
(3.22)
where
(x, t) = −
B3
G(|x − y|)divu(y, t)dy +
B3
G(|x − y|) < u(y, t), n(y) > dσ (y),
which satisfies (2.7). Combining (3.7) and (3.22), we may now simplify the definition of a stationary weak solution for our problem (1.1)–(1.5) as follows:
104
S. Ding, B. Guo
Remark 3.1. A stationary solution u of (1.1) satisfies for any ξ(x, t) ∈ C01 (B 3 ×R+ , R3 ) and θ (x, t) ∈ C01 (B 3 × R+ , R) as above, the following inequality: +∞ +∞ 1 1 ( ut − u × ut )(∇u · ξ + ut θ ) + dt (uixj uixk ξxkj + uit uixj θxj )dx 2 B3 2 B3 0 0 1 +∞ dt |∇u|2 (divξ + θt )dx − 2 0 B3 +∞ dt divu(x, t)divξ(x, t) (x, t)dxdt −2 B3 0 +∞ j dt
(x, t)uixj (x, t)ξxi (x, t) +2 3 B 0 +∞ x−y dt < , ξ(x, t) − ξ(y, t) + 3 3 |x − y| B B 0 > G (|x − y|)divu(x, t)divu(y, t)dxdy +∞ x−y dt < , ut (x, t) > θ (x, t)divu(y, t)G (|x − y|)dxdy ≤ 0. −2 |x − y| B3 B3 0 (3.23) In what follows we want to derive some more applicable inequalities and equalities from (3.23) to be used in the future. Now let Mε = B 3 × B 3 \ {|x − y| < ε} for ε > 0 small enough. Since ξ(·, t) is ξ(x, t) − ξ(y, t) smooth, then in (3.22) is uniformly bounded. |x − y| +∞ x−y dt < , ξ(x, t) − ξ(y, t) > G (|x − y|)divu(x, t)divu(y, t)dxdy |x − y| B3 B3 0 +∞ x−y , ξ(x, t) − ξ(y, t) dt < = lim ε→0 0 |x − y| Mε
> G (|x − y|)divu(x, t)divu(y, t)dxdy +∞ x−y dt ξ(x, t) G (|x − y|)divu(x, t)divu(y, t)dxdy = lim 2 ε→0 |x − y| Mε 0 +∞ x−y dt ξ(x, t)divu(x, t) G (|x − y|)divu(y, t)dy dx. = lim 2 ε→0 x∈B 3 y∈B 3 \B(x,ε) |x − y| 0 (3.24)
Let Lε (x, t) = 2
x−y G (|x − y|)divu(y, t)dy. y∈B 3 \B(x,ε) |x − y|
We have lim Lε (x, t) = L(x, t) = −H (x, t) + v(x, t)
ε→0
with H (x, t) ∈ L∞ (0, ∞; Lp (B 3 )), |v(x, t)| ≤ K
∂B 3
L∞ (0, ∞; Lp (B 3 )) for any p > 1 and there holds
L(x, t) L2 ≤ C ∇u L2 .
(3.25)
dσ (y) ≤ K ln(1 − |x|) ∈ |x − y|2
Hausdorff Measure of Landau-Lifshitz Equations
105
Then (3.24) can be rewritten as +∞ x−y , ξ(x, t) − ξ(y, t) > G (|x − y|)divu(x, t)divu(y, t)dxdy dt < |x − y| B3 B3 0 ∞ ξ(x, t) · L(x, t)divu(x, t)dxdt. (3.26) = B3
0
Similarly we deduce that −
+∞
dt 0 ∞ =−
< B3
B3
0
B3
x−y , ut (x, t) > θ (x, t)divu(y, t)G (|x − y|)dxdy |x − y|
ut · L(x, t)θ (x, t)dxdt.
(3.27)
Finally, inequality (3.23) can be rewritten as Assumption (S).
+∞
+∞ 1 1 ( ut − u × ut )(∇u · ξ + ut θ ) + dt (uixj uixk ξxkj + uit uixj θxj )dx 2 B3 2 B3 0 0 1 +∞ dt |∇u|2 (divξ + θt )dx − 2 0 B3 +∞ dt divu(x, t)divξ(x, t) (x, t)dxdt −2 B3 0 +∞ j dt
(x, t)uixj (x, t)ξxi (x, t) +2 3 B 0 ∞ L(x, t)[ξ divu(x, t) − ut θ ]dxdt ≤ 0. (3.28) + B3
0
Formula (3.28), Assumption (S), is just the starting point of all the following discussions. From Assumption (S), one easily derive the following lemma as in [14] which will be used to get the generalized monotonicity inequality in the following section. Lemma 3.1. Let u be a stationary weak solution of (1.1)–(1.5) and ξ , θ as before. Then we have (ut − u × ut )∇u · ξ − |∇u|2 divξ + 2uxj uxk ξxkj B 3 ×{t}
j
−2divu(x, t)divξ(x, t) (x, t) + 2 (x, t)uixj ξxi + 2divu(x, t)(ξ · L(x, t))dxdt = 0 (3.29) and B 3 ×{t2 }
−
|∇u| θ dx ≤ 2
B 3 ×{t1 }
t2
t1
B3
|∇u|2 θt − |ut |2 θ − 2(∇u · ut )∇θ
+ 2(ut · L(x, t))θ dxdt.
(3.30)
106
S. Ding, B. Guo
4. Estimates for Local Energy In this section, we use inequalities (3.29) and (3.30) to derive the generalized monotonicity inequality which will be used to deduce the energy decay in the next section. We apply a different method from [6] by using the approaches of deriving the monotonicity inequality for the harmonic map rather than the method for the harmonic map heat flow so that we may avoid a term involving the integration of |ut |2 on the right-hand side of the inequality as stated in [18]. Such an idea was first introduced by Moser [19]. Now we use this idea to derive a generalized monotonicity for the L-L equation with a nonlocal term. We think that such a proof of energy decay, though it is only applicable to low dimensional problems, is much more simple than that usually used for the harmonic map heat flow. Denote Bρ = Bρ (0), Pρ (z) = Bρ (x) × (t − ρ 2 , t + ρ 2 ) for z = (x, t). In the following, we always denote by u the stationary solution of (1.1)–(1.5): Lemma 4.1. For any Bs (x0 ) ⊂ Br/2 (0) we have for the stationary solution u that 1 1 2 2 |∇u| ≤ |∇u| + Kr |ut |2 . (4.1) s Bs (x0 ) r Br Br Proof. Taking ξ(x) = φ(|x|)x in inequality (3.29) where for any fixed 0 < r, r + h < 1, h > 0, 1, if s ≤ r , if r ≤ s ≤ r + h φ(s) = 1 + r−s 0, ifh r + h ≤ s < 1 and noting that ξxkj (x) = δj k φ(|x|) + φ (|x|)
x j xk , divξ = 3φ(|x|) + φ (|x|)|x|, |x|
(4.2)
where φ (s) = 0 if s ≤ r, φ (s) = − h1 if r < s < r + h and φ (s) = 0 if r + h < s < 1, we obtain from (3.29) that
(ut − u × ut )(∇u · x)φ(|x|) − |∇u|2 (3φ(|x|) + φ (|x|)|x|)
Br+h
x j xk ) − 2 (x, t)divu(x, t)(3φ(|x|) + φ (|x|)|x|) |x| xj xk j +2 (x, t)uxk (δj k φ(|x|) + φ (|x|) ) + 2divu(x, t)φ(|x|)(x · L(x, t))dx = 0. |x| (4.3)
+2uxj uxk (δj k φ(|x|) + φ (|x|)
Sending h → 0 we may estimate every term in (4.3) as follows: (ut − u × ut )(∇u · x)φ(|x|) = (ut − u × ut )(∇u · x), lim h→0 Br+h
− lim
h→0 Br+h
(4.4)
Br
|∇u| (3φ(|x|) + φ (|x|)|x|) = −3 2
|∇u| + r
|∇u|2 dσ, (4.5)
2
Br
∂Br
Hausdorff Measure of Landau-Lifshitz Equations
lim
h→0 Br+h
2uxj uxk (δj k φ(|x|) + φ (|x|)
107
x j xk )=2 |x|
|∇u|2 − Br
2 r
|x · ∇u|2 dσ, ∂Br
(4.6) − lim
h→0 Br+h
= −6
2 (x, t)divu(x, t)(3φ(|x|) + φ (|x|)|x|)
(x, t)divu(x, t) + 2r
divu(x, t)dσ,
Br
− lim
h→0 Br+h
= −2 Br
x j xk j 2 (x, t)uxk (δj k φ(|x|) + φ (|x|) ) |x| 2
(x, t)divu(x, t) −
(x · ∇u) · xdσ, r ∂Br
lim
h→0 Br+h
(4.7)
∂Br
(4.8)
2divu(x, t)φ(|x|)(x · L(x, t)) = 2
(x · L(x, t))divu(x, t).
(4.9)
Br
Substituting (4.4)–(4.9) into (4.3), we get
(ut − u × ut )(∇u · x) − |∇u| + r |∇u|2 dσ Br ∂Br 2 2 − |x · ∇u| dσ − 8
(x, t)divu(x, t) + 2r
divu(x, t)dσ r ∂Br Br ∂Br 2 −
(x · ∇u) · xdσ + 2 (x · L(x, t))divu(x, t) = 0. (4.10) r ∂Br Br 2
Br
On the other hand, since 1 |∇u|2 − (ut − u × ut )(∇u · x) + 8 (x, t)divu(x, t) r Br −2(x · L(x, t))divu(x, t)]} 1 =− 2 |∇u|2 − (ut − u × ut )(∇u · x) + 8 (x, t)divu(x, t) r Br −2(x · L(x, t))divu(x, t)] 1 + |∇u|2 − (ut − u × ut )(∇u · x) + 8 (x, t)divu(x, t) r ∂Br −2(x · L(x, t))divu(x, t)] , d dr
(4.11)
108
S. Ding, B. Guo
we get from (4.10) and (4.11) that d 1 |∇u|2 − (ut − u × ut )(∇u · x) + 8 (x, t)divu(x, t) dr r Br −2(x · L(x, t))divu(x, t)]} 1 = [−(ut − u × ut )(∇u · x) + 8 (x, t)divu(x, t) − 2(x · L(x, t))divu(x, t)] r ∂Br 2 2 2 2 + |x · ∇u| − divu(x, t) + 3 (x · ∇u) · x 3 r r ∂B 3 r 2 |x · ∇u| (x · ∇u)(ut − u × ut ) 3 divu (x · L)divu =2 + − − 3 |x| 2|x| |x| |x| ∂B 3
(x · ∇u) · x + . (4.12) |x|3 Denote ψ(ρ, t) =
1 ρ
Bρ
|∇u| − (ut − u × ut )(∇u · x) + 8 (x, t)divu(x, t) − 2(x · L(x, t))divu(x, t) . 2
(4.13) Then by integrating (4.12) from s to ρ (s < ρ), we obtain
|x · ∇u|2 (x · ∇u)(ut − u × ut ) − 3 |x| 2|x| Bρ \Bs 3 divu (x · L)divu (x · ∇u) · x + − + |x| |x| |x|3
ψ(ρ, t) − ψ(s, t) = 2
(4.14)
or
|x · ∇u|2 (x · ∇u)(ut − u × ut ) − 3 |x| 2|x| Bρ \Bs 3 divu (x · L)divu (x · ∇u) · x , + − + |x| |x| |x|3
ψ(s, t) = ψ(ρ, t) − 2
(4.15)
that is 1 s
|∇u|2 = Bs
1 [(ut − u × ut )(∇u · x) − 8 (x, t)divu(x, t) s Bs +2(x · L(x, t))divu(x, t)] |x · ∇u|2 (x · ∇u)(ut − u × ut ) +ψ(ρ, t) − 2 − 3 |x| 2|x| Bρ \Bs 3 divu (x · L)divu (x · ∇u) · x + − . (4.16) + |x| |x| |x|3
Hausdorff Measure of Landau-Lifshitz Equations
109
By filling the hole and note that |x| ≤ s if x ∈ Bs , we have |(x · ∇u)(ut − u × ut )| 4| divu| 1 2 |∇u| ≤ ψ(ρ, t) + 2 [ + s Bs 2|x| |x| Bρ +
|(x · L)divu| | ||x · ∇u| ]. + |x| |x|2
(4.17)
Now we can estimate all the terms on the right-hand side of (4.17) and get the generalized monotonicity inequality. It follows from (4.17) and H¨older inequality that 1 1 |∇u|2 ≤ |∇u|2 + Kρ [|ut |2 + | |2 + | |8 + |L|2 ]. (4.18) s Bs ρ Bρ Bρ In this inequality, we should note that (x, t) ∈ L∞ (0, T ; W 1,p ()), L(x, t) ∈ L∞ (0, T ; Lp ()) for any p > 1. Then it follows from Lemma 2.1, (2.7) and (2.10) that 1 1 |∇u|2 ≤ K |∇u|2 + Kρ |ut |2 (4.19) s Bs ρ Bρ Bρ with K independent of t. It is not difficult to get, for any Bs (x0 ) ⊂ Br/2 (0), 1 1 |∇u|2 ≤ |∇u|2 + Kr |ut |2 . s Bs (x0 ) r Br Br
(4.20)
Lemma 4.1 is proved.
In the following we want to prove Lemma 4.2 (Generalized monotonicity inequality). There exists a constant C > 0 such that for any θ ∈ (0, 1/16] there is ε0 > 0, if u ∈ H 1 (Pr (z0 ), S 2 ) is a stationary solution, we have |∇u|2 dz ≤ δε2 + C1 r −5 |u − (u)Pr (0) |2 dz r −3 Pr/8 (0)
Pr (0)
under the condition r
−3
Pr (0)
|∇u|2 dz ≤ ε2 ≤ ε02 .
Proof. It follows from (3.30) that |∇u|2 θ dx − ≤
B 3 ×{t2 } t2
t1
B3
Therefore we have − B 3 ×{t2 }
B 3 ×{t1 }
|∇u|2 θt − |ut |2 θ − 2(∇u · ut )∇θ + 2(ut · L(x, t))θ dxdt. (4.21)
B 3 ×{t1 }
|∇u|2 θ dx ≤ C
t2
t1
B3
|∇u|2 (θt + |∇θ |2 ) + |L(x, t)|2 θ.
110
S. Ding, B. Guo
Hence if taking θ (x, t) such that |∇θ |2 , |θt | ≤ rC2 , θ(x, t1 ) = 0 and θ ≡ 1 for x ∈ Br/2 , t ∈ ((− 2r )2 , ( 2r )2 ), we obtain from Lemma 2.1 for almost every t = t2 ∈ (−( 2r )2 , ( 2r )2 ), 1 1 2 2 |∇u| ≤ K1 3 |∇u| dxdt . (4.22) r Br/2 (0)×{t} r Pr (0) Now we prove the following claim. Claim. There exists a constant C > 0 such that for any given 0 < λ < 1, there exists a set ⊂ (−r 2 /2, r 2 /2) with || ≤ λ satisfying 1 Cε2 |ut |2 dx ≤ (4.23) r B r (0)×{t} λ 2
for almost every t ∈ , where 13 Pr |∇u|2 dz ≤ ε2 ≤ ε02 . r Proof. Taking the smooth test function θ (x, t) such that θ ≡ 1 in Pr/2 (0) but θ ≡ 0 outside Pr (0), t1 = −r 2 , t2 = r 2 , and noticing that L(x, t) ∈ L∞ (0, ∞; Lp (B 3 )) for any p ≥ 1, we get from Lemma 2.1 that |ut |2 dz ≤ C0 |∇u|2 (θt + |∇θ |2 ) + θ |L(x, t)|2 dz Pr/2 (0)
Pr (0)
≤ C0 r
1 r3
|∇u|
2
≤ C0 rε 2 .
(4.24)
Pr (0)
If the Claim is false, then for any C > 0, there exists 0 < λ < 1 such that for some set ⊂ (−r 2 /2, r 2 /2) with || > λ there holds: if t ∈ then 1 1 |ut |2 dz ≥ |ut |2 dz ≥ Cε2 . r Pr/2 (0) r Br/2 (0) This contradicts (4.24) by taking C > C0 . The Claim follows. We can even prove by (4.20) (Lemma 4.1) that if t ∈ then 1 Cε 2 sup . ( |∇u|2 dx) ≤ λ Bs (x0 )⊂Br/4 (x0 ) s Bs (x0 )×{t}
(4.25)
In the sequence, we mean t ∈ . In order to prove the lemma, we take ξ ∈ C0∞ (Br/4 (0)) such that 0 ≤ ξ ≤ 1, ξ ≡ 1 in Br/8 (0), |∇ξ | ≤ 16 r to compute ξ(x)|∇u|2 dx = ξ(x)∇u · ∇(u − (u)Pr (0) )dx Br (0)×{t} Br (0)×{t} =− (u − (u)Pr (0) )∇ξ · ∇u Br (0)×{t} − ξ(x)(u − (u)Pr (0) )u. Br (0)×{t}
Hausdorff Measure of Landau-Lifshitz Equations
111
Using the equation, we have ξ(x)|∇u|2 dx = − (u − (u)Pr (0) )∇ξ · ∇u Br (0)×{t} Br (0)×{t} 1 1 − ξ(x)(u − (u)Pr (0) )( ut − u × ut ) 2 2 B (0)×{t} r + ξ(x)(u − (u)Pr (0) )u|∇u|2 Br (0)×{t} − ξ(x)(u − (u)Pr (0) )uH (u). (4.26) Br (0)×{t}
In the following we estimate every term on the right hand side of (4.26). Estimate of the first term. Br (0)×{t}
16
∇u L2 (Br ) u − (u)Pr (0) L2 (Br ) r 1/2 16 1 ≤ √ |∇u|2 dx
u − (u)Pr (0) L2 (Br ) r r Br (0) 1 ≤ C0 ε √ u − (u)Pr (0) L2 (Br ) (by (4.22)) r C 1 δλr ≤ Cε2 +
u − (u)Pr (0) 2L2 (B ) . (4.27) r 2 δλ r 2
(u − (u)Pr (0) )∇ξ · ∇u ≤
Estimate of the second term. Since t ∈ , we get
1 1 ξ(x)(u − (u)Pr (0) )( ut − u × ut ) ≤ C ut L2 (Br ) u − (u)Pr (0) L2 (Br ) 2 2 Br (0)×{t} 1/2 √ 1 2 ≤C r |ut | dx r Br (0)
u − (u)Pr (0) L2 (Br ) √ Cε r ≤ √ u − (u)Pr (0) L2 (Br ) λ (by (4.23)) δr C ≤ Cε2 + u − (u)Pr (0) 2L2 (B ) . r 2 δλ (4.28)
Estimate of the third term To estimate this term, we use the H´elein method to decompose it. Since |u| = 1 a.e, we have ui |∇u|2 =
3 j =1
∇uj (ui ∇uj − uj ∇ui ),
112
S. Ding, B. Guo
and then Br (0)×{t}
=
ξ(x)(u − (u)Pr (0) )u|∇u|2
3 3 i=1 j =1 Br (0)×{t}
(ui − (ui )Pr (0) )∇uj (ξ(x)(ui ∇uj − uj ∇ui )).
On the other hand div(ui ∇uj − uj ∇ui ) = ui uj − uj ui = ui w j − uj w i , where w = 21 ut + 21 u × ut ; we obtain
div(ξ(x)(ui ∇uj − uj ∇ui )) L2 (Br ) ≤ ∇ξ(x) · (ui ∇uj − uj ∇ui ) L2 (Br ) + ξ(x)(ui w j − uj w i ) L2 (Br ) 32
∇u L2 (Br/4 (0)) + 2 ut L2 (Br/4 (0)) ≤ r Cε ≤ √ (by (4.22) and (4.23)). (4.29) λr To continue the proof, we recall a lemma by Feldman [14]: Lemma 4.3 ([14]). Let f, h ∈ H 1 (Rn ) and g ∈ L2 (Rn , Rn ) with divg ∈ L2 (Rn ) in the distribution sense, and r 2−n |∇h|2 dx = A2 < ∞. sup x0 ∈Rn , r>0
Then
R
Br (x0 )
fg · ∇h ≤ CA( ∇f L2 g L2 + f L2 divg L2 ) n
for some universal constant C. Now we apply Lemma 4.3 to f = ui − (ui )Pr (0) , h = uj and g = ξ(x)(ui ∇uj − by extending them properly to the whole space R3 . By (4.22) and (4.29), 2 ξ(x)(u − (u)Pr (0) )u|∇u| = (ui − (ui )Pr (0) )∇uj
uj ∇ui )
Br (0)×{t}
Br (0)×{t} i
(ξ(x)(u ∇uj − uj ∇ui )) Cε ≤ √ ∇f L2 g L2 + f L2 divg L2 λ Cε Cε ≤ √ Cε2 r + √ u − (u)Pr (0) L2 λ λr Cε3 r Cε 2 ≤ (4.30) + 2 u − (u)Pr (0) 2L2 . λ λr
Hausdorff Measure of Landau-Lifshitz Equations
113
Finally we estimate the last term on the right hand side of (4.26) as follows. Since H (u) = ∇ and = −divu, divu ∈ L∞ (0, ∞; H −1 ), we know H (u) 2L2 (B ) ≤ r
C0 ∇u 2L2 (B ) ≤ Crε 2 . Hence we derive r
Br (0)×{t}
√ ξ(x)(u − (u)Pr (0) )uH (u) ≤ C rε u − (u)Pr (0) L2 (Br (x0 )) ≤ Cε 2
C δλr + u − (u)Pr (0) 2L2 (B (x )) . r 0 2 δλ (4.31)
Combining (4.26)–(4.31), we have obtained for t ∈ (|| < λ), Cε C δ |∇u|2 dx ≤ (
u − (u)Pr (0) 2L2 (B (0)) . + )ε 2 r + r λ 2 δλr 2 Br/8 (0)×{t}
(4.32)
Integrating over (−r 2 /2, r 2 /2) and using (4.22), we get
r 2 /2
−r 2 /2 Br/8 (0)×{t}
|∇u|2 dx ≤ (
δ Cε C + Cλ + )ε 2 r 3 +
u − (u)Pr (0) 2L2 (P (0)) . r λ 2 δλr 2 (4.33)
Hence, choosing C, λ and ε0 properly we obtain r −3 |∇u|2 dz ≤ δε2 + C1 r −5 Pr/8 (0)
Pr (0)
under the condition r −3
|u − (u)Pr (0) |2 dz
(4.34)
Pr (0)
|∇u|2 dz ≤ ε2 ≤ ε02 .
Lemma 4.2 is proved.
5. Energy Decay and Partial Regularity In this section we derive the energy decay which will be used to prove the regularity. Lemma 5.1. There exists a constant C > 0 such that for any θ ∈ (0, 21 ], there is a number ε0 > 0 such that for any stationary solution u ∈ H 1 (Pr (z0 ); S 2 ) of (1.1)–(1.5) satisfying −3 |∇u|2 dz ≤ ε2 ≤ ε02 , r Pr (z0 )
we have −5
(θ r)
Pθ r (z0 )
|u − (u)Pθ r (z0 ) |2 dz ≤ Cθ 2 ε 2 .
(5.1)
114
S. Ding, B. Guo
Proof. Since the integrals are invariant under the transformation (x, t) → (rx+x0 , r 2 t + t0 ), we may assume Pr (z0 ) = P1 (0). We argue by contradiction: If the conclusion is untrue, then for any C > 0 we may find θ ∈ (0, 1/2] and the stationary weak solution uk ∈ H 1 (P1 (0); S 2 ) of the considered problem such that |∇uk |2 dz = εk2 → 0, k → ∞ (5.2) P1 (0)
but
Pθ (0)
|uk − (uk )Pθ (0) |2 > Cθ 3+4 εk2 .
(5.3)
It follows from these assumptions and the lemmas in the above sections that the sequence {vk } = { ε1k (uk − (uk )Pθ (0) )} is bounded in H 1 (P1/2 (0)) which allows us to assume that there is a map v ∈ H 1 (P1/2 (0); R3 ) such that vk → v, weakly in H 1 (P1/2 (0); R3 );
with
vk → v, strongly in L2 (P1/2 (0); R3 )
vdz = 0;
Pθ (0)
|∇v|2 dz ≤ 1. P1/2 (0)
It is obvious that we can assume that uk → p strongly in L2 (P1/2 (0)) for some constant map p ∈ S 2 . Note that uk solves the following equation: 1 1 ukt − (uk × ukt ) = uk + uk |∇uk |2 + H (uk ) − H (uk )uk , in B 3 × R+ , 2 2 (5.4) where curlH (uk ) = 0, in D (R3 ),
(5.5)
div(H (uk ) + uk ) = 0, in D (R3 ).
(5.6)
Let H (uk ) = ∇ k . Then k = −divuk . Note that uk = εk vk + (uk )Pθ (0) . Then for any φ ∈ C0∞ (P1/2 (0), R3 ) we have by multiplying (5.4) by φ, 1 1 ( ukt − (uk × ukt ))φ 2 2 P1/2 (0) = (uk + uk |∇uk |2 + H (uk ) − H (uk )uk )φ P1/2 (0)
=
(−∇uk · ∇φ + uk |∇uk |2 φ + φ∇ k − uk φ∇ k ).
(5.7)
Note that uk = εk vk + (uk )Pθ (0) . We have from (5.7) that 1 1 ε ( vkt − (uk × vkt ))φ 2 2 P1/2 (0) = (−εk ∇vk · ∇φ + uk |∇uk |2 φ + φ∇ k − uk φ∇ k )
(5.8)
P1/2 (0)
P1/2 (0)
Hausdorff Measure of Landau-Lifshitz Equations
115
with k = −εk divvk . Divide both sides of (5.8) by εk and send k → ∞ to give 1 1 ( vt − (p × vt ))φ 2 P1/2 (0) 2 = (−∇v · ∇φ + φ∇ ∞ − pφ∇ ∞ ), (5.9) 1 k→∞ εk
since lim
P1/2 (0)
2 P1/2 (0) |∇uk | uk φ
= 0 from (5.2), where ∞ = −divv. Since v ∈
H 1 (P1/2 (0)) we know divv ∈ L2 (P1/2 (0)) then ∞ ∈ W22,2 . Denote H∞ = ∇ ∞ . It follows from (5.9) that v satisfies 1 1 vt − (p × vt ) = v + H∞ − pH∞ . (5.10) 2 2 By a standard estimate and boot-strapping method, we know that v is smooth and there holds |v|2 dz ≤ |v − vPθ (0) |2 dz ≤ C0 θ 3+4 (5.11) Pθ (0)
Pθ (0)
by the Poincar´e inequality. Inequality (5.11) contradicts (5.3) from the strong L2 -convergence of vk to v if one chooses C > C0 .
Combining these lemmas we get by the iteration method: Proposition 5.1. There is a constant C > 0 such that for any θ ∈ (0, 1/16] there exists a number ε0 > 0 such that if u ∈ H 1 (Pr (z0 ), S 2 ) is a stationary solution of (1.1)–(1.5) satisfying the small energy condition r −3 |∇u|2 dz ≤ ε2 ≤ ε02 , (5.12) Pr (z0 )
then −3
|∇u|2 dz ≤ Cθ 2 ε 2 .
(θ r)
(5.13)
Pθ r (z0 ) 1 1 Proof. Given 0 < θ < 1/16, taking k such that 8k+2 ≤ θ ≤ 8k+1 . Denote (r) = −3 2 −5 2 r Pr (z0 ) |∇u| dz, (r) = r Pr (z0 ) |u − (u)Pr (z0 ) | . It follows from (5.12) and Lemma 4.2 that for any δ > 0, r
( ) ≤ δε2 + C1 (r). (5.14) 8 Similarly we have r r r r
( 2 ) ≤ δ ( ) + C1 ( ) ≤ δ(δε2 + C1 (r)) + C1 ( ). (5.15) 8 8 8 8 Iterating, applying Lemma 5.1 and choosing δ properly, we get
(
r
8
) ≤ δ k+1 ε 2 + C1 k+1
k j =1
δ j (
r r 1 ) + ( k ) ≤ Cδ k+1 ε 2 + C( k )2 ε 2 ≤ Cθ 2 ε 2 , j 8 8 8 (5.16)
116
S. Ding, B. Guo
where we have used Lemma 5.1 again which shows ( 8rj ) ≤ C( 81j )2 ε 2 . Then we have −3
|∇u| dz ≤ θ
(θr) = (θ r)
2
Pθ r
≤ (8k+1 θ )−3 (
r
8
)−3 k+1
−3
(
r 8k+1
)
−3
(8
k+1 −3
|∇u|2 dz
)
P
|∇u|2 dz ≤ 83 ( P
r 8k+1
r 8k+1
r 8k+1
) ≤ Cθ 2 ε 2 (by (5.16)).
Remark 5.1. By virtue of (4.25), Proposition 5.1 also holds if replacing Pθr (z0 ) by Ps (z1 ) for any z1 ∈ P15r/16 (z0 ) and s ∈ (0, r/16). Theorem 5.1 (Main Theorem 1). There exist constants ε0 > 0 and Ckl > 0 such that any stationary solution u ∈ H 1 (Pr (z0 )) of (1.1)–(1.4) satisfying the small energy condition (5.10) is smooth in Pr/2 (z0 ) and
∂tl ∇ k u L∞ (Pr/2 (z0 )) ≤ Ckl r −k−2l ε, k, l = 0, 1, 2, · · · .
(5.17)
Proof. For any given Pr (z0 ) ⊂ B 3 × (0, T ), Proposition 5.1 shows that for any λ ∈ (0, 1), if ε0 is small enough, we have (|∇u|2 + s 2 |ut |2 )dz ≤ C1 s n+2λ (5.18) Ps (z1 )
for any z1 ∈ P15r/16 (z0 ) and s ∈ (0, r/16) with C1 only depending on λ. In fact, let θr = s then θ = rs . Substituting this θ into (5.13), we have s
−3
s |∇u|2 dz ≤ ( )2 ε02 . r Ps (z1 )
(5.19)
On the other hand we may control Ps (z1 ) s 2 |ut |2 dz as in Feldman [14]. Equation (5.18) is proved. By a standard method we deduce that u is smooth in P15r/16 (z0 ) by Morrey’s Lemma which was done by Feldman [14] . The estimates (5.17) can be obtained by a scaling argument.
By a standard method as in Giaquinta’s book [16], it is easy to conclude Theorem 5.2 (Main Theorem 2). Let u ∈ H 1 ( × (0, T ); S 2 ) be a stationary solution of (1.1)–(1.4). There is an open set Q ⊂ × (0, T ) such that u is smooth in Q and H3 ( × (0, T ) \ Q) = 0,
(5.20)
where × (0, T ) \ Q = {z = (x, t)| lim inf r r→0
−3
|∇u|2 dz ≥ ε0 }. Pr (z)
(5.21)
Hausdorff Measure of Landau-Lifshitz Equations
117
References 1. Alouges, F., Soyeur, A.: On global weak solutions for Landau-Lifshitz equations: existence and nonuniqueness. Nonlinear Analysis, TMA 18(11), 1084 (1992) 2. Carbou, G., Fabrie, P.: Regular solutions for Landau-Lifshitz equation in a bounded domain. Diff. Int. Eqns. 14, 213–229 (2001) 3. Carbou, G.: Regularity for critical points of a nonlocal energy. Calc. Var. and PDEs 5, 409–433 (1997) 4. Chen, Y., Ding, S., Guo B.: Partial regularity for two-dimensional Landau-Lifshitz equations. Acta Math. Sinica, 14(3), 423–432 (1998) 5. Chen , Y., Lin, F. H.: Evolution of harmonic maps with Dirichlet boundary conditions. Comm. Anal. Geom. 1, 327–346 (1993) 6. Chen , Y., Li, J., Lin, F. H.: Partial Regularity for Weak Heat Flows into Spheres. Comm. Pure Appl. Math. XLVIII, 429–448 (1995) 7. Chen, Y., Struwe, M.: Existence and partial regularity results for the heat flow of harmonic maps. Math. Z. 201, 83–103 (1989) 8. Chen , Y., Wang, C.: Partial regularity for weak flows into Riemannian homogeneous space. Comm. P. D. E. 21(5–6), 735–761 (1996) 9. Coifman, R. R., Fefferman, C.L.: Weighted norm inequality for maximal functions and singular integrals. Studia Math. 51, 241–250 (1974) 10. Coron, J.: Nonuniqueness for the heat flow of harmonic maps. Ann. Inst. H. Poincar´e, Non-Lin´eair´e 7, 335–344 (1990) 11. Ding, S., Guo , B.: Partial Regularity for Higher Dimensional Landau-Lifshitz Systems. Preprint 2003 12. Ding, S., Guo , B.: Initial-Boundary Value Problem for Higher Dimensional Landau-Lifshitz Systems. Appl. Anal., to appear 13. Evans, L. C.: Partial regularity for harmonic maps into spheres. Arch. Rat. Mech. Anal. 116, 101–113 (1991) 14. Feldman, M.: Partial Regularity for Harmonic Maps of Evolution into Spheres. Comm. PDE. 19, 761–790 (1994) 15. Freire, A.: Uniqueness for the harmonic heat flow in two dimensions. Calc. Var. PDE. 4, 761–790 (1996) 16. Giaquinta, M.: Introduction to Regularity theory for nonlinear elliptic systems. Basel-Boston-Berlin: Birkh¨auser Verlag, 1993 17. Guo, B., Hong, M.: The Landau-Lifshitz equations of the ferromagnetic spin chain and harmonic maps. Calc. Var. PDE. 1, 311–334 (1993) 18. Liu, X.: Partial regularity for the Landau-Lifshitz systems. Calc. Var. 20(2), 153–173 (2003) 19. Moser, R.: Partial regularity for the Landau-Lifshitz equation in small dimensions, MPI Preprint 26 2002, www.mis.mpg.de/preprints/2002 20. Rivi´ere, T.: Everywhere discontinuous harmonic maps from B 3 into S 2 . C. R. Acad. Sci. Paris 314, 719–723 (1992) 21. Stein, E. M.: Singular integrals and differentiability properties of functions. Princeton, NJ: Princeton Univ. Press, 1970 22. Zhou, Y., Guo, B.: Existence of weak solution for boundary problems of systems of ferromagnetic chain . Science in China, 27A, 799–811 (1984) 23. Zhou, Y., Guo , B.: Weak solution systems of ferromagnetic chain with several variables. Science in China, 30A, 1251–1266 (1987) 24. Zhou, Y., Guo, B., Tan, S.: Existence and uniqueness of smooth solution of system of ferromagnetic chain. Science in China 34A, 257–266 (1991) 25. Zhou, Y., Sun, H., Guo, B.: Multidimensional system of ferromagnetic chain type. Science in China 36A, 1422–1434 (1993) Communicated by P. Constantin
Commun. Math. Phys. 250, 119–131 (2004) Digital Object Identifier (DOI) 10.1007/s00220-004-1121-8
Communications in
Mathematical Physics
Singularity Dominated Strong Fluctuations for Some Random Matrix Averages P. J. Forrester1 , J. P. Keating2 1 2
Department of Mathematics and Statistics, University of Melbourne, Victoria 3010, Australia School of Mathematics, University of Bristol, Bristol, BS8 1TW, UK
Received: 24 July 2003 / Accepted: 26 January 2004 Published online: 11 June 2004 – © Springer-Verlag 2004
Abstract: The circular and Jacobi ensembles of random matrices have their eigenvalue support on the unit circle of the complex plane and the interval (0, 1) of the real line respectively. The averaged value of the modulus of the corresponding characteristic polynomial raised to the power 2µ diverges, for 2µ ≤ −1, at points approaching the eigenvalue support. Using the theory of generalized hypergeometric functions based on Jack polynomials, the functional form of the leading asymptotic behaviour is established rigorously. In the circular ensemble case this confirms a conjecture of Berry and Keating.
1. Introduction Random matrices from the classical groups — N × N unitary matrices U (N ), N × N real orthogonal matrices O(N ), and N × N unitary matrices with real quaternion elements embedded as 2N × 2N complex unitary matrices Sp(N ) — play a special role in the application of random matrix theory to number theory (see e.g. the recent review [21]). Of particular interest in such applications are the random matrix averages of the modulus of the characteristic polynomial raised to some power 2µ say. Thus in [20] it was shown how knowledge of
| det(zI − U )|2µ
U ∈U (N)
(1.1)
for |z| = 1 (in this case (1.1) is in fact independent of z) allows the mean value of the 2µth power of the modulus of the Riemann zeta function on the critical line to be predicted. Knowledge of the analogue of (1.1) for the classical groups O(N ) and Sp(N ) allows for similar predictions in the case of families of L-functions [5, 19, 6].
120
P.J. Forrester, J.P. Keating
Our interest is in the asymptotic behaviour of (1.1) and its analogues as |z| approaches unity. Consider in particular the generalization of (1.1) N
|z − eiθl |2µ
CβEN
l=1
,
(1.2)
where CβEN (circular β-ensemble) refers to the eigenvalue probability density function proportional to |eiθk − eiθj |β , −π ≤ θl < π. (1.3) 1≤j 0 and general 2µ ≤ −1. Our analysis relies on identifying (1.2) as a special generalized hypergeometric function based on Jack polynomials [25, 17]. We are then able to use known asymptotic properties of the latter to deduce the sought asymptotic behaviour of (1.2). Also studied will be the x → 1+ asymptotic behaviour of N l=1
|x − xl |2µ
JβEN
,
(1.7)
Singularity Dominated Strong Fluctuations for Random Matrix Averages
121
where JβEN (Jacobi β-ensemble) refers to the eigenvalue probability density function proportional to N
xla (1 − xl )b
|xk − xj |β ,
0 < xl < 1.
(1.8)
1≤j 0 depend polynomially on ν; those with m < −1 depend rationally on ν. Therefore there exist limits Lm := lim L(ν) m , ν→0
m ≥ −1.
To arrive at the Virasoro operators given above in the Introduction one use the following realization of the bosonic operators aα,p : p≥0 ∂t∂α,p , . (3.9) aα,p = −1 (−1)p+1 tα,−p−1 , p < 0 Here we use the matrix η for lowering the indices tα,p := ηαβ t β,p . In the next section we will prove that the linear action of the Virasoro operators Lm with m ≥ −1 defines infinitesimal symmetries of the extended Toda hierarchy. 4. Proof of the Main Theorem We first consider the following system of Euler-Lagrange equations:
t˜α,p
δHα,p−1 = 0, δv(x)
t˜α,p
δHα,p−1 =0 δu(x)
p≥0
p≥0
(4.1)
with t˜α,p = t α,p − cα,p () + δ1α δp,0 x
(4.2)
Virasoro Symmetries of the Extended Toda Hierarchy
175
for some formal power series cα,p (). Here and henceforth summations with respect to the repeated Greek indices are assumed. We assume that only finitely many of them are nonzero. The series must satisfy the condition of genericity that we shall now formulate. Let us expand the Hamiltonian densities (1.14) in powers of , hα,p = θα,p+1 (v, u) + O().
(4.3)
The explicit formulae for the functions θα,p (v, u) can be found in [9] (see also the formulae (A.12), (A.13) below). Let us impose the following assumptions for the leading terms of the series cα,p (). 1. There exist values v, ¯ u¯ such that
∂θα,p (v, u) c (0) = 0, ∂v v=v,u= ¯ u¯ p≥0 ∂θα,p (v, u) α,p c (0) = 0, ∂u v=v,u= ¯ u¯
α,p
p≥0
(4.4) and 2. The operator of multiplication by the vector cα,p (0)θα,p−1 (v, ¯ u) ¯ ∇
(4.5)
p≥1
is invertible element of the Frobenius algebra Tv, ¯ u¯ MToda . Under these assumptions the following lemma holds true (cf. Lemma 3.8.2 of [9]). Lemma 4.1. There exists a unique solution to the Euler - Lagrange equations (4.1) in the class of formal series v = v(x, t, ) = a0 () + aα1 ,p1 ;...;αk ,pk ()t α1 ,p1 . . . t αk ,pk |t 1,0 →t 1,0 +x , k>0
u = u(x, t, ) = b0 () +
bα1 ,p1 ;...;αk ,pk ()t α1 ,p1 . . . t αk ,pk |t 1,0 →t 1,0 +x , (4.6)
k>0
where a0 (0) = v, ¯
b0 (0) = u. ¯
(4.7)
Proof. In the leading order in the Euler - Lagrange equations (4.1) become just equations for the critical points of the function p (t α,p − cα,p (0) + xδ1α δ0 )θα,p (v, u). p≥0
The above two assumptions imply that there exists a unique critical point v0 = v0 (x, t, ), u0 = u0 (x, t, )
176
B. Dubrovin, Y. Zhang
of the function
p
(t α,p − cα,p () + xδ1α δ0 )θα,p (v, u)
p≥0
in the class of formal power series of the structure similar to (4.6) with v0 (0, 0, 0) = v, ¯ u0 (0, 0, 0) = u. ¯ It is easy to see that these functions can be uniquely extended to a solution to the full Euler - Lagrange equations (4.1). The lemma is proved. Lemma 4.2. The space of solutions of the Euler-Lagrange equation (4.1) is invariant with respect to the flows of the extended Toda hierarchy. Proof. Let us represent the difference operators Aβ,q defined in (1.3) by aβ,q;k k , β = 1, 2, q ≥ 0. Aβ,q =
(4.8)
k≥0
Then by using Lemma 2.3 we obtain δHα,p = aα,p;0 (x), δv(x)
δHα,p = aα,p;1 (x − ) eu(x) . δu(x)
(4.9)
The Lax pair representation (1.2) of the extended Toda hierarchy yields ∂Aα,p ∂Aβ,q − α,p = [Aβ,q , Aα,p ], β,q ∂t ∂t ∂eu = aβ,q;0 (x) − aβ,q;0 (x − ) eu(x) . β,q ∂t These equations together with (4.8) imply δHα,p δHβ,q ∂ ∂ = , α, β, γ = 1, 2; p, q ≥ 0. ∂t β,q δw γ (x) ∂t α,p δw γ (x)
(4.10)
So, under the flows of the extended Toda hierarchy we have ξ δHβ,q ∂ α,p δHα,p−1 δHβ,q−1 ∂ ∂w t˜ t˜α,p α,p−1. = + ∂m ∂t β,q δw γ (x) δw γ (x) ∂w ξ,m δw γ (x) x ∂t p≥0
m≥0
p≥1
(4.11) Here and below we use the following notations for the x-derivatives of the functions u and v ∂ m wξ wξ,m := , ξ = 1, 2, m ≥ 0. ∂x m So w 1,m = v (m) , w2,m = u(m) . By using (4.12) that we will prove in Lemma 4.4 below we know that the r.h.s. of (4.11) can be rewritten as δHβ,q−1 δHβ,q ∂ − ∂v δw γ (x) δw γ (x) which equals zero due to (4.9), (2.6) and the identity
∂L ∂v
= 1. The lemma is proved.
Virasoro Symmetries of the Extended Toda Hierarchy
177
Due to the uniqueness of solutions of the initial value problem for the Euler-Lagrange equation (4.1) and the above theorem, we have Theorem 4.3. Any solution of Eq. (4.1) gives a solution to the extended Toda hierarchy. Using quasitriviality it can be shown that the class of solutions of the extended Toda hierarchy that is given by the above theorem forms a dense subset of the class of its analytic solutions w α (x, t, ), α = 1, 2 (see Theorem 3.6.15 and 3.10.31 in [9]). We call this class of solutions the generic class of solutions of the extended Toda hierarchy, and we will restrict ourselves to it henceforth. Lemma 4.4. Any solution (v, u) of the Euler-Lagrange equation (4.1) satisfies the equations ∂v t˜α,p α,p−1 + 1 = 0, ∂t p≥1
p≥1
q
q≥1
q
q≥1
∂u = 0, ∂t α,p−1 ∂v ∂v 2,q−1 ∂v 1,q ˜t 1,q ˜ ˜ + 2t + v = 0, +t ∂t 1,q ∂t 2,q−1 ∂t 2,q−1 ∂u ∂u ∂u t˜1,q 1,q + t˜2,q−1 2,q−1 + 2t˜1,q 2,q−1 + 2 = 0. ∂t ∂t ∂t
t˜α,p
(4.12)
(4.13)
Proof. Equation (4.12) is the result of the application of the operators ∂t∂2,0 and 1 (−1) on the Euler-Lagrange equation (4.1). To prove Equation (4.13), we need to use the following bihamiltonian recursion relation of the extended Toda hierarchy [1]: αγ
U2
δHβ,q−1 1 αγ δHβ,q γ αξ δHγ ,q−1 = (q + µβ + )U1 + R β U1 . δw γ δw γ δw ξ 2
(4.14)
αβ
Here the first Hamiltonian structure U1 is defined in (1.13) and the second one is given by 1 ∂x u(x) 1 U211 = − eu(x) e− ∂x , U212 = v(x) e ∂x − 1 , e e 1 1 ∂x 21 − ∂x 22 U2 = − e− ∂x . v(x), U2 = (4.15) 1−e e The matrices R and µ are defined by (2.4) (see also (3.4)). Then Eq. (4.13) is obtained αγ by applying the operator U2 to both sides of the Euler-Lagrange equation (4.1) and by using the bihamiltonian recursion relation (4.14). The lemma is proved. Let v, u be any solution of the Euler-Lagrange equation (4.1) specified by a choice of the series cα,p () and of the leading term v, ¯ u¯ in (4.4). Due to Theorem 1.1 and Theorem 4.3 this solution can be obtained from a solution v0 , u0 of the dispersionless Toda hierarchy. Denote by τ [0] and τ the corresponding tau functions with the relation (3g−2) 2g−2 Fg (w0 , . . . , w0 ). (4.16) log τ = −2 log τ [0] + g≥1
178
B. Dubrovin, Y. Zhang
Note that the genus zero tau function τ [0] is defined up to multiplication by a function [0] α,p [0] of the form e cα,p t with constants cα,p . We now fix this ambiguity by taking log τ [0] =
1 [0] α,p;β,q (v0 , u0 )t˜α,p t˜β,q , 2
(4.17)
where [0] α,p;β,q = α,p;β,q =0 .
(4.18)
The validity of this definition for the tau function of the solution v0 , u0 of the dispersionless extended Toda hierarchy is based on the fact that v0 , u0 satisfy the genus zero Euler-Lagrange equation
t˜α,p
∂h[0] α,p−1 (v0 , u0 ) ∂v0
= 0,
t˜α,p
∂h[0] α,p−1 (v0 , u0 ) ∂u0
=0
(4.19)
with h[0] α,p−1 = hα,p−1 =0 . From this equation it readily follows that [0] α,p;β,q =
∂ 2 log τ [0] , ∂t α,p ∂t β,q
(4.20)
and that the genus zero tau function satisfies the string equation p≥1
t˜α,p
∂ log τ [0] + t˜1,0 t˜2,0 = 0. ∂t α,p−1
(4.21)
The proof of the above statement can be found in [5]. It was proved in [8] that such a tau function also satisfies the genus zero Virasoro constraints given by the Virasoro operators (1.29). The action of these operators on tau functions of the form (4.16) can be expressed as ∂ Lm ( −1 t˜, )τ = 2g−2 Zg τ, ∂t
m ≥ −1.
(4.22)
g≥0
The genus zero Virasoro constraints are given by Z0 = 0. We are to prove below that the tau function of a generic solution to the extended Toda hierarchy satisfies the full genera Virasoro constraints Zg = 0, g ≥ 0. Let us begin with the L−1 and L0 constraints. Lemma 4.5. The tau function (4.16) satisfies the constraints L−1 ( −1 t˜,
∂ ∂ [g] )τ = 0. L0 ( −1 t˜, )τ = c0 () = 2g−2 c0 ∂t ∂t g≥1
with a certain constant c0 ().
(4.23)
Virasoro Symmetries of the Extended Toda Hierarchy
179
2
Proof. Let us apply the operator ∂t σ,k∂ ∂t ρ,l to the l.h.s. of the first equation of (4.23). Using the definition (1.20) for the tau function and Eq. (4.12) we get 2 ∂ ∂ log τ 1 2 σ,k ρ,l t˜α,p α,p−1 + 2 t˜1,0 t˜2,0 ∂t ∂t ∂t p≥1
∂σ,k;ρ,l + δσ,1 δρ,2 + δσ,2 δρ,1 δk,0 δl,0 ∂t α,p−1 p≥1 ∂σ,k;ρ,l m ∂wξ t˜α,p = σ,k−1;ρ,l + σ,k;ρ,l−1 + ∂ ∂w ξ,m x ∂t α,p−1 p≥1 + δσ,1 δρ,2 + δσ,2 δρ,1 δk,0 δl,0 ∂σ,k;ρ,l = σ,k−1;ρ,l + σ,k;ρ,l−1 − + δσ,1 δρ,2 + δσ,2 δρ,1 δk,0 δl,0 ∂v = 0. (4.24) = σ,k−1;ρ,l + σ,k;ρ,l−1 +
t˜α,p
Here the last equality is due to (2.2). On the other hand, the Euler-Lagrange equation (4.1) implies that the l.h.s. of the first formula of (4.23) does not depend on t˜1,0 and t˜2,0 , [g] [g] so there exist constants c−1 , cα,p , α = 1, 2, p ≥ 0, g ≥ 1 such that p≥1
t˜α,p
∂ log τ 1 + 2 t˜1,0 t˜2,0 = α,p−1 ∂t
[g]
2g−2 cα,p−1 t˜α,p +
p≥1,g≥1
[g]
2g−2 c−1 . (4.25)
g≥1
Here the vanishing of the −2 term in the r.h.s. of the above identity is due to (4.21). Thus if we modify the tau function by τ → τ˜ = τ e−
p≥0,g≥1
2g−2 c[g] t˜α,p α,p
,
(4.26)
then we obtain L−1 ( −1 t˜,
∂ [g] 2g−2 c−1 )τ˜ . ) τ˜ = c−1 () τ˜ = ( ∂t
(4.27)
g≥1
[g]
We will prove the vanishing of the constants cα,p , c−1 () in a moment. By using the formula (2.3) and a similar argument as that given in the proof of (4.24), we can prove the validity of the following identity: ∂2 L0 τ˜ = 0. (4.28) ∂t σ,k ∂t ρ,l τ˜ Here
∂ ). ∂t So there exist constants c0 () and bα,p () such that L0 = L0 ( −1 t˜,
L0 τ˜ = bα,p ()t˜α,p + c0 (). τ˜ α,p
(4.29)
180
B. Dubrovin, Y. Zhang
By using the commutation relation [L−1 , L0 ] = −L−1 we obtain L−1
bα,p ()t˜α,p + c0 () τ˜ − L0 (c−1 () τ˜ ) = −c−1 ()τ˜ .
(4.30)
The l.h.s. of the above equality reads
bα,p−1 t˜α,p τ˜ +
p≥1
=
bα,p t˜α,p + c0 () L−1 τ˜ − c−1 ()L0 τ˜
bα,p−1 t˜α,p τ˜ .
(4.31)
p≥1
So from (4.30) it follows that bα,p−1 t˜α,p τ˜ = −c−1 () τ˜ ,
(4.32)
p≥1
from which we obtain c−1 () = bα,p () = 0. [g] Now we proceed to proving the vanishing of the constants cα,p . From the above argument we already have the identity L−1 ( −1 t˜,
2g−2 c[g] t˜α,p ∂ α,p = 0. ) τ e− g≥1 ∂t
At the genus one level we have
t˜α,p
p≥1
∂F1 (w, wx ) α,p [1] − t˜ cα,p−1 = 0. ∂t α,p−1
(4.33)
p≥1
Starting from this formula till the end of the proof of the lemma we will redenote for the sake of brevity the arguments w0 = (v0 , u0 ) and w0x = (v0x , u0x ) of the function F1 (w0 , w0x ) by w = (v, u) and wx = (vx , ux ). Since τ [0] satisfies the genus zero Virasoro constraints, we can use the vanishing of the genus zero Virasoro symmetries to obtain, as we did in [8, 9], the following formula:
t˜α,p
p≥1
∂F1 (w, wx ) ∂F1 =− . ∂t α,p−1 ∂v
(4.34)
Thus the identity (4.33) can be rewritten as −
[1] = t˜α,p cα,p−1
p≥1
By applying the operator
p ∂ p≥0 z ∂t α,p
∂F1 . ∂v
to the above identity we get
(4.35)
Virasoro Symmetries of the Extended Toda Hierarchy
−
181
[1] cα,p−1 zp
p≥1
∂ 2 F1 ∂θα,p+1 ∂ 2 F1 2 ∂θα,p+1 + zp ∂ ∂ x γ x ∂v∂w γ ∂wγ ∂wγ ∂v∂w x p≥0
2 ∂ F1 γ σ ρ ∂ 2 F1 ∂ 2 F1 γ σ1 ρ1 σ γσ ρ ρ2 ∂θα (z) = c w + . γ ∂x (cρ wx ) + z γ cρ wx cσ1 ρ2 wx ∂v∂w γ ρ x ∂w σ ∂v∂wx ∂v∂wx 1 (4.36) =
Here the functions θα,p = θα,p (w), cαβγ = cαβγ (w) are given by θα,p zp , θα,p = hα,p−1 |=0 , θα (z) = p≥0
cαβγ =
∂3
1 ( v 2 u + eu ), ∂w α ∂w β ∂w γ 2
(4.37)
and the raising of indices in cαβγ is done by the metric (1.27), i.e. η11 = η22 = 0, η12 = η21 = 1. In the above computation we used the horizontality of the differentials of the functions θα (w; z) w.r.t. the deformed flat connection on MToda , i.e. the equations ∂ 2 θα (z) ξ ∂θα (z) = zcβγ . ∂w β ∂w γ ∂w ξ
(4.38)
So from (4.36) we get ∂ 2 F1 γ σ ρ ∂ 2 F1 γσ ρ c w + γ ∂x (cρ wx ) = 0, ρ x ∂v∂w γ ∂v∂wx 2 ∂ F1 γ σ1 ρ1 σ [1] ρ2 ησ α = −cα,0 , γ cρ wx cσ1 ρ2 wx ∂v∂wx 1 and together with (4.36) these formulae in turn yield [1] p cα,p z = cγ[1],0 ∂ γ θα (z).
(4.39)
(4.40)
p≥0
By differentiating both sides of the above equation w.r.t. x we get γ
0 = cγ[1],0 cξ σ wxσ ∂ ξ θα (z) which implies cγ[1],0 = 0. So from (4.40) we obtain cγ[1],p = 0,
p ≥ 1.
In a completely similar way we can prove that [g]
cγ ,p = 0, The lemma is proved.
p ≥ 0, g ≥ 2.
(4.41)
182
B. Dubrovin, Y. Zhang
The following lemma represents the bihamiltonian recursion relation for the extended Toda hierarchy in terms of its tau functions: Lemma 4.6. The following recursion relations hold true for any q ≥ 1 for the tau functions of generic solutions to the extended Toda hierarchy: ∂ log τ ∂ log τ ∂ log τ = R 1,q−1 − 2 ( − 1) 2,q−1 , ∂t 1,q ∂t ∂t ∂ log τ ∂ log τ (q + 1) ( − 1) = R 2,q−1 , 2,q ∂t ∂t q ( − 1)
(4.42) (4.43)
where the operator R is defined by R = v(x)( − 1) + ( + 1)
∂ . ∂t 2,0
(4.44)
Proof. Denote Wβ,q := R
∂ log τ 1 ∂ log τ γ ∂ log τ − (q + µβ + )( − 1) β,q − ( − 1)Rβ γ ,q−1 . (4.45) ∂t β,q−1 2 ∂t ∂t
We are to prove that Wβ,q = 0 for β = 1, 2, q ≥ 1. From Lemma 2.3 and from the bihamiltonian recursion relation (4.14) with α = 2 we obtain by a direct calculation that ( − 1)Wβ,q = 0.
(4.46)
We note that Wβ,q can be expressed as homogeneous differential polynomials in w i,m , e±u , i = 1, 2, m ≥ 0 of degree q + 21 + µβ . Recall that the degree of such differential polynomials is defined in (1.16). So the lemma follows from the above equation (4.46). The lemma is proved. Proof of the Main Theorem. Let us first prove that the following recursion relation holds true: R
Lm τ Lm+1 τ = ( − 1) . τ τ
Here and below Lm = Lm ( −1 t˜,
∂ ). ∂t
From the definition of the operator R we have R
m−1
k!(m − k)!
k=1 m−1
=
k=1
+
1 ∂ 2τ τ ∂t 2,k−1 ∂t 2,m−1−k
k!(m − k)!
∂
∂t
R 2,k−1
∂ log τ ∂ log τ , R ∂t 2,k−1 ∂t 2,m−1−k
∂ log τ ∂ log τ ∂ log τ 2,m−1−k + R 2,m−1−k 2,k−1 ∂t ∂t ∂t m ≥ 1.
(4.47)
Virasoro Symmetries of the Extended Toda Hierarchy
183
So by using the recursion relations (4.42), (4.43) we can deduce the relation (4.47) for m ≥ 1 as follows: Lm τ 1 ∂ 2τ k!(m + 1 − k)! = 2 ( − 1) τ τ ∂t 2,k−1 ∂t 2,m−k k=1
∂ log τ ∂ 2 log τ ∂ log τ 2 − m!( − 1) 2,m−1 2,0 − m! ( − 1) 2,0 ( + 1) 2,m−1 ∂t ∂t ∂t ∂t
(m + k)! ∂ log τ ∂ log τ 1,k 2,k−1 + + t˜ ( − 1) (m + k + 1) t˜ (k − 1)! ∂t 1,m+k+1 ∂t 2,m+k k≥1 ∂ log τ ∂ log τ +2t˜1,k 2,m+k + (m + 1)!( + 1) 2,m ∂t ∂t ∂ log τ (m + k + 1)αm (k)t˜1,k ( − 1) 2,m+k +2 ∂t m
R
k≥0
∂ log τ ∂ 2 log τ + 2 2 αm (0) 2,m−1 2,0 2,m−1 ∂t ∂t ∂t 2 log τ Lm+1 τ ∂ = ( − 1) − 2 m!( − 1) 2,m−1 2,0 τ ∂t ∂t
∂ log τ ∂ log τ −m! ( − 1) 2,0 ( + 1) 2,m−1 ∂t ∂t ∂ log τ ∂ log τ +(m + 1)!( + 1) 2,m − 2(m + 1)! 2,m ∂t ∂t ∂ log τ ∂ 2 log τ +2αm (0)v 2,m−1 + 2αm (0) 2,m−1 2,0 ∂t ∂t ∂t ∂ log τ Lm+1 τ ∂ log τ + m! R 2,m−1 − (m + 1)!( − 1) 2,m = ( − 1) τ ∂t ∂t Lm+1 τ . (4.48) = ( − 1) τ It can be easily checked that the recursion relation (4.47) is also true for m = −1, 0. From (4.23) and the recursion relation (4.47) we know that L1 τ ( − 1) = 0. (4.49) τ +2αm (0)v
Since the function τ [0] satisfies the genus zero Virasoro constraints, it follows from (4.16) that L1 τ (m ) = 2g−2 Wg (w0 , w0 , . . . , w0 g ). (4.50) τ g≥1
Thus from (4.49) we arrive at the formula L1 τ [g] 2g−2 c1 = c1 () = τ g≥1
[g]
for some constants c1 .
(4.51)
184
B. Dubrovin, Y. Zhang
On the other hand, by using the the commutation relation (1.28) and Eqs. (4.23) we obtain L−1 (Lm τ ) = −(m + 1)Lm−1 τ,
L0 (Lm τ ) = (c0 () − m)Lm τ.
These formulae can be rewritten as Lm τ Lm−1 τ ˆ L−1 = −(m + 1) , τ τ
Lˆ 0
(4.52)
Lm τ Lm τ = −m . (4.53) τ τ 1,0 2 . By putting m = 1 into the t˜
Here Lˆ −1 = L−1 − 12 t˜1,0 t˜2,0 and Lˆ 0 = L0 − 12 above two relations we get L1 τ L1 τ L1 τ ˆ ˆ L−1 = −2 c0 (), L0 =− . τ τ τ
(4.54)
So from (4.51) we have c0 () = c1 () = 0, and we proved the vanishing of L0 τ and L1 τ . By using the recursion relation (4.47) with m = 1 we obtain L2 τ ( − 1) = 0. (4.55) τ Due to the same reason as the one we used to derive (4.51) we have L2 τ [g] 2g−2 c2 = c2 () = τ
(4.56)
g≥1
[g]
for some constants c2 . So by using the second formula in (4.53) we get L2 τ = 0. Now, the Virasoro commutation relation (1.28) implies the validity of all the Virasoro constraints ∂ τ = 0, m ≥ −1. Lm −1 t˜, ∂t It remains to prove that linear action of the Virasoro operators (1.29) defines infinitesimal symmetries of the extended Toda hierarchy. To this end we observe that ∂ ∂ ∂ Lm −1 (t − c()), = Lm −1 t, − a α,p α,p − bα,p t α,p − c, ∂t ∂t ∂t where a α,p , bα,p and c are some series in that may also depend on m. Note that the a series contains only nonnegative powers of . From the already proven Virasoro constraints it follows that, for any generic solution to the extended Toda hierarchy the action of the Virasoro operators on the tau function can be recast as ∂ −1 α,p ∂ α,p Lm t, τ= a + bα,p t + c τ. ∂t ∂t α,p So, for a small parameter δ α,p ∂ ∂ α,p −1 τ + δ Lm t, τ = eδ(bα,p t +c) eδ a ∂t α,p τ + O(δ 2 ). ∂t
(4.57)
Virasoro Symmetries of the Extended Toda Hierarchy
185
The operator eδ a
α,p
∂ ∂t α,p
is nothing but the shift along trajectories of the hierarchy. Note that such a shift leaves invariant the class of generic solutions. Multiplication by the exponential of a linear function in the times for obvious reasons maps a tau function to another one for the same solution to the hierarchy. This proves that (4.57) is again a tau function of some solution of the extended Toda hierarchy. The theorem is proved. 5. The Topological Solution of the Extended Toda Hierarchy Let us briefly recall the definition of Gromov - Witten invariants and their descendents and the construction of Witten’s generating function (physicists call it the free energy of the two-dimensional CP 1 topological sigma model). Denote φ1 = 1 ∈ H 0 (CP 1 ), φ2 = ω ∈ H 2 (CP 1 ) the basis in the cohomology space H ∗ (CP 1 ). The 2-form ω is assumed to be normalized by the condition ω = 1. CP 1
The free energy of the CP 1 topological sigma-model is a function of infinite number of coupling parameters t = (t 1,0 , t 2,0 , t 1,1 , t 2,1 , . . . ) and of defined by the following genus expansion form: 2g−2 Fg (t). F(t; ) =
(5.1)
g≥0
The parameter is called here the string coupling constant, and the function Fg = Fg (t) is called the genus g free energy which is given by Fg =
1 t α1 ,p1 . . . t αm ,pm τp1 (φα1 ) . . . τpm (φαm ) g , m!
(5.2)
where τp (φα ) are the gravitational descendents of the primary fields φα , t α,p is the corresponding coupling constants, and the rational numbers τp1 (φα1 ) . . . τpm (φαm ) g are given by the following intersection numbers on the moduli spaces of CP 1 -valued stable curves of genus g: τp1 (φα1 ) . . . τpm (φαm ) g = qβ β
[M¯ g,m (CP 1 ,β)]virt
ev∗1 φα1 ∧ ψ1 1 ∧ · · · ∧ ev∗m φαm ∧ ψmm . p
p
(5.3)
Here M¯ g,m (CP 1 , β) is the moduli space of stable curves of genus g with m markings of the given degree β ∈ H2 (CP 1 ; Z), evi is the evaluation map evi : M¯ g,m (CP 1 , β) → CP 1 corresponding to the i th marking, ψi is the first Chern class of the tautological line bundle over the moduli space corresponding to the i th marking. According to the divisor axiom
186
B. Dubrovin, Y. Zhang
[22] the dependence of the Gromov - Witten potential on the indeterminate q appears 2,0 only through the combination q et . We will therefore omit the dependence on q in the formulae. Let us clarify the relationship between our theory of Virasoro symmetries of the extended Toda hierarchy and the Virasoro conjecture of T.Eguchi, K.Hori, and C.-S.Xiong [12, 13] extended by S.Katz. Denote ZCP 1 (t; ) := eF (t;) the partition function of the CP 1 topological sigma-model. Here F(t; ) is the generating function of the CP 1 Gromov - Witten invariants and their descendents defined above. According to the results of A.Givental [19, 20] this partition function satisfies the following infinite sequence of linear Virasoro constraints: L−1 ZCP 1 =
∂ Z 1, ∂t 1,0 CP
Lm ZCP 1 = (m + 1)!
∂
∂t 1,m+1
where κm =
+ 2κm
m+1 j =1
∂ ∂t 2,m
ZCP 1 ,
m ≥ 0,
(5.4)
1 . j
Here Lm are just the Virasoro operators defined in (1.29). For the particular case of CP 1 (5.4) coincide with the Virasoro constraints conjectured in [12]. However, in their papers Eguchi, Hori and Xiong formulated a somewhat bolder conjecture that says that all g ≥ 1 Gromov - Witten invariants and their descendents of a smooth projective variety can be uniquely determined by solving recursively the linear system of Virasoro constraints. Although this conjecture seems to be too nice to be true in general (Calabi - Yau manifolds give counterexamples to uniqueness, see [2]), in certain cases it can be justified. Let us give our version of the Eguchi - Hori - Xiong Virasoro constraints programme adapted to computing Gromov - Witten invariants of CP 1 . Step 1. Computation of the genus zero Gromov - Witten potential F0 (t). This can be done in terms of the Frobenius manifold MToda as in [3, 4]. For the reader’s convenience we recall the algorithm of computation of F0 (t) in the Appendix below. Introduce functions ∂ 2 F0 (t) , ∂t 1,0 ∂t 2,0 ∂ 2 F0 (t) u0 = u0 (t) := 1,0 1,0 . ∂t ∂t
v0 = v0 (t) :=
We will denote u 0 , v0 , u
0 , v0
etc. the derivatives of these functions along t 1,0 . Step 2. Eguchi - Hori - Xiong (3g − 2)-ansatz for the higher genus corrections. Look for the g ≥ 1 terms in the genus expansion (5.1) in the form Fg (t) = Fg (v0 (t), u0 (t), v0 (t), u 0 (t), . . . , v0
(3g−2)
(3g−2)
(t), u0
(t)),
g ≥ 1.
(5.5)
The ansatz (5.5) was proved by E. Getzler in [18]. In the setup of our theory [9] of integrable systems the (3g − 2)-ansatz is a consequence of a deep result about quasitriviality of tau-symmetric deformations of Poisson pencils (see Theorem 3.9.5 in [9]).
Virasoro Symmetries of the Extended Toda Hierarchy
187
Step 3. Virasoro Conjecture. Part 1. The series (5.1), where F0 (t) is the genus zero Gromov - Witten potential and the terms of positive genera have the form (5.5), satisfies the Virasoro constraints (5.4). (Clearly the (3g − 2)-ansatz is of no importance so far.) Part 2. The degree 2g − 2 homogeneous functions Fg on the (3g − 2) jet space of MToda for all g ≥ 1 are uniquely determined from the Virasoro constraints (5.4) by solving recursively systems of linear equations. Part 1 of the Virasoro Conjecture was proved by A. Givental [19, 20]. Part 2 was proved in the much more general framework of an arbitrary semisimple Frobenius manifold in Theorem 3.10.20 of [9]. Combining these results we arrive at Theorem 5.1. 1. The partition function ZCP 1 (t; ) of the CP 1 topological sigma-model is uniquely determined by the Virasoro Conjecture equations. 2. It coincides with the tau function τCP 1 of a particular solution to the extended Toda hierarchy (1.2) ZCP 1 (t; ) = τCP 1 (t; )
(5.6)
specified by the following choice of the shift parameters cα,p () and the initial point v, ¯ u: ¯ p
cα,p () = δ1α δ1 ,
v¯ = u¯ = 0.
(5.7)
The choice (5.7) selects the solution satisfying the string equation
t α,p
p≥1
∂F 1 ∂F + 2 t 1,0 t 2,0 = 1,0 . ∂t α,p−1 ∂t
(5.8)
Proof. As it was shown in Sect. 3.10.7 of [9], from validity of the Virasoro constraints for the sum F of all g ≥ 1 corrections to the Gromov - Witten potential represented via the (3g − 2)-ansatz, F :=
2 g Fg (v, u; vx , ux , . . . , v (3g−2) , u(3g−2) ),
g≥1
the loop equation below (cf. Example 3.10.27 in [9]) follows: ∂F
−λ ∂F r 1 − 2 (r) ∂x D D ∂u r≥0 r ∂F r−k+1 v − λ ∂F r−k+1 1 r k−1 1 ∂x √ ∂ ∂ + − 2 √ √ x k ∂u(r) x D ∂v (r) D D r≥1 k=1 = D −3 eu 4 eu + (v − λ)2 v ∂r ∂v (r) x
188
B. Dubrovin, Y. Zhang
2 ∂ 2 F v−λ v−λ ∂F ∂F ∂xk+1 √ ∂xl+1 √ + + (k) − 4 ∂v (k) ∂v (l) ∂v ∂v (l) D D k,l 2 ∂F ∂F v−λ 1 ∂ F ∂xk+1 √ ∂xl+1 √ + (k) +4 (k) (l) (l) ∂v ∂u ∂v ∂u D D 2 ∂F ∂F ∂ F k+1 1 l+1 1 ∂ + −4 ∂ √ √ x x ∂u(k) ∂u(l) ∂u(k) ∂u(l) D D 2 u
2 ∂F k+1 u 4 e (v − λ)u − [(v − λ) + 4 eu ] v
∂ e − 2 D3 ∂v (k) x k ∂F k+1 u 4 (v − λ) v − [(v − λ)2 + 4 eu ] u
, + (k) ∂x e D3 ∂u
(5.9)
where D = (v − λ)2 − 4 eu . Here λ is an arbitrary complex parameter. Expanding the loop equation near λ = ∞ reproduces the Virasoro constraints for F. The proof of existence and uniqueness of the solution to this equation is based on expanding the loop equation near zeroes u± = v ± 2eu/2 of D (these are the canonical coordinates on the Frobenius manifold MToda ). The uniqueness of the solution to the loop equation proves the first part of the theorem. To prove the second part we use the following arguments. From [4, 15, 11, 14] we already know that τ [0] (t) := F0 (t) is the tau function of the dispersionless extended Toda hierarchy. This solution is specified by the shift parameters and the leading term (5.7). The transformation log τ [0] → log τ [0] + F =: 2 log τ
(5.10)
maps dispersionless tau functions to tau functions of the full hierarchy associated with the semisimple Frobenius manifold MToda . The full hierarchy is uniquely determined for the given semisimple Frobenius manifold by the following properties: – bihamiltonian structure satisfying certain nondegeneracy conditions; – tau symmetry that provides existence of a tau function for a generic solution; – invariance with respect to the linear action of the Virasoro operators Lm , m ≥ −1 onto the tau functions. As we explained in the Introduction, the first two properties are met by the extended Toda hierarchy due to results of [1]. The last property of Virasoro invariance is established in the present paper. This implies that the full hierarchy associated with the Frobenius manifold MToda coincides with the extended Toda hierarchy. Therefore the transformation (5.10) maps the tau function τ [0] of an arbitrary solution of the dispersionless hierarchy to the tau function τ of a solution of the full extended Toda hierarchy. Taking τ [0] = F0 one obtains τ = ZCP 1 . The theorem is proved.
Virasoro Symmetries of the Extended Toda Hierarchy
189
Clearly the theorem covers Corollaries 1.2 and 1.3 formulated in the Introduction. To illustrate the algorithm of computation of the genus expansion (5.1) for CP 1 let us write down the first two terms of the expansion. The formulae become simpler when written in the canonical coordinates u0
u± = v0 ± 2e 2 . Genus 1: F1 =
1 1 u+ − u− log u + u − − log . 24 12 4
Genus 2: 242 F2 =
4 u
+ 3 (u+ − u− ) 5 u + 4
3 u
+ 1 + 4 u + 3 2
3 u
− 1 + 4 u − 3 2 +
1
−
4 u
− 3 (u+ − u− )
u
+ u − − u
− u +
−
5 u − 4
u
+ u
− 4 u + u −
7
u (u+ − u− ) 5 +
7 + u
(u+ − u− ) 5 −
33
2 9
1
u − u u + u u + uI+V (u+ − u− ) 10 + 10 + − 10 + −
4 u + 2
33
2 9
1
1 IV − u + u − u (u − u ) + u u u + − − − 10 − + 10 + − 4 u − 2 10 1 17
1
17
1
1 −
u+ + u− − u + u
4 u+ 5 2 4 u− 5 − 2 + 3 u + u − 3 1 1 11
2
2 − − u+ − u u + u− +
10 (u+ − u− )2 u − u+ (u+ − u− )2 5 + − u
+ − u
− u − u + + + +1 . (5.11) u+ − u− 5 u + 5 u −
Remark 1. In [16], Getzler proved that, under the assumption of the recursion relation (4.43), validity of the Virasoro constraints for τCP 1 is equivalent to (4.42). In his proof a recursion relation of the form (4.47) was used. The recursion (4.43) for τCP 1 was proved in [24] on the subspace {t 1,k = 0, k > 1} of the large phase space of all couplings. Using this result Getzler also proved (4.42) and (4.43) under the assumption of the Virasoro constraints for τCP 1 . He did not consider connections between recursion relations and Virasoro constraints for other solutions to the extended Toda hierarchy. Our Corollary 1.3 shows that the recursion relations (4.42), (4.43) for τCP 1 follow directly from validity of the Virasoro constraints. Remark 2. In [25] A. Okounkov and R. Pandharipande proved that the Gromov - Witten potential of the equivariant GW invariants of CP 1 and their descendents is the logarithm of the tau function of the 2D Toda hierarchy of K. Ueno and K. Takasaki [27]. The tau function of [25] depends on an additional small parameter t. The non-equivariant limit corresponds to t → 0. It would be interesting to derive the Lax representation of the extended Toda lattice by applying a suitable limiting procedure to that of the 2D
190
B. Dubrovin, Y. Zhang
Toda hierarchy of [27]. An interesting construction, due to Getzler [17] of a nontrivial reduction of the 2D Toda hierarchy depending on the parameter t (it was called the equivariant Toda lattice) could give a clue to such a limiting procedure. We plan to study the relationships between 2D and extended Toda hierarchies in a subsequent publication. Appendix. Genus Zero Gromov-Witten Potential of CP 1 To compute the genus zero Gromov-Witten potential F0 (t) according to the general scheme of [3, 4] one is to perform the following computations (cf. Sects. 3.6.2, 3.6.10 of [9]). 1. Compute the functions θα,p (v, u) as the coefficients of expansion of the following series: θ1 (v, u; z) = θ1,p (v, u)zp p≥0
1 1 = −2 ezv K0 (2ze 2 u ) + (log z + γ )I0 (2ze 2 u ) = −2ez v θ2 (v, u; z) =
m≥0
z2 m 1 (γ − u + ψ(m + 1))em u , 2 (m!)2
(A.12)
θ2,p (v, u)zp
p≥0
= z−1 ezv I0 (2ze 2 u ) − z−1 = z−1 1
m≥0
em u+z v
z2 m (m!)2
− 1 .
(A.13)
Here γ denotes Euler’s constant, ψ(z) stands for the digamma function, K0 (x) and I0 (x) are modified Bessel functions. 2. Compute the functions [0] α,p;β,q (v, u) as the coefficients of the following generating series [0] α,p;β,q (v, u)zp w q p,q≥0
=
∂θα (v, u; z) ∂θβ (v, u; w) ∂θα (v, u; z) ∂θβ (v, u; w) 1 + − ηαβ . z+w ∂v ∂u ∂u ∂v (A.14)
3. Define the functions v(t), u(t) as the unique solution of the system
∂θβ,q , ∂u ∂θβ,q u= t β,q ∂v v=
t β,q
having the expansion v(t) = t 1,0 + o(t), u(t) = t 2,0 + o(t).
(A.15)
Virasoro Symmetries of the Extended Toda Hierarchy
191
4. The genus zero Gromov - Witten potential of CP 1 is given by F0 (t) = Here
1 α,p β,q [0] t˜ t˜ α,p;β,q (v(t), u(t)). 2
1,1 t − 1, t˜α,p = t α,p ,
α = 1, p = 1 otherwise.
(A.16)
(A.17)
Let us write the first few terms of the expansion of the resulting genus zero Gromov Witten potential. For simplicity we denote tp := t 1,p ,
sp := t 2,p ,
p ≥ 0.
The potential is expanded in powers of tp , sp and in es0 . The powers of the exponential separate the intersection numbers on the moduli spaces M¯ 0,m (CP 1 , β) for different β. Thus, the terms without exponential correspond to β = 0 etc. In our expansion we collect the intersection numbers up to degree β = 3 and up to m ≤ 4 punctures. We also restrict the potential onto the subspace of couplings t α,p with p ≤ 3. The maximal number of descendents is restricted to three. F0 =
t02 s0 t 2 t 1 s0 t 3 s1 t 3 t 2 s0 t 4 s2 t 2 t 2 s0 + 0 + 0 + 0 + 0 + 0 1 2! 2! 3! 3! 4! 2! t04 t3 s0 t05 s3 t03 t1 t2 s0 t04 t2 s1 t 4 t1 s2 t03 t1 s1 + + +3 +3 +3 0 +2 4! 5! 3! 4! 4! 3! 2 2 t t0 t3 +es0 1 − 2 t1 + 2 1 − 2 t0 t2 + t1 s0 + t0 s1 − 2 2! 2! t1 2 s0 t 0 2 t1 t3 t 0 2 s2 t 0 2 t 3 s0 + t0 t2 s0 + −2 − t 0 t 1 t 2 s0 + 2! 2! 2! 2! t1 2 s0 2 t0 2 t1 s2 t0 3 s3 t 0 2 t 2 s1 t 0 2 s1 2 +2 + t + + − t s s + 0 1 0 1 (2!)2 2! 2! 2! 3!
2 3 3 1 5 t2 1 3 1 s1 2 +e2s0 − t3 + s2 + + t1 t3 + t3 s0 − t2 s1 + 4 4 4 2! 4 4 4 2 2!
−2
1 1 3 t2 2 s0 3 − t1 s2 + t0 s3 + 2 t0 t2 t3 − − t1 t3 s0 − 2 t0 t3 s1 4 4 2 2! 2 1 t 0 2 t3 2 1 + t2 s0 s1 − t0 t2 s2 + t1 s0 s2 + t0 s1 s2 + 2 t0 t1 t2 t3 + 4 2 (2!)2 2 2 2 2 2 2 t 1 t 2 s0 t0 t2 s1 t 2 s0 t1 t3 s0 − 3 t0 t2 t3 s0 + + + + 2! (2!)2 2! 2! 2 t 0 t 2 s1 t1 s0 s1 2 −2 t0 t1 t3 s1 − t1 t2 s0 s1 + t0 t3 s0 s1 − 2 + 2! 2! t0 s1 3 t0 2 t3 s2 +3 − t 0 t 1 t 2 s2 − 3 + t0 t2 s0 s2 + t0 t1 s1 s2 3! 2! t 0 2 t2 s 3 t0 t 1 s 0 s3 3 t0 2 s 1 s3 t 0 2 s2 2 − + + +2 (2!)2 2! 2 2 2!
192
B. Dubrovin, Y. Zhang
+e
3s0
50 t3 2 1 s2 2 1 t 2 2 t3 7 2 14 t3 2 s0 − t3 s2 + − t2 s3 + s1 s3 − 2 − 27 2! 9 3 2! 9 6 2! 9 2!
t 3 s1 2 s 1 2 s2 t 2 2 s2 1 + + t 3 s0 s2 − t 2 s 1 s2 + − t0 t3 s3 2! 2! 3 2! 1 1 t 0 t2 t3 2 t 2 2 t 3 s0 t1 t3 2 s 0 2 t 3 2 s0 2 + t 2 s 0 s3 + t 0 s2 s3 − 4 +5 +4 + 6 2 2! 2! 2! 3 (2!)2 2 2 2 t0 t3 s1 t 3 s0 s 1 t 2 s0 s2 − 3 t2 t3 s0 s1 + +8 + 3 t0 t2 t3 s2 − 2 2! 2! 2! 2 t0 t2 s2 t1 s 0 s 2 2 −2 t1 t3 s0 s2 − 5 t0 t3 s1 s2 + t2 s0 s1 s2 − 2 + 2! 2! t0 s1 s2 2 1 1 t0 t2 2 s3 +3 + − t0 t1 t3 s3 − t1 t2 s0 s3 + t0 t3 s0 s3 2! 2! 2 2 3 1 t 0 s1 2 s 3 t 0 2 s3 2 1 . − t 0 t 2 s1 s3 + t 1 s0 s 1 s3 + 2 + t0 t1 s2 s3 + 2 2 2! 2 (2!)2 +2 t2 t3 s1 − 2
Acknowledgements. The research of B.D. was partially supported by Italian Ministry of Education research grant Cofin2001 “Geometry of Integrable Systems”. The research of Y.Z. was partially supported by the Chinese National Science Fund for Distinguished Young Scholars grant No.10025101 and the Special Funds of Chinese Major Basic Research Project “Nonlinear Sciences”. Y.Z. thanks Abdus Salam International Centre for Theoretical Physics and SISSA where part of the work was done for their hospitality. The authors are grateful to the referee for the suggested improvements of the paper.
References 1. Carlet, G., Dubrovin, B., Zhang, Y.: The extended Toda hierarchy. nlin-SI/0306060, Moscow Math. J., to appear. 2. Cox, D., Katz, S.: Mirror symmetry and algebraic geometry. Mathematical Surveys and Monographs, 68. Providence, RI: American Mathematical Society, 1999 3. Dubrovin, B.: Integrable systems in topological field theory. Nucl. Phys. B 379, 627–689 (1992) 4. Dubrovin, B.: Integrable systems and classification of 2-dimensional topological field theories. In: Integrable systems, Proceedings of Luminy 1991 conference dedicated to the memory of J.-L. Verdier, O. Babelon, P. Cartier, Y. Kosmann-Schwarzbach, (eds.), Basel-Boston: Birkh¨auser, 1993 5. Dubrovin, B.: Geometry of 2D topological field theories. In: Integrable Systems and Quantum Groups, Montecatini Terme, 1993. M.Francaviglia, S. Greco, (eds.), Springer Lecture Notes in Math. 1620, Berlin-Heidelberg-New York: Springer-Verlag, 1996, pp. 120–348 6. Dubrovin, B., Zhang, Y.: Extended affine Weyl groups and Frobenius manifolds. Compositio Math. 111, 167–219 (1998) 7. Dubrovin, B., Zhang,Y.: Bihamiltonian hierarchies in 2d topological field theory at one loop approximation. Commun. Math. Phys. 198, 311–361 (1998) 8. Dubrovin, B., Zhang, Y.: Frobenius manifolds and Virasoro constraints, Selecta Mathematica. New series 5, 423–466 (1999) 9. Dubrovin, B., Zhang,Y: Normal forms of integrable PDEs, Frobenius manifolds and Gromov-Witten invariants. math/0108160 10. Eguchi, T., Jinzenji, M., Xiong, C.-S.: Quantum cohomology and free field representation. Nucl. Phys. B 510, 608–622 (1998) 11. Eguchi, T., Hori, K., Yang, S.-K.: Topological σ models and large-N matrix integral. Internat. J. Modern Phys. A 10, 4203–4224 (1995) 12. Eguchi, T., Hori, K., Xiong, C.-S.: Quantum cohomology and Virasoro algebra. Phys. Lett. B 402, 71–80 (1997) 13. Eguchi, T., Xiong, C.-S.: Quantum cohomology at higher genus: topological recursion relations and Virasoro conditions. Adv. Theor. Math. Phys. 2, 219–229 (1998) 14. Eguchi, T., Yamada, Y., Yang, S.-K.: On the Genus Expansion in the Topological String Theory. Rev. Math. Phys. 7, 279 (1995)
Virasoro Symmetries of the Extended Toda Hierarchy
193
15. Eguchi, T., Yang, S.-K.: The Topological CP 1 Model and the Large-N Matrix Integral. Mod. Phys. Lett. A 9, 2893–2902 (1994) 16. Getzler, E.: The Toda conjecture, math.AG/0108108. In: Symplectic geometry and mirror symmetry (Seoul, 2000), River Edge, NJ: World Sci. Publishing, 2001, pp. 51–79 17. Getzler, E.: The equivariant Toda lattice. I, math.AG/0207025; II, math.AG/0209110 18. Getzler, E.: The jet-space of a Frobenius manifold and higher-genus Gromov - Witten invariants. math.AG/0211338 19. Givental, A.: Semisimple Frobenius structures at higher genus. Intern. Math. J. 48, 295–304 (2000) math.AG/0008067 20. Givental, A.: Gromov - Witten invariants and quantization of quadratic Hamiltonians. Moscow Math. J. 1, 1–23 (2001). math.AG/0108100 21. Kontsevich, M.: Intersection theory on the moduli space of curves and the matrix Airy function. Commun. Math. Phys. 147, 1–23 (1992) 22. Kontsevich, M., Yu.Manin: Gromov-Witten classes, quantum cohomology and enumerative geometry. Commun. Math. Phys. 164, 525–562 (1994) 23. Okounkov, A.: Generating functions for intersection numbers on moduli spaces of curves. math.AG/0101201 24. Okounkov, A., Pandharipande, R.: Gromov-Witten theory, Hurwitz numbers, and matrix models I. math.AG/0101147 25. Okounkov, A., Pandharipande, R.: The equivariant Gromov-Witten theory of P 1 . math.AG/0207233 26. Toda, M.: Wave propagation in anharmonic lattices. J. Phys. Soc. Japan 23, 501–506 (1967) 27. Ueno, K., Takasaki, K.: Toda lattice hierarchy. In: Group representations and systems of differential equations (Tokyo, 1982), Adv. Stud. Pure Math. 4, Amsterdam: North Holland, 1984, pp. 1–95 28. Witten, E.: Two-dimensional gravity and intersection theory on moduli space. Surv. in Diff. Geom. 1, 243–310 (1991) 29. Witten, E.: On the Kontsevich model and other models of two-dimensional gravity. Preprint IASSNS-HEP-91/24 30. Zhang, Y.: On the CP 1 topological sigma model and the Toda lattice hierarchy. J. Geom. Phys. 40, 215–232 (2002) Communicated by L. Takhtajan
Commun. Math. Phys. 250, 195–213 (2004) Digital Object Identifier (DOI) 10.1007/s00220-004-1149-9
Communications in
Mathematical Physics
Hochschild- and Cyclic-Homology of LCNT-Spaces Christian-Oliver Ewald University of Kaiserslautern, Department of Mathematics, 67653 Kaiserslautern, Germany Received: 19 August 2003 / Accepted: 24 February 2004 Published online: 5 August 2004 – © Springer-Verlag 2004
Abstract: We define a class of topological spaces ( LCNT spaces ) which come together with a nuclear Fr´echet algebra. Like the algebra of smooth functions on a manifold, this algebra carries the differential structure of the object. We compute the Hochschild homology of this algebra and show that it is isomorphic to the space of differential forms. This is a generalization of a result obtained by Alain Connes in the framework of smooth manifolds. 1. Introduction In this paper we present a result about the Hochschild- and cyclic homology groups of certain algebras corresponding to certain singular spaces, which we will call locally coned topological spaces or in short “lcnt-spaces”. These algebras represent in some sense the “smooth” functions on these spaces. The results are a slightly modified version of similar results about locally coned stratifolds ( see [Ew, Kr1 and Kr2] ) which were part of my PhD thesis. Some of the results have already been included in a sectional talk which I gave at the DMV-Jahrestagung 2000 in Dresden. The remaining results were obtained within the years 2000 and 2001 when I was a member of the Topology group at the University of Heidelberg. Theorem 1. Let (X, C∞ X ) be a finite dimensional lcnt-space, then the antisymmetrization map : ∗C∞ → H H∗ (C∞ X) X
is a topological isomorphism from the space of differential forms over the algebra C∞ X to its Hochschild homology, where differential forms are defined by Definition 3.4. Theorem 2. If in addition to the conditions above X has finite dimensional homology groups, then for all n ∈ N there is a natural topological isomorphism n−2 H Cn (C∞ ) ∼ (X, C) ⊕ H n−4 (X, C) ⊕ H n−6 (X, C).... = n ∞ /dn−1 ∞ ⊕H X
CX
CX
196
C.-O. Ewald
These results are generalizations of results obtained byAlain Connes about the Hochschild- and cyclic-homology of the algebra of smooth functions on a compact manifold (see [Co]). Related results in the context of Whitney-stratified spaces have appeared in [Bra/Pf]. The result in this paper is in a quite general form, but can for example be applied on spaces which are locally coned stratifolds (see [Ew]). It is however not clear in what sense lcnt-spaces are stratified spaces and if Whitney stratified spaces provide examples for lcnt-spaces. The latter should be true under some mild extra conditions for the class of so called pseudo-manifolds, but we won’t give a proof for this. Besides being a generalization of a famous result (Connes’) the result was intended to be a first step toward a generalization of the concept of non-commutative geometry to non-smooth (“singular”) spaces which may or may not have applications in theoretical physics. I thank Prof. Dr. Matthias Kreck and Dr. Anna Grinberg who have both actively contributed to this work. 2. LCNT-Spaces Given a topological space X, we consider the algebra C(X) of continuous complex valued functions on X together with the topology of uniform convergence on compact subsets of X. Definition 2.1. A sub-algebra A of C(X) is called ample, if the following two conditions hold: 1. A is dense in C(X). 2. For any x ∈ X and any open neighborhood V of x, there exists an element a ∈ A such that supp(a) ⊂ V and a(x) = 0. In case the space X is paracompact it is well known that the second condition above implies that for any open covering of X there exists a subordinated partition of unity. Any topological space considered in this paper is assumed to be paracompact and has a countable base of topology. Also, since we work in the categories of topological vector-spaces and algebras, homomorphism always means continuous homomorphism and isomorphism is a continuous homomorphism which has a continuous inverse. Definition 2.2. A pair (X, C∞ X ) consisting of a topological space and a unital, associasuch that tive algebra C∞ X echet1 , 1. C∞ X is a topological algebra and is nuclear Fr´ 2. C∞ is an ample sub-algebra of C(X) and the inclusion i : C∞ X X → C(X) is continuous, is called a nuclear topological space or in short “nt-space”. Now let (X, C∞ X ) be an nt-space, U ⊂ X an open subset, x ∈ X and f ∈ C(U ). Then we say that f is of C∞ -type in x if there exists f˜ ∈ C∞ X and a neighborhood Ux of X in U such that f|Ux = f˜|Ux . We say f is of C∞ -type on U if f is of C∞ -type in x for all x ∈ U . Definition 2.3. Let (X, C∞ X ) be a nuclear topological space and U ⊂ X an open subset. Then we define C∞ (U ) := {f ∈ C(U ) | f is of C∞ -type on U }. 1
See [Tr] p. 85 for the definition of “Fr´echet space”, p. 510 for the definition of “nuclear”.
Hochschild- and Cyclic-Homology of LCNT-Spaces
197
∞ ∞ The existence of partitions of unity in C∞ X implies that CX = C (X) and for any ∞ open U ⊂ X, the topology on CX induces a locally convex topology on C∞ (U ) in the canonical way.
We will use the algebras C∞ (U ) corresponding to the nt-space (X, C∞ X ) to prescribe the local structure of (X, C∞ ). The local structure within a locally coned nuclear topological X space is that of a cone. The construction is as follows: ∞ ˆ ∞ Let cX = X×[0,1) X×{0} denote the open cone over X. We consider C (X)⊗C ([0, 1)) as a sub-algebra of C(X × [0, 1)). We define the jet-map:
ˆ ∞ ([0, 1)) → C∞ (X)[[z]], J : C∞ (X)⊗C ∞ 1 ∂i f (x, t) → f (x, 0)zi . i! ∂t i
(1) (2)
i=0
In Eq. (1) C∞ (X)[[z]] denotes the power series algebra with coefficients in C∞ (X). We consider this object as a locally convex topological algebra. The topology is given by the product topology. It follows from the Leibniz rule, that J is a continuous algebra homomorphism. Clearly C[[z]] is a closed subspace of C∞ (X)[[z]]. We define −1 C∞ (C[[z]]). cX := J
(3)
ˆ ∞ ([0, 1)) and hence a nuclear This object is a closed sub-algebra of C∞ (X)⊗C Fr´echet algebra. We consider it as a subspace of C(cX) = {f : X × [0, 1) → C|f (·, 0) = const.}. It is not hard to show that C∞ cX is an ample sub-algebra of C(cX) so that we are able to define the cone over the nt-space (X, C∞ X ) as ∞ c(X, C∞ X ) := (cX, CcX ),
(4)
and a locally coned nuclear topological space ( in short lcnt-space ) as an nt-space which has locally the structure of a product of an open ball and a cone over another lcnt-space. More precisely: Definition 2.4. A nuclear topological space (X, C∞ X ) is called a locally coned nuclear topological space (lcnt-space) if for any x ∈ X there exists an open neighborhood Ux ⊂ X of x, k ∈ N and a homeomorphism φ : Ux → B k × cLx , where B k ⊂ Rk denotes the open unit ball and cLx the cone over another lcnt-space Lx , such that the induced map ∞ ˆ ∞ φ ∗ : C∞ (B k )⊗C cLx → C (Ux )
(5)
is an isomorphism. ∞ Clearly, if (X, C∞ X ) is an lcnt-space, then c(X, CX ) is also an lcnt-space. Hence the class of lcnt-spaces is closed under the cone construction. Also any smooth manifold M together with the algebra of smooth functions on M defines an lcnt-space (M, C∞ (M)). So the class of lcnt-spaces contains at least all spaces which are locally iterated cones
198
C.-O. Ewald
over smooth manifolds. The concept of lcnt-spaces is related to that of pseudo-manifolds2 , but emphasizes the analytic structure of the underlying space. To use inductive methods, we have to define some kind of (local) dimension. For this let x ∈ X be an arbitrary point. We consider C as a module over C∞ X via the evaluation map evx : C∞ → C. We define the tangent space at x as X TxC X := Der(C∞ X , C). Definition 2.5. Let at x as
(X, C∞ X)
(6)
be an lcnt-space and x ∈ X. We define the dimension of X
C dimx (X, C∞ X )) := dimC (Tx X) − 1.
(7)
From the definition it is not clear that dimx (X) is a finite number. Nevertheless we have the following proposition: Proposition 2.1. Let (X, C∞ X ) be an lcnt-space, x ∈ X and Ux a neighborhood of x ˆ ∞ (cLx ) ∼ such that according to Definition 2.4 we have C∞ (B k )⊗C = C∞ (Ux ). Then ∞ dimx (X, CX ) = k. Proof. From the second condition in Definition 2.1 it follows immediately that any derivation on the algebra C∞ X is a local operator. From this it follows that ∞ TxC X = Der(C∞ X , C) = Der(C (Ux ), C)
ˆ ∞ = Der(C∞ (B k )⊗C cLx , C) = Der(C∞ (B k ), C) ⊕ Der(C∞ cLx , C). The statement follows, if we show that dimC Der(C∞ cLx , C) = 1. We have C ˆ C = Der(C∞ Der(C∞ cLx )⊗C∞ cLx , C) = Tx (cLx ), cL x
where x is now considered as the vertex of the cone. Let D ∈ Der(C∞ cLx ) and let ∞ ⊂ C∞ ⊗C ∞ ([0, 1)).3 Then we have ˆ ˆ λ f (y) ⊗g (t) be an element in C i i i i cLx Lx ˆ i (t)) = ˆ i (t) + ˆ λi fi (y)⊗g λi (Dfi )(y)⊗g λi fi (y)⊗(Dg D( i )(t). i
i
i
ˆ C∞ C gives Tensoring this with ⊗ cL x
ˆ C∞ (D ⊗ cL
x
ˆ i (t)) = ˆ i (0) + ˆ 1)( λi fi (y)⊗g λi (Dfi )(y)⊗g λi fi (y)⊗(Dg i )(0) . i
i
I
i
II
ˆ By definition of C∞ i λi fi (y)⊗gi (0) is constant, hence cLx the function y → ˆ i (0)) = 0. λi fi (y)⊗g I = D( i 2 3
See [Bo] for a definition of “pseudo-manifold”. See [Tr] p. 459 about the representation of elements in the projective tensor-product.
Hochschild- and Cyclic-Homology of LCNT-Spaces
199
Hence any derivation D ∈ TxC (cLx ) is uniquely determined by its effect on the factor and we have
C∞ ([0, 1))
TxC (cLx ) = Der(C∞ ([0, 1)), C), where the right hand side is clearly of dimension 1 over C.
The last proposition clearly justifies the somehow artificial term −1 in Definition 2.5. There is still one awkward thing, the dimension of an isolated point according to this definition is −1. However, if the point is non-isolated, then thinking of stratified spaces, the dimension is precisely what it should be. In fact one can decompose X with respect to dimx (X, C∞ X ) and gets something like a stratified (pre-stratified) space. Definition 2.6. Let (X, C∞ X ) be an lcnt-space. We define the (global) dimension of (X, C∞ X ) as ∞ dim(X, C∞ X ) = sup dimx (X, CX ).
(8)
x∈X
We conclude this section with a lemma, which reflects some metric aspects of nt-spaces and will later be used to localize the Hochschild complex of C∞ X. ˆ ∞ Lemma 2.1. Let (X, C∞ nt-space. Then there exists a function δ ∈ C∞ X ) be an√ X ⊗CX which has values in R+ such that δ(x, y) is a metric on X which generates the topology of X. Proof. To show the existence of such a δ, we will modify the proof of the Urysohn metrization theorem, which states that every regular T1 space with countable base of topology is metrizable. Using that C∞ X is ample one can easily prove that for any two ∞ and disjoint, closed subsets A and B of X there is a function fA,B ∈ CX A,B > 0 −1 −1 such that A ⊂ fA,B (0) and B ⊂ fA,B (A,B ). Let us now consider a complete bounded family F of such function, i.e. choose for any two disjoint, closed subsets A and B of X a function fA,B ∈ F as above such that {fA,B } is bounded.4 We can assume that F is countable and given by the family {fn |n ∈ N}. We define δ as follows: δ : X × X → R+ ∞ (x, y) → λi µi (fi (x) − fi (Y ))2 , i=1
where λi , µi > 0 are chosen such that lim µi fi = 0 and (see [Tr], p. 459). The map
i
ˆ ∞ λi < 1. Then δ ∈ C∞ X ⊗CX
ψ : X → [0, 1]F x → (f (x))f ∈F is a topological embedding, where [0, 1]F is considered with the product topology (see [Ke], p. 125). It is well known that this topology can be realized via the metric d((xi ), (yi )) =
∞ 1 |xi − yi |, 2i i=1
4
To achieve this we have to allow the constants A,B to be arbitrary small.
200
C.-O. Ewald
and hence also via the map ˜ i ), (yi )) = δ((x
∞
λi µi (xi − yi )2 .
i=1
We have δ = δ˜ ◦ (ψ × ψ) which shows that δ satisfies the conditions in the lemma.
In a forthcoming paper we will study the geometric aspects of lcnt-spaces. It is possible to generalize most of the concepts of Riemannian geometry and also stochastic analysis to this class of spaces. 3. Hochschild Homology and Differential Forms In this section we will give a short introduction into Hochschild homology and differential forms. We work in the category of locally convex spaces and use the projective tensor product. A reference for this section is the original work [Co] as well as the articles [Wo] and [Bro/Lyk] about excision in Hochschild homology. From now on we assume that A is a locally convex topological algebra over C. For any natural number n ∈ N let ˆ
Cn (A) = A⊗(n+1)
(9)
be the (n + 1)-fold completed tensor product over C. We define operators bn : Cn (A) → Cn−1 (A), ˆ ⊗a ˆ n) bn (a0 ⊗...
=
n−1
ˆ ⊗a ˆ i ai+1 ⊗... ˆ ⊗a ˆ n, (−1)i a0 ⊗...
i=0
bn : Cn (A) → Cn−1 (A), ˆ ⊗a ˆ n ) = bn (a0 ⊗... ˆ ⊗a ˆ n ) + (−1)n an a0 ⊗a ˆ 1 ⊗... ˆ ⊗a ˆ n−1 . bn (a0 ⊗... An easy computation shows bn−1 ◦ bn = 0 = bn−1 ◦ bn = 0, hence we get two chain complexes
C∗ (A) = (Cn (A), bn ), C∗bar (A) = (Cn (A), bn ).
(10) (11)
The first complex is called the Hochschild complex, the second complex is called the bar-complex. Both complexes give rise to homology groups. The homology groups of the bar complex are only considered for technical reasons. In fact, the bar complex is acyclic, if the algebra A in case is unital. In general, the acyclicity of the bar-complex is related to the question of excision. Definition 3.1. Let A be a unital algebra. For n ∈ N we define the nth Hochschild homology group of A as H Hn (A) =
ker(bn : Cn (A) → Cn−1 (A)) . im(bn+1 : Cn+1 (A) → Cn (A))
Hochschild- and Cyclic-Homology of LCNT-Spaces
201
We consider these homology groups as topological vector-spaces but keep in mind that unless im(b) is not closed, they are not Hausdorff. In general this can cause some problems. For the problem addressed in this paper though, they will turn out to be Hausdorff and everything is fine. Clearly the construction above is functorial, and any continuous algebra homomorphism induces a continuous homomorphism of the Hochschild homology groups. For computational reasons, it is important that Hochschild homology can be represented as a derived functor, in fact as a special kind of T or-product. Definition 3.2. A locally convex topological vector-space M which is also a topological module over A is called topological projective if it is a topological direct summand of ˆ a module of the form N = A⊗E, where E is a locally convex vector-space. Definition 3.3. Let M be a locally convex topological module over A. A topological projective resolution of M is an exact sequence of topological projective A-modules and A-linear maps b2
...M2
/ M1
b1
/ M0
b0
/M,
which admits a C-linear continuous contraction, i.e. for all i ∈ N maps si satisfying si : Mi → Mi+1 , id = bi+1 si + si−1 bi . ˆ op . Now let Aop denote the algebra A with the opposite multiplication and B := A⊗A The algebra A itself becomes a topological B-module by setting ˆ · c = acb. (a ⊗b) The following proposition gives an answer on how to compute Hochschild homology groups using projective resolutions. Proposition 3.1. Let (Mn , bn ) be a topological projective resolution of A over B. Then the Hochschild homology groups of A coincide with the homology groups of the complex ˆ BA ...M3 ⊗ Proof. (see [Co]).
b3
/M ⊗ 2 ˆ BA
b2
/M ⊗ 1 ˆ BA
b1
/M ⊗ 0 ˆ BA .
We will use this proposition to give two examples. Example 3.1. 1. Consider C together with its natural topology. Then ˆ ∼ Cop ⊗C =C via the multiplication map and 0
/C
id
/C
/0
ˆ Hence we have is in fact a topological projective resolution of C over Cop ⊗C. C if n = 0 H Hn (C) = . 0 if n = 0
(12)
202
C.-O. Ewald
2. Consider the algebra C[[x]] of formal power series with product topology. Then we can identify ∼ ˆ C[[x]]op ⊗C[[x]] = C[[x, y]] ˆ and a topological projective resolution of C[[x]] over C[[x]]op ⊗C[[x]] is given by 0
/ C[[x, y]]
·(x−y)
/ C[[x, y]]
x=y
/ C[[x]]
/0.
The C-linear contraction can easily be found by induction on the coefficients of the ˆ C[[x,y]] C[[x]] leads to corresponding power series. Tensoring this resolution with ⊗ the sequence 0
/ C[[x]]
0
/ C[[x]] .
Passing to homology, we get
C[[x]] if n = 0 C[[x]] if n = 1 . H Hn (C[[x]]) = 0 if n = 0, 1
The concept of Hochschild homology is strongly related to the concept of differential forms. In fact it is a refinement. We will show later, that for lcnt-spaces the two concepts coincide. The next proposition is a first step in this direction. Proposition 3.2. For any k ∈ N we have H Hk (C∞ (B n )) ∼ = k (B n ),
(13)
where B n denotes the open unit ball and k the complex-valued differential k-forms. Proof. We use Proposition 3.1 with A = C∞ (B n ). We construct an explicit projective resolution of A over C∞ (B n × B n ) by defining, for each k ∈ N, the C∞ (B n × B n )modules Mk := C∞ (B n × B n , k (Cn∗ )). Here Cn∗ denotes the space of linear forms on Cn . Clearly ˆ k (Cn∗ ), Mk ∼ = C∞ (B n × B n )⊗ where completion is actually unnecessary, since the vector-space k (Cn∗ ) is finite dimensional. Nevertheless, it follows that each Mk is free, hence projective. Furthermore let γ denote the difference function γ : B n × B n → Rn ⊂ Cn , γ (a, b) = b − a. This map induces maps iγ : Mk+1 → Mk , iγ ω(a, b)(v1 , ..., vk ) = ω(a, b)(γ (a, b), v1 , ...vk ) = ω(a, b)(b − a, v1 , ...vk ).
Hochschild- and Cyclic-Homology of LCNT-Spaces
203
Here ω ∈ Mk+1 denotes a form, a, b are points in B n ⊂ Rn ⊂ Cn and v1 , ...vk are elements of Cn . In other words, iγ is contraction with the vector-field γ . Let us now consider the following sequence: ...
iγ
/ M2
iγ
/ M1
iγ
/ M0 = C∞ (B n × B n )
∗
/0,
/ C∞ (B n )
where : B n → B n × B n denotes the diagonal map. Clearly iγ ◦ iγ = 0. To show that the chain complex above is a topological projective resolution of C∞ (B n ) over C∞ (B n × B n ), we have to give a continuous C-linear contraction. For this let sk : Mk → Mk+1 be defined as follows. Let e1∗ , ..., en∗ denote the dual basis of the standard canonical basis of Cn , and let ω ∈ Mk be given as ω(x, y) = f (x, y)ei∗1 ∧ ... ∧ ei∗k , where f ∈ C∞ (B n × B n ) is a smooth function on B n × B n and i1 , ..., ik ∈ {1, ...n}. In this case we define n 1 ∂f sk ω(a, b) := (a, a + t (b − a))t k ej∗ ∧ ei∗1 ∧ ... ∧ ei∗k dt. ∂y j 0 j =1
In the following we suppress the subscript k and simply write sω. We have
n 1 ∂f (iγ sω)(a, b) = (a, a + t (b − a)) t k ∂yj 0 j =1 k l+1 ∗ ∗ ∗ ∗ · (−1) (b − a)i e ∧ e ∧ ... ∧ e ∧ ... ∧ e l
j
i1
il
ik
l=1
+(b − a)j (ei∗1 ∧ ... ∧ ei∗k ). From this we get the expression n 1 ∂f (a, a + t (b − a))t k (b − a)j ei∗1 ∧ ... ∧ ei∗k dt (iγ sω)(a, b) = 0 ∂yj j =1
n k
1
∂f (a, a + t (b − a))t k ∂yj 0 l=1 j =1 ×(b − a)il ej ∧ ei∗1 ∧ ... ∧ ei∗l ∧ ... ∧ ei∗k dt . +
(−1)l+1
The chain-rule of differential calculus applied to the first sum gives 1 d f (a, a + t (b − a))t k ei∗1 ∧ ... ∧ ei∗k dt iγ sω(a, b) = dt 0 n 1 k ∂f + (−1)l+1 (a, a + t (b − a))t k ∂y j l=1 j =1 0 ∗ ∗ ∗ ×(b − a)il ej ∧ ei1 ∧ ... ∧ eil ∧ ... ∧ eik dt .
204
C.-O. Ewald
Now we perform partial integration with the first integral on the right side. This yields the following expression: iγ sω(a, b) = f (a, a + t (b − a))t k |10 ei∗1 ∧ ... ∧ ei∗k =ω(a,b)
1
− 0
f (a, a + t (b − a))kt k−1 ei∗1 ∧ ... ∧ ei∗k dt
n k
1
∂f (a, a + t (b − a))t k ∂yj 0 l=1 j =1 ∗ ∗ ∗ ×(b − a)il ej ∧ ei1 ∧ ... ∧ eil ∧ ... ∧ eik dt . +
(−1)l+1
The first term on the right side clearly coincides with ω(a, b). In total we get iγ sω(a, b) = ω(a, b) + R(a, b), where R(a, b) denotes the rest, i.e. the integral and the double sum on the right side. Now we compute siγ ω(a, b). The definition give us siγ ω(a, b) =
k
(−1)l {
n
1
j =1 0
l=1
∂f (a, a + t (b − a))(a + t (b − a) − a)il ∂yj
−f (a, a + t (b − a))δj,il t k−1 dt ej∗ ∧ ei∗1 ∧ ...ei∗l ∧ ... ∧ ei∗k }. Reordering the terms and considering the Kronecker symbol δj,il yields 1 f (a, a + t (b − a))t k−1 dt ei∗l ∧ ei∗1 ∧ ... ∧ ei∗l ∧ ... ∧ ei∗k siγ ω(a, b) = (−1)l+1 0
k
n
1
∂f (a, a + t (b − a))(b − a)il t k dt ∂yj 0 l=1 j =1 ∗ ∗ ∗ ∗ ej ∧ ei1 ∧ ...eil ∧ ... ∧ eik .
+
(−1)l
Shuffling ei∗l from the first to the ilth position in ei∗l ∧ ei∗1 ∧ ... ∧ ei∗l ∧ ... ∧ ei∗k changes the sign by the factor (−1)l−1 . This cancels with the factor (−1)l+1 in front of the first term on the right side, and we see that this term is actually independent of the summation index l. Hence for this term summation over l is just multiplication with k. Taking a close look at the summands we can recognize that we end up with −R(a, b), where R(a, b) was defined on the previous page. So we get siγ ω(a, b) + iγ sω(a, b) = −R(a, b) + ω(a, b) + R(a, b) = ω(a, b). This proves siγ + iγ s = id.
Hochschild- and Cyclic-Homology of LCNT-Spaces
205
It is not hard to see that s is continuous and C-linear. So far we have constructed a topological projective resolution of C∞ (B n ) over C∞ (B n × B n ). We do now compute the Hochschild homology of C∞ (B n ) by tensoring the topological projective resolution from above over C∞ (B n × B n ) with C∞ (B n ). For any k ∈ N we have ˆ k (Cn∗ ))⊗ ˆ C∞ (B n ×B n ) C∞ (B n ) ˆ C∞ (B n ×B n ) C∞ (B n ) = (C∞ (B n × B n )⊗ Mk ⊗ ˆ ∞ (B n ) = k (B n ). = k (Cn∗ )⊗C Since γ restricted to the diagonal is zero, we have ˆ C∞ (B n ×B n ) idC∞ (B n ) = 0. iγ ⊗ Therefore the tensored complex has zero differentials and we get k (B n ) for the k th homology group of this complex. To study the relationship between Hochschild homology and differential forms in general, we have to say what we mean by differential forms for a general Fr´echet-algebra. There are various concepts of this, we will use the following one:5 The first Hochschild homology group of a locally convex algebra A is given as H H1 (A) =
ˆ → A) ker(b1 : A⊗2 ˆ ˆ im(b2 : A⊗3 → A⊗2 )
.
The boundary maps b1 and b2 are explicitly given as ˆ = ab − ba, b1 (a ⊗b) ˆ ˆ = ab⊗c ˆ − a ⊗bc ˆ + ac⊗b. ˆ b2 (a ⊗b⊗c)
(14) (15)
We assume from now on that A is commutative. Then in Eq. (14) b1 = 0. We denote ˆ the class of a ⊗b ˆ in H H1 (A). We have with [a ⊗b] ˆ = [a(1⊗b ˆ − b⊗1) ˆ + ab⊗1] ˆ [a ⊗b] ˆ − b⊗1] ˆ + [ab⊗1] ˆ = a[1⊗b =∗
ˆ − b⊗1]. ˆ = a[1⊗b That ∗ = 0 follows by taking c = 1 in (15). We define the formal differential of b as ˆ − b⊗1]. ˆ db := [1⊗b An elementary computation shows that d(ab) = adb + bda.
(16)
From above, we also see that H H1 (A) = span{adb|a, b ∈ A}. This leads us to the following definition. 5 This version of differential forms works very well in the case of commutative algebras. In the case of non-commutative algebras this definition of differential forms has to be slightly modified.
206
C.-O. Ewald
Definition 3.4. Let A be a locally convex topological algebra and k ∈ N. We define the locally convex topological vector-space of differential k-forms on A as kA := kA H H1 (A), where kA denotes the k th exterior product over A.6 By definition we have 0A = A and 1A := H H1 (A). In general we have the antisymmetrization map : kA → H Hk (A), a0 da1 ∧ ... ∧ dak → sign(σ )(a0 ⊗ aσ −1 (1) ... ⊗ aσ −1 (k) ). σ ∈ k
This map is well defined, continuous and has a continuous left inverse ( see [Lo], p. 27ff). Hence it is always injective and kA is even a topological direct summand in H Hk (A). The question is now, when is it also surjective or what can Hochschild homology see, which differential forms do not see. In the case of lcnt-spaces we will answer this question in the next section. To use inductive methods it is very useful to have at hand certain long exact sequences of Hochschild homology groups associated to short exact sequences of certain algebras. Those long exact sequences do not always exist. The question of their existence is referred to as the Excision-problem in Hochschild homology, because it somehow resembles the situation of the Mayer-Vietoris sequence in ordinary homology. In general the obstruction is a property called H -unitality ([Lo], p. 32). By definition A is H -unital, if C∗bar (A) is acyclic. For a non-unital algebra I such as for example an ideal in a bigger unital algebra, the Hochschild homology groups are defined as coker(i∗ ) : H Hn (C) → H Hn (I + ), where I + denotes the unitalization of I and i the canonical inclusion. By definition I is called H unital, if C∗bar (I ) is acyclic. The following proposition can be found in [Bro/Lyk]. Proposition 3.3. Let 0 → I → A → A/I → 0 be an exact sequence of nuclear Fr´echet algebras such that A is unital and I is H-unital. Then there is a long exact sequence of continuous Hochschild homology groups ..
/ H Hn (I )
/ H Hn (A)
/ H Hn (A/I )
δ
/ H Hn−1 (I )
/ .. .
The task to determine whether an algebra is H -unital or not can turn out to be quite difficult. We will later make use of the following proposition. Proposition 3.4. Let A be a unital nuclear Fr´echet algebra and let ∞ ∞ C∞ 0 ([0, 1)) = ker(res : C ((−1, 1)) → C (−1, 0))
(17)
be the algebra of smooth function on [0, 1) with derivatives of any order vanishing at ˆ zero. Then the non-unital nuclear Fr´echet algebra C∞ 0 ([0, 1))⊗A is H -unital. 6
In the category of locally convex topological vector-spaces [He, Ew] p. 64
Hochschild- and Cyclic-Homology of LCNT-Spaces
207
Proof. We have to show that the corresponding bar-complex (11) is acyclic. Let ˆ 0i )⊗... ˆ ni ) ∈ Cn (C∞ ˆ ⊗(f ˆ ni ⊗b ˆ α= λi (f0i ⊗b 0 ([0, 1))⊗A) i
be an element in the bar complex. Here λi is a sequence of complex numbers such that ∞ i i i=0 |λi | < 1 and fj respectively bj converge to zero as i goes to infinity. A factori ization theorem (see [Vo], Thm. 3.4) applied to C∞ 0 ([0, 1)) and the sequence (f0 ) says ∞ ∞ there are functions g i ∈ C0 ([0, 1)) for all i ∈ N and h ∈ C0 ([0, 1)) with the following properties : 1. f0i = h · g i for all i ∈ N,
i 2. g i ∈ C∞ 0 ([0, 1)) · (f0 |i ∈ N).
The expression in Condition 2 denotes the closure of the ideal in C∞ 0 ([0, 1)) which i ∞ ˆ is generated by the functions f0 . Let us define β ∈ Cn (C0 ([0, 1))⊗A) as β=
∞
ˆ ⊗(f ˆ ni ⊗b ˆ 0i )⊗... ˆ ni ). λi (g i ⊗b
i=0
From Condition 2 above it follows that ∞ ˆ ˆ β ∈ C∞ 0 ([0, 1))⊗A) · α ⊂ Cn (C0 ([0, 1))⊗A).
(18)
Here the term in the middle denotes the closure of the ideal generated by α. A simple calculation shows that ˆ A )⊗β) ˆ + (h⊗1 ˆ A )⊗b ˆ (β). α = b ((h⊗1
(19)
Let us now assume that α is a cycle in the bar complex. Then b (α) = 0. Hence by ˆ continuity and C∞ 0 ([0, 1))⊗A linearity of b it follows from (18) that b (β) = 0. Hence by (19) we have that ˆ A )⊗β) ˆ α = b ((h⊗1 is a boundary in the continuous bar complex and the bar complex is acyclic.
4. Hochschild Homology of lcnt-Spaces The following theorem is the main result of this work. It generalizes a result of A. Connes7 about Hochschild homology in the case where the underlying space is a smooth manifold to the case where the underlying space is an lcnt-space. We will not use Connes’s result in the proof, but rather Proposition 3.2 and a localization method introduced in Teleman [Te]. Theorem 3. Let (X, C∞ X ) be a finite dimensional lcnt-space, then the antisymmetrization map : ∗C∞ → H H∗ (C∞ X) X
is an isomorphism. 7
See [Co] Lemma 45 and Theorem 46 (Connes works in the cohomological setting).
208
C.-O. Ewald
The proof of the theorem will follow after a couple of lemmas. The first step is: Lemma 4.1. The antisymmetrization map : kC[[x]] → H Hk (C[[x]]) is an isomorphism for all k ∈ N. Proof. By definition for k = 0, 1 there is nothing to show. On the other side it follows from (16) that d(x n ) = n · x n−1 dx. Since obviously dx ∧ dx = 0 we have kC[[x]] = 0 for all k ≥ 2 which together with Example 3.1 (2) shows the rest. Lemma 4.2. Let (X, C∞ X ) be an lcnt-space such that the antisymmetrization map : ∗C∞ → H H∗ (C∞ X) X
is an isomorphism and furthermore H H∗ (C∞ X ) is Hausdorff, then the antisymmetrization map : ∗C∞ → H H∗ (C∞ cX ) cX
is also an isomorphism and
H H∗ (C∞ cX )
is also Hausdorff.
Proof. Let us first consider the following short exact sequence of algebras 0
/ C∞ ⊗C ˆ ∞ ([0, 1))
incl.
0
X
/ C∞ ⊗C ˆ ∞ ([0, 1)) X
J
/ C∞ [[z]] X
/0,
(20)
∞ˆ ∞ where C∞ 0 ([0, 1)) is given by (17). It follows from Proposition 3.4 that CX ⊗C0 ([0, 1)) is H -unital. Hence we have the following commutative diagram with an exact upper row
ˆ ∞ ([0, 1))) H Hk (C∞ X ⊗C O 0
incl.∗
J∗
/ H Hk (C∞ [[z]]) OX
∼ =
incl.∗
kC∞ ⊗ˆ C∞ ([0,1)) X
/ H Hk (C∞ ⊗C ˆ ∞ ([0, 1))) X O
0
/ k ∞ ˆ ∞ CX ⊗C ([0,1))
∼ = J∗
/ k ∞ CX [[z]]
The two antisymmetrization maps on the right side are in fact isomorphisms. This fol∼ ∞ˆ lows from C∞ X [[z]] = CX ⊗C[[z]], Lemma 4.1 and Proposition 3.2 (slightly modified, to cover half open intervals) once there is a K¨unneth-Theorem for Hochschild homology at hand. Such a theorem indeed exists, if the corresponding Hochschild homology groups of the involved algebras are Hausdorff (see [Ka]). This is the case.8 It is not hard to see that the second map in the lower sequence is always surjective. Therefore by commutativity and our assumption, the second map in the upper sequence is also surjective. Since the upper sequence is actually a part of a long exact sequence, the first map in the upper sequence is in addition injective and the upper sequence is in fact a short exact sequence. It follows from this information and that is in a general split injective via a small diagram chase, that ˆ ∞ : ∗C∞ ⊗ˆ C∞ ([0,1)) → H H∗ (C∞ X ⊗C0 ([0, 1))) X
8
0
For more details about the K¨unneth-Theorem in Hochschild homology see [Ew], p. 86.
Hochschild- and Cyclic-Homology of LCNT-Spaces
209
is also an isomorphism. Let us now consider the short exact sequence of algebras / C∞ ⊗C ˆ ∞ ([0, 1))
0
incl.
0
X
/ C∞ cX
/ C[[z]]
J
/0.
(21)
As before this sequence induces a commutative diagram ˆ ∞ H Hk (C∞ ([0, 1))) X ⊗C O 0
incl.∗
∼ =
proj.∗
incl.∗
/ kC∞
proj.∗
cX
0
/ H Hk (C[[z]]) O ∼ =
kC∞ ⊗ˆ C∞ ([0,1)) X
/ H Hk (C∞ ) O cX
/ k C[[z]]
where the upper row is exact. Furthermore the right and the left antisymmetrization maps are isomorphisms. This follows from Lemma 4.1 and the first part. Also again, it is not hard to see that the second map in the lower sequence is always surjective. It follows from this and split injectivity, via a small diagram chase that the map in the middle : kC∞ → H Hk (C∞ cX )) cX
is also an isomorphism.
Lemma 4.3. Let (X, C∞ X ) be an nt-space, such that there exists a locally finite open covering (Ui |i ∈ I ) of X such that any of the antisymmetrization maps : ∗C∞ (Ui ) → H H∗ (C∞ (Ui )) is an isomorphism, then the antisymmetrization map : ∗C∞ → H H∗ (C∞ X) X
is also an isomorphism. ˆ ∞ ⊗(k+1) in the Hochschild complex of Proof. We consider the spaces Ck (C∞ X ) = (CX ) C∞ X as subspaces of the space of continuous complex valued functions on the (k +1)-fold Cartesian product of X and use a method first presented in [Te] to localize the Hochschild complex around the diagonal. First let λ : [0, ∞) → [0, 1] be a smooth function, such that supp(λ) ⊂ [0, 1] and λ|[0,1/2] ≡ 1. For t > 0 we define
λt : [0, ∞) → [0, 1] λt (s) := λ(s/t). These functions have the following properties: 1. supp(λt ) ⊂ [0, t], 2. λt|[0,t/2] ≡ 1. Now for any k ∈ N let us define functions ρk as ρk : Xk+1 → [0, ∞), ρk (x0 , x1 , ..., xk ) = δ(x0 , x1 ) + δ(x1 , x2 ) + ... + δ(xk , x0 ).
210
C.-O. Ewald
Here δ denotes the function on X × X of Lemma 2.1. ρk measures the distance of a point in Xk+1 from the diagonal with respect to the “distance-function” δ. Clearly ˆ ⊗(k+1) . Let ρk ∈ (C∞ X) Ut,k := {(x0 , x1 , ..., xk )|ρk (x0 , ..., xk ) < t} be the t-neighborhood of the diagonal k+1 ⊂ X k+1 . Let C∗t (C∞ X ) be the sub-complex of ˆ t ∞ ∞ ⊗(k+1) the Hochschild complex C∗ (CX ), where Ck (CX ) contains the elements of (C∞ X) vanishing on Ut,k . Let C∗0 (C∞ C t (C∞ ), X ) = lim − → ∗ X where the limit is for t to zero. The complex C∗0 (C∞ X ) consists of the chains vanishing in an arbitrary neighborhood of the diagonal. We will now show that this complex is acyclic. We define an operator ∞ Et : Ck (C∞ X ) → Ck+1 (CX ),
Et (F )(x0 , ..., xk+1 ) = λt (δ(x0 , x1 )2 ) · F (x1 , ..., xk+1 ), ∀F ∈ Ck (X). ∞ This operator maps Cks (C∞ X ) into Ck+1 (CX ) which can be verified by computation. Another computation shows that s/4
b ◦ E t + Et ◦ b = 1 − N t , where Nt is defined as Nt (F )(x0 , ..., xk ) = (−1)k λt (δ(x0 , x1 )) · {F (x1 , x2 , ...xk , x0 ) − F (x1 , x2 , ...xk , x1 )}, th ∀F ∈ Ck (C∞ X ). Again by computation b ◦ Nt = Nt ◦ b. Let’s consider the k power of ∞ Nt . For F ∈ Ck (CX ) we get
(Nt )k F (x0 , ..., xk ) =
k−1
λt (δ(xi , xi+1 )) · G(x0 , ..., xk ),
i=0
where G(x0 , ..., xk ) is a linear combination of functions built out of F by restricting to certain diagonals and permutation of some arguments. For the product in front of G not to be zero, we must have δ(xi , xi+1 ) √ < t for each 0 ≤ i ≤ k − 1. The triangle equation shows that in this case we also have δ(x0 , xk ) < kt 1/2 . Thus we have ρk (x0 , ..., xk ) =
k−1
δ(xi , xi+1 ) + δ(xk , x0 ) < kt + k 2 t.
i=0 (k+k 2 )t
Hence for F ∈ Ck operator
Kt := Et ·
k−1 r=0
k (C∞ X ) we have that (Nt ) (F ) = 0. Let’s define another
(k+k 2 )t
(Nt )r : Ck
(k+k 2 )4−(k+1) t
(C∞ X ) → Ck+1
(C∞ X ).
Hochschild- and Cyclic-Homology of LCNT-Spaces
211
By construction this operator satisfies b ◦ Kt + Kt ◦ b = 1, which proves the acyclicity of C∗0 (C∞ X ) by taking the direct limit where t goes to zero. This in fact shows that any Hochschild-homology class can be represented by a Hochschild-cycle F ∈ Ck (C∞ X ) which has support in an arbitrary small neighborhood of the diagonal X ≈ ⊂ Xk+1 . Now let (Ui |i ∈ I ) be an open covering of X which satisfies the requirements of the proposition. Let F ∈ Ck (C∞ X ) be an arbitrary Hochschild-cycle, then by the discussion above we can assume that supp(F ) ⊂ Uik+1 . i∈I ∞ ∞ Now we can use the C∞ X -module structures on Ck (CX ) and H Hk (CX ) as well as a partition of unity in C∞ subordinated to the covering (U |i ∈ I ) to see that i X
[F ] =
[Fi ], i∈I
where supp(Fi ) ⊂ Uik+1 . It does now follow from the surjectivity assumption on the antisymmetrization map corresponding to each Ui that each of the [Fi ] is in the image of the antisymmetrization map corresponding to X. Hence [F ] is also in the image of the antisymmetrization map. This shows that the antisymmetrization map is surjective. Since it always has a continuous left inverse it is in fact an isomorphism. With these tools we are able to proof Theorem 4.1. Proof. Let n ∈ N. We have to show that the antisymmetrization map is surjective and equally that any Hochschild cycle in Cn (C∞ X ) is homologous to an antisymmetric Hochschild cycle. From Lemma 4.3. this follows if we show that this is locally the case. ˆ ∞ Hence we can assume that our lcnt-space is given as in (5) by (B k ×cL, C∞ (B k )⊗C cL ), k where B denotes the open unit ball of dimension k and cL denotes the cone over an lcnt-space of dimension less than the dimension of X (see (3),(4),(7) and Proposition 2.1). Using induction on the dimension, we can assume that all antisymmetrization maps corresponding to L are isomorphisms and all Hochschild homology groups of L are Hausdorff. Therefore by Lemma 4.2 the same holds for (cL, C∞ cL ). Furthermore by Proposition 3.2 the antisymmetrization maps corresponding to B k are isomorphisms. Hence the corresponding Hochschild-homology groups are also Hausdorff and Theorem 4.1 follows as in the proof of Lemma 4.2 from the K¨unneth Theorem in Hochschildhomology (see [Ka]). 5. Cyclic Homology of lcnt-Spaces In this section we will say something about the cyclic homology groups corresponding to an lcnt-space (X, C∞ X ). The result we will present is a straight forward consequence of the general relationship between Hochschild homology and cyclic homology and Theorem 4.1. Using Theorem 4.1 the proof is identical to the proof in [Co]. To shorten this article, we keep this section very informal.
212
C.-O. Ewald
If we divide out a cyclic action from the Hochschild complex (10), in equal identifying cycles, which arise from another by cyclic permutation, we get another complex, which is sometimes called Connes’ complex. The homology groups of this complex are called cyclic homology groups and will be denoted by H C∗ (A). These groups are related to the Hochschild homology groups by the so-called Connes’ exact sequence ...H Hn (A)
/ H Cn (A)
I
S
/ H Cn−2 (A)
B
/ H Hn−1 (A)
I
/ ... .
The operator S is the so-called Connes periodicity operator and corresponds via some identifications to the Bott periodicity operator in K-theory. In the commutative case it is not hard to show that via the antisymmetrization map, up to a factor the operator B ◦ I : H Hn (A) → H Hn+1 (A) exactly corresponds to the operator d : nA → n+1 A , a0 da1 ∧ ... ∧ dan → da0 ∧ da1 ∧ ... ∧ dan . Theorem 5.1. Let (X, C∞ X ) be an lcnt-space with finite dimensional homology groups. Then for all n ∈ N there is a natural topological isomorphism n−1 n−2 ∼ n (X, C) ⊕ H n−4 (X, C) ⊕ H n−6 (X, C).... H Cn (C∞ X ) = C∞ /dC∞ ⊕ H X
X
Here H k (X, C) stands for singular cohomology with coefficients in C. There is also a de Rham theorem for lcnt-spaces9 and these singular cohomology groups can be identified with the corresponding de Rham cohomology groups. References [Bo]
Borel, A.: Intersection cohomology. (Notes of a Seminar on Intersection Homology at the University of Bern, Switzerland, Spring 1983). (English) Progress in Mathematics, Vol. 50, Swiss Seminars. Boston-Basel-Stuttgart: Birkh¨auser, 1984 [Bra/Pf] Brasselet, J-P., Pflaum, M.: On the homology of algebras of Whitney functions over sub-analytic sets. Institut de Math´ematiques de Luminy, Preprint 2002-02, 2002 [Bre] Bredon, G.: Topology and Geometry. Graduate Texts in Math 139, Berlin-Heidelberg-New York: Springer, 1991 [Bro/Lyk] Brodzki,J., Lykova,Z.: Excision in Cyclic Type Homology of Fr´echet Algebras. Bull. Lond. Math. Soc. 33(3), 283–291 (2001) [Co] Connes, A.: Non-commutative Differential Geometry. Publications Mathematiques IHES 62, 1985 [Ew] Ewald, C.: Hochschild homology and de Rham cohomology of stratifolds. Dissertation Universit¨at Heidelberg 2002, Heidelberg Dokumenten Server [He] Helemskii, A.Ya.: The Homology of Banach and Topological Algebras. Dordrecht: Kluwer Academic Publishers, 1989 [Ka] Karoubi, M.: Formule de K¨unneth en homologie cyclique I et II. C.R. Acad. Sci. Paris S´erie A-B 303, 507–510 (1986) [Ke] Kelley, J.L.: General Topology. New York: Van Nostrand, 1955 [Kr1] Kreck, M.: (Co)Homology via Differential topological Varieties. Preprint Mainz, 1999 [Kr2] Kreck, M.: Lecture Notes in Differential Algebraic Topology. Mainz/Heidelberg, 2000 [Lo] Loday, J.L.: Cyclic homology. Grundlehren der Mathematik, Berlin-Heidelberg-New York: Springer, 1991 9
Compare [Ew] Theorem 4.6.1, p. 49
Hochschild- and Cyclic-Homology of LCNT-Spaces [Pf] [Te] [Tr] [Vo] [We] [Wo]
213
Pflaum, M.: Ein Beitrag zur Geometrie und Analysis auf stratifizierten Räumen. Habilitationsschrift, Humboldt Universit¨at Berlin, 2000 Teleman, N.: Microlocalisation de l’homologie de Hochschild. C.R. Acad. Sci. Paris S´erie 1, 326, 1261–1264 (1998) Treves, J.F.: Topological Vectorspaces, Distributions and Kernels, London-New-York: Academic Press, 1967 Voigt, J.: Factorization in some Fr´echetalgebras of Differentiable Functions, Juergen Voigt, Studia Mathematica, 77, 333–348 (1983) Weibel, C.: Homological Algebra. Oxford: Oxford University Press, 1995 Wodzicki, M.: Excision in Cyclic Homology, Annals of Math, 129, 591–639 (1989)
Communicated by A. Connes
Commun. Math. Phys. 250, 215–239 (2004) Digital Object Identifier (DOI) 10.1007/s00220-004-1141-4
Communications in
Mathematical Physics
A Generalization of the Bargmann’s Theory of Ray Representations Jarosław Wawrzycki Institute of Nuclear Physics PAS, ul. Radzikowskiego 152, 31-342 Kraków, Poland. E-mail:
[email protected] Received: 12 June 2003 / Accepted: 2 March 2004 Published online: 27 July 2004 – © Springer-Verlag 2004
Abstract: The paper contains a complete theory of factors for ray representations acting in a Hilbert bundle, which is a generalization of the known Bargmann’s theory. With its help, we have reformulated the standard quantum theory so that the gauge freedom emerges naturally from the very nature of quantum laws. The theory is of primary importance in the investigations of covariance (in contradistinction to symmetry) of a quantum theory which possesses a nontrivial gauge freedom. In that case the group in question is not any symmetry group but a covariance group only – the case not yet investigated in depth. It is shown that the factor of a covariance group representation depends on space and time when the system in question possesses gauge freedom. In nonrelativistic theories, the factor depends on time only. In relativistic theory, the Hilbert bundle is built over spacetime while in the nonrelativistic case-over time. We explain two applications of this generalization: in the theory of a quantum particle in gravitational field in the nonrelativistic limit, and in quantum electrodynamics. 1. Introduction In standard Quantum Mechanics (QM) and in Quantum Field Theory (QFT), the spacetime coordinates are essentially classical variables. Therefore, the question about the general covariance of QM and QFT emerges naturally just like in the classical theory, namely: What is the effect of a change in spacetime coordinates in QM and QFT when this change does not form any symmetry transformation? It is a commonly accepted belief that there are no substantial difficulties if we refer the question to the wave equation. We simply treat the wave equation, without saying why, in such a manner as if it was a classical equation. The only problem arising is to find the transformation rule ψ → Tr ψ for the wave function ψ. This procedure, which on the other hand can be seriously objected, does not solve the above-stated problem. The
216
J. Wawrzycki
heart of the matter, as well as that of QM and QFT, lies in the Hilbert space of states, and specifically in finding the representation Tr of the covariance group in question. The trouble originates from the fact that the covariance transformation changes the form of the wave equation so that ψ and Tr ψ do not belong to the same Hilbert space, which means that Tr does not act in the ordinary Hilbert space. This is not compatible with the paradigm worked out while dealing with symmetry groups. We show that the covariance group acts in a Hilbert bundle RH over time in nonrelativistic theory, and in a Hilbert bundle MH over spacetime M in the relativistic case. The waves are the respective cross-sections of the bundle in question. The exponent ξ(r, s, p) in the formula Tr Ts = eiξ(r,s,p) Trs , depends on point p of the base of the bundle in question. That is, ξ depends on time t in the nonrelativistic theory, and on spacetime point p in the relativistic theory if there exists nontrivial gauge freedom. Moreover, we argue that the bundle MH is more appropriate for treating the covariance as well as the symmetry groups than the Hilbert space itself. Namely, we show that from a more general assumption that the representation Tr of the Galilean group acts in RH and has an exponent ξ(r, s, t) depending on time t one can reconstruct nonrelativistic Quantum Mechanics. Even more, in the more complex case of theory with nontrivial time-dependent gauge describing a spinless quantum particle in Newtonian gravity, we are able to infer the wave equation and prove the equality of the inertial and gravitational masses, cf. [17]. In doing so, we apply extensively the classification theory for exponents ξ(r, s, t) of Tr acting in RH and depending on time. The main task of this paper is to construct the general classification theory of spacetime-dependent exponents ξ(r, s, p) of representations acting in MH. At the same time, the theory presented here can be viewed as a generalization of Bargmann’s [1] classification theory of exponents ξ(r, s) of representations acting in ordinary Hilbert spaces, with the exponents independent of p ∈ M. In the presented theory which is slightly more general than the standard one, gauge freedom emerges from the very nature of the fundamental laws of Quantum Mechanics. We hope this reformulation will be useful in dealing with the problems in QFT caused by gauge freedom. Section 2 presents physical motivation in detail. In Sect. 3 we discuss a generalization of the ordinary state vector ray and operator ray introduced by Weyl. In Sect. 4 we analyze the local exponents of Lie groups and introduce algebras which are important tools for the classification theory of local exponents presented in that section. In Sect. 5 we investigate globally defined exponents, classifying them in some special cases. Sect. 6 includes two examples. The first example is the Galilean group. We analyze the group from the point of view of the generalized theory. By way of the other example, the exponents of the Milne group, the covariance group relevant in the theory of a nonrelativistic particle in the gravitational field, are analyzed. The proof of differentiability of the (generalized) exponent and of the theorem about exponents defined on one-parameter groups proceeds in a quite analogous way to that presented in Bargmann’s work [1]. However, it is not trivial that the theorems are also true in this generalized situation. We present their proof explicitly for the reader’s convenience. The remainder of our reasoning is not a simple analogue of [1] and proceeds differently.
A Generalization of the Bargmann’s Theory of Ray Representations
217
2. Setting the Motivation In this section we carry out a general analysis of the representation Tr of a covariance group, and compare it with the representation of a symmetry group. We also describe the correspondence between the space of wave functions ψ( x , t) and the Hilbert space. The analysis is performed in the nonrelativistic case, but it can be derived in the relativistic quantum field theory as well. Before we give the general description, it will be instructive to investigate the problem for a free particle in the flat Galilean spacetime. The set of solutions ψ of the Schrödinger equation, which are admissible in Quantum Mechanics, is precisely given by t kx 3 −i 2m kk+i ψ( x , t) = (2π)−3/2 ϕ(k)e d k, is any square integrable function. The where p = k is the linear momentum and ϕ(k) functions ϕ (wave functions in the “Heisenberg picture”) form a Hilbert space H with the inner product 2 (k) d3 k. (ϕ1 , ϕ2 ) = ϕ1∗ (k)ϕ The correspondence between ψ and ϕ is one-to-one. In general, however, the construction fails if the Schrödinger equation possesses nontrivial gauge freedom. Let us explain it. For example, the above construction fails for the nonrelativistic quantum particle in the curved Newton-Cartan spacetime. Besides, in this spacetime we do not have any plane wave, see [15]. Thus, there does not exist any natural counterpart for the Fourier transform. However, we do not need to use the Fourier transform. What is the role of the Schrödinger equation in the above construction of H? Please note that in general ψ2 ≡ ψ ∗ ( x , 0)ψ( x , 0) d3 x = (ϕ, ϕ) = ψ ∗ ( x , t)ψ( x , t) d3 x. This is in accordance with the Born interpretation of ψ. Namely, if ψ ∗ ψ( x , t) is the probability density, then ψ ∗ ψ d3 x has to be preserved over time. In the above construction, the Hilbert space H is isomorphic to the space of square integrable functions ϕ( x ) ≡ ψ( x , 0), namely the set of square integrable initial data for the Schrödinger equation, cf. e.g. [5]. The connection between ψ and ϕ is given by the time evolution operator U (0, t) (equivalently by the Schrödinger equation): U (0, t)ϕ = ψ. The correspondence between ϕ and ψ has all formal properties, such as in the Fourier construction above. Of course, the initial data for the Schrödinger equation do not cover the whole Hilbert space H of square integrable functions, but the time evolution given
218
J. Wawrzycki
by the Schrödinger equation can be uniquely extended over the whole Hilbert space H by the unitary evolution operator U . The construction can be applied to the particle in the Newton-Cartan spacetime. As we implicitly assumed, the wave equation is such that the set of its admissible initial data is dense in the space of square integrable functions (we need this for the uniqueness of the extension). Because of the Born interpretation, the integral ψ ∗ ψ d3 x has to be preserved over time. Let us denote the space of the square-integrable initial data ϕ on the simultaneous hyperplane t ( x , t) = t by Ht . Then, the evolution is an isometry between H0 and Ht . But such an isometry has to be a unitary operator, and the construction is well defined, i.e. the inner product of two states corresponding to the wave functions ψ1 and ψ2 does not depend on the choice of Ht . Let us mention that the wave equation has to be linear in accordance with the Born interpretation of ψ (since any unitary operator is linear the time evolution operator is linear as well). The space of wave functions ψ( x , t) = U (0, t)ϕ( x ) isomorphic to the Hilbert space H0 of ϕ’s is commonly called the “Schrödinger picture”. However in general, the connection between ϕ( x ) and ψ( x , t) is not unique if the wave equation possesses a gauge freedom. Namely, let us consider two states ϕ1 and ϕ2 and ask when these two states are equivalent, and indistinguishable. The answer is that they are equivalent if |(ϕ1 , ϕ)| ≡ ψ1∗ ( x , t)ψ( x , t) d3 x = |(ϕ2 , ϕ)| x , t)ψ( x , t) d3 x , (1) ≡ ψ2∗ ( for any state ϕ from H, or for all ψ = U ϕ (ψi are defined to be = U (0, t)ϕi ). By substituting ϕ1 and then ϕ2 for ϕ and making use of the Schwarz’s inequality, one gets: ϕ2 = eiα ϕ1 , where α is any constant1 . The situation for ψ1 and ψ2 is however different. In general, condition (1) is fulfilled if ψ2 = eiξ(t) ψ1 and the phase factor can depend on time. Of course, this has to be consistent with the wave equation, that is, together with a solution ψ of the wave equation, the wave function eiξ(t) ψ is also a solution of the appropriately gauged wave equation. A priori one cannot exclude the existence of such a consistent time evolution. This is not a new observation, as it was noticed by John von Neumann2 , but it seems that it has never been deeply investigated (probably because the ordinary nonrelativistic Schrödinger equation has gauge symmetry with constant ξ ). The space of waves ψ describing the system cannot be reduced in the above way to any fixed Hilbert space Ht with a fixed t. So, the existence of the nontrivial gauge freedom leads to the following 1 This gives the conception of the ray, introduced to Quantum Mechanics by Hermann Weyl [H. Weyl, Gruppentheorie und Quantenmechanik, Verlag von S. Hirzel in Leipzig (1928)]: a physical state does not correspond uniquely to a normed state ϕ ∈ H, but it is uniquely described by a ray; two states belong to the same ray if they differ by a constant phase factor. 2 J. v. Neumann, Mathematical Principles of Quantum Mechanics, University Press, Princeton (1955). He did not mention gauge freedom on that occasion. However, gauge freedom is necessary for the equivalence of ψ1 and ψ2 = eiξ(t) ψ1 .
A Generalization of the Bargmann’s Theory of Ray Representations
219
Hypothesis. The two waves ψ and eiξ(t) ψ are quantum-mechanically in-distinguishable. Moreover, we are obliged to use the whole Hilbert bundle RH : t → Ht over time instead of a fixed Hilbert space Ht , with the appropriate cross-sections ψ as the waves (see the next section for details). Let us consider now an action Tr of a group G in the space of waves ψ. Before we infer some consequences of the assumption that G is a symmetry group, we need to state a: Classical-like Postulate. Group G is a symmetry group if and only if the wave equation is invariant under the transformation x = rx, r ∈ G of independent variables and the transformation ψ = Tr ψ of the wave function. The above postulate is indeed commonly accepted in Quantum Mechanics even when the gauge freedom is not excluded. But it is a mere application of the symmetry definition for a classical field equation applied to the wave equation without any change. The wave ψ is not a classical quantity, such as e.g. electromagnetic intensity. The above Hypothesis is not true for classical fields, and we have to be careful in forming the appropriate postulate for the wave equation compatible with the Hypothesis. Namely, the two wave equations differing by a mere gauge are indistinguishable. We call them gauge-equivalent. It is therefore natural to assume the Quantum Postulate. Group G is a symmetry group if and only if the transformation x = rx, r ∈ G of independent variables and the transformation ψ = Tr ψ of the wave function transform the wave equation into a gauge-equivalent one. Please note that not all possibilities admitted by the Hypothesis are included in the Classical-like postulate. From the Classical-like Postulate it follows that ψ as well as Tr ψ are solutions of exactly the same wave equation, in view of the invariance of the equation. Therefore, ψ and Tr ψ belong to the same “Schrödinger picture”, so that Tr Ts ψ = eiξ(r,s) Trs ψ, with ξ = ξ(r, s) independent of time t! This is in accordance with the known theorem that Theorem 1. If G is a symmetry group, then the phase factor ξ should be time-independent. But if we start from the Quantum Postulate, we obtain instead Tr Ts ψ = eiξ(r,s,t) Trs ψ
(2)
and get Theorem 1’. If G is a symmetry group, then the phase factor ξ = ξ(r, s, t) is timedependent in general. In this paper we propose to accept the Quantum Postulate, which is compatible with the Hypothesis, and is more in spirit of Quantum Mechanics than the Classical-like Postulate. It should be noted that in the special case when gauge freedom degenerates to the case of constant phase, the Quantum Postulate is equivalent to to the Classical-like Postulate.
220
J. Wawrzycki
Acceptance of the Quantum Postulate throws some light on the two very difficult problems: (a) generally covariant formulation of Quantum Mechanics, (b) the problems in Quantum Field Theory caused by gauge freedom. Moreover, with the help of the Hypothesis we can see that both (a) and (b) are deeply interconnected. Let us consider the standard treatment in which the Hypothesis is not taken into account and H0 is assumed to be the Hilbert space of all states and the Classical-like Postulate is accepted. Then a problem arises if we intend to formulate a representation theory of a covariance group in contradistinction to a symmetry group. The troubles have their source in the fact that the covariance group transforms the solution of the wave equation into the solution of the transformed wave equation, but the transformed equation is different in form in comparison to the initial one. That is Tr ψ does not belong to the same “Schrödinger picture” as ψ, and Tr does not act in the Hilbert space H0 of states. In view of the paradigm that any reasonable treatment of action of any group in Quantum Mechanics reduces to a unitary representation of the group in the Hilbert space of states, there was no natural way of treating the covariance group. The difficulty disappears if we start from the Hypothesis and the Quantum Postulate. Now the states are the respective cross-sections in the bundle RH, and Tr unitarily transforms fibre Ht onto a fibre Hr −1 t , acting in the same space RH as the symmetry group. Please note that the space of states degenerates to a fixed fibre H0 over t = 0 if gauge freedom degenerates to a constant phase. Before we explain the connection to problem (b), let us make a general comment concerning relation (2). There is a physical motivation to investigate representations Tr fulfilling (2) with ξ depending on spacetime point p: Tr Ts = eiξ(r,s,p) Trs .
(3)
Namely, in the Quantum Field Theory the spacetime coordinates of p ∈ M play the role of parameters in the same way as time does in the nonrelativistic theory (recall that, for example, the wave functions ψ of the Fock space of the quantum electromagnetic field are functions of the Fourier components of the field, with the spacetime coordinates playing the role of parameters like time t in nonrelativistic Quantum Mechanics). By this token, the two wave functions ψ and ψ = eiξ(p) ψ are indistinguishable in the sense that they give the same transition probabilities: |(ψ(p), φ(p))| = |(ψ (p), φ(p))| for any φ. In an analogous way, we get the Hilbert bundle MH over spacetime M and respective cross-sections as the wave functions ψ (see the next section for details). Now, let us return to problem (b). It should be mentioned at this place that the problems in QFT generated by gauge freedom are of a general character, and are well known. For example, there are no zero mass vector particles with helicity = 1, which is a consequence of the theory of unitary representations of the Poincaré group, as shown by Łopusza´nski [9]. This is apparently in contradiction to the experiment, because the photon is a vector particle with helicity = 1. What is the solution of this paradox? First, let us describe the solution on the grounds of the existing theory, which constitutes at the same time an orthodox view. We observe that we can build a zero mass vector state with h = 1 but we must admit finite-dimensional irreducible and thus non-unitary representations of the small group, that is the two-dimensional non-compact Euclidean group. Next, please note that the representation of G induced by the non-unitary representation of the small group remains “unitary” if we admit the inner product in the “Hilbert” space to be not positively defined, cf. [6 or 18]. However, even with the most favorable attitude towards
A Generalization of the Bargmann’s Theory of Ray Representations
221
the orthodox view, this solution is rather obscure. We propose to proceed in another way. First, let us observe that the case of the free quantum vector field3 Aµ (x) with zero mass is exactly the same. As long as the inner product in the Hilbert space is positively defined, we are not able to introduce any vector potential which transforms as a vector field. However, we can introduce a local real electromagnetic field Fµν (x) = −Fνµ (x) which is a linear combination of a self-dual and an antiself-dual field with helicity +1 and −1 respectively. If we introduce a vector potential Aµ in some Lorentz frame such that Fµν = ∂µ Aν − ∂ν Aµ then in another Lorentz frame we will have ( νµ is the Lorentz transformation matrix corresponding to the Poincaré transformation r) ν
Aµ (x) → Ur Aµ (x)Ur −1 = ( −1 )µ Aν (r −1 x) + ∂µ R(r, x), ∂µ R = 0,
(4)
while Fµν transforms as a tensor field. We infer that gauge transformation of the second kind has to accompany the Poincaré transformation, or that gauge freedom is indispensable in the construction of the vector potential in the quantum field theory, cf. [9 or 18, vol. I]. Let denote the vacuum state. According to QFT, we should define a photon state ψµ (x) in the following way ψµ (x) = Aµ (x) . It immediately follows from Eq. (4) that ν
Ur ψµ (x) = ( −1 )µ ψν (r −1 x) + ∂µ (x),
(5)
where ∂µ (x) denotes the vector-valued distribution ∂µ R(x) . The above representation spanned by the generalized vectors ψµ (x) induces a representation in the appropriate Hilbert space. Indeed, let us write ϕµ (x) and θ (x) for test functions which “smear” the distributions ψµ (x) and (x) respectively. From formula (5) we get the transformation law for ϕµ , ν
Tr ϕµ (x) = ( −1 )µ ϕν (r −1 x) + ∂µ θ(x).
(6)
By construction, the space of test functions is dense in the corresponding Hilbert space and the above representation Tr can be uniquely extended. As we are dealing with a gauge-invariant theory, the two quantum vector potentials Aµ (x) and Aµ + ∂µ (x) are unitarily equivalent. Accordingly, two photon states differing by a gradient, as well as their two corresponding vectors ϕµ (x) and ϕµ (x) + ∂µ φ(x) should be unitarily equivalent. This means that it is more adequate to consider all ϕµ (x) + ∂µ φ(x) instead of the respective ϕµ (x) alone. We write ϕµ (x) + ∂µ φ(x) as a pair {φ(x), ϕµ (x)}. The action of our representation Tr in the space of pairs {φ(x), ϕµ (x)} is as follows: Tr {φ(x), ϕµ (x)} = {φ(r −1 x) + θ(r, x), Ur ϕµ (x)}, where Ur acts as an ordinary vector transformation: ν
Ur ϕµ (x) = ( −1 )µ ϕν (r −1 x). 3
This time Aµ (x) is an operator-valued distribution
222
J. Wawrzycki
Moreover, we have Tr Ts {φ(x), ϕµ (x)} = Trs {φ(x), ϕµ (x)} + {ξ(r, s, x), 0},
(7)
where ξ(r, s, x) = θ (r, x) + θ (s, r −1 x) − θ(rs, x). But this is nothing else but a (generalized) ray representation of the Poincaré group G fulfilling (3) with the spacetime-dependent exponent ξ(r, s, x)! To sum up the discussion concerning problem (b), we have just seen that the generalized representation in the sense of Eq. (3) seems to be indispensable if we wish to work with ordinary Hilbert spaces with positive norms while having a theory which describes photons. We shall resolve the following paradox. Namely, a natural question arises why the phase factor eiξ in (2) is time-independent for the Galilean group (even when the Galilean group is considered as a covariance group). The explanation of the paradox is as follows. The Galilean covariance group G induces the representation Tr in the space RH and fulfills (2). But, as we will show later on, the structure of G is such that there always exists a function ζ (r, t) continuous in r and differentiable in t, with the help of which one can define a new equivalent representation Tr = eiζ (r,t) Tr fulfilling Tr Ts = eiξ(r,s) Trs with a time-independent ξ . The representations Tr and Tr are equivalent because Tr ψ and Tr ψ are equivalent for all r and ψ. However, this is not the case in general, when the exponent ξ depends on time, and this time dependence cannot be eliminated in the same way as for the Galilean group. We have a similar situation when we try to find the most general wave equation for a nonrelativistic quantum particle in the Newton-Cartan spacetime. The relevant covariance group in this case is the Milne group which possesses representations with time-dependent ξ not equivalent to any representations with a time-independent ξ . Moreover, the only physical representations of the Milne group are those with time-dependent ξ . The difference between the relativistic and nonrelativistic theories, where we consider the bundle over time and over spacetime, is artificial and depends on the structure of the corresponding group. Namely, we could consider the spacetime-dependent ξ with Tr acting in a Hilbert bundle over the spacetime M for the Galilean group as well. However, the relevant group structure is such that Tr is equivalent to a representation Tr with time-dependent ξ . We have a similar situation with the Milne group. Let us close this section with more speculative considerations, trying to answer the following question. We have already mentioned that spacetime coordinates are classical variables in QFT. What does this mean? We have interpreted the term ‘classical’ coordinate as indicating that spacetime coordinates are numerical parameters. In other words, that they do not form any operators. But could we interpret this word in a somewhat deeper sense? Already in the classic works of Dirac [3] and Heisenberg [8], spacetime coordinates are interpreted as representing a special kind of operators, called c-numbers. The spacetime coordinate operators x µ 1 commute with each other and all other operators of our quantum algebra R, which we do not specify here. Here we propose to treat this interpretation seriously. Let us consider an algebra a of functions f (p) on spacetime M which encodes the geometry of M. At the moment, the exact structure of a is not important for us. However, it will be a commutative algebra with point-wise operations. For simplicity, we confine ourself to the topological structure of M (i.e. the
A Generalization of the Bargmann’s Theory of Ray Representations
223
algebra a of continuous functions), assume M to be compact and R ⊂ B(H). Let us form an operator Df = f (p)1 corresponding to the function p → f (p) in our algebra a. It is therefore natural to assume that the set of all Df , f ∈ a in a natural manner composes an algebra A isomorphic to a, i.e. the point-wise function multiplication in a corresponds to the operator composition in A. This is precisely what we mean when assuming the spacetime coordinates to be classical. Moreover, it is also natural to assume that A is closed with respect to norm and dense in the center of R with respect to the strong operator topology. But this is possible if the operators in R are decomposable and act in a direct integral of Hilbert spaces Hp , p ∈ M, or in a Hilbert bundle MH over spacetime. This is the peculiar property of the Galilean group structure that the whole construction degenerates as if we started from an ordinary ray representation acting in the ordinary Hilbert space. Theorem 1 is then true in this case, but only incidentally. The generalization to the relativistic case is natural. We postulate the spacetime coordinates to be classical commutative variables, which leads to the Hilbert bundle MH over the space-time manifold M. The factor of the representation of the Poincaré group acting in the bundle MH does not have to be constant with respect to space-time coordinates even when it is a symmetry group. In order to implement the above program consistently, we are forced to generalize Bargmann’s theory of factors to embrace the spacetime-dependent factors of representations acting in a Hilbert bundle over space-time (time). Similar ideas was already suggested in the early seventies by Piron, cf. [12, 13]. 3. Generalized Wave Rays and Operator Rays In this section we give strict mathematical definitions of the notions of the preceding section, and formulate the problem stated there in an exact way. From the pure mathematical point of view, the analysis of spacetime-dependent ξ(r, s, p) is more general, so at the outset we confine ourselves to this case 4 . Let us recall some definitions, cf. e.g. [10]. Let M be a set endowed with an analytic Borel structure. By a Hilbert bundle over M or a Hilbert bundle with base M we shall mean an assignment H : p → Hp of a Hilbert space Hp to each p ∈ M. The set of all pairs (p, ψ) with ψ ∈ Hp will be denoted by MH and called the space of the bundle. By a cross section of our bundle we shall mean an assignment ψ : p → ψp of a member of Hp to each p ∈ M. If ψ is a cross section and (p0 , φ0 ) a point of MH, we may form a scalar product (φ0 , ψp0 ). In this way, every cross-section ψ defines a complex-valued function fψ on MH. By a Borel Hilbert bundle we shall mean a Hilbert bundle together with an analytic Borel structure in MH such that the following conditions are fulfilled: (1) Let π(p, ψ) = p. Then E ⊆ M is a Borel set if and only if π −1 (E) is a Borel set in MH. (2) There exist countably many cross-sections ψ 1 , ψ 2 , . . . such that (a) the corresponding complex-valued functions on MH are Borel functions, (b) these Borel functions separate points in the sense that no two distinct points (pi , φi ) of MH assign the same values to all ψ j unless φ1 = φ2 = 0, and (c) p → (ψ i (p), ψ j (p)) is a Borel function for all i and j . 4 It becomes clear in further analysis that the group G in question has to fulfill the consistency condition requiring that for any r ∈ G, rt is a function of time only in the case of nonrelativistic theory with (2).
224
J. Wawrzycki
A cross-section is said to be a Borel cross-section if the function on MH defined by the cross-section is a Borel function. All Borel cross-sections compose a linear space under the obvious operations, cf. [10]. Now let µ be a measure on M. The cross-section p → ϕp is said to be square summable with respect to µ if (ϕp , ϕp ) dµ(p) < ∞. M
L2 (M, µ, H)
of all equivalence classes of square-summable cross-sections, The space where two cross-sections ϕ and ϕ are in the same equivalence class if ϕp = ϕp for almost all p ∈ M, forms a separable Hilbert space with the inner product given by (ϕp , θp ) dµ(p), (ϕ, θ ) = M
cf. [10]. It is called the direct integral of the Hp with respect to µ and is denoted by M Hp dµ(p). Identification with the previous section is partially suggested by the notation itself. We shall make this identification more explicit. The set M plays the role of spacetime or real line R of time t respectively. The wave functions ψ of the preceding section are the Borel cross-sections of MH but they do not belong to the subset L2 (M, µ, H) of cross-sections which are square integrable. Rather, the separate Hilbert spaces Hp with their inner products play a role in experiments than the inner product in their direct integral product. We have also used ψ(p) and ψp as well as (ψp , θp ) and (ψ, θ )p interchangeably. The physical interpretation ascribed to the cross-section ψ is as follows. Each experiment is, by its very nature, a spatiotemporal event. To each act of measurement carried out at the spacetime point p0 we ascribe a self-adjoint operator Qp0 acting in the Hilbert space Hp0 assigning standard interpretation to the spectral theorem for Qp0 . Hence, for simplicity assuming that Qp0 possess discrete spectrum, if φ0 ∈ Hp0 and λ0 = λo (p0 ) is a characteristic vector and its corresponding characteristic value of Qp0 respectively, then the following statement holds true. If the experiment corresponding to Qp was performed at the spatiotemporal event p0 on a system in the state described by the crosssection ψ, then the probability of the measurement value to be λ0 (p0 ) and the system to be found in the state described by φ such that φ(po ) = φ0 after the experiment is given by the square of the absolute value of the Borel function |fψ (p0 , φ0 )|2 = |(φ0 , ψp0 )|2 induced by the cross-section ψ. In the nonrelativistic case, the above statement is a mere rephrasing of well-established knowledge. By an isomorphism of the Hilbert bundle MH with the Hilbert bundle M H we shall mean a Borel isomorphism T of MH on M H such that for each p ∈ M the restriction of T to p × Hp has some q × Hq for its range and is unitary when regarded as a map of Hp on Hq . The induced map carrying p into q is clearly a Borel isomorphism of M with M and we denote it by T π . The above-defined T is said to be an automorphism if MH = M H . Please note that for any automorphism T we have (T ψ, T φ)T π p = (ψ, φ)p , but in general (T ψ, T φ)p = (ψ, φ)p . By this token, any automorphism T is what is frequently called bundle isometry. The function r → Tr from group G into the set of automorphisms (bundle isometry) of MH is said to be a general factor representation of G associated to the action G × M r, p → r −1 p ∈ M of G on M if Trπ (p) ≡ r −1 p for all r ∈ G, and Tr satisfy condition (3).
A Generalization of the Bargmann’s Theory of Ray Representations
225
Of course, Tr is to be identified with that of the preceding section. Our further specializing assumptions partly following from the above interpretation are as follows. We assume M to be endowed with the manifold structure inducing a topology associated with the above-assumed Borel structure. We confine ourselves to a finite dimensional Lie group G which acts smoothly and transitively on spacetime M, such that a G-invariant measure µ exists on M. By a factor representation of a Lie group we mean a general factor representation with the exponent ξ(r, s, p) differentiable in p ∈ M. Now we define the operator ray T corresponding to a given bundle isometry operator T to be the set of operators T = {τ T , p → τ (p) ∈ D and |τ | = 1}, where D denotes the set of all differentiable real functions on M. Any T ∈ T will be called a representative of the ray T . The product T V is defined as the set of all point-wise products T V such that T ∈ T and V ∈ V . Please note that not all Borel sections are physically realizable. By interpreting the discussion of the preceding section in the Hilbert bundle language, we see that the role of the Schrödinger equation is essentially to establish all the physical sections. Any two sections ψ(p) and ψ (p) = eiζ (p) ψ(p) are indistinguishable giving the same probabilities |fψ |2 = |fψ |2 . After this, any group G acting in M induces a ray representation of G, i.e. a mapping r → T r of G into the space of rays of bundle automorphisms (bundle isometrics) of MH, fulfilling the condition T r T s = T rs . For any cross-section ψ we define its corresponding ray ψ = {eiζ (p) ψ(p), ζ ∈ D}. If ψ is a physical cross-section, then we get the physical ray of the preceding section. Selecting a representative Tr for each T r , we get a factor representation fulfilling (3). Please note that Tr transforms rays into rays, and we have Tr (eiξ(p) ψ) = eiξr (p) Tr ψ. Further on we assume that that operators Tr are such that ξr (p) = ξ(r −1 p), where r −1 p denotes the action of r −1 ∈ G on the spacetime point p ∈ M. This is a natural assumption which does actually take place in practice. Now we shall make the last assumption, namely that all transition probabilities vary continuously with the continuous variation of the coordinate transformation s ∈ G: For any element r in G, any ray ψ and any positive , there exists a neighborhood N of r on G such that dp (T s ψ, T r ψ) < if s ∈ N and p ∈ M, where dp (ψ1 , ψ2 ) = inf ψ1 − ψ2 p = 2|1 − |(ψ1 , ψ2 )p | |. ψi ∈ψ i
Basing on the continuity assumption, one can prove the following Theorem 2. Let T r be a continuous ray representation of a group G. For all r in a suitably chosen neighborhood N0 of the unit element e of G one may select a strongly continuous set of representatives Tr ∈ T r . That is, for any compact set C ⊂ M, any wave function ψ, any r ∈ N0 and any positive there exists a neighborhood N of r such that Ts ψ − Tr ψp < if s ∈ N and p ∈ C. There are numerous possible selections of such factor representations. But many among them merely differ by a differentiable phase factor and are physically indistinguishable. We call them equivalent. Our task then is to classify all possible factor representations with respect to this equivalence.
226
J. Wawrzycki
4. Local Exponents The representatives Tr ∈ T r selected as in Theorem 2 will be called admissible, with the representation Tr obtained in this way referred to as an admissible representation. There are infinitely many possibilities of such a selection of admissible representations Tr . We confine ourselves to the local admissible representations defined on a fixed neighborhood No of e ∈ G, as in Theorem 2. Let Tr be an admissible representation. With the help of the phase eiζ (r,p) with a real function ζ (r, p) differentiable in p and continuous in r, we can define Tr = eiζ (r,p) Tr ,
(8)
which is a new admissible representation. This is trivial if one defines the continuity of ζ (r, p) in r appropriately. Namely, from Theorem 2 it follows that the continuity has to be defined in the following way. The function ζ (r, p) will be called strongly continuous in r at r0 if and only if for any compact set C ⊂ M and any positive there exists a neighborhood N0 of r0 such that |ζ (r0 , p) − ζ (r, p)| < , for all r ∈ N0 and for all p ∈ C. But the converse is also true. Indeed, if Tr is also an admissible representation, then (8) has to be fulfilled for a real function ζ (r, p) differentiable in p because Tr and Tr belong to the same ray. Moreover, because both Tr ψ and Tr ψ are strongly continuous (in r for any ψ), then ζ (r, p) has to be strongly continuous (in r). Let Tr be an admissible representation, and thus continuous in the sense indicated in Theorem 2. One can always choose the above ζ in such a way that Te = 1 as will be assumed from now on. Because Tr Ts and Trs belong to the same ray, one has Tr Ts = eiξ(r,s,p) Trs
(9)
with a real function ξ(r, s, p) differentiable in p. From the fact that Te = 1, we have ξ(e, e, p) = 0.
(10)
From the associative law (Tr Ts )Tg = Tr (Ts Tg ) one gets ξ(r, s, p) + ξ(rs, g, p) = ξ(s, g, r −1 p) + ξ(r, sg, p).
(11)
Formula (11) is very important and our analysis largely rests on this relation. From the fact that the representation Tr is admissible it follows that the exponent ξ(r, s, p) is continuous in r and s. Indeed, let us take a ψ belonging to a unit ray ψ. Then, making use of (9), we get eiξ(r,s,p) (Trs − Tr s )ψ + (Tr (Ts − Ts )ψ + (Tr − Tr )Ts ψ
= (eiξ(r ,s ,p) − eiξ(r,s,p) )Tr s ψ.
A Generalization of the Bargmann’s Theory of Ray Representations
227
Taking norms p of both sides, we get
|eiξ(r ,s ,p) − eiξ(r,s,p) | ≤ (Tr s − Trs )ψp + +Tr (Ts − Ts )ψp + (Tr − Tr )Ts ψp . From this inequality and the continuity of Tr ψ, the continuity of ξ(r, s, p) in r and s follows. Moreover, from Theorem 2 and the above inequality follows the strong continuity of ξ(r, s, p) in r and s. Formula (8) suggests the following definition. Two admissible representations Tr and Tr are called equivalent if and only if Tr = eiζ (r,p) Tr for some real function ζ (r, p) differentiable in p and strongly continuous in r. Thus, making use of (9), we get Tr Ts = eiξ (r,s,p) Trs , where ξ (r, s, p) = ξ(r, s, p) + ζ (r, p) + ζ (s, r −1 p) − ζ (rs, p).
(12)
Then the two exponents ξ and ξ are equivalent if and only if (12) is fulfilled with ζ (r, p) strongly continuous in r and differentiable in p. From (10) and (11) it immediately follows that ξ(r, e, p) = 0 and ξ(e, g, p) = 0,
(13)
ξ(r, r −1 , p) = ξ(r −1 , r, r −1 p).
(14)
Relation (12) between ξ and ξ will be written in short by ξ = ξ + [ζ ].
(15)
The relation (12) between exponents ξ and ξ defines an equivalence relation, which preserves the linear structure. We introduce now group H , a very important notion for our further investigations. It is evident that all operators Tr contained in all rays T r form a group under multiplication. Indeed, let us consider an admissible representation Tr with a well-defined ξ(r, s, p) in formula (9). Because any Tr ∈ T r has the form eiθ(p) Tr (with a real and differentiable θ), one has −1 (16) eiθ(p) Tr eiθ (p) Ts = ei{θ(p)+θ (r p)+ξ(r,s,p)} Trs . This important relation suggests the following definition of the local group H connected with the admissible representation or with the exponent ξ(r, s, p). Namely, H consists of the pairs {θ (p), r}, where θ (p) is a differentiable real function and r ∈ G. The multiplication rule, suggested by the above relation, is defined as follows: {θ (p), r} {θ (p), r } = {θ (p) + θ (r −1 p) + ξ(r, r , p), rr }.
(17)
The associative law for this multiplication rule is equivalent to (11) (in complete analogy with the classical Bargmann theory). The pair eˇ = {0, e} plays the role of the unit element in H . For any element {θ (p), r} ∈ H there exists the inverse {θ(p), r}−1 = {−θ (rp) − ξ(r, r −1 , rp), r −1 }. Indeed, from (14) it follows that {θ, r}−1 {θ, r} = {θ, r} {θ, r}−1 = e. ˇ The elements {θ (p), e} form an Abelian subgroup N of H . Any {θ, r} ∈ H can be uniquely written as {θ (p), r} = {θ(p), e} {0, r}. The same element
228
J. Wawrzycki
can be also uniquely expressed in the form {θ (p), r} = {0, r} {θ (rp), e}. Thus, we have H = N G = G N . The Abelian subgroup N is a normal factor subgroup of H . But this time, G does not form any normal factor subgroup of H (contrary to the classical case investigated by Bargmann, when the exponents do not depend on p). So, this time H is not a direct product N ⊗ G, but a semidirect product N G, cf. e.g. [11]. In this case, however, the theorem that G is locally isomorphic to the factor group H /N is still valid, cf. [11]. Then group H composes a semicentral extension of G, and not a central extension of G as in Bargmann’s theory. The rest of this paper is based on the following reasoning (the author being largely inspired by Bargmann’s work [1]). If the two exponents ξ and ξ are equivalent, that is ξ = ξ + [ζ ], then the semicentral extensions H and H connected with ξ and ξ are homomorphic. The homomorphism h : {θ, r} → {θ , r } is given by θ (p) = θ (p) − ζ (r, p), r = r.
(18)
Using an Iwasawa-type construction we show that any exponent ξ(r, s, p) is equivalent to a differentiable one (in r and s). We can then confine ourselves to the differentiable ξ and ξ . We show that ζ (r, p) is also a differentiable function of (r, p). Moreover, we show that any ξ is equivalent to the canonical one, that is such ξ which is differentiable and for which ξ(r, s, p) = 0 whenever r and s belong to the same one-parameter subgroup. Then we can restrict our investigation to the canonical ξ considering the subgroup of all elements {θ (p), r} ∈ H with differentiable θ (p). For simplicity let us denote the subgroup by the same symbol H . We embed the subgroup in an infinite dimensional Lie group D with manifold structure modeled on a Banach space. Then we consider the subgroup H which is the closure of H in D. After this, H turns into a Lie group and the homomorphism (18) becomes an isomorphism of the two Lie groups. Thus, the group H has the Banach Lie algebra H. We apply the general theory of analytic groups developed in [2 and 4]. From this theory it follows that the correspondence between the local H and H is bi-unique and one can construct uniquely the local group H from the algebra H as well. As we will see, the algebra defines a spacetime-dependent antilinear form on the Lie algebra G of G, the so-called infinitesimal exponent . By this we reduce the classification of local ξ ’s which define H ’s to the classification of ’s which define H’s. So, we will simplify the problem of the classification of local ξ ’s to a largely linear problem. Here are the details. Iwasawa construction. Let us denote by dr and d∗ r the left and right invariant Haar measure on G. Let ν(r) and ν ∗ (r) be two infinitely differentiable functions on G with compact supports contained in the fixed neighborhood N0 of e. Bymultiplying them by the appropriate constants, we can always obtain: G ν(r) dr = G ν ∗ (r) d∗ r = 1. Let ξ(r, s, p) be any admissible local exponent defined on N0 . We will construct a differentiable (in r and s) exponent ξ (r, s, p) equivalent to ξ(r, s, p) and defined on N0 , in the following two steps: ξ = ξ + [ζ ] and ξ = ξ + [ζ ], where ζ (r, p) is the left invariant integral of l → −ξ(r, l, p)ν(l), while ζ (r, p) is the right invariant integral of u → −ξ (u, r, up)ν ∗ (u). A rather simple computation in which we use (12) and (11) and the invariance property of the Haar measures shows that ξ (r, s, p) is a differentiable (up to any order) exponent in all variables. Next we shall show that if two differentiable exponents ξ and ξ are equivalent, that is, if ξ = ξ + [ζ ], then ζ (r, p) is differentiable in r. Clearly, the difference ξ − ξ is differentiable. Similarly, both (ξ − ξ )ν ( with ν defined as above), as well as its left invariant integral η are differentiable. It is easy to show that ζ = η − ζ is differentiable. In this way we arrive at
A Generalization of the Bargmann’s Theory of Ray Representations
229
differentiability of ζ = η − ζ . A slightly more complicated argument shows that every (local) exponent of one-parameter group is equivalent to zero. However, the argument is quite analogous to that of Bargmann. We can treat such a group as the additive group of real numbers, so that the first two arguments of ξ are real numbers. Let us set ϑ(τ, σ, p) as the derivative of ξ with respect to the second argument. It is not hard to show that ξ + [ζ ] = 0, where ζ (τ, p) is the ordinary Riemann integral of µ → τ ϑ(µτ, 0, p) over the unit interval [0, 1]. But it means that ξ is equivalent to zero. Let us recall that the continuous curve r(τ ) in a Lie group G is a one-parameter subgroup if and only if r(τ1 )r(τ2 ) = r(τ1 + τ2 ), i.e. r(τ ) = (r0 )τ for some element r0 ∈ G. (Please note that the real power r τ is well defined on a Lie group, at least on some neighborhood of e.) The coordinates ρ k in G are canonical if and only if any curve of the form r(τ ) = τρ k (where the coordinates ρ k are fixed) is a one-parameter subgroup (the curve r(τ ) = τρ k will be denoted in short by τ a, with the coordinates of a equal to ρ k ). The “vector” a is called by physicists the generator of the one-parameter subgroup τ a. A local exponent ξ of a Lie group G is called canonical if ξ(r, s, p) is differentiable in all variables, and ξ(r, s, p) = 0 if r and s are elements of the same one-parameter subgroup. Almost the same argument used to show that every ξ on a one-parameter group is equivalent to zero also shows that every local exponent ξ of a Lie group is equivalent to a canonical local exponent. In order to prove this, we shall apply the argument to the exponent ξ0 (τ, σ, p) := ξ(τ a, σ a, p), cf. [16]. Up to now, the argumentation has been more or less analogous to that of Bargmann. From now on, the argumentation becomes entirely different. Let ξ and ξ be two differentiable and equivalent local exponents of a Lie group G, assuming ξ to be canonical. Then ξ is canonical if and only if ξ = ξ +[ ], where (r, p) is a linear form in the canonical coordinates of r fulfilling the condition that (a, (τ a)p) is constant as a function of τ , i.e. it follows that 5 a (a, p) =
(a, (a)p) − (a, p) d (a, (τ a)p) = lim = 0. →0 dτ
(19)
While sufficiency of the condition in the above statement is almost evident, proving its necessity is quite nontrivial. Hereafter we outline the argumentation. Because the exponents are equivalent we have ξ (r, s, p) = ξ(r, s, p) + [ζ ]. Since both ξ and ξ are differentiable then ζ (r, p) is also a differentiable function, which follows from what has been said above. Let us suppose that r = τ a and s = τ a. Because both ξ and ξ are canonical, we have ξ(τ a, τ a, p) = ξ (τ a, τ a, p) = 0, so that [ζ ](τ a, τ a, p) = 0. Applying the last formula recurrently one gets ζ (τ a, p) =
n−1 k=0
τ k ζ ( a, (− τ a)p). n n
5
The limit in the expression can be understood in the ordinary point-wise sense with respect to p, but also in any linear topology in the function linear space (with obvious addition) of θ (p), providing that p → (a, p) is differentiable in the sense of this linear topology. Further on, the simple notation af (p) = will be used.
df ((τ a)p) f ((a)p) − f (p) , = lim τ =0 dτ →0
230
J. Wawrzycki
Then we apply the Taylor Theorem to each summand in the above expression, and pass to the limit n → +∞. In this way, we obtain τ ζ (τ a, p) = ς(a, (−σ a)p) dσ, (20) 0
where ς = ς (r, p) is a differentiable function, cf. [16]. If we differentiate now expression (20) with respect to τ at τ = 0, we will immediately see that the function ς (a, p) is linear with respect to a. Let us suppose that the spacetime coordinates are chosen in such a way that the integral curves p(x) = (xa)p0 are coordinate lines, which is possible for appropriately small x. There are of course three remaining families of coordinate lines besides p(x), which can be chosen in an arbitrary way, with their parameters denoted by yi . After this, 1 τ 1 x ζ (a, x, yi ) = ς(a, x − σ, yi ) dσ = ς (a, z, yi ) dz, τ 0 τ x−τ for any τ (of course, with appropriately small |τ |, in our case |τ | ≤ 1) and for any (appropriately small) x. But this is possible for the function ς (a, x, yk ) continuous in x (in our case, differentiable in x) if and only if ς(a, x, yk ) does not depend on x. This means that ζ (a, x, yk ) does not depend on x and the condition of the statement is hereby proved. Infinitesimal exponents and embedding of H in a Lie group D. According to what has been shown already, we can assume that the exponent is canonical. We also confine ourselves to the subgroup of {θ (p), r} ∈ H with differentiable θ , and denote this subgroup by the same letter H . We embed this subgroup H in an infinite dimensional Lie group with the manifold structure modeled on a Banach space. We will extensively use the theory developed by Birkhoff [2] and Dynkin [4]. For the systematic treatment of manifolds modeled on Banach spaces, see e.g. [7]. By this embedding we ascribe bi-uniquely a Lie algebra to the group H with the convergent Baker-Hausdorff series. Please note first that the formula H × L2 (M, µ, H) ({θ (p), r}, φ) → eiθ(p) Tr φ (together with(16)) can be viewed as a rule giving the action of H in the direct integral Hilbert space M Hp dµ(p) defined in Sect. 3. Moreover, this is a unitary action, provided µ is G-invariant. In accordance to [2], the group D of all unitary operators of a Hilbert space is an infinite dimensional Lie group. Hence, H = N G can be viewed as a subgroup of a Lie group. We consider now the closure H of H in the sense of the topology in D. It is remarkable that the subgroup H has locally the structure of the semi-direct product N G as well. This is a consequence of the following four facts. (1) N is a normal subgroup of H = N G. (2) G is finite dimensional, so G = G. (3) Locally (in a neighborhood O), the multiplication in D is given by the Baker-Hausdorff formula in the Banach algebra of D. Because N is normal in H , then the above exponential mapping converts locally the multiplication N S of N by any subset S of H into the sum N + S. Because G is finite-dimensional, and hence locally compact, the neighborhood O can be chosen in such a way that locally (in the closure of O + O) the following holds: N + G = N + G = H.
A Generalization of the Bargmann’s Theory of Ray Representations
231
(4) The local N (intersected with O) has a finite co-dimension in local N + G (intersected with O + O) and thus it splits locally N + G. So, we have locally i.e. in O + O: N + G = H = N ⊕ G , where G = G and ⊕ stands for a direct sum. From this it follows that G = G locally. This shows that H = N G. Because H = N G, every h ∈ H is uniquely representable in the form ng, where n ∈ N and g ∈ G. Please note now that (n1 g1 )(n2 g2 ) = n1 g1 n2 g1−1 g1 g2 = [n1 (g1 n2 g1−1 )](g1 g2 ) and that g1 n2 g1−1 ∈ N because N is normal in H . Let us denote the automorphism n → gng −1 of N by Rg . The group H can be locally viewed as a topological product of Banach spaces N × G, one of which (namely G) is finite-dimensional and isomorphic to the Lie algebra of G. The multiplication in H can be written as (n1 , g1 )(n2 , g2 ) = (n1 Rg1 (n2 ), g1 g2 ). Moreover, N can be viewed locally as the Banach space N with the multiplication law given by the vector addition in N. Our task now is to reconstruct the Lie algebra H corresponding to the subgroup H . Let λ → λa be a one-parameter subgroup of G. The mapping (λ, n) → (Rλa n, λa) of the Banach space R × N into the Banach space N × G is continuous. In consequence, R λ → Rλa n ∈ N as well as N n → Rλa n are continuous. Therefore, the function λ → Rλa n can be integrated over any compact interval and τ τ → (nτ a , τ a) := Rσ a n dσ, τ a 0
is a one-parameter subgroup of H with generator6 (n, a), cf. [2, 4]. Having obtained this, we are able to reconstruct the algebra. The elements of H ⊂ H are representable in the ordinary form {α, r} with differentiable α = α(p), p ∈ M, and r ∈ G. Let us consider the above-defined operator Rλa . Its restriction to H ⊂ H is given by (please remember that ξ is canonical) α(p) → (Rλa α)(p) = α((λa)−1 p). We can now compute explicitly the Lie bracket and the Jacobi identity for all the elements {α(p), a} of the subalgebra H ⊂ H corresponding to the subgroup H . The result is as follows7 ˇ = {aβ − bα + (a, b, p), [a, b]}, [a, ˇ b]
(22)
6 The limit process with the help of which the generator is computed refers to the topology in D, of course. 7 Let us stress once more that
aθ (p) = lim
→0
θ ((a)p) − θ (p) ,
and the limit is in the sense of topology induced from the Lie group D.
(21)
232
J. Wawrzycki
(a, b, p) = lim τ −2 {ξ((τ a)(τ b), (τ a)−1 (τ b)−1 , p)+ τ →0
+ξ(τ a, τ b, p) + ξ((τ a)−1 , (τ b)−1 , (τ b)−1 (τ a)−1 p)}.
(23)
From the associative law in H one gets ([a, a ], a , p) + ([a , a ], a, p) + ([a , a], a , p) = = a(a , a , p) + a (a , a, p) + a (a, a , p),
(24)
which can be shown to be equivalent to the Jacobi identity [[a, ˇ aˇ ], aˇ ] + [[aˇ , aˇ ], a] ˇ + [[aˇ , a], ˇ aˇ ] = 0.
(25)
ˇ for all Thus, in this way we have reconstructed the Lie algebra H giving explicitly [a, ˇ b] a, ˇ bˇ ∈ H ⊆ H. Because H is dense in H, the local exponent determines the algebra H uniquely. But from the theory of Lie groups the correspondence between the algebras H and local Lie groups H is bi-unique, at least locally, cf. e.g. [2 and 4]. Therefore, the correspondence H → H between the local group H and the algebra H is one-to-one. Because the exponent ξ determines the multiplication rule in H and vice-versa, then it follows that the correspondence ξ → between the local ξ and the infinitesimal exponent is one-to-one. Please note that the term ’local ξ = ξ(r, s, p)’ means that ξ(r, s, p) is defined for r and s belonging to a fixed neighborhood N0 ⊂ G of e ∈ G, but in our case it is defined globally as a function of the spacetime variable p ∈ M. Infinitesimal exponents and local exponents. Now, let us move on to describing the relation between the infinitesimal exponents and the local exponents ξ . First, let us compute the infinitesimal exponents and given by (23), which correspond to the two equivalent canonical local exponents ξ and ξ = ξ +[ ]. Inserting ξ = ξ +[ ] into formula (23), one gets (a, b, p) = (a, b, p) + a (b, p) − b (a, p) − ([a, b], p).
(26)
According to what has been said, we can confine ourselves to the canonical exponents. Then, as one of our previous statements said, = (a, (τ b)p) is a constant function of τ if a = b, and (a, p) is linear with respect to a (we use the canonical coordinates on G). Hence (a, b, p) is antisymmetric in a and b and fulfills (24) only if (a, b, p) is antisymmetric in a and b and fulfills (24). This suggests the definition: two infinitesimal exponents and will be called equivalent if and only if relation (26) holds. For brevity, we write relation (26) as follows: = + d[ ]. Finally, we maintain that two canonical local exponents ξ and ξ are equivalent if and only if the corresponding infinitesimal exponents and are equivalent. Indeed. (1) Assume ξ and ξ to be equivalent. Then, by the definition of equivalence of infinitesimal exponents: = + d[ ]. (2) Assume and to be equivalent: = + d[ ] for some linear form (a, t) such that (a, (τ a)p) does not depend on τ . Then ξ +[ ] → , and by the uniqueness of the correspondence ξ → we have ξ = ξ + [ ], i.e. ξ and ξ are equivalent. In this way, we arrive at the following:
A Generalization of the Bargmann’s Theory of Ray Representations
233
Theorem 3. (1) On a Lie group G, every local exponent ξ(r, s, p) is equivalent to a canonical local exponent ξ (r, s, p) which, on some canonical neighborhood N0 , is analytic in canonical coordinates of r and s, and vanishes if r and s belong to the same one-parameter subgroup. Two canonical local exponents ξ, ξ are equivalent if and only if ξ = ξ + [ ] on some canonical neighborhood, where (r, p) is a linear form in the canonical coordinates of r such that (r, sp) does not depend on s if s belongs to the same one-parameter subgroup as r. (2) To every canonical local exponent of G there corresponds uniquely an infinitesimal exponent (a, b, p) on the Lie algebra G of G, i.e. a bilinear antisymmetric form which satisfies the identity ([a, a ], a , p) + ([a , a ], a, p) + (a , a], a , p) = a(a , a , p) + a (a , a, p) + a (a, a , p). The correspondence is linear. (3) Two canonical local exponents ξ, ξ are equivalent if and only if the corresponding , are equivalent, i.e. (a, b, p) = (a, b, p) + a (b, p) − b (a, p) − ([a, b], p), where (a, p) is a linear form in a on G such that τ → (a, (τ b)p) is constant if a = b. (4) There exist a one-to-one correspondence between the equivalence classes of local exponents ξ (global in p ∈ M) of G and the equivalence classes of infinitesimal exponents of G. 5. Global Extensions of Local Exponents Theorem 3 provides full classification of exponents ξ(r, s, p) local in r and s, defined for all p ∈ M. But if G is both connected and simply connected, then we have the following theorems. (1) If an extension ξ of a given local (in r and s) exponent ξ does exist, then it is uniquely determined (up to the equivalence transformation (12)) (Theorem 4). (2) There exists such an extension ξ (Theorem 5), proved for G, which possess the finite-dimensional extension H only. We are not able to prove that the (global) homomorphism (18) is continuous when ξ is not canonical. Please note that any ξ is equivalent to its canonical counterpart, but only locally! This is why the topology of H induced from D is not applicable in the global analysis. We introduce another topology. Because of the semidirect structure of H = N G, it is sufficient to introduce it into N and G separately in such a manner that G acts continuously on N , cf. e.g. [10]. From the discussion of Sect. 4 it is sufficient to introduce the Fréchet topology of almost uniform convergence in the function space N . Indeed, from the strong continuity of ξ and ζ in (18) it follows that the multiplication rule as well as the homomorphism (18) are continuous. Theorem 4. Let ξ and ξ be two equivalent local exponents of a connected and simply connected group G, so that ξ = ξ + [ζ ] on some neighborhood, assuming the exponents ξ1 and ξ1 of G to be extensions of ξ and ξ respectively. Then, for all r, s ∈ G: ξ1 (r, s, p) = ξ1 (r, s, p) + [ζ1 ], where ζ1 (r, p) is strongly continuous in r and differentiable in p, and ζ1 (r, p) = ζ (r, p), for all p ∈ M and for all r belonging to some neighborhood of e ∈ G. Here is the proof outline. The two exponents ξ1 and ξ1 being strongly continuous (by assumption) define two semicentral extensions H1 = N1 G and H1 = N1 G, which are continuous groups. Please note that the linear groups N1 , N1 are connected and simply connected. Because both H1 and H1 are semi-direct products of two connected and simply connected groups they are both connected and simply connected. Equation (18) defines a local isomorphism mapping h : rˇ → rˇ = h(ˇr ) of H1 into H1 . Because H1 and H1 are connected and simply connected, the isomorphism h given by (18) can be uniquely extended to an isomorphism h1 of the entire groups H1 and
234
J. Wawrzycki
H1 such that h1 (ˇr ) = h(ˇr ) on some neighborhood of H1 , cf. [14], Theorem 80. The isomorphism h1 defines an isomorphism of the two Abelian subgroups N1 and h1 (N1 ). By (18), h1 (θ, e) = {θ, e} locally in H1 , that is for θ lying appropriately close to 0 (in the metric sense defined previously). Both N1 and h1 (N1 ) are connected, and N1 is in addition simply connected, so applying once again Theorem 80 of [14], one can see that h1 (θ, e) = {θ, e} for all θ . A rather simple computation shows that ζ1 defined by the equality h1 (0, r) = {−ζ1 (p), g(r)} fulfills the conditions of our theorem. The following theorem is proved for the group G with a finite-dimensional extended algebra H . Theorem 5. Let G be a connected and simply connected Lie group. Then to every exponent ξ(r, s, X) of G defined locally in (r, s) there exists an exponent ξ0 of G defined on the whole group G which is an extension of ξ . If ξ is differentiable, ξ0 may be chosen differentiable. Because the proof of Theorem 5 is almost identical to that of Theorem 5.1 in [1], we do not present it explicitly8 . Please note that the proof rests largely on the global theory of classical (finite-dimensional) Lie groups. Namely, it rests on the theorem that to any finite dimensional Lie group there always exists a universal covering group . We can use those methods because of the existence of a finite-dimensional extension H of G. We have obtained the full classification of time-dependent ξ defined on the whole group G for Lie groups G which are connected and simply connected in nonrelativistic theory. But for any Lie group G there exists a universal covering group G∗ which is connected and simply connected. Thus, for G∗ the correspondence ξ → is one-toone, that is, to every ξ there exists a unique and vice versa, to every corresponds a unique ξ defined on the whole group G∗ , and the correspondence preserves the equivalence relation. Because G and G∗ are locally isomorphic, the infinitesimal exponents ’s are exactly the same for G and for G∗ . Since to every there does exist exactly one ξ on G∗ , so, if to a given there exists the corresponding ξ on the whole G, then such a ξ is unique. In this way, we have obtained the full classification of ξ defined on a whole Lie group G for any Lie group G, in the sense that no ξ can be omitted in the classification. The set of equivalence classes of ξ is considerably smaller than that for ; it may happen that to some local ξ there does not exist any global extension.
6. Examples Example 1: The Galilean Group. According to the conclusions of Sect. 2 one should a priori investigate such representations of the Galilean group G which fulfill Eq. (2), with ξ depending on time. Then, the following paradox arises. Why has the transformation law Tr under the Galilean group a time-independent ξ in (2), regardless of whether it is a covariance group or a symmetry group? We will solve the paradox in this subsection. Namely, we will show that any representation of the Galilean group fulfilling (2) is equivalent to a representation fulfilling (2) with time-independent ξ . This is a rather 8 In this proof we consider the finite-dimensional extension H of G instead of the Lie group H in the proof presented in [1]. The remaining replacements are rather trivial, but we mark them here explicitly to simplify the reading. (1) Instead of the formula r¯ = t¯(θ )¯r = r¯ t¯(θ ) of (5.3) in [1], we have rˇ = tˇ(θ(r −1 p))ˇr = rˇ tˇ(θ (p)). Thus, from the formula (hˇ 1 (r)hˇ 1 (s))hˇ 1 (g) = hˇ 1 (r)(hˇ 1 (s)hˇ 1 (g)) (see [1]) it follows that ξ(r, s, p)+ξ(rs, g, p) = ξ(s, g, r −1 p)+ξ(r, sg, p) instead of (5.8) in [1]. (2) Instead of (4.9), (4.10) and (4.11) we use the Iwasawa-type construction presented in this paper.
A Generalization of the Bargmann’s Theory of Ray Representations
235
peculiar property of the Galilean group, not valid in general. For example, this is not true for the group of Milne transformations. In nonrelativistic theory ξ = ξ(r, s, t) depends on the time. In this case, according to our assumption about G, any r ∈ G transforms simultaneous hyperplanes into simultaneous hyperplanes. Thus, there are two possibilities for any r ∈ G. First, when r does not change time: t (rp) = t (p), and the second in which time is changed, but in such a way that t (rp) − t (p) = f (t). We assume in addition that the base generators ak ∈ G can be chosen in such a way that only one acts on time as translation and the remaining ones do not act on time. We can assume that the operators a are ordinary differential operators. Hence, the Jacobi identity (24) reads ([a, a ], a ) + ([a , a ], a) + ([a , a], a ) = ∂t (a , a ),
(27)
if one and only one among a, a , a is the time-translation generator, namely a, and ([a, a ], a ) + ([a , a ], a) + ([a , a], a ) = 0,
(28)
in all of the remaining cases. The Jacobi identities (27) and (28) can be treated as a system of ordinary differential linear equations for the finite set of unknown functions ij (t) = (ai , aj , t), where ai is the base in the Lie algebra of G. According to Sect. 3, in order to classify all ξ of G we shall determine all equivalence classes of infinitesimal exponents of the Lie algebra G of G. The commutation relations for the Galilean group are as follows [aij , akl ] = δj k ail − δik aj l + δil aj k − δj l aik ,
(29)
[aij , bk ] = δj k bi − δik bj , [bi , bj ] = 0,
(30)
[aij.dk ] = δj k di − δik dj , [di , dj ] = 0, [bi , dj ] = 0,
(31)
[aij , τ ] = 0, [bk , τ ] = 0, [dk , τ ] = bk ,
(32)
where bi , di and τ stand for the generators of space translations, with the proper Galilean transformations and time translation respectively and aij = −aj i being rotation generators. Please note that the Jacobi identity (28) is identical to that in the ordinary Bargmann theory of time-independent exponents (see [1], Eqs (4.24) and (4.24a)). Thus, using (29) – (31) we can proceed exactly after Bargmann (see [1], pp. 39, 40) and show that any infinitesimal exponent defined on the subgroup generated by bi , di , aij is equivalent to an exponent equal to zero, with the possible exception of (bi , dk , t) = γ δik , where γ = γ (t). Hence, the only components of defined on the whole algebra G which can be a priori not equal to zero are: (bi , dk , t) = γ δik , (aij , τ, t), (bi , τ, t) and (dk , τ, t). First, we compute the function γ (t). Substituting a = τ, a = bi , a = dk to (27), we get dγ /dt = 0, so that γ is a constant, denoting the constant value of γ by m. By inserting a = τ, a = ais , a = asj to (27) and summing up with respect to s, we get (aij , τ, t) = 0. In the same way, but with the substitution a = τ, a = ais , a = bs , one shows that (bi , τ, t) = 0. At last, the substitution a = τ, a = ais , a = ds to (27) and summation with respect to s gives (di , τ, t) = 0. In this way, we have proved that any time-dependent on G is equivalent to a time-independent one. In other words, we get a one-parameter family of possible , with the parameter equal to the inertial mass m of the system in question. Any infinitesimal time-dependent exponent of the
236
J. Wawrzycki
Galilean group is equivalent to the above time-independent exponent with some value of the parameter m; and any two infinitesimal exponents with different values of m are nonequivalent. As was argued in Sect. 3 (Theorems 3 ÷ 5), the classification of gives a full classification of ξ . Moreover, it can be shown that the classification of ξ is equivalent to the classification of possible θ -s in the transformation law Tr ψ(p) = eiθ(r,p) ψ(r −1 p)
(33)
for the spinless nonrelativistic particle. On the other hand, the exponent ξ(r, s, t) of the representation Tr given by (33) can be easily computed to be equal to θ(rs, p) − θ(r, p) − θ (s, r −1 p), and the infinitesimal exponent belonging to θ defined as θ(r, p) = −m v x + m2 v2 t, covers the whole one-parameter family of the classification (its infinitesimal exponent is equal to that infinitesimal exponent , which has been found above). Thus, the standard θ (r, p) = −m v x + m2 v2 t, covers the full classification of possible θ -s in (33) for the Galilean group. Inserting the standard form for θ we see that ξ does not depend on time but only on r and s. By this, any time-dependent ξ on G is equivalent to its time-independent counterpart. In this way, we have reconstructed the standard result. Using now the formula9 (Tσ ai ) − 1)ψ(p) , σ →0 σ
Ai ψ(p) = lim
for the generator Ai corresponding to ai , we get the standard commutation relations for the ray representation Tr of the Galilean group [Aij , Akl ] = δj k Ail − δik Aj l − δj l Aik , [Aij , Bk ] = δj k Bi − δik Bj , [Bi , Bj ] = 0, [Aij , Dk ] = δj k Di − δik Dj , [Di , Dj ] = 0, [Bi , Dj ] = mδij , [Aij , T ] = 0, [Bk , T ] = 0, [Dk , T ] = Bk . Please note that to any ξ (or ) there exists a corresponding θ (and such a θ is unique up to a trivial equivalence relation). As we will see, this is not the case for the Milne group, where some ’s do exist which cannot be realized by any θ. Example 2: Milne group as a covariance group. In this subsection we apply the theory of Sect. 3 to the Milne transformations group. We proceed like with the Galilean group in the preceding section. The Milne group G does not form any Lie group, but in the physical application it is sufficient for us to consider some Lie subgroups G(m) of the Milne group. We will go on according to the following plan. First, we compute the 9 The transformation T does not act in the ordinary Hilbert space but in the Hilbert bundle space r RH, hence we cannot immediately appeal to the Stone and Gårding Theorems. Nonetheless, Tr induces a unique unitary representation acting in the Hilbert space R Ht dµ(t) and it can be shown that it is meaningful to talk about the generators A of Tr .
A Generalization of the Bargmann’s Theory of Ray Representations
237
infinitesimal exponents and exponents for each G(m), m = 1, 2, . . . , and then the θ in (33) for G(m). Please compare [16], where the result is extended on the whole group. The Milne transformation is defined as follows: ( x , t) → (R x + A(t), t + b),
(34)
where R is an orthogonal matrix, and b is constant. The extent of arbitrariness of the in (34) will be left undetermined for now. It is convenient to rewrite the function A(t) Milne transformations (34) in the following form: x = R x + A(t) v , t = t + b, where v is a constant vector which does not depend on time t. We define the subgroup G(m) of G as the group of the following transformations t2 tm x = R x + v(0) + t v(1) + v(2) + . . . + v(m) , t = t + b, 2! m! k are the group parameters; in particular, the group G(m) has where R = (Rab ), v(n) dimension equal to 3m + 7. Now let us investigate the group G(m), that is, classify its infinitesimal exponents. The commutation relations of G(m) are as follows
[aij , akl ] = δj k ail − δik aj l + δil aj k − δil aik , (n)
(n)
[aij , dk ] = δj k di
(n)
(n)
(k)
− δik dj , [di , dj ] = 0,
(0)
(n)
(n−1)
[aij , τ ] = 0, [di , τ ] = 0, [di , τ ] = di (n)
where di
,
(35) (36) (37)
i ): is the generator of the transformation r(v(n)
x = xi + i
tn i v , n! (n)
which will be called the n-acceleration, and 0-acceleration in the particular case of the ordinary space translation. All relations (35) and (36) are identical to (29) ÷ (31) with the n-acceleration instead of the Galilean transformation. Thus, the same argu(n) mentation as that used for the Galilean group gives: (aij , akl ) = 0, (aij , dk ) = 0, (n) (n) and (di , dj ) = 0. Substituting aih , ahi , τ for a, a , a into Eq. (27), making use of the commutation relations and summing up with respect to h, we get (aij , τ ) = (l) (n) 0. In an analogous way, substituting aih , dh , dk for a, a , a into Eq. (28), we get (l) (n) (n) (n) (di , dk ) = 13 (d (l)h , dh ) δik . Substituting aih , dh , τ for a, a , a into Eq. (27), making use of the commutation relations, and summing up with respect to h, we get (n) (n) (0) (di , τ ) = 0. Now, we substitute dk , di , τ for a, a , a in (27), and proceed recur(0) (n) rently with respect to n, obtaining in this way (di , dk ) = P (0,n) (t)δik , where (0,n) P (t) is a polynomial of degree n − 1, and the time derivation of P (0,n) (t) has to be (n) (l) equal to P (0,n−1) (t), and P (0,0) (t) = 0. Substituting dk , di , τ into (27), in the same (l) (n) way we get (dk , di ) = P (l,n) (t)δki , where dtd P (l,n) = P (l−1,n) + P (l,n−1) . This
238
J. Wawrzycki
allows us to determine all P (l,n) by the recurrent integration process. Please note that P (0,0) = 0, and P (l,n) = −P (n,l) , so given the P (0,n) we can compute all P (1,n) . Indeed, we have P (1,0) = −P (0,1) , P (1,1) = 0, dP (1,2) /dt = P (0,2) + P (1,1) , dP (1,3) /dt = P (0,3) +P (1,2) , . . . , and after m−1 integrations we compute all P (1,n) . Each elementary integration introduces a new independent parameter (the arbitrary additive integration constant). Exactly in the same way, given all P (1,n) we can compute all P (2,n) after m−2 elementary integration processes. In general, the P (l−1,n) allows us to compute all P (l,n) after m−l integrations. Thus, P (l,n) (t) are l+n−1-degree polynomial functions of t, and (n) (l) all are determined by m(m + 1)/2 integration constants. Because d[ ](di , dk ) = 0, the exponents defined by different polynomials P (l,n) are inequivalent. Therefore, the space of nonequivalent classes of is m(m + 1)/2-dimensional. However, not all can be realized by the transformation Tr of the form (33). It can seen that any integration constant γ(l,q) of the polynomial P (l,q) (t) has to be equal to zero if l, q = 0, provided the exponent belongs to the representation Tr of the form (33). By this, all exponents of G(m) which can be realized by the transformations Tr of the form (33) are determined by the polynomial P (0,m) , that is, by m constants. We omit the proof of this fact, and refer the reader to [16], Example 2. In the proof we compute the exponent directly for the transformation Tr of the form (33) and compare it with the classification results above. Consider the θ , given by the formula θ (r, p) = γ1
dA d2 A dm A θ (t), + γ2 2 + . . . + γ m m + dt dt dt
(38)
for r ∈ G(m), where γi are the integration constants which define the polynomial t m−1 t (m−2) + γ2 (m−2)! + . . . + γm , and θ (t) is any function of time t, and evenP (0,m) = γ1 (m−1)! tually of the group parameters. A rather simple computation shows that this θ covers all possible which can be realized by (33). That is, the infinitesimal exponents corresponding to the θ given by (38) yield all possible with all integration constants γ(k,n) = 0, for k, n = 0. Thus, the most general θ (r, p) defined for r ∈ G(m) is given by (38). At this point we make use of the assumption that the wave equation is local. It can be shown then (we leave this without proof) that the θ (r, p) can be a function of a finite order of derivatives of A(t), say k th at most, while the higher derivatives cannot enter into θ. Therefore, the most general θ (r, p) defined for r ∈ G(m) has the following form: θ (r, X) = γ1
dA dk A + . . . + γk k + θ (t). dt dt
(39)
Having obtained this we can infer the most general Schrödinger equation for a spinless particle in Newton-Cartan spacetime, cf. [17]. The inertial and the gravitational masses are always equal in this equation. Acknowledgements. The author is indebted to A. Staruszkiewicz and A. Herdegen for helpful discussions. The author also wishes to thank the Referee who suggested the explicit use of the Hilbert bundle formalism. The paper was financially supported by the KBN grant no. 5 P03B 09320.
References 1. Bargmann, V.: Ann. Math. 59, 1 (1954)
A Generalization of the Bargmann’s Theory of Ray Representations
239
2. Birkhoff, G.: Continuous Groups and Linear Spaces. Recueil Mathématique (Moscow) 1(5), 635 (1935); Analytical Groups, Trans. Am. Math. Soc. 43, 61 (1938) 3. Dirac, P.A.M.: Lectures on Quantum Field Theory. New York: Academic Press, 1966 4. Dynkin, E.: Uspekhi Mat. Nauk 5, 135 (1950); Am. Math. Soc. Transl. 9(1), 470 (1950) 5. Giulini, D.: States, Symmetries and Superselection. In: Decoherence: Theoretical, Experimental and Conceptual Problems, Lecture Notes in Physics, Berlin-Heidelberg-New York: Springer Verlag, 2000, p. 87 6. Gupta, S.N.: Proc. Phys. Soc. 63, 681 (1950); Bleuler, K.: Helv. Phys. Acta 23, 567 (1950) 7. Lang, S.: Differential Manifolds. Berlin-Heidelberg-New York: Springer-Verlag, 1985 8. Heisenberg, W.: The Physical Principles of the Quantum Theory. New York: Dover, 1949 9. Łopusza´nski, J.: Forts. der Phys. 26, 261, (1978); Rachunek spinorów, Warszawa: PWN, 1985 (in Polish) 10. Mackey, G.W.: Unitary Group Representations in Physics, Probability, and Number Theory. New York, Amsterdam, Wokingham-UK: Addison-Wesley Publishing Company, Inc., 1989 11. Nachbin, L.: The Haar Integral. Princeton-New Jersey-Toronto-New York-London: D. Van Nostrad Company Inc., 1965 12. Piron, C.: Field Theory Revisited. In: Trends in Quantum mechanics, H.D. Doebnen (ed.), Singapore: World Scientific, 2000, pp. 270–273 13. Piron, C.: A Unified Concept of Evolution in Quantum Mechanics. In: Interpretation and Foundations of Quantum Theory, Holger Neumann (ed.), Mannheim: Bibliographisches Institut, 1981 14. Pontrjagin, L.: Topological groups. Moscow, 1984 (in Russian) 15. Wawrzycki, J.: Int. J. Theor. Phys. 40, 1595 (2001) 16. Wawrzycki, J.: A Generalization of the Bargmann’s Theory of Ray Representations. http://arxiv.org/abs/ 17. Wawrzycki, J.: Acta Phys. Polon. B 35, 613 (2004) 18. Weinberg, S.: The Quantum Theory of Fields. volume II, Cambridge: Univ. Press, 1996 Communicated by G.W. Gibbons
Commun. Math. Phys. 250, 241–257 (2004) Digital Object Identifier (DOI) 10.1007/s00220-004-1143-2
Communications in
Mathematical Physics
Bounds on the Unstable Eigenvalue for the Asymmetric Renormalization Operator for Period Doubling B.D. Mestel1 , A.H. Osbaldestin2 , A.V. Tsygvintsev3 1
Department of Computing Science and Mathematics, University of Stirling, Stirling, FK9 4LA, UK. E-mail:
[email protected] 2 Department of Mathematics, University of Portsmouth, Portsmouth, PO1 3HE, UK. E-mail:
[email protected] 3 Unite de Math´ematiques Pures et Appliqu´ees, Ecole Normale Sup´erieure de Lyon, 46 All´ee d’Italie, 69364 Lyon Cedex 07, France. E-mail:
[email protected] Received: 8 July 2003 / Accepted: 1 February 2004 Published online: 5 August 2004 – © Springer-Verlag 2004
Abstract: We establish rigorous bounds for the unstable eigenvalue of the period-doubling renormalization operator for asymmetric unimodal maps. Herglotz-function techniques and cone invariance ideas are used. Our result generalizes an established result for conventional period doubling. 1. Introduction The remarkable universality of the scalings witnessed in the period-doubling route to chaos now has a well-established, mathematically rigorous basis. Soon after discovery by Feigenbaum [8] and Coullet-Tresser [3], the first (computer-assisted) proof was given by Lanford [11], closely followed by the analytic proofs of Epstein [5] and coworkers. More recently the rigorous analysis has reached new levels of sophistication in the works of Sullivan [17] and McMullen [12]. Contemporaneously, Arneodo et al [2] initiated the investigation of asymmetric unimodal maps. In a recent series of articles [13–15], we have given a rigorous renormalization analysis of period doubling in degree-d asymmetric unimodal maps. These are unimodal maps possessing a degree-d maximum, but with differing left and right d th derivatives. The maps we have in mind take the form fL (x) = 1 − a1 |x|d if x ≤ 0 ; f (x) = (1.1) fR (x) = 1 − a2 |x|d if x ≥ 0 . (The case of differing left- and right-hand degrees appears to be somewhat different in nature. See, e.g., [9].) In brief, for each d > 1, the standard Feigenbaum period-doubling renormalization operator has been shown to possess a family of period-two orbits, parametrized by an invariant asymmetry modulus, µ, measuring the ratio of the left and
This research is supported by The Leverhulme Trust (grant number F/00144/W).
242
B.D. Mestel, A.H. Osbaldestin, A.V. Tsygvintsev
right d th derivatives at the maximum. The period-two orbit is then given by a quartet of functions (fL , fR , f˜L , f˜R ) satisfying the functional equations f˜L (x) = −λ−1 fR fR (−λx) , f˜R (x) = −λ−1 fR fL (−λx) , ˜ , fL (x) = −λ f˜R f˜R (−λx) −1 ˜ , fR (x) = −λ˜ f˜R f˜L (−λx) ˜ −1
(1.2a) (1.2b) (1.2c) (1.2d)
with the normalizations fL (0) = fR (0) = f˜L (0) = f˜R (0) = 1 so that λ = −fR (1) > 0 and λ˜ = −f˜R (1) > 0. The solutions of (1.2) depend on two parameters, viz., the degree d of the critical point and the modulus µ, which (for the case when d is an even integer) is the ratio (d)
µ=
fL (0−) (d)
(1.3)
.
fR (0+)
The case µ = 1 is the standard Feigenbaum scenario in which case the period-two orbit is in fact a fixed point. Let us denote by R the period-doubling renormalization operator acting on a unimodal map f with f (0) = 1, so that R(f )(x) = −λ−1 f (f (−λx)),
λ = −f (1) .
(1.4)
Then R acts on both symmetric and asymmetric unimodal maps, preserving the degree d and inverting the asymmetry modulus µ. The scaling of the parameters in (1.1) undergoing a period-doubling cascade is determined by the expanding eigenvalue of the derivative of R 2 at the period-2 point f . This derivative dR 2 (f ) is compact on a suitable Banach space of tangent functions δf and numerical results suggest that it is hyperbolic with a single expanding eigenvalue δ 2 . It is this expanding eigenvalue which we investigate in this paper. More precisely, we study an associated operator T (defined below), which has a positive expanding eigenvalue δ. We give a brief description of the relationship between T and dR 2 (f ) in Sect. 4. The operator T is defined on a pair of functions (v, v) ˜ and is given by: −1 ˜ t˜x))L˜ (t˜x)−1 ) v(x) ˜ t˜x) + v( ˜ L( t˜ (v( , (1.5) T = −1 v(x) ˜ t (v(tx) + v(L(tx))L (tx)−1 ) ˜ where t = µλd , t˜ = µ−1 λ˜ d , and L(x) = F (x)d , L(x) = F˜ (x)d . In this article we analyze the positive unstable eigenvalue of T , and, in particular, we shall establish the following theorem. Our work mirrors closely the analysis of Eckmann and Epstein [4] on the expanding eigenvalue of the symmetric Feigenbaum fixed-point. We shall establish the following result: Theorem 1. There exists a Banach space of function pairs on which the operator T is well defined, compact and has an eigenvalue δ > 0 satisfying 1
0 and for each d > 1, there exists a solution pair (ψ, ψ) −1 −1 −1 −1 ˜ ˜ ˜ ˜ for (2.4) with ψ ∈ E(−λ , (λλ) ) and ψ ∈ E(−λ , (λλ) ). From this it is straightforward to reverse the transformation above to show that (1.2) has a solution. See [14]. One crucial feature of the Herglotz and anti-Herglotz functions is that they satisfy ˜ these bounds the so-called a priori bounds. (See [5, 6, 14].) For the solution pair (ψ, ψ) are, for x < 0 and x > 1: 1−x ˜ 1 − λλx
≤ ψ(x) ≤
1−x ˜ 1 + λx
,
and for 0 < x < 1: 1−x 1−x ≤ ψ(x) ≤ , ˜ ˜ 1 + λx 1 − λλx
1−x
1−x ; 1 + λx
(2.7)
1−x 1−x ˜ . ≤ ψ(x) ≤ ˜ 1 + λx 1 − λλx
(2.8)
˜ 1 − λλx
˜ ≤ ψ(x) ≤
In addition, as in [5], it is straightforward to derive a priori bounds on the first and second derivatives: −2λ˜ ψ (x) U (x) 2λλ˜ ≤ = ≤ , ˜ ˜ ψ (x) U (x) (1 + λx) (1 − λλx) −2λ ψ˜ (x) 2λλ˜ U˜ (x) ≤ ≤ = , ˜ (1 + λx) ψ˜ (x) (1 − λλx) U˜ (x)
˜ −1 ) , x ∈ (−λ˜ −1 , (λλ)
(2.9a)
˜ −1 ) . x ∈ (−λ−1 , (λλ)
(2.9b)
Let us define t = µλd , t˜ = µ−1 λ˜ d . Then we have the following properties which are a consequence of the definitions and the results of [14]. 1. t, t˜, z1 , z˜ 1 , τ , τ˜ ∈ (0, 1) ; 2. t < z1d , and t˜ < z˜ 1d . ˜ z1 x 1/d ). Following [5], we define the functions V (x) = τ −1 ψ(z1 x 1/d ), V˜ (x) = τ˜ −1 ψ(˜ −d ˜ λ). ˜ Then V ∈ AH(0, α(λλ) ˜ Let us further define α = ψ(−λ), α˜ = ψ(− ) and V˜ ∈ ˜ −d ) and, in view of Eqs. (2.4), we have AH(0, α(λ ˜ λ) ˜ λx)) ˜ ψ(x) = V˜ (ψ(− ,
˜ ψ(x) = V (ψ(−λx) .
(2.10)
Note that V (1) = V˜ (1) = 1 and V (α) = V˜ (α) ˜ = 0. Differentiating (2.10), and evaluating at 0, gives V (1) =
−ψ˜ (0) , λψ (0)
−ψ (0) V˜ (1) = , λ˜ ψ˜ (0)
1 V (1)V˜ (1) = . λλ˜
(2.11)
Bounds on the Unstable Eigenvalue for the Asymmetric Renormalization Operator
245
Lemma 1. The functions U (x) and U˜ (x) are injective respectively in domains = −1 , (λλ) ˜ −1 ) and ˜ = (−λ ˜ ˜ −1 ). (−λ˜ −1 , (λλ) Proof. Our proof is based on the ideas of Epstein given in [6] and [7]. It is sufficient to show the injectivity of U , for the injectivity of U˜ then easily follows from Eqs. (2.6). From the same equations we have ˜ −d U ( 3 (x)), U (x) = (λλ)
x ∈ ,
(2.12)
˜ where 3 (x) = u(−µ−1/d u(u(λλx))) with u(x) = U (x)1/d . To elucidate the proof we −1/d ˜ ˜ ˜ u(u(λλx)), 1 (x) = u(λλx), 0 (x) = λλx. further define 2 (x) = −µ The functions 0 , 3 are Herglotz in and 1 , 2 are anti-Herglotz in the same domain. Indeed, it is clear that they are Herglotz and anti-Herglotz respectively on C+ ∪ C− and any interval on which they are well-defined. Indeed they are Herglotz ˜ −1 ) as follows from the inclusions and anti-Herglotz respectively on J0 = (−λ˜ −1 , (λλ) −1 ˜ proved below. Let s ∈ (1, (λλ) ) be arbitrarily chosen and let J = (−λ˜ −1 , s). Our first aim is to show that i ((J )) (J ) for 0 ≤ i ≤ 3, where we adopt the notation that A B means that A is strictly inside of B, i.e., A ⊂ B. Taking into account the Herglotz (resp. anti-Herglotz) character of the functions i it is sufficient to prove the inclusions i (J ) J . We consider each function in turn. (a) 0 (J ) J is seen from the inequality 0 < λλ˜ < 1. ˜ ∈ (z1 ψ((λλ)(λ ˜ λ) ˜ −1 )1/d , z1 ψ((λλ)(− ˜ ˜ −1 )1/d ) = (b) 1 (J ) J . We have u(λλx) λ) 1/d (0, z1 ψ(−λ) ) = (0, 1) for all x ∈ J . The result follows from this and the fact that s > 1. (c) 2 (J ) J . We have −µ−1/d < 2 (x) < 0 for all x ∈ J ⊂ J0 since 0 < ˜ u(u(λλx)) < 1 for arbitrary x ∈ J0 . The inequality µ−1/d λ˜ < 1 implies −µ−1/d > −1 −λ˜ and we obtain 2 (J ) = ( 2 (s), 2 (−λ˜ −1 )) (−λ˜ −1 , 0) as required. (d) 3 (J ) J . We have 3 (J ) = ( 3 (−λ˜ −1 ), 3 (s)), 0 < 3 (−λ˜ −1 ) < 3 (s). But 3 (s) < s. Indeed, by the Schwarz Lemma, x = 1 is the unique attracting fixed point of 3 in the interval J0 . Hence, 3 (1) = 1, 3 (1) < 1 and the graph of ˜ −1 ). y = 3 (x) lies below the line y = x for all x ∈ (1, (λλ) Following [6] we define z−a D(a, b, θ) = z ∈ C : 0 < arg 1 one has (zn , wn ) ∈ δ(C, θ0 , kR). We note that the restriction of ˆ U to the real interval I = δ(C, θ0 , kR) ∩ R is injective. This implies that there exists a small complex neighborhood UI ⊂ C of I , where U will be also injective. As n → ∞ we have θn → 0 and both points zn and wn will eventually enter UI for n sufficiently large. This contradiction proves that U is injective in . 3. Lower Bound for d In this section we shall prove the inequality d > (1 + λλ˜ )/(1 − λλ˜ ) which is important ˜ V and V˜ . The inequality is analogous to in the proof of convexity of the functions L, L, Epstein’s result for the symmetric Feigenbaum function, viz., d > (1 + λ2 )/(1 − λ2 ). However, our proof differs somewhat from that in [5]. Lemma 2. ˜ ˜ . d > (1 + λλ)/(1 − λλ)
(3.1)
Bounds on the Unstable Eigenvalue for the Asymmetric Renormalization Operator
247
Proof. We start from the estimates contained in the paper [15] 1−x , ˜ ˜ ˜ (τ˜ + λτ˜ − λ)(1 − x) + (1 − τ˜ )(1 + λ) 1−x ˜ ψ(x) ≤ , (τ + λτ − λ)(1 − x) + (1 − τ )(1 + λ)
ψ(x) ≤
x ∈ (−λ˜ −1 , 0) ,
(3.2a)
x ∈ (−λ−1 , 0) .
(3.2b)
The first inequality gives ψ(−λ) ≤ so that
z1 = ψ(−λ)
−1/d
≥
1+λ , ˜ (λ + λλ)τ˜ + 1 − λλ˜
˜ τ˜ + 1 − λλ˜ (λ + λλ) 1+λ
(3.3)
1/d
= (ρ τ˜ + (1 − ρ))1/d ≥ ρ τ˜ 1/d + 1 − ρ = 1 − ρ(1 − τ˜ 1/d ) ,
(3.4)
˜ where we have set ρ = (λ + λλ)/(1 + λ) ∈ (0, 1) and have used the property (ρx + ˜ − τ 1/d ), where (1 − ρ)y)1/d ≥ ρx 1/d + (1 − ρ)y 1/d . Similarly, we have z˜ 1 ≥ 1 − ρ(1 ˜ ˜ ∈ (0, 1). Note ρ ρ˜ = λλ. ˜ ρ˜ = (λ˜ + λλ)/(1 + λ) ˜ we have Using the a priori bounds (2.8) and writing c = λλ, 1 − z˜ 1 1 − z1 ˜ z1 ) ≤ (3.5) cd = τ τ˜ = ψ(z1 )ψ(˜ 1 − cz1 1 − c˜z1 ≤
c(1 − τ˜ 1/d )(1 − τ 1/d ) , (1 − c + cρ(1 − τ˜ 1/d ))(1 − c + cρ(1 ˜ − τ 1/d ))
(3.6)
using the fact that x → (1 − x)/(1 − cx) is monotonic decreasing on (0, 1). Now, noting that ρ ρ˜ = c, we see that the denominator is minimized when ρ = (c(1 − τ 1/d )/(1 − τ˜ 1/d ))1/2 , so that c(1 − τ˜ 1/d )(1 − τ 1/d ) . (3.7) (1 − c + c3/2 (1 − τ˜ 1/d )(1 − τ 1/d ))2 √ Since x → cx/(1 − c + c3/2 x)2 is increasing, a further upper bound is obtained by 1/d 1/d 1/d 1/d = c, (1− τ˜ 1/d )(1−τ 1/d ) is maximized maximizing (1− √ τ˜ )(1−τ ). Since τ˜ τ 1/d when τ = c, and we obtain √ c(1 − c)2 d c ≤ . (3.8) √ (1 − c + c3/2 (1 − c))2 cd ≤
Writing c = g 2 and taking square roots, we have, dividing g out from both sides, g d−1 ≤
1 (1 − g) = . (1 − g 2 + g 3 − g 4 ) 1 + g + g3
(3.9)
˜ ˜ = (1 + g 2 )/(1 − g 2 ). Then, since g ∈ (0, 1), Now suppose d ≤ (1 + λλ)/(1 − λλ) 2g 2
g 1−g2 ≤ g d−1 ≤
1 , 1 + g + g3
(3.10)
so that log(1 + g + g 3 ) + 2g 2 /(1 − g 2 ) log(g) ≤ 0. However, we have a contradiction with the following result which we prove in the Appendix:
248
B.D. Mestel, A.H. Osbaldestin, A.V. Tsygvintsev
Lemma 3. For all x ∈ (0, 1), (1 − x 2 ) log(1 + x + x 3 ) + 2x 2 log(x) > 0.
Thus the lemma is proved. 4. The Operator T
In this section we first of all discuss informally the relationship between dR 2 (f ) and the operator T given in Sect. 1. When analyzing asymmetric maps it is often convenient to work (as in [14]) with a map of pairs, rather than the doubling operator R. Let RP denote the map RP (f, f˜) = (R(f˜), R(f )). Then a fixed point of RP , with f = f˜ = R(f ), corresponds to a period2 point of R and vice versa. The spectra of the derivatives dR 2 (f ) and dRP (f, f˜) are related: an eigenvalue ρ 2 of dR 2 (f ) corresponds to a pair of eigenvalues ±ρ of dRP (f, f˜). Indeed, if ρ 2 ∈ C is an eigenvalue of dR 2 (f ) with eigenvector δf , then the pair (δf, ±ρ −1 δ f˜), where δ f˜ = dRf δf , is an eigenvector of dRP (f, f˜) with eigenvalue ±ρ, and vice versa. We may therefore study the spectrum of dRP (f, f˜) in lieu of dR 2 (f ). As in [4], a further simplification can be made by studying the operator R¯ P given by RP with the parameters λ and λ˜ held constant at their values at the fixed-point pair (f, f˜). This introduces eigenvalues ±1 into the spectrum of d R¯ P (f, f˜) but otherwise leaves the spectrum undisturbed. Acting on pairs of tangent functions (δf (x), δ f˜(x)), the operator d R¯ P (f, f˜) is given by: −1 ˜ ˜ ˜ δf (x) − λ˜ −1 f˜ (f˜(−λx))δ f˜(−λx) −λ˜ δ f˜(f˜(−λx)) ¯ ˜ . d RP (f, f ) = δ f˜(x) −λ−1 δf (f (−λx)) − λ−1 f (f (−λx))δf (−λx)) (4.1) Furthermore, it is convenient to build in the degree of criticality d by writing f (x) = F (|x|d ), f˜(x) = F˜ (|x|d ) leading to an induced map R¯ P on pairs (F, F˜ ) and derivative d R¯ P (F, F˜ ). Following [4], as a final simplification, we consider tangent vector pairs (v, v) ˜ = (δF /F , δ F˜ /F˜ ). Following [4] we define a map from R to R given by q(x) = sign(x)|x|d .
(4.2)
We then define L, L˜ by L(x) = q(F (x)), and use also the notation
˜ L(x) = q(F˜ (x)),
x ∈ [0, 1] ,
(4.3)
x ∈ [0, z1d ] L+ (x) = F (x)d , −L− (x) = −|F (x)|d , x ∈ [z1d , 1] , x ∈ [0, z˜ 1d ] L˜ + (x) = F˜ (x)d , ˜ L(x) = d ˜ ˜ −L− (x) = −|F (x)| , x ∈ [˜z1d , 1] .
L(x) =
(4.4a) (4.4b)
These functions satisfy the identities L(x) = −
1 ˜ ˜ L(L(t˜x)), λ˜ d
1 ˜ L(x) = − d L(L(tx)) , λ
(4.5)
Bounds on the Unstable Eigenvalue for the Asymmetric Renormalization Operator
249
or, equivalently, 1 ˜ ˜ L− (L+ (t˜x)), ∀ x ∈ [0, z1d ] , λ˜ d 1 L˜ + (x) = d L− (L+ (tx)), ∀ x ∈ [0, z˜ 1d ], (4.6) λ 1 ˜ ˜ L− (x) = L (L (t˜x)), ∀ x ∈ [z1d , 1] , ˜λd + + 1 L˜ − (x) = d L+ (L+ (tx)), ∀ x ∈ [˜z1d , 1] . (4.7) λ The linear operator induced on (v, v) ˜ by d R¯ P (F, F˜ ) is the operator T described in the introduction: −1 ˜ t˜x))L˜ (t˜x)−1 ) v(x) v1 (x) ˜ t˜x) + v( ˜ L( t˜ (v( . (4.8) T = = −1 v(x) ˜ v˜1 (x) t (v(tx) + v(L(tx))L (tx)−1 ) L+ (x) =
In view of Lemma 1, the functions F (x), v(x) are analytic in the domain = U () ˜ = U˜ (). ˜ and F˜ (x), v(x) ˜ are analytic in We recall that U (x), U˜ (x) satisfy the following functional equations: t˜U (x) = U˜ (u(− ˜ λ˜ x)), t U˜ (x) = U (u(−λ x)),
u(x) ˜ = U˜ (x)1/d , u(x) = U (x)
1/d
,
x ∈ , ˜. x∈
(4.9a) (4.9b)
The following equations are a direct consequence of (4.9): L(t U˜ (x)) = U (−λx),
˜, x∈
˜ t˜U (x)) = U˜ (−λx), ˜ L(
x ∈ ,
(4.10)
which provide (by the injectivity of U and U˜ ) a holomorphic extension of the restriction ˜ ˜ (resp. t˜ ). L|(0, z1d ) (resp. L|(0, z˜ 1d ) ) to the complex domain t We now consider rigorously the properties of the operator T . Our first task is to show that T is well defined on function pairs (v, v) ˜ on suitable domains. We first of all show ˜ map nicely. that the domains and ˜ , ˜ satisfy: Lemma 4. The domains , , ˜ ˜ ⊂ , 1. −λ˜ ⊂ , −λ ˜ ˜ ⊂ , 2. t˜ ⊂ , t ˜ t˜ ) ⊂ . ˜ ⊂ , L( ˜ 3. L(t ) ˜ We shall prove the first Proof. Statement 1 follows directly from the definition of , . inclusion of Statement 2; the proof of the second one is similar. Let x ∈ t˜ , i.e., suppose there exists y ∈ such that x = t˜U (y). Then, according to the first of equations (4.9), ˜ and so x ∈ ˜ by the we have x = t˜ U (y) = U˜ (u(− ˜ λ˜ y)). However, u(− ˜ λ˜ y) ∈ ˜ definition of . We now establish Statement 3. We shall prove the first inclusion; the proof of the sec˜ i.e. suppose there exists y ∈ ˜ such that x = t U˜ (y). ond one is analogous. Let x ∈ t , Then, the first of Eqs. (4.10) gives us L(x) = U (−λ y), and hence, by Statement 1, −λ y ∈ and L(x) ∈ . ˜ ˜ and t , ˜ t˜ are natural domains on which to define F , F˜ and L, L. The domains , However, to ensure that T is well defined and compact, we must obtain smaller domains on which T is bounded and analyticity improving. This we do in the next section.
250
B.D. Mestel, A.H. Osbaldestin, A.V. Tsygvintsev
5. Analyticity-Improving Domains Let a, b ∈ R, a < b and D(a, b) be an open disc in C with diameter (a, b). We intro˜ 1 = U˜ (D(α˜ 1 , β˜1 )), duce the domains 1 = U (D(α1 , β1 )), 0 = U (D(α0 , β0 )), ˜ ˜ ˜
0 = U (D(α˜ 0 , β0 )), where −1 α1 = −λ˜ −1 , α˜ 1 = −λ−1 , β0 = u(−λ˜ −1 ) , β˜0 = u(−λ ˜ ), (5.1) −1 α0 = −3λ˜ −1 /4 − λu(−λ ˜ )/4 ,
˜ −1 /2 + u(−λ˜ −1 )/2 , β1 = (λλ)
˜ α˜ 0 = −3λ−1 /4 − λu(− λ˜ −1 )/4 , −1 ˜ −1 /2 + u(−λ β˜1 = (λλ) ˜ )/2 .
(5.2) (5.3)
The following inequalities will be important in what follows: ˜ −1 , 1 < u(−λ˜ −1 ) < (λλ)
−1 ˜ −1 . 1 < u(−λ ˜ ) < (λλ)
(5.4)
We shall now prove (5.4). To show that u(−λ˜ −1 ) > 1 we use the fact that u(x) is an anti-Herglotz function which is decreasing in (−λ˜ −1 , 1) and satisfies the condition u(−λ) = 1, λ ≤ λ˜ −1 . This gives u(−λ˜ −1 ) = limx→−λ˜ −1 u(x) > 1. Next, u(−λ˜ −1 ) = ˜ −1 if and only if z˜ 1 µ1/d λ < 1 which z˜ 1 /t˜1/d = z˜ 1 µ1/d λ˜ −1 , so that u(−λ˜ −1 ) < (λλ) follows from the inequalities z˜ 1 < 1, µ1/d λ < 1. The other inequalities follow similarly. From the inequalities (5.4), it straightforward to check the following: ˜ −1 , −λ˜ −1 = α1 < α0 < −λ < 0 < 1 < β0 < β1 < (λλ) ˜ −1 , −λ−1 = α˜ 1 < α˜ 0 < −λ˜ < 0 < 1 < β˜0 < β˜1 < (λλ)
(5.5a) (5.5b)
˜0 ˜ 1 ⊂ . ˜ and from these it is easy to check that 0 1 ⊂ and ˜ ˜ 0. We have the following lemma concerning the domains 0 , 1 , 1 , ˜ 0, ˜ 1 satisfy: Lemma 5. The domains 0 , 1 , ˜ t˜ 1 ) ⊂ ˜ 0 , L(t ˜ 1 ) ⊂ 0 , 1. L( ˜ 0, t ˜ 1 ⊂ 0 , 2. t˜ 1 ⊂ ˜0. 3. [0, 1] ⊂ 0 , [0, 1] ⊂ Proof. We shall prove the first inclusion of Statement 1, the proof of the second is similar. ˜ 0. Let x = U (ζ ) ∈ 1 , ζ ∈ D(α1 , β1 ). By (4.10), we need to show that U˜ (−λ˜ ζ ) ⊂ ˜ ˜ This is equivalent to −λD(α1 , β1 ) ⊂ D(α˜ 0 , β0 ), where we have used the property that if an anti-Herglotz function is holomorphic on a real segment (A, B) and maps it into the real segment (A , B ), then it maps D(A, B) into D(A , B ) (see [6]). This in turn ˜ 1 ≤ β˜0 which are easy to check with help of gives the inequalities −λ˜ β1 ≥ α˜ 0 and −λα the inequalities (5.4). We shall now outline the proof of the first inclusion of Statement 2. The proof of the second one is similar. Consider x = U (ζ ) ∈ 1 for some ζ ∈ D(α1 , β1 ). From the first of Eqs. of (4.9) we have t˜ x = t˜ U (ζ ) = U˜ (u(− ˜ λ˜ ζ )). We note that u(x) ˜ is an anti-Herglotz function, analytic in the domain C+ ∪ C− ∪ (−λ−1 , 1). It is sufficient to ˜ ) ∈ D(α˜ 0 , β˜0 ). We have −λζ ˜ ∈ D(a, b), where a = −λβ ˜ 1 , b = −λα ˜ 1 show that u(− ˜ λζ ˜ ) ∈ D(u(b), and u(− ˜ λζ ˜ u(a)). ˜ We thus verify that D(u(b), ˜ u(a)) ˜ ⊂ D(α˜ 0 , β˜0 ). Finally, we prove Statement 3. We have 0 = U (D(α0 , β0 )). Thus, 0 ⊂ D(U (β0 ), U (α0 )). ˜ −1 ) and that U (−λ) = 1, Using the property that U (x) is decreasing in (−λ˜ −1 , (λλ) U (1) = 0 we conclude that the condition [0, 1] ⊂ 0 is equivalent to the inequalities ˜ 0 is similar. α0 < −λ and β0 > 1 which are given by (5.5). The proof of [0, 1] ⊂
Bounds on the Unstable Eigenvalue for the Asymmetric Renormalization Operator
251
˜ t˜x)) ˜ 0 then v( From Lemma 5 we have that if (v, v) ˜ are analytic on 0 × ˜ t˜x), v( ˜ L( ˜ 1 . Furthermore, differentiating are analytic on 1 and v(tx), v(L(tx)) are analytic on ˜ since U is univalent on the first equation of (4.10) gives L (t U˜ (x)) = 0 for all x ∈ , ˜ ⊂ . We deduce that L (tx) = 0 for all x ∈ ˜ = U˜ (). ˜ Similarly L˜ (tx) = 0 −λ for all x ∈ . ˜1 ⊂ ˜ we conclude that if (v, v) ˜0 Since 1 ⊂ and ˜ are analytic on 0 × ˜ 1. T (v, v) ˜ is defined and analytic on 1 × ˜ 1 (resp. We note that the derivative L (tx) (resp. L˜ (t˜x)) vanishes at x = z1d /t ∈ ∂ d ˜ ˜ ˜ x = z˜ 1 /t ∈ ∂ 1 ). But its reciprocal 1/L (tx) (resp. 1/L (t x)) is bounded in any ˜ ˜ 1 (resp. 1 ). Hence T (v, v)
˜ is well defined and bounded on any domain ˜ with 1 , ˜ ˜ 1.
1 × 1 1 1 From this we immediately have the following lemma, which shows that T is analyticity-improving. Lemma 6. If (v(x), v(x)) ˜ is a pair of real function on [0, 1] which extend to a holomor˜ 0 ⊂ C2 then the pair (v1 (x), v˜1 (x)), defined in (4.8), extend phic functions on 0 × ˜ 1. to holomorphic functions on 1 × We now define the Banach space in which we shall work. Definition. Let B denote the Banach space of pairs of functions (v(x), v(x)) ˜ holomor2 ˜ phic and bounded on 0 × 0 ⊂ C which are real on [0, 1]. We equip B with the norm
||(v, v)|| ˜ = max
sup |v(x)|, sup |v(x)| ˜ x∈ 0
.
(5.6)
˜0 x∈
The results of this section enable us to conclude that T is compact. Indeed, we have the following, which is a direct consequence of Lemma 4 and Lemma 6. Corollary 1. T is a compact operator on B and T B ⊂ B. Moreover, for every 1 , ˜ ˜ 1 we have
1 sup |v(y)| sup |v1 (x)| ≤ t˜−1 1 + sup ˜ , (5.7a) ˜ (t˜x) y∈ ˜ x∈ x∈ L 0 1 −1 sup |v(y)| . 1 + sup (5.7b) sup |v˜1 (x)| ≤ t L (tx) y∈ 0 ˜ ˜ x∈ x∈ ˜ 6. Properties of the Functions L(x) and L(x) ˜ in particular we In this section, we prove several properties of the functions L and L; show that they are convex. Lemma 7. The function L+ is convex on [0, z1d ], and the function L− is convex on [z1d , 1]. The function L˜ + is convex on [0, z˜ 1d ], and the function L˜ − is convex on [˜z1d , 1]. Proof. In order to prove the lemma, we prove that S± = L−1 ± are convex. The proof is analogous to the proof of Lemma 3.2 from [4].
252
B.D. Mestel, A.H. Osbaldestin, A.V. Tsygvintsev
Let S± (ζ ) = U (±ζ 1/d ), S˜± (ζ ) = U˜ (±ζ 1/d ). Then S (ζ ) U (x) 1 d − 1 − x , = − ± (ζ ) dζ U (x) S±
x = ±ζ 1/d .
Using Lemma 2 and the a priori bounds (2.9a) we have for x = ζ 1/d > 0,
(ζ ) ˜ S+ 1 1 + λλ˜ 1 + λλx > − . − ˜ S+ (ζ ) dζ 1 − λλ˜ 1 − λλx This is positive for x < 1. For x = −ζ 1/d ≤ 0, we get:
(ζ ) ˜ S− 1 − λx 1 1 + λλ˜ − . − > ˜ S− (ζ ) dζ 1 − λλ˜ 1 + λx
(6.1)
(6.2)
(6.3)
This is positive for −λ ≤ x ≤ 0. This gives us the inequalities −
(ζ ) S+ (ζ ) > 0, S+
ζ ∈ [0, 1],
−
(ζ ) S− (ζ ) > 0, S−
ζ ∈ [0, λd ].
(6.4)
The analogous inequalities hold for the functions S˜± (x). This completes the proof of Lemma 7. The next two lemmas give important estimates on L, L˜ and their derivatives. Lemma 8. For all x ∈ [0, 1], we have L+ (tx) > tx,
L˜ + (t˜x) > t˜x .
(6.5)
Proof. The proof is similar to that of Corollary 3.3 of [4]. By the monotonicity and ˜ it suffices to prove the lemma for x = 1. Using Eqs. (4.5) at x = 0 convexity of L, L, ˜ we find L(1) = −λd , L(1) = −λ˜ d . Reapplying them at x = 1 we obtain ˜ d, L(L(t)) = (λλ)
˜ L( ˜ t˜)) = (λλ) ˜ d. L(
(6.6)
The inequalities λ < z1 µ−1/d , λ˜ < z˜ 1 µ1/d , and z1 , z˜ 1 < 1 imply z1d >
˜ d (λλ) , t˜
˜ d (λλ) , t
(6.7)
˜ d. z˜ 1d > (λλ)
(6.8)
z˜ 1d >
so that, in particular, ˜ d, z1d > (λλ)
Let ζ = L(t)/t. Then the inequality L(t) > t is equivalent to ζ > 1. The function ˜ L(x) is decreasing for x ∈ [0, 1] and takes values in the interval [−λ˜ d , 1]. Suppose that ζ ≤ 1. Then from (6.6) and (4.5) we have zd 1 ˜ ) = − 1 L((λλ) ˜ d ) < − L(t z˜ 1d ) = − 1 , L(ζ λd λd λd
(6.9)
˜ d. where we have used (6.7). Thus, we must have −z1d /λd ∈ [−λ˜ d , 1], i.e. z1d < (λλ) ˜ t˜) > t˜ can be shown in This contradicts (6.8), so ζ > 1 and L(t) > t. The inequality L( a similar manner.
Bounds on the Unstable Eigenvalue for the Asymmetric Renormalization Operator
253
Lemma 9. For all x ∈ [0, 1], we have L (tx) < −1,
L˜ (t˜x) < −1 .
(6.10)
˜ Proof. Using the convexity of L(x) and L(x) it is sufficient to prove the above inequali˜ d as ties for x = 1 only. We will show that L (t) < −1. Let y = L(t), then L(y) = (λλ) follows from (6.6). According to the previous lemma t < y. By the mean value theorem ˜ d )/(t − y). there exists t0 ∈ [t, y] such that L (t0 ) = (L(t) − L(y))/(t − y) = (y − (λλ) ˜ d which The inequality |L (t0 )| > 1 can be easily established then with help of t > (λλ) −1 in its turn follows from µ λ˜ < 1. Since t0 > t, then, from the convexity of L(x), we see that L (t) < −1. The same proof holds for L˜ (x). 7. Invariant Cone for T Generalizing the result obtained in [4], now we can derive the existence of an invariant cone for the operator T in the space of functions (v(x), v(x)). ˜ We shall then be able to apply the Krein-Rutman theorem. Definition. Define 1 to be the set of pairs (v(x), v(x)) ˜ of real smooth functions on [0, 1] which, for x ∈ [0, 1], satisfy (i) v(x) ≥ 0, v(x) ˜ ≥ 0, and (ii) v (x) ≤ 0, v˜ (x) ≤ 0. The following lemma is a generalization of Lemma 3.4 of [4]. Lemma 10. Let = B ∩ 1 . Then T maps 1 into itself and T 2 maps any non-zero vector in into the interior of . Proof. Let T (v, v) ˜ = (v1 , v˜1 ). From Lemma 8 we have ˜ t˜x)(1 + 1/L˜ (t˜x)), t˜v1 (x) ≥ v(
t v˜1 (x) ≥ v(tx)(1 + 1/L (tx)) .
(7.1)
Both of these expressions are non-negative since L+ (tx) < −1, L˜ + (t˜x) < −1 according to Lemma 9. Differentiation of (4.8) gives ˜ t˜x)) − v( ˜ t˜x))L˜ (t˜x)L˜ (t˜x)−2 , ˜ L( v1 (x) = v˜ (t˜x) + v˜ (L(
(7.2a)
v˜1 (x)
(7.2b)
= v (tx) + v (L(tx)) − v(L(tx))L (tx)L (tx)
−2
.
These two expressions are non-positive since each one is a sum of three non-positive terms. Thus we have proved that (v1 , v˜1 ) ∈ 1 . Repeating the arguments from the proof of Lemma 3.4 from the paper [4], we notice that the interior of is composed of (v(x), v(x)) ˜ for which the inequalities defining are all strict. Suppose that (v(x), v(x)) ˜ ≡ (0, 0). If v(x) (resp. v(x)) ˜ vanished for some x ∈ [0, 1), it would have to vanish on [x, 1], hence everywhere by analyticity, i.e. 1 is the only place in [0, 1] where v(x) (resp. v(x)) ˜ can vanish. But according to (7.1) v1 (x) and v˜1 (x) cannot vanish even at 1. Considering now (7.2), we observe that their last terms cannot vanish in (0, 1], and can vanish at 0 only if v(1) = 0 or v(1) ˜ = 0. This proves that T 2 (v, v) ˜ is in the interior of . From the theorem of Krein and Rutman [10] we thus have the following result. Theorem 2. The operator T , acting on B, has an eigenvalue of largest modulus δ > 0. The spectral subspace corresponding to δ is one-dimensional and is generated by an element from the interior of which is the only eigenvector of T in . In the next section we give some bounds on this eigenvalue δ.
254
B.D. Mestel, A.H. Osbaldestin, A.V. Tsygvintsev
8. Bounds on the Expanding Eigenvalue Let (v, v) ˜ be an eigenvector with eigenvalue δ > 0 in the cone v ≥ 0, v˜ ≥ 0, v ≤ 0, v˜ ≤ 0. We have further that v(0), v(0) ˜ > 0, since (v, v) ˜ is in the interior of the cone. The eigenvector equations are
˜ t˜x)) v( ˜ L( v(L(tx)) −1 −1 , δv(x) = t˜ v( ˜ t˜x) + v(tx) + δ v(x) ˜ =t . (8.1) L (tx) L˜ (t˜x) Evaluating these at 0 we obtain v(1) , δ v(0) ˜ = t −1 v(0) + L (0)
v(1) ˜ ˜ + . δv(0) = t˜−1 v(0) L˜ (0)
(8.2)
˜ > 0 so that, neglecting the second term Now we have L (0), L˜ (0) < −1 and v(1), v(1) on the right hand sides of these equations, and multiplying, we immediately obtain the ˜ −d , which is the upper bound δ 2 v(0)v(0) ˜ < (t t˜)−1 v(0)v(0) ˜ so that δ 2 < (t t˜)−1 = (λλ) bound in Theorem 1. ˜ Since v , v˜ ≤ 0, we have To obtain the lower bound, we use the convexity of L and L. that v(1) ≤ v(0) and v(1) ˜ ≤ v(0) ˜ so that, multiplying the eigenvector equations (8.2), we have 1 1 2 −1 ˜ ≥ (t t˜) v(0)v(0) δ v(0)v(0) ˜ 1+ 1+ . (8.3) L (0) L˜ (0) From the convexity of L and L˜ we have L (0) < −1/z1d < −1, and L˜ (0) < −1/˜z1d < −1 so that 1 − z1d < 1 + 1/L (0) and 1 − z˜ 1d < 1 + 1/L˜ (0), and, hence, δ2 >
1 (1 − z1d )(1 − z˜ 1d ) . t t˜
(8.4)
˜ −1 . Then both V and V˜ are convex since they are Recall that we have V (1)V˜ (1) = (λλ) scaled versions of S+ and S˜+ respectively. We also have V (1) = 1, V (α) = 0 , where α = z1−d > 1. The graph of V is sketched in Fig. 1. From the convexity of V we have V (1) ≤ −1/(α − 1) so that α − 1 ≥ −1/V (1). ˜ Now if x, y > 1 Similarly, we have α˜ − 1 ≥ −1/V˜ (1), and thus (α − 1)(α˜ − 1) ≥ λλ. and we have (x − 1)(y − 1) ≥ C > 0, then a straightforward application of Lagrange √ multipliers shows that (1 − x −1 )(1 − y −1 ) ≥ C/(1 + C)2 . We conclude that 1 1 1 . (8.5) (1 − z1d )(1 − z˜ 1d ) = (1 − α −1 )(1 − α˜ −1 ) ≥ t t˜ t t˜ (d−1) ˜ 2 ˜ (1 + λλ) (λλ) from (3.9) that g (d−1) ≤ (1 + g + g 3 )−1 < Now, recall that for g = λλ˜ we have (1 + g)−1 . It follows that (λλ˜ )(d−1) (1 + λλ˜ )2 = g 2(d−1) (1 + g)2 < 1 and thus δ2 ≥
1
0 ,
x ∈ (0, 1) .
Dividing both sides of (A.1) by 2x 2 we obtain the equivalent inequality 1 1 − 1 log(1 + x + x 3 ) + log(x) > 0, x ∈ (0, 1) . F (x) = 2 x2
(A.1)
(A.2)
Differentiation gives F (x) = −
1 (1 + 2x + 4x 2 − x 4 ) 3 log(1 + x + x ) + . x3 2x 2 (1 + x + x 3 )
(A.3)
Since F (1) = 0, in order to prove (A.2) it is sufficient to show that F (x) < 0 for x ∈ (0, 1). Firstly we shall obtain a lower rational bound on the function log(1 + x). We have the following integral formula: 1 log(1 + x) dt g(x) = , (A.4) = x 0 1 + xt which implies (see [18], p. 279), that g(x) can be expressed as a g-fraction g(x) =
1 , g1 x 1+ (1 − g1 )g2 x 1+ (1 − g2 )g3 x 1+ 1 + ···
(A.5)
with coefficients gi ∈ [0, 1] defined uniquely by derivatives of g(x) calculated at x = 0. These may be calculated with help of the so-called Stieltjes formulas. (See [18], p. 203.)
256
B.D. Mestel, A.H. Osbaldestin, A.V. Tsygvintsev
We have g1 = 1/2, g2 = 1/3, g3 = 1/2, g4 = 2/5. According to the a priori bounds for g(x) obtained in [16] (Theorem 2.2, case k = 2) we have g(x) =
log(1 + x) (6 + 5x) ≥ , x 2(x + 3)(1 + x)
x ≥ 0,
(A.6)
so that, multiplying both sides by x and replacing x by x + x 3 , we obtain log(1 + x + x 3 ) ≥
x(1 + x 2 )(6 + 5x + 5x 3 ) , 2(x + x 3 + 3) (1 + x + x 3 )
x ≥ 0.
(A.7)
Substituting the last inequality in (A.3) we obtain F (x) < R(x) =
−(3 − 2x − 8x 2 + 5x 3 + x 4 + 2x 5 + x 7 ) . 2x 2 (x + x 3 + 3)(1 + x + x 3 )
(A.8)
Obviously, in order to prove F (x) < 0 it is sufficient to prove that R(x) < 0 for x ∈ (0, 1). But this is equivalent to the polynomial inequality P (x) = 3 − 2 x − 8 x 2 + 5 x 3 + x 4 + 2 x 5 + x 7 > 0,
x ∈ (0, 1) .
(A.9)
Since P (0) > 0 we can establish (A.9) by showing that P (x) does not have any roots in the interval (0, 1). Below we calculate its Sturm sequence [1] 21 3 + 3x + 10x 2 − 5x 3 − x 4 − x 5 , 4 4 409 1096 264 2 1604 3 s2 (x) = − + x+ x − x + x4 , 337 337 337 337 12104 19981 35629 2 s3 (x) = − x− x + x3 , 30867 30867 61734 4822700 1726274 s4 (x) = − − x + x2 , 11118563 11118563 418811267 s5 (x) = − +x, 558846882 s6 (x) = −1 . s1 (x) = −
(A.10) (A.11) (A.12) (A.13) (A.14) (A.15)
Counting the number of sign changes between the Sturm functions si (x) evaluated at two points x = 0 and x = 1 we conclude that P (x) has no roots in the interval (0, 1). References 1. Acton, F.S.: Numerical Methods that Work. 2nd printing. Washington, DC: Math. Assoc. Amer., 1990 2. Arneodo, A., Coullet, P., Tresser, C.: A renormalization group with periodic behaviour. Phys. Lett. A 70, 74–76 (1979) 3. Coullet, P., Tresser, C.: It´eration d’endomorphismes et groupe de renormalisation. J. de Physique C 5, 25–28 (1978) 4. Eckmann, J.-P., Epstein, H.: Bounds on the unstable eigenvalue for period doubling. Commun. Math. Phys. 128, 427–435 (1990) 5. Epstein, H.: New proofs of the existence of the Feigenbaum functions. Commun. Math. Phys. 106, 395–426 (1986) 6. Epstein, H.: Fixed points of composition operators. In: Nonlinear Evolution and Chaotic Phenomena, Gallavotti, G. Zweifel, P. (eds.), New York: Plenum Press, 1988
Bounds on the Unstable Eigenvalue for the Asymmetric Renormalization Operator
257
7. Epstein, H.: Existence and properties of p-tupling fixed points. Commun. Math. Phys. 215, 443–476 (2000) 8. Feigenbaum, M.J.: Quantitative universality for a class of nonlinear transformations. J. Stat. Phys. 19, 25–52 (1978) 9. Jensen, R. V., Ma, L. K. H.: Nonuniversal behavior of asymmetric unimodal maps. Phys. Rev. A 31, 3993–3995 (1985) 10. Krein, M.G., Rutman, M.A.: Linear operators leaving invariant a cone in a Banach space. Usp. Mat. Nauk 31, 3–95 (1948); English Translation: Functional analysis and measure theory. Providence, RI: Am. Math. Soc. 1962 11. Lanford, O.E.: A computer-assisted proof of the Feigenbaum conjectures. Bull. Am. Math. Soc. 6, 427–434 (1982) 12. McMullen, C.T.: Renormalization and 3-Manifolds which Fiber over the Circle. Annals of Math. Studies, Vol. 142, Princeton NJ: Princeton University Press, 1996 13. Mestel, B.D., Osbaldestin, A.H.: Feigenbaum theory for unimodal maps with asymmetric critical point. J. Phys. A 31, 3287–3296 (1998) 14. Mestel, B.D., Osbaldestin, A.H.: Feigenbaum theory for unimodal maps with asymmetric critical point: rigorous results. Commun. Math. Phys. 197, 211–228 (1998) 15. Mestel, B.D., Osbaldestin, A.H.: Asymptotics of scaling parameters for period-doubling in unimodal maps with asymmetric critical points. J. Math. Phys. 41, 4732–4746 (2000) 16. Mestel, B.D., Osbaldestin, A.H., Tsygvintsev, A.V.: Continued fractions and solutions of the Feigenbaum-Cvitanovic equation. C.R. Acad. Sci. Paris, 334, S´erie I, 683–688 (2002) 17. Sullivan, D.: Bounds, quadratic differentials and renormalization conjectures. In: Mathematics into the Twenty-first Century. Browder, F.E. (ed.), Providence, RI: Am. Math. Soc., 1992 18. Wall, H.S.: Analytic Theory of Continued Fractions. New York: Van Nostrand, 1948 Communicated by G. Gallavotti
Commun. Math. Phys. 250, 259–285 (2004) Digital Object Identifier (DOI) 10.1007/s00220-004-1145-0
Communications in
Mathematical Physics
No Quantum Ergodicity for Star Graphs G. Berkolaiko1, , J.P. Keating2 , B. Winn2, 1 2
Department of Mathematics, University of Strathclyde, Glasgow G1 1XH, UK School of Mathematics, University of Bristol, Bristol BS8 1TW, UK. E-mail:
[email protected] Received: 25 July 2003 / Accepted: 18 March 2004 Published online: 12 August 2004 – © Springer-Verlag 2004
Abstract: We investigate statistical properties of the eigenfunctions of the Schr¨odinger operator on families of star graphs with incommensurate bond lengths. We show that these eigenfunctions are not quantum ergodic in the limit as the number of bonds tends to infinity by finding an observable for which the quantum matrix elements do not converge to the classical average. We further show that for a given fixed graph there are subsequences of eigenfunctions which localise on pairs of bonds. We describe how to construct such subsequences explicitly. These structures are analogous to scars on short unstable periodic orbits. 1. Introduction Let ψn denote the wave-function corresponding to the nth energy level of a quantum system that has a Hamiltonian dynamical system as its classical limit. We are interested in these wave-functions in the n → ∞ limit, which corresponds to the semi-classical regime. Numerical and some theoretical evidence supports the hypothesis that their behaviour in this limit is determined by general properties of the underlying Hamiltonian such as, for example, time-reversibility, integrability and statistical properties of the flow (ergodicity, mixing, etc.). A deeper understanding of this is one of the goals of current research in quantum chaology. When the classical Hamiltonian generates chaotic motion, the semi-classical eigenfunction hypothesis asserts that the wave-functions should equidistribute over the appropriate energy shell [Be1, V]. A physical explanation for this is that in the semi-classical limit the quantum system should mimic the behaviour of the classical system; if the classical motion is chaotic, then a typical trajectory ergodically explores the surface Present address: Department of Mathematics, Texas A&M University, College Station, TX 778433368, USA. E-mail:
[email protected] Present address: Dipartimento di Matematica, Universit`a di Bologna, 40127 Bologna, Italy. E-mail:
[email protected] 260
G. Berkolaiko, J.P. Keating, B. Winn
of constant energy in phase space. Another interpretation is that eigenstates are invariant under time evolution, so it is natural to associate them in the semi-classical limit with classical invariant sets. One such invariant set is the energy shell itself. The Schnirelman theorem [S, CdV, Z, GL] states that for systems in which the Hamiltonian flow is ergodic the sequence of measures induced by ψn converges to Liouville measure in the limit as n → ∞ along a subsequence of density one. This behaviour has been termed “quantum ergodicity”. Quantum ergodicity implies a weak version of the semi-classical eigenfunction hypothesis [BSS]. It is possible that quantum ergodic systems have subsequences of states for which the corresponding measures do not converge to Liouville measure (of course such subsequences have density zero). These subsequences, if they exist, are expected to be associated with other classical invariant sets, such as periodic orbits. The case where the limit of an exceptional subseqence is a singular measure supported on one-or-more isolated, unstable periodic orbits of the classical system is called “scarring”. Scarred eigenstates were observed numerically by Heller [H], who proposed the first theoretical explanation for their existence, based on the semi-classical evolution of a wave-packet centred on a periodic orbit under linearised dynamics. Another important development was an understanding of the contribution to wave-functions from all periodic orbits [Bo, Be2] resulting in formulæ related to the semi-classical trace formula for the density of states. Later, the theory was extended to include non-linear effects [KH] and, more recently, situations where the orbit in question undergoes a bifurcation [KP]. A review of related works was given in [K1]. All of the above mentioned theories relate to scar effects in averages over a semi-classically increasing number of states. This may be thought of as a weakened form of scarring, because it is not clear that any one state in the averaging range causes the scar; the scars may be a collective effect. It is a much harder problem to show that a particular sequence of individual states is scarred. Currently, the only systems known rigorously to support scarring in this strong form are the cat maps [FNdB] which have non-generic spectral statistics caused by number-theoretical symmetries [Ke]. For quantum graphs [KS1] the wave-functions are the eigenfunctions of the (continuous) Laplace operator on the bonds with matching conditions at the vertices chosen to make the problem self-adjoint. The semi-classical limit is equivalent to the limit b → ∞ where b is the number of bonds. The classical dynamics is realised as a Markov random walk on the bonds of the graph defined by a bistochastic matrix of bond-to-bond transition probabilities. Such motion is ergodic if 1 is the only eigenvalue of this matrix lying on the unit circle. Although the classical dynamics on graphs is not Hamiltonian, for a sequence of graphs with an increasing number of bonds, each with ergodic Markov dynamics, we might expect that something equivalent to a Schnirelman theorem holds in the limit. However, because the semi-classical limit for graphs fundamentally affects the classical dynamics, one cannot simply adapt the reasoning used in the proof of the Schnirelman theorem. The same problem presents itself in formulating an equivalent version of the Bohigas-Giannoni-Schmit conjecture for spectral statistics [BGS]. In this case we have a more complete understanding. It has been conjectured by Tanner [T] that the spectral statistics of a sequence of quantum graphs in the limit b → ∞ will be the same as those of generic quantised, classically-chaotic, systems (i.e. of random matrix type) provided that the spectral gap between the largest and second largest eigenvalues of the associated Markov transition matrix for the classical dynamics closes more slowly than 1/b. The Tanner conjecture is supported by much numerical evidence [KS2] and in recent
No Quantum Ergodicity for Star Graphs
261
theoretical developments [BSW1, BSW2, B] a similar condition was used, namely that the onset of the ergodic regime increases not faster than the number of bonds. 1 Since there is no universally accepted definition of ergodicity in the limit of infinite-sized graphs the Tanner conjecture represents the natural replacement for ergodicity as a criterion for random matrix statistics. Although there are no conjectures for a Schnirelman theorem for quantum graphs, it would not be surprising if the same spectral gap criterion proved necessary for the realisation of quantum ergodic behaviour. In parallel to the developments in their spectral statistics cited above, graphs have been a rich source of problems in quantum chaology and related fields. Recent works have considered: scattering problems [KS3, TM, KS4], the spacing distribution of eigenvalues [BG], nodal domain statistics [GSW], the Dirac operator on graphs [BH1, BH2], Brownian motion on graphs [CDM, D] and the important question of how to construct ˙ families of graphs with increasing numbers of bonds [PTZ]. Recently, authors have begun to investigate the wave-functions of quantum graphs. Kaplan [K2] studied eigenfunction statistics for ring-graphs using a combination of numerical techniques and analytical calculations of the short-time semiclassical behaviour of a wave-packet close to a 1-bond periodic orbit. The inverse participation ratio (a measure of localisation in a given state) was found to be well-described by this contribution, and shows deviation from the ergodically expected behaviour. Similar deviations were noticed for lattice-graphs. Remarkably, Schanz and Kottos [SK] observed that it would be impossible for the shortest orbits that are responsible for this enhanced localisation to support strong scarring. They wrote down an explicit criterion which must be satisfied by the energy of any strongly scarred state, and deduced asymptotics of the probability distribution of scarring strengths. In [KMW] a study was made of the eigenfunctions of a family of graphs known as star graphs (the name being derived from the connectivity of graphs in the family). The value distribution for the amplitude of eigenfunctions on a single bond of the graph, subject to an appropriate normalisation, was rigorously calculated in the limit as the number of bonds tends to infinity. In fact the normalisation implies that star graphs with a fixed, finite number of bonds are not quantum ergodic, although they are classically ergodic [T]. However, this result leaves open the question of whether star graphs are quantum ergodic in the limit as the number of bonds tends to infinity. This is because one bond represents a vanishingly small fraction of a graph when the number of bonds becomes infinite, whereas quantum ergodicity is concerned with structures on macroscopic (classical) scales. The results we present here extend the work in [KMW] on star graphs. We review the definition of a quantum star graph in Sect. 2 below. We show that (see the following subsection for precise statements) quantum star graphs are not quantum ergodic in the limit as the number of bonds tends to infinity. We also show that for any given star graph there exist exceptional subsequences of eigenfunctions that become localised on pairs of bonds as n → ∞. Orbits on a graph are simply itineraries of bonds, so this localisation is analogous to strong scarring on short period-2 orbits. Such orbits are unstable in the sense that there is an exponentially small probability of remaining on a given orbit. Our explicit construction supports the observation of Schanz and Kottos [SK] that star graphs support a large number of states scarred in such a way. The spectral statistics of star graphs are different from those associated with the more general graphs described above [BK]. In fact the spectral gap of their Markov transition 1 While the present manuscript was in an advanced stage of preparation the preprint [GA] was released in which it is demonstrated that the quantum spectra of graphs with gaps closing more slowly than b−1/2 exhibit the same correlations as the spectrum of large random matrices.
262
G. Berkolaiko, J.P. Keating, B. Winn
matrices closes precisely at the critical rate, 1/b, of the Tanner conjecture [KS2] and so star graphs are excluded from its scope. Our result, therefore, implies that any extension of the Schnirelman theorem to quantum graphs must exclude sequences of star graphs. This does not contradict the possibility of a quantum ergodicity theorem for other families of graphs (such as, for example, those whose spectral gap closes strictly more slowly than 1/b) which remains an open problem. Note that it is not clear a priori that absence of random matrix spectral statistics should indicate the failure of quantum ergodicity; the Schnirelman theorem applies in the case of quantum cat maps [BdB], although the spectral statistics are not of random matrix type [Ke]. It is known that the spectral statistics of quantum star graphs are the same as those ˇ ˇ BBK], so-called “intermediate statisassociated with the family of Seba billiards [Se, tics”. There is evidence to suggest that the results we present on scarring can also be ˇ extended to Seba billiards [BKW]. 1.1. Main results. To investigate quantum ergodicity for large star graphs, we consider an observable that picks out a positive proportion of the graph. We consider a graph with αv bonds, where α, v ∈ N, and the observable B = (Bi (x))αv i=1 defined by 1 for i = 1, . . . , v Bi := (1) 0 for i = v + 1, . . . , αv. B may be thought of as the indicator function of the first v bonds. The classical average of B is approximately 1/α. We shall consider the limit v → ∞. Wave-functions on graphs have a component on each bond, so we shall use the notation (n)
ψ (n) := (ψi )αv i=1 for the nth eigenstate. The inner product ·|· is defined in (8) below. Each bond of the graph has a length, and the vector of bond lengths will be denoted L := (Li )αv i=1 . Theorem 1. For each v let the components of L be linearly independent over Q. Then there exists a probability density pv (η) such that for any continuous function h, ∞ N 1 h(ψ (n) |B|ψ (n) ) = h(η)pv (η)dη. N→∞ N −∞
(2)
lim
n=1
The density pv (η) is supported on the interval [0, 1]. ¯ L¯ + Theorem 2. For each v let the bond lengths Lj , j = 1, . . . , αv lie in the range [L, L] and be linearly independent over Q. If vL → 0 as v → ∞ then there exists a probability distribution function F (R) such that for any R ∈ (0, 1), R lim pv (η)dη = F (R), (3) v→∞ −∞
where 1 1 F (R) = − Re 2 πα
∞
−∞
Pη (ξ ) arg(τη (ξ )) − i log |τη (ξ )| dξ
η=1/R−1
,
(4)
No Quantum Ergodicity for Star Graphs
and
263
−iπ 1 iξ 2 (α − 1) iξ 2 iπ Pη (ξ ) = √ exp + + √ − , exp πη 4 4η 4 4 π −iπ/4 ξ 2 √ iπ iξ 2 e η exp + + ξ erf τη (ξ ) = √ √ 4 4η 2 η π iπ/4 iξ 2 e ξ 2(α − 1) iπ + ξ(α − 1) erf . + √ exp − − 4 4 2 π
The function F (R) is plotted in Fig. 1. Remark 1. If star graphs satisfied quantum ergodicity, then F (R) would be the stepfunction 1, for R > 1/α F (R) = (5) 0, for R < 1/α for this observable. In Fig. 1 we compare the numerical data for the value distribution of ψ (n) |B|ψ (n) for a star graph with 90 bonds with the v → ∞ analytical prediction F (R). The difference between the actual distribution F (R) and that which would be expected if the graph were quantum ergodic (Remark 1) is clear. Figure 2 shows the difference between numerical data and F (R) for increasing values of v. We also show that for graphs with fixed number of bonds, there are subsequences of eigenfunctions that localise on two bonds. Theorem 3. Let the elements of L be linearly independent over Q. Given any distinct two bonds, indexed by i1 and i2 , of a v-bond star graph, there exists a subsequence (knr ) ⊆ (kn ) such that for any f = (fi )vi=1 smooth in each component, Li Li 1 2 1 fi1 (x)dx + fi2 (x)dx . (6) lim ψ (nr ) |f|ψ (nr ) = r→∞ Li1 + Li2 0 0
Numerical study Analytical prediction
1
0.8
0.6
0.4
0.2
0
0
0.2
0.4
0.6
0.8
1
Fig. 1. Comparing numerical data with the analytical prediction, F (R). For this plot α = 3 and in the numerical study, v = 30
264
G. Berkolaiko, J.P. Keating, B. Winn 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0
0
0.2
0.4
0.6
0.8
1
Fig. 2. Convergence to F (R) for v = 5(+), 10(×), 15(+ ×), 20(), 25(), 30()
11 00 00 11
111 000
11 00 00 11
111 000 000 111
11 00 00 11 111 000
Fig. 3. A star graph with 5 bonds
2. Quantum Star Graphs A star graph2 is a metric graph with b vertices all connected only to one central vertex. Thus there are b + 1 vertices and b bonds (Fig. 3). We shall denote by L ∈ Rb the vector of bond lengths. We define the quantum star graph in the following way. Let H denote the real Hilbert space H := L2 ([0, L1 ]) × · · · × L2 ([0, Lb ])
(7)
with inner product
f|g :=
b j =1 0
2
sometimes referred to as a Hydra graph
Lj
fj (x)gj (x)dx.
(8)
No Quantum Ergodicity for Star Graphs
265
Elements of H are denoted f = (f1 , . . . , fb ). Let F ⊆ H be the subset of functions f which are twice-differentiable in each component and satisfy the conditions fj (0) = fi (0) =: f0 , b
fj (0) =
j =1
j, i = 1, . . . , b,
1 f0 , λ
fj (Lj ) = 0,
(9) (10)
j = 1, . . . , b,
(11)
The parameter λ may be varied to give different boundary conditions at the central vertex of the graph. Henceforth we shall concentrate on the case 1/λ = 0, the so-called Neumann condition. The Laplace operator on F is defined by
f :=
d 2 fb d 2 f1 ,... , 2 dx dx 2
.
(12)
defined on F is self-adjoint. Since the space on which the functions in F are defined is compact, the operator has a discrete spectrum of eigenvalues ([DS], Sect. XIII.4). i.e. the equation − ψ = k 2 ψ
(13)
has non-trivial solutions for k = k1 , k2 , . . . Such ψ are the wave-functions [KS1, KS2]. (n) We shall use the notation that ψ (n) := (ψi (x))bi=1 is the wave-function corresponding to k = kn . Solving (13) with boundary conditions (9)–(11), we find that the component of the nth normalised eigenfunction of the Laplace operator on the i th bond of a star graph is (n)
(n)
ψi (x) = Ai cos kn (x − Li ),
(14)
where the amplitude is given by (n) Ai
=
b
2 sec2 kn Li
j =1 Lj
1/2
sec2 kn Lj
(15)
and kn is the nth positive solution to
Z(k, L) :=
b
tan kLj = 0.
(16)
j =1
In Sects. 3–5 it will be convenient to take b = αv, where α ∈ N is fixed. This is so that we can easily describe a fraction of the total number of bonds as the number of bonds becomes large (v → ∞). In Sect. 6 we shall take α = 1 for notational convenience, since there we will only be concerned with fixed graphs.
266
G. Berkolaiko, J.P. Keating, B. Winn
3. Distribution of the Observable B In this section we prove the existence of a limit distribution for the diagonal matrix elements of B on star graphs with a fixed number, αv of bonds. Lemma 1. Consider a star graph with αv bonds with fixed lengths given by the vector L. Then for B defined by (1),
v 2 1 i=1 Li sec kn Li (n) (n) ψ |B|ψ = αv , (17) +O 2 kn j =1 Lj sec kn Lj where the error estimate is uniform in v and Li Lmin > 0 for each i. Proof. We recall that ψ (n) |B|ψ (n) =
αv j =1 0
Lj
(n)
|ψj (x)|2 Bj (x)dx.
(18)
Integrating (14) gives, for i = 1, . . . , v, Li 1 1 (n) 2 L |ψi (x)|2 Bi (x)dx = αv sec k L + tan k L i n i n i , 2 kn 0 j =1 Lj sec kn Lj (19) and for i v + 1,
Li
0
(20)
v 2 E i=1 Li sec kn Li = αv + , 2 kn j =1 Lj sec kn Lj
(21)
Thus ψ
(n)
|B|ψ
where
(n)
|ψi (x)|2 Bi (x)dx = 0.
(n)
v tan kn Li . E = αv i=1 2 j =1 Lj sec kn Lj
(22)
Let Lmin := minj {Lj }. Then
v
|E|
i=1 | tan kn Li |
(α − 1)Lmin v + Lmin vj =1 sec2 kn Lj (α − 1)vLmin −1 Lmin + v 2 i=1 sec kn Lj
(23) (24)
using the fact that | tan θ | sec2 θ for any θ ∈ R. Hence E = O(1) as n → ∞ uniformly in v, Lmin > 0.
No Quantum Ergodicity for Star Graphs
267
Proof of Theorem 1. By Lemma 1, h(ψ
(n)
|B|ψ
(n)
v 2 i=1 Li sec xi + En , ) = h αv 2 j =1 Lj sec xj
(25)
where En = o(1) as n → ∞ since h is uniformly continuous on [0, 1]. Hence N 1 En → 0 N
as N → ∞.
(26)
n=1
Therefore
N N v 2 1 1 i=1 Li sec kn Li (n) (n) . h(ψ |B|ψ ) = lim h αv lim 2 N→∞ N N→∞ N j =1 Lj sec kn Lj n=1
(27)
n=1
According to Barra and Gaspard, there is an absolutely continuous measure ν(ξ ) such that for piecewise continuous functions, f : → R, N 1 f (kn L) = f (ξ )dν(ξ ), N→∞ N lim
n=1
where is the surface embedded in the αv dimensional torus with side π , defined by tan x1 + · · · + tan xαv = 0. ξ is a set of αv − 1 coordinates which parameterise . To avoid repetition, we refer the reader to [BG, KMW] for more detail about this result and its application to similar problems. Let
v 2 i=1 Li sec xi f (x) = h αv , (28) 2 j =1 Lj sec xj we can define pv (η) by
f (ξ )dν(ξ ) =:
∞ −∞
(29)
h(η)pv (η)dη.
Since 0 ψ (n) |B|ψ (n) 1, it follows that pv (η) is supported on [0, 1].
4. The Large Graph Limit Let η ∈ R and define Xη (n) :=
αv v 1 η 2 L sec k L − Li sec2 kn Li j n j v2 v2 j =v+1
i=1
for n = 1, 2, . . . . The key result of this section is the following.
(30)
268
G. Berkolaiko, J.P. Keating, B. Winn
Proposition 1. For each v, there exists a probability density function fXη ,v such that
1 # n ∈ {1, . . . , N} : Xη (n) < S = N→∞ N
lim
S
−∞
(31)
fXη ,v (σ )dσ.
Furthermore, for each S ∈ R,
S
−∞
fXη ,v (σ )dσ →
S −∞
fXη (σ )dσ
as v → ∞, provided that vL → 0 in this limit, where fXη (σ ) =
−1 √ Re 4α π
∞
−∞
Pη (ξ )
3iπ/4 τη (ξ ) e3iπ/4 τη (ξ ) e w dξ. √ (−σ )3/2 2 −σ
The functions Pη and τη are defined by (43) and (44) below, and w(z) := e−z erfc(−iz). 2
Proof. The existence of the N → ∞ limiting density, fXη ,v is a consequence of the result of Barra and Gaspard [BG]. The proof is entirely analogous to the proof of Theorem 1. We turn our attention to the v → ∞ limit of this density. With n a random variable uniformly distributed on the set {1, . . . , N} for some N ∈ N, the characteristic function for this random variable Xη (n) is ev,N (β) := E(e
iβXη
N 1 )= f (kn L) + O(vL), N n=1
where f : [0, π]v → C is defined to be
αv v iβ f (x) := exp 2 sec2 xj − η sec2 xi . v j =v+1
i=1
Following the argument of [KMW] we can write N 1 f (kn L) N →∞ N
lim
n=1
=
1 1 2αv 2 π αv
∞
−∞ 0
π
···
π 0
αv αv iζ sec2 xj f (x) exp tan xj dαv xdζ. v j =1
j =1
Denoting this limit for each fixed v, β by ev (β), we can write 1 ev (β) = 2αv
∞
−∞
I1 I2v−1 I3αv−v + (α − 1)I4 I2v I3αv−v−1 dζ,
(32)
No Quantum Ergodicity for Star Graphs
where the integrals I1 , . . . , I4 are: iβη 1 π iζ I1 := tan x − 2 sec2 x dx, sec2 x exp π 0 v v 1 π iζ iβη I2 := exp tan x − 2 sec2 x dx, π 0 v v 1 π iζ iβ I3 := exp tan x + 2 sec2 x dx, π 0 v v 1 π iζ iβ I4 := sec2 x exp tan x + 2 sec2 x dx. π 0 v v
269
(33) (34) (35) (36)
An integral similar to (32) was tackled in [KMW]. We quote here the relevant results, mutatis mutandis. Asymptotic analysis of the integrals in (33–36) gives v iζ 2 iπ I1 = √ + O(1) as v → ∞ (37) exp − + 4 4βη πβη and
v iζ 2 iπ I4 = √ − + O(1) as v → ∞. (38) exp 4 4β πβ √ √ √ For √ I2 and I3√we consider separately the cases − v < ζ < v and |ζ | > v. For − v < ζ < v, −iπ/4 ζ 2 iπ iζ 2 e 1 + ζ2 v I2 = exp − √ βη exp + − ζ erf 1+O √ 4 4βη v π 2 βη (39) and
iπ/4 ζ iζ 2 e 2(α − 1) iπ −ζ (α − 1) erf β exp − − (40) √ √ 4 4β π 2 β 1 + ζ2 , × 1+O v √ both error estimates are as v → ∞. For |ζ | > v, √ −1 βη βη ζ2 + + O(ζ −3 ) (41) |I2 | vπ v2 βη (α−1)v
I3
= exp
−
as ζ → ∞, and |I3 |
√ −1 β β ζ2 + + O(ζ −3 ). vπ v 2 β
Using these estimates, we can find an expression for the limit of ev (β) as v → ∞, ∞ 1 ζ ζ 1 e(β) := lim ev (β) = exp − βτη √ dζ √ Pη √ v→∞ 2α −∞ β β β ∞ 1 = Pη (ξ ) exp − βτη (ξ ) dξ. 2α −∞
(42)
270
G. Berkolaiko, J.P. Keating, B. Winn
For ease of notation, we have introduced −iπ iξ 2 (α − 1) iξ 2 1 iπ Pη (ξ ) := √ exp + + √ − exp πη 4 4η 4 4 π
(43)
and −iπ/4 iξ 2 e ξ 2 √ iπ τη (ξ ) := √ + + ξ erf η exp √ 4 4η 2 η π iπ/4 ξ iξ 2 e 2(α − 1) iπ + ξ(α − 1) erf . + √ exp − − 4 4 2 π
(44)
√ √ In the above, wherever β occurs for β < 0, this should be understood to mean ±i −β, where the sign is taken in such a way that e(−β) = e(β), the usual condition for the characteristic function of a probability density. This can always be done. We also note that e(0) = 1 which is consistent with e(β) being the characteristic function of a probability distribution. e(β) is continuous at β = 0 since the defining integral is uniformly convergent in β (see Lemma 7 below). Thus the limiting density, fXη exists and is given by fXη (σ ) =
1 2π
∞
−∞
1 2α
∞
−∞
Pη (ξ ) exp(− βτη (ξ ) − iσβ)dξ dβ,
(45)
√ where we have made the substitution ξ = ζ / β. We here switch the order of integration. This is a non-trivial operation since both integrals are improper. However in this case we can rigorously justify the manoeuvre. Justification is provided in Appendix A, Proposition 5, ∞ ∞ 1 Re Pη (ξ ) exp(− βτη (ξ ) − iσβ)dβ dξ fXη (σ ) = 2π α −∞ 0 √ 3iπ/4 3iπ/4 ∞ τη (ξ ) τη (ξ ) e πe 1 1 = Re − Pη (ξ ) w dξ. (46) √ 2π α iσ 2 (−σ )3/2 2 −σ −∞ To evaluate the final integration we have used the following result (a variant of formula 3.462.5 in [GR]), √ ∞ √ 1 ib π b . (47) exp(−ax − b x)dx = − w a 2 a 3/2 2a 1/2 0 To conclude we observe that since ∞ −∞
Pη (ξ )dξ = 2α ∈ R
the first term in (46) vanishes as it has zero real part.
No Quantum Ergodicity for Star Graphs
271
Some properties of w(z) are discussed in [AS] Chapter 7. We shall use the following Lemma 2. The function w(z) has the asymptotic expansion ∞ i (2m)! w(z) ∼ √ m 4 m!z2m+1 π
(48)
m=0
−π 5π < arg z < . 4 4 Proof. This follows from the asymptotic expansion of erfc, as z → ∞, valid for
∞ √ 2 πzez erfc(z) ∼ m=0
(2m)! , (4z2 )m m!
as z → ∞, | arg z| < 3π/4, taken from [AS] (formula 7.1.23; see also [BlHa] Exercise 3.11). The series for w comes from making the substitution z → −iz. Since erfc and w are analytic functions, they are bounded in the domain of validity of their asymptotic expansions quoted in Lemma 2. Proposition 2. Let z ∈ C and −π/4 < arg z < 5π/4. Then √ R arg(z) π 1 i − √ + √ log |z| + iCR + O , zw(zp)dp = 2 |z|2 R 2 π π 0
(49)
where CR ∈ R is independent of z, but may depend on R. Proof. Write z = |z|eiϕ where ϕ = arg z. Then |z|R R zw(zp)dp = eiϕ w(eiϕ p)dp 0
(50)
0
via p → p/|z|. Using Cauchy’s theorem, |z|R iϕ iϕ e w(e p)dp = w(t)dt = w(t)dt + 0
γ1
γ2
w(t)dt.
(51)
γR
The contours γ1 , γ2 and γR in the complex t-plane are illustrated in Fig. 4. On γ2 , |z|R w(t)dt = w(x)dx γ2
0
|z|R i i = w(x) − √ w(x)dx + dx + √ dx πx π x 0 1 1 ∞ 1 i w(x) − √ w(x)dx + dx = πx 0 1 ∞ |z|R i i w(x) − √ − dx + √ dx πx πx |z|R 1 ∞ ∞ |z|R i i −x 2 = w(x) − √ e dx + i × const − dx + √ dx, π x π x 0 |z|R 1
1
|z|R
272
G. Berkolaiko, J.P. Keating, B. Winn
|z|Reiϕ
γ1 γR
γ2 |z|R Fig. 4. The contours γ1 , γ2 and γR
using w(z) = e−x (1 + i erfi z) to separate the real and imaginary contributions. The imaginary part goes into the constant CR , the value of which is not important since we shall always be considering only the real part of resulting expressions. By the use of Lemma 2, √ 1 i π (52) w(t)dt = + i × const + √ (log R + log |z|) + O 2 |z|2 R 2 π γ2 2
as R → ∞ uniformly for |z| c for some c. On γR , ϕ f (t)dt = i|z|Reiθ w(|z|Reiθ )dθ γR
0
−1 1 dθ i|z|Re +O = √ |z|3 R 3 πi|z|Reiθ 0 −ϕ 1 = √ +O . 2 |z| R 2 π
ϕ
iθ
Combining (52) and (53) gives (49).
by Lemma 2 (53)
The following lemma from probability theory will also be useful. Lemma 3. Let U, V be random variables, then 0 U P fXη (σ )dσ, 2,
and sec2 knr L1 ∼ sec2 knr L2 , ensuring that (6) holds. To make the above arguments rigorous we need the following propositions. Proposition 3. Let the elements of L be linearly independent over Q. Let 1 < v ∗ < v for v 3. Given ε > 0 there exist infinitely many n ∈ N such that:
No Quantum Ergodicity for Star Graphs
275 ε
knr Fig. 5. Poles pn,i and nodes n,i on the real line. Different symbols correspond to different values of i (the circle corresponds to pn,1 ). Filled symbols correspond to the poles, empty symbols to the nodes. In this example v = 5 and v ∗ = 2
1. for each i = 2, . . . , v ∗ there exists m ∈ N satisfying |pm,i − pn,1 | ε/2,
(59)
2. for all i = v ∗ + 1, . . . , v and for all m ∈ N, |pm,i − pn,1 |
π − ε/2, 2Li
(60)
Proof. The idea behind the proof is that for linearly independent elements of L the poles p·,i for different i behave like independent random variables, therefore every permitted pole configuration happens infinitely often. To substantiate this claim we express the nearest-pole distances as the states of an ergodic dynamical system. For n ∈ N and i = 2, . . . , v ∗ , let δn,i denote the distance between pn,1 and the closest pole of tan kLi ; δn,i := pn,1 − pm,i , where m is such that |pm,i − pn,1 | = min{|pm,i − pn,1 |}. m
Since the poles of tan kLi are π/Li -periodic, we have δn,i +
π π π = pn,1 − p0,i + mod 2Li 2Li Li π π π = + n mod . 2L1 L1 Li
(61) (62)
Let m,i denote the mth zero of tan kLi . We note that (60) is implied by the condition that |m,i − pn,1 | ε/2 for some m ∈ N. For i = v ∗ + 1, . . . , v, define ηn,i to be the distance between pn,1 and the closest zero of tan kLi . Similarly to (62), π π π 1 1 π ηn,i + + = + n mod . (63) 2Li 2 L1 Li L1 Li
276
G. Berkolaiko, J.P. Keating, B. Winn
From (62) and (63), δn,i and ηn,i satisfy the recurrence i = 2, . . . , v ∗ δn+1,i = δn,i + Lπ1 mod Lπi . π π ηn+1,i = ηn,i + L1 mod Li i = v ∗ + 1, . . . , v
(64)
Since the bond lengths are not rationally related, the dynamical system (64) is equivalent to an irrational translation on a torus. In this case, Weyl’s equidistribution result [W] applies, and any subset of the torus with positive Lebesgue measure is visited infinitely many times. The volume of the area in δ − η space defined by −ε/2 < δn,i , ηn,i < ε/2
(65)
is non-zero and so there are infinitely many n for which (65), and therefore (59-60), are satisfied. The interpretation of Proposition 3 is that we can find situations on the real line where v ∗ poles of the functions tan kLi are bunched together and the remaining v − v ∗ poles are not close to these bunched poles (see Fig. 5). Proposition 4. Under the conditions of Proposition 3 there is a subsequence (knr ) ⊆ (kn ) for which sec2 knr Li → ∞
for i = 1, . . . , v ∗ ,
sec2 knr Li → 1
for i = v ∗ + 1, . . . , v
as r → ∞. Proof. Let (εr ) be a sequence satisfying εr → 0 as r → ∞. We choose knr as follows. Applying Proposition 3 with ε = εr yields a set of v ∗ poles of Z(k, L) inside a region with width εr . Since there is a zero of Z(k, L) between any two poles of Z(k, L), we can find v ∗ − 1 zeros in this region. Set knr to be one of these zeros. From Proposition 3 we have |knr − pm,i | εr |knr − m,i | εr
for all i = 1, . . . , v ∗ and some m = m(r, i), for all i = v ∗ + 1, . . . , v and some m = m(r, i).
Since sec2 Li pm,i = ∞, sec2 Li m,i = 1 and sec θ is a periodic function, the statement of the proposition follows trivially. Corollary 1. If v ∗ = 2 in Proposition 4 then we additionally have sec2 knr L1 = 1. r→∞ sec2 knr L2 lim
Proof. We recall that since knr is an eigenvalue, Z(knr , L) = 0, and hence tan knr L1 = − tan knr L2 − tan knr L3 − · · · − tan knr Lv .
(66)
On the other hand, by Proposition 4, tan knr Li remains bounded for i > 2 and tends to infinity for i = 1, 2. Dividing (66) through by tan knr L2 we obtain lim
r→∞
tan knr L1 = −1. tan knr L2
Further observations that sin2 knr Li → 1 for i = 1, 2 and sec2 θ = tan2 θ/ sin2 θ conclude the proof.
No Quantum Ergodicity for Star Graphs
277
Lemma 4. Let f : [0, L] → R be continuously differentiable. Then lim
k→∞ 0
L
cos(kx)f (x)dx = 0.
Proof. Integration by parts yields
L
cos(kx)f (x)dx =
0
L 1 sin(kx)f (x)dx sin(kL)f (L) − k 0
and the statement follows immediately from the boundedness of f and its derivative.
Proof of Theorem 3. Without loss of generality, we can assume that i1 = 1 and i2 = 2. We take the subsequence whose existence is guaranteed by Proposition 4 with v ∗ = 2. By Corollary 1, (n ) 2 lim A r r→∞ i
−1 v sec2 knr Lj 2(L1 + L2 )−1 if i = 1, 2 =2 Lj lim = 0 otherwise. r→∞ sec2 knr Li j =1
We use Lemma 4 to get rid of the second integrals in (58) and conclude lim ψ (nr ) |f|ψ (nr ) =
r→∞
1 L1 + L 2
L1
L2
f1 (x)dx +
0
f2 (x)dx .
0
Acknowledgements. GB is grateful to the University of Bristol for hospitality during visits while part of this research was carried out. BW wishes to thank the University of Strathclyde for hospitality. BW has been financially supported by an EPSRC studentship (Award Number 0080052X). We gratefully acknowledge the support of the European Commission under the Research Training Network (Mathematical Aspects of Quantum Chaos) HPRN-CT-2000-00103 of the IHP Programme.
A. Appendix: The Order of Integration in (45) In this appendix we deal with some technical issues regarding the change of order of integration in (45). We first consider some asymptotics of τη (ξ ). Lemma 5. For ξ ∈ R, τη (ξ ) = α|ξ | + Oη (ξ −2 ) as |ξ | → ∞, where the error estimate depends on η.
278
G. Berkolaiko, J.P. Keating, B. Winn
Proof. We first note that τη is an even function, so we may assume ξ > 0, and the result for ξ < 0 will follow by symmetry. We can write τη (ξ ) as ξ √ τη (ξ ) = ηt √ + (α − 1)t (ξ ), η where
iπ/4 ξ iξ 2 e 2 iπ + ξ erf . t (ξ ) := √ exp − − 4 4 2 π
(67)
We expand the error function asymptotically, iπ/4 iπ/4 e ξ ξ e erf = 1 − erfc 2 2 2 iπ iξ 2 = 1 − √ exp − − + O(ξ −3 ) 4 4 ξ π
(68)
as ξ → ∞. Substituting (68) into (67) gives t (ξ ) = ξ + O(ξ −2 ),
as ξ → ∞,
(69)
and the lemma follows.
Lemma 6. For ξ > 0, Re
dτη 0 and for all ξ ∈ R, there exists τ ∗ > 0 such that dξ Re τη (ξ ) τ ∗ .
Proof. By differentiation,
eiπ/4 ξ 2 2 iπ/4 ξ/2 −ir 2 e dr. =√ e π 0
dt = erf dξ
We see that Re
2 dt =√ dξ π
Thus, Re
dτη = Re t dξ
ξ/2
0
ξ √ η
π cos r 2 − dr 0. 4
+ (α − 1)Re t (ξ ) 0.
Hence it follows that Re τη (ξ ) Re τη (0) = Lemma 7. The integral
∞
√ √ √ 2( η + α − 1)/ π =: τ ∗ .
Pη (ξ ) exp − βτη (ξ ) dξ
0
is uniformly convergent for β ∈ [0, β0 ] for all β0 > 0.
No Quantum Ergodicity for Star Graphs
279
Proof. By making the substitution ν = ξ 2 we can consider the uniform convergence of ∞ √ √ dν Pη ( ν) exp − βτη ( ν) √ . ν Let √ exp − βRe τ ( ν) , η ν 1/4 √ √ Pη ( ν) φ(ν, β) := exp − βiIm τ ( ν) . η ν 1/4 √ By Lemma 5, Im τη ( ν) = O(ν −1 ) as ν → ∞. (We drop the η-dependence since we are concerned here only with fixed η.) So √ (70) exp − βiIm τη ( ν) = 1 + O(ν −1 ) f (ν, β) :=
1
uniformly for β ∈ [0, β0 ]. This means that ∞ φ(ν, β)dν converges uniformly, i.e. given any ε > 0 there exists ν1 such that for any ν2 > ν1 , ν2 φ(ν, β)dν < ε ν1
for all β ∈ [0, β0 ]. f (ν, β) is differentiable in ν, and decreasing, so that
If we let ψ(ν, β) :=
ν2
ν1
ν ν1
∂f 0. ∂ν φ(ν , β)dν then integrating by parts gives
ν2 ∂f (ν, β)ψ(ν, β)dν f (ν, β)φ(ν, β)dν = f (ν2 , β)ψ(ν2 , β) − ν1 ∂ν ν2 ∂f dν f (ν2 , β)ε − ε ν1 ∂ν = εf (ν1 , β),
where we have used the mean value theorem for integrals. If additionally, ν1 > 1 then f (ν1 , β) < 1 and we are done. Corollary 2. ∞ 1 Pη (ξ )e 0
0
√ − βτη (ξ )−iσβ
1 ∞
dβdξ = 0
√ βτη (ξ )−iσβ
Pη (ξ )e−
dξ dβ.
(71)
0
Proof. This follows immediately from Lemma 7, see, for example, §11.55.II of [St].
280
G. Berkolaiko, J.P. Keating, B. Winn
Lemma 8. The integral
∞
exp − βτη (ξ ) − iσβ dβ
1
is uniformly convergent for ξ ∈ [0, ξ0 ] for all ξ0 > 0. Proof. We, in fact, prove the stronger statement that the integral in question is uniformly ∗√ convergent for all ξ > 0. Taking M(β) := e−τ β , | exp − βτη (ξ ) − iσβ | M(β) and the integral is uniformly convergent by the Weierstrass M-test.
Lemma 9. The iterated integral
∞ R2
√ βτη (ξ )−iσβ
Pη (ξ )e−
dβdξ
(72)
1
0
converges uniformly for R > 1. Proof. We shall first consider the case where σ < 0. A lengthy calculation gives 1
R2
√ βτη (ξ )−iσβ
e−
dβ =
1 −iσ −τη (ξ ) 1 2 e − e−iR σ −Rτη (ξ ) iσ iσ √ √ τη (ξ )2 πτη (ξ ) − −3πi/4 exp erfc e−iπ/4 −σ 2e (−σ )3/2 4iσ √ τη (ξ )eiπ/4 τη (ξ )eiπ/4 + − erfc Re−iπ/4 −σ + . √ √ 2 −σ 2 −σ
By Lemma 5, τη (ξ ) ∼ αξ as ξ → ∞, so ∞ 2 Pη (ξ )e−iR σ −Rτη (ξ ) dξ 0
is uniformly convergent for R > 1 by the Wierstrass M-test with M(ξ ) := Ce−τη (ξ ) for some constant C which does not depend on ξ . We can write √ −iτη (ξ )2 τη (ξ )eiπ/4 exp erfc Re−iπ/4 −σ + √ 4σ 2 −σ √ τη (ξ )e3πi/4 = exp −R 2 iσ − Rτη (ξ ) w Reiπ/4 −σ + √ 2 −σ and since w(z) = O(z−1 ) as z → ∞ and |τη (ξ )| τ ∗ , w Re
iπ/4
√
τη (ξ )e3π i/4 −σ + √ 2 −σ
= O(1)
No Quantum Ergodicity for Star Graphs
281
as ξ → ∞, uniformly for R > 1. Since | exp(−Rτη (ξ ))| exp(−τη (ξ )) we see that the convergence of ∞ √ −iτη (ξ )2 τη (ξ )eiπ/4 −iπ/4 erfc Re Pη (ξ )τη (ξ ) exp −σ + dξ √ 4σ 2 −σ 0 is uniform for R > 1, by the Wierstrass M-test. In the case σ = 0 we have the simpler integral R2 2 −τη (ξ ) −Rτη (ξ ) exp(− βτη (ξ ))dβ = (ξ ) e − Re τ η τη (ξ )2 1 + e−τη (ξ ) − e−Rτη (ξ ) . The integral with respect to ξ then converges uniformly by the Wierstrass M-test, since | exp(−Rτη (ξ ))| exp(−Re τη (ξ )) and |R exp(−Rτη (ξ ))| for R > 1.
2 τ ∗e
exp(− 21 Re τη (ξ ))
The following theorem from §11.55.III of [St] describes criteria which permit the change of order of two improper integrals. Theorem 4. Let f (x, α) be continuous in α1 α α2 and c x d, where both α2 and d may be arbitrarily large, and; ∞ i) f (x, α)dx be uniformly convergent for α ∈ [α1 , α2 ], c ∞ f (x, α)dα be uniformly convergent for x ∈ [c, d], ii) α1∞ R f (x, α)dαdx be uniformly convergent for R ∈ [α1 , ∞], iii) c
then
α1
∞ ∞
α1
∞ ∞
f (x, α)dxdα =
f (x, α)dαdx.
c
c
α1
Applying Theorem 4 to the integral in (45) allows us to conclude the following. Proposition 5. ∞ ∞ Pη (ξ ) exp(− βτη (ξ ) − iσβ)dξ dβ 0 0 ∞ ∞ Pη (ξ ) exp(− βτη (ξ ) − iσβ)dβdξ. = 0
0
Proof. This follows from Theorem 4 with Lemmas 7, 8 and 9, together with Corollary 2.
282
G. Berkolaiko, J.P. Keating, B. Winn
B. Appendix: Simplification of (55) We here consider some technical points that arise in Sect. 5. Lemma 10. The integral 3iπ/4 3iπ/4 ∞ e e τη (ξ ) τη (ξ ) Re Pη (ξ ) w dξ √ 2(−σ )3/2 2 −σ 0
(73)
is uniformly convergent for σ ∈ [−R 2 , 0] for any R > 0. Proof. Expanding the w function, using Lemma 2, 3iπ/4 e3π i/4 τη (ξ ) τη (ξ ) e 1 1 w + O = √ 2(−σ )3/2 iσ τη (ξ )2 2 −σ
∞ as ξ → ∞ where the implied constant is independent of σ ∈ [−R 2 , 0]. Since 0 Pη (ξ ) dξ = α the leading order term in the expansion of (73) has zero real part, and the integral of the remainder converges since τη (ξ )−2 ∼ (αξ )−2 as ξ → ∞. Proposition 6. We have lim
R→∞ 0
∞ −R 2 −∞
3iπ/4 e3iπ/4 τη (ξ ) e τη (ξ ) Pη (ξ ) w dσ dξ = 0. √ 2(−σ )3/2 2 −σ
Proof. We make the substitution 2p = (−σ )−1/2 to give
−R 2
−∞
3iπ/4 1/2R e e3iπ/4 τη (ξ ) τη (ξ ) w 2e3iπ/4 τη (ξ )w(e3iπ/4 τη (ξ )p)dp dσ = √ 2(−σ )3/2 2 −σ 0 2w(t)dt, = γξ,R
e3iπ/4 τη (ξ ) . Since w is an analytic where t ∈ C follows the contour γξ,R connecting 0 to 2R function, we can write 3iπ/4 3iπ/4 −R 2 3iπ/4 e e τη (ξ ) τη (ξ ) τη (ξ ) e w , dσ = 2W √ 2(−σ )3/2 2R 2 −σ −∞ where W is the antiderivative of w satisfying dW = w(z) dz
and
W (0) = 0.
By making the substitution ξ → Rξ , we see that √ 3iπ/4 3iπ/4 ∞ √ e e τη (ξ ) τη (R ν) dν R 1 Pη (ξ )W Pη (R ν)W dν = √ 2R 2 0 2R ν 0 ∞ 3iπ/4 τη (Rξ ) e dξ, (74) Pη (Rξ )W +R 2R 1
No Quantum Ergodicity for Star Graphs
283
where we have, additionally, split the range of integration into two regimes and the first integral made the substitution ν = ξ 2 . For the first integral in (74) we consider √ 3iπ/4 1 e τη (R ν) dν R iπ iR 2 ν exp − W √ 4 4 2R ν 0 2 √ which comes from the first term of Pη (R ν). The second term of Pη can be handled in the same way. Differentiating, √ √ 3iπ/4 3iπ/4 √ e τη (R ν) τη (R ν) e d e3iπ/4 W = √ w τη (R ν). (75) dν 2R 2R 4 ν dτη is bounded for ξ ∈ R, we deduce from (75) that there exists a constant K dξ independent of R such that 3iπ/4 √ d τη (R ν) e K (76) dν W √ν . 2R Since
Let
which satisfies
iπ/4 √ √ Re ν ψ(ν) := − π erfc 2 dψ R iR 2 ν iπ = √ exp − . dν 4 4 2 ν
We can then use integration by parts, √ 3iπ/4 1 e τη (R ν) dν iR 2 ν iπ R exp − W (77) √ 4 4 2R ν 0 2 √ 1 1 √ 3iπ/4 3iπ/4 e τη (R ν) τη (R ν) e d − ψ(ν) W dν = ψ(ν)W 2R dν 2R 0 0 →0 (78) as R → ∞, since
W
e3iπ/4 τη (0) 2R
and
erfc
Reiπ/4 2
→0
→0
and the fact that the final integral in (78) converges uniformly by (76). For the second integral in (74) we apply Taylor’s theorem and Lemma 5 to get 3iπ/4 3iπ/4 e τη (Rξ ) e 1 αξ W . =W +O 2R 2 R3ξ 2
284
G. Berkolaiko, J.P. Keating, B. Winn
This gives 3iπ/4 3iπ/4 ∞ ∞ e τη (Rξ ) αξ e Pη (Rξ )W Pη (Rξ )W R dξ = R dξ + O(R −2 ) 2R 2 1 1 as R → ∞. The integral which remains is of a form for which the asymptotic series may be derived by the method of repeated integration-by-parts [BlHa] to see that this contribution also vanishes in the limit R → ∞. References [AS] [BSS] [BG] [B] [BBK] [BK] [BSW1] [BSW2] [BKW] [Be1] [Be2] [BlHa] [Bo] [BGS] [BH1] [BH2] [BdB] [CdV] [CDM] [D] [DS] [FNdB] [GA] [GL]
Abramowitz, M., Stegun, I.A.: Handbook of mathematical functions. New York: Dover Publishing, 1965 B¨acker, A., Schubert, R., Sifter, P.: Rate of quantum ergodicity in Euclidean billiards. Phys. Rev. E 57, 5425–5447 (1998); Erratum ibid. 58, 5192 (1998) Barra, F., Gaspard, P.: On the level spacing distribution in quantum graphs. J. Stat. Phys. 101, 283–319 (2000) Berkolaiko, G.: Form factor for large quantum graphs: evaluating orbits with time-reversal. Waves Random Media 14, S7–S27 (2004) ˇ Berkolaiko, G., Bogomolny, E.B., Keating, J.P.: Star graphs and Seba billiards. J. Phys. A 34, 335–350 (2001) Berkolaiko, G., Keating, J.P.: Two-point spectral correlations for star graphs. J. Phys. A 32, 7827–7841 (1999) Berkolaiko, G., Schanz, H., Whitney, R.S.: Leading off-diagonal correction to the form factor of large graphs. Phys. Rev. Lett. 82, 104101 (2002) Berkolaiko, G., Schanz, H., Whitney, R.S.: Form factor for a family of quantum graphs: An expansion to third order. J. Phys. A 36, 8373–8392 (2003) Berkolaiko, G., Keating, J.P., Winn, B.: Intermediate wave-function statistics. Phys. Rev. Lett 91, 134103 (2003) Berry, M.V.: Regular and irregular semiclassical wavefunctions. J. Phys. A 10, 2083–2091 (1977) Berry, M.V.: Quantum scars of classical closed orbits in phase space. Proc. Roy. Soc. Lond. A 423, 219–231 (1989) Bleistein, N., Handlesman, R.A.: Asymptotic expansions of integrals. New York: Dover Publishing, 1986 Bogomolny, E.B.: Smoothed wave functions of chaotic quantum systems. Physica D 31, 169–189 (1988) Bohigas, O., Giannoni, M.-J., Schmit, C.: Characterisation of chaotic quantum spectra and universality of level fluctuation laws. Phys. Rev. Lett. 52, 1–4 (1984) Bolte, J., Harrison, J.: Spectral statistics for the Dirac operator on graphs. J. Phys. A 36, 2747–2769 (2003) Bolte, J., Harrison, J.: The spin contribution to the form factor of quantum graphs. J. Phys. A 36, L433–L440 (2003) Bouzouina, A., De Bi`evre, S.: Equipartition of the eigenfunctions of quantised ergodic maps on the torus. Commun. Math. Phys. 178, 83–105 (1996) Colin De Verdi`ere, Y.: Ergodicit´e et fonctions propres du Laplacien. Commun. Math. Phys. 102, 497–502 (1985) Comtet, A., Desbois, J., Majumdar, S.N.: The local time distribution of a particle diffusing on a graph. J. Phys. A 35, 687–694 (2002) Desbois, J.: Occupation times distribution for Brownian motion on graphs. J. Phys. A 35, 673–678 (2002) Dunford, N., Schwartz, J.T.: Linear Operators Part II: Spectral Theory. New York: Interscience Publishers, 1963 Faure, F., Nonnenmacher, S., De Bi`evre, S.: Scarred eigenstates for quantum cat maps of minimal periods. Commun. Math. Phys. 239, 449–492 (2003) Gnutzmann, S., Altland, A.: Universal spectral statistics in quantum graphs. Preprint. http://arxiv.org/abs/nlin.cd/0402029, 2004 G´erard, P., Leichtnam, E.: Ergodic properties of the eigenfunctions for the Dirichlet problem. Duke Math. J. 71, 559–607 (1993)
No Quantum Ergodicity for Star Graphs [GR] [GSW] [H] [K1] [K2] [KH] [Ke] [KMW] [KP] [KS1] [KS2] [KS3] [KS4] ˙ [PTZ] [S] [St] ˇ [Se] [SK] [T] [TM] [V] [W] [Z]
285
Gradshteyn, I.S., Ryzhik, I.M.: Tables of integrals, series, and products. London-New York: Academic Press, 1965 Gnutzmann, S., Smilansky, U., Weber, J.: Nodal domains on quantum graphs. Waves Random Media 14, S61–S73 (2004) Heller, E.J.: Bound-state eigenfunctions of classically chaotic Hamiltonian systems: scars of periodic orbits. Phys. Rev. Lett. 53, 1515–1518 (1984) Kaplan, L.: Scars in quantum chaotic wavefunctions. Nonlinearity, 12, R1–R40 (1999) Kaplan, L.: Eigenstate structure in graphs and disordered lattices. Phys. Rev. E 64, 036225 (2001) Kaplan, L., Heller, E.J.: Linear and nonlinear theory of eigenfunction scars. Ann. Phys. 264, 171–206 (1998) Keating, J.P.: The cat maps: quantum mechanics and classical motion. Nonlinearity, 4, 309– 341 (1991) Keating, J.P., Marklof, J., Winn, B.: Value distribution of the eigenfunctions and spectral determinants of quantum star graphs. Commun. Math. Phys. 241, 421–452 (2003) Keating, J.P., Prado, S.D.: Orbit bifurcations and the scarring of wavefunctions. Proc. Roy. Soc. Lond. A 457, 1855–1872 (2001) Kottos, T., Smilansky, U.: Quantum chaos on graphs. Phys. Rev. Lett. 79, 4794–4797 (1997) Kottos, T., Smilansky, U.: Periodic orbit theory and spectral statistics for quantum graphs. Ann. Phys. 274, 76–124 (1999) Kottos, T., Smilansky, U.: Chaotic scattering on graphs. Phys. Rev. Lett. 85, 968–971 (2000) Kottos, T., Smilansky, U.: Quantum graphs: a simple model for chaotic scattering. J. Phys. A 36, 3501–3524 (2003) ˙ Pako´nski, P., Tanner, G., Zyczkowski, K.: Families of line-graphs and their quantization. J. Stat. Phys. 111, 1331–1351 (2003) Schnirelmann, A.: Ergodic properties of eigenfuncions. Usp. Math. Nauk. 29, 181–182 (1974) Stewart, C.A.: Advanced Calculus. Methuen, 1940 ˇ Seba, P.: Wave chaos in singular quantum billiards. Phys. Rev. Lett. 64, 1855–1858 (1990) Schanz, H., Kottos, T.: Scars on quantum networks ignore the Lyapunov exponent. Phys. Rev. Lett. 90, 234101 (2003) Tanner, G.: Unitary stochastic matrix ensembles and spectral statistics. J. Phys. A 34, 8485– 8500 (2001) Texier, C., Montambaux, G.: Scattering theory on graphs J. Phys. A 34, 10307–10326 (2001) Voros, A.: Semi-classical ergodicity of quantum eigenstates in the Wigner representation. In: Stochastic behaviour in classical and quantum Hamiltonian systems, Berlin-Heidelberg-New York: Springer-Verlag, 1979, pp. 326–333 ¨ Weyl, H.: Uber die Gleichverteilung von Zahlen mod. Eins. Math. Ann. 77, 313–352 (1916) Zelditch, S.: Uniform distribution of the eigenfunctions on compact hyperbolic surfaces. Duke Math. J. 55, 919–941 (1987)
Communicated by P. Sarnak
Commun. Math. Phys. 250, 287–300 (2004) Digital Object Identifier (DOI) 10.1007/s00220-004-1146-z
Communications in
Mathematical Physics
A Lower Bound for the Wehrl Entropy of Quantum Spin with Sharp High-Spin Asymptotics Bernhard G. Bodmann Department of Physics, Princeton University, Princeton, NJ 08544, USA. Received: 28 July 2003 / Accepted: 1 February 2004 Published online: 5 August 2004 – © Springer-Verlag 2004
Abstract: A lower bound for the Wehrl entropy of a single quantum spin is derived. The high-spin asymptotics of this bound coincides with Lieb’s conjecture up to, but not including, terms of first and higher order in the inverse spin quantum number. The result presented here may be seen as complementary to the verification of the conjecture in cases of lowest spin by Schupp [Commun. Math. Phys. 207, 481 (1999)]. The present result for the Wehrl-entropy is obtained from interpolating a sharp norm bound that also implies a sharp lower bound for the so-called R´enyi-Wehrl entropy with certain indices that are evenly spaced by half of the inverse spin quantum number. 1. Introduction Among the many facets of entropy, Wehrl [Weh91] investigated the relation between the usual Gibbs-Boltzmann-Shannon entropy of a density in phase space and its quantum analogue, commonly known as the von Neumann entropy of a quantum state. The question how to recover one from the other in the sense of a classical limit led Wehrl to construct a quasi-classical entropy of quantum states [Weh79]. This hybrid construction derived from so-called Glauber coherent states is non-negative, or more accurately, bounded below by the von Neumann entropy. Described in physical terms, the presence of an uncertainty principle in the quasi-classical formulation imposes the lower bound on the usual classical entropy. Wehrl conjectured that the minimum value of his quasi-classical entropy is assumed whenever the quantum system is in a coherent state. This was proved by Lieb [Lie78] who established sharp norm bounds for the Bargmann transform [Bar61], an isometry between Schr¨odinger’s position-space representation and a phase-space representation associated with Glauber coherent states. In addition, Lieb [Lie78] suggested that coherent states should also be the minimizers for the Wehrl entropy of quantum spin systems. Present address: Department of Mathematics, University of Houston, Houston, TX 77204, USA. E-mail:
[email protected] 288
B.G. Bodmann
The quasi-classical entropy for a single quantum spin is derived from coherent states that contain a highest-weight vector in a finite-dimensional Hilbert space carrying a unitary irreducible representation of SU(2). In this finite-dimensional setting, one could have hoped for a simpler way to prove the corresponding bound on the Wehrl entropy compared to the effort and ingenuity in the proof given by Lieb for the case of Glauber coherent states. In fact, this would have provided an alternative to Lieb’s original proof via the usual group-contraction procedure, in which Glauber coherent states are obtained as the high-spin limit of SU(2) coherent states [ACGT72]. However, despite a lot of attempts, such a proof did not materialize. Perhaps the simplest proof of Wehrl’s original conjecture was given by Carlen [Car91] and in a similar argument by Luo [Luo00]. It reduces to an equality between the norm of functions in the phase-space representation and the Dirichlet form of the Laplacian, together with Gross’ logarithmic Sobolev inequality [Gro75]. Unfortunately, the combination of such identities on the sphere does not yield the desired bound for the Wehrl entropy of a quantum spin. The reason is that the logarithmic Sobolev inequality on the sphere [MW82] has the constant as its optimizer, a function that is not contained in the phase-space representation for spin. A few years ago, Lieb’s conjecture was verified for the non-trivial case of spin quantum numbers j = 1 and j = 3/2 [Sch99]. This was done by explicit calculation of the entropy in a geometric parametrization of quantum states that also appeared in [Lee88], combined with elementary inequalities involving chordal distances of points on the sphere. Here, we derive a lower bound on the Wehrl entropy that coincides with Lieb’s conjecture up to first order in the inverse of the spin quantum number. To this end, we modify Carlen’s approach and its implicit idea that the Wehrl entropy bound should be a result of the hypercontractivity properties of the Berezin transform. In physical terms, increasing the value of Planck’s constant adds fluctuations and smooths the phase-space density associated with a given quantum state. As usual, we evaluate the smoothness of a density by the s-dependence of its s-norm, and understand the Wehrl entropy as related to the infinitesimal change at s = 1. Carlen’s technique amounts to establishing bounds between norms for continuously tuned values of Planck’s constant. Its analogue in the setting of spin is the inverse of the spin quantum number that cannot be tuned continuously, which is ultimately the reason why the entropy bound obtained here is not sharp. Fortunately, for large spin quantum numbers this conceptual problem is less and less relevant which leads to the sharp asymptotics. A more detailed explanation of the way that Carlen’s approach has been modified here is given in the concluding remarks in Sect. 5. This paper is organized as follows: In Sect. 2 we fix the notation and give the basic definitions for a quantum spin. Then, we recall conjectures concerning Wehrl’s entropy and some background information in Sect. 3. After that, we present a sharp norm bound and derive an estimate on the Wehrl entropy, followed by proofs in Sect. 4. We conclude with a review of the strategies employed here and possible improvements. 2. Basic Definitions The mathematical description of a spatially fixed, non-relativistic spinning quantum particle involves a (2j + 1)-dimensional Hilbert space, where j ∈ 21 N is called the spin quantum number. One may identify this space with the complex Euclidean space C2j +1 equipped with the canonical inner product. The symmetry group describing the purely internal quantum degree of freedom is SU(2), and its representation on C2j +1 may be
A Lower Bound for the Wehrl Entropy of Quantum Spin
289
constructed by an inductive procedure [Lee88, Sch99] from that on C2 . An alternative to this construction is based on choosing a function space related to a coherent-state resolution of the identity [Kla59, Ber74, Per86]. This function space incorporates a correspondence principle and makes contact with a classical mechanical system having the sphere as its phase space. In the co-ordinates we use, the functions are defined on the Riemann sphere, i. e. the compactified complex plane. Elements of SU(2) act in close analogy with rotations, realized as Moebius transformations in the complex plane. Definition 2.1. Each vector f in the complex linear space Fj is given as the product of a conformal factor and a polynomial of degree at most 2j ,
f : C → C, z →
2j 1 ck z k (1 + |z|2 )j k=0
(2.1)
with complex coefficients ck ∈ C, k ∈ {0, 1, . . . , 2j }. We define an inner product f, g between vectors f and g as f, g :=
2j + 1 d 2z f (z)g(z) , π (1 + |z|2 )2 C
(2.2)
by convention conjugate linear in the first entry. Remarks 2.2. The elements of the group SU(2) may be thought of as matrices (α, −β; β, α) that are specified by two complex parameters α, β ∈ C satisfying |α|2 + |β|2 = 1. The group acts on a function f in the space Fj by Moebius transformations of the argument combined with a unimodular multiplier, Tα,β f (z) =
(βz + α)2j αz − β f . βz + α |βz + α|2j
(2.3)
One may verify that indeed the inner product of Fj is invariant under such transforma2 tions {Tα,β }, because d z2 2 is the rotation-invariant measure on the sphere expressed (1+|z| ) in stereographic coordinates. The space Fj is equipped with a reproducing kernel K, K(z, w) =
(1 + zw)2j (1 + |z|2 )j (1 + |w|2 )j
(2.4)
that yields K(·, w), f = f (w)
(2.5)
for all f ∈ Fj , w ∈ C. The functions {K(·, w)} ⊂ Fj , indexed by w ∈ C ∪ {∞} with 2j the convention K(z, ∞) = z 2 j , are also known as coherent vectors. (1+|z| )
290
B.G. Bodmann
3. Conjectured Norm and Entropy Bounds A correspondence principle [Ber74] suggests viewing the compactified complex plane as the phase-space of a classical system. From this point of view, the definition of the Hilbert space Fj implies an uncertainty principle. Unlike in L2 -spaces, one cannot arbitrarily concentrate the contribution to the norm of a function f ∈ Fj around a given point. Definition 3.1. The vectors in Fj are by their definition all bounded functions and thus contained in the Lp (S2 ) spaces on the Riemann sphere S2 = C ∪ {∞}. For each p ≥ 1, we define the normalized p-norm of f as 1/p d 2z pj + 1 |f (z)|p . (3.1) |||f |||p := π (1 + |z|2 )2 C Remarks 3.2. Since the underlying measure is invariant under Moebius transformations, these p-norms are all invariant under the action of SU(2), |||Tα,β f |||p = |||f |||p
(3.2)
where again α, β ∈ C satisfy |α|2 + |β|2 = 1. The unusual, explicit p-dependence via the prefactor pj + 1 in the definition (3.1) is chosen in order to have |||K(·, w)|||p = 1 independent of w ∈ C ∪ {∞} and p ≥ 1. This may be verified by explicit calculation after using an appropriate Tα,β rotating the index w to the origin. By the point-evaluation property (2.5), the Cauchy-Schwarz inequality and the normalization of coherent vectors, we estimate |f (z)| ≤ |||f |||2 . Therefore, when f is normalized according to |||f |||2 = 1, the probability density ρf : z → |f (z)|2 ≤ 1 cannot be arbitrarily concentrated, which provides a first albeit crude uncertainty principle. Definition 3.3. The Wehrl entropy of f ∈ Fj , normalized according to |||f |||2 = 1, is defined as 2j + 1 d 2z |f (z)|2 ln |f (z)|2 Sj (|f |2 ) := − . (3.3) π (1 + |z|2 )2 C Remarks 3.4. By the pointwise estimate in the preceding remark, Sj is seen to be nonnegative. This may also be proved using Jensen’s inequality [Lie78]. Coherent vectors are believed to be the most concentrated wavepackets and therefore expected to minimize Sj and various other measures of uncertainty. This is suggestive in the light of the point evaluation property (2.5), which is in sufficiently large function spaces solely achieved by integrating against Dirac’s delta function. The somewhat vague notion of coherent vectors being the “most concentrated” motivates the following conjecture in the spirit of Lieb’s approach to the Wehrl entropy bound [Lie78]. Conjecture 3.5. The normalized p-norm of a given f ∈ Fj is decreasing in p ≥ 1. More precisely, |||f |||q ≤ |||f |||p (3.4) for all q ≥ p ≥ 1, and when q > p, equality holds if and only if f is collinear with a coherent vector f (z) = cK(z, w) (3.5) for some fixed pair c ∈ C and w ∈ C ∪ {∞}.
A Lower Bound for the Wehrl Entropy of Quantum Spin
291
A related conjecture has recently been discussed in the literature in the context of ˙ measures of localization in phase-space, see [GZ01] and references therein. It appears there as a conjectured lower bound for the so-called R´enyi-Wehrl entropy. In terms of our notation, it amounts to stating the following special case of the above conjecture. Conjecture 3.6. Specializing to p = 2 in the preceding conjecture and using the monotonicity properties of the q th power and of the logarithm, (3.4) implies for f ∈ Fj with |||f |||2 = 1 and any q > 2 that 2 q ln(|||f |||q ) ≥ 0 , 2−q
(3.6)
with equality again if and only if f is collinear with a coherent vector. We henceforth refer to the left-hand side of the above inequality as an entropy of R´enyi-Wehrl type of index q/2, but caution the reader that it differs from the defini˙ tion of the usual R´enyi-Wehrl entropy in the literature [GZ01, Sug02] by the explicit q-dependence contained in the norm |||f |||q . A virtue of the normalization used here is that neither (3.4) nor (3.6) are explicitly j -dependent. In addition, the conjectured monotonicity in (3.4) and non-negativity in (3.6) are notationally identical with properties of ˙ R´enyi’s entropy in discrete measure spaces, see [Zyc03] or equivalent statements in [BS93, Sect. 5.3] in terms of the R´enyi information. The reason for this similarity is that the minimizers of R´enyi’s entropy in the discrete measure spaces are normalized with respect to all l q -norms, just as the coherent vectors are in our case. One may wonder ˙ whether the counterparts of other properties described in [Zyc03] or [BS93], such as q 2 convexity and monotonicity of 2−q ln(|||f |||q ) in q, are also satisfied. The analogue of (3.6) has been shown to hold when q is an even integer in a rather general setting [Sug02], thereby extending a tensorization argument for SU(2), see [Lee88] or [Sch99, Theorem 4.3], to compact semisimple Lie groups, see also the related results [Sai80, Bur83, Bur87, Luo97]. However, so far we still lack a proof that includes those q that are arbitrarily close to p = 2. If those were included in a proof of either of the preceding conjectures, then the following bound on the Wehrl entropy would result by endpoint differentiation. Conjecture 3.7 (Lieb, 1978). For all f ∈ Fj with |||f |||2 = 1, the Wehrl entropy is bounded below by 2j Sj (|f |2 ) ≥ (3.7) 2j + 1 and equality holds whenever f is up to a unimodular constant given by a coherent vector. 4. Results The first result presented in this section is that for a given f ∈ Fj , the norm |||f |||q is decreasing when restricted to a discrete set of values for q. A special case of this result implies via interpolation and endpoint differentiation the bound on the Wehrl entropy with sharp asymptotics. Theorem 4.1. The conjectured norm estimate (3.4) holds for all f ∈ Fj if q = p +
n j
with any integer n ∈ N and p > j1 . The estimate is sharp because equality holds in (3.4) if f is collinear with a coherent vector.
292
B.G. Bodmann
Consequence 4.2. Specializing to p = 2 and |||f |||2 = 1, this estimate implies via the monotonicity properties of the logarithm and of the q th power that (3.6) is true for f ∈ Fj when q = 2 + nj with n ∈ N. Unlike in previous results [Sai80, Bur83, Bur87, Luo97, Sch99, Sug02], the index q/2 is no longer restricted to integer values. Another, less direct consequence of Theorem 4.1 is the following bound on Wehrl’s entropy. Theorem 4.3. The Wehrl entropy associated with f ∈ Fj , |||f |||2 = 1, is bounded below by 1 Sj (|f |2 ) ≥ 2j ln 1 + . (4.1) 2j + 1 Remarks 4.4. This bound has the conjectured high-spin asymptotics up to, but not including, first and higher order terms in j −1 , because 1 2j x 1 1 2j 0≤ − 2j ln 1 + = . (4.2) x dx < 2 2j − 1 2j + 1 (2j + 1) 0 1 + 2j +1 4j Since highest-weight SU(2) coherent states approach Glauber coherent states when scaled appropriately in the limit j → ∞ [ACGT72], see also [BLW99], one could use these asymptotics to reproduce the entropy bound related to Glauber coherent states obtained by Lieb [Lie78]. However, the route via finite dimensional spaces and SU(2) does not seem to offer the vast simplification that was originally expected. By the usual concavity argument [Weh79, Lie78, Sch99], the Wehrl entropy bound derived here may be extended to include all quantum states, not only pure ones. To this end, we only need to replace the density ρf = |f |2 that appears in the definition of Sj 2j by the more general ρ : z → k |fk (z)|2 with any orthogonal family {fn }k=1 ⊂ Fj that satisfies k |||fk |||22 = 1. Another possible generalization of the bound (4.1) is to include several degrees of freedom using the monotonicity of Wehrl’s entropy [Weh79, Lie78].
4.1. Proof of Theorem 4.1. Proof. To begin with, we remark that it is enough to show the result for n = 1 and q > p > q/2, otherwise we simply iterate the inequality. The strategy for the proof of Theorem 4.1 is as follows: First, we restate the norm bound (3.4) as the result of an optimization problem involving the quadratic form of the Laplacian on the sphere. This step relies on identity (4.11) that is derived in the spirit of Carlen [Car91] from the special properties of the function space Fj . Then, we enlarge the space when looking for optimizers without losing the sharpness of the bound, because coherent vectors are optimal functions. For the detailed explanation, it is convenient to introduce the norm ||f ||p = |||f |||p / (pj +1)1/p of f ∈ Lp (S2 ) with respect to the rotationally-invariant probability measure on S2 . Using the Carlen Identity (4.11), we have for q ≥ p > 0, 2 2 q q ||f ||q − qj (qj4 +2) ∂|f |q/2 dπ z ||f ||q . (4.3) max q = max q f ∈Fj \{0} ||f ||p f ∈Fj \{0} (1 − qj1+2 )||f ||p
A Lower Bound for the Wehrl Entropy of Quantum Spin
293
Without loss of generality, we can assume that each f vanishes at infinity, limz→∞ f (z) = 0. Either this assumption is satisfied right away, or the polynomial part of f has highest degree and thus it has a zero in the complex plane that can be rotated to infinity by a suitable Moebius transformation without changing any of the norms in (4.3). Now we change notation to u = |f |q/2 and enlarge the set of such f , 2 q ||u||22 − qj (qj4 +2) |∂u|2 dπ z ||f ||q max , (4.4) q ≤ sup f ||f ||p u (1 − qj1+2 )||u||22p/q where the supremum is taken over the set {0 ≤ u ≡ 0} in the space C01 (C) of bounded, continuously differentiable functions on the plane that vanish at infinity. In Lemma 4.7 we see that the supremum in (4.4) has a finite value and even deduce the existence of a maximizing function for this supremum. Given any maximizer, performing a spherically symmetric decreasing rearrangement must necessarily preserve its gradient norm. Therefore, we may restrict ourselves to proving in Lemma 4.9 that in the class of rotationally symmetric functions, this maximizer is unique, up to an overall positive constant. It satisfies the Euler-Lagrange variational equation 2p 4 −1 u = bu q 1+ (4.5) qj (qj + 2) with = 41 (1 + |z|2 )2 (∂12 + ∂22 ) being the spherical Laplacian and a constant b > 0 that is fixed by choosing the norm ||u||2p/q . By inspection, this unique function is seen to be u : z → A(1 + |z|2 )qj/2 with A = ((qj + 2)b/qj )q/2(q−p) , and thus the corresponding f is up to a constant factor a coherent vector. Consequently, inequality (3.4) follows from reverting to the original normalization convention. It is sharp since equality is achieved if f is a coherent vector. 4.2. Proof of Theorem 4.3. This is merely a consequence of Theorem 4.1. It follows from interpolating the norm bound (3.4) and endpoint differentiation. Proof. Let us choose q = 2 +
1 j
in Theorem 4.1 and for 2 < s < q select θ > 0
such that = + Using the same notation ||f ||s = |||f |||s /(sj + 1)1/s as before, H¨older’s inequality combined with the norm bound (3.4) gives 1 s
1−θ 2
θ q.
θ ||f ||s ≤ ||f ||1−θ 2 ||f ||q
(2j + 1)1/2 θ ≤ ||f ||2 , (qj + 1)1/q
(4.6) (4.7)
so eliminating θ results in q 2−s
||f ||ss
≤
(2j + 1) 2 2−q (qj + 1)
2−s 2−q
||f ||s2 .
At s = 2 both sides are equal, thus one may differentiate d s ds ||f ||ss = π1 C |f (z)|2 ln |f (z)|2 s=2 (2j +1) 2 2−q d 2−s ds s=2 (qj +1) 2−q
(4.8)
d2z (1+|z|2 )2
q 2−s
≤s
= ln(2j + 1) − 2j ln
2j +2 2j +1
(4.9) (4.10)
294
B.G. Bodmann
and by a change in normalization and an overall change of sign arrives at the desired entropy bound (4.1). The justification for differentiating under the integral sign in (4.9) is dominated convergence and the estimate 0 ≥ 2(|f |s−2 − 1)/(s − 2) ≥ ln |f |2 . 4.3. Ingredients of the proof of Theorem 4.1. Lemma 4.5 (Carlen Identity). For q > 0 and all f ∈ Fj , j ∈ identity is true 2 d 2 z d 2z qj |f |q . = ∂ |f |q/2 π 4π C (1 + |z|2 )2 C
1 2 N,
the following (4.11)
Proof. We will first prove a regularized version of the identity. Let > 0 and assume φ : C → C is a complex polynomial of maximal degree 2j , we will show for q > 0 that 2 2 q/4 2 (|φ(z)|2 + )q/2 d 2 z qj (|φ(z)| + ) d z = + E( ) (4.12) ∂ 4 C (1 + |z|2 )qj +2 π (1 + |z|2 )qj/2 π C with an error term E( ) =
q |∂φ(z)|2 d 2 z q (|φ(z)|2 + ) 2 −2 . 8 C (1 + |z|2 )qj π
(4.13)
This regularized identity is obtained by elementary calculus operations involving the complex derivative ∂, observing that φ is holomorphic, and integration by parts: 2 q 2 2 q/4 2 q |φ(z)|2 |∂φ(z)|2 (|φ(z)|2 + ) 2 −2 (|φ(z)| + ) d z ∂ = (4.14) (1 + |z|2 )qj/2 π (1 + |z|2 )qj C C 16
q q 2 j [φ(z)z∂φ(z)](|φ(z)|2 + ) 2 −1 q 2 j 2 |z|2 (|φ(z)|2 + )qj/2 d 2 z − + 4 π 4 (1 + |z|2 )qj +1 (1 + |z|2 )qj +2 1 q 1 q = ∂∂(|φ(z)|2 + )q/2 + (|φ(z)|2 + ) 2 −2 |∂φ(z)|2 (4.15) 2 8 C (1 + |z| )qj 4
qj [z∂(|φ(z)|2 + )q/2 ] q 2 j 2 |z|2 (|φ(z)|2 + )q/2 d 2 z − + 2 4 π (1 + |z|2 )qj +1 (1 + |z|2 )qj +2 1 1 q |∂φ(z)|2 = (|φ(z)|2 + )q/2 + (4.16) ∂∂ 2 4 (1 + |z| )qj 8 (|φ(z)|2 + )2 C
q 2j 2 |z|2 qj d 2z z − ∂ + 2 4 (1 + |z|2 )qj +2 π (1 + |z|2 )qj +1 q qj (|φ(z)|2 + )q/2 d 2 z (|φ(z)|2 + ) 2 −2 |∂φ(z)|2 d 2 z q = + . (4.17) 4 C (1 + |z|2 )qj +2 π 8 C π (1 + |z|2 )qj There are no boundary terms in the integration by parts because the growth of the powers of |φ| and of their derivatives are suppressed by sufficiently strong polynomial growth in the denominator.
A Lower Bound for the Wehrl Entropy of Quantum Spin
295
In the limit → 0, the error term vanishes by dominated convergence, yielding identity (4.11). Dominated convergence applies because either q ≥ 4 and (|φ(z)|2 + q ) 2 −2 ≤ C(1 + |z|2 )j (q−4) with a fixed constant C > 0 valid for all sufficiently small , or 0 < q < 4 and we estimate on the set {z : |φ(z)|2 < } the expression q
q 2− 2
(|φ(z)|2 + ) 2 −2 ≤ 2 (4−q)3− q |φ(z)|q−2 . To ensure the validity of dominated conver(2−q)
2
gence, one may then recall that φ has only isolated zeros of finite order.
Remarks 4.6. A version of the identity (4.11) was first shown by Carlen [Car91] for the function space related to Glauber coherent states. One may recover his result √ by the usual group contraction procedure in the limit j → ∞ while scaling z → z/ 2j h ¯ with h ¯ > 0. The case q = 2 can be verified in a simpler way than the explicit calculation given here. All that is needed is to verify that the square of the gradient norm in (4.11) defines a quadratic form on Fj . Appealing to Schur’s lemma, the corresponding operator is a multiple of the identity, because the quadratic form is invariant under the irreducible unitary representation of SU(2). To obtain the correct numerical factor, one may then simply choose f to be the coherent vector centered at the origin and evaluate the left-hand side of Eqs. (4.11). Lemma 4.7. The supremum of the expression (4.4) is attained for some function u. In other words, there is some u with ||u||s = 1, 1 < s < 2, satisfying 0 ≤ u and u(z) → 0 as z → ∞, which maximizes d 2z 4 2 |∂u|2 ||u||2 − . (4.18) qj (qj + 2) C π Proof. We abbreviate s := 2p q and denote the dual index as s := inequality on the two-sphere states
||u||22 +
s − 2 ||∇u||22 ≥ ||u||2s . 2
s s−1 .
The Sobolev
(4.19)
Note that for real-valued u, in two dimensions the two expressions ||∇u||22 and 2 2 C |∂u| d z/π are identical, one using the gradient ∇ on the sphere and the other the complex derivative ∂ = (∂1 − i∂2 )/2. Therefore, employing the Sobolev and H¨older inequalities yields for ||u||s = 1 a finite bound ||u||22
d 2z 4 |∂u|2 − qj (qj + 2) C π 2 4 4 2 ≤ 1+ ||u||s ||u||s − ||u||2s (4.20) r − 2 qj (qj + 2) r − 2 qj (qj + 2)
2 4 2 4 ≤ max 1 + (4.21) x− x2 . x≥0 r − 2 qj (qj + 2) r − 2 qj (qj + 2)
Now assume a maximizing sequence (un )n∈N such that 0 ≤ un and ||un ||s = 1 for all n ∈ N. Apart from the finiteness of the supremum over s-normalized u in (4.20), inequality (4.21) shows that the sequence (xn )n∈N given by xn := ||un ||s is bounded, and because of the compactness of the sphere, the same bound applies to the 2-norms of
296
B.G. Bodmann
the sequence (un )n∈N . This, in turn, forces any maximizing sequence to be uniformly bounded in gradient 2-norm. We may now without loss of generality replace each un by its equimeasurable symmetric decreasing rearrangement, since this only decreases the gradient norm by the contractivity properties of the heat semigroup on the sphere, just as in the Euclidean case [Lie83, Lemma 4.1]. In addition, we can use Helly’s selection principle, that is, choose a subsequence that converges on all rational radii. By the monotonicity of each un , the convergence extends to almost every radius. Appealing to a Rellich-Kondrakov compact embedding theorem [Aub82, Theorem 2.34], the limit un → u is in the s-norm. The compact embedding we use is that of the Sobolev space H11 (S2 ) in Ls (S2 ). Lemma 4.8. Given s > 1, every maximizer u ∈ Ls (S2 ) in the preceding lemma is a smooth function. Proof. From calculus of variations, any maximizer u ∈ Ls (S2 ) is a distributional solution of the corresponding Euler-Lagrange equation that has the form of a Poisson equation, u = aus−1 − bu (4.22) with constants a, b > 0. By the integrability of the Green function G on the sphere, in complex coordinates given as 4|z − w|2 , (4.23) G(z, w) = − ln (1 + |z|2 )(1 + |w|2 ) and the identity 1 d 2z u(w) = G(w, z)(aus−1 (z) − bu(z)) π C (1 + |z|2 )2
(4.24)
we see that u is bounded and continuous. In fact, its first derivative is seen to be H¨older continuous with any index 0 < α < 1 from replacing u with a directional derivative and calculating the magnitude of the gradient of the Green function 1/2 (1 + |z|2 )(1 + |w|2 ) 1 |∇G(z, w)| = − . (4.25) |z − w|2 (1 + |z|2 )(1 + |w|2 ) By bootstrapping, u is smooth since its k th derivative is H¨older continuous with index 0 < αk < (s − 1)k−1 . Lemma 4.9. Given q > p > q/2, there is a unique rotationally symmetric non-trivial solution u to the Euler-Lagrange equation (4.5) that meets the non-negativity and limit requirements u ≥ 0 and limz→∞ u(z) = 0. Proof. We know from the preceding lemma that u is differentiable, even smooth. Therefore, a rotationally symmetric u considered as a function of r = |z| solves the ordinary differential equation 2p 1 −1 + bu(r) = 0 (1 + r 2 )2 (ru (r)) − a(u(r)) q r
(4.26)
A Lower Bound for the Wehrl Entropy of Quantum Spin
297
with an initial condition u (0) = 0 and an unknown value u(0) > 0. We will abbreviate the non-linearity as φ(u) := −bu(2p−q)/q + u. Despite the singularity at r = 0, the initial values u(0) and u (0) = 0 uniquely determine a solution on all {r ≥ 0}. This follows with the help of the Schauder fixed point theorem as in [FLS96, Appendix] by replacing the expression for u (r) in the Euclidean case treated there with (4.27) given hereafter. To show the claimed uniqueness of a solution meeting the non-negativity and limiting requirements, we first show that any such solution is strictly radially decreasing. Then, we exclude the cases of multiple solutions having infinitely many, finitely many, and finally having no intersections. In the literature, such arguments have been called separation lemmas. We adapt elements of [PS83, PS86, FLS96, ST00] to the setting of the sphere. To begin with, we show that u is strictly decreasing as a function of r. Integrating the ODE (4.26) gives t 1 r u (r) = − φ(u(t)) dt . (4.27) r 0 (1 + t 2 )2 If there is some critical point r0 > 0 for u, then we could conclude that both sides of this equation vanish and thus, inserting (4.27) simplifies (4.26) to u (r0 ) = −
1 φ(u(r0 )) = 0 . (1 + r02 )2
(4.28)
From the uniqueness of solutions to initial value problems having Lipschitz-continuous coefficients we conclude that u is necessarily the non-zero constant determined by φ(u) = 0, contradicting the limiting requirement in our assumption. Therefore, u cannot have any critical points for r > 0. Now we exclude the three possible cases of multiple solutions. 1. Two solutions with infinitely many intersections. Assuming there are two solutions u and v that intersect infinitely often, then there are radii 0 < r0 < r1 such that (a) u(r1 ) = v(r1 ), u (r0 ) = v (r0 ), and 0 > v > u on (r0 , r1 ), (b) φ(u(r)) < 0and φ(v(r)) < 0 for all r ≥ r0 . u Define (u) := 0 φ(x) dx. By comparison with energy dissipation in a mechanical system, we have the “conservation law”, here in terms of the solution u, 1 r1 2 2 2 2 (1 + r ) (u (r)) + (u(r)) r0 r 2 2 + r01 (1+rr ) − r(1 + r 2 ) (u (r))2 dr = 0 . (4.29) Subtracting this identity for the two solutions u and v, we obtain 1 2 2 2 2 2 (1 + r1 ) ((u (r1 )) − (v (r1 )) ) + (v(r0 )) − (u(r0 )) r1 (1+r 2 )2 + r0 − r(1 + r 2 ) ((u (r))2 − (v (r))2 ) dr = 0 r
(4.30)
but the left-hand side is strictly positive in each difference term, so this yields a contradiction. 2. Two solutions having finitely many intersections. For this part, we choose the parametrization in terms of the azimuthal angle θ, r = tan θ/2. In this variable, the ordinary differential equation (4.26) is expressed as u (θ ) + cot θ u (θ ) + φ(u(θ )) = 0 .
(4.31)
298
B.G. Bodmann
From the two solutions u and v let us pass to the inverse functions denoted as θ = u−1 and σ = v −1 . If there are finitely many intersections, then there is u0 > 0 such that θ > σ > π/2 and 0 > θ > σ on (0, u0 ). Define the difference of “kinetic energy” as B(u) =
1 (θ (u))2
−
1 (σ (u))2
> 0,
then B(0) = 0. However, according to the ODE (4.31), the derivative cot θ cot σ cot θ > cot σ and 0 > θ > σ . We have reached a contradiction with the strict positivity of B. 3. Two solutions without intersection. As a first step to exclude this case, we will show that the difference θ − σ of the inverse functions of u and v can have at most one critical point, a local maximum, in an interval where θ > σ . To this end, we note that in terms of the inverse function θ of u, the ODE (4.31) becomes −θ + cot θ (θ )2 + φ(u(θ ))(θ )3 = 0 . (4.34) Subtracting this identity for the two solutions at a point where θ = σ and θ > σ gives θ − σ = (cot θ − cot σ )(θ )2 < 0 (4.35) by the monotonicity of the cotangent function. Consequently, assuming u > v, then the difference θ − σ cannot have any critical point in (0, v(0)), because this would contradict θ − σ → ∞ as u → v(0). So θ − σ > 0 in (0, v(0)). But similarly as in the preceding case of finitely many intersections, this is impossible near u = 0, because in terms of the azimuthal angle we would then have B(0) = 0, B (u) < 0 for small u. 5. Conclusion The lack of a sharp entropy bound remains frustrating. Many promising attempts have failed. Among others, this includes the hope that group-theoretic, explicit computations as in [Sch99] may give rise to an inductive argument; the use of the Hardy-LittlewoodSobolev inequality on the sphere that may be found in so many uncertainty principles [Bec93, Bec95, Bec01]; or via a change of weight analogous to [Luo00] the attempt to combine variants of the Carlen identity with a logarithmic Sobolev-type inequality on the sphere that breaks down because of a domain problem [Gro99, Sect. 5]. At this point the best bet may still be the difficult task of finding sharp smoothing properties of the Berezin transform. For lack of a manageable alternative, we replaced the Berezin transform with the differential operator 1 + qj (qj4 +2) , which should intuitively be a good approximation for high spin quantum numbers. This particular choice of approximation is motivated by considering optimization problems of the form (4.4) and by demanding that coherent vectors solve the corresponding Euler-Lagrange equation. The downside of replacing the Berezin transform is that we cannot characterize the cases of equality in Theorem 4.1 because there is no strong form of the Riesz theorem for the gradient norm under radially symmetric decreasing rearrangements. We cannot resolve
A Lower Bound for the Wehrl Entropy of Quantum Spin
299
this deficiency by invoking a duality principle as in [Bec01, Theorem 6], because the operator 1 + qj (qj4 +2) is not invertible. One may expect that an adaptation of the techniques employed here yields analogous bounds for the Wehrl entropy in the setting of SU(1,1) as well, see [Luo97]. It could be ˙ worthwhile to include other highest-weight representations [GZ01]. However, the most important task is still to find the road that will finally lead to the sharp bounds. Most likely it will not be a simple group-theoretic argument, but all of the partial results so far point to the very geometric nature and inherent beauty of the problem. Discovering these facets of entropy makes the difficulties in resolving Lieb’s conjecture much less frustrating and all the more fascinating. Acknowledgement. Thanks go to Elliott Lieb for remarks that were always to the point and for suggesting this incredibly beautiful and challenging open problem to me. I am also grateful for helpful and motivating conversations with Shannon Starr, Michael Aizenman, Robert Seiringer, Almut Burchard, Stephen B. Sontz and Bob Sims. The referee is acknowledged for pointing out analogies between the material presented here and classical results for the R´enyi entropy on discrete measure spaces. This work was partially supported under the National Science Foundation grant PHY-0139984.
References [ACGT72] Arecchi, F.T., Courtens, E., Gilmore, R., Thomas, H.: Atomic coherent states in quantum optics. Phys. Rev. A 6, 2211–2237 (1972) [Aub82] Aubin, Th.: Nonlinear analysis on manifolds. Monge-Amp`ere equations. Grundlehren der Mathematischen Wissenschaften, Vol. 252, New York: Springer-Verlag, 1982 [Bar61] Bargmann, V.: On a Hilbert space of analytic functions and an associated integral transform, Part I. Comm. Pure Appl. Math. 14, 187–214 (1961) [Bec93] Beckner, W.: Sharp Sobolev inequalities on the sphere and the Moser-Trudinger inequality. Ann. of Math. (2) 138(1), 213–242 (1993) [Bec95] Beckner, W.: Pitt’s inequality and the uncertainty principle. Proc. Am. Math. Soc. 123(6), 1897–1905 (1995) [Bec01] Beckner, W.: On the Grushin operator and hyperbolic symmetry. Proc. Am. Math. Soc. 129(4), 1233–1246 (2001) [Ber74] Berezin, F.A.: Quantization. Math. USSR Izv. 8, 1109–1165, (1974); Russ. orig.: Izv. Akad. Nauk SSSR, Ser. Mat. 38, 1116–1175 (1974) [BLW99] Bodmann, B., Leschke, H., Warzel, S.: A rigorous path integral for quantum spin using flat-space Wiener regularization. J. Math. Phys. 40, 2549–2559 (1999) [BS93] Beck, Ch., Schl¨ogl, F.: Thermodynamics of chaotic systems. Cambridge Nonlinear Science Series, Vol. 4, Cambridge: Cambridge University Press, 1993 [Bur83] Burbea, J.: Inequalities for holomorphic functions of several complex variables. Trans. Am. Math. Soc. 276(1), 247–266 (1983) [Bur87] Burbea, J.: Sharp inequalities for holomorphic functions. Ill. J. Math. 31(2), 248–264 (1987) [Car91] Carlen, E.A.: Some integral identities and inequalities for entire functions and their application to the coherent state transform. J. Funct. Anal. 97(1), 231–249 (1991) [FLS96] Franchi, B., Lanconelli, E., Serrin, J.: Existence and uniqueness of nonnegative solutions of quasilinear equations in Rn . Adv. Math. 118(2), 177–243 (1996) [Gro75] Gross, L.: Logarithmic Sobolev inequalities. Am. J. Math. 97(4), 1061–1083 (1975) [Gro99] Gross, L.: Hypercontractivity over complex manifolds. Acta Math. 182(2), 159–206 (1999) ˙ ˙ [GZ01] Gnutzmann, S., Zyczkowski, K.: R´enyi-Wehrl entropies as measures of localization in phase space. J. Phys. A 34(47), 10123–10139 (2001) [Kla59] Klauder, J.R.: The action option and a Feynman quantization of spinor fields in terms of ordinary c-numbers. Ph.D. thesis, Princeton, 1959 [Lee88] Lee, C.T.: Wehrl’s entropy of spin states and Lieb’s conjecture. J. Phys. A 21(19), 3749–3761 (1988) [Lie78] Lieb, E.H.: Proof of an entropy conjecture of Wehrl. Commun. Math. Phys. 62(1), 35–41 (1978) [Lie83] Lieb, E.H.: Sharp constants in the Hardy-Littlewood-Sobolev and related inequalities. Ann. of Math. (2) 118(2), 349–374 (1983)
300 [Luo97] [Luo00] [MW82] [Per86] [PS83] [PS86] [Sai80] [Sch99] [ST00] [Sug02] [Weh79] [Weh91] ˙ [Zyc03]
B.G. Bodmann Luo, S.: A harmonic oscillator on the Poincar´e disc and hypercontractivity. J. Phys. A 30(14), 5133–5139 (1997) Luo, S.: A simple proof of Wehrl’s conjecture on entropy. J. Phys. A 33(16), 3093–3096 (2000) Mueller, C.E., Weissler, F.B.: Hypercontractivity for the heat semigroup for ultraspherical polynomials and on the n-sphere. J. Funct. Anal. 48(2), 252–283 (1982) Perelomov, A.: Generalized coherent states and their applications. Texts and Monographs in Physics. Springer-Verlag, Berlin, 1986 Peletier, L.A., Serrin, J.: Uniqueness of positive solutions of semilinear equations in Rn . Arch. Rat. Mech. Anal. 81(2), 181–197 (1983) Peletier, L.A., Serrin, J.: Uniqueness of nonnegative solutions of semilinear equations in Rn . J. Differ. Eq. 61(3), 380–397 (1986) Saitoh, S.: Some inequalities for entire functions. Proc. Am. Math. Soc. 80(2), 254–258 (1980) Schupp, P.: On Lieb’s conjecture for the Wehrl entropy of Bloch coherent states. Commun. Math. Phys. 207(2), 481–493 (1999) Serrin, J., Tang, M.: Uniqueness of ground states for quasilinear elliptic equations. Indiana Univ. Math. J. 49(3), 897–923 (2000) Sugita, A.: Proof of the generalized Lieb-Wehrl conjecture for integer indices larger than one. J. Phys. A 35(42), L621–L626 (2002) Wehrl, A.: On the relation between classical and quantum-mechanical entropy. Rep. Math. Phys. 16(3), 353–358 (1979) Wehrl, A.: The many facets of entropy. Rep. Math. Phys. 30(1), 119–129 (1991) ˙ Zyczkowski, K.: R´enyi extrapolation of Shannon entropy. Open Syst. Inf. Dyn. 10(3), 297– 310 (2003)
Communicated by M.B. Ruskai
Commun. Math. Phys. 250, 301–334 (2004) Digital Object Identifier (DOI) 10.1007/s00220-004-1147-y
Communications in
Mathematical Physics
Glauber Dynamics on Trees: Boundary Conditions and Mixing Time Fabio Martinelli1, , Alistair Sinclair2, , Dror Weitz2, 1
Department of Mathematics, University of Roma Tre, Largo San Murialdo 1, 00146 Roma, Italy. E-mail:
[email protected] 2 Computer Science Division, University of California, Berkeley, CA 94720-1776, USA. E-mail: {sinclair, dror}@cs.berkeley.edu Received: 3 August 2003 / Accepted: 5 March 2004 Published online: 12 August 2004 – © Springer-Verlag 2004
Abstract: We give the first comprehensive analysis of the effect of boundary conditions on the mixing time of the Glauber dynamics in the so-called Bethe approximation. Specifically, we show that the spectral gap and the log-Sobolev constant of the Glauber dynamics for the Ising model on an n-vertex regular tree with (+)-boundary are bounded below by a constant independent of n at all temperatures and all external fields. This implies that the mixing time is O(log n) (in contrast to the free boundary case, where it is not bounded by any fixed polynomial at low temperatures). In addition, our methods yield simpler proofs and stronger results for the spectral gap and log-Sobolev constant in the regime where the mixing time is insensitive to the boundary condition. Our techniques also apply to a much wider class of models, including those with hard-core constraints like the antiferromagnetic Potts model at zero temperature (proper colorings) and the hard–core lattice gas (independent sets).
1. Introduction In this paper we will analyze the influence of boundary conditions on the Glauber dynamics for discrete spin models on a regular rooted tree. Although in what follows we will focus for simplicity on the well-known Ising model, our techniques also apply to other models, not necessarily ferromagnetic and with hard-core constraints. An extended abstract of this paper appeared under the title “The Ising model on trees: Boundary conditions and mixing time” in Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science, October 2003, pp. 628–639. This work was done while this author was visiting the Departments of EECS and Statistics, University of California, Berkeley, supported in part by a Miller Visiting Professorship. Supported in part by NSF Grant CCR-0121555 and DARPA cooperative agreement F30602-00-20601. Supported in part by NSF Grant CCR-0121555.
302
F. Martinelli, A. Sinclair, D. Weitz
In the Ising model on a finite graph G = (V , E), a configuration σ = (σx ) consists of an assignment of ±1-values, or “spins”, to each vertex (or “site”) of V . The probability of finding the system in configuration σ ∈ {±1}V ≡ G is given by the Gibbs distribution σx σy + βh σx , (1) µG (σ ) ∝ exp β xy∈E
x∈V
where β ≥ 0 is the inverse temperature and h the external field. Boundary conditions can also be taken into account by fixing the spin values at some specified “boundary” vertices of G; the term free boundary is used to indicate that no boundary condition is specified. In the classical Ising model, G = Gn is a cube of side n1/d in the d-dimensional Cartesian lattice Zd , and in this case the phase diagram in the thermodynamic limit Gn ↑ Zd is quite well understood (see, e.g., [16, 39] for more background). While the classical theory focused on static properties of the Gibbs measure, in the last decade the emphasis has shifted towards dynamical questions with a computational flavor. The key object here is the Glauber dynamics, a (discrete– or continuous–time) Markov chain on the set of spin configurations G in which each spin σx flips its value with a rate that depends on the current configuration of the neighboring spins of x, and which satisfies the detailed balance condition w.r.t to the Gibbs measure µG (see Sect. 2 for more details). The Glauber dynamics is much studied for two reasons: firstly, it is the basis of Markov chain Monte Carlo algorithms, widely used in computational physics for sampling from the Gibbs distribution; and secondly, it is a plausible model for the actual evolution of the underlying physical system towards equilibrium. In both contexts, one of the central questions is to determine the mixing time, i.e., the time until the dynamics is close to its stationary distribution. As is well known (see, e.g., [37]), the approach to stationarity of a reversible Markov chain with Markov generator L and reversible measure π can be successfully studied by analyzing two key quantities: the spectral gap and the logarithmic Sobolev constant of the pair (L, π )1 . The first of these measures the rate of exponential decay as t → ∞ of the variance Var π (et L f ) computed with respect to the invariant measure π, while the second measures instead the rate of decay of the relative entropy of et L f w.r.t π (see, e.g., [1]). Advances in statistical physics over the past decade have led to remarkable connections between these two quantities and the occurrence of a phase transition (see, e.g., [41, 31, 30, 9, 29, 27]). As an example, on finite n-vertex squares with free boundary in the 2-dimensional lattice Z2 , when h = 0 and β is smaller than the critical value βc , the spectral gap and the logarithmic Sobolev constant are (1) (i.e., bounded√ away from zero uniformly in n), while for β > βc they are both exponentially small in n. One of the most interesting and difficult questions left open by the above and related results is the influence of boundary conditions on the spectral gap and the log-Sobolev constant when h = 0 and β > βc . It has been conjectured that, in the presence of an all-(+) boundary, the relaxation process is driven by the mean–curvature motion of interfaces separating droplets of the (−)-phase inside the (+)-phase, and therefore the mixing time should be polynomial in n (most likely n2/d log n) [7, 15]. In particular it has 1 Unfortunately the definition of the logarithmic Sobolev constant is not constant in the literature! The ambiguity arises because there are two definitions, one the inverse of the other. The definition used in this paper is the one that puts the logarithmic Sobolev constant and the spectral gap on the same footing; see Eq. (7) in Sect. 2.2.
Glauber Dynamics on Trees: Boundary Conditions and Mixing Time
303
been argued that the spectral gap for the pure phases in high enough dimension should be (1). Proving results of this kind has proved very elusive, and the only (presumably sharp) available bounds are upper bounds on the spectral gap and the logarithmic Sobolev constant [7]. In this paper we prove a strong version of the above conjecture in what is known in statistical physics as the Bethe approximation, namely when the lattice Zd is replaced by a regular tree. Among other results, we show that the spectral gap of the Glauber dynamics for the Ising model on a tree with a (+)-boundary condition on its leaves is (1) at all temperatures and all values of the external field, and further that the same holds for the logarithmic Sobolev constant. Notice that, with a free boundary, β large and h = 0, both quantities tend to zero as 1/na and the exponent a grows arbitrarily large as β → ∞ [3]. Ours is apparently the first result that quantifies the effect of boundary conditions on the Glauber dynamics in an interesting scenario. We stress that, while the tree is simpler in many respects than Zd due to the lack of cycles, in other respects it is more complex due to the large boundary: e.g., it exhibits a “double phase transition,” and the critical field at low temperature is non–zero (see below). In the next subsection, we briefly describe the Ising model on trees before stating our results in more detail.
1.1. The Ising model on trees. Fix b ≥ 2 and let Tb denote the infinite b-ary tree. The Ising model on Tb is known [16, 25] to have a phase diagram in the (h, β) plane quite different from that on the cubic lattice Zd (see Fig. 1), and has recently received a lot of attention as the canonical example of a statistical physics model on a “non-amenable” graph (i.e., one whose boundary is of comparable size to its volume) — see, e.g., [6, 20, 14, 38, 23, 3, 5]. Let us first the behavior on the line h = 0. There is a first critical value discuss β0 = 21 log b+1 , marking the dividing line between uniqueness and non-uniqueness b−1 of the Gibbs measure. Then, in sharp contrast to the model on Zd , there is a second √ critical point β1 = 21 log √b+1 which is often referred to as the “spin-glass critical b−1 point” [10]. This second critical point is such that, in the “intermediate temperature” region β0 < β ≤ β1 , the (+)- and (−)-boundary conditions exert arbitrarily long-range influence on the spin at the root of the tree and hence give rise to different Gibbs measures, but “typical” boundary conditions (i.e., chosen from the infinite-volume Gibbs measure with free boundary) do not. Another way to phrase this peculiar behavior is that the Gibbs measure constructed via a free boundary is extremal for all β ≤ β1 , and non-extremal for β > β1 (see [6, 20, 21, 3], and also [14, 34, 35] for an analysis in the context of “bit reconstruction problems” for noisy data transmission). Let us now examine what happens when an external field h is added to the system. It turns out that for all β > β0 , there is a critical value h = hc (β) > 0 of the field such that the Gibbs measure is not unique when |h| ≤ hc , and is unique when |h| > hc . (When β ≤ β0 the Gibbs measure is unique for all h, and hc is defined to be zero.) In the presence of a (+)-boundary, the Ising model on the tree with external field h = −hc is rather analogous to the classical case of Zd with zero field. Both models share the following two properties: firstly, the Gibbs measure is sensitive to the choice of boundary condition, and secondly, adding an arbitrarily small negative field causes the Gibbs measure to become insensitive to the boundary condition (i.e., unique in the thermodynamic limit).
304
F. Martinelli, A. Sinclair, D. Weitz T = 1/β
1/β0
1/β1
h −(b−1)
b−1
Fig. 1. The critical field hc (β). The Gibbs measure is unique above the curve
Finally we remark that the concentration properties of the Gibbs measure for β > β0 , h ≥ −hc and (+)-boundary are very different from those on Zd . In the latter case, along the line of first order phase transition, the (negative) large deviations for the bulk magnetization are related to the appearance of a Wulff droplet of the opposite phase and are depressed by a negative exponential in the surface area of the droplet (see, e.g., [11]). In the tree, on the other hand, for any value of (β, h) they are always depressed by a negative exponential in the volume of the excess negative spins (the phenomenon of “rigidity of the critical phases” [5]). The Glauber dynamics for the Ising model on trees has also been studied. In a recent paper [3], it is shown that the associated spectral gap with zero external field and free boundary on a complete b-ary tree T with n vertices is (1) at high and intermediate temperatures (i.e., when β < β1 )2 . Moreover, at the critical point β = β1 the same spectral gap is bounded above by c/ log n, and as soon as β > β1 it becomes smaller than c/na(β) , with a(β) ↑ ∞ as β → ∞. Thus the critical point β = β1 is reflected in the dynamics by an abrupt jump in the behavior of the spectral gap as a function of the size of the tree T . Finally, also in [3], it is proved that the spectral gap for arbitrary fixed β, h and boundary condition can never shrink to zero faster than an inverse polynomial in n. Again such a result should be compared to the lattice case where it is known that the spectral gap for a cube with n sites can be exponentially small in the surface area n(d−1)/d . 1.2. Main results and techniques. Our first main result is a detailed analysis of the spectral gap of the Glauber dynamics in different regions of the phase diagram. The main novelty here is that we are able for the first time to prove a sharp result in the region where the spectral gap is highly sensitive to the boundary condition. Theorem 1.1. In both of the following situations, the spectral gap of the Glauber dynamics on a complete b-ary tree T with n vertices is (1): (i) the boundary condition is arbitrary, and either β < β1 (with h arbitrary), or |h| > hc (β) (with β arbitrary); (ii) the boundary condition is (+) and β, h are arbitrary. 2 Actually the arguments in [3] prove that the gap is (1) for any β < β , arbitrary boundary condition 1 and any external field. Their argument, together with some monotonicity properties specific to the Ising model [36], implies a mixing time of O(log n). Thus, although for β0 < β < β1 there exist several Gibbs measures, the mixing time of the Glauber dynamics is insensitive to the boundary condition.
Glauber Dynamics on Trees: Boundary Conditions and Mixing Time
305
Remark. On Zd not much is known about the spectral gap when β > βc , h = 0 and the boundary condition is (+), the notable exception being that of Z2 where it has recently been proved √ [7] that the spectral gap in a square with n sites shrinks to zero at least as fast as 1/ n. The best known lower bounds are significantly weaker [29]. In high enough dimensions (d ≥ 3) it has been conjectured (see [15] and [7]) that the spectral gap should stay bounded away from zero uniformly in n. The above theorem can be looked upon as evidence in favor of this conjecture. In our second main result we extend our analysis to the more delicate and difficult logarithmic Sobolev constant. Theorem 1.2. In the same situations as in Theorem 1.1, the logarithmic Sobolev constant of the Glauber dynamics on a complete b-ary tree T with n vertices is (1). As a corollary we obtain that, in the situations of Theorems 1.1 and 1.2, the Glauber dynamics mixes (in a very strong sense) in time O(log n). Remarks. (i) In Zd with (+)-boundary condition, β large and zero external field the logarithmic Sobolev constant in a cube with n sites is always smaller than n−2/d , apart from logarithmic corrections [7], in agreement with heuristic predictions based on the mean–curvature motion of phase interfaces. (ii) We also prove an additional result (see Theorem 5.7) which shows that, for an arbitrary nearest-neighbor spin system on a tree, as soon as the spectral gap is (1) then the logarithmic Sobolev constant cannot shrink faster than (c log n)−1 . This means that, even when a constant lower bound is known for the gap but not for log-Sobolev, one can deduce a mixing time of O((log n)2 ). While we do not require this fact to derive the results of this paper, we believe it may be of interest for other models on trees. In order to better appreciate Theorem 1.2, one should keep in mind that for general finite-range, translation-invariant, compact spin models on Zd , if there exists an infinitevolume Gibbs measure µ with a positive logarithmic Sobolev constant, then the system is necessarily in the uniqueness region and µ has exponentially decaying correlations [42]3 . We also recall (see, e.g., [26]) that when the log-Sobolev constant is bounded away from zero one can derive very strong (Gaussian–like) concentration properties of the corresponding Gibbs measure, such as those proved in [5]. We now proceed to sketch some of our techniques and point out the main technical innovations. Our analysis of both the log-Sobolev constant and the spectral gap rests on certain spatial mixing conditions that can be stated as follows. Let f be a function of the spin configuration that does not depend on the spins in the first levels of the tree starting from the root r, and let µ(f | σr ) be the projection of f onto the spin σr at the root. If the variance (respectively, the entropy) under the Gibbs measure µ of µ(f | σr ) decays fast enough with the depth , then we show by a unified argument how to deduce a bound of (1) on the spectral gap (respectively, the log-Sobolev constant). Crucially, in contrast to previous approaches we do not require the above decay to hold in arbitrary environments, but only for the Gibbs measure µ under consideration. This opens up the possibility that the condition holds for some boundary conditions and not for others 3 A close look at the proof in [42] reveals that the same is true for any infinite, locally-finite, boundeddegree graph such that the volume of any ball of radius grows sub–exponentially in .
306
F. Martinelli, A. Sinclair, D. Weitz
(with the same values of temperature and external field). We also prove the converse, thus showing that our mixing conditions are in fact equivalent to the required bounds on the spectral gap and log–Sobolev constants. This analysis has several additional advantages over previous ones [3, 36]: it is more direct, applies also when there is an external field, and applies to general nearest-neighbor spin systems on trees. The second main ingredient of the paper is establishing the above spatial mixing conditions in the scenarios of interest described in the above two theorems. In the case of the variance, this is done via a rather simple and novel coupling technique. Such a technique provides, along the way, a new and really elementary proof of the extremality of the Gibbs measure with free boundary below β1 . Surprisingly, we are also able to exploit the same coupling technique (via strong concentration properties of the Gibbs measure) to establish the entropy mixing condition. Thus in terms of the coupling analysis our conditions for variance and entropy mixing are essentially the same. Finally, we mention that our results actually hold (with suitable modifications) for a much wider class of spin systems on trees than just the Ising model, including the Potts model and models with hard constraints such as the zero-temperature antiferromagnetic Potts model (proper colorings) and the hard-core lattice gas model (independent sets). We briefly outline some of these extensions at the end of the paper; full details can be found in a companion paper [32]. The remainder of the paper is organized as follows. In Sect. 2 we give some basic definitions and notation. Then in Sect. 3 we define the spatial mixing conditions and relate them to the spectral gap and log-Sobolev constant. The mixing conditions in the scenarios of interest for the spectral gap and the log-Sobolev constant are verified in Sects. 4 and 5 respectively. Finally, in Sect. 6 we mention some extensions of our results to other models of interest. The proofs of some technical lemmas omitted from the main text are collected in a supplement, Sect. 7. 2. Preliminaries 2.1. Gibbs distributions on trees. For b ≥ 2, let Tb denote the infinite, rooted b-ary tree (in which every vertex has b children). We will be concerned with (complete) finite subtrees T of Tb ; if T has depth m then it has n = (bm+1 − 1)/(b − 1) vertices, and its boundary ∂T consists of the children (in Tb ) of its leaves, i.e., |∂T | = bm+1 . We identify subgraphs of T with their vertex sets, and write E(A) for the edges within a subset A, and ∂A for the boundary of A (i.e., the neighbors of A in (T ∪ ∂T ) \ A). Fix an Ising spin configuration τ on the infinite tree Tb . We denote by τT the set of (finite) spin configurations σ ∈ {±1}T ∪∂T that agree with τ on ∂T ; thus τ specifies a boundary condition on T . Usually we abbreviate τT to . For any η ∈ and any subset η A ⊆ T , we denote by µA the Gibbs distribution over conditioned on the configuration outside A being η: i.e., if σ ∈ agrees with η outside A then η µA (σ ) ∝ exp β σx σy + h σx , xy∈E(A∪∂A)
x∈A
η
where β is the inverse temperature and h the external field. We define µA (σ ) = 0 otherwise. In particular, when A = T , µτT is simply the Gibbs distribution on the whole of T with boundary condition τ ; we abbreviate µτT to µ. η η For a function f : → R we denote by µA (f ) = σ ∈ µA (σ )f (σ ) the expecη η tation of f w.r.t. the distribution µA . It will be convenient to view µA (f ) as a funcη tion of η, defined by µA (f )(η) = µA (f ), the conditional expectation of f . Note that
Glauber Dynamics on Trees: Boundary Conditions and Mixing Time
307
µA (f ) is a function from to R but depends only on the configuration outside A. η η η η η We write Var A (f ) = µA (f 2 ) − µA (f )2 and (for f ≥ 0) EntA (f ) = µA (f log f ) − η η η µA (f ) log µA (f ) for the variance and entropy of f respectively w.r.t. µA . Note that η Var A (f ) = 0 iff, conditioned on the configuration outside A being η, f does not depend η on the configuration inside A. The same holds for Ent A (f ). In case A = T we use the abbreviations µ(f ), Var(f ) and Ent(f ). We record here some basic properties of variance and entropy that we use throughout the paper: (i) For B ⊆ A ⊆ T , η
η
η
Var A (f ) = µA [Var B (f )] + Var A [µB (f )].
(2)
This equation expresses a decomposition of the variance into the local conditional variance in B and the variance of the projection outside B. η (ii) If A = i Ai for disjoint Ai , and the Gibbs distribution µA is the product of its marginals over the Ai , then for any function f , η η µA [Var Ai (f )]. (3) Var A (f ) ≤ i
(iii) For any two subsets A, B ⊆ T such that (∂A) ∩ B = ∅, and for any function f , µ[Var A (µB (f ))] ≤ µ[Var A (µA∩B (f ))].
(4)
Properties (ii) and (iii) are consequences of the fact that variance w.r.t. a fixed measure is a convex functional. All three properties (i), (ii) and (iii) also hold with Var replaced by Ent. 2.2. The Glauber dynamics. The Glauber dynamics on T with boundary condition τ is the continuous time Markov chain on = τT with Markov generator L ≡ LτT given by cx (σ )[f (σ x ) − f (σ )], (5) (Lf )(σ ) = x∈T
σx
where denotes the configuration obtained from σ by flipping the spin at the site x, and cx (σ ) denotes the flip rate at x. Although all our results apply to any choice of finite–range, uniformly positive and bounded flip rates satisfying the detailed balance condition w.r.t. the Gibbs measure, for simplicity in the sequel we will work with a specific choice known as the heat-bath dynamics:
1 , where wx (σ ) = exp 2βσx ( cx (σ ) = µσ{x} (σ x ) = σy + h) . 1 + wx (σ ) xy∈E
It is a well-known fact (and easily checked) that the Glauber dynamics is ergodic and reversible w.r.t. the Gibbs distribution µ = µτT , and so converges to the stationary distribution µ. The rate of convergence is often measured using two concepts from functional analysis: the spectral gap and the logarithmic Sobolev constant. For a function f : → R, define the Dirichlet form of f associated with the generator L by
2 D(f ) = 21 µ cx f (σ x ) − f (σ ) µ(Var {x} (f )). (6) = x
x
308
F. Martinelli, A. Sinclair, D. Weitz
(The l.h.s. here is the general definition for any choice of the flip rates cx ; the last equality holds when specializing to the case of the heat-bath dynamics.) The spectral gap cgap (µ) and the logarithmic Sobolev constant csob (µ) of the chain are then defined by √ D(f ) D( f ) cgap (µ) = inf ; csob (µ) = inf , (7) f Var(f ) f ≥0 Ent(f ) where the infimum in each case is over non-constant functions f . As is well known, these two quantities measure the rate of exponential decay as t → ∞ of the variance and relative entropy respectively (see, e.g., [37]). The quantity cgap also has a natural interpretation as the smallest positive eigenvalue of −L. We make the following important note. When discussing the asymptotics of csob (or cgap ) for a fixed
boundary condition τ , we think of the infinite sequence of Gibbs distributions µτT , where T ranges over all finite complete subtrees of Tb . In particular, when we say that csob (µ) ≡ csob (µτT ) = (1) we mean that there exists a finite constant C > 0 such that for every T (or equivalently, for every µ ∈ µτT ), csob (µ) ≥ 1/C. We close this section by recalling some well-known relationships between the above constants and certain notions of mixing time of the Glauber dynamics. Define hσt (η) = Pt (σ,η) tL µ(η) , where Pt (σ, η) = e (σ, η) is the transition kernel at time t. Then, for 1 ≤ p ≤ ∞, define 1 Tp = min t > 0 : sup hσt − 1p ≤ , (8) e σ where f p denotes the Lp (, µ) norm of f . The time T1 is usually called simply the mixing time of the chain. Standard results relating Tp to the spectral gap and log-Sobolev constant (see, e.g., [37]), when specialized to the Glauber dynamics, yield the following: Theorem 2.1. On an n-vertex b-ary tree T with boundary condition τ , (i) cgap (µ)−1 ≤ T1 ≤ cgap (µ)−1 × C1 n; (ii) cgap (µ)−1 ≤ T2 ≤ csob (µ)−1 × C2 log n, where µ = µτT and C1 , C2 are constants depending only on b, β and h. Finally, we note that our choice of the heat-bath dynamics is not essential. Since changing to any other reversible local update rule (e.g., the Metropolis rule) affects csob and cgap by at most a constant factor, our analysis applies to any choice of Glauber dynamics. 3. Spatial Mixing Conditions for the Spectral Gap and log-Sobolev Constant In this section we define a certain spatial mixing condition (i.e., a form of weak dependence between the spin at a site and the configuration far from that site) for a Gibbs distribution µ, and prove that this condition implies that cgap (µ) = (1). An analogous condition implies that csob (µ) = (1). Our spatial mixing conditions have two main advantages over those used previously: first, the conditions for the spectral gap and the log-Sobolev constant are identical in form, allowing a uniform treatment; second, and more importantly, they are measure-specific, i.e., they may hold for the Gibbs distribution induced by some specific boundary configuration while not holding for other boundary configurations. Hence, the conditions are sensitive enough to show rapid mixing for specific boundaries even though the mixing time with other boundaries is slow for the same choice of temperature and external field. We also note that the results of this section hold not just for the Ising model but for any nearest-neighbor interaction model on a tree.
Glauber Dynamics on Trees: Boundary Conditions and Mixing Time
309
3.1. Reduction to block analysis. Before presenting the main result of this section, we need some more definitions and background. For each site x ∈ T , let Bx, ⊆ T denote the subtree (or “block”) of height − 1 rooted at x, i.e., Bx, consists of levels. (If x is k < levels from the bottom of T then Bx, has only k levels.) In what follows we will think of as a suitably large constant. By analogy with expression (6) for the Dirichlet form, let D (f ) ≡ x∈T µ[Var Bx, (f )] denote the local variation of f w.r.t.
the blocks Bx, . A straightforward manipulation (see, e.g., [29], keeping in mind that each site belongs to at most blocks) shows that cgap can be bounded as follows: cgap (µ) ≥
D (f ) 1 η · inf · min cgap (µBx, ). f Var(f ) η,x
(9)
As before, the infimum is taken over non-constant functions (and henceforth we omit η explicit mention of this). The importance of (9) is that minη,x cgap (µBx, ) depends only
on the size of Bx, and β, but not on the size of T ; in fact, it is at least (e−c(b,β)· ) [3]. Therefore, in order to show that cgap is bounded by a constant independent of the size of T , it is enough to show that, for some finite , Var(f ) ≤ const × D (f ) for all functions f . This is what we will show below, under the relevant spatial mixing condition. As D (f ) a side remark, notice that inf f Var(f ) is exactly the spectral gap of the Glauber dynamics based on flipping blocks Bx, , rather than single sites x. An identical manipulation yields an analogous bound for the log-Sobolev constant. For a non-negative function f , let E (f ) ≡ x∈T µ[Ent Bx, (f )]. Then csob (µ) ≥
E (f ) 1 η · inf · min csob (µBx, ). f ≥0 Ent(f ) η,x
(10)
Hence to bound csob (µ) it suffices to show that, for some constant , Ent(f ) ≤ const × E (f ) for all f ≥ 0. 3.2. Spatial mixing. We are now ready to state our spatial mixing conditions, first for the variance and then for the entropy. For x ∈ T , write Tx for the subtree rooted at x, and Tx for Tx \ {x}, the subtree Tx excluding its root. Definition 3.1 (Variance Mixing). We say that µ = µτT satisfies VM(, ε) if for every x ∈ T , any η ∈ τT and any function f that does not depend on Bx, , the following holds: η η Var Tx [µTx (f )] ≤ ε · Var Tx (f ). Let us briefly discuss the above condition. Essentially, ε = ε() gives the rate of decay η with distance of point-to-set correlations. To see this, note that the l.h.s. Var Tx [µTx (f )] is the variance of the projection of f onto the root x of Tx , which is at distance from the sites on which f depends. It is also worth noting that the required uniformity in η in VM η is not very restrictive: since the distribution µTx depends only on the restriction of η to τ the boundary of Tx , and since η ∈ T (i.e., η agrees with τ on ∂T and therefore on the bottom boundary of Tx ), the only freedom left in choosing η is in choosing the spin of the parent of x. Thus, VM is essentially a property of the distribution induced by the boundary condition τ . It is this lack of uniformity (i.e., the fact that we need not verify VM for other boundary conditions) that makes it flexible enough for our applications.
310
F. Martinelli, A. Sinclair, D. Weitz
As the following theorem states, if VM(, ε) holds with ε ≈ bound on cgap :
1 2 ,
then we get a lower
Theorem 3.2. For any and δ > 0, if µ satisfies VM(, (1 − δ)/2( + 1 − δ)) then Var(f ) ≤ 3δ · D (f ) for all f . In particular, if VM with the above parameters holds for some fixed and δ > 0, for all µ = µτT with T a full subtree, then cgap (µ) = (1). Conversely, if cgap (µ) = (1) then for all T , µτT satisfies VM(, ce−ϑ ) for some constants c, ϑ > 0 and all . Remark. The second part of the theorem was already proved in [3], where it was shown that for general nearest-neighbor spin systems on any bounded degree graph, if cgap (µ) is bounded independently of n then µ exhibits an exponential decay of point-to-set correlations (i.e., VM(, c exp(−ϑ)) holds for all ). The authors of [3] posed the question of whether the converse is also true. Theorem 3.2 (which holds for general nearest-neighbor spin systems on a tree) answers this question affirmatively when the graph is a tree. In fact, as is apparent from the above theorem, the decay of point-to-set correlations on a tree is either slower than linear or exponentially fast. The analogous mixing condition for entropy and the log-Sobolev constant is the following: Definition 3.3 (Entropy Mixing). We say that µ = µτT satisfies EM(, ε) if for every x ∈ T , any η ∈ τT and any non-negative function f that does not depend on Bx, , the following holds: η η EntTx [µTx (f )] ≤ ε · Ent Tx (f ). Before stating the analog of Theorem 3.2 relating csob to EM, we need to define η one more constant. Let pmin = minx,s,η∈τT µTx (σx = s), where s ranges over {+, −}; i.e., pmin is the minimum probability of any spin value at any site with any boundary condition. It is easy to see that pmin ≥ 21 e−2β(b+|h|) , a constant depending only on b, β, h. Theorem 3.4. For any and δ > 0, if µ satisfies EM(, [(1 − δ)pmin /( + 1 − δ)]2 ) then Ent(f ) ≤ 2δ · E (f ) for all f ≥ 0. In particular, if EM with the above parameters holds for some fixed and δ > 0, for all µ = µτT with τ fixed and T an arbitrary full subtree, then csob (µ) = (1). Conversely, if csob (µ) = (1) then for all T , µτT satisfies EM(, ce−ϑ ) for some constants c, ϑ > 0 and all . In order to prove Theorems 3.2 and 3.4 it is convenient to work with spatial mixing conditions that are somewhat more involved than VM and EM. The main difference is that we want to allow for functions that may depend on Bx, (the first levels of Tx ) and thus need to introduce a term for this dependency. The modified conditions express the property that the variance (entropy) of the projection of any function f onto the root x of Tx can be bounded up to a constant factor by the local variance (entropy) of f in Bx, , plus a negligible factor times the local variance (entropy) of f in Tx . As the following lemma states, the modified conditions (with appropriate parameters) can be deduced from VM and EM. Lemma 3.5. (i) For any ε < 21 , if µ = µτT satisfies VM(, ε) then for every x ∈ T , η η any η ∈ τT and any function f we have Var Tx [µTx (f )] ≤ 2−ε 1−ε · µTx [Var Bx, (f )] + ε 1−ε
· µTx [Var Tx (f )], with ε = 2ε. η
Glauber Dynamics on Trees: Boundary Conditions and Mixing Time
311
2 , if µ = µτ satisfies EM(, ε) then for every x ∈ T , any η ∈ τ (ii) For any ε < pmin T T η η 1 ε and any function f ≥ 0 we have Ent Tx [µTx (f )] ≤ 1−ε · µTx [Ent Bx, (f )] + 1−ε ·
µTx [Ent Tx (f )], with ε = η
√
ε pmin .
Remark. We note that with extra work, part (ii) of Lemma 3.5 can be improved to hold with ε = c(pmin )ε. We give the weaker bound because it is simpler to prove while still enough for our applications. Similar statements to those in Lemma 3.5 appeared in [4]. We defer our proof to Sect. 7. We can now prove Theorems 3.2 and 3.4 by working with the modified spatial mixing conditions of Lemma 3.5. Proof of Theorems 3.2 and 3.4. Here we only prove the forward direction of both theorems, which are the more important for our development. The reverse direction of Theorem 3.2 was proved in [3], as already mentioned above. The proof of the reverse direction of Theorem 3.4 is deferred to Sect. 7 because it uses machinery developed later in the paper. The main step in the proof of the forward directions is to show the following claim: Claim 3.6. If for every x ∈ T , any η ∈ τT and any function f , 1−δ η η η Var Tx [µTx (f )] ≤ c · µTx [Var Bx, (f )] + · µTx [Var Tx (f )], then Var(f ) ≤ δc · D (f ) for all f . The same implication holds when Var is replaced by Ent, D is replaced by E and the function f is restricted to be non-negative. Observe that the hypothesis of Theorem 3.2 together with part (i) of Lemma 3.5 establishes the hypothesis of Claim 3.6 with c ≤ 3, and similarly, the hypothesis of Theorem 3.4 together with part (ii) of Lemma 3.5 establishes the hypothesis of Claim 3.6 (after the necessary replacement of symbols) with c ≤ 2. It therefore suffices to prove Claim 3.6. We prove only the formulation with Var and D since the proof for the formulation with Ent and E is identical once we make the same replacements in the text of the proof. As will be clear below, the proof uses only properties which are common to both Var and Ent. Consider an arbitrary function f : → R. Our first goal is to relate Var(f ) to the η projections Var Tx [µTx (f )] for x ∈ T , so that we can apply the spatial mixing condition of the hypothesis. Recall that T has m + 1 levels, and define the increasing sequence ∅ = F0 ⊂ F1 ⊂ . . . ⊂ Fm+1 = T , where Fi consists of all sites in the lowest i levels of T . Thus Fi is a forest of height i − 1. Using (2) recursively, and the facts that µFi+1 (µFi (f )) = µFi+1 (f ) and µF0 (f ) = f , we obtain Var(f ) = µ[Var F1 (f )] + Var[µF1 (f )] = µ[Var F1 (f )] + µ[Var F2 (µF1 (f ))] + Var[µF2 (µF1 (f ))] .. . m+1 = µ[Var Fi (µFi−1 (f ))]. i=1
312
F. Martinelli, A. Sinclair, D. Weitz
Now a fundamental property of nearest-neighbor interaction models on a tree is that, given the configuration on T \ Fi , the Gibbs distribution on Fi becomes a product of the marginals on the subtrees rooted at the sites x ∈ Fi \ Fi−1 . Using inequality (3) for the variance of a product measure, we therefore have that Var(f ) ≤
m+1
µ[Var Tx (µFi−1 (f ))] ≤
i=1 x∈Fi \Fi−1
µ[Var Tx (µTx (f ))],
(11)
x∈T
where in the second inequality we used the convexity of the variance as in (4). Notice that so far we have not used the spatial mixing condition in the hypothesis of Claim 3.6, but only a natural martingale structure induced by the tree. Let us denote the final sum in (11) by Pvar(f ). In order to bound cgap , we need to compare the projection terms Var Tx (µTx (f )) in Pvar(f ) with the local conditional variance terms in D (f ). For example, notice that if µ were the product of its single-site marginals then Var Tx (µTx (f )) ≤ µTx [Var x (f )] and cgap = 1. However, in general the variance of the projection on x may also involve terms which depend on other sites, and may lead to a factor that grows with the size of Tx . We will use the spatial mixing condition in order to preclude the latter possibility. Specifically, we show that if for every x ∈ T , any η ∈ τT η η η and any function g, Var Tx [µTx (g)] ≤ c · µTx [Var Bx, (g)] + ε · µTx [Var Tx (g)] then for every x ∈ T and η ∈ , η η η Var Tx [µTx (f )] ≤ c · µTx [Var Bx (f )] + ε · µTx [Var Ty (µTy (f ))], (12) y∈Bx ∪ ∂Bx ,y=x
where we have abbreviated Bx, to Bx and ∂Bx stands for the boundary of Bx excluding the parent of x, i.e., the bottom boundary of Bx . Notice that the last term in (12) is relevant only when x is at distance at least from the bottom of T . When x belongs to one of the η η lowest levels of T then Tx = Bx , and thus trivially Var Tx [µTx (f )] ≤ µTx [Var Bx (f )]. Let us assume (12) for now and conclude the proof of the theorem. Applying (12) for every x and η, and using the hypothesis that ε = 1−δ and the fact that each site appears in at most blocks, we get Pvar(f ) ≤ c · D (f ) + ε · µ[Var Ty (µTy (f ))] x∈T y∈Bx ∪ ∂Bx ,y=x
≤ c · D (f ) + ε ·
µ[Var Ty (µTy (f ))]
y∈T
= c · D (f ) + (1 − δ)Pvar(f ), and hence Var(f ) ≤ Pvar(f ) ≤
c · D (f ), δ
proving Claim 3.6. We now return to proving (12). Let g = µTx \(Bx ∪∂Bx ) (f ). Once we notice that µTx (f ) = µTx (g), we can use the spatial mixing assumption that precedes (12) to deduce η
η
η
η
η
Var Tx [µTx (f )] ≤ c · µTx [Var Bx (g)] + ε · µTx [Var Tx (g)] ≤ c · µTx [Var Bx (f )] + ε · µTx [Var Tx (g)],
Glauber Dynamics on Trees: Boundary Conditions and Mixing Time
313
where we used (4) for the second inequality. We will be done once we show that η
µTx [Var Tx (g)] ≤
y∈Bx ∪ ∂Bx ,y=x
η
µTx [Var Ty (µTy (f ))].
(13)
But (13) follows from a similar argument to that used earlier to show Var(f ) ≤ Pvar(f ), starting from the fact that g = µFk (f ), where the forests Fi are defined analogously to the Fi earlier but restricted to the subtree Tx , and k = height(x) − . We omit the details. This concludes the proof of Claim 3.6, and thus of Theorems 3.2 and 3.4.
4. Verifying Spatial Mixing for the Spectral Gap In this section, we will prove that the spectral gap of the Glauber dynamics is bounded in all of the situations covered by Theorem 1.1 in the Introduction. In light of Theorem 3.2, to bound the spectral gap it suffices to verify the Variance Mixing condition VM(, ε) with ε = (1 − δ)/2( + 1 − δ), for some constants , δ > 0 independent of the size of T . In fact, we will show it with the asymptotically tighter value ε = c exp(−ϑ): Theorem 4.1. In both of the following situations, there exists a positive constant ϑ (depending only on b, β and h) such that, for all T , the Gibbs distribution µ = µτT satisfies VM(, e−ϑ ) for all : (i) τ is arbitrary, and either β < β1 (with h arbitrary), or |h| > hc (β) (with β arbitrary); (ii) τ is the (+)-boundary condition, and β, h are arbitrary. As a corollary, in both situations cgap (µ) = (1). Remark. The validity of VM, i.e, the decay of point-to-set correlations, is of interest independently of its implication for the spectral gap (an implication which is new to this paper): e.g., it is closely related to the purity of the infinite volume Gibbs measure [6] and to bit reconstruction problems on trees [14]. In the special case of a free boundary and h = 0, part (i) of Theorem 4.1 was first proved in [6] via a lengthy calculation, which was considerably simplified in [20]. It was later reproved in [3] (for arbitrary boundary conditions) as a consequence of the fact that the spectral gap is bounded in this situation. An extension to general trees can be found in [14] and [21]. Our motivation for presenting another proof of part (i) (in addition to handling general fields h) is the simplicity of our argument compared with previous ones. As far as part (ii) is concerned, we are unaware of any previous results for the case of the (+)-boundary other than the fact that VM(, ε()) must hold with lim→∞ ε() = 0 because the (+)-phase is pure (see, e.g., [16]). The rest of this section is divided into two parts. First, we develop a general framework based on coupling in order to establish the exponential decay of point-to-set correlations. This framework identifies two key quantities, κ and γ , and states that when their product is small enough then VM holds. Then, in the second part, we go back to proving Theorem 4.1 by calculating κ and γ for each of the above two regimes separately.
314
F. Martinelli, A. Sinclair, D. Weitz
4.1. A coupling argument for decay of point-to-set correlations. In this section we develop a coupling framework that enables us to verify the exponential decay of point-to-set correlations from a simple calculation involving single-spin distributions. First we need some additional notation. When x is not the root of T , let µ+ Tx (respec) denote the Gibbs distribution in which the parent of x has its spin fixed to (+) tively, µ− Tx (respectively, (−)) and the configuration on the bottom boundary of Tx is specified by τ (the global boundary condition on T ) 4 . For two distributions µ1 and µ2 , we denote by µ1 − µ2 x the variation distance between the projections of µ1 and µ2 onto the spin at x. (Since the Ising model has only two spin values, µ1 − µ2 x = |µ1 (σx = +) − µ2 (σx = +)|.) Recall also that ηy denotes the configuration η with the spin at site y flipped. We now identify two constants that are crucial for our coupling argument: τ Definition 4.2. For a sequence of Gibbs distributions τ µTτ corresponding to a fixed boundary condition τ , define κ ≡ κ( µT ) and γ ≡ γ ( µT ) by − (i) κ = supT maxz µ+ Tz − µTz z ; η
ηy
(ii) γ = supT max µA −µA z , where the maximum is taken over all subsets A ⊆ T , all boundary configurations η, all sites y on the boundary of A and all neighbors z ∈ A of y. Note that κ is the same as γ , except that the maximization is restricted to A = Tz and the boundary vertex y being the parent of z; hence always κ ≤ γ . Since κ involves Gibbs distributions only on maximal subtrees Tz , it may depend on the boundary condition τ at the bottom of the tree. By contrast, γ bounds the worst-case probability of disagreement for an arbitrary subset A and arbitrary boundary configuration around A, and hence depends only on (β, h) and not on τ . It is the dependence of κ on τ that opens up the possibility of an analysis that is specific to the boundary condition. For example, at very low temperature and with no external field, κ is close to 1 in the free boundary case, while it is close to zero in the (+)-boundary case. In our arguments κ will be used to bound the probability of a disagreement percolating one level down the tree, namely, when we fix a disagreement at x and couple the two resulting marginals on a child z of x. On the other hand, γ will be used in order to bound the probability of a disagreement percolating one level up the tree, namely, when we fix a single disagreement on the bottom boundary of a block, say at y (with the rest of the boundary configuration being arbitrary), and couple the marginals on the parent of y. The novelty of our argument for establishing VM comes from the fact that we identify two separate constants κ and γ , and consider their product, rather than working with κ alone: Theorem 4.3. Any Gibbs distribution µ = µτT satisfies (γ κb) ) for all , where κ VM(,
τ and γ are the constants associated with the sequence µT as specified in Definition 4.2. In particular, if γ κb < 1 then there exists a constant ϑ > 0 such that, for every T , the measure µ = µτT satisfies VM(, e−ϑ ) for all , and hence cgap (µ) = (1). 4 Notice that we do not specify the rest of the configuration outside T since it has no influence on x the distribution inside Tx once the spin at the parent of x is fixed. However, since our distributions are defined over the whole configuration space, in the discussion below when the configuration outside Tx is relevant it will be understood from the context.
Glauber Dynamics on Trees: Boundary Conditions and Mixing Time
315
Proof. Fix arbitrary T , x ∈ T , η ∈ τT . We need to show that for every function f η η that does not depend on Bx, , Var Tx [µTx (f )] ≤ ε · Var Tx (f ) with ε = (κγ b) , i.e., projecting f onto the root (of Tx ) causes the variance to shrink by a factor ε. As is well known, it is enough to establish a dual contraction, i.e., to consider an arbitrary function that depends only on the spin at the root and show that, when projecting onto levels and below, the variance shrinks by a factor ε. Formally, it is enough to show that for every function g that does not depend on Tx 5 we have η
η
Var Tx [µBx, (g)] ≤ ε · Var Tx (g).
(14)
This is because for a function f that does not depend on Bx, , the variance of the projection can be written as η
η
η
Var Tx [µTx (f )] = CovTx (f, µTx (f )) = CovTx (f, µBx, (µTx (f ))) ≤ η η Var Tx (f ) · Var Tx [µBx, (µTx (f ))] , where CovA (f, f ) denotes the covariance µA (ff )−µA (f )µA (f ) and the last inequality is an application of Cauchy-Schwartz. We then have η
η
η
η
η
η
η
Var Tx [µTx (f )] ≤ Var Tx (f ) ·
Var Tx [µBx, (µTx (f ))] η
Var Tx [µTx (f )]
. η
If we assume (14) then the expression on the r.h.s. is bounded by ε · Var Tx (f ) since g = µTx (f ) does not depend on Tx . We therefore proceed with the proof of (14), which goes via a coupling argument. A coupling of two distributions µ1 , µ2 on is any joint distribution ν on 2 whose marginals are µ1 and µ2 respectively. For two configurations σ, σ ∈ , let |σ − σ |x, denote the Hamming distance between the restrictions of σ and σ to ∂Bx, , i.e., the number of sites at distance below x at which σ and σ differ. Notice that |σ − σ |x, can be at most b , the number of sites on the th level below x. Let µ+ T x
(respectively, µ− ) stand for the Gibbs distribution where the spin at x is set to Tx (+) (respectively, (−)) and, as usual, the configuration on the bottom boundary of Tx is specified by τ . Our goal will be to construct a coupling ν of µ+ and µ− for which the Tx Tx expectation Eν |σ − σ |x, ≡ σ,σ ν(σ, σ )|σ − σ |x, is only (κb) . Claim 4.4. For every x ∈ T and all the following hold: (i) There is a coupling ν of µ+ and µ− for which Eν |σ − σ |x, ≤ (κb) . T T x
x
(ii) For any η, η ∈ that have the same spin value at the parent of x, µBx, − η
η
µBx, x ≤ γ · |η − η |x, . Let us assume Claim 4.4 for the moment and complete the proof of (14). Consider η an arbitrary g that does not depend on Tx . Let p = µTx (σx = +) and q = 1 − p = η µTx (σx = −). We also write g + for g(σ ), where σ is any configuration that agrees with η outside Tx and such that σx = +. (This is well defined since g does not depend on Tx ). We define g − similarly. Without loss of generality we may assume that in the coupling ν from Claim 4.4 both the coupled configurations agree with η outside Tx with probability 1. We then have 5 Effectively this means that, conditioned on the configuration outside T being η, g depends only on x the spin at the root x.
316
F. Martinelli, A. Sinclair, D. Weitz η
η
Var Tx [µBx, (g)] = CovTx [g, µBx, (g)] η
= CovTx [g, µTx (µBx, (g))] (µBx, (g)) − µ− (µBx, (g))] = pq(g + − g − )[µ+ Tx Tx ν(σ, σ )[µσBx, (g) − µσBx, (g)] = pq(g + − g − ) σ,σ
≤ pq|g + − g − |
σ,σ
+
− 2
≤ pq(g − g ) = ≤
ν(σ, σ )µσBx, − µσBx, x · |g + − g − | (15)
ν(σ, σ )|σ − σ |x, · γ
σ,σ η γ · Var Tx (g) · Eν |σ η (γ κb) · Var Tx (g).
− σ |x,
In the sixth line here we have used part (ii) of Claim 4.4, and in the last line we have used part (i). This completes the proof of (14), and hence of Theorem 4.3. We now go back and prove Claim 4.4. The proof of Claim 4.4 makes use of a standard recursive coupling along paths in the and µ− tree (as in, e.g., [3]). We start with part (i), i.e., constructing a coupling ν of µ+ T T x
x
with the required properties. Since the underlying graph is a tree, we can couple µ+ T
x
and µ− recursively. This goes as follows. First, given the spin at x the measures on Tz Tx (where z ranges over the children of x) are all independent of each other, so we can couple the projections on the Tz ’s independently. Then, we couple the two projections on Tz by first coupling the spin at z using the optimal coupling (the one that achieves the variation distance) of the marginal measures on the spin at z. Thus, the spins at z disagree with probability at most κ. Once a coupled pair of spins at z is chosen, we continue as follows: if the spins at z agree then we can make the configurations in Tz equal with probability 1 (because the two boundary conditions are the same); if the spins at z differ (i.e., one is (+) and the other (−)) then we recursively couple µ+ and µ− . T T z
z
and µ− , and notice that Eν |σ − σ |x,l ≤ (κb) We let ν be the resulting coupling of µ+ Tx Tx since for every site y at distance below x the probability that the two coupled spins at y disagree is at most κ . We go on to prove part (ii) of Claim 4.4. First, by writing a telescopic sum and applying the triangle inequality we get that η µBx,
η − µBx, x
≤
k
η(i−1)
η(i)
µBx, − µBx, x ,
i=1
where k = |η − η |x, and the sequence of configurations η(i) is a site-by-site interpolation of the differences between η and η in ∂Bx, . (It suffices to interpolate only η over the differences in ∂Bx, since the measure µBx, depends only on the configuration in ∂Bx, and since η and η agree on the parent of x.) It is now enough to show that η ηw ∂Bx, . This, however, follows by a coupling µBx, − µBx, x ≤ γ for all η and w ∈ argument as before, where this time we couple recursively along the path from w to x (i.e., up the tree). Specifically, suppose by induction that in our coupling there is already
Glauber Dynamics on Trees: Boundary Conditions and Mixing Time
317
a path of disagreement going from w to y, where y is some site on the path from w to x. Let z denote the parent of y. At the next step we choose a coupled pair of spins at z from η ηy the two distributions µA and µA (using an optimal coupling for the projections onto the spin at z), where the subset A is Bx, excluding the path from w to y. The probability of disagreement at z given the disagreement at y is then bounded by γ , by definition. If the resulting spins at z agree then the spins on the rest of the path are coupled to agree with certainty, while if there is a disagreement at z we continue recursively starting from the disagreement at z. We therefore conclude that the probability of disagreement at x in the resulting coupling is γ , as required. Remark. We emphasize that Theorem 4.3 is not specific to the Ising model and generalizes to arbitrary nearest-neighbor models on a tree. Although we used the fact that the Ising model has only two possible spin values, the proof can easily be generalized to 1 more than two spin values at the cost of a factor pmin in front of (γ κb) in VM, where pmin is the minimum probability of any spin value as defined just before Theorem 3.4. Thus, since Theorem 3.2 also applies to general nearest-neighbor spin systems on a tree, we conclude that the implication from γ κb < 1 to a bounded cgap (µ) holds for any such system (with the definitions of κ and γ extended in the obvious way to systems with more than two spin values). The details can be found in the companion paper [32]. 4.2. Proof of Theorem 4.1. In this section we go back to proving Theorem 4.1. Using Theorem 4.3, all we need to do for the given choices of the Ising model parameters is to bound κ and γ as in Definition 4.2 such that γ κb < 1. In contrast to Sec. 3 and 4.1, which apply to general nearest-neighbor spin systems on trees, here the calculations are specific to the Ising model. η ηy For both κ and γ , we need to bound a quantity of the form µA − µA z , where y ∈ ∂A and z ∈ A is a neighbor of y. The key observation is that this quantity can be expressed very cleanly in terms of the “magnetization” at z, i.e., the ratio of probabilities of a (−)-spin and a (+)-spin at z. It will actually be convenient to work with the magη,y=∗ netization without the influence of the neighbor y: thus we let µA denote the Gibbs distribution with boundary condition η, except that the spin at y is free (or equivalently, the edge connecting z to y is erased). We then have: Proposition 4.5. For any subset A ⊆ T , any boundary configuration η, any site y ∈ ∂A and any neighbor z ∈ A of y, we have η
ηy
µA − µA z = Kβ (R), where R =
η,y=∗
µA (σz =−) η,y=∗ µA (σz =+)
and the function Kβ is defined by Kβ (a) =
1 1 − . e−2β a + 1 e2β a + 1
Proof. First, w.l.o.g. we may assume that the edge between y and z is the only one connecting y to A; this is because a tree has no cycles, so once the spin at y is fixed A decomposes into disjoint components that are independent. We also assume w.l.o.g. η ηy − that the spin at y is (+) in η, and we abbreviate µA and µA to µ+ A and µA respectively, y η,y=∗ η η − to µ∗A . Thus µA − µA z = |µ+ and also µA A (σz = +) − µA (σz = +)|, and
318
R=
F. Martinelli, A. Sinclair, D. Weitz µ∗A (σz =−) µ∗A (σz =+) .
µ+ µ− (σ =−) A (σz =−) and R − for µA− (σz =+) . Since the only influµ+ (σ =+) z A A z we have R + = e−2β R and R − = e2β R. The proposition 1 that, by definition of R + and R − , µ+ A (σz = +) = R + +1
We write R + for
ence of y on A is through z, now follows once we notice 1 and µ− A (σz = +) = R − +1 .
Now it is easy to check that Kβ (a) is an increasing function in the interval [0, 1], decreasing in the interval [1, ∞], and is maximized at a = 1. Therefore, we can always β −β . Indeed, for γ we must make do with bound κ and γ from above by Kβ (1) = eeβ −e +e−β this crude bound because it has to hold for any boundary configuration η and we cannot hope to gain by controlling the magnetization R. However, as we shall see, for κ we can do better in some cases by computing the magnetization at the root; when this differs from 1 we get a better bound than Kβ (1). We are now ready to proceed to the proof of Theorem 4.1: (i) Arbitrary boundary conditions. Here, the boundary condition τ is arbitrary and we first consider the (easy) case when β < β0 or |h| > hc (β) (i.e., h is super-critical). In this case we do not need to resort to the calculation of κ and γ . As discussed in the Introduction, in this regime there is a unique infinite volume Gibbs measure, so certainly η η the variation distance at the root maxη,η µBx, − µBx, x goes to zero as increases. In fact, it is not too difficult to see that in the above regime this variation distance goes to zero exponentially fast, which directly implies the desired exponential decay of correlations (VM) by plugging the bound on the variation distance into expression (15) in the proof of Theorem 4.3. We go on to consider the more interesting regime when β0 ≤ β < β1 (i.e., intermediate temperatures) and the external field h is arbitrary. Here we use the fact that κ ≤ γ ≤ Kβ (1). We then certainly have γ κb < 1 whenever Kβ (1) = √ √b−1 . b+1
eβ −e−β eβ +e−β
From the definition of β1 (see Sec. 1.1), this corresponds precisely to β < β1 . (Observe how this non-trivial result drops out immediately from our machinery, as expressed in the condition γ 2 < b1 .) This completes the verification of Theorem 4.1 part (i). (ii) (+)-boundary condition. We now assume that τ is the all-(+) configuration and consider arbitrary β and h. For convenience, we assume h ≥ −hc (β) since the case |h| > hc (β) was covered in part (i) for all boundary conditions τ . The important property of the regime h ≥ −hc (β) is that, for the (+)-boundary, the spin at the root is at least as likely to be (+) as it is to be (−). We will show that γ κb < 1 throughout this regime. Recall that we already showed that γ ≤ Kβ (1) < 1 for all finite β. It is therefore enough to show that κ ≤ b1 . − To calculate κ, we need to bound the variation distance µ+ Tz −µTz z , which by Proposition 4.5 is equal to Kβ (Rz ), where Rz =
µ∗Tz (σz =−) µ∗Tz (σz =+)
and µ∗Tz is the Gibbs distribution
over the subtree Tz when it is disconnected from the rest of T and the spins on its bottom boundary agree with τ . We thus have κ = supT maxz∈T Kβ (Rz ). The final ingredient we need is a recursive computation of the magnetization Rz , the details of which (up to change of variables) can be found in [2] or [5]. Let y ≺ z denote that y is a child of z. A simple direct calculation gives that Rz = e−2βh y≺z F (Ry ), where F (a) ≡ Fβ (a) =
a+e−2β . e−2β a+1
In particular, if z is any site on the bottom-most level
Glauber Dynamics on Trees: Boundary Conditions and Mixing Time
319
of T , then since the spins of the children of z are all set deterministically to (+), we get that Rz = e−2βh [F (0)]b . We thus define J (a) ≡ Jβ,h (a) = e−2βh [F (a)]b
(16)
and observe that, for any z ∈ T , Rz = J () (0), where J () stands for the -fold composition of J , and is the distance of z from the bottom boundary of T . We now describe some properties of J that we use (refer to Fig. 2): J is continuous and increasing on [0, ∞), with J (0) = e−2β(h+b) > 0 and supa J (a) = e−2β(h−b) < ∞. This immediately implies that J has at least one fixed point in [0, ∞); we denote by a0 the least fixed point. Since a0 is the least fixed point and J (0) > 0 then clearly J (a0 ) ≤ 1, where J (a) ≡ ∂J∂a(a) is the derivative of J . We also note that a0 ≤ 1 when h ≥ −hc (β), which corresponds to the fact that for the (+)-boundary and the above regime of h, the spin at the root is at least as likely to be (+) as (−). Now, since J is monotonically increasing and a0 is the least fixed point of J , clearly J () (0) converges to a0 from below, i.e., Rz ≤ a0 for every z ∈ T . Thus, since a0 ≤ 1 for h ≥ −hc (β), and the function Kβ (a) is monotonically increasing in the interval [0, 1], Kβ (Rz ) ≤ Kβ (a0 ) for every z ∈ T . What remains to be shown is that Kβ (a0 ) ≤ b1 . This follows from the fact that J (a0 ) ≤ 1, together with the following lemma: Lemma 4.6. Let a0 be any fixed point of J . Then Kβ (a0 ) =
1 b
· J (a0 ).
Proof. From the definitions of J and F we have: J (a0 ) = e−2βh · b · [F (a0 )]b−1 F (a0 ) F (a0 ) = b · J (a0 ) · F (a0 ) F (a0 ) = b · a0 · F (a0 ) 1 − e−4β = b · a0 · −2β −2β (a0 + e )(e a0 + 1) = b · Kβ (a0 ). This completes the verification of Theorem 4.1 part (ii). a J(a)
(i)
J(a)
(ii)
a0
1
J(a)
a
a
0
(iii)
0 a0
1
0 a0
1
Fig. 2. Curve of the function J (a), used in the proof of Theorem 4.1, for β > β0 and various values of the external field h. (i) h = −hc (β); (ii) hc (β) > h > −hc (β); (iii) h > hc (β). The point a0 is the smallest fixed point of J
320
F. Martinelli, A. Sinclair, D. Weitz
5. Verifying Spatial Mixing for log-Sobolev In this section we will prove a uniform lower bound (independent of n) on the logarithmic Sobolev constant csob (µ) in all the situations covered by Theorem 1.2 in the Introduction. In light of Theorem 3.4, to show csob = (1) we need only prove the validity of the Entropy Mixing condition EM(, [(1 − δ)pmin /2( + 1 − δ)]2 ) for some constants and δ independent of the size of T . In order to establish EM in the situations covered by Theorem 1.2, we extend the coupling framework developed in Sect. 4.1. As before, we will use a condition on the constants κ and γ , which were defined in Sect. 4.1. In fact, the condition on κ and γ for establishing EM is practically the same as the one that was used to establish VM, which immediately transfers our (1) bound on cgap for the relevant parameters to an (1) bound on csob for the same choice of parameters. The main result of this section is the following relationship between (κ, γ ) and EM. Theorem 5.1. Any Gibbs distribution µ = µτT satisfies EM(, c(γ α)/5 ) for all , where α = max {κb, 1}, κ and γ are the constants associated with the sequence µτT as specified in Definition 4.2, and c is a constant that depends only on (b, β, h). In particular, if max {γ κb, γ } < 1 then there exists a constant ϑ such that, for every T , the measure µ = µτT satisfies EM(, ce−ϑ ) for all , and hence csob (µ) = (1). Remark. We note that the above theorem, like its counterpart for the spectral gap, holds for any spin system on a tree (with the definitions of κ and γ generalized appropriately). See the companion paper [32] for details. Since in Sec. 4.2 we have already calculated κ and γ for the regimes of interest and shown that in both cases max {γ κb, γ } < 1, we have: Corollary 5.2. In both of the following situations, csob (µ) = (1): (i) τ is arbitrary, and either β < β1 (with h arbitrary), or |h| > hc (β) (with β arbitrary); (ii) τ is the (+)-boundary condition and β, h are arbitrary. This completes the proof of our second main result, Theorem 1.2 stated in the Introduction. The first step in proving Theorem 5.1 is a reduction of EM to a certain strong concentration property of µ, the Gibbs measure under consideration. We believe that this concentration property, as well as its connection to EM, may be of independent interest. The statement of this property and the reduction of EM to it is the content of Sec. 5.1. Then, in Sec. 5.2, we complete the proof of Theorem 5.1 by relating the strong concentration property to κ and γ . It is worth mentioning that we are also able to establish a general (but cruder) bound on csob as a function of cgap . Specifically, we can show that csob = (1/ log n) × cgap . Although we do not need this bound in this paper, we present it in Sec. 5.3 for future reference since its proof is simple and short.
5.1. Establishing EM via a strong concentration property. In this subsection we reduce EM to a certain strong concentration property of µ. In the next subsection, we will then establish this strong concentration property as a function of κ and γ in order to
Glauber Dynamics on Trees: Boundary Conditions and Mixing Time
321
prove Theorem 5.1. For simplicity and without loss of generality, we will analyze the entropy mixing condition only for Tx = T (the whole tree), with root r. Let µ+ and µ− denote the Gibbs distributions on T with the spin at the root r set T T to (+) and (−) respectively (the boundary condition on the leaves of T being specified by τ ). Define µ+ (σ ) 1/p if σr = (+), T = g+ (σ ) = 0 otherwise, µ(σ ) where p = µ(σr = +). The key quantity we will work with in the sequel is the following: ()
g+ = µBr, (g+ ). ()
Note that g+ (σ ) depends only on the spins in ∂Br, . Indeed, let σr, stand for the restric() tion of σ to ∂Br, , i.e., to the sites at distance below r. It is easy to verify that g+ (σ ) µ+ (σr, )
()
T is equal to µ(σ . Thus, for a given configuration σ , g+ (σ ) is the ratio of the probar, ) bilities of seeing the spins of σ at level below the root r when the spin at r is (+) and () when there is no condition on the spin at r, respectively. We define g− and g− in an analogous way. () () The role played by the functions g+ and g− is embodied in the following theorem, which says that if these functions are sufficiently tightly concentrated around their common mean value of 1 then the entropy mixing condition EM holds. Theorem 5.3. There exists a constant c (depending only on b, β and h) such that, for any δ ≥ 0, if
µ |gs() − 1| > δ ≤ e−2/δ (17)
for s ∈ {+, −}, then we have Ent µT(f ) ≤ cδ Ent(f ) for any non-negative function f that does not depend on Br, ; in particular, EM(, cδ) holds.
Proof. Fix < m and a non-negative function f that does not depend on the spins inside the block Br, . Since Ent(f ) ≤ Var(f )/µ(f ) for every non-negative function f (see, e.g., [37]) then
Var µT(f ) Ent µT(f ) ≤
µ µT(f )
2
− 2 1 · p µ+ = (f ) − µ(f ) + (1 − p) µ (f ) − µ(f ) T T µ(f ) 2 2 1 = · p Cov g+ , f + (1 − p) Cov g− , f µ(f ) 2 Cov gs , f ≤ max , (18) s∈{+,−} µ(f ) where Cov denotes covariance w.r.t µ. Now observe that, since f does not depend on Br, , when computing the covariance term in (18) the function gs can be replaced () by gs , which depends only on the spins in ∂Br, . Thus, if we can show that (17) implies 2 (19) Cov gs() , f ≤ cδµ(f ) Ent(f )
for some constant c, then by plugging (19) into (18) we will get that Ent µT(f ) ≤ cδ Ent(f ), as required.
322
F. Martinelli, A. Sinclair, D. Weitz
To establish (19) we make use of the following technical lemma, whose proof can be found in Sec. 7. Lemma 5.4. Let {, F, ν} be a probability space and let f1 be a mean-zero random variable such that f1 ∞ ≤ 1 and ν[ |f1 | > δ ] ≤ e−2/δ for some δ ∈ (0, 1). Let f2 be a probability density w.r.t. ν, i.e. f2 ≥ 0 and ν(f2 ) = 1. Then there exists a numerical constant c > 0 independent of ν, f1 , f2 and δ, such that ν(f1 f2 )2 ≤ c δ Entν (f2 ). We apply this lemma with ν = µ and f1 =
() gs − 1 () gs ∞
;
f2 =
f , µ(f )
() 2 () () to deduce Cov gs , f ≤ c δgs 2∞ µ(f ) Ent(f ). Noting also that gs ∞ ≤ gs ∞ ≤ 1/pmin , where pmin was defined just before Theorem 3.4, this establishes (19) with 2 and thus completes the proof of the theorem. c = c /pmin
5.2. Proof of Theorem 5.1. In light of Theorem 5.3, to prove Theorem 5.1 it is sufficient () to verify the strong concentration property (17) of the functions gs with δ = (γ α)/5 . In order to do this we appeal to a strong concentration of the Hamming distance under the coupling ν of µ+ and µ− , as defined in the proof of Claim 4.4. Recall the notation T T used in that claim, and notice that the Hamming distance is dominated by the size of the population in the th generation of a specific branching process. The following tail bound can be obtained using standard techniques from the analysis of branching processes, and we defer the proof to the end of this section. Lemma 5.5. Let α = max {κb, 1}. Then for every C > 0, C 1 Pr |σ − σ |r, > Cα ≤ e +1 1− 2e . ν
Corollary 5.6. For every C > 0 and s ∈ {+, −}, p C 1 min Pr gs() (σ ) − gs() (σ ) > C(γ α) ≤ e +1 1− 2e . ν
Proof. It is enough to show that |gs() (σ ) − gs() (σ )| ≤
γ · |σ − σ |r, pmin
(20)
since we can then apply Lemma 5.5 with C replaced by pmin C. On the other hand, (20) () follows from part (ii) of Claim 4.4 once we recall that gs (σ ) = µσBr, (gs ) and that gs
depends only on the spin at the root, implying that |gs (σ ) − gs (σ )| ≤ µσBr, − ()
µσBr, r · gs ∞ ≤ γ |σ − σ |r, /pmin .
()
Glauber Dynamics on Trees: Boundary Conditions and Mixing Time
323
Before we go on with the proof of Theorem 5.1, let us compare the way we used the constants κ and γ in the proof of Corollary 5.6 to the way we used them in the proof of Theorem 4.3. In both cases we used κ and γ to get bounds for coupling “down” and “up” the tree respectively. Specifically, we used κ to deduce that the Hamming distance between the coupled configurations at the th level is about (κb) , and we then used γ to bound the effect of each discrepancy at the th level on the spin at the root (or equiv() alently, on gs ) by roughly γ . While in Theorem 4.3 it was enough that the average Hamming distance when coupling down the tree was bounded by (κb) , here we need that this distance is not much larger than (κb) with high probability. We now return to the proof of Theorem 5.1. W.l.o.g. we may assume that γ α ≤ 1 since EM(, 1) always holds, and also that γ α > 0 since if γ = 0 then EM(, 0) holds because then the spin at the root r is independent of the rest of the configuration. Let a = (γ α)−1 ≥ 1. Recall that we wish to establish (17) with δ = a −/5 for all large enough . We will show only that
1 µ gs() − 1 > δ ≤ e−2/δ (21) 2 since the same bound on the negative tail can be achieved by an analogous argument. We start by applying Corollary 5.6 with C = a /4 to get that, for every ε > 0,
µsT gs() − 1 > ε ≤ µ gs() − 1 > ε − a −3/4 + A, (22) p a/4 1 min where A = e +1 1− 2e and we have used the fact that µ is a convex combination − of µ+ and µ . T T ()
Next, we notice that by definition of gs ,
µsT gs() − 1 > ε ≥ (1 + ε)µ gs() − 1 > ε . Combining (22) and (23) we get that, for every ε > 0,
1 µ gs() − 1 > ε ≤ µ gs() − 1 > ε − a −3/4 + A . 1+ε This immediately yields that, for every non-negative integer k and ε > 0,
1+ε µ gs() − 1 > ε + ka −3/4 ≤ (1 + ε)−(k+1) + A , ε
(23)
(24)
(25)
where we applied (24) k + 1 times, each time increasing ε by a −3/4 . Inequality (21) then follows (assuming is large enough) by applying (25) with ε = a −/4 and k = a /2 . This concludes the proof of Theorem 5.1. Finally, we supply the missing proof of Lemma 5.5. Proof of Lemma 5.5. First notice that, by an exponential Markov inequality, it is enough to show that Eν et|σ −σ |r, ≤ e2etα for all t ≤ (2e( + 1)α )−1 ≤ 1. We thus fix t as above and let Dx,i = Eν et|σ −σ |x,i , where ν is the coupling of µ+ and µ− . Note Tx Tx that Dx,i can be calculated recursively as follows. The main observation is that, given a disagreement at x, the random variable |σ − σ |x,i is the sum of the b independent random variables |σ −σ |z,i−1 , where z ranges over the children of x. In turn, the random variable et|σ −σ |z,i−1 takes the value Dz,i−1 with probability at most κ (the probability of a disagreement at z given a disagreement at x) and the value 1 with the remaining
324
F. Martinelli, A. Sinclair, D. Weitz
probability (since |σ − σ |z,i−1 = 0 if there is no disagreement at z). Thus, if we let δi = maxx Dx,i − 1, then δi+1 ≤ [1 + κδi ]b − 1 ≤ eκbδi − 1 ≤ eαδi − 1.We wish to show that, for t in the above range, δ ≤ 2etα , which implies Eν et|σ −σ
| r,
≤ δ + 1 ≤ e2etα , as
i required. In fact, we show by induction that δi ≤ 2t[ +1 · α] for every 0 ≤ i ≤ . For the base case i = 0, notice that |σ − σ |x,0 = 1 when starting from a fixed disagreement at x, so δ0 = et − 1 ≤ 2t for t in the given range. For i + 1 > 0, we use the fact that αδi 1 δi+1 ≤ eαδi − 1 ≤ 1−αδ ≤ +1 · αδi , since by the induction hypothesis δi ≤ α(+1) for i all 0 ≤ i ≤ − 1 and t in the given range.
5.3. A crude bound on log-Sobolev via the spectral gap. In this section we state and prove a general bound on csob using a bound on cgap . Although we do not require this bound for the results in this paper, we believe that it may find applications in the future. We state the bound for the Ising model, but it can be easily verified that it generalizes to any nearest-neighbor spin system on a tree. Theorem 5.7. For the Ising model on the b–ary tree, csob (µ) = cgap (µ) × (1/ log n). In particular, if cgap (µ) = (1) then csob (µ) = (1/ log n). It is useful to compare this bound with the well-known bound csob (µ) = cgap (µ) × (1/n) (see, e.g., [37]), which though much weaker is also more general (for example, it applies to spin systems on any graph). Theorem 5.7 is a consequence of the following lemma. Lemma 5.8. For any β and h, there exists a constant c = c(b, β, h) such that, for any x ∈ T and all , csob (µτTx )−1 ≤
max {csob (µTy )−1 } + c · cgap (µτTx )−1 . η
(26)
y≺x,η∈τT
This lemma immediately implies Theorem 5.7, once we notice that cgap (µTx ) ≥ c · cgap (µτT ) for a constant c ≡ c (b, β, h) and every x ∈ T and η ∈ τT , as can easily be checked. η
Proof of Lemma 5.8. For simplicity and w.l.o.g. we will prove the recursive inequality (26) only for Tx = T (the whole tree), with root r. Let f be a non–negative function. We then write (using the entropy version of (2))
(27) Ent(f ) = µ Ent T(f ) + Ent µT(f ) . Using the definition of csob we have
η µ Ent T(f ) ≤ max τ {csob (µTy )−1 } µ Var {x} ( f ) y≺r,η∈T
≤
max
y≺r,η∈τT
x∈T η −1 {csob (µTy ) } D( f .
(28)
The second term on the r.h.s. of (27), being the entropy of a Bernoulli random variable, is bounded above by
Ent µT(f ) ≤ αVar µT(f ) (29) ≤ αVar( f ) ≤ α cgap (µ)−1 D f , (30)
Glauber Dynamics on Trees: Boundary Conditions and Mixing Time
325
where α ≡ α(p) is a constant that depends on p = µ(σr = +); specifically α(p) = log(p/1−p) for p = 1/2, and α(1/2) = 1/2 (see [37]). 2p−1 Putting together (28) and (30), the expression in (27) is bounded above by η max τ {csob (µTy )−1 } + α cgap (µ)−1 D f , y≺r,η∈T
so that from the definition of csob we have csob (µ)−1 ≤
max {csob (µTy )−1 } + α cgap (µ)−1 . η
y≺r,η∈τT
6. Extensions to Other Models As we have already indicated, our techniques extend beyond the Ising model to general nearest-neighbor interaction models on trees, including those with hard constraints. In this final section we mention some of these extensions. For a fuller treatment of this material, the reader is referred to the companion paper [32] and the PhD thesis of the last author [45]. A (nearest neighbor) spin system on a finite graph G = (V , E) is specified by a finite set S of spin values, a symmetric pair potential U : S × S → R ∪ {∞}, and a singleton potential W : S → R. A configuration σ ∈ S V of the system assigns to each vertex (site) v ∈ V a spin value σv ∈ S. The Gibbs distribution is given by U (σx , σy ) + W (σx ) . µ(σ ) ∝ exp − xy∈E
x∈V
Thus the Ising model corresponds to the case S = {±1}, and U (s1 , s2 ) = −βs1 s2 , W (s) = −βhs, where β is the inverse temperature and h is the external field. Note that setting U (s1 , s2 ) = ∞ corresponds to a hard constraint, i.e., spin values s1 , s2 are forbidden to be adjacent. We denote by the set of all valid spin configurations, i.e., those for which µ(σ ) > 0. As for the Ising model, we allow boundary conditions which fix the spin values of certain sites. We carry over our notation from the Ising model: thus, e.g., µτA denotes the Gibbs distribution on a subset A ⊆ V with boundary condition τ on ∂A. The (heat-bath) Glauber dynamics extends in the obvious way to general spin systems. We first note that, as the reader may easily check, neither the spatial mixing conditions in Sec. 3 nor their proofs made any reference to the details of the Ising model. All of this material therefore carries over without modification to general spin systems on trees. Theorem 6.1. The statements of Theorems 3.2 and 3.4 hold for general nearest-neighbor spin systems on trees. Likewise, the machinery developed in Sec. 4 and 5 for verifying the conditions VM and EM also extends to general models, though the details of the calculations are model-specific. In particular, Theorems 4.3 and 5.1 relating VM and EM to the coupling quantities κ and γ of Definition 4.2 still hold (with very minor modifications). Thus all we need to do is to carry out the detailed calculations of κ and γ for the model under consideration. We now state without proof the results of these calculations for several models of interest. For the proofs, together with further discussion and extensions, the reader is referred to the companion paper [32] and the thesis [45].
326
F. Martinelli, A. Sinclair, D. Weitz
6.1. The hard-core model (independent sets). In this model S = {0, 1}, and we refer to a site as occupied if it has spin value 1, and unoccupied otherwise. The potentials are U (1, 1) = ∞;
U (1, 0) = U (0, 0) = 1;
W (1) = L;
W (0) = 0,
where L ∈ R. The hard constraint here means that no two adjacent sites may be occupied, so can be identified with the set of all independent sets in G. Also, the aggregated potential of a valid configuration is proportional to the number of occupied sites. Hence the Gibbs distribution takes the simple form µ(σ ) ∝ λN(σ ) , where N(σ ) is the number of occupied sites and the parameter λ = exp(−L) > 0, which controls the density of occupation, is referred to as the “activity.” The hard-core model on a b-ary tree undergoes a phase transition at a critical activity bb λ = λ0 = (b−1) b+1 (see, e.g., [40, 24]). For λ ≤ λ0 there is a unique Gibbs measure regardless of the boundary condition on the leaves, while for λ > λ0 there are (at least) two distinct phases, corresponding to the “odd” and “even” boundary conditions respectively. The even boundary condition is obtained by making the leaves of the tree all occupied if the depth is even, and all unoccupied otherwise. The odd boundary condition is the complement of this. (These boundary conditions are derived from the two maximum-density configurations on the infinite tree Tb in which alternate levels — either odd or even — are completely occupied.) For λ > λ0 , the probability of occupation of the root in the infinite-volume Gibbs measure differs for odd and even boundary conditions. Relatively little is known about the Glauber dynamics for the hard-core model on trees, beyond the general result of Luby and Vigoda [28, 44] which ensures a mixing time of 2 O(log n) (after translation to our continuous time setting) when λ < b−1 . This result actually holds for any graph G of maximum degree b + 1. Our results for the Glauber dynamics in the hard-core model mirror those given earlier for the Ising model. First, for sufficiently small activity λ we show that both cgap and csob are uniformly bounded away from zero for arbitrary boundary conditions. Second, for even (or, symmetrically, odd) boundary conditions, we get the same result for all activities λ. Theorem 6.2. For the hard-core model on the n-vertex b-ary tree with boundary condition τ , cgap (µ) and csob (µ) are (1) in both of the following situations : (i) τ is arbitrary, and λ ≤ max √ 1 , λ0 ; b−1 (ii) τ is even (or odd), and λ ≥ 0 is arbitrary. Part (ii) of this theorem is analogous to our earlier result for the Ising model with (+)-boundary and zero external field at all temperatures. This is in line with the intuition that the even boundary eliminates the only bottleneck in the dynamics. Part (i) identifies a region in which the mixing time is insensitive to the boundary condition. We would expect this to hold throughout the low-activity region λ ≤ λ0 , and indeed, by analogy with the Ising model, also in some intermediate region beyond this. Our bound in part (i) confirms this behavior: note that the quantity √ 1 exceeds λ0 for all b ≥ 5, and indeed b−1
for large b it grows as √1 compared to the b1 growth of λ0 . Thus for b ≥ 5 we establish b rapid mixing in a region above the critical value λ0 . To the best of our knowledge this is the first such result. (Note that the result of [28, 44] mentioned earlier establishes rapid
Glauber Dynamics on Trees: Boundary Conditions and Mixing Time
327
2 mixing only for λ < b−1 , which is less than λ0 for all b and so does not even cover the whole uniqueness region.) We should also mention that our coupling analysis of cgap in this region has consequences for the infinite volume Gibbs measure itself, implying that when λ ≤ √ 1 any µ = limT →∞ µτT that is the limit of finite Gibbs distributions for b−1 some boundary configuration τ is extremal, again a new result. We elaborate on these points in the companion paper [32].
6.2. The antiferromagnetic Potts model (colorings). In this model S = {1, 2, . . . , q}, and the potentials are U (s1 , s2 ) = βδs1 ,s2 , W (s) = 0. This is the analog of the Ising model except that the interactions are antiferromagnetic, i.e., neighbors with unequal spins are favored. The most interesting case of this model is when β = ∞ (i.e., zero temperature), which introduces hard constraints. Thus if we think of the q spin values as colors, is the set of proper colorings of G, i.e., assignments of colors to vertices so that no two adjacent vertices receive the same color. The Gibbs distribution is uniform over proper colorings. In this model it is q that provides the parameterization. For background on the model, see [8]. For colorings on the b-ary tree it is well known that, when q ≤ b+1, there are multiple Gibbs measures; this follows immediately from the existence of “frozen configurations,” i.e., colorings in which the color of every internal vertex is forced by the colors of the leaves (see, e.g., [8]). Recently Jonasson [22] proved that, as soon as q ≥ b + 2, the Gibbs measure is unique. Moreover, it is known that there is again an “intermediate” region that includes the value q = b + 1, in which the Gibbs measure, while not unique, is insensitive to “typical” boundary conditions (chosen from the free measure); see [8]. The sharpest result known for the Glauber dynamics on colorings is due toVigoda [43], who shows that for arbitrary boundary conditions the mixing time is O(log n) provided q > 11 6 (b + 1). Actually this result holds for any n-vertex graph G of maximum degree b + 1. For graphs of large maximum degree and girth at least 6, this range was recently improved [13] to q > max {1.489(b + 1), q0 }, where q0 is an absolute constant.6 Before we state our results for the Glauber dynamics on colorings, we wish to discuss a few issues regarding the connectedness of the dynamics in this model. It is well known and not too difficult to see that the dynamics is connected for q ≥ b + 3 (on any graph of maximum degree b + 1). For q ≤ b + 1, there is at least one boundary condition η η for which the dynamics for µT is not connected. For the critical value q = b + 2, the situation is somewhat delicate: while the dynamics is connected for all boundary conditions when run on T , it is not connected for at least one boundary condition if we add a boundary site above T (i.e., if the dynamics is run on Tx for x not the root of Tb ). Since our arguments apply equally well to these settings, the smallest q for which we can hope to establish that the log-Sobolev constant is bounded below by a constant independent of n uniformly in the boundary condition is q = b + 3. Indeed, we establish this for the entire regime in which the Glauber dynamics is guaranteed to be connected, i.e., for q ≥ b + 3. We note that this is the first result that establishes this fact for a non-trivial graph. (It has been conjectured that the dynamics mixes in O(log n) time for q ≥ b + 3 on any graph of maximum degree b + 1.) We also notice that, if the Glauber dynamics is 6 A recent sequence of papers [12, 33, 18] have reduced the required number of colors further for general graphs, under the assumption that the maximum degree is (log n); the current state of the art requires q ≥ (1 + )(b + 1) for arbitrarily small > 0 [19]. However, these results do not apply in our setting where the degree b + 1 is fixed.
328
F. Martinelli, A. Sinclair, D. Weitz
replaced by the heat-bath dynamics based on flipping edges, then the dynamics remains connected for q = b + 2 and all subtrees and boundary conditions. Our detailed results in fact imply that the log-Sobolev constant of the edge dynamics is bounded below by a constant independent of n uniformly in the boundary condition even at the critical value q = b + 2; notice that for this value of q there are boundary conditions for which the log-Sobolev constant of the (single-site) Glauber dynamics tends to zero as n → ∞ even though this dynamics remains connected. (Again, we refer to [45] for details on the edge dynamics and the bound on the associated log-Sobolev constant.) Thus, we essentially establish rapid mixing uniformly in the boundary condition throughout the uniqueness regime. Theorem 6.3. For the colorings model on the n-vertex b-ary tree with q ≥ b + 3 and arbitrary boundary conditions, both cgap (µ) and csob (µ) are (1). Moreover, the entropy mixing condition EM(, c exp(−ϑ)) holds for arbitrary boundary conditions for q ≥ b + 2. 6.3. The ferromagnetic Potts model. Here we have S = {1, 2, . . . , q} and potentials U (s1 , s2 ) = −βδs1 ,s2 , W (s) = 0. This is a straightforward generalization of the (ferromagnetic) Ising model studied earlier in the paper, in which the spin at each site can take one of q possible values, and the aggregated potential of any configuration depends on the number of adjacent pairs of equal spins. There are no hard constraints. Qualitatively the behavior of this model is similar to that of the Ising model, though less is known in precise quantitative terms. Again there is a phase transition at a critical β = β0 , which depends on b and q, so that for β > β0 (and indeed for β ≥ β0 when q > 2) there are multiple phases. This value β0 does not in general have a closed form, but it is known [17] that β0 < 21 ln( b+q−1 b−1 ) for all q > 2. (For q = 2, this value is exactly β0 for the Ising model as quoted earlier.) Using our techniques, we are able to prove the following: Theorem 6.4. For the Potts model on the n-vertex b-ary tree, cgap (µ) and csob (µ) are (1) in all of the following situations: √ (i) the boundary condition is arbitrary and β < max β0 , 21 ln( √b+1 ) ; b−1 (ii) the boundary condition is constant (e.g., all sites on the boundary have spin 1) and β is arbitrary; (iii) the boundary is free (i.e., the boundary spins are unconstrained) and β < β1 , 2β1 −1 2β1 · ee2β1 −1 = b1 . where β1 is the solution to the equation e2βe 1 +q−1 +1 Part (i) of this theorem shows that cgap and csob√are (1) for arbitrary boundaries
throughout the uniqueness region; also, since 21 ln( √b+1 ) ≥ 21 ln( b+q−1 b−1 ) > β0 when b−1 √ q ≤ 2( b +1), this result extends into the multiple phase region for many combinations of b and q. Part (ii) of the theorem is an analog of our earlier results for the Ising model with (+)-boundaries at all temperatures. Part (iii) is of interest for two reasons. First, since β1 > β0 always, it exhibits a natural boundary condition under which cgap and csob are (1) beyond the uniqueness region (but not for arbitrary β) for all combinations of b and q. Second, because of an intimate connection between the free boundary case and so-called “reconstruction problems” on trees [34] (in which the edges are noisy channels and the goal is to reconstruct a value transmitted from the root), we obtain an alternative proof of the best known value of the noise parameter under which reconstruction
Glauber Dynamics on Trees: Boundary Conditions and Mixing Time
329
is impossible [35]. Indeed, a slight strengthening of part (iii) allows us to marginally improve on this threshold. Again, we spell out the details in [32].
7. Proofs Omitted from the Main Text In this final section, we supply the proofs of some technical lemmas that were omitted from the main text.
7.1. Proof of Lemma 3.5. The lemma in fact holds in a more general setting, where in place of Tx and Bx, we think of two arbitrary subsets A, B such that A∪B = Tx . η Also, in this proof we write ν = µTx and Var and Ent for variance and entropy with respect to ν. For part (i) we will show that if for any function g that does not depend on B we have Var[νA (g)] ≤ ε · Var(g), then for any function f , Var[νA (f )] ≤
2(1 − ε) 2ε · ν[Var B (f )] + · ν[Var A (f )]. 1 − 2ε 1 − 2ε
Notice that by the convexity of variance we have Var(g1 +g2 ) ≤ 2[Var(g1 )+Var(g2 )] for any two functions g1 , g2 . We therefore write Var[νA (f )] = Var[νA (f ) − νA (νB (f )) + νA (νB (f )] ≤ 2Var[νA (f − νB (f ))] + 2Var[νA (νB (f ))] ≤ 2Var[f − νB (f )] + 2εVar[νB (f )] = 2ν[Var B (f )] + 2ε(Var[νA (f )] + ν[Var A (f )] − ν[Var B (f )]), where we used the facts that Var[f − νB (f )] = ν[Var B (f )] and that Var[νA (f )] + ν[Var A (f )] = Var[νB (f )] + ν[Var B (f )] = Var(f ) as in (2). We therefore conclude 2ε that Var[νA (f )] ≤ 2(1−ε) 1−2ε · ν[Var B (f )] + 1−2ε · ν[Var A (f )], as required. We proceed to part (ii). Here we have to show that if for any non-negative function g that does not depend on B we have Ent[νA (g)] ≤ ε · Ent(g), then for any non-negative function f , Ent[νA (f )] ≤
1 ε · ν[Ent (f )] + · ν[Ent A (f )], B 1 − ε 1 − ε
(31)
√ where ε = ε/p and p stands for the minimum non-zero probability of any configuration in Tx \ A. We will in fact show that Ent(f ) ≤
1 (ν[Ent A (f )] + ν[Ent B (f )]), 1 − ε
(32)
which implies (31) since Ent[νA (f )] = Ent(f ) − ν[Ent A (f )]. Before we go on with the proof, let us review some properties of entropy. First, f f by definition, Ent(f ) = ν(f log ν(f ) ) and ν[Ent A (f )] = ν(f log νA (f ) ). Also, by the variational characterization of entropy we have νA (f log νAg(g) ) ≤ EntA (f ) for all nonnegative functions f and g.
330
F. Martinelli, A. Sinclair, D. Weitz
We can now proceed with the proof of (32) by writing f νB (f ) νA (νB (f )) Ent(f ) = ν f log + ν f log + ν f log νB (f ) νA (νB (f )) ν(f ) f f νA (νB (f )) ≤ ν f log + ν f log + ν f log νB (f ) νA (f ) ν(f ) νA (νB (f )) = ν[EntB (f )] + ν[Ent A (f )] + ν νA (f ) log . ν(f ) B (f )) Therefore, (32) will follow once we show that ν νA (f ) log νA (ν ≤ ε Ent(f ). We ν(f ) use the following claim in order to get this bound. Claim 7.1. Let µ be a probability measure over a space where the probability of any σ ∈ is either zero or at least p. Then for any two non-negative functions f and g over we have g 1 µ(f ) µ f log ≤ · Ent(f ) · Ent(g), µg p µ(g) where Ent is taken w.r.t. to µ. Assuming Claim 7.1, we conclude that νA (νB (f )) 1 ν νA (f ) log Ent[νA (f )] · Ent[νA (νB (f ))] ≤ ν(f ) p 1 1√ ≤ ε · Ent[νA (f )] · Ent[νB (f )] ≤ ε Ent(f ), p p completing the proof of Lemma 3.5. We note that, since neither νA (f ) nor νA (νB (f )) depends on A, the effective probability space in the above derivation is the marginal over Tx \ A, so indeed p can be taken as the minimum marginal probability of configurations restricted to Tx \ A. It remains to prove Claim 7.1. Consider two arbitrary non-negative functions f and g. g Let χ be the indicator function of the event that g ≥ µ(g). Clearly, χ log µ(g) ≥ g g g 0 while (1 − χ ) log µ(g) ≤ 0. Also, since µ log µ(g) ≤ log µ µ(g) = 0 then g g µ (1 − χ ) log µ(g) ≤ −µ χ log µ(g) . Letting fmax and fmin be the maximum and minimum values of f respectively over configurations with non-zero probability, we get: g g g µ f log = µ χf log + µ (1 − χ )f log µ(g) µ(g) ν(g) g g ≤ fmax · µ χ log + fmin · µ (1 − χ ) log µ(g) µ(g) g ≤ (fmax − fmin ) · µ χ log µ(g) 1 g ≤ · f − µ(f )1 · µ χ −1 p µ(g)
Glauber Dynamics on Trees: Boundary Conditions and Mixing Time
331
1 · f − µ(f )1 · g − µ(g)1 2p · µ(g) 1 µ(f ) ≤ · Ent(f ) · Ent(g), p µ(g)
=
where we wrote · 1 for the 1 norm with respect to µ and used the fact that f − µ(f )21 ≤ 2µ(f ) Ent(f ) for any non-negative function f (see, e.g., [37]). The proof of Claim 7.1 is now complete. 7.2. Proof of reverse direction of Theorem 3.4. In the main text we proved the forward η direction of Theorem 3.4. Here we prove the reverse direction, i.e., that minx,η csob (µT ) x
= (1) implies EM(, ce−ϑ ) for all , where c = c(b, β, h) and ϑ = ϑ(b, β, h) are constants independent of . To do this, we follow the same line of reasoning as in the proof of Theorem 5.2: namely, we establish the strong concentration property of the () functions gs as in Sec. 5.1 and then appeal to Theorem 5.3. The proof of concentration is accomplished via hypercontractivity bounds, assuming the above condition on csob . For a function f , let f ⊆ T denote the subset of sites on whose spins f depends. A claim similar to the following lemma was proved in [42]: Lemma 7.2. Let ν be any Gibbs measure on T , f any function, and B any subset that includes all sites within distance from f . Then there exists a constant ϑ , depending only on the degree b, such that
νB (f ) − ν(f )q ≤ 3e−csob (ν)ϑ |f |f − ν(f )∞ ,
where q = 1 + ecsob (ν)ϑ and norms are taken w.r.t. ν.
We use Lemma 7.2 to complete the proof of the reverse direction of Theorem 3.4. For simplicity, we verify EM only for the case Tx = T (the whole tree), with root r. () () Recall the functions gs from Sec. 5.1, the fact that gs = µBr, (gs ) by definition, and that gs depends only on the spin at r. Applying Lemma 7.2 with ν = µ, f = gs , and B = Br, , together with the fact that csob (µ) = (1) by hypothesis, we conclude that there exists a constant ϑ such that
gs() − 1q ≤ 3e−ϑ gs − 1∞ ≤ 3e−ϑ /pmin ,
where q = 1 + eϑ and norms are taken w.r.t. µ. Therefore, using a Markov inequality, there exist constants 0 and ϑ such that, for all ≥ 0 ,
() ϑ µ+ |gs − 1| > e−ϑ ≤ e−2e . T ()
This establishes the strong concentration property of gs follows by Theorem 5.3.
as in (17), from which EM
Remark. In [42], a claim equivalent to Lemma 7.2 was proved in the context of Zd ; however, it is easy to see that it in fact applies to general, finite-range models on any graph of bounded degree. As a result, the fact that (1) logarithmic Sobolev constant implies EM(, ce−ϑ ) holds in this generality as well.
332
F. Martinelli, A. Sinclair, D. Weitz
7.3. Proof of Lemma 5.4. We split our analysis of ν(f1 f2 )2 into three cases: (a) Entν (f2 ) ≥ 1δ ; (b) δ < Entν (f2 ) < 1δ ; (c) Entν (f2 ) ≤ δ. Case (a). We simply bound ν(f1 f2 )2 ≤ f1 2∞ ν(f2 )2 ≤ 1 ≤ δ Entν (f2 ) . Case (b). We use the entropy inequality (see, e.g., [1]), which states that for any t > 0, 1 1 (33) log ν(etf1 ) + Entν (f2 ) . t t √ We choose the free parameter t in (33) equal to Ent ν (f2 )/δ. Notice that, by construction, 1 < t < δ −1 . Using the assumption ν(|f1 | > δ) ≤ e−2/δ together with f1 ∞ ≤ 1, we get 1 2 ν(f1 f2 )2 ≤ log etδ + et−2/δ + δ Entν (f2 ) t 2 ≤ c1 δ + δ Entν (f2 ) ≤ c2 δ Entν (f2 ) ν(f1 f2 ) ≤
for suitable numerical constants c1 , c2 . √ Case (c). Again we use the entropy inequality with t = Ent ν (f2 )/δ ≤ 1, but we now tf simply bound the Laplace transform ν(e 1 ) by a Taylor expansion (in t) up to second order: 1 t2 1 t
log ν(etf1 ) ≤ log 1 + e ν(f12 ) ≤ e δ 2 + e−2/δ t t 2 2 1 2 −2/δ = e δ +e Ent ν (f2 )/δ, 2 which by (33) implies e √ 2 ν(f1 f2 )2 ≤ √ δ 2 + e−2/δ + δ Ent ν (f2 ) ≤ c3 δ Entν (f2 ) 2 δ for another numerical constant c3 .
Acknowledegements. F. Martinelli would like to thank the Miller Institute, the Dept. of Statistics and the Dept. of EECS of the University of California at Berkeley for financial support and warm hospitality. We also wish to thank E. Mossel and Y. Peres for interesting discussions about reconstruction on trees and related topics.
References 1. An´e, C., Blach`ere, S., Chafa¨ı, D., Foug`eres, P., Gentil, I., Malrieu, F., Roberto, C., Scheffer, G.: Sur les in´egalit´es de Sobolev logarithmiques. Paris: Soc. Math. de France, 2000 2. Baxter R.J.: Exactly solved models in statistical mechanics. London: Academic Press, 1982 3. Berger, N., Kenyon, C., Mossel, E., Peres, Y.: Glauber dynamics on trees and hyperbolic graphs. Preprint (2003); Preliminary version: C. Kenyon, E. Mossel, Y. Peres, Glauber dynamics on trees and hyperbolic graphs. In: Proc. 42nd IEEE Symposium on Foundations of Computer Science, 2001, pp. 568–578 4. Bertini, L., Cancrini, N., Cesi, F: The spectral gap for a Glauber-type dynamics in a continuous gas. Ann. Inst. H. Poincar´e Probab. Statist. 38, 91–108 (2002)
Glauber Dynamics on Trees: Boundary Conditions and Mixing Time
333
5. Bleher, P., Ruiz, J., Schonmann, R.H., Shlosman, S., Zagrebnov, V.: Rigidity of the critical phases on a Cayley tree. Moscow Math. J. 1, 345–363 (2001) 6. Bleher, P., Ruiz, J., Zagrebnov, V.: On the purity of the limiting Gibbs state for the Ising model on the Bethe lattice. J. Stat. Phys. 79, 473–482 (1995) 7. Bodineau, T., Martinelli, F.: Some new results on the kinetic Ising model in a pure phase. J. Stat. Phys. 109, no. 1–2, 207–235 (2002) 8. Brightwell, G., Winkler, P.: Random colorings of a Cayley tree. In: Contemporary combinatorics, Bolyai Society Mathematical Studies 10, Budapest: J´anos Bolyai Math. Soc., 2002, pp. 247–276 9. Cesi, F.: Quasi-factorization of the entropy and logarithmic Sobolev inequalities for Gibbs random fields. Probab. Theory Relat. Fields 120, 569–584 (2001) 10. Chayes, J.T., Chayes, L., Sethna, J.P., Thouless, D.J.: A mean field spin glass with short-range interactions. Commun. Math. Phys. 106, 41–89 (1986) 11. Dobrushin, R., Koteck´y, R., Shlosman, S.: Wulff Construction. A Global Shape From Local Interaction. Trans. Math. Monographs, AMS 104, (1992) 12. Dyer, M., Frieze, A.: Randomly colouring graphs with lower bounds on girth and maximum degree. In: Proceedings of the 42nd Annual IEEE Symposium on Foundations of Computer Science, 2001, pp. 579–587 13. Dyer, M., Frieze, A., Hayes, T.P., Vigoda, E.: Randomly colouring constant degree graphs. Preprint, 2004 14. Evans, W., Kenyon, C., Peres, Y., Schulman, L.J.: Broadcasting on trees and the Ising model. Ann. Appl. Probab. 10, 410–433 (2000) 15. Fisher, D., Huse, D.: Dynamics of droplet fluctuations in pure and random Ising systems. Phys. Rev. B 35 no. 13, 6841–6846 (1987) 16. Georgii, H.-O.: Gibbs measures and phase transitions, de Gruyter Studies in Mathematics 9, Berlin: Walter de Gruyter & Co., 1988 17. H¨aggstr¨om, O.: The random-cluster model on a homogeneous tree. Probab. Theory Related Fields 104, 231–253 (1996) 18. Hayes, T.P.: Randomly coloring graphs with girth five. In: Proceedings of the 35th Annual ACM Symposium on Theory of Computing, 2003, pp. 269–278 19. Hayes, T.P., Vigoda, E.: A non-Markovian coupling for randomly sampling colorings. In: Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science, 2003, pp. 618–627 20. Ioffe, D.: A note on the extremality of the disordered state for the Ising model on the Bethe lattice. Lett. Math. Phys. 37, 137–143 (1996) 21. Ioffe, D.: Extremality of the disordered state for the Ising model on general trees. Prog. Probab. 40, 3–14 (1996) 22. Jonasson, J.: Uniqueness of uniform random colorings of regular trees. Stat. Probab. Lett. 57, 243– 248 (2002) 23. Jonasson, J., Steif, J.E.: Amenability and phase transition in the Ising model. J. Theor. Probab. 12, 549–559 (1999) 24. Kelly, F.P.: Stochastic models of computer communication systems. J. Royal Stat. Soc. B 47, 379–395 (1985) 25. Lyons, R.: Phase transitions on non amenable graphs. J. Math. Phys 41, 1099–1127 (2000) 26. Ledoux, M.: The concentration of measure phenomenon. Mathematical Surveys and Monographs 89, Providence, RI: American Mathematical Society, 1981 27. Lu, S.L., Yau, H.T.: Spectral gap and logarithmic Sobolev inequality for Kawasaki and Glauber dynamics. Commun. Math. Phys. 156, 399–433 (1993) 28. Luby, M., Vigoda, E.: Fast convergence of the Glauber dynamics for sampling independent sets. Random Structures & Algorithms 15, 229–241 (1999) 29. Martinelli, F.: Lectures on Glauber dynamics for discrete spin models. In: Lectures on Probability Theory and Statistics (Saint-Flour, 1997), Lecture notes in Mathematics 1717, Berlin: Springer, 1998, pp. 93–191 30. Martinelli, F., Olivieri, E.: Approach to equilibrium of Glauber dynamics in the one phase region I: The attractive case. Commun. Math. Phys. 161, 447–486 (1994) 31. Martinelli, F., Olivieri, E., Schonmann, R.: For 2-D lattice spin systems weak mixing implies strong mixing. Commun. Math. Phys. 165, 33–47 (1994) 32. Martinelli, F., Sinclair, A., Weitz, D.: Fast mixing for independent sets, colorings and other models on trees. Submitted, 2004. Extended abstract appeared in: Proc. of the 15th ACM-SIAM Symposium on Discrete Algorithms, 2004, pp. 449–458 33. Molloy, M.: The Glauber dynamics on colorings of a graph with high girth and maximum degree. In: Proceedings of the 34th Annual ACM Symposium on Theory of Computing, 2002, pp. 91–98 34. Mossel, E.: Survey: Information flow on trees. In: Graphs, Morphisms and Statistical Physics, DIMACS Series in Discrete Mathematics and Theoretical Computer Science, Volume 63, J. Nesetril, P. Winkler (eds.), Providence, RI: AMS, 2004, pp. 155–170
334
F. Martinelli, A. Sinclair, D. Weitz
35. Mossel, E., Peres, Y.: Information flow on trees. Ann. Appl. Probab. 13, 817–844 (2003) 36. Peres, Y., Winkler, P.: Personal communication 37. Saloff-Coste, L.: Lectures on finite Markov chains. In: Lectures on probability theory and statistics (Saint-Flour, 1996), Lecture Notes in Mathematics 1665, Berlin: Springer, 1997, pp. 301–413 38. Schonmann, R.H., Tanaka, N.I.: Lack of monotonicity in ferromagnetic Ising model phase diagrams. Ann. Appl. Probab. 8, 234–245 (1998) 39. Simon, B.: The statistical mechanics of lattice gases. Vol. I, Princeton Series in Physics, Princeton, NJ: Princeton University Press, 1993 40. Spitzer, F.: Markov random fields on an infinite tree. Ann. Probab. 3, 387–398 (1975) 41. Stroock, D.W., Zegarlinski, B.: The logarithmic Sobolev inequality for discrete spin systems on a lattice. Commun. Math. Phys. 149, 175–194 (1992) 42. Stroock, D.W., Zegarlinski, B.: On the ergodic properties of Glauber dynamics. J. Statist. Phys. 81, 1007–1019 (1995) 43. Vigoda, E.: Improved bounds for sampling colorings. J. Math. Phys. 41, 1555–1569 (2000) 44. Vigoda, E.: A note on the Glauber dynamics for sampling independent sets. Elect. J. Comb. 8(1), (2001) 45. Weitz, D.: Mixing in time and space for discrete spin systems. Ph.D. dissertation, Berkeley: University of California at Berkeley, 2004 Communicated by H. Spohn
Commun. Math. Phys. 250, 335–370 (2004) Digital Object Identifier (DOI) 10.1007/s00220-004-1148-x
Communications in
Mathematical Physics
Spatially Periodic Solutions in Relativistic Isentropic Gas Dynamics Hermano Frid1 , Mikhail Perepelitsa2 1 2
Instituto de Matem´atica Pura e Aplicada-IMPA, Estrada Dona Castorina, 110, 22460-320 Rio de Janeiro RJ, Brasil. E-mail:
[email protected] Department of Mathematics, Northwestern University, 2033 Sheridan Road, Evanston, Illinois 60208-2730, USA. E-mail:
[email protected] Received: 15 August 2003 / Accepted: 20 February 2004 Published online: 12 August 2004 – © Springer-Verlag 2004
Abstract: We consider the initial value problem, with periodic initial data, for the Euler equations in relativistic isentropic gas dynamics, for ideal polytropic gases which obey a constitutive equation, relating pressure p and density ρ, p = κ 2 ρ γ , with γ ≥ 1, 0 < κ < c, where c is the speed of light. Global existence of periodic entropy solutions for initial data sufficiently close to a constant state follows from a celebrated result of Glimm and Lax (1970). We prove that given any periodic initial data of locally bounded total variation, satisfying the physical restrictions 0 < inf x∈R ρ0 (x) < supx∈R ρ0 (x) < +∞, v0 ∞ < c, where v is the gas velocity, there exists a globally defined spatially periodic entropy solution for the Cauchy problem, if 1 ≤ γ < γ0 , for some γ0 > 1, depending on the initial bounds. The solution decays in L1loc to its mean value as t → ∞. 1. Introduction We consider the nonlinear hyperbolic system of conservation laws which describes the motion of one dimensional isentropic relativistic gas in Euler coordinates, ∂ p+ρc2 v 2 ∂ v + ρ + ∂x (p + ρc2 ) c2 −v 2 = 0, ∂t c2 c2 −v 2 (1.1) v ∂ v2 ∂ 2 2 ∂t (p + ρc ) c2 −v 2 + ∂x (p + ρc ) c2 −v 2 + p = 0, with given periodic initial data, with, say, period 1, and locally bounded total variation, (ρ(x, 0), v(x, 0)) = (ρ0 (x), v0 (x)).
(1.2)
Here ρ is the density, v is the velocity, p = κ 2 ρ γ , 1 < γ < 2, is the pressure, κ < c, and c is the speed of light. The initial data are subject to the physical bounds 0 < inf ρ0 (x) ≤ sup ρ0 (x) < +∞, x∈R
x∈R
sup p (ρ0 (x)) < c2 ,
x∈R
v0 ∞ < c.
(1.3)
336
H. Frid, M. Perepelitsa
Let U = (U1 , U2 ), with U1 =
p + ρc2 v 2 + ρ, c2 c2 − v 2
U2 = (p + ρc2 )
c2
v , − v2
(1.4)
and denote U0 (x) = (U1 0 (x), U2 0 (x)) the vector function corresponding to (ρ0 (x), v0 (x)). Set 1 U = U0 (x) dx. (1.5) 0
The main purpose of this paper is to prove the following result. Theorem 1.1. Suppose the periodic initial data ρ0 , v0 satisfy (1.3), and c + v0 < +∞, Var [ρ0 ] + Var log c − v0
(1.6)
where by Var we mean the total variation over one period. Then, there exists γ0 > 1 such that, for all 1 ≤ γ < γ0 , there exists a globally defined spatially periodic entropy solution U (x, t) of problem (1.1),(1.2), assuming values in a compact subset of {ρ > 0, p < c2 , v 2 < c2 }, with locally bounded total variation, defined through the Glimm difference scheme. The solution U (x, t) satisfies 1 lim |U (x, t) − U | dx = 0. (1.7) t→∞ 0
We recall that global existence of entropy solutions for initial data in L∞ sufficiently close to a constant state follows from the celebrated result of Glimm and Lax [9], which, in the periodic case, also ensures decay to the mean value in the L∞ norm at a rate O(t −1 ). So, here we will be concerned with periodic initial data subjected only to the physical restrictions (1.3), but we also impose the regularity condition (1.6). To this, we need to restrict γ to be sufficiently close to 1, depending on the bounds for the initial data. As we explain below, the decay given by (1.7) will be a direct consequence of the fact that U (x, t) assumes values in a compact subset of {ρ > 0, p < c2 , v 2 < c2 }, as an application of a general decay result in [2] combined with a well known compactness result of DiPerna [6]. Discontinuous solutions of the relativistic Euler equations were first considered in the pioneering paper of Smoller and Temple [15], where it was shown the global existence of BV entropy solutions of the Cauchy problem for (1.1), with γ = 1, for initial data in BV (R), satisfying the physical restrictions (1.3). Their result is based on the striking observation that the shock curves in the relativistic case with γ = 1, in the plane of the natural Riemann invariants, (z, w), possess the same geometrical property observed by Nishida [11] in the non-relativistic case with γ = 1, namely, the shock curves of both families starting from any arbitrary point in the (z, w)-plane may be obtained by translation of the corresponding curves starting from a fixed point, say, (0, 0). Also for initial data in BV (R), satisfying the physical restrictions (1.3), it was shown in [3] the global existence of BV entropy solutions of the Cauchy problem (1.1),(1.2) for 1 < γ < γ0 , for some γ0 depending on the bounds for the initial data. We remark that neither of these results can apply to periodic initial data. We refer to [15] for an account of the physical derivation of (1.1) and for references in the physics literature. We also refer to [12] for a better understanding on the physical ground concerning (1.1).
Periodic Solutions in Relativistic Gas Dynamics
337
Returning to system (1.1), one easily sees that in the limit, as c → ∞, the Euler equations for isentropic gas dynamics are recovered: ∂ ∂ ∂t ρ + ∂x (ρv) = 0, (1.8) ∂ ∂ 2 2 γ ∂t (ρv) + ∂x (ρv + κ ρ ) = 0. Recall that in the Lagrangian coordinates system (1.8) reads ∂ ∂ ∂t τ − ∂y v = 0, ∂ ∂ 2 −γ ) = 0, ∂t v + ∂y (κ τ
(1.9)
where τ = ρ −1 represents the specific volume of the gas. Next, before closing this section, we give an exposition of the main points involved in the proof of Theorem 1.1. We start considering the non-relativistic case because it is simpler and so allows a neater outline of all steps.
1.1. The Non-Relativistic Case. We now explain our method for constructing global periodic solutions of the isentropic gas dynamics equations, with large total variation and oscillation, as long as the adiabatic exponent γ is close to 1, by considering the simpler non-relativistic case. So, we set m = ρv, U = (ρ, m) and consider Eqs. (1.8) with a periodic initial data U (x, 0) = U0 (x),
x ∈ R,
U0 (x + 1) = U0 (x),
U0 ∈ BVloc (R)2 ,
ρ0 (x) > 0. (1.10)
Recalling the well known equivalence between BV entropy solutions in Eulerian and in Lagrangian coordinates (see [16]), setting V = (τ, v), we may consider, alternatively, system (1.9) with initial data m0 (X0 (y)) 1 V (y, 0) = V0 (y) := , , (1.11) ρ0 (X0 (y)) ρ0 (X0 (y)) where X0 (y) is implicitly defined by the equation X0 (y) y= ρ0 (x) dx.
(1.12)
0
Observe that V0 (y + ρ ) = V0 (y), for all y ∈ R, where 1 ρ = ρ0 (x) dx, 0
and also that V0 ∈ BVloc (R) with Var (V0 (y)) ≤ C Var (U0 (x)), where C > 0 depends on the lower bound for ρ0 and the upper bound for m0 . To simplify the notation we write x and U instead of y and V , and assume, with no loss of generality, that ρ = 1.
338
H. Frid, M. Perepelitsa
System (1.9) has eigenvalues γ −1 √ λ1 (U ) = −κ γ τ 2 ,
γ −1 √ λ2 (U ) = κ γ τ 2 .
We want to construct a periodic solution for (1.9),(1.11) using a periodic version of the Glimm scheme. We recall that, by this method, approximate solutions U h (x, t) are constructed as follows. Set l = x, h = t, with l ≥ sup |λi (U )|, 2h U ∈V ,i=1,2 where V is a set where U h takes its values and for which the right-hand member of the inequality is finite. Set U h (x, 0) = U0 (j l + 0)
x ∈ ((j − 1/2)l, (j + 1/2)l)
and, for 0 < t < h, we define U h (x, t) by solving the Riemann problems for the discontinuities at x = (j + 1/2)l, j ∈ Z. Then, assuming that U h (x, t) has been defined for 0 ≤ t < nh, we define U h (x, nh) = U ((j + αn )l, nh − 0),
for (j − 1/2)l < x < (j + 1/2)l, j ∈ Z,
where αn ∈ (−1/2, 1/2) is a number randomly chosen. Then, we define U h (x, t), for nh < t < (n + 1)h, by solving the Riemann problems for the discontinuities at ((j + 1/2)l, nh), j ∈ Z. This procedure can be reiterated as long as U h takes values in V and the Riemann problems can be solved. To guarantee these conditions, in general, the main point is to control the growth of the spatial total variation of U h (x, t) as t increases. In the usual non-periodic case we assume that the initial data is in BV (R), and, in particular, the limit limx→+∞ U0 (x) = U∞ exists, hence, control of the total variation implies control of U h − U∞ ∞ . Therefore, controlling the spatial total variation of U h , we may guarantee that it takes its values in a suitable neighborhood of U∞ . In the periodic case the situation is more complicated since we miss a state like U∞ which is assumed by U h (x, t), for all t > 0, as long as it can be defined. The state that approximates best this role, is the mean value 1 U = (τ , v ) := U0 (x) dx. 0
Each of its coordinates (not the vector itself) may be viewed as a value assumed by the corresponding coordinate of U0 (x). Although 1 (U h (t)) := U h (x, t) dx 0
is only piecewise constant with jumps at t = nh, if U h (x, t) converges almost everywhere to a weak solution of (1.9),(1.11), U (x, t), then we must have (U (t)) = U . So, the idea here is: (i) show first that there is a time T > 0, independent of h, so that all the U h can be defined up to t = T , by controlling Var (U h (t)) and U h (t) − U ∞ ; (ii) show that if |(U h (T )) − U | is small enough, which may be achieved for very small h, the initial situation is recovered, roughly speaking, so that we can construct our
Periodic Solutions in Relativistic Gas Dynamics
339
approximate solutions through the time interval [T , 2T ], and so on. Finally, one uses a diagonal argument to show the convergence of a subsequence of U h in the whole R2+ . The control of Var (U h (t)) is possible because the system (1.9) (as well as (1.8)) belongs, in a disguised way, to the so-called, Bakhvalov class, characterized by four conditions, A1 , . . . , A4 , described in Sect. 3. This remarkable fact was proved by DiPerna [5]. We recall that (1.9) is endowed with a pair of (natural) Riemann invariants, namely, √ √ 2κ γ − γ −1 2κ γ − γ −1 z=v− w=v+ (1.13) τ 2 , τ 2 , γ −1 γ −1 that is, z, w satisfy ∇z(U )∇F (U ) = λ1 (U )∇z(U ), ∇w(U )∇F (U ) = λ2 (U )∇w(U ), where F (U ) = (−v, κ 2 τ −γ ). Clearly, any function of a Riemann invariant is also a Riemann invariant. We refer to [4, 13, 14] for the basic facts about the theory of conservation laws. For an exposition of Bakhvalov conditions we refer the reader to Sect. 3. Let σ = w + z, η = w − z. It is sometimes more convenient to work with the coordinates (σ, η) than with the Riemann invariants (z, w). Denote W (a, k) = {(σ, η) : |σ − a| ≤ kη}. In [5], DiPerna proved the following theorem. Theorem 1.2 (DiPerna [5]). There exists a 2-parameter family of transformations T (a, θ) : (z, w) → (z (z), w (w)), a ∈ R, θ > 0, and positive constants k, c1 , c2 (k) which have the following property. For sufficiently small k, T (a, θ) maps the shock curves of (1.9) in W˜ (a, k) = W (a, k) ∩ {c1 /θ < η < c2 (k)/θ } onto shock curves which satisfy Ai , i = 1, 2, 3, 4 in the z − w variables. Furthermore, lim c2 (k) = ∞.
(1.14)
k→0
Actually, DiPerna’s analysis in [5] misses a clear determination of the way in which the bounds to be imposed on the initial data can grow when γ decreases to 1. To make this study precise, besides (1.14), it is necessary to use the fact, demonstrated in our analysis, that k and c2 (k) can be chosen such that k → ∞, γ −1
c2 (k)k → 0,
as γ → 1+.
Let
(τ ) = − log τ,
0 (y) = (τ0 (y)), σ = 2v , η = η(τ ), = (τ ). We assume (τ0 (y), v0 (y)) ∈ R[V (γ )], for all y ∈ R, where R[V (γ )] = { − V (γ ) ≤ ≤ + V (γ ) } ∩ { |σ − σ | ≤ V (γ ) },
(1.15)
340
H. Frid, M. Perepelitsa
and V (γ ) is a positive decreasing function defined for γ > 1 satisfying (γ − 1)V (γ ) → 0, k
V (γ ) → ∞,
V (γ ) → 0, c2 (k)
as γ → 1 + .
(1.16)
We also denote ˜ (γ )] = { η − V (γ ) ≤ η ≤ η + V (γ ) } ∩ { |σ − σ | ≤ V (γ ) }. R[V An important observation is that there is an absolute constant c0 , independent of γ , such that c0 ( (τ1 ) − (τ2 )) ≤ η(τ1 ) − η(τ2 ) ≤ 2c0 ( (τ1 ) − (τ2 )), 2 if τ1 ≤ τ2 and − N0 V (γ ) ≤ (τi ) ≤ + N0 V (γ ), i = 1, 2, and, in particular,
(1.17)
˜ 0 N0 V (γ )], R[N0 V (γ )] ⊂ R[2c
for any given positive integer N0 , provided γ is sufficiently close to 1, due to (1.16). The Riemann invariants z , w to which Theorem 1.2 refers are defined by w = −1 + exp(2θ(w − a/2)).
z = 1 − exp(2θ (a/2 − z)),
(1.18)
We choose a = σ ,
θ=
c2 (k) , η + 6c0 V (γ )
(1.19)
and define z =
exp(−θ η ) z, θ
w =
exp(−θη ) w. θ
(1.20)
Using (1.13), (1.15) and (1.17), it is not difficult to see that there exist absolute constants c1 , c2 such that c1 (|σ (U1 )−σ (U2 )|+| (U1 )− (U2 )|) ≤ |z (U1 )−z (U2 )|+|w (U1 )−w (U2 )| ≤ c2 (|σ (U1 ) − σ (U2 )| + | (U1 ) − (U2 )|), (1.21) if (σ (Ui ), (Ui )) ∈ R[N0 V (γ )], for any given positive integer N0 , provided that γ is sufficiently close to 1. Now, if the approximate solution U h (x, t) assumes values in a region for which Bakhvalov’s conditions A1 − A4 , recalled in Sect. 3, are satisfied relative to the Riemann invariants z , w (and, hence, also relative to z , w ) then a periodic version of the main result in [1] (cf. [7]) implies that there exists an absolute constant c3 such that Var [(z , w )(U h (t))] ≤ c3 Var [(z , w )(U h (0))],
(1.22)
and so there is an absolute constant c4 such that Var [(σ, )(U h (t))] ≤ c4 Var [(σ, )(U0 )].
(1.23)
We then assume that c4 Var [(σ, )(U0 )] < V (γ ).
(1.24)
Periodic Solutions in Relativistic Gas Dynamics
341
Concerning (1.22), the key point in Bakhvalov’s proof of an inequality like this one, in [1], is the introduction of a functional which, restricted to solutions of Riemann problems with a left state Ul and a right state Ur , denoted by (Ul Ur ), is defined by F[(Ul Ur )] = ([z (δ1 )])− + ([w (δ2 )])− , where the first term is the absolute value of the increment in z across the first wave, if it is a shock, and 0 otherwise, and the second is the absolute value of increment of w across the second wave, if it is a shock, and 0 otherwise. Thanks to A1 –A4 this functional is then proven to satisfy F[(Ul Ur )] ≤ F[(Ul Um )] + F[(Um Ur )], with equality holding if Um is a value assumed by (Ul Ur ). An essential feature of the above relation is that it does not involve quadratic terms, as opposed to the original interaction estimate in [8], which is a crucial point for a periodic version of Glimm’s method. Extending, in a natural way, the above functional to the periodic approximate solutions and using periodicity, (1.22) follows. Now, we choose N0 = 3 and show that, for γ sufficiently close to 1, Bakhvalov’s conditions are satisfied in R[3V (γ )], relatively to the Riemann invariants z , w ; this implies that the approximate solution can be defined for the first two steps so that it assumes values in a region where Bakhvalov’s conditions are satisfied relative to the Riemann invariants z , w . To this, it suffices to show that ˜ 0 V (γ )] ⊂ W˜ (σ , k), R[6c
(1.25)
if γ is sufficiently close to 1. Now, clearly, due to (1.16), we have c1 (η + 6c0 V (γ )) < η − 6c0 V (γ ), c2 (k) and k(η − 6c0 V (γ )) ≥ 6c0 V (γ ), if γ is sufficiently close to 1, which proves (1.25). As in the classical case, we easily prove that there is a constant C(γ ), depending on γ , such that 1 (σ, )(U h (x, t1 )) − (σ, )(U h (x, t2 )) dx ≤ C(γ )(|t1 − t2 | + h),
(1.26)
0
for t1 , t2 ∈ [0, T ], as long as U h (t) is defined and satisfies (1.23) in the interval [0, T ]. Now, let us investigate the growth of the oscillation of the approximate solutions with time. Denoting by C(γ ) a positive constant depending on γ , independent of h, T , not necessarily the same as in (1.26), we have |(σ, )(U h (x, t)) − (σ , )| ≤ |(σ, )(U h (x, t)) − (σ, )((U h (t)) )| +|(σ, )((U h (t)) ) − (σ, )(U )| ≤ |σ (x, t) − (σ h (t)) | + | (τ h (x, t)) − ((τ h (t)) )| +C(γ )(t + h) ≤ Var [(σ, )(U h (t))] + C(γ )(t + h) ≤ c4 Var [(σ, )(U0 )] + C(γ )(t + h),
342
H. Frid, M. Perepelitsa
so that, if T > h is such that 2C(γ )T < V (γ ) − c4 Var [U0 ], we have that the approximate solution may be defined up to a time T > T , independent of h, satisfying |(σ, )(U h (x, t)) − (σ , )| < V (γ ),
0 ≤ t ≤ T,
in particular, U h (t) ∈ R[V (γ )], for 0 ≤ t ≤ T . We may assume T = m0 h0 , for some m0 ∈ N, and h0 > 0 such that all h to be used are of the form h = 2−p h0 , for some p ∈ N. In particular, T ∈ Nh, for all time-steps h considered. Now, assume an approximate solution has been defined up to t = N T , for some N ∈ N, satisfying |(σ, )(U h (x, t)) − (σ , )| < V (γ ),
0 ≤ t ≤ NT ,
and suppose |(σ, )((U h (N T )) ) − (σ, )(U )| < V (γ ) − c4 Var [(σ, )(U0 )] − 2C(γ )T . (1.27) Hence, as above, for t > N T such that U h (t) is defined and assume values in R[3V (γ )], we have |(σ, )(U h (x, t)) − (σ , )| ≤ |(σ, )(U h (x, t)) − (σ, )((U h (t)) )| +|(σ, )((U h (t)) ) − (σ, )((U h (N T )) )| +|(σ, )((U h (N T )) ) − (σ, )(U )| ≤ c4 Var [(σ, )(U0 )] + C(γ )(t − N T + h) +|(σ, )((U h (N T )) ) − (σ, )(U )|, so that U h (t) may be defined up to a time N T + T with T > T , independent of h, such that |(σ, )(U h (x, t)) − (σ , )| < V (γ ),
0 ≤ t < N T + T .
(1.28)
The above argument provides the reiteration procedure introduced in [7]. That is, assuming the reiteration has been carried out until the N th step, we see, from (1.28), that the approximate solutions can then be defined until a time N T + T , with T > T . Then, using Glimm’s argument in [8], for the consistence of his scheme, we can obtain a subh sequence of h’s and a set N ⊂ ∞ n=1 (−1/2, 1/2), of measure 1, such that the U are 1 defined, for h less than certain hN , and converge in C([0, N T + T ), Lloc (R)) to a weak solution of (1.9). In particular, (1.27), with N replaced by N + 1, holds and we can advance one more step, continuing this way indefinitely. 1.2. An overview of the relativistic case. The situation in the relativistic case becomes more complicated because, first, the proof of the analogous of Theorem 1.2 (see Theorem 4.1), including the new estimates (1.15), requires yet more technical calculations, second, we now miss completely a reference value, such as U above, since the variables that are conserved, namely, U = (U1 , U2 ) given in (1.4), are not nicely related with the (natural) relativistic Riemann invariants ρ ρ p p 1 1 c+v c+v w = log ds, z = log ds, +c −c 2 2 c−v p + sc 2 c − v p + sc2 0 0
Periodic Solutions in Relativistic Gas Dynamics
343
or the corresponding σ = w +z, η = w −z, and transforming to Lagrangian coordinates in this case would not change this situation. The latter forces us to change the argument a bit, as follows. Let an initial data (ρ0 (x), v0 (x)) be given periodic with, say, period 1, with bounded total variation over one period and satisfying 0 < ρ < inf x∈R ρ0 (x) < supx∈R ρ0 (x) < ρ, ¯ v0 < c, with p (ρ) ¯ < c2 . We do not impose any size restriction either on Var (ρ0 , v0 ), or on ρ and ρ. ¯ We will need to use the fact that the region = (z, w) : z ≥ inf z(x, 0), w ≤ sup w(x, 0) x
x
is invariant for the solution of Riemann problems. Let (ρ h (x, t), v h (x, t)) denote the approximate solution in (ρ, v)-coordinates, constructed by Glimm’s method, as above. The invariance of implies that, while ηh (x, t) = η(ρ h (x, t)) assumes values in an interval [η(ρ) − V (γ ), η(ρ) ¯ + V (γ )], for V (γ ) as in the non-relativistic case, we must h h have that σ (x, t) = σ (v (x, t)) assumes values in an interval |σ | ≤ σ (γ ), with σ (γ ) determined by V (γ ) using the bounds for the region . We prove that {|σ | ≤ σ (γ )} ∩ {η(ρ) − V (γ ) < η < η(ρ) ¯ + V (γ )} ⊂ W˜ (a, k),
(1.29)
for γ sufficiently close to 1, where W˜ (a, k) is as in the non-relativistic case and, by the analogue of Theorem 1.2 (see Theorem 4.1 below), has the property that (1.1) satisfies Bakhvalov’s conditions in its image by the map T (θ, a) : (z, w) → (z (z), w (w)). Similar to the non-relativistic case, we choose θ, a conveniently so that (1.29) holds true. Using the inequality corresponding to (1.26), for the approximate solutions in the conservative variables (U1h (x, t), U2h (x, t)), we then obtain rough estimates from above and from below for the mean value (ρ h (t)) , in the form ˆ ρ(γ ˇ ) < (ρ h (t)) < ρ,
for 0 ≤ t < T ,
(1.30)
as long as 2ρ(γ ˇ ) < (ρ h (0))
η(ρ) −
V (γ ) , 2
(1.32)
for γ close to 1. The estimate (1.30), then, tells us that ρ h (x, t) assumes a value in the interval (ρ(γ ˇ ), ρ), ˆ for 0 ≤ t < T , where we agree that we can redefine ρ h (x, t) in any discontinuity point (x0 , t) such that ρ h (x0 , t) is any suitable value in the interval between ρ h (x0 − 0, t) and ρ h (x0 + 0, t), observing that this redefinition does not change the Var (ρ h (·, t)). In terms of the variables (σ, η), this tells us that ηh (x, t) assumes a value from the interval (η(ρ(γ ˇ )), η(ρ)), ˆ for each 0 ≤ t < T . The point is that an estimate as (1.31) follows directly from estimates for (U1,0 ) and from the estimate |σ | < σ .
344
H. Frid, M. Perepelitsa
Now, arguing as in the non-relativistic case, we may find constants c0 , . . . , c4 playing analogous roles, and we have, for 1 < γ < γ0 , for a certain γ0 > 1, (cf. (1.24)) c02 c4 Var [(σ, η)(ρ0 , v0 )]
0, p < c2 , v 2 < c2 } and the fact that (1.1) is strictly hyperbolic and genuinely nonlinear over this compact set, by applying the followings results in [2] and [6]. Let us consider the Cauchy problem for a general n × n system of conservation laws Ut + F (U )x = 0, U (x, 0) + U0 (x).
(1.34) (1.35)
Theorem 1.3 (Chen & Frid [2]). Assume that (1.34) is endowed with a strictly convex entropy, and let U ∈ L∞ (R × R+ ) be a periodic entropy solution of (1.34) – (1.35), with, say, period 1. Denote U T (x, t) = U (T x, T t). If the sequence U T , T → ∞, is pre-compact in L1loc (R × R+ ) then one has 1 ess lim |U (x, t) − U | dx = 0, where U =
1 0
t→∞ 0
U0 (x) dx.
Theorem 1.4 (DiPerna [6]). Assume (1.34) is a strictly hyperbolic genuinely nonlinear 2 × 2 system. Let U T , T ∈ I , for some index set I , be a family of entropy solutions of initial value problems for (1.34), which is uniformly bounded in L∞ (R × R+ ). Then U T is pre-compact in L1loc (R × R+ ). We remark that the existence of a strictly convex entropy for (1.1), defined on a compact set where the periodic solution assumes its values, is a consequence of a well known result of Lax (see [10]), by using the results in Sect. 2 below. The fact that solutions constructed by Glimm’s method are entropy solutions is also proved in [10]. The remainder of this paper is organized as follows. In Sect. 2, we recall several properties of system (1.1). In Sect. 3, we recall Bakhavalov’s and DiPerna’s conditions. In Sect. 4, we state Theorem 4.1, which is our extension of Theorem 1.2 to the relativistic case, including the new asymptotic information (1.15), mentioned above. In Sect. 5, we prove the existence part of Theorem 1.1. Finally, in Sect. 6, we give the rather technical proof of Theorem 4.1.
Periodic Solutions in Relativistic Gas Dynamics
345
2. Properties of the System (1.1) In this section we collect some properties of the system (1.1). The proofs can be found in [15] and [3]. Lemma 2.1 (cf. [15], p.79). (i)The mapping (U1 , U2 ) → (ρ, v), as given by (1.4), is one-to-one and the determinant of its Jacobian is non zero when ρ > 0, |v| < c. (ii) The system (1.1) is strictly hyperbolic and genuinely nonlinear when
|v| < c, 0 < p (ρ) < c, and has two real eigenvalues λ1 =
v− 1−
v
p √ , p
c2
λ2 =
v+ 1+
v
p √ .
(2.1)
p
c2
(iii) There is the pair of “classical” Riemann invariants ρ p (s) 1 c+v w = log ds, +c 2 c−v p(s) + sc2 0 ρ p (s) 1 c+v z = log ds, −c p(s) + sc2 2 c−v 0
(2.2)
the mapping (w, z) → (ρ, v) is one-to-one and the determinant of the corresponding Jacobian is non zero when ρ > 0, |v| < c. Remark 2.1. In view of the above lemma we will often refer to a given state of the system (1.1) in different state spaces by marking coordinates with the same label, for example, UR , (zR , wR ) and (ρR , vR ) are assumed to be connected by (1.4) and (2.2). From the results in [3] it follows that it is possible to choose z = R1 (w; zL , wL ),
z = R2 (w; zL , wL ),
w < wL
(2.3)
as the parametrization of shock curves of the first and the second family with the given state on the left (zL , wL ), and z = L1 (w; zR , wR ),
z = L2 (w; zR , wR ),
w > wR
(2.4)
as the parametrization of shock curves of the first and the second family, with the given state on the right (zR , wR ). Moreover the shock curves have the following properties. Lemma 2.2 (cf. [3], p. 1623). Let (ρL , vL ) and (ρR , vR ) be two states (on the left and right) connected by the shock of the first family, then vR < vL , ρR > ρL , ∂L1 1 1 < ∂R ∂w (wR ; zL , wR ) < +∞, and 1 < ∂w (wL ; zR , wR ) < +∞. If (ρL , vL ) and (ρR , vR ) connected by the shock of the second family, then vR < vL , ρR < ρL , ∂L2 2 0 < ∂R ∂w (wR ; zL , wL ) < 1, and 0 < ∂w (wR ; zL , wR ) < 1. The shocks are admissible in the sense of Lax, when ρ > 0. It follows from the next lemma, since we assume that p = κρ γ with γ > 1.
346
H. Frid, M. Perepelitsa
Lemma 2.3 (cf. [3], p.1613). If p(ρ) satisfies p (ρ) > 0, p (ρ) ≥ 0, then Lax shock conditions hold, i.e., λ1 (UR ) < s < λ1 (UL ), s < λ2 (UR ), for 1-shocks, and s > λ1 (UL ), λ2 (UR ) < s < λ2 (UL ), for 2-shocks. In addition, the shock curves are monotone in the way described by the next lemma. Lemma 2.4 (cf. [3], p.1619). Given the left state (ρL , vL ) the shock curves are star-like in, the (ρ, v) plane, when 1 ≤ γ ≤ 2. As was noted in [15] the system (1.1) is invariant under a Lorenz transformation, meaning that if (t¯, x) ¯ is a reference frame that moves with velocity µ, as measured in the (t, x) frame, then the system (1.1) does not change when rewritten in (t¯, x) ¯ coordinates, provided that velocities of particles are calculated in the barred reference frame. The correspondence between velocities of a particle in the unbarred frame, v, and the barred frame, v, ¯ is given by the rule v=
µ + v¯ . 1 + µv/c ¯ 2
The density ρ(t, x) is invariant under a Lorenz transformation. Moreover, there is an invariant functional of velocities . Lemma 2.5 (cf. [15], p.74). Let velocities of two fluid particles be given by vR and vL , as measured in coordinates (t, x), and v¯R and v¯L , as measured in coordinates (t¯, x), ¯ obtained from (t, x) by a Lorenz transformation, then log
c + vR c + vL c + v¯R c + v¯L − log = log − log . c − vR c − vL c − v¯R c − v¯L
The above lemma and (2.2) motivate the introduction of functions σ and η by c+v σ = w + z = log , c−v γ −1 √ ρ 2 γ p (s) κρ 2 η = w − z = 2c arctan . ds = 2 γ −1 c 0 p(s) + sc
(2.5) (2.6)
Let us consider the shock curves in the (σ, η) plane, which is obtained from the (z, w) plane through rotation by the angle − π4 . Remark 2.2. Throughout the paper we will use the same notation for the shock curves in (z, w) and (σ, η) planes. The use of (σ, η) coordinates as a state space will prove to be useful due to the following fact. Lemma 2.6. For any two states (σ0 , η0 ) and (σ1 , η0 ) and for each i = 1, 2, two shock curves σ = Ri (η; σ0 , η0 ) and σ = Ri (η; σ1 , η0 ) are identical up to a translation along σ -axes.
Periodic Solutions in Relativistic Gas Dynamics
347
Proof. Let v0 be such that σ0 = σ (v0 ), where σ (v) is given by (2.5). Let the new reference frame (t¯, x) ¯ to be chosen in such a way that σ0 computed in the new reference frame is 0. This is the case if (t¯, x) ¯ moves relative to the given reference frame with velocity v0 . Then, by Lemma 2.5, σ¯ 1 = σ1 − σ0 , and similarly Ri (η; σ1 , η0 ) − σ1 = Ri (η; σ1 − σ0 , η0 ) − (σ1 − σ0 ). As a result we get Ri (η; σ1 , η0 ) = Ri (η; σ1 − σ0 , η0 ) + σ0 . Choose σ0 = σ1 . We conclude that Ri (η; σ1 , η0 ) = σ1 + Ri (η; 0, η0 ).
Remark 2.3. The above lemma also holds for σ = Li (η; σ0 , η0 ), i = 1, 2. For the purpose of the paper it will be sufficient to use the Rankine-Hugoniot condition in the following form. Lemma 2.7. Let (ρL , vL ) = (ρ, v) and (ρR , vR ) = (ρR , 0) be two states (on the left and right) connected by a shock, then (p − pR )(ρ − ρR ) 2 v=c , (2.7) (p + ρR c2 )(pR + ρc2 ) where pR = p(ρR ). 3. Bakhvalov and DiPerna Conditions Generalizing Nishida’s method of proof of the existence of solutions of isothermal gas dynamics with initial data from the class L∞ ∩ BVloc (R), Bakhvalov, in [1], introduced a class of 2x2 strictly hyperbolic and genuinely non-linear systems, characterized by the particular geometry of the shock curves in the plane of Riemann invariants, for which the existence result for the same class of initial data can be proven. We follow [5] in the exposition of Bakhvalov conditions. Consider a strictly hyperbolic, genuinely nonlinear system ∂t U + ∂x F (U) = 0,
(3.1)
where U = (U1 , U2 ) and F (U) = (f1 (U ), f2 (U )). Let λ1 < λ2 be the characteristic speeds of (3.1). Let z, w be a pair of Riemann invariants for (3.1) such that in its domain of definition the map (U1 , U2 ) → (z, w) is one-to-one. Let the shock curves of the first and second family be parameterized by z = R1 (w; z0 , w0 ), w ≤ w0 ; z = R2 (w; z0 , w0 ), w ≤ w0 ;
z = L1 (w; z0 , w0 ), w ≥ w0 . z = L2 (w; z0 , w0 ), w ≥ w0
(3.2)
In the above, the state (z, w) is a state which can be connected on the left (Li ) and on the right (Ri ) to (z0 , w0 ) by a shock of the i th family. Finally, let = (z, w) : z ≥ inf z(x, 0), w ≤ sup w(x, 0) . (3.3) x
x
348
H. Frid, M. Perepelitsa
The next hypotheses impose conditions on the shock curves under which the solvability of the Cauchy problem with locally bounded variation is obtained. A1 : sup |λi (z, w)| < ∞. i,
∂R2 ∂L2 ∂R1 ∂L1 , < +∞, 0 < , < 1, w = w0 . ∂w ∂w ∂w ∂w A3 : If zr = Ri (wr ; zl , wl ), i = 1, 2, then shock curves z = Ri (w; zl , wl ), w ≤ wl and z = Li (w; zr , wr ), w ≥ wr intersect only in points (zl , wl ), (zr , wr ). A4 : If four points (zl , wl ), (zr , wr ), (zm , wm ) and (ˆzm , wˆ m ) satisfy zm = R2 (wm ; zl , wl ), zr = R1 (wr ; zm , wm ), zˆ m = R1 (wˆ m ; zl , wl ) and zr = R2 (wr ; zˆ m , wˆ m ), then (zl − zˆ m ) + (wˆ m − wr ) ≤ (wl − wm ) + (zm − zr ).
A2 : ∀(z, w) ∈ , 1
0, with p (ρ) < c(1 − δ0 ), (4.1)
350
H. Frid, M. Perepelitsa
for some δ0 > 0, independent of γ , and σ < +∞ be given. Let W (a, k) = {(σ, η) : |σ − a| < kη} , where a > 0 is a constant such that |a| < σ . Let W˜ (a, k) = W (a, k) ∩ (σ, η) : 0 < η < η(ρ), |σ | < σ . Define a map T (a, θ) : R2 → R2 as T (a, θ ) : (z, w) → (z , w ), z = 1 − exp 2θ (a/2 − z), w = −1 + exp 2θ (w − a/2), θ ∈ R.
(4.2)
This map written in σ = z + w, η = w − z variables has the following form σ = exp θ (η + σ − a) − exp θ (η − σ + a), η = exp θ (η + σ − a) + exp θ (η − σ + a) − 2. The next theorem and its proof is analogous to Theorem 3.3 in [5], except for the second part, in which we investigate the dependence of system (1.1) on γ . Theorem 4.1. There are positive constants c1 , c2 , depending on (γ , k, σ , δ0 ), such that shock curves in W˜ (a, k) ∩ cθ1 < η < cθ2 are mapped by T (a, θ ) to shock curves satisfying hypothesis A1 , A2 , A3 , B4 . Moreover, we can choose σ = σ (γ ) → +∞ and k = k(γ ) → 0, when γ → 1, such that the following limits hold. k c1 → +∞, c2 → +∞, c2 k → 0, → 0, as γ → 1 + . γ −1 c2
(4.3)
We postpone the proof of Theorem 4.1 to Sect. 6. 5. Cauchy Problem In this section we prove the existence part of Theorem 1.1; the decay (1.7) was already explained in Subsect. 1.3. We carry out the construction of periodic, weak solution by the adaptation of Glimm’s scheme, as was introduced by H.Frid in [7]. Proof of Theorem 1.1. In contrast with the non-relativistic case, for the physically meaningful values of ρ the sound speed must be smaller than the speed of light, that is, p (ρ) < c2 . In the equation of state we assume that κ < c. Then, for all γ0 > 1 there is ρ(γ ˆ 0 ) > 0 such that p (ρ) ˆ = c2 . It is easy to see that ρˆ increases to +∞ as γ → 1+. We restrict γ to be so small that γ ∈ (1, γ0 ), where γ0 is such that ρ¯ < ρ(γ ˆ 0 ).
(5.1)
We will impose further restrictions on γ0 in the course of the proof. Bounds (1.3) imply that there is σ¯ > 0 such that sup |σ (v0 (x))| < σ¯ .
x∈R
(5.2)
Periodic Solutions in Relativistic Gas Dynamics
351
Consider the set = (z, w) : w ≤ sup w0 (x), z ≥ inf z0 (x) . x
x
is an invariant set for a Riemann problem formed with any two states in it. Next, we define the set which serves as an “admissible” region for approximate solutions defined through Glimm scheme. Let σ = σ (γ ) be the function from Theorem 4.1; its explicit form, σ = σ¯ +β log(γ − 1)−1 , β > 0, is given by Lemma 6.9 of Sect. 6. Lemma 5.1. There exist absolute constants δ0 , α > β and γ0 and functions ρ < ρ, ¯ ρ> ρ¯ such that for any (η, σ ) in ∩ η(ρ) < η < η(ρ) and 1 < γ < γ0 we have ρ ≤ ρ ≤ ρ,
|σ | ≤ σ ,
p (ρ) < c(1 − δ0 ).
(5.3)
Moreover, ρ = ρ(γ ¯ − 1)α ,
(5.4)
¯ − log (γ − 1), η(ρ) = η(ρ)
(5.5)
for some > 0. Remark 5.1. Conditions ρ < ρ and |σ | < σ are needed to apply the results of the previous section, whereas condition ρ > ρ is needed to obtain the lower bound for the average of ρ(t, ·). Proof. Let us choose ρ > ρ¯ from the equation ¯ = log (γ − 1)−1 , η(ρ) − η(ρ)
(5.6)
κ (γ −1)/2 ρ < 1 − δ0 , c
(5.7)
with > 0 so small that
for some δ0 > 0. Such ρ exists. Indeed, from (5.6), upon use of (2.6), we have κ (γ −1)/2 κ (γ −1)/2 γ −1 −1 . ρ = tan arctan ρ¯ + √ log (γ − 1) c c 2 γ
(5.8)
By assumptions on the initial data (1.3), κc ρ¯ (γ −1)/2 < 1 − δˆ0 , for some δˆ0 > 0. Also, the function 2γ√−1γ log (γ − 1)−1 is bounded. We now can choose such that the argument of tan in (5.8) is less than π/4, verifying by this (5.7). By the choice of ρ, (5.6), we trivially have ρ > ρ. ¯
352
H. Frid, M. Perepelitsa
Let ρ = ρ(γ ¯ − 1)α , α > 0. Then, for some ψ ∈ (ρ(γ ¯ − 1)α , ρ), ¯ η(ρ) ¯ − η(ρ) = = ≤ = =
√ (γ −1)/2 κρ (γ −1)/2 2 γ κ ρ ¯ arctan − arctan γ −1 c c √ κ2 γ 1 (γ −1)/2 α(γ −1)/2 ρ ¯ 1 − (γ − 1) c(γ − 1) 1 + k 2 c−2 ψ γ −1 √ 2 γ κ (γ −1)/2 1 − (γ − 1)α(γ −1)/2 ρ¯ (5.9) c γ −1 √ γ κ (γ −1)/2 ρ¯ α log(γ − 1)−1 (1 + o(1)) c
p (ρ) ¯ (1 + o(1)) α log(γ − 1)−1 . c
Note that by assumptions on the initial data, κc ρ¯ (γ −1)/2 < 1 − δˆ0 , for some δˆ0 > 0. We thus obtain that η(ρ) ¯ − η(ρ) ≤ β log(γ − 1)−1 for all γ close to 1, if we choose α > β > 0 such that (1 − δ0 )α < β. Since we restrict (σ, η) to the set we have √ 2 γ 1 c + v0 (x) κρ0 (x)(γ −1)/2 σ + η ≤ sup log + arctan 2 c − v0 (x) γ − 1 c x ≤ σ¯ + η(ρ), ¯ from where we deriveσ ≤ σ¯ + η(ρ) ¯ − η(ρ) ≤ σ¯ + β log(γ − 1)−1 = σ . Analogously,σ ≥ −σ¯ − η(ρ) ¯ + η(ρ) = −σ . Define
a = {|σ | < σ } ∩ η(ρ) < η < η(ρ) .
Let U = (U1 (x), U2 (x)) be a vector function in BVloc (R), periodic with period 1, such that (σ ◦ U (x), η ◦ U (x)), obtained through (1.4), (2.5) and (2.6), assumes values in a , for all x ∈ R, and such that 1 (U (x) − U0 (x)) dx < 1 ρ. (5.10) 2 0 We can obtain the bounds on the average of ρ ◦ U (x). From the definition of function U1 in (1.4) we get U1 ≥ ρ.
(5.11) σ
4e Also, since values of (σ, η) belong to a , it follows that v 2 < c2 , c2 −v 2 = c2 (1+e σ )2 ≥
c2 e−σ ,
p ρc2
< 1 and thus U1 = ρ
p v2 + 1 + ρ ≤ 3eσ ρ. ρc2 c2 − v 2
(5.12)
Periodic Solutions in Relativistic Gas Dynamics
353
Similarly, for initial data (ρ0 (x), σ0 (x)) satisfying bounds (1.3) and (5.2) we have ρ < U0,1 (·) < 3eσ¯ ρ, ¯
(5.13)
which implies the following estimate for the space average over the period of U0,1 . ¯ (5.14) ρ < U0,1 < 3eσ¯ ρ. Using (5.10), (5.11) and (5.12) we arrive at the following inequalities. e−σ ρ < (ρ ◦ U ) < 4eσ¯ ρ. ¯ 6
(5.15)
Remark 5.2. The in (5.15) imply, in particular, that ρ ◦ U(x) takes a value inequalities in the interval ρ, ˇ ρˆ , where ρˇ = e−σ ρ/6, ρˆ = 4eσ¯ ρ, ¯
(5.16)
and we agree that we can redefine ρ ◦ U (x) in any discontinuity point x0 such that ρ ◦ U (x0 ) is any suitable value in the interval between ρ ◦ U (x0 − 0) and ρ ◦ U (x0 + 0), observing that this redefinition does not change the Var (ρ ◦ U ). Remark 5.3. A computation similar to (5.9) shows that there is a positive constant ε < β such that for any (η, σ ) ∈ ∩ η(ρ) ˇ < η < η(ρ) ˆ and γ close to 1 it holds that |σ | < σˆ = σ¯ − ε log(γ − 1).
(5.17)
c = {|σ | < σˆ } ∩ η(ρ) ˇ < η < η(ρ) ˆ .
(5.18)
Define
Let us apply Theorem 4.1 with ρ and σ chosen as above and a = 0, θ =
c2 . 2η(ρ) ˆ
(5.19)
Map T (0, θ ) takes shock curves in c1 W˜ = {|σ | < kη} ∩ |σ | < σ ∩ max{2 η(ρ), ˆ η(ρ)} < η < min{2η(ρ), ˆ η(ρ)} c2 (5.20) to shock curves satisfying hypotheses A1 − A4 . Let ˜ (γ )] ≡ η(ρ) R[V ˇ − V (γ ) < η < η(ρ) ˆ + V (γ ) ∩ |σ | < σˆ + V (γ ) .
(5.21)
Lemma 5.2. There exist a positive, monotone increasing function V (γ ), 1 < γ < 2, with V (γ ) = o(− log(γ − 1)) as γ → 1+, and γ0 > 1 such that for all 1 < γ < γ0 we have ˜ (γ )] ⊂ a ⊂ W˜ . c ⊂ R[V
(5.22)
354
H. Frid, M. Perepelitsa
Proof. The first inclusion in (5.22) holds trivially for any V (γ ) > 0. To find V (γ ) with properties stated in the lemma and to prove the second inclusion it is enough to show that ˆ ≥ −1 log(γ − 1), η(ρ) − η(ρ) η(ρ) ˇ − η(ρ) ≥ −2 log(γ − 1),
(5.23) (5.24)
for some i > 0, i = 1, 2 and γ close to 1 and then use the fact that σ − σˆ = (ε − β) log(γ − 1), ε < β. Consider ˆ = η(ρ) − η(ρ) ¯ + η(ρ) ¯ − η(ρ) ˆ η(ρ) − η(ρ) −1 ≥ log(γ − 1) , 2 ¯ − η(ρ) ¯ is bounded independently of by (5.6) and the fact that η(ρ) ˆ − η(ρ) ¯ = η(4eσ¯ ρ) γ close to 1. We establish (5.24) with 1 = 2 . Recalling the definitions ρˇ = ρe−σ /6 = ρeσ¯ (γ − 1)β /6, ρ = ρ(γ ¯ − 1)α , α > β > 0, we have the following inequality: η(ρ) ˇ − η(ρ) = η
ρe−σ¯ 6
(γ − 1)
β
− η(ρ(γ ¯ − 1)α )
σ¯ (γ −1)/2 ¯ (α−β) √ 1 − 6ρe 2κ γ ρe−σ¯ ρ (γ − 1) β ≥ (γ − 1) c 6 γ −1
(γ −1)/2
(5.25) = −2 log(γ − 1), for some 2 > 0. This verifies (5.24). To prove that a ⊂ W˜ we show that for γ sufficiently close to 1. The next conditions hold: min{2η(ρ), ˆ η(ρ)} = η(ρ), c1 max{2 η(ρ), ˆ η(ρ)} = η(ρ), c2 η(ρ)k − σ > 0.
(5.26) (5.27) (5.28)
Since (γ − 1)η(ρ) ˆ > 0 uniformly in γ close to 1 and ˆ = η(ρ) − η(ρ) ¯ + η(ρ) ¯ − η(ρ) ˆ η(ρ) − η(ρ) −1 ≤ 2 log(γ − 1) , ¯ − η(ρ) ¯ is bounded independently of γ by (5.6) and the fact that η(ρ) ˆ − η(ρ) ¯ = η(4eσ¯ ρ) close to 1 we conclude (5.26). On the other hand, since c1 /c2 → 0 as γ → 1+ by Theorem 4.1, (5.27) holds. By Lemma 6.9, σ = σ¯ −β log(γ −1) and k = (γ −1)−ι(γ −1) −1 = ι(γ − 1) log(γ − 1)−1 (1 + o(1)) and β/ι < 2 arctan(κ/c). Thus κ(ρ) ¯ (γ −1)/2 √ (1 + o(1)) log(γ − 1)−1 η(ρ)k − σ > ι2 γ arctan c − σ¯ − β log(γ − 1)−1 = −3 log(γ − 1), with 3 = 2ι arctan
κ c
− β. This verifies the last condition (5.28).
Periodic Solutions in Relativistic Gas Dynamics
355
Remark 5.4. For a function V (γ ) as in Lemma 5.2 and a positive constant N , (5.22) holds also for N V (γ ), provided that γ is sufficiently close to 1. Lemma 5.3. There exists a constant C0 > 0, independent of γ , such that |η(ρ1 ) − η(ρ2 )| ≤ C0 |ρ1 − ρ2 |,
(5.29)
if ρ ≤ ρi ≤ ρ, ¯ i = 1, 2 and provided that γ is sufficiently close to 1. Proof. This follows immediately from the definition of η(ρ) in (2.6).
Let (z , w ) be the pair of Riemann invariants given by Theorem 4.1, when a = 0, c2 θ = 2η( . It is convenient to use a normalized version of (z , w ). Define (z , w ) by ρ) ˆ z =
exp(−θ η(ρ)) ˇ z, θ
w =
exp(−θη(ρ)) ˇ w . θ
(5.30)
The next lemma establishes that increments in (z , w ) are comparable with corresponding increments in (σ, η). Lemma 5.4. There exists an absolute constant C > 0 such that C −1 (|σ1 − σ2 | + |η1 − η2 |) ≤ |z1 − z2 | + |z1 − w2 | ≤ C(|σ1 − σ2 | + |η1 − η2 |), (5.31) ˜ if (σi , ηi ) ∈ R[KV (γ )], zi = z (σi , ηi ), and wi = w (σi , ηi ), for any K > 0, provided that γ is close to 1. Proof. Let us consider dependence z = z (σ, η), and w = w (σ, η). The determi ,w ) nant of Jacobian matrix, J = ∂(z ˇ and each element ∂(σ,η) , equals −2 exp(θ (η − η(ρ))) Jij , i, j = 1, 2, is smaller than exp(θ|σ |+θ|η −η(ρ)|). ˜ ˇ The set R[KV (γ )] is convex in the (σ, η) plane and, consequently, using notation J =
ˇ eθ|σ |+θ |η−η(ρ)| ,
sup
(5.32)
˜ [KV (γ )] (σ,η)∈R
we can write the following estimate: |z1 − z2 | + |z1 − w2 | ≥ |z1 − z2 | + |w1 − w2 | ≤
inf
˜ [KV (γ )] (σ,η)∈R
sup ˜ [V (γ )] (σ,η)∈R
ˇ −2e(θ(η−η(ρ)) ||J ||(|σ1 − σ2 | + |η1 − η2 |),
ˇ −2e(θ(η−η(ρ)) ||J ||(|σ1 − σ2 | + |η1 − η2 |).
Let us show that for the choice of θ, (5.19), it holds that θ(η(ρ) ˆ − η(ρ) ˇ + 2KV (γ ) + σˆ ) → 0, with γ → 1+. This will suffice to conclude the lemma since, for (σ, η) ∈ ˜ R[KV (γ )], η ranges over the interval [η(ρ) ˇ − KV (γ ), η(ρ) ˆ + KV (γ )] and |σ | < σˆ + KV (γ ). By Lemma 5.2, V (γ ) = o (− log(γ − 1)) → ∞ as γ → 1+. The estimate similar to (5.25) shows that η(ρ) ˆ − η(ρ) ˇ = O (− log(γ − 1)), and, by (5.17), σˆ = σ¯ − ε log(γ − 1). The definition of η = η(ρ) and ρˆ = 4eσ¯ ρ¯ imply that √ √ γ γ κ κ η(ρ) ˆ =2 arctan (4eσ¯ ρ) (arctan + o(1)). ¯ (γ −1)/2 = 2 (γ − 1) c (γ − 1) c
356
H. Frid, M. Perepelitsa
√ By Lemma 6.9, k = ι(γ − 1) log(γ − 1)−1 (1 + o(1)), and so η(ρ)k ˆ = ι2 γ arctan( κc ) log(γ − 1)−1 (1 + o(1)). Since, by Theorem 4.1, c2 k → 0, we have for γ → 1+ that θ (η(ρ) ˆ − η(ρ) ˇ + 2KV (γ ) + σˆ ) c2 k (η(ρ) ˆ − η(ρ) ˇ + V (γ ) + σ¯ − ε log(γ − 1)) = 2η(ρ)k ˆ o(1)O (− log(γ − 1)) = → 0. 2ιγ arctan( κc )(1 + o(1)) log(γ − 1)−1
Now, we construct a sequence of approximate solutions, U h (x, t), using Glimm’s method, as recalled in Subsect. 1.1. We consider in detail the solutions of Riemann problems that make up U h . Denote (σ h , ηh ) = (σ ◦ U h , η ◦ U h ), (zh , wh ) = (z ◦ U h , w ◦ U h ). Assume that U h (x, t) can be constructed in the interval [0, nh], satisfying (5.10). Then ηh (·, (n − 1)h) takes a value in [η(ρ), ˇ η(ρ)]. ˆ Furthermore, since (zh , wh )(x, t) assumes values in for all (x, t) ∈ R × [0, (n − 1)h], it follows from Remark 5 that (σ h (·, (n − 1)h), ηh (·, (n − 1)h)) takes a value in c . Denote this value by (σc , ηc ). If Var [(σ h (·, t), ηh (·, t))] < V (γ ),
t ∈ [0, (n − 1)h],
(5.33)
then, upon using the fact that Bakhvalov’s condition A2 holds in the (z, w) plane, we obtain (σ h (·, t), ηh (·, t)) ∈ {(σ, η) : |σ − σc | + |η − ηc | < 2V (γ )} , t ∈ [(n − 1)h, nh], h h (σ (·, t), η (·, t)) ∈ B0 = {(σ, η) : |σ − σc | + |η − ηc | < 4V (γ )} , t ∈ [nh, (n + 1)h].
(5.34) (5.35)
By the same argument, using notation in Sect. 3, we derive the inclusion R[R[B0 ]] ⊂ B1 = {(σ, η) : |σ − σc | + |η − ηc | < 16V (γ )} .
(5.36)
˜ We have B1 ⊂ R[16V (γ )] and so B1 ⊂ W˜ , provided that γ is close to 1, by Remark 5.4. Theorem 4.1 ensures that map T (0, θ) takes shock curves in B1 onto shock curves satisfying Bakhvalov’s conditions A1 − A4 . It is straightforward to see that Bi = T (0, θ )[Bi ], i = 0, 1 are rectangles in the (z , w ) plane and moreover, by (5.36), R[R[B0 ]] ⊂ B1 . Let F[(Ul Ur )] be the functional given by Definition 3.1, using (z , w ) instead of (z, w) as the pair of Riemann invariants. Approximate solutions U h (·, νh + 0) are piecewise constant when ν = 0, . . . , n. Assume, as we may, that x = l = 2−kl , for some kl ∈ N. Define ! F[(U h ((j − 1)l, νh + 0)U h (j l, νh + 0))], t ∈ [νh, (ν + 1)h). F[U h ](t) := 1≤j ≤2kl
(5.37) We have the following corollary of Lemma 3.2.
Periodic Solutions in Relativistic Gas Dynamics
357
Corollary 5.1. F[U h ](nh) ≤ F[U h ]((n − 1)h) ≤ · · · ≤ F[U h ](0).
(5.38)
Moreover, for t ∈ [0, (n + 1)h), we have 1 h h h h DV [(z , w )(·, t)] ≤ F[U h ](t) ≤ Var [(z , w )(t, ·)], 2
(5.39)
where DV stands for the decreasing variation over one period. Proof. The first inequality in (5.1) follows from Lemma 3.2, recalling the construction of Glimm’s approximate solution, using, in case −1/2 < αn ≤ 0, F[(U h ((j − 1)l, (n − 1)h + 0)U h (j l, (n − 1)h + 0))] = F[(U h ((j − 1)l, nh − 0)U h ((j + αn )l, nh − 0))] + F[(U h ((j + αn )l, nh − 0)U h (j l, nh − 0))], F[(U h ((j − 1 + αn )l, nh − 0)U h ((j − 1)l, nh − 0))] + F[(U h ((j − 1)l, nh − 0)U h ((j + αn )l, nh − 0))] ≥ F[(U h ((j − 1 + αn )l, nh − 0)U h ((j + αn )l, nh − 0)] and periodicity. In case 0 < αn < 1/2, we proceed similarly, only replacing j − 1 by j in both relations above. The subsequent inequalities in (5.38) are reiterations of the first one with time steps (n − k)h and (n − k − 1)h instead of nh and (n − 1)h, for k = 1, . . . , n − 1. As for inequalities (5.39), the first one holds because z and w decrease in shocks and increase in rarefaction waves, and also because, by property A2 , we have |[w(δ1 )]| ≤ |[z(δ1 )]| and |[z(δ2 )]| ≤ |[w(δ2 )]|, where δ1 (δ2 ) is a shock of the first(second) family. The last inequality in (5.39) is then obvious. We can use the non-increasing property of the functional F to bound the total variation per period of the approximate solutions. Lemma 5.5. For any t, such that 0 < t < (n + 1)h, h h h h Var (z , w ) (t) ≤ 4 Var (z , w ) (0).
(5.40)
Proof. Let 0 < t < (n + 1)h. Since the approximate solutions are periodic with period 1, we have h h h h Var (z , w )(·, t) ≤ 2DV (z , w )(·, t) . (5.41) # " Using (5.38), (5.39) and inequality Var [(z h , w h )](0) ≤ Var (z0 , w0 ) we get h h Var (z , w ) (t) ≤ 4F U h (t) ≤ 4F U h (0) # " h h ≤ 4 Var (z , w ) (0) ≤ 4 Var (z0 , w0 ) .
358
H. Frid, M. Perepelitsa
Since (z , w ) are constant multiples of (z , w ) it also holds that for t specified in the lemma " # h h Var (z , w ) (t) ≤ 4 Var (z0 , w0 ) . ˜ The values of U h lie in R[16V (γ )]. Lemma 5.4 can be used, with K = 16, to write that for all γ sufficiently small, Var [(σh , ηh )] (t) ≤ 4C 2 Var [(σ0 , η0 )] . Finally, by Lemma 5.3 we obtain that Var [(σh , ηh )] (t) ≤ 4C 2 C0 Var [(σ0 , ρ0 )] . We thus impose the following restriction on Var [(σ0 , ρ0 )]: Var [(σ0 , ρ0 )] < (4C 2 C0 )−1 V (γ ).
(5.42)
We have then obtained that, under the above restriction on total variation of initial data, Var (σ h , ηh ) (t) < V (γ ), t ∈ [0, (n + 1)h). (5.43) " # ˜ , a ), for Remark 5.5. As a consequence of (5.43) we obtain that Var U h (t) < C(γ ˜ t ∈ [0, (n + 1)h) and some C > 0. The following lemma is proved following the same procedures as in [8] but using also periodicity (cf. [7]). Lemma 5.6. There exists C˜ 1 > 0 depending on γ and a such that for all 0 ≤ s < t < (n + 1)h we have 1 |U h (x, t) − U h (x, s)| dx < C˜ 1 (|t − s| + h). (5.44) 0
In particular, there is a T > 0 and h0 > 0, both independent of h, n, such that U h (·, t) verifies condition (5.10), i.e., 1 ρ (U h (x, t) − U0 (x)) dx < , (5.45) 2 0 for 0 < h < h0 and 0 ≤ t < min(T , (n + 1)h). Now, let us take T > 0 satisfying h0 < T < T . Iterating the above argument we obtain a sequence of approximate solutions U h (x, t) with h < h0 and t ∈ [0, T ) such that U h (x, t) ∈ a and (5.43) holds. Applying results of [8] we conclude that there is a sequence hk → 0 and a vector function U such that U hk → U $ in C([0, T ]; L1loc (R))2 and a.e. x for each t in [0, T ]. Moreover there is a set 0 ⊂ ∞ 1 (−1/2, 1/2), with measure 1, such that, for any sampling sequence {αn } from 0 , U is a weak entropy solution of (1.1) in R × [0, T ). Suppose that for some integer N there is a sequence $ h˜ k is defined on time interval h˜ k , a set N ⊂ ∞ 1 (−1/2, 1/2) of measure 1, such that U ˜ [0, NT ], satisfies (5.43) and (5.45) for t ∈ [0, NT ] and U hk → U N as k → +∞ 1 2 in C([0, N T ]; Lloc (R)) and a.e. x for each t in [0, N T ], which is a weak entropy solution of (1.1) in R × [0, (N − 1)T + T ), for all sampling sequences {αn } in N .
Periodic Solutions in Relativistic Gas Dynamics
Since U N is a weak solution we have NT < t < (N − 1)T + T ,
359
1 0
U N (x, N T ) dx =
1 0
U0 (x) dx. Thus, for
1 1 ˜ ˜ ˜ U hn (x, t) − U0 (x) dx ≤ U hn (x, t) − U hn (x, NT ) dx 0 0 1 ˜ U hn (x, NT ) − U0 (x) dx + 0 ˜ < C˜ 1 (t − NT + h˜ n ) + ||U hn (·, NT ) − U N (·, NT )||L1 (0,1) .
when h˜ k is sufficiently small because of the way ˜k h we have chosen T and the fact that ||U (·, N T )−U N (·, N T )||L1 (0,1) → 0, as k → ∞. ˜ For these h˜ k , the sequence U hk (x, t) can be defined on the interval [0, (N + 1)T ] while The right-hand side is smaller than
ρ 2
keeping estimates (5.43) and (5.45). Extracting a convergent subsequence we obtain a vector function U N+1 which coincides with U N when t ∈ [0, N T ] and which is a weak entropy solution of (1.1) for all sampling sequences {αn } in (N+1) ⊂ N , of measure 1. This process can be reiterated and, consequently, a weak entropy solution defined by U (x, t) = U N (x, t), for t ∈ [0, NT ], exists for all times t > 0. Finally, U satisfies bounds (5.43) and (5.45) and takes values in the bonded set a . 6. Proof of Theorem 4.1 We give the proof in the sequence of lemmas. We start by investigating the conditions under which property A2 is preserved by the map T (a, θ ). Lemma 6.1 (DiPerna, [5], p.249). A2 is equivalent to the requirement that −∞
0 such that L2 (η + ; 0, η) − √ + 2 = w. √ 2
(6.7)
362
H. Frid, M. Perepelitsa σ = L2 (η0 + , 0, η0 ) ˆ 0, ηˆ 0 ) L2 (η, z ˆz
σ
z
η0
w
w
w
η0 +
ηˆ 0
η ˆ σˆ 0 , ηˆ 0 ) L2 (η, ˆz
R1 w (ηˆ 0 , σˆ 0 )
Fig. 2.
Then, we have to show that d d L2 (η + ; 0, η) − z = < 0. √ dη dη 2
(6.8)
Using the condition (6.7) we can write (6.8) as ∂η L2 (η + ; 0, η) + ∂η0 L2 (η + ; 0, η0 ) < 0. 1 + ∂η L2 (η + ; 0, η) By Lemma 2.2 ∂η L2 (η + ; 0, η) ≥ 0, and the last inequality is equivalent to ∂η L2 (η + ; 0, η) + ∂η0 L2 (η + ; 0, η0 ) < 0. The parametrization of a shock in the (ρ, ρ) plane is given in Lemma 2.7. Using formulae (2.5) we can write L2 (η + ; 0, η) = σ ◦ v(ρ(η + ), ρ(η)). Consequently, dσ dv dσ ∂η0 L2 (η + ; 0, η) = dv ∂η L2 (η + ; 0, η) =
where
dσ dv
=
2c c2 −v 2
> 0, and
dρ dη
∂v dρ (ρ(η + ), ρ(η)) (η + ), ∂ρ dη ∂v dρ (ρ(η + ), ρ(η)) (η), ∂ρR dη 2
√ . Note that L2 (η1 ; 0, η2 ) = L2 (η2 ; 0, η1 ) = p(ρ)+ρc c
p (ρ)
for any η1 , η2 . We conclude that ∂η0 L2 (η + ; 0, η) = ∂η L2 (η; 0, η + ) and ∂η L2 was computed in the previous lemma. We set ρ = ρ(η + ), ρ0 = ρ(η) and use shorthand
Periodic Solutions in Relativistic Gas Dynamics
363
notations introduced in the last lemma, ∂η L2 (η + ; 0, η) + ∂η L2 (η; 0, η + ) δp = P p (p0 + ρc2 ) + (p + ρ0 c2 ) ρη (η + )(p0 + ρ0 c2 ) δρ δp − P p0 (p + ρ0 c2 ) + (p0 + ρc2 ) ρη (η)(p + ρc2 ), δρ 3 c δρ 1 P = 2 > 0. 2 2 3/2 c −v δpδρ (p + ρ0 c ) (p0 + ρc2 )3/2 Substituting the expressions for ρη derived from (2.5) and setting P1 = P (p0 +ρ0 c c)(p+ρc we have 2
∂η L2 (η + ; 0, η) + ∂η L2 (η; 0, η + ) δp 1 2 2 = P1 p (p0 + ρc ) + (p + ρ0 c )
δρ p δp 1 − P1 p0 (p + ρ0 c2 ) + (p0 + ρc2 ) δρ p0 1 δp 2 − p0 = P1 (p + ρ0 c )
p δρ
δp 1 − P1 (p0 + ρc2 ) − p . δρ p0 Then, from the fact that p + ρ0 c2 < p0 + ρc2 , since δp δρ
−
p +p0 2
δp δρ
< c2 and
δp δρ
−
2)
(6.9)
p p0 >
> 0, when 1 ≤ γ ≤ 2 we have
∂η L2 (η + ; 0, η) + ∂η L2 (η; 0, η + )
δp 2 ≤ P1 (p0 + ρc ) < 0. p − p0 p p0 − δρ
Now we are ready to derive the sufficient condition on the shock curves to make their images under T (a, θ ) satisfy B4 . Define maps T1 , T2 : R2 → R2 , T1 (a, θ ) : (z, w) → (z , w), T2 (a, θ ) : (z, w) → (z, w ),
z = 1 − exp 2θ(a/2 − z), w = −1 + exp 2θ(w − a/2).
Obviously, T (a, θ) = T1 ◦ T2 . Moreover we have the simple lemma. Lemma 6.6 (see [5], Lemma 3.14). T (a, θ ) maps shock curves in some region U onto shock curves satisfying B4 in the image T [U ] iff T1 (a, θ ) and T2 (a, θ ) map shock curves in U onto shock curves satisfying B4 .1 and B4 .2 in T1 [U ] and T2 [U ], respectively.
364
H. Frid, M. Perepelitsa
σ 1 L2 z
6
z 2 L2 w
w
ˆz 7 w
8
0
η 4 ˆz 5
σ R1
L2 w 3 Fig. 3.
Lemma 6.7. Suppose that d ≤ θ, log (η + ; 0, η) − ) (L 2 dη d ≤ θ, log (η + ; 0, η) + ) (R 1 dη
∀ η, η + ∈ (η1 , η2 ), > 0,
(6.10)
∀ η, η + ∈ (η1 , η2 ), > 0,
(6.11)
for some η1 < η2 . Then, (6.10) implies that the images under the map T1 of the shock curves in R × {η1 < η < η2 } satisfy B4 .1, and (6.11) implies that the images under the map T1 of the shock curves in R × {η1 < η < η2 } satisfy B4 .2. Proof. We prove only the implication following from (6.10); the proof of another part of the lemma is similar. Throughout the proof we refer to Fig. 3. As before we can assume that σ0 = z0 + w0 = 0. Fix w > 0 and δ > 0. Let us, as in hypothesis B4 .1, consider points 0 : (σ0 , η0 ), 1 : (L2 (η0 + 1 ; σ0 , η0 ), η0 + 1 ), w w 2 : √ , η0 + √ , 3 : (σˆ 0 , ηˆ 0 ) = (R1 (η0 + δ; σ0 , η0 ), η0 + δ), 2 2 w w 4 : (L2 (ηˆ 0 + 2 ; σˆ 0 , ηˆ 0 ), ηˆ 0 + 2 ), 5 : σˆ 0 + √ , ηˆ 0 + √ , 2 2 w 6 : (L2 (ηˆ 0 + 2 ; 0, ηˆ 0 + 2 ), ηˆ 0 + 2 ), 7 : (0, η0 + δ + √ ), 2 8 : (0, η0 + δ), σ = −σˆ 0 ,
Periodic Solutions in Relativistic Gas Dynamics
365
d where i , i = 1, 2 are determined by w. Since T1 does not depend on w and dz z ≥0 we have to verify that z4 − z5 ≥ z1 − z2 , where zi stands for the z coordinate of the image under T1 of point i. This is equivalent to each of the following:
e−θ(σ5 −η5 ) − e−θ(σ4 −η4 ) ≥ e−θ(σ2 −η2 ) − e−θ(σ1 −η1 ) ,
eθσ e−θ(σ7 −η7 ) − e−θ(σ6 −η6 ) ≥ e−θ(σ2 −η2 ) − e−θ(σ1 −η1 ) , eθσ e−θ(−η0 −δ) − e−θ(S2 −η0 −δ) ≥ e−θ(−η0 ) − e−θ(S1 −η0 ) .
(6.12)
Here S1 = L2 (η0 + 1 , η0 ) − 1 > 0, S2 = L2 (η0 + δ + 2 , η0 + δ) − 2 > L2 (η0 + δ + 1 , η0 + δ) − 1 > 0, where we used the fact that 2 > 1 , which follows from Lemma 6.5, and the star-like property of the curve L2 . The inequality (6.12) will hold provided d θη e − eθ(η−L2 (η+;0,η)+) ≥ 0, dη for η1 < η < η + < η2 . The condition (6.10) will suffice.
By the next lemma we investigate the validity of conditions (6.10) and (6.11). Lemma 6.8. There is a constant c1 (σ , k, δ0 ) such that (6.10) and (6.11) hold for shocks in W˜ (a, k) ∩ cθ1 < η . Proof. Inequality (6.11) is verified in a similar way as (6.10), which we prove below. Equality (6.9) implies dL2 (η + ; 0, η) = dη
δp δp 1 1 2 P1 (p + ρ0 c2 )
− p0 − P1 (p0 + ρc ) − p . p δρ p0 δρ and δp > p p . As in the previous lemma we have p+ρ0 c2 < p0 +ρc2 , p0 < δp < p 0 δρ δρ This allows us to write
dL2 (η + , 0, η) ≤ P1 (p0 + ρc2 )δ p (1 + p ) ≤ 2P1 (p0 + ρc2 )δ p p . dη p0 p0 Recalling that P1 =
1 1 δρ(p0 + ρ0 c2 )(p + ρc2 ) , √ 1 − (v/c)2 δpδρ (p + ρ0 c2 )3/2 (p0 + ρ0 c2 )3/2
366
H. Frid, M. Perepelitsa
and using Lemma 2.1 we can estimate
2P1 (p0 + ρc2 )δ p
p ≤ C6 e σ p0
δp (p0 + ρ0 c2 )(p + ρc2 ) δ p δρ (p + ρ0 c2 )3/2 (p0 + ρc2 )1/2
p . p0
It is convenient to write the last estimates in terms of the ratio ρ/ρ0 . We have 0 1 ρ p ρ 1 p0 − 1 −1 ρ0 ρ dL2 (η + ; 0, η) ρ p p 1 0 0 0 σ 2 ≤ C7 e 3/2 p p0 p0 dη ρ p p0 − 1 ρ0 ρc2 + 1 0 1/2 1 ρ p ρ 1 − 1 −1 ρ ρ p p 1 0 0 0 ≤ C8 e σ 2 . 3/2 p p0 ρ p − 1 + 1 p0 ρ0 ρc2 /
We set γ −1 n = √ η, 2 γ
η + = τ η,
2 tan τ n γ −1 ρ = ≥ 1. g= ρ0 tan n
In the above notation we estimate √ 1/2 (γ −1)/2 − 1) γ −1 dL2 (η + ; 0, η) ≤ C8 eσ g(g − 1) (g 3/2 g 2 dη (g γ − 1)1/2 g
p ρ
ρc2
+1
γ γ tan τ n γ −1 tan τ n −1 ≤ C8 eσ g 2 (g (γ −1)/2 − 1) = C8 eσ tan n tan n 2γ 2γ tan τ n ≤ C8 eσ τ γ −1 − 1 ≤ C9 eσ τ γ −1 (τ − 1). tan n
(6.13)
The last inequality is true since 0 < n < τ n < π/4. Consider c+v ≥ L2 (η + ; 0, η) = σ ◦ v = log c−v 2 δpδρ = c2 c (p + ρ0 c2 )(p0 + ρc2 ) =
2 c
/
2 v c
0 21 0 − 1) 1 p0 (ρ/ρ0 − 1)(p/p = 2 p p0 c ρ +1 +1 ρ0 c2
ρc2
(g − 1)(g γ − 1) g(g tan2 τ n + 1)(g −1 tan2 n + 1) / 2 tan n p0 (g − 1)(g γ − 1) = . 2 κ ρ0 g(g tan τ n + 1)(g −1 tan2 n + 1) p0 ρ0
(6.14)
Periodic Solutions in Relativistic Gas Dynamics
367
Then, 1 L2 (η + ; 0, η) − (τ − 1) η 2 tan n (g − 1)(g γ − 1) − (τ − 1) ≥ 2 κη g(g tan τ n + 1)(g −1 tan2 n + 1) (γ − 1) (g − 1)(g γ − 1) ≥ √ − (τ − 1) 2 κ γ g(g tan τ n + 1)(g −1 tan2 n + 1) (g − 1)2 − (τ − 1) ≥ (γ − 1) g(g tan2 τ n + 1)(g −1 tan2 n + 1) =
(γ − 1)(g − 1)
− (τ − 1) g(g tan2 τ n + 1)(g −1 tan2 n + 1) τn 2 tan tan n − 1 ≥
− (τ − 1) g(g tan2 τ n + 1)(g −1 tan2 n + 1) 2 ≥ (τ − 1)
−1 . g(g tan2 τ n + 1)(g −1 tan2 n + 1)
Let us write τ = z(γ −1) ,
1 < z < z0 ,
√ p for some z0 to be chosen later. From (4.1) and formula tan n = c we have tan n < 4 8 # " τ n γ −1 1 − δ0 . Also, since g 2 = tan ≤ τ γ −1 = z8 ≤ z08 , it follows tan n (g tan2 τ n + 1)(tan2 n + g) = g 2 tan2 τ n + g(1 + tan2 n) + tan2 n ≤ g 2 + 2g + 1 − δ0 ≤ 3g 2 + 1 − δ0 ≤ 3z08 + 1 − δ0 < 4, when z0 close to 1. In this way we obtain 1 L2 (η + ; 0, η) − (τ − 1) ≥ C13 (z0 )(τ − 1), η Now
(γ −1)
1 < τ < z0
.
(6.15)
1 L2 (η + ; 0, η) − = η L2 (η + ; 0, η) − (τ − 1) ≥ C13 η(τ − 1) = C13 . η (6.16)
Note that shocks are starlike in the (σ, η) plane. It implies that for fixed η L2 (η+; 0, η)/ −1 is nondecreasing in , which in turn implies that (6.16) holds for any > 0 (τ > 1). Combining (6.13) and (6.16) we obtain 2γ γ −1 d στ (6.17) dη log (L2 (η + ; 0, η) − ) ≤ C14 e η .
368
H. Frid, M. Perepelitsa 2γ
Take c1 = C14 eσ τ γ −1 , τ =
1+k 1−k .
The proof of Theorem 4.1 will be complete upon using Lemma 6.4 and Lemma 6.8 if we show that there is k = k, σ such that properties (4.3) hold. Since we know the explicit dependence of c2 and c1 on k, γ , σ , this result will follow by straightforward but technical computation given in the next lemma. Lemma 6.9. There are ι > 0 and β > 0 such that if k = z(γ )γ −1 − 1, where z(γ ) = (γ − 1)−ι , and σ = σ¯ + β log(γ − 1)−1 , then (4.3) holds. Moreover, ι, β can be chosen to satisfy β/ι < 2 arctan κc . Proof. In the proof we denote by Ci > 0, i = 1..11 some absolute constants. We have z(γ )γ −1 → 1 and z(γ ) → ∞. To simplify the notations let us write z for z(γ ) and C = C1 eσ . Then 2 γ +1 Cz2(γ +1) + 1 − (zγ −1 − 1) γ −1 c2 (k) = log 2 γ +1 2(zγ −1 − 1) Cz2(γ +1) − 1 − (zγ −1 − 1) γ −1 2 γ +1 γ −1 γ −1 2 1 − (z − 1) 1 log 1+ = . γ −1 γ +1 2(z − 1) 2 Cz2(γ +1) 1 − 1 − (zγ −1 − 1) γ −1 C −1 z−2γ −2 1
(6.18) Consider the function
2 γ +1
1 − (zγ −1 − 1)
γ −1
1 zγ −1 −1 2(γ +1) = 1 − (zγ −1 − 1) zγ −1 −1 γ −1 zγ −1 −1 2(γ +1) γ −1 γ −11 γ −1 z −1 = 1 − (z − 1)
(6.19)
zγ −1 −1 2(γ +1) γ −1 −1) γ −1 −1+O(z = e =e
−z
γ −1 −1 γ −1 2(γ +1)
e
O(zγ −1 −1) z
γ −1 −1 γ −1 2(γ +1)
.
1
In the above computation we used the fact that (1 − x) x = e−1+O(x) . Also, since O(zγ −1 −1)
(zγ −1 − 1)2 zγ −1 − 1 2(γ + 1) ≤ C1 2(γ +1) ≤ C2 (γ −1) log2 z(γ ) → 0, γ −1 γ −1
then C3 e
−z
γ −1 −1 γ −1 2(γ +1)
2 γ +1 γ −1 −1 γ −1 −z 2(γ +1) ≤ 1 − (z(γ )γ −1 − 1) ≤ C4 e γ −1 → 0.
(6.20)
γ −1
z −1 The limit holds since (γ −1) log z(γ ) → 1. Equation (6.20) and the fact that C → ∞ imply that the argument of the log in (6.18) is close to 1. We obtain
C5
1 zγ −1 − 1
e
−z
γ −1 −1 γ −1 2(γ +1)
Cz2(γ +1)
≤ c2 (k) ≤ C6
1 zγ −1 − 1
e
−z
γ −1 −1 γ −1 2(γ +1)
Cz2(γ +1)
.
(6.21)
Periodic Solutions in Relativistic Gas Dynamics
Finally, since
zγ −1 −1 (γ −1) log z(γ )
369
→ 1, we get
1 1 1 1 ≥ C7 5(γ +1) (γ − 1) log z Cz (γ − 1) log z Cz10 1 1 ≥ C7 γ − 1 Cz11 1 → +∞, ≥ C8 (γ − 1)1−β−11ι
c2 ≥ C7
(6.22)
if 1 − β − 11ι > 0.
(6.23)
Equations (6.22) and (6.17) can be used to derive c1 ≤ C9 C 2 (γ − 1)z15 ≤ C10 (γ − 1)1−2β−15ι → 0, c2 if 1 − 2β − 15ι > 0.
(6.24)
The right side of (6.21) reads c2 (k)k ≤ C11
1 → 0. Cz2(γ +1)
There are ι > 0, β > 0 which solve (6.23), (6.24) and satisfy β/ι < 2 arctan κc .
Acknowledgements. The research of the first author was partially supported by CNPq through the grants 352871/96-2, 46.5714/00-5, 479416/2001-0, and FAPERJ through the grant E-26/151.890/2000.
References 1. Bakhvalov, N.: The existence in the large of a regular solution of a quasilinear hyperbolic system. USSR Comp. Math. and Math. Phys. 10, 205–219 (1970) 2. Chen, G.-Q., Frid, H.: Decay of entropy solutions of nonlinear conservation laws. Arch. Rat. Mech. Anal. 146, 95–127 (1999) 3. Chen, J.: Conservation laws for the relativistic p-systems. Commun. in PDE, 20(9–10), 1605–11646 (1995) 4. Dafermos, C.M.: Hyperbolic Conservation Laws in Continuum Physics. Berlin: Springer-Verlag, 1999 5. DiPerna, R.J.: Existence in the large for quasilinear hyperbolic conservation laws. Arch. Rat. Mech. Anal. 52, 224–257 (1973) 6. DiPerna, R.J.: Convergence of approximate solutions to conservation laws. Arch. Rat. Mech. Anal. 82, 27–70 (1983) 7. Frid, H.: Periodic solutins of conservation laws constructed through Glimm scheme. Trans. Am. Math. Soc. 353(11), 4529–4544 (2001) 8. Glimm, J.: Solutions in the large for nonlinear hyperbolic systems of equations. Commun. Pure Appl. Math. 18, 697–715 (1965) 9. Glimm, J., Lax, P.D.: Decay of solutions of systems of nonlinear hyperbolic conservation laws. Mem. of the Am. Math. Soc. 101, (1970) 10. Lax, P.D.: Sock waves and entropy. In: Contributions to Nonlinear Functional Analysis, E. Zarantonello (ed.), New York: Academic Press, 1971, pp. 603–634 11. Nishida, T.: Global solution for an initial boundary value problem of a quasilinear hyperbolic system. Proc. Japan Acad. 44, 642–646 (1968)
370
H. Frid, M. Perepelitsa
12. Landau, L.D., Lifschitz, E.M.: The Classical Theory of Fields. Course of Theoretical Physics, Vol. 2, London-New York: Pergamon Press, 1975 13. Serre, D.: Systems of Conservation Laws. Vols. 1, 2, Cambridge: Cambridge University Press, 1999, 2000. 14. Smoller, J.: Shock Waves and Reaction-Diffusion Equations. New York: Springer-Verlag, 1983 15. Smoller, J., Temple, B.: Global solution of the relativistic Euler equations. Commun. Math. Phys. 156, 67–99 (1993) 16. Wagner, D.: Equivalence of the Euler and Lagrange equations of gas dynamics for weak solutions. J. Diff. Eqs. 68, 118–136 (1987) Communicated by P. Constantin
Commun. Math. Phys. 250, 371–391 (2004) Digital Object Identifier (DOI) 10.1007/s00220-004-1087-6
Communications in
Mathematical Physics
Randomizing Quantum States: Constructions and Applications Patrick Hayden1,2 , Debbie Leung1,2 , Peter W. Shor3 , Andreas Winter4,2 1
Institute for Quantum Information, Caltech 107–81, Pasadena, CA 91125, USA. E-mail:
[email protected];
[email protected] 2 Mathematical Sciences Research Institute, 1000 Centennial Drive, Berkeley, CA 94720, USA 3 AT & T Labs Research, Florham Park, NJ 07922, USA. E-mail:
[email protected] 4 Department of Computer Science, University of Bristol, Merchant Venturers Building, Woodland Road, Bristol BS8 1UB, United Kingdom. E-mail:
[email protected] Received: 29 August 2003 / Accepted: 14 November 2003 Published online: 8 July 2004 – © Springer-Verlag 2004
Abstract: The construction of a perfectly secure private quantum channel in dimension d is known to require 2 log d shared random key bits between the sender and receiver. We show that if only near-perfect security is required, the size of the key can be reduced by a factor of two. More specifically, we show that there exists a set of roughly d log d unitary operators whose average effect on every input pure state is almost perfectly randomizing, as compared to the d 2 operators required to randomize perfectly. Aside from the private quantum channel, variations of this construction can be applied to many other tasks in quantum information processing. We show, for instance, that it can be used to construct LOCC data hiding schemes for bits and qubits that are much more efficient than any others known, allowing roughly log d qubits to be hidden in 2 log d qubits. The method can also be used to exhibit the existence of quantum states with locked classical correlations, an arbitrarily large amplification of the correlation being accomplished by sending a negligibly small classical key. Our construction also provides the basic building block for a method of remotely preparing arbitrary d-dimensional pure quantum states using approximately log d bits of communication and log d ebits of entanglement. I. Introduction In this paper we revisit the question of finding the minimal resources required to randomize a quantum state. This problem has previously been investigated in several variations, always with the conclusion that in order to randomize or, more generally, encrypt a quantum state of l qubits, 2l classical bits of random key are required [1–3]. This factor of 2 represents a familiar and even welcome phenomenon in quantum information theory; the reason for its appearance is intimately connected to the existence of superdense coding [3, 4]. All this previous work, however, considered only the task of perfectly encrypting quantum states. Here we focus on the task of approximately encrypting quantum states, allowing a negligible but non-zero amount of information to remain available to an eavesdropper. In sharp contrast to the exact case, in this setting we find that the factor of 2 disappears entirely: an l-qubit quantum state can be approximately encrypted using l + o(l) bits of random key.
372
P. Hayden, D. Leung, P.W. Shor, A. Winter
Our encryption or, more specifically, randomization scheme also exposes a previously unobserved difference between classical and quantum correlations: classical correlations are always effectively destroyed by a local randomization procedure whereas quantum correlations need not be. Therefore, any correlation that survives local randomization must be “nonlocal” and, hence, quantum mechanical in nature. This basic insight provides the intuition behind a new scheme for data hiding in a bipartite system [5–7]. The encoding is an approximate randomization procedure applied collectively to the two l-qubit shares of a 2l-qubit system, calibrated to eliminate all correlations that can be detected by local operations and classical communication (LOCC). The failure of the encoding to destroy quantum correlations is striking: there exists a collective decoding operation that recovers all the states on an (l − o(l))-qubit subsystem of the input. In other words, randomization can be used to construct schemes for LOCC hiding of roughly l qubits in a 2l-qubit quantum state. This construction is far more efficient than any previously known for hiding qubits or even classical bits. Another variation on our basic construction can be used to find quantum states whose classical correlations are large but locked. Roughly speaking, this means that local measurements can yield classical data with only a small amount of correlation but that local operations supplemented with a small amount of communication can yield a disproportionately large amount of correlation. The existence of such states was demonstrated in Ref. [8] but some central questions about the range of possible effects were left open. In particular, the authors defined two figures of merit, one measuring the (reciprocal of) amplification of the correlation and the other the ratio of the amount of communication to the amount of unlocked correlation. We give the first demonstration that both quantities can go to zero simultaneously. While seemingly esoteric, the existence of this phenomenon has important implications for the definition of security in quantum cryptographic scenarios; in particular, it establishes the potentially enormous volatility of accessible information. There is also an alternative interpretation of this result: we prove that random observables typically obey extremely strong entropic uncertainty relations. One final application of the ability to randomize an l-qubit state using only l + o(l) random key bits has a sufficiently different character that we present it in a separate paper [9]. Insofar as teleportation [10], or more generally remote state preparation [11], can be interpreted as a method for encrypting quantum states [12, 13], our approximate encryption procedure should give rise to an approximate remote state preparation method consuming only half as much communication as teleportation. We report in [9] how our results on randomization provide the basic building block for a protocol capable of sending an arbitrary pure l-qubit quantum state using only l ebits and l + o(l) bits of classical communication. The rest of the paper is structured as follows. Section II describes our results on the private quantum channel, which serve as a prototype for the rest of the paper. Section III then studies approximately randomizing maps by characterizing their effect on classical and quantum correlations. These observations are then put to use in slightly modified form in Sect. IV, which describes our quantum data hiding protocol and proves both its correctness and security. Section V then formalizes the idea of locking classical correlations and describes our contribution. We also include an appendix establishing some results on randomization procedures using subsets of the unitary group, such as the Pauli matrices. We use the following conventions throughout the paper. log and exp are always taken base 2. Unless otherwise stated, a “state” can be pure or mixed. The symbol for a state (such as ϕ or ρ) also denotes its density matrix. “Part of an entangled state” refers to a (mixed) state whose purification is accessible to some of the parties. We will make an
Randomizing Quantum States: Constructions and Applications
373
explicit indication when referring to a pure state. The density operator |ϕϕ| of the pure state |ϕ will frequently be written simply as ϕ. B(Cd ) will be used to denote the set of linear operators from Cd to itself and U(d) ⊂ B(Cd ) the unitary group on Cd . Finally, a word of warning about the cryptographic interpretation of our results. When we say that a scheme for approximate encryption or quantum data hiding is secure for mixed states, it should be assumed that the purifications of those mixed states are inaccessible to all parties considered. Indeed, the possibility that purification-inaccessible security criteria can hold even as purification-accessible criteria fail is essentially a quantum mechanical re-statement of a familiar cryptographic observation: approximate security is sometimes much easier to achieve than perfect security. We exploit the gap throughout this paper. II. An Approximate Private Quantum Channel We consider an insecure one-way quantum channel between two parties Alice and Bob that is noiseless in the absence of eavesdropping. This channel can be made secure against eavesdropping if Alice and Bob are allowed the extra resource of shared random secret key bits. For instance, one can encrypt a state |ϕ ∈ Cd using a secret key of length 2 log d as follows [2, 3]. Fix a basis {|1, . . . , |d} for Cd and let X|j = |(j + 1) mod d
and Z|j = e2πij/d |j .
(1)
It’s straightforward to verify that d d 1 j k k† j † I X Z ϕZ X = 2 d d
(2)
j =1 k=1
for all states ϕ. If j and k are selected using a shared secret key, Alice can encrypt the state using the unitary operation Xj Z k and Bob can decrypt by applying Z −k X −j . By Eq. (2), the view of an eavesdropper without access to j and k is I/d, which is independent of the input state ϕ. This structure, consisting of a set of encoding maps and decoding maps indexed by key values, such that the average encoded state is independent of the input, is known as a private quantum channel for Cd . (See Ref. [3] for a formal definition.) In fact, if perfect security is required, it can be shown that the secret key must have length at least 2 log d [1–3, 13]. Here we relax the security criterion. Definition II.1. A completely positive, trace-preserving (CPTP) map R : B(Cd ) → B(Cd ) is -randomizing if, for all states ϕ, R(ϕ) − I ≤ . (3) d ∞ d In the above, · ∞ is the operator norm, so Eq. (3) is equivalent to all the eigenvalues of R(ϕ) lying in the interval [(1 − )/d, (1 + )/d]. By convexity of the norm, it suffices to check the condition for all pure states, a fact we will use repeatedly. In quantum information, distinguishability is frequently measured using the trace norm · 1 = Tr | · |, the analogue of variation distance in probability theory. Note that Eq. (3) automatically implies the weaker estimate R(ϕ) − I/d1 ≤ . The main result of this section is
374
P. Hayden, D. Leung, P.W. Shor, A. Winter
Theorem II.2. For all > 0 and sufficiently large d (> 10 ), there exists a choice of unitaries in U(d), {Uj : 1 ≤ j ≤ n} with n = 134d(log d)/ 2 such that the map 1 Uj ϕUj† n n
R(ϕ) =
(4)
j =1
on B(Cd ) is -randomizing. As illustrated in Fig. 1, by having Alice and Bob select j using a shared secret key, this map R can be used to build a private quantum channel with key length log n = log d + log log d + log(1/ 2 ) + 8, albeit one that is not perfectly secure, only nearly so. The view of an eavesdropper without access to the key, given a particular input state ϕ, is precisely R(ϕ). Definition II.1, therefore, doubles as a definition of security. For any distribution of states {pi , ϕi } supported on Cd we can bound the mutual information accessible to an eavesdropper who performs measurements on the encrypted states R(ϕi ). This accessible information is bounded above by the Holevo quantity [14] χ =S pi R(ϕi ) − pi S(R(ϕi )) (5) i
≤ log d −
i
pi S R(ϕi )
(6)
i
≤ log(1 + ) ≤ /(ln 2).
(7)
The second inequality is true because the definition of -randomizing maps implies that R(ϕi ) ≤ (1 + )I/d, which allows for an application of the monotonicity of log. In d→∞
particular, for all α > 0, choosing = (log d)−α implies that χ −−−→ 0. In this case, n = 134 d (log d)(1+2α) , so the key size can be taken to be log d + (1 + 2α) log log d + 8. Alternatively, one can choose an arbitrarily small α > 0, and let = d −α . Then, χ ≤ d −α /(ln 2) which is exponentially decaying in the number of qubits to be encrypted, log d. This comes at the cost of a slight increase in the asymptotic key length: log n = (1 + 2α) log d + log log d + 8. Moreover, as is common with probabilistic existence proofs, it is possible to ensure that the overwhelming majority of random choices succeed at only minor additional cost; in this case, log n would need to be increased by a Alice
Bob
j
ϕ
j
Uj
R(ϕ)
Uj†
ϕ
Fig. 1. A private quantum channel built on the randomization map R. Alice and Bob share knowledge of the secret key j , using it to encrypt and decrypt the state. Because an eavesdropper does not have access to j , her view is R(ϕ) ≈ I/d. If ϕ is a d-dimensional quantum state, then the key length need only be log d + o(log d)
Randomizing Quantum States: Constructions and Applications
375
constant number of bits for any fixed probability of success. Analogous statements hold for our other constructions later in the paper. The proof of Theorem II.2 is based on a large deviation estimate, Lemma II.3, and discretization via a net construction, Lemma II.4, both of which will be re-used in other applications later in the paper. The large deviation estimate, in turn, is based on Cram´er’s theorem (see Ref. [15], for example, for a detailed exposition), which states that for independent, identically distributed (i.i.d.) real-valued random variables, X, X1 , X2 , . . . , Xn , n 1 1 ∗ Pr inf (x) Xj ≥ a ≤ exp −n and (8) n ln 2 x≥a j =1
n 1 1 ∗ Pr Xj ≤ a ≤ exp −n inf (x) , n ln 2 x≤a
(9)
j =1
where ∗ (x) = sup
λ∈R
λx − ln EeλX .
(10)
∗ (x) is known as the rate function and EeλX the moment generating function. (In fact, the harder part of Cram´er’s theorem deals with the optimality of the rate function. We only need bounds (8) and (9) here, which are surprisingly easy to prove: they require only the Bernstein trick and Markov’s inequality.) Lemma II.3. Let ϕ be a pure state, P a rank p projector and let (Uj )j ≥1 be an i.i.d. sequence of U(d)-valued random variables, distributed according to the Haar measure. There exists a constant C (C ≥ (6 ln 2)−1 ) such that if 0 < < 1,
n
1
p p Pr
(11) ≤ 2 exp −Cnp 2 . Tr(Uj ϕUj† P ) −
≥ d d
n j =1
Proof. Since the Haar measure is left and right invariant, we may assume that ϕ = |11| p and P = i=1 |ii| for some fixed orthonormal basis {|i}. Let |gj = di=1 gij |i, where the i.i.d. complex random variables gij ∼ NC (0, 1). (That is, the real and imaginary parts of gij are independent gaussian random variables with mean 0 and variance 1/2.) The distribution of |gj is the same as the distribution for gj 2 Uj |1. For a fixed j , let U = Uj and |g = |gj . The convexity of exp implies that p p λg22 λ 2 2 Eg exp (12) |i|g| = EU Eg exp |i|U |1| d d i=1 i=1 p λg22 2 (13) ≥ EU exp Eg |i|U |1| d i=1
p = EU exp λ |i|U |1|2
(14)
i=1
= EU exp λ Tr(U ϕU † P ) .
(15)
376
P. Hayden, D. Leung, P.W. Shor, A. Winter
This inequality between moment generating functions establishes, via Cram´er’s theorem, that the rate function ∗U for the random variable Tr(U ϕU † P ) and the rate function p ∗p for d1 i=1 |i|g|2 are related by the inequality ∗p (x) ≤ ∗U (x). It follows from the definitions that ∗p (px/d), in turn, is equal to p∗g (x), where ∗g is the rate function for |gij |2 . Therefore, n 1 p p 1 † ∗ Pr Tr(Uj ϕUj P ) − ≥ (16) ≤ exp −np inf (1 + x) . n d d ln 2 x≥ g j =1
The rate function ∗g can be evaluated directly, with the result that ∗g (1 + ) ≥ C 2 , where C can be chosen to be the constant (6 ln 2)−1 [9]. Repeating the argument for deviations below the mean and applying the union bound completes the proof. As an aside, we note that the probability density function for Tr(Uj ϕUj† P ) was recently calculated exactly by Zyczkowski and Sommers [16]. In principle, this should allow for an exact calculation of the rate function ∗U . Lemma II.4. For 0 < < 1 and dim H = d there exists a set M of pure states in H 2d with |M| ≤ (5/) ˜ ∈ M with , such that for every pure state |ϕ ∈ H there exists |ϕ |ϕϕ| − |ϕ ˜ ϕ| ˜ 1 ≤ . (We call such a set an –net.) Proof. We begin by relating the trace norm to the Hilbert space norm: 2 |ϕ ˜ − |ϕ2 = 2 − 2Re ϕ|ϕ ˜ 2 ≥ 1 − |ϕ|ϕ| ˜ 2 1 = |ϕ ˜ ϕ| ˜ − |ϕϕ| 1 , 2
(17)
where the last line can be shown by evaluating the eigenvalues of |ϕ ˜ ϕ| ˜ − |ϕϕ|. Thus it will be sufficent to find an /2–net for the Hilbert space norm. Let M = {|ϕi : 1 ≤ i ≤ m} be a maximal set of pure states satisfying |ϕi − |ϕj 2 ≥ /2 for all i and j . (Such a set exists by Zorn’s lemma.) By definition, M is an /2–net for · 2 . We can then estimate m by a volume argument. As subsets of R2d , the open balls of radius /4 about each |ϕi are pairwise disjoint and all contained in the ball of radius 1 + /4 centered at the origin. Therefore, m(/4)2d ≤ (1 + /4)2d .
(18)
Proof of Theorem II.2. Let (Uj )j ≥1 be i.i.d. U(d)-valued random variables, distributed according to the Haar measure. We will show that with high probability the corresponding R in Eq. (4) is -randomizing. The proof will consist of bounding n 1 I † Pr sup Uj ϕUj − ≥ U d d ϕ n j =1 ∞
n
1
1 † = Pr sup sup
(19) Tr(Uj ϕUj ψ) −
≥ . U d d ϕ ψ n j =1
Randomizing Quantum States: Constructions and Applications
377
The optimizations over ϕ and ψ can both be taken over pure states only by the convexity of | · |. Fix a net of projectors M = {X} and let ϕ˜ be the net point corresponding to ϕ so that sup ϕ − ϕ ˜ 1≤ ϕ
. 2d
(20)
2d . We can then proceed Define ψ˜ similarly. Lemma II.4 provides a net with |M| ≤ 10d as follows:
n
1
1 Pr sup sup
Tr(Uj ϕUj† ψ) −
≥ U d d ϕ ψ n j =1 n
1
† ˜
≤ Pr sup sup (21) ˜ j† ψ)
Tr(Uj ϕUj ψ) − Tr(Uj ϕU U ϕ ψ n j =1
n
1 1
† ˜
+ Tr(Uj ϕU ˜ j ψ) − ≥ n d d j =1
n
1
1 . ˜ − ≥ ≤ Pr sup sup
(22) Tr(Uj ϕU ˜ j† ψ) U d
2d ϕ ψ n j =1
In the last inequality, we used the estimate
† ˜ ∞, ˜ ∞ + ψ − ψ ˜ j† ψ˜ ≤ ϕ − ϕ
Tr Uj ϕUj ψ − Tr Uj ϕU
(23)
which, because ϕ − ϕ˜ and ψ − ψ˜ are traceless and either zero or rank 2, is equal to 1 ˜ 1 . This sum is then less than or equal to /(2d) by construction ˜ 1 + 21 ψ − ψ 2 ϕ − ϕ of the net. Next, we replace the optimization over the set of all pure states in Eq. (22) by optimization over the net, use the union bound and apply Lemma II.3:
n
1
1 ˜ − ≥ Pr max
Tr(Uj ϕU ˜ j† ψ) (24) U d
2d ˜ M n ϕ, ˜ ψ∈ j =1
n
1 1 ˜ − ≥ ≤ |M|2 max Pr
Tr(Uj ϕU ˜ j† ψ) (25) d
2d ˜ M U ϕ, ˜ ψ∈
n
≤
10d
4d
j =1
Cn 2 exp − 4
,
(26)
where C ≥ (6 ln 2)−1 is the same constant as in the proof of Lemma II.3. The existence of the desired -randomizing map is guaranteed if the above probability is bounded 16d 10d 10 away from 1, which is the case if n > C 2 log( ). If d > , this is true when n ≥ 192(ln 2) −2 d log d.
378
P. Hayden, D. Leung, P.W. Shor, A. Winter
III. Randomization and the Destruction of Correlations Our discussion in the previous section demonstrates that a quantum operation R constructed by averaging over 134d(log d)/ 2 randomly selected unitaries will be -randomizing with high probability. In this section, our goal will be to investigate the properties of general -randomizing maps so the method used to construct R will be immaterial as long as R(ϕ) − dI ∞ ≤
d
(27)
for all ϕ. The definition, it should be noticed, makes no mention of the effect of R on a system that is correlated with another system (we call this the “environment”). Here we will analyze that effect, which will ultimately lead to a partial characterization of all -randomizing maps. To start, it is easy to verify that an -randomizing R properly destroys classical correlations between the system being randomized and its environment: Lemma III.1. Let ρ AB = i pi ϕiA ⊗ψiB be a separable state and R an -randomizing map on A. Then (R ⊗ I )(ρ AB ) − dI ⊗ ρ B 1 ≤ . Proof. This is straightforward:
(28)
A B B I pi [R(ϕi ) ⊗ ψi − d ⊗ ψi ] i 1 pi R(ϕiA ) ⊗ ψiB − dI ⊗ ψiB ≤
(R ⊗ I )(ρ AB ) − dI ⊗ ρ B = 1
1
i
=
pi R(ϕiA ) − dI 1
(29) (30) (31)
i
≤ .
(32)
Thus, for classically correlated states, approximate randomization implies the destruction of correlations with other systems. Indeed, for classically correlated states, finding an operation that will destroy correlations is effectively the same thing as finding an operation that will erase local information. This equivalence fails dramatically for entangled states. (In contrast, any perfectly randomizing map does destroy all possible correlations including entanglement.) Indeed, if we apply an R constructed using the method of Theorem II.2 to half of a maxid mally entangled state | = √1 j =1 |i|i, then the resulting state has rank at most n = o(d 2 ), so
d
(R ⊗ I )( ) −
1 I ⊗ I1 d2
d→∞
≤ 2(1 − n/d 2 ) −−−→ 2,
(33)
meaning that (R ⊗ I )( ) and d12 I ⊗ I can be distinguished by a collective measurement with negligible probability of error for large d. It goes without saying then that the approximate randomization does not eliminate all the correlations that were originally present in the maximally entangled state. Nonetheless, the correlations that remain have been rendered invisible to local operations and classical communication, recovering at least some of the spirit of Lemma III.1.
Randomizing Quantum States: Constructions and Applications
379
Lemma III.2. Let R be an -randomizing quantum operation, M = {Mi } be a positive operator-valued measure (POVM) that can be implemented using LOCC, pi := Tr(Mi (R ⊗ I )( )) and qi :=Tr(Mi d12 I). Then p − q1 ≤ . Proof. If M can be implemented using LOCC then it will have the separable form Mi = Xi ⊗ Yi . Moreover, we can assume without loss of generality that Xi and Yi are rank 1 since refining the measurement can only increase the 1 distance between the outcome probability distributions. Making use of the identities Yi Tr Yi = Yi2 and (I ⊗ Yi ) (I ⊗ Yi ) = d1 YiT ⊗ Yi , the demonstration is then direct: p − q1 =
Tr[(Xi ⊗ Yi )(R ⊗ I )( )] − Tr[(Xi ⊗ Yi ) dI2 ] i
=
YT
Tr[(Xi ⊗ I)(R ⊗ I )( Tr Yi T ⊗ i
i
=
YT
Tr[Xi (R( Tr Yi T ) − dI )]
Tr i
i
≤
d2
Yi d ) − Tr[(Xi
(35)
Yi d
Tr(Xi ) Tr(Yi ) ≤ ,
⊗ Yi ) dI2 ]
(34)
(36) (37)
i
where in the last step we have used that domizing.
i
Xi ⊗ Yi = I and the fact that R is -ran-
A very similar proof demonstrates the more general Theorem III.3. Let M = {Mi } be a POVM that can be implemented using LOCC, pi := Tr(Mi (R ⊗ I )(ρ AB )) and qi := Tr(Mi ( d1 I ⊗ ρ B )). Then p − q1 ≤ . Thus, while some kind of correlation can persevere when half of an entangled state is randomized, that correlation will all be inaccessible to LOCC measurements. In fact, the proof shows that it will be inaccessible to all measurements that can be implemented using separable superoperators. It’s tempting to speculate that Lemma III.2 provides a characterization of all -randomizing operations. There is at least a weak sense in which that is true: if the conclusion of Lemma III.2 holds for a map R, then for all states ϕ, R(ϕ) − dI 1 ≤ .
(38)
Recall that this condition is weaker than the operator norm definition of -randomization that we use, however. This might suggest that our definition is too strong and that this trace norm version might be more easily characterized. Unfortunately, our proof of Lemma III.2 makes explicit use of the stronger condition. We don’t know if it would hold for R only satisfying Eq. (38). The fact that -randomizing maps render the correlations of entanglement invisible to LOCC also raises the question of their relationship to the phenomenon of quantum nonlocality without entanglement [17]. The range of connections between -randomizing maps and the physics of nonlocality will be further developed in a upcoming paper [18].
380
P. Hayden, D. Leung, P.W. Shor, A. Winter
IV. Quantum Data Hiding Theorem III.3 also immediately suggests another application of -randomizing maps: quantum data hiding, the name given to schemes for sharing bits or qubits between multiple parties in such a way that the data cannot be accessed by LOCC operations alone. We will focus on the bipartite case, where our methods provide a protocol for hiding l qubits in a bipartite state of roughly 2l qubits. This is far more efficient than previous constructions for hiding either bits or qubits, where the best previous constructions gave ratios that depended on the security requirements. To achieve δ = = 1/16 in terms of the parameters introduced below, for example, the separable Werner state construction of Ref. [5] requires roughly 24 qubits per hidden bit. (Using, in the notation of that paper, K = 4 and d = 64.) To achieve qubit hiding, the construction of Ref. [6] would have multiplied that overhead by a further dimension-dependent factor. (We note that by making additional assumptions about the operations available to the parties, it is possible to improve on the 1 : 2 ratio between hidden and physical qubits. It was recently discovered, for example, that in some systems with superselection rules, a ratio of 1 : 1 is achievable [19].) Definition IV.1 (Adapted from Ref. [6]). A (δ, , p, q)-qubit hiding scheme consists of a CPTP encoding map E : B(Cp ) → B(Cq ) and a CPTP decoding map D : B(Cq ) → B(Cp ) such that 1. (Security) For all LOCC measurements L, as well as all states ϕ0 and ϕ1 on Cp , L(E(ϕ0 )) − L(E(ϕ1 ))1 ≤ .
(39)
2. (Correctness) For all states ϕ on Cp , (D ◦ E)(ϕ) − ϕ1 ≤ δ. ∼ HA ⊗ The security criterion obviously implicitly assumes some bipartite structure Cq = HB . In our constructions, we will set dim(HA ) = dim(HB ) = d. Our main result is Theorem IV.2. For all δ, > 0 (satisfying 2 log(40/δ 2 ) < 1) and sufficiently large 36 √ d(d > max{ Cδ , 15/, 21}), one can construct a (δ, , p, d 2 )-qubit hiding scheme, 2 with p=
Cδ 2 2 d , 1188 log d
(40)
and C = (6 ln 2)−1 . The encoding map is given by 1 Uj ρUj† , R(ρ) = n n
(41)
j =1
where {Uj } ⊂ U(d 2 ), and n =
99 d C 2
log d.
The main point, unfortunately obscured by the proliferation of constants and conditions is simply this: for these hiding schemes, the limiting ratio of hidden qubits to physical qubits is log p 1 = . 2 d→∞ log d 2 lim
(42)
Randomizing Quantum States: Constructions and Applications (a)
381
(b)
Alice:
Bob:
Fig. 2a,b. Quantum data hiding. (a) depicts the encoding procedure. A random Uj is applied to the state |ϕ drawn from subspace S. The output, R(ϕ), is almost indistinguishable from the maximally mixed state using LOCC alone. (b) The different subspaces {Uj S} have very small overlaps, however, so a collective operation on HA ⊗ HB can be used to distinguish them without causing much distortion to the encoded states Uj |ϕ
The idea behind our approach is simple. We randomly choose the Uj acting on the AB system. With n sufficiently large, we can assure that for all ρ, R(ρ) is effectively indistinguishable (by LOCC alone) from the maximally mixed state on AB. Then we restrict the input ρ to have support on a sufficiently small subspace S to ensure that the subspaces {Uj S} can be reliably distinguished by a collective measurement on AB. This strategy is summarized in Fig. 2. We will prove the security and correctness in the next two subsections. A. Security of the protocol. The proof of security uses techniques similar to those in the proofs of Lemmas II.3 and III.2, so we only give the outline here. As usual, let {Uj : 1 ≤ j ≤ n} be U(d 2 )-valued independent random variables distributed according to the Haar measure. For 0 < < 1,
n
(X ⊗ Y )Uj ϕ Uj† − d12
≥ 2 Pr sup sup
Tr n1 U d ϕ∈S X⊗Y j =1 Cn 2 ≤ 2 |Md |2 |Mp | exp − , (43) 4 where X and Y are d-dimensional rank 1 projectors acting on A and B respectively, and ϕ ∈ S is a p-dimensional pure state (to be hidden). C is the same positive constant as in Lemma II.3, Md and Mp are 3d 2 -nets for d-dimensional and p-dimensional rank 1 projectors respectively. Equation (43) can be proved in a way very similar to Lemma II.3. From Lemma II.4 we can choose |Md |2 |Mp | = (15d 2 /)2(2d+p) . Whenever d 2 > max(15/, 16) and n ≥ 33(2d + p)(log d)/(C 2 ), the probability in Eq. (43) is strictly less than 1/2, in which case more than half of the choices for {Uj } are such that for all ϕ, M and N,
n
†
Tr 1 (M ⊗ N )U ϕ U j j −
n j =1
1 d2
≤ .
d2
(44)
To finish the proof of security, we use arguments similar to those in Lemma III.2. Let Uj be chosen so that R(ϕ) = n1 nj=1 Uj ϕUj† satisfies Eq. (44). Let {Xi ⊗ Yi } be any
382
P. Hayden, D. Leung, P.W. Shor, A. Winter
POVM implemented by LOCC, where Xi and Yi are both rank 1. For any state ϕ ∈ S, let pi = Tr((Xi ⊗ Yi )R(ϕ)) and qi = Tr((Xi ⊗ Yi ) dI2 ). Using Eq. (44) we find
p − q1 = (45)
Tr((Xi ⊗ Yi )R(ϕ)) − d12 Tr(Xi ⊗ Yi ) i
≤
d2
Tr(Xi ⊗ Yi ) ≤ .
(46)
i
Therefore, when the conditions stated above on n, d and are satisfied, the security condition is fulfilled with probability at least 1/2 for a random selection of unitaries. B. Correctness of the protocol. To complete the construction of the data hiding scheme, we must also show that the decoding can be performed reliably. Let S be a fixed subspace of dimension p in HA ⊗ HB and let P be the projector onto S. Our decoding procedure will be given by the transpose channel [20], a generalization of the pretty good measurement [21]. Specifically, let N = ni=1 Ui P Ui† and Di = P Ui† N −1/2 . Our decoding procedure D will be given by performing the quantum operation with Kraus elements Di . Our proof that this decoding procedure works is via a reduction to the task of decoding classical data, for which there are well-known criteria for the success of the pretty good measurement [22]. Fixing 0 < α < 1, our goal will be to ensure that |ϕ|Di Ui |ϕ|2 ≥ 1 − α for all i = 1, . . . , n and whenever |ϕ ∈ S. Then, ϕ|D ◦ R(ϕ)|ϕ √ ≥ 1 − α. It would then follow by standard inequalities [23] that D ◦R(ϕ)−ϕ1 ≤ 2 α and, choosing α = δ 2 /4, that the correctness criterion is satisfied. To begin, fix any basis E = {|j } for S. Our strategy is to first show that, for any given i and |j , the operation D decodes Ui |j correctly with high probability by relating D to the pretty good measurement for decoding classical messages. We then show that D succeeds on all pure input states by verifying that it succeeds simultaneously on a large enough set of bases to effectively cover the set of all states. So, consider decoding the classical messages i = 1, · · · , n and j = 1, · · · , p by applying the pretty good measurement to the set of states {Ui |j }ij . The POVM elements are Mij = N −1/2 Ui |j j |Ui† N −1/2 . This POVM can be implemented in two stages: first D is applied and the outcome i is recorded, then the projective measurement onto the basis E is performed. To see this, observe that j |Di ρDi† |j = j |P Ui† N −1/2 ρN −1/2 Ui P |j j |Ui† N −1/2 ρN −1/2 Ui |j
= = Tr(ρMij ).
(47) (48) (49)
In particular, this calculation also demonstrates that |j |Di Ui |j |2 = Tr(Ui |j j |Ui† Mij )
(50)
so the decoding procedure D succeeds on |j provided that for each i, the pretty good measurement identifies Ui |j with high probability. Applying the criterion of Hausladen et al. for the success of the pretty good measurement [22], we find that |j |Ui† Ui |j |2 . (51) 1 − |j |Di Ui |j |2 ≤ ij := i j =ij
Randomizing Quantum States: Constructions and Applications
383
Notice that terms for which j = j and i = i do not contribute to ij so its expectation value is Tr[EU Ui |j j |Ui† Ui |j j |Ui† ] (52) EU ij = i =i j
=
i =i
j
Tr[ dI2 dI2 ] =
(n − 1)p , d2
(53)
which is small provided np d 2 . We will be interested in , Pr ij ≥ (1 + η) (n−1)p d2
(54)
which by the left invariance of the Haar measure is equal to . Pr |j |Ui |j |2 ≥ (1 + η) (n−1)p d2
(55)
U
U
i =i j
Invoking Lemma II.3 with η = 1/2, the state |j j | and projector 3(n−1)p 2d 2
p
j =1 |j
j |,
the
is less than or equal to exp(−C (n − 1) p/4) for probability that ij exceeds the same positive constant C in Lemma II.3. By the union bound, the probability that this bad event happens for at least one of the choices of i is less than or equal to n exp(−C (n−1) p/4): 3(n−1)p 2 Pr min |j |Di Ui |j | ≤ 1 − 2d 2 ≤ n exp(−C(n − 1)p/4). (56) U
i
Now, as discussed earlier, our goal is to verify that |ϕ|Di Ui |ϕ|2 will be large for all |ϕ ∈ S. Fix an α2 -net for pure states on S. The size of this net can be taken to be less 2p than or equal to ( 10 α ) . Extend each net point ϕ˜ to a basis of S. We have 2p C Pr min min |ϕ|D ≤ ( 10 ˜ i Ui |ϕ| ˜ 2 ≤ 1 − 3(n−1)p α ) n exp(− 4 (n−1)p) (57) 2d 2 U
ϕ˜
i
by Eq. (56) and the union bound. The probability of D failing on at least one net point 4 log 2n is less than 1/2 if n > C8 log( 10 α ) + C p + 1. Otherwise,
˜ i Ui |ϕ| ˜ 2 − |ϕ|Di Ui |ϕ|2 − |ϕ|D ˜ i Ui |ϕ| ˜ 2 |ϕ|Di Ui |ϕ|2 ≥ |ϕ|D ≥ 1−
3(n−1)p 2d 2
−
α 2
=1−α
(58)
for all |ϕ ∈ S by choosing 3(n − 1)p/d 2 = α. That is essentially the end of the proof. All that remains is to make appropriate choices for our various parameters. Collecting all our requirements, we find that the correctness condition D ◦ R(ϕ) − ϕ1 ≤ δ for all ϕ is satisfied with probability at least 1/2 if n>
8 C
log( 10 α )+
4 log 2n C p
+ 1,
(59)
384
P. Hayden, D. Leung, P.W. Shor, A. Winter
α = δ 2 /4 and 3(n − 1)p/d 2 = α. Recall that the security criterion is satisfied√ with probability at least 1/2 provided n ≥ 33(2d + p)(log d)/(C 2 ) and d > max( 15/, 4). Restricting to p ≤ d, we make the choice n=
C 2 δ 2 d 99 d log d , and then p = . C 2 1188 log d
(60)
36 2 2 If, in addition, d > max( Cδ 2 , 21) and log(40/δ ) < 1/ , a straightforward calculation shows that all our requirements are met. Therefore, by the union bound, the probability that both the correctness and security criteria are satisfied is greater than 0 for a 1 random choice of R. As an example, when = δ = 16 , and d is sufficiently large, log p ≥ log d − log log d − 30. Finally, both δ and can be chosen to be any polynomial in log1 d without affecting the asymptotic efficiency, and can be chosen to be d −α for small α > 0 in order to achieve security that is exponential in 2 log d, the number of physical qubits, at the expense of a slightly reduced asymptotic efficiency (1 − 4α)/2.
V. Locking Classical Correlations Define the maximum classical mutual information that can be obtained by local measurements X ⊗ Y on a bipartite state ρAB as Ic (ρ) = maxX⊗Y I (x : y),
(61)
where x and y are random variables representing the outcomes of measurements X and Y on ρAB and I (x : y) is equal to H (x) + H (y) − H (x, y) for the Shannon entropy H . Now suppose that ρAB is obtained from ρAB by communicating l classical bits present in Alice’s system to Bob. There are natural cryptographic reasons to worry about the relationship between Ic (ρ), Ic (ρ ) and l. Suppose, for example, that an eavesdropper, initially uncorrelated with a quantum state, can extract Ic (ρ) bits of mutual information about some secret classical data by performing a measurement on the state. If instead the eavesdropper started with l classical bits potentially correlated with the secret, one would hope that the most the eavesdropper could learn upon performing her measurement would be less than Ic (ρ) + lD bits for some constant D. The existence of locked classical correlations in the form presented here demonstrates that such bounds fail drastically in general. As a consequence then, it is generally much more prudent to use the Holevo χ quantity instead of the accessible information when bounding an eavesdropper’s information. (χ does obey simple bounds of the desired type.) In their paper introducing the idea of locked classical correlations, DiVincenzo et al. [8] defined two figures of merit, r1 =
Ic (ρ) Ic (ρ )
and
r2 =
l Ic
(ρ ) − I
c (ρ)
.
(62)
Ideally, the two should be small simultaneously: the first is the ratio of the initial to the final information while the second measures the ratio of the “key length” to the unlocked information. In their paper, they found examples for which (r1 , r2 ) ∼ ( 21 , log1 d ) and 1 1 (r1 , r2 ) ∼ ( 2 log d , 2 ). Here we show that r1 and r2 can be made arbitrarily small at the same time, meaning that the amount of information unlocked is large relative both to the amount of information originally available and relative to the number of classical bits communicated from Alice to Bob.
Randomizing Quantum States: Constructions and Applications
385
Theorem V.1. For all , δ > 0 there exist bipartite states ρAB with r1 ≤ and r2 ≤ δ. Alice’s system may be taken to be a classical system of log d + 3 log log d bits and Bob’s a quantum system of log d qubits provided is smaller than some fixed constant, log d > C16 log 20 (where C is a positive constant) and δ≥
3 log log d . (1 − /2) log d
(63)
As in the original work, the states we consider will have the form 1 |ij ij |A ⊗ (Uj |ii|Uj† )B , dn d
ρAB =
n
(64)
i=1 j =1
where the {|ij A } and {|iB } are orthonormal, d = dim(B) and the Uj are unitary. Thus, j can be thought of as a label describing which orthonormal basis is used on Bob’s system to encode i. For such states, a convexity argument (see Ref. [8]) quickly implies that 1 Ic (ρ) ≤ log d + maxϕ |ϕ|Uj |i|2 log |ϕ|Uj |i|2 . (65) n ij
The communication of j , which requires l = log n bits, obviously yields a state ρ for which Ic (ρ ) = log d + log n so an investigation of the locking properties of ρ will hinge on bounding the second term of Eq. (65). Letting pj i = |ϕ|Uj |i|2 and pj = (pj 1 , . . . , pj d ), this second term is equal to − n1 j H (pj ). (Note that pj i and pj are functions of ϕ, although we have suppressed the dependence in our notation.) As usual, we will proceed by selecting the operators Uj at random using the Haar measure, in which case this average entropy quantity will be provably large. Indeed, a now familiar type of calculation (see Appendix B, with the substitution → /2) demonstrates that there is a positive constant C such that n C d 1 2d Pr inf H (pj ) ≤ (1−/2) log d − 3 ≤ ( 20 ) exp − n − 1 , ϕ n 4(log d)2 j =1
(66) provided < 2/5 and d ≥ 7. Choosing n = and log d to be larger than 16 20 log then ensures that the probability is bounded away from 1. It’s worth pausing C to interpret this statement: it means that there is a choice of n bases that is highly incommensurate with all states ϕ, in the sense that averaged over bases, the entropy of the probability distribution induced by measuring any fixed ϕ is almost maximal. Returning to locking, we see that there exists a choice of unitaries such that (log d)3
Ic (ρ) ≤ log d − [(1 − /2) log d − 3] =
2
log d + 3.
(67)
We can then estimate, using the facts that 3/ log d < /2 and d ≥ 7, log d + 3 ≤ , log d + 3 log log d 3 log log d r2 ≤ . (1 − /2) log d r1 ≤
2
(68) (69)
386
P. Hayden, D. Leung, P.W. Shor, A. Winter
The general mathematical question we addressed in this section amounts to quantifying the constraints imposed by entropic uncertainty relations [24, 25] on typical observables, an interesting problem in its own right, regardless of its connection to locking classical correlations. As such, and acknowledging that the approximations used here were quite crude, it would be worth developing a more detailed understanding of the distribution of the quantity minϕ nj=1 H (pj ). VI. Discussion We have explored a range of cryptographic applications that are based on concentration phenomena in high-dimensional inner product spaces. Most results in quantum information theory exploit regularities in the structure of the input or operations; Schumacher’s quantum noiseless coding theorem [20, 26], for example, is based on the fact that for large l, a state ρ ⊗l will be almost entirely supported on an l(S(ρ) + δ)-qubit subspace. The results we presented here are of a related but different character: the regularity we exploit is inherent in the structure of Cd and, therefore, doesn’t require any additional constraints. Our first application was to demonstrate the existence of approximate private quantum channels capable of achieving exponential security (as measured, for example, by an eavesdropper’s accessible information) in the number of encrypted qubits while simultaneously using only about half as much key as the well-known perfectly secure constructions. The failure of bounds on the size of the secret key from the exact case to apply in our approximate setting exposed a new distinction between quantum and classical correlations: classical correlations must be destroyed by local randomization operations while quantum correlations can survive such operations. Our second application built on this principle to find protocols for LOCC hiding of bipartite quantum states capable of encoding roughly l qubits in 2l qubits, a significant improvement over previous constructions. We ended by exhibiting states with locked classical correlations. Such states can be used to perform surprising communication tasks but also serve as a warning that accessible information is a potentially volatile measure for use in security definitions. Our results here suggest a number of possible directions for future research. One natural question is the optimality of the cryptographic protocols we’ve described. While a simple rank argument ensures that the 1 : 1 asymptotic ratio of secret key bits to encoded qubits achieved by our approximate private quantum channel is optimal assuming a unitary encryption map, there appear to be technical obstacles to proving optimality in case of CPTP encryption maps with unbounded output dimension. (If the size of the encrypted state is a polynomial function of the size of the message then the proof is straightforward. One need only combine the argument of Ref. [6], Sect. IV with the Fannes inequality [27].) Optimality of the 1 : 2 ratio found for quantum data hiding represents an even bigger challenge; we know of no convincing argument, beset by technical obstacles or not. At a finer level of detail, since our focus has been on asymptotic rates, we haven’t made any serious attempt to optimize the constants in our constructions; it is likely that significant improvements and perhaps simplifications could be found, particularly in the estimates leading to the locking result. Finally, from a practical point of view, the most pressing problem would be to find computationally efficient versions of the constructions we have presented here. Our Appendix A provides one step in this direction in the case of approximate private quantum channels: instead of selecting unitary transformations from the full unitary group, it suffices to select them at random from
Randomizing Quantum States: Constructions and Applications
387
the set of products of Pauli operators. This random selection is easily done in polynomial time and the Pauli operators are easily implemented. Unfortunately, since the number of random selections is exponential in the number of encrypted qubits, this simplification does not yet yield a polynomial time construction. Acknowledgement. We thank Daniel Gottesman, Leonid Gurvits, Karol Zyczkowski and, in particular, Charles Bennett for their helpful suggestions. PH and DL acknowledge the support of the Sherman Fairchild Foundation, the Richard C. Tolman Foundation, the Croucher Foundation and the US National Science Foundation under grant no. EIA-0086038. AW is supported by the U.K. Engineering and Physical Sciences Research Council.
A. Beyond Haar Measure For some applications, a randomization procedure using a different distribution over unitaries would be preferable; in data hiding, for example, proving that the decoding procedure succeeds with high probability would have been greatly facilitated by using Pauli operators instead of arbitrary unitaries. The convenience appears to come at a price, however. Rather than the strong operator norm estimate of Sect. II, we have only been able to prove the corresponding result in trace norm for the general case. Throughout, we assume only that {Uj : 1 ≤ j ≤ n} are independent U(d)-valued random variables with density p(U ) such that EUj ϕUj† = I/d. The density function p(U ), for example, could consist of point masses concentrated at tensor products of Pauli operators. √ Lemma A.1. For a fixed pure state ϕ, E[ n1 nj=1 Uj ϕUj† − I/d1 ] ≤ d/n. Proof. First we evaluate 2 n 1 I 1 1 † E U ϕU − E Tr(Ui ϕUi† Uj ϕUj† ) − . = 2 j j n n d d j =1 ij
(A.1)
2
The expectation value is easy to calculate: E Tr(Ui ϕUi† Uj ϕUj† ) = E Tr(Ui ϕUi† ) + E Tr(Ui ϕUi† Uj ϕUj† ) (A.2) ij
i
= n+
Tr dI dI
i=j
(A.3)
i=j
n
= n+
n(n−1) . d
(A.4)
† I 2 j =1 Uj ϕUj − d 2 ]
= d−1 Therefore, E[ n1 nd . By the Cauchy-Schwartz inequality and the concavity of the square-root function, we can then estimate n n √ 1 I I 1 † † E (A.5) Uj ϕUj − ≤ dE Uj ϕUj − d d n j =1 n j =1 1 2 2 1/2 n √ 1 I † U ϕU − ≤ d E j j n . (A.6) d j =1 2
The lemma then follows by substitution and the trivial inequality d − 1 < d.
388
P. Hayden, D. Leung, P.W. Shor, A. Winter
We’ll also make use of Azuma’s inequality: Lemma A.2. Let (Yj )nj=1 be a sequence of real-valued random variables such that |Yj | ≤ 1. Let Sn = nj=1 Yj and S0 = 0. If E[Yj |Sj −1 ] = 0, then 1 −nt 2 Pr Sn ≥ t ≤ exp . (A.7) n 2 Proof. See, for example, Ref. [15]. Theorem A.3. For sufficiently large d and n = d log d/ 2 , there exists a choice of {Uj }nj=1 in the support of p such that the inequality 1 n I † Uj ϕUj − (A.8) ≤ n d j =1 1
holds for all states ϕ.
Proof. For n, k ≥ 1, let Zn = nj=1 Uj ϕUj† − nI/d1 , Sk = E[Zn |U1 , . . . , Uk ] − E[Zn ] and Yk = Sk − Sk−1 . It’s also convenient to introduce the notation S0 = Y0 = 0. Note that E[Yk |Sk−1 ] = E[E[Sk |U1 , . . . , Uk−1 ] − Sk−1 |Sk−1 ] = 0.
(A.9)
Also, for fixed (U1 , . . . , Un ) and unitary Uˆ k , the triangle inequality gives
n nI nI † † †
ˆ ˆ Uj ϕUj − − Uj ϕUj + Uk ϕ Uk −
d d j =k
j =1 1
≤ Uk ϕUk† − Uˆ k ϕ Uˆ k† 1 ≤ 2 so |Yk | ≤ 2. An application of Lemma A.2 to Sn =
(A.10)
1
(A.11)
n
Pr (Zn − E[Zn ] ≥ 2nt) ≤ exp
k=1 Yk then −nt 2
2
tells us that (A.12)
.
By the previous lemma, if n ≥ 4d/δ 2 , then E[Zn ] ≤ nδ/2. Therefore, when this condition holds, we find that −nδ 2 . (A.13) Pr (Zn ≥ nδ) ≤ exp 32 Fix an /2-net M with |M| ≤ (10/)2d . Then n n nI nI n . Uj ϕUj† − ≥ n ≤ Pr max Uj ϕUj† − ≥ Pr sup ϕ∈M d d 2 ϕ j =1 j =1 1
1
(A.14) By the union bound and our previous calculations, this probability is less than or equal to 2d 10 −n 2 exp . (A.15) 128 If n = d log d/ 2 , then this quantity goes to zero with increasing d.
Randomizing Quantum States: Constructions and Applications
389
B. Proof of Eq. (66) Our goal is to prove that there is a positive constant C such that n C d 1 2d Pr inf H (pj ) ≤ (1−) log d − 3 ≤ ( 10 ) exp − n − 1 , 2 2(log d) ϕ n j =1
where pj i = |i|Uj† |ϕ|2 . We begin by estimating the concentration of measure for the entropy of measurement of a random state. Let |ψ be a pure state chosen from the unitarily invariant measure on Cd , qi = |i|ψ|2 and f (|ψ) = H (q) the Shannon entropy of the distribution q. We use a version of Levy’s Lemma [28]: Lemma B.1 (Levy). Let f : S k−1 → R be a function with Lipschitz constant σ . Then
(B.1) Pr f − Ef > η ≤ 4 exp −C kη2 /σ 2 , for Haar measure on S k−1 and C > 1/(220 ln 2) a constant. For our application, k = 2d and the Lipschitz constant can be expressed in terms of the qi : d 4 σ = sup ∇f · ∇f = qi (1 + ln qi )2 (ln 2)2 ψ 2
(B.2)
i=1
≤
d 4 qi (1 + (ln qi )2 ) (ln 2)2
(B.3)
4 (1 + (ln d)2 ) ≤ 8(log d)2 , (ln 2)2
(B.4)
i=1
≤
where the second inequality, true if d ≥ 3, can be shown using Lagrange multipliers. The expectation value of f is log d − (d), with (d) = log d −( 21 + 13 +· · · d1 )/(ln 2), which converges to (1 − γ )/(ln 2), where γ is Euler’s constant (approximately 0.577) [29]. Using the estimate [30] 1 1 1 < − ln d − γ < , 2(d + 1) i 2d d
(B.5)
i=1
we can guarantee that 1/2 < (d) < 1 if d ≥ 7. Thus, choosing η = 2 − (d) and setting C = C /8, we find dC Pr (H (q) < log d − 2) ≤ 4 exp − . (B.6) (log d)2 We now move on to bounding nj=1 H (pj ) for a given ϕ. This is easily done using the Chernoff bound [15]: if X1 , . . . , Xn are i.i.d. random variables such that Xj ∈ [0, 1] and EX = µ ≥ α ≥ 0, then
390
P. Hayden, D. Leung, P.W. Shor, A. Winter
n 1 Pr Xj ≤ α ≤ exp − nD(αµ) n
(B.7)
j =1
where D(··) is the binary divergence function D(αµ) = α log α − α log µ + (1 − α) log(1 − α) − (1 − α) log(1 − µ) . (B.8) Let Xj = 0 whenever H (pj ) < log d − 2 and Xj = 1 otherwise. Then (log d − 2)Xj ≤ H (pj ) and by Eq. (B.6), EU Xj ≥ 1 − 4 exp(−dC /(log d)2 ). Choosing α = 1 − 2 and µ = EU Xj in the Chernoff bound, we find
Pr
n 1
n
j =1
−dC H (pj ) ≤ (1 − 2 )(log d − 2) ≤ exp − nD 1 − 2 1 − 4 exp( (log . 2 d) (B.9)
The divergence can be bounded as follows: −dC D 1 − 2 1 − 4 exp( (log ) ≥ −H (/2) − + 2 d) ≥
dC 2(log d)2
dC − 1. 2(log d)2
(B.10) (B.11)
The first inequality arises by neglecting the mixed term corresponding to α log µ in the divergence, since it is always nonnegative. The second is valid whenever < 1/5, which we assume from now on. To extend Eq. (B.9) to all possible states, choose an 2 -net M for d-dimensional pure states, with |M| = (10/)2d . Write ϕ˜ for the net point corresponding to ϕ. Let pj be as previously defined in terms of ϕ, and p˜j be similarly defined in terms of ϕ. ˜ Using the union bound, Pr
dC
1 H (p˜j ) ≤ (1 − 2 )(log d − 2) ≤ |M| exp − n − 1 . 2(log d)2 ϕ∈ ˜ Mn n
min
j =1
(B.12) Furthermore, viewing pj and p˜j as postmeasurement states, the monotonicity of the trace norm implies pj − p˜j 1 ≤ 2 .
(B.13)
Then, applying Fannes’ inequality [27] to the distributions pj and p˜j , |H (pj ) − H (p˜j )| ≤
2
log d −
A substitution then completes the proof.
2
log 2 ≤
2
log d + 1.
(B.14)
Randomizing Quantum States: Constructions and Applications
391
References 1. Braunstein, S., Lo, H.-K., Spiller, T.: Forgetting qubits is hot to do. Unpublished manuscript, 1999 2. Boykin, P. O., Roychowdhury, V.: Optimal encryption of quantum bits. http://arxiv.org/abs/quantph/0003059, 2000 3. Ambainis, A., Mosca, M., Tapp, A., de Wolf, R.: Private quantum channels. In IEEE Symposium on Foundations of Computer Science (FOCS), 2000, pp. 547–553 4. Bennett, C. H., Wiesner, S.: Communication via one- and two-particle operators on Einstein-Podolsky-Rosen states. Phys. Rev. Lett. 69(20), 2881–2884 (1992) 5. Eggeling, T., Werner, R. F.: Hiding classical data in multi-partite quantum states. Phys. Rev. Lett. 89(9), 097905 (2002) 6. DiVincenzo, D. P., Hayden, P., Terhal, B. M.: Hiding quantum data. Found. Phys. 33(11), 1629–1647 (2003) 7. DiVincenzo, D.P., Leung, D. W., Terhal, B. M.: Quantum data hiding. IEEE Trans. Inf. Theory 48(3), 580–598 (2002) 8. DiVincenzo, D. P., Horodecki, M., Leung, D., Smolin, J., Terhal, B. M.: Locking classical correlation in quantum states. Phys. Rev. Lett. 92, 067902 (2004) 9. Bennett, C. H., Hayden, P., Leung, D., Shor, P. W., Winter, A.: Remote preparation of quantum states. http://arxiv.org/abs/quant-ph/0307100, 2003 10. Bennett, C. H., Brassard, G., Cr´epeau, C., Jozsa, R., Peres, A., Wootters, W. K.: Teleporting an unknown quantum state via dual classical and Einstein-Podolsky-Rosen channels. Phys. Rev. Lett. 70, 1895–1899 (1993) 11. Lo, H.-K.: Classical-communication cost in distributed quantum-information processing: A generalization of quantum-communication complexity. Phys. Rev. A 62, 012313 (2000) 12. Leung, D.: Quantum vernam cipher. Quantum Info. Comp. 2, 14–34 (2001) 13. Leung, D., Shor, P.: Oblivious remote state preparation. Phys. Rev. Lett. 90, 127905 (2003) 14. Holevo, A. S.: Statistical problems in quantum physics. In: G. Maruyama J. V. Prokhorov, editors, Proceedings of the second Japan-USSR Symposium on Probability Theory, Volume 330 of Lecture Notes in Mathematics, Berlin: Springer-Verlag, 1973, pp. 104–119 15. Dembo, A., Zeitouni, O.: Large deviations techniques and applications. New York: Springer-Verlag, 1993 16. Zyczkowski, K., Sommers, H.-J.: Truncations of random unitary matrices. J. Phys. A 33, 2045–2057 (2000) 17. Bennett, C. H., DiVincenzo, D. P., Fuchs, C.A., Mor, T., Rains, E., Shor, P. W., Smolin, J.A., Wootters, W. K.: Quantum nonlocality without entanglement. Phys. Rev. A 59(2), 1070–1091 (1999) 18. Hayden, P.: Spin-cycle entanglement. In preparation 19. Verstraete, F., Cirac, J. I.: Quantum nonlocality in the presence of superselection rules and data hiding protocols. Phys. Rev. Lett. 91, 10404 (2003) 20. Ohya, M., Petz, D.: Quantum entropy and its use. Texts and monographs in physics. Berlin: SpringerVerlag, 1993 21. Hausladen, P., Wootters, W. K.: A pretty good measurement for distinguishing quantum states. J. Mod. Opt. 41, 2385–2390 (1994) 22. Hausladen, P., Jozsa, R., Schumacher, B., Westmoreland, M., Wootters, W. K.: Classical information capacity of a quantum channel. Phys. Rev. A 54, 1869–1876 (1996) 23. Fuchs, C. A., van de Graaf, J.: Cryptographic distinguishability measures for quantum mechanical states. IEEE Trans. Inf. Theory 45, 1216–1227 (1999) 24. Deutsch, D.: Uncertainty in quantum measurements. Phys. Rev. Lett. 50, 631–633 (1983) 25. Maasen, H., Uffink, I.: Generalized entropic uncertainty relations. Phys. Rev. Lett. 60, 1103–1106 (1988) 26. Schumacher, B.: Quantum coding. Phys. Rev. A 51, 2738–2747 (1995) 27. Fannes, M.: A continuity property of the entropy density for spin lattice systems. Commun. Math. Phys. 31, 291–294 (1973) 28. Milman, V.D., Schechtman, G.: Asymptotic theory of finite dimensional normed spaces. Number 1200 in Lecture Notes in Mathematics. Springer-Verlag, 1986 29. Jozsa, R., Robb, D., Wootters, W.K.: Lower bound for accessible information in quantum mechanics. Phys. Rev. A 49(2), 668–677 (1994) 30. Young, R. M.: Euler’s constant. Math. Gaz. 75, 187–190 (1991) Communicated by M.B. Ruskai
Commun. Math. Phys. 250, 393–413 (2004) Digital Object Identifier (DOI) 10.1007/s00220-004-1124-5
Communications in
Mathematical Physics
Time Development of Exponentially Small Non-Adiabatic Transitions George A. Hagedorn1, , Alain Joye2 1 2
Department of Mathematics and Center for Statistical Mechanics and Mathematical Physics, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061-0123, U.S.A. Institut Fourier, Unit´e Mixte de Recherche CNRS-UJF 5582, Universit´e de Grenoble I, BP 74, 38402 Saint Martin d’H`eres Cedex, France
Received: 2 September 2003 / Accepted: 8 December 2003 Published online: 14 July 2004 – © Springer-Verlag 2004
Abstract: Optimal truncations of asymptotic expansions are known to yield approximations to adiabatic quantum evolutions that are accurate up to exponentially small errors. In this paper, we rigorously determine the leading order non–adiabatic corrections to these approximations for a particular family of two–level analytic Hamiltonian functions. Our results capture the time development of the exponentially small transition that takes place between optimal states by means of a particular switching function. Our results confirm the physics predictions of Sir Michael Berry in the sense that the switching function for this family of Hamiltonians has the form that he argues is universal. 1. Introduction The adiabatic approximation in quantum mechanics asymptotically describes solutions to the time dependent Schr¨odinger equation when the Hamiltonian of the system is a slowly varying function time. After a rescaling of the time variable, the adiabatic approximation describes the small behavior of solutions to the Schr¨odinger equation ∂ψ = H (t) ψ. (1.1) ∂t In the simplest non-trivial situation, {H (t)}t∈R is a family of 2×2 Hermitian matrices that depends smoothly on t, and whose eigenvalues E1 (t) and E2 (t) are separated by a minimal gap E2 (t) − E1 (t) > g > 0 for all t ∈ R. To discuss scattering transition amplitudes, we also assume that H (t) approaches limits as t tends to plus or minus infinity. We let j (t), j = 1, 2 be smooth normalized instantaneous eigenstates associated with Ej (t), respectively. Then the transition amplitude A() across the gap between the asymptotic eigenstates is defined as i
A() =
lim
t0 →−∞ t1 →+∞
| 2 (t1 ), U (t1 , t0 ) 1 (t0 ) |,
(1.2)
Partially supported by National Science Foundation Grants DMS–0071692 and DMS–0303586.
394
G.A. Hagedorn, A. Joye
where U (t1 , t0 ) denotes the evolution operator corresponding to (1.1). The adiabatic theorem of quantum mechanics, [6], asserts that A() = O(), so the transition probability A()2 is of order O( 2 ). If the Hamiltonian is an analytic function of time, the transition amplitude is much smaller. Long ago, Zener [34] considered a specific real symmetric two-level system, that had an exponentially small transition A() as → 0. Generalizations of Zener’s result to analytic real symmetric two-level Hamiltonians with gaps in their spectra were then proposed in the physics literature. Formulas for A() of the form A() e−γ / ,
→ 0,
as
(1.3)
with γ > 0 that applied more generally were obtained, e.g., in [24, 9, 13]. The decay rate γ was essentially determined by complex crossing points, i.e., the points in the complex t–plane where the analytic continuations of the eigenvalues coincided. Later on, papers [4] and [18] recognized independently that non-trivial prefactors G, with |G| = 1 could be present for general Hermitian two-level Hamiltonians to yield the general formula A() |G| e−γ / ,
as
→ 0.
(1.4)
Also, [5] and [15] pointed out independently that certain complex degeneracies could lead to the same formula in the real symmetric case. Formulas (1.3) and (1.4) are variants of the well–known Landau-Zener formula that has been widely used in many areas of atomic and molecular physics. The goal of this paper is to obtain more precise results for a family of two–level systems. For all times t, we construct an approximate solution to (1.1) that is accurate up to errors of order exp(−γ /) µ , for some µ > 0. This captures the transition process which is of order exp(−γ /). Explicitly, let E > 0 and δ > 0, and consider the Hamiltonian function E δ t H (t) = √ (1.5) t −δ 2 t 2 + δ2 whose eigenvalues are ±E/2 for every t. This Hamiltonian can be viewed as the familiar Landau-Zener Hamiltonian modified to keep its eigenvalues constant. In our case, the notion of eigenvalue crossing point is replaced by the singularities of the Hamiltonian itself at t = ± i δ. These points govern the transitions between the two levels. Let 1 (t) and 2 (t) be smooth, normalized real eigenvectors corresponding to −E/2 and E/2 respectively. Theorem 1.1. Let H (t) be given by (1.5) and let 0 < µ < 1/2. Then: 1) There exist vectors χ1 (, t) and χ2 (, t) that satisfy the Schr¨odinger equation (1.1) up to errors of order e−E δ/ and correspond to the eigenstates 1 (t), and 2 (t) in the sense that µ . (1.6) lim | j (t), χj (, t) | = 1 + O e−E δ/ |t|→∞ Eδ Moreover, the set {χj (, t)}j =1,2 is orthonormal up to errors of order e−E δ/ µ . 2) The Schr¨odinger equation has solutions j (, t), j = 1, 2 such that uniformly in t ∈ R as → 0,
Time Development of Exponentially Small Non-Adiabatic Transitions
1 (, t) = χ1 (, t) −
√
2e
−E δ/ 1
erf
2
µ + O e−E δ/ , Eδ
E t 2δ
395
+1
χ2 (, t)
and 2 (, t) = χ2 (, t) +
√
2e
−E δ/ 1
2
µ + O e−E δ/ . Eδ
erf
E t 2δ
+1
χ1 (, t)
Remarks. 0. Recall that the function erf is defined by
x
x 2 2 2 −y 2 erf(x) = √ e dy = √ e−y dy − 1 ∈ [−1, 1]. π 0 π −∞
(1.7)
1. The vectors χj (, t), j = 1, 2, are constructed as approximate solutions to (1.1) obtained by means of optimal truncation of asymptotic expansions of actual solutions. As t → −∞ they are asymptotic to the instantaneous eigenvectors j (t) of H (t), up to a phase. We call them optimal adiabatic states. See (4.10), (4.11) and (6.9). 2. The transition mechanism between optimal adiabatic states takes place from the value √ −E δ/ zero to the value 2 e in a smooth monotonic way described by the switching √ function (erf+1)/2, on a time scale of order . By contrast, the transition between instantaneous eigenstates of the Hamiltonian displays oscillations of order for any finite time to eventually reach its exponentially small value only at t = ∞. In that sense also, the vectors χj (, t) are optimal. 3. As the optimal adiabatic states and eigenstates essentially coincide at t = ±∞, √ the µ transition amplitude equals A() 2 e−E δ/ , up to errors of order e−E δ/ Eδ . 4. The parameters E and δ play somewhat different roles. If we fix E and decrease δ, √ the singularity of H (t) approaches the real axis. The transition amplitude √2 e−E δ/ increases whereas the time it takes to accomplish the transition decreases as 2 δ /E. If instead we fix δ and let E decrease, the gap between the eigenvalues decreases. The transition amplitude increases as well, whereas the typical time of the transition now increases. 5. Our results allow us to control the µ evolution operator U (t, s) associated with (1.1) up to errors of order e−E δ/ Eδ , for any time interval [s, t]. 6. Further comments concerning the relevance of the Hamiltonian (1.5) are presented at the end of this section. We now put our results in perspective by describing previous work on exponential asymptotics for the adiabatic approximation. Rigorous computations of the behavior of the exponentially small quantity A() for two-level systems, or generalizations of this typical setting, were provided relatively late in [18, 20, 15, 16, 21, 17, 23]. Although we shall not use the technique in this paper, let us briefly describe the mechanism that is typically used to get the asymptotics leading to the exponentially small quantity A() in these papers. It involves deforming integration
396
G.A. Hagedorn, A. Joye
paths from the real axis to the complex plane t–plane until they reach a (non–real) crossing point. Crossing points provide singularities where significant transitions take place, but their lying away in the complex plane makes
t these transitions exponentially small due to the presence of dynamical phases exp(−i 0 Ej (s) ds/) whose exponents acquire a non–zero real part along the path. With this approach, the link with the initial problem posed on the real axis is possible only at infinity. One does not learn about the dynamics of the transition. We note that in a more general framework where A() represents the transition between two isolated bands of the spectrum for general (unbounded) analytic Hamiltonians, exponential bounds on A() were obtained by suitable adaptations of this method in [19, 14]. See also [26, 33] for similar results using the pseudo-differential operator machinery. The other successful method used to construct precise approximations of solutions to (1.1) uses optimal truncation of asymptotic expansions. With sufficiently sharp estimates of the errors, one can prove exponential accuracy. Under appropriate analyticity assumptions, one typically proves that the error committed by retaining n terms in the asymptotic expansion is of order n! n . This error is minimized by choosing n 1/. √ By virtue of Stirling’s formula, it is of order exp(−1/)/ as → 0. The optimal truncation method was first used for the adiabatic approximation by Berry in [3]. He constructed approximate solutions to (1.1) and gave heuristic arguments concerning their exponential accuracy and the determination of the exponentially small transition mechanism between asymptotic eigenstates. In particular, the switching function (erf+1)/2 in Theorem 1.1 first appeared in [3]. Berry further claimed that this function was universal, i.e., the time development of non–adiabatic transitions in all systems governed by one crossing point (and its conjugate) were described by this switching function. His formal arguments were supported by beautiful numerical investigations of Lim and Berry [25]. Berry’s paper [3] is the main inspiration for the present work. Mathematically rigorous exponential bounds for solutions to (1.1) (and of A()) using optimal truncation were first obtained in the general situations by Nenciu in [31]. They were refined later in [20]. See [11] for an elementary derivation of such results. Although these rigorous results prove the exponential accuracy of the optimal truncation technique, their estimates are not accurate enough to capture the exponentially smaller non–adiabatic transitions. An exponentially small bound on A() is an easy corollary, but the estimates do not provide the asymptotic leading term (1.4) for A(). We refer the reader to two fairly recent reviews, [22] and [1], for more details and many other aspects of the adiabatic approximation in quantum mechanics. We also note that in the broader context of singular perturbations of linear ODE’s, Theorem 1.1 can be interpreted as the smooth crossing of a Stokes line that emanates from some eigenvalue crossing point or singularity. See e.g., [2, 27, 32, 12, 8] and references therein. However, from our perspective, in all rigorous work that has dealt with such issues, the crossing of a Stokes line is performed on a very small circle around the point responsible for the transition. In this paper all estimates are performed on the real axis, and the two conjugate points responsible for the transition are fixed and away from the real axis. Another angle of attack for such problems uses Borel summation ideas. This is done for certain singularly perturbed ODE’s e.g., in [7]. The method consists of writing the ∞
solution as a Laplace transform evaluated at 1/, i.e., as 0
F (t, p) ep/ dp. One then
derives a PDE for F . The PDE is roughly what one obtains from the original equation by replacing 1/ by the symbol ∂/∂p. To get exponentially precise information about the
Time Development of Exponentially Small Non-Adiabatic Transitions
397
solution to the original problem, it is enough to study the location and nature of complex singularities of F . We tried to implement the Borel summation technique for our problem, but failed to obtain sufficiently detailed information on the nature of the singularities. To the best of our knowledge, no rigorous results that address the issue we describe in Theorem 1.1 are available in the literature. Before we turn to the proof of Theorem 1.1, let us briefly discuss our choice of Hamiltonian (1.5). This choice belongs to the family of real–symmetric time–dependent 2 × 2 Hamiltonians with non–degenerate eigenvalues E2 (t) > E1 (t). For any member of that family, we can assume without loss of generality that E1 (t) = −E2 (t), because we can subtract a time–dependent multiple of the identity from H (t), which only changes the solutions by a trivial phase. If we change the time variable from t to
t t = 2 0 E2 (s) ds/E and drop the prime on t , we obtain a Schr¨odinger equation (1.1) with a new Hamiltonian h(t) of the form E cos(α(t)) sin(α(t)) (1.8) h(t) = 2 sin(α(t)) − cos(α(t)) whose eigenvalues are ±E/2 for every t. The angle α is given by some function of time. We choose − sin(α(t)/2) cos(α(t)/2) 1 (t) = , and 2 (t) = , cos(α((t)/2) sin(α((t)/2) as the normalized real eigenvectors of h(t). Then the coupling f (t) that drives transitions between the instantaneous eigenvectors is given by f (t) = 2 (t), 1 (t) = − 1 (t), 2 (t) = −
α (t) . 2
Our choice of Hamiltonian (1.5) corresponds to f (t) =
1 2(t 2 + δ 2 )
⇔
α(t) = −
1 arctan(t/δ), δ
where δ > 0 is a parameter monitoring the strength of the coupling. This choice of coupling presents the simplest non-trivial singularities in the complex t-plane. From another point of view, our choice of Hamiltonian is motivated by the Lanδ t dau–Zener Hamiltonian, , which has the local structure of a generic avoided t −δ crossing [10]. Physically, one expects the transition to take place in a neighborhood of t = 0, and one expects that only the form of the Hamiltonian for small t should determine the transition dynamics. To first order near t = 0, our Hamiltonian agrees with the Landau–Zener Hamiltonian. However, quantum transitions have a more global character. The √ presence of the square root factor in (1.5), gives rise to the nontrivial prefactor 2 in the transition amplitude. Indeed, it is shown in [15] that a change of variable allows one to transform Eq. (1.1) with Hamiltonian (1.5) into an equivalent Schr¨odinger equation driven by a Hamiltonian that behaves locally near t = 0 as a Landau–Zener type Hamiltonian, but with a non-generic complex crossing point. This non-generic structure is responsible for
398
G.A. Hagedorn, A. Joye
√ the nontrivial prefactor 2 in the transition amplitude. It follows from [15], Sect. 4, that the leading order transition amplitude for our Hamiltonian (1.5) is √ A() 2e−E δ/ . (1.9) The rest of the paper is organized as follows. In the next section we develop perturbation expansions for solutions to the time–dependent Schr¨odinger equation (1.1) with Hamiltonian (1.5). In Sect. 3 we analyze the behavior of the high order terms of this expansion that are required for precise optimal truncation. In Sect. 4 we study the error term obtained from optimal truncation and define optimal adiabatic states. Section 5 is devoted to the study of two integrals that arise in the error term and give rise to the switching function. Theorem 1.1 is then proven in Sect. 6. 2. The Formal Perturbation Expansion We begin by converting our time–dependent Schr¨odinger equation (1.1) with Hamiltonian (1.5) into a parameter free equation. Changing variables from t to s with s = t/δ, we obtain ∂ψ Eδ 1 s i ψ. = √ ∂s 2 1 + s 2 s −1 Thus, without loss of generality, by changing into = /(Eδ), we can study the parameter free model ∂ψ 1 1 s i ψ. = √ ∂s 2 1 + s 2 s −1 In this way, we see that it is sufficient to consider (1.1) with δ = E = 1, which corresponds to the coupling 1 f (t) = . (2.1) 2 (1 + t 2 ) t and t → . The general results are recovered by making the substitutions → Eδ δ We now develop a formal asymptotic expansion to solutions of (1.1). We concentrate on constructing a formal perturbation expansion of the solution to (1.1) that corresponds to the negative eigenvalue −1/2 for small . We make the unusual ansatz that (1.1) has a formal solution of the form ψ(, t) = eit/(2) e ∞
where g(, t) =
t 0
f (s) g(, s) ds
( 1 (t) + g(, t) 2 (t) ) ,
gj (t) j .
j =1
Remarks. 1. We arrived at this ansatz by attempting a formal solution of the form ψ(, t) = eit/(2) ez(, t) ( 1 (t) + g(, t) 2 (t) ) , with z(, t) =
∞
zj (t) j . We then realized that this required
j =1
t
z(, t) =
f (s)g(, s) ds. 0
(2.2)
Time Development of Exponentially Small Non-Adiabatic Transitions
399
2. There are more standard ans¨atze for the perturbation expansion [3, 4, 11, 16, 18–20, 22, 26, 28–31, 33], but we were unable to get sufficient control of their nth terms to prove the estimates that we required. 3. If we do not expand g(, t), then we find that it must satisfy i g (, t) = g(, t) − i f (t) 1 + g(, t)2 . However, we will not use this equation.
t 4. For normalization purposes, we later consider (2.2) with −∞ f (s)g(, s) ds in the
t exponent, instead of 0 f (s)g(, s) ds. 5. When seeking a solution that corresponds to the positive eigenvalue for small , one makes the similar ansatz
t
˜ s) ds ˜ t) 1 (t) ) , ψ(, t) = e−it/(2) e− 0 f (s) g(, ( 2 (t) + g(, ∞ g˜ j (t) j satisfies where g(, ˜ t) = j =1
i g˜ (, t) = − g(, ˜ t) + i f (t)
(2.3)
1 + g(, ˜ t)2 .
Hence, for any j ∈ N, we have g˜ j (t) = gj (−t). We substitute (2.2) into (1.1) and formally solve the resulting equation order by order in powers of . First Order. The terms of order require g1 (t) = i f (t). Second Order. The terms of order
2
(2.4)
require
g2 (t) = i g1 (t). Third and Higher Order. The terms of order n+1 for n ≥ 2 require n−1 gn+1 (t) = i gn (t) + f (t) gj (t) gn−j (t) .
(2.5)
(2.6)
j =1
Using (2.1) for the coupling f , and an easy induction using partial fractions decompositions, we see that gn (t) can be written as gn (t) =
2n
cn,j ej (t), where e2j −1 (t) = (1 + it)−j , e2j (t) = (1 − it)−j .
j =1
(2.7) Thus, for each n, we can associate gn with a unique element of l 1 , the space of absolutely summable sequences. Following the intuition of Michael Berry [3, 4], we isolate the highest order poles of gn (t) by decomposing gn (t) = Gn (t) + hn (t), where Gn (t) = cn,2n−1 e2n−1 (t) + cn,2n e2n (t). Proof of our results now depends on an analysis of the behavior of Gn (t) and hn (t) for large n.
400
G.A. Hagedorn, A. Joye
3. Analysis of the Recurrence Relation The main goals of this section are summarized in the following proposition. Proposition 3.1. There exists C, such that for all real t and each n ≥ 1, |Gn (t)| ≤ (n − 1)!,
(3.1)
|G n (t)| ≤ n!,
(3.2)
|hn (t)| ≤ C (n − 2)! log(n − 2),
(3.3)
|h n (t)| ≤ C (n − 1)! log(n − 2).
(3.4)
We prove this result using the recurrence formulas (2.4)–(2.6) and the following amazing fact that we use to control the nonlinear terms in the recurrence relation. Lemma 3.1. For each k and m, ek (t) em (t) =
k+m+1
Every dk,m,j is a non-negative real number, and
j =1 k+m+1
dk,m,j ej (t). dk,m,j = 1.
j =1 1 Remark. This lemma implies that we have a Banach algebra structure on l , where the product of {an } and {bn } is determined by formally multiplying n an en (t) times n bn en (t) and then taking the coordinates of the result in the {ej (t)} basis.
Proof of Lemma 3.1. We first remark that by keeping track of the orders of the poles, it is easy to see that j ≤ k + m + 1 in the sums in the lemma. Next, we observe that there are many trivial situations. If k and m are both odd, then ek (t) em (t) = ek+m+1 (t). If k and m are both even, then ek (t) em (t) = ek+m (t). Thus, the only non-trivial cases are when one is odd and the other is even. To prove these cases, we do inductions on odd k and even m. For k = 1 and m = 2, the lemma follows immediately from (1 + it)−1 (1 − it)−1 =
1 1 (1 + it)−1 + (1 − it)−1 . 2 2
(3.5)
Next, we fix m = 2 and assume inductively that the lemma has been proven for k = 2K − 1. Then using (3.5) again, we have ek+2 (t) em (t) = (1 + it)−K−1 (1 − it)−1 1 1 = (1 + it)−K (1 + it)−1 + (1 − it)−1 2 2 =
1 ek+2 (t) + ek (t)e2 (t) . 2
The result now follows from our induction hypothesis. Thus, the lemma is true for all k and m = 2.
(3.6)
Time Development of Exponentially Small Non-Adiabatic Transitions
401
Finally, we fix an odd k, and assume inductively that the lemma has been proven for this k and an even m. Then ek (t) em+2 (t) = (ek (t) em (t)) e2 (t) dk,m,j ej (t) e2 (t) = j
=
dk,m,j
The lemma now follows since occur here are non-negative.
j
dj,2,j ej (t).
j
j
dj,2,j = 1,
j
dk,m,j = 1, and all the d’s that
We also need the following technical result. Lemma 3.2. For each n ≥ 1, we have n (n − j )! j ! 8 ≤ , n! 3
(3.7)
j =0
n−1 (n − j )! j ! 5 ≤ , n! 3
and
j =0
n−1 (n − j )! j ! 2 ≤ . n! 3 j =1
Proof. The first inequality trivially implies the other two. We observe by direct computation that the result is true for the first few values of n, and that the sum equals 8/3 when n = 3 and n = 4. For n ≥ 5, we separate the first two terms and last two terms to see that the sum equals n−2 2 (n − j )! j ! 2+ . + n! n j =2
The largest terms in the sum over j in this expression come from j = 2 and j = n − 2. 2 , and there are (n − 3) terms. Thus, the left-hand side of Those terms equal n(n − 1) (3.7) is bounded by 2+
2(n − 3) 2 1 8 2 + ≤ 2+ + < . n n(n − 1) 5 5 3
2(n−3) This last step relies on the observation that n(n−1) takes the value 1/5 when n = 5 and n = 6, and that it is decreasing for n ≥ 6.
402
G.A. Hagedorn, A. Joye
1 For any y(t) = j yj ej (t), with {yj } ∈ l , we define y = j |yj |. We note that for t ∈ R, |y(t)| ≤ y. Since Gn is obtained from gn by dropping components in the ej (t) basis, we note that Gn ≤ gn . Thus, the following lemma implies (3.1) and (3.2) since ddt (1 ± it)−j = ∓ i j (1 ± it)−j −1 . Lemma 3.3. gn ≤ (n − 1)!. Proof. We prove that the sequence an = gn /(n − 1)! is bounded above by 1. By (2.6), Lemmas 3.1 and 3.2, we see that n ≥ 2 implies an+1 ≤ an +
2 4 an−1
3n(n − 1)
(3.8)
.
From (2.4) and (2.5) we have a1 = a2 = 1/2. By explicit computation, we observe 17 197 4 that a3 = 32 ≤ 1 − 49 , and a4 = 384 ≤ 1 − 12 . The lemma now follows by induction (starting at n = 4) and the following statement: 4 4 4 If n ≥ 4, an−1 ≤ 1 − 3(n−1) and an ≤ 1 − 3n , then an+1 ≤ 1 − 3(n+1) . To prove this statement, we use (3.8) to see that for n ≥ 4, an+1 ≤ 1 −
4 4 + 3n 3n(n − 1)
1−
4 3(n − 1)
=1−
4 8(3n2 + 10n − 29) − 3(n + 1) 27(n + 1)n(n − 1)3
≤1−
4 . 3(n + 1)
2
Lemma 3.1 is now proven by the comments before Lemma 3.3 and the following lemma. Lemma 3.4. There exists C, such that hn ≤ C (n − 2)! log(n − 2) and h n ≤ C (n − 1)! log(n − 2). Proof. The first estimate implies the second since hn is in the span of ej (t) with j ≤ 2n − 2. Define bn = hn /((n−2)!). Since hn is obtained from gn by dropping components in the ej (t) basis, we have hn ≤ gn . Thus, by Lemma 3.3, bn ≤ n − 1. We rewrite (2.6), using gn = Gn + hn : Gn+1 + hn+1 n−1 n−1 n−1 dGn dhn =i Gj Gn−j + 2 f Gj hn−j + f hj hn−j . + +f dt dt j =1
j =1
j =1
We then drop the e2n+1 (t) and e2n+2 (t) components of this expression to obtain an dGn expression for hn+1 . This involves dropping the entire term i , as well as parts of dt
Time Development of Exponentially Small Non-Adiabatic Transitions
403
other terms. Since the norm decreases when we drop components, we see that hn+1
n−1 dhn Gj Gn−j + f ≤ +2 dt j =1
n−1 n−1 f Gj hn−j + f hj hn−j . j =1 j =1
dhn ≤ (n − 1) hn . Thus, by Since hn is in the span of {ej (t)} for j ≤ 2n − 2, dt f ≤ 1/2, Gn ≤ (n − 1)!, h1 = h2 = 0, Lemmas 3.1, and 3.2, we have bn+1 ≤ bn +
4/3 1/3 5/3 + bn−1 + b2 . n−1 (n − 1)(n − 2) (n − 1)(n − 2)(n − 3) n−3
Since we already have bn ≤ n − 1 and b1 = b2 = 0, this implies bn+1 ≤ bn +
4/3 10/3 5/3 1/3 + + = bn + , n−1 n−1 n−1 (n − 1)
for n ≥ 2. Thus, bn =
n−1 k=2
( bk+1 − bk ) =
n−1 10/3 10 ≤ ( 1 + log(n − 2) ) . k−1 3 k=2
This implies the lemma and completes the proof of Proposition 3.1.
The functions ej (t) are in L1 (R) for j ≥ 3. The functions e1 (t) and e2 (t) are not in L1 (R), but e1 (t) + e2 (t) is. The following lemma facilitates getting L1 information about the functions gn . Lemma 3.5. For every n, the e1 (t) and e2 (t) coefficients in gn (t) are equal. Furthermore, the L1 (R) norm of gn is bounded by π gn . These results are also true for hn . Proof. We prove the first statement by induction on n. It is clearly true for n = 1. Sup pose it is true for all n < N . The function i gN−1 contains no e1 (t) or e2 (t) component. N−2 1 Since the gn are bounded, f j =1 gj gN−j −1 is in L (R) since f is. The only way 1 this function can be in L is if its e1 (t) and e2 (t) components are equal. This implies the result for gN , and the induction can proceed. The second statement follows from the first because the absolute values of (e1 (t) + e2 (t)) and ej (t) for j ≥ 3 are dominated by (1 + t 2 )−1 which has integral π . The third statement follows since hn is obtained from gn by removal of the nth order pole terms. We now examine Gn more closely. We note that highest order pole terms in gn at t = ±i satisfy the recurrence relation n−1 ± 1/4 dG n ± i (t) ± G± G± n+1 (t) = j (t) Gn−j (t) , dt t ∓i j =1
404
G.A. Hagedorn, A. Joye
with G± 1 (t) = ±
1 1 ∓i 1 and G± . From this it follows that 2 (t) = 4 t ∓i 4 (t ∓ i)2 Gn (t) = i γn
e2n−1 (t) + (−1)n−1 e2n (t) ,
(3.9)
where γn satisfies the real numerical recurrence relation γn+1 = n γn −
n−1 1 γj γn−j , 4
(3.10)
j =1
with γ1 = γ2 = 1/4. By Lemma 3.1, the quantity βn = γn /(n − 1)! is bounded. It satisfies βn+1 = βn −
n−1 1 ((j − 1)!)((n − j − 1)!) βj βn−j , 4 n!
(3.11)
j =1
with β1 = β2 = 1/4. From this relation and Lemma 3.2, it follows that βn has a limit β ∗ as n tends to infinity. To see this, suppose that the sequence βm is positive and strictly decreasing for m ≤ n. Then,
0
β2 − 3 n(n − 1) 3 j (j − 1) j =2 2β12 5 1 = β2 − > . 1− n 24 3
βn+1 > βn −
(3.13)
Therefore, the sequence {βn } is positive, strictly decreasing and bounded below. Similarly, for some constant C > 0 and any p > 0, we have βn − βn+p ≤ C
1 1 − n−1 n+p−1
,
(3.14)
so that, βn = β ∗ (1 + O(n−1 )). Remark. Later, we will see that β ∗ =
1 √ . π 2
(3.15)
Time Development of Exponentially Small Non-Adiabatic Transitions
405
4. Optimal Truncation We begin this section by studying ζn (, t) = i
∂ψ − H (t) ψ, ∂t
where ψ is given by (2.2) with g(, t) =
n
(4.1)
gj (t) j . We ultimately choose n =
j =1
[[1/]] − 1, where [[k]] denotes the greatest integer less thanor equal to k.
t (f (s) nj=1 gj (s) j ) ds it/(2) 0 e 2 (t) times By explicit calculation, ζn (, t) equals e i n+1 G n + i n+1 h n + i n+1 f
n−1
2n+1
gj gn−j +
j =1
i k f
k=n+2
n
gj gk−j −1 .
j =k−n−1
(4.2) Lemma 4.1. The first term in (4.2) satisfies n+1 G n = 2 βn n+1 (n!).
(4.3)
When n is chosen to be n = [[1/]] − 1, the norm of the remaining terms in (4.2) satisfies (4.4) ζn (, t) − i n+1 G n ≤ C n+1 ((n − 1)!) log(n − 2) for some C. Thus, as tends to zero with n = [[1/]] − 1, ζn (, t) is asymptotic to t n j i n+1 G n (t) eit/(2) e 0 (f (s) j =1 gj (s) ) ds 2 (t). Proof. The result (4.3) was proven at the end of the previous section. By Lemma 3.4, the second term in (4.2) satisfies n+1 h n ≤ C n+1 (n − 1)! log(n − 2). By Lemmas 3.1, 3.2, and 3.3, the third term satisfies n−1 4 n+1 n+1 n+1 8 1 gj gn−j (n − 2)!. (n − 2)! = f ≤ 3 2 3
(4.5)
(4.6)
j =1
We now prove that 2n+1 k f k=n+2
n j =k−n−1
n+1 gj gk−j −1 (n − 1)! ≤ C
for some C when n = [[1/]] − 1. We begin the proof of (4.7) with a technical lemma: Lemma 4.2. For positive integers l ≤ m/2, m−l (p!) ((m − p)!) ≤ (m − 2l + 1) (l!) ((m − l)!). p=l
(4.7)
406
G.A. Hagedorn, A. Joye
Proof. m−l
(p!) ((m − p)!) = (l!)((m − l)!) + ((l + 1)!)((m − l − 1)!) + · · ·
p=l
+ ((m − l − 1)!)((l + 1)!) + ((m − l)!)(l!). The first and last terms are the largest terms in this sum, and they equal (l!) ((m − l)!). The lemma follows since there are (m − 2l + 1) terms in the sum. We apply Lemma 4.2 with p = j − 1, m = k − 3, and l = k − n − 2, along with n Lemmas 3.1, 3.2, and 3.3, to see that the norm of the term i k f gj gk−j −1 in j =k−n−1
(4.2) is bounded by k
1 2
m−l
p! ((m−p)!) ≤ k (2n−k+2) ((k−n−2)!) ((n−1)!)/2.
p=l
To prove (4.7), we now sum this quantity over k: 2n+1
k (2n − k + 2) ((k − n − 2)!) ((n − 1)!)/2
k=n+2
= n+1 ((n − 1)!)
2n+1 k=n+2
=
n+1
2n − k + 2 k−n−1 ((k − n − 2)!) 2
n−1 n − m m+1 ((n − 1)!) (m!). 2
(4.8)
m=0
We now fix n = [[1/]] − 1 and note that this implies that n−1 n−1 n − m m+1 1 m (m!) ≤ (m!) 2 2
m=0
m=0
(n − 1)! 1 2! . 1+ + 2 + ··· + n n nn−1 √ By Stirling’s formula, j !/nj ≤ C (j/n)j e−j j for some C > 0. So, the quantity on the right hand side here is bounded. This and (4.8) imply (4.7), which proves the lemma. 1 ≤ 2
We note that our choice of n = [[1/]] − 1 implies that there exists a C > 0 such that n j ≤ C . g (t) dt (4.9) j j =1 This follows from Lemmas 3.3 and 3.5, since the left hand side of (4.9) is bounded by π
n j =1
gj j ≤ π
n j =1
(j − 1)! j ≤ π
n−1 k=0
k k! ≤ C .
Time Development of Exponentially Small Non-Adiabatic Transitions
407
This allows us to define the optimal adiabatic state ψ1 (, t) associated with the eigenvalue −1/2 by ψ1 (, t) = eit/(2) e where g(, t) =
[ 1/] ]−1
t
−∞
f (s) g(, s) ds
( 1 (t) + g(, t) 2 (t) ) ,
(4.10)
gj (t) j , with the gj (t)’s defined in Sect. 2. By construction,
j =1
ψ1 (, t) is normalized as t → −∞. The optimal adiabatic state ψ2 (, t) associated with the eigenvalue 1/2 is defined similarly, according to (2.3), ψ2 (, t) = e−it/(2) e− where g(, ˜ t) =
[ 1/] ]−1
t
−∞
f (s) g(, ˜ s) ds
˜ t) 1 (t) ) , ( 2 (t) + g(,
(4.11)
g˜ j (t) j . Since the entire analysis of the gj ’s above does not
j =1
depend on the sign of t, it holds for the g˜ j (t) as well. See Remark 5 of Sect. 2. 5. Analysis of Some Integrals The main goal of this section is to analyze the two (quite different) integrals
t −∞
eis/ ds, (1 ± is)m
where > 0 is a small parameter, and m = [[1/]]. By taking conjugates, we obtain the analogous results for
t −∞
e−is/ ds. (1 ∓ is)m
Remark. As we see below, the reason the two integrals have different behavior is that in one case, there is a cancellation of rapidly oscillating phases, while in the other, the phases reinforce one another. Lemma 5.1. For small > 0, m = [[1/]] and any γ ∈ (1/2, 1), we have
t eis/ π m erf t + 1 + O(m−γ ), ds = m 2m 2 −∞ (1 + is) and
t
−∞
eis/ ds = O(m−γ ). (1 − is)m
(5.1)
(5.2)
Proof. We begin with some preliminary estimates that apply to both integrals. We note that for real s, eis/ 2 −m/2 . (1 ∓ is)m = (1 + s )
408
G.A. Hagedorn, A. Joye
When |s| ≥ 1, this is bounded by |s|−m . Since
∞
ds/s m = 1/(m − 1), we make an
1
O(m−1 ) error in each of the integrals, if we√drop the contributions from |s| ≥ 1. Next, let δ > 0 be small and a = mδ / m. If a ≤ |s| ≤ 1, then (1 + s 2 )−m/2 ≤ (1 + a 2 )−m/2 . Thus
1 (1 + s 2 )−m/2 ds ≤ (1 − a) (1 + a 2 )−m/2 ≤ (1 + a 2 )−m/2 . (5.3) a
Now, if δ is small enough, (5.3) is of the order of e− 2
ln(1+a 2 )
t
−∞
= e−
m2δ 2
m2δ
= O(e− 2 ) z1 −z2 > 0 and z1 > z1 −z2 > 0, respectively. Consequently, (1.12),(1.13), (1.14) and (1.15) converge absolutely for z1 , z2 ∈ C satisfying |z1 | > |z2 | > 0, |z2 | > |z1 | > 0, |z2 | > |z1 − z2 | > 0 |z1 | > |z1 − z2 | > 0, respectively. The convergence is proved. In particular, (1.12) and (1.14) give (possibly multivalued) analytic functions defined on the regions |z1 | > |z2 | > 0 and |z2 | > |z1 − z2 | > 0, respectively. By associativity for Y O , (1.12) and (1.14) are equal for z1 , z2 ∈ R+ satisfying z1 > z2 > z1 − z2 > 0. By the basic properties of analytic functions, (1.12) and (1.14) are equal for z1 , z2 ∈ C satisfying |z1 | > |z2 | > |z1 −z2 | > 0 (the intersection of the regions |z1 | > |z2 | > 0 and |z2 | > |z1 − z2 | > 0 on which the analytic functions (1.12) and (1.14) are defined). The second part of the associativity for Y f can be obtained From the first part by substituting v2 , v1 , z2 and z1 for v1 , v2 , z1 and z2 .
Definition 1.10. A grading-restricted open-string vertex algebra is an open-string vertex algebra satisfying the following conditions: 8. The grading-restriction conditions: For all n ∈ R, dim V(n) < ∞ (the finitedimensionality of homogeneous subspaces) and V(n) = 0 when n is sufficiently negative (the lower-truncation condition for grading). A conformal open-string vertex algebra is an open-string vertex algebra equipped with a conformal element ω ∈ V satisfying the following conditions: 9. The Virasoro relations: For any m, n ∈ Z, [L(m), L(n)] = (m − n)L(m + n) +
c (m3 − m)δm+n,0 , 12
where L(n), n ∈ Z are given by Y O (ω, r) =
L(n)r −n−2
n∈Z
and c ∈ C. 10. The commutator formula for Virasoro operators and formal vertex operators (or component operators): For v ∈ V , Y f (ω, x)v involves only finitely many negative powers of x and x1 − x0 −1 f f Y f (Y f (ω, x0 )v, x2 ). [Y (ω, x1 ), Y (v, x2 )] = Res0 x2 δ x2 11. The L(0)-grading property and L(−1)-derivative property: L(0) = d and L(−1) = D. A grading-restricted conformal open-string vertex algebra or open-string vertex operator algebra is a conformal open-string vertex algebra satisfying the gradingrestriction condition.
Open-String Vertex Algebras, Tensor Categories and Operads
443
We shall denote the conformal open-string vertex algebra defined above by (V , Y O , 1, ω) or simply V . The complex number c in the definition is called the central charge of the algebra. Note that the grading-restriction conditions imply the local-truncation property for D . Proposition 1.11. Let V be a grading-restricted open-string vertex algebra. Then for u, v ∈ V , u+ n v = 0 if n is sufficiently negative. Proof. This follows immediately from the lower-truncation condition for grading and the fact that the weights of u+
n for n ∈ R is wt u − n − 1. 2. Intertwining Operators and Open-String Vertex Algebras In this section, we establish a connection between open-string vertex algebras and intertwining operator algebras. We assume that the reader is familiar with the basic notions and properties in the representation theory of vertex operator algebras and we also assume that the reader is familiar with the notion of intertwining operator algebra. See [FHL, H7 and H8] for details. In the remaining part of the paper, we shall consider only those open-string vertex algebras such that all the products and iterates of vertex operators associated to complex numbers are absolutely convergent in natural regions. See [K] for details. Let V be an open-string vertex algebra and S a subset of V . Then the open-string vertex subalgebra of V generated by S is the smallest open-string vertex subalgebra of V containing S. Proposition 2.1. Let V be a conformal open-string vertex algebra and ω the openstring vertex subalgebra of V generated by ω. Then ω is in fact a vertex operator algebra. In particular, V is a module for the vertex operator algebra ω . Proof. All the axioms for a vertex operator algebra are satisfied by ω obviously except for the commutativity or equivalently the commutator formula. But the Virasoro relations imply the commutator formula for the vertex operators for ω .
More generally, we have the following generalization: Let V be an open-string vertex algebra and let
C0 (V ) = u ∈ V(n) Y f (u, x) ∈ (End V )[[x, x −1 ]], n∈Z
Y f (v, x)u = exD Y f (u, −x)v, ∀v ∈ V . In particular, for elements of C0 (V ), skew-symmetry holds. Clearly C0 (V ) is not zero since by (1.11), 1 ∈ C0 (V ). For an open-string vertex algebra V , the formal vertex operator map Y f for V induces a map from C0 (V ) ⊗ C0 (V ) to V [[x, x −1 ]]. We denote this map by Y f |C0 (V ) . We first need: Proposition 2.2. Let v1 ∈ C0 (V ), v2 , v ∈ V and v ∈ V . Then there exists a (possibly multivalued) analytic function on M 2 = {(z1 , z2 ) ∈ C2 | z1 , z2 = 0, z1 = z2 } such that it is single valued in z1 and is equal to the (possibly multivalued) analytic extensions of (1.12), (1.13), (1.14) and (1.15) in the regions |z1 | > |z2 | > 0, |z2 | > |z1 | > 0,
444
Y.-Z. Huang, L. Kong
|z2 | > |z1 − z2 | > 0 and |z1 | > |z1 − z2 | > 0, respectively. Moreover, if v2 is in C0 (V ), then this analytic function is single valued in both z1 and z2 . If V satisfies the grading-restriction condition, then this analytic function is a rational function with the only possible poles z1 , z2 = 0 and z1 = z2 . Proof. By Proposition 1.9, (1.12), (1.13) and (1.14) are absolutely convergent in the regions |z1 | > |z2 | > 0, |z2 | > |z1 | > 0, |z2 | > |z1 − z2 | > 0, respectively, and the associativity for Y f holds. Since v1 ∈ C0 (V ), by definition, Y f (v1 , x)v2 ∈ V [[x, x −1 ]] and we have the skewsymmetry Y f (v1 , x)v2 = exD Y f (v2 , −x)v1 , Y f (v2 , x)v1 = exD Y f (v1 , −x)v2 . In [H7] it was proved that commutativity for intertwining operators follows from associativity and skew-symmetry for intertwining operators. For the reader’s convenience, here we give a proof of commutativity in the special case in which we are interested. By associativity, (1.12) and (1.14) are equal in the region |z1 | > |z2 | > |z1 − z2 | > 0. By associativity also, (1.13) and (1.15) converge absolutely to analytic functions defined on the regions |z2 | > |z1 | > 0 and |z1 | > |z1 −z2 | > 0, respectively, and are equal in the region |z2 | > |z1 | > |z1 − z2 | > 0. By skew-symmetry and the D-derivative property, for z1 , z2 ∈ C satisfying |z1 | > |z1 − z2 | > 0 and |z2 | > |z1 − z2 | > 0, we have v , Y f (Y f (v1 , z1 − z2 )v2 , z2 )v = v , Y f (e(z1 −z2 )D Y f (v2 , −(z1 − z2 ))v1 , z2 )v = v , Y f (Y f (v2 , z2 − z1 )v1 , z2 + (z1 − z2 ))v = v , Y f (Y f (v2 , z2 − z1 )v1 , z1 )v , that is, in the region given by |z1 | > |z1 − z2 | > 0 and |z2 | > |z1 − z2 | > 0, (1.14) and (1.15) are equal. Since (1.12) is equal to (1.14) in the region |z1 | > |z2 | > |z1 − z2 | > 0, (1.14) is equal to (1.15) in the region given by |z1 | > |z1 −z2 | > 0 and |z2 | > |z1 −z2 | > 0, and (1.15) is equal to (1.13) in the region |z2 | > |z1 | > |z1 − z2 | > 0, we see that (1.12) and (1.13) are analytic extensions of each other. So commutativity is proved. Now we prove the existence of the function stated in the proposition. By skew-symmetry, we have Y f (v, z)1 = ezD Y f (1, −z)v = ezD v for any v ∈ C0 (V ). Thus by definition, for v1 ∈ C0 (V ), v2 , v ∈ V and v ∈ (C0 (V )) , v , Y f (v1 , z1 )Y f (v2 , z2 )ez3 D v = v , Y f (v1 , z1 )Y f (v2 , z2 )Y f (v, z3 )1 converges absolutely for z1 , z2 , z3 ∈ R× satisfying |z1 | > |z2 | > |z3 | > 0. Consequently it also converges absolutely for z1 , z2 , z3 ∈ C satisfying |z1 | > |z2 | > |z3 | > 0. Now the same proof as the one for Lemma 4.1 in [H7] shows that there exists a (possibly multivalued) analytic function on M 2 such that it is equal to (possibly multivalued) analytic extensions of (1.12), (1.13), (1.14) and (1.15) in the regions |z1 | > |z2 | > 0, |z2 | > |z1 | > 0, |z2 | > |z1 − z2 | > 0 and |z1 | > |z1 − z2 | > 0, respectively. Since (1.12), (1.13) and (1.14) give analytic functions which are all single valued in z1 , this function as the analytic extension of these functions must also be single valued in z1 .
Open-String Vertex Algebras, Tensor Categories and Operads
445
If v2 is in C0 (V ), then by definition, Y f (v2 , x)v ∈ V [[x, x −1 ]] and thus (1.12), (1.13) and (1.15) give analytic functions which are also single valued in z2 . So their analytic extension is also single valued in both z1 and z2 . If V satisfies the grading-restriction condition, then the singularities z1 , z2 = 0, ∞ and z1 = z2 of this analytic extension are all poles and this analytic extension is therefore a rational function in z1 and z2 with the only possible poles z1 , z2 = 0 and z1 = z2 .
Theorem 2.3. Let V be a grading-restricted open-string vertex algebra. Then the image of C0 (V ) ⊗ C0 (V ) under Y f |C0 (V ) is in C0 (V )[[x, x −1 ]] and the image of C0 (V ) under D is in C0 (V ). Moreover, (C0 (V ), Y f |C0 (V ) , 1, D) f is a grading-restricted V vertex algebra, V is a C0 (V )-module and Y is an intertwining operator of type V V for the vertex algebra C0 (V ).
Proof. Let v1 , v2 be homogeneous elements of C0 (V ). We would like to show that Y f(v1 , x)v2 ∈ C0 (V )[[x, x −1 ]]. First of all, since v1 ∈ C0 (V ), Y f(v1 , x)v2 ∈ V [[x, x −1 ]]. Since (V ), wt v1 , wt v2 ∈ Z. Thus by Proposition 1.4, Y f (v1 , x)v2 ∈ v1 , v2 ∈ C0−1 2 n∈Z V(n) [[x, x ]]. By Proposition 2.2, the analytic extension of (1.14) to M is a single-valued analytic function. In particular, (1.14) gives a single-valued analytic function in z1 and z2 . Thus Y f (Y f (v1 , x)v2 , x2 )v ∈ (V [[x2 , x2−1 ]])[[x, x −1 ]]. For v ∈ V , v ∈ V and z1 , z2 ∈ R+ satisfying z1 > z2 > z1 − z2 > 0, v , Y f (Y f (v1 , z1 − z2 )v2 , z2 )v = v , Y f (v1 , z1 )Y f (v2 , z2 )v = v , Y f (v1 , z1 )ez2 D Y f (v, −z2 )v2 . (2.1) The right-hand side of (2.1) is well defined when z1 , z2 ∈ C and |z1 | > |z2 | > 0 and is equal to v , ez2 D Y f (v1 , z1 − z2 )Y f (v, −z2 )v2 = v , ez1 D Y f (Y f (v, −z2 )v2 , −(z1 − z2 ))v1 = v , ez1 D Y f (v, −z1 )Y f (v2 , −(z1 − z2 ))v1 = v , ez1 D Y f (v, −z1 )e−(z1 −z2 )D Y f (v1 , z1 − z2 )v2
(2.2)
when z1 , z2 ∈ R+ and z1 > z1 − z2 > z2 > 0. The right-hand side of (2.2) is well defined when z1 , z2 ∈ C and |z1 | > |z1 − z2 | > 0 and is equal to v , ez2 D Y f (v, −z2 )Y f (v1 , z1 − z2 )v2
(2.3)
when z1 , z2 ∈ C and |z1 | > |z2 | > |z1 − z2 | > 0. From (2.1)–(2.3), we see that the left-hand side of (2.1) and the right-hand side of (2.3) are analytic extensions of each other. Since both the left-hand side of (2.1) and the right-hand side of (2.3) are well defined single-valued analytic functions on the region |z2 | > |z1 − z2 | > 0, they are equal when |z2 | > |z1 − z2 | > 0. Thus we obtain Y f (Y f (v1 , x)v2 , x2 )v = ex2 D Y f (v, −x2 )Y f (v1 , x)v2 , where x and x2 are two commuting formal variables. So Y f (v1 , x)v2 ∈ C0 (V )[[x, x −1 ]].
446
Y.-Z. Huang, L. Kong
Let u be a homogeneous element of C0 (V ). Then wt u ∈ Z. Since D has weight 1, Du ∈ n∈Z V(n) . By the D-derivative property, we see that Y f (Du, x) =
d f Y (u, x) ∈ (End V )[[x, x −1 ]]. dx
For any v ∈ V , using the D-derivative property and the D-bracket formula, we obtain d f Y (u, x)v dx d xD f = e Y (v, −x)u dx = exD DY f (v, −x)u − exD Y f (Dv, −x)u = exD Y f (v, −x)Du.
Y f (Du, x)v =
So Du ∈ C0 (V ). To show that C0 (V ) is a vertex algebra, we need only verify commutativity, associativity and rationality since all the other axioms are clearly satisfied. But associativity, commutativity and rationality have been proved in Proposition 2.2. The of the fact proof that V is a C0 (V )-module and Y f is an intertwining operator of type VVV for C0 (V ) is completely the same.
We shall call the grading-restricted vertex algebra (C0 (V ), Y f |C0 (V ) , 1, D) the meromorphic center of V . Remark 2.4. In fact, using the relationship between skew-symmetry and locality (or commutativity), it is easy to see that the meromorphic center of an open-string vertex algebra is the maximal Z-graded vertex algebra contained in the open-string vertex algebra such that the vertex operators for elements in this vertex algebra and the vertex operators for elements in the open-string vertex algebra are mutually local to each other. Proposition 2.5. Let V be a conformal open-string vertex algebra. Then ω ∈ C0 (V ). Proof. By definition, ω ∈ n∈Z V(n) and Y f (ω, x) ∈ (End V )[[x, x −1 ]]. For any v ∈ V , the commutator formula for ω and formal vertex operators implies the commutativity for Y f (ω, z1 ) and Y f (v, z2 ). In particular, for any v ∈ V , v , Y f (ω, z1 )Y f (v, z2 )1
(2.4)
v , Y f (v, z2 )Y f (ω, z1 )1
(2.5)
and
are absolutely convergent in the regions |z1 | > |z2 | > 0 and |z2 | > |z1 | > 0, respectively, and are analytic extensions of each other. Also by associativity we know that v , Y f (Y f (ω, z1 − z2 )v, z2 )1
(2.6)
v , Y f (Y f (v, z2 − z1 )ω, z1 )1
(2.7)
and
are absolutely convergent in the region |z2 | > |z1 − z2 | > 0 and |z1 | > |z1 − z2 | > 0, respectively, and are equal to (2.4) and (2.5), respectively, in the region |z1 | > |z2 | >
Open-String Vertex Algebras, Tensor Categories and Operads
447
|z1 − z2 | > 0 and |z2 | > |z1 | > |z1 − z2 | > 0, respectively. Thus (2.6) and (2.7) are also analytic extensions of each other. Note that by (1.10), v , Y f (e(z1 −z2 )L(−1) Y f (v, z2 − z1 )ω, z2 )1
= e(z1 −z2 )L (−1) v , Y f (Y f (v, z2 − z1 )ω, z2 )1
(2.8)
is absolutely convergent in the region |z2 | > |z1 − z2 | > 0 and is equal to (2.7) in the region |z1 |, |z2 | > |z1 − z2 | > 0. So (2.6) and the left-hand side of (2.8) are analytic extensions of each other. We know that both (2.6) and the left-hand side of (2.8) are convergent absolutely in the region |z2 | > |z1 − z2 | > 0 and, moreover, we know that (2.4), (2.5), (2.6) and (2.7) give single-valued analytic functions in z1 and z2 . Thus in the region |z2 | > |z1 −z2 | > 0, (2.6) and the left-hand side of (2.8) are equal, that is, v , Y f (Y f (ω, z1 − z2 )v, z2 )1 = v , Y f (e(z1 −z2 )L(−1) Y f (v, z2 − z1 )ω, z2 )1 . (2.9) By taking coefficients of z1 − z2 and z2 in both sides of (2.9) and then taking the generating functions of these coefficients, we obtain v , Y f (Y f (ω, x)v, y)1 = v , Y f (exL(−1) Y f (v, −x)ω, y)1 ,
(2.10)
where x and y are commuting formal variables. Since v ∈ V is arbitrary, (2.10) gives Y f (Y f (ω, x)v, y)1 = Y f (exL(−1) Y f (v, −x)ω, y)1.
(2.11)
Taking the formal limit y → 0 (that is, taking the constant term of the series in y) of both sides of (2.11), we obtain Y f (ω, x)v = exL(−1) Y f (v, −x)ω. So we conclude that ω ∈ C0 (V ).
One immediate consequence of this result is the following: Corollary 2.6. Let V be a grading-restricted conformal open-string vertex algebra. Then the vertex operator algebra ω is a subalgebra of C0 (V ). Recall the following main theorem in [H8]: Theorem 2.7. Let V be a vertex operator algebra satisfying the following conditions: 1. Every generalized V -module is a direct sum of irreducible V -modules. 2. There are only finitely many inequivalent irreducible V -modules and these irreducible V -modules are all R-graded. 3. Every irreducible V -module satisfies the C1 -cofiniteness condition. Then the direct sum of all (inequivalent) irreducible V -modules has a natural structure of an intertwining operator algebra. In particular, the following associativity for intertwining operators holds: For any 0, W1 , W2 , W3 and W4 , any intertwining WV0 -modules WW 4 operators Y1 and Y2 of types W1 W4 and W2 W3 , respectively, w(0) , Y1 (w(1) , z1 )Y2 (w(2) , z2 )w(3)
(2.12)
448
Y.-Z. Huang, L. Kong
∈ W, w is absolutely convergent when |z1 | > |z2 | > 0 for w(0) (1) ∈ W1 , w(2) ∈ W2 0 and w(3) ∈ W3 , and there exist the V -module W5 and intertwining operators Y3 and Y4 5 0 of types WW and WW , respectively, such that 1 W2 5 W3 , Y4 (Y3 (w(1) , z1 − z2 )w(2) , z2 )w(3) w(0) ∈ W, w is absolutely convergent when |z2 | > |z1 −z2 | > 0 for w(0) (1) ∈ W1 , w(2) ∈ W2 0 and w(3) ∈ W3 and is equal to (2.12) when |z1 | > |z2 | > |z1 − z2 | > 0.
Theorems 2.3 and 2.7 suggest a method to construct a conformal open-string vertex algebra: We start with a vertex operator algebra (V , Y, 1, ω) satisfying the conditions in Theorem 2.7 and look for a module W , an intertwining operator Y f of type WWW and elements 1W and ωW such that if we define Y O : (W ⊗ W ) × R+ → W (w1 ⊗ w2 , r) → Y O (w1 , r)w2 by Y O (w1 , r)w2 = Y f (w1 , r)w2
(2.13)
for r ∈ R+ , then (W, Y O , 1W , ωW ) is a conformal open-string vertex algebra. When the vertex operator algebra (V , Y, 1, ω) is simple, W must contain V . We give more details here. Let (V , Y, 1, ω) be a vertex operator algebra satisfying the conditions in Theorem 2.7. For simplicity, we assume that V is simple. Let A be the set of equivalence classes of irreducible and, for a ∈ A, let W a be a represenV -modules a has a natural structure of an intertwining tative in a. Then by Theorem 2.7, W a∈A operator algebra. Let W = a∈A E a ⊗ W a , where E a for a ∈ A are vector spaces to be determined. We give W the obvious V = C ⊗ V -module structure. We also let Y f ∈ Hom(W ⊗ W, W {x}) = Hom(E a1 ⊗ E a2 , E a3 ) ⊗ Hom(W a1 ⊗ W a2 , W a3 {x}) a1 ,a2 ,a3 ∈A
be given by Y = f
a
Na13a2
a1 ,a2 ,a3 ∈A i=1
Caa13a;i2 ⊗ Yaa13a;i2 ,
a3 a3 ;i a1 ⊗ where for a1 , a2 , a3 ∈ A, Naa13a2 is the fusion rule of type W W a1 W a2 , Ca1 a2 ∈ Hom(E E a2 , E a3 ) for i = 1, . . . , Naa13a2 are to be determined, and Yaa13a;i2 for i = 1, . . . , Naa13a2 is a3 a basis of the space Vaa13a2 of intertwining operators of type W W a1 W a2 . a Let e be the equivalence class of irreducible V -modules containing V . Note that Nea a;1 for a ∈ A are always one-dimensional. We choose the basis Yea for a ∈ A to be the a;1 (1, x)w a = wa for a ∈ A and vertex operator for the V -module W a . In particular, Yea a;1 a a w ∈ W . We also choose the basis Yae for a ∈ A to be the ones given by a;1 a;1 (w a , x)u = exL(−1) Yea (u, −x)w Yae
Open-String Vertex Algebras, Tensor Categories and Operads
449
a;1 for u ∈ V and w a ∈ W a . Thus we have limx→0 Yae (w a , x)1 = w a for a ∈ A and a a w ∈W . We would like to choose E a for a ∈ A and Caa13a;i2 for a1 , a2 , a3 ∈ A and i = 1, . . . , Naa13a2 such that the map Y O given by (2.13) in terms of Y f satisfies the associativity
Y O (w1 , r1 )Y O (w2 , r2 )w3 = Y O (Y O (w1 , r1 − r2 )w2 , r2 )w3
(2.14)
for r1 , r2 ∈ R+ satisfying r1 > r2 > r1 − r2 > 0 and w1 ∈ W1 , w2 ∈ W2 , w3 ∈ W3 . Note that both sides of (2.14) are well-defined since a∈A W a is an intertwining operator algebra. The left-hand side of (2.14) gives a;j a;j (Caa14a;i ◦ (idE a1 ⊗ Ca2 a3 )) ⊗ Yaa14a;i (w1 , r1 )Ya2 a3 (w2 , r2 )w3 a1 , a2 , a3 a4 , a; i, j
=
(Caa14a;i ◦ (idE a1 ⊗ Ca2 a3 )) a;j
a1 , a2 , a3 a4 , a; i, j
⊗
a5 ;k,l
ij ;kl
Fa;a5 (a1 , a2 , a3 ; a4 )Yaa54a;l3 (Yaa15a;k2 (w1 , r1 − r2 )w2 , r2 )w3 , ij ;kl
where for any a ∈ A, idE a is the identity on E a and Fa;a5 (a1 , a2 , a3 ; a4 ), for a, a1 , . . . , a5 ∈ A, i = 1, . . . , Naa14a , j = 1, . . . , Naa2 a3 , k = 1, . . . , Naa15a2 and l = 1, . . . , Naa54a3 , are the matrix elements of the corresponding fusing isomorphisms. (In the formulas above and below, for simplicity, we omit the ranges over which the sums are taken, since these are clear and some of them have been given above.) The right-hand side of (2.14) gives (Caa54a;l3 ◦ (Caa15a;k2 ⊗ idE a3 )) ⊗ Yaa54a;l3 (Yaa15a;k2 (w1 , r1 − r2 )w2 , r2 )w3 . a1 , a2 , a3 a4 , a5 ; k, l
It is clear that in this case Yaa54a;l3 (Yaa;k 1 a2 (·, r1 − r2 )·, r2 ) · for a1 , a2 , a3 , a4 , a5 ∈ A are linearly independent. Thus (2.14) gives ij ;kl a;j Fa;a5 (a1 , a2 , a3 ; a4 )(Caa14a;i ◦(idE a1 ⊗Ca2 a3 )) = Caa54a;l3 ◦(Caa15a;k2 ⊗idE a3 ) (2.15) a;i,j
for a1 , a2 , a3 , a4 , a5 ∈ A, k = 1, . . . , Naa15a2 and l = 1, . . . , Naa54a3 . We need a vacuum for W . Let 1e ∈ E e . If we want the vacuum to be of the form 1W = 1e ⊗ 1, then we must have the following identity property and creation property: Y O (1W , r)(α a ⊗ w a ) = α a ⊗ w a , lim Y O ((α a ⊗ w a ), r)1W = α a ⊗ w a
r→0
(2.16) (2.17)
for a ∈ A, ∈ and w a ∈ W a . Equations (2.16) and (2.17) together with the properties of intertwining operators for V gives αa
Ea
a;1 e Cea (1 ⊗ α a ) = α a , a;1 a (α Cae
for a ∈ A and
αa
∈
Ea .
⊗1 ) = α e
a
(2.18) (2.19)
450
Y.-Z. Huang, L. Kong
Let 1W = 1e ⊗ 1 and ωW = 1e ⊗ ω. Then we have just proved the following: Proposition 2.8. Let V be a simple vertex operator algebra satisfying the conditions in Theorem 2.7 and let A, e and W a for a ∈ A be as above. If we choose the vector spaces E a for a ∈ A, Caa13a;i2 ∈ Hom(E a1 ⊗ E a2 , E a3 ) for a1 , a2 , a3 ∈ A, i = 1, . . . , Naa13a2 , and 1e ∈ E e such that (2.15), (2.18) and (2.19) hold, then the quadruple (W, Y O , 1W , ωW ) is a grading-restricted conformal open-string vertex algebra. 3. Examples In this section, we give some examples of open-string vertex algebras. Examples can also be constructed using the main results in Sects. 4 and 5. First of all, we have the following examples for which the axioms are trivial to verify: 1. Associative algebras. 2. Vertex (super)algebras. 3. Tensor products of algebras above, for example, A ⊗ V , where A is an associative algebra and V a vertex (super)algebra. The examples above are trivial to construct because they satisfy some much stronger axioms than those in the definition of open-string vertex algebra. Nontrivial examples of open-string vertex algebras can be constructed from the direct sum of a vertex algebra and an R-graded module for the vertex algebra in the same ways as in the construction of the example of vertex operator algebras in Example 3.4 in [H3] and as in the conceptual construction of the vertex operator algebra structure on the moonshine module in [H5], except that here the module does not have to be Z-graded. Note that in the construction of the vertex operator algebra structure on the moonshine module in [H5], the hard part is to prove the duality properties, which follow from the duality properties of a larger intertwining operator algebra. If we start with a vertex operator algebra satisfying the conditions in Theorem 2.7, then the construction becomes very easy because the duality properties have been established by Theorem 2.7. We now give an example constructed using a different method. It is an example constructed from modules for the minimal Virasoro vertex operator algebra of central charge c = 21 . This example is nontrivial because it is not an associative algebra, a vertex (super)algebra or a tensor product of these algebras. Here we describe the data. For the details, we refer the reader to the second author’s thesis [K]. For the minimal Virasoro vertex operator algebras, their representations, intertwining operators and chiral correlation functions, see, for example, [DF, BPZ, W, H4, FRW and DMS]. Let L( 21 , 0) be the minimal Virasoro vertex operator algebra of central charge 21 . It has three inequivalent irreducible modules W0 = L( 21 , 0), W1 = L( 21 , 21 ) and W2 = Wk 1 L( 21 , 16 ). It is well known that the fusion rules Nijk = NW for i, j, k = 0, 1, 2 are i Wj equal to 1 for (i, j, k) = (0, 0, 0), (0, 1, 1), (1, 0, 1), (1, 1, 0), (0, 2, 2), (2, 0, 2), (2, 2, 0), (1, 2, 2), (2, 1, 2), (2, 2, 1) and are equal to 0 otherwise. It was proved in [H4] that the direct sum of W0 , W1 and W2 has a structure of an intertwining operator algebra. When Nijk = 1, we choose a basis Yijk of Vijk . Given i, j, k, l ∈ {0, 1, 2}, m ∈ {0, 1, 2} is said to be coupled with
Open-String Vertex Algebras, Tensor Categories and Operads
451
l , V m , V n and V l are all nonzero. We use the n ∈ {0, 1, 2} through (i, j, k; l) if Vim jk ij nk notation m 1li,j,k n to denote the fact that m is coupled with n through (i, j, k; l). For i, j, k, l ∈ {0, 1, 2}, the matrix elements Fm;n (i, j, k; l) for m, n = 0, 1, 2 of the fusing isomorphisms
F(i, j, k; l) :
2 m=0
l Vim ⊗ Vjmk →
2
l Vijn ⊗ Vnk
n=0
are determined by the following associativity relations (see [H7]): l wl , Yim (wi , z1 )Yjmk (wj , z2 )wk l = Fm;n (i, j, k; l)wl , Ynk (Yijn (wi , z1 − z2 )wj , z2 )wk m1li,j,k n
for i, j, m = 0, 1, 2, z1 , z2 ∈ R satisfying z1 > z2 > z1 − z2 > 0 and wi ∈ Wi , wj ∈ Wj , wk ∈ Wk , where the sum is over all k, l, n = 0, 1, 2 such that m 1li,j,k n. For ˜ j, k; l) for i, j, k, l = 0, 1, 2 to denote matrices whose entries simplicity, we use F(i, ˜ Fmn (i, j, k; l) for m, n = 0, 1, 2 are the symbol DC (meaning decoupled) if m is not coupled with n through (i, j, k; l) and is Fm;n (i, j, k; l) if m is coupled with n through (i, j, k; l). We call these matrices the fusing-coupling matrices. For m, n = 0, 1, 2, we use ±Emn to denote the 3 × 3 matrices with the entry in the mth row and the nth column being ±1 and the other entries being DC. Proposition 3.1. For i, j, k = 0, 1, 2 such that Nijk = 1, there exists a basis Yijk of Vijk such that ˜ ˜ F(0, 0, 0, 0) = F(1, 1, 1, 1) = E00 , ˜ ˜ F(1, 1, 0, 0) = F(0, 0, 1, 1) = E01 , ˜ ˜ F(1, 0, 0, 1) = F(0, 1, 1, 0) = E10 , ˜ ˜ F(1, 0, 1, 0) = F(0, 1, 0, 1) = E11 , ˜ ˜ ˜ ˜ F(2, 2, 0, 0) = F(0, 0, 2, 2) = F(1, 1, 2, 2) = F(2, 2, 1, 1) = E02 , ˜ ˜ ˜ ˜ F(0, 2, 2, 0) = F(2, 0, 0, 2) = F(1, 2, 2, 1) = F(2, 1, 1, 2) = E20 , ˜ ˜ ˜ ˜ F(0, 1, 2, 2) = F(1, 0, 2, 2) = F(2, 2, 0, 1) = F(2, 2, 1, 0) = E12 , ˜ ˜ ˜ ˜ F(0, 2, 2, 1) = F(1, 2, 2, 0) = F(2, 0, 1, 2) = F(2, 1, 0, 2) = E21 , ˜ ˜ F(1, 2, 1, 2) = F(2, 1, 2, 1) = −E22 , ˜ ˜ ˜ ˜ F(0, 2, 0, 2) = F(2, 0, 2, 0) = F(0, 2, 1, 2) = F(1, 2, 0, 2) ˜ ˜ = F(2, 0, 2, 1) = F(2, 1, 2, 0) = E22 , 1 √ √1 DC 2 2 ˜ F(2, 2, 2, 2) = √1 − √1 DC ; 2 2 DC DC DC all other fusing-coupling matrices have entries which are either 0 or DC.
452
Y.-Z. Huang, L. Kong
The proposition above gives the complete information about the fusing isomorphisms for the minimal model of central charge 21 . Now consider the irreducible modules Wi ⊗Wi for i = 0, 1, 2 for the tensor product vertex operator algebra L( 21 , 0) ⊗ L( 21 , 0). Let W = 2i=0 Wi ⊗ Wi and let Y f : (W ⊗ W ) → W {x} be given by Yf =
2
Yijk ⊗ Yijk ,
i,j,k=0
where we have taken Yijk = 0 for i, j, k ∈ {0, 1, 2} such that Vijk = 0 and where Yijk ⊗Yijk for i, j, k ∈ {0, 1, 2} act on W ⊗ W in the obvious way. Let Y O : (W ⊗ W ) × R+ → W (w1 ⊗ w2 , r) → Y O (w1 , r)w2 be given by Y O (w1 , r)w2 = Y f (w1 , r)w2 for r ∈ R+ and w1 , w2 ∈ W . Let 1 and ω be the vacuum and conformal element of L( 21 , 0). Then we have: Proposition 3.2. The quadruple (W, Y O , 1 ⊗ 1, ω ⊗ 1 + 1 ⊗ ω) is a grading-restricted conformal open-string vertex algebra with C0 (W ) = W0 ⊗ W0 . The proof is a straightforward verification. See [K] for details. Remark 3.3. In the construction above, Y f and Y O involve fractional powers. So W is not a vertex operator algebra. 4. Braided Tensor Categories and Open-String Vertex Algebras In this section, we show that an associative algebra in the braided tensor category of modules for a suitable vertex operator algebra V is equivalent to an open-string vertex algebra with V in its meromorphic center. The main result of this section (Theorem 4.3) is a straightforward generalization of the main result in [HKL]. In this section, we assume that the reader is familiar with the tensor product theory developed by Lepowsky and the first author. See [HL3, HL4, HL5, HL6 and H3] for details. First of all, we have the following result established in [H9]: Theorem 4.1. Let V be a vertex operator algebra satisfying the conditions in Theorem 2.7. Then the category of V -modules has a natural structure of vertex tensor category with V as its unit object. In particular, this category has a natural structure of braided tensor category. Given a braided tensor category C, we use 1C to denote its unit object. We need the following concept:
Open-String Vertex Algebras, Tensor Categories and Operads
453
Definition 4.2. Let C be a braided tensor category. An associative algebra in C (or associative C-algebra) is an object A ∈ C along with a morphism µ : A ⊗ A → A and an injective morphism ιA : 1C → A such that the following conditions hold: 1. Associativity: µ ◦ (µ ⊗ idA ) = µ ◦ (idA ⊗ µ) ◦ A, where A is the associativity isomorphism from A ⊗ (A ⊗ A) to (A ⊗ A) ⊗ A. −1 2. Unit properties: µ◦(ιA ⊗idA )◦lA = µ◦(ιA ⊗idA )◦rA−1 = idA , where lA : 1C ⊗A → A and rA : A ⊗ 1C → A are the left and right unit isomorphism, respectively. We say that the unit of an associative algebra A in C is unique if dim HomC (1C , A) = 1. We use (A, µ, ιA ) or simply A to denote the associative algebra in C just defined. An associative algebra whose unit is unique was called a haploid algebra by Fuchs, Runkel and Schweigert ( see [FRS1 and FRS2]). Let V be a vertex operator algebra satisfying the conditions in Theorem 2.7. Then we know that the direct sum of all irreducible V -modules is an intertwining operator algebra. We say that this intertwining operator algebra satisfies the positive weight condition if for any irreducible V -module W , the weights of nonzero elements of W are nonnegative, W(0) = 0 if and only if W is isomorphic to V , and V(0) = C1. We say that an open-string vertex algebra V satisfies the positive weight condition if the weights of elements of V are nonnegative and V(0) = C1. Theorem 4.3. Let (V , Y, 1, ω) be a vertex operator algebra satisfying the conditions in Theorem 2.7 and let C be the braided tensor category of V -modules. Then the categories of the following objects are isomorphic: 1. A grading-restricted conformal open-string vertex algebra Ve and an injective homomorphism of vertex operator algebras from V to the meromorphic center C0 (Ve ) of Ve . 2. An associative algebra Ve in C. If the intertwining operator algebra on the direct sum of all irreducible V -modules satisfies the positive weight condition, then an algebra Ve in Category 1 above satisfies the positive weight condition if and only if the unit of the corresponding associative algebra Ve in C is unique. Proof. Let Ve be a grading-restricted conformal open-string vertex algebra, 1e the vacuum of Ve and ιVe an injective homomorphism of vertex operator algebras from V to C0 (Ve ). Then we have ιVe (1) = 1e . Then by Theorem 2.3, Ve is an ιVe (V )-module and thus a V -module. So Ve is an object in C. Since Ve is an open-string vertex algebra, we have a vertex operator map YeO for Ve . By Theorem 2.3 again, the corresponding formal f vertex operator map Ye is in fact an intertwining operator for V of type VVe Ve e . Let f
µ : Ve Ve → Ve be the module map corresponding to the intertwining operator Ye . We claim that (Ve , µ, ιVe ) is an associative algebra in C. The proof is similar to the proof of the result in [HKL] that suitable commutative associative algebras in C are equivalent to vertex operator algebras extending V . For the reader’s convenience, we give a proof here. For r ∈ R+ , let µr be the morphism from Ve P (r) Ve to Ve corresponding to the f intertwining operator Ye and let µr : Ve P (r) Ve → V e be the natural extension of µP (r) . Then by definition, µ = µ1 and f
µr (u P (r) v) = Ye (u, r)v = YeO (u, r)v
454
Y.-Z. Huang, L. Kong
for u, v ∈ Ve . For simplicity, we shall use id to denote idVe in this proof. Thus for u, v, w ∈ Ve and r1 , r2 ∈ R+ satisfying r1 > r2 > r1 − r2 > 0, (µr1 ◦ (id P (r1 ) µr2 ))(u P (r1 ) (v P (r2 ) w)) = YeO (u, r1 )YeO (v, r2 )w,
(4.1)
(µr2 ◦ (µr1 −r2 P (r2 ) id))((u P (r1 −r2 ) v) P (r2 ) w) = YeO YeO (u, r1 − r2 )v, r2 )w,
(4.2)
where (and below) we use the notation that a linear map preserving gradings with a horizontal line over it always mean the natural extension of the map to a map between the algebraic completions of the original graded spaces. The associativity for YeO gives YeO (u, r1 )YeO (v, r2 )w = YeO (YeO (u, r1 − r2 )v, r2 )w.
(4.3)
The associativity isomorphism P (r −r ),P (r2 )
AP (r11 ),P2(r2 )
: Ve P (r1 ) (Ve P (r2 ) Ve ) → (Ve P (r1 −r2 ) Ve ) P (r2 ) Ve
is characterized by P (r1 −r2 ),P (r2 )
AP (r1 ),P (r2 )
(u P (r1 ) (v P (r2 ) w)) = (u P (r1 −r2 ) v) P (r2 ) w P (r1 −r2 ),P (r2 )
for u, v, w ∈ Ve , where AP (r1 ),P (r2 ) Combining (4.1)–(4.4), we obtain
P (r −r ),P (r2 )
is the natural extension of AP (r11 ),P2(r2 ) P (r −r ),P (r2 )
(µr1 ◦ (id P (r1 ) µr2 )) = (µr2 ◦ (µr1 −r2 P (r2 ) id)) ◦ AP (r11 ),P2(r2 )
.
(4.4) . (4.5)
From (4.5), we obtain (µr1 ◦ (id P (r1 ) µr2 )) ◦ (id P (r1 ) Tγ2 ) ◦ Tγ1 P (r −r ),P (r2 )
= (µr2 ◦ (µr1 −r2 P (r2 ) id)) ◦ AP (r11 ),P2(r2 )
◦ (id P (r1 ) Tγ2 ) ◦ Tγ1 , (4.6)
where r1 , r2 are real numbers satisfying r1 > r2 > r1 − r2 > 0, γ1 and γ2 are paths in R+ from 1 to r1 and r2 , respectively, and Tγ1 and Tγ2 the parallel transport isomorphisms associated to γ1 and γ2 , respectively. (For the reader’s convenience, we recall the definition of parallel transport isomorphism here. Let γ be a path from z1 ∈ C× to z2 ∈ C× . The parallel isomorphism Tγ : W1 P (z1 ) W2 → W1 P (z2 ) W2 is given as follows: Let Y be the intertwining operator corresponding to the intertwining map P (z2 ) and l(z1 ) the value of the logarithm of z1 determined uniquely by log z2 (satisfying 0 ≤ (log z2 ) < 2π ) and the path γ . Then Tγ is characterized by
T γ (w1 P (z1 ) w2 ) = Y(w1 , x)w2
x n =enl(z1 ) , n∈C
for w1 ∈ W1 and w2 ∈ W2 , where T γ is the natural extension of Tγ to the algebraic completion W1 P (z1 ) W2 of W1 P (z1 ) W2 . The parallel isomorphism depends only on the homotopy class of γ ). By definition, we have (µr1 ◦ (id P (r1 ) µr2 )) ◦ (id P (r1 ) Tγ2 ) ◦ Tγ1 = µ ◦ (id µ).
(4.7)
Open-String Vertex Algebras, Tensor Categories and Operads
455
Similarly, we have (µr2 ◦ (µr1 −r2 P (r2 ) id)) ◦ (Tγ3 ◦ (Tγ4 P (r2 ) id))−1 = (µ ◦ (µ id)),
(4.8)
where γ3 and γ4 are paths in R+ from r2 and r1 − r2 to 1, respectively, and Tγ3 and Tγ4 the parallel transport isomorphisms associated to γ3 and γ4 , respectively. Combining (4.6)–(4.8) with the definition P (z −z ),P (z2 )
A = Tγ3 ◦ (Tγ4 P (z2 ) id) ◦ AP (z11 ),P2(z2 )
◦ (id P (z1 ) Tγ2 ) ◦ Tγ1
(4.9)
of the associativity isomorphism for the tensor product structure, we obtain the associativity µ ◦ (id µ) = (µ ◦ (µ id)) ◦ A. For the unit property, we note that the inverse lV−1 : Ve → V Ve of the left unit e −1 isomorphism is defined by lVe (u) = 1 u for u ∈ Ve and thus (µ ◦ (ιVe idVe ) ◦ lV−1 )(u) = µ((ιVe idVe )(1 u)) e = µ(1e u) = Ye (1e , 1)u = idVe (u) for u ∈ Ve . The other unit property is proved similarly. Conversely, let (Ve , µ, ιVe ) be an associative C-algebra. In particular, Ve is a V -modf ule. The map µ : Ve Ve → Ve corresponds to an intertwining operator Ye of Vemodule type Ve Ve such that f
µ(u v) = Ye (u, 1)v
(4.10)
for u, v ∈ Ve . Let 1e = ιVe (1) and ωe = ιVe (ω). We define YeO : (Ve ⊗ Ve ) × R+ → V e (u ⊗ v, r) → YeO (u, r)v by f
YeO (u, r)v = Ye (u, r)v for r ∈ R+ , u, v ∈ Ve . Then we claim that (Ve , YeO , 1e , ωe ) is an grading-restricted conformal open-string vertex algebra satisfying the positive weight condition above and with V in its meromorphic center. Again, the proof is similar to the proof of the result in [HKL] mentioned above. For the reader’s convenience, we give a proof here. The identity property for the vacuum follows immediately from the left unit property µ ◦ (ιVe idVe ) ◦ lV−1 = idVe . The creation property follows from the right unit property e −1 µ ◦ (ιVe ⊗ idVe ) ◦ rVe = idVe . The Virasoro relations and the L(0)-grading property follows from the fact that Ve is a V -module. The L(−1)-derivative property and the f f commutator formula for the Virasoro operators and Ye follow from the fact that Ye is an intertwining operator.
456
Y.-Z. Huang, L. Kong
We now prove associativity. As above, for any r ∈ R+ , let µr : Ve P (r) Ve → Ve f
be the module map corresponding to the intertwining operator Ye . By definition, we have f
µr (u P (r) v) = Ye (u, r)v = (µ ◦ Tγ )(u P (r) v)
(4.11)
for u, v ∈ Ve and r ∈ R+ , where γ is a path from r to 1 in R+ . By definition, for r1 , r2 ∈ R+ satisfying r1 > r2 > r1 − r2 > 0, paths γ1 and γ2 in R+ from 1 to r1 , r2 , respectively, and paths γ3 and γ4 in R+ from r2 and r1 − r2 to 1, respectively, (4.7)–(4.8) hold. Compose both sides of the associativity µ ◦ (id µ) = (µ ◦ (µ id)) ◦ A for the C-algebra Ve with ((id P (z1 ) Tγ2 ) ◦ Tγ1 )−1 , where r1 , r2 ∈ R+ satisfying r1 > r2 > r1 − r2 > 0 and γ1 and γ2 , as above, are paths From 1 to r1 and r2 , respectively, in R+ . Then we obtain µ ◦ (id µ) ◦ ((id P (r1 ) Tγ2 ) ◦ Tγ1 )−1 = (µ ◦ (µ id)) ◦ A ◦ ((id P (r1 ) Tγ2 ) ◦ Tγ1 )−1 .
(4.12)
Using (4.7)–(4.9) and (4.12), we obtain µr1 ◦ (id P (r1 ) µr2 ) = µ ◦ (id µ) ◦ ((id P (r1 ) Tγ2 ) ◦ Tγ1 )−1 = (µ ◦ (µ id)) ◦ A ◦ ((id P (r1 ) Tγ2 ) ◦ Tγ1 )−1 = (µr2 ◦ (µr1 −r2 P (r2 ) id)) ◦ (Tγ3 ◦ (Tγ4 P (r2 ) id))−1 ◦A ◦ ((id P (r1 ) Tγ2 ) ◦ Tγ1 )−1 P (r −r ),P (r2 )
= (µr2 ◦ (µr1 −r2 P (r2 ) id)) ◦ AP (r11 ),P2(r2 )
(4.13)
.
For the next step, we use the convergence of products and iterates of intertwining operators for V . Because of the convergence, id P (r1 ) µr2 is well defined and it is clear that µr1 ◦ (id P (r1 ) µr2 ) is equal to µr1 ◦ (id P (r1 ) µr2 ). Similarly, µr1 −r2 P (r2 ) id is well-defined and µr1 ◦ (µr1 −r2 P (r2 ) id) is equal to µr1 ◦ (µr1 −r2 P (r2 ) id). Taking the natural completions of both sides of (4.13), we obtain P (r1 −r2 ),P (r2 )
µr1 ◦ (id P (r1 ) µr2 ) = µr1 ◦ (µr1 −r2 P (r2 ) id) ◦ AP (r1 ),P (r2 )
.
(4.14)
Applying both sides of (4.14) to u P (r1 ) (v P (r2 ) w) for u, v, w ∈ Ve , pairing the result with v ∈ Ve and using (4.11) and P (r1 −r2 ),P (r2 )
AP (r1 ),P (r2 )
(u P (r1 ) (v P (r2 ) w)) = (u P (r1 −r2 ) v) P (r2 ) w,
Open-String Vertex Algebras, Tensor Categories and Operads
457
we obtain the associativity v , YeO (u, r1 )YeO (v, r2 )w = v , YeO (YeO (u, r1 − r2 )v, r2 )w for u, v, w ∈ Ve , v ∈ Ve and r1 , r2 ∈ R+ satisfying r1 > r2 > r1 − r2 > 0. We now prove that ιVe (V ) is in the meromorphic center of Ve . Clearly ιVe (V ) is a vertex operator algebra isomorphic to V , ιVe is an isomorphism of vertex operator algebras from V to ιVe (V ) and thus Ve is an ιVe (V )-module. We know that the restriction f f Ye |Ve ⊗ιVe (V ) of Ye to Ve ⊗ ιVe (V ) is in fact the intertwining operator of type Ve ιVVe(V ) e for the vertex operator algebra ιVe (V ) corresponding to the module map µ|Ve ιVe (V ) : Ve ιVe (V ) → Ve which is the restriction of µ to Ve ιVe (V ). By the creation property for YeO , we have f
lim Ye (u, r)1e = lim YeO (u, r)1e = u
r→0
r→0
for u ∈ Ve . Since the space of intertwining operators of type Ve ιVVe(V ) is isomorphic e Ve to the space of intertwining operators of type ιV (V )Ve , which in turn is isomorphic to e the space of module maps from Ve to itself, any intertwining operator Y of this type satisfying the creation property lim Y(u, r)1e = u
r→0 f
must be equal to Ye |Ve ⊗ιVe (V ) . In fact, the intertwining operator Y of such type defined by Y(u, x)v = exL(−1) YVe (v, −x)u for u ∈ Ve , v ∈ ιVe (V ), where YVe is the vertex operator map for the ιVe (V )-module Ve , is such an intertwining operator. Thus we have f
Ye |Ve ⊗ιVe (V ) (u, x)v = exL(−1) YVe (v, −x)u f
(4.15)
for u ∈ Ve , v ∈ ιVe (V ). But both YVe and Ye |ιVe (V )⊗Ve are intertwining operators of Ve type ιV (V )Ve satisfying the identity property and the space of intertwining operators e of such type, as we mentioned above, is isomorphic to the space of module maps from f Ve to itself. So YVe and Ye |ιVe (V )⊗Ve must be equal. Thus (4.15) says that ιVe (V ) is in the meromorphic center of Ve . So ιVe is an injective homomorphism from V to the meromorphic center of Ve . The constructions above give two functors and it is easy to see that they are inverse to each other. Thus the two categories are isomorphic. Finally we prove the last statement. We assume that the intertwining operator algebra on the direct sum of all irreducible V -modules satisfies the positive weight condition. In particular, as an open-string vertex algebra, V itself satisfies the positive weight condition. Let Ve be a grading-restricted conformal open-string vertex algebra and ιVe an injective homomorphism of vertex operator algebras from V to C0 (Ve ). Since the weights of the nonzero elements of all the irreducible V -modules are nonnegative, the weights of the nonzero elements of the V -module Ve are also nonnegative. Assume that Ve satisfies the positive weight condition. Let f ∈ HomC (V , Ve ). Since f preserves the grading and
458
Y.-Z. Huang, L. Kong
since V and Ve both satisfy the positive weight condition, it is clear that f maps 1 to a scalar multiple of 1e . Since V as a module is generated by 1, f is determined completely by the scalar above. On the other hand, given any scalar, we can also construct an element of HomC (V , Ve ) such that it maps 1 to the scalar times 1e . Thus dim HomC (V , Ve ) = 1. Conversely, assume that dim HomC (V , Ve ) = 1. We already know that the weights of nonzero elements of the V -module Ve are also nonnegative. Assume that there is an element of (Ve )(0) which is not proportional to 1e . Then this element generates a V -submodule of the V -module Ve . Since all V -modules are completely reducible, we can find an irreducible V -submodule of this V -submodule such that it is generated by an element of (Ve )(0) which is not proportional to 1e . Since any irreducible V -module having a nonzero element of weight 0 must be isomorphic to V , this V -submodule is isomorphic to V . But this V -submodule is not equal to ιe (V ) ⊂ C0 (Ve ) since its generator of weight 0 is not proportional to 1e . Thus we see that dim HomC (V , Ve ) > 1. Contradiction. So Ve satisfies the positive weight condition.
Remark 4.4. Recall that a commutative associative algebra in a braided tensor category C or a commutative associative C-algebra is an associative C-algebra satisfying µ ◦ R = µ (commutativity), where R is the commutativity isomorphism from A ⊗ A to itself. Let V be a vertex operator algebra as in Theorem 4.3 and C the category of V -modules. Then an associative C-algebra Ve is in general not commutative In fact, for the category C of modules for V , the commutativity isomorphism R is characterized by R(u v) = eL(−1) T γ+ (v P (−1) u),
(4.16)
where u, v ∈ Ve , γ+ is a path from −1 to 1 in the closed upper half plane without passing through 0, Tγ+ is the corresponding parallel transport isomorphism and T γ+ is the natural extension of Tγ+ to the algebraic completion Ve Ve of Ve Ve . The natural extensions of the left- and right-hand sides of commutativity applied to u v for u, v ∈ Ve gives µ(R(u v)) and µ(u v), respectively. By the characterization (4.16) f of R and the relation between µ and Ye , the left- and right-hand sides of commutativity f f are further equal to eL(−1) Ye (v, −1)u and Ye (u, 1)v, respectively. Note that in genf f f O L(−1) eral Ye (v, −1)u = Ye (v, −1)u. So e Ye (v, −1)u and Ye (u, 1)v are not equal in general. Thus commutativity is not true in general. Remark 4.5. In [O], Ostrik introduced the notions of left center and right center of an associative algebra in a braided tensor category. In the case that the braided tensor category is the category of V -modules for a vertex operator algebra as in the theorem above, both left and right centers of an associative algebra are V -modules and thus are graded. The meromorphic center of the grading-restricted conformal open-string vertex algebra corresponding to the associative algebra is actually the maximal Z-graded V -module contained in the intersection of the left and right centers. 5. A Geometric and Operadic Formulation In this section, we give a geometric and operadic formulation of the notion of a grading-restricted conformal open-string vertex algebra. For the notion of open-string vertex algebra and other variations, we have similar geometric and operadic formulations. In the present section, we discuss only grading-restricted conformal open-string vertex algebras. We assume that the reader is familiar with the geometric and operadic formulation
Open-String Vertex Algebras, Tensor Categories and Operads
459
of the notion of vertex operator algebra given by the first author. See [H1, H2, H6, HL1 and HL2]. for details. ˆ is analytically diffeomorWe first introduce a geometric partial operad. Note that H r r ¯ phic to the closed unit disk. We use a and a to denote the relatively open upper-half ¯ and the closed upper-half disk in H, ¯ respectively, centered at a ∈ R with radius disk in H r r r ¯ a = B¯ ar ∩ H, where Bar and B¯ ar are the open and r ∈ R+ , that is, a = Ba ∩ H and closed disks centered at a ∈ R with radius r ∈ R+ . A disk with strips of type (m, n) (m, n ∈ N) is a disk S (a genus-zero compact connected one-dimensional complex manifold with one connected component of boundary) with m + n distinct, ordered points p1 , . . . , pm+n (called boundary punctures) on the boundary of S with p1 , . . . , pm negatively oriented and the other punctures positively oriented, and with local analytic coordinates (U1 , ϕ1 ), . . . , (Um+n , ϕm+n ) vanishing at the boundary punctures p1 , . . . , pm+n , respectively, where for each i = ¯ mapping 1, . . . , m + n, Ui is a local coordinate neighborhood at pi and ϕi : Ui → H, the boundary part of Ui analytically to R and satisfying ϕi (pi ) = 0, is a local analytic coordinate map vanishing at pi . In the present paper, we consider only disks with strips of types (1, n) for n ∈ N. For such a disk with strips, we use the subscript 0 and the subscripts 1, . . . , n to indicate that the corresponding boundary punctures are negatively oriented and positively oriented, respectively. Let S1 and S2 be disks with strips of type (1, m) and of type (1, n), respectively. Let p0 , . . . , pm be the boundary punctures of S1 , q0 , . . . , qn the boundary punctures of S2 , (Ui , ϕi ) the local coordinate at pi for some fixed i satisfying 0 < i ≤ m, and (V0 , ψ0 ) the local coordinate at q0 . Note that in our convention discussed above, p0 and q0 are the negatively oriented boundary punctures on S1 and S2 , respectively. Assume that there ¯ r and ψ0 (V0 ) contains ¯ 1/r . Assume also exists r ∈ R+ such that ϕi (Ui ) contains 0 0 ¯ r ) and ψ −1 ( ¯ 1/r ), respecthat pi and q0 are the only boundary punctures in ϕi−1 ( 0 0 0 tively. In this case we say that the i th boundary puncture of the first disk with strips can be sewn with the 0th boundary puncture of the second disk with strips. From these two disks with strips we obtain a disk with strips of type (1, m + n − 1) by cutting 1/r ϕi−1 (r0 ) and ψ0−1 (0 ) from S1 and S2 , respectively, and then identifying the new parts of the boundaries (the parts not on the boundaries of the original surfaces) of the resulting surfaces using the map ϕi−1 ◦ (−J ) ◦ ψ0 , where J is the map from C× to itself given by J (w) = 1/w. The boundary punctures (with ordering) of this disk with strips are p0 , . . . , pi−1 , q1 , . . . , qn , pi+1 , . . . , pm . The local coordinates vanishing at these punctures are given in the obvious way. This sewing procedure gives a partial operation which we call the sewing operation. Note that we have to use −J instead of J (as in [H6]) in the definition of the sewing operation. We define the notion of conformal equivalence between two disks with strips in the obvious way. The space of equivalence classes of disks with strips is called the moduli space of disks with strips. Similar to the moduli spaces of spheres with tubes in [H6], the moduli space of disks with strips of type (1, n) (n ≥ 1) can be identified with ϒ(n) = n−1 × × nR+ , where is the set of all sequences A = {Aj }j ∈Z+ of real numbers such that d exp Aj x j +1 x dx j >0
460
Y.-Z. Huang, L. Kong
is a convergent power series in some neighborhood of 0, R+ = R+ × , and n−1 is the set of elements of Rn−1 with nonzero and distinct components. We think of each ˆ equipped with ordered punctures ∞, r1 , . . . , rn−1 , element of ϒ(n), n ≥ 1, as the disk H 0, with an element of specifying the local coordinate at ∞ and with n elements of R+ specifying the local coordinates at the other punctures. Analogously, the moduli space of disks with strips of type (1, 0) can be identified with ϒ(0) = {A ∈ | A1 = 0}. Then the moduli space of disks with strips can be identified with ∪n≥0 ϒ(n). From now on we will refer to ∪n∈N ϒ(n) as the moduli space of disks with strips. The sewing operation for disks with strips induces a partial operation on ∪n∈N ϒ(n). It is still called the sewing operation. ˆ with the Let Iϒ ∈ ϒ(1) be the equivalence class containing the standard disk H negatively oriented puncture ∞, the only positively oriented puncture 0 , and with stanˆ the standard local dard local coordinates vanishing at ∞ and 0. Here for a ∈ R ⊂ H, ˆ the standard local coordinate vanishing at a is given by w → w − a, and for ∞ ∈ H, 1 coordinate vanishing at ∞ is given by w → − w . Note the minus sign in the definition of the standard local coordinate at ∞. For n ∈ N, the symmetric group Sn acts on ϒ(n) in an obvious way. Then by construction, the following result is clear: Proposition 5.1. The sequences ϒ = {ϒ(n) | n ∈ N} of moduli spaces, together with the sewing operation, the identity Iϒ and the actions of the symmetric groups, has a structure of an associative smooth R+ -rescalable partial operad. We shall call the R+ -rescalable partial operad ϒ the boundary disk partial operad. Note that the boundary disk partial operad is very different from the so-called little disk operad which is constructed using the embeddings of disks in the unit disk. In fact, ϒ can be viewed as a partial suboperad of the sphere partial operad K discussed in [H6]. Geometrically, any disk with strips of type (1, n) is conformally equivalent to a disk with ˆ and whose negatively oriented puncture strips of type (1, n) whose underlying disk is H is ∞. But any such disk with strips of type (1, n) corresponds to a sphere with tubes ˆ whose punctures are the same as those on of type (1, n) whose underlying sphere is C, the disk with strips, whose local coordinates vanishing at positively oriented punctures are the analytic extensions of those on the disk with strips and whose local coordinate vanishing at the negatively oriented puncture is the analytic extension of the negation of that on the disk with strips. Thus we obtain a map from ϒ(n) to K(n) and this map is clearly injective. In fact the images of ϒ(n) in K(n) for n ≥ 2 are (1)
(n)
{(r1 , . . . , rn−1 ; A(0) , (a0 , A(1) ), · · · , (a0 , A(n) )) ∈ K(n) | (1)
(n)
r1 , . . . , rn−1 ∈ R, a0 , . . . , a0 ∈ R+ , A(0) , . . . , A(n) ∈ }. The images of ϒ(0) in K(0) and of ϒ(1) in K(1) are (0)
{A(0) ∈ K(0) | A(0) ∈ , A1 = 0} and (1)
(1)
{(A(0) , (a0 , A(1) )) ∈ K(1) | a0 ∈ R+ , A(0) , A(1) ∈ }, respectively. In addition, by the definitions of the maps from ϒ(n) to K(n) for n ∈ N and the sewing operations in ϒ and K, it is clear that the maps from ϒ(n) to K(n) for n ∈ N respect the sewing operations, the identities and the actions of Sn and thus give an
Open-String Vertex Algebras, Tensor Categories and Operads
461
injective morphism of partial operads. From now on, we shall identify the partial operad ϒ with its image in K under this injective morphism. For any c ∈ C, the restriction of the partial operad K˜ c of the 2c th power of the determinant line bundles over K to ϒ gives a partial suboperad ϒ˜ c of K˜ c . This partial operad is called the C-extension of ϒ of central charge c. We now consider certain (pseudo-)algebras over the partial operad ϒ˜ c for c ∈ C. In the terminology of [HL1, HL2 and H6], we consider ϒ˜ c -associative (pseudo-)algebras satisfying an additional differentiability condition. Since the rescaling group of ϒ˜ c is R+ , we need to consider modules for R+ . Since an equivalence class of irreducible modules for R+ is determined by a real number s such that a ∈ R+ acts on modules in this class as −s the scalar multiplication by a , any completely reducible module for R+ is of the form V = s∈R V(s) , where V(s) is the sum of the R+ -submodules in the class determined by the real number s. We shall consider only those algebras over ϒ˜ c whose underlying vector space is of the form V = s∈R V(s) such that dim V(s) < ∞. Recall from [HL1, HL2 and H6] that given any R+ -submodule W of V , the endomorphism partial pseudoR+ R+ operad associated to the pair (V , W ) is the sequence HV ,W = {HV ,W (n)}n∈N , where R
+ HV ,W (n) is the set of all multilinear maps from V ⊗n to V such that W ⊗n is mapped to W , equipped with natural operadic structures.
˜ c -associative pseudo-algebra is a comDefinition 5.2. A differentiable (orC 1 ) ϒ pletely reducible R+ -module V = s∈R V(s) satisfying the condition dim V(r) < ∞ for s ∈ R equipped with an R+ -submodule W and a morphism of an R+ -rescalable R+ pseudo-partial operad from ϒ˜ c to the endomorphism partial pseudo-operad HV ,W (that c ˜ is, an ϒ -associative pseudo-algebra) satisfying the following conditions: 1. For s sufficiently negative, V(s) = 0. R+ (n) is linear on the fibers of ϒ˜ c (n). 2. For any n ∈ N, n : ϒ˜ c (n) → HV ,W 3. For any s 1 , . . . , sn ∈ R, there exists a finite subset R(s1 , . . . , sn ) ⊂ R such that the image of s∈s1 +Z V(s) ⊗ · · · ⊗ s∈sn +Z V(s) under n (ψn (Q)) for any Q ∈ ϒ˜ c (n) is in s∈R(s1 ,...,sn )+Z V(s) . 4. For any v ∈ V , v1 , . . . , vn ∈ V , v , n (ψn (Q))(v1 ⊗ · · · ⊗ vn ) as a function of Q = (r1 , . . . , rn−1 ; A(0) , (a0 , A(1) ), · · · , (a0 , A(n) )) ∈ ϒ˜ c (n) (1)
(n)
is of the form m
(1)
(n)
fi (r1 , . . . , rn−1 )gi (A(0) , (a0 , A(1) ), · · · , (a0 , A(n) )),
i=1
where fi (r1 , . . . , rn−1 ) for i = 1, . . . , m are continuous differentiable functions of (1) (n) r1 , . . . , rn−1 and gi (A(0) , (a0 , A(1) ), · · · , (a0 , A(n) )) for i = 1, . . . , m are poly(1) (n) nomials in A(0) , (a0 )±1 , A(1) , · · · , (a0 )±1 , A(n) . Morphisms (respectively, isomorphisms) of differentiable ϒ˜ c -associative pseudoalgebras are morphisms (respectively, isomorphisms) of the underlying ϒ˜ c -associative pseudo-algebras.
462
Y.-Z. Huang, L. Kong
We denote the differentiable ϒ˜ c -associative pseudo-algebra just defined by (V , W, ) or simply V . It is easy to see that a differentiable ϒ˜ c -associative pseudo-algebra is actually analytic in the sense that for any v ∈ V , v1 , . . . , vn ∈ V , v , ν(Q)(v1 , . . . , vn ) is analytic in Q because of the sewing axiom (that is, the sewing operation in ϒ corresponds R+ to the contraction in HV ,W under ). Using this fact and the fact that the expansion of analytic functions are always absolutely convergent in the domain of convergence, it is easy to obtain: Proposition 5.3. Any differentiable ϒ˜ c -associative pseudo-algebra (V , W, ) is an ϒ˜ c associative algebra, that is, the image of ϒ˜ c under is a partial operad (the image of ϒ˜ c under satisfies the composition-associativity). We omit the proof of this result since it is the same as the proof of the corresponding result in [H6]. Because of this result, we shall call a differentiable ϒ˜ c -associative pseudo-algebra simply a differentiable ϒ˜ c -associative algebra. Now we have the following main theorem which gives a geometric and operadic formulation of the notion of grading-restricted conformal open-string vertex algebras: Theorem 5.4. The category of grading-restricted conformal open-string vertex algebras of central charge c is isomorphic to the category of differentiable ϒ˜ c -associative algebras. Proof. The proof of this theorem is basically the same as that of the isomorphism theorem for the geometric and operadic formulation of vertex operator algebras in [H6]. Here we give a sketch. Some more details will be given in [K]. Let (V , Y O , 1, ω) be a grading-restricted conformal open-string vertex algebra of central charge c. We construct a differentiable ϒ˜ c -associative algebra of central charge c as follows: The R-graded vector space V is naturally a completely reducible R+ -module. The module W for the Virasoro algebra generated by 1 is an R-graded subspace of V and therefore is an R+ -submodule of V . In [H2 and H6], a section ψ of the line bundle K˜ c over K is chosen. The restriction of this section to ϒ is a section of ϒ˜ c and, for simplicity, we still denote it by ψ. For an element (1)
(n)
Q = (r1 , . . . , rn−1 ; A(0) , (a0 , A(1) ), . . . , (a0 , A(n) ))
(5.1)
of ϒ(n), any element of the fiber of ϒ˜ c over Q is of the form λψn (Q), where λ ∈ C. When r1 > · · · > rn−1 > 0, we define n (λψn (Q)) by (n (λψn (Q)))(v1 ⊗ · · · ⊗ vn ) = λe
−
·Y O (e ·e
−
j ∈Z+
−
(0)
Aj L(−j ) O
j ∈Z+
Y (e
(n−1)
Aj
(n) j ∈Z+ Aj L(j )
L(j )
−
j ∈Z+
(1)
Aj L(j )
(n−1) −L(0)
(a0
)
(a0 )−L(0) v1 , r1 ) · · · (1)
vn−1 , rn−1 ) ·
(a0 )−L(0) vn (n)
for v1 , . . . , vn ∈ V . In general, for any Q ∈ ϒ(n), we can always find σQ ∈ Sn such that σQ (Q) is of the form of the right-hand side of (5.1) such that r1 > · · · > rn−1 > 0. We define n (λψn (Q)) by (n (λψn (Q)))(v1 ⊗ · · · ⊗ vn ) = n (λψn (σQ (Q)))(vσ −1 (1) ⊗ · · · ⊗ vσ −1 (n) ) Q
Q
Open-String Vertex Algebras, Tensor Categories and Operads
463
for v1 , . . . , vn ∈ V . It can be verified in the same way as in [H6] that the triple (V , W, ν) is a differentiable ϒ˜ c -associative algebra of central charge c. This construction gives a functor from the category of grading-restricted conformal open-string vertex algebras of central charge c to the category of differentiable ϒ˜ c -associative algebras. Conversely, given a differentiable ϒ˜ c -associative algebra (V , W, ), we construct a grading-restricted conformal open-string vertex algebra as follows: As in [H6], for ε ∈ R and i ∈ Z+ , let A(ε; i) be the element of whose i th component is equal to ε and all other components are 0 and 0 the element of whose components are all 0, and for r ∈ R+ , let P (r) = (r; 0, (1, 0), (1, 0)) ∈ ϒ(2) ⊂ K(2). We define the vertex operator map Y O : (V ⊗ V ) × R+ → V , (v1 ⊗ v2 , r) → Y O (v1 , r)v2 by Y O (v1 , r)v2 = (2 (ψ2 ((P (r))))(v1 ⊗ v2 ) for v1 , v2 ∈ V and r ∈ R+ . The vacuum 1 ∈ V is given by 1 = 0 (ψ0 (0)). The conformal element ων is given by
d ω = − 0 (ψ0 ((A(ε; 2))))
. dε ε=0
It can be proved in the same way as in [H6] that (V , Y O , 1, ω) is a grading-restricted conformal open-string vertex algebra. This construction gives a functor from the category of differentiable ϒ˜ c -associative (pseudo-)algebras to the category of conformal open-string vertex algebras of central charge c. It can be shown in the same way as in [H6] that these two functors constructed above are inverse to each other. Thus the conclusion of the theorem is true.
The result above can actually be generalized to show that a grading-restricted conformal open-string vertex algebra of central charge c gives an algebra over a partial operad extending the operad of the cth power of the determinant line bundles over the so-called “Swiss-cheese” operad (see [V]). We actually have a stronger isomorphism theorem than Theorem 5.4 involving meromorphic centers of grading-restricted conformal openstring vertex algebras. To formulate this result, we first introduce the underlying partial operads. A disk with strips and tubes of type (m, n; k, l) (m, n, k, l ∈ N) is a disk S with m + n B distinct, ordered points p1B , . . . , pm+n (called boundary punctures) on the boundary of S I (called interior punctures) in the interior and k + l distinct, ordered points p1I , . . . , pk+l B I I B of S with p1 , . . . , pm and p1 , . . . , pk negatively oriented and the other (boundary or interior) punctures positively oriented, and with local analytic coordinates B B I i , ϕm+n ), (U1I , ϕ1I ), . . . , (Uk+l , ϕk+l ) (U1B , ϕ1B ), . . . , (Um+n
464
Y.-Z. Huang, L. Kong
B , p I , . . . , p I , respecvanishing at the (boundary or interior) punctures p1B , . . . , pm+n 1 k+l tively, where for each i = 1, . . . , m + n (or j = 1, . . . , k + l), UiB (or UjI ) is a local ¯ (or ϕ I : U I → C), mapcoordinate neighborhood at piB (or pjI ) and ϕiB : UiB → H j j ping the boundary part of UiB (or mapping UjI ) analytically to R (or C) and satisfying ϕiB (piB ) = 0 (or ϕiI (piI ) = 0), is a local analytic coordinate map vanishing at piB (or piI ). Note that when k = l = 0, we have a disk with strips of type (m, n). In the present paper, we consider only disks with strips and tubes of types (1, n; 0, l) for n, l ∈ N. For such a disk with strips, we use the subscript 0 and the subscripts 1, . . . , n to indicate that the corresponding boundary punctures are negatively oriented and positively oriented, respectively. Similar to disks with strips, we have a sewing operation which sews two disks with strips and tubes at boundary punctures of opposite orientations. Here we shall call this sewing operation the boundary sewing operation. On the other hand, we can also sew the negatively oriented puncture of a sphere with tubes to an interior puncture of a disk with strips and tubes just as we sew two spheres with tubes in [H6]. We shall call this sewing operation the interior sewing operation. The conformal equivalences for these disks with strips and tubes are defined in the obvious way. For n ≥ 1 and l ∈ N, the moduli space of disks with strips and tubes l × H l, of type (1, n; 0, l) can be identified with ϒ(n; l) = n−1 × × nR+ × MH c
l is the where , and R+ are defined above, H ands Hc are defined in [H6] and MH l set of elements of H with nonzero and distinct components. Analogously, for l ∈ N, the moduli space of disks with strips and tubes of type (1, 0; 0, l) can be identified with ϒ(0; 1) = {A ∈ | A1 = 0} × Hcl . Note that ϒ(n) = ϒ(n; 0) for n ∈ N. In particular, the identity Iϒ is an element of ϒ(1; 0). Also, for n ∈ N, Sn acts on ϒ(n; l) in the obvious way. Let S(n) = ∪l∈N ϒ(n; l) for n ∈ N. Then Sn acts on S(n) for n ∈ N. The following result is clear:
Proposition 5.5. The sequences S = {S(n)}n∈N together with the boundary sewing operation, the identity Iϒ ∈ ϒ(1) = ϒ(1; 0) and the actions of the symmetric groups, has a structure of a smooth R+ -rescalable partial operad. In addition, for each n ∈ N, there is an action of the sphere partial operad K on S(n) given by the interior sewing operation. Borrowing the terminology used by Voronov in [V], we shall call the R+ -rescalable partial operad S the Swiss-cheese partial operad. But note that our partial operad is much larger than the Swiss cheese operad. In fact, the Swiss cheese operad is an analogue of the little disk operad while our Swiss cheese partial operad is an analogue of the sphere partial operad in [H6]. For each pair n, l ∈ N, we have an injective map from ϒ(n; l) to K(n + 2l) obtained by doubling disks with strips and tubes as follows: For any disk with strips and tubes of type (1, n; 0, l), by the uniformization theorem, we can find a conformally equivalent ˆ This latter disk with strips and tubes of the same type such that its underlying disk is H. disk with strips and tubes can be doubled to obtain a sphere with tubes of type (1, n + 2l) ˆ By definition, we see that such that its underlying sphere is the double C ∪ {∞} of H. conformally equivalent disks with strips give conformally equivalent spheres with tubes. Thus we obtain a map from ϒ(n; l) to K(n + 2l). Clearly this map is injective. It is clear from the definition that these maps respect the (boundary) sewing operations. In addition, these maps also intertwine the actions of K on S(n) for n ∈ N and
Open-String Vertex Algebras, Tensor Categories and Operads
465
the actions of K on the images of S(n) obtained by doubling the actions of K on K(n). We shall identify ϒ(n; l) with its image in K(n + 2l). For any c ∈ C, the restriction of the partial operad K˜ c of the 2c th power of the determinant line bundles over K to S has a natural structure of a partial operad. This partial ˜ c . For any operad is called the C-extension of S of central charge c and is denoted S c c ˜ (n). For any n ∈ N, n ∈ N, the action of K on S(n) also induces an action of K˜ on S the restrictions of the sections ψn+2l of K˜ c (n + 2l) for l ∈ N to ϒ(n; l) gives a section ˜ c (n) and we shall use ψ S to denote this section. of S n We now consider a completely reducible R+ -module or, equivalently, an R-graded O and completely reducible C× -modules or, equivalently, vector space V O = s∈R V(s) LC and V RC = RC Z-graded vector spaces V LC = m∈Z V(m) m∈Z V(m) . (Here O, LC and RC means open, left closed and right closed, respectively.) Let W O , W LC and W RC be an R+ -submodule of V O , a C× -submodule of V LC and a C× -submodule of V RC , respectively. Associated to V LC , W LC , V RC , W RC , we have the endomorphism par× × × tial pseudo-operads HVCLC ,W LC , HVCRC ,W RC and HVCLC ⊗(V RC )− ,W LC ⊗(W RC )− (see [HL1, HL2 and H6]), where (V RC )− and (W RC )− are the complex conjugate of V RC and W RC . We also need an endomorphism partial operad constructed from V O , W O , V LC , R W LC , V RC and W RC . For n, l ∈ N, let HV O+,W O ;V LC ⊗(V RC )− ,W LC ⊗(W RC )− (n; l) be the space of all linear maps from (V O )⊗n ⊗ (V LC ⊗ (V RC )− )⊗l to V O such that (W O )⊗n ⊗ (W LC ⊗ (W RC )− )⊗l is mapped to W O and for n ∈ N, let R
HV O+,W O ;V LC ⊗(V RC )− ,W LC ⊗(W RC )− (n) R HV O+,W O ;V LC ⊗(V RC )− ,W LC ⊗(W RC )− (n; l). = l∈N
Then it is clear that for n ∈ N, the endomorphism partial pseudo-operad × R HVCLC ⊗(V RC )− ,W LC ⊗(W RC )− acts on HV O+,W O ;V LC ⊗(V RC )− ,W LC ⊗(W RC )− (n) and R
HV O+,W O ;V LC ⊗(V RC )− ,W LC ⊗(W RC )− R
= {HV O+,W O ;V LC ⊗(V RC )− ,W LC ⊗(W RC )− (n)}n∈N is an R+ -rescalable partial pseudo-operad. We call it the endomorphism partial pseudooperad for (V O , W O ; V LC ⊗ (V RC )− , W LC ⊗ (W RC )− ). × × × Notice that HVCLC ,W LC ⊗(HVCRC ,W RC )− (here (HVCRC ,W RC )− is the complex conjugate ×
×
of HVCRC ,W RC ) can be embedded naturally into the space HVCLC ⊗(V RC )− ,W LC ⊗(W RC )− . ×
×
Below we shall view HVCLC ,W LC ⊗ (HVCRC ,W RC )− as a partial pseudo-suboperad of ×
HVCLC ⊗(V RC )− ,W LC ⊗(W RC )− .
Let c¯ be the complex conjugate of c ∈ C. The complex conjugate K˜ c¯ of K˜ c¯ is also a C× -rescalable partial operad. Consequently the tensor product K˜ c ⊗ K˜ c¯ (the tensor product of line bundles) is also a C× -rescalable partial operad. Interpreting the action of K on S using the method of doubling disks, we see that K˜ c ⊗ K˜ c¯ acts naturally on ˜ c. S
466
Y.-Z. Huang, L. Kong
˜ c for c ∈ C. We are interested in certain algebras over S ˜ c generated by a differentiable ϒ ˜ c -assoDefinition 5.6. A pseudo-algebra over S ˜ c -associative algebras or ciative pseudo-algebra and meromorphic actions of two K ˜ c consists of the followsimply a differentiable-meromorphic pseudo-algebra over S ing data: O 1. A completely reducible R+ -module V O = s∈R V(s) satisfying the condition O < ∞ for s ∈ R and completely reducible C× -modules V LC = LC dim V(s) m∈Z V(m) RC LC RC and V RC = m∈Z V(m) satisfying the condition dim V(m) < ∞ and dim V(m) < ∞ for m ∈ Z. 2. An R+ -submodule W O of V O and C× -submodules W LC and W RC of V LC and V RC , respectively. ˜ c to the endomor3. A morphism of R+ -rescalable partial pseudo-operads From S R+ phism partial pseudo-operad HV O ,W O ;V LC ⊗(V RC )− ,W LC ⊗(W RC )− and a morphism of C× -rescalable partial pseudo-operads From K˜ c ⊗ K˜ c¯ to the endomorphism partial × pseudo-operad HVCLC ⊗(V RC )− ,W LC ⊗(V RC )− . These data satisfy the following conditions: O = 0 and for m ∈ Z sufficiently negative, 1. For s ∈ R sufficiently negative, V(s) LC = V RC = 0. V(m) (m) ˜ c (n) → H RO+ O LC RC LC RC (n) is linear on the 2. For any n ∈ N, n : S V ,W ;V ⊗V ,W ⊗W ˜ c (n). fibers of S
3. The morphism is equal to L ⊗ R , where L ( R ) is a morphism of C× × × rescalable partial pseudo-operads From K˜ c (K˜ c¯ ) to HVCLC ,W LC (HVCRC ,W RC ) and R is the complex conjugate of R . In addition, the triples (V LC , W LC , L ) and (V RC , W RC , R ) are meromorphic K˜ c -associative algebra and K˜ c¯ -associative algebra, respectively. 4. For any n ∈ N, the map ˜ c (n) → H RO+ O LC n : S (n) V ,W ;V ⊗(V RC )− ,W LC ⊗(W RC )− ˜ c (n) and the intertwines the action of the partial operad K˜ c ⊗ K˜ c¯ on S × C on action of the partial pseudo-operad HV LC ⊗(V RC )− ,W LC ⊗(V RC )− R
HV O+,W O ;V LC ⊗(V RC )− ,W LC ⊗(V RC )− (n). . , sn ∈ R, there existsa finite subset R(s1 , . . . , sn ) ⊂ R such that 5. For any s1 , . . O O S the image of s∈s1 +Z V(s) ⊗ · · · ⊗ s∈sn +Z V(s) under n (ψn (Q)) for any Q ∈ O. ϒ˜ c (n; 0) is in s∈R(s1 ,...,sn )+Z V(s) 6. For any v ∈ (V O ) , u1 , . . . , un ∈ V O , v1L ⊗ v¯1R , . . . , vlL ⊗ v¯lR ∈ V LC ⊗ (V RC )− , v , n (ψnS (Q))(u1 ⊗ · · · ⊗ un ⊗ v1L ⊗ v¯1R ⊗ · · · ⊗ vlL ⊗ v¯lR )
Open-String Vertex Algebras, Tensor Categories and Operads
467
as a function of (1)
(n)
Q = (r1 , . . . , rn−1 ; A(0) , (a0 , A(1) ), · · · , (a0 , A(n) ); (1)
(l)
z1 , . . . , zl ; (b0 , B (1) ), · · · , (b0 , B (l) ); z¯ 1 , . . . , z¯ l ; (b¯0 , B¯ (1) ), · · · , (b¯0 , B¯ (l) )) ∈ ϒ˜ c (n; l) ⊂ K(n + 2l) (1)
(l)
is of the form k
fi (r1 , . . . , rn−1 ; z1 , . . . , zl ; z¯ 1 , . . . , z¯ l ) ·
i=1 (1)
(n)
(1)
(l)
·gi (A(0) , (a0 , A(1) ), · · · , (a0 , A(n) ); (b0 , B (1) ), · · · , (b0 , B (l) ); (1) (l) (b¯0 , B¯ (1) ), · · · , (b¯0 , B¯ (l) )),
where the functions fi (r1 , . . . , rn−1 ; z1 , . . . , zl ; ξ1 , . . . , ξl ) for i = 1, . . . , k are continuous differentiable in r1 , . . . , rn−1 and are meromorphic in z1 , . . . , zl , ξ1 , . . . , ξl with zi = 0, ∞ and zi = zk , i < k, zi = rj , ξi = 0, ∞, ξi = ξk , i < k, ξi = rj and zi = ξk for i, k = 1, . . . , l and j = 1, . . . , n − 1 as the only possible poles, and (1)
(n)
(1)
(l)
gi (A(0) , (a0 , A(1) ), · · · , (a0 , A(n) ); (b0 , B (1) ), · · · , (b0 , B (l) ); (1)
(l)
(d0 , D (1) ), · · · , (d0 , D (l) )) for i = 1, . . . , k are polynomials in A(0) , (a0 )±1 , A(1) , · · · , (a0 )±1 , A(n) , (b0 )±1 , (l) (1) (l) B (1) , · · · , (b0 )±1 , B (l) and (d0 )±1 , D (1) , · · · , (d0 )±1 , D (l) . (1)
(n)
(1)
˜ c are morMorphisms (respectively, isomorphisms) of such pseudo-algebras over S ˜ c prephisms (respectively, isomorphisms) of the underlying pseudo-algebras over S serving all the structures. ˜ c just defined by We denote the differentiable-meromorphic pseudo-algebra over S (V O , W O , V LC , W LC , V RC , W RC , , ) or simply by (V O , V LC , V RC ). For these pseudo-algebras, we also have the following result whose proof is the same as those of the corresponding result in [H6] and for Proposition 5.3: ˜c Proposition 5.7. Any differentiable-meromorphic pseudo-algebra over S (V O , W O , V LC , W LC , V RC , W RC , , ) ˜ c , that is, the image of S ˜ c under is a partial operad (the image is an algebra over S c ˜ of S under satisfies the composition-associativity). Because of this result, we shall omit the word “pseudo” from now on.
468
Y.-Z. Huang, L. Kong
Note that given a vertex operator algebra V , its complex conjugate space V − has a natural vertex operator algebra structure (V − , Y − , 1, ω) of central charge c. ¯ We have the following generalization of Theorem 5.4: Theorem 5.8. The category of objects consisting of a grading-restricted conformal open-string vertex algebra of central charge c ∈ C, two vertex operator algebras, one of central charge c and the other of central charge c, ¯ and homomorphisms from the first vertex operator algebra and the complex conjugate of the second vertex operator algebra to the meromorphic center of the grading-restricted conformal open-string vertex algebra is isomorphic to the category of differentiable-meromorphic algebras over ˜ c. S Proof. The proof of this theorem is basically the same as the proof of the isomorphism theorem for the geometric and operadic formulation of vertex operator algebras in [H6] and the proof of Theorem 5.4. The main new ingredient is that we use those spheres with tubes which are obtained by doubling disks with strips and tubes. Here we give only a sketch. More details will be given in [K]. Given a grading-restricted conformal open-string vertex algebra V O of central charge c, vertex operator algebras V LC and V RC of central charge c and c, ¯ respectively, and homomorphisms hL : V LC → C0 (V O ) and hR : (V RC )− → C0 (V O ). Let W O , W LC and W RC be the modules for the Virasoro algebra generated by 1O , 1LC and 1RC (the vacuums for these algebras), respectively. By the isomorphism theorem in [H6], there are a meromorphic K˜ c -associative algebra (V LC , W LC , L ) and a meromorphic K˜ c¯ -associative algebra (V RC , W RC , R ) constructed from the vertex operator algebras V LC and V RC , respectively. Let = L ⊗ R . By Theorem 5.4, there is a differentiable ϒ˜ c -associative algebra (V O , W O , ϒ ) con˜ c . So this differentiastructed from V O . Note that ϒ˜ c is actually a partial suboperad of S c ˜ ble ϒ -associative algebra gives us part of a differentiable-meromorphic algebra struc˜ c . In general, the construction of the differentiable-meromorphic algebra over ture over S c ˜ S can be obtained using the meromorphic K˜ c -associative algebra (V LC , W LC , L ) and meromorphic K˜ c -associative algebra (V RC , W RC , R ), the differentiable ϒ˜ c -associative algebra (V O , W O , ϒ ) and the homomorphisms hL and hR . The action of the K˜ c -associative algebra (V LC ⊗ (V RC )− , W LC ⊗ (W RC )− , ) is also obtained using the homomorphisms hL and hR . Thus we have a functor from the category of objects of the form (V O , V LC , V RC , hL , hR ) to the category of differentiable-meromorphic ˜ c. pseudo-algebra over S ˜ c , by the definition Now given a differentiable-meromorphic pseudo-algebra over S and the isomorphism theorem in [H6], we know that there are vertex operator algebra structures of central charge c and c¯ on V LC and V RC , respectively. In particular, we have vacuums 1LC ∈ V LC and 1RC ∈ V RC . Since ϒ˜ c is actually a partial suboperad of ˜ c , by Theorem 5.4, we also obtain a grading-restricted conformal open-string vertex S algebra structure of central charge c on V O . For z ∈ H, we consider an element (z) of ϒ(0; 1) which is the conformal equivalence class containing the following disk with ˆ with the boundary puncture ∞ and the strips and tubes of type (1, 0; 0, 1): The disk H interior puncture z and with the standard local coordinates vanishing at these punctures. Then R
0 (ψ0S ((z))) ∈ HV O+,W O ;V LC ⊗(V RC )− ,W LC ⊗(W RC )− (0; 1) = Hom(V LC ⊗ (V RC )− , V O ).
Open-String Vertex Algebras, Tensor Categories and Operads
469
By Condition 6 in Definition 5.6, we know that 0 (ψ0S ((z))(v L ⊗ 1RC ) is meromorphic in z with the only pole z = ∞. In particular, lim 0 (ψ0S ((z)))(v L ⊗ 1RC )
z→0
exists. We define hL (v L ) = lim 0 (ψ0S ((z))(v L ⊗ 1RC ). z→0
Thus we obtain a linear map hL From V LC to V O . It is easy to see that the image of hL is in fact in C0 (V O ) and hL is a homomorphism from V LC to C0 (V O ). Similarly we can construct hR . Now we have a functor from the category of a differentiable-meromorphic ˜ c to the category of objects of the form (V O , V LC , V RC , hL , hR ). pseudo-algebra over S From the isomorphism theorem in [H6], Theorem 5.4 and the construction of the two functors above, we see that these two functors are inverse to each other.
In particular, we have: Corollary 5.9. Let V be a grading-restricted conformal open-string vertex algebra of ˜ c in the sense that central charge c. Then V gives a natural structure of an algebra over S − (V , C0 (V ), C0 (V ) ) has a natural structure of a differentiable-meromorphic algebra ˜ c. over S Proof. We have a grading-restricted conformal algebra V and a vertex operator algebra C0 (V ). Let hL , hR : C0 (V ) → C0 (V ) be the identity map. Then hL and hR are homomorphisms from C0 (V ) to the meromorphic center of V . Then Theorem 5.8 gives (V , C0 (V ), (C0 (V ))− ) a natural structure of differentiable-meromorphic algebra over ˜ c. S
Acknowledgement. We would like to thank J¨urgen Fuchs and Christoph Schweigert for helpful discussions and comments. We are also grateful to Jim Lepowsky for comments. The research of Y.-Z. H. is supported in part by NSF grant DMS-0070800.
References [BPZ]
Belavin, A.A., Polyakov, A.M., Zamolodchikov, A.B.: Infinite conformal symmetries in twodimensional quantum field theory. Nucl. Phys. B241, 333–380 (1984) [B] Borcherds, R.E.: Vertex algebras, Kac-Moody algebras, and the Monster. Proc. Natl. Acad. Sci. USA 83, 3068–3071 (1986) [DMS] Di Francesco, P., Mathieu, P., S´en´echal, D.: Conformal field theory. Graduate Texts in Contemporary Physics. New York: Springer-Verlag, 1997 [DF] Dotsenko, V.S., Fateev, V.A.: Conformal algebra and multipoint correlation functions in 2d statistical models. Nucl. Phys. B240, 312–348 (1984) [C1] Cardy, J.L.: Conformal invariance and surface critical behavior. Nucl. Phys. B240, 514–532 (1984) [C2] Cardy, J.L.: Effect of boundary conditions on the operator content of two-dimensional conformally invariant theories. Nucl. Phys. B275, 200–218 (1986) [C3] Cardy, J.L.: Boundary conditions, fusion rules and the Verlinde formula. Nucl. Phys. B324, 581–596 (1989) [D] Douglas, M.R.: Dirichlet branes, homological mirror symmetry, and stability. In: Proceedings of the International Congress of Mathematicians, Vol. III (Beijing, 2002), Beijing: Higher Ed. Press, 2002, pp. 395–408
470
Y.-Z. Huang, L. Kong
[FRW] Feingold, A.J., Ries, J.F.X., Weiner, M.D.: Spinor construction of the c = 21 minimal model. In: Moonshine, the Monster, and related topics. Proc. Joint Summer Research Conference, Mount Holyoke College, South Hadley, 1994, (eds.), C. Dong, G. Mason, Contemp. Math. 193, Providence, RI: Am. Math. Soc., 1996, pp. 45–92 [FFFS] Felder, G., Fr¨ohlich, J., Fuchs, J., Schweigert, C.: Correlation functions and boundary conditions in rational conformal field theory and three-dimensional topology. Compositio Math. 131, 189–237 (2002) [FHL] Frenkel, I.B., Huang, Y.-Z., Lepowsky, J.: On axiomatic approaches to vertex operator algebras and modules. Preprint, 1989; Mem. Amer. Math. Soc. 104, (1993) [FLM] Frenkel, I.B., Lepowsky, J., Meurman, A.: Vertex operator algebras and the Monster. Pure and Appl. Math. 134, New York: Academic Press, 1988 [FRS1] Fuchs, J., Runkel, I., Schweigert, C.: Conformal correlation functions, Frobenius algebras and triangulations. Nucl. Phys. B624, 452–468 (2002) [FRS2] Fuchs, J., Runkel, I., Schweigert, C.: TFT construction of RCFT correlators I: Partition functions. To appear; hep-th/0204148 [HK1] Hu, P., Kriz, I.: Conformal field theory and elliptic cohomology. To appear [HK2] Hu, P., Kriz, I.: D-brane cohomology and elliptic cohomology. To appear [H1] Huang, Y.-Z.: On the geometric interpretation of vertex operator algebras. Ph.D thesis, Rutgers University, 1990 [H2] Huang, Y.-Z.: Geometric interpretation of vertex operator algebras. Proc. Natl. Acad. Sci. USA 88, 9964–9968 (1991) [H3] Huang, Y.-Z.: A theory of tensor products for module categories for a vertex operator algebra, IV. J. Pure Appl. Alg. 100, 173–216 (1995) [H4] Huang,Y.-Z.: Virasoro vertex operator algebras, (nonmeromorphic) operator product expansion and the tensor product theory. J. Alg. 182, 201–234 (1996) [H5] Huang, Y.-Z.: A nonmeromorphic extension of the moonshine module vertex operator algebra. In: Moonshine, the Monster and related topics, Proc. Joint Summer Research Conference, Mount Holyoke, 1994, (eds.), C. Dong, G. Mason, Contemp. Math., Vol. 193, Providence, RI: Am. Math. Soc., 1996, pp. 123–148 [H6] Huang, Y.-Z.: Two-dimensional conformal geometry and vertex operator algebras. Progress in Mathematics, Vol. 148, Boston: Birkh¨auser, 1997 [H7] Huang, Y.-Z.: Generalized rationality and a “Jacobi identity” for intertwining operator algebras. Selecta Math. (N. S.) 6, 225–267 (2000) [H8] Huang, Y.-Z.: Differential equations and intertwining operators. Duke Math. J., to appear; math.QA/0206206 [H9] Huang, Y.-Z.: Riemann surfaces with boundaries and the theory of vertex operator algebras. To appear; math.QA/0212308 [HKL] Huang, Y.-Z., Kirillov, A., Jr., Lepowsky, J.: Braided tensor categories and extensions of vertex operator algebras. To appear [HL1] Huang, Y.-Z., Lepowsky, J.: Operadic formulation of the notion of vertex operator algebra. In: Mathematical Aspects of Conformal and Topological Field Theories and Quantum Groups, Proc. Joint Summer Research Conference, Mount Holyoke College, South Hadley, 1992, (eds.), P. Sally, M. Flato, J. Lepowsky, N. Reshetikhin, G. Zuckerman, Contemp. Math., Vol. 175, Providence, RI: Am. Math. Soc., 1994, pp. 131–148 [HL2] Huang, Y.-Z., Lepowsky, J.: Vertex operator algebras and operads. In: The Gelfand Mathematical Seminars, 1990–1992, (eds.), L. Corwin, I. Gelfand, J. Lepowsky, Boston: Birkh¨auser, 1993, pp. 145–161 [HL3] Huang, Y.-Z., Lepowsky, J.: A theory of tensor products for module categories for a vertex operator algebra, I. Selecta Math. (N.S.) 1, 699–756 (1995) [HL4] Huang, Y.-Z., Lepowsky, J.: A theory of tensor products for module categories for a vertex operator algebra, II. Selecta Math. (N.S.) 1, 757–786 (1995) [HL5] Huang, Y.-Z., Lepowsky, J.: Tensor products of modules for a vertex operator algebra and vertex tensor categories. In: Lie Theory and Geometry, in honor of Bertram Kostant, (eds.), R. Brylinski, J.-L. Brylinski, V. Guillemin, V. Kac, Boston: Birkh¨auser, 1994, pp. 349–383 [HL6] Huang, Y.-Z., Lepowsky, J.: A theory of tensor products for module categories for a vertex operator algebra, III. J. Pure Appl. Alg. 100, 141–171 (1995) [K] Kong, L.: Ph. D. Thesis, Rutgers University, in preparation [L] Lazaroiu, C.I.: On the structure of open-closed topological field theory in two dimensions. Nucl. Phys. B603, 497–530 (2001) [M1] Moore, G.: Some comments on branes, G-flux, and K-theory. Int. J. Mod. Phys. A16, 936–944 (2001)
Open-String Vertex Algebras, Tensor Categories and Operads [M2] [M3] [O] [S1] [S2] [S3] [S4] [V]
[W]
471
Moore, G.: D-branes, RR-Fields and K-Theory, I, II, III, VI. In: Lecture notes for the ITP miniprogram: The duality workshop: a Math/Physics collaboration, June, 2001; http://online.itp.ucsb.edu/online/mp01/moore1 Moore, G.: K-Theory from a physical perspective. To appear; hep-th/0304018 Ostrik, V.: Module categories, weak Hopf algebras and modular invariants. Transform. Groups 8, 177–206 (2003) Segal, G.: The definition of conformal field theory. In: Differential geometrical methods in theoretical physics (Como, 1987), NATO Adv. Sci. Inst. Ser. C Math. Phys. Sci. 250, Dordrecht: Kluwer Acad. Publ., 1988, pp. 165–171 Segal, G.B.: The definition of conformal field theory. Preprint, 1988 Segal, G.B.: Two-dimensional conformal field theories and modular functors. In: Proceedings of the IXth International Congress on Mathematical Physics, Swansea, 1988, Bristol: Hilger, 1989, pp. 22–37 Segal, G.: Topological structures in string theory. R. Soc. Lond. Philos. Trans. A359, 1389–1398 (2001) Voronov, A.A.: The Swiss-cheese operad. In: Homotopy invariant algebraic structures, in honor of J. Michael Boardman, Proc. of the AMS Special Session on Homotopy Theory, Baltimore, 1998, (eds.), J.-P. Meyer, J. Morava, W. S. Wilson, Contemp. Math., Vol. 239, Providence, RI: Am. Math. Soc., 1999, pp. 365–373 Wang, W.: Rationality of Virasoro vertex operator algebras. Int. Math. Res. Notices (in Duke Math. J.) 7, 197–211 (1993)
Communicated by M.R. Douglas
Commun. Math. Phys. 250, 473–505 (2004) Digital Object Identifier (DOI) 10.1007/s00220-004-1086-7
Communications in
Mathematical Physics
Dispersion of Singularities of Solutions for Schr¨odinger Equations Shin-ichi Doi Department of Mathematics, Graduate School of Science, Osaka University, 1-1 Machikaneyama-cho, Toyonaka, Osaka 560-0043, Japan. E-mail:
[email protected] Received: 4 September 2003 / Accepted: 12 November 2003 Published online: 4 May 2004 – © Springer-Verlag 2004
Dedicated to Professor Mitsuru Ikawa on his sixtieth birthday Abstract: We consider the singularities of solutions for the Schr¨odinger evolution equation associated with H = − 21 + 21 Qx, x + W (x), where Q is a d × d real symmetric matrix with the eigenvalues λ1 , · · · , λd , and W ∈ C ∞ (Rd , R) satisfies W (x) = o(|x|2 ) as |x| → ∞. Under additional conditions, we show the dispersion of microlocal singularities of solutions due to the principal symbol 21 |ξ |2 + 21 Qx, x in all directions at time t ∈ = {0}∪ ∪λj >0 (π/ λj )Z and in the nondegenerate directions at t ∈ . We also show the weaker dispersion of microlocal singularities of solutions due to the subprincipal symbol W in the degenerate directions at t ∈ if W satisfies W (x) = O(|x|1+δ ) as |x| → ∞ for some 0 < δ < 1 and additional conditions. In particular, we prove the dispersion of microlocal singularities of solutions at resonant times when H is a perturbed harmonic oscillator. 1. Introduction This paper deals with the singularities of solutions for Schr¨odinger equations, especially for those associated with perturbed harmonic oscillators. Our aim is to clarify how the “subprincipal symbol” affects the singularities of solutions at resonant times. We first illustrate three aspects of singularities: stability, propagation, and dispersion. Example. Let H = − 21 + 21 ω2 |x|2 + axλ , x ∈ Rd . Here x = (1 + |x|2 )1/2 , ω > 0, a ∈ R, and λ < 2. Let K(t, x, y) be the distribution kernel of the propagator e−itH . (i) When a = 0, the Mehler formula gives √ d i −k ω cos ωt x·y 2 2 K(t, x, y) = √ (|x| + |y| ) − exp iω 2 sin ωt sin ωt 2πi| sin ωt| Partly supported by Grand-in-Aid forYoung Scientists (B) 14740110, Japan Society of the Promotion of Science; and Mathematical Sciences Research Institute in Berkeley
474
S. Doi
for kπ/ω < t < (k + 1)π/ω, k ∈ Z; and at resonant times, K(kπ/ω, x, y) = i −dk δ(x − (−1)k y), Hence
k ∈ Z.
π in (t, x, y) ∈ R \ Z × Rd × Rd ; ω W F u(kπ/ω) = {(−1)k (y, η); (y, η) ∈ W F u0 }, k ∈ Z, K(t, x, y) ∈ C ∞
(1.1) (1.2)
for every u0 ∈ S (Rd ). (ii) When a = 0, the property (1.1) still holds. In other words, the “principal symbol” h0 = 21 |ξ |2 + 21 ω2 |x|2 is dominant at nonresonant times t ∈ / πω Z. (iii) When a = 0 and λ < 1, the property (1.2) also holds, and hence the principal symbol is dominant. This case implies the stability of singularities at resonant times. (iv) When a = 0 and λ = 1, we have 2ak η k W F u(kπ/ω) = (−1) y + 2 , η ; (y, η) ∈ W F u0 , k ∈ Z, ω |η| (1.3) for every u0 ∈ S (Rd ). So the influence of the “subprincipal symbol” axλ appears. This case implies the propagation of singularities at resonant times. (v) When a = 0 and 1 < λ < 2, we have K(kπ/ω, x, y) ∈ C ∞ in (x, y) ∈ Rd × Rd ,
k ∈ Z \ {0}.
(1.4)
So the influence of the subprincipal symbol is dominant. This case implies the dispersion of singularities at resonant times.
This paper focuses on the dispersion of singularities of solutions and explain the role of the subprincipal symbol at resonant times. The next paper will discuss the propagation of singularities as well as another phenomenon, the creation of weaker singularities, at resonant times. Next we prepare some notation. m (R n )), m ∈ R, is the set of all a ∈ C ∞ (R n ) such The symbol space S m (Rn ) (resp. S+ n n that for every α ∈ N0 = (N ∪ {0}) , |∂zα a(z)| ≤ Cα zm−|α| , z ∈ Rn , resp. |∂zα a(z)| ≤ Cα zmax{m−|α|, 0} , z ∈ Rn . The Sobolev space B s (Rd ), s ∈ R, is defined as follows (cf. [4]). Let Hosc be the self-adjoint extension of the operator 1 − + |x|2 with domain C0∞ (Rd ). Then for s/2 every s ∈ R, Hosc is continuous on S(Rd ) and extends to a continuous linear operator s/2 1/2 on S (Rd ) (with the weak* topology), denoted also by Hosc . We set = Hosc and define B s (Rd ) = {u ∈ S (Rd ); s u ∈ L2 (Rd )}. The operator s is known to have the Weyl symbol σ ( s ) satisfying σ ( s ) − (1 + |x|2 + |ξ |2 )s/2 ∈ S s−2 (R2d ) (see [4]).
Dispersion of Singularities of Solutions for Schr¨odinger Equations
The Hamilton vector field of h ∈ C 1 (T ∗ Rd ) is Hh = C 1 (T ∗ Rd )
475
d
j =1
∂h ∂ ∂ξj ∂xj
−
∂h ∂ ∂xj ∂ξj
.
The Poisson bracket of f, g ∈ is {f, g} = Hf g. The Hamilton flow of h ∈ C 1 (T ∗ Rd ) is denoted by etHh ; (x(t, y, η), ξ(t, y, η)) = etHh (y, η) is the solution of the system of ordinary differential equations xj (0) = yj , x˙j (t) = ∂ξj h(x(t), ξ(t)), ξ˙j (t) = −∂xj h(x(t), ξ(t)), ξj (0) = ηj
(1 ≤ j ≤ d).
Now we recall some related results. Let H = − 21 + V (x) be the Schr¨odinger operator on Rd with V ∈ C ∞ (Rd , R). Under some condition, H |C ∞ (Rd ) is essen0 tially self-adjoint. Let H denote its closure by abuse of notation. The propagator e−itH , defined first by the spectral theorem, can be extended to various continuous operators; 2 (R d ), then the mapping, X φ → e−itH φ ∈ C(R, X ), is continuous for if V ∈ S+ s X = B (Rd ), S(Rd ), S (Rd ) (cf. Sect. 6). Let K(t, x, y) be the distribution kernel of e−itH . 2 (R d ), then there exists T > 0 such that K(t, x, y) is C ∞ in t, x, y when (i) If V ∈ S+ 0 < |t| < T (Fujiwara [3]). If in addition lim|x|→∞ |∇ 2 V (x)| = 0, then K(t, x, y) is C ∞ in t, x, y when t = 0 (Yajima [11]; cf. Kapitanski and Rodianski [6]). Forgetting 2 (R d ) now, assume d = 1, V (x) ≥ C(1 + |x|)2+ε near infinity the condition V ∈ S+ for some ε > 0 as well as other technical conditions. Then K(t, x, y) is nowhere C 1 in t, x, y ([11]). See also [2]. 2 (R d ) such that |∇ 2 W (x)| = (ii) Let V (x) = 21 ω2 |x|2 +W (x) with ω > 0 and W ∈ S+ ∞ o(1) as |x| → ∞. Then K(t, x, y) is C in t, x, y when t ∈ / (π/ω)Z (Kapitanski, Rodianski and Yajima [7]). If in addition W ∈ S λ (Rd ) for some λ < 1, then W F u(kπ/ω) = {(−1)k (y, η); (y, η) ∈ W F u0 },
k ∈ Z,
for every u0 ∈ S (Rd ). This was proved by Weinstein [9] and Zelditch [13] when λ = 0, by [7] when u0 = δ(· − y) (the Dirac measure at y ∈ Rd ) at the singular support level, ¯ and by Okaji [8] in the general case. In these cases, no influence of W appears. See also Wunsch [10]. λ (R d ) for some 1 < λ < 2, (iii) Let V (x) = 21 ω2 |x|2 + W (x) with ω > 0 and W ∈ S+ and assume C1 xλ−2 Id ≤ ∇ 2 W (x) ≤ C2 xλ−2 Id near infinity for some C1 , C2 > 0. Then K(kπ/ω, x, y) is C ∞ in x, y for every k ∈ Z \ {0} (Yajima [11]). In my knowledge, this is the only non-trivial result that shows the influence of W . (iv) For other results, see Craig, Kappeler, and Strauss [1] and the references therein. Last we explain the plan of this paper. Sections 2 and 3 describe the dispersion of singularities due to the principal symbol and the subprincipal symbol respectively. Section 4 recalls the Weyl calculus of pseudodifferential operators. Section 5 discusses the asymptotic behavior of the Hamilton flow as the momentum tends to infinity. Section 6 considers the Cauchy problem for the Schr¨odinger equations. Section 7 shows a lemma on the semiclassical limit of pseudodifferential operators. Sections 8 and 9 prove the results of Sects. 2 and 3 respectively. Section 10 collects other results.
476
S. Doi
Notation. N0 = N ∪ {0}. R+ = (0, ∞). C k (U, V ) is the set of all C k maps from U to V (k ∈ N0 ∪ {∞}), and C(U, V ) = C 0 (U, V ); V is omitted if V = C. For locally convex spaces E and F , L(E, F ) is the set of all continuous linear operators from E to F , and L(E) = L(E, E). Mn (R) is the set of all n × n real matrices. The symbol (·, ·) denotes the inner product of L2 (Rd ) or L2 (Rd , Cn ) by abuse of notation, and · the norm. For v ∈ Rn , v = (1 + |v|2 )1/2 . 2. Dispersion of Singularities due to the Principal Symbol 2.1. Preliminaries. Hereafter, we consider the Schr¨odinger operator on Rd in the following form: 1 1 1 H = − + V (x) = − + Qx, x + W (x), 2 2 2 where Q is a d × d real symmetric matrix with the eigenvalues λ1 , . . . , λd , and W ∈ 2 (R d ) is real and satisfies |∇W (x)| = o(|x|) as |x| → ∞. The last condition is S+ 2 (R d ). We regard W as a equivalent to |W (x)| = o(|x|2 ) as |x| → ∞ when W ∈ S+ 1 1 perturbation of H0 = − 2 + 2 Qx, x. Put h0 (x, ξ ) = 21 |ξ |2 + 21 Qx, x and h(x, ξ ) = h0 (x, ξ ) + W (x). Write etHh0 (y, η) = (x 0 (t, y, η), ξ 0 (t, y, η)) = (A(t)y + B(t)η, C(t)y + D(t)η), etHh (y, η) = (x(t, y, η), ξ(t, y, η)). Here the d × d matrices A(t), B(t), C(t), D(t), are given by A(t) B(t) O I = exp t . C(t) D(t) −Q O Hence B(t) =
∞
(−1)k k 2k+1 , k=0 (2k+1)! Q t
A(t) = D(t) = (d/dt)B(t), and C(t) =
d −QB(t). Especially when Qx, x = j =1 λj xj2 , we have if λj = 0; yj + ηj t √ sinh( −λj t) √ if λj < 0; xj0 (t, y, η) = yj cosh( −λj t) + ηj √ −λj yj cos( λj t) + ηj sin(√ λj t) if λj > 0; λj if λj = 0; ηj ξj0 (t, y, η) = yj −λj sinh( −λj t) + ηj cosh( −λj t) if λj < 0; −y λ sin(λ t) + η cos(λ t) if λj > 0. j j j j j Remark. The matrix B(t) is singular if and only if t belongs to π Z ∪ {0}. = λj λ >0 j
For nonzero t, B(t) = O if and only if λ1 , . . . , λd > 0 and t ∈ ∩j =1,...,d √π Z. λj
Dispersion of Singularities of Solutions for Schr¨odinger Equations
477
To state our results, we need a definition. Definition 2.1. For a ∈ S 0 (Rd ), denote by Char a the subset of Rd \ {0} whose complement consists of all x0 ∈ Rd \ {0} such that |a(x)| ≥ c on {x ∈ V ; |x| ≥ C} for a conic neighborhood V of x0 and constants c, C > 0. For a closed conic subset F of Rd \ {0}, denote by ML(F ) the set of all a ∈ S 0 (Rd ) such that 0 ≤ a ≤ 1 and a(x) = 1 on {x ∈ V ; |x| ≥ C} for a conic neighborhood V of F and a constant C > 0. Write ML(η0 ) = ML(R+ η0 ) for η0 ∈ Rd \ {0}. For an open conic set U ⊂ Rd \ {0}, denote by Scm (U ) the set of all a ∈ S m (Rd ) such that supp a ⊂ cone(K) := ∪t>0 tK for a compact set K ⊂ U . 2.2. Dispersion of singularities. Our first theorem concerns the dispersion of singularities in the direction η0 ∈ Rd \ KerB(−t0 ) around t = t0 . Theorem 2.2. Let t0 ∈ R and η0 ∈ Rd \ KerB(−t0 ), and let r > 0. Let u0 ∈ B s0 (Rd ) and u(t) = e−itH u0 ∈ C(Rt , B s0 (Rd )). Assume that xr a(x)u0 ∈ B s0 (Rd ) for some a ∈ ML(B(−t0 )η0 ). Then there exist an open conic neighborhood U of η0 and an interval I = [t0 − ε, t0 + ε] for some 0 < ε 1 such that for all b ∈ Sc0 (U ), x−r Dr b(D)u(t) ∈ C(I, B s0 (Rd )). Remark. Theorem 2.2 holds if U and I satisfy that B(−t)η = 0 and a ∈ ML(B(−t)η) for all η ∈ U and t ∈ I ; in particular, U and I are independent of r. In fact, suppose that U and I satisfy the condition above. For every η ∈ U and t ∈ I , take an open conic set U (η , t ) η and an interval I (η , t ) t determined by the theorem, because a ∈ ML(B(−t )η ). By compactness, for every η ∈ U , there exist t1 , . . . , tn ∈ I such that ∪nj=1 I (η , tj ) ⊃ I . Set U (η ) = ∩nj=1 U (η , tj ). Then the conclusion of the theorem holds for U (η ) and I . For every b ∈ Sc0 (U ) there exist η1 , . . . , ηn ∈ U such that b = b1 + · · · + bn with b ∈ Sc0 (U (ηj )). This completes the proof. Similar compactness argument explains on what U and I depend in other theorems.
Corollary 2.3. Assume that Q = ω2 I for some ω > 0. Let k ∈ Z, η0 ∈ Rd \ {0}, and r > 0. Let u0 ∈ B s0 (Rd ) and u(t) = e−itH u0 ∈ C(Rt , B s0 (Rd )). Assume that xr a(x)u0 ∈ B s0 (Rd ) for some a ∈ ML((−1)k+1 η0 ). Then there exists an open conic neighborhood U of η0 such that for all b ∈ Sc0 (U ), x−r Dr b(D)u(t) ∈ C((kπ/ω, (k + 1)π/ω), B s0 (Rd )). Corollary 2.4. Let r > 0, and let u0 ∈ B s0 (Rd ) and u(t) = e−itH u0 ∈ C(Rt , B s0 (Rd )). If xr u0 ∈ B s0 (Rd ), then x−r Dr u(t) ∈ C(R \ , B s0 (Rd )). Theorem 2.2 relates the direction and order of increase in regularity of the solution to those of decay of the initial data. The next theorem shows that this relation is sharp. Theorem 2.5. Let r1 , r > 0, t0 ∈ R, and η0 ∈ Rd \ {0}. Assume that for some f ∈ C0∞ (Rd ), not identically zero, b ∈ ML(η0 ), a ∈ S 0 (Rd ), and C > 0, the following estimate holds: s0 f (x)Dr b(D)u(t0 ) ≤ C s0 xr1 a(x)u0 + C s0 u0
(2.1)
for all u0 ∈ S(Rd ) with u(t) = e−itH u0 ∈ C(Rt , S(Rd )). Then r1 ≥ r. Moreover if r = r1 , then B(−t0 )η0 = 0 and B(−t0 )η0 ∈ / Char a.
478
S. Doi
To motivate the decay assumption of the initial data in the next section, we next consider in which direction the initial data should decay for the dispersion estimate in the direction η0 ∈ Ker B(−t0 ) \ {0} to hold uniformly in t ∈ (t0 , t0 + ε] or [t0 − ε, t0 ). Theorem 2.6. Assume that B(−t0 ) is singular. Let P1 be the d ×d orthogonal projection matrix onto Ker B(−t0 ) and set P2 = I −P1 . Let η0 ∈ Im P1 \{0}. Let r1 , r > 0. Assume that for some f ∈ C0∞ (Rd ), not identically zero, b ∈ ML(η0 ), a ∈ S 0 (Rd ), an interval I = (t0 , t0 + ε] (resp. [t0 − ε, t0 )) for some ε > 0, and M : I → R+ , the following estimate holds: s0 f (x)Dr b(D)u(t) ≤ M(t) s0 xr1 a(x)u0 + M(t) s0 u0 for all t ∈ I and u0 ∈ S(Rd ) with u(t) = e−itH u0 ∈ C(Rt , S(Rd )). Then r1 ≥ r. Moreover if r = r1 , then −B (−t0 )η0 +P2 η ∈ / Char a (resp. B (−t0 )η0 +P2 η ∈ / Char a) for every η ∈ Rd . Suggested by the necessary condition above, we prove that if the initial data decays in the direction −B (−t0 )η0 with respect to the variables P1 x, some uniform dispersion estimate holds when the influence of W is weak, which contrasts with the case of the next section. Theorem 2.7. Let t0 , P1 , and η0 be the same as in Theorem 2.6. Let r > 0, and let 0 (R d ) and u0 ∈ B s0 (Rd ) and u(t) = e−itH u0 ∈ C(Rt , B s0 (Rd )). Assume P1 ∇W ∈ S+ r s d P1 x a(P1 x)u0 ∈ B 0 (R ) for some a ∈ ML(−B (−t0 )η0 ) (resp. ML(B (−t0 )η0 )). Then there exist an open conic neighborhood U of η0 and an interval I = [t0 , t0 + ε] (resp. [t0 − ε, t0 ]) for some 0 < ε 1 such that for all b ∈ Sc0 (U ), r P1 x−r |t − t0 |P1 D b(P1 D)u(t) ∈ C(I, B s0 (Rd )). Remark. We can choose an open cone U0 , η0 ∈ U0 ⊂ U , such that for all b0 ∈ Sc0 (U0 ), r P1 x−r |t − t0 |D b0 (D)u(t) ∈ C(I, B s0 (Rd )).
3. Dispersion of Singularities due to the Subprincipal Symbol 3.1. General case. Let H be the Schr¨odinger operator in Sect. 2.1. Let t0 ∈ R \ {0} such that B(−t0 ) is singular. Let P1 be the d × d orthogonal projection matrix onto Ker B(−t0 ) and set P2 = I − P1 . Then A(t) = B (t), B(t), P1 , P2 are functions of the real symmetric matrix Q, and A(−t0 )|Im P1 is invertible on Im P1 . This section considers the dispersion of singularities in the direction η0 ∈ Im P1 \ {0} around t = t0 due to the perturbation W . The next section summarizes the results in the special case of the perturbed isotropic harmonic oscillators. We consider two conditions on W . 1+δ (W1) W ∈ S+ (Rd ) for some 0 < δ < 1; |∇ 2 W (x)| = o(1) as |x| → ∞. (W2) There exist F1 , . . . , Fd ∈ C(Rd \ {0}, R), homogeneous of degree δ (where δ is the constant in (W1)), such that with F = (F1 , . . . , Fd ),
lim |∇W (x) − F (x)|/|x|δ = 0.
|x|→∞
Dispersion of Singularities of Solutions for Schr¨odinger Equations
479
We define φ, φ∞ : R × (Rd \ {0}) → Rd by
t
φ(t, η) =
B(τ )∇W B(τ )η dτ,
t ∈ R, η ∈ Rd \ 0;
B(τ )F B(τ )η dτ,
t ∈ R, η ∈ Rd \ 0,
0
φ∞ (t, η) =
t
when (W2) is assumed.
0
Remark. Assume (W1) and (W2). Then φ(t, η) = φ∞ (t, η) + o(|η|δ ) as |η| → ∞ uniformly on every compact time interval.
By direct calculation, we have Lemma 3.1. Assume (W1) and (W2). If P1 Q = ω2 P1 for some ω > 0, then t0 ∈ (π/ω)Z and for every η ∈ Im P1 \ {0}, Cδ,ω nP1 F (η) − F (−η) if k = 2n, n ∈ Z; P1 φ∞ (kπ/ω, η) = Cδ,ω P1 n(F (η) − F (−η)) + F (η) if k = 2n + 1, n ∈ Z. Here Cδ,ω =
π 0
| sin t|1+δ dt/ω2+δ .
First we state a local dispersion theorem. Theorem 3.2. Assume (W1). Assume that |P1 φ(−t0 , η)| ≥ c|η|δ if η ∈ Im P1 and |η| ≥ C
(3.1)
for some c > 0 and C > 0. Let u0 ∈ B s0 (Rd ) and u(t) = e−itH u0 ∈ C(Rt , B s0 (Rd )), and let r > 0. Assume xr u0 ∈ B s0 (Rd ). Then r x−r P1 Dδ + P2 D u(t0 ) ∈ B s0 (Rd ). We give two sufficient conditions for (3.1). Lemma 3.3. Assume (W1). Assume that for some c0 > 0 and C0 > 0, P1 ∇ 2 W (x)P1 ≥ c0 |x|δ−1 P1 if x ∈ Im P1 and |x| ≥ C0 .
(3.2)
Then (3.1) holds. Lemma 3.4. Assume (W1) and (W2). If P1 φ∞ (−t0 , η) = 0 for all η ∈ Im P1 \ {0}, then (3.1) holds. Now we state our microlocal dispersion theorems at t = t0 . Theorem 3.5. Let η0 ∈ Im P1 \ {0}. Assume (W1), (W2), and η˜ 0 = P1 φ∞ (−t0 , η0 ) = 0. Let u0 ∈ B s0 (Rd ) and u(t) = e−itH u0 ∈ C(Rt , B s0 (Rd )), and let r > 0. Assume P1 xr a(P1 x)u0 ∈ B s0 (Rd ) for some a ∈ ML(A(−t0 )η˜ 0 ). Then there exists an open conic neighborhood U of η0 such that for all b ∈ Sc0 (U ), x−r Dδr b(D)u(t0 ) ∈ B s0 (Rd ).
480
S. Doi
Remark. When P2 = 0, there are variants of theorems depending on what kind of decay of the initial data we assume. For example, assume a stronger decay of u0 : xr a(x)u0 ∈ B s0 (Rd ) for some a ∈ ML(A(−t0 )F ), where F is the minimal closed, convex cone in Rd \ {0} containing η˜ 0 and Im P2 \ {0}. Then there exists an open conic neighborhood U of η0 such that for all b ∈ Sc0 (U ), r x−r P1 Dδ + P2 D b(D)u(t0 ) ∈ B s0 (Rd ).
Theorem 3.6. Assume (W1) and (W2). Let η0 ∈ Im P1 \ {0} and r, r1 > 0. Assume that for some f ∈ C0∞ (Rd ), not identically zero, b ∈ ML(η0 ), a ∈ S 0 (Rd ), and C > 0, the following estimate holds: s0 f (x)Dδr b(D)u(t0 ) ≤ C s0 xr1 a(x)u0 + C s0 u0 for all u0 ∈ S(Rd ) with u(t) = e−itH u0 ∈ C(Rt , S(Rd )). Then r ≤ r1 . Moreover if r = r1 , then P1 φ∞ (−t0 , η0 ) = 0 and Char a ∩ (A(−t0 )P1 φ∞ (−t0 , η0 ) + Im P2 ) = ∅. Next we state uniform dispersion theorems near t = t0 : Theorem 3.7. Let η0 ∈ Im P1 \{0}. Assume (W 1), (W 2), and η˜ 0 = P1 φ∞ (−t0 , η0 ) = 0. Consider each of the cases (i) and (ii): / R+ η0 , let be the minimal closed, convex conic subset of Rd \ {0} (i) When η˜ 0 ∈ containing η˜ 0 and −η0 , and let Iε = [t0 , t0 + ε] for ε > 0. (ii) When η˜0 ∈ / −R+ η0 , let be the minimal closed, convex conic subset of Rd \ {0} containing η˜ 0 and η0 , and let Iε = [t0 − ε, t0 ] for ε > 0. Let u0 ∈ B s0 (Rd ) and u(t) = e−itH u0 ∈ C(Rt , B s0 (Rd )), and let r > 0. Assume P1 xr a(P1 x)u0 ∈ B s0 (Rd ) for some a ∈ ML(A(−t0 )). Then there exist an open conic neighborhood U of η0 and ε > 0 such that for all b ∈ Sc0 (U ), r x−r Dδ + |t − t0 |D b(D)u(t) ∈ C(Iε , B s0 (Rd )). Remark. Assume a stronger decay of u0 : xr a(x)u0 ∈ B s0 (Rd ) for some a ∈ ML(A(−t0 )F ), where F is the minimal closed, convex cone in Rd \ {0} containing and Im P2 \ {0}. Then there exist an open conic neighborhood U of η0 and ε > 0 such that for all b ∈ Sc0 (U ), r x−r P1 Dδ + |t − t0 |P1 D + P2 D b(D)u(t) ∈ B(Iε , B s0 (Rd )).
Theorem 3.8. Assume (W1) and (W2). Let η0 ∈ Rd \ {0} and r, r1 > 0. Assume that for some f ∈ C0∞ (Rd ), not identically zero, b ∈ ML(η0 ), a ∈ S 0 (Rd ), C > 0, and Iε = [t0 , t0 + ε] with ε > 0, the following estimates hold: r sup s0 f (x) Dδ + |t − t0 |D b(D)u(t) ≤ C s0 xr1 a(x)u0 + C s0 u0 t∈Iε
for all u0 ∈ S(Rd ) with u(t) = e−itH u0 ∈ C(Rt , S(Rd )). Then r ≤ r1 . Moreover if r = r1 , then η˜ 0 = P1 φ∞ (−t0 , η0 ) = 0, η˜ 0 ∈ / R+ η0 , and Char a ∩ (A(−t0 ) + Im P2 ) = ∅, where is the minimal closed, convex conic subset of Rd \ {0} containing η˜ 0 and −η0 , Similarly, assume the same conditions as above except that Iε is now defined as Iε = [t0 − ε, t0 ]. Then r ≤ r1 . Moreover if r = r1 , then η˜ 0 = P1 φ∞ (−t0 , η0 ) = 0, −η˜ 0 ∈ / R+ η0 , and Char a ∩ (A(−t0 ) + Im P2 ) = ∅, where is the minimal closed, convex conic subset of Rd \ {0} containing η˜ 0 and η0 .
Dispersion of Singularities of Solutions for Schr¨odinger Equations
481
3.2. Special case: isotropic harmonic oscillators. Let H be the Schr¨odinger operator in Sect. 2.1 with Q = ω2 I for some ω > 0. Then t sin(ωτ ) sin(ωτ ) ∇W η dτ, t ∈ R, η ∈ Rd \ 0. φ(t, η) = ω ω 0 Moreover, the explicit calculation of φ∞ (kπ/ω, η) gives Lemma 3.9. Assume (W1) and (W2). Define θk ∈ C(Rd \ {0}, Rd ) by Cδ,ω n(F (η) − F (−η)) if k = 2n, n ∈ Z; θk (η) = Cδ,ω n(F (η) − F (−η)) + F (η) if k = 2n + 1, n ∈ Z. π Here Cδ,ω = 0 | sin t|1+δ dt/ω2+δ . Then for every k ∈ Z, φ∞ (kπ/ω, η) = θk (η); hence φ(kπ/ω, η) = θk (η) + o(|η|δ ) as |η| → ∞. First we state a local dispersion theorem. Theorem 3.10. Assume (W1). Assume that k ∈ Z satifies |φ(−kπ/ω, η)| ≥ c|η|δ if |η| ≥ C
(3.3)
for some c > 0 and C > 0. Let u0 ∈ B s0 (Rd ) and u(t) = e−itH u0 ∈ C(Rt , B s0 (Rd )), and let r > 0. Assume xr u0 ∈ B s0 (Rd ). Then x−r Dδr u(kπ/ω) ∈ B s0 (Rd ). We give sufficient conditions for (3.3). Lemma 3.11. Assume (W1). Assume that for some c0 > 0 and C0 > 0, ∇ 2 W (x) ≥ c0 |x|δ−1 I
if |x| ≥ C0 .
(3.4)
Then there exist c > 0 and C > 0 such that |φ(kπ/ω, η)| ≥ c|k||η|δ − C|k| for all k ∈ Z, η ∈ Rd . Lemma 3.12. Assume (W1) and (W2). Let k ∈ Z satisfy θ−k (η) = 0 for all η = 0. Then (3.3) holds. Now we state our microlocal dispersion theorems at resonant times. Theorem 3.13. Assume (W1) and (W2). Let k ∈ Z and η0 ∈ Rd \ {0} such that η˜ 0 = θ−k (η0 ) = 0. Let u0 ∈ B s0 (Rd ) and u(t) = e−itH u0 ∈ C(Rt , B s0 (Rd )), and let r > 0. Assume xr a(x)u0 ∈ B s0 (Rd ) for some a ∈ ML((−1)k η˜ 0 ). Then there exists an open conic neighborhood U of η0 such that for all b ∈ Sc0 (U ), x−r Dδr b(D)u(kπ/ω) ∈ B s0 (Rd ). Theorem 3.14. Assume (W1) and (W2). Let k ∈ Z \ {0}, r, r1 > 0, and η0 ∈ Rd \ {0}. Assume that for some f ∈ C0∞ (Rd ), not identically zero, b ∈ ML(η0 ), a ∈ S 0 (Rd ), and C > 0, the following estimate holds: s0 f (x)Dδr b(D)u(kπ/ω) ≤ C s0 xr1 a(x)u0 + C s0 u0 for all u0 ∈ S(Rd ) with u(t) = e−itH u0 ∈ C(Rt , S(Rd )). Then r ≤ r1 . Moreover if r = r1 , then θ−k (η0 ) = 0 and (−1)k θ−k (η0 ) ∈ / Char a.
482
S. Doi
We consider uniform estimates of the solution near resonant times. Theorem 3.15. Assume (W1) and (W2). Let k ∈ Z and η0 ∈ Rd \ {0} such that η˜ 0 = θ−k (η0 ) = 0. Consider each of the cases (i) and (ii): / R+ η0 , let be the minimal closed, convex conic subset of Rd \ {0} (i) When η˜ 0 ∈ containing η˜ 0 and −η0 , and let Iε = [kπ/ω, kπ/ω + ε] for ε > 0. (ii) When η˜0 ∈ / −R+ η0 , let be the minimal closed, convex conic subset of Rd \ {0} containing η˜ 0 and η0 , and let Iε = [kπ/ω − ε, kπ/ω] for ε > 0. Let u0 ∈ B s0 (Rd ) and u(t) = e−itH u0 ∈ C(Rt , B s0 (Rd )), and let r > 0. Assume that xr a(x)u0 ∈ B s0 (Rd ) for some a ∈ ML((−1)k ). Then there exist an open conic neighborhood U of η0 and ε > 0 such that for all b ∈ Sc0 (U ), r x−r Dδ + |t − kπ/ω|D b(D)u(t) ∈ C(Iε , B s0 (Rd )). Theorem 3.16. Assume (W1) and (W2). Let k ∈ Z and η0 ∈ Rd \ {0}. Let r, r1 > 0. Assume that for some f ∈ C0∞ (Rd ), not identically zero, b ∈ ML(η0 ), a ∈ S 0 (Rd ), C > 0, and Iε = [kπ/ω, kπ/ω + ε] with some ε > 0, the following estimates holds: r sup s0 f (x) Dδ +|t − kπ/ω|D b(D)u(t) ≤ C s0 xr1 a(x)u0 +C s0 u0 t∈Iε
for all u0 ∈ S(Rd ) with u(t) = e−itH u0 ∈ C(Rt , S(Rd )). Then r ≤ r1 . Moreover if r = r1 , then η˜ 0 = θ−k (η0 ) = 0, η˜ 0 ∈ / R+ η0 , and Char a ∩ = ∅, where is the minimal closed, convex conic subset of Rd \ {0} containing η˜ 0 and −η0 . Similarly, assume the same conditions as above except that Iε is now defined as Iε = [kπ/ω − ε, kπ/ω]. Then r ≤ r1 . Moreover if r = r1 , then η˜ 0 = θ−k (η0 ) = 0, η˜ 0 ∈ / −R+ η0 , and Char a ∩ = ∅, where is the minimal closed, convex conic subset of Rd \ {0} containing η˜ 0 and η0 . 4. Weyl Calculus This section recalls the Weyl calculus associated with the flat metric (see H¨ormander [5, Chapters 18.4-6] for details). A positive function m : R2d → (0, ∞) is said to be a weight if there are C, N > 0 such that m(Y ) ≤ Cm(X)(1 + |X − Y |)N , X, Y ∈ R2d . For a weight m, the symbol space S(m) = S(m, |dX|2 ) is the set of all a ∈ C ∞ (R2d ) such that for every k ∈ N0 , sup |∂Xα a(X) |/m(X) < ∞. ak,S(m) = |α|≤k
It is a Fr´echet space with seminorms ( || · ||k,S(m) )k=0,1,... . A sequence (an )n=1,2,... in S(m) is said to converge to a weakly in S(m), or simply an → a weakly in S(m), if (an ) is bounded in S(m) and converges to a in C ∞ (R2d ). For a ∈ S (R2d ), the Weyl quantization a w = a w (x, D) ∈ L(S(Rd ), S (Rd )) is defined by x+y 1 a , ξ ei(x−y)·ξ u(y)dydξ, u ∈ S(Rd ), a w u(x) = a w (x, D)u (x) = (2π)d 2
Dispersion of Singularities of Solutions for Schr¨odinger Equations
483
where the integral is in the sense of temperate distribution. Then the correspondence Op : S (R2d ) a → Op(a) = a w ∈ L(S(Rd ), S (Rd )) is an isomorphism. For A ∈ L(S(Rd ), S (Rd )), set σ (A) = (Op)−1 (A), called the Weyl symbol of A. Let σ be the canonical symplectic form on R2d ∼ = Rd × (Rd ) : σ (X, Y ) = ξ · y − η · x,
X = (x, ξ ), Y = (y, η) ∈ R2d .
If a1 , a2 ∈ S(R2d ), then a1w a2w = (a1 #a2 )w with iσ (DX , DY ) a1 (X)a2 (Y )|Y =X (a1 #a2 )(X) = exp 2 N−1 1 iσ (DX , DY ) j = a1 (X)a2 (Y )|Y =X + rN (a1 , a2 )(X); j! 2 j =0 1 iθ σ (DX , DY ) (1 − θ )N−1 rN (a1 , a2 )(X) = exp (N − 1)! 2 0 N iσ (DX , DY ) × a1 (X)a2 (Y )|Y =X dθ. 2 Here N ∈ N. Set r0 (a1 , a2 ) = a1 #a2 . Remark that a1 #a2 = a1 a2 + {a1 , a2 }/(2i) + r2 (a1 , a2 ), a1 #a2 + a2 #a1 = 2a1 a2 + r2 (a1 , a2 ) + r2 (a2 , a1 ), a1 #a2 − a2 #a1 = r1 (a1 , a2 ) − r1 (a2 , a1 ) = {a1 , a2 }/ i + r3 (a1 , a2 ) − r3 (a2 , a1 ). Theorem 4.1. Let m1 and m2 be weights and θ ∈ [0, 1]. Then the map iθ σ (DX , DY ) a1 (X)a2 (Y )|Y =X ∈ S(R2d ) Qθ : S(R2d )×S(R2d ) (a1 , a2 ) → exp 2 can be extended to a weakly continuous bilinear map from S(m1 ) × S(m2 ) to S(m1 m2 ). Moreover, for every j ∈ N0 there are C > 0 and k ∈ N0 such that for all (θ, a1 , a2 ) ∈ [0, 1] × S(m1 ) × S(m2 ), Qθ (a1 , a2 )j,S(m1 m2 ) ≤ Ca1 k,S(m1 ) a2 k,S(m2 ) . Proof. This follows from the proofs of Theorems 18.4.10 and 18.5.4 of [5].
Theorem 4.2 (cf. [5, Theorems 18.6.2 and 18.6.3]). (1) Let m be a weight. Then S(m) a → a w ∈ L(S(Rd )) (resp. L(S (Rd )) ) is continuous. Moreover, if an → a weakly in S(m), then anw u → a w u in S(Rd ) (resp. S (Rd ) ) for all u ∈ S(Rd ) (resp. S (Rd ) ). Here S (Rd ) is endowed with the weak* topology. (2) The map S(1) a → a w ∈ L(L2 (Rd )) is continuous. Moreover, if an → a weakly in S(1), then anw u → a w u in L2 (Rd ) for all u ∈ L2 (Rd ). (3) Let mj be a weight, and aj ∈ S(mj ) (j = 1, 2). Then a1w a2w = (a1 #a2 )w . The following observation is useful for obtaining better estimates of the remainder term of a symbol product.
484
S. Doi
Lemma 4.3. Let m1 and m2 be weights. If (a1 , a2 ) ∈ S(m1 ) × S(m2 ) satisfies that σ (DX , DY ) a1 (X)a2 (Y ) = N
n
a1,k (X)a2,k (Y )
k=1
mj,k , and symbols aj,k ∈ S(mj,k ), j = 1, 2, k = 1, . . . , n, with some N ∈ N0 , weights
then rN (a1 , a2 ) ∈ S( nk=1 m1,k m2,k ). Proof. By Theorem 4.1, this lemma follows from n 1 iN (1 − θ )N−1 Qθ (a1,k , a2,k )(X)dθ. rN (a1 , a2 )(X) = (N − 1)!2N 0
k=1
Lemma 4.4. If r ∈ S(1) satisfies r w < 1, then (1 + r w )−1 ∈ Op S(1). Proof. This follows from the Beals characterization of the operators in Op S(1).
In application, we shall use a parameter-dependent version of the calculus above. Let (mλ )λ∈ be a uniform family of weights in the sense that the constants C and N in the definition of weight can be chosen independent of λ ∈ . We say that aλ ∈ S(mλ ) uniformly in λ ∈ if supλ∈ aλ k,S(mλ ) < ∞ for every k ∈ N0 . Then all the statements in this section have the natural parameter dependent version. We introduce another symbol space S+ (m). For a weight m, S+ (m) is the set of all a ∈ S(m) such that for every α ∈ N02d , ∂ α a ∈ S(1 + m(X)X−|α| ). For a uniform family of weights (mλ )λ∈ , we say that aλ ∈ S+ (mλ ) uniformly in λ ∈ if ∂ α aλ ∈ S(1 + m(X)X−|α| ) uniformly in λ ∈ for every α ∈ N02d . Lemma 4.5. If aλ ∈ S+ (X) uniformly in λ ∈ and inf λ∈ aλ ≥ c0 for some c0 > 0, then (aλ )λ∈ is a uniform family of weights, and aλ ∈ S(aλ ) uniformly in λ ∈ . Proof. Observe only that aλ (Y ) ≤ aλ (X) + ∇aλ L∞ (R2d ) |X − Y | ≤ (1 + ∇aλ L∞ (R2d ) /c0 )aλ (X)(1 + |X − Y |),
X, Y ∈ R2d , λ ∈ .
Last, we define time dependent symbol classes. Definition 4.6. For a compact subset I ⊂ Rn and a uniform family of symbol spaces (St )t∈I , that is, St = S(mt ) (resp. S+ (mt )) for every t ∈ I with a uniform family of weights (mt )λ∈I , the space B(I, St ) consists of all p ∈ C(I, C ∞ (R2d )) such that p(t) ∈ St uniformly in t ∈ I . 5. Properties of Hamilton Flows 5.1. General properties of Hamilton flows. 2 (R 2d )) of real value. Let Lemma 5.1. Let I = [t1 , t2 ] (t1 < t2 ). Let p ∈ B(I, S+ ts be the (2-parameter) Hamilton flow associated with p; that is, let ts (y, η) = (x(t, s, y, η), ξ(t, s, y, η)) = (x(t), ξ(t)) be the solution of the system of ordinary differential equations
x˙j (t) = ∂ξj p(t, x(t), ξ(t)), xj (s) = yj , ξ˙j (t) = −∂xj p(t, x(t), ξ(t)), ξj (s) = ηj 1 (R 2d )). Then ts ∈ B(It × Is , S+
(1 ≤ j ≤ d).
Dispersion of Singularities of Solutions for Schr¨odinger Equations
485
Proof. By assumption, there exists C > 0 such that |∇ξ,−x p(t, X)| ≤ CX for every t ∈ I and X ∈ R2d ; hence the a priori estimate, ts (Y ) ≤ eC|t−s| Y , holds. Therefore ts (Y ) exists globally in t ∈ I and satisfies ts (Y ) ≤ eC|t−s| Y for every t, s ∈ I, Y ∈ R2d . The estimates of |∂Yα ts (Y )| can be shown by induction on |α|.
Lemma 5.2. Let I = [t1 , t2 ] (t1 < t2 ). Let ts be the (2-parameter) Hamilton flow 2 (R 2d )). associated with a real symbol p ∈ B(I, S+ (1) ts (x, ξ ) − ts (0, ξ ) ∈ B(It × Is , S+ (x)). (2) If |∇X p(t, X)| = o(|X|) as |X| → ∞ uniformly in t ∈ I , then |ts (X) − X| = o(|X|) as |X| → ∞ uniformly in t, s ∈ I . λ (R 2d )) for some 1 ≤ λ < 2, then (X)−X ∈ B(I ×I , S λ−1 (R 2d )). (3) If p ∈ B(I, S+ ts t s + 1 Proof. (1) Write ts (x, ξ ) − ts (0, ξ ) = 0 (x · ∇x ts )(θ x, ξ )dθ and use Lemma 5.1. (2) Since C −1 ≤ ts (X)/X ≤ C for some C > 0, we have |(∇ξ,−x p)(t, ts (X))| = o(|X|) as |X| → ∞ uniformly in t, s ∈ I . Therefore t |ts (X) − X| = (∇ξ,−x p)(τ, τ s (X))dτ = o(|X|) as |X| → ∞ s
uniformly in t, s ∈ I . t (3) Write ts (X) − X = s (∇ξ,−x p)(τ, τ s (X))dτ and use Lemma 5.1.
Lemma 5.3. Let I = [t1 , t2 ] 0 (t1 < t2 ). Let p0 , p1 ∈ C(I, C ∞ (R2d )) be real-valued. Let φts and ψts be the (2-parameter) Hamilton flows associated with p = p0 + p1 and p0 respectively, which are assumed to exist globally in t, s ∈ I . Let ts be the (2-parameter) Hamilton flow associated with p˜ 1 (t, ·) = p1 (t, ψt0 (·)). Then ts = ψ0t ◦φts ◦ψs0 . In particular, φt0 = ψt0 ◦ t0 . 2 (R 2d ) are real-valued and Remark. This lemma is used when p = p0 +p1 , p0 , p1 ∈ S+ time-independent. Then the (2-parameter) Hamilton flow ts associated with p1 ◦ etHp0 satisfies ts = e−tHp0 ◦ e(t−s)Hp ◦ esHp0 . In particular, etHp = etHp0 ◦ t0 .
Proof. Set ts = ψ0t ◦ φts ◦ ψs0 . For every f ∈ C ∞ (R2d ), d d f ◦ t0 = f ◦ ψ0t ◦ φt0 dt dt = Hp(t) (f ◦ ψ0t ) ◦ φt0 − Hp0 (t) (f ◦ ψ0t ) ◦ φt0 = Hp1 (t) (f ◦ ψ0t ) ◦ φt0 = {p˜ 1 (t), f } ◦ t0 . Here t is regarded as a parameter. By the uniqueness of the solution for the Hamil−1 = ton system associated with p˜ 1 (t), we obtain t0 = t0 . Hence ts = t0 ◦ s0 −1 t0 ◦ s0 = ts .
486
S. Doi
5.2. Asymptotics of Hamilton flows I. We shall adopt the notation and assumption of Sect. 2.1. Let ts be the (2-parameter) Hamilton flow associated with h1 (t, ·) = W ◦ π ◦ etH0 , where π : R2d X = (x, ξ ) → x ∈ Rd . Put ts (y, η) = (x(t, ˜ s, y, η), ξ˜ (t, s, y, η)). Then h1 (t, y, η) = W (A(t)y + B(t)η) and by Lemma 5.3 x(t, y, η) = A(t)x(t, ˜ 0, y, η) + B(t)ξ˜ (t, 0, y, η)).
(5.1)
Lemma 5.4. Let I be a compact interval. (1) x(t, y, η) = B(t)η + f1 (t, y, η) + f2 (t, η) with f1 ∈ B(I, S+ (y)) and f2 (t, η) = o(|η|) as |η| → ∞ uniformly in t ∈ I . (2) Let P be a d × d matrix such that [P , Q] = O. If P ∇W ∈ S(1), then P x(t, y, η) = A(t)P y + B(t)P η + f3 (t, y, η) with f3 ∈ B(I, S(1)). (3) Let t0 ∈ I . Let P1 be the d × d orthogonal projection matrix onto Ker B(−t0 ) and 1 (R d ), then set P2 = I − P1 . Define B0 = −B (−t0 )P1 + B(−t0 )P2 . If W ∈ S+ x(−t, y, η) = (B0 + R(t))((t − t0 )P1 + P2 )η + A(−t)y + f4 (t, y, η) with f4 ∈ B(I, S(1)) and R(t) ∈ C ∞ (I, Md (R)), R(t0 ) = 0. Proof. (1) Use Lemma 5.2 (1) and (2) to estimate f1 (t, y, η) = x(t, y, η) − x(t, 0, η) and f2 (t, η) = A(t)x(t, ˜ 0, 0, η) + B(t)(ξ˜ (t, 0, 0, η) − η) respectively. (2) Since P ∇W ∈ S(1), Lemma 5.1 gives
t
P x(t, ˜ 0, y, η) − P y = 0
P ξ˜ (t, 0, y, η) − P η = −
B(τ )P (∇W )(x(τ, y, η))dτ ∈ B(It , S(1)),
t
A(τ )P (∇W )(x(τ, y, η))dτ ∈ B(It , S(1)).
0
Then (5.1) implies f3 (t, y, η) ∈ B(I, S(1)). 2 (3) Since B(−t)P1 = (B(−t) − B(−t0 ))P − t0 )P1 + O(|t∞− t0 | )P1 as 1 = −B (−t0 )(t t → t0 , we have B(−t) = B0 +R(t) (t −t0 )P1 +P2 with R(t) ∈ C (I, Md (R)) satisfying R(t0 ) = 0. Applying (2) with P = 1, we can complete the proof.
5.3. Asymptotics of Hamilton flows II. We adopt the notation of Sect. 5.2 and assume in addition 1+δ (W1) W ∈ S+ (Rd ) (0 < δ < 1),
|∇ 2 W (x)| = o(1) as |x| → ∞.
Then by Lemma 5.2, x(t, ˜ s, 0, η), ξ˜ (t, s, 0, η) − η ∈ B(It × Is , S+ (ηδ )) for every compact interval I . We shall discuss the asymptotics of x(t, y, η) when |η| → ∞.
Dispersion of Singularities of Solutions for Schr¨odinger Equations
Lemma 5.5. Set
487
t
x(t, ˜ s, 0, η) =
B(τ )∇W (B(τ )η)dτ + r1 (t, s, η), t ξ˜ (t, s, 0, η) = η − A(τ )∇W (B(τ )η)dτ + r2 (t, s, η). s
s
Then |rj (t, s, η)| = o(|η|δ ) as |η| → ∞ uniformly on t, s ∈ I for every compact interval I ⊂ R, j = 1, 2. Proof. Since the proof is similar, we consider only the estimate of x(t, ˜ s, 0, η). Let I be a compact interval in R. By the definition of (x(t, ˜ s, y, η), ξ˜ (t, s, y, η)), t x(t, ˜ s, 0, η) = B(τ )∇W A(τ )x(τ, ˜ s, 0, η) + B(τ )ξ˜ (τ, s, 0, η) dτ s t = B(τ )∇W (B(τ )η + a(τ, s, η))dτ, s
where a(t, s, η) = A(t)x(t, ˜ s, 0, η) + B(t) ξ˜ (t, s, 0, η) − η . By Lemma 5.2 (3), |a(t, s, η)| ≤ C0 ηδ for some C0 > 0 independent of t, s ∈ I, η ∈ Rd . Set K = [s, t] if s ≤ t, and K = [t, s] if s ≥ t. Set Iη = {τ ∈ K; |B(τ )η| ≤ 2C0 ηδ } and I Iη = K \ Iη . Then we have 2 B(τ )∇W (B(τ )η)dτ = O(ηδ ), Iη
2
B(τ )∇W (B(τ )η + a(τ, s, η))dτ = O(ηδ ), Iη
as |η| → ∞ uniformly in t, s ∈ I . On the other hand, B(τ )∇W (B(τ )η + a(τ, s, η))dτ I Iη
=
B(τ )∇W (B(τ )η)dτ I Iη
1
+ I Iη
B(τ )∇ 2 W (B(τ )η + θ a(τ, s, η))a(τ, s, η)dθ dτ.
0
Since |B(τ )η + θ a(τ, s, η)| ≥ C0 ηδ for η ∈ I Iη and ∇ 2 W (x) = o(1) as |x| → ∞, we have 1 B(τ )∇ 2 W (B(τ )η + θ a(τ, s, η))a(τ, s, η)dθ dτ = o(ηδ ) I Iη
0
as |η| → ∞ uniformly in t, s ∈ I . Combining all estimates, we obtain t x(t, ˜ s, 0, η) = B(τ )∇W (B(τ )η)dτ + o(ηδ ) s
as |η| → ∞ uniformly in t, s ∈ I .
488
S. Doi
Lemma 5.6. Set x(t, 0, η) = A(t)φ(t, η) + B(t)(η + ψ(t, η)) + r3 (t, η) with t t φ(t, η) = B(τ )∇W (B(τ )η)dτ, ψ(t, η) = − A(τ )∇W (B(τ )η)dτ. 0
0
B(I, S+ (ηδ )) and |r3 (t, η)|
Then φ, ψ, r3 ∈ for every compact time inverval I .
=
o(|η|δ ) as |η|
→ ∞ uniformly in t ∈ I
Proof. This follows from (5.1) and Lemma 5.5.
Lemma 5.7. Let t0 ∈ R such that B(−t0 ) is singular, and let I t0 be a compact interval. Let P1 be the d × d orthogonal projection matrix onto Ker B(−t0 ) and set P2 = I − P1 . Then P1 x(−t, y, η) = A(−t0 )P1 φ(−t0 , η) − (t − t0 )η + P1 r4 (t, η) +(t − t0 )r5 (t, η) + P1 f (t, y, η), P2 x(−t, y, η) = A(−t0 )P2 φ(−t0 , η) + B(−t0 )P2 ψ(−t0 , η) + B(−t0 )P2 η +P2 r4 (t, η) + (t − t0 )r6 (t, η) + P2 f (t, y, η). Here r4 (t, η) ∈ B(It , S+ (ηδ )) and |r4 (t, η)| = o(|η|δ ) as |η| → ∞ uniformly in t ∈ I ; r5 ∈ B(It , S+ (ηδ + |t − t0 |P1 η)); r6 ∈ B(I, S+ (ηδ + P2 η)); f (t, y, η) ∈ B(It , S+ (y)). Proof. Write B(−t)P1 = (B(−t) − B(−t0 ))P1 = −(t − t0 )A(−t0 )P1 + R(t)P1 , where 1 R(t) = (t − t0 )2 0 (1 − s)B (−t0 − s(t − t0 ))ds. By Lemma 5.6, we have f (t, y, η) = x(−t, y, η)−x(−t, 0, η), r4 (t, η) = r3 (−t, η), and r˜j (t, η) = (t −t0 )rj (t, η) (j = 5, 6), where r˜5 (t, η) = A(−t)P1 φ(−t, η) − A(−t0 )P1 φ(−t0 , η) +R(t)P1 η + B(−t)P1 ψ(−t, η), r˜6 (t, η) = A(−t)P2 φ(−t, η) − A(−t0 )P2 φ(−t0 , η) + B(−t)P2 ψ(−t, η) −B(−t0 )P2 ψ(−t0 , η) + B(−t) − B(−t0 ) P2 η. By these expressions, the proof is immediate.
6. Propagators Associated with Schr¨odinger Equations We first prove a general well-posedness lemma. Let I = [t1 , t2 ] (t1 < t2 ). Lemma 6.1. Let p0 ∈ B(I, S(X2 )) with values in n × n Hermitian matrices such that ∇X p0 ∈ B(I, S(X)), and p1 ∈ B(I, S(1)) with values in n × n matrices. Set p = ip0 + p1 . Then for every s ∈ R, t0 ∈ I , u0 ∈ B s (Rd , Cn ), and f ∈ L1 (I, B s (Rd , Cn )), there exists u ∈ C(I, Bs (Rd , Cn )) satisfying (∂t + p w (·))u = f in D ((t1 , t2 ) × Rd , Cn ),
u(t0 ) = u0 ,
(6.1)
which is unique in C(I, S (Rd , Cn )). Moreover, the solution u satisfies the following estimate: t −γ |t−t0 | s s −γ |τ −t0 | s e u(t) ≤ u(t0 ) + e f (τ ) dτ , t ∈ I. (6.2) t0
Dispersion of Singularities of Solutions for Schr¨odinger Equations
489
Here γ ≥ 0 depends only on s and the seminorms of p0 and p1 (not on u0 , f , u, or t0 ). In particular, sup s u(t) ≤ eγ |I | inf s u(t) + s f (τ ) dτ . (6.3) t∈I
t∈I
I
Proof. By Lemma 4.3, [ s , pw (t) ] = r1 (σ ( s ), p(t)) − r1 (p(t), σ ( s )) ∈ Op B(I, S(Xs )). Therefore s p w (t) −s = pw (t) + [ s , pw (t) ] −s = ip0w (t) + bsw (t) with bs ∈ B(I, S(1)). Set γ = supt∈I bsw (t). Claim. For every u ∈ C(I, B s+2 (Rd , Cn )) ∩ C 1 (I, B s (Rd , Cn )) with f (t) = (∂t + pw (t))u(t), the a priori estimate (6.2) holds. Proof of the Claim. Since v = s u ∈ C(I, B 2 (Rd , Cn )) ∩ C 1 (I, B 0 (Rd , Cn )) satisfies s f (t) = (∂t + ip0w (t) + bsw (t))v(t), we obtain ∂t v(t)2 = 2(−(ip0w (t) + bsw (t))v(t) + s f (t), v(t)) ≤ 2v(t)( γ v(t) + s f (t) ), t ∈ I, which implies
∂t v(t) ≤ γ v(t) + s f (t),
a.e. t ∈ I.
By a Gronwall-type inequlity, we get (6.2) if t ≥ t0 . We can treat the case t ≤ t0 similarly.
Now we shall return to the main proof. Uniqueness. Suppose that u ∈ C(I, S (Rd , Cn )) is a solution of (6.1) with u0 = 0 and f = 0. Since {u(t); t ∈ I } is bounded in some B s+4 (Rd , Cn ), it follows from the equation that u ∈ C(I, B s+2 (Rd , Cn )) (in fact, Lipshitz continuous) and hence u ∈ C 1 (I, B s (Rd , Cn )). By the claim, we get u = 0. Existence. We shall treat the case t1 = t0 (we can treat the case t2 = t0 similarly and hence the rest case by combining both cases). For simplicity, we assume t0 = 0 and t2 = T > 0. First, assume u0 ∈ B s+4 (Rd , Cn ) and f ∈ C(I, B s+4 (Rd , Cn )). If there is a solution u ∈ C(I, S (Rd , Cn )), then u ∈ C 1 (I, S (Rd , Cn )) and it satisfies T T ((−∂t + p w (t)∗ )v(t), u(t))dt = (v(0), u0 ) + (v(t), f (t))dt (6.4) 0
for every v ∈ Y = {v ∈
0
C 1 (I, S(Rd , Cn )); v(T )
= 0}. Set
X = {φ(·) = (−∂t + p w (·)∗ )v(·) ∈ L1 (I, B −s−4 (Rd , Cn )); v ∈ Y}.
490
S. Doi
By the a priori estimate for the operator −∂t + p w (t)∗ , the functional T X φ(·) = (−∂t + p w (·)∗ )v(·) → (v(0), u0 ) + (v(t), f (t))dt ∈ C 0
is bounded if X is regarded as a subspace of L1 (I, B −s−4 (Rd , Cn )). By the Hahn-Banach theorem, there exists u ∈ L∞ (I, B s+4 (Rd , Cn )) such that (6.4) holds for all v ∈ Y. (In fact, the Hahn-Banach theorem is not necessary, because we can prove that X is dense in L1 (I, B −s−4 (Rd , Cn ))). Taking v ∈ C0∞ ((0, T ) × Rd , Cn ), we obtain (∂t + p w (·))u = f in D ((0, T ) × Rd , Cn ), which gives u ∈ C 1 (I, B s (Rd , Cn )). By integrating (6.4) by parts, we have (v(0), u(0)) = (v(0), u0 ) for all v ∈ Y; hence u(0) = u0 . So u ∈ C 1 (I, B s (Rd , Cn )) is the solution of (6.1). Next, assume u0 ∈ B s (Rd , Cn ) and f ∈ L1 (I, B s (Rd , Cn )). Take u0,j ∈ B s+4 (Rd , Cn ) and fj ∈ C(I, B s+4 (Rd , Cn )) such that u0,j → u0 in B s (Rd , Cn ) and fj → f in L1 (I, B s (Rd , Cn )) as j → ∞. Let uj ∈ C 1 (I, B s (Rd , Cn )) be the solution of (6.1) with u0 and f replaced by u0,j and fj . Then (uj ) is a Cauchy sequence in C(I, B s (Rd , Cn )) and its limit u satisfies (6.1) and (6.2).
Lemma 6.2. Under the same setting of Lemma 6.1, let S(t, t0 ) ∈ L(S (Rd , Cn )) (t, t0 ∈ I ) be the operator mapping u0 ∈ S (Rd , Cn ) to u(t) ∈ S (Rd , Cn ), where u ∈ C(I, S (Rd , Cn )) is the solution of the Cauchy problem (∂t + p w (·))u = 0 in D ((t1 , t2 ) × Rd , Cn ),
u(t0 ) = u0 .
(6.5)
(1) S(t, t) = 1 and S(t, s)S(s, r) = S(t, r) on S (Rd , Cn ) (t, s, r ∈ I ). (2) { S(t, t0 )|Bs (Rd ,Cn ) ; t, t0 ∈ I } is bounded in L(B s (Rd , Cn )). (3) I × I × B s (Rd , Cn ) (t, t0 , u0 ) → S(t, t0 )u0 ∈ B s (Rd , Cn ) is continuous. (4) S(t, t0 )|L2 (Rd ,Cn ) ∈ L(L2 (Rd , Cn )) is unitary if p1 = 0. Proof. This lemma is an immediate consequence of Lemma 6.1.
Now we show our two key lemmas for proving the dispersion of singularities. 2 (R 2d )), real-valued, and p ∈ Lemma 6.3. Let p = ip0 + p1 with p0 ∈ B(I, S+ 1 0 (R 2d )), complex-valued. Let be the (2-parameter) Hamilton flow associB(I, S+ ts ated with p0 . In general, for a family of operators A = (A1 , . . . , Ak ) and a sequence J = (j1 , . . . , jm ) ∈ {1, . . . , k}m , m ∈ N, write AJ = Aj1 · · · Ajm and |J | = m; if J = ∅, write AJ = 1 and |J | = 0. Let s ∈ R, t0 ∈ I , and N ∈ N0 . Let a = (a1 , . . . , ak ), 1 (R 2d ) (1 ≤ j ≤ k). Assume aj ∈ S+
(a w )J u0 ∈ B s (Rd ),
((a ◦ t0 t )w )J f (t) ∈ L1 (It , B s (Rd ))
for all |J | ≤ N. Then the soluiton u ∈ C(I, B s (Rd )) of the Cauchy problem (∂t + p w (·))u = f in D ((t1 , t2 ) × Rd ),
u(t0 ) = u0 ,
satisfies ((a ◦ t0 t )w )J u(t) ∈ C(It , B s (Rd ))
Dispersion of Singularities of Solutions for Schr¨odinger Equations
491
for all |J | ≤ N. More precisely, there exists C > 0, depending only on s, N , and the seminorms of p0 , p1 , and a (not on u0 , f , u, or t0 ), such that s ((a ◦ t0 t )w )J u(t) ≤ C inf s ((a ◦ t0 t )w )J u(t) sup t∈I
t∈I |J |≤N
+C
|J |≤N
|J |≤N
I
s ((a ◦ t0 τ )w )J f (τ )dτ. (6.6)
1 (R 2d )). Set n(N ) = 1 + k + · · · + k N . Proof. By Lemma 5.1, aj ◦ ts ∈ B(It × Is , S+ Put U (t) = ((a ◦ t0 t )w )J u(t) |J |≤N ∈ C(It , B s−N (Rd , Cn(N) )),
U0 = ((a w )J u0 )|J |≤N ∈ B s (Rd , Cn(N) ), F (t) = ((a ◦ t0 t )w )J f (t) |J |≤N ∈ L1 (It , B s (Rd , Cn(N) )). By Lemma 6.1, it suffices to prove Claim. U ∈ C(I, B s−N (Rd , Cn(N) )) is the solution of the Cauchy problem (∂t + p˜ w (·))U = F in D ((t1 , t2 ) × Rd , Cn(N) ),
U (t0 ) = U0 ,
(6.7)
where p˜ = ip0 In(N) +b with some b ∈ B(I, S(1)) with values in n(N )×n(N ) matrices. Proof of the Claim. We shall use induction on N . When N = 0, there is nothing to prove. Suppose that the assertion is true for N . Set U (t) = ((a ◦ t0 t )w )J u(t) |J |≤N and F (t) = ((a ◦ t0 t )w )J f (t) |J |≤N . By the inductive assumption, (∂t + ip0w (t) + bw (t))U (t) = F (t) with some b ∈ B(I, S(1)) with values in n(N ) × n(N ) matrices. For j ∈ {1, . . . , k}, c1,j (t) = −σ ( [ ∂t + ip0w (t), (aj ◦ t0 t )w ] ) = −r3 (ip0 (t), aj ◦ t0 t ) + r3 (aj ◦ t0 t , ip0 ) ∈ B(It , S(1)), c2,j (t) = σ ( [ (aj ◦ t0 t )w , bw (t) ] ) = r1 (aj ◦ t0 t , b(t)) − r1 (b(t), aj ◦ t0 t ) ∈ B(It , S(1)). Therefore (∂t + ip0w (t) + bw (t))(aj ◦ t0 t )w U (t) + bjw U (t) = (aj ◦ t0 t )w F (t) with bj = c1,j + c2,j ∈ B(It , S(1)). This completes the proof.
Lemma 6.4. Under the same assumption as in Lemma 6.3, the solution u(t) ∈ C(It , B s (Rd )) satisfies that for every p(t) ∈ B(It , S(a ◦ t0 t N )), p w (t)u(t) ∈ C(It , B s (Rd )). Here a ◦ t0 t = (1 +
k
j =1 |aj
◦ t0 t |2 )1/2 .
492
S. Doi
1 (R 2d ). By Lemmas 4.5 and 5.1, we obtain a ◦ ∈ B(I , S(a ◦ Proof. Note a ∈ S+ t0 t t t0 t )). Since k aj ◦ t0 t 1 a ◦ t0 t = aj ◦ t0 t + , a ◦ t0 t a ◦ t0 t j =1
we have a ◦ t0 t w =
k
bjw (t)(aj ◦ t0 t )w + b0w (t)
j =1
with some bj ∈ B(I, S(1)). By induction on n = 0, 1, . . . , N, we can show
a ◦ t0 t w
n
=
w cn,J (t)((a ◦ t0 t )w )J
|J |≤n
n with some cn,J ∈ B(I, S(1)). Thus a ◦ t0 t w u(t) ∈ C(It , B s (Rd )). Set hλ (t) = λ + a ◦ t0 t for λ ≥ 1. Since w rλ (t) = σ (hw λ (t)(1/ hλ (t)) ) − 1 = r2 (hλ (t), 1/ hλ (t)) ∈ S(1/λ)
uniformly in λ ≥ 1 and t ∈ I , there exists λ0 1 such that rλw (t) ≤ 1/2 for all λ ≥ λ0 and t ∈ I . By (the time-dependent version of) Lemma 4.4, (1 + rλw (t))−1 ∈ −1 = (1/ h (t))w (1+r w (t))−1 ∈ Op B(I , S(a◦ −1 )) Op B(I, S(1)); hence (hw λ t t0 t λ (t)) λ for each λ ≥ λ0 . For every p(t) ∈ B(It , S(a ◦ t0 t N )), −N N w w N · hw pw (t) = pw (t)hw λ0 (t) λ0 (t) = c (t)hλ0 (t)
with c ∈ B(I, S(1)). Therefore p w (t)u(t) ∈ C(It , B s (Rd )).
Last we state an exact Egorov lemma. Lemma 6.5. Let p(x, ξ ) be a real polynomial of order 2, and set P = pw . Then for all a ∈ S (R2d ) eitP a w e−itP = (a ◦ etHp )w . Proof. For completeness, we shall prove this lemma. Set A(t) = e−itP (a ◦ etHp )w eitP . Then A (t) = e−itP − [iP , (a ◦ etHp )w ] + (Hp (a ◦ etHp ))w eitP = 0 as a form on S(Rd ) by the symbolic calculus. Hence A(t) = A(0) = a w , which completes the proof.
Dispersion of Singularities of Solutions for Schr¨odinger Equations
493
7. Semiclassical Limits of Pseudodifferential Operators The next lemma is used to show necessary conditions, corresponding roughly to microlocal ellipticity conditions, for the dispersion estimates to hold.
Lemma 7.1. Let ξ = (ξ , ξ ) ∈ Rd × Rd = Rd (1 ≤ d ≤ d, d + d = d). For x0 ∈ Rd , ξ = (ξ0 , ξ0 ) ∈ Rd , ξ0 = 0, 0 < ε < δ ≤ 1, and φ ∈ C0∞ (Rd ) satisfying φ = 1, set
δ ξ ·x ) 0
vλ (x) = λεd/2 ei(λξ0 ·x +λ
φ(λε (x − x0 )),
λ ≥ 1.
Define mλ (ξ ) = ξ δ + λδ−1 ξ + ξ . (1) If qλ ∈ S(ξ s mλ (ξ )r xN ) uniformly in λ ≥ 1 for some r, s, N ∈ R, then |(qλw vλ , vλ )| = O(λs+δr ) as λ → ∞. (2) In (1), if in addition ∇ξ qλ ∈ S(ξ s mλ (ξ )r−ρ xN ) uniformly in λ ≥ 1 for some ρ > ε/δ, then (qλw vλ , vλ ) = qλ (x0 , (λξ0 , λδ ξ0 )) + O(λs+δr−min{ρδ−ε,ε} ) as λ → ∞. Proof. Remark that (mλ (ξ ))λ≥1 is a uniform family of weights. Set ξ0,λ = (λξ0 , λδ ξ0 ). We first prove the following claim. Claim. Set f (ξ, λ, θ ) = θ λε ξ +ξ0,λ s mλ (θ λε ξ +ξ0,λ )r for ξ ∈ Rd , λ ≥ 1, θ ∈ [0, 1]. Then there exist n ∈ N and Kn,s,r > 0 such that ξ −2n f (ξ, λ, θ ) ≤ Kn,s,r ξ −1−d λs+δr ,
ξ ∈ Rd , λ ≥ 1, θ ∈ [0, 1].
Proof of the Claim. We shall denote by Cj > 0 various constants independent of ξ ∈ Rd , λ ≥ 1, θ ∈ [0, 1]. Observe that 1/C1 ≤ mλ (ξ0,λ )/λδ ≤ C1 ,
λ ≥ 1.
When |ξ | ≤ λδ with δ = (δ − ε)/2, |mλ (θ λε ξ + ξ0,λ ) − mλ (ξ0,λ )| ≤ C2 λε |ξ | ≤ C2 λ(δ+ε)/2 ; hence 1/C3 ≤ mλ (θ λε ξ + ξ0,λ )/λδ ≤ C3 , which gives |r|
ξ −n mλ (θ λε ξ + ξ0,λ )r ≤ C3 ξ −n λδr .
When |ξ | ≥ λδ ,
ξ −n mλ (θ λε ξ + ξ0,λ )r ≤ C4 (λδ + ξ )−n λε|r| ξ |r| λδr . Therefore we have for n ≥ (ε|r|/δ ) + |r| + d + 1, ξ −n mλ (θ λε ξ + ξ0,λ )r ≤ C5 ξ −1−d λδr . Similarly for n ≥ (ε|s|/δ ) + |s|, ξ −n θ λε ξ + ξ0,λ s ≤ C6 λs . So it suffices to choose n ≥ (ε|r|/δ ) + |r| + d + 1 + (ε|s|/δ ) + |s|.
494
S. Doi
(1) By the definition and the integration by parts, (qλw vλ , vλ )
x+y ei(x−y)·(ξ −ξ0,λ ) qλ , ξ φ(λε (y − x0 ))φ(λε (x − x0 ))dy dξ dx · λεd 2 x + y ε ei(x−y)·ξ qλ x0 + = (2π )−d φ(y)φ(x)dy , λ ξ + ξ dξ dx 0,λ 2λε x+y ε φ(y)φ(x)dy ei(x−y)·ξ ξ −2n Dy 2n qλ x0 + , λ ξ + ξ dξ dx = (2π )−d 0,λ 2λε −2n ε ˜ φ(y) ˜ dydξ dx. λ ξ + ξ0,λ s mλ (λε ξ + ξ0,λ )r φ(x) ≤C ξ
= (2π )−d
Here C > 0 is independent of λ, and φ˜ ∈ C0∞ (Rd , R) such that φ˜ = 1 on supp φ. Applying the claim, we have (qλw vλ , vλ ) = O(λs+δr ) as λ → ∞ if n ∈ N is large enough. (2) Set
rλ (x, y, λ) = qλ
x+y ε x0 + , λ ξ + ξ0,λ − qλ (x0 , ξ0,λ ) φ(x)φ(y). 2λε
By the proof of (1), (qλw vλ , vλ ) −qλ (x 0 , ξ0,λ )
ei(x−y)·ξ rλ (x, y, ξ )dy dξ dx
= (2π )−d = (2π )−d
ei(x−y)·ξ ξ −2n Dy 2n rλ (x, y, ξ )dy dξ dx.
By using the formula qλ (x0 + h, ξ0,λ + k) − qλ (x0 , ξ0,λ ) =
1
(h · ∇x + k · ∇ξ )qλ (x0 + θh, ξ0,λ + θ k)dθ
0
and applying the claim above, we obtain |ξ −2n Dy 2n rλ (x, y, ξ )| 1 −ε −2n ε λ ξ λ ξ + ξ0,λ s mλ (θ λε ξ + ξ0,λ )r ≤ C 0 ε
˜ φ(y)| ˜ +(λ |ξ |)ξ −2n λε ξ + ξ0,λ s mλ (θ λε ξ + ξ0,λ )r−ρ dθ |φ(x) ˜ φ(y)| ˜ ≤ C λs+δr−min{ε,ρδ−ε} ξ −1−d |φ(x) if n 1. Hence (qλw vλ , vλ ) − qλ (x0 , ξ0,λ ) = O(λs+δr−min{ε,ρδ−ε} ) as λ → ∞.
Dispersion of Singularities of Solutions for Schr¨odinger Equations
495
8. Proofs for Section 2 We adopt the notation of Sect. 5.2. Proof of Theorem 2.2. Take N ∈ N such that ρ = r/N ≤ 1. Replacing a by another in ML(B(−t0 )η0 ) if necessary, we can assume that (a(x)xρ )j u0 ∈ B s (Rd ) for ρ j = 0, 1, . . . , N. Set α(x, ξ ) = 1 + a(x)xρ ∈ S+ (R2d ). By Lemma 5.4 (1), there d exist an open conic neighborhood U of η0 in R \ {0}, ε > 0, and M > 1 such that a(x(−t, y, η)) = 1,
|x(−t, y, η)| ≥ |B(−t0 )η|/2,
if η ∈ U, t ∈ Iε = [t0 − ε, t0 + ε], My ≤ |η|. Take γ ∈ C ∞ (Rd ) such that 0 ≤ γ ≤ 1, γ (x) = 1 if |x| ≥ M + 1, γ (x) = 0 if |x| ≤ M. Let b ∈ Sc0 (U ). Set p(y, η) = ηr b(η)γ (η/y) ∈ S(ηr ). Then there exists C > 0 such that α ◦ e−tHh N (y, η) ≥ C −1 ηr for every t ∈ Iε and (y, η) ∈ supp p. Therefore p ∈ B(Iε , S(α ◦ e−tHh N )). By Lemma 6.4, we obtain pw u(·) ∈ C(Iε , B s0 (Rd )). Since q(y, η) = ηr b(η) 1 − γ (η/y) ∈ S(yr ), we get x−r q w ∈ Op S(1). Thus x−r Dr b(D)u(·) ∈ C(Iε , B s0 (Rd )).
Proof of Corollary 2.3. Since B(−t) = −(sin ωt/ω)Id , we have ML(B(−t)η0 ) = ML((−1)k+1 η0 ) if t ∈ (kπ/ω, (k + 1)π/ω). For a ∈ ML((−1)k+1 η0 ), there exists an open cone U η0 such that a ∈ ML((−1)k+1 η) for every η ∈ U . By the remark after Theorem 2.2, the proof is complete.
Proof of Corollary 2.4. This follows from Theorem 2.2.
Proof of Theorem 2.5. Suppose r = r1 . Reduction. Take N ∈ N such that 0 < ρ = r/N ≤ 1/2. Set a1 (x) = (x−1/2 + ρ |a(x)|2 )1/2N . Then ∂x a1 (x) ∈ S(x−1/2 ). Set α(x) = 1 + a1 (x)xρ ∈ S+ (Rd ). Since α(x)N ≥ |a(x)|xr , we have a(x)xr ∈ S(α(x)N ). This implies s0 a(x)xr u0 + s0 u0 ≤ C1 s0 α(x)N u0 ,
u0 ∈ S(Rd ),
(8.1)
for some C1 > 0. Let I = [−T , T ] with T 1. Set α(t, y, η) = α(x(−t, y, η)) ∈ ρ B(I, S+ (R2d )). By Lemma 6.3 and (8.1), there exists C2 > 0 such that for every u0 ∈ S(Rd ) with u(t) = e−itH u0 , s0 a(x)xr u0 + s0 u0 ≤ C2 inf t∈I
N
s0 (α w (t))j u(t).
j =0
Main proof. By the a priori estimate and (8.2), there exists C3 > 0 such that s0 f (x)Dr b(D)u(t0 )2 ≤ C3
N j =0
s0 (α w (t0 ))j u(t0 )2
(8.2)
496
S. Doi
for every u(t0 ) ∈ S(Rd ). By Lemma 5.4 (1), x(−t0 , x0 , λη0 ) = λB(−t0 )η0 + o(λ) as λ → ∞. Set p(y, η) = |(y, η)s0 f (y)ηr b(η)|2 and q(y, η) = |(y, η)s0 α(t0 , y, η)N |2 . Using Lemma 4.3, we have σ (| s0 f (x)Dr b(D)|2 ) − p ∈ S(η2s0 +2r−1 ), N
σ (| s0 (α w (t0 ))j |2 ) − q ∈ S((y, η)2s0 +2r−2ρ ).
j =0
Take x0 ∈ Rd such that f (x0 ) = 0. Take φ ∈ C0∞ (Rd , R) such that |φ(x)|2 dx = 1. Set vλ (x) = λεd/2 eiλη0 ·x φ(λε (x − x0 )) with 0 < ε < ρ. Applying Lemma 7.1 with d = d and δ = 1, we obtain s0 f (x)Dr b(D)vλ 2 = p(x0 , λη0 ) + o(λ2s0 +2r ) N
s0 (α w (t0 ))j vλ 2 = q(x0 , λη0 ) + o(λ2s0 +2r )
(λ → ∞), (λ → ∞).
j =0
Notice that lim inf λ→∞ p(x0 , λη0 )/λ2s0 +2r > 0 and that q(x0 , λη0 ) = |η0 |2s0 |B(−t0 )η0 |2r |a(x(−t0 , x0 , λη0 ))|2 λ2s0 +2r +o(λ2s0 +2r )
(λ → ∞).
/ Thus B(−t0 )η0 = 0 and lim inf λ→∞ |a(λB(−t0 )η0 )| > 0. This implies B(−t0 )η0 ∈ Char a. Suppose r > r1 . By the a priori estimate and the continuity of the propagator e−itH , we have s0 f (x)Dr b(D)u(t0 )2 ≤ C s0 +r1 u(t0 )2 ,
u(t0 ) ∈ S(Rd ),
for some C > 0. Choosing u(t0 ) = vλ , we have s0 +r1 vλ 2 = O(λ2s0 +2r1 ) as λ → ∞, which leads to contradiction. Therefore, r ≤ r1 .
Proof of Theorem 2.6. Since the proof is parallel, we consider the first case. By Theorem 2.5, r ≤ r1 . Suppose r = r1 . By the same theorem, B(−t)η ∈ / Char a for all t0 < t ≤ t0 +ε and η ∈ η0 +V , where V is a neighborhood of 0 in Rd . Define B0 = −B (−t0 )P1 + B(−t0 )P2 . Then B0 is non-singular and B(−t) = (B0 + R(t))((t − t0 )P1 + P2 ) with R ∈ C ∞ (R, Md (R)) satisfying R(t0 ) = O. Set f (t, η) = (I + B0−1 R(t))−1 (η0 + P2 η) for η ∈ Rd , |η| ≤ ε1 , 0 < t − t0 ≤ ε2 , with some 0 < ε1 , ε2 1. If ε1 and ε2 are sufficiently small, then f (t, η) ∈ η0 + V and B(−t)f (t, η) = −(t − t0 )B (−t0 )η0 + B(−t0 )P2 η, / Char a. This means −B (−t0 )η0 + which implies −(t − t0 )B (−t0 )η0 + B(−t0 )P2 η ∈ d P2 η ∈ / Char a for every η ∈ R .
Proof of Theorem 2.7. Since the proof is parallel, we consider the first case. By assumption P1 ∇W ∈ S(1); hence Lemma 5.4 (2) gives P1 x(−t, y, η) = B(−t)P1 η + A(−t)P1 y + f3 (t, y, η),
(8.3)
where f3 (t, y, η) ∈ B(K, S(1)) for every compact interval K ⊂ R. Remark that B(−t)P1 = (B(−t) − B(−t0 ))P1 = (−B (−t0 )(t − t0 ) + O(|t − t0 |2 ))P1 as t → t0 .
Dispersion of Singularities of Solutions for Schr¨odinger Equations
497
Take N ∈ N such that ρ = r/N ≤ 1. Replacing a by another in ML(−B (−t0 )η0 ) if necessary, we can assume that (a(P1 x)P1 xρ )j u0 ∈ B s0 (Rd ) for j = 0, 1, . . . , N. ρ Set α(x, ξ ) = 1 + a(P1 x)P1 xρ ∈ S+ (R2d ). By (8.3), there exist an open conic neighborhood U of η0 in Rd \ {0}, ε > 0, and M > 1 such that a(P1 x(−t, y, η)) = 1,
|P1 x(−t, y, η)| ≥ |B (−t0 )(t − t0 )P1 η|/2,
if P1 η ∈ U, t ∈ Iε = [t0 , t0 + ε], |t − t0 ||P1 η| ≥ MP1 y. Take γ ∈ C ∞ (Rd ) such that 0 ≤ γ ≤ 1, γ (x) = 1 if |x| ≥ M + 1, γ (x) = 0 if |x| ≤ M. Let b ∈ Sc0 (U ). Set p(t, y, η) = |t −t0 |r P1 ηr b(P1 η)γ ((t −t0 )P1 η/P1 y) ∈ B(Iε , S((1+|t −t0 |P1 η)r ). Then there exists C > 0 such that α ◦ e−tHh N (y, η) ≥ C −1 (1 + |t − t0 |P1 η)r for every (t, y, η) ∈ supp p with t ∈ Iε . Therefore p ∈ B(Iε , S(α ◦ e−tHh N )). By Lemma 6.4, we obtain pw (·)u(·) ∈ C(Iε , B s0 (Rd )). Since q(t, y, η) = |t − t0 |r P1 ηr b(P1 η) 1 − γ ((t − t0 )P1 η/P1 y) ∈ B(Iε , S(P1 yr )), we get P1 x−r q w (t) ∈ Op B(Iε , S(1)). Thus |t − t0 |r P1 x−r P1 Dr b(P1 D)u(·) ∈ C(Iε , B s0 (Rd )).
9. Proofs for Section 3 Since the results in Sect. 3.2 are contained in Sect. 3.1, we shall prove the assertions of Sect. 3.1. We shall adopt the notation there. Set t t B(τ )∇W (B(τ )η)dτ, ψ(t, η) = − A(τ )∇W (B(τ )η)dτ ; φ(t, η) = 0 0 t t φ∞ (t, η) = B(τ )F (B(τ )η)dτ, ψ∞ (t, η) = − A(τ )F (B(τ )η)dτ. 0
0
Proof of Theorem 3.2. First we prove Claim. φ(−t0 , η +η ) = φ(−t0 , η )+o(|η |) as η ∈ Im P2 and |η | → ∞ uniformly in η ∈ Im P1 . Proof of the Claim. Let η ∈ Im P1 , η ∈ Im P2 , and η = η + η . Set K = [0, −t0 ] if t0 < 0, and K = [−t0 , 0] if t0 > 0. Put C = supt∈K B(τ ). Set Iη = {τ ∈ K; |B(τ )η | ≥ 2C|η |} and I Iη = K \ Iη . Then B(τ ) ∇W (B(τ )η) − ∇W (B(τ )η ) dτ Iη
= Iη
0
1
B(τ )∇ 2 W (B(τ )η + sB(τ )η )B(τ )η ds dτ
= o(|η |) as |η | → ∞
498
S. Doi
uniformly in η ∈ Im P1 , because |B(τ )η + sB(τ )η | ≥ C|η | when τ ∈ Iη , s ∈ [0, 1]. On the other hand, B(τ ) ∇W (B(τ )η) − ∇W (B(τ )η ) dτ = O(|η |δ ) as |η | → ∞ I Iη
uniformly in η ∈ Im P1 . Since
B(τ ) ∇W (B(τ )η) − ∇W (B(τ )η ) dτ , |φ(−t0 , η + η ) − φ(−t0 , η )| =
K
the proof is complete.
Now we shall start the main proof. Take N ∈ N such that ρ = r/N ≤ 1. Set α(x, ξ ) = xρ . By Lemma 5.7, P1 x(−t0 , y, η) =A(−t0 )P1 φ(−t0 , η) + P1 r4 (t0 , η) + P1 f (t0 , y, η), P2 x(−t0 , y, η) = A(−t0 )P2 φ(−t0 , η) + B(−t0 )P2 ψ(−t0 , η) + B(−t0 )P2 η + P2 r4 (t0 , η) + P2 f (t0 , y, η). Here r4 (t0 , η) = o(|η|δ ) as |η| → ∞, and f (t0 , ·, ·) ∈ S+ (y). By the Claim, A(−t0 )P1 φ(−t0 , η) = A(−t0 )P1 φ(−t0 , P1 η) + o(|P2 η|) as |P2 η| → ∞ uniformly in P1 η. There exist c1 , c2 , C1 > 0 such that for all η ∈ Rd , |A(−t0 )P1 φ(−t0 , P1 η)| ≥ c1 |P1 η|δ − C1 , |B(−t0 )P2 η| ≥ c2 |P2 η|. Therefore there exist ε > 0, c > 0, and M > 1 such that |x(−t0 , y, η)| ≥(1 − ε)|P1 x(−t0 , y, η)| + ε|P2 x(−t0 , y, η)| ≥c(P1 ηδ + P2 η) = cm(η) if m(η) ≥ My. Take γ ∈ C ∞ (R) such that 0 ≤ γ ≤ 1, γ (t) = 1 if |t| ≥ M + 1, γ (t) = 0 if |t| ≤ M. Set p(y, η) = m(η)r γ (m(η)/y) ∈ S(m(η)r ). Then there exists C > 0 such that α ◦ e−t0 Hh N (y, η) ≥ C −1 m(η)r for every (y, η) ∈ supp p. Therefore p ∈ S(α ◦ e−t0 Hh N )). By Lemma 6.4, we obtain pw u(t0 ) ∈ B s0 (Rd ). Since q(y, η) = m(η)r 1 − γ (m(η)/y) ∈ S(yr ), we get x−r q w ∈ Op S(1). Thus x−r m(D)r u(t0 ) ∈ B s0 (Rd ).
Proof of Lemma 3.3. For all x ∈ Im P1 satisfying |x| ≥ 2C0 , 1 x · ∇W (x) = x · ∇W (0) + ∇ 2 W (θ x)x, xdθ ≥ ≥
0 1
C0 /|x|
c0 θ xδ−1 |x|2 dθ − C1 |x| − C1
C0 /|x| 1
c0 θ xδ−1 |x|2 dθ − C2 |x|
1/2
≥ c1 |x|δ+1 − C2 |x| ≥ c2 x1+δ − C3
0
|x|2 dθ
Dispersion of Singularities of Solutions for Schr¨odinger Equations
499
with some cj , Cj > 0 independent of x. Hence x · ∇W (x) ≥ cx1+δ − C,
x ∈ Im P1 ,
for some c, C > 0. Therefore −t0 1+δ |P1 φ(−t0 , η) · η| ≥ cB(τ )η dθ − C ≥ c η1+δ − C ,
η ∈ Im P1 ,
0
for some c , C , C > 0, which completes the proof.
Proof of Lemma 3.4. This follows from the remark before Lemma 3.1.
Proof of Theorem 3.5. Take N ∈ N such that ρ = r/N ≤ 1. Replacing a by another in ML(A(−t0 )η˜ 0 ) if necessary, we can assume that (a(P1 x)P1 xρ )j u0 ∈ B s0 (Rd ) for ρ j = 0, 1, . . . , N. Set α(x, ξ ) = 1 + a(P1 x)P1 xρ ∈ S+ (R2d ). By Lemma 5.7, P1 x(−t0 , y, η) = A(−t0 )P1 φ(−t0 , η) + P1 r4 (t0 , η) + P1 f (t0 , y, η). Here |r4 (t0 , η)| = o(|η|δ ) as |η| → ∞, and f (t0 , y, η) ∈ S+ (y). Then there exist an open conic neighborhood U of η0 , c > 0, and M > 1 such that a(P1 x(−t0 , y, η)) = 1, |P1 x(−t0 , y, η)| ≥ |A(−t0 )P1 φ∞ (−t0 , η)|/2 ≥ cηδ = cm(η), if η ∈ U and m(η) = ηδ ≥ My. Let b ∈ Sc0 (U ). Take γ ∈ C ∞ (R) such that 0 ≤ γ ≤ 1, γ (t) = 1 if |t| ≥ M+1, γ (t) = 0 if |t| ≤ M. Set p(y, η) = m(η)r b(η)γ (m(η)/y) ∈ S(m(η)r ). Then there exists C > 0 such that α ◦ e−t0 Hh N (y, η) ≥ C −1 m(η)r for every (y, η) ∈ supp p. Therefore p ∈ S(α ◦ e−t0 Hh N )). By Lemma 6.4, we obtain pw u(t0 ) ∈ B s0 (Rd ). Since q(y, η) = m(η)r b(η) 1 − γ (m(η)/y) ∈ S(yr ), we get x−r q w ∈ Op S(1). Thus x−r m(D)r b(D)u(t0 ) ∈ B s0 (Rd ).
Proof of Theorem 3.6. Suppose r = r1 . Combining the reduction in the proof of Theorem 2.5 and the a priori estimate, we have with C1 > 0 that s0 f (x)Dδr b(D)u(t0 )2 ≤ C1
N
s0 (α w (t0 ))j u(t0 )2
j =0
for every u(t0 ) ∈ S(Rd ). By Lemma 5.7, x(−t0 , y, η) = A(−t0 )φ(−t0 , η) + B(−t0 )P2 ψ(−t0 , η) +B(−t0 )P2 η + r4 (t0 , η) + f (t0 , y, η). Here r4 (t0 , η) ∈ S+ (ηδ ); |r4 (t0 , η)| = o(|η|δ ) as |η| → ∞; f (t0 , y, η) ∈ S+ (y). Since x(−t0 , y, η) ∈ S+ (P1 ηδ + P2 η + y), we have α w (t0 , y, η) ∈ S+ (P1 ηδ + P2 η + y)ρ .
500
S. Doi
Setting p(y, η) = |(y, η)s0 f (y)ηδr b(η)|2 ∈ S(η2s0 +2δr ), q(y, η) = |(y, η)s0 α(t0 , y, η)N |2 ∈ S((y, η)2s0 (P1 ηδ + P2 η + y)2r ), we have by Lemma 4.3, | s0 f (x)Dδr b(D)|2 − p w ∈ Op S(η2s0 +2δr−1 ), N
| s0 α w (t0 )j |2 − q w ∈ Op S((y, η)2s0 (P1 ηδ + P2 η + y)2r−2ρ ).
j =0
Take x0 ∈ Rd such that f (x0 ) = 0. Choose ξ0 ∈ Im P2 such that A(−t0 )P2 φ∞ (−t0 , η0 ) + B(−t0 )P2 ψ∞ (−t0 , η0 ) + B(−t0 )P2 ξ0 = 0. Let ξ ∈ Im P2 and put ξ0,λ = λη0 + λδ (ξ0 + ξ ) (λ ≥ 1). Then x(−t0 , x0 , ξ0,λ ) = x∞ λδ + o(λδ )
(λ → ∞),
, η0 ) + B(−t0 )ξ . Take φ ∈ C0∞ (Rd , R) such that where x2∞ = A(−t0 )P1 φ∞ (−t0εd/2 |φ(x)| dx = 1. Set vλ (x) = λ eiξ0,λ ·x φ(λε (x − x0 )) with 0 < ε δρ. Applying Lemma 7.1, we obtain s0 f (x)Dδr b(D)vλ 2 = p(x0 , ξ0,λ ) + o(λ2s0 +2δr ) (λ → ∞), N
s0 α w (t0 )j vλ 2 = q(x0 , ξ0,λ ) + o(λ2s0 +2δr ) (λ → ∞).
j =0
Notice that p(x0 , ξ0,λ ) = |η0 |2s0 +2δr |f (x0 )b(λη0 )|2 λ2s0 +2δr + o(λ2s0 +2δr ) (λ → ∞), q(x0 , ξ0,λ ) = |η0 |2s0 |x∞ |2r |a(x(−t0 , x0 , ξ0,λ ))|2 λ2s0 +2δr + o(λ2s0 +2δr ) (λ → ∞). If P1 φ∞ (−t0 , η0 ) = 0, then x∞ = 0 with ξ = 0; by contradiction, P1 φ∞ (−t0 , η0 )η0 = 0. Then lim inf λ→∞ |a(x(−t0 , x0 , ξ0,λ ))| = lim inf λ→∞ |a(x∞ λδ )| > 0 for every ξ ∈ Im P2 . Next, suppose r > r1 . Take N ∈ N such that 0 < ρ1 = r1 /N ≤ 1. Set α1 (x) = xρ1 ∈ ρ1 S+ (Rd ). Take a compact interval I containing 0 and t0 . Set α1 (t, y, η) = α1 (x(−t, y, η)) (t ∈ I ). Using Lemma 6.3 and the a priori estimate, we obtain similarly s0 f (x)Dδr b(D)v2 ≤ C4
N
s0 (α1w (t0 ))j v2 ,
v ∈ S(Rd ).
j =0
Since α1 (t0 , y, η) ∈ S+ ((P1 ηδ +P2 η+y)ρ1 ), we have s0 (α1w (t0 ))j ∈ S+ ((y, η)s0 (P1 ηδ + P2 η + y)r1 ). Set wλ (x) = λεd/2 eiλη0 ·x φ(λε (x − x0 )) with 0 < ε 1.
s0 w j 2 Replacing v by wλ and applying Lemma 7.1, we obtain N j =0 (α1 (t0 )) wλ = O(λ2s0 +2δr1 ) as λ → ∞. By contradiction, we get r ≤ r1 .
Dispersion of Singularities of Solutions for Schr¨odinger Equations
501
Proof of Theorem 3.7. Since the proof is similar, we consider the case (i). Take N ∈ N such that ρ = r/N ≤ 1. Replacing a by another in ML(A(−t0 )) if necessary, we can assume that (a(P1 x)P1 xρ )j u0 ∈ B s0 (Rd ) for j = 0, 1, . . . , N. Set α(x, ξ ) = ρ 1 + a(P1 x)P1 xρ ∈ S+ (R2d ). By Lemma 5.7, P1 x(−t, y, η) = A(−t0 )P1 φ(−t0 , η) − (t − t0 )η + P1 r4 (t, η) +(t − t0 )r5 (t, η) + P1 f (t, y, η). Here |r4 (t, η)| = o(|η|δ ) as |η| → ∞ uniformly on t ∈ Iε , r5 (t, η) ∈ B(Iε , S+ (ηδ + |t − t0 |P1 η)), and f (t, y, η) ∈ B(Iε , S+ (y)). Since η˜ 0 ∈ / {cη0 ; c ≥ 0}, there exist an open conic neighborhood U of η0 , c > 0, ε > 0, and M > 1 such that a(P1 x(−t, y, η)) = 1, |P1 x(−t, y, η)| ≥ c ηδ + |t − t0 |η = cm(t, η), if η ∈ U , t ∈ Iε , and m(t, η) ≥ My. Let b ∈ Sc0 (U ). Take γ ∈ C ∞ (R) such that 0 ≤ γ ≤ 1, γ (t) = 1 if |t| ≥ M + 1, γ (t) = 0 if |t| ≤ M. Set p(t, y, η) = m(t, η)r b(η)γ (m(t, η)/y) ∈ B(Iε , S(m(t, η)r )). Then there exists C > 0 such that α ◦ e−t0 Hh N (y, η) ≥ C −1 m(t, η)r for every (t, y, η) ∈ supp p. Therefore p ∈ B(Iε , S(α ◦ e−t0 Hh N )). By Lemma 6.4, we obtain pw (t)u(t) ∈ C(Iε , B s0 (Rd )). Since q(t, y, η) = m(t, η)r b(η) 1 − γ (m(t, η)/y) ∈ B(Iε , S(yr )), we get x−r q w ∈ Op B(Iε , S(1)). Thus x−r m(t, D)r b(D)u(t0 ) ∈ C(Iε , B s0 (Rd )).
Proof of Theorem 3.8. Since the proof is parallel, we consider only the forward case. By Theorem 2.6, we have r ≤ r1 . Suppose r = r1 . By the same theorem, −A(−t0 )η0 + P2 η ∈ / Char a for every η ∈ Rd . As in the proof of Theorem 2.5, Take N ∈ N such that 0 < ρ = r/N ≤ 1/2, and set ρ a1 (x) = (x−1/2 + |a(x)|2 )1/2N and α(x) = 1 + a1 (x)xρ ∈ S+ (Rd ). Take a compact interval I containing 0 and Iε . Set α(t, y, η) = α(x(−t, y, η)) (t ∈ I ). Replacing f by another in C0∞ (Rd ), which we denote also by f for simplicity, and reasoning as before, we have that for every u0 ∈ S(Rd ) with u(t) = e−itH u0 sup s0 f (x)Dδr b(D)u(t) ≤ C1 inf t∈I
t∈I
N
s0 (α w (t))j u(t).
(9.1)
j =0
By Lemma 5.7,
P1 x(−t, y, η) = A(−t0 )P1 φ(−t0 , η) − (t − t0 )η + P1 r4 (t, η) +(t − t0 )r5 (t, η) + P1 f (t, y, η), P2 x(−t, y, η) = A(−t0 )P2 φ(−t0 , η) + B(−t0 )P2 ψ(−t0 , η) + B(−t0 )P2 η +P2 r4 (t, η) + (t − t0 )r6 (t, η) + P2 f (t, y, η).
Here f (t, y, η) = x(−t, y, η) − x(−t, 0, η) ∈ B(I, S+ (y)); r4 (t, η) ∈ B(I, S+ (ηδ )), |r4 (t, η)| = o(|η|δ ) as |η| → ∞ uniformly in t ∈ I ; r5 (t, η) ∈ B(I, S+ (ηδ + |t − t0 |P1 η)); r6 (t, η) ∈ B(I, S+ (ηδ + P2 η)).
502
S. Doi
For c0 ≥ 0, set tλ = t0 + c0 λδ−1 , λ 1. From now, all assertions are uniform in λ 1. Set w(y, η, λ) = y + P1 ηδ + λδ−1 P1 η + P2 η. Since x(−tλ , y, η) ∈ S+ (w(y, η, λ)), we have α w (tλ , y, η) ∈ S+ (w(y, η, λ)ρ ). Setting p(y, η) = |(y, η)s0 ηδr f (y)b(η)|2 ∈ S(η2s0 +2δr ), qλ (y, η) = |(y, η)s0 α(tλ , y, η)N |2 ∈ S((y, η)2s0 w(y, η, λ)2r ), we have by Lemma 4.3, | s0 f (x)Dδr b(D)|2 − p w ∈ Op S(η2s0 +2r−1 ), N
| s0 α w (tλ )j |2 − qλw ∈ Op S((y, η)2s0 w(y, η, λ)2r−2ρ ).
j =0
Take x0 ∈ Rd such that f (x0 ) = 0. Choose ξ0 ∈ Im P2 such that
A(−t0 )P2 φ∞ (−t0 , η0 ) + B(−t0 )P2 ψ∞ (−t0 , η0 ) + B(−t0 )P2 ξ0 = 0.
Let ξ ∈ Im P2 and put ξ0,λ = λη0 + λδ (ξ0 + ξ ) (λ ≥ 1). Then x(−tλ , x0 , ξ0,λ ) = x∞ λδ + o(λδ )
(λ → ∞),
c0 η0 ) + B(−t0 )ξ . Take φ ∈ C0∞ (Rd , R) such where x∞ = 2A(−t0 )(P1 φ∞ (−t0 , η0 ) −εd/2 that |φ(x)| dx = 1. Set vλ (x) = λ eiξ0,λ ·x φ(λε (x − x0 )) with 0 < ε < ρδ. Applying Lemma 7.1, we obtain s0 f (x)Dδr b(D)vλ 2 = p(x0 , ξ0,λ ) + o(λ2s0 +2δr ) (λ → ∞), N
s0 α w (tλ )j vλ 2 = qλ (x0 , ξ0,λ ) + o(λ2s0 +2δr ) (λ → ∞).
j =0
Notice that p(x0 , ξ0,λ ) = |η0 |2s0 +2δr |f (x0 )b(λη0 )|2 λ2s0 +2δr + o(λ2s0 +2δr ) (λ → ∞), qλ (x0 , ξ0,λ ) = |η0 |2s0 |x∞ |2r |a(x(−tλ , x0 , ξ0,λ ))|2 λ2s0 +2δr + o(λ2s0 +2δr ) (λ → ∞). If P1 φ∞ (−t0 , η0 ) ∈ {cη0 ; c ≥ 0}, then x∞ = 0 with some c0 ≥ 0 and ξ = 0, / {cη0 ; c ≥ 0}. Then by (9.1), which contradicts (9.1). Therefore P1 φ∞ (−t0 , η0 ) ∈ lim inf λ→∞ |a(x(−tλ , x0 , ξ0,λ ))| = lim inf λ→∞ |a(x∞ λδ )| > 0 for every c0 ≥ 0 and ξ ∈ Im P2 . This means A(−t0 )P1 φ∞ (−t0 , η0 ) − c0 A(−t0 )η0 + P2 η ∈ / Char a for every c0 ≥ 0 and η ∈ Rd .
Dispersion of Singularities of Solutions for Schr¨odinger Equations
503
10. Other Results 10.1. Uniform estimates when W is linearly bounded. When B(−t0 ) is singular, the dispersion of singularities in the direction η0 ∈ Ker B(−t0 ) \ {0} of Theorem 2.2 fails as Theorem 2.5 shows. In this section, we consider some uniform estimates when W is linearly bounded. Let t0 ∈ R. Let P1 be the d × d orthogonal projection matrix onto Ker B(−t0 ) and set P2 = I − P1 . Define B0 = −B (−t0 )P1 + B(−t0 )P2 . Then B0 is non-singular and B(−t) = (B0+R(t))((t−t0 )P1+P2 ) with R ∈ C ∞ (R, Md (R)) satisfying R(t0 ) = O. 1 (R d ). Let r > 0 and ξ ∈ R d \ {0}. Assume that Theorem 10.1. Assume that W ∈ S+ 0 r s d x a(x)u0 ∈ B 0 (R ) for some a ∈ ML(B0 ξ0 ). Then there exist an open conic neighborhood U of ξ0 and an interval I = [t0 − ε, t0 + ε] for some 0 < ε 1 such that for all b ∈ Sc0 (U ),
r x−r |t − t0 |P1 D + P2 D b((t − t0 )P1 D + P2 D)u(t) ∈ C(I, B s0 (Rd )). Remark. Since there is C > 0 such that C −1 B(−t)η ≤ |t − t0 |P1 η + P2 η ≤ CB(−t)η for every η ∈ Rd and |t − t0 | ≤ ε, we can replace |t − t0 |P1 D + P2 D by B(−t)D.
Remark. When P2 = 1, Theorem 10.1 coincides with Theorem 2.2. When P2 = 1 and ξ0 ∈ Im P2 , Theorem 10.1 is a refinement of Theorem 2.2 with η0 ∈ ξ0 + Im P1 .
1 (R d ). Let r > 0 and ξ ∈ Im P \ {0}. Assume Theorem 10.2. Assume that W ∈ S+ 0 1 r s d that x a(x)u0 ∈ B 0 (R ) for some a ∈ ML(B0 F ), where F is the minimal closed cone in Rd \ {0} containing ξ0 (resp. −ξ0 ) and Im P2 \ {0}. Then there exist an open conic neighborhood U of ξ0 and an interval Iε = [t0 , t0 + ε] (resp. [t0 − ε, t0 ]) for some 0 < ε 1 such that for all b0 ∈ Sc0 (U ),
r x−r |t − t0 |P1 D + P2 D b0 (D)u(t) ∈ C(Iε , B s0 (Rd )). Remark. When P1 = 1, Theorem 10.2 implies that for all b0 ∈ Sc0 (U ), r x−r |t − t0 |D b0 (D)u(t) ∈ C(Iε , B s0 (Rd )).
1 (R d ). Let r > 0. Let u ∈ B s0 (R d ) and u(t) = e−itH Corollary 10.3. Assume W ∈ S+ 0 u0 ∈ C(Rt , B s0 (Rd )). If xr u0 ∈ B s0 (Rd ), then
x−r B(−t)Dr u(t) ∈ C(R, B s0 (Rd )). The next theorem shows the sharpness of the decay order and direction of the initial data in Theorem 10.1.
504
S. Doi
1 (R d ). Assume Theorem 10.4. Let r, r1 > 0 and ξ0 ∈ Rd \ {0}. Assume that W ∈ S+ ∞ d 0 that for some f ∈ C0 (R ), not identically zero, b ∈ ML(ξ0 ), a ∈ S (Rd ), ε > 0, and C > 0, the following estimate holds: r s0 f (x) |t − t0 |P1 D + P2 D b((t − t0 )P1 D + P2 D)u(t)
≤ C s0 xr1 a(x)u0 + C s0 u0
(10.1)
for all t ∈ Iε = [t0 , t0 + ε] (resp. for all t ∈ [t0 − ε, t0 ]) and u0 ∈ S(Rd ) with u(t) = e−itH u0 ∈ C(Rt , S(Rd )). Then r1 ≥ r. Moreover if r = r1 , then B0 ξ0 ∈ / Char a. Remark. Similarly we can show the sharpness of the decay order and direction of the initial data in Theorem 10.2: if ξ0 ∈ Im P1 \ {0} and if the estimate (10.1) with b((t − t0 )P1 D + P2 D) replaced by b(D) holds, we can prove r1 ≥ r, and that B0 F ∩ Char a = ∅ when r1 = r.
10.2. Proofs. Proof of Theorem 10.1. Take N ∈ N such that ρ = r/N ≤ 1. Replacing a by another in ML(B0 ξ0 ) if necessary, we can assume that (a(x)xρ )j u0 ∈ B s (Rd ) for ρ j = 0, 1, . . . , N. Set α(x, ξ ) = 1 + a(x)xρ ∈ S+ (R2d ). By Lemma 5.4 (3), there d exist an open conic neighborhood U of η0 in R \ {0}, ε > 0, c > 0, and M > 1 such that a(x(−t, y, η)) = 1, |x(−t, y, η)| ≥ cm(t, η) if (t − t0 )P1 η + P2 η ∈ U, t ∈ Iε = [t0 − ε, t0 + ε], m(t, η) := |t − t0 |P1 η + P2 η ≥ My. Take γ ∈ C ∞ (Rd ) such that 0 ≤ γ ≤ 1, γ (x) = 1 if |x| ≥ M + 1, γ (x) = 0 if |x| ≤ M. Let b ∈ Sc0 (U ). Set p(t, y, η) = m(t, η)r b((t − t0 )P1 η + P2 η)γ (m(t, η)/y) ∈ B(Iε , S(m(t, η)r )). Then there exists C > 0 such that α ◦ e−tHh N (y, η) ≥ C −1 m(t, η)r if t ∈ Iε and (t, y, η) ∈ supp p. Therefore p ∈ B(Iε , S(α ◦ e−tHh N )). By Lemma 6.4, we obtain pw (·)u(·) ∈ C(Iε , B s0 (Rd )). Since q(t, y, η) = m(t, η)r b((t − t0 )P2 η + P2 η) 1 − γ (m(t, η)/y) ∈ B(Iε , S(yr )), we get x−r q w ∈ Op B(Iε , S(1)). Thus x−r m(t, D)r b((t − t0 )P2 D + P2 D)u(t) ∈ C(Iε , B s0 (Rd )).
Proof of Theorem 10.2. The proof is similar to that of Theorem 10.1. Take N ∈ N such that ρ = r/N ≤ 1. Replacing a by another in ML(B0 F ) if necessary, we can assume that ρ (a(x)xρ )j u0 ∈ B s (Rd ) for j = 0, 1, . . . , N. Set α(x, ξ ) = 1 + a(x)xρ ∈ S+ (R2d ). d By Lemma 5.4 (3), there exist an open conic neighborhood U of ξ0 in R \ {0}, ε > 0, c > 0, and M > 1 such that a(x(−t, y, η)) = 1,
|x(−t, y, η)| ≥ cm(t, η)
Dispersion of Singularities of Solutions for Schr¨odinger Equations
505
if η ∈ U , t ∈ Iε = [t0 , t0 + ε], m(t, η) := |t − t0 |P1 η + P2 η ≥ My. Take γ ∈ C ∞ (Rd ) such that 0 ≤ γ ≤ 1, γ (x) = 1 if |x| ≥ M + 1, γ (x) = 0 if |x| ≤ M. Let b ∈ Sc0 (U ). Set p(t, y, η) = m(t, η)r b(η)γ (m(t, η)/y) ∈ B(Iε , S(m(t, η)r )). Then there exists C > 0 such that α ◦ e−tHh N (y, η) ≥ C −1 m(t, η)r if t ∈ Iε and (t, y, η) ∈ supp p. Therefore p ∈ B(Iε , S(α ◦ e−tHh N )). By Lemma 6.4, we obtain pw (·)u(·) ∈ C(Iε , B s0 (Rd )). Since q(t, y, η) = m(t, η)r b(η) 1 − γ (m(t, η)/y) ∈ B(Iε , S(yr )), we get x−r q w ∈ Op B(Iε , S(1)). Thus x−r m(t, D)r b(D)u(t) ∈ C(Iε , B s0 (Rd )). Similarly, we can prove the other claim.
Proof of Corollary 10.3. This follows directly from Theorem 10.2.
Proof of Theorem 10.4. Suppose ξ0 ∈ Im P2 \ {0}. Using the a priori estimate at t = t0 , we can prove this theorem in the same way as Theorem 2.5 with η0 = ξ0 . / Char b Suppose P1 ξ0 = 0. There exists 0 < ε1 < ε such that (1 + B0−1 R(t))−1 ξ0 ∈ if |t − t0 | ≤ ε1 . Set t1 = t0 + ε1 , ξ˜0 = (1 + B0−1 R(t1 ))−1 ξ0 , η0 = (P1 /(t1 − t0 ) + P2 )ξ˜0 , and b1 (η) = b((t1 − t0 )P1 η + P2 η). Then η0 ∈ / Char b1 . Using the a priori estimate at t = t1 and applying Theorem 2.5, we have B(−t1 )η0 ∈ / Char a. Since B(−t1 )η0 = (B0 + R(t1 ))((t1 − t0 )P1 + P2 )η0 = B0 ξ0 , the proof is complete.
References 1. Craig, W., Kappeler, T., Strauss, W.: Microlocal dispersive smoothing for the Schr¨odinger equation. Comm. Pure Appl. Math. 48, 769–860 (1995) 2. Doi, S.: Smoothness of solutions for Schr¨odinger equations with unbounded potentials. Submitted 3. Fujiwara, D.: Remarks on the convergence of the Feynman path integrals. Duke Math. J. 47, 559–600 (1980) 4. Helffer, B.: Th´eorie spectrale pour des op´erateurs globalement elliptiques. Ast´erisque 112, Paris: Soc. Math. France, 1984 5. H¨ormander, L.: The Analysis of Linear Partial Differential Operators III. Berlin Heidelberg New York: Springer-Verlag, 1985 6. Kapitanski, L., Rodianski, I.: Regulated smoothing for Schr¨odinger evolution. Internat. Math. Res. Notices 2, 41–54 (1996) 7. Kapitanski, L., Rodianski, I., Yajima, K.: On the fundamental solution of a perturbed harmonic oscillator. Topol. Meth. Nonl. Anal. 9, 77–106 (1997) ¯ 8. Okaji, T.: Propagation of wave packets and smoothing properties of soluitons to Schr¨odinger equations with unbounded potential. Preprint (version 8.4), 2000 9. Weinstein, A.: A symbol class for some Schr¨odinger equations on Rn . Am. J. Math. 107, 1–21 (1985) 10. Wunsch, J.: The trace of the generalized harmonic oscillator. Ann. Inst. Fourier, Grenoble 49, 351–373 (1999) 11. Yajima, K.: Smoothness and non-smoothness of the fundamental solution of time dependent Schr¨odinger equations. Commun. Math. Phys. 181, 605–629 (1996) 12. Yajima, K.: On fundamental solution of time dependent Schr¨odinger equations. Contemp. Math. 217, 49–68 (1998) 13. Zelditch, S.: Reconstruction of singularities for solutions of Schr¨odinger’s equation. Commun. Math. Phys. 90, 1–26 (1983) Communicated by B. Simon
Commun. Math. Phys. 250, 507–580 (2004) Digital Object Identifier (DOI) 10.1007/s00220-004-1088-5
Communications in
Mathematical Physics
Global Regularity of Wave Maps from R2+1 to H2 . Small Energy Joachim Krieger Department of Mathematics, Fine Hall, Princeton University, Princeton, NJ, 08544, USA Received: 4 September 2003 / Accepted: 14 November 2003 Published online: 28 May 2004 – © Springer-Verlag 2004
Abstract: We demonstrate that Wave Maps with smooth initial data and small energy from R2+1 to the Lobatchevsky plane stay smooth globally in time. Our method is similar to the one employed in [18]. However, the multilinear estimates required are considerably more involved and present novel technical challenges. In particular, we shall have to work with a modification of the functional analytic framework used in [30], [33], [18]. 1. Formulation of The Problem and Overview Let (M, g) be a Riemannian manifold equipped with metric g = (gij ). Also, let Rn+1 , n ≥ 1, be the standard Minkowski space equipped with metric (δij ) = diag(−1, 1, . . . , 1). A classical Wave Map u from Rn+1 to (M, g) is a smooth map which is critical with respect to the functional u→ < ∂α u, ∂ α u >g dσ. Rn+1
The following notational conventions are used: dσ denotes the volume measure associated with (δij ), ∂α u = u∗ (∂α ) ∈ T M, α = 0, 1, . . . , n, and Einstein’s summation convention is in force.1 Moreover, ∂α = δαβ ∂ β . In local coordinates, u is seen to satisfy the following conditions: 2ui + ji k ∂α uj ∂ α uk = 0,
(1)
where u = (ui ), and ji k are the Riemann-Christoffel symbols associated with the metric g and the local coordinate system. We are interested in the 1
Present address: Department of Mathematics, Harvard University, Cambridge, MA 02138, USA We shall also use the convention ∂0 = ∂t .
508
J. Krieger
Cauchy problem associated with (1). Given smooth initial data u[0] := (u(0), ∂t u(0)) : 0 × R2 → M × T M at time t = 0, is there a Wave Map u(t, x) extending these globally in time? To start with, we observe that the problem is supercritical with respect to the conserved energy provided n > 2.2 Thus, one expects development of singularities for ‘large initial data’. Blow-up examples are given for instance in [25], p. 102. Still, in sync with the general philosophy developed by Klainerman, e. g. in [7], one expects existence of classical Wave Maps provided the initial data are smooth and small in the n critical Sobolev space H˙ 2 . Moreover, for the energy critical case n = 2, one hopes for existence of classical Wave Maps for arbitrary smooth data, provided the target (M, g) is ‘sufficiently nice’. Of particular importance is the following conjecture of Klainerman, in light of its close connection to Einstein’s equations under U (1)-symmetry3 : Conjecture (Klainerman). Let (H2 , dg) be the standard hyperbolic plane. Then classical Wave Maps originating on R2+1 exist for arbitrary smooth initial data. Furthermore, numerical evidence elaborated in [2] suggests development of singularities for Wave Maps from R2+1 to S 2 , provided the (smooth) data are sufficiently large, even under certain symmetry assumptions (equivariance) on the Wave Map. In this paper, we shall establish a partial result towards the conjecture stated above, namely the following small data result: Theorem 1.1. Let H2 be the standard hyperbolic plane, consisting of all pairs of real 2 2 numbers {(x, y)|y > 0} equipped with the metric dg = dx y+dy . Then, given smooth 2 initial data (x, y)[0] : 0 × R2 → H2 which are sufficiently small in the sense that 2 ∂α x 2 ∂α y 2 dx < + y y 0×R2 α=0
for suitably small > 0, there exists a classical Wave Map from R2+1 to H2 extending these globally in time. This result is to be seen as a further step in a long sequence of developments, whose high points are the following achievements: (1) The subcritical case for n ≥ 2: strong local well-posedness of (1) in H s , s > n2 by Klainerman-Machedon [9] and Klainerman-Selberg [13] .4 (2) The critical Besov case for n ≥ 2: strong global well-posedness of (1) for initial n data small in the critical Besov space B˙ 2 ,1 by Tataru [33]. (3) Global regularity for Wave Maps from Rn+1 , n ≥ 2 to S k , k ≥ 1, provided the n initial data are smooth and small in the critical Sobolev space H˙ 2 by Tao [29, 30]. 2 To define the energy, for example isometrically embed (M, g) into an ambient Euclidean space using Nash’s embedding theorem, and put ||u||2 1 := nα=0 t=const |∂α u(t, .)|2 dσ , where |, | denotes H Euclidean length. This is easily seen to be a well-defined quantity for classical Wave Maps. 3 Einstein’s equations in vacuo under U (1)-symmetry attain the form of a Wave Map originating on a Lorentzian 2 + 1-manifold M to H2 , coupled with an elliptic system driving the metric on M. Our form of the Wave Maps problem is a highly simplified version. 4 The problem for n = 1 is globally strongly well-posed in H 1 , [5]. However, it is not well-posed in 1
the critical H 2 [28].
Global Regularity of Wave Maps from R2+1 to H2 . Small Energy
509
(4) Extension of the preceding result to the case n ≥ 5 and more general targets (boundedly parallelizable, compact) by Klainerman-Rodnianski [12]. (5) Massive simplification and extension of the previous case to n ≥ 4 by Shatah-Struwe [24]. (6) Ill-posedness of the Cauchy problem in H s , s < n2 by d’Ancona-Georgiev [1]. Further developments include an alternative proof of (5) (in more restrictive formulation) by Nahmod-Stefanov-Uhlenbeck [20]5 , as well as extension of (5) to the case n ≥ 3 by the author in [17–19]. Also, a recent preprint by D.Tataru [34] (which appeared when the research for this paper had been concluded) promises to solve the small-data case for n ≥ 2 and targets which can be uniformly isometrically imbedded into some Euclidean space. This condition appears to fail for the hyperbolic plane, as it would require at most polynomial growth of disc areas with respect to their radius. Adding some comments on the above-listed developments, we observe that in (1) the crucial framework of X s,b spaces in the context of the wave equation was developed, and the null-structure of the schematic type Q0 (u, v) = ∂ν u∂ ν v of the nonlinearity was exploited. This didn’t suffice, however, to settle the critical case, and (2) involved further sophisticated Banach spaces employing decompositions into travelling waves. These were the main harmonic analysis ingredients that went into (3) (provided that n ≤ 4; in the case n > 4, Strichartz type spaces sufficed); the important additional features of the work of Tao, however, were the use of an inherent Gauge freedom in the problem, as well as sophisticated trilinear null-form estimates. Construction of a suitable Gauge, in turn, depended upon taking advantage of a hidden skew-symmetry in the equations, which attain the following form when the target is S k : 2u = −u∂α ut ∂ α u, u ∈ S k ⊂ Rk+1 . The ‘skew-symmetry’ is evidenced by the equality6 u∂α ut ∂ α u = (u∂α ut − ∂α uut )∂ α u. The Gauge was used in order to eliminate those frequency interactions for which u has the smallest of all occurring frequencies. In (4), the ‘extrinsic formulation’ via embedding the target isometrically was abandoned, and replaced by an ‘intrinsic approach’7 , parametrizing the Wave Map by means of variables {φαi }, which express the derivatives of the Wave Map ∂α u, in terms of a global orthonormal frame {ei } for T M, provided the latter is a trivial bundle: ∂α u = φαi ei . One then obtains a first order system of equations of divergence-curl type for the φαi , which in turn lead to Wave equations of the schematic form 2φ = φ∇φ + φ 3 . While the nonlinearity in the above is not amenable to estimation, Klainerman-Rodnianski exploited a further skew-symmetry (hinging on the orthogonality of the frame {ei }), as well as the introduction of a partial Coulomb Gauge (which in turn takes advantage of the freedom in choosing the frame {ei }), in order to modify the nonlinearity: more precisely, singling out the ‘bad frequency interactions’ in the nonlinearity8 , they choose This team also recently announced the 3 + 1 case, provided the target is a symmetric space. The matrix (u∂α ut − ∂α uut ) occurring on the right-hand side is skew-symmetric. 7 This was originally introduced in [3]. 8 Corresponding to the case when φ is at much lower frequency than ∇φ. 5 6
510
J. Krieger
a Gauge which allows elimination of just these. The remaining frequency interactions in the nonlinearity can then be estimated by means of Strichartz type norms, provided n ≥ 5, without invocation of any null-structures (as in the ‘high-dimensional’ work [29] (n ≥ 5) of Tao). In (5), Shatah-Struwe observed that using a global Coulomb Gauge, one could immediately modify the form of the equations schematically to the following: 2φ = ∇ −1 (φ 2 )∇x,t φ. The nonlinearity here can be estimated for n ≥ 4 by means of somewhat nonstandard Strichartz type estimates (involving Lorentz spaces), without further use of microlocalization. A particularly transparent proof results. The previous work of the author [17–19], combined this approach (global Coulomb Gauge) together with the functional analytic framework of Tataru and multilinear null-form estimates of the type considered by Tao in [30], to settle the case n = 3. The null-structure in turn was rendered visible by using the device of dynamic separation9 , exploiting a Hodge-type decomposition of the variables φαi . While the estimates are messy, they are not as tough as the ones in [30] for the case n = 2, as the linear theory is better in this somewhat higher dimensional setting. In the present paper, we extend these investigations to the case n = 2 and target H2 . While parts of the method appear to carry over to more general targets (the cancellations we require stem from a quite general relation between Jacobi- and Christoffel symbols10 ), we prefer to stick to the present case on account of its transparency. One expects to obtain the same result for targets of bounded geometry, i. e. all covariant derivatives of the curvature tensor with respect to arbitrary slowly varying unit vector fields need to be globally bounded11 . This would in particular encompass the hyperbolic plane as well as the class of targets considered in Tataru’s recent preprint, and is possibly in some sense optimal. The nontrivial additional difficulty has to do with the fact that in general, the intrinsic approach no longer leads to an autonomous system, as is the case for the hyperbolic plane. Our approach in this paper is to use the differentiated formulation of the Wave Maps problem. As mentioned before, Wave Maps to the hyperbolic plane have the pleasant property that going to the differentiated problem allows one to deduce an autonomous system which no longer involves the actual Wave Map u. In particular, one can avoid Moser type estimates. Also, the construction of a global Coulomb Gauge a la ShatahStruwe is simple and explicit, thanks to the fact that SO(2) is abelian. This focuses the difficulty purely on the null-form estimates, which are qualitatively distinct from the ones in [30], since we lose one degree of smoothness for large frequencies. This is in marked contrast with Tataru’s recent approach, which uses an embedded formulation of the problem, without going to the derivative: 2ui = Sji k (u)∂α uj ∂ α uk , where Sji k is the 2nd fundamental form of the isometric embedding into some Euclidean space. Tataru demonstrates that on account of the fact that essentially the same cancellation occurs here as for the sphere (Sji k (u)∂α ui = 0), one can rely on the same Gauge 9 This terminology was suggested by S. Klainerman; it was already used in a different (bilinear) context in [11]. 10 Letting {e } be an orthonormal frame as in the preceding discussion, and letting ∇ e = i e , ej k i jk i i . [ej , ek ] = Cji k ei , the identity we need is Cji k = ji k − kj 11 More precisely, we need |∇ ∇ ...∇ R e1 e2 ea ij kl | < C provided |∇ei ej | < C and |er | = 1. Possibly it suffices to require this condition for only finitely many derivatives.
Global Regularity of Wave Maps from R2+1 to H2 . Small Energy
511
construction and trilinear estimates as the ones in [30], provided the target is uniformly isometrically embeddable into some Euclidean space. The novelty of Tataru’s approach is thus more analytic, and in particular relies heavily on Moser type estimates. While the overall strategy in the present paper is quite similar to the one pursued in [18, 17], we have to take into account additional structures in the nonlinearity. These have to do with the elementary observation that ‘Coulomb-Gauging’ the φαi not only improves the resulting Wave equations, but also the underlying divergence-curl system. This allows us to formulate the Wave Map system in the form (8). We shall then use the device of dynamic separation to decompose the nonlinearity into various null-forms and error terms, which are analyzed by methods similar to [30, 18]. As already mentioned, the main difference with respect to [30] is that we work at the level of the derivative of the Wave Map. This loss of smoothness forces us to modify the spaces employed in [30, 18], using finer decompositions (‘discs’ instead of ‘angular sectors’) on the Fourier side, since high-high frequency interactions become harder to analyze12 . The overall technical scheme underlying this paper is similar to the one introduced by Tao in [29, 30]: we bootstrap the energies of the frequency localized pieces ||Pk φαi ||L2x , k ∈ Z on every time slice t = const 13 . Gaining global control over the distribution of energy amongst the frequencies then allows us to control some subcritical norm ||φ||H σ , σ > 0, which by means of the subcritical result (1) proves Theorem 1.1. 2. Wave Maps to H2 We identify H2 with the upper half-plane and standard metric as before. Choose the orthonormal frame {e1,2 } = {−y∂x , −y∂y }. We obtain a representation of the derivatives of the Wave Map as follows: ∂α u = φαi ei . i=1,2
Proceeding as in [12, 18], one deduces the following divergence-curl system, provided u is a Wave Map: ∂β φα1 − ∂α φβ1 = φα1 φβ2 − φβ1 φα2 ,
(2)
∂β φα2 − ∂α φβ2 = 0,
(3)
∂α φ 1α = −φα1 φ 2α ,
(4)
∂α φ 2α = φα1 φ 1α .
(5)
We pass to complex notation and introduce the Coulomb Gauge: 1 −1 2 j =1 ∂j φj
ψα = ψα1 + iψα2 := (φα1 + iφα2 )e−i 12 13
,
This was not necessary in 3 + 1 dimensions, since the linear theory provides better estimates. Pk denote standard Littlewood-Paley multipliers.
512
J. Krieger
where −1 is given by convolution with the standard Green’s function on R2 . One then easily verifies the fundamental divergence-curl system: ∂α ψβ −∂β ψα = iψβ −1 ∂j (ψα1 ψj2 −ψα2 ψj1 ) − iψα −1 ∂j (ψβ1 ψj2 − ψβ2 ψj1 ), j =1,2
j =1,2
(6)
∂ν ψ ν = iψ ν −1
∂j (ψν1 ψj2 − ψν2 ψj1 ).
(7)
j =1,2
One deduces the following Wave Equations: 2ψα = i∂ β [ψα −1
2
∂j [ψβ1 ψj2 − ψβ2 ψj1 ]]
j =1
−i∂ β [ψβ −1
2
∂j [ψα1 ψj2 − ψα2 ψj1 ]]
j =1
+i∂α [ψν
−1
2
∂j [ψ 1ν ψj2 − ψ 2ν ψj1 ]].
(8)
j =1
As in [18], the nonlinearity here does not display an obvious null-structure. In order to render it visible, we apply dynamic decomposition to the variables ψα , by writing ψν = −Rν
2
Rk ψk + χ ν ,
k=1
√ where Rα := ( −x )−1 ∂α , α = 0, 1, 2, is a Riesz type operator. Substituting the ‘hyperbolic terms’ Rν 2k=1 Rk ψk into the right-hand side of (8) results in a trilinear expression featuring a null-structure. On the other hand, substituting at least one ‘elliptic’ term χν yields quintilinear terms, upon noting that ∂j χj = 0, (9) j =1,2
∂i χν − ∂ν χi = ∂i ψν − ∂ν ψi ,
(10)
whence χν = i
2
∂i −1 (ψν −1 ∂j (ψi1 ψj2 − ψj1 ψi2 ) − ψi −1 ∂j (ψν1 ψj2 − ψj1 ψν2 )). (11)
i,j =1
Indeed, one obtains expressions of the following schematic type: ∇(∇ −1 (ψ∇ −1 (ψ 2 ))∇ −1 (ψ 2 )),
(12)
∇(ψ∇ −1 (∇ −1 (ψ∇ −1 (ψ 2 ))ψ)).
(13)
Global Regularity of Wave Maps from R2+1 to H2 . Small Energy
513
The basic idea is that the quintilinear terms should be easier to estimate than the trilinear ones, and should essentially be amenable to estimation by means of Strichartz type norms. Unfortunately, this is strictly speaking only true for the first quintilinear expression, and we have been unable to find an elegant method for dealing with the second. The reason for this is that the Strichartz type norms available to us (Lemma 3.1, Lemma 6.7) are just not quite good enough for dealing with certain frequency interactions. Our way out of this is to exploit a (somewhat cumbersome) null-structure via further dynamic separations, and treat the expression similarly to the trilinear null-forms. Fortunately, we then don’t have to take advantage of the same subtle cancellations as for the trilinear expressions. The septilinear error terms generated by this procedure are finally easy to estimate, essentially by means of Strichartz type norms. 3. Technical Preparations We shall employ Banach spaces closely modelled upon the ones in [33, 30, 18]. First, we recall the homogeneous Besov analogues of the classical X s,b -spaces of KlainermanMachedon: We introduce the Littlewood-Paley localizers Pk which restrict frequency to dyadic size ∼ 2k , k ∈ Z. More precisely, choosing a smooth nonnegative bump function m0 (x) : R → R with support on 41 < x < 4 and satisfying x
m0 k = 1, x ∈ R+ , 2 k∈Z
we define for f (x) ∈ S(R2 ) or f ∈ S(R2+1 ), P k f (ξ ) := m0
|ξ | 2k
fˆ(ξ )14 . Similarly,
let Qj , j ∈ Z microlocalize to dyadic distance ∼ 2j from the light cone. Thus we put
| |τ | − |ξ | | ˜ ξ ). Qj φ(τ, ξ ) = m0 φ(τ, 2j We usually denote the (space-time) Fourier transform on R2+1 by ˜, and use (τ, ξ ) as coordinates on the (space-time) Fourier side. For future reference, we also introduce the multipliers Q± j , where
| |τ | − |ξ | | ± ˜ ξ ), Qj φ(τ, ξ ) = m0 χ>0 , χ 0, τ < 0. Observe that for Schwartz functions ψ ∈ S(R2+1 ) we have − Qj ψ = Q+ j ψ + Qj ψ.
We also have the obviously defined variants Qj , etc. The quantity 2j will be called modulation, a notation inherited from [30]. Now we put15 ||φ||X˙ λ,p,q = 2 k
14 15
λk
j ∈Z
1 q
[2 ||Qj φ||L2 L2 ] pj
t
x
q
.
More details are to be found in the fundamental work [23]. For lots of information concerning Xs,θ spaces in the subcritical context, consult [14].
514
J. Krieger
We shall mostly need the versions corresponding to the triples (0, 21 , ∞) and (0, 21 , 1). Furthermore, we introduce the Banach spaces S[k, κ] as in [30], [18]16 , where k ∈ Z indexes the frequency region, and κ ⊂ S 1 is a small cap. These spaces consist of three ingredients: 1
− − ||φ||S[k,κ] = ||φ||L∞ 2 + 2 2 |κ| 2 ||φ||P W [κ] + ||φ||NFA∗ [κ] . t Lx k
Here we let P W [κ] be the atomic Banach space whose atoms are Schwartz functions ψ ∈ S(R2+1 ) with the property ∃ω ∈ κ s.t. ||ψ||L2
∞ tω Lxω
≤ 1.
Also, we define ||ψ||NFA[κ]∗ = supω∈2κ 2 . / dist(ω, κ)||ψ||L∞ t Lx ω
ω
We immediately observe that these definitions imply the following fundamental bilinear inequality: assume 2κ ∩ 2κ =. Then k
1
||φψ||L2 L2 t
x
|κ | 2 2 2 ||φ||S[k,κ] ||ψ||S[k ,κ ] . dist(κ, κ )
(14)
Furthermore, as in [30, 18], we note the following fundamental relation between these new spaces and Xs,b type spaces just introduced : let ψ ∈ S(R2+1 ). Then we have ||Pk Q± 0, choose very small. Put T T+ = λ. Then we have ||Pk ψ||S[k]([−T −,T +]×R2 ) = λ||Pk+log2 λ ψλ ||S[k+λ]([−T ,T ]×R2 ) ,
516
J. Krieger
where we have put ψλ (t, x) = ψ(λt, λx). Now we estimate | ||Pk ψ||S[k]([−T −,T +]×R2 ) − ||Pk ψ||S[k]([−T ,T ]×R2 ) | ≤| ||Pk ψ||S[k]([−T ,T ]×R2 ) − ||Pk+log2 λ ψλ ||S[k+log2 λ]([−T ,T ]×R2 ) | + |λ − 1|||Pk+log2 λ ψλ ||S[k+log2 λ]([−T ,T ]×R2 ) ||Pk ψ − Pk+log2 λ ψλ [0]||L2x + ||2(Pk ψ − Pk+log2 λ ψλ )||L1 H˙ −1 t
+ |λ − 1|||Pk+log2 λ ψλ ||S[k+log2 λ]([−T ,T ]×R2 ) . Letting → 0 whence λ → 1 yields the claim. In the last inequality, we have used the ‘energy inequality’ (21) to be discussed below. Continuity at T = 0 follows directly from the ‘energy inequality’. As insinuated in the first section, we shall not be able to place the (frequency localized) nonlinearity into the classical energy space L1t H˙ −1 (which has the right scaling). We shall use spaces N[k], k ∈ Z, as well as time localized versions N [k]([−T , T ]×R2 ) for that purpose, which, up to minor wrinkles, are defined in perfect analogy with [30, 18]: indeed, we let N[k] be the atomic Banach space whose atoms are Schwartz functions F ∈ S(R2+1 ), at frequency ∼ 2k , as well as satisfying one of the following properties: (1) ||F ||L1 H˙ −1 ≤ 1 and F has modulation < 2k+100 . t
j
(2) F is at modulation ∼ 2j and satisfies ||F ||L2 L2 ≤ 2 2 2k . t
(3) F satisfies ||F ||
− 21 ,−1,2
X˙ k
x
≤ 1, and one can write F = ∂t F for some F ∈ S(R2+1 ) 1
with the property ||F ||LM L2 ≤ 2(1− M )k , for M as in the definition of S[k]. t
x
(4) There exists an integer l < −10, and Schwartz functions Fκ with Fourier support in the region k−2l−100 ξ (τ, ξ )| ± τ > 0, | |τ | − |ξ | |≤ 2 ∈ ±κ , |ξ | with the properties
F =
Fκ ,
κ∈Kl
1 2
||Fκ ||2NFA[κ] ≤ 2k .
κ∈Kl
In the last inequality, N F A[κ] denotes the dual of N F A[κ]∗ used in the definition of S[k, κ]. We immediately note the 2nd pivotal bilinear inequality, which is essentially dual to (14): letting the notation and assumptions be like there, we have k
||φψ||NFA[κ]
1
2 2 |κ | 2 ||φ||L2 L2 ||ψ||S[k ,κ ] . t x dist(κ, κ )
(20)
Global Regularity of Wave Maps from R2+1 to H2 . Small Energy
517
Note that NF A[κ] is the atomic Banach space whose atoms are Schwartz functions F satisfying 1 ||F ||L1 L2 ≤ 1 tω xω dist(ω, κ) for some ω ∈ / 2κ. We still need to tie the two classes of spaces N [k], S[k] or their timelocalized versions together by means of an energy inequality. This shall be proved in an appendix, essentially as in [33, 30], and can be stated as follows17 : ||Pk ψ||S[k]([−T ,T ]×R2 )
1
inf [min{2k T0 , 1}− M ||2Pk ψ||N[k]([−T ,T ]×R2 )
0 0 with the property ca 2−σ |a−b| ≤ cb ≤ ca 2σ |a−b| . Of particular relevance for us is the following kind of frequency envelope: take the initial data ψ(0) = (ψα (0)), σ > 0, and form 1 2 −σ |k−l| 2 2 ||Pl ψ(0)||L2 . ck = x
l∈Z
The following proposition is the heart of the paper: Proposition 4.1. Let ψ := {ψν } be as in the preceding discussion. Given K > 0 sufficiently large, there exist σ > 0, > 0 sufficiently small, such that the following holds: assume that for some T > 0, we have ||Pk ψ||S[k]([−T ,T ]×R2 ) ≤ Kck , ||ψ(0)||L2x ≤ . Then, the first inequality holds with
K 2
instead.
Global Regularity of Wave Maps from R2+1 to H2 . Small Energy
519
Proposition 4.1 implies Theorem 1.1. Finite speed of propagation allows us to assume that the initial data are compactly supported. Assume that (−T , T ), T > 0 is a maximal interval of existence for the Wave Map. Suppose supT 0 sufficiently large, there exists some T ∈ (0, T ) with the property sup ck−1 ||Pk ψ||S[k]([−T ,T ]×R2 ) = K. k∈Z
But this contradicts Proposition 4.1, so we conclude that there is some K0 > 0 with the property ||Pk ψ||S[k]([T ,T ]×R2 ) ≤ K0 ck ∀T ∈ [0, T ). From the definition of {ck }, we infer that ||ψ||L∞ δ 2 < ∞ for 0 ≤ δ < σ , where σ is as in the proposition. t H ([−T ,T ]×R ) Choose δ1,2 with σ > δ1 > δ2 > 0. On every fixed time slice t = const, t ∈ (−T , T ), we compute ||Pl φν ||H δ2 = ||Pl [ei
−1 ∂j φj1
j
≤ ||Pl [P 0, where T0 remains to be chosen. Observe that the divergence-curl system and in particular (6) implies
t
P0 ψi (t, .) = P0 ψi (0, .) + 0
t
P0 [∂i ψt ](s, .)ds + 0
P0 [ψ∇ −1 (ψ 2 )](s, .)ds,
520
J. Krieger
where the 2nd integrand is of course written schematically, i = 1, 2. We observe that Lemma 3.1 and the assumptions in Proposition 4.1 imply 2 ||P0 [ψ∇ −1 (ψ 2 )]||LM L2 ≤ CK 3 c0 ck . t
x
k∈Z
Using Hoelder’s inequality, we deduce that 1
||P0 ψi (t, .)||L2 c0 (1 + KT + K 3 T 1− M ). Choosing K large enough and T0 small enough, we infer that K ||P0 ψi ||L∞ c0 , 2 2 < t Lx ([−T ,T ]×R ) 100 provided T ∈ [0, T0 ]. Again using Hoelder, one gets the same bound for ||P0 ψi ||L2 L2 ([−T ,T ]×R2 ) . Similarly, using the divergence-curl system, the same cont x clusion follows for ||P0 ∂t ψi ||L2 L2 , ||P0 ∂t ψi ||LM L2 , provided we also choose k∈Z ck2 t x t x small enough. Now we build a Schwartz extension of P0 ψi |[T ,T ] as follows: first choose a Schwartz function fi (t, x) extending P0 ∂t ψi |[T ,T ] as well as satisfying K c0 100 which is possible according to the preceding considerations. Then we set t ψ˜ i := ηT1 (t)[P0 ψi (0, x) + fi (s, x)ds], ||P0 fi ||LM L2 + ||P0 fi ||L2 L2 < t
t
x
x
0
where ηT1 , T1 > T0 , is a smooth cutoff supported in [−2T1 , 2T1 ] with ηT1 |[−T1 ,T1 ] ≡ 1 ≤ 2T1−1 . Observe that and ||∂t ηT1 (t)||L∞ t t K −1 ||∂t ηT1 (t) fi (s, x)ds||L∞ T c0 . 2 ≤ t Lx 50 1 0 This yields that t t K fi (s, x)ds||LM L2 + ||∂t ηT1 (t) fi (s, x)ds||L2 L2 ≤ ||∂t ηT1 (t) c0 . t x t x 10 0 0 The desired properties of ψ˜ i follow from this if necessary replacing 10 by a bigger factor to counteract the loss in (18). The claim for ψ0 follows similarly upon invoking (7). 5. Preparing the Bootstrapping; Part II 5.1. Preliminaries. Now we assume that the Wave Map exists on a time interval [−T , T ], where T ≥ T0 , the latter as in the previous section. We shall now revert to the wave equations satisfied by the ψα . More precisely, writing schematically 2ψα = Fα , where Fα stands for the expression in (8), we need to choose a Schwartz extension F˜α of Fα |[−T ,T ] , and solve the corresponding wave equation with given Cauchy data. An appropriate truncation will then define our new Schwartz extension of ψα |[−T ,T ] . A good candidate for F˜α is of course obtained by simply substituting suitable Schwartz extensions of ψα implied by the bootstrap assumptions in the proposition. However, this
Global Regularity of Wave Maps from R2+1 to H2 . Small Energy
521
will not result in good terms. Another option is to apply dynamic separation, yielding trilinear null-forms and quintilinear error terms. Unfortunately, it turns out that the trilinear null-forms cause trouble in certain elliptic regimes, and we have to apply a somewhat messy ‘partial dynamic separation’ in which not all terms are decomposed into hyperbolic and elliptic parts, depending on microlocal properties of other terms. Leaving the details until later, we assume that we have found a suitable extension F˜α , α = 0, 1, 2, for which the required estimates hold. In order to avoid confusion, we denote the putative extension of P0 ψα by ρα . Then we write 2ρα = P0 Q