Commun. Math. Phys. 206, 1 – 22 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
On Fusion Algebras and Modular Matrices? T. Gannon1 , M.A. Walton2,?? 1 Department of Mathematics, University of Alberta, Edmonton, Alberta, Canada T6G 2G1.
E-mail:
[email protected] 2 Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Silver Street,
Cambridge CB3 9EW, UK. E-mail:
[email protected] Received: 7 October 1997 / Accepted: 7 March 1999
Abstract: We consider the fusion algebras arising in e.g. Wess–Zumino–Witten conformal field theories, affine Kac–Moody algebras at positive integer level, and quantum groups at roots of unity. Using properties of the modular matrix S, we find small sets of primary fields (equivalently, sets of highest weights) which can be identified with the variables of a polynomial realization of the Ar fusion algebra at level k. We prove that for many choices of rank r and level k, the number of these variables is the minimum possible, and we conjecture that it is in fact minimal for most r and k. We also find new, systematic sources of zeros in the modular matrix S. In addition, we obtain a formula relating the entries of S at fixed points, to entries of S at smaller ranks and levels. Finally, we identify the number fields generated over the rationals by the entries of S, and by the fusion (Verlinde) eigenvalues.
1. Introduction (1)
Fix an affine non-twisted algebra g = Xr , and level k. Put k := k + h∨ , where h∨ is 0 r the dual P Coxeter number of g. Let w , . . . , w denote its fundamental weights, and put ρ := ri=0 w i . Let P+k (g) be the set of all level k integrable highest weights of g. For example, r r X X λi wi | λi ∈ Z≥0 , λi = k}. P+k (A(1) r )={ i=0
i=0
the corresponding character. Sometimes it is convenient to write (λ0 , λ1 , Write chλ forP . . . , λr ) for i λi w i . When the level of a weight is known, we will often drop the w 0 component. For example, the element kw0 of P+k (g) will be denoted by 0. The ? This research was supported in part by NSERC.
?? On leave from the Physics Dept, Univ. Lethbridge, Alberta, Canada.
2
T. Gannon, M.A. Walton
corresponding quantities for the underlying finite-dimensional Lie algebra g¯ will always be denoted with a bar. Under the familiar action of SL2 (Z) on the Cartan subalgebras of g, we find that the span of the level k characters chλ is stable. In particular, define a matrix S by: X (z|z) −1 z Sλ,µ chµ (τ, z, u). , ,u − = chλ τ τ τ k µ∈P+ (g)
S has several interesting properties. Most importantly: Lemma 1 (Kac–Peterson [16]). Let chν¯ denote the Weyl character of g¯ with highest weight ν¯ . Then for any λ, µ ∈ P+k (g), we have both S0,µ 6 = 0 and Sλ,µ µ+ρ = chλ −2πi (1.1a) =: χλ (µ). S0,µ k By Lemma 1, a useful expression for χλ (µ) is χλ (µ) =
X
X
mλ (β) exp[−2π i
γ · (µ + ρ)
β∈(λ) γ ∈W β
k
],
(1.1b)
where W is the (finite) Weyl group, where (λ) is the set of dominant weights of the representation of g¯ with highest weight λ, and where mλ (β) is the weight multiplicity. A classical result is: Lemma 2 (Cartan [3]). For each ν¯ , we can write chν¯ = Pν¯ (chw1 , . . . , chwr ) for some polynomial Pν¯ (x1 , . . . , xr ). Therefore,
χλ (µ) = Pλ¯ (χw1 (µ), . . . , χwr (µ)),
(1.2)
for all µ ∈ P+k (g). Define the fusion matrices Nλ by Verlinde’s formula [21]: ν = (Nλ )νµ := Nλ,µ
X γ ∈P+k (g)
Sλ,γ
Sµ,γ ∗ S . S0,γ ν,γ
(1.3)
Equation (1.3) tells us that the Nλ are simultaneously diagonalized by S, and have eigenvalues χλ (µ). The fusion algebra (or Verlinde algebra) of g at level k can be defined to be the C-span of {Nλ : λ ∈ P+k (g)}. It is associative and commutative, with ν : unit N0 = I and integer structure constants Nλ,µ X ν Nλ,µ Nν . Nλ Nµ = ν∈P+k (g) k
In fact it is isomorphic as an algebra to CkP+ (g)k , defined with componentwise addition and multiplication, and so a critical ingredient here in our definition is the choice of preferred basis {Nλ : λ ∈ P+k (g)}. Fusion algebras (or the corresponding fusion ring) appear in many different contexts, e.g. in rational conformal field theory (RCFT) [21]. The RCFTs with fusion algebras of the type discussed here, i.e. those associated with
Fusion Algebras and Modular Matrices
3
some g, are known as Wess–Zumino–Witten models. Fusion algebras also appear in the study of quantum groups [19] and Hecke algebras [14] at roots of unity, Chevalley groups at nonzero characteristic [12], and quantum cohomology [22]. Call a set 0 = {γ 1 , . . . , γ n } ⊂ P+k (g) a fusion-generator if any Nλ can be written as a polynomial1 in Nγ 1 , . . . , Nγ n – in other words, if for each λ ∈ P+k (g) there is a polynomial Pλ (x1 , . . . , xn ) such that χλ (µ) = Pλ (χγ 1 (µ), . . . , χγ n (µ))
∀µ ∈ P+k (g).
(1.4a)
Equivalently, 0 is a fusion-generator2 iff for any λ, µ ∈ P+k (g), the only way we can have (1.4b) χγ ` (λ) = χγ ` (µ) for all ` = 1, . . . , n, is when λ = µ. The equivalence of the statements of (1.4a) and (1.4b) can be seen as follows. First, if (1.4b) holds, then (1.4a) implies χφ (λ) = χφ (µ) for all φ ∈ P+k (g). Multiplying ∗ and summing over φ ∈ P k (g) gives λ = µ, by the unitarity of the this result by Sν,φ + matrix S. In the other direction, we need to construct a polynomial Pλ in n = k0k variables, taking values χλ (µ) at m = kP+k (g)k distinct points. Let x := (x1 , . . . , xn ) denote a point in Cn , and let x a , a = 1, . . . , m be the points at which the required polynomial must take the values ya . Here xa,j = χγ j (µa ) and ya = χλ (µa ), where a labels the different weights of P+k (g). A polynomial of minimal degree satisfying the requirements can be constructed by the Lagrange interpolation formula: P (x) =
m X a=1
ya
m Y b=1,b6=a
r · (x − x b ) . r · (x a − x b )
Here r can be any (constant) vector such that r · (x a − x b ) vanishes iff a = b. By the fusion-rank Rk (g), we mean the minimum possible cardinality n = k0k of a fusion-generator 0. Such a 0 is called a fusion-basis. Question 1. For a given g and k, what is the fusion-rank Rk (g), and what is a fusionbasis? This problem was studied by Di Francesco and Zuber [6]. For the applications it should suffice to get a reasonable upper bound for the fusion-rank, and to find a 0 which realizes that bound. Incidentally, it was proven in [1] that there will be a fusion potential [13] corresponding to any fusion-generator 0. Question 1 seems a natural one from the fusion algebra perspective, and is especially interesting considering that the fusion-rank often turns out to be surprisingly low. This analysis should have consequences for the work of Moody, Patera, Pianzola, . . . on elements of finite order in a finite-dimensional Lie group (see e.g. [18,20] and references therein). It has direct relevance for the classification of conformal field theories (more precisely, their 1-loop partition functions; see e.g. [9,11,10]). Our results may lead to a new presentation of the fusion algebras, along the lines of the Schubert calculus of [13,15]. As another example, we mention that our problem may be related to finding bases for the quantum cohomology of Grassmannians [22]. 1 By Lagrange interpolation, “polynomial” here is equivalent to “function”. 2 Our definition should not be confused with the “bootstrapped” version of a fusion-generator used in [10].
4
T. Gannon, M.A. Walton
Incidentally, these fusion algebras all have a rank of one, in a sense: precisely, the Krull dimension of a fusion algebra will be one. It is not difficult to find an element N of the fusion algebra in which every fusion matrix Nλ will be a polynomial. These N however will in general be nontrivial linear combinations of our basis vectors (1.3). For the applications we are interested in, this observation is not helpful. There is a natural basis for the fusion algebra, namely P+k (g), and an important condition is that fusion-generators are required to be subsets of that basis. (1) We will address Question 1 for g = Ar in Sect. 3. Our best lower bound for (1) Rr,k := Rk (Ar ) is given in Thm. 1(2); our best upper bound and smallest fusiongenerator is given in Thm. 3. Corollary 1 tells us precisely when {w1 } is a fusiongenerator. Corollary 2 answers Question 1 when r or k is small, and Conjecture 1 gives our guess for a general statement. Another question related to this one, which we will consider in Sect. 4, is: (1)
Question 2. For g = Ar , when is Nw1 invertible? The first fundamental weight w 1 is especially interesting, since (1.1b) and its fusion numbers Nwν 1 ,µ are so simple. Incidentally, Nλ is invertible iff Nλσ is, for any Galois element σ (see (2.6) below) – this holds in fact for any RCFT [5]. However, the inverse of a fusion matrix will only itself be a fusion matrix in the trivial cases: (Nλ )−1 = Nµ iff both λ = J a 0 and µ = J −a 0 for some a ∈ Z, where J is given in (2.1b) – again the analogue holds for any RCFT. (The proof of this uses the fact that the inverse of a nonnegative integer matrix can itself be integral and non-negative, only if it is a permutation matrix.) Our best condition for Nw1 being invertible is given in Thm. 6(3), while our best conditions for noninvertibility are Thms. 6(4),(5). Together, these answer Question 2 for most r, k. Conjecture 2 gives our guess for the general answer. A final question, which we solve in Sect. 6, was asked in [4]. It is interesting because of the Galois action (2.6) on the matrix S and on the fusion coefficients. (1)
Question 3. For Ar , what are the number fields Kr,k and Lr,k generated over the rationals by the entries Sλ,µ , and by the fusion (Verlinde) eigenvalues χλ (µ), respectively? The primary purpose of this paper is to develop tools for the analysis of affine fusions. We focus mostly on the most important case: Ar,k . We believe that these three questions are both interesting and representative. 2. The Ar,k Modular Matrix S (1)
For now, let us restrict attention to Ar,k (i.e. Ar at level k). Write r := r + 1, P+r,k := (1) (1) P+k (Ar ) and Rr,k := Rk (Ar ). The symmetry group of its Coxeter-Dynkin diagram is the dihedral group on r elements, generated by an order 2 conjugation C and an order r simple current J : Cλ = λ0 w0 + J λ = λr w 0 +
r X i=1 r X i=1
λr+1−i w i ,
(2.1a)
λi−1 w i .
(2.1b)
Fusion Algebras and Modular Matrices
5
These act on the χλ (µ) by χCλ (µ) = χλ (Cµ) = χλ (µ)∗ ,
(2.2a)
b
χJ a λ (J µ) = exp[2πi(b t (λ) + a t (µ) + kab)/r] χλ (µ), where t (λ) :=
r X
j λj
(2.2b)
(2.2c)
j =1
is called the r-ality. A useful relation is t (J a λ) ≡ ak + t (λ) (mod r).
(2.2d)
Another “symmetry” of χλ (µ), when k 6 = 1, is rank-level duality [2]: eτ λ (τ µ)∗ , χλ (µ) = exp[2π i t (λ) t (µ)/rk] χ
(2.3a)
where τ λ denotes the weight in P+k−1,r+1 corresponding to the transpose (sometimes called “conjugate”) of the Young diagram of λ, after deleting any columns of length k in the transposed diagram (reminder: the i th row of the Young diagram of λ has P r j =i λj boxes). This deletion is a consequence of (2.4f) below. We will usually denote the quantities of Ak−1,r+1 with tildes. For example, τ w` = `e w1 . τ defines a bijection r,k k−1,r+1 . Note that between the J -orbits in P+ and the J˜-orbits in P+ t˜(τ λ) ∈ t (λ) − kZ≥0 ,
(2.3b)
since t (λ) is the number of boxes in the Young diagram of λ. The Weyl group of Ar is the symmetric group Sr . This gives us an essential property of S: its relation to the symmetric polynomials. In particular, we can see from (1.1b) that (2.4a) χλ (µ) = exp[2π i t (λ) t (µ + ρ)/rk] Sλ (x1 , . . . , xr ), Pr where x` := exp[−2πi µ(`)/k] for µ(`) := j =` (µj + 1). Sλ is a polynomial over Z P P – the Schur polynomial of shape ( ri=1 λi , ri=2 λi , . . . , λr ) [8] – symmetric in the xi , and homogeneous of degree t (λ). It is often convenient to write Sλ as a polynomial X Y m cm y` ` , (2.4b) Qλ (y1 , . . . , yrk ) = m=(m1 ,... ,mrk )
`
evaluated at the “power sums” of our xi : y` =
r X i=1
xi` = P` (x1 , . . . , xr ).
(2.4c)
The coefficients cm of Qλ can be expressed in terms of the characters of the symmetric group Sr (this is essentially the Frobenius–Schur duality), and each nonzero cm will have P j mj = t (λ) [8]. We will also write Sλ [µ] and P` [µ], when convenient. Note that P` [J m µ] = exp[2π i ` µ(r − m)/k] P` [µ] .
(2.4d)
6
T. Gannon, M.A. Walton
A valuable special case of (2.4a) is χw` (µ) = exp[2π i ` t (µ + ρ)/rk]
X
xi1 · · · xi` .
(2.4e)
1≤i1 r , (2.4f) χ(λ0 ,λ1 ,... ,λr ,... ) (µ) = otherwise χ(λ0 ,λ1 ,... ,λr ) (µ) valid for any µ ∈ P+r,k . This can be directly understood using for example the construction of Schur polynomials from Young Tableaux. A special case of (2.4f) is χwr = 1 and χw` = 0 for ` > r. We will use (2.4f) in several places – see e.g. the proof of Thm. 3. Call λ ∈ P+r,k a J d -fixed point if d is the smallest positive integer satisfying J d λ = λ – in other words if the λi have period d. We will say λ is a fixed point if it is a J d -fixed point for some d < r. Note that if ϕ is a fixed point of J d , we can speak of a “truncated d,kd/r . We have weight” (ϕ0 , ϕ1 , . . . , ϕd−1 ) =: ϕ 0 ; by (2.5a) below it will lie in P+ d−1 d−1 X X dk ϕi = ϕi0 , = r i=0
t (ϕ) =
(2.5a)
i=0
d−1 r X r −d r r −d j ϕj + k = t 0 (ϕ 0 ) + k , d 2 d 2
(2.5b)
j =1
where t 0 denotes d-ality. There exist J d -fixed points in P+r,k iff d divides r and r/d divides k. In other words, the smallest fixed-point period is r/gcd{r, k}, and all other possible periods are multiples of this number. Also, if ϕ is a J r/d -fixed point, its rank-level dual τ ϕ is a J˜k/d -fixed point. By (2.2b), if µ is a J d -fixed point, then χλ (µ) = 0 whenever t (λ) 6 ≡ 0 (mod r/d). The same comment holds for µ if instead λ is a J d -fixed point. This is certainly not the only source of zeros in the matrix S however, as we shall see, but it is an important one. In fact, there are many more zeros at fixed points than this simple r-ality test suggests. For example, of all weights λ with t (λ) = r/d, the entry Sλ,ϕ will equal zero for every J d -fixed point ϕ, unless λ is a hook ( dr − a)w1 + w a . We will describe below the set N Z(d) of all weights λ which can have nonzero entries at J d -fixed points. Moreover, many different weights λ 6 = µ – even in the set N Z(d) – will have the same value Sλ,ϕ = Sµ,ϕ at all J d -fixed points ϕ. For example, for the hooks λ with t (λ) = dr , we will have χλ (ϕ) = ±χwr/d (ϕ) for all ϕ, where the sign is independent of ϕ. More generally, note that the right side of (2.8c) is independent of a 00 , except for the unimportant sign. 3 Such as (2.4), but not e.g. (2.3a), (2.6) or (2.8). More precisely, specialisation defines a homomorphism between the polynomial rings, taking Schur polynomials to Schur polynomials, power sums to power sums, etc.
Fusion Algebras and Modular Matrices
7
Hence fixed point considerations are very important for both Questions 1 and 2, and play a large role in this paper. An unexpected symmetry of the matrix S is the Galois action discussed in [5]. For any σ ∈ Gal(Kr,k /Q), there exists a permutation µ 7 → σ µ of P+r,k such that σ Sλ,µ = σ (µ) Sλ,σ µ , σ χλ (µ) = χλ (σ µ),
(2.6a) (2.6b)
where σ (µ) ∈ {±1}. Similar equations hold for any other affine algebra g, and more generally for any RCFT. The field Kr,k here is generated over Q by all elements Sλ,µ ; if instead we are only interested in the permutation µ 7 → σ µ, and not the “parities” σ (µ), then we are more concerned with the effective Galois group Gal(Lr,k /Q) coming from the subfield Lr,k generated over Q by the fusion eigenvalues χλ (µ). Incidentally, Galois orbits tend to be nicely behaved – see e.g. Thm. 8 below. They also have been studied in the “elements of finite order” Lie group context – see e.g. [18, 20]. Galois group considerations are central to many arguments in this paper, so next we will quickly review the cyclotomic Galois group. The cyclotomic field Qn := Q[exp[2π i/n]] consists of all polynomials in ξn := exp[2π i/n]. The Galois group Gn := Gal(Qn /Q) is isomorphic to the multiplicative group (Z/nZ)× of integers coprime to n, taken mod n. More precisely, any automorphism σ ∈ Gn corresponds to some integer ` ∈ (Z/nZ)× , in such a way that σ ξn = ξn` . We write σ` for this σ . The classic example of a Galois automorphism is complex conjugation, which always corresponds to ` = −1. A subfield F of Qn will have Galois group Gal(F /Q) isomorphic to a factor group (equivalently here, a subgroup) of (Z/nZ)× . The previous properties of S are all well known. The following one, which relates S entries at fixed points to S entries at both smaller rank and level, appears to be new. We will call it fixed-point factorisation. Let ϕ be a fixed point of J d for Ar,k . Then we will show that χλ (ϕ) = 0 unless (i)
(i)
(∗) for each i = 1, . . . , r/d, there are precisely d integers 1 ≤ `1 < · · · < `d ≤ r P (i) for which λ(`j ) ≡ −i (mod r/d). (Recall λ(a) := rb=a (λb + 1).) Assume this for now. (∗) implies dr will divide t (λ) – which we already know – but it is much stronger. Write N Z(d) for the set of all weights λ ∈ P+r,k which obey (∗). We will see below that λ ∈ N Z(d)
⇐⇒
χλ
r/d−1 kd X di w 6= 0. r
(2.7a)
i=0
The fixed-point argument of this last equation has truncated weight 00 . Consider any λ ∈ N Z(d). Let π be the unique permutation of {1, . . . , r} defined by the following rule: for each 1 ≤ i ≤
r d
(i)
and 1 ≤ j ≤ d, put π(i + (j − 1) dr ) = `j . d−1,kd/r
π will exist iff (∗) holds. For each such i, let λ0(i) denote the weight in P+ with Dynkin labels (i) (i) λ(`j ) − λ(`j +1 ) (i) −1 (2.7b) λ0 j = r/d
8
T. Gannon, M.A. Walton d−1,kd/r
for j = 1, . . . , d − 1. As above, let ϕ 0 ∈ P+ (ϕ0 , ϕ1 , . . . , ϕd−1 ). Then we obtain the “factorisations” Sλ,ϕ = sgn π ξ (−1)t (λ)(1−d/r)
be the truncated weight
r r/d−1 2
Sλ0 0(1) ,ϕ 0 · · · Sλ0 0(r/d) ,ϕ 0 , k χλ (ϕ) = sgn π ξ (−1)t (λ)(1−d/r) χλ0 0(1) (ϕ 0 ) · · · χλ0 0(r/d) (ϕ 0 ),
(2.8a) (2.8b)
P (i) where ξ is the kd/r th root of unity equal to exp[2π i t 0 (ϕ 0 + ρ 0 ) i (λd + i − r/d)], and 0 0 where primes denote quantities in Ad−1,kd/r (we take S = χ = 1 for d = 1). Perhaps some examples at low rank and level will be helpful. For r = 3, k = 4, the only fixed points are (ϕ0 , ϕ1 , ϕ2 , ϕ3 ) = (2, 0, 2, 0), (0, 2, 0, 2), and (1, 1, 1, 1). N Z(1) consists of the J -orbits of (4,0,0,0) and (2,1,0,1), for a total of 8 weights out of the full 35. N Z(2) contains N Z(1) plus the J -orbits of (3, 0, 1, 0), (2, 2, 0, 0) and (2, 0, 2, 0), increasing the number of weights to 18 out of 35. All three fixed points are in the simple-current orbits of weights of the special type indicated in (2.7a) (for d = 1 or 2). Therefore, for these fixed points ϕ, we must have Sϕ,λ 6 = 0 for all weights λ in the appropriate N Z(d). For r = 3, k = 8, d = 2, however, there are fixed points such as ϕ = (3, 1, 3, 1) that are not of the type in (2.7a), i.e. (4, 0, 4, 0). In this case, we find that S(3,1,3,1),λ 6= 0 for only 48 weights λ, while N Z(2) has cardinality 75 (and kP+3,8 k = 165). The large discrepancy here between “48” and “75” is not surprising and is explained by (2.8): χϕ0 0 will vanish at a fifth of the points of P+2,4 . Incidentally, the total number of weights satisfying t (λ) ≡ 0 (mod r/d) is 85. This means there are 10 weights that satisfy the r-ality test necessary for χλ (ϕ) 6 = 0, yet still have χλ (ϕ) = 0 for all ϕ. Condition (∗) will become more severe as r and k increase. For example, with r = 3 and d = 2, the numbers of weights in N Z(2) compared with those with even r-ality, compared with those in P+3,k are: 196, 231, and 455 for k = 12; and 405, 489, and 969 for k = 16. As an example of how “factor weights” {λ0(i) } are found, consider the weight λ = (0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0) at r = 11, k = 6. Fix d = 4. The corresponding partition labels {λ(`)} are {17, 16, 14, 13, 12, 10, 8, 6, 5, 3, 1, 0}. Those congruent to −1 (mod r/d = 3) are {17, 14, 8, 5}. From these we find λ0(1) = (1, 0, 1, 0), where the zeroth Dynkin label is set so that the factor weight is at level kd/r = 2. We find λ0(2) = (0, 0, 0, 2) and λ0(3) = (1, 1, 0, 0) in similar fashion. For a more general example, consider any hook λ = aw1 + wb . It will lie in N Z(d) iff r/d divides a + b, in which case we find χaw1 +wb (ϕ) = ξ (−1)a+b+c+(a
00 +1)(c+a 0 +1)
χ(a 0 −1)w01 +w0c−a0 +1 (ϕ 0 ),
(2.8c)
where c = (a + b)d/r and a = dr a 0 − a 00 , for a 00 ∈ {1, . . . , dr }, and where ξ = 1 unless b > r − r/d, in which case ξ = exp[2π i t 0 (ϕ 0 + ρ 0 )r/dk]. The permutation π here is the product of c − a 0 + 1 disjoint a 00 -cycles. In this example, each λ0(i) = 00 except for 00 0 λ(a ) = (a 0 − 1)w01 + w0c−a +1 . Equation (2.8c) says that hooks in P+r,k act like hooks in d−1,kd/r , when their fusion eigenvalues are restricted to fixed points of J d . The most P+ interesting special case of (2.8c) is 0 χ (ϕ 0 ) if r/d divides ` . (2.8d) χw` (ϕ) = w0`d/r 0 otherwise
Fusion Algebras and Modular Matrices
9
Lemma 3 (Fixed-point factorisation). Choose any Ar,k , any divisor d of gcd{r, k}, and any λ ∈ P+r,k . Then exactly one of the following holds: (i) Sλ,ϕ = χλ (ϕ) = 0 for all fixed points ϕ of J d ; or (ii) λ ∈ N Z(d) and so λ obeys (2.8a), (2.8b) for every fixed point ϕ of J d . The leading signs in (2.8) are independent of ϕ and so for our purposes are of no significance. The phase ξ depends only on ϕ and will often equal 1. Of course the right side of (2.8b) can be “linearised” by expanding it out using fusion coefficients. Conversely, it leads to the curious observation that the fusion coefficients of Ar,k can be seen in the fusion eigenvalues of A2r+1,2k evaluated at fixed points. At present we do not have formulas of equal generality for the other affine algebras with simple currents. Since their Coxeter–Dynkin diagrams are similarly related by the (1) “folding” by a simple current, one would expect that E6 would be related in this way (1) (1) (1) (1) to G2 , E7 to F4 , Dr for its vector simple current (i.e. the one interchanging w0 (1) and w1 , and wr−1 and wr ) to Cr−2 , etc. Perhaps an algebraic understanding of these equations can be obtained from the ideas in e.g. [7]. To prove Eqs. (2.8), first note that P` [ϕ] =
r
0 0 d P`d/r [ϕ ]
0
if r/d divides ` , otherwise
(2.9a)
0 [ϕ 0 ] or 0 (for r|d`, r 6 |d`, from which we immediately obtain that H` [ϕ] equals H`d/r respectively), for the “complete” symmetric polynomials H` := S`w1 , since (2.4b) for λ = `w 1 takes a simple form [8]. We have the determinantal formula [8]
Sλ = det(Hλ(i)−r+j )1≤i,j ≤r =
X
sgn σ Hλ(σ 1)−r+1 · · · Hλ(σ r)−r+r .
(2.9b)
σ
In this formula, H0 identically equals 1, and for negative `, H` is identically 0. Evaluated at the fixed point ϕ, this will be a sparse matrix: each row will have at most d nonzero elements, spaced r/d entries apart. If Sλ [ϕ] 6 = 0, then some prod(i) (i) uct Hλ(σ 1)−r+1 [ϕ] · · · Hλ(σ r)−r+r [ϕ] 6 = 0, and thus {`1 , . . . , `d } = {σ i, σ (i + dr ), . . . , σ (i + r − dr )} for each i. This shows that (∗) is satisfied, and that the permutation π exists. The sum in (2.9b) can be restricted to those σ in the coset (Sd )×(r/d) π ⊂ Sr , where the i th factor Sd permutes the indices congruent to i (mod r/d). So (2.9b) can now be written as the product of determinants, the i th one of which corresponds to the d−1,kd/r (note that (2.4f) is implicit in (2.7b)), which gives us (2.8b). weight λ0(i) ∈ P+ P di 0 0 Equation (2.7a) follows from (2.8b) and the fact that ( kd i w ) = 0 . Using the r product formula (= Weyl denominator formula) for S0,µ , we can show S0,ϕ =
r/d−1 2 r k
(S00 0 ,ϕ 0 )r/d .
Together with (2.8b), this immediately gives us (2.8a).
(2.9c)
10
T. Gannon, M.A. Walton
3. Fusion-rank of Ar,k The original polynomial realisation [13,15] uses the Cartan fusion-generator 0 = {w1 , . . . , wr }, which works by Lemma 2. We can do better. From (2.2a) and Lemma 2, we see that Rr,k ≤ 2r , with 0 = {w1 , . . . , wbr/2c }, where bxc is the largest integer not larger than x. For example, the fusion-rank of A1,k and A2,k equals 1 for all k, with {w 1 } a fusion-generator. This result for A2 was first obtained in [6], though by a more complicated argument. We also obtain, from Thm. 2(3) below (rank-level duality), the bound Rr,k ≤ 2k + 1. We begin by collecting a few simple consequences of the previous comments. Parts (1) and (3) of Thm. 1 are technical facts we will use repeatedly in the rest of the paper. Theorem 1(2) gives a fairly strong lower bound on Rr,k . We give some consequences of Thm. 1(4) in the paragraph before Conjecture 1. Theorem 1 (Simple-current constraints). (1) Let 0 be a fusion-generator, and choose any µ ∈ P+r,k . Let 0µ be the set of all γ ∈ 0 for which χγ (µ) 6 = 0. Let d = gcd{r, k, t (γ )|γ ∈0µ } (put d = r if 0µ = ∅). Then µ is a J r/d -fixed point. (2) (Our best lower bound). LetQ 0 be any fusion-generator. Write out the prime decomposition D := gcd{r, k} = piai , where each prime pi is distinct. Then X ai . Rr,k ≥ If D 6 = r, we get the stronger bound Rr,k ≥ 1 +
X
ai .
More precisely, for each pi , and each `, 1 ≤ ` ≤ ai , there must be some γ ∈ 0 ∩ N Z(rpi` /D) (see Lemma 3) with gcd{D, t (γ )} = D/pi` . When D 6 = r, there must also be some γ ∈ 0 ∩ N Z(r/D) whose r-ality t (γ ) is a multiple of D. (3) Suppose J r/d µ = µ and J r/d ν = ν for some divisor d of r. Then for any weight λ, χλ (µ) = χλ (ν) 6 = 0 implies t (λ) t (µ) ≡ t (λ) t (ν) (mod d r). (4) When k 1 is some multiple of k 2 , then Rr,k1 ≥ Rr,k2 . Proof. (1) Let µ be a J c -fixed point. Then from the previous remarks, c must divide r, and r/c must divide both k and t (γ ) for each γ ∈ 0µ . Therefore c must be a multiple of r/d. Moreover χγ (J r/d µ) = χγ (µ) for all γ ∈ 0µ (hence all γ ∈ 0); since 0 is a fusion-generator this means J r/d µ = µ, and hence c = r/d. (2) We know that for every divisor d of D, there are J r/d -fixed points (more than one, unless d = D = r). Choose such a fixed point ϕ, say. Let 0ϕ be as in (1) – necessarily 0ϕ ⊆ N Z(r/d). Then, by (1), gcd{r, k, t (γ )|γ ∈0ϕ } = d. So we see there must be a subset 0d ⊆ 0, namely 0d = 0ϕ , such that gcd{D, t (γ )|γ ∈0d } = d. Note that each 0D/p` must contain some weight γ with gcd{D, t (γ )} = D/pi` (otherwise different i
J r/d -fixed points would not be distinguished by 0). This gives the first bound. If r 6= D, then there will be several J r/D -fixed points, and in order for 0 to distinguish them, 0D must be nonempty. This gives the second bound. (3) Let P` be the `th power sum polynomial (2.4c). From (2.4d) and (2.5a), P` [µ] 6 = 0 requires d to divide `. Consider the m = (m1 , m2 , . . . )th term in Qλ (see (2.4b)); either it will vanish at µ, or d will divide each ` with m` 6= 0. Since P` [µ] lies in the cyclotomic
Fusion Algebras and Modular Matrices
11
field Q[exp[2πi `/k]], we find that Sλ [µ] lies in the cyclotomic field Q[exp[2π i d/k]]. Therefore (2.4a) applied to χλ (µ) = χλ (ν) 6 = 0 gives us the desired conclusion. (4) First note that we have the containment k 1 (P+r,k2 + ρ) ⊂ P+r,k1 + ρ. Moreover, k2
(1)
(2)
for any weight γ we have χγ ( k 1 (µ + ρ) − ρ) = χγ (µ) for all µ ∈ P+r,k2 , where the k2
superscripts indicate that k1 or k2 should be substituted for k in (1.1b). Suppose 0 (1) is a (2) (2) fusion-generator for Ar,k1 . Then for any µ, ν ∈ P+r,k2 , if we have χγ (µ) = χγ (ν) for all γ ∈ 0 (1) , then we know µ = ν. Now the ρ-shifted action of the affine Weyl group at level k2 will map any weight γ ∈ 0 (1) either to some γ 0 ∈ P+r,k2 or onto the “boundary” (2) (2) of P+r,k . In the former case we get χγ (µ) = ±χγ 0 (µ), for some sign independent of (2)
µ. In the latter case χγ (µ) = 0 for any µ, and can be ignored. Therefore, the set of weights γ 0 in P+r,k2 obtained in this way from those in 0 (1) will be a fusion-generator t for Ar,k2 . u Equation (2.3a) suggests that the fusion-generators for Ar,k should be related to those of Ak−1,r+1 . This is indeed so: Theorem 2 (Rank-level duality). (1) Suppose r does not divide k. Then Rr,k ≥ Rk−1,r+1 . Moreover, if 0 = {γ 1 , . . . , γ n } 0 = {J˜a1 τ γ 1 , . . . , J˜an τ γ n } is one for Ak−1,r+1 , is a fusion-generator for Ar,k , then e where each ai is chosen so that gcd{ai r + t (γ i ), k} = gcd{t (γ i ), r, k} for each i. (2) If r does not divide k, and k does not divide r, then Rr,k = Rk−1,r+1 ; in this case 0 , defined in (1), will be one for Ak−1,r+1 . if 0 is a fusion-basis for Ar,k , then e (3) If r does divide k, then Rr,k ≤ Rk−1,r+1 ≤ Rr,k + 1. Using the notation of (1), ˜ τ γ 1 , . . . , τ γ n } is a fusion-generator for Ak−1,r+1 . {J˜0, Proof. (1) Any weight of P+k−1,r+1 can be expressed as J˜b τ µ for some integer b and some weight µ ∈ P+r,k . So, it suffices to consider any µ, µ0 ∈ P+r,k and b ∈ Z for which eJ˜ai τ γ i (J˜b τ µ0 ) χ eJ˜ai τ γ i (τ µ) = χ
∀i ,
(3.1a)
and show that this implies τ µ = J˜b τ µ0 . Equation (3.1a) becomes χγ i (µ) = exp[2πi {rai + t (γ i )} {t (µ) − t (µ0 ) − rb}/rk] χγ i (µ0 ).
(3.1b)
Define 0µ as in Thm. 1(1). Because r does not divide k, we know 0µ 6= ∅. Equation (3.1b) and Thm. 1(1) imply that µ and µ0 will both be J r/d -fixed points, where d = gcdγ i ∈0µ {ai r + t (γ i ), k}. Then τ µ and J˜b τ µ0 will both be J˜k/d -fixed points. For each γ i ∈ 0µ , Thm. 1(3) and (3.1a) imply {rai + t (γ i )} {t (µ) − t (µ0 ) − rb} ≡ 0 (mod d k) . 0
(3.1c)
For each prime p|k, write p a and pa for the exact powers dividing k and d, re0 spectively: i.e. p a kk and pa kd. So a ≥ a 0 . If a = a 0 , then pa must divide both r and 0 t (µ) − t (µ0 ), by (2.5b). If a > a 0 , then pa k(rai + t (γ i )), for some γ i ∈ 0µ . Therefore 0 (3.1c) tells us that L := {t (µ)−t (µ )−rb}/k is an integer. Equation (3.1b) then implies χγ i (µ) = χγ i (J L µ0 ) for all i. Therefore µ = J L µ0 , so we may take µ = µ0 in (3.1a),
12
T. Gannon, M.A. Walton
and absorb the L into b. Then r/d must divide L, i.e. k/d must divide b, i.e. J˜b τ µ = τ µ, and we see that (3.1a) can only be trivially satisfied. Hence Rk−1,r+1 ≤ Rr,k . (2) is immediate from part (1). (3) The first inequality comes from (1). That the given set is a fusion-generator follows by the proof of (1). More precisely, by replacing J˜ai τ γ i with J˜0˜ in (3.1a) implies L ∈ Z. The rest of the argument is as before. u t The Chinese Remainder Theorem tells us that it is always possible to choose the ai ’s in Thm. 2(1). Incidentally, in all cases of which we know, Rk−1,r+1 = 1 + Rr,k when r < k divides k. Earlier we suggested the upper bound Rr,k ≤ r/2, and now we also know Rr,k ≤ k + 1 (or k/2 if k fails to divide r). In fact we can do much better than this for most 2 pairs (r, k). The argument relies on the cyclotomic Galois group Gn described briefly in the previous section. Theorem 3 (Galois considerations). 0÷ := {w d : 2d ≤ r and d divides k } is a fusion-generator for Ar,k , called the divisor generator. A related fusion-generator is τ , defined by 0÷ {wd : 2d ≤ k and d divides k} when k does not divide r τ . := 0÷ k when k divides r {w } ∪ {wd : 2d ≤ k and d divides k} τ can be replaced with any hook `w 1 + w d−` for Moreover, each w d in 0÷ and 0÷ 1 ≤ ` ≤ d. th
Proof. The key observation here is that, because each xj is a k root of unity, for any ` there will exist a Galois automorphism σ ∈ Gk for which σ Pd (x1 , . . . , xr ) = P` (x1 , . . . , xr ),
(3.2)
where d = gcd{`, k}. Suppose, for all d ≤ r/2 dividing k, that χwd (µ) = χwd (µ0 ).
(3.3a)
We will show this implies µ = µ0 . Equations (3.3a) and (2.4a) give Swd [µ] = ξ d Swd [µ0 ]
(3.3b)
for all d ≤ r/2 dividing k, where ξ = exp[−2πi (t (µ) − t (µ0 ))/rk]. Equation (2.4b) reads Sw` [µ] =
(−1)`+1 ˙ ` (P1 [µ], . . . , P`−1 [µ]) P` [µ] + Q `
(3.4a)
˙ ` homogeneous in the same sense as Qλ (and so has no constant for some polynomial Q term). Let d be the smallest ` with P` [µ] 6 = 0. Then (3.4a) implies Sw` [µ] = 0 for all ` < d and Swd [µ] = ± d1 Pd [µ] 6 = 0, so either d = r (in which case µ = ( kr , kr , . . . , kr ) = µ0 ), or d ≤ r/2 by (2.2a). But (3.2) requires d to divide k, if it is to be minimal. Thus th (3.3b) holds. However both Swd [µ] and Swd [µ0 ] lie in Qk/d , so ξ must be a k root of unity.
Fusion Algebras and Modular Matrices
13
We next want to show, by induction on `, that P` [µ] = ξ ` P` [µ0 ]
(3.4b)
for all ` ≤ r/2. If we could show this, we would be done, because by (3.4a) it would force χw` (µ) = χw` (µ0 ) for all ` ≤ r/2, i.e. µ = µ0 . Equation (3.4b) is clearly true for P1 = S1 , using (3.3b) with d = 1. By (3.2), it is then true for all ` with gcd{`, k} = 1. Take any divisor d ≤ r/2 of k, and suppose (3.4b) is true for all ` < d. Using (3.3b), Eq. (3.4a) means that (3.4b) is true for ` = d, and hence all ` with gcd{`, k} = d. Therefore (3.4b) is indeed true for all ` ≤ r/2, and µ = µ0 . The above remarks continue to hold if we replace each w d with any hook `w1 +w d−` (of all the weights λ with t (λ) = d, only the hooks have the variable yd appearing nontrivially in the corresponding polynomial Qλ (yi ) – see e.g. p.51 of [8]). Theorem 2 τ (the hooks dw 1 and J 0 = kw 1 can be applied to 0÷ gives us the fusion-generator 0÷ d k t replaced here with w and w , respectively). u In many special cases, most notably Cor. 1 and Cor. 2 below, we can prove that the divisor generator 0÷ is actually a fusion-basis. Another example: suppose gcd{r, k} = p ` for some prime p, so k will equal pm q for some m ≥ ` and some number q coprime to p. If all prime divisors of q are larger than r/2, then 0÷ will be a fusion-basis, and Rr,k = ` + 1 (if r 6 = p ` ) or Rr,k = ` (if r = p ` ). The reason is that here the lower bound for Rr,k from Thm. 1(2) agrees with the upper bound from Thm. 3. A special case of this occurs when both r and k are powers of p. In fact we know of only a few examples (for r ≤ k) where the divisor generator is not a fusion-basis. For r = 4, for example, we find by computer that the fusion-rank is one for k = 5, 9, 17 and 21. On the other hand, the computer program tells us that the fusion-rank is 2 for r = 4 and k = 7, 11, 13 and 15. This implies, by Thm. 1(4), that whenever k is a multiple of 12,16,18 or 20, R4,k = 2 and 0÷ will be a fusion-basis. Conjecture 1. At fixed rank r, the divisor generator 0÷ is a fusion-basis for all sufficiently high levels k. For reasons of simplicity, the case of greatest interest is when 0 = {w1 } is a fusiongenerator. The complete solution to this is a consequence of this theorem: Theorem 4. 0 = {w 1 , w2 , . . . , wm } is a fusion-generator of Ar,k iff 0÷ ⊆ 0 or τ ⊆ 0. 0÷ Proof. “⇐” is immediate from Thm. 3. “⇒” Suppose we could find a polynomial p(x) = x m1 + · · · + x m` − x n1 − · · · − x n` , not identically 0, such that: (a) (b) (c) (d)
` < r, 1 ≤ m1 < · · · < m` < k and 1 ≤ n1 < · · · < n` < k, x = exp[2πia/k] is a root of p(x), for each a = 1, 2, . . . , m, and P` P` i=1 mi = i=1 ni .
14
T. Gannon, M.A. Walton
Then there would exist weights λ 6 = µ in P+r,k obeying χwa (λ) = χwa (µ) for each a = 1, . . . , m – in other words, 0 could not in this case be a fusion-generator. To see this, choose any r − ` distinct integers hi such that h1 = 0, the remaining hi obey 1 ≤ hi < k, and {hi } ∩ {mi } = {hi } ∩ {ni } = ∅. The hi and mj together equal the r values of λ(i), and the hi and nj together equal the r values of µ(i). Since p(x) 6 ≡ 0, we know µ 6 = λ. Condition (c) says that Pa [λ] = Pa [µ] for all a ≤ m, and (d) is just the statement that t (λ + ρ) = t (µ + ρ). Hence (c) together with induction on (3.4a) is equivalent to saying χwa (λ) = χwa (µ) for those a, and we are done. It is easy to find this polynomial in many cases. In particular, let d be the largest divisor of k with 2d ≤ min{r, k}, and assume d > m. Take p(x) to be (x 4 − x 3 − x 2 + x)(x k−n + x k−2n + · · · + x n + 1), where n = k/d. Then (c) and (d) are automatically satisfied. ` = 2d here, so (a) will be satisfied unless d = r/2. Also, (b) will be satisfied unless n ≤ 4, which can only happen if d = r/2 = k/2. This argument breaks down only when d = r/2. However, when r/2 divides k, there will be J 2 -fixed points, and by Thm. 1(2) we would require some γ ∈ 0 with t (γ ) a multiple of r/2 if 0 is to be a fusion-generator. τ is if simultaneously k|r, The ony remaining way 0 could fail to contain 0÷ ∩ 0÷ r 6 = k, and m < k. But then Thm. 1(2) applies, and 0 would not be able to distinguish t the J r/k -fixed points. u Corollary 1 (The first-fundamental generator). 0 = {w1 } is a fusion-generator iff both: (i) each prime divisor p of k satisfies 2p > min {r, k}, and (ii) either r divides k, or gcd{r, k} = 1. Incidentally, the proof of Thm. 4 also implies that at least one weight γ in any fusiongenerator must have t (γ ) ≥ d, where d is the largest divisor of k with d ≤ r/2 and d ≤ k/2. If this γ is not a hook, then in fact t (γ ) would have to be strictly larger than d. Corollary 2. Some fusion-bases for Ar,k are: 0÷ = {w1 } for r = 1 and 2, ∀k ≥ 1; 0÷ = {w1 } for r = 3 when k is odd; 0÷ = {w1 , w 2 } for r = 3 when k is even; τ = {w 1 } for k = 1, ∀r ≥ 1; 0÷ τ = {w 1 } for k = 2 and any even r; both 0 = {J 0, w 1 } and 0 τ = {w 1 , w 2 } for 0÷ ÷ k = 2 and any odd r > 1; τ = {w 1 } for k = 3 and any r coprime to 3; both 0 = {J 0, w 1 } and 0 τ = {w 1 , w 3 } • 0÷ ÷ for k = 3 and any multiple r > 3 of 3; τ = {w 1 } for k = 4 when r is even; 0 = {w 1 , w 2 } for k = 4 when r ≡ 1 (mod 4), • 0÷ ÷ τ = {w 1 , w 2 , w 4 } for k = 4 when r ≡ 3 r > 4; and both 0 = {J 0, w 1 , w 2 } and 0÷ (mod4), r > 4. • • • •
Corollary 2 follows immediately from Thm. 1(2) and Thm. 3. Some of these fusionbases are collected in the table. Corollary 2 tells us the fusion-rank when either r ≤ 3 or k ≤ 4. In addition, other fusion-bases are 0÷ = {w1 } for r = 4 when k is even, for r = 5 τ = {w 1 } for k = 6 when r is coprime to 6; 0 = {w 1 , w 2 } when k is coprime to 6, and 0÷ ÷ τ = {w 1 , w 2 } for k = 6 when r ≡ 1, 3 (mod for r = 5 when k ≡ 2, 4 (mod 6), and 0÷ τ = {w 1 , w 3 } for k = 6 6); and 0÷ = {w1 , w3 } for r = 5 when k ≡ 3 (mod 6), and 0÷ when r ≡ 2 (mod 6). The simplest cases we do not yet know the answer for are: r = 4
Fusion Algebras and Modular Matrices
15
Table 1. Listed are Ar,k fusion-bases for low ranks and/or levels. The symbols | in rows of the table delimit sequences of fusion-bases that repeat indefinitely as the level k increases. For increasing ranks r, overlines and underlines work similarly in the columns. “l” signifies that Nw1 is invertible (see Sect. 4) r \ k
1
2
3
4
5
1
|{w 1 }| l
{w 1 }
{w 1 } l
{w 1 }
{w 1 } l
2
|{w 1 }| l
{w 1 } l
{w 1 }
{w 1 } l
{w 1 } l
3
|{w 1 } l
{w 1 , w2 }|
{w 1 } l
{w 1 , w2 }
{w 1 } l
4
{w 1 } l
{w1 } l
{w 1 } l
{w 1 } l
{w 2 }
5
{w 1 } l
{w 1 , w2 }
{w 1 , w3 }
{w 1 , w2 }
{w 1 } l
6
{w 1 } l
{w1 } l
{w 1 } l
{w 1 } l
{w 1 , w2 }
7
{w 1 } l
{w 1 , w2 }
{w 1 } l
{w 1 , w2 , w4 }
{w 1 } l
8
{w 1 } l
{w1 } l
{w 1 , w3 }
{w 1 } l
{2w 2 + w 5 } l
when k is odd (R ≤ 2); r = 5 when 6 divides k (R = 2 or 3); k = 5 when r is even (R ≤ 3); and k = 6 when 6 divides r (R = 3 or 4). Obviously to go further we need a better lower bound. Theorem 1(2) is the best we have, but it only exploits the presence of fixed points. 4. The Fusion Matrix of w 1 There are many times when it is useful to know whether particular S matrix elements are nonzero. This is the case for example in almost every modular invariant partition function classification attempt – e.g. see the underlying assumption in [17]. It is especially useful to answer this for the first fundamental weight w1 – in Thm. 5 below we give some consequences. For later convenience, define the sets Pr,k := {p prime : p ≤ min{r, k} and p divides k}, X ax x : ax ∈ Z≥0 }, Z≥ X := {
(4.1a) (4.1b)
x∈X
where X in (4.1b) is any set of natural numbers. Z≥ X is the set of all possible sums (repetitions allowed) of elements of X. For example, Z≥ {n} = {0, n, 2n, . . . } is the set of all nonnegative multiples of n. Theorem 5. (1) Suppose Sw1 ,µ = 0. Then Sλ,µ = 0 unless t (λ) ∈ Z≥ Pr,k . Both k and r must lie in Z≥ Pr,k . (2) Suppose there is only one prime divisor p of k not larger than min{r, k}. Then Sw1 ,µ = 0 iff µ is a fixed point. Proof. When k ≥ r, part (1) follows by considering the polynomial expression (2.4b) and using the Galois argument of (3.2): P` [µ] 6 = 0 requires ` ∈ Z≥ Pr,k . Taking λ = J 0 gives us k ∈ Z≥ Pr,k , and λ = wr (see (2.4f)) gives us r ∈ Z≥ Pr,k . When k < r, to show that we can restrict to primes p ≤ k, we use rank-level duality (2.3a) to get that t˜(τ λ) ∈ Z≥ Pr,k and then t (λ) ∈ Z≥ Pr,k follows from (2.3b) and the fact that k ∈ Z≥ Pr,k . For part (2), use part (1) and Thm. 1(1) to get that µ must be fixed by J r/p . t u
16
T. Gannon, M.A. Walton
Note that the hypothesis of (2) holds whenever k is a power of a prime. This special case follows directly from (4.2) below, by using Gauss’ Lemma on factorising integral polynomials, and evaluating certain factored polynomials at 1. Theorem 5(2) however is much more general. Nw1 is invertible iff Sw1 ,µ 6 = 0 for all µ ∈ P+r,k . Equivalently, Nw1 is invertible iff r X
exp[2π i µ(j )/k] 6 = 0
j =1
∀µ ∈ P+r,k .
(4.2)
It is not hard to show that for k ≤ 4 or r ≤ 4, Nw1 is invertible iff gcd{r, k} = 1; in fact, for those r, k, χw1 (µ) = 0 only for fixed points µ. The identical conclusion holds for many other r and k, as we saw in Thm. 5(2). But Thms. 6(4),(5) below say that these cases are uncharacteristically well-behaved. For example, when r = 5, if 6 divides k ≥ 12, then Nw1 will not be invertible, even though there are no fixed points. Theorem 6 (Invertibility). (1) Nw1 is invertible iff N˜ w˜ 1 is, where the latter is the fusion matrix for Ak−1,r+1 . (2) If gcd{r, k} 6 = 1, then Nw1 cannot be invertible. (3) Nw1 is invertible if either r 6 ∈ Z≥ Pr,k or k 6∈ Z≥ Pr,k . (4) Suppose pq divides k, where p and q are distinct primes for which r ∈ Z≥ {p, q} – i.e. there exist nonnegative integers a, b such that ap + bq = r. If k ≥ pq(d qa e + d pb e), then Nw1 will not be invertible (dxe here denotes the smallest integer not smaller than x – e.g. d2e = 2, d3.1e = 4). (5) Suppose p1 , p2 , . . . , pn are primes dividing k for which r ∈ Z≥ {p1 , . . . , pn } – i.e. P Pj there exist nonnegative integers ai such that ai pi = r. If k ≥ pi pj h=i ah for any i < j , then Nw1 will not be invertible. Proof. (1) follows directly from (2.3a). (2) exploits the fact (see (2.2b)) that χw1 (ϕ) = 0 for any fixed point ϕ. (3) is a corollary of Thm. 5(1). (4) We want to construct a particular µ ∈ P+r,k such that χw1 (µ) = 0. To do this we find an arithmetic sequence pk Z + ci for each i = 1, . . . , a, and an arithmetic sequence k 0 q Z + cj
for each j = 1, . . . , b, such that none of these a + b sequences intersect. This
is easy to do, provided k is big enough. Choose as the ci ’s 0, qk , . . . , qk (q − 1), 1, 1 + k q,...
, 1 + qk (q − 1), etc., until we have chosen a of them (the last one will be d qa e − 1
plus some multiple of
k q ).
Next choose as the cj0 ’s d qa e, d qa e +
chosen b of them (the last one will be
d qa e
+
d pb e
k p,...,
until we have
− 1 plus some multiple of
k p ).
Our
a + b sequences will be disjoint, provided the bound on k is satisfied, and will intersect the interval 0 ≤ x < k in precisely ap + bq = r points. Let µ be the unique weight in P+r,k whose µ(`) equal those r points. Then χw1 (µ) = 0, because the sum in (4.2) along each of the a + b sequences is 0. (5) follows immediately from similar considerations: we are looking for ai series k k k Z pi + cij , where cij 6 ≡ ci` (mod pi ) for j 6 = `, and cij 6 ≡ ch` (mod pi ph ) for i 6 = h. Pi−1 t The choice cij = j − 1 + `=1 a` works. u
Fusion Algebras and Modular Matrices
17
The proofs of Thms. P 6(4),(5) are constructive: their zeros arise when (4.2) finds itself p a sum of terms such as a=1 ξ a for ξ a primitive pth root of unity. A simple example of Thm. 6(4) is at r = 11, k = 30. With p = 3, q = 5, and a = 2, b = 1, the bound is saturated. One finds c1 = 0, c2 = 6 and c10 = 1. These yield 0, 10, 20; 6, 16, 26; and 1, 7, 13, 19, 25; respectively. So, there is a zero for the weight given by {µ(1), . . . , µ(r)} = {26, 25, 20, 19, 16, 13, 10, 7, 6, 1, 0}. Conjecture 2. For Ar,k , Nw1 fails to be invertible iff one can find distinct primes pi ≤ P k} dividing k and nonnegative integers ai , bi such that r = i ai pi and k = min{r, P i bi pi . In other words, we conjecture that the condition of Thm. 6(3) is an “iff”. Note that one way this condition will be satisfied is if gcd{r, k} 6 = 1. The conditions in Thms. 6(4),(5) are strongest when we take r < k (which without loss of generality we can). Also, the bound in 6(5) is best when the pi are labelled so that the largest are given indices near n/2. In practice the most useful special case of Thms. 6(4),(5) is: If one can find an odd prime p ≤ r for which 2p divides k and k ≥ 3p − 1, then Nw1 will not be invertible. The analogue of Thm. 1(4) is also valid here, but is not very useful. The answer to Question 2 for small r and k is indicated in the table. Computer checks were performed for r ≤ 9 and all levels k > r such that dim P+r,k < 300, 000. The results were consistent with Conjecture 2. Conjectures 1 and 2 are the simplest guesses consistent with our results, but it would be nice to test them against additional numerical data. Incidentally, conditions like “` ∈ Z≥ {n1 , . . . , nm }” are only strong when ` is small. For example, given any coprime numbers m and n, there are only (m − 1) (n − 1)/2 positive integers ` which do not lie in Z≥ {m, n} – the largest such ` is mn − m − n. So for fixed r, we know Conjecture 2 will hold for all sufficiently large k. 5. Extensions Because the fundamental weights are much simpler, the most interesting fusion-generators are the ones which consist only of fundamental weights: 0 ⊆ {w1 , . . . , wr }. We can speak of fundamental-fusion-generators and fundamental-fusion-rank FRr,k . All of the results in Sects. 3 and 4 also apply directly to FRr,k . By definition, Rr,k ≤ FRr,k , and Conjecture 1 predicts that, for fixed r, Rr,k = FRr,k for all sufficiently large k. Note however from the table that FR8,5 = FR4,9 = 2 while R8,5 = R4,9 = 1. Because of (2.8d), we can strengthen here the bound in Thm. 1(2). For example, if FRr,k equals the bound given in Thm. 1(2), then so must FRr/d−1,k/d for all divisors d of gcd{r, k}. One can also ask Question 2 for other weights, most importantly the other fundamental weights, and again (2.8b) will be very useful. For example, we know χw2 will vanish at some J 5 -fixed point of A9,14 , because Nw1 is not invertible for A4,7 . Of course Questions 1 and 2 can and should be asked of the fusion algebras for the other affine algebras, and similar arguments will apply. We have not investigated them, except to find some fusion-bases for C2,k and G2,k on the computer, and to get Thm. (1) 7 below for G2,k . Of course Rk (C2 ) must equal 2 for any even k, and we find the rank is also 2 for all odd k < 26 (the limit of our computer check), save k = 1, 3 and 9. For k = 1 and 9, the only fusion-bases are {w1 } and {2w1 + 6w2 }, respectively. At k = 3 there are four different fusion-bases: {2w 1 }, {w 2 }, {2w 1 + w2 }, and {2w2 }. A very
18
T. Gannon, M.A. Walton
tempting conjecture is that the rank R(Cr,k ) equals 2 for all sufficiently large k (and probably for all k > 9). The situation for G2,k however is more surprising: Theorem 7. (1) When the level k is odd, {w 2 } is a fusion-basis for G2,k . (2) Nw2 fails to be invertible for G2,k iff either 4 or 30 divides k := k + 4. Proof. The key here is to reduce the G2,k quantities to A2,k+1 quantities, and use the fact that {w 1 } is a fusion-basis for A2,k+1 . Using (1.1b) and the simple Lie subalgebra A2 ⊂ G2 , we find χw2 (µ) = χ w1 (µ) + χ w2 (µ) + 1,
(5.1)
where underlines denote A2,k+1 quantities, and µ = µ1 w1 + (µ1 + µ2 + 1)w2 . So part (1) reduces to the following statement4 for A2,k+1 : for any λ, µ ∈ P+2,k+1 with λ 6 = Cλ and µ 6 = Cµ (only these nonselfconjugate weights correspond to G2,k ones), does the equality λ1 + 2λ2 + 3
) + cos(2π
λ2 − λ1
) + cos(2π
2λ1 + λ2 + 3
) (5.2a) 3k 3k 3k µ2 − µ1 2µ1 + µ2 + 3 µ1 + 2µ2 + 3 ) + cos(2π ) + cos(2π ) = cos(2π 3k 3k 3k
cos(2π
force either λ = µ or λ = Cµ? Write c1 , c2 , c3 for the three cosines on the left side of (5.2a), and write c10 , c20 , c30 for those on the right. Then (5.2a) says c1 + c2 + c3 = c10 + c20 + c30 , and since (2ν1 + ν2 + 3) + (ν2 − ν1 ) = ν1 + 2ν2 + 3, we also get c12 + c22 + c32 = 1 + 2c1 c2 c3 and c10 2 + c20 2 + c30 2 = 1 + 2c10 c20 c30 . Hit both sides of (5.2a) with the Galois automorphism σ2 (see Sect. 2). Since cos(2x) = 2 cos2 (x) − 1, we obtain c12 + c22 + c32 = c10 2 + c20 2 + c30 2 .
(5.2b)
Thus any symmetric polynomial in c1 , c2 , c3 will equal the corresponding symmetric polynomial in c10 , c20 , c30 . In particular 2λ1 + λ2 + 3 2 (5.2c) ) 3k 3k 3k µ2 − µ1 2µ1 + µ2 + 3 2 µ1 + 2µ2 + 3 ) − sin(2π ) − sin(2π ) . = sin(2π 3k 3k 3k
sin(2π
λ1 + 2λ2 + 3
) − sin(2π
λ2 − λ1
) − sin(2π
In other words, we know from (5.2a) that the real parts of χw1 (λ) and χw1 (µ) are equal, and from (5.2c) that their imaginary parts are also equal, up to a sign. Hence either λ = µ or λ = Cµ, and we have proven part (1). For part (2), note that χw2 (µ) = 0 is equivalent to (see (5.1)) 1 c1 + c2 + c3 = − , 2
(5.3a)
in the above notation. Consider first k odd. Then hitting (5.3a) with the Galois automorphism σ2 gives us c12 +c22 +c32 = 45 , and hence c1 c2 c3 = 18 . We can solve these equations, 4 For the remainder of the proof of part (1), we will switch to A 2,k+1 notation.
Fusion Algebras and Modular Matrices
19
and we find 8ci3 +4ci2 −4ci −1 = 0, i.e. {c1 , c2 , c3 } = {cos(2π 17 ), cos(2π 27 ), cos(2π 37 )}. (1) However, these cosines cannot be realised by a weight in P+k (G2 ). Next, suppose k ≡ 2 (mod 4). We may assume (using G2,k notation) that exactly two of the arguments {3µ1 + 2µ2 + 5, µ2 + 1, 3µ1 + µ2 + 4} are odd, otherwise they would all be even and the argument would reduce to the k odd one. Here we use the automorphism σ3k/2−2 and find (relabeling the ci if necessary) that c32 − c12 − c22 = − 43 . We can solve for ci as before, and we find that either c3 = cos(2π 15 ) and {c1 , c2 } = 7 13 1 11 ), cos(2π 30 )}, or c3 = cos(2π 25 ) and {c1 , c2 } = {cos(2π 30 ), cos(2π 30 )}. {cos(2π 30 Either possibility requires 30 to divide k, in order to be realised by a weight of G2,k . When (1) 30 divides k, we do indeed get zeros: µ = (k/3 − 1, k/30 − 1, 3k/5 − 1) ∈ P+k (G2 ) works. (1) t Finally, suppose 4 divides k. Then µ = (k/4, k/4, k/4) ∈ P+k (G2 ) works. u (By w 2 here we mean the Weyl-dimension 7 fundamental weight of G2 , corresponding to the short simple root.) However, {w 2 } will not be a fusion-generator when k > 4 is even. Our computer program tells us that for k ≤ 24, the fusion-rank is 1 except for k = 6, 12, 16 and 20 (of course this implies it will also be 2 whenever k + 4 is a multiple of 10, 16, or 24). 6. Number Fields Associated with S By the field Kr,k we mean the smallest field containing the rationals and all of the entries Sλ,µ of S. Similarly, by the field Lr,k we mean the smallest field containing Q and all of the values χλ (µ). Because of their role in the Galois symmetry (2.6), it is natural to try to identify these fields. This question was posed in [4], and related questions have been considered in e.g. [18,20]. Another reason the question is interesting is that, as we shall see, it has a simple answer! We will give this answer in Cor. 3 below, for the most important case: Ar,k . The matrix S for any nontwisted affine algebra g is given in e.g. [16]. The expression for Sλ,µ consists of a sum s(λ, µ) over the Weyl group of g, multiplied by a constant c. For Ar,k , s(λ, µ) manifestly lies in the field Qrk , and c=
ir(r+1)/2 . r/2 √ k r +1
Using Gauss sums, which express square-roots of integers as sums of roots of unity, it can be shown that the constant c lies in either Qr if r is even, or Qrk if either r ≡ 3 (mod √ 4) or k is even, or Qrk [ ±2] if both k is odd and kr ≡ ±2 (mod 8). Thus we know Lr,k is always a subfield of Qrk , and Kr,k is always a subfield of Q4rk . Write [λ] for the orbit {J i λ} of λ by the simple currents. We will find our fields by first computing some Galois orbits. This result should be of independent value. Theorem 8. Consider any k > 2 and r 6 = 1. (1) Choose any fundamental weight wm with m ≤ min{r − 2, k − 2}, and any Galois automorphism σ` . Then (with one exception) σ` w m ∈ [wm ] ∪ [Cwm ] iff ` ≡ ±1 (mod k); for all other ` the quantum-dimension Sσ` wm ,0 /S0,0 of σ` w m will be strictly greater than that of w m . (The one exception is w2 for A3,4 , where each σ` fixes w2 .)
20
T. Gannon, M.A. Walton
(2) When r 6 ≡ 1 (mod 4), σ` w1 = w1 iff ` = 1 (mod rk). When r ≡ 1 (mod 4) and k is even, then σ` w1 = w1 iff ` = 1 (mod rk/2). Proof. (1) Because of (2.1a), we may assume m ≤ r/2. Assume first that k ≥ r. From the Weyl denominator formula, we compute m Y | sin(π`n/k)|r−n Sσ` wm ,0 = Swm ,0 sin(πn/k)r−n n=1 r−m Y n=m+1
| sin(π`n/k)|r−n sin(πn/k)r−n
r Y
| sin(π `n/k)|r+1−n
n=r+1−m
sin(π n/k)r+1−n
(6.2a)
where we drop the middle product if m = r/2. We want to know when (6.2a) equals 1. This is easy, for k > r ≥ 2, since sin(π/k) < sin(2π/k) < · · · < sin(πr/k). Consider first m < r/2: of all possible choices of integers 1 ≤ n1 < n2 < · · · < nr+1 ≤ k/2, the minimum possible product of r − 1 sin(π n1 /k)’s, r − 2 sin(π n2 /k)’s, ..., r − m sin(π nm /k)’s, r − m sin(πnm+1 /k)’s, ..., m sin(π nr−m /k)’s, m sin(π nr+1−m /k)’s, ..., and 1 sin(π nr /k), is the choice n1 = 1, n2 = 2, ..., {nm , nm+1 } = {m, m + 1}, ..., nm+2 = m + 2, ..., {nr−m , nr+1−m } = {r − m, r + 1 − m}, ..., nm+1 = m + 1. This immediately forces ` ≡ ±1 (mod k) (for m > 1, just look at the first term; when m = 1, ` ≡ ±2 is eliminated by seeing what happens to the second term). If instead m = r/2, the exponents of sin(πn/k) in (6.2a) are no longer nonincreasing: near n = m + 1 we get the subproduct · · · sin(π(m − 1)/k)r−m sin(π m/k)r−m sin(π (m + 1)/k)r−m sin(π (m + 2)/k)r−m · · · . For m > 2, the proof that (6.2a) will always be greater than 1 for ` 6 ≡ ±1 (mod k), follows from the simple observation that sin(π/k) sin(π (m + 1)/k) < sin(2π/k) sin(π m/k): the least-harmful place to move “1” to is “2”, and the best place to move “m + 1” to is “m”, and yet even that (forgetting the other terms, which will make matters worse) will increase the product. The remaining case m = 2 corresponds to r = k = 4, i.e. to the given exception. This completes the argument for k ≥ r. When k < r, apply rank-level duality (2.3a): it is an exact symmetry of quantum-dimensions, and maps J -orbits to J˜-orbits. τ w m = mw˜ 1 , so we are interested in the ratio k−2 Y | sin(π`n/k)|k−1−n k−1 Y | sin(π ` (n + m)/k)| S˜σ` mw˜ 1 ,0 = . sin(πn/k)k−1−n n=1 sin(π (n + m)/k) S˜mw˜ 1 ,0 n=1
(6.2b)
The rest of the argument is as before: again m = r/2 causes minor problems. Now consider any ` = (−1)a + bk. Applying (2.6b) to the Cartan generators λ ∈ 1 {w , . . . , wr } and using (2.4e), we find σ` µ = C a J b t (µ+ρ) µ t whenever σ` ∈ Gal(Lr,k /Q). Applying (6.2b) to µ = w1 gives us part (2). u
(6.2b)
Fusion Algebras and Modular Matrices
21
Corollary 3. When both k > 2 and r 6 = 1, then Lr,k = Qrk and ( Qrk if either r 6 ≡ 1 (mod 4) or k is even . Kr,k = √ Qrk [ ±2] if r is odd and rk ≡ ±2 (mod 8) The proof of the corollary is immediate from Thm. 8, by regarding Galois orbit sizes: when r 6 ≡ 1 (mod 4), the Galois orbit of w 1 alone suffices, but when r ≡ 1 (mod 4) and k is even, we have σ1+rk/2 w 1 = w1 , so also use σ1+rk/2 0 = J r/2 0 6 = 0, which is obtained from (6.2b). What we find in all cases is that for any ` ∈ (Z/rkZ)× , ` 6 = 1, either σ` w 1 6 = w1 or σ` 0 6 = 0. This tells us Lr,k = Qrk , and Kr,k is then obtained by adjoining the constant c shown above. Similar statements to Thm. 8 can be found for other weights. For example, by ranklevel duality the identical result to Thm. 8(1) holds for any mw 1 , 0 ≤ m ≤ min{r − 2, k − 2}, and we can expect similar results for other hooks. When r ≡ 1 (mod 4) and k is odd, Q4rk is a degree 2 extension of Kr,k , which is in turn a degree 2 extension of Qrk . The results corresponding to Corollary 3 for k = 1, 2 or r = 1 can be easily found, but are more complicated and hence less interesting. We include them here for completeness. √ • Lr,1 = Qr . Kr,1 will equal either Qr , Qr [i], or Qr [ ±2], depending on whether or not r ≡ 0, 1 (mod 4), or r ≡ 3 (mod 4), or r ≡ ±2 (mod 8), respectively. if k is odd, and √ Q[cos(π/k)] if k is even. K1,k will equal either • L1,k = Q[cos(2π/k)] √ L1,k , or L1,k [ 2 sin(2π/k)], or L1,k [ 2], depending on whether k ≡ 0, 2, or k ≡ 3, or k ≡ 1 (mod 4), respectively. • Lr,2 = Qr [cos(2π/k)] if r is odd, and Qrk if r is even. Kr,2 will equal Lr,2 , unless r ≡ 3 (mod 4) when Kr,2 = Qrk . Acknowledgements. T.G. thanks A. Coste for showing him Questions 1 and 3, and C. Cummins for discussions. M.W. thanks the High Energy Physics group of DAMTP for hospitality, and W. Eholzer for reading the manuscript.
References 1. Aharony, O.: Generalized fusion potentials. Phys. Lett. B306, 276–282 (1993) 2. Altschuler, D., Bauer, M., and Itzykson, C.: The branching rules of conformal embeddings. Commun. Math. Phys. 132, 349–364 (1990) 3. Bourbaki, N.: Groupes et Algèbres de Lie. Chapitres IV-VI, Paris: Hermann, 1968 4. Buffenoir, E., Coste, A., Lascoux, J., Buhot, A., and Degiovanni, P.: Precise study of some number fields and Galois actions occurring in conformal field theory. Annales de l’I.H.P.: Phys. Théor. 63, 41–79 (1995) 5. Coste, A. and Gannon, T.: Remarks on Galois symmetry in rational conformal field theories. Phys. Lett. B323, 316–321 (1994) 6. Di Francesco, P. and Zuber, J.-B.: Fusion Potentials I. J. Phys. A26, 1441–1454 (1993) 7. Fuchs, J., Schellekens, B., and Schweigert, C.: From Dynkin diagram symmetries to fixed point structures. Commun. Math. Phys. 180, 39–97 (1996) 8. Fulton, W. and Harris, J.: Representation Theory: A First Course. New York: Springer-Verlag, 1991 9. Gannon, T.: Symmetries of the Kac–Peterson modular matrices of affine algebras. Invent. Math. 122, 341–357 (1995) 10. Gannon, T.: Kac–Peterson, Perron–Frobenius, and the classification of conformal field theories. e-print q-alg/9510026 (1995) 11. Gannon, T., Ruelle, Ph., and Walton, M.A.: Automorphism modular invariants of current algebras. Commun. Math. Phys. 179, 121–156 (1996) 12. Georgieu, G. and Mathieu, O.: Catégorie de fusion pour les groupes de Chevalley. C. R. Acad. Sci. Paris 315, 659–662 (1992)
22
T. Gannon, M.A. Walton
13. Gepner, D.: Fusion rings and geometry. Commun. Math. Phys. 141, 381–411 (1991) 14. Goodman, F. and Nakanishi, T.: Fusion algebras in integrable systems in two dimensions. Phys. Lett. B262, 259–264 (1991) 15. Goodman, F. and Wenzl, H.: Littlewood–Richardson coefficients for Hecke algebras at roots of unity. Adv. Math. 82, 244–265 (1990) 16. Kac, V. and Peterson, D.: Infinite-dimensional Lie algebras, theta functions and modular forms. Adv. Math. 53, 125–264 (1984) 17. Kreuzer, M. and Schellekens, A.N.: Simple currents versus orbifolds with discrete torsion – a complete classification. Nucl. Phys. B411, 97–121 (1994) 18. Moody, R.V. and Patera, J.: Characters of elements of finite order in Lie groups. SIAM J. Alg. Disc. Meth. 5, 359–383 (1984) 19. Pasquier, V. and Saleur, H.: Common structures between finite systems and conformal field theories through quantum groups. Nucl. Phys. B330, 523–526 (1990) 20. Pianzola, A.: The arithmetic of the representation ring and elements of finite order in Lie groups. J. Algebra 108, 1–33 (1987) 21. Verlinde, E.: Fusion rules and modular transformations in 2D conformal field theory. Nucl. Phys. B300, 360–376 (1988) 22. Witten, E.: The Verlinde algebra and the cohomology of the Grassmannian. In: Geometry, Topology and Physics. Conf. Proc. and Lecture Notes in Geom. Topol. Vol. VI, 1995, pp. 357–422 Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 206, 23 – 32 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Master Partitions for Large N Matrix Field Theories Matthias Staudacher?,?? Albert-Einstein-Institut, Max-Planck-Institut für Gravitationsphysik, Schlaatzweg 1, D-14473 Potsdam, Germany. E-mail:
[email protected] Received: 30 October 1998 / Accepted: 7 March 1999
Abstract: We introduce a systematic approach for treating the large N limit of matrix field theories.
1. Introduction It has been known for thirty years that quantum field theory simplifies enormously if the number N of internal field components tends to infinity. In the case where the N components form a vector this leads to exact solutions in any dimension of spacetime. For physical applications, ranging from solid state physics to gauge theories and quantum gravity, a different situation is much more pertinent: The case of N 2 internal components that form a matrix. Here exact solutions have only been produced for very low dimensionalities. It is one of the outstanding problems of theoretical physics to extend large N technology to physically interesting dimensions. In the present article we will be concerned with matrix “spin systems”, that is Ddimensional Euclidean lattice field theories whose internal degrees of freedom are hermitian, complex or unitary N × N matrices. The idea is to treat the problem by a three step procedure: (1) Eguchi–Kawai reduction: Replace the N = ∞ field theory by a one-matrix model coupled to appropriate constant external field matrices. (2) Character expansion: Express the partition function of the one-matrix model of (1) as a sum over polynomial representations – labelled by Young diagrams – of U (N ). (3) Saddle point analysis: Find an effective Young diagram that dominates the partition sum of (2) in the large N limit. ? Supported in part by EU Contract FMRX-CT96-0012.
?? Current institute address: Am Mühlenberg, Haus 5, D-14476 Golm, Germany.
24
M. Staudacher
The insight that step (1) is possible is due to Eguchi and Kawai [1]. Intuitively it says that, if a saddle point configuration exists at N = ∞, it should be given by a single translationally invariant matrix (the so-called master field). In practice the reduction is rather subtle, and we will be using the twisted EK reduction [2] which results in a one-matrix model in external constant fields encoding the original (discrete) space-time. Step (2) is novel in this context and is the main focus of the present work. The onematrix model of (1) still has N 2 degrees of freedom, and it is well known that a saddle point for matrix models can only be found once the degrees of freedom are reduced as N 2 → N . The external fields encoding space-time prevent any naive reduction to the N eigenvalues of the matrix, which is the route of choice for simpler models without external fields. But is it possible to replace the matrix integral by a sum over partitions corresponding to a sum over all polynomial representations of U (N ). The crucial point is then that one ends up with a kind of one-dimensional spin model in Young diagram space with only N variables: the possible lengths of the N rows of the diagram. Step (3) might appear to be an exotic idea: we claim that the N = ∞ “master field” can be described by a “master partition”. However, it has already been recently demonstrated in a series of papers [3,4] that certain infinite sums over partitions are dominated by a saddle point configuration. This led to the solution of matrix models in external fields not treatable by any other method. The present models are more complicated, but not fundamentally different. The character expansions we find lead to a very interesting and apparently novel combinatorial problem in Young pattern space (see Sect. 4). More insight into this problem will be needed in order to proceed with the final step (3) of our program, the saddle point analysis. We introduce what we call “lattice polynomials” 4h ,ϒh which are polynomials in N1 . They depend on the Young diagram h and the precise nature of the space-time lattice. It might be objected that the present approach is futile unless one can demonstrate that the lattice polynomials 4h ,ϒh can be explicitly computed or at least bootstraped at N = ∞. But there is one important argument against this pessimistic assessment: The lattice polynomials 4h ,ϒh only depend on the nature of the lattice but not on the local measure of the minimally coupled (matrix) spins of the model1 . Therefore, solving interacting field theory in our language is of the same degree of complexity as solving the free field case. Finally we should mention that our program is very general since it applies in principle to any large N matrix spin system. It would be interesting to extend the method to matrix field theories with a gauge symmetry such asYang-Mills theory. Indeed the EK reduction was initially designed for lattice gauge theory [1]. Recently it was demonstrated by Monte Carlo methods that even the path integral of continuum gauge theory may be EK reduced to a convergent ordinary multiple matrix integral [5]. A rigorous mathematical proof, as well as an investigation on whether the reduced model reinduces the field theory as N → ∞, are still lacking. At any rate, reducing a D-dimensional gauge theory, one so far ends up with a nonlinearly coupled D-matrix model, which is not yet tractable by the present machinery unless it is understood how to perform a further reduction DN 2 → N 2 .
1 Except for the global symmetry of the matrix spins. In this paper we develop the theory in parallel for the case of U(N) global symmetry (hermitian matrices) and U(N ) × U(N ) symmetry (complex matrices). The other classical groups could presumably be treated as well, but it is well known that they do not lead to different large N limits.
Master Partitions for Large N Matrix Field Theories
25
2. Reduced Matrix Spin Systems Consider a spin model on a periodic lattice. In order to be specific we will sketch the method for a two-dimensional lattice, but higher (or lower) dimensions can be treated as well. We will not dwell on details since they are well explained elsewhere. The variables are N × N hermitian matrices M(x) defined on the lattice sites x Z Y DM(x) e−SH , ZH = x
SH = N Tr
X 1 x
2
M(x)2 + V M(x)
β X [M(x)M(x + µ) ˆ + M(x)M(x − µ)] ˆ , − 2
(1)
µ=1,2
where µˆ denotes the unit vector in the µ-direction. It is equally natural to consider general complex matrices 8(x) ∈ GL(N, C), in which case Z Y D8(x) e−SGL , ZGL = x
SGL = N Tr
X
8(x)8† (x) + V 8(x)8† (x)
x
−β
X
[8(x)8 (x + µ) ˆ + 8(x)8 (x − µ)] ˆ . †
(2)
†
µ=1,2
If V = 0 in Eqs. (1),(2) the model is free. The integration measures in Eqs. (1),(2) are the flat measures for hermitian and complex matrices: DM =
N dMii Y dReMij dImMij , √ −1 πN −1 2πN i=1 i<j
N Y
D8 =
N Y dRe8ij dIm8ij . (3) π N −1
i,j =1
A third, very important type of spin model is the so-called chiral field, which looks like the free complex model Eq. (2) Z Y DU (x) e−SU , ZU = x
SU = −βN Tr
X X
[U (x)U † (x + µ) ˆ + U (x)U † (x − µ)] ˆ ,
(4)
x µ=1,2
but the matrices U (x) ∈ U(N) are unitary. In this case the measure DU (x) is the Haar measure on the group. The model is therefore not free.
26
M. Staudacher
The Eguchi–Kawai reduction [1,2] states that the above lattice models can be replaced at N = ∞ by, respectively, the following one-matrix models coupled to constant external field matrices P and Q: Z 1 2 † † , (5) ZH = DM exp N Tr − M − V M + β MP MP + MQMQ 2 Z ZGL =
Z ZU =
h i D8 exp N Tr − 88† − V 88† × × exp βN Tr 8P 8† P † + 8P † 8† P + 8Q8† Q† + 8Q† 8† Q , (6)
DU exp βN Tr U P U † P † + U P † U † P + U QU † Q† + U Q† U † Q . (7)
Here P = PN and Q = QN are the famous N × N unitary “shift and clock” matrices 1 01 ωN 0 1 . . . . . . , Q = (8) PN = , . N . . N −2 01 ω 1
N
0
N −1 ωN
where ωN = exp 2πi N and PN QN = ωN QN PN . To be more precise, the free energies as well as appropriate correlation functions (see [2]) are identical to leading order in N1 in the lattice field theory and the corresponding one-matrix model. The thermodynamic limit, that is a lattice of infinite extent, is approached when N → ∞. We see that the structure of the lattice has been “hidden” in index space! It is natural to generalize the situation to a toroidal K × L lattice: P = PK ⊗ 1 N , K
Q = QL ⊗ 1 N , L
(9)
where N is chosen to be divisible by K and L. This allows to take the thermodynamic limit and the large N limit independently. If we put L = 1 (we can then equivalently omit Q altogether) the target space becomes a closed one-dimensional chain. We suspect that matrix models on arbitrary discrete target spaces can be EK reduced by appropriate external matrices, but this has not been worked out yet. 3. Character Expansions Now we turn to step (2) and rewrite the reduced hermitian, complex and unitary matrix integrals Eqs. (5), (6), (7) as sums over representations of U(N ). To this end introduce the following source integrals: Z h 1 i (10) ZH [J ] = DM exp N Tr − M 2 − V (M) + J M , 2 ZGL [J J¯] =
Z
h i D8 exp N Tr − 88† − V (88† ) + J 8 + 8† J¯ ,
(11)
Master Partitions for Large N Matrix Field Theories
ZU [J J¯] =
Z
27
h i DU exp N Tr J U + U † J¯ .
(12)
The two different ways of introducing a source are due to the U(N ) symmetry of hermitian matrices on the one hand and the U(N)×U(N ) symmetry of complex (and complex unitary) matrices on the other. The reduced models are easily obtained from the source integrals by applying an operator: ZH = exp
ZGL,U = exp
β , Tr ∂P ∂P † + ∂Q∂Q† · ZH [J ] J =0 N
(13)
β ¯ † + ∂P † ∂P ¯ + ∂Q∂Q ¯ † + ∂Q† ∂Q ¯ Tr ∂P ∂P . · ZGL,U [J J¯] J =J¯=0 N (14)
Here ∂,∂¯ denote N × N matrix differential operators whose matrix elements are ∂j i = ∂ ∂ ¯ ∂Jij and ∂j i = ∂ J¯ij . It is clear that the source integrals are class functions of, respectively, J and J J¯. Therefore they can be expressed as character expansions, with known (see [3, 4,6]) expansion coefficients. If V = 0, they read for the hermitian and complex source integrals, respectively, ZH [J ] = exp
X 1 χh (A2 )χh (J ), N Tr J 2 = 2
(15)
h
ZGL [J J¯] = exp N Tr J J¯ =
X
χh (A1 )χh (J J¯),
(16)
h
while for the unitary source integral one has [6] ZU [J J¯] =
X χh (A1 )χh (A1 ) h
χh (1)
χh (J J¯).
(17)
Here the sum runs over all partitions h labeled by the shifted weights hi = N − i + mi , where mi ≥ 0, i = 1, . . . , N, is the number of boxes in the i th row of the Young pattern associated to h. χh (J ) is the Schur function, dependent on J , on the diagram h. It is identical to the Weyl character of the matrix J corresponding to the representation labeled by h. A1 and A2 are defined through Tr Ak1 = N(δk,0 +δk,1 ) and Tr Ak2 = N (δk,0 +δk,2 ), and χh (1) is the dimension of the representation. For more details on the notation, and for explicit formulas for the characters χh (A1 ), χh (A2 ) and χh (1) see [3,4]. For a nonzero potential V , the hermitian and complex character expansions become a bit more complicated, but are still available: X 2h χh (J ), (18) ZH [J ] = h
ZGL [J J¯] =
X h
h χh (J J¯),
(19)
28
M. Staudacher
where 2h is given by Z h 1 i χh (A1 ) DM exp N Tr − M 2 − V (M) χh (M), 2h = χh (1) 2 and h by 2 Z i h χh (A1 ) D8 exp N Tr − 88† − V (88† ) χh (88† ). h = χh (1)
(20)
(21)
The integrals appearing in Eqs. (20), (21) are ordinary one-matrix integrals which may be computed rather explicitly as N × N determinants. Their analysis in the N → ∞ limit proceeds by employing standard techniques, supplemented by the methods of [3]. Now we apply the operators in Eqs. (13), (14) in order to generate the space-time lattice; this results in character expansions for the reduced matrix field theories. In the P hermitian case one has (here |h| = i mi =number of boxes in the Young diagram) X |h| χh (A2 ) 4h β 2 for V = 0, (22) ZH = h
ZH =
X
2h 4h β
|h| 2
for V 6 = 0,
(23)
h
with
1 Tr ∂P ∂P † + ∂Q∂Q† · χh (J ) . J =0 N The free complex, interacting complex, and the unitary case become X χh (A1 ) ϒh β |h| for V = 0, ZGL = 4h = exp
(24)
(25)
h
ZGL =
X
h ϒh β |h|
for V 6= 0,
(26)
h
ZU =
X χh (A1 )χh (A1 ) h
χh (1)
ϒh β |h| ,
(27)
with
1 ¯ + ∂Q∂Q ¯ † + ∂Q† ∂Q ¯ ¯ † + ∂P † ∂P . (28) · χh (J J¯) Tr ∂P ∂P J =J¯=0 N The character expansions Eqs. (22), (23), (25), (26), (27) are at the heart of our proposal. It is seen that they neatly separate the nature of the local spin weight (χh (A2 ),2h ,χh (A1 ),h , (χh (A1 ))2 (χh (1))−1 ) and the nature of the embedding space (4h ,ϒh ). As a striking example, note that from the point of view of our character expansion method the difference between the free Gaussian model on a toroidal lattice Eq. (25) and the non-trivial chiral model Eq. (27) is a simple, explicitly known factor ϒh = exp
N
Y (N − i)! χh (A1 ) = N |h| . χh (1) hi ! i=1
The character expansions involve sums over N variables only and we can write down a saddle point equation for the effective density of the master partition. In order to complete the program, we need a second bootstrap equation for the novel quantities 4h and ϒh , which contain the connectivity information of the lattice.
Master Partitions for Large N Matrix Field Theories
29
4. Lattice Polynomials Inspection of the quantities 4h and ϒh in eqs.(24),(28) shows that they are polynomials in the variable N1 of degree not higher than, respectively, 21 |h| − 1 and |h| − 2. They are zero if the number |h| of boxes in the Young pattern is odd. Conjugating the diagram gives the same polynomial except for the replacement N1 → − N1 . The first few can be computed by brute force calculation directly from the definitions Eqs. (24), (28), see Table 1. Table 1. The first few D = 2 lattice polynomials h 2
4h 2
ϒh 2
12
2
2
4
3 + 12 N1 5 + 4 N1
3 + 24 N1 + 54 12 N
31 22 212
6 5 − 4 N1
5 + 8 N1 + 18 12 N 6 5 − 8 N1 + 18 12
14
3 − 12 N1
3 − 24 N1 + 54 12
N
N
Here we used Tr(P k Ql ) = Nδk,0 δl,0 , which is true as long as |k| < N, |l| < N. We ∗ → 1 (remember ω = exp 2π i ): in other words, we assumed also replaced ωN → 1, ωN N N P and Q to commute at large N. Both assumptions are innocent at least in the strong coupling (small β) phase. If the model possesses a weak coupling phase (like e.g. the chiral field Eq. (7)), these assumptions may have to be reconsidered, if we want the character expansion to describe this second phase as well. This is because in the present approach we expect large N phase transitions to correspond to the situation where the number of rows of the master partition is of O(N ) (“touching transition”). Note that we cannot drop the other terms of O( N1 ) in 4h ,ϒh since the character expansions are for the partition function and not for the free energy. The direct calculation of the lattice polynomials quickly gets very tedious. The combinatorics involved seems to be of a novel type. While we have not yet found an efficient calculational scheme or recursive method, let us give some interesting representations for 4h and ϒh that may prove useful later. Introduce the following Gaussian measure on the space of M × N (M ≤ N) complex matrices 3:
[D3] =
N M Y Y dRe3ij dIm3ij i=1 j =1
πN −1
i h exp N Tr − 33† .
(29)
This measure is invariant under U(M) × U(N ). It is then fairly easy to prove (cf. [4]) the following representation for the character of the source: Z χh (J ) =
Z DU χh (U † )
[D3] exp N Tr U 3J 3† ,
(30)
30
M. Staudacher
where U ∈ U(M) is unitary and DU is the Haar measure on U(M). This formula is valid for diagrams h with at most M rows. Therefore 4h becomes, cf. Eq. (24) Z Z † 4h = DU χh (U ) [D3] exp N Tr 3† U 3P 3† U 3P † + 3† U 3Q3† U 3Q† . (31) After a Hubbard-Stratanovich transformation decoupling the quartic terms by Gaussian M × M complex matrices S and T (with measure as in Eq. (29) with N → M), and integration over 3, we obtain the representation Z Z † 4h = DU χh (U ) [DS][DT ] ∞ k X 1 SU ⊗ P + S † U ⊗ P † + T U ⊗ Q + T † U ⊗ Q† . × exp Tr M⊗N k k=1
(32) The combinatorial interpretation of the exponential in Eq. (32) is the following: we have a generating function for a non-commutative random walk on a two-dimensional lattice with variable U . The representation is useful for getting some exact results on the 4h , but we have not yet been able to compute the integral Eq. (32) exactly except for M = 1 (characters with just one row). E.g. we can find a generating function (with zi being the eigenvalues of U ) for the large N limit of 4h M Y i,j
X 1 =∞ = 4N χh (z) h 2 (1 − zi zj )
(33)
h
giving the constant terms of the lattice polynomials. This is however not sufficient for the large N limit of the field theory, as already mentioned. A curious feature of Eq. (32) is that we can take N → ∞ while keeping M in the range 1 M N . That is, it should be possible to find a saddle point for the situation where the row lengths are large compared to the number of rows, corresponding to the extreme strong coupling limit. Furthermore, it should be investigated whether the M × M matrices can be taken to commute as N → ∞. Similar, if slightly more complicated representations are possible for ϒh ; here the starting point is the expression Z Z 1 1 χh (J J¯) = DU χh (U † ) [D31 ][D32 ] exp N Tr U 2 31 J 3†2 + 32 J¯3†1 U 2 , (34) which means the lattice polynomials become Z Z † ϒh = DU χh (U ) [D31 ][D32 ]× 1 1 1 1 × exp N Tr 3†2 U 2 31 P 3†1 U 2 32 P † + 3†2 U 2 31 P † 3†1 U 2 32 P × 1 1 1 1 × exp N Tr 3†2 U 2 31 Q3†1 U 2 32 Q† + 3†2 U 2 31 Q† 3†1 U 2 32 Q , (35)
Master Partitions for Large N Matrix Field Theories
31
and the non-commutative random walk representation is Z Z † ¯ ][DT¯ ] ϒh = DU χh (U ) [DS][DS][DT ∞ k X 1 1 1 1 1 ¯ 2 ⊗ P † + T U 2 ⊗ Q + T¯ U 2 ⊗ Q† SU 2 ⊗ P + SU × exp Tr M⊗N k k=1 ∞ k X 1 ¯† 1 † 21 † † 21 † 21 † ¯ 2 S U ⊗P +S U ⊗P +T U ⊗Q+T U ⊗Q , × exp Tr M⊗N k k=1
(36) , cf. Eq. (33), but N1 corrections are different from which we find that ϒhN=∞ = 4N=∞ h (see Table 1). Again, for arbitrary one-row representations (M = 1) it is possible to obtain ϒh rather explicitly. Another potentially useful representation2 of the lattice polynomials is given by the following dual equations: Eq. (24) becomes 1 , (37) 4h = χh (∂) · exp Tr J P J P † + J QJ Q† J =0 N and Eq. (28) is dual to ¯ · exp 1 Tr J P J¯P † + J P † J¯P + J QJ¯Q† + J Q† J¯Q . (38) ϒh = χh (∂ ∂) J =J¯=0 N We could go on and discuss correlation functions which are naturally included into the present formalism. In particular, it is straightforward to give expressions for their character expansions in terms of modified lattice polynomials, and it remains true that the combinatorics is independent on whether the reduced field theory is free or interacting. This is however beyond the scope of the present article. While it is unclear whether the D ≥ 2 lattice polynomials can be computed exactly for a general partition, it should be stressed once more that this is unnecessary; all we need is an indirect method in order to extract the large N behavior. 5. Conclusions This solution to the problem of the large N limit of (non-gauge) matrix field theories is not yet complete since the structure of the lattice polynomials we introduced still needs to be further analyzed in order to be able to write the full set of saddle point equations. However we feel that we are definitely closing in on the large N problem, and that we have brought it into the simplest form to date. The proposed approach is concrete, systematic and rather general: we demonstrated that the reduction from N 2 to N variables is possible once one changes variables from matrices to partitions. In this language the master field becomes a master partition. Presumably one should first (re)derive in the current framework the exact solutions for some lower dimensional target spaces before dealing with the two (and higher) dimensional field theories. Acknowledgements. We thank Daya-Nand Verma and Brian G. Wybourne for interesting and useful discussions concerning the combinatorial aspects of this project. This work was supported in part by the EU under Contract FMRX-CT96-0012. 2 We thank D.-N. Verma for pointing this out to us.
32
M. Staudacher
References 1. Eguchi, T. and Kawai, H.: Reduction of dynamical degrees of freedom in the large N gauge theory. Phys. Rev. Lett. 48, 1063 (1982) 2. Eguchi, T. and Nakayama, R.: Simplification of Quenching Procedure for Large N Spin Models. Phys. Lett. B122, 59 (1982) 3. Kazakov, V.A., Staudacher, M. and Wynter, T.: Character Expansion Methods for Matrix Models of Dually Weighted Graphs. hep-th/9502132, Commun. Math. Phys. 177, 451 (1996); Almost Flat Planar Diagrams. hep-th/9506174, Commun. Math. Phys. 179, 235 (1996); Exact Solution of Discrete 2D R 2 Gravity. hep-th/9601069, Nucl. Phys. B471, 309 (1996) 4. Kostov, I. and Staudacher, M.: Two-Dimensional Chiral Matrix Models and String Theories. hepth/9611011, Phys. Lett. B394, 75 (1997); Kostov, I., Staudacher, M. and Wynter, T.: Complex Matrix Models and Statistics of Branched Coverings of 2D Surfaces. hep-th/9703189, Commun. Math. Phys. 191, 283 (1998) 5. Krauth, W. and Staudacher, M.: Finite Yang-Mills Integrals. AEI-065, hep-th/9804199, accepted for publication in Phys. Lett. B. 6. Bars, I.: U (N) Integral for the Generating Functional in Lattice Gauge Theory. J. Math. Phys. 21, 2678 (1980) Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 206, 33 –55 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Statistics of Return Times: A General Framework and New Applications Masaki Hirata1 , Benoît Saussol2 , Sandro Vaienti2 1 Mathematical Department, Tokyo Metropolitan University, Japan. E-mail:
[email protected] 2 Centre de Physique Théorique, Luminy, Marseille and PHYMAT, Mathematical Department, University of
Toulon, France. E-mail:
[email protected];
[email protected] Received: 4 August 1998 / Accepted: 9 March 1999
Abstract: In this paper we provide general estimates for the errors between the distribution of the first, and more generally, the K th return time (suitably rescaled) and the Poisson law for measurable dynamical systems. In the case that the system exhibits strong mixing properties, these bounds are explicitly expressed in terms of the speed of mixing. Using these approximations, the Poisson law is finally proved to hold for a large class of non hyperbolic systems on the interval. 1. Introduction The investigation of asymptotically rare events is growing up as a new direction in the understanding of statistical properties of dynamical systems. By “asymptotically rare” events we mean, in a wide sense and following the terminology in the review paper of [Coe97], those events which have asymptotically zero probability but which occur with a well determined asymptotic limit law. In the dynamical setting, where we have a probability space (X, B, µ) with a measurable µ-preserving mapping T acting on it, the “events” will usually be the visits into a sequence of sets k ∈ B of positive measure but with their measure going to zero in the limit of large k. We call the event “rare”, when the expected entrance time in k diverges with k. A well-known result in ergodic theory shows how abundant are the “asymptotically rare” events. Let us consider in fact an ergodic measure µ for an endomorphism T and take a measurable subset : then Kac’s theorem [CFS82] says that the expectation of the return time to , starting from , is just µ()−1 . Kac’s theorem suggests the good normalization to keep in order to study the asymptotic distribution of the return time to . The natural object will thus be the distribution: (1) F (t) = µ x ∈ τ (x)µ() > t , where τ (x) is the first return time to provided that x ∈ and µ is the normalized restriction of µ to . The question will be whether the limit of F (t) exists when the
34
M. Hirata, B. Saussol, S. Vaienti
measure goes to zero and what kind of distribution is recovered. The condition that the starting point x in (1) belongs to could be relaxed by asking that x belongs to the whole space. In this case, F (t) will give the distribution of the “visiting time” into , but in order to get its asymptotic distribution, a suitable normalization is needed [GS97]. The situations sketched above could be considerably refined, producing richer processes (see the quoted paper [Coe97] for an historical account of these questions and an exhaustive bibliography). We will however explore some of them in this paper under a more general perspective and successively by giving applications to class of systems never investigated before. Let us first come back to formula (1) and replace with a decreasing sequence of neighborhoods of a given point z ∈ X, ε (z), such that their measure goes to zero when ε → 0+ . Then for some classes of hyperbolic dynamical systems, notably axiom A diffeomorphisms [Hir93], transitive Markov chains [Pit91], expanding maps of the interval with a spectral gap [Col96] and in the more general setting of systems verifying a strong mixing property (“self-mixing” condition and ϕ-mixing [Hir95]), and recently even in the case of rational maps with critical points in the Julia set [Hay98a], it is possible to prove that the distribution Fε (z) (t) goes to the exponential-one law e−t and this for µ-almost every z ∈ X. A strong improvement of this kind of result appears in the paper [GS97], where an upper bound for the difference t −t µ τA (x) > − e µ(A)λ(A) was explicitly computed in the case of ϕ-mixing systems and where A is a cylinder set, and λ(A) a suitable normalizing factor. Recently [Hay98b] obtained an exponential error estimate for the quantity like (1) in the case of parabolic rational maps. To enrich the process, and the statistics, one successively introduce the K th return time, τKε (x), from ε into itself (see the precise definition in the next section), where ε = ε (z) is still a neighborhood of some point z ∈ X. For the dynamical systems quoted above, a Poisson statistics can be proved, by showing that the distribution of successive return times into ε satisfies, for z µ-a.e. t K −t (x) −→ e . µε x ∈ ε τKε (x) ≤ t < τK+1 ε ε→0+ K!
(2)
The preceding results deserve further investigations at least in two directions: 1. extend them to non-hyperbolic dynamical systems and, more ambitiously, check their robustness when the system loses strong mixing properties. 2. prove an error estimate even for the distribution of successive return times (2) and relate this approximation rate, if possible, to the statistical properties of the system like correlations decay or spectral properties. We try to give partial answer to these questions in this paper. The general setting we put in, is the return(s) times to the set starting from itself, as expressed in formulas (1) and (2) (although in Theorem 2.1 we will also consider points starting everywhere). The first attempt was to give, for measure preserving dynamical systems, a general upper bound for the difference between the distribution of the (rescaled) first return time and the exponential-one law e−t and then between the distribution of high-order (rescaled) t K −t e . We do not make any hypothesis on the set , nor return times and the Poisson law K! on the ergodic properties of µ; nevertheless these bounds are expressed in terms of the
Statistics of Return Times
35
self-interactions of the set and can be explicitly computed when typical rates of mixing are known (uniform mixing, α-mixing or ϕ-mixing). In this context, our bounds greatly improve and simplify the hypothesis of self-mixing condition of [Hir95], which was a powerful tool to get sufficient condition for the Poisson statistics. This first part of the paper is essentially due to one of us (B.S.) and is part of his Ph.D. Thesis [Sau98b]. In the second part we apply the preceding bounds to new situations. The systems we treat are some non-uniformly hyperbolic maps of the interval; these maps are characterized by a structure parameter, say α, which measures the order of tangency at a neutral fixed point and governs the algebraic decay of correlations (in our example the order is n1−1/α ). If µ denotes the absolutely continuous invariant measure, we prove Poisson statistics (in the sense precise above), by giving an explicit approximation of the asymptotic law in terms of the measure of the set n , where in this case n is a decreasing sequence of cylinder sets chosen around almost all points in the interval. To be precise the error is of the type: µ(n )β , for any β < 1 − α, and therefore β is explicitly related to α and optimized just by 1 − α. For the distributions of the K th return times the bounds simply become µ(n )β/K . By inspecting these results, we could argue that the non-hyperbolic character of the maps reflects in the error term; to be more precise we think that as soon as the degree of non-uniform hyperbolicity of the map is monitored by a structure parameter α, this parameter will appear explicitly in the approximation to the Poisson law, which suggests, on the converse, that we could use Poissonian statistics to test lack of hyperbolicity. Our claim is motivated by two more observations: first, in getting these bounds we proved a sort of α-mixing for the map with a rate which was exactly the same as the algebraic rate for the correlations’ decay. Second, in the forthcoming paper [Sau98a] the return times is analyzed for a class of piecewise expanding multidimensional maps. Although the mixing properties are much more difficult to handle with, especially for the presence of singularity lines and the geometry of their shape, the uniform dilatation will provide bounds on the form: µ(n )β and µ(n )β/K for all β < 1, which reflects the fact that all the quantities involved, and the correlations’ decay too, admit exponential estimates. We will come back to these questions in Sect. 4. As a final remark, we address two questions: 1. Our analysis is local: the events are chosen around almost all points which we could call, following a widespread tradition, generic (for our statistics). What happens if we consider non-generic points (discarding of course some trivial situation like fixed points)? Could we see their (possibly different) statistics by involving some sort of large deviation argument ? 2. What is the place of Poissonian statistics regarding other ergodic characterizations of dynamical systems? For example: what is the largest class of ergodic dynamical systems enjoying a Poissonian statistics? Conversely, does an invariant measure satisfying that behavior imply strong ergodic properties too?
2. General Bounds on the Distribution of Return Times We will consider in this section a probability space (X, B, µ) together with a measure preserving transformation T acting on X. The basic object will be the return time into a positive measure set U starting from U defined by n o τU (x) = inf k ≥ 1| T k x ∈ U ∪ {∞}.
36
M. Hirata, B. Saussol, S. Vaienti
µ(A ∩ U ) . We µ(U ) then recall Kac’s theorem which says that the conditional expectation of τU given U is finite, and equal to 1/µ(U ), when µ is ergodic. As indicated in the introduction, Kac’s result suggests how to properly rescale the return time when we are interested in its distribution.
We define as usual the conditional measure µU on U by µU (A) =
2.1. First return time. We begin to show that the distribution of the first return time into the set U starting from U is close to an exponential one law if and only if the two distributions of the first return time starting, respectively from U and everywhere, are close. Theorem 2.1. Let us define c(k, U ) = µU (τU > k) − µ(τU > k) and set c(U ) = supk |c(k, U )|. The distribution of the (rescaled) first return time into the set U differs from the exponential-one law by at most d(U ) := 4µ(U ) + c(U )(1 + log c(U )−1 ), namely: t −t − e ≤ d(U ), sup µ τU > µ(U ) t≥0 which is still true starting from U : sup µU τU > t≥0
t µ(U )
− e ≤ d(U ). −t
Conversely, the difference between the two distributions (starting inside U and everywhere) can be bounded in terms of the distance e c(U ) := supt≥0 |µU (τU > t/µ(U )) − −t e |, precisely: c(U ) ≤ 2µ(U ) + e c(U )(2 + loge c(U )−1 ). Remark 2.2. Whenever µ(U ) > 0 the return time’s law is discrete and this allow us to get a lower bound for the rate of convergence. More precisely, we have the following proposition: Proposition 2.3. For each k ≥ 0, εk,U := µ (τU > k) − e−kµ(U ) + µ (τU > k + 1/2) − e−(k+1/2)µ(U ) ≥
e−kµ(U ) µ(U ). 4
In particular, ε0,U ≥ µ(U )/4. Proof of Proposition 2.3. Let k ≥ 0 be an integer. Since τU takes only integer values, the distribution for t = kµ(U ) and t 0 = (k + 1/2)µ(U ) is the same, then εk,U ≥ |exp(−kµ(U )) − exp(−(k + 1/2)µ(U ))| ≥ exp(−kµ(U ))(1 − e−µ(U )/2 ) ≥
e−kµ(U ) µ(U ). 4
t u
Statistics of Return Times
37
Proof of Theorem 2.1. Let us remark that for any k ≥ 1 we have µ(τU = k) = µ(U ∩ {τU > k − 1}).
(3)
Since {τU > k} = T −1 (U c ∩ {τU > k − 1}) by the invariance of µ we get that µ(τU > k) = µ(τU > k − 1) − µ(U ∩ {τU > k − 1}), whence the result. Next, for all k > 0 we have µ(τU > k) = µ(τU > k − 1) − µ(U )µU (τU > k − 1) = µ(τU > k − 1) − µ(U )[µ(τU > k − 1) + c(k, U )] = µ(τU > k − 1)[1 − µ(U )] − µ(U )c(k, U ). Then it follows by an immediate induction that k
µ(τU > k) = (1 − µ(U )) − µ(U )
k X
c(j, U )(1 − µ(U ))k−j .
j =1
Hence for all t ≥ 0, putting kt = [t/µ(U )], we have kt X µ(τU > kt ) − (1 − µ(U ))kt ≤ µ(U ) |c(j, U )| ≤ tc(U ).
(4)
j =1
Setting z = − log c(U ), and kz = [z/µ(U )], we get (1 − µ(U ))kz ≤ e−kz µ(U ) ≤ c(U )eµ(U ) ≤ c(U ) + 2µ(U ), for any t > z, µ(τU > kt ) ≤ µ(τU > kz )
≤ (1 − µ(U ))kz + zc(U ) ≤ 2µ(U ) + c(U )(1 − log c(U )) which gives µ(τU > kt ) − (1 − µ(U ))kt ≤ 2µ(U ) + c(U )(1 − log c(U )). Instead for any t ≤ z the same estimate holds by inequality (4). Since, by an easy computation |(1 − µ(U ))kt − e−t | ≤ 2µ(U ), we get for any t ≥ 0, µ(τU > kt ) − e−t ≤ 4µ(U ) + c(U )(1 − log c(U )), which proves the first part of the theorem. Moreover, since µU (τU > kt ) − µ(τU > kt ) = |c(kt , U )| ≤ c(U ), we finally have for each t ≥ 0, µU (τU > kt ) − e−t ≤ 4µ(U ) + c(U )(2 − log c(U )).
38
M. Hirata, B. Saussol, S. Vaienti
The converse part is proven in the same way. For k ≥ 1, µ(τU > k) = 1 − µ(τU ≤ k) = 1−
k X
µ(τU = j )
j =1
= 1 − µ(U )
k X
µU (τU > j − 1),
j =1
where we used in the last equality the relation (3). Hence k X −kµ(U ) −(j −1)µ(U ) −kµ(U ) | ≤ 1 − µ(U ) e −e c(U ) |µ(τU > k) − e + kµ(U )e j =1 1 − e−kµ(U ) −kµ(U ) − e c(U ) ≤ 1 − µ(U ) + kµ(U )e 1 − e−µ(U ) µ(U ) −kµ(U ) ) 1 − + kµ(U )e c(U ) ≤ (1 + e 1 − e−µ(U ) ≤ 2µ(U ) + kµ(U )e c(U ). c(U )−1 /µ(U ): This gives, whenever k ≤ k0 := loge |c(k, U )| ≤ 2µ(U ) + e c(U ) loge c(U )−1 . For k > k0 we simply have |c(k, U )| ≤ µ(τU > k0 ) + µU (τU > k0 ) c(U ). ≤ 2µ(U ) + e c(U ) loge c(U )−1 + e−k0 µ(U ) + e
t u
The last theorem gives a necessary and sufficient condition to obtain the exponential law, that is d(U ) → 0. However, such a quantity is not very transparent for dynamical systems, that is why we give a criterion to estimate it. This kind of condition is a generalization of the so-called “self-mixing condition” introduced in [Hir95]. Lemma 2.4. Let U ⊂ X a measurable set. The following estimate holds: c(U ) ≤ inf { aN (U ) + bN (U ) + N µ(U )| N ∈ N}, where the quantities are defined by aN (U ) = µU (
N [
T −j U ) = µU (τU ≤ N ),
j =1
bN (U ) = sup |µU (T −N V ) − µ(V )| V ∈U∞
with U = {U, U c }, Un =
Wn−1 k=0
T −k U and U∞ = ∪n σ (Un ).
Statistics of Return Times
39
Proof. Let N ∈ N. If k < N, we just bound c(k, U ) by |µU (τU > k) − µ(τU > k)| = |µU (τU ≤ k) − µ(τU ≤ k)| ≤ |µU (τU ≤ k)| + |µ(τU ≤ k)| ≤ aN (U ) + kµ(U ) ≤ aN (U ) + N µ(U ). Otherwise, let us remark that {τU > k} and {τU ◦ T N > k − N } differ only on {τU ≤ N }, and by hypothesis |µU (τU > k) − µU (τU ◦ T N > k − N )| ≤ µU (τU ≤ N ) = aN (U ). Moreover |µU (τU ◦ T N > k − N) − µ(τU > k − N )| = |µU (T −N (τU > k − N)) − µ(τU > k − N )| ≤ bN (U ). But {τU > k − N } and {τU > k} differs only on {τU ◦ T k−N ≤ N }, hence |µ(τU > k − N) − µ(τU > k)| ≤ µ(τU ◦ T k−N ≤ N ) = µ(τU ≤ N ) ≤ N µ(U ). We finally get for each k, N ∈ N, |µU (τU > k) − µ(τU > k)| ≤ aN (U ) + bN (U ) + N µ(U ), which concludes the proof, since N is arbitrary. u t We remark that bN (U ) is bounded by α(N) if the partition U = {U, U c } is α-mixing, and by γ (N ) if it is uniformly mixing (see Definition 2.1 below). To simplify, we could say that the exponential law holds when there exists some N so small that only few points of U come back in U before N steps, but large enough such that T N U is uniformly spread out. Definition 2.1 (Speed of mixing). Let (X, B, T , µ) be a dynamical system and ξ a W −j ξ and σ (ξ ) the finite or countable measurable partition of X. We set ξk = k−1 k j =0 T σ -algebra generated by ξk . 1. Uniform mixing. The partition ξ is uniformly mixing with speed γ (n) going to zero for n going to infinity if for any n, γ (n) = sup k,l
sup
R∈σ (ξk ) S∈T −(n+k) σ (ξl )
|µ(R ∩ S) − µ(R)µ(S)|.
2. α-mixing. The partition ξ is α-mixing with speed α(n) going to zero for n going to infinity if for any n, µ(R ∩ S) sup α(n) = sup µ(R) − µ(S) . k,l R∈ξk S∈T −(n+k) σ (ξl )
40
M. Hirata, B. Saussol, S. Vaienti
3. ϕ-mixing. The partition ξ is ϕ-mixing with speed ϕ(n) going to zero for n going to infinity if for any n, µ(R ∩ S) sup ϕ(n) = sup µ(R)µ(S) − 1 . k,l R∈σ (ξk ) S∈T −(n+k) ξl
4. Weak-Bernoulli. The partition ξ is weak-Bernoulli with speed β(n) going to zero when n goes to infinity, if for any n, X |µ(R ∩ S) − µ(R)µ(S)|. β(n) = sup k,l
R∈ξk S∈T −(n+k) ξl
Remark 2.5. We state some general implications and results verified by the preceding types of mixing. 1. ϕ-mixing implies α-mixing which implies uniform mixing. For any n, γ (n) ≤ α(n) ≤ ϕ(n). 2. ϕ-mixing implies weak-Bernoulli which implies uniform mixing. For any n, γ (n) ≤ β(n) ≤ ϕ(n). 3. If ξ is a generating partition of an uniformly mixing dynamical system, then the system is mixing. 4. If ξ is a generating weak-Bernoulli partition then the system is metrically conjugated with a Bernoulli shift. 2.2. Successive return times. We will now investigate the properties of successive return times to the set U . For this purpose, let us define the k th return time in U by ( 0 if k = 0, (k) τU (x) = (k−1) τU (x) + τU (T τU (x) (x)) if k > 1. Observe that the difference between two consecutive return times follows the same law than the first, for the simple reason that (K+1)
τU
(K)
− τU
(K)
= τU ◦ T τU
and the measure µU is invariant with respect to the induced application on U . Theorem 2.6. Let U ⊂ X be a measurable set, and U = {U, U c } the partition associated to it. Given an integer K and a rectangle QK in RK , the differences between successives normalized return times in U are independent and exponentially distributed up to f (K, U ) (see (5) below), where f (K, U ) is defined depending on the type of mixing by (α) When (X, T , µ) is α-mixing for U, with speed α 1 , then f (K, U ) = K 3d(U ) + inf {α(M) + 3Mµ(U )} . M∈N
1 We just need that mixing property for some special sets, more precisely, we are interested by
µ(R ∩ S) α 0 (N) = sup − µ(S) µ(R)
j, N ∈ N, R ∈ Uj , T j R ⊂ U, V ∈ T −j −N U∞ .
Statistics of Return Times
41
(γ ) When the partition U is uniformly mixed by (X, T , µ) with speed γ , then f (K, U) = K 4d(U ) +
inf
M∈N γ (M) v) − µU (τU > u) − (e−u − e−v )| ≤ 2d(U ). Let’s suppose that the inequality (5) is true for K; we want to prove that it is also true for K + 1. Let [r, s] be the projection of QK+1 onto the last coordinate, and for k = K, K + 1 denote: Dk = U ∩ τk−1 (
1 Qk ). µ(U )
For any M ∈ N, the set defined by n o EK+1 (M) = DK ∩ x ∈ U | τU ◦ T M ◦ F K (x) ∈ [r, s]/µ(U ) − M verifies the inclusions EK+1 (M) ∩ {τU ◦ F K > M} ⊂ DK+1 ⊂ EK+1 (M) ∪ {τU ◦ F K ≤ M}.
42
M. Hirata, B. Saussol, S. Vaienti
Theorem 2.1 shows that the two sets which bound DK+1 do not differ too much, namely, µU (τU ◦ F K ≤ M) = µU (τU ≤ M) ≤ 1 − e−Mµ(U ) + d(U ) ≤ Mµ(U ) + d(U ). Therefore we get the first bound |µU (DK+1 ) − µU (EK+1 (M))| ≤ Mµ(U ) + d(U ).
(6)
So the problem reduces to prove that µU (EK+1 (M)) follows the expected law. We j (K) decompose the sets EK+1 (M) over AK = U ∩ {τU = j }. We have j
j
EK+1 (M) ∩ AK = DK ∩ AK ∩ T −(M+j ) {τU ∈
[r, s] − M}. µ(U )
j
We can now use the mixing with R = DK ∩ AK ∈ σ (Uj ) and S = T −(M+j ) {τU ∈ [r, s]/µ(U ) − M}. According to the type of mixing, we get two approximations: (α) When the partition U is α-mixing: j
j
|µU (EK+1 (M) ∩ AK ) − µU (DK ∩ AK )µ(τU ∈
[r, s] − M)| µ(U )
j
≤ α(M)µU (DK ∩ AK ). Summing over the possible values of j we get: |µU (EK+1 (M)) − µU (DK )µ(τU ∈
[r, s] − M)| ≤ α(M)µU (DK ) ≤ α(M). µ(U )
(7)
Now Theorem 2.1 gives |µ(τU ∈
[r, s] [r, s] − M) − (e−r − e−s )| ≤ |µ(τU ∈ ) − (e−r − e−s )| + 2Mµ(U ) µ(U ) µ(U ) ≤ 2(Mµ(U ) + d(U )).
We briefly recall the approximations done with their respective errors [r,s] −r − e−s ) µU (DK+1 )→µU (EK+1 (M))→µU (DK )µ{τU ∈ µ(U ) }→µU (DK )(e ↓ ↓ ↓ Mµ(U ) + d(U ) α(M) 2(Mµ(U ) + d(U ))
This allows us to show that the difference Z K+1 Y −si K+1 e ds µU (DK+1 ) − QK+1
(8)
i=1
is bounded by the quantity f (K, U ) + 3Mµ(U ) + α(M) + 3d(U ) ≤ f (K + 1, U ), which proves the induction and concludes the proof of this first case. (γ ) We now consider the case when U is uniformly mixing:
Statistics of Return Times
43
Let M be such that γ (M) < µ(U )2 . As a first step, we can restrict ourselves to the case γ (M) > 0. In fact, when QK ⊂ [0, z]K , with z = − log µ(U )2 QK \ [0, z]K ⊂
K [ k=1
K−k Rk−1 + ×]z, ∞] × R+
which implies using Theorem 2.1 K
µU (µ(U )τK ∈ QK \ [0, z] ) ≤
K X k=1
(k+1)
µU (τU
(k)
− τU > z/µ(U ))
= KµU (τU > z/µ(U )) ≤ K(e−z + d(U )). Moreover Z
K Y
QK \[0,z]K i=1
e−si ds K ≤
K Z X
K Y
k−1 K−k k=1 R+ ×]z,∞]×R+ i=1
e−si ds K ≤ Ke−z .
Next, by decomposing according to µU (µ(U )τK ∈ QK ) = µU (µ(U )τK ∈ QK ∩ [0, z]K ) + µ(µ(U )τK ∈ QK \ [0, z]K ), we get f (K, U ) ≤ K(2e−z + d(U )) + f 0 (K, U ), where f 0 (K, U ) is the maximum of the difference (5) for the boxes QK ⊂ [0, z]K . We then estimate f 0 (K, U ). First by uniform mixing we get j
j
|µU (EK+1 (M) ∩ AK ) − µU (DK ∩ AK )µ(τU ∈ [r, s]/µ(U ) − M)| ≤
γ (M) µ(U )
and then we sum over all possible2 values j of τ (K) , |µU (EK+1 (M)) − µU (DK )µ(τU ∈ [r, s]/µ(U ) − M)| ≤
Kzγ (M) . µ(U )2
The same computation performed after estimation (7) (where now α(M) is replaced zγ (M) + by Kzγ (M)/µ(U )2 in inequality (7)), gives the bound f 0 (K + 1, U ) ≤ K µ(U )2 3(d(U ) + Mµ(U )). Then for each M, f 0 (K, U ) ≤ K 2
zγ (M) + 3K(d(U ) + Mµ(U )). µ(U )2
Since M is arbitrary, our choice of z implies that the inequality (5) is verified with γ (M) γ (M) 2−K log +3Mµ(U ) . u t f (K, U ) = K 4d(U ) + inf 2 µ(U ) µ(U )2 M∈N γ (M) 0| τU ≤ t/µ(U ) . It turns out that N(t) is a discrete random variable whose law is close to a Poissonian one, more precisely we have Theorem 2.8. The distribution of the number of visits N (t) differs from the Poissonian law by K µU (N(t) = K) − t e−t ≤ g(t, K, U ) + g(t, K + 1, U ), K! p where for each k ≥ 0 g(t, k, U ) = 12t k /k + k k−1 k f (k, U ). Proof. It is a consequence of the weak dependence of the differences of successives return times established by Theorem 2.6. We first remark that t t (K) (K+1) } ∩ {τU } > µU (N(t) = K) = µU {τU ≤ µ(U ) µ(U ) = µU τ (K) ≤ t/µ(U ) − µU τ (K+1) ≤ t/µ(U ) . It is then sufficient to compute the measure of points whose k th rescaled return time is ek (t) the distribution of the sum of the smaller than t, for k = K, K + 1. If we put P differences of successive return times, we know that when the latter are i.i.d. random variables with the same exponential law, then setting n o Lk (t) = (s1 , . . . , sk ) ∈ Rk+ s1 + · · · + sk ≤ t we get ek (t) = Pk (t) := P
Z
k Y
Lk (t) i=1
e−si dsi
t K −t e . K! The difficulty comes now from the fact that we have to translate Theorem 2.6 given for boxes on the simplex Lk (t). Let’s suppose that f (k, U ) < 1, otherwise there is nothing to prove. Hence the integer defined by N = [k/f (k, U )k+1 ] is bigger than k. We consider the uniform partition of [0, t]k by cubes of size t/N. Let 1k be the union of those cubes Qk included P in the interior of Lk (t), for which for any (s1 , . . . , sk ) ∈ Qk , ki=1 si < t and 6k those which intersect the boundary, i.e. the union of those cubes such that there exists which gives the classical result PK (t) − PK+1 (t) =
Statistics of Return Times
45
s2 t
s t
1
Fig. 1. Partition of the cube [0, t]k for k = 2. 6k is the union of dotted squares and 1k the union of shaded rectangles Rk (Qk ).
P (s1 , . . . , sk ) ∈ Qk with ki=1 si = t. By using the notation τk introduced in the proof of Theorem 2.6 we have, Z k Y (k) −si k e ds δ := µU (τU ≤ t/µ(U )) − Lk (t) i=1 Z Y Z Y k k 1k 6k )− )+ e−si ds k + µU (τk ∈ e−si ds k ≤ µU (τk ∈ µ(U ) µ(U ) 1k 6k ≤ δ1 + δ2 + δ3 .
i=1
i=1
To estimate δ1 , we put 5 for the projection over the k − 1 last coordinates; then the sets Rk (Qk ) = {Q0k ∈ 1k |5(Q0k ) = 5(Qk )} are boxes, and their number is bounded by N k−1 (see Fig. 1). For each of these boxes Theorem 2.6 gives an error smaller than f (k, U ), and then we get δ1 ≤ N k−1 f (k, U ). To compute δ2 and δ3 , we first remark that a straightforward combinatorial calculus k of cubes inside 6 , C k ≤ 6N k−1 (see [Sau98b]). But for each gives, for the number CN k N cube Qk ⊂ 6k Theorem 2.6 gives Z µU (τk ∈ Qk ) ≤
k Y
Qk i=1
e−si ds k + f (k, U ).
Summing over all the cubes contained in 6k one has δ2 ≤ 6N k−1 f (k, U )+δ3 . Moreover Z Y k e−si ds k is bounded by the volume of Qk equal to (t/N )k , which the integral gives δ3 ≤
Qk i=1 6N k−1 t k /N k . We
then deduce that
δ ≤ δ1 + δ2 + δ3 ≤ N k−1 f (k, U ) + 12t k /N which implies δ ≤ 12t k /k + k k f (k, U ) by the previous choice of N. u t
46
M. Hirata, B. Saussol, S. Vaienti
3. Applications In the preceding chapter we gave general estimates for the error between the distribution of the number of visits into a set U and the Poissonian law. We could wonder whether this law is attained in the limit of µ(U ) → 0. Put in this way the question is not very clear. What we need is instead to localize a sequence of neighborhoods Uε (z) shrinking to zero and ask whether the Poisson law holds in the limit ε → 0. This approach was successfully carried out by several authors as reminded in the introduction.Although their results were applied to dynamical systems, the inspiration and some of the techniques of the proofs were of probabilistic nature (theory of moments, Laplace transform). Here we follow a purely dynamical direction, trying to extract all the statistical information by the ergodic properties of the system. In this way we are able, for example, to exhibit the Poissonian statistics for a large class of non uniformly hyperbolic maps of the interval, widely studied in the last years especially to determine the rate of decay of correlations and the central limit theorem. Some statistical properties of these maps have been studied in the paper [LSV97] (this paper contains a quite complete bibliography on the subject), where an absolutely continuous invariant probability measure (acim) is first constructed, and then it is shown that it enjoys a polynomial decay of correlations. One feature of these maps is that they are characterized by a structure parameter (the order of tangency at an indifferent fixed point), which governs the statistical properties, and that can be viewed as an indicator of the “weak” hyperbolicity of the map. Actually, it turns out that this parameter appears even in the approximation to the Poissonian law. Let’s then consider for 0 < α < 1 the following map of the unit interval: ( x(1 + 2α x α ) ∀x ∈ [0, 1/2) . T (x) = 2x − 1 ∀x ∈ [1/2, 1] We recall some properties and results which we will need in the following, and we refer the reader to the quoted paper for more informations and proofs. This application has a finite Markov partition (with two elements), but for our purposes it is more convenient to work with the countable one ξ generated by the left preimages an of 1, ξ = { Am | m ∈ N} an ≤ 2. with An =]an+1 , an ]. We will often use in the following the easy bound an+1 We can associate to each point z ∈ X =]0, 1] an unique infinite sequence ω = ω1 ω2 ... with the property that T m−1 z ∈ Aωm for all integer m ≥ 1. We denote by ξm the dynamical partition ξ ∨ T −1 ξ · · · T −m+1 ξ and call its elements m-cylinders. We denote with ξm (z) ∈ ξm the m-cylinder which contains z. The sequence ω satisfies the admissibility condition: ωm ωm+1 appears in ω if and only if ωm = 0 or ωm+1 = ωm − 1. We say that a non empty cylinder C = [ω1 . . . ωk ] ∈ ξk is maximal if it maps onto X after exactly k iterations, which is easily seen to be equivalent to ωk = 0. 3.1. Some mixing properties. We begin with a brief survey of some results proved by two of us (B.S., S.V) in the joint paper [LSV97] with Carlangelo Liverani. We showed that the density h of the acim belongs to a certain cone of functions C∗ (a), which will be characterized later (see Lemma 3.2), provided a is big enough, and satisfies3 : 3 We recall the formal definition of the Perron Frobenius operator P acting on function f : [0, 1] → R: P 1 Pf (x) = T y=x D1 T f (y). One easily check that µ is an acim iff h = dµ dx is a fixed point of P on L (dx). y
Statistics of Return Times
47
Lemma A (Lemma 2.2 in [LSV97]). The cone C∗ (a) is left invariant by the PerronFrobenius operator P , i.e. P (C∗ (a)) ⊂ C∗ (a). Lemma B (Lemma 2.3 in [LSV97]). The density h belongs to the cone C∗ (a), and verifies in particular whenever x ≤ y, h(x) ≤ (y/x)α+1 , h(y) h(x) ≤ ax −α .
(9) (10)
Proposition C (Distortion inequality, proof of Proposition 3.3 in [LSV97]). There exists some constant 1 such that for all k and x, y ∈ C ∈ ξk , Dx T k ≤ 1 < ∞. Dy T k
(11)
We will suppose without loss of generality that a ≥ 41. Theorem D (Theorem 4.1 in [LSV97]). In the proof of this theorem we in particular got that for f ∈ C∗ (a),
n
(12)
P f − λ(f ) 1 ≤ 8(n)kf kL1 (λ) L (λ)
with 8(n) = Cn− α +1 (log n) α = OL (n− α +1 ), where we define by 1
1
1
OL (ε) = O(ε(log ε−1 )r ) in the limit ε → 0, for any constant r. We then need a few more results on the speed of mixing which turn out to be useful for the statistics of return times and also to establish the weak-bernoullicity of the map. Lemma 3.1. For any z ∈ X, and for any m such that ξm (z) is maximal, the partition U = {ξm (z), ξm (z)c } satisfies a property close to the α-mixing, namely µ(R ∩ T −N−j S) 1 − µ(S) = OL ((N − m)1− α ). α 0 (N ) = sup sup sup µ(R) j ∈N R∈Uj S∈U∞ T j R⊂U
Proof. Let z be a point of X and m be an integer such that ξm (z) is maximal. Let U be the partition given by ξm (z) and its complement, and Uj the refinement of U. For R ∈ Uj such that T j R ⊂ U , we have R ∈ σ (ξm+j ) and R is a union of maximal k ∈ ξm+j ; choose V ∈ ξm+j one of these maximal cylinders. For any cylinders Vm+j −(N +j ) B there exists a set W ∈ B such that R = T −(N+j ) W . We then have S∈T (∗) := µ(V ∩ S) − µ(V )µ(S) Z Z = 1I V 1I W ◦ T N+j hdλ − µ(V )h1I W dλ Z = P N+j [h(1I V − µ(V ))]1I W dλ ≤ kP N+j [h(1I V − µ(V ))]kL1 (λ) .
48
M. Hirata, B. Saussol, S. Vaienti
By exploiting the fact that V is maximal we continue the preceding bound as
(∗) ≤ P N−m [P j +m (h1I V ) − µ(V )] 1 + P N −m [µ(V )h − µ(V )] L (λ)
L1 (λ)
≤ 4a8(N − m)µ(V ), with 8 given by inequality (12), provided P m+j (h1I V ) ∈ C∗ (a), which is the case by Lemma 3.2 below. We conclude the proof by summing over all the maximal cylinders of R. u t Lemma 3.2. For any maximal cylinder V ∈ ξp , P p (h1I V ) ∈ C∗ (a). p
Proof. We first set f := P p (h1I V ) and TV : V → X the restriction of T p to V . Since T p is injective over V we can rewrite f as −p
−p
f (x) = h ◦ TV (x)Dx TV
which in particular shows that f is continuous. To prove that f belongs to the cone of smooth functions C∗ (a) we must verify the following four properties which just define the cone: 1. f is continuous and positive, that is clear in our case. −p 2. f is decreasing. Since h ∈ C∗ (a), h decreases. In addition, TV is decreasing and −p −p concave, therefore h ◦ TV and DTV decrease. −p α+1 f (x) increases. Since TV : X → V is increasing, an equivalent statement 3. x 7 → x is that 1 (T p u)α+1 h(u) Du T p is increasing with u ∈ V . Observing that p α+1 1 T u u Du T p increases over V ∈ ξp (which is true for p = 1 and the general case is proved by α+1 recurrence), and R u 7 → u h(u) increases, we obtain the result. 4. f (x) ≤ ax −α f . Since f is continuous, there exists v ∈ V such that Z 1 . f = f (T p v) = h(v) Dv T p The distortion estimate (11) for u ∈ V ∈ ξp gives Dv T p ≤ 1. Du T p Moreover since h decreases, inequality (9) yields h(aω1 +1 ) aω1 α+1 h(u) ≤ ≤ ≤ 4. h(v) h(aω1 ) aω1 +1
Statistics of Return Times
49 −p
As a consequence, we get for u = TV x, 1 1 ≤ 4h(v) ≤ ax −α f (x) = h(u) Du T p Dv T p
Z f,
because x ≤ 1 and 41 ≤ a. u t We finally prove that the countable partition ξ , and therefore the two-elements one, is weakly Bernoulli. Theorem 3.3. The partition ξ is weakly Bernoulli for (X, T , µ) with speed β(n) = OL (n1−1/α ). Proof. We begin to recall the following result by Hofbauer and Keller [HK82] which permits to bound β(n) as β(n) ≤ sup
X
m∈N R∈ξ
kP n+m ((1I R − µ(R))h)kL1 . λ
(13)
m
Then it will be enough to bound kP m+n ((1I R − µ(R))h)k with R ∈ ξm . Let pR ≥ m be the integer for which R ∈ ξpR is maximal. We decompose the sum over all the cylinders R ∈ ξm into two blocks. Let M(m, n) be the set of maximal cylinders for pR < m + n/2. When R ∈ M(m, n), the same computation performed in Lemma 3.1 gives kP m+n ((1I R − µ(R))h)kL1 ≤ µ(R)OL ((m + n − pR )1−1/α ) = µ(R)OL (n1−1/α ). λ
Then the set of cylinders which do not belong to M(m, n) is exactly T −m+1 [0, an/2 ], whose measure is equal to µ(T −m+1 [0, an/2 ]) = µ([0, an/2 ]) =
Z
an/2
h(x)dx = O(n1−1/α ).
0
This proves the theorem. u t
3.2. Statistics of return times. We now come back to the study of return times and the first step will be the estimation of the quantities involved in the error term given by Lemma 2.4. Lemma 3.4. There exists a constant B such that for any k and C ∈ ξk with T −k C ∩C 6= ∅, sup P k 1I C ≤ Bk −1−1/α .
(14)
50
M. Hirata, B. Saussol, S. Vaienti
Proof. Let k0 be such that Dak0 T ≤ 2, and put r = Dak0 T > 1. Let C = [ω1 ...ωk ] be a k-cylinder such that T −k C ∩ C 6 = ∅. This implies that ωk ω1 is admissible. We want to estimate sup P k 1I C = 1/ inf C DT k . If ωj ≤ k0 for all j = 1..k, then DT k ≥ r k . Else, take j such that ωj = max1≤i≤k ωi . Either j = 1, and consequently ωk = 0 or ωj −1 = 0. In the last case we have inf DT k ≥ C
inf
[ω1 ...ωj −1 ]
DT j −1
inf
[ωj ...ωk ]
DT k+1−j ≥ 1−1
inf
[ωj ...ωk ω1 ...ωj −1 ]
DT k .
By this argument we are led to consider the worst case which is given by a cylinder of type C = [(k − 1)(k − 2)...0]. For T k C = [0, 1], the distortion formula (11) and the estimation ak ≤ ck −1/α given by Lemma 3.2 in [LSV97] we get Dak T k = c0 k 1+1/α for some constant c0 , from which the lemma follows by taking B ≥ 1/c0 such that t Bk 1−1/α ≥ r k for all k > 0. u We now introduce the first return time of a cylinder U which plays a crucial role in [Hir95]. We define it as τ (U ) = inf { τU (x)| x ∈ U }. Lemma 3.5. The quantity aN (U ) defined in Lemma 2.4 for U = ξm (z) is bounded by, aN (U ) =
N µ(U ) 41 . inf h λ(T τ (U ) U )
Proof. We suppose N > τ (U ) otherwise aN (U ) = 0. Set τ = τ (U ); for each z in X we have aN (U ) ≤
N X j =1
=
N X j =τ
1 µ(T −j U ∩ U ) µ(U ) 1 µ(U )
Z
≤ N sup sup j =τ..N U
P j (1I U h)1I U dλ P j (1I U h) . h
Now the distortion (11) and the regularity of the density (9) give P τ (1I U h) = h ◦ TU−τ DTU−τ 1I T τ U Z 1 ≤ 41 h ◦ TU−τ DTU−τ 1I T τ U dλ λ(T τ U ) T τ U µ(U ) . ≤ 41 λ(T τ U ) Finally, P h = h and since P is a positive operator one has P j −τ infh h P j −τ 1I 41 µ(U ) P j (1I U h) ≤ sup P τ (1I U h) ≤ sup P τ (1I U h) ≤ . h h h inf h λ(T τ U ) t u
Statistics of Return Times
51
The next step will be to show that τ (U ) is almost everywhere big enough to give a good upper bound in the previous lemma for aN (U ). We first define in full generality the local rate of return for cylinders. As a matter of fact, we would like to point out that the first return time of a set into itself allows to define and compute an interesting dimension-like characteristic which we called the Afraimovich-Pesin dimension in [PSV98]. Definition 3.1. Let ζ a partition of X. Denote with ζn (x) the element of ζ ∨ T −1 ζ ∨ · · · ∨ T −n+1 ζ which contains x ∈ X. We then define the local (lower and upper) rate of return for cylinders as τ (ζn (x)) . R ζ (x) = lim n n→∞ Proposition 3.6. (i) Both R ζ and R ζ are sub-invariant, namely R ζ ◦ T ≤ R ζ and Rζ ◦ T ≤ Rζ . (ii) Assume that ζ is a measurable partition of the measurable space X, and µ is an invariant probability, then R ζ and R ζ are µ-a.e. invariant. (iii) Moreover, whenever µ is ergodic R ζ and R ζ are µ-a.e. constant Proof. (i) Let x ∈ X. For each integer n > 0, we have: ζn (x) ∩ T k ζn (x) 6 = ∅ H⇒ ζn−1 (T x) ∩ T k ζn−1 (T x) 6 = ∅, which implies that τ (ζn−1 (T x)) ≤ τ (ζn (x)). (ii) is a standard property of sub-invariant functions on finite measure spaces and then (iii) follows immediately. u t We state the following result which can be improved for some subshifts4 . Proposition 3.7. For µ-almost every z ∈ X, the lower rate of return for cylinders is equal to 1. R ξ (z) = 1. Proof. Let 1/2 < δ < 1. Consider the set (we denote Nm (z) = τ (ξm (z))), Lm := { z ∈ A0 | Nm (z) ≤ δm}. If ∞ X
µ(Lm ) < ∞,
(15)
m=1
then the Borel-Cantelli Lemma ensures that for almost every z ∈ A0 , we have Nm > δm, up to finitely many m. By sending δ to 1 we show that R ξ (z) ≥ 1 almost everywhere on A0 . Then for the preceding proposition (iii) and the ergodicity of the measure µ, we 4 We have in fact the following: Theorem. Suppose that µ is a Gibbs state for the Hölder potential ϕ on some irreducible and aperiodic subshift of finite type with finite alphabet ζ , then µ-almost everywhere, R ζ = R ζ = 1.
Proof. An easiest version of the Proposition 3.7 gives the lower bound, while the uniform upper bound τ (Cn ) ≤ n + n0 holds, where Cn is a cylinder of order n, and n0 is the lowest power for which the transition matrix becomes strictly positive.
52
M. Hirata, B. Saussol, S. Vaienti
get the same bound almost everywhere. The equality finally follows since each time that T m−1 z ∈ A0 , we have T m ξm (z) = X hence Nm (z) ≤ m. In order to prove (15) it is sufficient to consider the Lebesgue measure instead of µ (since the density h is bounded from below). We have λ(Lm ) =
[m/2] X
δm X
λ(Nm = k) +
k=1
λ(Nm = k).
k=[m/2]+1
(1)
+
(2)
We now perform a detailed analysis of the sets appearing in the preceding formula. (1): In this case, the cylinder ξm (z) with Nm = k must be of the form ξm (z) = [(ω1 .. .ωk )(ω1 ...ωk )...(ω1 . ..ωk )...]. | {z } [m/k]
Therefore when k ≤ [m/2], the cylinder is completely determined by its first k symbols. Put C = [ω1 ...ωk ]; we say that a cylinder of length k is admissible (admis) when it is the beginning of a cylinder of Lm with Nm = k. Then we can bound (1) by (1) ≤
[m/2] X
X
λ(C ∩ T −k C ∩ · · · ∩ T −[m/k−1]k C)
k=1 C admis
≤
[m/2] X
X
k=1 C admis
≤
[m/2] X
sup
k=1 C admis
sup P k 1I C
[m/k]−1
C
k
sup P 1I C
[m/k]−1
C
λ(C) .
We first remark that T k being injective over C ∈ ξk , we have P k 1I C ≤ 1/ inf DT k ≤ 1/2. A0
1+ α1
We split the last sum in three pieces by fixing k0 as the biggest integer for which k0 eB , where B is the constant in Lemma 3.4. We then have by using Lemma 3.4,
≥
m/3 [m/2] k0 X X X [m/k]−1 −1−1/α m/k−2 (1/2) + (Bk ) + Bk −1−1/α . (1) ≤ k=1
m/3
k=k0
The first and the last sum are easily shown to be summable with respect to m. For the second term, we observe that the terms (Bk −1−1/α )m/k−2 are increasing in k when k is bigger than k0 . A direct estimation of the sum is B31/α m−1/α which is summable with respect to m. (2): In this case, the cylinder ξm (z) has the form ξm (z) = [ω1 ...ωm−k ωm−k+1 ...ωk ω1 ...ωm−k ]. | {z } | {z } | {z } m−k
2k−m
m−k
Statistics of Return Times
53
As before, we set C = [ω1 ...ωm−k ], and we say that C is admissible (admis) when it is the beginning of a cylinder of Lm with Nm = k, (2) ≤
δm X
X
λ(C ∩ T −k C)
k=[m/2]+1 C admis
≤
δm X
sup sup P k 1I C .
k=[m/2]+1 C admis C
Let first p = p(C) ≥ m − k be such that C ∈ ξp is maximal (i.e. p(C) is the smallest p for which C ∈ ξp ). When p < k, since 1 ∈ C∗ (a) the inequality (12) and Lemma 3.4 give sup P k 1I C ≤ sup P p 1I C sup P k−p 1 ≤ a2α Bp−1−1/α ≤ a2α B(m − k)−1−1/α . C
C
When p ≥ k, C ∈ ξk and T −k C ∩ C 6 = ∅ we have P k 1I C ≤ Bk −1−1/α . But k ≥ m − k ≥ (1 − δ)m, and then the sum (15) is summable for any δ < 1. u t We are now ready to state and prove the main theorems of this section ¯ Theorem 3.8. For µ-almost every z ∈ X and β < β(α), t − exp(−t) = O(µ(ξm (z))β ), sup µξm (z) τξm (z) > µ(ξm (z)) t≥0 ¯ where the critical exponent β(α) = 1 − α. Proof. Let ε be a positive number. Let z be a typical point for Proposition 3.7 and for the Shannon–McMillan–Breiman theorem. We want to apply Lemma 2.4; Let m(ε) such that for any m > m(ε) we have (1 − ε)m ≤ τ (ξm (z)), µ(ξm (z)) ≤ exp(−m2hµ /3) and also µ(ξεm (T [(1−ε)m] z)) ≥ exp(−(2[εm])hµ ). For the sake of simplicity, we put for any m, Um = ξm (z). For any m > m(ε) such that Um is maximal, we have (1 − ε)m ≤ τ (Um ) ≤ m, and all the iterates T j Um for 1 ≤ j < m are at a distance bigger than am from the neutral fixed point (because Um is −α so maximal). If τ (Um ) < m then the density stays bounded on the orbit T j Um by bam we have aα aα λ(T τ (Um ) Um ) ≥ m µ(T τ (Um ) Um ) ≥ m exp(−2εmhµ ). b b On the other hand, when τ (Um ) = m we still get λ(T τ (Um ) Um ) = 1 ≥
α am exp(−2εmhµ ). b
Lemma 3.5 gives us the following estimation with N = µ(Um )−α+ε , aN (Um ) = O(µ(Um )1−α−3ε ).
54
M. Hirata, B. Saussol, S. Vaienti
Lemma 3.1 with R = Um gives us bN (Um ) = OL ((µ(Um )−α+ε − m)1− α ) = OL (µ(Um )(−α+ε)(1− α ) ). 1
1
We can then apply Lemma 2.4, which gives c(Um ) =≤ aN (Um ) + bN (Um ) = O(µ(Um )β ) for β ≤ 1 − α − 3ε and β ≤ 1 − α − 2ε(1/α − 1). We finally end up with d(Um ) = O(µ(Um )β )
(16)
for any β < 1 − α, since ε is arbitrary small, which conclude the proof by applying Theorem 2.1. u t ¯ Remark 3.9. The preceding theorem shows that the critical exponent β(α) is smaller than 1.We point out that, by using Proposition 2.3 the power β¯ cannot exceed 1. Theorem 3.10. For µ-almost every z ∈ X, we have for any t ≥ 0 and K ≥ 0 and ¯ β < β(α), K µξ (z) Nξ (z) (t) = K − t exp(−t) = O(µ(ξm (z))β/(K+1) ). m m K! ¯ with the critical exponent β(α) = 1 − α. Proof. Let z be a typical point satisfying the preceding theorem and m such that Um = ξm (z) is maximal. By invoking the footnote of Theorem 2.6, it will be sufficient to use the weakened α-mixing condition 1 α 0 (M) = OL ((M − m)α− α ) given by Lemma 3.1 to apply Theorem 2.6. Take M = µ(Um )−α ; we thus find for β < 1 − α, and by the estimation (16) and Theorem 2.6 an error of the order f (K, Um ) = const[d(Um ) + α 0 (M) + Mµ(U )] = O(µ(Um )β ). By applying Theorem 2.8, the error for the probability to have K successive visits is of t the order µ(Um )β/(K+1) for all β < 1 − α. u 4. Concluding Remarks We conclude with few observations. First, the proofs for the exponential-one law and the Poisson law given in Sect. 3 for a class of non uniform hyperbolic maps, can be easily adapted, and they are even easier, to all the cases quoted in the introduction, namely: Axiom A diffeomorphisms, transitive Markov chains, expanding maps of the interval with a spectral gap and in general to all ϕ-mixing dynamical systems. For such systems, an estimation for the error can also be done: following the arguments of Theorems 3.8 and 3.10, one can easily see that the critical exponent β¯ is equal to 1. This supports our beliefs that: (i) the error terms of type µ(U )β could be optimal and (ii) the non uniform hyperbolicity of the map reflects in the critical exponent: in that case, in fact, it should be strictly smaller than one. Acknowledgements. We would like to thank Viviane Baladi for a careful reading of a preliminary version of this work and Bernard Schmitt for useful discussions. B.S. acknowledges the ESF for support during the workshop “Probabilistic methods in non-hyperbolic dynamics” in Warwick.
Statistics of Return Times
55
References [Coe97] [Col96] [CG93] [CFS82] [GS97] [Hay98a] [Hay98b] [Hir93] [Hir95] [HK82] [LSV97] [PSV98] [Pit91] [Sau98a] [Sau98b]
Coelho, Z.: Asymptotic laws for symbolic dynamical systems. Lectures given in Temuco, Chili, 1997 Collet, P.: Some ergodic properties of maps of the interval, Dynamical systems (Temuco, 1991/1992) (Paris), Travaux en Cours, vol. 52, Paris: Hermann, 1996, pp. 55–91 Collet, P. and Galves, A.: Statistics of close visits to the indifferent fixed point of an interval map. J. Stat. Phys. 72, no. 3-4, 459–478 (1993) Cornfeld, I.P., Fomin, S.V. and Sina˘ı, Ya.G.: Ergodic theory. vol. 245, New York: Springer-Verlag, 1982 Galves, A. and Schmitt, B.: Inequalities for hitting time in mixing dynamical systems. Random Comput. Dynam. 5, no. 4, 337–347 (1997) Haydn, N.: The distribution of the first return time for rational maps. 1998, USC Haydn, N.: Statistical properties of equilibrium states for rational maps. 1998, USC Hirata, M.: Poisson law for Axiom A diffeomorphisms. Ergodic Theory Dynamical Systems 13, no. 3, 533–556 (1993) Hirata, M.: Poisson law for the dynamical systems with the “self-mixing” conditions. In: Dynamical systems and chaos, Vol. 1 (Hachioji, 1994) (River Edge, NJ), River Edge, NJ: World Sci. Publishing, 1995, pp. 87–96 Hofbauer, F. and Keller, G.: Ergodic properties of invariant measures for piecewise monotonic transformations. Math. Z. 180, no. 1, 119–140 (1982) Liverani, C., Saussol, B. and Vaienti, S.: A probabilistic approach to intermittency. Ergodic Theory Dynamical Systems (1997). To appear Penné, V., Saussol, B. and Vaienti, S.: Fractal and statistical characteristics of recurrence times. To appear in Journal de Physique (Paris), 1998, Proceeding of the conference “Disorder and Chaos” (Rome, 22–24th Sept. 1997), in honour of Giovanni Paladin Pitskel, B.: Poisson limit law for Markov chains. Ergodic Theory Dynamical Systems 11, no. 3, 501–513 (1991) Saussol, B.:Absolutely continuous invariant measures for multidimensional expanding maps. 1998, Submitted Saussol, B.: Etude statistique de systèmes dynamiques dilatants. Ph.D.thesis, Université de Toulon et du Var, 1998
Communicated by Ya. G. Sinai
Commun. Math. Phys. 206, 57 – 103 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Lifshitz Tails for Random Schrödinger Operators with Negative Singular Poisson Potential Frédéric Klopp1 , Leonid Pastur2,3 1 Département de Mathématique, Institut Galilée, U.M.R 7539 C.N.R.S, Université de Paris-Nord, Avenue
J.-B. Clément, F-93430 Villetaneuse, France. E-mail:
[email protected] 2 Département de Mathématique, Université Paris VII, 2, Place Jussieu, F-75005 Paris, France.
E-mail:
[email protected] 3 Mathematical Division, Institute for Low Temperature Physics, 47, Lenin’s Ave., 310164, Kharka, Ukraine
Received: 18 November 1998 / Accepted: 9 March 1999
Abstract: We develop a method of asymptotic study of the integrated density of states (IDS) N (E) of a random Schrödinger operator with a non-positive (attractive) Poisson potential. The method is based on the periodic approximations of the potential instead of the Dirichlet-Neumann bracketing used before. This allows us to derive more precise bounds for the rate of approximations of the IDS by the IDS of respective periodic operators and to obtain rigorously for the first time the leading term of log N (E) as E → −∞ for the Poisson random potential with a singular single-site (impurity) potential, in particular, for the screened Coulomb impurities, dislocations, etc. Contents 0. Introduction: Problems and History . . . . . . . . . . . . . . . . . 1. The Assumptions and the Results . . . . . . . . . . . . . . . . . . 1.1 The integrated density of states . . . . . . . . . . . . . . . . . 1.2 The asymptotics of the IDS . . . . . . . . . . . . . . . . . . . 2. Periodic Approximations . . . . . . . . . . . . . . . . . . . . . . . 2.1 A general approximation result . . . . . . . . . . . . . . . . . 2.2 Application to the estimation of the integrated density of states 3. The Lower Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 The general case . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 The case when V is bounded from below . . . . . . . . . . . . 3.3 The case when V has power law singularities . . . . . . . . . 4. The Upper Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 The general case . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Proof of Proposition 4.1 . . . . . . . . . . . . . . . . . . . . . 4.3 The case when V has power law singularities . . . . . . . . . 5. Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 The structure of the Poisson potential . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
58 61 62 62 65 65 75 76 76 79 80 81 81 81 87 91 91
58
F. Klopp, L. Pastur
5.2 An a-priori estimate on the density of states . . . . . . . . . . . . . . . . 94 5.3 Exponential decay estimates . . . . . . . . . . . . . . . . . . . . . . . . 96 5.4 Some useful facts about the single site potential Hamiltonian Hg . . . . . 97 0. Introduction: Problems and History The Integrated Density of States (IDS) is one of the simplest but quite important characteristics of the random Schrödinger operator. Among numerous problems related to the IDS, the problem of its asymptotic behavior near the edges of the spectrum is well known and studied. The results of these studies can be summarized as follows. One has to distinguish two types of spectral edges: stable and fluctuational (see e.g. [7,18,21]). The latter are special for shortly correlated random potentials. In the simplest case of the lower edge of the spectrum, they are determined by the absolute minimum of the potential since the spectrum in a neighborhood of this edge exists only because of the (arbitrarily large) fluctuations of the potential arbitrarily close to the minimum. By using the quantum mechanical terminology one can call these portions of the realization the potential wells. A heuristic derivation of the fluctuational asymptotics of the IDS was proposed by I.Lifshitz in the early 60’s [16,17]. The asymptotics is given by the probability to have a potential well whose ground state energy is close enough to the spectral edge. Since the probability of these realizations having the form of very broad or deep potential wells (and known also as optimal fluctuations) is usually exponentially small, one has to deal here with a version of the large deviation technique in the spectral context. In particular, to determine the asymptotic formula for the IDS one has to be able to give a rather detailed description of the statistics of these special realizations. This is why precise and explicit asymptotic formulae are known only for comparatively restricted classes of random potentials. One of the widely studied random potentials is the Poisson potential having the form X V (x − xj ), (0.1) Vω (x) = j
where {xj } is the Poisson point field of the density µ in Rd and V (x), the one-site (or single site) potential, is a function decaying sufficiently fast at infinity. The Poisson potential is of considerable interest both in spectral theory and in the theoretical physics of disordered systems. It possesses a number of nontrivial asymptotic regimes, only part of which has been studied so far. One has to mention first the case of the nonnegative one-site potential V (x) of compact support. In this case E = 0 is a fluctuational edge and according to I. Lifshitz, N(E) ' exp(−const · E −d/2 ), E → +0.
(0.2)
The right-hand side of this formula is just the probability to have a well (a region of Rd free of xj ’s) of width L ' E −1/2 ; the latter relation is due to the uncertainty principle. In other words, the asymptotics of the IDS in this case is determined by an optimization procedure, balancing of the quantum and the probabilistic components of the problem. This is why formula (0.2) is often called the quantum Lifshitz tail. Rigorous derivations of various versions of (0.2) (e.g. its logarithmic or even its double logarithmic forms) have required a number of rather sophisticated probabilistic and spectral techniques (see
Lifshitz Tails for Random Schrödinger Operators
59
e.g. [3,21,27,30,31,11] for results and references). Here and below we use the symbol “'” to denote the asymptotic equivalence without indicating explicitly the order of the remainder and respective constants. Other asymptotic regimes of the IDS for the potential (0.1) correspond to the case when the one-site potential has a non-positive part, i.e. inf V (x) < 0, so that the lower edge of the spectrum is E = −∞. In this case one has to distinguish the two asymptotic regimes, usually called quantum and classical. We will present respective asymptotic formulae by using a version of Lifshitz’s arguments adapted to this case. Recall the definition of the IDS. It is the limit as L → +∞ of the expectation of the normalized counting function of eigenvalues of the Schrödinger operator H3L , where H3L is the restriction of Hω = −1 + Vω to L2 (3L ) (see e.g. [21,7]). Here 3L is the cube of the Rd of center zero and of side length L. The definition shows that the IDS can be regarded as the probability to find an eigenvalue of H3L lying below a given energy E. For E → −∞ these eigenvalues are produced by very deep potential wells, created by large clusters of k Poisson points xj ’s, confined to sufficiently small regions of the space, say a cube of the side length l 0,
(0.8)
where δ(x) is the Dirac delta-function. In this case by using the special Markov processes technique one can obtain (see [18]) an asymptotic of the form s s E E log (1 + o(1)), E → −∞, log N(E) = −2 E0 E0 where E0 = −g02 /4. We see that formula (0.5) is only valid up to a factor 2. This difference is the result of a tunneling phenomenon related to the question how close to one another should k potentials (0.8) be in order to be regarded as the potential k ∗ (k)δ(x), i.e. the potential of same shape as (0.8) and of amplitude k ∗ (k). In other words, in this case contrary to the classical case, the radius of the exponential decay of the single site ground state is much larger than the width of the single site potential. Thus the optimal cluster should be much smaller (i.e. its radius should tend to 0 sufficiently fast) in order to be modeled by a single site potential of some effective amplitude k ∗ (k). ∗ We shall see below that in many interesting cases l ' k −α for some α ∗ > 0. Because ∗ of that, the factor k −α dk in (0.3) will also contribute to the asymptotic formula of the IDS. The study of this phenomenon is one of the topics dealt with in the present paper. In this paper we study the case of the singular one-site potential (mainly with powerlaw singularities) following in essence the scheme outlined above. We find the precise form of k ∗ (k) (see Theorem 1.4 and 1.6 below). This became possible owing to an improvement of one of the technique in the field, based on approximations of the Schrödinger operator in the whole space by the operator with the same potential but defined in a finite box whose size is properly chosen as a function of energy. Previous versions of this technique were based on the so called Neumann–Dirichlet bracketing where the boxes with the Neumann and Dirichlet boundary conditions were used to construct the upper and the lower bounds for the IDS. The error in these bounds is of the order O(L−1 ), where L is the size of the box. This precision is not sufficient to treat the quantum case. Therefore, we approximate the IDS of the random model by the IDS of some well chosen periodic Schrödinger operators and obtain much more precise bounds (see Sect. 2 and more precisely Lemmas 2.1 and 2.3)). This method was proposed in [11] and has been used to solve several problems in the field ([10,12]). We obtain the once logarithmic versions of (0.4), i.e. (0.5) with explicit g(E) and constants in front of g(E) log g(E) (see Theorems 1.5 and 1.6). Let us now give just one example of the results presented in Sect. 1.2; let V , the single site potential, be the 3-dimensional attractive screened Coulomb potential V (x) = e−|x| , widely used in semiconductor physics (see e.g.[4]). In this case we prove − |x| p log N(E) = −2 |E| log |E|(1 + o(1)), E → −∞, (see Theorem 1.6 and the discussion following it). The role of the IDS in the spectral theory and theoretical physics of disordered systems is well known and appreciated (see [2,7,21,4,18]). However, there is one more reason to
Lifshitz Tails for Random Schrödinger Operators
61
study this quantity. Since the pioneering papers of I. Lifshitz, the study of the IDS has been providing a first important step in the study and in the understanding of more complex properties and quantities in a respective version of the strong localization regime. In particular, the IDS is the first moment (see formula (1.4)) of the spectral kernel of the Schrödinger operator. From the mathematical physics point of view, the IDS determines the equilibrium properties of disordered systems, i.e. of the ideal gas of elementary excitation (electrons, phonons, spin waves, etc.) in the random environment. The study of the kinetic properties of this gas and of the interaction effects requires knowledge of higher moments of the spectral kernel, the second moments first of all. The knowledge of these correlators allows one to answer a number of relevant questions concerning the existence and the nature of the localization and behavior of related quantities. In particular, in a subsequent paper ([13]), we use the technique developed in this paper in order to find the large deficit asymptotic behavior of the inter-band light absorption coefficient. The paper is organized as follows. In Sect. 1, we define the framework of our study and give a brief account of our results. We also present several examples at the end of the section. In Sect. 2, we first prove the basic relation (1.4) expressing the IDS in terms of the spectral family of the random Schrödinger operator. Then we construct our main technical tool, the periodic approximations of the IDS. Sections 3 and 4 are devoted to the derivation of the lower and upper bounds for the IDS using the periodic approximations. Section 5 contains auxiliary facts on the statistics of the Poisson field, on random Schrödinger operators and on the structure of the ground state of the Schrödinger operator with a singular single site potential. 1. The Assumptions and the Results Let V : Rd → R be a function such that V = V1 + V2 , where H1 For some C > 0 and any x ∈ Rd , |V1 (x)| ≤ Ce−|x|/C . H2 The function V2 is compactly supported and satisfies V2 ∈ Lp (Rd ), where p > p(d) and p(d) = 2 if d ≤ 2 and p(d) = d/2 if d ≥ 3. H3 For some set of positive measure E, V E < 0. Define the random potential
Z
Vω (x) =
Rd
V (x − y)m(ω, dy),
(1.1)
where m(ω, dy) is a random Poisson measure of concentration µ. Vω is an ergodic random field on Rd . Consider the random Schrödinger operator Hω = −1 + Vω .
(1.2)
One has Theorem 1.1 ([7]). Under the assumptions made above, Hω is essentially self-adjoint on C0∞ (Rd ) ω-almost surely. Under our assumptions on V , we know that the almost sure spectrum of Hω is 6 = R ([21,7]).
62
F. Klopp, L. Pastur
1.1. The integrated density of states. Let 3 be a cube centered at 0 in Rd . We define D to be the Dirichlet restriction of H to 3. Pick E ∈ R. Consider the quantity Hω,3 ω Nω,3 (E) =
1 D ]{eigenvalues of Hω,3 smaller than or equal to E}. Vol(3)
(1.3)
Then one has Theorem 1.2 ([7]). Under the assumptions made above, there exists a non-random, nondecreasing, non-negative, right continuous function N (E) such that, ω-almost surely, for all E ∈ R, E a continuity point of N, Nω,3 (E) converges to N (E) as 3 exhausts Rd . N (E) is the integrated density of states (IDS) of Hω . As N is non-decreasing, one can define its distributional derivative dN. It is a positive measure and is supported on the almost sure spectrum of Hω (see [7,21]). One has the following result: Theorem 1.3. For ϕ ∈ C0∞ (R), we have (ϕ, dN) = E(tr(1C(0,1) ϕ(Hω )1C(0,1) )),
(1.4)
where C(0, 1) is the cube of center 0 and side length 1. Formula (1.4) is well known under more restrictive assumptions on the potential Vω , i.e. for less singular single site potentials V (see [21]). 1.2. The asymptotics of the IDS. To describe the asymptotic behavior of N (E) near −∞, we will need to define an auxiliary operator. For g ∈ R, define H (g) = −1 + gV .
(1.5)
Under our assumptions on V , V is relatively form bounded with respect to −1 with relative bound 0. Hence, H (g) admits a unique self-adjoint extension. Let σ (H (g)) denote its spectrum. It is lower semi-bounded. The infimum of σ (H (g)), i.e. the ground state energy of H (g) will be denoted by E(g). Let ϕg be the respective ground state, i.e. the unique positive normalized eigenfunction of H (g) associated to energy E(g) ([22,26]). In the sequel it will often be more convenient to work with E− (g) = −E(g) instead of E(g) itself. From assumption H 3, one easily infers that E− (g) → +∞ when g → +∞ (see Sect. 5). Moreover E− is strictly increasing in a neighborhood of +∞. Let g be an inverse of E− in a neighborhood of +∞. g is strictly increasing. In the regular (classical) case, it was found that g is governing the first term asymptotic of log N (cf [21,20]). In the singular (quantum) case, the singular set of V will play a special part in the asymptotics. To measure this role, we introduce the notion of asymptotic ground state, i.e. Definition 1.1. Let g ∈ (1, +∞) 7 → ψg ∈ H 1 (Rd ). We will say that ψg is an asymptotic ground state if and only if • the vector ψg is normalized. • ∃g0 > 1, l0 > 0 such that ∀g ≥ g0 , supp ψg ⊂ C(0, l0 ) (where C(x, l) denotes the • cube of center x and side length l). |h(H (g) − E(g))ψg , ψg i| → 0 as g → +∞. (1.6) |E(g)|
Lifshitz Tails for Random Schrödinger Operators
63
In Lemma 5.6, we prove the existence of an asymptotic ground state. For a ∈ Rd , we define the translation τa by τa V (x) = V (x − a) and we define ( Aψg = α > 0;
lim
sup
g→+∞ |a|≤g −α
) g|h(τa V − V )ψg , ψg i| =0 . E− (g)
If Aψg 6 = ∅, then we define α ∗ (ψg ) := inf Aψg . Moreover, we define A to be the union of all Aψg . By Lemma 5.6, we know that A 6 = ∅. We define α ∗ := inf A.
(1.7)
Roughly speaking, the dependence of the radius of the exponential decay of the single ∗ site potential ground state on the coupling constant g is of the form g −α . This determines the characteristic size l of the optimal cluster. Then we prove Theorem 1.4. Under the assumptions H1, H 2 and H 3, for sufficiently large E, one has −(1 + α ∗ d)g(E) log g(E)(1 + o(1)) ≤ log N (−E) ≤ −g(E) log g(E)(1 + o(1)). (1.8) One may complain that Theorem 1.4 is somewhat imprecise in that it only gives a two sided estimate. But, as we will see below, this is in some way unavoidable as the true asymptotic depends not only on g but also on the singular set of the negative part of V . More precisely, as can be seen from Theorem 1.6 (and from the proof of Theorem 1.4), the asymptotics of the IDS depends on the way the eigenfunction associated to the lowest eigenvalue for the operator −1 + gV concentrates near the singular set of the negative part of V as g becomes large. In general the correction also depends on the geometry of the singular set. For example, if the singular set is a segment (e.g. a dislocation), using the techniques developed in Sect. 3, one can see that neither the lower nor the upper bound given by Theorem 1.4 are sharp. The two sided estimate (1.8) can be made more precise if we know more on V . The first and simplest example we give is the case when V is bounded from below, reaches its minimum at a single point, say 0, and is continuous near 0. Then one easily proves that α ∗ = 0 and the upper and lower bounds in (1.8) coalesce to give (0.7). We will now give other results that, we think, enclose most of the physically relevant examples. Let v− be the essential infimum of V and assume that V is bounded from below, say H1’ −∞ < v− < 0. It is easy to show that g(E) ∼ E/|v− | when E → +∞ (see Lemma 5.5). We obtain Theorem 1.5. Under the assumptions H 1, H 2 and H 1’, one has log N(−E)
∼
E→+∞
−g(E) log g(E)
∼
E→+∞
E log E. v−
Here and in the rest of the paper, a ∼ b will always mean a = b(1 + o(1)).
(1.9)
64
F. Klopp, L. Pastur
This result extends (0.7) removing the continuity assumption near the minimum. Consider now an example a bit more singular. In this case, d = 2 and V2 (x) = log− |x|, x ∈ R2 , where, for a ≥ 0, log− a = min{log a, 0}. Using the inequality log− |x| + log R ≤ log− R|x| ≤ log |x| for 0 < R < 1 and the variational principle for the ground state energy, one shows that, in this case, E− (g) ∼ g/2 log g, hence g(E) ∼ 2E/ log E. g→+∞
E→+∞
One also shows that α ∗ = 0 for this single site potential. Therefore, Theorem 1.4 tells us that log N(−E)
∼
E→+∞
−g(E) log g(E)
∼
E→+∞
−2E.
Hence, the asymptotic formula (0.5) is also valid for certain mildly singular potentials. Another case where one can find an asymptotic for log N is when V has only power law singularities. Let q be a positive integer and pick q positive exponents (νi )i=1,...,q and q functions (hi (θ))i=1,...,q continuous on the sphere Sd−1 . For 1 ≤ i ≤ q, consider the potentials Vi (x) =
hi (θ(x)) x . where θ (x) = ν i |x| |x|
Assume that
(1.10)
( 0 < νi
1 and ρ > 0 (depending only on d) such that, for any ϕ ∈ C0∞ (R), for k ∈ N∗ and n ∈ N∗ , we have j Cµ −(1−α)k Ck log k ρ+k d ϕ (x) . e sup (|x| + C) |E((ϕ, dNω,n )) − (ϕ, dN)| ≤ Ce n jx d x∈R 0≤j ≤k+ρ
(2.3) Remark 2.1. The proof shows that the constant C obtained in Lemma 2.1 is independent of the concentration µ of the Poisson process. Before starting the proof of Lemma 2.1, let us recall some basic facts about the density of states of a periodic Schrödinger operator. Let T∗n = Rd /(2π nZd ). For θ ∈ T∗n , we can consider Hω,n,θ the unique self-adjoint operator defined by the quadratic form k∇ϕk2 + hVω ϕ, ϕi on L2θ,loc (i.e. the set of L2loc -functions that satisfy the boundary conditions ϕ(x + nγ ) = einθγ ϕ(x) for γ ∈ Zd and x ∈ Rd ; this set is endowed with the usual scalar product on L2 (C(0, n))). We know that Hω,n,θ has a compact resolvent (see [23]); hence its spectrum is discrete. Let us denote its eigenvalues by E0 (θ, ω, n) ≤ E1 (θ, ω, n) ≤ · · · ≤ En (θ, ω, n) ≤ . . . , The functions (θ 7 → En (θ, ω, n))n∈N are Lipschitz continuous in θ and one has En (θ, ω, n) → +∞ as n → +∞ (uniformly in θ ). One proves that the IDS of Hω,n satisfies
Z 1 X dθ Nω,n (E) = (2π)d {θ ∈T∗n ; En (θ,ω,n)≤E}
(2.4)
n∈N
and (ϕ, dNω,n )) =
1 tr(1C(0,n) ϕ(Hω,n )1C(0,n) ). Vol(C(0, n))
for any ϕ ∈ C0∞ (R) (see [24,23] or [28]). The rest of this subsection will be devoted to the proof of Lemma 2.1. To prove this result we will need the formula given in Theorem 1.3. The proof of this formula will be given at the end of the section. Let us proceed with the proof of Lemma 2.1. Fix ϕ ∈ C0∞ (R). We want to estimate |E((ϕ, dNω,n )) − (ϕ, dN )|. The computation done in the proof of Theorem 5.1 in [11] gives E((ϕ, dNω,n )) = E(tr(1C(0,1) ϕ(Hω,n )1C(0,1) )). Here we used the fact that the Poisson process is Zd -homogeneous. Notice that our regularity assumptions on Vω,n are weaker than the one used in [11] and [22]. Indeed, ω almost surely, Vω,n is only relatively form bounded with respect to −1 with relative bound 0 (see Lemma 5.1). Nevertheless the proofs of the relevant results in these papers extend easily to the case of relatively form bounded perturbations of −1. Now we only have to estimate |E(tr(1C(0,1) (ϕ(Hω,n )−ϕ(Hω ))))|. This is done with an integral representation of ϕ(H ) using an almost analytic extension of ϕ. Pick ϕ ∈ S(R) (the Schwartz space of rapidly decreasing functions). An almost analytic extension of ϕ is a function ϕ˜ : C → C satisfying
Lifshitz Tails for Random Schrödinger Operators
67
1. For z ∈ R, ϕ(z) ˜ = ϕ(z). 2. supp(ϕ) ˜ ⊂ {z ∈ C; |Im(z)| < 1}. 3. ϕ˜ ∈ S({z ∈ C; |Im(z)| < 1}). ∂ ϕ˜ (x + iy) · |y|−n (for 0 < |y| < 1) is bounded in 4. The family of functions x 7 → ∂z S(R) for any n ∈ N. Such extensions always exist for ϕ ∈ S (see [19]) and, one has the following estimates: there exists C > 0 such that for n ≥ 0, α ≥ 0, β ≥ 0, one has ∂β ∂ ϕ˜ (x + iy) sup sup x α β |y|−n · ∂x ∂z 0 d/2, q integer. Then, by [5] and [8], we know that, for any n and ω ∈ , the following formula holds: Z i ∂ ϕ˜ (2.6) (z) · (i + Hω,n )−q (z − Hω,n )−1 dz ∧ dz. ϕ(Hω,n ) = 2π C ∂z For q > d/2, 1C(0,1) (i + Hω,n )−q (z − Hω,n )−1 is trace-class and we have tr 1C(0,1) ϕ(Hω,n )1C(0,1) Z ∂ ϕ˜ i (z) · tr 1C(0,1) (i + Hω,n )−q (z − Hω,n )−1 1C(0,1) dz ∧ dz. = 2π C ∂z
(2.7)
By Lemma 5.1 and Sect. B.12 in [26], we know that ω-almost surely Hω is essential self-adjoint on C0∞ (Rd ) and that 1C(0,1) (i + Hω )−q (z − Hω )−1 is trace-class. Hence, (2.6) and (2.7) also hold for Hω . We are now going to use Lemma 5.1. We pick p0 ∈ (p(d), p) and b = 1, and compute |E(tr(1C(0,1) (ϕ(Hω,n ) − ϕ(Hω ))))| Z ∂ ϕ˜ 1 (z) E ≤
1C(0,1) (i + Hω,n )−q (z − Hω,n )−1 2π C ∂z
−(i + Hω )−q (z − Hω )−1 1C(0,1) dxdy (2.8) tr X 1 Z ∂ ϕ˜ (z) E 1{ω; V ∈ (α,1,p0 )\ (α,1,p0 )} K(z, ω) dxdy, ≤ ω k k−1 2π C ∂z k≥1
where
K(z, ω) = 1C(0,1) (i +Hω,n )−q (z − Hω,n )−1 −(i + Hω )−q (z−Hω )−1 1C(0,1) . tr
Here k · ktr denotes the trace-class norm. We need to estimate 1C(0,1) ((i + Hω,n )−q (z − Hω,n )−1 − (i + Hω )−q (z − Hω )−1 )1C(0,1)
68
F. Klopp, L. Pastur
under the assumption Vω ∈ k (α, 1, p0 ). Therefore, we imitate the method used in [11]. We write
1C(0,1) ((i + Hω,n )−q (z − Hω,n )−1 − (i + Hω )−q (z − Hω )−1 )1C(0,1) ≤ A + B, tr
(2.9) where
A = 1C(0,1) (z − Hω,n )−1 − (z − Hω )−1 (i + Hω )−q 1C(0,1) tr
−1 −1 −q = 1C(0,1) (z − Hω,n ) (i + Hω ) 1C(0,1) Vω,n − Vω (z − Hω )
tr
and
B = 1C(0,1) (z − Hω,n )−1 (i + Hω,n )−q − (i + Hω )−q 1C(0,1) tr
q−1 X
−1 l−q −l
= 1C(0,1) (z − Hω,n ) (i + Hω,n ) Vω,n − Vω (i + Hω ) 1C(0,1)
.
l=1 tr
The estimates for A and B being obtained essentially in the same way, we will write the details for A only. Pick χ ∈ C0∞ (Rd ) such that 0 ≤ χ ≤ 1, χ ≡ 1 on C(0, 1/2) and X χγ4 ≡ 1. Then, we have χ ≡ 0 outside of C(0, 3/2) such that γ ∈Zd
A≤
X
1C(0,1) (z−Hω,n )−1 χγ 0
L(H −1 ,L2 )
γ 0 ∈Zd ,β∈Zd
· χγ 0 (z − Hω )−1 χβ
χγ 0 (Vω,n −Vω )χγ 0
L(L2 ,H 1 )
L(H 1 ,H −1 )
·
· χβ3 (i + Hω )−q 1C(0,1) . tr
Here χβ (·) = χ(·−β). By Lemma 5.4 applied to Hω and to Hω,n , for Vω ∈ k (α, 1, p0 ), we know that, for some > 0, ρ ≥ 1 and C > 0, for all γ 0 ∈ Zd and β ∈ Zd , we have 0 1−α
e−·η(z,K)|γ |
≤ C ,
1C(0,1) (z − Hω,n )−1 χγ 0 L(H −1 ,L2 ) η(z, K)ρ
C 0 1−α 1−α
e−·η(z,K)||γ | −|β| | ,
χγ 0 (z − Hω )−1 χβ 2 1 ≤ ρ L(L ,H ) η(z, K) 1−α
e−·η(z,K)|β|
3 −q ,
χβ (i + Hω ) 1C(0,1) ≤ C tr η(z, K)ρ
|Imz| and ρ depends only on d |z| + K + C and q. By Lemma 5.3 and the growth estimate known for Vω,n and Vω (when Vω ∈ k (α, 1, p0 )), we know that, for γ 0 ∈ Zd , we have
0 α
χγ 0 (Vω,n − Vω )χγ 0 L(H 1 ,H −1 ) ≤ C(1 + |γ |) . p
where K = Ck p−p0 , η(z, K) = η(z, K, 1) =
Lifshitz Tails for Random Schrödinger Operators
69
On the other hand, due to the exponential decay of V1 and the compact support of V2 , there exists C > 0 such that, for |γ 0 | ≤ n/2, we have
−n/C
χγ 0 (Vω,n − Vω )χγ 0 . L(H 1 ,H −1 ) ≤ Ce Hence, if we multiply these estimates and sum the result in γ and γ 0 , we get that A≤C
n(d+3)α −·η(z,K)n1−α e . η(z, K)ρ
(2.10)
For B, we get an estimate analogous to (2.10); only the constants change. Plugging this into (2.9) and (2.8), summing over k using the estimate (5.1) for the probability of c (α, 1, p 0 ), we get k (d+3)α
|E(tr(1C(0,1) (ϕ(Hω,n ) − ϕ(Hω ))))| ≤ Cn
1 2π
Z ∂ ϕ˜ (z) S(z, n)dxdy, (2.11) C ∂z
where S(z, n) :=
X (Cµ)k k!
k≥1
ρ p |Imz| − n1−α p p−p 0 + C |z| + Ck e |z|+Ck p−p0 +C . |Imz|
As suppϕ˜ ⊂ {z ∈ C; |Im(z)| < 1}, using the notation z = x + iy, for l ∈ N∗ , we estimate S(z, n) for |y| < 1 by S(z, n) ≤
X (Cµ)k k≥1
k!
≤ n−l(1−α) = n−l(1−α)
ρ p |y| − p p−p 0 (|x| + C)k e (|x|+C)k p−p0 |y| |x| + C |y| |x| + C |y|
ρ+l X ρ+l
k≥1
p (Cµ)k p−p ρ k 0 k!
n1−α
|y|n1−α (|x| + C)
l
1−α
e
|y| − n(|x|+C) k
−
p p−p 0
fl (t), (2.12)
where t =
fl (t) :=
|y|n1−α and |x| + C X (Cµ)k k≥1
k!
k
p ρ p−p 0
tle
p 0 −tk p −p
≤
X (Cµ)k k≥1
k!
k
p ρ p−p 0
lk
p p−p 0
l e−l
l l X (Cµ)k (ρ+l) p l l p−p 0 ≤ k ≤ e−l e−l L!(eCµe − eCµ ), k! k≥1
(2.13)
70
F. Klopp, L. Pastur
where L denotes the smallest integer larger than (ρ + l)
p . Here we used Stirling’s p − p0
formula and the identity X (Cµ)k X (Cµ)k X k l X 1 X (Cµ)k = kl = (ek −1) = eCµe −eCµ . l! k! k! l! k! l≥1
k≥1
k≥1
l≥1
k≥1
Hence, for some C > 0 (independent of l, n, z and µ), we have ρ+l Cµ −(1−α)l Cl log l |x| + C e . S(z, n) ≤ Ce n |y|
(2.14)
Plugging this into (2.11) and using estimate (2.5) for almost analytic extensions, we get (2.3) and end the proof of Lemma 2.1. u t Remark 2.2. In the proof of Lemma 2.1, we only have used the fact that the space of realization could be written as = ∪n n , where the probability P (c n ) was decreasing fast enough, and that in these subsets, we had uniform estimates on the quantity we want to compute. Obviously, to get such a decomposition, one does not need to have a Poisson potential but only a homogeneous random field with suitable bounds at infinity. This idea is applicable to many other random Schrödinger operators. Proof of Theorem 1.3. By [21], we know that, for φ ∈ C0∞ (R) and for almost every ω, we have 1 D tr(φ(Hω,3 )) 3→Rd Vol(3) 1 D ))) E(tr(φ(Hω,3 = lim d Vol(3) 3→R X 1 D E(tr(1C(γ ,1) φ(Hω,3 )1C(γ ,1) )), = lim 3→Rd Vol(3) d
hφ, dNi = lim
(2.15)
γ ∈3∩Z
D is defined in Sect. 1.1. On the other hand, as H is homogeneous, for any where Hω,3 ω d γ ∈ Z , we have
E(tr(1C(γ ,1) φ(Hω )1C(γ ,1) )) = E(tr(1C(0,1) φ(Hω )1C(0,1) )). So that E(tr(1C(0,1) φ(Hω )1C(0,1) )) =
1 Vol(3)
X
E(tr(1C(γ ,1) φ(Hω )1C(γ ,1) )).
γ ∈3∩Zd
Hence, by (2.15), to get Eq. (1.4), we just have to prove that 1 X D E tr(1C(γ ,1) φ(Hω,3 )1C(γ ,1) )−tr(1C(γ ,1) φ(Hω )1C(γ ,1) ) = 0. lim 3→Rd Vol(3) γ ∈3∩Zd (2.16)
Lifshitz Tails for Random Schrödinger Operators
71
To shorten the notations, let 3L = C(0, L) be the cube of center 0 and side length L. Pick > 0. To prove (2.16), one could try to prove that, for any γ ∈ Zd , one has D )1 )−tr(1 φ(H )1 ) lim A(γ , 3L ) := lim E tr(1C(γ ,1) φ(Hω,3 ω C(γ ,1) C(γ ,1) C(γ ,1) L L→∞
3L →Rd
=0 (2.17) in some uniform way. However, this may be difficult as, because of the Dirichlet boundary conditions, A(γ , 3L ) may have some non-uniform behavior for γ close to the boundary of 3L . So we are going to split the difficulty into two parts; on the one hand, we will show that, for any > 0 and for γ ∈ 3L \ 3(1−)L , we will show that A(γ , 3L ) stays bounded (uniformly in L, ω). As there are only very few such terms, this part of the sum tends to 0. On the other hand, for γ ∈ 3(1−)L , we will show that A(γ , 3L ) tends to 0 uniformly in L and ω. As in the proof of Lemma 2.1, uniformity in ω cannot be achieved over the whole set of realization, but only over subsets whose measure we control (see Lemma 5.1). This will suffice. D ). Hence, To estimate A(γ , 3L ), we will use (2.6) to compute φ(Hω ) and φ(Hω,3 L we see that we only need to estimate the following expression (cf. (2.9)): D D )−1 (i + Hω,3 )−q 1C(γ ,1) ) tr(1C(γ ,1) (z − Hω,3 L L
− tr(1C(γ ,1) (z − Hω )−1 (i + Hω )−q 1C(γ ,1) ),
where q > d/2 is an even integer. To do this we need a way to compare the resolvent of the Dirichlet problem on 3L with the resolvent of Hω over the whole space. We use the ◦
following resolvent identity (see e.g. [1]): let χ ∈ C02 (3L ) then, for Imz 6= 0, we have D D )−1 χ + (z − Hω,3 )−1 [−1, χ](z − Hω )−1 . χ (z − Hω )−1 = (z − Hω,3 L L
(2.18)
Pick α, k and p 0 are taken as in Lemma 2.1. We will use the following lemma Lemma 2.2. Assume that q > d is an even integer and that ω is such that Vω ∈ k (α, 1, p0 ). Then, there exists Cq > 0 such that, for any 30 ⊂ 3L (30 measurable) and λ ≥ 1, we have pd
d
D )−1 kTq ≤ Cq k 2q(p−p0 ) L q k(i + Hω,3 L
(1+ α2 )
−1/2 130 kTq ≤ Cq |30 |1/q . k(λ − 1D 3L )
,
(2.19) (2.20)
Here, p and p(d) are defined in H1, p 0 satisfies p(d) < p0 < p, |30 | denotes the measure of 30 and 1/q . k · kTq = tr | · |q Let us postpone the proof of this result to finish the proof of Theorem 1.3. Pick > 0 to be chosen precisely later. Pick ω such that Vω ∈ k (α, 1, p0 ). Hence, by Lemma 5.1, (i) (r) we can decompose Vω = Vω + Vω , where p0
(i)
• Vω(i) ∈ Lloc,unif (3L ) and supx∈3L kVω kLp0 (C(x,1)) ≤ 1, (r)
p
• for x ∈ 3L , |Vω (x)| ≤ Ck p−p0 Lα .
72
F. Klopp, L. Pastur p
For 0 0, if εL ≥ 1, then p
1−α /(Ck p/(p−p 0 ) )
kR(L)kT1 ≤ Ck p−p0 e−(L)
.
Plugging this into (2.6), we get that, for some C > 0, D )1C(γ ,1) ) − tr(1C(γ ,1) φ(Hω )1C(γ ,1) )| ≤ |tr(1C(γ ,1) φ(Hω,3 L p
1−α /(Ck p/(p−p 0 ) )
≤ Ck p−p0 e−(L)
.
(2.24)
Taking the expectation of (2.24), summing in k and using (5.1) in the same way as in (2.14), we get that D )1C(γ ,1) ) − tr(1C(γ ,1) φ(Hω )1C(γ ,1) )| ≤ C(L)−(1−α) . E |tr(1C(γ ,1) φ(Hω,3 L (2.25) Then, by (2.22) and (2.25), we have that 1 Vol(3L )
X γ ∈3L ∩Zd
D E tr(1C(γ ,1) φ(Hω,3 )1C(γ ,1) ) = L =
1 Vol(3L )
X
E tr(1C(γ ,1) φ(Hω )1C(γ ,1) ) + Q(L),
γ ∈3L ∩Zd
where 1
α(1+ d(q−1) 2q )
|Q(L)| ≤ C(L)−(1−α) + C q L
+ CL−1 .
If we choose β = α(q + d(q − 1)/2), set = L−β and pick 0 < α < 1/3 small enough so that 1 − β > 1/2, we get that Q(L) → 0 as L → +∞. This completes the proof of Theorem 1.3. u t Remark 2.3. Remark 2.2 applies also for the proof of Theorem 1.3. Proof of Lemma 2.2. Under our assumptions on ω, we have Vω ∈ k (α, 1, p0 ). Hence, (i) (r) by Lemma 5.1, we can decompose Vω = Vω + Vω , where p0
(i)
• Vω(i) ∈ Lloc,unif (3L ) and supx∈3L kVω kLp0 (C(x,1)) ≤ 1, (r)
p
• for x ∈ 3L , |Vω (x)| ≤ Ck p−p0 Lα .
74
F. Klopp, L. Pastur
For ϕ ∈ C0∞ (3L ), this yields D i r ϕ, ϕi = h−1D hHω,3 3L ϕ, ϕi + hVω ϕ, ϕi + hVω ϕ, ϕi L p 1 p−p 0 Lα + 1 kϕk2 . ≥ h−1D 3L ϕ, ϕi − C k 2
(2.26)
Equation (2.26) and the variational principle for eigenvalues immediately imply that D )≥ λj (Hω,3 L
p 1 p−p 0 Lα + 1 , ) − C k λj (−1D 3L 2
where λj (H ) denotes the j th eigenvalue of H (ordered increasingly counting multiplicity). Hence q
D )−1 kTq = k(i + Hω,3 L
X
≤
j ∈N
X j ∈N
1
D )]2 1 + [λj (Hω,3 L
q/2
1+ p
(2.27)
0
p−p Lα +1) λj (−1D 3 ) 0. There exists β > 0 and Eν > 0 such that, for E > Eν and for n ≥ E β , one has ν
ν
E(Nω,n (−E − 1)) − e−E ≤ N(−E) ≤ E(Nω,n (−E + 1)) + e−E .
(2.29)
Proof of Lemma 2.3. Pick ν > 0 arbitrary. By Eq. (5.13), the a-priori estimate on N given in Lemma 5.2, we know that, for some τ > 0, N(−E τ ) ≤
1 −E ν e . 4
(2.30)
Hence we just have to estimate N(−E) − N(−E τ ). Therefore introduce two functions ϕ± defined by ϕ± = 1[−E τ ∓ 1 ,−E± 1 ] ∗ ϕ0 , 2
2
C0∞ (R)
is a non-negative Gevrey class function of Gevrey exponent ρ > 1 where ϕ0 ∈ such that ϕ0 ≡ 1 on [− 41 , 41 ] and suppϕ0 ⊂ [− 21 , 21 ]. The functions ϕ± are then Gevrey class of Gevrey exponent ρ and suppϕ± ⊂ [−E τ − 1, −E + 1] (see e.g. [6]). Then one has hϕ− , dNi ≤ N(−E) − N(−E τ ) ≤ hϕ+ , dNi.
(2.31)
76
F. Klopp, L. Pastur
Using the Gevrey estimates for the derivatives and the estimates on the support of ϕ± , by Lemma 2.1, we have that, for some C > 0, and for all n ∈ N∗ and all k ∈ N∗ , |hϕ± , dNi − E(hϕ± , dNω,n i)| ≤ Cn−(1−α)k eCk log k (E τ + C)ρ+k (ρ + k)η(ρ+k) . (2.32) We optimize the right-hand side of (2.32) in k to get 1 1−α (E τ +C)) η+C
|hϕ± , dNi − E(hϕ± , dNω,n i)| ≤ e−(η+C)(n
.
Hence, for some β > 0, if n ≥ E β , we have |hϕ± , dNi − E(hϕ± , dNω,n i)| ≤
1 −E ν . e 4
(2.33)
Thus E(Nω,n (−E − 1) − Nω,n (−E τ + 1)) ≤ E(hϕ− , dNω,n i) ≤ E(hϕ+ , dNω,n i) ≤ E(Nω,n (−E + 1) − Nω,n (−E τ − 1)). (2.34) Using (5.14) to estimate Nω,n (−E τ ± 1) and summing Eqs. (2.30), (2.31) and (2.34), we end the proof of Lemma 2.3. u t Remark 2.4. Notice that we could have estimated N (−E) with E(Nω,n (−E ± )) (for small). The price to pay to get an error of the same size as in (2.29) would have been to take n of size E β −ζ for some ζ > 0. This was used in [12] to get precise asymptotics for N at high energy for a different model. 3. The Lower Bounds In this section, we will prove the asymptotic lower bound on the approximated density of states dNω,n defined in Sect. 2.1 in the different cases considered in the introduction. 3.1. The general case. We prove the following general bound Proposition 3.1. Under the assumptions H1, H 2 and H 3, there exist β0 > 0 such that, for any β > β0 and n = [E β ], we have log E(Nω,n (−E − 1)) ≥ −(1 + α ∗ d)g(E) log g(E)(1 + o(1)), E → +∞. (3.1) Here [·] denotes the integer part of ·. Proof. The strategy used to prove the lower bound is quite obvious: we construct a normalized vector ϕ such that h(H + Vω )ϕ, ϕi ≤ −E − 1, this with a sufficiently large probability. The right candidate will be an asymptotic ground state for H (g) for g chosen properly (see Sect. 1.2). Pick n = [E β ] and l = [| log E|β ]. Pick ρ > 1 large, 0 < ε < 1 small, 0 < α < 1 and 1 < k large. Let ψg be an asymptotic ground state for H (g) such that α ∗ (ψg ) < α ∗ (1+ε) (see Sect. 1.2). Define k,E = 1k,E ∩ 2E ,
Lifshitz Tails for Random Schrödinger Operators
77
where 1k,E = {ω : k(1 + 2ε) ≥ m(ω, C(0, k −α
∗ (1+ε)
)) = m(ω, C(0, l)) ≥ k(1 + ε)}, (3.2)
2E = {ω : ∀γ ∈ r0 Zd , m(ω, C(γ , r0 )) < E ρ (|γ |α + 1)}.
(3.3)
Here r0 is chosen such that suppV2 ⊂ C(0, r0 ). We minorize the probability of 1k,E ∗ by the probability that m(ω, C(0, k −α (1+ε) )) = k(1 + ε) and that m(ω, C(0, l) \ ∗ (1+ε) −α )) = 0. Using (5.2) to minorize the probability of 2E , for E sufficiently C(0, k large, we get that the probability of k,E is estimated by ∗
−dα (1+ε) )k(1+ε) C 1 d (µk − , P (k,E ) ≥ e−Cµl C 0(k(1 + ε)) 0(E ρ /2)
(3.4)
where C > 0 is a constant independent of l, k and ε, and 0 is the Euler 0-function. For ω ∈ k,E , we have (i) (e) )ψk , ψk i + hVω,n ψk , ψk i, h(−1 + Vω,n )ψk , ψk i = h(−1 + Vω,n
where Vω(i) =
Z C(0,l)
V (x − y)m(ω, dy) and Vω(e) =
Z Rd \C(0,l)
V (x − y)m(ω, dy), (i,e)
i.e. they are the parts of Vω with centers in C(0, l) or outside of C(0, l), and Vω,n is built from these in the same way as Vω,n is from Vω (see Eq. (2.2)). As V2 is of compact support, as V1 is exponentially decaying and as ω ∈ 2E , using the support properties of asymptotic ground states (see Definition 1.1), we get that β /C
(e) ψk , ψk i| ≤ CE ρ e−| log E| |hVω,n
.
(3.5)
∗
On the other hand, if we set m(ω) = m(ω, ∗B(0, k −α (1+ε) )) and define (xi (ω))i to be the points supporting m(ω, dx) in B(0, k −α (1+ε) ), we have h(−1 + Vω(i) )ψk , ψk i = h(−1 + m(ω)V )ψk , ψk i +
m(ω) X
h(τxi (ω) V − V )ψk , ψk i
i=1
= h(−1 + kV )ψk , ψk i +
m(ω) X m(ω) − k hkV ψk , ψk i + h(τxi (ω) V − V )ψk , ψk i k i=1
≤ E(k) + o(E(k)) + ε(E(k) + o(E(k))) +
m(ω) X
|h(τxi (ω) V − V )ψk , ψk i|.
i=1
(3.6) We used the fact that ψk is an asymptotic ground state and the fact that ω ∈ 1k,E .
78
F. Klopp, L. Pastur
As m(ω) ≤ k(1 + 2ε) and α ∗ (ψg ) < α ∗ (1 + ε), the definition of α ∗ (ψg ) tells us that m(ω) X 1 h(τxi (ω) V − V )ψk , ψk i → 0 as k → +∞. E− (k) i=1 Plugging all this into (3.6), we get h(−1 + Vω(i) )ψk , ψk i ≤ E(k)(1 + o(1) + ε) as k → +∞.
(3.7)
If we now chose k = g(E)(1 + ε),
(3.8)
then, for sufficiently large k, we get h(−1 + Vω(i) )ψk , ψk i ≤ −(E + 1).
(3.9)
But, as an asymptotic ground state, ψk vanishes outside some fixed cube independent of k; hence for E sufficiently large, it vanishes in a neighborhood of the boundary of the cube C(0, n) and can be continued so as to satisfy any quasi-periodic boundary conditions on C(0, n) (see Sect. 2.1). This implies that there exist C > 0 independent of n and E such that for ω ∈ k,E and k given by (3.8), we have Nω,n (−E − 1) ≥ n−d /C.
(3.10)
Taking into account the probability estimate (3.4), if ρ is large enough and as k ≥ E δ for some δ > 0 by (5.30), we get ∗
−dα (1+ε) )k(1+ε) 1 d (µk . E(Nω,n (−E − 1)) ≥ E −βd e−Cµl C 0(k(1 + ε))
So that, as E tends to +∞, we get log[E(Nω,n (−E − 1))] ≥ −(1 + ε + dα ∗ (1 + ε))g(E) log g(E)(1 + 8ε). As we can choose ε as small as we please, this ends the proof of Proposition 3.1. u t By Lemma 5.5, we know that, for some C > 0 and α > 1, g(E) ≤ CE for some C > 0 and sufficiently large E. Thus, if β > 1 is large enough and ν in Eq. (2.29) satisfies ν > 1, Lemma 2.3 and Proposition 3.1 immediately imply Proposition 3.2. Under the assumptions H1, H 2 and H 3, we have log N (−E) ≥ −(1 + α ∗ d)g(E) log g(E)(1 + o(1)), E → +∞.
Lifshitz Tails for Random Schrödinger Operators
79
3.2. The case when V is bounded from below. We now assume that V is bounded from below, i.e. that V− is bounded. We then prove Proposition 3.3. Under the assumptions H1, H 2 and H 1’, there exist β0 > 0 such that, for any β > β0 and n = [E β ], we have (3.11) log E(Nω,n (−E − 1)) ≥ −g(E) log g(E)(1 + o(1)), E → +∞. Proof. The strategy of the proof is the one used in the proof of Proposition 3.1. Recall that v− is the essential infimum of V . Then, for any ε > 0, we can find χ ∈ C0∞ (Rd ) such that Z Z V (x)χ 2 (x)dx ≤ v− + ε/2 and χ 2 (x)dx = 1. Rd
Rd
Recall that τa χ(x) = χ(x − a). As V ∈ L1 (Rd ), there exists δ > 0 such that Z |V (x)| · |τa χ 2 (x) − χ 2 (x)|dx ≤ ε/2. sup |a|≤δ Rd
Pick now n and l as in the proof of Proposition 3.1; pick k large and define δ,E = 1δ,E ∩ 2E , where 2E is defined in (3.3) and (see (3.2)) 1δ,E = {ω; k(1 + 2ε) ≥ m(ω, C(0, δ)) = m(ω, C(0, l)) ≥ k(1 + ε)}. Using (5.2), we get that the probability of δ,E is estimated by P (δ,E ) ≥
C 1 −Cµl d (µδ)kd(1+ε) e − . C 0(k(1 + ε)) 0(E ρ /2)
(3.12)
Pick ω ∈ δ,E . An argument similar to that used in the proof of Proposition 3.1 in which ∗ k −α (1+ε) is replaced by δ yields (cf. (3.6)) h(−1 + Vω,n )χ, χi = k∇χk2 + +
m(ω) XZ i=1
m(ω) XZ
τxi (ω) (V χ 2 ) +
i=1 (e) χ, χi [τxi (ω) V ](τxi (ω) χ 2 − χ 2 ) + hVω,n
≤ k(1 + ε)(v− + ε) + C + O(1/E).
(3.13)
Hence, if we take k = E/|v− |(1 + ε/|v− |), for E sufficiently large, we get (cf. (3.9)) h(−1 + Vω,n )χ, χi ≤ −E − 1. This inequality and the probability estimate (3.12) give, for some C > 0 independent of E and ε E (1 + Cε) log E. log E(Nω,n (−E − 1)) ≥ − |v− | This ends the proof of Proposition 3.3. u t
80
F. Klopp, L. Pastur
By the same argument as above, we get Proposition 3.4. Under the assumptions H1, H 2 and H 1’, if V is bounded from below then, we have log N(−E) ≥ −g(E) log g(E)(1 + o(1)), E → +∞. This ends the proof of Theorem 1.5 as the asymptotic upper bound given by Theorem 1.4 coalesces with the lower bound given by Proposition 3.4. 3.3. The case when V has power law singularities. Let us assume that V2 satisfies assumption H1.2”. We will show that Lemma 3.1. Under the assumptions H1 and H 1”, we have α† ≥ α∗. Proof. To estimate α ∗ we will use the asymptotic ground state constructed for H (g) in (i ) Lemma 5.7. Pick i0 and ψg 0 as in Lemma 5.7. Let α > α † . If the support of χ (cf. Lemma 5.7) is small enough, then, for |a| ≤ g −α and g large enough, we have (i )
8(a) =
(i )
gh(τa V − V )ψg 0 , ψg 0 i E− (g)
(i )
(i )
(i )
(i )
(i )
(i )
=
gh[τa (Wi0 Vi0 ) − Wi0 Vi0 ]ψg 0 , ψg 0 i gh[τa V1 − V1 ]ψg 0 , ψg 0 i + E− (g) E− (g)
=
gh[(τa Wi0 Vi0 ) − Wi0 Vi0 ]ψg 0 , ψg 0 i + o(1), E− (g)
as V1 is bounded and g = o(E− (g)). Hence to estimate 8(a), it is enough to estimate the expression (i )
(i )
gh[τa (Wi0 Vi0 ) − Wi0 Vi0 ]ψg 0 , ψg 0 i E− (g) Z g Wi0 (x − a)Vi0 (x − a) − Wi0 (x)Vi0 (x) χ 2 (x)|ϕg(i0 ) (x)|2 dx = E− (g) |x| α ∗ , we just need to show that I (a) → 0 when g → +∞. This is easily seen cutting I (a) †
Lifshitz Tails for Random Schrödinger Operators
81
into two parts. In the first one, we integrate over some small neighborhood of 0 and this is small as χ and ϕ (i0 ) are bounded and the singularities of Vi0 are integrable; outside of this neighborhood, the potentials Vi0 and Vi0 (· − b) are continuous and as ϕ (i0 ) is in L2 , we can conclude using Lebesgue’s Dominated Convergence Theorem. This implies t that α ≥ α ∗ . As it holds for any α > α † , we get the result of Lemma 3.1. u Combining Lemma 3.1 with Theorem 1.4, we get Proposition 3.5. Under the assumptions H1 and H 1”, one has −(1 + α † d)g(E) log g(E)(1 + o(1)) ≤ log N (−E), E → +∞. In Sect. 4.3 we improve on the general upper bound given in Theorem 1.4 so that the new upper bound coalesces with the lower bound obtained here. 4. The Upper Bounds In this section, we will prove the asymptotic upper bound on the approximated density of states dNω,n defined in Sect. 2.1 in the different cases considered in the introduction. 4.1. The general case. We prove the following general bound Proposition 4.1. Under the assumptions H1, H 2 and H 3, there exists β0 > 0 such that, for any β > β0 and n = 2[E β ] · [| log E|β ], we have (4.1) log E(Nω,n (−E + 1)) ≤ −g(E) log g(E)(1 + o(1)), E → +∞. By Lemma 5.5, we know that, for some C > 0 and α > 1, g(E) ≥ CE 1/α for some C > 0 and E large enough. So that if we pick β > 1 large enough so that ν defined in Eq. (2.29) satisfies ν > 1, Lemma 2.3 and Proposition 4.1 immediately imply Proposition 4.2. Under the assumptions H1, H 2 and H 3, we have log N(−E) ≤ −g(E) log g(E)(1 + o(1)), E → +∞. Taking into account Proposition 3.2, this ends the proof of Theorem 1.4. We now turn to the proof of Proposition 4.1. 4.2. Proof of Proposition 4.1. The idea of the proof is to show that, if Hω,n,θ has some low energies, then the corresponding potential Vω must have a very deep well, i.e. the corresponding realization of the Poisson measure must put sufficiently many points inside the cube C(0, n) and those points must be sufficiently close to each other. The main technical difficulty comes from the fact that our single site potential is not of finite range; so we need some a priori estimate on Nω,n that tells us that the behavior of Vω outside of C(0, n) does not interact too much with the one inside, more precisely, that this interaction can be large only with a small probability. Actually we need this not only on the scale of the large cube used in the periodic approximation but also on a much smaller scale, namely the scale of the size of the cube where we want the Poisson points to pile up. Pick β > 0 large. Set n = 2[E β ] · [| log E|β ] and l = [| log E|β ]. Pick ρ > 0 large and let 2E be defined as in (3.3). Then we prove
82
F. Klopp, L. Pastur
Lemma 4.1. For some C > 0 and sufficiently large E, we have E(Nω,n (−E + 1)) ≤ E(Nω,n (−E + 1)12 ) + E
C . 0(E ρ /2)
(4.2)
Proof. We decompose E(Nω,n (−E + 1)) = E(Nω,n (−E + 1)12 ) + A, E
where α and
p0
are as in Lemma 5.1, X E(Nω,n (−E + 1)1k (α,1,p0 )\(2 ∪k−1 (α,1,p0 )) ) A= E
k≥1
+ E(Nω,n (−E + 1)10 (α,1,p0 )\2 ) E X E(Nω,n (−E + 1)1k (α,1,p0 )\k−1 (α,1,p0 ) ) ≤
(4.3)
k≥E ρ
as, for k < E ρ , by the proof of Lemma 5.1, we have k (α, 1, p0 ) ⊂ 2E . For ω ∈ k (α, 1, p0 ) (see Lemma 5.1), we have p
(r) k∞ ≤ C(E log E)αβ k p−p0 . kVω,n
Hence, if N denotes the density of states of −1, for ω ∈ k (α, 1, p0 ), we get p Nω,n (−E + 1) ≤ N −E + 2 + C(E log E)αβ k p−p0 p d/2 . ≤ C −E + 2 + C(E log E)αβ k p−p0 Plugging this into (4.3) and computing the sum over k using (5.1), we get Lemma 4.1. t u Define Vω(i) (x) =
Z
V (x − y)m(ω, dy) and Vω(e) (x) =
C(0,n+2l)
Z V (x − y)m(ω, dy) Rd \C(0,n+2l)
(4.4) (i,e)
and the corresponding periodized potentials Vω,n (see Eq. (2.1)). Note that the peri(e) odized potentials are of period n. As V2 is compactly supported, Vω,n is almost surely bounded for l (i.e. E) large. We will estimate its magnitude later. The proof of Proposition 4.1 will be a consequence of the following lemmas: Lemma 4.2. For any ε > 0, there exists E0 and k0 such that, if k > k0 , E > E0 and ∀x ∈ C(0, n + 2l), m(ω, C(x, 2l)) ≤ k, then, for any θ ∈ Tn∗ , we have C k (e) k∞ − C((n/2l)d + 1) · ke−l/C − 2 − kVω,n Hω,n,θ ≥ −E− 1−ε εl
(4.5)
(4.6)
Lifshitz Tails for Random Schrödinger Operators
83
and Lemma 4.3. Pick δ > 0. If E is large enough and if k ≥ E δ , then, log [P ({ω; ∃x ∈ C(0, n + 2l), m(ω, C(x, 2l)) > k})]
∼
E→+∞
−k log k.
(4.7)
Before proving Lemmas 4.2 and 4.3, we finish the proof of Proposition 4.1. Therefore, fix ε > 0 and set k = g(E − 2)(1 − ε). For ω ∈ 2E , we have, for some C > 0, β /C
(e) k∞ ≤ E ρ e−(log E) kVω,n
.
(4.8)
Hence, using (4.6) and the bounds known for g(E) given in Lemma 5.5, we get that for ω ∈ 2E such that ∀x ∈ C(0, n + 2l), m(ω, C(x, 2l)) ≤ k, for any θ ∈ Tn∗ , we have Hω,n,θ ≥ −E + 3/2. In other words, we have ]{eigenvalues of Hω,n,θ ≤ −E + 1} = ]{eigenvalues of Hω,n,θ ≤ −E + 1}1{ω∈2 : ∃x∈C(0,n+2l), m(ω,C(x,2l))>k} . E
By the definition of Nω,n , using Fubini’s Theorem, we compute E(Nω,n (−E + 1)12 ) E Z E(]{eigenvalues of Hω,n,θ ≤ −E + 1}12 )dθ = Z
=
Tn∗ Tn∗
E
h
i E ]{eigenvalues of Hω,n,θ ≤ −E +1}1{ω∈2 : ∃x∈C(0,n+2l), m(ω,C(x,2l))>k} dθ. E
(4.9) On the other hand, for ω ∈ 2E (2E is defined in (3.3)), we have 0
(i) kLp (C(0,n)) ≤ CE β(d+α )+ρ , kVω,n
hence, using Corollary 5.1, there exists ν > 0 such that 1 0 Hω,n,θ ≥ − 1n,θ − E (β(d+α )+ρ)ν , 2 where −1n,θ is the Laplace operator on C(0, n) with quasi-periodic boundary conditions. This implies that, for some C > 0, ]{eigenvalues of Hω,n,θ ≤ −E + 1} 0
≤ ]{eigenvalues of − 1n,θ ≤ CE (β(d+α )+ρ)ν − 2E + 2} 0
≤ Cnd (E (β(d+α )+ρ)ν − E + 1)d/2 . Plugging this into (4.9), we get 0
E(Nω,n (−E + 1)12 ) ≤ C(E (β(d+α )+ρ)ν − E + 1)d/2 E
· P ({ω; ∃x ∈ C(0, n + 2l), m(ω, C(x, 2l)) > k}. Notice that, by Lemma 5.5, our choice of k fulfills the assumptions of Lemma 4.3. Therefrom we deduce that, if E is large enough, then log E(Nω,n (−E + 1)) ≤ −g(E) log g(E)(1 − 2ε). This completes the proof of Proposition 4.1 u t
84
F. Klopp, L. Pastur
4.2.1. The proof of Lemma 4.2. Consider the partition of the cube C(0, n) into cubes of side length l, i.e. [ C(γ , l), C(0, n) = γ ∈l Zd |γ |≤n/ l
and the covering
[
C(0, n) ⊂
C(γ , 2l).
γ ∈l Zd |γ |≤n/ l
Set L = 2[E β ] and consider a Zd -periodic partition of unity of Rd , i.e. X X χγ2+β , 1= β∈LZd γ ∈Zd ∩C(0,L)
where χ ∈ C0∞ (Rd ) such that χ ≡ 1 on C(0, 1/2), 0 ≤ χ ≤ 1 and suppχ ⊂ C(0, 3/2). For γ ∈ lZd , define χγ ,l (x) = χ((x − γ )/ l). Then, one has X X χγ2+β,l . 1= β∈nZd γ ∈l Zd ∩C(0,n)
Assume (4.5) holds and consider ϕ ∈ C ∞ (Rd ) satisfying the quasi-periodic boundary conditions ϕ(x + nγ ) = einθγ ϕ(x) for γ ∈ Zd and x ∈ Rd (i.e. ϕ ∈ C ∞ (Rd ) ∩ L2θ,loc (C(0, n))) such that kϕk = 1. Note that k · k and h·, ·i denote respectively the usual norm and scalar product in L2 (C(0, n)). For small positive ε, we compute (i) (e) )ϕ, ϕi + hVω,n ϕ, ϕi h(−1 + Vω,n )ϕ, ϕi = h(−1 + Vω,n (i) (e) ϕ, ϕi − kVω,n k∞ ≥ k∇ϕk2 + hVω,n X (i) (e) = χγ ,l ϕ, χγ ,l ϕi − kVω,n k∞ kχγ ,l ∇ϕk2 + hVω,n γ ∈l Zd ∩C(0,n+l)
1 (i) hV χγ ,l ϕ, χγ ,l ϕi ≥ (1 − ε) k∇ χγ ,l ϕ k + 1 − ε ω,n γ ∈l Zd ∩C(0,n+l) X 1 (e) k ∇χγ ,l ϕk2 − kVω,n k∞ + 1− ε d γ ∈l Z ∩C(0,n+l) X 2 1 (i) ≥ (1 − ε) hV χγ ,l ϕ, χγ ,l ϕi k∇ χγ ,l ϕ k + 1 − ε ω,n d
X
2
γ ∈l Z ∩C(0,n+l)
−
C (e) − kVω,n k∞ . εl 2
Define Vω(i,γ ) = =
Z C(γ ,2l)
Z
V (x − y)m(ω, dy) and Vω(e,γ )
C(0,n+2l)\C(γ ,2l)
V (x − y)m(ω, dy)
(4.10)
Lifshitz Tails for Random Schrödinger Operators
85 ((i,e),γ )
and the corresponding periodized potentials Vω,n (see Eq. (2.2)). Then (4.5) tells us that m(ω, C(0, n + 2l)) ≤ C(n/2l)d · k. Using this and the exponential decay of V1 , we get that, for some C > 0, (e,γ )
sup |Vω,n (x)| ≤ (n/2l)d · ke−l/C ,
x∈C(γ ,l)
(i,γ ) sup Vω,n − Vω(i,γ ) (x) ≤ Cke−l/C .
x∈C(γ ,l)
So that (4.10) gives us −1 +
X
h(−1 + Vω,n )ϕ, ϕi ≥ (1 − ε)
γ ∈l Zd ∩C(0,n+l)
−
1 (i,γ ) V χγ ,l ϕ, χγ ,l ϕ 1−ε ω
C (e) − kVω,n k∞ − C((n/2l)d + 1) · ke−l/C . εl 2
(4.11)
Now, if (xj (ω))j =1,...,m(ω,C(γ ,2l)) denotes the support of m(ω, dx) in C(γ , 2l), we write 1 (i,γ ) V χγ ,l ϕ, χγ ,l ϕ −1 + 1−ε ω + * m(ω,C(γ ,2l)) X 1 τxj (ω) V χγ ,l ϕ, χγ ,l ϕ = −1 + 1−ε j =1
=
1 m(ω, C(γ , 2l))
m(ω,C(γ X ,2l))
−1 +
j =1 m(ω,C(γ X ,2l))
1 ≥− m(ω, C(γ , 2l)) j =1 k ≥ −E− kχγ ,l ϕk2 . 1−ε
E−
m(ω, C(γ , 2l)) τxj (ω) V 1−ε
χγ ,l ϕ, χγ ,l ϕ
m(ω, C(γ , 2l)) kχγ ,l ϕk2 1−ε
(4.12) Here we have used (4.5). Plugging (4.12) into (4.11), we get C k (e) k∞ − C((n/2l)d + 1) · ke−l/C . − 2 − kVω,n h(−1 + Vω,n )ϕ, ϕi ≥ −E− 1−ε εl (4.13) Lemma 4.2 follows then from the fact that C ∞ (Rd )∩L2θ (C(0, n)) is dense in the domain t of Hω,n,θ . u 4.2.2. The proof of Lemma 4.3. Define P (n, k, l) = P ({ω; ∃x ∈ C(0, n + 2l), m(ω, C(x, 2l)) > k}). We will assume that k, n and l are large and that they satisfy d
(n + l)d+1 el = o(l −kd 0(k)).
(4.14)
86
F. Klopp, L. Pastur
We will prove a lower and an upper bound on P (n, k, l). We start with the lower bound. Consider the partition of the cube C(0, n + 2l) into cubes of side length 2l, i.e. [ C(γ , 2l). C(0, n + 2l) = γ ∈2l Zd |γ |≤n/2l+1
Using the independence for disjoint cubes and the homogeneity of the Poisson field, we obtain that P (n, k, l) ≥ P ({ω; ∃γ ∈ 2lZd , |γ | ≤ n/2l + 1; m(ω, C(γ , 2l)) > k}) d
= 1 − (1 − P (m(ω, C(0, 2l)) > k))(n/2l+1) . d
By definition, P (m(ω, C(0, 2l)) > k) ≥ e−µ(2l) we get
(4.15)
(µ(2l)d )k . Plugging this into (4.15), k!
log P (n, k, l) ≥ −k log k(1 + o(1)) when k, n and l tend to +∞ under the assumption (4.14). To get an upper bound, we consider the partition of the cube C(0, n + 2l) into cubes of side length 4l, i.e. [ C(γ , 4l). C(0, n + 2l) = γ ∈4l Zd |γ |≤n/(4l)+1/2
For any x ∈ C(0, n + 2l), there exists γ ∈ 4lZd , |γ | ≤ n/(4l) + 1/2 such that C(γ , 4l) ∩ C(x, 2l) = C(x, 2l). Hence, P ({ω; ∀x ∈ C(0, n + 2l), m(ω, C(x, 2l)) ≤ k}) ≥ ≥ P ({ω; ∀γ ∈ 4lZd , |γ | ≤ n/2l + 1/2; m(ω, C(γ , 4l)) ≤ k}) that is, using the stationarity of the Poisson process, P ({ω; ∃x ∈ C(0, n + 2l), m(ω, C(x, 2l)) > k}) X P ({ω; m(ω, C(γ , 4l)) > k}) ≤ γ ∈4l Zd , |γ |≤n/2l+1/2
≤ (n + 2l)d P ({ω; m(ω, C(0, 4l)) > k}). On the other hand P ({m(ω, C(0, 4l)) > k}) =
X j >k
dµ
e−(4l)
[µ(4l)d ]j (µ(4l)d )k ≤ . j! k!
This then implies that log P (n, k, l) ≤ −k log k(1 + o(1)) when k, n and l tend to +∞ under assumption (4.14). u t
(4.16)
Lifshitz Tails for Random Schrödinger Operators
87
4.3. The case when V has power law singularities. We now assume that H 1” holds. Obviously, modifying V1 , we can assume that the functions Wi do not change sign and that the supports of distinct Wi are pairwise disjoint. We then prove Proposition 4.3. Under the assumptions H1 and H 1”, there exists β0 > 0 such that, for any β > β0 and n = 2[E β ] · [| log E|β ], we have log E(Nω,n (−E + 1)) ≤ −(1 + α † d)g(E) log g(E)(1 + o(1)), E → +∞. (4.17) Taking into account Lemma 5.5 and Lemma 2.3, as a corollary to Proposition 4.3, we get Proposition 4.4. Under the assumptions H1 and H 1”, we have log (E(N (−E + 1))) ≤ −(1 + α † d)g(E) log g(E)(1 + o(1)), E → +∞. This ends the proof of Theorem 1.6 if one takes into account Proposition 3.5. 4.3.1. Proof of Proposition 4.3. The idea guiding this proof is essentially the same as the one guiding the proof of Proposition 4.1. The difference comes from the fact that, as E− (g) increases faster than linearly in g, if we want to gather k single site potentials sufficiently close together so as to get the effect of having k single site potentials exactly at the same point, we need the single site potentials to be roughly at a distance less † than k −α from each other. Hence, the scale on which we want the Poisson points to concentrate is much smaller than the one used to prove the upper bound in the general case. This leads to some supplementary technical difficulties as the single site potentials have a finite non-zero range. Recall that (xi )1≤i≤q are the singularities of the single site potential V (see Sect. 1.2). We define [ C(x − xi , r). (4.18) K(x, r) = 1≤i≤q
Pick ε > 0 small and define the events ˜ 1 = {ω ∈ 2E : ∃x ∈ C(0, n + 2l), m(ω, K(x, k −(α † −2ε) )) > k}, ˜ 2 = {ω ∈ 2E : ∃x ∈ C(0, n + 2l), m(ω, C(x, 1)) > k 1+εν † }, ˜ 2. ˜ = ˜1∪ and Pick β > 0 large. Set n = 2[E β ] · [| log E|β ] and l = [| log E|β ]. Pick ρ > 0 large. Taking into account Lemma 4.1, Proposition 4.3 is a direct consequence of the following two lemmas (cf. Lemmas 4.2 and 4.3). Lemma 4.4. For any ε > 0, there exists E0 and k0 such that, if k > k0 and E > E0 , if ω ∈ 2E and if ∀x ∈ C(0, n + 2l), m(ω, K(x, k −(α
† −2ε)
)) ≤ k and m(ω, C(x, 1)) ≤ k 1+εν
†
(4.19)
then, for any θ ∈ T∗n , we have † † Ck 2(α −ν ε/2) k β † (1 + ε) − − E ρ e−(log E) /C − Ck 1+εν , Hω,n,θ ≥ −E− 1−ε ε (4.20)
88
F. Klopp, L. Pastur
and Lemma 4.5. Pick δ > 0. Then, if E is large enough and if k ≥ E δ , the probability of ˜ satisfies the event ˜ log P ()
∼
E→+∞
−(1 + (α † − 2ε)d)k log k.
(4.21)
We now pick k = [(1 − ε)g(E/(1 + 2ε))], i.e. k is of order of magnitude E 1−ν /2 as † † † E → +∞ (see Lemma 5.7). In that case, for sufficiently small ε, k 1+εν and k 2(α −ν ε/2) are o(E). Hence Proposition 4.3 follows from Lemma 4.1, Lemma 4.4 and Lemma 4.5 in the same way as Proposition 4.1 followed from Lemmas 4.1, 4.2 and 4.3. †
4.3.2. The proof of Lemma 4.4. We are going to proceed along the same lines as in the (i,e) proof of Lemma 4.2. The potentials Vω,n are defined as in (4.4). Fix ϕ ∈ C ∞ (Rd ) ∩ L2θ (C(0, n)) normalized by kϕk = 1. Then (i) (e) ϕ, ϕi + hVω,n ϕ, ϕi hHω,n,θ ϕ, ϕi = h−1ϕ, ϕi + hVω,n
(4.22)
β /C
(i) ϕ, ϕi − E ρ e−(log E) ≥ h−1ϕ, ϕi + hVω,n
using (4.8). We can split (i) (i,1) (i,2) = Vω,n + Vω,n , Vω,n
where Vω(i,1)
Z =
C(0,n+2l)
(i,2)
V1 (x − y)m(ω, dy) and
Vω(i,2)
Z =
C(0,n+2l)
V2 (x − y)m(ω, dy).
(i)
Vω,n contains all the local singularities of Vω,n and, as V1 is exponentially decaying, there exists C > 0 such that for ω ∈ 2E satisfying (4.19), we have (i,1) k∞ ≤ Ck 1+εν . kVω,n †
Hence (4.22) gives β /C
(i,1) ϕ, ϕi − E ρ e−(log E) hHω,n,θ ϕ, ϕi ≥ h−1ϕ, ϕi + hVω,n
†
− Ck 1+εν .
(4.23)
Consider now a periodic partition of the unity of the cube C(0, n) of the form X χγ2 , 1C(0,n) = γ ∈0
where the χγ are supported on cells of size roughly k −α +ν ε/2 . These cells are centered † † at the points of 0 = δ(k)Zd ∩ C(0, n), where δ(k) = 1/[k α −ν ε/2 ]. Here [·] denote the integer part. These functions can then be chosen so that, for some C > 0, we have †
†
Lifshitz Tails for Random Schrödinger Operators
89
sup k∇χγ k2∞ + sup k1χγ k∞ ≤ Cδ(k)−2 ≤ Ck 2(α
γ ∈0
† −ν † ε/2)
γ ∈0
(4.24)
for some C > 1. We compute h−1ϕ, ϕi =
X
k∇(χγ ϕ)k2 + h|∇χγ |2 ϕ, ϕi − 2Re(h∇(χγ ϕ), ϕ∇χγ i)
γ ∈0
1 X k∇(χγ ϕ)k + 1 − h|∇χγ |2 ϕ, ϕi ≥ (1 + ε) ε γ ∈0 γ ∈0 X 1 † † 2 k∇(χγ ϕ)k + C 1 − k 2(α −ν ε/2) . ≥ (1 − ε) ε
X
2
(4.25)
γ ∈0
On the other hand (i,1) ϕ, ϕi = hVω,n
X γ ∈0
(i,1) hVω,n χγ ϕ, χγ ϕi.
Set V˜i = Wi Vi . As the Wi ’s are of compact support, so are the V˜i . Hence, for some R0 positive, we have X (i,1) χγ ϕ, χγ ϕi = hτxu (ω)+xj (V˜j )χγ ϕ, χγ ϕi hVω,n 1≤j ≤q, u≥1 xu (ω)+xj ∈C(γ ,R0 )
X
=
1≤j ≤q, u≥1
hτxu (ω)+xj (V˜j )χγ ϕ, χγ ϕi †
xu (ω)+xj ∈C(γ ,k 2ε−α )
X
+
1≤j ≤q, u≥1
hτxu (ω)+xj (V˜j )χγ ϕ, χγ ϕi, †
xu (ω)+xj ∈C(γ ,R0 )\C(γ ,k 2ε−α )
(4.26) where the (xu (ω))u≥1 are the support of the Poisson measure m(ω, dy). † For |γ − xu (ω) − xj | ≥ k ε−α and x in the support of χγ , one has † † † |τxu (ω)+xj (V˜j )(x)| ≤ Ck εν −α ν . †
We have assumed that, for any x ∈ C(0, n + 2l), one has m(ω, C(x, 1)) ≤ k 1+εν ; hence, we obtain X † † † † hτxu (ω)+xj (V˜j )χγ ϕ, χγ ϕi ≥ −Ck −2εν +α ν +1+εν 1≤j ≤q, u≥1
†
xu (ω)+xj ∈C(γ ,R0 )\C(γ ,k 2ε−α )
= −Ck 2α
† −εν †
. (4.27)
90
F. Klopp, L. Pastur
Hence, to estimate hHω,n,θ ϕ, ϕi using Eqs. (4.25), (4.26) and (4.27), we only need to estimate A := k∇(χγ ϕ)k2 +
1 1−ε
X
hτxu (ω)+xj (V˜j )χγ ϕ, χγ ϕi.
1≤j ≤q, u≥1
†
xu (ω)∈K(γ ,k 2ε−α )
Therefore, we notice that, as the (xj )1≤j ≤q are distinct, for k sufficiently large, for any ω, † any γ and any u, there is at most a single 1 ≤ j ≤ q such that xu (ω) ∈ C(γ −xj , k 2ε−α ). Hence, by (4.18), we get a partition †
Q := {xu ; xu (ω) ∈ K(γ , k 2ε−α )} [ [ † {xu ; xu (ω) ∈ C(γ − xj , k 2ε−α )} =: Qj . = 1≤j ≤q
1≤j ≤q †
Set m(ω) = m(ω, K(γ , k 2ε−α )) and qj = ]Qj . We compute q 1 XX hτxu (ω)+xj (V˜j )χγ ϕ, χγ ϕi A = k∇(χγ ϕ)k + 1−ε j =1 Qj q X X 1 m(ω) qj k∇(χγ ϕ)k2 + hτxu (ω)+xj (V˜j )χγ ϕ, χγ ϕi = m(ω) 1−ε 2
j =1
=
q X j =1
≥
1 m(ω)
X
Qj
k∇(χγ ϕ)k2 +
Qj
m(ω) hτx (ω)+xj (V˜j )χγ ϕ, χγ ϕi 1−ε u
q X qj X m(ω) E˜ j kχγ ϕk2 , m(ω) (1 − ε) j =1
Qj
where E˜ j (g) is the ground state of −1 + g V˜j . In Sect. 5.4, we prove that the lowest of these ground states is asymptotic to E(g) as g → +∞. Hence, for k sufficiently large, by (4.19), we have A ≥ −(1 + ε)E−
m(ω) k kχγ ϕk2 ≥ −(1 + ε)E− kχγ ϕk2 . 1−ε 1−ε
(4.28)
Combining this with Eqs. (4.25), (4.26) and (4.27), we end the proof of Lemma 4.4. u t ˜ 1 ). 4.3.3. The proof of Lemma 4.5. We first prove an asymptotic lower bound for P ( Recall that K is defined by (4.18). We notice that, for E large enough, P ({ω ∈ 2E : ∃x ∈ C(0, n + 2l), m(ω, K(x, k −(α ≥ P ({ω ∈
2E
† −2ε)
)) > k}) ≥
: ∃x ∈ C(0, n + 2l), m(ω, C(x, k −(α
† −2ε)
)) > k}).
Lifshitz Tails for Random Schrödinger Operators
91
Hence, using the proof of the lower estimate in Lemma 4.3, we get ˜ 1) log P ( ≥ −1. E→+∞ (1 + (α † − 2ε)d)k log k
(4.29)
lim inf
˜ 1 ). We partition C(0, n+3l) into cubes Let us prove the asymptotic upper bound for P ( † † of side length k −(α −2ε) , the cubes being indexed by 4k = k −(α −2ε) Zd ∩ C(0, n + 3l). † For 1 ≤ i ≤ q, let γi be the projection of xi on the lattice k −(α −2ε) Zd . Then, for any x ∈ C(0, n + 2l), there exists γ ∈ 4k such that K(x, k −(α
† −2ε)
[
)⊂
C(γ − γi , 4k −(α
† −2ε)
).
1≤i≤q
Hence P ({ω ∈ 2E : ∃x ∈ C(0, n + 2l), m(ω, K(x, k −(α −2ε) )) > k}) [ X † P ω ∈ 2E ; m ω, C(γi + γ , 4k −(α −2ε) ) > k ≤ γ ∈4k 1≤i≤q [ † † C(γi , 4k −(α −2ε) ) > k . ≤ ((n + 3l)k (α −2ε) )d P ω ∈ 2E : m ω, †
1≤i≤q
As k > E δ and as n and l are at most of polynomial size in E, this gives lim sup E→+∞
˜ 1) log P ( ≤ −1. † (1 + (α − 2ε)d)k log k
Combined with (4.29), we get ˜ 1) log P (
∼
E→+∞
−(1 + (α † − 2ε)d)k log k.
(4.30)
On the other hand, Lemma 4.3 tells us that ˜ 2) log P (
∼
E→+∞
†
−(1 + εν † )k 1+εν log k.
˜ 1 ). In view of (4.30), this completes the ˜ 2 ) is negligible with respect to P ( So that P ( proof of Lemma 4.5. u t 5. Appendix 5.1. The structure of the Poisson potential. Let V ∈ Lp (Rd ) be a potential satisfying assumptions H1, H 2 and H 3. Let m(ω, dx) be a Poisson measure of concentration µ. Consider the potential Vω defined by (1.1). One has
92
F. Klopp, L. Pastur
Lemma 5.1. For any α ∈ (0, 1), p0 < p and 0 < b < 1, there exists C > 0, such that ω-almost surely, Vω ∈
[
k (α, b, p0 ),
k≥1
where, for k ≥ 1, k (α, b, p0 ) is the set of measurable functions V : Rd → C that can be written in the form V = V (r) + V (i) , where V (r) and V (i) satisfy −p 0
p
1. |V (r) (x)| ≤ Ck p−p0 b p−p0 (|x|α + 1) for x ∈ Rd , p0
2. V (i) ∈ Lloc,unif and kV (i) k
p0
Lloc,unif
≤ b (here k · kLp
= sup k · kLp (C(x,1)) ).
loc,unif
x∈Rd
Moreover, there exists C > 0 such that P({Vω 6 ∈ k (α, b, p0 )}) ≤
(Cµ)k . k!
(5.1)
Proof. Fix α, p 0 and b as in the lemma. Pick 0 < α 0 < α. As V2 is compactly supported, there exists r0 > 1 such that suppV2 ⊂ C(0, r0 ). Let 0 = r0 Zd . Pick k > 0. Define 0 Pk,α 0 := P ({∃γ ∈ 0; m(ω, C(γ , r0 )) ≥ k(|γ |α + 1)}). Using (4.16), we compute Pk,α 0 ≤
X
0
P ({ω; m(ω, C(γ , r0 )) ≥ k(|γ |α + 1)}
γ ∈0 0
X (µr d )k(|γ |α +1) 0 ≤ 0 (k(|γ |α + 1))! γ ∈0 ≤C
(5.2)
(µr0d )k for some C > 0. k!
We write Vω = V1,ω + V2,ω ,
(5.3)
Z where V1,ω (x) =
Rd
Z V1 (x − y)dm(ω, y) and V2,ω (x) =
Now assume that ω is such that 0
Rd
V2 (x − y)dm(ω, y).
∀γ ∈ 0, m(ω, C(γ , r0 )) < k(|γ |α + 1),
(5.4)
Lifshitz Tails for Random Schrödinger Operators
93
and (xn (ω))n∈N the points in the support of m(ω, dx). We will estimate V1,ω and V2,ω separately. V1,ω can be estimated as in [9]. One has |V1,ω (x)| ≤
X
X
|V1 (x − xn (ω))|
β∈r0 Zd xn (ω)∈C(β,r0 )
≤C
X
e−|x−β|/C m(ω, C(γ , r0 ))
β∈r0 Zd
≤C
(5.5)
X
β∈r0
e
−|x−β|/C
α0
k(|β| + 1)
Zd 0
≤ Ck(|x|α + 1). As V2 ∈ Lp (Rd ), for K ≥ 1, we have Z
p0
Rd
1|V2 (x)|≥K |V2 (x)| dx
1/p0
Z =
1|V2 (x)|≥K |V2 (x)| |V2 (x)|
Rd
≤K
p0 −p
p
p 0 −p p0
1/p0 dx
p/p0
kV2 kLp .
(5.6) (r)
(i)
Let us introduce a notation: take a function W and decompose it into W = WK + WK , where (r)
(i)
WK := 1|W (x)|≤K W and WK := 1|W (x)|>K W.
(5.7)
Fix a sequence of positive numbers (Kγ )γ ∈0 . As suppV2 ⊂ C(0, r0 ), V2,ω can be rewritten as (i)
(r)
V2,ω = V2,ω + V2,ω ,
(5.8)
where we define (i,r)
V2,ω (x) :=
X
1C(γ ,r0 ) (x)
γ ∈0
X
(i,r)
xn (ω)∈C(γ ,2r0 ) p0 p α,
Pick a > 0, Kγ = a(|γ |β + 1) and set β = by (5.6), we have
(V2 )Kγ (x − xn (ω)) .
α0 =
(i)
p−p0 p α.
Then, for some C > 0,
k1C(γ ,r0 ) V2,ω kLp0 ≤ Cm(ω, C(γ , 2r0 ))(a(|γ |β + 1)) ≤ Cka
p 0 −p p0
p 0 −p p0
0
α 0 +β p p−p 0
(|γ |
+ 1) ≤ 2Cka
p 0 −p p0
(5.9) ,
and (r)
0
k1C(γ ,r0 ) V2,ω k∞ ≤ Cka(|γ |α +β + 1) ≤ Cka(|γ |α + 1).
(5.10)
94
F. Klopp, L. Pastur p0
p0
Pick a = (2−d b) p0 −p (Ck) p−p0 (where C is the constant appearing in (5.9)). As r0 > 1, (i) (i) ≤ 2d sup k1C(γ ,r0 ) V2,ω kLp0 . Hence, by (5.9), we get we have kV2,ω k p0 Lloc,unif
γ ∈0
(i)
kV2,ω k
p0
Lloc,unif
≤ b.
(5.11)
Moreover, by (5.10), for some C > 0, for x ∈ Rd , we have (r)
p
−p 0
|V2,ω (x)| ≤ Ck p−p0 b p−p0 (|x|α + 1).
(5.12)
Now putting together Eqs. (5.3), (5.5), (5.8), (5.11) and (5.12), we see that if ω is such 0 that ∀γ ∈ 0, m(ω, C(γ , r0 )) < k(|γ |α + 1), then Vω ∈ k (α, b, p0 ). Taking into account Eq. (5.2), we get Lemma 5.1. u t 5.2. An a-priori estimate on the density of states. Recall that N denotes the integrated density of states of Hω . We prove Lemma 5.2. Under assumption H1, H 2 and H 3, there exists C > 0 such that, for E > 1, we have 1 2p−d log N(−E) ≤ − E 2p log E, C
(5.13)
and, for any n ≥ 1 and E > 1, we have 1 2p−d log E(Nω,n (−E)) ≤ − E 2p log E. C
(5.14)
Proof. Equations (5.14) and (5.13) are proved along the same lines. We will only write the details for (5.13). By [21], we know that, for any cube 3 and E > 0, we have 1 N smaller than or equal to − E} ]{eigenvalues of Hω,3 N (−E) ≤ E Vol(3) 1 X N E 1{ω; m(ω,3)=n} ]{eigenvalues of Hω,3 smaller than or equal to −E} , = Vol(3) n∈N
(5.15) N denotes the restriction of H to 3 with the Neumann boundary conditions. where Hω,3 ω One has the
Lemma 5.3. Pick p > p(d) (p(d) is defined in assumption H 2). If V ∈ Lp (Rd ) then, d and E > 0, (E − 1)−α/2 V (E − 1)−α/2 is bounded on L2 (Rd ) and for for α > 2p some Cα,p > 0, one has k(E − 1)−α/2 V (E − 1)−α/2 kL(L2 ) ≤ Cα,p kV kLp (Rd ) E −α+d/2p .
(5.16)
Lifshitz Tails for Random Schrödinger Operators
95
Proof. Using classical properties of the Fourier transform from Lp to Lq (q −1 +p−1 = 1 and 1 ≤ p ≤ 2) (see e.g. [6]), one shows (E−1)−α/2 is bounded from L2 (Rd ) to Lr (Rd ) α 1 1 for < + with the bound r 2 d α
k(E − 1)−α/2 kL2 →Lr ≤ Cα,r E − 2 +
d(2−r) 4r
.
One concludes using duality and Hölder’s inequality. u t Lemma 5.3, admits an immediate corollary p
Corollary 5.1. Pick p > p(d). Let V ∈ Lloc,unif (Rd ) be such that kV (i) kLp Then, there exists C > 0 such that, for φ ∈ H 1 (Rd ) and > 0, we have d
h|V |φ, φi ≤ k∇φk2L2 (Rd ) + C d−2p kφk2L2 (Rd ) .
loc,unif
≤ 1.
(5.17)
We now continue the proof of Lemma 5.2. The identity (5.17) can be carried over to H 1 (3) in the following way. Let φ ∈ H 1 (3); we can extend φ to Rd and we denote this extension by φ˜ (see [29,32]). By definition, φ˜ |3 = φ. Moreover we know that, for some C > 0, ˜ L2 (Rd ) ≤ C0 k∇φkL2 (3) . ˜ L2 (Rd ) ≤ C0 kφkL2 (3) and k∇ φk kφk
(5.18)
Pick x0 ∈ Rd and define Vx0 (x) = V (x −x0 ). Then, for φ ∈ H 1 (3) and > 0, by (5.17) ˜ φi ˜ L2 (Rd ) h|Vx0 |φ, φiL2 (3) = h|Vx0 |13 φ, d
˜ 2 2 d + C d−2p kφk ˜ 22 d ≤ k∇ φk L (R ) L (R ) d
≤ C0 k∇φk2L2 (3) + C0 C d−2p kφk2L2 (3) , Hence, there exists C > 0 such that, for any x0 ∈ Rd , > 0 and φ ∈ H 1 (3), we have d
h|Vx0 |φ, φiL2 (3) ≤ k∇φk2L2 (3) + C d−2p kφk2L2 (3) .
(5.19)
Now pick ω such that m(ω, 3) = n. By (5.19), for φ ∈ H 1 (3), we have d
hVω φ, φiL2 (3) ≥ −nk∇φk2L2 (3) − Cn d−2p kφk2L2 (3) . Pick =
1 to get 2n h(−1 + Vω )φ, φiL2 (3) ≥
2p 1 k∇φk2L2 (3) − Cn 2p−d kφk2L2 (3) . 2
(5.20)
Hence 2p
N ≥ −Cn 2p−d . Hω,3
(5.21)
96
F. Klopp, L. Pastur
As a direct consequence of (5.20), we get that N smaller than or equal to − E} ]{eigenvalues of Hω,3 2p
2p−d − E} ≤ ]{eigenvalues of − 1N 3 smaller than or equal to Cn
≤ CVol(3)n
2p 2p−d
(5.22)
.
Plugging this into (5.15), for some C > 0 (depending on 3), we obtain X 2p C E 1{ω; m(ω,3)=n}Vol(3)n 2p−d N (−E) ≤ Vol(3) 2p−d n≥E
X
≤C
n≥E
≤C
n
2p−d 2p
/C
2p
2p 2p−d
(µVol(3))n n!
/C
E(µVol(3))E
2p−d 2p
/C
2p−d 2p
(E /C)! This ends the proof of Lemma 5.2. u t
.
5.3. Exponential decay estimates. One has Lemma 5.4. Let α ∈ (0, 1), p > p(d) (p(d) is defined in assumption H 2) and q > d/2. Pick χ ∈ C0∞ (Rd ) such that 0 ≤ χ ≤ 1, χ ≡ 1 on C(0, 1/2) and χ ≡ 0 outside of C(0, 3/2). Then, there exists Cα,p,q,χ > 0 and α,p,q,χ > 0 such that, for any V of the form V = V i + V r , where p
• V i ∈ Lloc,unif and kV i kLp
• For some K > 0,
Vr
loc,unif
≤ b (for some b > 0).
satisfies | V r (x) |≤ K(|x|α + 1) for all x ∈ Rd ,
there exists Cb > 0 such that, for any (γ , γ 0 ) ∈ Zd × Zd and z ∈ C \ R, one has kχγ (z − (−1 + V ))−1 χγ 0 kTq ≤ Cα,p,q,χ
kχγ (z − (−1 + V ))−1 χγ 0 kL(L2 ,H 1 )
(1 + |γ 0 |)α η(z, K, b)
0
· e−α,p,q,χ ·η(z,K,b)|δα (γ )−δα (γ )| , (1 + |γ |)α ≤ Cα,p,q,χ η(z, K, b)
(5.23)
· e−α,p,q,χ ·η(z,K,b)|δα (γ )−δα (γ )| ,
(5.24)
0
|Imz| . Here χγ (·) = χ(·−γ ), |z| + K + Cb k · kTq denotes the norm in the q th Schatten class, H 1 (Rd ) is the usual Sobolev space H 1 (Rd ) = (1 − 1)−1/2 L2 (Rd ) and dist(z, z0 ) denotes the distance in C.
where δα (x) = (1+x 2 )(1−α)/2 and η(z, K, b) =
Proof. Up to small modifications, the proof of this result is the same as the proof of Lemma 4.1 in [9]. Let us just say that, in order to prove (5.24), we use the fact that t (1 + x 2 )α/2 ∇(z − (−1 + V ))−1 is bounded on L2 (Rd ) for Imz 6= 0 (see [26]). u
Lifshitz Tails for Random Schrödinger Operators
97
5.4. Some useful facts about the single site potential Hamiltonian Hg . Recall that, for g ∈ R, in Sect. 1.2, we have defined H (g) = −1 + gV and −E− (g) to be the infimum of the spectrum of H (g). It is well known that, for g large enough, −E− is a simple eigenvalue ([23,26]); hence it is analytic in g. Moreover it is convex (by the variational principle). Its first and second derivative are positive. Let ϕg be the unique positive normalized ground state associated to −E− (g). Then the eigenvalue equation gives k∇ϕg k2 + ghV ϕg , ϕg i = −E− (g). 0 (g) = −hV ϕ , ϕ i; so that E (g) satisfies Hence E− g g − 0 (g). k∇ϕg k2 + E− (g) = gE−
(5.25)
Let g be an inverse of E− in a neighborhood of +∞. Then, one has the following Lemma 5.5. Let V ∈ Lp (Rd ) where p is chosen as in assumption H 2. Then, there exists C > 0 such that, for g and E sufficiently large, one has 2p 1 g ≤ E− (g) ≤ Cg 2p−d . C
E− (g) is bounded if and only if V− is bounded. g In this case, we have: E− (g) ∼ kV− k∞ g. g→+∞ Z Z |ϕg |2 dx + |∇ϕg |2 dx → 0 as g → +∞. ∃R > 0 such that |x|>R
|x|>R
(5.26) (5.27)
(5.28)
2p
k∇ϕg k2 ≤ Cg 2p−d .
(5.29)
1 2p−d E 2p ≤ g(E) ≤ CE. C ∃C > 0, E0 > 0 such that ∀a > 0, ∀E > E0 , one has g(E) ≤ g(E + a) ≤ g(E) + Ca.
(5.30) (5.31)
Proof. The two sided bound (5.26) is an immediate corollary of (5.17) and of assumption H 3. The definition of g and (5.26) give (5.30). The proof of (5.27) is easy and left to the reader. By Eqs. (5.25) and (5.26), for g large enough, we have 0 (g) ≥ E−
1 1 E− (g) ≥ . g C
Integrating this relation, we get E− (g + k) − E− (g) ≥
k . C
Hence, as g is increasing, for E large enough and a > 0, we have g(E) ≤ g(E + a) ≤ g(E− (g(E) + Ca)) = g(E) + Ca. 0 is increasing, for This completes the proof of (5.31). Let us now prove (5.29). As E− ε > 0, we have 0 (g) ≥ εk∇ϕg k2 E− (g(1 + ε)) − E− (g) ≥ εgE−
98
F. Klopp, L. Pastur
by Eq. (5.25). Equation (5.29) is then an immediate consequence of (5.26). Let us now prove (5.28). We will distinguish two cases when V− is bounded and when it is not. Let us start with assuming V− bounded and let v− be its essential infimum. Then, by (5.27), as g → ∞, we have Z k∇ϕg k2 + g V |ϕg |2 dx = gv− + o(g). Hence,
Z 0≤
(V − v− )|ϕg |2 dx ≤ o(1).
(5.32)
By assumptions H1 and H 2, there exists δ > 0 such that for |x| ≥ 1/δ, V (x) − v− ≥ δ. Then, as V − v− is non negative, Eq. (5.32) tells us that Z |ϕg |2 dx ≤ o(1). |x|≥1/δ
If V is not bounded from below, let χ be a C0∞ cut-off for the cube C(0, R). The eigenvalue equation for ϕg gives us (−1 − E(g))[(1 − χ )ϕg ] = gV (1 − χ)ϕg + 2∇χ · ∇ϕg + 1χ ϕg . So that (1 − χ)ϕg = a1 + a2 + a3 , where a1 = g(−1 − E(g))−1 [V (1 − χ)ϕg ], a2 = 2(1 + E(g))−1 [∇χ · ∇ϕg ], a3 = (1 + E(g))−1 [1χ ϕg ]. For R large enough, V is bounded on the support of 1 − χ; this and (5.27) implies that ka1 k + ka3 k → 0 as g → +∞. We write a2 = 2 (1 + E(g))−1 [1, ∇χ] + ∇χ (1 + E(g))−1 ∇ϕg . As (1 + E(g))−1 ∇ → 0 in L2 -operator norm when g → +∞, we have ka2 k → 0 as g → +∞. Hence k(1 − χ)ϕg k → 0 as g → +∞. We compute k∇[(1 − χ )ϕg ]k2 = 2h(1 − χ)∇ϕg , ϕg ∇χi + h1χ ϕg , χϕg i = 2h∇[(1 − χ)ϕg ], ϕg ∇χi − 2h(∇χ)2 ϕg , ϕg ∇χ i + h1χ ϕg , χϕg i. For R sufficiently large, by what we have just proved, the last two terms in the equation above tend to 0 as g → +∞. Using the Cauchy–Schwartz inequality, this implies k∇[(1 − χ )ϕg ]k2 ≤ 2k∇[(1 − χ)ϕg ]kkϕg ∇χ k + o(1). As kϕg ∇χ k → 0 when g → +∞, we get that k∇[(1 − χ)ϕg ]k → 0 when g → +∞. This completes the proof of Eq. (5.28) hence of Lemma 5.5. u t
Lifshitz Tails for Random Schrödinger Operators
99
As a corollary of this lemma, we get Lemma 5.6. Let V ∈ Lp (Rd ) where p is chosen as in assumption H 2. Then, there exists R0 > 0 such that, if χ ∈ C0∞ (Rd ), 0 ≤ χ ≤ 1, χ(x) = 1 if |x| ≤ R0 and χ (x) = 0 if |x| ≥ 2R0 , then ψg = χ · ϕg /kχ · ϕg k is an asymptotic ground state and we have ) ( g|h(τa V − V )ψg , ψg i| = 0 6 = ∅. α > 0; lim sup g→+∞ |a|≤g −α E− (g) Proof. If we pick χ as above for R0 large enough, by Lemma 5.5, we know that k(1 − χ )ϕg k + k(1 − χ)∇ϕg k → 0 as g → +∞.
(5.33)
Then, as ϕg is the ground state of H (g), we compute h(H (g) − E(g))ψg , ψg i = h(H (g) − E(g))(1 − χ)ϕg , (1 − χ)ϕg i = −E(g)k(1 − χ)ϕg k2 + h∇χ∇ϕg , (1 − χ)ϕg i + h1χ ∇ϕg , (1 − χ)ϕg i. Equation (5.33) tells us that ψg is an asymptotic ground state as kχ · ϕg k → 1 when g → +∞. Define φg = (E− (g) − 1)1/2 ϕg . Then, by (5.26) and (5.29), we have 2p
kφg k2 ≤ Cg 2p−d . Pick a ∈ Rd and write h(τa V − V )ϕg , ϕg i = h0(g)φg , φg i, where 0(g) = (E− (g) − 1)−1/2 (τa V − V )(E− (g) − 1)−1/2 . We now estimate the norm of 0(g). We write 0(g) = (τa − 1)(E− (g) − 1)−1/2 V (E− (g) − 1)−1/2 τ−a + + (E− (g) − 1)−1/2 V (E− (g) − 1)−1/2 (τ−a − 1). Hence, for 0 < δ < 1/2 such that −1 + 2δ + d/2p < 0, by Lemma 5.3, we have k0(g)k ≤ 2k(τa − 1)(E− (g) − 1)−δ kk(E− (g) − 1)−1/2+δ V (E− (g) − 1)−1/2 k ≤ C[E− (g)]−1+δ+d/2p |a|δ (E− (g))−δ , writing (τa − 1)(E− (g) − 1)−δ as a Fourier multiplier. Hence 2p g|h(τa V − V )ϕg , ϕg i| −αδ+ 2p−d . ≤ Cg[E− (g)]−2+d/2p g sup E− (g) |a|≤g −α
100
F. Klopp, L. Pastur
Using (5.26) and picking α large enough, we get lim
sup
g→+∞ |a|≤g −α
g|h(τa V − V )ϕg , ϕg i| = 0. E− (g)
Now using Eqs. (5.26), (5.33) and the fact that, for R0 large enough, τa V −V is bounded outside {|x| ≤ 2R0 }, we get lim
sup
g→+∞ |a|≤g −α
g|h(τa V − V )ψg , ψg i| = 0. E− (g)
This ends the proof of Lemma 5.6. u t To end this section we describe the ground state energy of H (g) and an asymptotic ground state in the case when V satisfies assumption H,1”. For 1 ≤ i ≤ q, define Hgi = −1 + gVi , where Vi is defined in Eq. (1.10). Hgi is a form bounded perturbation of −1 with relative bound 0. By the homogeneity properties of Vi , when Ei < 0 (Ei is the ground state energy of H1i ), it is obvious that the ground state energy of Hgi is given by 2
Ei (g) = Ei g 2−νi .
(5.34) (i)
Define E† (g) = inf Ei (g). Moreover, if ϕi is the ground state of H1i then ϕg , the ground state of Hgi has the following form: d 1 ϕg(i) (x) = g 2(2−νi ) ϕi g 2−νi x .
(5.35)
We have Lemma 5.7. Assume V satisfies assumptions H1 and H 1”. Then, we have E(g)
∼
g→+∞
E† (g).
Moreover, if 1 ≤ i0 ≤ q is such that Ei0 (g) = E† (g), then, if χi0 is a C0∞ cut-off function (i ) (i ) (i ) for a sufficiently small neighborhood of xi0 , ψg 0 (·) = χi0 ϕg 0 (·−xi0 )/kχi0 ϕg 0 (·−xi0 )k is an asymptotic ground state for H (g). Proof. Fix ε > 0 small. As in Sect. 4.3, at the cost of adding a term to V1 , we may assume that the functions Wi stay non-negative and smaller than 1 + ε and that the support of the Wi are two by two disjoint. For 1 ≤ i ≤ q, let χi be a C0∞ cut-off function of a on the support of χi , the function Wi (· − xi ) stays larger neighborhood of xi such that, P than 1 − ε. Define χ02 = 1 − 1≤i≤q χi2 . Pick ϕ ∈ H 1 (Rd ). Then, for any ε > 0, we
Lifshitz Tails for Random Schrödinger Operators
101
have q q X X 2 h−1χi ϕ, ϕi + ghV χi2 ϕ, ϕi hH (g)ϕ, ϕi = i=0
=
q X i=0
+
i=0
k∇(χi ϕ)k2 − 2Rehχi ∇ϕ, ϕ∇χi i − h|∇χi |2 ϕ, ϕi q X i=1
≥ (1 − ε)
ghτxi (Wi Vi )χi2 ϕ, ϕi + ghV0 ϕ, ϕi q X i=1
(5.36)
k∇(χi ϕ)k2 +
g hτx (Wi Vi )χi ϕ, χi ϕi 1−ε i
q 1 X k∇χi · ϕk2 , + ghV0 ϕ, ϕi + 1 − ε i=0
where V0 is some bounded potential (depending on ε). On the other hand, on the support of χi , we have Wi Vi ≥ Vi − ε|V |i where |V |i (·) :=
|hi (·)| . | · |νi
Hence k∇(χi ϕ)k2 +
g hτx (Wi Vi )χi2 ϕ, ϕi ≥ 1−ε i
g hτx (Vi )χi ϕ, χi ϕi + ≥ (1 − ε) k∇(χi ϕ)k2 + 1−ε i g 2 hτx (|V |i )χi ϕ, χi ϕi . (5.37) + ε k∇(χi ϕ)k − 1−ε i
By Eq. (5.37), we get that k∇(χi ϕ)k2 +
g hτx (Wi Vi )χi2 ϕ, ϕi ≥ (1 + εC)Ei 1−ε i
g kχi ϕk2 1−ε
(5.38)
for some constant C (independent of ε) given by the lowest eigenvalue of −1 − g|V |i . Here we used the fact that this eigenvalue has the same growth rate in g as Ei (g); indeed, in the present case, this growth rate only depends on the homogeneity properties of the potential |V |i (as can be seen by a scaling argument). Putting Eqs. (5.38) and (5.36) together, we obtain that, there exists C > 0, such that, for any ε > 0 small and ϕ ∈ H 1 (Rd ), we have g + O(g) kϕk2 . hH (g)ϕ, ϕi ≥ (1 − ε)E† 1−ε This proves that lim sup g→+∞
E(g) ≤ 1. E† (g)
102
F. Klopp, L. Pastur
Pick 1 ≤ i0 ≤ q such that Ei0 (g) = E† (g) and χ a C0∞ cut-off function for a sufficiently (i ) (i ) small neighborhood of xi0 . By the definition of ϕg 0 , it is immediate that kχϕg 0 k → 1 as g → +∞. An immediate computation gives that (i )
(i )
hH (g)χϕg 0 , χϕg 0 i = 1. g→+∞ Ei0 (g) lim
This implies that lim inf
g→+∞
E(g) ≥ 1. E† (g)
Hence, it completes the proof of Lemma 5.7. u t Acknowledgement. F. K. thanks the Erwin Schrödinger Institute (Vienna) where this work was partially done and A. Trouvé for interesting discussions about Poisson processes. The authors are grateful to the referee for his very careful reading of the paper and for his pertinent remarks that allowed them to correct a number of misprints and to make several important improvements.
References 1. Combes, J.M. and Hislop, P.D.: Localization for some continuous random hamiltonians in d-dimensions. J. Funct. Anal. 124, 149–180 (1994) 2. Cycon, H.L., Froese, R.G., Kirsch, W. and Simon, B.: Schrödinger Operators. Berlin: Springer Verlag, 1987 3. Donsker, M. and Varadhan, S.R.S.: Asymptotics for the Wiener sausage. Commun. Pure and App. Math. 28, 525–565 (1975) 4. Efros, M. and Shlovski, B.: Electronic properties of doped semi-conductors. Heidelberg: Springer Verlag, 1984 5. Helffer, B. and Sjöstrand, J.: On diamagnetism and the De Haas-Van Alphen effect. Ann. de l’Institut Henri Poincaré, série Phys. Théor. 52, 303–375 (1990) 6. Hörmander, L.: The analysis of linear partial differential equations. I. Vol. 256 of Grundlehren der Mathematischen Wissenschaften, Berlin–Heidelberg–New York: Springer Verlag, 1990 7. Kirsch, W.: Random Schrödinger operators. In: A.Jensen H.Holden, editor, Schrödinger Operators, Number 345 in Lecture Notes in Physics, Berlin: Springer Verlag, 1989 8. Klopp, F.: An asymptotic expansion for the density of states of a random Schrödinger operator with Bernoulli disorder. Random Operators and Stochastic Equations 3 (4), 315–332 (1995) 9. Klopp, F.: A low concentration asymptotic expansion for the density of states of a random Schrödinger operator with Poisson disorder. J. Funct. Anal. 145, 267–295 (1995) 10. Klopp, F.: Band edge behaviour for the integrated density of states of random Jacobi matrices in dimension 1. J. Stat. Phy. 90 (3–4), 927–947 (1998) 11. Klopp, F.: Internal Lifshits tails for random perturbations of periodic Schrödinger operators. Duke Math. J. 1999. To appear 12. Klopp, F.: Precise high energy asymptotics for the integrated density of states of an unbounded random Jacobi matrix. Rev. Math. Phys. 1999. To appear 13. Klopp, F. and Pastur, L.: In progress 14. Kuchment, P.: Floquet theory for partial differential equations. Vol. 60 of Operator Theory: Advances and Applications, Basel: Birkhäuser, 1993 15. Landau, L. and Lifshitz, L. Mécanique quantique, théorie non-relativiste. Moscou: Editions MIR, 1966 16. Lifshitz, I.M.: Structure of the energy spectrum of impurity bands in disordered solid solutions. Sov. Phys. JETP 17, 1159–1170 (1963) 17. Lifshitz, I.M.: Energy spectrum structure and quantum states of disordered condensed systems. Sov. Phys. Uspekhi 7, 549–573 (1965) 18. Lifshitz, I.M., Gredeskul, S.A. and Pastur, L.A. Introduction to the theory of disordered systems. NewYork: Wiley, 1988 19. Mather, J.N. On Nirenberg’s proof of Malgrange’s preparation theorem. In: Proceedings of Liverpool Singularities-Symposium I, Number 192 in Lecture Notes in Mathematics, Berlin: Springer Verlag, 1971
Lifshitz Tails for Random Schrödinger Operators
103
20. Pastur, L.: Behaviour of some Wiener integrals as t → +∞ and the density of states of the Schrödinger equation with a random potential. Teor.-Mat.-Fiz 32, 88–95 (1977) (in Russian) 21. Pastur, L. and Figotin, A.: Spectra of Random and Almost-Periodic Operators. Berlin: Springer Verlag, 1992 22. Reed, M. and Simon, B.: Methods of Modern Mathematical Physics, Vol IV: Analysis of Operators. New-York: Academic Press, 1978 23. Reed, M. and Simon, B. Methods of Modern Mathematical Physics, Vol I: Functional Analysis. New-York: Academic Press, 1980 24. Shubin, M.A. Spectral theory and index of elliptic operators with almost periodic coefficients. Russ. Math. Surv. 34, 109–157 (1979) 25. Simon, B.: Trace ideals and their applications. Cambridge: Cambridge University Press, 1979 26. Simon, B.: Schrödinger semigroups. Bull. Am. Math. Soc. 7, 447–526 (1982) 27. Simon, B.: Lifshitz tails for the Anderson model. J. Stat. Phys. 38, 65–76 (1985) 28. Sjöstrand, J.: Microlocal analysis for periodic magnetic Schrödinger equation and related questions. In: Microlocal analysis and applications, Vol. 1495 of Lecture Notes in Mathematics Berlin: Springer Verlag, 1991 29. Stein, E.: Singular integrals and Differentiability properties of functions. Princeton, N.J.: Princeton University Press, 1970 30. Sznitman, A.: Lifshitz tails and Wiener sausages. I. J. Funct. Anal. 94, 223–246 (1990) 31. Sznitman, A.: Fluctuations of principal eigenvalues and random scales. Commun. Math. Phys. 189, 337– 363 (1997) 32. Taylor, M. Partial differential equations. New-York–Berlin: Springer, 1996 Communicated by Ya. G. Sinai
Commun. Math. Phys. 206, 105 – 136 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Theta Functions and Hodge Numbers of Moduli Spaces of Sheaves on Rational Surfaces Lothar Göttsche International Center for Theoretical Physics, Strada Costiera 11, P.O. Box 586, 34100 Trieste, Italy. E-mail:
[email protected] Received: 26 August 1998/ Accepted: 10 March 1999
Abstract: We compute generating functions for the Hodge numbers of the moduli spaces of H -stable rank 2 sheaves on a rational surface S in terms of theta functions for indefinite lattices. If H lies in the closure of the ample cone and has self-intersection 0, it follows that the generating functions are Jacobi forms. In particular the generating functions for the Euler numbers can be expressed in terms of modular forms, and their transformation behaviour is compatible with the predictions of S-duality. We also express the generating functions for the signatures in terms of modular forms. It turns out that these generating functions are also (with respect to another developing parameter) the generating function for the Donaldson invariants of S evaluated on all powers of the point class. 1. Introduction Let (S, H ) be a rational algebraic surface with an ample divisor. We assume that KS H ≤ 0. In the current paper we want to compute the Betti numbers and Hodge numbers of the moduli spaces MSH (C, d) of H -semistable torsion-free sheaves of rank 2 on S. In [V-W] Vafa and Witten made a number of predictions about the Euler numbers of moduli spaces of sheaves on algebraic surfaces: in many cases their generating functions should be given by modular forms. In the case of rational surfaces this cannot be true for all polarizations H : The moduli spaces and their Euler numbers depend on H , and this dependence is not compatible with the modularity properties. We study the limit of the generating function for the Euler numbers as H approaches a point F on the boundary of the ample cone with F 2 = 0 (see below for the definitions). It turns out that this limit is indeed a (quasi)-modular form (see Sect. 2.3). More generally we will relate the generating functions for the Hodge numbers and Betti numbers of the MSH (C, d) to certain theta functions of indefinite lattices, which were introduced and studied in [G-Z] in order to show structural results about Donaldson invariants. That the Euler numbers and signatures are given by modular and quasimodular
106
L. Göttsche
forms follows then from the fact that these theta functions are Jacobi forms. As in [G-Z], where the Donaldson invariants were studied, the theta functions enter the calculations by summing over walls. The ample cone has a chamber structure, and the moduli spaces MSH (C, d) only change when H crosses a wall. The structure of the walls for the moduli spaces is precisely the same as for the Donaldson invariants. Therefore we can use again the same theta functions as in [G-Z]. We write our results for the χy -genera instead of for the Hodge numbers, which is equivalent as all the cohomology is of type (p, p) [Be]. One could also have instead used the Poincaré polynomial, but I believe that in general the χy -genus will be better behaved. By specializing the generating functions for the χy -genera of the moduli spaces, we also obtain that the generating functions for the signatures are given by modular forms, a fact that does not seem to have been predicted by the physics literature. It turns out that the generating function for the signatures is better behaved than that for the Euler numbers. If F lies on the boundary of the positive cone, then the corresponding generating function for the signatures is a modular form and not just a quasimodular form. A surprising and interesting result is that the signatures of the moduli spaces MSH (C, d) are closely related to the corresponding Donaldson invariants 8S,H C . For any point H in the ample cone, the generating function for the signatures is also the r generating function for the Donaldson invariants 8S,H C (p ) evaluated on all powers of the point class p ∈ H0 (S, Z). The signatures of the moduli spaces are just the coefficients of the Fourier development of this generating function, whereas the Donaldson invariants are (up to some elementary factors) the coefficients of the development of this function into powers of a modular function u(τ ) for 0(2). In particular knowing all the signatures of the moduli spaces MSH (C, d) is equivalent to knowing all the Donaldson r invariants 8S,H C (p ). This relation also persists under our extension of the generating functions and, together with the formulas for the K3 surfaces, suggests a similar result for any algebraic surface. The proof of this result uses the conjecture of Kotschick and Morgan [K-M]. Feehan and Leness [F-L1,F-L2,F-L3,F-L4] are working towards the proof of this conjecture. This paper grew out of discussions with Jun Li on some aspects of [V-W]. I would like to thank K. Yoshioka for several very useful comments, G. Thompson for useful discussions and the referee for many useful comments and improvements. While preparing this manuscript I learned about related work. In [M-N-V-W] new predictions are made about the Euler numbers of MSF (C, d), where S is a rational elliptic surface, F is the class of a fibre and CF even.Yoshioka [Y4] has shown these predictions. Li and Qin ([L-Q1,L-Q2]) have shown blowup formulas for the Euler numbers and virtual Hodge polynomials of MSH (C, d) for arbitrary S. After this paper was submitted Baranovsky [Ba] displayed an action of the oscillator algebra on the cohomology of the moduli spaces MSF (r, C, d) and gave a simple relation between the Betti numbers of the Gieseker and Uhlenbeck compactifications.
2. Notations, Definitions and Background In this paper S usually denotes a smooth algebraic surface over C. Often we will assume S to be also rational. For a variety Y over C, we denote by upper case letters the classes in H 2 (Y, C), unless they appear as walls (see below), when we denote them by Greek letters. For A, B ∈ H 2 (Y, C) the intersection product on H 2 (Y, C) is just denoted by AB. Later we will also need the negative of the intersection product, which we denote
Theta Functions and Hodge Numbers of Moduli Spaces
107
by hA, Bi. For a smooth compact variety Y of complex dimension d let X (−1)p+q hp,q (Y )x p y q h(Y, x, y) := p,q
be the Hodge polynomial (note the signs), and let d
H (Y ) = H (Y : x, y) := (xy)− 2 h(Y, x, y). , y 1/2 is that it is symmetric around The advantage of this (Laurent) polynomial in x 1/2P degree 0. In a similar way let P (Y ) = P (Y : y) = i (−1)i bi (Y )y i−d := H (Y : y, y) be the (shifted) Poincaré polynomial (again note the signs) and let Xy (Y ) = H (Y : 1, y) be the (shifted) χ−y -genus. Then the Euler number of Y is e(Y ) = X1 (Y ) = P (Y, 1), d and the signature is σ (Y ) := (−1) 2 X−1 (Y ). 2.1. Virtual Hodge polynomials and the Weil conjectures. Virtual Hodge polynomials were introduced in [D-K]. For Y a complex variety the cohomology Hck (Y, Q) with compact support carries a natural mixed Hodge structure. If Y is smooth and projective, this Hodge structure coincides with the classical one. Following [Ch], we put XX (−1)k hp,q (Hck (Y, Q))x p y q . hv (Y : x, y) := p,q
k
These virtual Hodge polynomials have the following properties (see [Ch]). If Y is a smooth projective variety, then hv (Y : x, y) = h(Y : x, y). For Z ⊂ Y Zariski-closed we have hv (Y : x, y) = hv (Y \ Z : x, y) + hv (Z : x, y). For f : Z −→ Y a Zariski-locally trivial fibre bundle with fibre F , we have hv (Z : x, y) = hv (Y : x, y)hv (F : x, y). Finally e(Y ) = hv (Y, 1, 1) for any complex variety Y . We denote by X (−1)i bvi (Y )y i = pv (Y : y) := hv (Y : y, y) i
the virtual Poincaré polynomial. If Y has pure complex dimension d (or sometimes when d Y has expected dimension d), we write Hv (Y ) = Hv (Y : x, y) := (xy)− 2 hv (Y : x, y), Xyv (Y ) := Hv (Y : 1, y) and Pv (Y ) = Pv (Y : y) := y −d pv (Y : y). If Y is smooth and projective of dimension d we have therefore Hv (Y ) = H (Y ), Xyv (Y ) = Xy (Y ) and Pv (Y ) = P (Y ). Let Y be an arbitrary quasiprojective variety (not necessarily irreducible or smooth) over C. We want to show that the Weil conjectures still compute the virtual Poincaré polynomials. This was pointed out to me by Jun Li, and seems to be known to the experts. Proposition 2.1. There is a finitely generated subring A = Z[a1 , . . . , al ] ⊂ C and a variety YA over A, such that Y = YA ×A C, and the following holds: For m a maximal ideal of A we put Ym := YA ×A A/m. There is a nonempty dense open subset U of spec(A), such that if m ∈ U is a maximal ideal of A with quotient field Fq , then there exist complex numbers (ai,j )i,j with |ai,j | = q i/2 , such that for all n ∈ Z>0 , i
v (Y ) X bX n (−1)i ai,j . #Yq (Fq n ) =
i
j =1
108
L. Göttsche
Proof. If Y is smooth and projective, this is part of the Weil conjectures, proven by Deligne [De]. The general case is a simple consequence of this and resolution of singularities in characteristic 0. Let d be the largest dimension of a component of Y . The proof is by induction on d, the case d = 0 being trivial. Write Y = Y0 t W , where Y0 is the smooth locus of Y , and let Y˜ = Y0 t Z be a smooth compactification of Y . Then pv (Y, z) = p(Y˜ , z) + pv (W, z) − pv (Z, z). Let A = Z[a1 , . . . , al ] ⊂ C be a finitely generated subring, such that Y , Y˜ , Z, W are already defined over A. Let U be an open dense subset of spec(A) where the proposition applies to Y˜ (by the usual Weil conjectures) Z and W (by induction). Let m ∈ U be a maximal ideal with quotient field t Fq . Then #Ym (Fq n ) = #Y˜m (Fq n ) + #Wm (Fq n ) − #Zm (Fq n ), and the result follows. u 2.2. Moduli spaces. Let again S be an algebraic surface, H a general ample divisor on S, and let C ∈ H 2 (X, Z). Let MSH (r, C, d) denote the moduli space of H -semistable sheaves E on S (in the sense of Gieseker-Maruyama), with c1 (E) = C and discriminant H 2 d = c2 (E) − r−1 2r C . Let MS (r, C, d)s denote the open subspace of H -slope stable H sheaves and NS (r, C, d) the subspace of H -slope stable locally free sheaves. If d is sufficiently large, then MSH (r, C, d) is irreducible and generically smooth of dimension e = 2rd − (r 2 − 1)χ(OS ) (see e.g. [H-L]). We put MSH (C, d) := MSH (2, C, d), MSH (C, d)s := MSH (2, C, d)s and NSH (C, d) := NSH (2, C, d). If S is a rational algebraic surface and H is an ample divisor with H KS ≤ 0, then a slope stable sheaf E fulfills Ext2 (E, E) = Hom(E, E ⊗ KS ) = 0, and therefore MSH (r, C, d)s is smooth of dimension e = 2rd − (r 2 − 1). 2.3. Modular forms. We give a brief review of the results for modular that we will forms need. It might be helpful to also look at [G-Z] Sect. 2.2. Let H := τ ∈ C Im(τ ) > 0 be the complex upper half-plane. For τ ∈ H let q := e2π iτ and q 1/n := e2π iτ/n . For We always use the principal branch of a ∈ Q we often write√(−1)a instead of eπia . √ the square root (with τ ∈ H for τ ∈ H and a ∈ R>0 for a ∈ R>0 ). We recall the definition of quasimodular forms from [K-Z]. A modular form of weight k on a subgroup 0 ⊂ Sl(2, Z) of finite index is a holomorphic function f on H satisfying aτ + b ab k ∈0 = (cτ + d) f (τ ), τ ∈ H, f cd cτ + d growing at most polynomially in 1/=(τ ) as =(τ ) → 0. An almost holomorphic modular form of weight k is a function F on H with the same transformation properties and growth P −m for conditions as a modular form which is of the form F (τ ) = M m=0 fm (τ )(=(τ )) M ≥ 0 and fi holomorphic functions. Functions f which occur as (the holomorphic part of F ) f0 (τ ) in Psuch an expansion are called quasimodular forms of weight k. We denote σk (n) := d|n d k and by σ1odd (n) the sum of the odd divisors of n. For even k ≥ 2 let Bk X + σk−1 (n)q n Gk (τ ) := − 2k n>0
is the k th
be the Eisenstein series, where Bk Bernoulli number. Note that Gk is a modular form of weight k on SL(2, Z) for k ≥ 4, but is only quasimodular for k = 2, i.e.
Theta Functions and Hodge Numbers of Moduli Spaces
109
G2 (τ ) + 1/(8π=(τ )) is an almost holomorphic modular form of weight 2. Equivalently c(cτ + d) aτ + b = (cτ + d)2 G2 (τ ) − (2.1) cτ + d 4π i Q (see [Z2, p. 242]). Let η(τ ) := q 1/24 n>0 (1 − q n ) be the Dedekind eta function and 1 := η24 the discriminant. We have the transformation laws r τ η(τ ) see [C, VIII.3.] (2.2) η(τ + 1) = (−1)1/12 η(τ ), η(−1/τ ) = i G2
We write y := e2πiz for z a complex variable. Recall the classical theta functions X 2 (−1)nν q (n+µ/2) /2 y n+µ/2 (µ, ν ∈ {0, 1}) (2.3) θµ,ν (τ, z) := n∈Z
(see e.g. [C, Ch. V], where however the notations and conventions are slightly different), and the “Nullwerte” η(τ )5 , η(τ/2)2 η(2τ )2 η(2τ )2 0 , (τ ) = θ1,0 (τ, 0) = 2 θ1,0 η(τ ) θ (τ ) := θ0,0 (τ, 0) =
0 θ0,1 (τ ) = θ0,1 (τ, 0) =
η(τ/2)2 , η(τ )
(2.4)
θ1,1 (τ, 0) = 0.
We use the same notations also for µ, ν arbitrary in Q. The identities (2.4) follow readily from the product formulas 1 1 1 Y (1 − q n )(1 − q n y)(1 − q n y −1 ), (2.5) θ1,1 (τ, z) = q 8 (y 2 − y − 2 ) θ0,1 (τ, z) =
Y
n>0 n
(1 − q )(1 − q n− 2 y)(1 − q n− 2 y −1 ), 1
1
n>0
and the fact that θµ,ν (τ, z) = θµ,0 (τ, z + ν). θ1,1 has the transformation behaviour r τ π iz2 /τ 1/4 e θ1,1 (τ, z). θ1,1 (τ + 1, z) = (−1) θ1,1 (τ, z), θ1,1 (−1/τ, z/τ ) = −i i (2.6) By the product formulas (2.5) we see that θ0,1 (τ, z)θ1,1 (τ, z) = We write e θ1,1 (τ, z) :=
η(τ )2 θ1,1 (τ/2, z). η(τ/2) θ1,1 (τ, z) y 2 − y− 2 1
1
(2.7)
.
From the definitions it is straightforward to see that θµ+2,0 (τ, z) = θµ,0 (τ, z), θµ+2,1 (τ, z) = −θµ,1 (τ, z), µ ∈ Q.
(2.8)
110
By
L. Göttsche (−n−1/2)2 4
(n+1/2)2 4
=
0 (2τ ) = θ1/2,0
X
= (±(n/2 + 1/2))2 , one also checks immediately that
q (n+1/4) = 2
n∈Z
0 (τ/2) θ1,0 1 X (n+1/2)2 /4 η(τ )2 q = = . 2 2 η(τ/2)
(2.9)
n∈Z
) Following [Gö3,G-Z], we set f (τ ) := (−1)−1/4 η(τ θ (τ ) . Let e2 and e3 be the 2-division values of the Weierstraß ℘-function at τ/2 and (1 + τ )/2 respectively, i.e. X 1 +2 σ1odd (n)q n/2 , e2 (τ ) = 12 n>0 X 1 (−1)n σ1odd (n)q n/2 , +2 e3 (τ ) = 12 3
n>0
(see e.g. [H-B-J, p.132]). It is easy to see that e3 (2τ + 1) = e2 (2τ ). We also see that 0 (2τ ) and f (2τ + 1) = η(2τ )4 /η(τ )2 . We write θ (2τ + 1) = θ0,1 u(τ ) := − Remark 2.2. Let
η(2τ )8 f (τ )2 . , u(τ ) := u(2τ + 1) = − 3e3 (τ ) 3e2 (2τ )η(τ )4
(2.10)
12 0 −1 11 2 , S := . T := , V := T = 01 1 0 01
Let 0u = ±hV 2 , V S, SV i; this is a subgroup of index 6 of SL(2, Z). u(τ ) is a modular 21 . It is function on 0u . Let 0(2) := A ∈ Sl(2, Z) A ≡ id mod 2 . Let X := 01 easy to see that X−1 0u X = 0(2). In other words a function g(τ ) is a modular function on 0u , if and only if h(τ ) := g(2τ + 1) is a modular function on 0(2). In particular u(τ ) is a modular function on 0(2). 2.4. Theta functions for indefinite lattices. We review the definition of theta functions for indefinite lattices from [G-Z]. Let 0 be a lattice, i.e. a free Z module 0 together with a Z-valued bilinear form hx, yi on 0. The extension of the bilinear form to 0C := 0 ⊗ C and 0R = 0 ⊗R is denoted in the same way. The type of 0 is the pair (r −s, s), where r is the rank of 0 and s the largest rank of a sublattice of 0 on which h , i is negative definite. ab , Let M0 be the space of meromorphic functions on H × 0C . For v ∈ 0Q , A = cd and k ∈ Z we put f |v(τ, x) := q hv,vi/2 exp(2πihv, xi)f (τ, x + vτ ), x hx, xi aτ + b f , . f |k A(τ, x) := (cτ + d)−k exp − πi cτ + d cτ + d cτ + d
(2.11) (2.12)
Now assume that 0 is unimodular of type (r − 1, 1). We fix a vector f0 ∈ 0R with hf0 , f0 i < 0, and let C0 := f ∈ 0R hf, f i < 0, hf, f0 i < 0 , S0 := f ∈ 0 f primitive, hf, f i = 0, hf, f0 i < 0 .
Theta Functions and Hodge Numbers of Moduli Spaces
For f ∈ S0 put
111
D(f ) := (τ, x) ∈ H × 0C 0 < =(hf, xi) < =(τ ) ,
and for f ∈ C0 put D(f ) := H × 0C . For t ∈ R we put µ(t) := 1, if t ≥ 0 and µ(t) = 0 otherwise. Let f, g ∈ C0 ∪ S0 . For c ∈ 0 and (τ, x) ∈ D(f ) ∩ D(g) we put X f,g (2.13) µ(hξ, f i) − µ(hξ, gi) q hξ,ξ i/2 e2π ihξ,xi , 20,c (τ, x) := ξ ∈0+c/2
f,g
f,g
and 20 := 20,0 .
f,g
Assume now that f, g ∈ S0 . Then (see [G-Z]) the function 20,c,b has a meromorphic extension to H × 0C , which is defined as follows. Let F : H × C2 → C; (τ, u, v) 7 →
η(τ )3 θ1,1 (τ, (u + v)/(2π i)) , θ1,1 (τ, u/(2π i))θ1,1 (τ, v/(2π i))
(see [Z1]; note the different conventions for θ1,1 in [Z1]). We have X X F (τ, u, v) = q nm e−nu−mv − q nm enu+mv , n≥0,m>0
n>0,m≥0
(see [G-Z, Sect. 3.1]). Assume hf, gi = −N ∈ Zξ ·g
q hξ,ξ i/2 e2π ihξ,xi
hξ,gi=0 hf,gi≤hξ,f i 0 small enough. We want to relate the virtual Poincaré polynomials of MSH (r, C, d)s , NSH (r, C, d) and H (r, C + bE, d)s . In fact we will see that the generating function for b S is obtained Mb S from that for S by multiplying by a suitable theta function and dividing by a power of the eta function. The results are easy consequences of corresponding results of Yoshioka about the counting of points of these moduli spaces over finite fields and of Prop. 2.1. We write
Pv (MSH (r, C, d)s ) = y −e pv (MSH (r, C, d)s , y), Pv (NSH (r, C, d)) = y −e pv (NSH (r, C, d), y),
where e = 2rd − (r 2 − 1)χ(OS ) is the virtual dimension, which agrees with the actual dimension for d sufficiently large.
Theta Functions and Hodge Numbers of Moduli Spaces
113
Proposition 3.1. Let S be an algebraic surface and let H be a general ample divisor on S. 1. X d≥0
r Y 4 YY i+1 Pv (MSH (r, C, d)s )q d = (1 − y i−2b q k )(−1) bi (S) k≥1 b=1 i=0
X Pv (NSH (r, C, d))q d , d≥0
in particular X d≥0
re(S)/24 X q e(MSH (r, C, d)s )q d = e(NSH (r, C, d))q d . η(τ )re(S) d≥0
2. Let A = (aij )ij be the (r − 1) × (r − 1)-matrix with entries aij = 1 for i ≤ j and aij = 0 otherwise. We view elements of Rr−1 as column vectors. We write I for the column vector of length r − 1 with all entries equal to one. Then r/24 X X q t t H Pv (Mb (r, C + bE, d)s )q d = (y 2 )v AI q v Av r S η(τ ) b d≥0
v∈Zr−1 + r I
X Pv (MSH (r, C, d)s )q d , d≥0
in particular X d≥0
H e(Mb (r, C + bE, d)s )q d = S
q r/24 η(τ )r
X
qv
t Av
v∈Zr−1 + br I
X
d≥0
e(MSH (r, C + bE, d)s )q d .
Proof. (1) is a consequence of ([Y1], Thm. 0.4) and Prop. 2.1. Let X be a surface over Fq . For every sheaf E in MXH (r, C, d)s (Fq ) there is an exact sequence 0 → E → E ∨∨ → E ∨∨ /E → 0, where E ∨∨ ∈ NSH (r, C, d − k)s (Fq ) and E ∨∨ /E ∈ QuotkE ∨∨ (Fq ) for a suitable k ≤ d. In fact it is easy to see that if E is defined over Fq , then it is defined over Fq if and only if both E ∨∨ and E ∨∨ /E are. For a sheaf F over X we denote by QuotkF the (Grothendieck) scheme of quotients of length k of F and by QuotkF,p the subscheme (with the reduced structure) of quotients with support in the point p ∈ X. If F is locally free of rank r and p is defined over Fq , we get isomorphisms QuotkF,p ' QuotkO⊕r ,p over Fq . In X
particular #QuotkF,p (Fq ) = #QuotkO⊕r ,p (Fq ). Therefore the proof of ([Y1], Thm. 0.4) for X
114
L. Göttsche
the numbers #QuotkO⊕r (Fq ) can be repeated for #QuotkF (Fq ), the only numbers entering X
the calculation being the #QuotkF,p (Fq n ). Therefore #QuotkF (Fq ) = #QuotkO⊕r (Fq ) (see X
also Y1, p.194). This gives #MXH (r, C, d)s (Fq ) =
X
#NXH (r, C, d − k)s (Fq ) · #QuotkO⊕r (Fq ). X
k≤d
Applying Prop. 2.1 to a good reduction X of S modulo q, we obtain immediately XX pv (MSH (r, C, d)s )q d d≥0 d≥0
r Y 4 YY X i+1 = (1 − y 2rk+i−2b q k )(−1) bi (S) pv (NSH (r, C, d))q d , k≥1 b=1 i=0
d≥0
(recall the signs in the definition of pv ). By the definition of Pv and the formula e = 2rd − (r 2 − 1)χ(OS ), we see that in order to replace pv by Pv we have to replace the factor (1 − y 2rk+i−2b q k ) by (1 − y i−2b q k ). (2) We apply Prop. 2.1 to ([Y3], Prop. 3.4). Using again e = 2rd − (r 2 − 1)χ (OS ) we obtain X H Pv (Mb (r, C + bE, d)s )q d S d≥0
P X q r/24 X = (y 2 )w(a1 ,... ,ar ) q − i<j ai aj Pv (MSH (r, C, d)s )q d . η(τ )r d≥0
(a1 ,... ,ar )
P Here the sum runs through the r-tuples (a1 , . . . , ar ) in Z + br with ri=1 ai = 0, and X X aj − ai +r ai aj . w(a1 , . . . , ar ) = 2 i<j ≤r
i<j ≤r
We note that equivalently we can let the sum run through the (r−1)-tuples (a1 , . . . , ar−1 ), P and put ar = − r−1 i=1 ai . Then X X ai aj = ai aj . − i<j ≤r
Furthermore we have
X
(aj − ai )2 = 2r
i<j ≤r
and
X
j ≤i≤r−1
X
ai aj
j ≤i≤r−1
! r−1 X (aj − ai ) = −2 (r − i)ai .
i<j ≤r
i=1
Putting things together, we obtain w(a1 , . . . , ar ) =
r−1 X (r − i)ai = (a1 , . . . , ar−1 )AI. i=1
Theta Functions and Hodge Numbers of Moduli Spaces
Finally we note that X
115
ai aj = (a1 , . . . , ar−1 )A(a1 , . . . , ar−1 )t .
t u
j ≤i≤r−1
Remark 3.2. 1. Li and Qin ([L-Q1,L-Q2]) have shown a blowup formula for the virtual Hodge polynomials in the case r = 2 using completely different methods. In particular they also obtain a blowup formula for the Euler numbers. Their method also gives a blowup formula for the virtual Hodge polynomials of the Uhlenbeck compactification. We write again Hv (MSH (r, C, d)s ) = (xy)−e/2 hv (MSH (r, C, d)s ) with e = 2rd − (r 2 − 1)χ(OS ). Then, writing x = e2π iu , their result can be rewritten as 1/12 X X q θ0,0 (2τ, u + z) H Hv (Mb (C, d)s )q d = Hv (MSH (C, d)s )q d , S η(τ )2 d≥0 d≥0 1/12 X X q θ1,0 (2τ, u + z) H Hv (Mb (C + E, d)s )q d = Hv (MSH (C, d)s )q d . S η(τ )2 d≥0
d≥0
This is the case r = 2 of the formula X H Hv (Mb (r, C + bE, d)s )q d S d≥0
=
q r/24 η(τ )r
X
v t AI
(xy)
v∈Zr−1 + br I
q
v t Av
X Hv (MSH (r, C, d)s )q d . d≥0
I expect that this formula holds for all r. P 2. Using [Y5], Prop. 3.1(2) can also be rewritten: Let Ar−1 = (x1 , . . . , xr ) i xi = P 0 be the Ar−1 -lattice and e1 , . . . , er−1 its standard basis. Let a := r−1 i=1 i(r − i)ei and λ = (1 − 1/r, −1/r, . . . , −1/r). Then the theta function on the left hand side in Prop. 3.1 can be written as X y hv,ai q hv,vi/2 , v∈Ar−1 +bλ
where h , i is the pairing of Ar−1 . This was pointed out to me by K. Yoshioka. 4. Wallcrossing and Theta Functions 4.1. Wallcrossing. Now let S again be a rational algebraic surface. Let 0 be the lattice H 2 (S, Z) with the negative of the intersection form as a quadratic form, i.e. for A, B ∈ 0 let hA, Bi = −AB. In this section we want to relate the Hodge numbers of the moduli spaces MSH (C, d) to the theta functions 2F,H 0,C from [G-Z]. The dependence of the moduli spaces MSH (C, d) on the polarization H and the corresponding dependence of the Donaldson invariants has been studied by a number of authors [Q1,Q2,F-Q,Gö2, E-G,Y2,Y3,L]. We follow (with some modifications) the notations in [Gö2,E-G].
116
L. Göttsche
An ample divisor H is called good if KS · H ≤ 0. We denote by CS the ample cone of S and by CSG the subcone of all good ample divisors. A class ξ ∈ H 2 (X, Z) + C/2 is called of type (C, d) if ξ 2 + d ∈ Z≥0 . In this case we call W ξ := ξ ⊥ ∩ CS the wall defined by ξ . If ξ ⊥ ∩ CSG 6 = ∅, we call W ξ a good wall. The chambers of type (C, d) are the connected components of the complement of the walls of type (C, d) in CS . If L and H lie in the same chamber of type (C, d), then MSL (C, d) = MSH (C, d). We say that L lies on a wall of type C, if Lξ = 0 for some class ξ ∈ H 2 (X, Z) + C/2. Theorem 4.1. Let C ∈ H 2 (S, Z). Let H, L ∈ CSG not on a wall of type C. Then 1.
X d≥0
Xyv (MSH (C, d)) − Xyv (MSL (C, d)) q d−e(S)/12
η(τ )2σ (S)−2 (y 2 − y − 2 ) L,H = 20,C (2τ, KS z), θ1,1 (τ, z)2 X (e(MSH (Cd)) − e(M L (C, d)))q d−e(S)/12 1
d≥0
=
1
1 Coeff2πiz 2L,H 0,C (2τ, KS z) . η(τ )2e(S)
2. Assume now that C 6 ∈ 2H 2 (S, Z). Then we can replace Xyv by Xy in (1). Furthermore X (e(NSH (C, d)) − e(NSL (C, d)))q d = Coeff2π iz 2L,H 0,C (2τ, KS z) d≥0
X d≥0
=
(−1)e(d)/2 (σ (MSH (C, d)) − σ (MSL (C, d)))q d−e(S)/12
η(τ )2σ (S) L,H 2 (2τ, KS /2). 2iη(2τ )4 0,C
Here e(d) := 4d − 3 is the dimension of MSH (C, d). Proof. This is essentially a reformulation of Thm. 3.4 from [Gö2]. Assume that H and L do not lie on a wall of type C. The result of [Gö2] gives y 2d−3/2 (Xyv (MSH (C, d)) − Xyv (MSL (C, d))) = ξ KS − y −ξ KS X 2 2 2y y d+ξ Xy ((S t S)[d+ξ ] )y d−ξ , = y(y − 1) ξ
where the sum runs through all classes of type (C, d) with ξ H < 0 < ξ L. We sum over P P [n] n 2 . all d ≥ 0. We use (2.17), noting that n≥0 Xy ((S t S)[n] )q n = n≥0 Xy (S )q We obtain X Xyv (MSH (C, d)) − Xyv (MSL (C, d)) q d−e(S)/12 = d≥0
η(τ )2σ (S)−2 X −ξ 2 y ξ KS − y −ξ KS q = . 1 1 θ˜1,1 (τ, z)2 y 2 − y− 2 ξ
(4.1)
Theta Functions and Hodge Numbers of Moduli Spaces
117
The sum on the right-hand side runs through all ξ ∈ H 2 (X, Z) + C/2 satisfying ξ H < g,f 0 < ξ L. Using the definition (2.13) of the theta functions 20,c , we obtain. X θ1,1 (τ, z)2 Xyv (MSH (C, d)) − Xyv (MSL (C, d)) q d−e(S)/12 1 − 21 2σ (S)−2 2 η(τ ) (y − y ) d≥0 X 2 q −ξ y ξ KS − y −ξ KS = ξ ∈H 2 (S,Z)+C/2 ξ H −8, then σ (MSF (C, d)) = 0. Proof. This follows immediately from [G-Z, Cor. 5.5]. u t Remark 6.4. Theorem 6.2 and the results for the K3 surface suggest that there should be a general formula relating the Donaldson invariants and the signatures of the moduli spaces MSH (C, d) for all simply connected algebraic surfaces S even if pg (S) > 0. In general the moduli spaces MSH (C, d) will be very singular, and one first has to find a suitable definition of the signature. The simplest formula that fits the known data seems to be the following: X (−1)e(d)/2 σ (MSH (C, d))q d−e(S)/12 = d
χ (OS ) X η(2τ )σ (S) w(τ ) 8SC (p r )u(τ )r+1 , = ± (2η(τ ))2χ(OS ) r≥0 where w(τ ) = χ (OS ) χ (OS ) 1 2u(τ )+1 + 2u(τ )−1 , 2 1 2
2u(τ )
if 3χ(OS ) − C 2 ≡ 2 mod 4,
2u(τ )
1 + 2u(τ ))χ(OS ) − (−1)χ (OS ) (1 − 2u(τ ))χ (OS ) , if 3χ(OS ) − C 2 ≡ 0 mod 4.
The formula has the following features: 1. It gives the correct result for rational surfaces and for K3-surfaces. 2. It is compatible with the blow-up formulas of [F-S] for the Donaldson invariants and with those of Prop. 3.1 for the signatures. 3. It is compatible with taking the disjoint union of algebraic surfaces. The formula for the rational surfaces is just Thm. 6.2, and the compatibility with the blowup formulas is obvious. We check the formula for the K3 surfaces. We know by [G-H] that, for generic polarization H and suitable C ∈ H 2 (X, Z), the moduli space MXH (C, d) has the same Hodge numbers as X[2d−3] . Let L and M be two such classes in H 2 (X, Z), satisfying L2 ≡ 2 modulo 4 and M 2 ≡ 0 modulo 4. Using (2.17), this gives X (−1)e(d) (σ (MXH (L, d)) + σ (MXH (M, d))q d−2 = d≥0
=
X d≥0
(−1)n σ (S [n] )q (n−1)/2 =
1 . η(τ )4 η(τ/2)16
For the Donaldson invariants we have the following results: X fulfills the simple type 2 = (−1)C /2 for all C ∈ H 2 (S, Z) (see e.g. [Kr-M]). Therefore we condition and 8S,H C X,H 2r 2r 2r+1 ) = 22r+1 , i.e. get 8X,H L (p ) = −2 and 8M (p X r≥0
X,M r r r+1 (8X,H =− L (p ) + 8L (p ))u(τ )
η(2τ )8 u(τ ) =4 . 1 + 2u(τ ) η(τ/2)8
128
L. Göttsche
The last identity is an elementary exercise in modular forms (e.g. one multiplies both sides with a suitable modular form on 0(2) such that they both become modular forms on 0(2) and compares the first few coefficients). Putting this together, we obtain X (σ (MXH (L, d) − σ (MXH (M, d))q d−2 d≥0
2 η(2τ )σ (X) X X,H r X,H = (8 (p ) + 8M (pr ))u(τ )r+1 (4η(τ ))2χ(OX ) r≥0 L 2 X X,H η(2τ )σ (X) = (1 − 2u(τ ))2 8L (pr ) (4η(τ ))2χ(OX ) r≥0 2 X X,H η(2τ )σ (X) 2 r = (1 − 1/(2u(τ ))) 8M (p ) . (4η(τ ))2χ(OX ) r≥0
The result follows by collecting the odd powers of u(τ ) (for L) and the even powers of u(τ ) (for M). Remark 6.5. Note that u(τ ) is the modular function on 0(2) that occurs in a natural way in physics ([W], there it is called u). In [M-W] the Donaldson invariants of 4-manifolds with b+ = 1 were (using physics arguments) related to (Borcherds type [Bo]) integrals over the “u-plane” H/ 0(2). This suggests that also many results of this paper could be reformulated in terms of such integrals. For the Euler number we can prove a weaker statement along the same lines. We can relate the generating functions for the difference of the Euler numbers for two polarizations H, L to the difference of certain Donaldson invariants between H and L. Let kS be the Poincaré dual of KS . Proposition 6.6. Let H , L ∈ CS not on a wall of type 0. Then 6 X iη(2τ ) S,L S,L r r r+1 = 8S,H . ES,H 0 −E0 0 (kS p )−80 (kS p ) u(2τ ) 2θ(2τ )σ (S)+2 η(τ )e(S) r≥0
Proof. The proof is similar to the case of the signature. By Corollary 4.5 we have i h 1 S,L L,H − E = Coeff (2τ, K z) . 2 ES,H 2π iz S 0 0 0,0 η(τ )2e(S) i h As H, L ∈ CS , we see that Coeff2πiz 2L,H 0 (2τ, Ks z) starts in degree ≥ 1/2 in q. Using this, we get from [G-Z], Cor 4.3, X S,H r r+1 80 (kS pr ) − 8S,L 0 (kS p ) u(2τ ) r≥0
" = Coeff2πiz =
#
2θ(2τ )σ (S) H,L 20,0 (2τ, KS z) f (2τ )2
2iθ(2τ )σ (S)+2 η(τ )2e(S) S,L (E0 − ES,H 0 ). η(2τ )6
t u
Theta Functions and Hodge Numbers of Moduli Spaces
129
Remark 6.7. This result can be reformulated as follows. The expression 6 X η(2τ ) r r+1 + 8S,H ES,H 0 0 (kS p )u(2τ ) 2iθ(2τ )σ (S)+2 η(τ )e(S) r≥0
is independent of H ∈ CS . 7. Examples 7.1. Rational ruled surfaces. Let S be a rational ruled surface. Let F be the class of a fibre of the ruling, and let G be a section with G2 ≤ 0. By Lem. 4.8 we know = 0 if CF = 1. We will compute XS,F that XS,F and ES,F C F F . Furthermore we set F+ F +G (F, d) for > 0 sufficiently small, so that there is no wall of MS (F, d) := MS type (F, d) between F and F + G. 1 1 y 2 − y − 2 η(τ ) S,F , Proposition 7.1. 1. XF = θ1,1 (τ, z)2 θ1,1 (τ, 2z) 1 1 X 1 y 2 − y− 2 1 η(τ )3 F Xy (MS + (F, d))q d− 3 = − , 2. η(τ )2 θ1,1 (τ, z)2 θ1,1 (τ, 2z) y − y −1 d≥0
3. ES,F F =
2G2 (τ ) + 1 2G2 (τ ) X F , e(MS + (F, d))q d− 3 = 8 η(τ ) η(τ )8
1 12
.
d≥0
Proof. Let F1 , F2 be the fibres of the two projections of P1 × P1 to P1 . By a sequence of blowups and blowdowns (S, F ) can be obtained from (P1 × P1 , F1 ), where in each blowup F is replaced by its total transform. By the blowup formula Cor. 4.6 we get = XPF11 ×P1 ,F1 . We can therefore assume that S = P1 × P1 and F = F1 , G = F2 . XS,F F
= 0 and By Cor. 4.8 we get XS,G F 1
XS,F F
1
y2 −y2 = 2G,F (2τ, −2F z − 2Gz). η(τ )2 θ1,1 (τ, z)2 0,F
By (2.15) we have 3 2G,F 0,F (2τ, x) = η(2τ )
=
θ1,1 (·, h(F − G), ·i) (2τ, x) θ1,1 (·, −hG, ·i)θ1,1 (·, hF, ·i) F /2
η(2τ )3 θ0,1 (2τ, h(F − G), xi) . θ0,1 (2τ, −hG, xi)θ1,1 (2τ, hF, xi)
Thus 2G,F 0,F (2τ, −2F z − 2Gz) = By (2.7) we get XS,F F
0 (2τ ) η(2τ )3 θ0,1
θ0,1 (2τ, 2z)θ1,1 (2τ, 2z)
1 1 y 2 − y − 2 η(τ ) = . θ1,1 (τ, z)2 θ1,1 (τ, 2z)
.
130
L. Göttsche +
To get the χy -genus of MSF (F, d), we note that X d≥0
y 2 − y− 2 +G (τ, −2F z − 2Gz) , lim 2G,F 0,F 2 2 η(τ ) θ1,1 (τ, z) →0 1
+
Xy (MSF (F, d)q d− 3 = 1
1
and by formula (3.9.1) from [G-Z], G,F +G (τ, −2F z − 2Gz) = 2G,F 0,F (τ, −2F z − 2Gz) − lim 20,F →0
1 . y − y −1
To finally obtain the formulas for the Euler numbers we use the formula 1 2 η(τ )3 = exp Gk (τ )(2π i)k , (see [Z1]), θ1,1 (τ, z) 2πiz k! and Coeff2πiz
1 1 t =− . u −1 y−y 12
7.2. The rational elliptic surface. Let m ∈ Z≥0 . Let S be the blowup of P2 in 4m + 5 P points, and assume that F := (m + 2)H − mE1 − 4m+5 i=2 Ei is nef, e.g. F is the fibre of a fibration of S over P1 , such that the genus of the generic fibre is m. Theorem 7.2. If m is odd, then θ1,1 (τ, mz) , θ1,1 (τ, z)e θ1,1 (τ/2, z)θ0,1 (τ, (m − 1)z)η(τ )η(τ/2)4m+3 m = , η(τ/2)4m+8
S,F 1. XS,F H + X E1 = S,F 2. ES,F H + EE1
S,F + 6ES,F = 3. 6H 1
1 . η(τ )2 η(τ/2)4m+4
If m is even, then θ1,1 (τ, mz) , e θ1,1 (τ, z)θ1,1 (τ/2, z)θ0,1 (τ, (m − 1)z)η(τ )η(τ/2)4m+3 m = , η(τ/2)4m+8
S,F 1. XS,F H +E2 + XE1 +E2 = S,F 2. ES,F H +E2 + EE1 +E2
S,F S,F 3. 6H +E2 = 6E1 +E2 = 0.
Proof. We mostly deal with the case m = 2l − 1 odd. The proof in the case m even is analogous. Let 0 = H 2 (S, Z) with the negative of the intersection form. Let G := H − E1 . Then by Lem. 4.8 and Cor. 4.5, y 2 − y− 2 G,E1 G,F (2τ, K z) + 2 (2τ, K z) . 2 S S 0,H 0,H η(τ )16l+2 θ1,1 (τ, z)2 1
S,F XS,F H + XE1 =
1
Theta Functions and Hodge Numbers of Moduli Spaces
131
Let [G, F ] be the lattice generated by G and F , and let [G, F ]⊥ be its orthogonal complement in 0. We write 3 := [G, F ] ⊕ [G, F ]⊥ . By hF, F i = hG, Gi = 0, hF, Gi = −2, hE1 , F i = 1 − 2l, hE1 , Gi = −1, hE2 , F i = −1, hE2 , Gi = 0 we see that 3 has index 4 in 0, and that 0, E1 , E2 , E2 + E1 form a system of representatives of 0 modulo 3. Therefore we get by (2.15): G,F G,F |0 + |E1 + |E2 + |E1 +E2 |H /2 +E1 /2 2G,F 0,H + 20,E1 = 23 = 2G,F |0 + |E1 + |E2 + |E1 +E2 |0 + |G/2 |E1 . 3
P ai even . Then the map ϕ : D8l → [G, F ]⊥ Let D8l := (a1 , . . . , a8l ) ∈ Z8l 8l i=1 P defined by (a1 , . . . , a8l ) 7 → 8l i=1 ai (Ei+1 − G/2) is easily seen to be an isomorphism of lattices. It is well-known (and easy to check) that
Y 1 Y θ0,0 (τ, xi ) + θ0,1 (τ, xi ) . 2D8l (τ, (x1 , . . . , x8l )) = 2 8l
8l
i=1
i=1
So we get by (2.14),
η(2τ )3 θ1,1 (2τ, hF − G, xi) · θ1,1 (2τ, hF, xi)θ1,1 (2τ, h−G, xi) 8l+1 8l+1 Y 1 Y θ0,0 (τ, hEi − G/2, xi) + θ0,1 (τ, hEi − G/2, xi) . · 2
2G,F 3 (τ, x) =
i=2
i=2
If H (τ, x) is a function H × (0C ) → C satisfying H (τ, x) = θa,b (nτ, hL, xi)H1 (τ, x) for some L ∈ 0Q , then, for W ∈ 0,
H |W (τ, x) = θa+2hW,L/ni,b (nτ, hL, xi)H1 |W (τ, x).
We have hH, F i = −(2l + 1), hH, Gi = −1 and hH, Ei i = 0 for i ≥ 2; hE1 , F i = −(2l − 1), hE1 , Gi = −1 and hE1 , Ei i = 0 for i ≥ 2; hE2 , F i = −1, hE2 , Gi = 0 and hE2 , E2 i = 1, hE2 , Ei i = 0 for i ≥ 3. We also use repeatedly (2.8). Using this we obtain the following:
132
L. Göttsche
Put A(τ, x) :=
η(2τ )3 θ1,1 (2τ, hF − G, xi) , θ1,1 (2τ, hF, xi)θ1,1 (2τ, h−G, xi)
B(τ, x) :=
−η(2τ )3 θ1,1 (2τ, hF − G, xi) , θ0,1 (2τ, hF, xi)θ0,1 (2τ, h−G, xi)
C(τ, x) :=
η(2τ )3 θ0,1 (2τ, hF − G, xi) , θ1,1 (2τ, hF, xi)θ0,1 (2τ, h−G, xi)
η(2τ )3 θ0,1 (2τ, hF − G, xi) , θ0,1 (2τ, hF, xi)θ1,1 (2τ, h−G, xi) 8l+1 8l+1 Y 1 Y θ0,0 (τ, hEi − G/2, xi) + θ0,1 (τ, hEi − G/2, xi) , α(τ, x) := 2
D(τ, x) :=
β(τ, x) := γ (τ, x) := δ(τ, x) :=
i=2
i=2
8l+1 1 Y
8l+1 Y
2
θ1,0 (τ, hEi − G/2, xi) +
i=2
i=2
8l+1 1 Y
8l+1 Y
2
θ0,0 (τ, hEi − G/2, xi) −
i=2
i=2
8l+1 1 Y
8l+1 Y
2
θ1,0 (τ, hEi − G/2, xi) −
i=2
θ1,1 (τ, hEi − G/2, xi) , θ0,1 (τ, hEi − G/2, xi) , θ1,1 (τ, hEi − G/2, xi) .
i=2
Then we get G,F 2G,F 3 |0 (τ, x) := A(τ, x)α(τ, x), 23 |E1 (τ, x) := B(τ, x)β(τ, x),
G,F 2G,F 3 |E2 (τ, x) := C(τ, x)γ (τ, x), 23 |E1 +E2 (τ, x) := D(τ, x)δ(τ, x),
G,F 2G,F 3 |G/2 (τ, x) := D(τ, x)α(τ, x), 23 |G/2+E1 (τ, x) := C(τ, x)β(τ, x),
G,F 2G,F 3 |G/2+E2 (τ, x) := B(τ, x)γ (τ, x), 23 |G/2+E1 +E2 (τ, x) := A(τ, x)δ(τ, x).
By hKS , Ei − G/2i = 0 and hE1 , Ei − G/2i = 1/2 for i ≥ 2, we see that 0 0 (τ ))8l + (θ1/2,1 (τ ))8l )/2, α|E1 /2 (τ, KS z) = ((θ1/2,0 0 0 (τ ))8l + (θ3/2,1 (τ ))8l )/2, β|E1 /2 (τ, KS z) = ((θ3/2,0 0 0 (τ ))8l − (θ1/2,1 (τ ))8l )/2, γ |E1 /2 (τ, KS z) = ((θ1/2,0 0 0 (τ ))8l − (θ3/2,1 (τ ))8l )/2. δ|E1 /2 (τ, KS z) = ((θ3/2,0
Now we note that by (2.8) and (2.9), 0 0 (τ ) = θ3/2,0 (τ ) = θ1/2,0
η(τ/2)2 0 0 , θ1/2,1 (τ ) = −θ3/2,1 (τ ). η(τ/4)
Putting things together, we get that G,F 2G,F 0,E1 (τ, KS z) + 20,H (τ, KS z) =
(A + B + C + D)|E1 /2 (τ, KS z) ·
η(τ/2)16l . η(τ/4)8l
Theta Functions and Hodge Numbers of Moduli Spaces
133
The orthogonal projections of 0, E1 and E2 and E1 + E2 to [F, G] are a system of representatives of [F /2, G/2] modulo [F, G]. Therefore (A + B + C + D)(τ, x) = 2G,F [G/2,F /2] (τ, x) =
η(τ/2)3 θ1,1 (τ/2, hF /2 − G/2, xi) . θ1,1 (τ/2, hF /2, xi)θ1,1 (τ/2, h−G/2, xi)
Finally by hE1 , F i = 1 − 2l, hE1 , −Gi = 1, we get 2G,F [G/2,F /2] |E1 /2 (2τ, KS z) = − =
η(τ )3 θ1,1 (τ, hF /2 − G/2, KS iz) θ0,1 (τ, hF /2, KS iz)θ0,1 (τ, h−G/2, KS iz)
η(τ )3 θ1,1 (τ, (2l − 1)z) . θ0,1 (τ, (2 − 2l)z)θ0,1 (τ, z)
Putting this together, we obtain (y 2 − y − 2 )η(τ )3 θ1,1 (τ, (2l − 1)z) η(τ )16l = η(τ )16l+2 θ1,1 (τ, z)2 θ0,1 (τ, (2 − 2l)z)θ0,1 (τ, z) η(τ/2)8l θ1,1 (τ, (2l − 1)z) = . θ1,1 (τ, z)e θ1,1 (τ/2, z)θ0,1 (τ, (2 − 2l)z)η(τ )η(τ/2)8l−1 1
XS,F H
+ XS,F E1
1
S,F In the last line we have used (2.7). To get the formula for ES,F H + EE1 , take the limit θ1,1 (τ,(2l−1)z) |z=0 = 2l − 1 and e θ1,1 (τ/2, 0) = z → 0. It is immediate from (2.5) that θ1,1 (τ,z)
η(τ/2)3 . Therefore S,F ES,F H + EE1 =
2l − 1 2l − 1 = . 0 8l+4 8l+2 η(τ/2) η(τ )η(τ/2) θ0,1 (τ )
S,F + 6ES,F we put z = π i, to obtain To get the formula for 6H 1 S,F + 6ES,F = 6H 1
1 0 (τ/2)θ 0 (τ )η(τ )η(τ/2)8l−1 θ1,0 0,1
=
1 η(τ )2 η(τ/2)8l
.
In the case m = 2l even, we again have that 0, E1 , E2 , E1 + E2 form a system of representatives of 0 modulo 3. So we get G,F G,F |0 + |E1 + |E2 + |E1 +E2 |0 + |G/2 |E1 +E2 . 2G,F 0,H +E2 + 20,E1 +E2 = 23 Essentially the same computations as in the case m odd give the result. u t Let now S be the blowup of P2 in 9 points. Let H be the pullback of the hyperplane class, and let E1 , . . . , E9 be the classes of the exceptional divisors. Let F := 3H − P9 i=1 Ei . Then KS = −F . An interesting case is when S is a rational elliptic surface, and F is the class of a fibre. In [M-N-V-W] the generating functions of the Euler numbers e(MS (C, d)) are predicted in case CF is even. This prediction was proven in [Y4]. As an immediate consequence of Thm. 7.2 we can compute the Hodge numbers of the MSF (C, d) in case F C is odd. For the Betti numbers this result was already obtained (more generally for regular elliptic surfaces) in [Y6] using completely different methods. By [Be] the result about the Hodge numbers for S is a direct consequence.
134
L. Göttsche
Theorem 7.3. Let C ∈ H 2 (S, d) with C 2 odd. Then MSF (C, d) has the same Hodge numbers as S [2d−3/2] . In particular the Hodge numbers depend only on d, and we have X 1 , Xy (MSF (H, d)) + Xy (MSF (E1 , d)) q d−1 = 1. e θ1,1 (τ/2, z)η(τ/2)9 d≥0 X 1 , e(MSF (H, d)) + e(MSF (E1 , d)) q d−1 = 2. η(τ/2)12 d≥0 X 1 . 3. σ (MSF (H, d)) − σ (MSF (E1 , d)) q d−1 = η(τ )2 η(τ/2)8 d≥0
Remark 7.4. 1. We can recover the Hodge numbers of MSF (C, d): X d≥0
Xy (MSF (H, d))q d−1
1 i + , e θ1,1 (τ/2, z)η(τ/2)9 e θ1,1 ((τ + 1)/2, z)η((τ + 1)/2)9 X Xy (MSF (E1 , d))q d−1 =
d≥0
=
1 i − . 9 e e θ1,1 (τ/2, z)η(τ/2) θ1,1 ((τ + 1)/2, z)η((τ + 1)/2)9
2. We can also use Thm. 6.2 to compute the generating functions for the signatures. From [G-Z], Sect. 5.3 we get 8S,F H (1 + p/2) = 1, and by the simple type condition S,F r r 8H (p ) is 2 if r even and 0 otherwise. After some calculations this gives X d≥0
σ (MSF (H, d))q d−1 = 12e2 (2τ )
η(2τ )8 . η(τ )22
A similar calculation using [G-Z], Sect. 5.3 and Thm. 6.2 gives X d≥0
σ (MSF (E1 , d))q d−1 = −8
η(2τ )16 . η(τ )26
It is an exercise in modular forms to show that 12e2 (2τ )
η(2τ )8 η(2τ )16 1 + 8 = , 22 26 2 η(τ ) η(τ ) η(τ ) η(τ/2)8
and thus to recover part (3) of Thm. 7.3. 3. If X is a K3 surface, L a primitive line bundle and H a generic ample line bundle on X, then it was shown in [G-H] that MXH (L, d) has the same Hodge numbers as X [2d−3] , and in [H] that MXH (L, d) is deformation equivalent to X[2d−3] . There should be a similar proof of Thm. 7.3 as that in [G-H]. Furthermore I expect that, in case C 2 odd, MSF (C, d) is deformation equivalent to S [2d−3/2] . More generally similar results also should hold for arbitrary rank.
Theta Functions and Hodge Numbers of Moduli Spaces
135
4. In physics the polarized rational elliptic surface (S, F ) is often called 21 K3. This is related to the fact that one can degenerate an elliptic K3 surface to the union of 2 rational elliptic surfaces intersecting along a fibre. The generating function of the χy -genera of the MXH (L, d) (L primitive and allowing L2 both congruent 0 modulo 4 and congruent 2 modulo 4) on the K3 surface is just the square of the generating function on S. One could ask whether this result can also be shown by degenerating the moduli spaces MXH (L, d). Proof (of Thm. 7.3). We first show that the MSF (C, d) depend only on d. Let G be the subgroup of Aut (H 2 (S, Z)) generated by the Cremona transforms and the permutations of E1 , . . . , E9 . F is invariant under the operation of G, and therefore, by Cor. P 4.11, MSF (C, d) ' MSF (g(C), d) for all g ∈ G. We can assume that C = nH − i ai Ei with n, a1 , . . . , a9 ∈ {0, 1}. Let m be the number of indices i ≥ 1 with ai = 1. By renumbering E1 . . . E9 we can assume that either C = H or C = E1 , in which case we are done, or (h, a1 , a2 , a3 ) is one of (0, 1, 1, 1) or (1, 0, 1, 1). The Cremona transform replaces (h, a1 , a2 , a3 ) by (1, 0, 0, 0), (0, 1, 0, 0), and the result follows by induction on m. As KS F ≤ 0, the moduli spaces MSF (C, d) are smooth, and by [Be] all their cohomology is of Hodge type (p, p). Therefore the theorem follows from the case l = 1 of Thm. 7.2. u t
References [Ba] [Be] [Bo] [C] [dC-M] [Ch] [D-K] [De] [E-G] [E-S] [F-S] [F] [F-L1] [F-L2] [F-L3] [F-L4] [F-Q] [Gö1] [Gö2]
Baranovsky, V.: Moduli of sheaves on surfaces and action of the oscillator algebra. Preprint math.AG/9811092 Beauville, A.: Sur la cohomologie de certaines espaces de modules de fibrés vectoriels. In: Geometry and Analysis. (Bombay, 1952) Bombay: Tata Inst. Fund. Res., 1995, pp. 37–40 Borcherds, R.: Automorphic forms with singularities on Grassmannians. Invent. Math. 132, 491–562 (1998) Chandrasekharan, K: Elliptic functions. Grundlehren 281, Berlin Heidelberg: Springer-Verlag, 1985 de Cataldo, M.A., Migliorini, M.: The Douady space of a complex surface. Preprint math.AG/9811199 Cheah, J.: On the cohomology of Hilbert schemes of points. J. Alg. Geom. 5, 479–511 (1996) Danilov, V.I., Khovanskii, A.G.: Newton Polyhedra and an algorithm for computing Hodge–Deligne numbers. Math. USSR Izvestiya 29, 279–298 (1987) Deligne, P.: La conjecture de Weil I. Publ. Math. IHES 43, 273–307 (1974) Ellingsrud, G. and Göttsche, L.: Variation of moduli spaces and Donaldson invariants under change of polarisation. J. reine angew. Math. 467, 1–49 (1995) Ellingsrud, G., Strømme, S.A.: On the homology of the Hilbert scheme of points in the plane. Invent. Math. 87, 343–352 (1987) Fintushel, R., Stern, R.: The blowup formula for Donaldson invariants. Ann. Math. 143, 529–546 (1996) Fogarty, J.: Algebraic families on an algebraic surface. Am. J. Math. 90, 511–521 (1968) Feehan, P.M.N., Leness, T.G.: Donaldson invariants and wall-crossing maps, I: Continuity of gluing maps. Preprint math.DG/9812060 Feehan, P.M.N., Leness, T.G.: Donaldson invariants and wall-crossing maps, II: Surjectivity of gluing maps. In preparation Feehan, P.M.N., Leness, T.G.: Donaldson invariants and wall-crossing maps, III: Bubble-tree compactifications and manifolds-with-corners structures. In preparation Feehan, P.M.N., Leness, T.G.: Donaldson invariants and wall-crossing maps, IV: Intersection theory. In preparation Friedman, R., Qin, Z.: Flips of moduli spaces and transition formulas for Donaldson polynomial invariants of rational surfaces. Commun. in Analysis and Geometry 3, 11–83 (1995) Göttsche, L.: The Betti numbers of the Hilbert schemes of points on a smooth projective surface. Math. Ann. 286, 193–207 (1990) Göttsche, L.: Change of polarization and Hodge numbers of moduli spaces of torsion free sheaves on surfaces. Math. Zeitschr. 223, 247–260 (1996)
136
L. Göttsche
Göttsche, L.: Modular forms and Donaldson invariants for 4-manifolds with b+ = 1. J. Am. Math. Soc. 9, 826–843 (1996) [G-H] Göttsche, L., Huybrechts, D.: Hodge numbers of moduli spaces of stable sheaves on K3 surfaces. Int. J. Math. 7 No. 3, 359–372 (1996) [G-S] Göttsche, L., Soergel, W.: Perverse sheaves and the cohomology of Hilbert schemes of smooth algebraic surfaces. Math. Ann. 296, 235–245 (1993) [G-Z] Göttsche, L., Zagier, D.: Jacobi forms and the structure of Donaldson invariants for 4-manifolds with b+ = 1. Sel. Math. New. Ser. 4, 69–115 (1998) [H-B-J] Hirzebruch, F., Berger, T., Jung, R.: Manifolds and modular forms. Aspects of Math. E20, Braunschweig–Wiesbaden: Viehweg, 1994 [H] Huybrechts, D.: Birational symplectic manifolds and their deformations. J. Diff. Geom. 45, 488–513 (1997) [H-L] Huybrechts, D., Lehn, M.: The Geometry of Moduli Spaces of Sheaves. Aspects of Math. Vol. E 31, Braunschweig–Wiesbaden: Vieweg 1997 [K-Z] Kaneko, M., Zagier, D.: A generalized Jacobi theta function and quasimodular forms. In: R. Dijkgraaf, C. Faber, G. van der Geer (eds.) The moduli space of curves. Boston: Birkhäuser, 1995, pp. 165–172 [K-M] Kotschick, D., Morgan, J.: SO(3)-invariants for 4-manifolds with b+ = 1 II. J. Diff. Geom. 39, 433–546 (1994) [Kr-M] Kronheimer, P., Mrowka, T.: Embedded surfaces and the structure of Donaldson’s Polynomial invariants. J. Diff. Geom. 33, 573–734 (1995) [L] Leness, T.G.: Donaldson wall-crossing formulas via topology. Forum Math. (1998) to appear, dgga/960316 [L-Q1] Li, W.-P., Qin, Z.: On blowup formulae for the S-duality conjecture of Vafa and Witten. Preprint mathAG/9805054 [L-Q2] Li, W.-P., Qin, Z.: On blowup formulae for the S-duality conjecture of Vafa and Witten II: The universal functions. Preprint mathAG/9805055, to appear in Math Research letters [M-N-V-W] Minahan, J.A., Nemeschansky, D., Vafa, C., Warner, N.P.: E-Strings and N = 4 Topological Yang-Mills Theories. hep-th/9802168 [M-W] Moore, G., Witten, E.: Integration over the u-plane in Donaldson theory preprint hep-th/9709193 [Q1] Qin, Z.: Moduli of stable sheaves on ruled surfaces and their Picard groups. J. reine angew. Math. 433, 201–219 (1992) [Q2] Qin, Z.: Equivalence classes of polarizations and moduli spaces of sheaves. J. Diff. Geom. 37, 397– 413 (1993) [V-W] Vafa, C., Witten, E.: A Strong Coupling Test of S-Duality. Nucl. Phys. B 431 (1994) [W] Witten, E.: Monopoles and four-manifolds. Math. Research Letters 1, 769–796 (1994) [Y1] Yoshioka, K.: The Betti numbers of the moduli space of stable sheaves of rank 2 on P2 . J. reine angew. Math. 453, 193–220 (1994) [Y2] Yoshioka, K.: The Betti numbers of the moduli space of stable sheaves of rank 2 on a ruled surface. Math. Ann. 302, 519–540 (1995) [Y3] Yoshioka, K.: Chamber structure of polarizations and the moduli space of stable sheaves on a ruled surface. Int. J. Math. 7, 411–431 (1996) [Y4] Yoshioka, K.: Euler characteristics of SU (2) instanton moduli spaces of rational elliptic surfaces. preprint math.AG/9805003 [Y5] Yoshioka, K., Betti numbers of moduli of stable sheaves on some surfaces. Nucl. Phys. B (Proc. Suppl.) 46, 263–268 (1996) [Y6] Yoshioka, K.: Numbers of Fq -rational points of moduli of stable sheaves on elliptic surfaces. Moduli of vector bundles, Lect. Notes in Pure and Applied Math. 179, Marcel Dekker, 297–305 [Z1] Zagier, D.: Periods of modular forms and Jacobi theta functions. Invent. math. 104, 449-465 (1991) [Z2] Zagier, D.: Introduction to Modular forms. In: From Number Theory to Physics, eds. W. Waldschmidt, P. Moussa, J.-M. Luck, C. Itzykson, Berlin–Heidelberg: Springer-Verlag, 1992
[Gö3]
Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 206, 137 – 155 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Angular Momentum and Positive Mass Theorem Xiao Zhang Institute of Mathematics, Chinese Academy of Sciences, Beijing 100080, P. R. China. E-mail:
[email protected] Received: 26 October 1998 / Accepted: 10 March 1999
Abstract: Total angular momentum for asymptotically flat manifolds is defined. Positive mass theorem for initial (spin) data set (M, gij , pij ) with nonsymmetric pij is proved. As an application, we establish positive mass theorems involving total linear momentum and total angular momentum. This gives an answer to a problem of S. T. Yau in his Problem Section [Ya2] and a partial answer to his recent conjecture on the relationship among total energy, total linear momentum, total angular momentum and entropy of black hole. 1. Introduction In general relativity, basic quantities are total energy, total linear momentum, total angular momentum and entropy of black hole (i.e., area of apparent horizon). Total energy and total linear momentum can be defined on asymptotically flat manifolds. Let (M, gij , hij ) be a 3-dimensional Riemannian manifold with metric tensor gij , and a 2-symmetric tensor hij . M is called asymptotically flat of order τ if there is a compact set K ⊂ M such that M − K is the disjoint union of a finite number of subsets M1 , · · · , Mk − called the “ends” of M − each diffeomorphic to R 3 − Br , where Br is the closed ball of radius r with center at the coordinate origin. Under the diffeomorphism the metric of Ml ⊂ M is of the form gij = δij + aij
(1.1)
in the standard coordinates {x i } on R 3 , where aij satisfies aij = O(r −τ ),
(1.2)
∂k aij = O(r
−τ −1
),
(1.3)
∂l ∂k aij = O(r
−τ −2
).
(1.4)
138
X. Zhang
Furthermore, the 2-symmetric tensor hij satisfies hij = O(r −τ −1 ), ∂k hij = O(r
−τ −2
).
We will often identify the end Ml ⊂ M with the corresponding set Ml ⊂ R 3 . For asymptotically flat manifold M, the total energy of end Ml is defined as Z 1 lim El = (∂j gij − ∂i gjj )di , 16π r→∞ Sr,l the total linear momentum of end Ml is defined as Z 1 (hki − gki hjj )di , lim Plk = 8π r→∞ Sr,l
(1.5) (1.6)
(1.7)
(1.8)
where Sr,l is the sphere of radius r in end Ml ⊂ R 3 . When the asymptotic order τ > 21 , total energy is independent on the choice of asymptotic coordinates, and vanishes when τ > 1 [Ba1]. The Riemannian version of Positive Mass Conjecture was proved first by R. Schoen and S. T. Yau [SY1,SY2,SY3]. Theorem 1.1 (Schoen, Yau). Let (M, gij , hij ) be a 3-dimensional asymptotically flat manifold of order 1. If M satisfies the following dominant energy condition: sX X X X 1 2 2 hii ) − hij ) ≥ ( (∇i hj i − ∇j hii ))2 , (1.9) (R + ( 2 i
i,j
j
i
where R is the scalar curvature of M, then, for each end Ml , we have El ≥ 0.
(1.10)
If El0 = 0 for some l0 , then M can be isometrically embedded into 4-dimensional Minkowski space R 3,1 as a spacelike hypersurface so that gij is the induced metric from R 3,1 and hij is the second fundamental form. In particular, M is topologically R 3 . Alternatively, there is a Lorentzian version of the Positive Mass Conjecture which was proved then by E. Witten [Wi], and, soon later, was completed by Parker and Taubes [PT]. Theorem 1.2 (Witten). Let N be a 4-dimensional Lorentzian manifold with Lorentzian metric g˜ of signature (−1, 1, 1, 1), which satisfies the Einstein equations R˜ R˜ αβ − g˜ αβ = Tαβ , 2
(1.11)
where R˜ αβ , R˜ are Ricci curvature, scalar curvature of g˜ respectively. Let M ⊂ N be a spacelike hypersurface with induced Riemannian metric gij , and the second fundamental form hij . (M, gij , hij ) is asymptotically flat of order 1. If M satisfies the following dominant energy condition sX T0i2 , (1.12) T00 ≥ i
Angular Momentum and Positive Mass Theorem
139
where we choose an orthonormal frame {eα } on N with e0 timelike, then, for each end Ml , we have sX Plk2 . (1.13) El ≥ k
If El0 = 0 for some l0 , and T00 ≥ |Tαβ |, then M has only one end and N is flat over M. qP 2 They can also improve El ≥ 0 by El ≥ k Plk by their argument in Schoen and Yau’s Positive Mass Theorem [Ya1]. By Gauss and Codazzi equations, (1.12) is equivalent to (1.9). Hence two versions of Positive Mass Theorem are equivalent. On the other hand, the Penrose inequality, which was proved by Huisken and Ilmanen [HI1,HI2], see also [Ba2,Br,Gi,He1] for partial or related results, gives the relation between total energy and area of apparent horizon. In [Ya2], S.T.Yau asked what a good definition of total angular momentum for asymptotically flat space is and what the relationship would be with total mass (Problem 120). In Spring 1997, S.T. Yau conjectured also that the most general case of “Positive Mass Theorem/Penrose inequality” should be as follows: Under a certain kind of “dominant energy condition”, one should have r A , El ≥ |Pl | + |Jl | + 16π where Pl , Jl are total linear momentum, total angular momentum of end Ml respectively, and A is the area of apparent horizon. The above conjecture of Yau deduces to Penrose inequality when hqij = 0. (See also a
A if (1.9) holds related conjecture of Huisken and Ilmanen that the ADM rest mass ≥ 16π [HI2]). It should also have its analogous version on higher dimensional asymptotically q
A by some universal constant times n−2 flat manifolds, simply replacing the part 16π n−1 power of entropy of the black hole, where n is the dimension of manifold, see [Va] for some evidence. We refer to [AH,AM,AS,St] for many important works on angular momentum on more restrictive asymptotically flat 3-dimensional manifolds in spacetime which can be conformally compactified in the sense of Penrose, and the limit of magnetic part of asymptotic the Weyl curvature tensor vanishes at spacelike infinity of Minkowski space. On these manifolds, total angular momentum can be defined in terms of the Weyl curvature tensor and conformal factor. Unfortunately, the concept of angular momentum remains somewhat problematical [Pe]. In this paper, we shall give total angular momentum a “good” definition and study its relations with total energy and total linear momentum. From now on, we always denote (M, gij , hij ) (hij = hj i ) as a 3-dimensional asymptotically flat manifold of order τ in the sense that 2,α (M), gij − δij ∈ C−τ
X i
hij ∈ hii ∈
0,α C−τ −1 (M), 1,α C−τ −1 (M).
(1.14) (1.15) (1.16)
140
X. Zhang
(Here, and henceforth, definitions of weighted Sobolev and Hölder spaces follow from Bartnik [Ba1].) Let ρz be the distance function of M with respect to some fixed point z ∈ M. ρz is Lipschitz. Let ij k be the components of the volume element of M relative to an arbitrary frame. Definition 1.1. For any 3-dimensional manifold (M, gij , hij ) (hij = hj i ), the local angular momentum density h˜ zij with respect to point z ∈ M is defined as 1 h˜ zij = i 2
uv
(∇u ρz2 )(hvj − gvj trg (h)).
(1.17)
2-tensor h˜ zij is bounded on any compact set K˜ ⊂ M, but it might have good smoothness, depending on the smoothness of hij , near infinity on the ends. Moreover, since j uv is anti-symmetric with respect to j and v, we have 1 trg (h˜ z ) = j uv (∇u ρz2 )(hvj − gvj trg (h)) = 0. 2
(1.18)
Note that h˜ zij is not symmetric in general. (One can also define local angular momentum density with respect to any global function f on M, replacing ρz2 by f in (1.17).) Definition 1.2. For 3-dimensional asymptotically flat manifold (M, gij , hij ) in the above sense, the total angular momentum of end Ml with respect to point z ∈ M is defined as Z 1 lim (1.19) h˜ z di , Jlkin (z) = 8π r→∞ Sr,l ki the total angular momentum of end Ml with respect to point x0 ∈ R 3 with coordinates {x0u } is defined as Z 1 v (x u − x0u )(hvi − gvi trg (h))di . (1.20) lim Jlkex (x0 ) = 8π r→∞ Sr,l ku (Note that Jlkex (0) is defined as total angular momentum in [CK].) In classical theory, total angular momentum is defined as Z kuv x u T v0 ∗ 1 Jk = R3
with origin of coordinates at the system’s center of mass, where Tv0 is the momentum density of system [MTW]. If there is a symmetric 2-tensor hij such that X ∂i hv i − ∂v tr(h), Tv0 = i
then
Z kiv (hvi − δvi tr(h)) ∗ 1 + lim kuv x u (hv i − δvi tr(h)) ∗ dx i r→∞ S R3 r Z = lim kuv x u (hv i − δvi tr(h)) ∗ dx i . Z
Jk =
r→∞ S r
Angular Momentum and Positive Mass Theorem
141
Therefore the definitions (1.19), (1.20) of total angular momentum coincide with the one in the classical case up to a constant. Inspired by it, we can also define total angular momentum with respect to z ∈ M as Z 1 in (∇i ρz2 )(hj k − gj k trg (h)) (z) = Cn lim Jlij r→∞ S 2 r,l (1.21) 2 k − (∇j ρz )(hik − gik trg (h)) d , and with respect to x0 ∈ R n with coordinates {x0u } as Z ex (x0 ) = Cn lim (x i − x0i )(hj k − gj k trg (h)) Jlij r→∞ S r,l
− (x
j
j − x0 )(hik
− gik trg (h)) dk
(1.22)
for higher dimensional asymptotically flat manifolds, where Cn is some universal constant, Sr,l is the sphere of radius r in end Ml ⊂ R n . Note that the total angular momentum Jlkex (x0 ) of each end depends on the choice of point in R 3 . Hence the one with respect to the center of mass of each end, if it exists, will play a special role in general relativity. For a class of 3-manifolds M with an asymptotically flat end and satisfying much more special asymptotic conditions than the ones of (1.2), (1.3), (1.4), Huisken and Yau proved that the center of mass does exist if the mass is positive [HY], i.e., there is a unique round sphere foliation of constant mean curvature for the asymptotically flat end such that their center of gravity converges to a vector a ∈ R 3 which depends only on the geometry of M. Therefore it defines a geometric center of mass. Although the same statement in their paper is not claimed, it is believed the center of mass will still exist for those ends which satisfy asymptotic conditions (1.2), (1.3), (1.4) and have positive total energy [Hu]. We first prove a Positive Mass Theorem for 3-dimensional almost asymptotically flat manifolds (M, gij , pij ) with nonsymmetric pij . We also generalize it to higher dimensional almost asymptotically flat spin manifolds. Definition 1.3. For any n-dimensional manifold (M, gij , pij ) (n ≥ 3) with metric tensor gij and an arbitrary 2-tensor pij , local mass density is defined as µ=
X X 1 2 pii )2 − pij ), (R + ( 2 i
(1.23)
i,j
where R is the scalar curvature of M, local momentum densities are defined as X (∇i pj i − ∇j pii ), (1.24) ωj = i
χj = 2
X
∇i (pij − pj i ).
(1.25)
i
(M, gij , pij ), the dominant energy condition, is satisfied if sX sX ωj2 , (ωj + χj )2 }. µ ≥ max{ j
j
(1.26)
142
X. Zhang
(M, gij , pij ) is called almost asymptotically flat of order τ if on each end Ml ⊂ M the metric is of the form (1.1) which is uniformly equivalent to the flat metric on R n − Br and there exists q > n such that 2,q
aij ∈ W−τ (M), R ∈ L (M) ∩ L 1
(1.27) q 2 ,−τ −2
(M).
(1.28)
Furthermore, the 2-tensor pij together with its associated 2-form X θ= (pij − pj i )ei ∧ ej
(1.29)
i,j
satisfy that there exists a compact set K˜ ⊃ K and C > 0 such that 0,α ˜ pij ∈ C−τ −1 (M − K),
˜ |pij − pj i | < C on K, X ˜ pii | < C on K, | i
X i
(1.30) (1.31) (1.32)
1, q
pii ∈ W−τ2−1 (M),
(1.33)
dθ, d ∗ θ ∈ L q ,−τ −2 (M).
(1.34)
2
For a 3-dimensional almost asymptotically flat manifold (M, gij , pij ), the total energy of end Ml is defined by (1.7) also and the total linear momentum of end Ml is defined as the same as (1.8) except to replace hij by pij . We refer to [Ba1,PT,Sc, Zh1] for definitions of total energy and total linear momentum on higher dimensional manifolds. Theorem 1.3. Let (M, gij , pij ) be a 3-dimensional almost asymptotically flat manifold of order 1 ≥ τ > 21 , where pij is an arbitrary 2-tensor. If M satisfies the dominant energy condition (1.26), then, for each end Ml , we have sX Plk2 . (1.35) El ≥ k
If equality holds in (1.35) for some end Ml0 , then M has only one end. Furthermore, if El0 = 0 and gij is C 2 , pij is C 1 , then the following equations hold on M Rij kl + pik pj l − pil pj k = 0, ∇i pj k − ∇j pik = 0, X ∇i (pij − pj i ) = 0.
(1.36) (1.37) (1.38)
i
Theorem 1.4. Let (M, gij , pij ) be an n-dimensional almost asymptotically flat spin manifold of order n − 2 ≥ τ > n−2 2 (n > 3), where pij is an arbitrary 2-tensor. If M satisfies the following dominant energy condition: sX s X sX ωj2 , (ωj + χj )2 } + κj2 , (1.39) µ ≥ max{ j
j
1≤j ≤n−3
Angular Momentum and Positive Mass Theorem
143
denoting p˜ ab = pab − pba , where κj2 =
X
(p˜ j i p˜ kl + p˜ j k p˜ li + p˜ j l p˜ ik )2 ,
(1.40)
i,k,l;k>l>i>j
then, for each end Ml , we have El ≥
sX k
Plk2 .
(1.41)
If equality holds in (1.41) for some end Ml0 , then M has only one end. Furthermore, if El0 = 0 and gij is C 2 , pij is C 1 , then X k 21 , where hij = hj i . Suppose there is a regular point z ∈ M and the dominant
144
X. Zhang
energy condition (1.26) holds for pij = h˜ zij , then, for each end Ml , and each point x0 ∈ lz (if lz is nonempty), we have sX sX (Jlkin (z))2 = (Jlkex (x0 ))2 . (1.43) El ≥ k
k
If equality holds in (1.43), then M has only one end. Furthermore, if El = 0 and h˜ zij is C 1 , then (1.36), (1.37) and (1.38) hold true for pij = h˜ z . ij
Theorem 1.6. Let (M, gij , hij ) be a 3-dimensional asymptotically flat manifold of order 1 ≥ τ > 21 , where hij = hj i . Suppose there is a regular point z ∈ M and the dominant energy condition (1.26) holds for pij = hij ± h˜ zij , then, for each end Ml and each point x0 ∈ lz (if lz is nonempty), we have sX sX (Plk ± Jlkin (z))2 = (Plk ± Jlkex (x0 ))2 . (1.44) El ≥ k
k
If equality holds in (1.44), then M has only one end. Furthermore, if El = 0 and hij ± h˜ zij is C 1 , then (1.36), (1.37) and (1.38) hold true for pij = hij ± h˜ z . ij
We refer to Sect. 4 for related definitions in the above two theorems. We shall address in another paper the Positive Mass Theorem involving total angular momentum for higher dimensional manifolds. 2. Dirac–Witten Operator Let (M, gij , pij ) be an n-dimensional spin Riemannian manifold with metric tensor gij and 2-tensor pij . Fix a point p ∈ M and an orthonormal basis {ei } of Tp M such that (∇i ej )p = 0, where ∇ is the metric connection of M. Let {ei } be the dual frame. Let S be the spinor bundle of M with Hermitian metric h , i. The metric connection ∇ of M induces a metric connection (also denoted by ∇) on S. Define the modified connections ∇˜ and ∇¯ on S by √ −1 X ˜ pij ej , (2.1) ∇i = ∇i + 2 j √ √ X −1 X −1 pij ej − pj k ei ej ek . (2.2) ∇¯ i = ∇i + 2 2 j
j,k;i6 =j 6 =k6 =i
In a local orthonormal coframe {ei } of M, Dirac operator D and Dirac–Witten opere are defined by ator D X ei ∇i , (2.3) D= i
e= D
X i
ei ∇˜ i
(2.4)
Angular Momentum and Positive Mass Theorem
145
respectively. We have the following Lichnerowicz formula: 1 D 2 = ∇ ∗ ∇ + R, 4
(2.5)
where R is the scalar curvature of M. In terms of (2.1), we have e=D+ D
√
−1 X pij ei ej . 2
(2.6)
i,j
Moreover, d(hφ, ψi ∗ ei ) = (h∇i φ, ψi + hφ, ∇i ψi) ∗ 1 √ X = (h∇˜ i φ, ψi + hφ, (∇˜ i − −1 pij ej )ψi) ∗ 1 = (h∇¯ i φ, ψi + hφ, (∇¯ i −
√
(2.7) (2.8)
j
−1
X
pij ej )ψi) ∗ 1,
(2.9)
j
d(hei φ, ψi ∗ ei ) = (hDφ, ψi − hφ, Dψi) ∗ 1 √ X e ψi − hφ, (D e + −1 pii )ψi) ∗ 1. = (hDφ,
(2.10) (2.11)
i
Hence, ∇˜ i∗ = −∇˜ i +
√ X −1 pij ej
(2.12)
j
√ −1 X pij ej , 2 j X √ pij ej , ∇¯ i∗ = −∇¯ i + −1 = −∇i +
(2.13) (2.14)
j
√ √ X −1 X −1 pij ej + pj k ei ej ek , 2 2 j j,k;i6 =j 6 =k6 =i √ X ∗ e e pii D = D + −1 = −∇i +
i
=D+
√
−1
X i
√ −1 X pii + pij ei ej . 2 i,j
Now we prove the following three Weitzenböck formulas.
(2.15) (2.16) (2.17)
146
X. Zhang
Theorem 2.1. X
√ −1
e = ∇ ∗∇ + e∗ D D
pj k ei ej ek ∇i +
i6=j 6=k6=i
X
√ −1 2
X
X
ei ej ek ∇i pj k
i6 =j 6 =k6 =i
1 + (R + ( pii )2 − pij pkl ei ej ek el ) 4 i i6=j,k6=l √ √ −1 X −1 X j ∇i (pij − pj i )e − ∇j pii ej − 2 2 i,j
(2.18)
i,j
√ X 1 1 ωj ej ) + F, = ∇¯ ∗ ∇¯ + (µ + −1 2 2
(2.19)
√ X 1 eD e∗ = ∇¯ ∇¯ ∗ + 1 (µ − −1 (ωj + χj )ej ) − F, D 2 2
(2.20)
j
j
where
F=
P
0 (n = 3), i ej ek el (n > 3). p p e ij kl i6=j 6=k6=l6=i
(2.21)
Proof. By (2.17), we have √ −1 X ∗e 2 e ∇k pij ek ei ej D D=D + 2 i,j,k √ X √ X √ X − −1 pij ej ∇i + −1 pij ei ∇j + −1 pij ei ej D −
1X 2
i
i,j
pii
X i,j
i,j
1 X pij ei ej − pij pkl ei ej ek el 4
i6 =j
i,j,k,l
√ √ −1 X 2 =D + ∇k pij ek ei ej + −1 2 i,j,k
X
ei ej ek pj k ∇i
i6 =j 6 =k6 =i
1 X 1 X + ( pii )2 − pij pkl ei ej ek el . 4 4 i
i6=j,k6=l
Hence (2.18) follows from Lichnerowicz formula (2.5). By (2.15), we have √ X √ −1 X ∗ ∗ ei ej ek ∇i pj k + −1 ei ej ek pj k ∇i ∇¯ ∇¯ = ∇ ∇ + 2 i6=j 6=k6=i i6 =j 6 =k6 =i √ X X X 1 1 1 −1 2 − ∇i pij ej + pij − pij pkl ei ej ek el − F. (2.22) 2 4 4 2 i,j
i,j
i6 =j,k6 =l
Hence (2.19) follows. On the other hand, (2.2), (2.15), (2.6) and (2.17) give √ X ∇i pij ej + F, ∇¯ ∇¯ ∗ − ∇¯ ∗ ∇¯ = −1
(2.23)
i,j
e∗ D e= eD e∗ − D D
√ X −1 ∇j pii ej . i,j
(2.24)
Angular Momentum and Positive Mass Theorem
147
Hence (2.20) follows. u t Now we can derive the following integral form of Weitzenböck formula (2.19): Theorem 2.2. Z Z √ X e ∗ ei = ¯ 2 + 1 hφ, (µ + −1 hφ, ∇¯ i φ + ei Dφi |∇φ| ωj ej )φi 2 ∂M M j Z 1 e 2. hφ, Fφi − |Dφ| (2.25) + 2 M Proof. It follows from (2.9), (2.11) and (2.19). u t 3. Positive Mass Theorem If (M, gij , pij ) is a 3-dimensional almost asymptotically flat manifold of order 1 ≥ τ > 1 i 2 with asymptotic coordinates {x }, then, on end, we have 1 (3.1) ∇j = ∂j − 0kj l dx k dx l + O(r −2τ −1 ), 4 √ X e = dx j ∂j − 1 0kj l dx j dx k dx l + −1 pij dx i dx j + O(r −2τ −1 ), (3.2) D 4 2 i,j
where 0kj l =
1 (∂j gkl + ∂l gkj − ∂k gj l ). 2
(3.3)
Let δ = τ for 1 > τ > 21 and δ = 1 − ε for τ = 1, where ε > 0 is chosen such e D e∗ give the maps for the following weighted Sobolev spaces that 1 > δ > 21 . Thus D, defined by connection ∇ on S, 2,q
e D
1,q
e∗ D
0,q
W−δ (S) −→ W−δ−1 (S) −→ W−δ−2 (S). e 0 ∈ W 1,q (S), and D e∗ Dφ e 0 ∈ W 0,q (S). For constant spinor φ0 , ∂j φ0 = 0, we have Dφ −δ−1 −δ−2 Recall the Pauli representation of the coframe {ei } on a spinor bundle √ √ 0 0 1 −1 √0 −1 , e2 7 → √ , e3 7 → e1 7 → . (3.4) −1 0 −1 0 0 − −1 Obviously, e1 e2 e3 = I d. Now we recall a lemma, which can be easily proved in the spirit of [PT], see also [Zh1]. Lemma 3.1. Suppose (M, gij , pij ) is a 3-dimensional almost asymptotically flat man¯ = 0, ∇φ ¯ a =0 ifold of order τ > 0, and φ, {φa } are C 1 spinors which satisfy either ∇φ or ∇¯ ∗ φ = 0, ∇¯ ∗ φa = 0. (i) If limx→∞ φ(x) = 0, where the limit is taken along M in one asymptotic end, then φ = 0.
148
X. Zhang
(ii) If {φa } are linearly independent in some end, then they are linearly independent everywhere on M. Proof. (i) By the assumption, we have √ √ −1 X −1 j pij e φ + ∇i φ = ∓ 2 2 j
X
pj k ei ej ek φ.
j,k;i6 =j 6 =k6 =i
Then |d|φ|2 | = 2|<eh∇φ, φi| ≤ C|p||φ|2 . This implies |d ln |φ|| ≤ Cr −τ −1 on the ˜ Integrating it along a path from x0 ∈ M complement of the zero set of φ on M − K. gives −τ −|x|−τ )
|φ(x)| ≥ |φ(x0 )|eC(|x0 |
.
Taking x to be the first zero of φ along the path of integration, or taking the limit as ˜ Since |x| → ∞ if no such zero exists, shows that φ(x0 ) = 0. Hence φ = 0 on M − K. φ satisfies the following Dirac-type equation, √ √ −1 X −1 X ( pii )φ − (1 ± 1) (pij − pj i )ei ej φ, Dφ = ∓ 2 4 i
i,j
therefore φ = 0 by Unique Continuation Property. P (ii) Suppose there are constant Ca such that φ = a Ca φa vanishes at some point ¯ = 0 or ∇¯ ∗ φ = 0, we can repeat the above argument to conclude that x0 ∈ M. Since ∇φ t φ → 0 on each end. Hence Ca = 0 and it follows. u Proposition 3.1. If (M, gij , pij ) is a 3-dimensional almost asymptotically flat manifold of order 1 ≥ τ > 21 and the dominant energy condition (1.26) holds, then the map e : W 2,q (S) −→ W 0,q (S) e∗ D D −δ −δ−2
(3.5)
is an isomorphism. e is asymptotic to the standard Laplacian operator ∇ ∗ ∇ e∗ D Proof. Note that, by (2.18), D e is Fredholm with index e∗ D in the sense of Bartnik [Ba1]. Hence D e∗ D| e 2,q ) = ind(∇ ∗ ∇| 2,q ), ind(D W (S) W (S) −δ
(3.6)
−δ
see [Ba1]. Weighted Holder ¨ inequality and elliptic regularity imply Coker(∇ ∗ ∇|W 2,q (S) ) = Ker((∇ ∗ ∇)∗ |W 0,q¯
−1+δ (S)
−δ
) = Ker(∇ ∗ ∇|W 2,q¯
−1+δ (S)
),
(3.7)
2,q q ∗ q−1 . By the maximal principle, ∇ ∇ has trivial kernel both on W−δ (S) and e∗ D) e = 0 and we only need to show that the kernel of D e∗ D e (S). Thus ind(D
where q¯ = 2,q¯
on W−1+δ
e = 0, then e∗ Dφ on W−δ (S) is trivial. Let φ ∈ W−δ (S) satisfy D Z Z Z Z 2 ∗e i i e e e e ∗ ei → 0 |Dφ| = hφ, D Dφi + he φ, Dφi ∗ e = hei φ, Dφi 2,q
M
2,q
M
∂M
∂M
e = 0. Therefore ∇φ ¯ = 0 on M by (2.25). Hence φ = 0 by Lemma as x → ∞. Thus Dφ 3.1 (i). The proof of proposition is complete. u t
Angular Momentum and Positive Mass Theorem
149
Proposition 3.2. If (M, gij , pij ) is a 3-dimensional almost asymptotically flat of order 1 ≥ τ > 21 and the dominant energy condition (1.26) holds, then the map e∗ : W 1,q (S) −→ W 0,q (S) D −δ−1 −δ−2
(3.8)
is injective. e∗ | 1,q ). Then (2.20) gives ∇¯ ∇¯ ∗ φ + Proof. Let φ ∈ Ker(D W−δ−1 (S) P j j (ωj + χj )e )φ = 0. By (2.9), we have,
1 2 (µ
√ −1
−
d(h∇¯ i∗ φ, φi ∗ ei ) = (h∇¯ ∇¯ ∗ , φi) − h∇ ∇¯ ∗ , ∇ ∇¯ ∗ i) ∗ 1 √ X 1 (ωj + χj )ej )φi) ∗ 1 = −(|∇¯ ∗ φ|2 + hφ, (µ − −1 2 j
Hence Z
√ X 1 |∇¯ ∗ φ|2 + hφ, (µ − −1 (ωj + χj )ej )φi = − 2 M j
Z ∂M
h∇¯ i∗ φ, φi ∗ ei → 0
as x → ∞. Thus ∇¯ ∗ φ = 0, and φ = 0 by Lemma 3.1 (i). The proof of proposition is complete. u t Proposition 3.3. If (M, gij , pij ) is a 3-dimensional almost asymptotically flat manifold of order 1 ≥ τ > 21 and the dominant energy condition (1.26) holds, then for any constant spinor φ0 on ends, the following boundary value problem has a unique solution φ ∈ W 2,q (S), e Dφ = 0, (3.9) limr→∞ φ = φ0 . e 0 ∈ W 0,q (S), by Proposition 3.1, there is unique φ1 ∈ W 2,q (S) e∗ Dφ Proof. Since D −δ −δ−2 e 1 =−D e 0 . Then φ = φ1 + φ0 satisfies D e = 0. Since Dφ e ∈ e∗ Dφ e∗ Dφ e∗ Dφ such that D 1,q e = 0 by Proposition 3.2 and φ is the unique solution of (3.9). u W (S), then Dφ t −δ−1
Theorem 3.1. Let (M, gij , pij ) be a 3-dimensional almost asymptotically flat manifold of order 1 ≥ τ > 21 , where pij is an arbitrary 2-tensor. If M satisfies the dominant energy condition (1.26), then, for each end Ml , we have sX El ≥ Plk2 . (3.10) k
If equality holds in (3.10) for some end Ml0 , then M has only one end. Furthermore, if El0 = 0 and gij is C 2 , pij is C 1 , then the following equations hold on M: Rij kl + pik pj l − pil pj k = 0, ∇i pj k − ∇j pik = 0, X ∇i (pij − pj i ) = 0. i
(3.11) (3.12) (3.13)
150
X. Zhang
Proof. Let constant spinor φ0 6 = 0 on Ml , and φ0 = 0 on other ends. Denote φ = φ1 +φ0 , 2,q where φ1 ∈ W−δ (S), as the corresponding solution of (3.9) for this φ0 . We have Z
√ X ¯ 2 + 1 hφ, (µ + −1 |∇φ| ωj ej )φi 2 M j √ Z X −1 X hφ0 , dx i dx j ∇˜ j φ0 − pj k dx i dx j dx k φ0 i ∗ dx i = 2 ∂M∞ i6=j i6=j 6 =k6 =i Z X 1 = hφ0 , − 0kj l dx i dx j dx k dx l φ0 i ∗ dx i 4 ∂M∞ i6=j √ Z X −1 X + hφ0 , pj k dx i dx j dx k − pj k dx i dx j dx k )φ0 i ∗ dx i ( 2 ∂M∞ i6=j i6 =j 6 =k6 =i Z 1 = hφ0 , (∂j gij − ∂i gjj )φ0 i ∗ dx i 4 ∂M∞ √ Z −1 (pki dx k − hjj dx i )φ0 i ∗ dx i + hφ0 , 2 ∂M∞ √ = 4π(hφ0 , El φ0 i + hφ0 , −1Plk dx k φ0 i). qP 2 −1Plk dx k has eigenvalue ± k Plk . Now we take φ0 as the eigenspinor of qP 2 eigenvalue − k Plk with |φ0 | = 1, we therefore obtain the first part of the theorem. ¯ = 0. Hence it That the equality implies there is at least one spinor φ such that ∇φ follows from Lemma 3.1 (i) that M has only one end. If total energy vanishes, then, by Lemma 3.1 (ii), there is {φa } which form a basis of the spinor bundle everywhere on M ¯ a = 0. So in a local frame {ei } of M, such that ∇φ √
Note
∇ i φa = −
√ √ −1 −1 pik ek φa + iab pab φa . 2 2
Thus √ √ −1 −1 ∇i pj l el φa − pj l el ∇i φa 2 2 √ √ −1 −1 + j ab ∇i pab φa + j ab pab ∇i φa 2 √2 1 1 −1 ∇i pj l el φa − pj l pik el ek φa + iab pab pj l el φa =− 4 4 √2 1 1 −1 j ab ∇i pab φa + j ab pab pil el φa − iab j cd pab pcd φa . + 2 4 4
∇i ∇j φa = −
Angular Momentum and Positive Mass Theorem
151
Therefore 1 − Rij kl ek el φa = (∇i ∇j − ∇j ∇i )φa 4 √ 1 −1 (∇i pj k − ∇j pik )ek φa + (pik pj l − pil pj k )ek el φa =− 2 4 √ −1 − (3.14) (iab ∇j pab − j ab ∇i pab )φa . 2 Thus X
(Rij kl + pik pj l − pil pj k )ek el φa =
√ X −1 (∇i pj k − ∇j pik )ek φa
k n−2 2 (n > 3), where pij is an arbitrary 2-tensor. If M satisfies the dominant energy condition (1.39), then, for each end Ml , we have sX Plk2 . (3.23) El ≥ k
If equality holds in (3.23) for some end Ml0 , then M has only one end. Furthermore, if El0 = 0 and gij is C 2 , pij is C 1 , then X √ X (Rij kl + pik pj l − pil pj k )ek el − −1 (∇i pj k − ∇j pik )ek k 3 and a compact set K˜ ⊃ K such that 0,α ˜ h˜ zij ∈ C−τ −1 (M − K),
(4.4)
d θ˜ z , d ∗ θ˜ z ∈ L q ,−τ −2 (M).
(4.5)
2
For any z ∈ M, its “influence domain” lz with respect to end Ml is the set of point x0 ∈ R 3 such that i uv ∂u (ρz2 − σxl 0 )(hvj − gvj trg (h)) = O(r −2τ −1 ), i
uv
∂u gkl (hvj − gvj trg (h))(x
k
− x0k )(x l
− x0l )
= O(r
−2τ −1
)
(4.6) (4.7)
as x → ∞ on end Ml , where σxl 0 is the barrier function defined on R 3 − Br with respect to end Ml and point x0 ∈ R 3 by j
σxl 0 (x) = gij (x i − x0i )(x j − x0 ),
(4.8)
where gij is understood to be a function on R 3 − Br . As an example, let us see a symmetric 2-tensor hij defined on R 3 with flat metric, hii = −f + Fii , hij = 2f + Fij (i 6= j ),
(4.9) (4.10)
where f , Fij are certain smooth functions which satisfy asymptotic orders f = O(r −2 ), Fij = O(r −3 ).
(4.11) (4.12)
It is not hard to see that the influence domain 0 of origin is the line spanned by vector a = (1, 1, 1). For each x0 ∈ lz , we have 1 h˜ zij = i uv (∇u σxl 0 )(hvj − gvj trg (h)) + O(r −2τ −1 ) 2 = iu v (x u − x0u )(hvj − gvj trg (h)) + O(r −2τ −1 ) as x → ∞ on end Ml . Therefore, Z 1 lim h˜ z di Jlkin (z) = 8π r→∞ Sr,l ki Z 1 lim = v (x u − x0u )(hvj − gvj trg (h))di 8π r→∞ Sr,l iu = Jlkex (x0 ).
By taking pij = h˜ ij , pij = hij ± h˜ ij in Theorem 3.1 respectively, we can obtain the following two Positive Mass Theorems involving total angular momentum.
154
X. Zhang
Theorem 4.1. Let (M, gij , hij ) be a 3-dimensional asymptotically flat manifold of order 1 ≥ τ > 21 , where hij = hj i . Suppose there is a regular point z ∈ M and the dominant energy condition (1.26) holds for pij = h˜ zij , then, for each end Ml and each point x0 ∈ lz (if lz is nonempty), we have sX sX (Jlkin (z))2 = (Jlkex (x0 ))2 . (4.13) El ≥ k
k
If equality holds in (4.13), then M has only one end. Furthermore, if El = 0 and h˜ zij is C 1 , then (3.11), (3.12) and (3.13) hold true for pij = h˜ z . ij
Theorem 4.2. Let (M, gij , hij ) be a 3-dimensional asymptotically flat manifold of order 1 ≥ τ > 21 , where hij = hj i . Suppose there is a regular point z ∈ M and the dominant energy condition (1.26) holds for pij = hij ± h˜ zij , then, for each end Ml and each point x0 ∈ lz (if lz is nonempty), we have sX sX in 2 (Plk ± Jlk (z)) = (Plk ± Jlkex (x0 ))2 . (4.14) El ≥ k
k
If equality holds in (4.14), then M has only one end. Furthermore, if El = 0 and hij ± h˜ zij is C 1 , then (3.11), (3.12) and (3.13) hold true for pij = hij ± h˜ z . ij
Remark 4.1. Theorem 4.1 holds true when M contains minimal 2-spheres and Theorem 4.2 holds true when M contains apparent horizons, due to Remark 3.2 and (1.18). Acknowledgements. The author would like to express his gratitude to Professor S. T. Yau for his suggestion and especially for bringing the center of mass to his attention, and to thank Professors R. Bartnik, F. H. Lin, K. Liu, L. F. Tam, G. Tian, J. P. Wang and W. P. Zhang for their interest in this work and useful conversations. This work was finished while the author visited Max-Planck-Institute for Mathematics in the Sciences, Leipzig, Germany. He would like to thank Professor J. Jost for his invitation and to thank the institute for its hospitality and financial support.
References [ADM] Arnowitt, S., Deser, S., Misner, C.: Coordinate invariance and energy expressions in general relativity. Phys. Rev. 122, 997–1006 (1961) [AH] Ashtekar, A., Hansen, R.: A unified treatment of null and spatial infinity in general relativity. I. Universal structure, asymptotic symmetries, and conserved quantities at spatial infinity. J. Math. Phys. 19, 1542–1566 (1978) [AM] Ashtekar, A., Magnon-Ashtekar, A.: On conserved quantities in general relativity. J. Math. Phys. 20, 793–800 (1979) [AS] Ashtekar, A., Streubel, M.: On angular momentum of stationary gravitating systems. J. Math. Phys. 20, 1362–1365 (1979) [Ba1] Bartnik, R.: The mass of an asymptotically flat manifold. Comm. Pure Appl. Math. 36, 661–693 (1986) [Ba2] Bartnik, R.: Quasi-spherical metrics and prescribed scalar curvature. J. Diff. Geom. 37, 31–71 (1993) [Br] Bray, H.: The Penrose inequality in general relativity and volume comparison theorems involving scalar curvature. Thesis, Stanford University, 1997 [CK] Christodoulou, D., Klainerman, S.: The global nonlinear stablity of Minkowski space. Princeton Math. Series 41, Princeton, NJ: Princeton Univ. Press, 1993
Angular Momentum and Positive Mass Theorem
[Gi]
155
Gibbons, G.: Collapsing shells and the isoperimetric inequality for black holes. Class. Quant. Grav. 14, 2905–2915 (1997) [GHHP] Gibbons, G., Hawking, S., Horowitz, G., Perry, M.: Positive mass theorems for black holes. Commun. Math. Phys. 88, 295–308 (1983) [Ha] Harvey, R.: Spinors and calibrations. London: Academic Press, 1989 [HE] Hawking, S., Ellis, S.: The large scale structure of space-time. Cambridge: Cambridge Univ. Press, 1973 [He1] Herzlich, M.: A Penrose-like inequality for the mass of Riemannian asymptotically flat manifolds. Commun. Math. Phys. 188, 121–133 (1997) [He2] Herzlich, M.: The positive mass theorem for black holes revisited. J. Geom. Phys. 26, 97–111 (1998) [Hu] Huisken, M.: The private communication [HI1] Huisken, G., Ilmanen, T.: The Riemannian Penrose inequality. (Announcement) Int. Math. Res. Not. 20, 1045–1058 (1997) [HI2] Huisken, G., Ilmanen, T.: The inverse mean curvature flow and the Riemannian Penrose inequality. Preprint [HY] Huisken, G., Yau, S.T.: Definition of center of mass for isolated physical systems and unique foliations by stable spheres with constant mean curvature. Invent. Math. 124, 281–311 (1996) [LP] Lee, J., Parker, T.: The Yamabe problem. Bull. Am. Math. Soc. 17, 31–81 (1987) [Li] Li, P.: Lecture notes on geometric analysis. Lecture Notes Series No. 6 – Research Institute of Mathematics and Global Analysis Research Center, Seoul National University, Seoul, 1993 [MTW] Misner, C., Thorne, K., Wheeler, J.: Gravitation. (19th printing), New York: W.H. Freeman and Company, 1995 [PT] Parker, T., Taubes, C.: On Witten’s proof of the positive energy theorem. Commun. Math. Phys. 84, 223–238 (1982) [Pe] Penrose, C.: Some unsolved problems in classical general relativity. Seminar on differential geometry, ed. S.T.Yau, Annals of Math. Stud. 102, Princeton, NJ: Princeton Univ. Press, 1982, pp. 631–668 [Sc] Schoen, R.: Variational theory for the total scalar curvature functional for Riemannian metric and related topics. Lecture Notes in Math. 1365, Berlin–Heidelberg–New York: Springer-Verlag, 1987, pp. 120–154 [St] Streubel, M.: Conserved quantities for isolated gravitational system. Gen. Rel. Grav. 9, 551–561 (1978) [SY1] Schoen, R., Yau, S.T.: On the proof of the positive mass conjecture in general relativity. Commun. Math. Phys. 65, 45–76 (1979) [SY2] Schoen, R., Yau, S.T.: The energy and the linear momentum of spacetimes in general relativity. Commun. Math. Phys. 79, 47–51 (1981) [SY3] Schoen, R., Yau, S.T.: Proof of the positive mass theorem. II Commun. Math. Phys. 79, 231–260 (1981) [Va] Vafa, C.: Geometric physics. Doc. Math., Extra Vol. ICM 1998 (I), pp. 375-394 [Wi] Witten, E.: A new proof of the positive energy theorem. Commun. Math. Phys. 80, 381–402 (1981) [Ya1] Yau, S.T.: The private communication [Ya2] Yau, S.T.: Problem section. Seminar on differential geometry, ed. S.T. Yau, Annals of Math. Stud. 102, Princeton, NJ: Princeton Univ. Press, 1982, pp. 699–706 [Zh1] Zhang, X.: Positive mass conjecture for 5-dimensional Lorentzian manifolds. submitted [Zh2] Zhang, X.: Positive mass theorem for modified energy condition. submitted [Zh3] Zhang, X.: Positive mass theorem for hypersurface in 5-dimensional Lorentzian manifolds. Comm. Anal. Geom., to appear [Zh4] Zhang, X.: Lower bounds for eigenvalues of hypersurface Dirac operators. Math. Res. Lett. 5, 199–210 (1998) Communicated by H. Nicolai
Commun. Math. Phys. 206, 157 – 183 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
On the Structure of the Small Quantum Cohomology Rings of Projective Hypersurfaces Alberto Collino1 , Masao Jinzenji2,? 1 Dipartimento di Matematica, Universita’ di Torino, Via Carlo Alberto 10, 10123 Torino, Italy.
E-mail:
[email protected] 2 Department of Physics, University of Tokyo, Bunkyo-ku, Tokyo 113, Japan.
E-mail:
[email protected] Received: 29 November 1996 / Accepted: 15 March 1999
Abstract: We give an explicit procedure which computes for degree d ≤ 3 the correlation functions of topological sigma model (A-model) on a projective Fano hypersurface X as homogeneous polynomials of degree d in the correlation functions of degree 1 (number of lines). We extend this formalism to the case of Calabi–Yau hypersurfaces and explain how the polynomial property is preserved. Our key tool is the construction of universal recursive formulas which express the structure constants of the quantum cohomology ring of X as weighted homogeneous polynomial functions of the constants of the Fano hypersurface with the same degree and dimension one more. We propose some conjectures about the existence and the form of the recursive laws for the structure constants of rational curves of arbitrary degree. Our recursive formulas should yield the coefficients of the hypergeometric series used in the mirror calculation. Assuming the validity of the conjectures we find the recursive laws for rational curves of degree four. 1. Introduction ∗ (M k ) in the quantum cohomology ring of a In [16], we studied the Kähler sub-ring Hq,e N k N−1 , where we used numerical computation based hypersurface MN of degree k in CP on the torus action method. We worked under the condition that c1 (MNk ) is not negative, i.e. under the hypothesis N ≥ k. The following statements summarize the content of that paper:
1. For N ≤ 9 with N − k ≥ 2, we computed that the main relation satisfied by the ∗ (M k ) has the simple form generator Oe of Hq,e N (Oe )N−1 − k k (Oe )k−1 · q = 0.
(1.1)
? Present address: Graduate School of Mathematical Sciences, University of Tokyo, Meguro-ku, Tokyo, 153-8914, Japan.
158
A. Collino, M. Jinzenji
∗ (M k ) can be ex2. Under the same restriction as for 1 the structural constants of Hq,e N pressed as polynomial functions of a finite set of integers. These integers are basically the Schubert numbers of lines, and they do depend on the degree of the hypersurface but not on its dimension. 3. An explanation for (1.1) was found by looking at a toric compactification of the moduli space of maps from CP 1 to CP N−1 . It was said that the boundary portion of the moduli space should turn out to be irrelevant for the calculation, under the condition N − k ≥ 2.
The justification for 1 and 2 was based on numerical computations and the explanation of 3 was heuristic. Givental [11] gave a mathematically rigorous proof of (1.1). He constructed the fundamental solution of the Gauss–Manin system (the deformation parameter is restricted to the Kähler deformation) associated to the A-model on MNk by inventing a powerful extension of the torus action method and he showed that it satisfies the linear ODE of ∗ (M k ) hypergeometric type if N − k ≥ 2. This ODE yields the main relation of Hq,e N (1.1) by means of a certain limit procedure (in his notation h¯ → 0). Givental also treated the cases when N − k is 1 or 0 and he showed that the solution above satisfies the linear ODE of hypergeometric type if a) some multiplicative factor is added (when N − k = 1); b) some multiplicative factor is added and a coordinate transformation (the mirror map) is performed (when N = k). In this way he proved the mirror symmetry conjecture, namely that topological sigma models on Calabi–Yau manifolds realized as complete intersections in projective space can be solved by the analysis of hypergeometric series. His proof of the symmetry seems to rely on the flat metric condition, namely on the fact that the three point functions which include the identity operator do not receive quantum correction. His arguments are quite powerful and deep, but we can hardly see what is happening microscopically. The original motivation of the present paper was an attempt to prove (1.1) by descending induction on N, see [6]. Our program is to construct recursive formulas that ∗ (M k ) as weighted homogeneous polynomials in express the structural constants of Hq,e N k ∗ the structural constants of Hq,e (MN+1 ). Our method is based on a geometric process, which we call the specialization procedure, but unfortunately it works only up to the case of cubic curves. We believe that it should be possible to find and construct universal recursive laws also for curves of higher degree. We state some ansatz on the expected structure of such formulas. The main expected property says that if the index N − k ≥ 2 then the recursive laws should stay the same, independently of N and k. When the hypersurface is of Fano index 1 then the recursion law for the Schubert numbers of lines changes, while the formulas for curves of higher degree do not. Coming to the Calabi– Yau situation, N = k, the recursion relation is modified for all degrees. In this case the main relation of (1.1) must be changed entirely. ∗ (M N ) and evaluated We first computed the recursion relation for lines in case of Hq,e N the degree 1 part of the relation [16]. The result had a structure strongly reminiscent of the situation from mirror symmetry [18] and we speculated that the above correction and the correction terms argued in [18] are closely related. On the other hand the universal recursion laws valid for the case N − k ≥ 2 can be formally iterated by descent of dimension (while keeping the degree of the hypersurface fixed) twice more down to the case of a Calabi–Yau. What we conjecture here, and verify in part, is that this formal procedure yields the coefficients of the hypergeometric series which appear in the mirror
Quantum Cohomology Rings of Projective Hypersurfaces
159
calculation, but without use of the mirror conjecture. At this point the construction of the structure constants for the quantum ring of the Calabi–Yau hypersurface can be realized by the standard procedure that arises from the flat metric condition. Coming back to the main relation above, we can prove it modulo (q 4 ) as follows. We construct explicit recursion relations for d ≤ 3 and start the descending induction where N is large enough with respect to the fixed degree k so that the only non-trivial quantum corrections comes from lines. This procedure yields (1.1) and 2. We expect that the universal recursive procedure should provide interesting information also for the case when MNk is a hypersurface of general type, i.e. N < k. Our paper is organized as follows. In Sect. 2 we recall first the main properties of the structure of the quantum Kähler ∗ (M k ) and then we study the quantum product with primitive cohomology algebra Hq,e N classes for the Fano case, k < N. In Sect. 3 we introduce the specialization calculation and derive the recursion relations for rational curves of degree at most 3 under the assumption that the hypersurface is a Fano manifold. In Sect. 4 we try to extend the specialization procedure to Calabi–Yau hypersurfaces and determine how the recursion laws should be modified up to degree 2. Using this we evaluate the main relation of the quantum ring and compare it with the result from mirror symmetry. We find here the motivation to organize our findings in a compact form by means of the hypergeometric series used in mirror calculation. At this point some computations induce us to state a conjecture which says how to modify the recursion relation for cubic curves in the Calabi–Yau case. In Sect. 5 we present a set of conjectures which should provide a guiding rule in the explicit construction of the recursive formulas for rational curves of higher degree. Assuming the conjectures we can explicitly construct the recursive formulas for degree four.
2. Quantum Cohomology of Fano Hypersurfaces ∗ (M k ). Let M k be the generic and non-singular 2.1. The quantum Kähler sub-ring Hq,e N N hypersurface of degree k in CP N−1 . By the Lefschetz theorem the cohomology ring H ∗ (MNk ) splits into two parts. One of them is the Kähler sub-ring generated by the Kähler form e induced from the hyperplane section H of CP N −1 , and the other is the primitive part, which is a subspace of the middle dimension cohomology H N −2 (MNk ). ∗ (M k ). It is generated additively by We first consider the quantum Kähler sub-ring Hq,e N Oeα (α = 0, 1, 2, · · · N − 2), where Oeα represents the BRST- closed operator induced ∗ (M k ) are determined by means of from eα ∈ H ∗ (MNk ). The multiplication rules of Hq,e N the flat metric and the three point correlation functions (or Gromov–Witten invariants): N,k ηαβ αβ
Z := hOe0 O Oeβ iM k = eα
N
k MN
eα ∧ eβ = k · δα+β,N −2 ,
1 · δα+β,N−2 , k Z ∞ X = hOeα Oeβ Oeγ iM k = qd αβ
N,k = δγα , ηN,k = ηN,k ηβγ N,k Cα,β,γ
N
d=0
(2.1) (2.2)
k
M ¯ N M 0,d,3
φ1∗ (eα ) ∧ φ2∗ (eβ ) ∧ φ3∗ (eγ ). (2.3)
160
A. Collino, M. Jinzenji k
¯ MN represents the moduli space of stable maps from 3-pointed Here q := exp(t), M 0,d,3 k
¯ MN → M k . genus 0 curves to MNk and φi is the natural i th evaluation morphism M 0,d,3 N Mk
¯ N is the space of rational curves of degree d with three punctures in M k . Basically M 0,d,3 N The rules of quantum multiplication are N,k η γ δ Oe δ = Oeα · Oeβ = Cα,β,γ
∞
X 1 N,k,d 1 N,k q d Cα,β,γ OeN −2−γ . Cα,β,γ OeN −2−γ := k k
(2.4)
d=0
∗ (M k ), and therefore One should note that Oe is a multiplicative generator of Hq,e N it is enough to determine the multiplication rule between Oe and Oeα . The topological N,k is non-zero only if 1 + α + β = N − 2 + (N − k)d, hence selection rule yields that C1αβ it is:
Oe · Oeα =
∞ X d=0
1 N,k,d q d C1,α,N−3−α+(N −k)d Oeα+1−(N −k)d . k
(2.5)
For conventional reasons, we rewrite (2.5) as follows: Oe · OeN −2−m = OeN −1−m + LN,k,d m
∞ X d=1
q d LN,k,d OeN −1−m−(N −k)d , m
1 N,k,d := C1,N −2−m,m−1+(N−k)d . k
(2.6)
∗ (M k ) coincides with the classical cohoHere we have used the fact that q 0 part of Hq,e N
are the structure constants of the quantum ring. One mology ring. The integers LN,k,d m should think of kLN,k,d as the number of rational curves of degree d on MNk which meet a m linear section of dimension m and a second linear section of the right (= m+(N −k)d−1) are independent of N if N ≥ k + 2, and therefore codimension. We shall see that LN,k,1 m we write them simply as Lkm ; we refer to them as the Schubert number of lines. Note that Lkm = Lkk−m−1 . Gromov–Witten invariants of genus 0 vanish by insertion of Oe0 . Since MNk is a N −2 dimensional manifold we obtain: 6 = 0 H⇒ 1 ≤ N − 2 − m ≤ N − 2, 1 ≤ m − 1 + (N − k)d ≤ N − 2 LN,k,d m ⇐⇒ max{0, 2 − (N − k)d} ≤ m ≤ min{N − 3, N − 1 − (N − k)d}. (2.7) These vanishing conditions translate into 6 = 0 H⇒ 0 ≤ m ≤ (N LN,k,d m H⇒ 1 ≤ m ≤ (N H⇒ 0 ≤ m ≤ (N H⇒ 2 ≤ m ≤ (N
− 1) − (N − k)d (N − k ≥ 2), − 3) (N − k = 1, d = 1), − 1) − (N − k)d (N − k = 1, d ≥ 2), − 3) (N − k = 0).
(2.8)
We remark explicitly that if the dimension N is large with respect to k (N ≥ 2k) then the only non trivial quantum correction left is due to curves of degree 1.
Quantum Cohomology Rings of Projective Hypersurfaces
161
As we said above Oe is a multiplicative generator of the ring, and then there are coefficients γ which give the representations: OeN −1−m = (Oe )N−1−m −
∞ X d=1
q d γmN,k,d (Oe )N −1−m−(N−k)d .
(2.9)
∗ (M k ). One has When we set m = 0 we obtain the main relation of Hq,e N
Oe · ((Oe )N−2−m −
∞ X d=1
= (Oe )N−1−m −
∞ X d=1
+ −
∞ X d=1 ∞ X
N,k,d q d γm+1 (Oe )N −2−m−(N −k)d )
q d γmN,k,d (Oe )N −1−m−(N−k)d
LN,k,d q d (Oe )N−1−m−(N−k)d m
d 0 =1
0 N,k,d 0 0 q d γm+(N−k)d (Oe )N−1−m−(N−k)(d+d ) ,
(2.10)
and therefore it is N,k,d = LN,k,d − γmN,k,d − γm+1 m
d−1 X d 0 =1
0
0
N,k,d LN,k,d−d γm+(N m −k)(d−d 0 ) .
(2.11)
This yields: γmN,k,d
=
d X
X
l=1 Pl
l−1
(−1)
i=1 di =d
N −1−(N X−k)d
···
jl =m
j2 j3 X X
l Y
j2 =m j1 =m
i=1
! LN,k,dPi i−1 ji +( n=1 dn )(N−k)
.
(2.12) ∗ (M k ) is of the form of (O )N −1 −k k (O )k−1 ·q = 0 The fact that the main relation of Hq,e e e N [16,11] is equivalent to
γ0N,k,1 =
N−1 X j =1
γ0N,k,d =
LN,k,1 = kk , j
d X
X
l=1 Pl
(−1)l−1
i=1 di =d
(d ≥ 2).
N −1−(N X−k)d
···
jl =0
j2 Y j3 X l X j2 =0 j1 =0 i=1
LN,k,dPi i−1 ji +(
n=1 dn )(N−k)
= 0.
(2.13)
162
A. Collino, M. Jinzenji
2.2. The role of primitive cohomology. In this subsection we consider the general structure of the quantum cohomology ring of a Fano hypersurface V of degree k in Pn+1 (n ≥ 3) including the primitive part. It is H2 (V , Z) = Zq, where kq is the class of a plane section, and H 2 (V , Z) is spanned by the class x(:= e) of the hyperplane section H . The ring H ∗ (V , Q) is generated by x and by the primitive cohomology H n (V , Q)0 , with the relations x n+1 = 0, x ∪ a1 = 0, R −1 n a1 ∪ a2 = k V (a1 ∧ a2 )x for aR1 , a2 primitive classes. We shall denote by ( | )V the intersection form, hence (a|b)V = V a ∧ b. For 0 ≤ i ≤ n, xi (:= ei ) is the class of the linear section of V of codimension i, so that x = x1 . The vectors xi span the invariant part R of H ∗ (V , Q), which is the orthogonal complement of H n (V , Q)0 . We recall that the Fano index of V is h P = h(V ) = n + 2 − k. Let Z{H2 (V , Z)} be the graded homogeneous ring of formal series nd q d with integer coefficients. One introduces a ring structure on α ∗ , β ∗ in H ∗ (V , Z) the quantum H ∗ (V , Z{H2 (V , Z)}) by the rule that for P homogeneous ∗ ∗ ∗ ∗ d multiplication product is α · β = l (α , β )d q , where (α ∗ , β ∗ )0 is the ordinary cohomology product, and (α ∗ , β ∗ )d is a class of degree deg(α ∗ )+deg(β ∗ )−2hd defined by the condition ((α ∗ , β ∗ )d |γ ) = [α ∗ , β ∗ , γ ; d; V ](= hOα ∗ Oβ ∗ Oγ iV ,d,gravity ). This last term is the GW invariant, which can be informally defined as the number of rational curves of degree d on V meeting the representative submanifold A,B,G in general position. We shall use the associativity and the grading properties of ·, whose rigorous and highly non-trivial construction is due to Ruan and Tian [27]. We recall some facts from [28]. Tian observed that the GW classes [α1 , . . . , αl ; d; V ] are invariant under monodromy action, which is a direct corollary of the main result in [27], and he applied this explicitly to cases like hypersurfaces by using the Picard-Lefschetz theorem. Proposition 1 (Tian). If m − l is odd and as are primitive classes then [xi1 , . . . , xil , al+1 , . . . , am ; d; V ] = 0. Proof. The statement holds when n is odd P for trivial reasons, indeed by definition [xi1 , . . . , xil , al+1 , . . . , am ; j ; V ] = 0 if 2( ij )+(m−l)n 6 = 2n+2hd +2(m+l −3). Coming to the case when the hypersurface V is even dimensional, we recall that the monodromy group M is generated by reflections defined by the vanishing cycles. The case of even dimensional quadrics is readily checked, since the vanishing cohomology has rank one in this case. On the other hand if n > 3 and k > 3, by the same argument explained in p.384 of [26], a lemma of Deligne yields that the Zariski closure M¯ is in fact the full group of isometries of H ∗ (V , C)0 . Thus the GW invariant above defines a symmetric multilinear form with an odd number of entries, invariant under the orthogonal group. It is clear that such a form vanishes. u t If h ≥ 2 Tian’s result yields x · a = 0, for a ∈ H n (V , Q)0 . Instead we have Proposition 2. If h = 1 then x · a = k!aq. Proof. The statement is equivalent to [x, a1 , a2 ; 1] = −k!(a1 |a2 )V ; here ai are primitive classes and [x, a1 , a2 ; 1] is the GW number of the lines which meet them. Our proof of this equality is based on a remark of Beauville, [1, 4, Application II]. In this direction we also need to prove the formula below, which is a generalization of a result of Tyurin, [29, 3] and [22]. Let W be a general hypersurface whose generic hyperplane section is V . Then the Fano variety F (W ) of lines on W is a non-singular irreducible of dimension k and there are k! lines on W which meet a general point, [22]. The variety F (V ) is a nonsingular subvariety of codimension 2 in F (W ). The natural P 1 bundle p : L → F (W )
Quantum Cohomology Rings of Projective Hypersurfaces
163
surjects λ : L → W with degree k!. We denote by γ : BF → V the restriction of λ to V , where γ has degree k!. Then β : BF → F (W ) is the blow up along F (V ) and the projection of the exceptional divisor π : E → F (V ) is the restriction of p. We denote here i : E → BF and j : V → W the natural inclusions. The cohomology of a blow up decomposes as a direct sum, in our case H ∗ (BF, Q) = i∗ π ∗ (H ∗−2 (F (V ), Q)) ⊕ β ∗ (H ∗ (F (W ), Q)). Now γ∗ H ∗ (BF, Q) → H ∗ (V , Q) is a surjection, because γ : BF → V is. It is known that the primitive cohomology is contained in the image (γ i)∗ π ∗ (H ∗−2 (F (V ), Q)), [22]. We need the stronger result that given a primitive class a there is a class α with γ ∗ (a) = i∗ π ∗ α. To prove this statement we first note that it is equivalent to β∗ γ ∗ (a) = 0, and then we consider a cycle A which represents a and which is in general position with respect to the locus covered by the lines on V . We have β∗ γ ∗ (A) = β∗ (λ∗ (j∗ (A)) ∩ BF ), and then j∗ (A) = 0 in H n+2 (W, Q)) because primitive classes are annihilated by j∗ . Fix next the primitive classes a1 and a2 so that γ ∗ a1 = i∗ π ∗ α1 , γ ∗ a2 = i∗ π ∗ α2 . One has equality of degrees of intersection (γ ∗ a1 |γ ∗ a2 )BF = k!(a1 |a2 )V , because the degree of γ is k!. On the other hand the excess intersection formula of [9] yields (i∗ π ∗ α1 |i∗ π ∗ α2 )BF = −(π ∗ α1 |π ∗ α2 · ζ )E = −(α1 |α2 )F (V ) , here ζ denotes the tautological class of E as a P 1 bundle, and ζ is known to be the opposite of the class of the normal bundle of E in BF . Thus k!(a1 |a2 )V = −(α1 |α2 )F (V ) . Now it is geometrically clear, and is the idea from [1], that [x, a1 , a2 ; 1] = (π∗ i ∗ γ ∗ a1 |π∗ i ∗ γ ∗ a2 )F (V ) = (α1 |α2 )F (V ) . Tian’s vanishing implies also that the quantum product of the hyperplane class with a linear section is of type X ad,s xs−dh q d , x · xs−1 = xs + i≥1
where ad,s = k −1 [xs−1 , xn+dh−s , x; d] are the structure constants. We set w := x +k!q, if h = 1, and otherwise w := x, and we write ws the s th power of w with respect to the quantum product. Then w satisfies a unique minimal monic equation F = 0, of degree(n+1), the equation which is found by setting s = n+1 in the displayed P[(n+1)/ h] cd w n+1−dh q d . For primitive formula. This is of the form F := w n+1 + d=1 P [n/ h] classes a and b we have a·b = k −1 (a|b)V (w n + d=1 bd w n−dh q d ). Following Tian we P[n/ h] note that associativity yields 0 = (w·a)·b = k −1 (a|b)V (w n+1 + d=1 bd wn+1−dh q d ), and thus cd = bd for 1 ≤ d ≤ n, cn+1 = 0. Beauville in [1] studied the structure of the quantum ring of Fano hypersurfaces of degree small with respect to the dimension. Beauville’s result deals with the case n ≥ 2k − 3, in this case only the coefficient c1 6 = 0. Now −c1 is the sum of the Schubert numbers of lines on V , and it turns out that −c1 = k k hence: Theorem 1. The quantum cohomology of V over the rational numbers is generated by w and H n (V , Q)0 with relations (i) wn+1 = k k w k−1 q, (ii) w · a = 0, (iii) a · b = k −1 (a|b)V (w n − k k wk−2 q). This theorem in fact always holds, the hardest part (i) is a deep theorem of Givental [11], while (ii) and (iii) follow from the same arguments used before. u t
164
A. Collino, M. Jinzenji
3. Recursion Relations for the Structure Constants of Fano Hypersurfaces This section is devoted to the proof of the following recursion laws and of some related results: Theorem 2. Consider a hypersurface MNk in CP N −1 of degree k; if the 1st Chern class N − k ≥ 2 then the basic structure constants satisfy the following recursion relations: = LN+1,k,1 =: Lkm , LN,k,1 m m 1 +1,k,1 +1,k,1 = (LN+1,k,2 + LN+1,k,2 + 2LN · LN LN,k,2 m m m m+(N−k) ), 2 m−1 1 +1,k,3 (4LN+1,k,3 = + 10LN+1,k,3 + 4LN LN,k,3 m m m−2 m−1 18 +1,k,1 N +1,k,2 + 12LN+1,k,2 · LN+1,k,1 · LN m−1 m+2(N−k) + 9Lm m+2(N−k)
(3.1) (3.2)
+ 6LN+1,k,2 · LN+1,k,1 m m+1+2(N−k) +1,k,2 N +1,k,1 + 6LN+1,k,1 · LN+1,k,2 · LN m−1 m−1+(N −k) + 9Lm m−1+(N−k)
+ 12LN+1,k,1 · LN+1,k,2 m m+(N −k) N +1,k,1 + 18LN+1,k,1 · LN+1,k,1 m m+(N −k) · Lm+2(N−k) ).
(3.3)
Our arguments are heuristic. We embed X := MNk as the linear section of a general k in CP N so that hypersurface Y := MN+1 k ∩ H, MNk = MN+1
(3.4)
where the hyperplane H is identified with CP N −1 . Next we introduce the notation N N hOea1 Oea2 · · · Oeam iM k ,d,gravity = [AN a1 , Aa2 , · · · , Aam ; d, N, k]. N
(3.5)
N −1 and in general Here the spaces AN ai are linear subspaces of codimension ai in CP position, so that k P DM k (eai ) = AN ai ∩ MN . N
(3.6)
We introduce below the “special position” correlation functions, N +1 N +1 ∩ H, AN+1 ∩ H, · · · , AN+1 G[AN+1 a1 a2 am ∩ H, Aam+1 , · · · , Aam+n ; d, N + 1, k]. (3.7)
∩ H ’s is a linear subspace in CP N of codimension ai + 1 which lies Clearly AN+1 ai N−1 = H . The special position correlation function should count the number in CP of rational curves of degree d on Y with m labeled points on them which belong to the corresponding linear spaces and which have the further property that points with different labels stay distinct. By taking the linear spaces in general position in CP N −1 we may N +1 N +1 ∩ H, AN+1 ∩ H, · · · , AN+1 assume that [AN+1 a1 a2 am ∩ H, Aam+1 , · · · , Aam+n ; d, N, k] has no contribution from reducible curves on X. Now an irreducible curve of degree d which cuts H in d + 1 points lies on it and then
Quantum Cohomology Rings of Projective Hypersurfaces
165
N N [AN a1 , Aa2 , · · · , Aad+1 ; d, N, k] + R +1 ∩ H, AN+1 ∩ H, · · · , AN = G[AN+1 a1 a2 ad+1 ∩ H ; d, N + 1, k],
(3.8)
where R measures the contributions due to the connected reducible curves on Y which satisfy the conditions. In the cases that we consider, R does not occur for lines and conics and it is a finite set for the case of cubic curves, as we compute below. For curves of degree 4 or more the family of reducible curves supporting R may be of positive dimension and we are not able to determine the contribution due to them. For this reason we shall restrict to the case of curves of degree d at most equal to 3. The following lemma, the specialization formula, gives a procedure for computing the N+1 N+1 degree of G[AN+1 a1 ∩ H, Aa2 ∩ H, · · · , Aad+1 ∩ H ; d, N + 1, k], because by definition N+1 N+1 N+1 +1 N +1 N +1 G[Aa1 +1 , Aa2 +1 , · · · , Aam +1 ; d, N +1, k] = [AN a1 +1 , Aa2 +1 , · · · , Aam +1 ; d, N +1, k]. N+1 By moving AN+1 as+1 +1 into Aas+1 ∩ H one has Lemma 1. N +1 N +1 ∩ H, · · · , AN+1 ∩ H, AN+1 G[AN+1 a1 as as+1 +1 , Aas+2 +1 , · · · , Aas+t +1 ; d, N + 1, k] N +1 N +1 ∩ H, · · · , AN+1 ∩ H, AN+1 = G[AN+1 a1 as as+1 ∩ H, Aas+2 +1 , · · · , Aas+t +1 ; d, N + 1, k] + s X j =1
N+1 N +1 N +1 G[AN+1 ∩ H, · · · , AN+1 ∩ H, a1 aj −1 ∩ H, Aaj +as+1 ∩ H, Aaj +1 ∩ H, · · · , Aas
N+1 AN+1 as+2 +1 , · · · , Aas+t +1 ; d, N + 1, k].
(3.9)
Here we explain the definition of the special position G-W invariants. Given a projective variety Z Kontsevich [20] has constructed the coarse moduli ¯ space M¯ := M(Z, m, β) of stable maps of homology class β to Z. M¯ is the set of equivalence classes of data [C, p1 , . . . , pm , µ], where µ : C → Z is the “stable” map, C is a varying, projective, connected, nodal curve of arithmetic genus 0, and p1 , . . . , pm are distinct, labeled nonsingular points on C. We refer to [10] for a detailed discussion of this construction. The canonical evaluation maps ρi : M¯ → Z are defined ¯ m, β) by ρi ([C, p1 , . . . , pm , µ]) = µ(pi ). The interior M(Z, m, β) is the locus in M(Z, ¯ corresponding to nonsingular irreducible domain curves. We write M(Z, A, β) when the index set of labels is a set A instead of [n] = {1, . . . , n}. There are forgetful ¯ ¯ A, β) → M(Z, A − B, β), defined when B is a subset of A. Let maps φB : M(Z, now Z be a general non-singular Fano hypersurface of dimension n ≥ 3 and of index h(Z) = n + 2 − deg(Z). We consider the case when β is the class of a curve of degree d ¯ and we assume that M(Z, m, d) has the expected dimension dimZ +dh(Z)+m−3 and similarly for the boundary components. We recall that such components are associated with the choice of a partition A ∪ B of the set [m] := {1, . . . , m} and of the choice of d1 ¯ and d2 with d = d1 + d2 . The boundary component D(A, B; d1 , d2 ) is defined as the locus of moduli points corresponding to reducible domain curve C = C1 ∪ C2 , where µ∗ (Ci ) has degree di . Here the curve C is obtained by gluing at • the curve C1 which has on it points marked by the elements in A and a further point, labeled by •. and C2 , which has on it points marked by the elements in B and a further point, also labeled by •. ¯ ¯ ¯ B ∪ {•}, d2 ). A ∪ {•}, d1 ) ×Z M(Z, There is an identification D(A, B; d1 , d2 ) = M(Z, In what follows we take n + 2 = N and define Ti , i = 1, . . . , m to be linear spaces of codimension ti ≥ 1 in P n+2 and in general position there. As before Y
166
A. Collino, M. Jinzenji
P is a Fano hypersurface of degree k. We assume that ti is the expected dimension ¯ dimY + dh(Y ) + m − 3 of M(Y, m, d). We define [t1 , . . . , tm ; d, Y ] to be the degree of the zero cycle [T1 , . . . , Tm ; d, Y ], which we define to be the intersection product of the cycles ρi−1 (Ti ). Here we assume that those cycles intersect transversally in a finite number of points, each one of which is associated with an irreducible source curve and with the property that the corresponding map sends different labeled points to different images. By definition [t1 , . . . , tm ; d, Y ] is one of the GW invariants of Y , and it is called basic if m = 3 and if at least one of the ti is 1. The GW invariants on X are defined in a similar way, by means of linear spaces Si ⊂ P n+1 . We shall use the convention that Si and Ti are spaces of the same dimension, so that Si is obtained by moving Ti into P n+1 . Given linear spaces as above we write G[S1 , . . . , Ss , Ts+1 , . . . , Ts+t ; d; Y ] to rep¯ resent the open cycle in M(Y, s + t, d) which can be informally described as the set of rational curves of degree d on Y with s + t marked points such that the images of the labeled points pj belong to the space with the same label, T and such that for j ≤ s and i ≤ s if pj and pi have the same image point in Sj Si then this point is a double point for the image curve. We shall use the notation that si is the codi= ti − 1, because mension of Si in P n+1 so that si P P of our convention. The codimension of the preceding cycle is j (sj + 1) + j tj . Our aim is to compute the degree of G[S1 , . . . , Ss , Ts+1 , . . . , Ts+t ; d; Y ] when its expected dimension is 0. By abuse of notations we shall often use the same notation to represent both a cycle of dimension 0 and the degree of the said cycle. We define M¯ 0 (s) to be the complement ¯ ¯ in M¯ := M(Y, s + t, d) of the union of the components of type D(A, B; d1 , d2 ) with d2 = 0 and with at least two elements of B which are ≤ s. Thus we have ¯ The evaluation ρi restricts to ρ(s)0 on M¯ 0 (s). Our definiM¯ 0 (s + 1) ⊂ M¯ 0 (s) ⊂ M. i tion is that G[S1 , . . . , Ss , Ts+1 , . . . , Ts+t ; d; Y ] is the intersection product of the cycles (ρ(s)oj )−1 (Sj ),j = 1, . . . s, (ρ(s)ol )−1 (Tl ), l = s + 1, . . . , s + t. If the codimensions sj and tl are fixed the set of lists (S1 , . . . , Ss , Ts+1 , . . . , Ts+t ) is parameterized by a product of Grassmann manifolds, hence it is an irreducible variety and then there is an open dense subset of it where the degree of G[S1 , . . . , Ss , Ts+1 , . . . , Ts+t ] is maximum. We shall assume that our lists come from this subset. We start by noting that [S1 , . . . , Sd+1 ; d; X] and G[S1 , . . . , Sd+1 ; d; Y ] both have the same expected dimension, which we take to be 0. Proposition 3. If the given cycles are zero dimensional then G[S1 , . . . , Sd+1 ; d; Y ] = [S1 , . . . , Sd+1 ; d; X] + R, ¯ where R is supported on the boundary locus of M(Y, m, d) and more precisely on the locus corresponding to reducible domain curves with reducible image. In order to compute the degree of [S1 , . . . , Sd+1 ; d; X] we need to verify that the dimension of the preceding cycles is in fact 0 and then to compute their degrees. Now we have by assumption that [S1 , . . . , Sd+1 ; d; X] has the correct dimension 0, so the dimension of G[S1 , . . . , Sd+1 ; d; Y ] can fail to be also = 0 only if R fails. Of course the dimension of R can be detected by looking at decomposable curves on Y ; that is to the behavior of rational curves of degree strictly less than d. As we have said above the degree of G[S1 , . . . , Sd+1 ; d; Y ] is determined by a reduction procedure, which is performed by moving linear spaces Ti which are in general position in P n+2 to spaces Si of the same dimension, which are contained in the hyperplane P n+1 and in general position there. Our main tool is the next proposition; we have only heuristic arguments to support it.
Quantum Cohomology Rings of Projective Hypersurfaces
167
Proposition 4. Provided that the dimensions of the cycles below are 0 as it is expected then degree G[S1 , . . . , Ss , Ts+1 , . . . , Ts+t ; d; Y ] = degree G[S1 , . . . , Ss , Ss+1 , Ts+2 , . . . , Ts+t ; d; Y ] X + degree ψi−1 G[S1 , . . . , Si−1 , Si,s+1 , Si+1 , . . . , Ss , Ts+2 , . . . , Ts+t ; d; Y ], ¯ + t] − {i, s + 1}, {i, s + where Si,s+1 := Si ∩ Ss+1 and where ψi is the isomorphism D([s ¯ 1}; d, 0) → M(Y, ([s + t] − {i, s + 1}) ∪ {•}, d). The specialization lemma is just a restatement of this proposition. The following procedure gives the recursive formulas: +1 ∩ H, AN +1 ∩ 1. By iterative application of the specialization formula we write G[AN a b N+1,d−1 ∩H, · · · , A ∩H ; d, N +1, k] in terms of the standard correlation H, AN+1,1 1 1 k . functions of Y = MN+1 2. We decompose the standard correlation functions found in Step 1 as polynomials in the basic G-W invariants of Y , by which we mean the functions
hOea Oeb Oe id,M k
N +1 ,gravity
.
This step is done by means of the first reconstruction theorem of Kontsevich and Manin or, equivalently, by the microscopic version of the DWVV equations. +1 ∩ H, AN +1 ∩ H , 3. We compute the contribution of reducible curves in G[AN a b N+1,d−1 AN+1,−1 ∩ H, · · · , A ∩ H ; d, N + 1, k]. 1 1 The 1st step gives next equalities. In writing them we use the convention that if the number of insertion points gets lower than 3 then we insert an hyperplane condition and divide by the degree of the curve: ∩ H, AN+1 ∩ H ; 1, N + 1, k] G[AN+1 a b N+1 N +1 = [AN+1 a+1 , Ab+1 ; 1, N + 1, k] − [Aa+b+1 ; 1, N + 1, k]
(a + b = N − 3 + (N − k)),
∩ H, AN+1 ∩ H, AN+1 ∩ H ; 2, N + 1, k] G[AN+1 a 1 b N+1 N+1 +1 N +1 ; 2, N + 1, k] − [AN = [AN+1 a+1 , Ab+1 , A2 a+2 , Ab+1 ; 2, N + 1, k] N+1 − [AN+1 a+1 , Ab+2 ; 2, N + 1, k] N+1 +1 ; 2, N + 1, k] + 2[AN − [AN+1 a+b+1 , A2 a+b+2 ; 2, N + 1, k]
(a + b = N − 3 + 2(N − k)),
168
A. Collino, M. Jinzenji
+1,2 G[AN+1 ∩ H, AN+1 ∩ H, AN+1,1 ∩ H, AN ∩ H ; 3, N + 1, k] a 1 1 b
N+1,1 N+1 , AN+1,2 ; 3, N + 1, k] = [AN+1 a+1 , Ab+1 , A2 2
N+1 N+1 +1 N +1 N +1 ; 3, N + 1, k] − 2[AN ; 3, N + 1, k] − 2[AN+1 a+1 , Ab+2 , A2 a+2 , Ab+1 , A2
N+1,1 +1 N +1 N +1 , AN+1,2 ; 3, N + 1, k] − [AN ; 3, N + 1, k] − [AN+1 2 a+1 , Ab+1 , A3 a+b+1 , A2
N+1 N+1 N +1 + 2[AN+1 a+1 , Ab+3 ; 3, N + 1, k] + 2[Aa+2 , Ab+2 ; 3, N + 1, k] N+1 + 2[AN+1 a+3 , Ab+1 ; 3, N + 1, k]
N+1 N +1 ; 3, N + 1, k] + [AN+1 ; 3, N + 1, k] + 4[AN+1 a+b+2 , A2 a+b+1 , A3
− 6[AN+1 a+b+3 ; 3, N + 1, k] (a + b = N − 3 + 3(N − k)).
(3.10)
We assume now that the Fano index of X is at least 2, namely N − k ≥ 2, then a + b + 1 k ) + 1, and therefore (3.10) is truncated in an obvious is greater than N = dim(MN+1 +1 N +1 N +1 way. At this point we recall the definition [AN a1 , Aa1 , · · · , Aam ; d, N + 1, k] = hOea1 Oea2 · · · , Oeam id,M k ,gr . In order to proceed we need to express N +1
hOea Oeb Oe2 id,M k
N +1 ,gr
and hOea Oeb Oe3 id,M k
N +1 ,gr
in terms of hOea Oeb Oe id,M k ,gr . Our tool is the first reconstruction theorem of KontN +1 sevich and Manin [21], and it yields hOea Oeb Oem ihOeN −1−m Oe Oe i hOea Oeb Oem ihOeN −1−m Oe2 Oe i hOea Oeb Oe2 Oem ihOeN −1−m Oe Oe i = hOea Oe Oe2 Oem ihOeN −1−m Oeb Oe i
= = + +
hOea Oe Oem ihOeN −1−m Oeb Oe i, hOea Oe2 Oem ihOeN −1−m Oeb Oe i, hOea Oeb Oem ihOeN −1−m Oe2 Oe Oe i hOea Oe Oem ihOeN −1−m Oeb Oe2 Oe i. (3.11)
We find in the end, if the Fano index of X is at least 2: 1 N+1 G[AN+1 N−2−m ∩ H, Am−1+(N−k) ∩ H ; 1, N + 1, k] k := Lkm , (3.12) = LN+1,k,1 m 1 N+1 N +1 ∩ H ; 2, N + 1, k] G[AN+1 N−2−m ∩ H, Am−1+2(N−k) ∩ H, A1 k 1 +1,k,1 + LN+1,k,2 + 2LN+1,k,1 · LN (3.13) = (LN+1,k,2 m m m+(N−k) ), 2 m−1 1 N +1,1 +1,2 N+1 G[AN+1 ∩ H, AN ∩ H ; 3, N + 1, k] 1 N−2−m ∩ H, Am−1+3(N−k) ∩ H, A1 k 1 + 10LN+1,k,3 + 4LN+1,k,3 = (4LN+1,k,3 m m−2 m−1 6 +1,k,1 N+1,k,2 + 12LN+1,k,2 · LN+1,k,1 · LN m−1 m+2(N−k) + 12Lm m+2(N−k) + 6LN+1,k,2 · LN+1,k,1 m m+1+2(N−k) N+1,k,1 N +1,k,1 + 6LN+1,k,2 + 12LN+1,k,2 m−1+(N−k) · Lm−1 m−1+(N−k) · Lm
Quantum Cohomology Rings of Projective Hypersurfaces
169
N+1,k,1 + 12LN+1,k,2 m+(N−k) · Lm N+1,k,1 +1,k,1 + 18LN · LN+1,k,1 m m+(N−k) · Lm+2(N−k) ).
(3.14)
We make the hypothesis that the Schubert varieties of conics and lines on Y and on X which are associated with the given linear spaces Si , Tj and their intersections have the right dimension. A count of dimensions shows that for the cases of degree 1 and degree 2 there is no contribution from the reducible curves and therefore N+1 N,k,1 G[AN+1 N−2−m ∩ H, Am−1+(N−k) ∩ H ; 1, N + 1, k]/k = Lm
and N+1 N+1 ∩ H ; 2, N + 1, k]/k = LN,k,2 . G[AN+1 m N−2−m ∩ H, Am−1+(N−k) ∩ H, A1
For cubic curves there is a contribution due to the reducible connected curves which are made of a line lying on X and of a conic lying on Y . There are two cases which occur, in one instance the line meets AN+1 N−2−m ∩ H , and the conic meets the line and N+1 Am−1+3(N −k) ∩ H . In the other case the incidence conditions with the linear spaces are reversed. In this way N+1,1 +1,2 N+1 ∩ H, AN ∩ H ; 3, N + 1, k]/k G[AN+1 1 N −2−m ∩ H, Am−1+3(N−k) ∩ H, A1
+R = 3LN,k,3 m
1 k
1 1 +1,k,2 + LN+1,k,2 · LN,k,1 + LN · LN,k,1 . = 3LN,k,3 m m m+2(N−k) 2 2 m−1+(N−k) m
t u
(3.15)
We come now to the case of Fano index 1. The same type of computations as above yield: Theorem 3. If X is a hypersurface of degree k in CP k the recursion relations for the basic invariants of conics and cubic curves are the same as given in Theorem 1, instead the numbers of lines satisfy the law = Lk+2,k,1 − Lk+2,k,1 = Lk+2,k,1 − k!. Lk+1,k,1 m m m 0
(3.16)
Proof. In this case, a + b + 1 = N − 2 + d, from (3.10). Then we obtain ∩ H, AN+1 ∩ H ; 1, N + 1, k] [AN+1 a b
N+1 N +1 = [AN+1 a+1 , Ab+1 ; 1, N + 1, k] − [AN −1 ; 1, N + 1, k] (a + b + 1 = N − 1). t u
(3.17)
Now we can prove: Corollary 1. The main relation of the quantum ring of a Fano hypersurface MNk with index N − k ≥ 2 is of the form (Oe )N−1 − k k (Oe )k−1 q = 0 up to q 3 .
(3.18)
170
A. Collino, M. Jinzenji
Proof. The recursion relations of Theorem 2 do not change γ0N,k,d , namely γ0N,k,d = γ0N+1,k,d . If N ≥ 2k + 1, then γ0N,k,d = 0(d ≥ 2), because of the vanishing conditions due to the topological selection rule.On the other hand γ0N,k,1 = k k , from Schubert calculus cf. [1]. u t Corollary 2. The main relation of the quantum cohomology ring of a Fano hypersurface of index 1 and dimension k − 1 is of the form (Oe + k!q)k − k k (Oe + k!q)k−1 q = 0
(3.19)
up to q 3 . Proof. Consider the multiplication rule (2.6): Oek−1−m + Oe · Oek−1−m := Oek−m + qLk+1,k,1 m
∞ X d=2
q d Lk+1,k,d Oek−m−d . (3.20) m
This gives: + k!)Oek−1−m + (Oe + k!q) · Oek−1−m = Oek−m + q(Lk+1,k,1 m = Oek−m +
∞ X d=1
∞ X d=2
q d Lk+1,k,d Oek−m−d m
q d L˜ k+1,k,d Oek−m−d . m
(3.21)
Set now F = Oe + k!q, use F as a multiplicative generator and write k−m
Oek−m = (F )
−
∞ X d=1
q d γ˜mk+1,k,d (F )k−m−d (m = 0, 1, · · · , k − 1).
(3.22)
A standard computation yields γ˜0k+1,k,d = γ0k+2,k,d , and we conclude by descending induction as in the proof of the preceding corollary. u t Our last result is also easily proved by descending induction on N . can be written as Corollary 3. If d ≤ 3 and N − k ≥ 1) the structure constants LN,k,d m a homogeneous polynomial of degree d in the structure constants of lines Lkm . 4. Recursive Formulas in the Calabi–Yau Case Here we try to understand how the recursive formulas change when the hypersurface becomes of Calabi–Yau type, i.e. when we deal with Mkk of degree k in CP k−1 . In this situation we can proceed as before for lines and conics. On the other hand we cannot use the same method for curves of degree 3, because it is difficult in this case to control the contribution from reducible curves. We give instead conjectural recursive formulas for cubics. In the last part of the section we explain the trend of thought which led us to the conjecture. We recall first that given a general point on Mkk there are no rational curves meeting k,k,d = Lk,k,d = Lk,k,d it, and therefore Lk,k,d 0 1 k−1 = Lk−2 = 0.
Quantum Cohomology Rings of Projective Hypersurfaces
171
Theorem 4. The recursive laws for lines and conics on a Calabi–Yau hypersurface of degree k are for 2 ≤ m ≤ k − 3, = Lk+1,k,1 − Lk+1,k,1 , Lk,k,1 m m 1 1 Lk,k,2 = (Lk+1,k,2 + Lk+1,k,2 − Lk+1,k,2 − Lk+1,k,2 m m 0 1 2 m−1 k+1,k,1 2 k+1,k,1 +2(Lm − L1 ) ).
(4.1)
(4.2)
Proof of Theorem 4. We go back to the specialization formula (3.10), which we use with the condition a + b = N − 3 because now we have N = k. Repeated use of the first reconstruction theorem yields 1 k+1 G[Ak+1 k−2−m ∩ H, Am−1 ∩ H ; 1, k + 1, k] k − Lk+1,k,1 , = Lk+1,k,1 m 1 1 k+1 k+1 ∩ H ; 2, k + 1, k] G[Ak+1 k−2−m ∩ H, Am−1 ∩ H, A1 k 1 + Lk+1,k,2 − Lk+1,k,2 − Lk+1,k,2 = (Lk+1,k,2 m 0 1 2 m−1 k+1,k,1 k+1,k,1 k+1,k,1 + 2Lm · (Lm − L1 )).
(4.3)
(4.4)
Next we check the contribution from reducible curves. For lines there are no reducible curves, so that 1 k+1 k,k,1 G[Ak+1 k−2−m ∩ H, Am−1 ∩ H ; 1, k + 1, k] = Lm . k
(4.5)
In case of conics, the reducible curves are given by two intersecting lines, one lying on k k , hence it is ∩ H and the other on Mk+1 Mkk = Mk+1 1 k+1 k+1 G[Ak+1 ∩ H ; 2, k + 1, k] k−2−m ∩ H, Am−1 ∩ H, A1 k + Lk+1,k,1 · Lk,k,1 = Lk,k,2 m m . 1
(4.6)
t u =0 Next we deal with cubic curves, the Calabi–Yau condition implies again Lk,k,3 m for 0 ≤ m ≤ 1, k − 2 ≤ m ≤ k − 1, our proposal for the remaining m is the following: Conjecture 1. The recursive law for curves of degree 3 on Mkk and for 2 ≤ m ≤ k − 3 becomes: = Lk,k,3 m
1 (4Lk+1,k,3 + 10Lk+1,k,3 + 4Lk+1,k,3 m m−2 m−1 18 − 10Lk+1,k,3 − 4Lk+1,k,3 0 1 + 12Lk+1,k,2 · Lk+1,k,1 + 12Lk+1,k,2 · Lk+1,k,1 m m m m−1 + 6Lk+1,k,2 · Lk+1,k,1 m m+1
+ 6Lk+1,k,2 · Lk+1,k,1 + 12Lk+1,k,2 · Lk+1,k,1 m m−1 m−1 m−1 + 12Lk+1,k,2 · Lk+1,k,1 m m
172
A. Collino, M. Jinzenji
+ 18(Lk+1,k,1 − Lk+1,k,1 )2 · (Lk+1,k,1 + 2Lk+1,k,1 )) m m 1 1 1 k+1,k,2 k+1,k,1 1 k+1,k,2 k+1,k,1 − Lm−1 · Lm − Lm · Lm 6 6 3 3 − Lk+1,k,2 · Lk+1,k,1 − Lk+1,k,2 · Lk+1,k,1 m m 4 0 4 1 5 5 − Lk+1,k,2 · Lk+1,k,1 − Lk+1,k,2 · Lk+1,k,1 1 1 12 1 12 0 1 − Lk+1,k,2 · Lk+1,k,1 2 3 1 1 − 3Lk+1,k,1 · (Lk+1,k,2 + Lk+1,k,2 − Lk+1,k,2 − Lk+1,k,2 m 1 0 1 2 m−1 k+1,k,1 2 k+1,k,1 + 2(Lm − L1 ) ).
(4.7)
We came to this formula by means of the following considerations. Using Theorem 4 ∗ (M k ) up to degree 2, this reads: one can compute the main relation of Hq,e k (1 − (k k − (k − 2)Lk1 − 2Lk0 )q − (2k k Lk0 + (k − 3)k k Lk1 − 3(Lk0 )2 − (2k − 6)Lk1 Lk0 − k k − 2 k+1,k,2 2 L1 − Lk+1,k,2 − )q − · · · )(Oe )k−1 = 0. 0 2 2
(k − 3)(k − 2) k 2 (L1 ) 2 (4.8)
On the other hand one has from [18] and [16] that the main relation can be written using the k − 2 point correlation function of the pure matter theory in the form Qk−2
h
k
j =1 Oe (zj )iM k ,matter
(Oe )k−1 = 0,
(4.9)
k
and that it is h
k−2 Y j =1
Oe (zj )iM k ,matter = k + k k+1 (1 − 2a1 − (k − 2)(b1 ))q k
+ k 2k+1 (1 − 2a1 − b1 + 3(a1 )2 − 2a2 + 2a1 · b1 (k − 2)(k − 3) + (k − 2)(−b1 + 4a1 · b1 + 2(b1 )2 − 2b2 ) + (b1 )2 )q 2 + · · · . 2 (4.10) Here ad =
(kd)! , (d!)k k kd
bd = ad (
k−1 d X X i=1 m=1
m ) i(ki − m)
are the coefficients of the hypergeometric series associated to the solutions W0 (x) =
∞ X d=0
ad edx ,
W1 (x) =
∞ X d=1
bd edx + W0 (x)x,
Quantum Cohomology Rings of Projective Hypersurfaces
173
of the differential equation ((
1 d 2 d k−1 d d k−1 ) + )( + )···( + ))Wi (x) = 0. − ex ( dx dx k dx k dx k (4.11)
By comparison of (4.8) with (4.10), we notice the following equalities: k k a1 = Lk0 , k k b1 = Lk1 , 1 + 2(Lk0 )2 ), k 2k a2 = (Lk+1,k,2 2 0 1 + Lk+1,k,2 + 2Lk1 Lk1 ) + Lk1 Lk0 . k 2k b2 = (Lk+1,k,2 0 4 1
(4.12)
These equalities can be organized more systematically by means of generating functions. To this aim we need: be the integer obtained by applying Definition 1. For arbitrary N and k let L˜ N,k,d m formally the recursive laws of Theorem 2. x x ˜ k,k Remark. One should note: (i) L˜ k,k i (e ) = Lk−1−i (e ), (ii) if the index N − k is at least N,k,d must be the ordinary structural constant LN,k,d of the Fano hypersurface. 2 then L˜ m m
Now we can rewrite (4.12) as , k k a1 = L˜ k,k,1 0 k,k,1 k k b1 = L˜ , 1
, k a2 = L˜ k,k,2 0 1 k 2k b2 = L˜ k,k,2 + L˜ k,k,1 · L˜ k,k,1 . 1 0 2 1 2k
(4.13)
After having performed some numerical computations, we have noticed that also the following relations should hold true: , k 3k a3 = L˜ k,k,3 0 1 1 + L˜ k,k,2 · L˜ k,k,1 + L˜ k,k,1 · L˜ k,k,2 . k 3k b3 = L˜ k,k,3 0 1 0 3 1 2 1
(4.14)
Consider next the generating functions: ˜ := 1 + L˜ k,k i (q)
∞ X d=1
q˜ d , L˜ k,k,d i
q˜ := ex ,
(4.15)
and define t := x + (
∞ X
∞ X bj k e )/( aj k kj ej x ).
j =1
j =0
kj j x
(4.16)
174
A. Collino, M. Jinzenji
The preceding equalities motivate us to expect: ˜ = L˜ k,k 0 (q)
∞ X j =0
dt x aj k kj ej x , L˜ k,k . 1 (e ) = dx
(4.17)
˜ automatically gives the information of the B-model on the Thus, we expect that Lk,k m (q) mirror manifold of Mkk without using the mirror conjecture. We use the virtual structure ˜ to define a virtual quantum product determined by the action of a ring constants L˜ k,k m (q) generator G which operates according to the rules: x m G · Oem−1 = L˜ k,k k−1−m (e )Oe , (m = 1, 2, · · · , k − 2) G · Oek−2 = 0.
(4.18)
x We note the relation G = G · 1 = L˜ k,k k−2 (e )Oe . We expect that the structure constants of the virtual action satisfy the following equality, which in fact may be checked up to q˜ 3 using the recursive laws for the Fano case: k−1 Y i=0
x k x −1 L˜ k,k i (e ) = (1 − k e ) ,
(4.19)
and this yields the relation: x 2 k−1 = 0. (1 − k k ex ) · (L˜ k,k 0 (e )) · (G)
(4.20)
Note that multiplicative factor in (4.20) is nothing but the reciprocal of the generalized Yukawa coupling of the B-model [18]. On the other hand the true quantum cohomology ring satisfies a similar multiplication rule: Oe · 1 = Oe ,
t m (m = 2, 3, · · · , k − 2), Oe · Oem−1 = Lk,k k−1−m (e )Oe Oe · Oek−3 = Oek−2 , Oe · Oek−2 = 0, where ∞ X k,k t Lk,k,d edt . Li (e ) := 1 + i
(4.21)
(4.22)
d=1
t We can compute Lk,k i (e ) in concrete examples using the method of torus localization. See [17] for details and results. Now we search for a transformation rule to pass from the virtual to the true quantum multiplication. To this aim we find it useful to introduce a formal definition:
Definition 2. The commutative product (∗) between differential operators of the form dm f (x) dx m is given by: (f (x)
dn d m+n dm ) ∗ (g(x) ) = (f (x) · g(x)) . dx m dx n dx m+n
(4.23)
Quantum Cohomology Rings of Projective Hypersurfaces
175
d d Given the coordinate change x = x(t) we define a map from dx operators to dt operators by means of the rule
f (x)
dt m d m dm ) → f (x(t))( . dx m dx dt m
(4.24)
At this point we can relate the quantum product laws using as an intermediate step the product of differential operators. To start we propose the correspondence Oe =
d d , G= . dt dx
(4.25)
Then one has d x m ∗ Oem−1 = L˜ k,k k−1−m (e )Oe , (m = 1, 2, · · · , k − 2), dx d ∗ Oek−2 = 0, dx
(4.26)
and d ∗ 1 = Oe , dt
d t m (m = 2, 3, · · · , k − 3), ∗ Oem−1 = Lk,k k−1−m (e )Oe dt d d ∗ Oek−3 = Oek−2 , ∗ Oek−2 = 0. dt dt
(4.27)
It follows from (4.26): d x ˜ k,k x d = dt · d . ∗ 1 = L˜ k,k k−2 (e ) · Oe = Lk−2 (e ) · dx dt dx dt
(4.28)
This equality suggests that (4.26) and (4.27) become isomorphic if we use the transformation of differential operators defined above. Compare now the coefficients for Oeα in (4.26) with (4.27), then the wished isomorphism yields the equality α Y
x(t) dx (L˜ k,k ) )= k−1−j (e dt
j =1
α Y j =1
t Lk,k k−1−j (e ).
(4.29)
We find in this way the transformation laws that we were looking for; they are: x(t) ) L˜ k,k dx i (e t = L˜ k,k = Lk,k (ex(t) ) i i (e ). k,k x(t) dt L˜ (e )
(4.30)
1
This is the rule that provides the recursive formulas for curves of arbitrary degree d on the Calabi–Yau hypersurface Mkk once we know the recursive formulas for curves up to degree d on Fano hypersurfaces. At this point we obtain the recursive formulas for cubics in Conjecture 1 by means of elementary calculations.
176
A. Collino, M. Jinzenji
Example. The true quantum cohomology ring of the quintic Calabi–Yau threefold is: Oe · 1 = Oe , Oe · Oe = Oe2 (1 + 575et + 975375e2t + 1712915000e3t + · · · ), O e · O e 2 = Oe 3 , Oe · Oe3 = 0.
(4.31)
while the associated virtual ring is: G · 1 = Oe (1 + 770ex + 1435650e2x + 3225308000e3x + · · · ), G · Oe = Oe2 (1 + 1345ex + 3296525e2x + 8940963625e3x + · · · ),
G · Oe2 = Oe3 (1 + 770ex + 1435650e2x + 3225308000e3x + · · · ), G · Oe3 = 0.
(4.32)
Using (4.30) we find that: 575 = 1345 − 770, (4.33) 975375 = 3296525 − 1435650 + 770 · 770 (4.34) − 1345 · 770 − 770 · (1345 − 770), 1712915000 = 8940963625 − 3225308000 + 2 · 770 · 1435650 − 7703 + 1345 · (770)2 − 1345 · 1435650 − 3296525 · 770 − 2 · 770 · (3296525 − 1435650 + 770 · 770 − 1345 · 770) 1 3 (4.35) + ( · (770)2 − · 1435650) · (1345 − 770). 2 2 5. On the Construction of the Recursive Formulas for Rational Curves of Larger Degree Motivated by the preceding results, we begin this section by proposing some conjectures. They are strong enough to allow in principle the construction of the expected recursive formulas for curves of higher degree, and we explicitly produce the law for the d = 4 case. Conjecture 2. There are universal recursive polynomial laws which express the structure +1,k,n on a Fano variety in terms of LN (1 ≤ n ≤ d). The formulas constants LN,k,d m m have the following properties: 1. The form of the recursive polynomials is invariant if the index N − k ≥ 2, and the equality γ0N,k,d = γ0N+1,k,d for the coefficients in the fundamental relations is a consequence of them. 2. If N − k = 1 the recursive formulas change only for the case of lines, i.e. d = 1. represents the result of a formal We keep the notations of Sect. 4, so that L˜ N,k,d m x iteration of the recursive functions for fixed k down to any chosen N, and then L˜ k,k m (e ) is the associated generating function.
Quantum Cohomology Rings of Projective Hypersurfaces
177
Conjecture 3. Formal iteration of the laws of Conjecture 2 for descending N down to the case N = k yields the coefficients of the hypergeometric series used in the mirror calculation, i.e., it should be , k dk ad = L˜ k,k,d 0 k dk bd =
d−1
1 ˜ k,k,d X 1 ˜ k,k,m ˜ k,k,d−m L L + · L0 . d 1 m 1
(5.1)
m=1
The same procedure gives the structure constants of the quantum cohomology ring of the Calabi–Yau hypersurface of degree k, according to the rule t Lk,k i (e ) =
x(t) ) L˜ k,k i (e . L˜ k,k (ex(t) )
(5.2)
1
Here we set
dt x := L˜ k,k 1 (e ). dx
Remark. It is an immediate consequence of the conjectures and of the vanishing conis a polynomial of degree d in the ditions of Sect. 2 that the structure constant LN,k,d m constants Lkm , N ≥ k. We proceed now to the construction of the recursive formulas for the case d = 4. Our method is based on the expectation that the specialization procedure gives if not the right coefficients at least the right monomials which appear in the recursive laws. We formalize this below with a conjecture. We start by constructing some technical formulas for the factorization of the Gromov– Witten invariants. P Let {n∗ } := {n1 , n2 , · · · , nl } and ind({n∗ }) = lj =1 (nj − 1). We formally define ind({∅}) to be 0. We have a formula for the correlation functions (Gromov–Witten invariants) of the topological sigma model on MNk +1 coupled to gravity, which reads: k −1 h·Oea Oen1 Oen2 · · · Oenl Oeb id,M k
N +1 ,gr
=
d1 d X X d1 =0 d2 =0 ind({n Y∗ }) i=0
dind({n∗ })−1
···
X
C d ({n∗ }; d1 , d2 , · · · , dind({n∗ }) )
dind({n∗ }) =0 N+1,k,d −d
i i+1 Ln+1−a−i+(N −k+1)(d−di ) ,
d0 := d, N − k + 1 ≥ ind({n∗ }) + 1.
(5.3) (5.4)
The coefficients C d ({n∗ }; d1 , · · · , dind({n∗ }) ) which appear here have the following properties: C d ({m}; d1 , · · · , dm−1 ) = 1, d
(5.5) d
C ({n∗ } ∪ {1}; d1 , · · · , dind({n∗ }) ) = dC ({n∗ }; d1 , · · · , dind({n∗ }) ).
(5.6)
178
A. Collino, M. Jinzenji
One can determine C d ({n∗ }; d1 , · · · , dind({n∗ }) ) by means of the recursive relation, C d ({n∗ }; d1 , · · · , dind({n∗ }) ) X (C d−dind({l∗ })+nl −1 ({l∗ } ∪ {nl − 1}; d1 − dind({l∗ })+nl −1 , = {l∗ }q{m∗ }={n∗ }/{nl }
{m∗ }6=∅
· · · , dind({l∗ })+nl −2 − dind({l∗ })+nl −1 ) ·
C dind({l∗ })+nl −1 ({m∗ }; d1+ind({l∗ })+nl −1 , · · · , dind({n∗ }) ) · dind({l∗ })+nl −1 )
+ C d−dind({n∗ }) ({n∗ }\{ne } ∪ {nl − 1}; d1 − dind({n∗ }) , · · · , dind({n∗ })−1 − dind({n∗ }) ). (5.7) Proof. We prove (5.7) by induction of ind({n∗ }). We denote Oen1 Oen2 · · · Oenl as Oe{n∗ } for brevity. The first reconstruction theorem of KM yields: X
d X
{l∗ }q{m∗ }
d0 =0
={n∗ }/{nl }
k −1 hOea Oe{l∗ } Oeb Oenl +ind({m∗ })−(N −k+1)d0 id−d0 ,gr · · k −1 hOeind({l∗ })−(N −k+1)(d−d0 )+a+b Oe{m∗ } Oenl −1 Oe id0 ,gr
=
X
d X
{l∗ }q{m∗ }
d0 =0
={n∗ }/{nl }
k −1 hOea Oe{l∗ } Oenl −1 Oeb+1+ind({m∗ })−(N −k+1)d0 id−d0 ,gr · · k −1 hOea−1+nl +ind({l∗ })−(N −k+1)(d−d0 ) Oe{m∗ } Oe Oeb id0 ,gr . (5.8)
The l.h.s. of (5.8) has the contribution of d0 = 0 and of {m∗ } = {∅} because ind({n∗ }) − (N − k + 1)d0 ≤ −1, (d0 ≥ 1) and because the classical correlation function remains non-zero only if the number of the operator insertion point equals 3. Then we have (the l.h.s.) of (5.8) = k −1 hOea Oe{n∗ } Oeb id,gr .
(5.9)
On the other hand, we can rewrite the r.h.s. of (5.8) from the assumption of induction, d d−d X X0
X
(
{l∗ }q{m∗ }={n∗ }/{nl } d0 =0 t1 =0 {m∗ }6=∅
ind({l∗Y })+nl −2 i=0
(
d0 X u1 =0
ind({m Y∗ }) j =0
···
X
C d−d0 ({l∗ } ∪ {nl − 1}; t1 , · · · , tind({l∗ })+nl −2 ) ·
tind({l∗ })+nl −2 =0
N+1,k,t −t
i i+1 Ln+1−a−i+(N−k+1)(d−d )· 0 −ti )
uind({l∗ })−1
···
tind({l∗ })+nl −3
X
d0 C d0 {m∗ }; u1 , · · · , uind({m∗ }) ) ·
uind({l∗ }) =0 N+1,k,u −u
j j +1 Ln+1−a+1+ind({l ) ∗ })−j +(N−k+1)(d−d0 )+(N−k+1)(d0 −uj )
(5.10)
Quantum Cohomology Rings of Projective Hypersurfaces d d−d X X0
X
+
179
···
{l∗ }q{m∗ }={n∗ }/{nl } d0 =0 t1 =0 {m∗ }6=∅
tind({n∗ })−2
X
C d−d0 ({n∗ }\{ne } ∪ {nl − 1}; t1 ,· · ·, tind({n∗ })−1 ) ·
tind({n∗ })−1 =0
·
ind({n ∗ })−1 Y i=0
N+1,k,t −t
N+1,k,t −t
i i+1 i i+1 Ln+1−a−i+(N−k+1)(d−d · Ln+1−a−ind({n . ∗ })+(N−k+1)(d−d0 ) 0 −ti )
t Then an appropriate change of ti ’s and ui ’s leads to (5.7). u The specialization process can be done systematically employing the formula: [Aa1 −1 ∩ H, Aa2 −1 ∩ H, · · · , Aad+1 −1 ∩ H ; d, N + 1, k] d+1 X
=
X
(−1)d+1−m ([Aind(U1 )+1 , Aind(U2 )+1 , · · · , Aind(Um )+1 ; d, N + 1, k] ·
m=1 qm j =1 Uj ={a∗ } Uj 6=∅
·(
m Y
(] (Uj ) − 1)!)).
(5.11)
j =1
We apply formally the process of specialization to quartic curves. Combining (5.7) with (5.11), we obtain: +R 16LN,k,4 n 13 13 3 3 = L4n−3 + L4n−2 + L4n−1 + L4n 2 2 2 2 + 2L1n−2 L3n−2+N−k + 2L1n−1 L3n−2+N −k + 6L1n L3n−2+N −k + 8L1n−1 L3n−1+N−k + 12L1n L3n−1+N −k + 6L1n L3n+N −k + 3L2n−2 L2n−1+2(N−k) + 7L2n−1 L2n−1+2(N−k) + 6L2n L2n−1+2(N−k) + 10L2n−1 L2n+2(N−k) + 7L2n L2n+2(N−k) + 3L2n L2n+1+2(N−k) + 6L3n−2 L1n+3(N−k) + 12L3n−1 L1n+3(N −k) + 6L3n L1n+3(N −k) + 8L3n−1 L1n+1+3(N−k) + 2L3n L1n+1+3(N−k) + 2L3n L1n+2+3(N−k) + 4L1n−1 L1n−1+N−k L2n−1+2(N−k) + 9L1n L1n−1+N −k L2n−1+2(N−k) + 10L1n L1n+N−k L2n−1+2(N−k) + 12L1n L1n+N −k L2n+2(N−k) + 8L1n−1 L2n−1+N−k L1n+3(N−k) + 14L1n L2n−1+N −k L1n+3(N −k) + 14L1n L2n+N−k L1n+3(N−k) + 8L1n L2n+N −k L1n+1+3(N−k) + 12L2n−1 L1n+2(N−k) L1n+3(N−k) + 10L2n L1n+2(N−k) L1n+3(N −k) + 9L2n L1n+1+2(N−k) L1n+3(N−k) + 4L2n L1n+1+2(N−k) L1n+1+3(N−k) + 16L1n L1n+N−k L1n+2(N−k) L1n+3(N −k) .
(5.12)
180
A. Collino, M. Jinzenji
+1,k,d Here and further down we drop N +1, k from LN for simplicity. We determine next n the contribution from the connected reducible curves R indirectly, namely by means of a conjecture on the structure of the recursive formulas which is motivated by the results that we have found for curves of degree up to 3.
Conjecture 4. The prototype result obtained from the specialization exhausts all the addends that appear in the “true” recursive formula and the coefficients described by the following generating polynomial remain unchanged after subtraction of the contribution from “R” term:
d−1 Y
(
j =1
j x + (d − j )y + zj ). d
(5.13)
Examples. d=2 d=3 d=4
x+y ) + z1 , 2 2x + y 2x 2 + 5xy + 2y 2 x + 2y )+( )z1 + ( )z2 + z1 z2 , ( 9 3 3 3x 3 + 13x 2 y + 13xy 2 + 3y 3 ) ( 32 3x 2 + 10xy + 3y 2 3x 2 + 4xy + y 2 x 2 + 4xy + 3y 2 )z3 + ( )z2 + ( )z1 +( 8 16 8 3x + y x+y x + 3y +( )z1 z2 + ( )z1 z3 + ( )z2 z3 4 2 4 + z1 z2 z3 . (5.14) (
The remaining unknown coefficients are found by means of the constraints that they must satisfy. One condition is that the recursive laws must be compatible with the main relation of the Kähler sub-ring, i.e. with (Oe )N −1 − k k (Oe )k−1 q = 0. We use also the symmetry of the recursive formulas arising from the equality LN,k,d = LN,k,d n N −1−(N−k)d−n . 1 As a result the only unknown coefficients left are the ones of Ln−1 L3n−2+N −k and of L1n L3n−1+N −k . By symmetry they coincide with the coefficients of L3n L1n+1+3(N −k) and of L3n−1 L1n+3(N −k) respectively. Using the condition imposed by the main relation we find that their sum must be 7/9. Everything is reduced to the computation of the coefficient of L1n−1 L3n−2+N−k . Now we have already the recursive laws for the curves of degree at most 3, and then iterated use of the recursive formula (with the unknown coefficients) from N ≥ 2k region (in this region, what we need is only the Schubert numbers !) down to the N = k region yields linear functions on the remaining unknown coefficients. At this point the information that we have about the coefficients of the hypergeometric series ad and bd in Conjecture 3 provide us with an infinite number of linear relations, by varying k.
Quantum Cohomology Rings of Projective Hypersurfaces
181
Using these arguments we compute that the coefficient of L1n−1 L3n−2+N−k is 1/6. We have thus obtained the law for curves of degree 4: = 32−1 (3L4n−3 + 13L4n−2 + 13L4n−1 + 3L4n ) LN,k,4 n
+ 72−1 (9L1n−2 L3n−2+N−k + 12L1n−1 L3n−2+N−k + 16L1n L3n−2+N −k + 36L1n−1 L3n−1+N−k + 44L1n L3n−1+N −k + 27L1n L3n+N −k )
+ 16−1 (3L2n−2 L2n−1+2(N−k) + 6L2n−1 L2n−1+2(N−k) + 4L2n L2n−1+2(N−k) + 10L2n−1 L2n+2(N−k) + 6L2n L2n+2(N−k) + 3L2n L2n+1+2(N−k) ) + 72−1 (27L3n−2 L1n+3(N−k) + 44L3n−1 L1n+3(N−k) + 16L3n L1n+3(N −k) + 36L3n−1 L1n+1+3(N−k) + 12L3n L1n+1+3(N −k) + 9L3n L1n+2+3(N−k) ) + 12−1 (3L1n−1 L1n−1+N−k L2n−1+2(N −k) + 4L1n L1n−1+N −k L2n−1+2(N−k) + 6L1n L1n+N−k L2n−1+2(N−k) + 9L1n L1n+N −k L2n+2(N−k) ) + 6−1 (3L1n−1 L2n−1+N−k L1n+3(N−k) + 4L1n L2n−1+N −k L1n+3(N−k) + 4L1n L2n+N−k L1n+3(N−k) + 3L1n L2n+N −k L1n+1+3(N −k) ) + 12−1 (9L2n−1 L1n+2(N−k) L1n+3(N −k) + 6L2n L1n+2(N−k) L1n+3(N −k) + 4L2n L1n+1+2(N−k) L1n+3(N−k) + 3L2n L1n+1+2(N−k) L1n+1+3(N−k) ) + L1n L1n+N−k L1n+2(N−k) L1n+3(N−k) .
(5.15)
We have checked numerically that this formula is compatible with the previous conjecture (3.1) on the relation satisfied by the coefficients ad and bd . We have also checked that the formula for quartic curves yields the right Gromov–Witten invariant for the quintic threefold when we use the same conjecture. Remark. Looking at the recursive laws for the Fano case established so far, we expect that the number of addends appearing in the laws for degree d is the number of monomials of degree d in d variables, (2d − 1)!/(d!(d − 1)!). Of course the procedure that we have used for the determination of the recursive formula for d = 4 is in principle applicable to any d. For example we have also determined the recursive formula for d = 5. See (hep-th 9611053), the web version of this paper. On the other hand a general description of the coefficients that appear in the conjectural recursive formula of arbitrary degree is still an open problem. 6. Conclusion In this paper, we proved the property that up to degree 3 the correlation functions of the hypersurfaces MNk in CP N−1 (N ≥ k) can be written as polynomial functions of a finite number of integers Lkm . For instance in the quintic case these numbers are 1345, 770, 120. In proving this, we have found certain recursive formulas that are invariant in the c1 (MNk ) ≥ 2 region. They yield the “bare” B-model or “bare” coordinates of the deformation space of the complex structure of the mirror manifold associated to Mkk . This completely agrees with the results of Givental, which say that in the c1 (MNk ) ≥ 2 region the sigma models on (MNk ) can be solved with the hypergeometric series without
182
A. Collino, M. Jinzenji
coordinate transformation i.e., (the bare deformation parameter is good coordinate of the A-model) and that in the Calabi–Yau case, one has to transform the bare coordinate by the mirror map. In sum one can see the B-model as the toric quantum cohomology compatible with the toric compactification of the moduli space of the pure matter theory. In the Calabi–Yau case one has to introduce the mirror map to compensate for the gap between the toric compactification of the moduli space of the pure matter theory and the exact moduli space. These conclusions agree with the arguments of [24]. Possibly the case of complete intersections in projective space can be achieved by changing the input integers Lkm . One may try to search for the generalization of our arguments to the case of complete intersections in weighted projective spaces. Acknowledgement. A. C. thanks A. B. Givental, J. Lewis, C. Peters, S. A. Strømme, G. Tian, A. Tyurin. M. J. thanks T. Eguchi, K. Hori, A. Matsuo, M. Kobayashi, members of Algebraic Geometry Group in Dept. of Mathematical Science in Tokyo Univ., M. Nagura andY. Sun for discussions and kind encouragement. A.C. has been partially supported by Science Project Geom. of Alg. Var. n. 0-198-SC1 , and by fundings from M.U.R.S.T. and G.N.S.A.G.A. (C.N.R.) Italy. M. J has been supported by a grant of Japan Society for Promotion of Science.
References 1. Beauville, A.: Quantum cohomology of complete intersections. Mathematical, Physics Analysis and Geometry 168, 384–398 (1995) 2. Bertram, A.: Quantum Schubert Calculus. To appear in Advances in Mathematics 3. Bloch-Murre: On the Chow group of certain types of Fano threefolds. Compositio Math. 39, 47–105 (1979) 4. Candelas, P. and de la Ossa, X.: Nucl. Phys. B355, 415 (1991) 5. Candelas, P., de la Ossa, X., Green, P. and Parkes, L.: Phys. Lett. 258B, 118 (1991); Nucl. Phys. B359, 21 (1991) 6. Collino, A.: Some computations on the quantum cohomology algebra of a Fano hypersurface. Informal draft (1996) 7. Dubrovin, B.: The geometry of 2D topological field theories. In: Integral systems and quantum groups. LNM 1620, Berlin–Heidelberg–New York: Springer-Verlag, 1996, pp. 120–348 8. Ellingsrud, Strømme, S.A.: Bott’s formula and enumerative geometry. J. AMS 9, n. 1, 175–193 (1996) 9. Fulton, W.: Intersection Theory. Ergebnisse der Math. und ihrer Grenzgebiete 3. Folge Band 2, Berlin– Heidelberg–New York: Springer-Verlag, 1984 10. Fulton and Pandharipande: Notes on stable maps and quantum cohomology. In: Proceedings of symposia in pure mathematics: Algebraic geometry Santa Cruz 1995, (J. Kollar, R. Lazarsfeld, D. Morrison eds.) Volume 62, Part 2, Providence, RI: American Mathematical Society, pp. 45–96 11. Givental, A.B.: Equivariant Gromov–Witten Invariants. Internat. Math. Res. Notices 13 613–663 (1996) 12. Greene, B.R., Morrison, D.R. and Plesser, M.R.: Mirror Manifolds in Higher Dimension. Commun. Math. Phys. 173, 559–598 (1995) 13. Griffiths, P. and Harris, J.: Principles of Algebraic Geometry. New York: Wiley, 1978 14. Hosono, S.,Klemn, A.,Theisen, S. and Yau, S.T.: Mirror Symmetry, Mirror Map and Applications to Calabi–Yau Hypersurfaces. Commun.Math.Phys. 167, 301–350 (1995) 15. Intriligator, K.: Fusion residues. Modern Physics letters A6 Number 38, 3543–3556 (1991) 16. Jinzenji, M.: On Quantum Cohomology Rings for Hypersurfaces in CP N −1 . J. Math. Phys. 38, 5775– 5802 (1997) 17. Jinzenji, M.: Construction of Free Energy of Calabi–Yau Manifold embedded in CP N −1 via Torus Actions. Int. J. Mod. Phys. A12, 5775–5802 (1997) 18. Jinzenji, M. and Nagura, M.: Mirror Symmetry and An Exact Calculation of N − 2 point Correlation Function on Calabi–Yau Manifold embedded in CP N −1 . Int. J. Mod. Phys. A11, 171–202 (1996) 19. Keel, S.: Intersection theory of moduli spaces of n-stable pointed curves of genus zero. Trans. Am, 330, 545–574 (1992) 20. Kontsevich, M.: Enumeration of Rational Curves via Torus Actions. In: The moduli space of curves, R.Dijkgraaf, C.Faber, G.van der Geer (Eds.), Progress in Math., 129, Basel–Boston: Birkhäuser, 1995, pp. 335–368 21. Kontsevich, M., Manin,Y.: Gromov–Witten Classes, Quantum Cohomology, and Enumerative Geometry. Commun. Math. Phys. 164, 525–562 (1994)
Quantum Cohomology Rings of Projective Hypersurfaces
183
22. Lewis, J.: The cylinder correspondence for hypersurfaces of degree n in P n . Am. J. of Math. 110, 77–114 23. Li, J.,Tian, G.: Quantum Cohomology of Homogeneous Varieties. alg/geom/9504009 24. Morrison, D.R. and Plesser, M.R.: Summing the Instantons: Quantum Cohomology and Mirror Symmetry in Toric Varieties. Nucl. Phys. B440, 279–354 (1995) 25. Nagura, M. and Sugiyama, K.: Mirror Symmetry of K3 Surface. Int. J. Mod. Phys. A10, 233 (1995) 26. Persson, U., Peters, C.: Some aspects of the topology of algebraic surfaces. Israel Mathematical conference Proceedings Vol. 9, 377–392 (1996) 27. Ruan, Y., Tian, G.: A mathematical theory of quantum cohomology. J. Diff. Geom. 42 no.2, (1995) 28. Tian, G.: Quantum cohomology and its associativity. In: Proc. of 1st Current Developments in Math., Cambridge: Cambridge International Press, 1995 29. Tjurin, A. N.: Five lectures on three-dimensional varieties. (Russian) Uspehi Mat. Nauk 27 no. 5, (167), 3–50 (1972) 30. Vafa, C.: Topological Mirrors and Quantum Rings. hep-th/9111017 31. Witten, E.: Mirror Manifolds and Topological Field Theory. In: Essays on Mirror Manifolds, ed. S.-T.Yau, Hong Kong: Int. Press. Co., 1992, pp. 120–180 Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 206, 185 – 233 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Dynamics of Cubic Siegel Polynomials Saeed Zakeri? Department of Mathematics, SUNY at Stony Brook, Stony Brook, NY 11794, USA. E-mail:
[email protected] Received: 29 August 1998 / Accepted: 19 March 1999
Abstract: We study the one-dimensional parameter space of cubic polynomials in the complex plane which have a fixed Siegel disk of rotation number θ , where θ is a given irrational number of Brjuno type. The main result of this work is that when θ is of bounded type, the boundary of the Siegel disk is a quasicircle which contains one or both critical points of the cubic polynomial. We also show that these boundaries vary continuously as one moves in the parameter space. This is most nontrivial near the set of cubics with both critical points on the boundary of their Siegel disk. We prove that this locus is a Jordan curve in the parameter space. Most of the techniques and results can be generalized to polynomials of higher degrees. Contents 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . A Cubic Parameter Space . . . . . . . . . . . . . . . . Components of the Interior of M(θ) . . . . . . . . . . . Renormalizable Cubics . . . . . . . . . . . . . . . . . . Quasiconformal Conjugacy Classes . . . . . . . . . . . Connectivity of M(θ) . . . . . . . . . . . . . . . . . . Critical Parametrization of Blaschke Products . . . . . . A Blaschke Parameter Space . . . . . . . . . . . . . . . The Surgery . . . . . . . . . . . . . . . . . . . . . . . . The Blaschke Connectedness Locus C(θ) . . . . . . . . Continuity of the Surgery Map . . . . . . . . . . . . . . Renormalizable Blaschke Products . . . . . . . . . . . Surjectivity of the Surgery Map . . . . . . . . . . . . . Siegel Disks with Two Critical Points on Their Boundary
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
186 189 193 199 201 203 205 208 213 216 218 221 224 229
? Current address: Department of Mathematics, University of Pennsylvania, Philadelphia, PA 19104-6395, USA.
186
S. Zakeri
1. Introduction Let f be a polynomial of degree d ≥ 2 in the complex plane and consider the following statements: • (Ad ) “If f has a fixed Siegel disk 1 of bounded type rotation number, then ∂1 is a quasicircle passing through some critical point of f .” • (Bd ) “If f has a fixed Siegel disk 1 such that ∂1 is a quasicircle passing through some critical point of f , then the rotation number of 1 is a bounded type.” Statement (A2 ) is a theorem of Douady, Ghys, Herman and Shishikura, (Bd ) is open, even for d = 2,1 and one of the main corollaries of this work is (A3 ): Theorem. Let P be a cubic polynomial which has a fixed Siegel disk 1 of rotation number θ . Let θ be of bounded type. Then the boundary of 1 is a quasicircle which contains one or both critical points of P . In fact, we study the one-dimensional slice in the cubic parameter space which consists of all cubics with a fixed Siegel disk of a given rotation number. Many of the results apply to all rotation numbers of Brjuno type and can be generalized to polynomials of degree d ≥ 4. Siegel disks provide examples of quasiperiodic dynamics. Let p be an irrationally indifferent fixed point of a rational map f : b C→b C. This means that f (p) = p and the multiplier f 0 (p) is of the form e2πiθ , where the rotation number 0 < θ < 1 is irrational. When f is linearizable near p, the largest domain 1 on which the linearization is possible is simply-connected and is called the Siegel disk of f centered at p. Every punctured Siegel disk 1 r {p} is foliated by dynamically-defined real-analytic invariant curves. However, as we get close to ∂1, these invariant curves may become more wiggly, and in the limit we lose control of their distortion. So, a priori, we do not even know if ∂1 is a Jordan curve. The topology and geometry of the boundary of Siegel disks is a current field of research in Holomorphic Dynamics. It was conjectured by Douady and Sullivan in the early 80’s that the boundary of every Siegel disk of a rational map has to be a Jordan curve (see [D1]). To this date, this has remained an open problem, even for polynomials, even when the degree is 2. Even worse, there are very few explicit examples of polynomials for which we can effectively verify this conjecture. For instance, it is easy to see that local-connectivity of the Julia set implies the boundary of a Siegel disk to be a Jordan curve, but except for one case in the quadratic family [Pe], we do not know how to check local-connectivity of the Julia set of a rational map which has a Siegel disk (and even in that single case, the boundary being a Jordan curve is proved as a first step in the proof of local-connectivity). On the other hand, there are examples of non locally-connected quadratic Julia sets whose Siegel disks are bounded by quasicircles [H3] or indifferent linearizable germs with non locally-connected “ hedgehogs” whose Siegel disks are bounded by smooth or even quasianalytic Jordan curves [Pr2]. It is known that in any counterexample to the DouadySullivan conjecture, the boundary of the Siegel disk must either be very complicated (an indecomposable continuum) or very simple (a circle with infinitely many topologist’s sine curves planted on it) [R]. Let [a1 , a2 , . . . , an , . . . ] be the continued fraction expansion of the rotation number θ and pn /qn = [a1 , a2 , . . . , an ] be its nth rational approximation, where every an is a positive integer. According to the theorem of Brjuno–Yoccoz [Y], every holomorphic 1 Note added in proof (July 1999): Carsten Petersen has announced a proof of (B ) for d ≥ 2. d
Dynamics of Cubic Siegel Polynomials
187
germ with an indifferent fixed point of multiplier e2π iθ is linearizable if and only if θ satisfies the condition ∞ X log qn+1 n=1
qn
< +∞.
(1.1)
Such θ is called of Brjuno type. It is not hard to show that this set has full measure in the unit interval. The set of irrational numbers of Brjuno type contains two important arithmetic subsets: (1) numbers of Diophantine type, the set of all 0 < θ < 1 for which there exist positive constants C and ν such that |θ − p/q| > C/q ν for every rational number 0 < p/q < 1; and (2) numbers of bounded type, the set of all 0 < θ < 1 for which supn an < +∞. Another issue is the existence of critical points on the boundary of Siegel disks. This problem was first studied by Ghys under the assumption that the boundary is a Jordan curve and the rotation number is Diophantine [G]. Later Herman improved the result by showing that when the rotation number is Diophantine and the action on the boundary is injective, there must be a critical point on the boundary [H1]. A very short proof of this theorem is now possible with knowledge of “Siegel compacts” as recently introduced by Perez-Marco [Pr1] (see [Z2] for such a proof). In the case of quadratic polynomials, no critical point on the boundary of the Siegel disk automatically implies that the map acts injectively on this boundary. Hence one concludes that for θ of Diophantine type, the critical point of Qθ : z 7 → e2πiθ z + z2 is on the boundary of the Siegel disk centered at 0. Later Herman gave the first example of a θ of Brjuno type for which the boundary of the Siegel disk for Qθ is disjoint from the entire orbit of the critical point [H3]. The most significant example in which one can explicitly show that the boundary of a Siegel disk is a Jordan curve containing a critical point is the quadratic map Qθ : z 7 → e2πiθ z + z2 , with θ of bounded type. The idea, originally due to Ghys but utilized by Douady, Herman and Shishikura, is to consider the degree 3 Blaschke product z−3 2πit (θ) 2 z fθ (z) = e 1 − 3z which has a double critical point at 1 and 0 < t (θ ) < 1 is chosen such that the rotation ´ atek number of the restriction of fθ to the unit circle is θ . Using a theorem of Swi¸ and Herman on quasisymmetric linearization of critical circle maps ([Sw,H2]), one can redefine fθ on the unit disk to make it quasiconformally conjugate to the rigid rotation by angle θ. After modifying the conformal structure of the sphere on the unit disk and all its preimages, one applies the Measurable Riemann Mapping Theorem of Morrey-Ahlfors-Bers to prove that the resulting map is quasiconformally conjugate to a quadratic polynomial Q. But the image of the unit disk has to be a Siegel disk of rotation number θ for Q, and there is only one such quadratic up to an affine conjugacy, so Q must be √ conjugate to Qθ , which proves (A2 ). The Julia set of Qθ for the golden mean θ = ( 5 − 1)/2 and its self-similar properties was studied empirically by physicists in the early 80’s (see [MN,W]). For general θ of bounded type, it has been a subject of recent rigorous studies by mathematicians (see for example [Pe,GJ,Mc3,YZ]). In a very recent work [Z4], using a non-quasiconformal surgery on fθ , we find explicit arithmetical conditions on unbounded type rotation numbers θ which guarantee the Siegel disk of Qθ is a Jordan curve passing through the critical point. In any attempt to generalize (A2 ) to higher degrees, one must address several problems. In fact, the main difficulty is not the surgery which can be performed in all degrees
188
S. Zakeri
in a similar way, provided that one has the appropriate Blaschke products in hand. Instead, we have to face a different set of questions such as parametrization of the candidate Blaschke products by their critical points, combinatorics of various “drops” of their Julia sets, continuity of the surgery, and surjectivity of this operation. None of these issues arises in degree 2, where the corresponding parameter spaces are single points. In this paper we address these questions in detail for cubic polynomials although many of the arguments apply to higher degrees as well. We introduce the parameter space P cm (θ ) of critically marked cubic polynomials with a Siegel disk of a given rotation number θ of Brjuno type, which is canonically isomorphic to the punctured plane. The connectedness locus M(θ) ⊂ P cm (θ ) (the analogue of the Mandelbrot set for the quadratic family) is the set of all cubics with both critical orbits bounded (see Fig. 1). In the interior of M(θ), every cubic is either hyperbolic-like, for which the free critical point approaches an attracting cycle, or capture, for which the free critical point eventually maps into the Siegel disk, or of neither type, in which case it is called queer. (There may be no queer components. In any case, no example is known.) The presence of hyperbolic-like cubics in P cm (θ) implies the existence of copies of the Mandelbrot set all over M(θ), while captures appear as components in M(θ ) which look like Siegel disks in the dynamical plane. The most significant property of queer cubics is that their Julia sets support invariant line fields and in particular have positive Lebesgue measure (Theorem 3.4). Motivated by the Douady-Ghys-Herman-Shishikura approach, we introduce an auxiliary family of degree 5 critically marked Blaschke products which serve as models for cubics in P cm (θ) in the same way fθ does for the quadratic Qθ . We show that these Blaschke products can be parametrized by their critical points (Theorem 7.1) and we use this parametrization to define the parameter space B cm (θ ) which is also homeomorphic to the punctured plane. A connectedness locus C(θ ) ⊂ B cm (θ ) can be defined similarly. When θ is of bounded type, one can perform a quasiconformal surgery on Blaschke products in B cm (θ ) in order to obtain critically marked cubics in P cm (θ ). The result of this surgery does not depend on various choices we make along the way (Proposition 9.2), hence it gives rise to a well-defined surgery map S : B cm (θ ) → P cm (θ ). Continuity of S is far from being straightforward and depends on the fact that the parameter spaces have one complex dimension (Theorem 11.1). In fact, in higher degrees, the proof of this continuity step is the only part our techniques for cubics polynomials fail to apply. Various evidence suggest that the connectedness loci C(θ ) and M(θ ) are in fact homeomorphic. One can go even farther as to speculate that S is a homeomorphism. Although we provide some evidence to support this, we only need to show that S is surjective (Theorem 13.6) in order to get the desired results in the dynamical plane of cubics. Surjectivity follows from an injectivity result (Theorem 13.3) which in particular shows that S induces a homeomorphism between the complementary components of C(θ ) and M(θ). The proof of the injectivity result relies on various tools developed along the way, especially a renormalization scheme to “extract” Qθ from some cubics in P cm (θ ) and fθ from some Blaschke products in Bcm (θ ). Surjectivity of S proves (A3 ). As another consequence, we obtain the following Theorem. For θ of bounded type, the boundary of the Siegel disk of P ∈ P cm (θ ) is a continuous function of P in the Hausdorff topology. It is interesting to contrast this result with the fact that the Julia set of P undergoes drastic implosions near the boundary of M(θ ), especially near the set of cubics with both critical points on the boundary of their Siegel disk. We study this locus and describe its topology:
Dynamics of Cubic Siegel Polynomials
189
Theorem. For θ of bounded type, the set 0 of all cubics in P cm (θ ) with both critical points on the boundary of their Siegel disk is a Jordan curve. Figure 18 shows the Jordan curve 0. We conclude with a topological characterization of this set as the common boundary of the two complementary components of M(θ ) (Theorem 14.4). 2. A Cubic Parameter Space We begin by considering the space of all cubic polynomials which have a fixed Siegel disk of multiplier λ = e2πiθ centered at the origin. Here 0 < θ < 1 is a given irrational number of Brjuno type satisfying the condition (1.1). By the theorem of Brjuno–Yoccoz [Y], every holomorphic germ z 7 → e2πiθ z + O(z2 ) with θ of Brjuno type is holomorphically linearizable near 0. In particular, every cubic polynomial of the form z 7 → λz + a2 z2 + a3 z3 ,
(2.1)
with (a2 , a3 ) ∈ C × C∗ has a Siegel disk centered at the origin. We are not directly interested in the rather big space of all such cubics. Instead, we would like to consider the space of affine conjugacy classes of these cubics together with a marking of their critical points. A few words on the notion of “marking” is in order; however, we will hardly refer to the following formal definition in the rest of this paper. Roughly speaking, a marking of the critical points of a cubic P of the form (2.1) is a choice of labeling these critical points. It can be thought of as a surjective function m from the set {1, 2} to the set of critical points of P . Two such critically marked cubics (P , m) and (Q, m0 ) are affinely conjugate if there is a dilation ϕ : z 7 → αz such that ϕ ◦ P = Q ◦ ϕ and m0 = ϕ ◦ m. In other words, an affine conjugacy must also respect the markings. We denote the space of affine conjugacy classes of such critically marked cubics by P cm (θ). One way to parametrize P cm (θ) is as follows: In each conjugacy class we choose the unique critically marked cubic (P , m) whose second critical point m(2) is located at z = 1 in the complex plane. The first critical point m(1) will then be located at some point c 6 = 0. It is easy to see that such a cubic has the form 1 1 2 1 (2.2) Pc : z 7 → λz 1 − (1 + )z + z . 2 c 3c Note that using this normal form, every cubic comes automatically with a marking of its critical points. Thus (2.2) provides us with an identification P cm (θ ) ' C∗ . Under this identification, the natural Z2 -action on P cm (θ ) (swapping the markings of the critical points) corresponds to the involution c 7 → 1/c. By an abuse of notation, we often identify the cubic Pc ∈ P cm (θ) with the parameter c ∈ C∗ . The parameter space P cm (θ) has two very special points: P1 which corresponds to the conjugacy class of cubics of the form (2.1) with one critical point, and P−1 which corresponds to the conjugacy class of those cubics whose critical points are centered. The pair {P1 , P−1 } coincides with the set of fixed points of the natural Z2 -action on P cm (θ ). To understand the implication of marking the critical points, let us also consider the space P(θ ) of affine conjugacy classes of cubics of the form (2.1), this time with no particular marking. Every cubic in (2.1) can be conjugated to a monic cubic of the form z 7 → λz + az2 + z3 ,
190
S. Zakeri
where a ∈ C, and this polynomial is uniquely determined by ±a. In other words, the space P(θ ) is parametrized by the invariant ζ = a 2 ∈ C, hence it can be identified with the complex plane. Consider the map which sends every critically marked cubic in P cm (θ ) to its unique monic representative in P(θ ). This amounts to “forgetting” the markings of the critical points. It is easy to check that in the coordinates we have chosen, this map P cm (θ) → P(θ) is given by 1 2 3 . ζ = λc 1 + 4 c It follows that P cm (θ) is a double cover of P(θ ), branched over the points c = ±1. Note that by the above formula ζ (c) = ζ (1/c), as expected. Notation and Terminology. Throughout this work, the Siegel disk of the cubic Pc centered at the origin is denoted by 1c . When we do not want to emphasize the dependence on c, we denote the Siegel disk of a cubic P by 1P . By the grand orbit GO(1P ) we mean the set of all points in the plane which eventually map to the Siegel disk under the iteration of P . In other words, [ P −k (1P ). GO(1P ) = k≥0
Remark. From classical Fatou–Julia theory, we know that every point on the boundary of the Siegel disk 1c must be in the closure of the orbit of either c or 1 [M1, Corollary 11.4]. According to Herman [H1], Pc |∂1c has a dense orbit. It follows that the orbit of either c or 1 must accumulate on the entire boundary of 1c . The “size” of a Siegel disk can be measured by the following invariant: Definition (Conformal Capacity). Consider the Siegel disk 1c for c ∈ C∗ and the ' unique linearizing map hc : D(0, rc ) −→ 1c , with hc (0) = 0 and h0c (0) = 1. The radius rc > 0 of the domain of hc is called the conformal capacity of 1c and is denoted by κ(1c ). Alternatively, κ(1c ) can be described as the derivative ϕc0 (0) of the unique linearizing ' map ϕc : D −→ 1c normalized by ϕc (0) = 0 and ϕc0 (0) > 0. Naturally, one is interested in the behavior of the function c 7 → κ(1c ). This function is upper semicontinuous [Y], so a priori it can jump to a lower value, meaning that the Siegel disk 1c can shrink by a very small perturbation of the cubic Pc . Later we will see that for θ of bounded type, the closed Siegel disk 1c is a quasidisk which moves continuously in the Hausdorff topology on compact subsets of the plane (see Theorem 13.9). Therefore, in that case κ(1c ) is actually continuous as a function of c. On the other hand, for arbitrary θ of Brjuno type, I do not know if c 7 → κ(1c ) is continuous. However, we have the following general theorem of Yoccoz [Y]: Theorem 2.1. Let 0 < θ < 1 be an irrational number of Brjuno type, and set W (θ ) = P∞ n=1 (log qn+1 )/qn < ∞. Let S(θ) be the space of all univalent functions f : D → C with f (0) = 0 and f 0 (0) = e2πiθ , with the maximal Siegel disk 1f ⊂ D. Finally, define κ(θ ) = inf f ∈S(θ) κ(1f ). Then, there is a universal constant C > 0 such that | log(κ(θ )) + W (θ)| < C.
Dynamics of Cubic Siegel Polynomials
191
√ Fig. 1. The connectedness locus M(θ ) for θ = ( 5 − 1)/2
We obtain the following statement which will be used in Theorem 5.3. Corollary 2.2. In the family {Pc } of cubic polynomials in (2.2), the conformal capacity function c 7 → κ(1c ) is locally bounded away from 0. Definition. We define the cubic connectedness locus M(θ ) as the set of all critically marked cubics P ∈ P cm (θ) whose Julia sets J (P ) are connected. It follows from classical Fatou–Julia theory that P ∈ M(θ ) if and only if both critical points of P have bounded orbits [M1, Theorem 17.3]: M(θ ) = {c ∈ C∗ : The Julia set J (Pc ) is connected} = {c ∈ C∗ : Both sequences {Pc◦k (c)} and {Pc◦k (1)} are bounded}. Since Pc and P1/c are affinely conjugate as maps when we neglect markings of their critical points, M(θ) as a subset of the c-plane is invariant under the mapping √ c 7 → 1/c. Figure 1 shows the connectedness locus M(θ) for the golden mean θ = ( 5 − 1)/2 = 0.61803399... and Fig. 2 shows the details of the same set near the unit circle. Proposition 2.3. (a) M(θ) is compact and contained in the open annulus A( (b) The complement C∗ r M(θ) has exactly two connected components ext which are mapped to one another by c 7 → 1/c. Moreover,
1 , 30). 30 and int
ext = {c ∈ C∗ : Pc◦k (c) → ∞ as k → ∞}, int = {c ∈ C∗ : Pc◦k (1) → ∞ as k → ∞}. Later we will prove that ext (hence int ) is homeomorphic to a punctured disk. This will show that M(θ) is a connected set (Theorem 6.1).
192
S. Zakeri
Fig. 2. Details of the same connectedness locus near the unit circle
Proof. (a) M(θ) is clearly closed. Let mc = (4.38) max{|c|, 1}.
(2.3)
If |z| ≥ mc , then |Pc (z)| ≥ (
1 1 1 ( |z| − |z|)|z| − 1)|z| |c| 3 4.38
≥ (0.46|z| − 1)|z| ≥ 1.0148 |z|, from which it follows that K(Pc ) ⊂ D(0, mc ),
(2.4)
where K(Pc ) is the filled Julia set of Pc . Now if |c| ≥ 30, then 1 1 |Pc (c)| = | c − ||c| ≥ (4.5)|c| > mc , 6 2 which implies Pc◦k (c) → ∞ as k → ∞. Therefore M(θ ) ⊂ D(0, 30), hence by 1 symmetry M(θ) ⊂ A( , 30). 30 (b) Let ext be the unbounded connected component of C∗ r M(θ ). Since M(θ ) is invariant under c 7 → 1/c, there exists a corresponding component int of the complement of M(θ ) containing a punctured neighborhood of the origin. By the proof of (a), we have ext ⊂ {c ∈ C∗ : Pc◦k (c) → ∞ as k → ∞} and similarly int ⊂ {c ∈ C∗ : Pc◦k (1) → ∞ as k → ∞}.
Dynamics of Cubic Siegel Polynomials
193
Suppose that there exists a bounded connected component U of C∗ r M(θ ) which is not int . Then 0 < sup |c| = R < +∞. c∈∂U
If c ∈ ∂U , it follows from (2.4) that for each k ≥ 0, |Pc◦k (c)| and |Pc◦k (1)| are not greater than mc , and sup mc ≤ (4.38) max{R, 1} < +∞. c∈∂U
Since U 6 = int , we have ∂U ⊂ ∂M(θ) and both Pc◦k (c) and Pc◦k (1) are holomorphic in U as functions of c. It follows from the Maximum Principle that the iterates Pc◦k (c) t and Pc◦k (1) are uniformly bounded throughout U , which is a contradiction. u 3. Components of the Interior of M(θ) First we give the following dynamical characterization of the boundary of the connectedness locus M(θ), which is reminiscent of the similar property of the Mandelbrot set. For terminology and basic results on holomorphic motions and J -stability, see for example [Mc2]. Theorem 3.1 (Boundary of M(θ) is Unstable). The boundary ∂M(θ ) is the set of parameters for which the corresponding cubics are not J-stable in P cm (θ ). Proof. A polynomial Pc0 ∈ P cm (θ) is J -stable if and only if both sequences {Pc◦k (c)} and {Pc◦k (1)} are normal for c in a neighborhood of c0 ([Mc2, Theorem 4.2]. If c0 ∈ ext , then c0 escapes to infinity under iterations of Pc0 , while 1 has bounded orbit. For c close to c0 , the orbit of c under Pc will still converge to infinity while 1 will have bounded orbit, with a bound given by mc in (2.3). It follows from Montel’s theorem that both sequences are normal throughout a neighborhood of c0 . Hence c0 is J -stable. Similarly, every Pc0 with c0 ∈ int is J -stable. If c0 belongs to the interior of M(θ ), then both c0 and 1 will have orbits contained in D(0, mc0 ) and the same holds for all c sufficiently close to c0 . Again both sequences {Pc◦k (c)} and {Pc◦k (1)} are normal in a neighborhood of c0 . Finally, if c0 belongs to the boundary of M(θ ), then a small perturbation will make either c or 1 escape to infinity. Hence at least one of the sequences {Pc◦k (c)} or t {Pc◦k (1)} fails to be normal in any neighborhood of c0 . u Corollary 3.2. Let Pc0 ∈ P cm (θ) have an indifferent periodic orbit other than the fixed point at the origin. Then c0 ∈ ∂M(θ). Proof. Otherwise c0 will be a J -stable parameter by the above theorem. But any stable indifferent cycle has to be persistent ([Mc2, Theorem 4.2]. So the indifferent cycle (z(c0 )) 7 → z(c0 ) can be continued analytically to z(c0 ) 7 → Pc0 (z(c0 )) 7 → · · · 7 → Pc◦k−1 0 the whole plane as a function of c and the multiplier function c 7→ (Pc◦k )0 (z(c)) remains constant. This is clearly impossible, since for example when c = 3 − 6λ, Pc (c) = c is a superattracting fixed point, hence there cannot be any indifferent periodic point other than 0. u t Definition (Types of Components). A component U of the interior of M(θ ) is called hyperbolic-like if for every c ∈ U , the orbit of either c or 1 under Pc converges to an attracting cycle. U is called a capture component if for every c ∈ U , either c or 1
194
S. Zakeri
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.
eventually maps into the Siegel disk 1c . In case U is neither hyperbolic-like nor capture, we call it a queer component. We say that Pc is hyperbolic-like, capture, or queer if the corresponding parameter c belongs to such a component. For example, there is a hyperbolic-like component in the form of the main cardioid of a large copy of the Mandelbrot set on the lower right corner of Fig. 1. For every c in this component, the orbit of the critical point c of Pc converges to an attracting fixed point. On the other hand, the large component which is attached on the right side of the unit circle to c = 1 is a capture, consisting of all c for which Pc (c) belongs √ to 1c . Figure 3–7 show examples of the filled Julia sets of cubics in P cm (θ ) for θ = ( 5 − 1)/2. Figure 3 is the filled Julia set of a hyperbolic-like cubic. The large topological disk in black is the immediate basin of attraction of an attracting fixed point. Figure 4 is the filled Julia set of a capture, with a critical point in the large preimage of the Siegel disk on the right. The cubic in Fig. 5 is located at the “cusp” of the large cardioid in the right lower corner of Fig. 1, hence it has a parabolic fixed point. Figure 6 has two critical points on the boundary of its Siegel disk. Finally, the cubic in Fig. 7 belongs to ext so it has a disconnected Julia set. There are countably many components each quasiconformally homeomorphic to the quadratic Siegel filled Julia set with the same rotation number. The uncountably many remaining components are single points. In the above definition, we tacitly assumed that hyperbolic-like or capture cubics define components of the interior of M(θ). The condition of being hyperbolic-like is clearly open. It is also closed in the interior of M(θ ) since by Theorem 3.1 a cubic P in the interior of M(θ) is J -stable, so in a small neighborhood of it the number of attracting
Dynamics of Cubic Siegel Polynomials
195
Fig. 7.
cycles remains constant [Mc2, Theorem 4.2]. This number is 1 if P is accumulated by hyperbolic-like cubics. Now consider the property of being capture for P ∈ P cm (θ ). It follows from Theorem 3.1 that when a capture cubic P belongs to the interior of M(θ ), there is an open neighborhood of P consisting of captures. Let V be the component of the interior of the set of capture cubics containing P . Similarly, define U to be the component of the interior of M(θ) containing P . Clearly V ⊂ U . If they are not equal, choose a cubic Q ∈ ∂V ∩ U . Since Q is J -stable, for all Q0 in a small neighborhood of Q, a critical point of Q0 belongs to the Fatou set of Q0 if and only if the corresponding critical point of Q belongs to the Fatou set of Q. If we choose Q0 ∈ V , there is a critical point of Q0 which hits the Siegel disk 1Q0 . It follows that the same is true for Q, hence Q is capture, which contradicts Q ∈ ∂V . This proves V = U . In other words, when a capture cubic P belongs to the interior of M(θ), the entire component of the interior of M(θ ) containing P consists of captures, hence the name “capture component”. However, the above argument does not rule out the possibility of a capture being on the boundary of the connectedness locus M(θ ). In fact, that the capture condition is open follows from a different type of argument which is standard in deformation theory of rational maps (see Theorem 5.3). Conjecturally, queer components do not exist. But if they do, every cubic in a queer component exhibits an outstanding property: It admits an invariant line field on its Julia set, and in particular, its Julia set has positive Lebesgue measure. The proof of this fact depends on the harmonic λ-lemma of Bers and Royden [BR] as well as the elementary observation of Sullivan [Su2] that if the boundary of a Siegel disk moves holomorphically in a family of rational maps, then there is a choice of holomorphically varying Riemann maps for the Siegel disks (also see the new expanded version [McS]). There is a technical difficulty showing up in the proof: For a general θ of Brjuno type, it is not known whether the boundary of the Siegel disk of a P ∈ P cm (θ ) is a Jordan curve. For this reason, the
196
S. Zakeri
extension of holomorphic motions to the grand orbits of Siegel disks will require some extra work. We will repeatedly use the following lemma of L. Bers [B], [DH2]: Lemma 3.3 (Bers Sewing Lemma). Let E ⊂ C be closed and U and V be two open ' ' neighborhoods of E. Let ϕ : U −→ ϕ(U ) and ψ : V −→ ψ(V ) be two homeomorphisms such that • ϕ is K1 -quasiconformal, • ψ|V rE is K2 -quasiconformal, • ϕ|∂E = ψ|∂E . Then the map ϕ q ψ defined on V by (ϕ q ψ)(z) =
ϕ(z) z ∈ E ψ(z) z ∈ V r E
is a K-quasiconformal homeomorphism with K = max{K1 , K2 }. Moreover, ∂(ϕ qψ) = ∂ϕ almost everywhere on E. Theorem 3.4 (Invariant Line Fields for Queer Cubics). Let U be a queer component of the interior of M(θ). Then for any c ∈ U , the Julia set J (Pc ) has positive Lebesgue measure and supports an invariant line field. Proof. Fix some c0 ∈ U . We first note that every Fatou component of Pc0 eventually maps to the Siegel disk 1c0 and the mapping is a conformal isomorphism: There cannot be further attracting cycles (since Pc0 is not hyperbolic-like) or indifferent periodic orbits (see Corollary 3.2). In particular, K(Pc0 ) = GO(1c0 ). Choose some c ∈ U with c 6 = c0 , and let '
ϕc : C r K(Pc0 ) −→ C r K(Pc ) be the conformal conjugacy given by composition of the Böttcher maps √of Pc0 and Pc . A brief computation using the normal form (2.2) shows that ϕc (z) = c/c0 z + O(1) and we can choose the branch of the square root near c0 for which ϕc0 (z) = z. Since ϕc depends holomorphically on c, it defines a holomorphic motion of C r K(Pc0 ). By the harmonic λ-lemma [BR], this motion extends to a unique holomorphic motion b ϕc of the entire plane, which is now defined only for c in a small neighborhood V of c0 , with the following properties: • For every c ∈ V , b ϕc is a quasiconformal homeomorphism of the plane. ∂b ϕc dz is harmonic in GO(1c0 ). • For every c ∈ V , the Beltrami differential ∂b ϕc dz It is easy to see that uniqueness of this extended motion implies that b ϕc conjugates Pc0 to Pc on the entire plane (compare [McS]). In fact, one can replace b ϕc by Pc−1 ◦ b ϕc ◦ Pc0 on GO(1c0 ), which also extends ϕc , where the branch of Pc−1 is determined uniquely ϕc = Pc−1 ◦ b ϕc ◦ Pc0 by uniqueness. by the values of b ϕc on the Julia set J (Pc0 ). Hence b Next, we want to show that the restriction b ϕc : GO(1c0 ) → GO(1c ) is a conformal conjugacy. As Sullivan observes in [Su2], the fact that the boundary of 1c moves holomorphically for c ∈ U (Theorem 3.1) implies that there is a choice of the Riemann map ζc : D → 1c such that ζc (0) = 0 and c 7 → ζc is holomorphic in c. Define a conformal
Dynamics of Cubic Siegel Polynomials
197
conjugacy ψc : 1c0 → 1c by ψc = ζc ◦ ζc−1 , and extend it to a conformal conjugacy 0 ψc : GO(1c0 ) → GO(1c ) by taking pull-backs as follows. Take any component W of (1c0 ) and let Wc = b ϕc (W ) be the corresponding component of Pc−n (1c ). Define Pc−n 0 . Since c 7 → ψc is holomorphic and ψc = id on ψc : W → Wc by ψc = Pc−n ◦ ψc ◦ Pc◦n 0 GO(1c0 ) when c = c0 , it follows that ψc defines a holomorphic motion of GO(1c0 ). bc of the entire By the harmonic λ-lemma, ψc extends to a unique holomorphic motion ψ plane which is defined for c in a neighborhood V 0 of c0 and has harmonic Beltrami difϕc , it follows that ferential on CrK(Pc0 ). By an argument similar to the one we used for b bc respects the dynamics, i.e., it conjugates Pc0 to Pc on the entire plane. In particular, ψ it sends the marked critical point c0 of Pc0 to the marked critical c of Pc . Let us assume for example that the forward orbit of c0 accumulates on the boundary of 1c0 . Then the ϕc was also a conjugacy to begin with, for all c ∈ V ∩ V 0 same is true for c and 1c . Since b bc (c0 ) = c = b bc (Pc◦k (c0 )) = Pc◦k (c) = b we have ψ ϕc (c0 ), and by induction ψ ϕc (Pc◦k (c0 )) 0 0 for all k. Since every point on the boundary of 1c0 is in the closure of the forward orbit bc and b bc and b ϕc agree on ∂1c0 . Evidently this shows that ψ ϕc of c0 , we conclude that ψ agree on the boundary of every bounded Fatou component of Pc0 , hence on the entire Julia set J (Pc0 ). It follows then from the Bers Sewing Lemma 3.3 that ϕc q ψc defined by b ϕ (z) z ∈ C r GO(1c0 ) (ϕc q ψc )(z) = bc ψc (z) z ∈ GO(1c0 ) is a quasiconformal homeomorphism which has harmonic Beltrami differential in C r J (Pc0 ). Note that ϕc qψc is an extension of both ϕc and ψc . By uniqueness, we conclude bc . In particular, when c ∈ V ∩ V 0 , b ϕc is conformal away from the Julia set that b ϕc ≡ ψ J (Pc0 ). ϕc would have been conformal, conNow, if the Julia set J (Pc0 ) had measure zero, b tradicting c 6 = c0 . So J (Pc0 ) has positive measure. The desired invariant line field is then given by b ϕc∗ (σ0 ), the pull-back of the standard conformal structure σ0 on the plane t by b ϕc . u The existence of holomorphic motions in the above proof was the crucial fact which made the conformal extensions possible. In the case we have “static” quasiconformal conjugacies, such conformal extensions are still possible once we assume that the boundaries of Siegel disks are Jordan curves. Let 1 be a Jordan domain containing the origin and Rt : z 7 → e2πit z be the rigid rotation on the unit circle. Let ζ : 1 → D be any conformal isomorphism with ζ (0) = 0. Then the homeomorphism ht1 : ∂1 → ∂1 defined by ht1 = ζ −1 ◦ Rt ◦ ζ is the intrinsic rotation of ∂1 by angle t. By Schwarz Lemma, ht1 is independent of the choice of ζ . Now suppose 11 and 12 are two Jordan domains containing 0 and t is irrational. Let ϕ : ∂11 → ∂12 be a homeomorphism satisfying ϕ ◦ ht11 = ht12 ◦ ϕ. Then two points a1 ∈ 11 and a2 ∈ 12 have the same conformal position with respect to ϕ if ζ1 (a1 ) = ζ2 (a2 ), where the ζj : 1j → D are conformal isomorphisms with ζj (0) = 0 and ζ1 = ζ2 ◦ ϕ on ∂11 . Lemma 3.5 (Extending QC Conjugacies). Let P and Q be two cubics in P cm (θ ) such that the boundaries of the Siegel disks 1P and 1Q are Jordan curves. Let ϕ : C → C be a quasiconformal homeomorphism whose restriction C r GO(1P ) → C r GO(1Q ) conjugates P to Q. Then (a) If P is not capture, there exists a quasiconformal homeomorphism ψ : C → C which conjugates P and Q, which is conformal on GO(1P ) and agrees with ϕ on C r GO(1P ).
198
S. Zakeri
α1
U1 ∗ c1
β1
P
V1
γ1 v1
α1’
U2
ψ α 2’
V2
β2
Q
c2 ∗
v2
γ2
α2 Fig. 8. Extending ϕ in the capture case
(b) If P is capture, we can construct a ψ as in (a) if and only if the captured images of the critical points of P and Q in 1P and 1Q have the same conformal position with respect to ϕ. Proof. (a) Fix some b1 ∈ ∂1P and let b2 = ϕ(b1 ). Consider conformal isomorphisms ' ' ζ1 : 1P −→ D and ζ2 : 1Q −→ D, with ζ1 (0) = 0 = ζ2 (0) and ζ1 (b1 ) = 1 = ζ2 (b2 ), which conjugate P on 1P and Q on 1Q to the rigid rotation Rθ : z 7→ e2π iθ z on D. Since the boundaries of 1P and 1Q are Jordan curves, ζ1 and ζ2 extend homeomorphically to the closures. The composition ψ = ζ2−1 ◦ ζ1 : 1P → 1Q is conformal and conjugates P on 1P to Q on 1Q . Also ψ(b1 ) = ϕ(b1 ) = b2 and by induction ψ(P ◦k (b1 ))) = Q◦k (b2 ) = ϕ(P ◦k (b1 )). Since the orbit of b1 is dense on the boundary of 1P , we have ψ|∂1P = ϕ|∂1P . Therefore, ψ gives the required extension of ϕ to the Siegel disk 1P . It is now easy to extend ψ to the grand orbit GO(1P ): P ◦k maps any component of P −k (1P ) isomorphically onto 1P . Hence we can define ψ on any such component as the composition Q−k ◦ ψ|1P ◦ P ◦k , where the branch of Q−k is determined by the values of ϕ on the Julia set J (P ). Clearly this composition is conformal inside this component and agrees with ϕ on its boundary. ψ defined this way is a quasiconformal homeomorphism by the Bers Sewing Lemma 3.3, with U = V = C and E = C r GO(1P ). (b) Now let P be capture. The construction of ψ goes through as in case (a) except for the last part where we want to extend ψ by taking pull-backs. Suppose that there exists a positive integer k such that the critical point c1 of P belongs to the component U1 of P −k (1P ). Let V1 = P (U1 ) and let v1 = P (c1 ) be the critical value in V1 . Since P : ∂U1 → ∂V1 is a double covering and ϕ conjugates P to Q on the Julia sets, there must be a critical point c2 of Q in a component U2 of Q−k (1Q ), with ∂U2 = ϕ(∂U1 ). Similarly define V2 and v2 . By the proof of part (a) we can define ψ inductively up to the (k − 1)th preimages of 1P , including V1 . This gives us a conformal isomorphism ψ : V1 → V2 which necessarily maps v1 to v2 , because by our assumption P ◦k (c1 ) and Q◦k (c2 ) have the same conformal position in 1P and 1Q and so one gets mapped to the other by ψ|1P . Choose any simple arc γ1 in V1 connecting v1 to some boundary point β1 . The simple arc γ2 = ψ(γ1 ) in V2 connects v2 to the boundary point β2 = ψ(β1 ).
Dynamics of Cubic Siegel Polynomials
199
Pull γ1 back by P to get two branches of a simple arc passing through the critical point c1 with two distinct endpoints α1 and α10 on the boundary of U1 . Similarly we consider the pull-back of γ2 by Q and we get two endpoints on the boundary of U2 , which we label as α2 = ϕ(α1 ) and α20 = ϕ(α10 ) (see Fig. 8). Now the inverse Q−1 can be defined analytically over V2 r γ2 and has two branches which take values in two different connected components of U2 r Q−1 (γ2 ). Define ψ on U1 as the composition Q−1 ◦ ψ ◦ P , where the boundary orientation tells us which of the two branches of Q−1 has to be taken. This way we extend ψ to U1 and ψ can then be defined on further t preimages of 1P similar to the case (a). u 4. Renormalizable Cubics This section briefly studies the class of renormalizable cubics in P cm (θ ). These are the cubics with disjoint critical orbits from which one can extract the quadratic Qθ : z 7 → e2πiθ z + z2 by straightening. From a different point of view, one may consider a renormalizable cubic with connected Julia set as the result of “intertwining” the quadratic Qθ with another quadratic with connected Julia set (compare [EY]). For background on polynomial-like maps, straightening and hybrid classes, see for example [DH2]. Definition. A cubic P ∈ P cm (θ) is called renormalizable if there exists a pair of Jordan domains U and V , with 0 ∈ U b V , such that the restriction P |U : U → V is a quadratic-like map hybrid equivalent to Qθ : z 7→ e2π iθ z + z2 . When θ is irrational of bounded type, it follows from the work of Douady-Ghys-HermanShishikura [D2] that the boundary of the Siegel disk of Qθ is a quasicircle passing through the critical point. Hence the same is true for the Siegel disk 1P when P is renormalizable. To prove the next theorem, we need the following useful lemma of Kiwi in [K]. This lemma in particular shows that each indifferent cycle for a cubic P ∈ P cm (θ ) must attract its own critical point. Lemma 4.1 (Separation Lemma). Let P be a polynomial with connected Julia set. Then there exists a finite collection of closed preperiodic external rays, separating the plane into disjoint open simply-connected sets {Uj }, such that: • Each Uj contains at most one non-repelling periodic point or periodic Fatou component of P . • If z1 7 → · · · 7 → zp 7 → z1 is a non-repelling cycle meeting Ui1 7→ · · · 7 → Uip 7→ Ui1 , Sp then j =1 Uij contains the entire orbit of at least one critical point of P . Theorem 4.2. A cubic P ∈ P cm (θ) is renormalizable if either of the following conditions holds: (a) P has a non-repelling periodic orbit other than 0 which is not parabolic. (b) P has disconnected Julia set. Proof. First assume that we are in case (a) so that J (P ) is connected. Let R be the finite collection of the closed preperiodic external rays given by the Separation Lemma 4.1. Let V be the component of C r R which contains 0, cut off by an equipotential of K(P ). Finally, let U be the component of P −1 (V ) containing 0. Since all the rays in R are preperiodic, P (R) ⊂ R, hence U ⊂ V . U necessarily contains a critical point of P since
200
S. Zakeri
√ Fig. 9. Filled Julia set of the quadratic Qθ : z 7 → e2π iθ z + z2 for θ = ( 5 − 1)/2
otherwise Schwarz lemma and |P 0 (0)| = 1 would imply that U = V and P |U : U → V is a conformal isomorphism conjugate to a rotation. This would contradict the fact that U intersects the basin of attraction of infinity for P . The other critical point of P has to stay away from V because by the second part of the Separation Lemma its entire orbit lives in the cycle of components of C r R which contains the non- repelling periodic orbit of P . Since by our assumption the non-repelling cycle of P is not parabolic, the landing points of the external rays in R must all be repelling. Therefore, by a simple “thickening” procedure (see for example [M3]), we can assume that U ⊂ V , so that P |U : U → V is a quadratic-like map. Up to affine conjugation, there is only one quadratic polynomial which has a fixed Siegel disk of rotation number θ , so this quadratic-like map has to be hybrid equivalent to Qθ : z 7 → e2πiθ z + z2 . Now suppose that J (P ) is disconnected. For > 0, let U be the connected component of {z ∈ C : GP (z) < } containing the Siegel disk 1P , where GP : C → {x ∈ R : x ≥ 0} is the Green’s function of K(P ). It is not hard to see that for small , t P |U : U → U3 is a quadratic- like map, necessarily hybrid equivalent to Qθ . u Figures 3 and 7 demonstrate the above theorem. In either example, one can see the filled Julia set of the quadratic-like restriction P |U : U → V given by the above theorem, which is quasiconformally homeomorphic to the filled Julia set of Qθ : z 7 → e2π iθ z + z2 in Fig. 9. Remark. When P ∈ P cm (θ) has a parabolic cycle, we can no longer expect to extract Qθ from it by straightening. However, there must be a homeomorphic embedding K(Qθ ) → K(P ), conformal in the interior of K(Qθ ), which conjugates Qθ to P . This can be proved directly when θ is of bounded type (by using Theorem 13.7), and in the general case by using the parabolic surgery recently introduced in [Ha]. Corollary 4.3. Let θ be an irrational number of bounded type. Let P ∈ P cm (θ ) be hyperbolic-like or have disconnected Julia set J (P ). Then J (P ) has Lebesgue measure zero.
Dynamics of Cubic Siegel Polynomials
201
Proof. Let P |U : U → V be the quadratic-like restriction given by Theorem 4.2 and let K be its filled Julia set. Since this restriction is hybrid equivalent to Qθ : z 7→ e2π iθ z+z2 whose Julia set has measure zero by the theorem of Petersen [Pe], we simply conclude that ∂K has Lebesgue measure zero. It is well-known that the forward orbit of almost every point z ∈ J (P ) accumulates on the ω-limit set of the critical points of P [Ly, Prop. 1.14], which in this case is just ∂1P together with the attracting periodic orbit (resp. ∂1P ) if P is hyperbolic-like (resp. with disconnected Julia set). So the orbit of almost every z ∈ J (P ) accumulates on ∂1P . This implies that for all n ≥ N = N(z), P ◦n (z) ∈ V . This can happen only if or equivalently z ∈ P −N (∂K). We conclude that, up to a set of measure P ◦N (z) ∈ ∂K S zero, J (P ) = N≥0 P −N (∂K). But the right-hand side has measure zero because ∂K does. This proves that J (P ) has Lebesgue measure zero as well. u t
5. Quasiconformal Conjugacy Classes In this section we characterize the quasiconformal conjugacy classes in P cm (θ ).A central role is played by the following: Theorem 5.1 (Parametrization of QC Conjugacy Classes). Let Pc0 , Pc1 be distinct cubics in P cm (θ) and let ϕ : C → C be a K-quasiconformal homeomorphism which conjugates Pc0 to Pc1 , i.e., ϕ ◦ Pc0 = Pc1 ◦ ϕ and ϕ(c0 ) = c1 . Then there exists a holomorphic map t 7 → ct from an open disk D(0, r) (r > 1) into C∗ which maps 0 to c0 and 1 to c1 , such that for every t ∈ D(0, r), Pc0 is conjugate to Pct by a Kt - quasiconformal homeomorphism ϕt : C → C. Moreover, Kt → 1 as t → 0. Proof. The idea of the proof is standard in Holomorphic Dynamics (see [Su2,DH2]); however, we briefly sketch it here because similar arguments appear again in the rest of this work. Define a conformal structure σ on C by σ = ϕ ∗ σ0 , where, as usual, σ0 is the standard conformal structure on C. (To simplify the notation, in what follows we identify a conformal structure on C with its associated Beltrami differential.) Since Pc1 is holomorphic, Pc0 has to preserve σ . Since ϕ is quasiconformal, kσ k∞ < 1. Define a one-parameter family {σt } of complex-analytic deformations of σ by σt = tσ , where t ∈ D(0, r) and r > 1 is chosen such that rkσ k∞ < 1. By the Measurable Riemann Mapping Theorem [AB], there exists a unique quasiconformal homeomorphism ϕt of the plane which solves the Beltrami equation ϕt∗ σ0 = σt and fixes 0, 1 and ∞. Define P t = ϕt ◦ Pc0 ◦ ϕt−1 . Since Pc0 is holomorphic, it acts as a pure rotation on Beltrami differentials. Hence Pc∗0 σ = σ implies Pc∗0 σt = σt and therefore P t is a quasiregular self-map of the plane which preserves σ0 and is conjugate to a cubic polynomial. It is then easy to see that P t itself is a cubic polynomial with a fixed Siegel disk of rotation number θ centered at 0 with a marked critical point at z = 1. Note that t 7 → σt is holomorphic, so the same is true for t 7→ ϕt and hence t 7→ P t by the analytic dependence of the solutions of the Beltrami equation on parameters [AB]. Therefore the map t 7 → ct which defines the second critical point of P t so that P t = Pct is holomorphic. It is easy to see that ct has all the required properties. u t Corollary 5.2. Quasiconformal conjugacy classes in P cm (θ ) are either single points or open and connected. In particular, cubics on the boundary ∂M(θ ) are quasiconformally rigid, i.e., their conjugacy classes are single points.
202
S. Zakeri
Theorem 5.3 (Capture is an open condition). Let Pc0 be a capture cubic. Then there is an open neighborhood U ⊂ P cm (θ) of c0 such that for every c ∈ U , Pc is also capture. In particular, capture cubics belong to the interior of the connectedness locus M(θ ). (c0 ) ∈ 1c0 and k ≥ 1 is the smallest Proof. To fix the ideas, let us assume that Pc◦k 0 (c ) 6 = 0. Let A ⊂ 1c0 be the annulus bounded by such integer. First assume that Pc◦k 0 0 (c0 ). Take a conformal ∂1c0 and the analytic invariant curve in 1c0 passing through Pc◦k 0 '
isomorphism ψ : A −→ A(1, ), with = e2π mod(A) > 1, which conjugates Pc0 on A to the rotation on A(1, ). Postcompose ψ with a (non- conformal) dilation A(1, ) → A(1, 2 ) to get a quasiconformal homeomorphism ϕ : A → A(1, 2 ) conjugating Pc0 to the rotation. Define a Pc0 -invariant conformal structure σ on C by putting σ = ϕ ∗ σ0 on A and pulling it back by the inverse branches of Pc0 to the entire grand orbit of A. Set σ = σ0 elsewhere. As in the proof of Theorem 5.1, we define σt = tσ for t ∈ D(0, r) for some r > 1, solve the Beltrami equation ϕt∗ σ0 = σt and set P t = ϕt ◦ Pc0 ◦ ϕt−1 . Then P t is a capture cubic in P cm (θ) and P 0 = Pc0 . The holomorphic mapping t 7 → P t is not constant because mod(ϕ1 (A)) is the same as the modulus of A equipped with the conformal structure σ , which in turn is (1/2π) log( 2 ) = 2 mod(A). Hence P 1 6 = P 0 and the mapping t 7 → P t is open. Now consider the case where Pc◦k (c0 ) = 0. In this case, by Corollary 2.2, the confor0 mal capacity of 1c has a positive lower bound for all c sufficiently close to c0 . It follows that there exists an > 0 such that for all c close to c0 , 1c ⊃ D(0, ). Hence a small t perturbation of Pc0 will still be a capture cubic. u By a center of a hyperbolic-like component U ⊂ M(θ ) we mean a cubic Pc ∈ U with one of the critical points c or 1 being periodic. Similarly, a center of a capture component will be a cubic with one critical point eventually mapped to the indifferent fixed point at the origin. Lemma 5.4 (Existence of Centers). Every hyperbolic-like or capture component of the interior of M(θ) has a center. By the remark after the proof, centers of hyperbolic-like or capture components are unique when θ is of bounded type. Proof. First let U be a hyperbolic-like component. For every c ∈ U , consider the multiplier m(c) of the unique attracting periodic orbit of Pc . The mapping c 7→ m(c) from U into D is easily seen to be proper and holomorphic. Hence it vanishes at a finite number of points in U . Now let U be capture. To be more specific, let us assume that for every c ∈ U , Pc◦k (c) belongs to the Siegel disk 1c , and let k be the smallest such integer. Since Pc is J -stable by Theorem 3.1, the boundary of 1c moves holomorphically. Then, as in the proof of Theorem 3.4, there is a holomorphically varying choice of the Riemann maps ζc : D → 1c with ζc (0) = 0. Define a map m : U → D by m(c) = ζc−1 (Pc◦k (c)). Clearly m is holomorphic. Let cn ∈ U be any sequence which converges to c ∈ ∂U as (cn ) ∈ 1cn and wn = m(cn ) = n → ∞. For simplicity, put ζcn = ζn . Let zn = Pc◦k n ζn−1 (zn ) ∈ D. If wn does not converge to the unit circle, we can find a subsequence wn(j ) such that wn(j ) → w ∈ D as j → ∞. Since the family of univalent functions
Dynamics of Cubic Siegel Polynomials
203
{ζn : D → C} is normal, by passing to a further subsequence if necessary, we may assume that ζn(j ) → ζ locally uniformly on D. Clearly ζ (D) ⊂ 1c . Therefore, ζ (w) = limj ζn(j ) (wn(j ) ) = limj zn(j ) = Pc◦k (c) ∈ 1c . But this means that Pc is capture, which contradicts c ∈ ∂U . This proves that wn converges to the unit circle. Hence m is a proper t map. Now, as before, m−1 (0) has to be non-vacuous and finite. u Remark. To show uniqueness of centers, by Theorem 5.1 it would be enough to prove that any two centers for a component are quasiconformally conjugate. When the rotation number θ is of bounded type, this can be proved by a pull-back argument similar to Lemma 3.5 since in this case the boundary of 1P for P ∈ P cm (θ ) is a Jordan curve by Theorem 13.7 (compare [Mc1] or [M2], where uniqueness of centers is shown for every hyperbolic component in the space of polynomial maps). Theorem 5.5 (QC Conjugacy Classes in P cm (θ )). Quasiconformal conjugacy classes in P cm (θ ) are given by the following list: (a) Hyperbolic-like or capture components of the interior of M(θ ) with the center(s) removed. (b) The two components ext and int . (c) Queer components of the interior of M(θ). (d) Centers of hyperbolic-like or capture components. (e) Single points on the boundary of M(θ). Proof. Corollary 5.2 shows that no conjugacy class intersects two distinct members of the above list. It also proves that (d) and (e) are in fact conjugacy classes. Also the proof of Theorem 3.4 shows that every queer component is a conjugacy class. That (a) and (b) are quasiconformal conjugacy classes follows from the fact that over the components of t type (a) or (b), the family {Pc } has no critical orbit relations ([McS], Theorem 2.7). u 6. Connectivity of M(θ) In this section we prove that M(θ) is connected. This amounts to showing that each of its complementary components ext and int are homeomorphic to the punctured disk. One way to do this is to mimic the standard Douady–Hubbard proof of connectivity of the Mandelbrot set [DH1]: We can construct a holomorphic branched covering 8 : ext → C r D by assigning to each Pc ∈ ext the position of the critical point c in the Böttcher coordinate of Pc . 8 extends holomorphically to infinity with 8−1 (∞) = ∞. The degree of this map is 3, so to prove that ext is a punctured disk we must show that 8 has no critical point other than ∞. (This additional difficulty does not show up in the case of the Mandelbrot set where the similar map has degree 1.) To prove that 8 is locally injective, one can start with two nearby polynomials in the same fiber of 8 and de fine a conformal conjugacy between them near infinity by composing their Böttcher coordinates. This conjugacy can be conformally extended using the dynamics to the entire basin of attraction of infinity. Then a delicate argument is necessary to prove that one can extend the conjugacy further to the complex plane in a holomorphic way, proving that the two polynomials are identical (see [Z3] for details of such a proof). However, to prove that ext is a punctured disk, it would be much easier to use methods of Teichmüller theory of rational maps as developed in [McS]. (There one can also find a different proof for connectivity of the Mandelbrot set.) Let P ∈ P cm (θ ). By definition, the Teichmüller space Teich(P ) consists of all pairs (Q, [ϕ]), where Q ∈ P cm (θ ) and
204
S. Zakeri
c γ
Fig. 10.
ϕ : C → C is a quasiconformal conjugacy between P and Q, i.e., P ◦ ϕ = ϕ ◦ Q. Here [ϕ] means that we only remember the isotopy class of ϕ. The modular group Mod(P ) is the group of isotopy classes of quasiconformal homeomorphisms commuting with P . Mod(P ) acts on Teich(P ) properly discontinuously by [ψ](Q, [ϕ]) = (Q, [ψ ◦ ϕ]). The quotient Teich(P )/Mod(P ), also called the moduli space of P , is isomorphic to the quasiconformal conjugacy class of P in P cm (θ ). More generally, one can define the Teichmüller space Teich(U, P ), where U is an open set invariant under P . It consists of all triples (V , Q, [ϕ]), where V is open and invariant under Q, and the quasiconformal homeomorphism ϕ : V → U conjugates P and Q. But now [ϕ] denotes the isotopy class of ϕ rel ideal boundary of V . Theorem 6.1. The connectedness locus M(θ) is connected. Proof. Let P = Pc ∈ ext . Then J (P ) is disconnected and the critical point c belongs to the basin of attraction of infinity. Let γ be the equipotential of the Green’s function of K(P ) passing through c. Topologically γ is a figure eight with the double point at c (see Fig. 10). Let Jb(P ) be the union of J (P ) together with the backward orbit of the fixed point 0 as well as the union of all forward and backward images of γ . In other words, Jb(P ) is the closure of the grand orbits of all periodic points and critical points of P . The complement U = C r Jb(P ) consists of countably many annuli Ai of finite modulus (contained in the basin of attraction of ∞) and countably many punctured disks (corresponding to the Siegel disk and its preimages). On U the grand orbit equivalence relation is clearly indiscrete. By [McS, Theorem 6.2], Teich(P ) ' Teich(U, P ) × M1 (J (P ), P ), where M1 (J (P ), P ) is the unit ball in the space of all P -invariant Beltrami differentials supported on J (P ). This factor is trivial by the following Lemma 6.2. The Julia set of a cubic polynomial outside the connectedness locus M(θ ) admits no invariant line field. Note that for arbitrary θ of Brjuno type, it is not known whether this Julia set has measure zero (compare Corollary 4.3).
Dynamics of Cubic Siegel Polynomials
205
Proof. By Theorem 4.2(b), such a cubic is renormalizable. By straightening, an invariant line field on its Julia set gives rise to an invariant line field, or equivalently an invariant Beltrami differential σ , on the Julia set of Qθ : z 7 → e2π iθ z + z2 . Now, as in the proof of Theorem 5.1, by deforming σ to σt = tσ we can get a holomorphic family Qt of normalized quadratic polynomials all quasiconformally conjugate to Qθ . But Qθ belongs to the boundary of the Mandelbrot set, hence admits no non-trivial deformations, implying that Qt = Qθ for all t. So the normalized quasiconformal homeomorphisms ϕt which solve the Beltrami equation ϕt∗ σ0 = σt must all commute with Qθ . Now for any periodic point z ∈ J (Qθ ) of period n, t 7 → ϕt (z) is a continuous path in the finite set of all period-n points in J (Qθ ). Since ϕ0 (z) = z, we must have ϕt (z) = z for all t. Such points z are dense in the Julia set, so ϕt |J (Qθ ) must be the identity. Since σt = 0 off the Julia set, it follows from the Bers Sewing Lemma 3.3 that ∂ϕt = 0 almost everywhere in the plane. This implies that σt , or equivalently σ , vanishes almost everywhere, which is a contradiction. u t Now by Theorem 5.5, ext coincides with the quasiconformal conjugacy class of P . It follows that ext ' Teich(P )/Mod(P ). By [McS, Theorem 6.1], Teich(P ) ' Teich(U, P ) is isomorphic to the upper halfplane H. Finally, every quasiconformal self-conjugacy ψ of P preserves grand orbits of the distinguished points 0 and c, hence it fixes the boundaries of all the annuli Ai pointwise. In particular, ψ is the identity on the Julia set J (P ). Hence the action of [ψ] ∈ Mod(P ) is identity except in the annuli Ai where it is possibly a power of a Dehn twist. So Mod(P ) is at most Z. Since ext is not simply-connected, Mod(P ) = Z. It t follows that ext is homeomorphic to a punctured disk. u 7. Critical Parametrization of Blaschke Products This section is the beginning of a digression in the study of cubic Siegel polynomials. We look at certain Blaschke products which will serve as models for the cubics in P cm (θ ). We will introduce these model maps in Sec 8 and return to their relation with the cubics in Sect. 9. Let us consider the following space of degree 5 normalized Blaschke products: z−q b = {B : z 7 → τ z3 z − p : B(1) = 1 and |p| > 1, |q| > 1}, (7.1) B 1 − pz 1 − qz where the rotation factor τ on the unit circle T is chosen so as to achieve the normalization bhas superattracting fixed points at 0 and ∞ and four other critical B(1) = 1. Each B ∈ B b of those points counted with multiplicity. We are interested in the open subset B ⊂ B normalized Blaschke products of the form (7.1) whose four critical points other than 0 and ∞ are of the form 1 1 , c1 , c2 , c1 c2 with |c1 | > 1, |c2 | > 1. Our goal is to parametrize elements of B by their critical points c1 and c2 . The following theorem provides this “critical parametrization” for B: Theorem 7.1 (Critical Parametrization). Let c1 and c2 be two points outside the closed unit disk in the complex plane. Then there exists a unique normalized Blaschke product 1 1 B ∈ B whose critical points are located at 0, ∞, c1 , c2 , , . c1 c2
206
S. Zakeri
The proof of this theorem will be given after the following two supporting lemmas. It would be interesting to find a conceptual proof of this fact which can be generalized to higher degrees (compare a similar situation in [Z1], where such a proof is given). bof all Blaschke products of the form (7.1) can be identified with the set of The space B all unordered pairs {p, q} of points outside the closed unit disk. This is homeomorphic to the symmetric product of two copies of the punctured plane. The latter can be identified with the space of all degree 2 monic polynomials w 7 → (w − w1 )(w − w2 ) = w2 − (w1 + w2 )w + w1 w2 b is homeomorphic to C × C∗ . In particular, it is an open with w1 w2 6 = 0. It follows that B topological manifold of real dimension 4. In the same way, we may consider the space C of all unordered pairs {c1 , c2 } of points outside the closed unit disk, which has a completely similar description. We consider the continuous map 9:B→C which sends a normalized Blaschke product B ' {p, q} with critical points {0, ∞, c1 , 1 1 c2 , , } to the unordered pair {c1 , c2 }. c1 c2 Lemma 7.2. 9 is a proper map. Proof. Let Bn ' {pn , qn } be a sequence of normalized Blaschke products in B which leaves every compact subset of B. Then, a priori we have the following three possibilities: • Some critical point of Bn accumulates on the unit circle, or • After relabeling, pn goes to ∞, or • After relabeling, pn accumulates on the unit circle (later we show that this cannot be the case; see Lemma 7.4). In the first two cases, it is easy to see that 9(Bn ) leaves every compact subset of C. In the third case, there is a subsequence of Bn which converges locally uniformly on C r T to a Blaschke product of degree < 5. It follows that the corresponding subsequence of t 9(Bn ) has to leave every compact subset of C. u Lemma 7.3. 9 is injective. Proof. Let A and B be two normalized Blaschke products in B with the same critical 1 1 points {0, ∞, c1 , c2 , , } . Let c1 c2 z − q1 z − p1 , A : z 7 → τA z3 1 − p1 z 1 − q1 z z − q2 z − p2 . B : z 7 → τB z 3 1 − p2 z 1 − q2 z If p1 = p2 or p1 = q2 , or if one of the critical points c1 , c2 coincides with one of the zeros pi , qi , then a straightforward computation shows that A = B. So let us assume that p1 6 = p2 and p1 6 = q2 and consider the rational function R(z) =
A(z) . B(z)
Dynamics of Cubic Siegel Polynomials
207
Clearly deg R = 4 and hence R has 6 critical points counted with multiplicity. We have Q Q z2 (z − cj )(1 − cj z) z2 (z − cj )(1 − cj z) 0 0 , B (z) = (const.) A (z) = (const.) (1 − p1 z)2 (1 − q1 z)2 (1 − p2 z)2 (1 − q2 z)2 from which it follows that 1 Y (z−cj )(1−cj z) R (z) = (const.) z 0
(P ) (−1)j (z−pj )(z−qj )(1−pj z)(1−qj z) . (z−p2 )2 (z−q2 )2 (1−p1 z)2 (1−q1 z)2
(Note that all the sums and products are taken over j = 1, 2.) From the above expression, R has already 4 critical points at the cj and 1/cj . So the rational function in the braces could have at most 2 zeros. Since this fraction is irreducible (by our assumption p1 6= p2 and p1 6 = q2 ), the numerator should have degree ≤ 2. But that implies p1 q1 = p2 q2 , p1 (1 + |q1 |2 ) + q1 (1 + |p1 |2 ) = p2 (1 + |q2 |2 ) + q2 (1 + |p2 |2 ) from which it follows that p1 = p2 or p1 = q2 , hence q1 = q2 or q1 = p2 , which contradicts our assumption. u t Proof of Theorem 7.1 (Critical Parametrization). By Lemma 7.2 and Lemma 7.3, 9 is ' t a covering map of degree 1. Hence, it is a homeomorphism B −→ C. u In particular, the theorem shows that B is also homeomorphic to the product C × C∗ . Lemma 7.4. Let B : z 7 → τ z3 (z − p)(z − q)/((1 − pz)(1 − qz)) be any normalized Blaschke product in B. Then |p| > 2 and |q| > 2. Proof. Write B(z) = ρz3 /R(z), where |ρ| = 1 and z−β z−α R(z) = 1 − αz 1 − βz is a degree 2 Blaschke product preserving the unit disk having zeros at α = 1/p and β = 1/q. We look at the logarithmic derivative LD(z) = d(log R(z))/d(log z) = zR 0 (z)/R(z) on the unit circle T. A brief computation shows that for z ∈ T, LD(z) =
1 − |β|2 1 − |α|2 + , 2 |z − α| |z − β|2
which is strictly positive. It is easy to see that 1 + |α| 1 + |β| , . max LD(z) ≥ max 1 − |α| 1 − |β| z∈T Hence if either |α| > 1/2 or |β| > 1/2, the maximum value of LD on T will be greater than 3. On the other hand, R induces a 2-to-1 covering map of the unit circle, so the average value of |R 0 | = LD on T will be 2. Putting these two facts together, it follows that if |α| > 1/2 or |β| > 1/2, then min LD(z) ≤ 2 < 3 < max LD(z). z∈T
z∈T
208
S. Zakeri
This simply implies that when |α| > 1/2 or |β| > 1/2, there are at least two points on T where LD takes on the value 3. Now B(z) = ρz3 /R(z) gives B 0 (z) = ρ
3z2 R(z) − z3 R 0 (z) 3 − LD(z) . = ρz2 R(z)2 R(z)
Hence by the above argument, B has at least two critical points on the unit circle as soon as |p| < 2 or |q| < 2. Certainly this cannot happen since by definition B ∈ B means the critical points of B are off the unit circle. u t Corollary 7.5. Given any two points c1 and c2 in the plane, with |c1 | ≥ 1 and |c2 | ≥ 1, there exists a unique normalized Blaschke product B in the closure B with critical points 1 1 {0, ∞, c1 , c2 , , }. c1 c2 In other words, critical parametrization is possible even if one or both critical points c1 , c2 belong to the unit circle. Proof. Take a sequence {c1n , c2n } of pairs of points outside the closed unit disk such that c1n → c1 and c2n → c2 as n → ∞. The zeros pn , qn of the corresponding normalized Blaschke products 9 −1 ({c1n , c2n }) stay away from the unit circle by Lemma 7.4. Therefore, 9 −1 ({c1n , c2n }) has a subsequence which converges to a normalized Blaschke 1 1 product which, by continuity of 9, has critical points at {0, ∞, c1 , c2 , , }. c1 c2 To see uniqueness, it is enough to note that the proof of Lemma 7.3 can be repeated t word by word even if we assume |c1 | = 1 or |c2 | = 1. u We conclude with the following proposition, the proof of which is quite straightforward. Proposition 7.6. Every B ∈ B induces a real-analytic diffeomorphism of the unit circle. Consequently, if B ∈ B r B, the restriction of B to the unit circle will be a real-analytic homeomorphism with one (or two) critical point(s).
8. A Blaschke Parameter Space Now we focus on a certain class of degree 5 Blaschke products. These are the maps B with the following two properties: (i) B has the form B : z 7 → e2πit z3
z−p 1 − pz
z−q , |p| > 1, |q| > 1, 1 − qz
(8.1)
where p and q are chosen such that B has a double critical point on the unit circle T and a pair (c, 1/c) of symmetric critical points which may or may not be on T. (ii) t is the unique number in [0, 1] for which the rotation number of B|T is equal to θ , with 0 < θ < 1 being a given irrational number.
Dynamics of Cubic Siegel Polynomials
209
E2
1 m( 1 )
glue
m(2 ) 1
1 1/ m ( 2 )
E1
Fig. 11. Topology of the parameter space Bcm (θ )
The number t in (ii) is unique because the rotation number of B in (8.1) is a continuous and increasing function of t which is strictly increasing at all irrational values (see for example [KH, Prop. 11.1.9]). From the above description, it follows that every B which satisfies (i) and (ii) can be represented as a normalized Blaschke product in B r B followed by a unique rotation which adjusts the rotation number to θ. As a consequence, Corollary 7.5 shows that every such B is uniquely determined by the position of its critical points. The rotation group rot= {Rρ : z 7 → ρz with |ρ| = 1} acts on the set of all such Blaschke products by conjugation. In fact, Rρ−1 ◦ B ◦ Rρ : z 7 → e2πit ρ 4 z3
z − pρ 1 − pρz
z − qρ . 1 − qρz
We would like to understand the topology of the space B cm (θ ) of all “critically marked” Blaschke products satisfying (i) and (ii) modulo the action of rot. Here by a marking of the critical points of such a Blaschke product B we mean a surjective function m from the set {1, 2} to the set of finite critical points of B outside the open unit disk. Two critically marked Blaschke products (B, m) and (A, m0 ) are equivalent under the action of rot if there exists an Rρ such that Rρ ◦ B = A ◦ Rρ and m0 = Rρ ◦ m. Here is how we parametrize the space B cm (θ ): For j = 1, 2, consider the closed set Ej consisting of all conjugacy classes in Bcm (θ ) for which the critical point m(j ) belongs to the unit circle. In each class in E1 , we choose the unique representative (B, m) for which m(1) = 1. It follows from Corollary 7.5 that E1 can be parametrized by the location of the second critical point m(2) ∈ C r D. Similarly, in each class in E2 , pick up the unique representative (B, m) for which m(2) = 1. This shows that E2 can be parametrized by the location of the first critical point m(1) ∈ C r D. Now on the common boundary E1 ∩ E2 , consisting of all Blaschke products with two double critical points on T, we have two different coordinates which must correspond to the same conjugacy class. This simply yields the identification m(1) = 1/m(2) between the two copies of C r D along their boundary circles. Consequently, Bcm (θ ) can be identified with the punctured plane (see Fig. 11).
210
S. Zakeri
It is easy to see that this gluing corresponds to choosing the uniformizing parameter µ = m(1)/m(2) ∈ C∗ for the space B cm (θ). Here is the concrete interpretation of this identification B cm (θ) ' C∗ : For µ ∈ C∗ with |µ| ≥ 1, the corresponding Blaschke product Bµ has marked critical points at m(1) = µ, m(2) = 1. Similarly, if |µ| ≤ 1, Bµ is the unique Blaschke product with marked critical points at m(1) = 1, m(2) = 1/µ. Note that Bµ = B1/µ as maps, if we forget the markings of the critical points. As in the case of the cubic parameter space P cm (θ ), the Blaschke space Bcm (θ ) also has two very special points: µ = 1 which corresponds to the conjugacy class of Blaschke products with a critical point of local degree 5 on T, and µ = −1, which corresponds to the conjugacy class of Blaschke products with two centered double critical points on T. The identification with C∗ puts the following topology on Bcm (θ ): If |µ| 6 = 1 so that Bµ has only one double critical point on T, then Bµn → Bµ simply means uniform convergence on compact subsets of the plane respecting the convergence of the marked critical points. On the other hand, if |µ| = 1 so that Bµ has two double critical points on the unit circle, then Bµn → Bµ means that in the topology of local uniform convergence, {Bµn } can only accumulate on Bµ or its conjugate Rµ−1 ◦ Bµ ◦ Rµ . future reference, we need to analyze the structure of the invariant set S For −k B (T) for a Blaschke product B ∈ Bcm (θ ). For similar descriptions in a family k≥0 of degree 3 Blaschke products, see [Pe] or [YZ]. Definition (Skeletons). Let B ∈ B cm (θ). Define T0 = T and T1 = B −1 (T0 ) r T0 . In general, for k ≥ 2 we define Tk inductively as Tk = B −1 (Tk−1 ). We call the closed set Tk the k-skeleton of B. Note that B commutes with the reflection I : z 7→ 1/z. Therefore, every Tk is invariant under I . Figure 12 shows different possibilities for the 1-skeleton of a B ∈ Bcm (θ ). The next proposition gives basic properties of k-skeletons. The proofs are straightforward and will be omitted. Proposition 8.1 (Structure of the k-Skeleton). (a) For k ≥ 1, the k-skeleton Tk is the union of finitely many piecewise analytic Jordan curves {Tk1 , · · · , Tkm } which intersect one another at finitely many points and do not cross the unit circle T. None of the Tki encloses T. For any Tki in this family, the reflected copy I (Tki ) also belongs to this family. (b) With the notation of (a), let Dki denote the bounded component of C r Tki for k ≥ 1. C r D. Then for k ≥ 1, B maps Dki onto For k = 0, D0i could mean either D or b j some Dk−1 . The mapping is either a conformal isomorphism or a 2-to-1 branched C r D. covering. As a result, B ◦k is a proper holomorphic map from Dki onto D or b j i (c) If k ≥ 1 and i 6 = j , we have Dk ∩ Dk = ∅. j j j (d) For k > ` ≥ 1, either Dki and D` are disjoint or Dki ⊂ D` . Conversely, if Dki ⊂ D` , we necessarily have k ≥ `. Every Dki is called a k-drop or simply a drop of B. In other words, k-drops are the open topological disks bounded by the Jordan curves in the decomposition of the kskeleton of B. For k = 0, we have slightly changed the notion of drops. The unit circle T is the only Jordan curve in the 0-skeleton of B, and we agree to call any of the two topological disks D or b C r D a 0-drop. The integer k is called the depth of Dki .
Dynamics of Cubic Siegel Polynomials
211
* *
* *
*
(a)
*
(b)
* * * *
*
(d)
(c)
Fig. 12. Four different configurations for B −1 (T), where B ∈ Bcm (θ ). The shaded regions are components of B −1 (D). The shaded subregion of D is mapped to D by a 3-to-1 branched covering with a superattracting
fixed point at the origin. There is a critical point at z = 1 and the other critical point(s) (marked by an asterisk) are symmetric with respect to the unit circle
Definition (Nucleus of a Drop). Let Dki be a drop. We define the nucleus Nki of Dki as the set of all points in Dki which are not accumulated by any other drop of B. The nuclei of k-drops are said to have depth k. It follows from Proposition 8.1(c) that Nki = Dki r
[[ `6=k j
j
D` .
Clearly every nucleus is open. It is also non-empty because every drop contains an open set which eventually maps to the immediate basin of attraction of 0 or ∞, and this open set cannot intersect the closure of any other drop of B. We have two nuclei of depth zero: N0 , which is the nucleus of D and contains the C r D and contains immediate basin of attraction of 0, and N∞ , which is the nucleus of b the immediate basin of attraction of ∞. Obviously N∞ = I (N0 ). It is not hard to see that both N0 and N∞ are invariant under B: B(N0 ) ⊂ N0 , B(N∞ ) ⊂ N∞ .
(8.2)
This of course implies that N0 and N∞ are subsets of the Fatou set of B. It follows from Proposition 8.1(b) that B maps every nucleus of depth k onto some nucleus of depth k − 1 and the mapping is either a conformal isomorphism or a 2-to-1 branched covering. We include the following lemma for completeness:
212
S. Zakeri
Lemma 8.2. Let Nki be the nucleus of a drop Dki which eventually maps to the unit disk D. Then (a) No point in the orbit B
B
B
B
i
i1 −→ · · · −→ N1k−1 −→ N0 Nki = Nki0 −→ Nk−1 i
j can intersect any of the reflected nuclei I (Nk−j ), 0 ≤ j ≤ k. i ◦k (b) For z ∈ Nk , B is the first iterate of B which sends z to N0 .
Proof. (a) B commutes with I , so there is a reflected orbit B
B
B
i
B
i1 ) −→ · · · −→ I (N1k−1 ) −→ N∞ . I (Nki ) = I (Nki0 ) −→ I (Nk−1
Now any point in both orbits would have to map to a point in N0 and N∞ simultaneously, which is impossible since N0 ∩ N∞ = ∅. (b) This is obvious if k = 1. Suppose that k > 1 and that for some 0 < ` < k, B ◦` (z) ∈ N0 . Then by (8.2), B ◦k−1 (z) ∈ N0 ⊂ D. But B ◦k−1 (z) ∈ B ◦k−1 (Dki ) and t B ◦k−1 (Dki ) is a 1-drop which does not intersect D. u Remark. If z ∈ Nki , it is not true that B ◦k is the first iterate of B which sends z to the unit disk. In fact, the orbit of z can pass through D several times before it maps to N0 . Proposition 8.3. (a) Distinct nuclei are disjoint. (b) The map B ◦k from Nki onto N0 or N∞ is either a conformal isomorphism or a 2-to-1 branched covering. j
Proof. (a) Let Nki and N` be two distinct nuclei which intersect. By Proposition 8.1(c), we have k 6 = `. Without loss of generality, we assume that k > ` and the iterate B ◦` j j maps N` onto N0 . So for every z in the intersection Nki ∩ N` , B ◦` (z) will belong to N0 . This contradicts Lemma 8.2(b). (b) Since by (a) distinct nuclei are disjoint, an orbit B
B
B
i
B
i1 −→ · · · −→ N1k−1 −→ N0 or N∞ Nki = Nki0 −→ Nk−1
can hit every critical point of B at most once. Since the critical point z = 1 of B does not belong to any nucleus, the above orbit can only hit the pair of critical points c and 1/c, with |c| 6 = 1. By Lemma 8.2(a), these critical points cannot belong to the above orbit simultaneously. This means that B ◦k : Nki → N0 or N∞ is either a conformal isomorphism or a 2- to-1 branched covering. u t cm √ Figures 13–15 show the Julia sets of some Blaschke products in B (θ ) for θ = ( 5 − 1)/2. In Fig. 13 there are two symmetric attracting cycles in the nuclei N0 and N∞ whose basins of attraction consist of the topological disks in black. Figure 14 shows the Julia set of a map outside of the connectedness locus C(θ ) (see Sect. 10). In Fig. 15 there is a critical point in the nucleus of the large 1-drop attached to the unit disk at z = 1 which maps into N0 . Hence this nucleus contains the zeros p and q. Surgery (see Sect. 9 below) will turn the first Blaschke product into a hyperbolic-like cubic, while sends the second to a cubic in ext and the last one to a capture cubic in P cm (θ ).
Dynamics of Cubic Siegel Polynomials
213
Fig. 13.
Fig. 14.
Fig. 15.
9. The Surgery For the rest of the paper, unless otherwise stated, we assume that θ is an irrational number of bounded type. We describe a surgery on Blaschke products in Bcm (θ ) to obtain cubic polynomials in P cm (θ). A similar surgery has been done in the case of ´ atek and Herman (see quadratic polynomials [D2] using the following theorem of Swi¸ [Sw] or [H2]). Recall that a homeomorphism h : R → R is called k-quasisymmetric, or simply quasisymmetric, if 0 < k −1 ≤
|h(x + t) − h(x)| ≤ k < +∞ |h(x) − h(x − t)|
214
S. Zakeri
for all x and all t > 0. A homeomorphism h : T → T is k-quasisymmetric if its lift to R has this property. Theorem 9.1 (Linearization of Critical Circle Maps). Let f : T → T be a real-analytic homeomorphism with finitely many critical points and rotation number θ. Then there exists a quasisymmetric homeomorphism h : T → T which conjugates f to the rigid rotation Rθ : z 7 → e2πiθ z if and only if θ is an irrational number of bounded type. Moreover, if f belongs to a compact family of real-analytic homeomorphisms with rotation number θ, then h is k-quasisymmetric, where the constant k only depends on the family and not on the choice of f . Let us briefly sketch what this surgery does on a Blaschke product B ∈ Bcm (θ ). By Proposition 7.6, the restriction B|T is a real-analytic homeomorphism with one (or two) critical point(s). When the rotation number of this circle map is of bounded type, by Theorem 9.1 one can find a unique k-quasisymmetric homeomorphism h : T → T with h(1) = 1 such that the following diagram commutes: B
T −−−−→ yh
T yh
Rθ
T −−−−→ T Moreover, {B|T }B∈Bcm (θ ) is contained in a compact family (see Theorem 12.3), hence h is k(θ )-quasisymmetric, where the constant k(θ) only depends on the family Bcm (θ ). We can extend h to a K(θ)-quasiconformal homeomorphism H : D → D whose dilatation depends only on k(θ). Possible extensions are given by the theorem of Beurling and Ahlfors [A] or Douady and Earle [DE] (which has the advantage of being conformally e as follows: invariant). Define a modified Blaschke product B |z| ≥ 1 e = B(z) . (9.1) B(z) (H −1 ◦ Rθ ◦ H )(z) |z| < 1 This amounts to cutting out the unit disk and gluing in a Siegel disk instead. Note that the two definitions match along T by the above commutative diagram. Now define a conformal structure σ on the plane as follows: On D, let σ be the pull-back H ∗ σ0 of e will preserve σ on D. For the standard conformal structure σ0 . Since Rθ preserves σ0 , B e◦k = B ◦k on B e−k (D) r D (which consists of all the every k ≥ 1, pull σ |D back by B maximal k-drops of B; see Sect. 10). Since B ◦k is holomorphic, this does not increase the dilatation of σ . Finally, let σ = σ0 on the rest of the plane. By the construction, σ e Therefore, by the Measurable Riemann has bounded dilatation and is invariant under B. Mapping Theorem, we can find a quasiconformal homeomorphism ϕ : C → C such that ϕ ∗ σ0 = σ . Set e ◦ ϕ −1 . P =ϕ◦B
(9.2)
Then P is a quasiregular self-map of the sphere which preserves σ0 , hence it is holoe has the same properties. Therefore P is morphic. Also P is proper of degree 3 since B a cubic polynomial. Evidently, ϕ(D) is a Siegel disk for P whose boundary ϕ(T) is a quasicircle passing through the critical point ϕ(1). To mark the critical points of P , hence getting an element of P cm (θ ), we must normalize ϕ carefully. Recall from Sect. 8 that B cm (θ ) is uniformized by the parameter
Dynamics of Cubic Siegel Polynomials
215
µ ∈ C∗ as follows: If |µ| ≥ 1, Bµ has marked critical points at m(1) = µ, m(2) = 1, while for |µ| ≤ 1, Bµ has marked critical points at m(1) = 1, m(2) = 1/µ. In the first case, we normalize ϕ such that ϕ(H −1 (0)) = 0 and ϕ(1) = 1. Call ϕ(µ) = c and mark the critical points of P by declaring P = Pc as in Sect. 2. In the case |µ| ≤ 1, we normalize ϕ similarly by putting ϕ(H −1 (0)) = 0 and ϕ(1/µ) = 1, but this time we call ϕ(1) = c and set P = Pc . It is easy to see that when |µ| = 1, both normalizations produce the same critically marked cubic polynomial in P cm (θ ). Let us denote the polynomial P constructed this way by SH (B). We will see that for two quasiconformal extensions H and H 0 , the cubics SH (B) and SH 0 (B) are quasiconformally conjugate and the conjugacy is conformal everywhere except on the grand orbit of the Siegel disk centered at the origin. When SH (B) is capture, we can certainly end up with two different cubics if we choose the extensions arbitrarily. In fact, let k be the first moment the orbit of the critical point c of B hits the unit disk, and let w = B ◦k (c). Then for two quasiconformal extensions H and H 0 , the captured images of the critical points of SH (B) and SH 0 (B) have the same conformal position in their corresponding Siegel disks if and only if H (w) = H 0 (w). It follows that SH (B) 6 = SH 0 (B) as soon as we choose two different extensions H, H 0 with H (w) 6 = H 0 (w). The following proposition has a very non-trivial content in case the result of the surgery is a cubic whose Julia set has positive measure (say, in a queer component). It is the Bers Sewing Lemma which makes the proof work. Proposition 9.2. Let P = SH (B) and H 0 be any other quasiconformal extension of the circle homeomorphism h which linearizes B|T . Then, if P is not capture, SH (B) = SH 0 (B). On the other hand, when P is capture, SH (B) = SH 0 (B) if and only if H (w) = H 0 (w), where w ∈ D is the captured image of the critical point of B. Proof. Let Q = SH 0 (B) and ϕH and ϕH 0 denote the quasiconformal homeomorphisms eH ◦ ϕ −1 and Q = ϕH 0 ◦ B eH 0 ◦ ϕ −10 as in (9.2). The homeowhich satisfy P = ϕH ◦ B H H morphism ϕ defined by −1 )(z) z ∈ C r GO(1P ) (ϕH 0 ◦ ϕH ϕ(z) = −1 )(z) z ∈ P −k (1P ) (ϕH 0 ◦ B −k ◦ H 0 −1 ◦ H ◦ B ◦k ◦ ϕH is quasiconformal and conjugates P to Q. By Lemma 3.5, one can find a quasiconformal conjugacy ψ : C → C between P and Q which is conformal on the grand orbit GO(1P ) and agrees with ϕ everywhere else. By the Bers Sewing Lemma, ∂ψ = ∂ϕ almost everywhere on CrGO(1P ). But the latter generalized partial derivative vanishes almost everywhereSon C r GO(1P ) because the surgery does not change the conformal e−k (D). Hence ∂ψ = 0 almost everywhere on C, which means structures outside k≥0 B ψ is conformal. This shows P = Q. u t Convention. For the rest of this paper, we always choose the Douady- Earle extension of circle homeomorphisms to perform surgery. By the above proposition, this is really a “choice” only in the capture case. We can therefore neglect the dependence on H and call S : B cm (θ) → P cm (θ ) the surgery map. As an immediate corollary of the normalization of ϕ and the construction of S, we have the following:
216
S. Zakeri
Corollary 9.3. Let µ ∈ C∗ and Pc = S(Bµ ) be the cubic obtained by performing the above surgery. / ∂1c . • If |µ| > 1, then 1 ∈ ∂1c and c ∈ / ∂1c . • If |µ| < 1, then c ∈ ∂1c and 1 ∈ • If |µ| = 1, then both c and 1 ∈ ∂1c . 10. The Blaschke Connectedness Locus C(θ) Suggested by the case of cubic polynomials, we define the Blaschke connectedness locus C(θ ) by C(θ) = {B ∈ B cm (θ) : The Julia set J (B) is connected}. The following theorem provides a useful characterization of C(θ ) in terms of the critical orbits. Theorem 10.1. B ∈ C(θ) if and only if one of the following holds: • The orbit of c, the critical point of B in C r D other than 1, eventually hits D. • The orbit of c never hits D, but remains bounded. The proof of this theorem depends on an alternative dynamical description for Julia sets of Blaschke products in B cm (θ) which is obtained by taking pull-backs along a certain type of drops called maximal drops. This description will be useful later in the proof of Theorem 13.1. Definition. Let Dki be a k-drop of B ∈ Bcm (θ). We call Dki a maximal drop if Dki = D, or if Dki ∩ D = ∅ and Dki is not contained in any other `-drop of B for ` ≥ 1. It follows in particular that maximal drops of B are disjoint. e ◦ ϕ −1 as in (9.2). Then Proposition 10.2. Let B ∈ B cm (θ) and let P = S(B) = ϕ ◦ B (a) Dki is a maximal drop of B if and only if ϕ(Dki ) is a Fatou component of P which eventually maps to the Siegel disk 1P . C r GO(1P ). (b) ϕ maps the nucleus N∞ of B onto b (c) The boundary of the immediate basin of attraction of infinity for B is precisely the closure of the union of the boundaries of all maximal drops of B. Under ϕ this set maps to the Julia set J (P ). Proof. (a) and (b) are easy consequences of the definitions. For (c), just note that under ϕ, the boundary of the immediate basin of attraction of infinity for B corresponds to the similar boundary for P , and the closure of the union of the boundaries of all maximal drops of B corresponds to the Julia set J (P ) by (a). u t Lemma 10.3 (Alternative description for Julia Sets). Let B ∈ B cm (θ ) and let J0 be the boundary of the immediate basin of attraction of infinity for B. Define a sequence of compact sets Jn = Jn (B) inductively by [ B −k (I Jn−1 ∩ D) ∩ Dki . (10.1) Jn = Dki maximal
Then J (B) =
[ n≥0
Jn .
(10.2)
Dynamics of Cubic Siegel Polynomials
217
Proof. Each Jn is compact and contained in J (B). By Lemma 10.2(c), J0 ⊂ J1 and it follows by induction on n that Jn ⊂ Jn+1 for n ≥ 0. Put J∞ =
[
Jn .
n≥0
Clearly J∞ is compact and contained in the Julia set J (B), and it is not hard to see that it is invariant under the reflection I . We will show that J∞ is totally invariant under B, i.e., B −1 (J∞ ) = J∞ . This will prove that J∞ = J (B). First we prove that J∞ is forward invariant. For any n, it follows from (10.1) that B(Jn r D) ⊂ Jn ⊂ J∞ . On the other hand, B(Jn ∩ D) = B(I Jn−1 ∩ D) = I B(Jn−1 r D) ⊂ I J∞ = J∞ . These two inclusions show that B(Jn ) ⊂ J∞ , hence B(J∞ ) ⊂ J∞ . To prove backward invariance, first note that for any n, B −1 (Jn ) r D ⊂ Jn ⊂ J∞ by (10.1). To obtain the same kind of inclusion for B −1 (Jn ) ∩ D, we distinguish two cases: First, B −1 (Jn ∩D)∩D = B −1 (I Jn−1 ∩D)∩D ⊂ I (B −1 (Jn−1 rD)) ⊂ I Jn−1 ∪Jn ⊂ J∞ . Second, B −1 (Jn rD)∩D = I (B −1 (I Jn ∩D)rD) ⊂ I (B −1 (Jn+1 )rD) ⊂ I Jn+1 ⊂ J∞ . Altogether, these three inclusions show that B −1 (Jn ) ⊂ J∞ for all n. Hence B −1 (J∞ ) ⊂ t J∞ and this proves (10.2). u Proof of Theorem 10.1. One direction is quite easy to see: If the orbit of c never hits the closed unit disk and escapes to infinity, one can easily show that J (B) is disconnected in a way identical to the polynomial case by considering the Böttcher map of the immediate basin of attraction of ∞ for B (see for example [M1, Theorem 17.3]. Conversely, suppose that the orbit of the critical point c either hits D or stays bounded in C r D. Then the Julia set J (P ) is connected, where P = S(B). Consider the sequence of compact sets Jn in (10.1). By Proposition 10.2(c), J0 is connected and it follows by induction on n that each Jn defined by (10.1) is connected. Therefore (10.2) shows that J (B) is connected. Hence B ∈ C(θ). u t In what follows, we prove that the connectedness locus C(θ ) is compact. Other facts, e.g. having only two complementary components, or connectivity, will be proved later using surgery (see Corollary 13.4 and Corollary 13.5). We would like to remark that unlike the case of cubic polynomials, it is often difficult to prove anything about the topology of the Blaschke connectedness locus, partly because of the complicated way these Blaschke products depend on their critical points, but more importantly because of the fact that the family µ 7 → Bµ does not depend holomorphically on µ. Lemma 10.4. Let {Bµn } be an arbitrary sequence of Blaschke products in Bcm (θ ) and hn : T → T be the unique normalized quasisymmetric homeomorphism which conjugates Bµn |T to the rigid rotation Rθ . Let Hn denote the Douady–Earle extension of hn . Then the sequence {Hn } has a subsequence which converges locally uniformly to a quasiconformal homeomorphism of D. It follows in particular that the sequence {Hn−1 (0)} stays in a compact subset of the unit disk. Proof. This follows from the facts that the space of all uniformly quasisymmetric normalized homeomorphisms of the circle is compact [Le, Lemma 5.1] and the Douady– Earle extension depends continuously on the circle homeomorphism [DE]. u t
218
S. Zakeri
Corollary 10.5. Let B ∈ Bcm (θ) and ϕB : C → C be the quasiconformal homeomore to the cubic P = S(B) as in phism which conjugates the modified Blaschke product B −1 e (9.2): P = ϕB ◦ B ◦ ϕB . Then the family F = {ϕB }B∈Bcm (θ ) is normal. Proof. By the surgery construction as described in Sect. 9, F is uniformly quasiconformal. Choose a sequence {Bµn } in B cm (θ) and let ϕn = ϕBµn denote the corresponding sequence in F. Choose a subsequence, still denoted by Bµn , such that |µn | ≥ 1 for all n (the case |µn | ≤ 1 is similar). By the way we normalized ϕn , ϕn (Hn−1 (0)) = 0, ϕn (1) = 1, ϕn (∞) = ∞. But {Hn−1 (0)} lives in a compact subset of D by the previous lemma. Hence the three points Hn−1 (0), 1 and ∞ has mutual spherical distance larger than some positive constant independent of n. This implies equicontinuity of {ϕn } by a standard theorem on quasiconformal mappings [Le, Theorem 2.1]. u t Proposition 10.6. The surgery map S : B cm (θ ) → P cm (θ ) is proper. Proof. Let the sequence {Bµn } leave every compact set in Bcm (θ ) and consider the eµn ◦ ϕn−1 . To be more specific, let us corresponding cubics Pcn = S(Bµn ) = ϕn ◦ B assume that the critical point µn tends to infinity. Clearly cn = ϕn (µn ). Since {ϕn } is t normal by the above corollary, we simply conclude that cn → ∞. u Proposition 10.7. The Blaschke connectedness locus C(θ ) is compact and invariant under µ 7 → 1/µ. As a result, there exists an unbounded component 3ext of C∗ r C(θ ) which contains a punctured neighborhood of ∞ and a corresponding component 3int which is mapped to it by µ 7 → 1/µ. Proof. The invariance follows from the definition of Bcm (θ ) and its identification with C∗ . Note that the unit circle T ⊂ Bcm (θ) is contained in C(θ ) by Theorem 10.1. So 3ext and 3int are actually distinct components of C∗ r C(θ ). C(θ ) is clearly closed by Theorem 10.1. Let us prove it is bounded. Assuming the contrary, there is a sequence Bµn ∈ C(θ) with µn → ∞ as in the above proof. It follows from Proposition 10.2(c) and Theorem 10.1 that the corresponding polynomials eµn ◦ ϕn−1 have connected Julia sets. By Proposition 2.3, 1/30 ≤ Pcn = S(Bµn ) = ϕn ◦ B t |cn | ≤ 30. This contradicts properness of S. u 11. Continuity of the Surgery Map This section is devoted to the proof of continuity of the surgery map S which depends strongly on the cubic parameter space being one-dimensional. We point out that the situation is similar to Douady–Hubbard’s proof of the continuity of the “straightening map” in their study of the space of quadratic-like maps [DH2]. One additional difficulty here is the lack of complete information on quasiconformal conjugacy classes in the nonholomorphic family Bcm (θ) (the analogue of Theorem 5.5; see however Theorem 12.4). The idea of the proof is as follows: Given a sequence Bµn ∈ B cm (θ ) such that Bµn → B = Bµ , we prove that there exists a subsequence {Bµn(j ) } such that S(Bµn(j ) ) → S(B) in P cm (θ ). The topology of the parameter space P cm (θ ) is local uniform convergence respecting the markings of the critical points. The same is true for Bcm (θ ) with one exception (see Sect. 9): If µ has absolute value 1, i.e., if B has two double critical
Dynamics of Cubic Siegel Polynomials
219
points on the unit circle, then Bµn → B means that every subsequence of {Bµn } has a further subsequence which either converges locally uniformly to B or to its conjugate Rµ−1 ◦ B ◦ Rµ . From the construction of S it is easy to see that S(B) = S(Rµ−1 ◦ B ◦ Rµ ). Therefore, in order to prove continuity of S, all we have to show is that Bµn → B locally uniformly on C (respecting the markings of the critical points) implies that for some subsequence {Bµn(j ) }, S(Bµn(j ) ) → S(B) locally uniformly on C (again, respecting the markings of the critical points). So let hn and h be the unique k(θ)-quasisymmetric homeomorphisms which fix z = 1 and conjugate Bµn |T and B|T to the rigid rotation Rθ . It is easy to see that hn → h uniformly on T. Consider the Douady–Earle extensions Hn and H , which are K(θ )-quasiconformal homeomorphisms of the unit disk. By the construction of these extensions, Hn and H are real-analytic in D and Hn → H locally uniformly in C ∞ topology [DE]. In particular, the partial derivatives ∂Hn and ∂Hn converge locally uniformly in D to the corresponding derivatives ∂H and ∂H . This shows that σn |D → σ |D locally uniformly, where σn = Hn∗ σ0 and σ = H ∗ σ0 are the conformal structures we constructed in the course of surgery for Bµn and B (see Sect. 9). At this point, the main problem is to prove that Bµn → B and σn |D → σ |D implies σn → σ in the L1 -norm on C, for this would show that the normalized solutions ϕn = ϕHn of the Beltrami equations ϕn∗ σ0 = σn converge locally uniformly on C to the normalized solution ϕ of the equation ϕ ∗ σ0 = σ . This would simply mean that S(Bµn ) → S(B) as n → ∞. Unfortunately, we cannot prove σn → σ in L1 (C) in all cases. So, following [DH2], we take a slightly different approach by splitting the argument into two cases depending on whether S(B) is quasiconformally rigid or not. In the former case, we show continuity directly using the rigidity. In the latter case, however, we prove ϕn → ϕ using the fact that S(B) admits non-trivial deformations. Theorem 11.1. The surgery map S : B cm (θ) → P cm (θ ) is continuous. Proof. Consider Bµn , B ∈ B cm (θ) and start with the same construction as above to get a sequence {σn } of conformal structures on the plane with uniformly bounded dilatation and the corresponding sequence {ϕn } of normalized solutions of ϕn∗ σ0 = σn . Since {ϕn } is a normal family by Corollary 10.5, it has a subsequence, still denoted by {ϕn }, which converges locally uniformly to a quasiconformal homeomorphism ψ : C → C. eµn ◦ϕn−1 = S(Bµn ), P = ϕ ◦ B e◦ϕ −1 = S(B), and Q = ψ ◦ B e◦ψ −1 . Set Pcn = ϕn ◦ B cm All these maps are cubic polynomials in P (θ ). Also P is quasiconformally conjugate to Q, and Pcn → Q as n → ∞. We will show that P = Q and this will prove continuity at B. For the rest of the argument, we distinguish two cases: If P = S(B) is quasiconformally rigid, then automatically P = Q and we are done. Otherwise, P is not rigid, so the quasiconformal conjugacy class of P is a non-empty open set U ⊂ P cm (θ ) by Corollary 5.2. Assume by way of contradiction that P 6 = Q. Since Pcn → Q as n → ∞, Pcn ∈ U for large n. Hence Pcn is quasiconformally conjugate to P for large n, i.e., there exists a normalized quasiconformal homeomorphism ηn : C → C such that ηn ◦ P = Pcn ◦ ηn . Observe that the dilatation of ηn is uniformly bounded, since by Theorem 5.1 the dilatation of (ψ ◦ ϕ −1 ) ◦ ηn−1 goes to 1 as n goes to ∞ (see Fig. 16). By “lifting” ηn , we can find a quasiconformal conjugacy ξn = ϕn−1 ◦ ηn ◦ ϕ between the e and B eµn , i.e., modified Blaschke products B e= B eµn ◦ ξn . ξn ◦ B
(11.1)
220
S. Zakeri
∼ Bµ n
ξn ∼ B
ϕn ϕ
ψ
U
Pcn P
Q ηn
Fig. 16. Sketch of the proof of continuity of S
Again, note that the dilatation of ξn is uniformly bounded. We prove that the sequence of conformal structures {σn } converges in L1 (C) to σ . This, by a standard theorem on quasiconformal mappings (see for example [Le, Theorem 4.6]) will show that ϕn → ϕ locally uniformly, hence Pcn → P , hence P = Q, which contradicts our assumption. To this end, we introduce the following sequences of conformal structures (where, as usual, we identify a conformal structure with its associated Beltrami differential): Sk e−i k σn (z) = σn (z) when z ∈ i=0 Bn (D) 0 otherwise and σ k (z) =
σ (z) when z ∈ 0 otherwise
Sk
e−i (D)
i=0 B
.
Note that σ k → σ in L1 (C) as k → ∞ and for every fixed k, σnk → σ k in L1 (C) as n → ∞. Lemma The L1 -norm kσn − σ k1 goes to zero as n → ∞ if the area of the open S∞ 11.2. −i eµ (D) goes to zero uniformly in n as k → ∞. set i=k B n S∞
Proof. For a given > 0, take k0 so large that k > k0 implies area( for all n. Then for a fixed large k > k0 and n large enough,
i=k
kσn − σ k1 ≤ kσn − σnk k1 + kσnk − σ k k1 + kσ k − σ k1 ≤Z kσn − σnk k1 + 2 =
S∞
< 3.
e−i i=k+1 Bµn (D)
This completes the proof of the lemma. u t
|σn | dxdy + 2
eµ−i (D)) < B n
Dynamics of Cubic Siegel Polynomials
221
S e−i So it remains to prove that the area of ∞ i=k Bµn (D) goes to zero uniformly in n S∞ −i e (D)) → 0 as k → ∞. Since {ξn } is uniformly as k → ∞. Clearly area( i=k B quasiconformal, there is a constant C ≥ 1 such that C −1 area(E) ≤ area(ξn (E)) ≤ C area(E) S e−i for every n and every measurable set E ⊂ ∞ i=0 B (D). By (11.1), ∞ [ i=k
eµ−i (D) = ξn ( B n
∞ [
e−i (D)), B
i=k
S∞ −i S e−i e so area( ∞ i=k Bµn (D)) ≤ C area( i=k B (D)) and this proves that the left side goes to zero uniformly in n. u t 12. Renormalizable Blaschke Products Here we consider those Blaschke products in B cm (θ ) from which one can “extract” the standard degree 3 Blaschke product fθ to be defined below. The importance of this particular Blaschke product lies in the fact that it provides a model for the dynamics of the quadratic polynomial Qθ : z 7 → e2πiθ z + z2 . It will be convenient to define renormalizable Blaschke products in B cm (θ) as ones which after the surgery give rise to renormalizable cubics in P cm (θ) (see Sect. 4). In what follows we will have to work with a symmetrized version of the notion of a quadratic-like map in order to show that any renormalizable Blaschke product is quasiconformally conjugate near the Julia set of its renormalization to the standard map fθ . The proof of this fact resembles the proof of [DH2] that every hybrid class of polynomial-like maps contains a polynomial. First we include the following simple fact for completeness. Proposition 12.1. Let 0 < θ < 1 be a given irrational number and f : b C→b C be a degree 3 Blaschke product with a superattracting fixed point at the origin and a double critical point at z = 1. Let the rotation number of f |T be θ . Then there exists a unique 0 < t (θ ) < 1 such that z−3 . (12.1) f (z) = fθ (z) = e2πit (θ ) z2 1 − 3z z−a , with |a| > 1 and 0 < t < 1. The fact Proof. Clearly f (z) = e2πit z2 1 − az that f 0 (1) = 0 implies a = 3. Since the rotation number of f |T as a function of t is continuous and strictly monotone at all irrational values, there exists a unique t for which this rotation number is θ. u t Remark. √ Computer experiments give the value t (θ ) ≈ 0.613648 · · · for the golden mean θ = ( 5 − 1)/2. Figure 17 shows the Julia set of fθ for this value of θ . This standard degree 3 Blaschke product was introduced by Douady, Ghys, Herman and Shishikura as a model for the quadratic Qθ : z 7 → e2πiθ z + z2 in the case θ is irrational of bounded type [D2]. It was also used by Petersen [Pe] to prove that the Julia set of Qθ is locallyconnected and has measure zero.
222
S. Zakeri
√ Fig. 17. Julia set of fθ for θ = ( 5 − 1)/2
Definition. A Blaschke product B ∈ Bcm (θ ) is called renormalizable if S(B) ∈ P cm (θ ) is a renormalizable cubic, as defined in Sect. 4. Theorem 12.2. Let B ∈ Bcm (θ) be renormalizable. Then there exists a pair of annuli W 0 b W , both containing the unit circle and symmetric with respect to it, and a quasiconformal homeomorphism ϕB : C → C such that: (a) B : ∂W 0 → ∂W is a degree 2 covering map, (b) ϕB ◦ I = I ◦ ϕB , (c) (ϕB ◦ B)(z) = (fθ ◦ ϕB )(z) for all z ∈ W 0 . Moreover, one can arrange ∂ϕB = 0 on K(B) =
T
n≥0 B
−n (W 0 ).
e◦ϕ −1 ∈ P cm (θ ) which is renormalizable. Proof. Consider the cubic P = S(B) = ϕ ◦ B Consider the quadratic-like restriction P |U : U → V and the corresponding regions U1 = ϕ −1 (U ) and V1 = ϕ −1 (V ). Clearly U1 b V1 and both contain the closed unit disk. Define the symmetrized regions W 0 = U1 ∩ I (U1 ),
W = V1 ∩ I (V1 )
W0
b W . Note that B sends ∂W 0 to ∂W in a 2-to-1 which are topological annuli with fashion. Now extend B|W 0 to the whole complex plane by gluing it to the polynomial z 7 → z2 near 0 and ∞ as follows: Let r > 1 and ω : CrW 0 → CrA(r −1 , r) be a diffeomorphism such that ω ◦ I = I ◦ ω, ω(B(z)) = ω(z)2 , z ∈ ∂W 0 . Define the extension of B|W 0 by F (z) =
B(z) z ∈ W0 . ω−1 (ω(z)2 ) z ∈ / W0
Dynamics of Cubic Siegel Polynomials
223
Note that F is a quasiregular degree 3 self-map of the sphere, F ◦ I = I ◦ F , and every point outside W 0 will converge to 0 or ∞ under the iteration of F . Define a conformal structure σ on the plane as follows: Put σ = ω∗ σ0 on C r W 0 , and pull it back by F ◦n to all the components of F −n (C r W 0 ) ∩ W 0 . Finally, on K(B) set σ = σ0 . It is easy to see that σ has bounded dilatation on the plane, is symmetric with respect to the unit circle, and F ∗ (σ ) = σ . By the Measurable Riemann Mapping Theorem, there exists a unique quasiconformal homeomorphism ϕB of the plane which fixes 0, 1, ∞, such that ϕB∗ (σ0 ) = σ . The conjugate map f = ϕB ◦ F ◦ ϕB−1 is easily seen to be a degree 3 rational map on the sphere. The quasiconformal homeomorphism I ◦ ϕB ◦ I also fixes 0, 1, ∞ and pulls σ0 back to σ because σ is symmetric with respect to T. By uniqueness, ϕB = I ◦ ϕB ◦ I . This implies that f commutes with I , hence it is t a Blaschke product. By Proposition 12.1, f = fθ , and we are done. u While the above theorem establishes a direct connection between some Blaschke products in B cm (θ) and fθ , it is curious to note the following entirely different relation: Theorem 12.3. Let Bµn be any sequence in Bcm (θ ) such that µn → ∞ as n → ∞. Then Bµn → fθ locally uniformly on C∗ as n → ∞. In other words, fθ can be regarded as the point at infinity of the parameter space B cm (θ ). Proof. As in Sect. 8, let Bµn : z 7 → e
z
2πitn 3
z − pn 1 − pn z
z − qn . 1 − qn z
The first and second logarithmic derivatives Bµ0 n Bµn
and
Bµn Bµ00 n − (Bµ0 n )2 (Bµn )2
both vanish at z = 1. A brief computation shows that these two conditions translate into |pn |2 − 1 |qn |2 − 1 + = 3, |pn − 1|2 |qn − 1|2
(12.2)
(pn − pn )(|pn |2 − 1) (qn − qn )(|qn |2 − 1) + = 0. |pn − 1|4 |qn − 1|4
(12.3)
and
Since µn → ∞, both pn and qn cannot stay bounded. Hence, after relabeling, pn → ∞ (compare Theorem 7.1). Then (12.2) shows that (|qn |2 − 1)/|qn − 1|2 → 2, or equivalently, |qn − 2| → 1 but qn stays away from z = 1 by Lemma 7.4. On the other hand, (12.3) shows that (qn −qn )(|qn |2 −1)/|qn −1|4 → 0, hence (qn −qn )/|qn −1|2 → 0. Since qn does not accumulate on z = 1, this implies that (qn − qn ) → 0. Near the circle |z − 2| = 1 this can happen only if qn → 3. Since the rotation number depends continuously on the circle map, it is easy to see that Bµn → fθ locally uniformly on C∗ . t u
224
S. Zakeri
Consider a sequence Bµn going off to infinity as in the previous theorem. Consider eµn ◦ ϕn−1 as in (9.2). By the previous theorem, the cubics Pcn = S(Bµn ) = ϕn ◦ B eµn → feθ locally uniformly on C. Here Bµn → fθ locally uniformly on C∗ , so B feθ denotes the modified Blaschke product for fθ , defined in a way similar to (9.1). Since {ϕn } is normal by Corollary 10.5, by passing to a subsequence if necessary, ϕn converges to a quasiconformal homeomorphism ϕ. Since the surgery map is proper by Proposition 10.6, cn → ∞. By examining the normal form (2.2), we see that Pcn → Q, where Q : z 7 → λz(1 − 1/2z) is affinely conjugate to Qθ : z 7 → e2π iθ z + z2 . Hence, Q = ϕ ◦ feθ ◦ ϕ −1 and we recover the surgery introduced by Douady and others. We conclude that the surgery map S : Bcm (θ) → P cm (θ ) extends continuously to the points at infinity of both parameter spaces, and the extension is also a surgery. The next theorem is the analogue of Theorem 5.1 for Blaschke products. It will be more convenient to formulate it for a general Blaschke product since we would like to use it for fθ as well as the elements of Bcm (θ). Theorem 12.4 (Paths of QC Conjugacies). Let A and B be two Blaschke products of degree d and let 8 be a quasiconformal homeomorphism which fixes 0, 1, ∞ such that 8 ◦ I = I ◦ 8 and 8 ◦ A = B ◦ 8. Then there exists a path {8t }0≤t≤1 of quasiconformal is a homeomorphisms, with 80 = id and 81 = 8, such that At = 8t ◦ A ◦ 8−1 t Blaschke product for every 0 ≤ t ≤ 1. In particular, either A is quasiconformally rigid or its conjugacy class is non-trivial and path-connected. Proof. The proof is almost identical to that of Theorem 5.1. Consider σ = 8∗ σ0 , which is invariant under A, and take the real perturbations σt = tσ , 0 ≤ t ≤ 1. Let 8t be the unique quasiconformal homeomorphism which fixes 0, 1, ∞ and satisfies 8∗t σ0 = σt . The map At = 8t ◦ A ◦ 8−1 t is easily seen to be a degree d rational map. By uniqueness, I ◦ 8t ◦ I = 8t since the left-hand side also pulls σ0 back to σt and fixes 0, 1, ∞. Hence t At commutes with I . So it is a Blaschke product. u We will need the next lemma in the proof of Theorem 13.3. Lemma 12.5 (Rigidity on the Julia Set). Let ψ be a quasiconformal homeomorphism defined on an open annulus containing the Julia set J (fθ ) of the Blaschke product fθ defined in (12.1). Suppose that ψ commutes with I and conjugates fθ to itself. Then ψ|J (fθ ) is the identity. Proof. Extend ψ to a quasiconformal homeomorphism C → C which commutes with I and conjugates fθ to itself. By the previous theorem, there exists a path t 7 → ψt of quasiconformal homeomorphisms, with 0 ≤ t ≤ 1 and ψ0 = id, ψ1 = ψ, such that ψt ◦ fθ ◦ ψt−1 is a degree 3 Blaschke product quasiconformally conjugate to fθ . By Proposition 12.1, this Blaschke product has to be fθ itself, so ψt commutes with fθ . Now that ψ|J (fθ ) must be the identity map follows from an argument similar to the proof of Lemma 6.2. u t 13. Surjectivity of the Surgery Map In this section we prove that the surgery map S : Bcm (θ ) → P cm (θ ) is surjective. We do this by showing that S is injective on the set of Blaschke products which map to C∗ r M(θ ) or to hyperbolic-like cubics. The proof of this fact is based on the combinatorics of drops and their nuclei as developed in Sect. 8. Here is the outline of the proof: If S(A) =
Dynamics of Cubic Siegel Polynomials
225
S(B) for some A, B ∈ Bcm (θ), there exists a quasiconformal homeomorphism of the e and B, e which is conformal plane which conjugates the modified Blaschke products A everywhere except on the union of the maximal drops. A careful analysis will then show that when S(A) is not capture, one can redefine this homeomorphism on all the drops of the two Blaschke products to get a conjugacy between A and B everywhere. A pull-back argument together with the Bers Sewing Lemma at each step shows that this conjugacy is conformal away from the Julia sets (Theorem 13.1). When S(A) is hyperbolic-like or has disconnected Julia set, one can use the renormalization scheme of Sect. 12 and the rigidity on the Julia sets (Lemma 12.5) to conclude that the conjugacy between A and B is in fact conformal (Theorem 13.3). Surjectivity of S, Theorem 13.7 and some corollaries will follow immediately. Theorem 13.1. Let A, B ∈ B cm (θ) and S(A) = S(B) = P . Suppose that P is not capture. Then there exists a quasiconformal homeomorphism 8 : b C→b C which fixes 0, 1, ∞, commutes with I and conjugates A to B. Moreover, 8 is conformal on the Fatou set b C r J (A). e◦ e ◦ ϕ −1 = ϕ 0 ◦ B Proof. Following the notation of (9.2), we assume that P = ϕ ◦ A −1 0 0 for some quasiconformal homeomorphisms ϕ and ϕ . Consider the quasiconformal ϕ e e homeomorphism 80 = ϕ 0 −1 ◦ ϕ which conjugates S A to−kB on the entire plane and is e (D). conformal (i.e., ∂80 = 0) everywhere except on k≥0 A S e−k (D) is precisely the Note that by Proposition 10.2(b) the open set b C r k≥0 A S e−k (D) is the disjoint union of the maximal nucleus N∞ as defined in Sect. 8.Also, k≥0 A drops of A (which by Proposition 10.2(a) correspond to the bounded Fatou components of map to the Siegel disk 1P ). Similar correspondence holds for the open set S P which −k (D). Therefore, for any maximal k-drop D i (A), there corresponds a unique e B k≥0 k maximal k-drop Dki (B) = 80 (Dki (A)). Finally, note that for any such maximal drops, A◦k : Dki (A) → D and B ◦k : Dki (B) → D are conformal isomorphisms since by our assumption P is not capture. In what follows we construct a sequence of quasiconformal homeomorphisms 8n : C → C which preserve the unit circle T and another sequence ϒn by symmetrizing each 8n : 8n (z) |z| ≥ 1 . ϒn (z) = (I ◦ 8n ◦ I )(z) |z| < 1 We have already constructed 80 , hence ϒ0 . Consider the sequences of compact sets {Jn (A)} and {Jn (B)} as in Lemma 10.3. Note that 80 ◦ A = B ◦ 80 on J0 (A). The next step is to define 81 : Let 81 = ϒ0 everywhere except on the maximal drops of A. On any maximal k-drop Dki (A) we define 81 : Dki (A) → Dki (B) by B −k ◦ϒ0 ◦A◦k . (When k = 0, the only maximal 0-drop is D and by this definition 81 |D = ϒ0 |D .) Observe that the two definitions match along the common boundary. Hence 81 is in fact a quasiconformal homeomorphism by the Bers Sewing Lemma. Note that 81 |J0 (A) = 80 |J0 (A) and by definition of J1 (A) in (10.1), 81 ◦ A = B ◦ 81 on J1 (A). The homeomorphism ϒ1 is then obtained by symmetrizing 81 . Continuing inductively, we define 8n to be equal to ϒn−1 everywhere except on the maximal drops of A and then on the maximal drops we define it by taking pull-backs. In other words, 8n : Dki (A) → Dki (B) will be defined by B −k ◦ ϒn−1 ◦ A◦k .
226
S. Zakeri
Lemma 13.2. The sequence of quasiconformal homeomorphisms {8n } has the following properties: 8n |Jn−1 (A) = 8n−1 |Jn−1 (A) ,
(13.1)
and (8n ◦ A)(z) = (B ◦ 8n )(z)
z ∈ Jn (A).
(13.2)
Proof. Both properties follow by induction on n. Let us prove (13.1) first. We have already seen (13.1) for n = 1. Assume (13.1) is true and let z ∈ Jn (A). We distinguish three cases: • Case 1. z ∈ Jn (A) ∩ D. Then I (z) ∈ Jn−1 (A) and we have 8n+1 (z) = ϒn (z) = (I ◦ 8n ◦ I )(z) = (I ◦ 8n−1 ◦ I )(z) by the induction hypothesis. The latter is clearly equal to ϒn−1 (z) = 8n (z). • Case 2. z ∈ Jn (A) r D and A◦k (z) ∈ D for some k ≥ 1. A◦k (z) ∈ I Jn−1 and hence (I ◦ A◦k )(z) ∈ Jn−1 (A). So 8n+1 (z) = (B −k ◦ ϒn ◦ A◦k )(z) = (B −k ◦ I ◦ 8n ◦ I ◦ A◦k )(z) = (B −k ◦ I ◦ 8n−1 ◦ I ◦ A◦k )(z) by the induction hypothesis. Again, the latter is equal to (B −k ◦ ϒn−1 ◦ A◦k )(z) = 8n (z). • Case 3. z ∈ Jn (A) r D and z is accumulated by points of the form Case 2. Then, clearly, 8n+1 (z) = 8n (z) by continuity. Altogether the three steps show that 8n+1 |Jn (A) = 8n |Jn (A) , which completes the induction step and the proof of (13.1). To prove (13.2) we have to work a little bit more. We have already seen (13.2) for n = 1. Assume (13.2) is true and let z ∈ Jn+1 (A). We split the induction step into the following cases: / D. Then (8n+1 ◦ A)(z) = (B ◦ 8n+1 )(z) • Case 1. z ∈ Jn+1 (A) r D and A(z) ∈ automatically since 8n+1 is defined by pull- backs. • Case 2. z ∈ Jn+1 (A) r D but A(z) ∈ D. Then (8n+1 ◦ A)(z) = (ϒn ◦ A)(z) = (B ◦ B −1 ◦ ϒn ◦ A)(z) = (B ◦ 8n+1 )(z). • Case 3. z ∈ Jn+1 (A) ∩ D and A(z) ∈ D. Then (8n+1 ◦ A)(z) = (ϒn ◦ A)(z) = (I ◦8n ◦I )(A(z)) = (I ◦8n ◦A)(I (z)). But I (z) ∈ Jn (A) so by the induction hypothesis, (I ◦ 8n ◦ A)(I (z)) = (I ◦ B ◦ 8n )(I (z)) = (B ◦ I ◦ 8n )(I (z)) = (B ◦ ϒn )(z) = (B ◦ 8n+1 )(z). / D. Then I (z) ∈ Jn (A). Let w = A(z). Since • Case 4. z ∈ Jn+1 (A) ∩ D but A(z) ∈ A(I (z)) = I (w) ∈ D, we have I (w) ∈ I Jn−1 (A), hence w ∈ Jn−1 (A). By (13.1), one has 8n+1 (w) = 8n (w) = 8n−1 (w) = ϒn−1 (w) = (I ◦ ϒn−1 ◦ I )(w) = (I ◦ 8n ◦ I )(w) = (I ◦ 8n ◦ I )(A(z)) = (I ◦ 8n ◦ A)(I (z)) = (I ◦ B ◦ 8n )(I (z)) by the induction hypothesis. The latter is equal to (B ◦ I ◦ 8n )(I (z)) = (B ◦ ϒn )(z) = (B ◦ 8n+1 )(z). t u Back to the proof of Theorem 13.1. By the Bers Sewing Lemma, the symmetrization 8n −→ ϒn does not increase the dilatation. On the other hand, the modification ϒn −→ 8n+1 achieved by pull-backs along the maximal drops does not increase the dilatation either, simply because A and B are holomorphic. So we may assume that {8n } is uniformly quasiconformal. Since all the 8n fix 0, 1, ∞, it follows that some subsequence 8n(j ) converges locally uniformly to a quasiconformal homeomorphism 8. Lemma 10.3 and Lemma 13.2 imply that 8 ◦ A = B ◦ 8 on J (A). In particular, this shows that 8 sends all the drops of A bijectively to the drops of B (before we only had a correspondence between the maximal drops of A and B).
Dynamics of Cubic Siegel Polynomials
227
S It is easy to check that 8 obtained this way is conformal on the union N = i,k Nki (A) of all the nuclei of drops of A at all depths as defined in Sect. 8 and in fact conjugates A to B there. Since N is clearly disjoint from the Julia set J (A) by (8.2), it remains to show that every Fatou component of A is contained in N . Consider a component U of the Fatou set of A. Under the iteration of A, U visits both D and C r D either finitely many times or infinitely often. In the first case, U has to map eventually into the nucleus N0 (A) or N∞ (A), hence it has to be contained in N. We prove that the second case cannot occur. In fact, suppose that the orbit of U visits D and C r D infinitely often. According to Sullivan [Su1], U eventually maps to a periodic Fatou component of A which is either an attracting or parabolic basin or a Siegel disk or a Herman ring. It follows that this cycle of periodic Fatou components intersects both D and C r D, so in either case a critical point of A has to enter D and leave it infinitely often, which is impossible since S(A) is not a capture. This shows that N = b C r J (A) and proves that 8 is a conjugacy between A and B everywhere and is conformal on b C r J (A). It is easy to see that 8 constructed this way commutes with I . u t Theorem 13.3. Let A, B ∈ B cm (θ) and S(A) = S(B). If S(A) is hyperbolic-like or has disconnected Julia set, then A = B. Proof. A and B are renormalizable by Theorem 4.2. Consider the quasiconformal homeomorphism 8 given by Theorem 13.1. By Theorem 12.2, there exists a pair of annuli WA0 b WA (resp. WB0 b WB ) and a quasiconformal homeomorphism ϕA (resp. ϕB ) which conjugates A (resp. B) to fθ on WA0 (resp. WB0 ). Since S(A) = S(B), we can assume that WB0 = 8(WA0 ) and WB = 8(WA ). The quasiconformal homeomorphism −1 : ϕA (WA0 ) → ϕB (WB0 ) is a self-conjugacy of fθ near its Julia set ψ = ϕB ◦ 8 ◦ ϕA which commutes with I . By Lemma 12.5, we must have ψ|J (fθ ) = id. It follows from the Bers Sewing Lemma that the ∂-derivative of ψ is zero almost everywhere on J (fθ ). Since by Theorem 12.2(b) ϕA (resp. ϕB ) has zero ∂-derivative on K(A) (resp. K(B)), we conclude that ∂8 = 0 almost everywhere S on K(A). But, as in the proof of Corollary 4.3, up to a set of measure zero, J (A) = n≥0 A−n (K(A)). Therefore, ∂8 has to be zero almost everywhere on the Julia set J (A). Hence 8 is conformal, so A = B. u t Remark. We believe that the surgery map is a homeomorphism, at least outside of the capture components where it might have branching. This would imply that the connectedness loci C(θ) and M(θ) are actually homeomorphic, a conjecture that is strongly supported by computer experiments. '
Corollary 13.4. The surgery map S restricts to a homeomorphism 3ext −→ ext . Similar conclusion holds for 3int and int . In particular, the connectedness locus C(θ ) is connected. Proof. Clearly S maps 3ext into ext injectively by the previous theorem. Since S is a proper map by Proposition 10.6, it extends to a continuous injection 3ext ∪ {∞} ,→ ext ∪ {∞}. We claim that this injection is onto. To this end, we show that for any sequence Bµn ∈ 3ext which converges to the boundary of the connectedness locus C(θ ), the sequence Pcn = S(Bµn ) ∈ ext converges to the boundary of M(θ ). If not, there is a subsequence of Bµn which converges to B ∈ ∂C(θ ) but the corresponding subsequence of Pcn converges to some P ∈ ext . By continuity, P = S(B). But B has connected Julia set while J (P ) is disconnected. This is impossible by Theorem 10.1. u t
228
S. Zakeri
Corollary 13.5. The connectedness locus C(θ ) has only two complementary components 3ext and 3int . Proof. Let U be a bounded component of C∗ r C(θ ) which is not 3int . Without loss of generality, we assume that U maps into ext by S. Take A ∈ U . By the previous corollary, there exists a B ∈ 3ext such that S(A) = S(B). By Theorem 13.3, A = B and this is a contradiction. u t Corollary 13.6. The surgery map S : B cm (θ) → P cm (θ ) is surjective. Proof. Compactify B cm (θ) and P cm (θ) by adding points at 0 and ∞ to get topological 2-spheres. S extends to a continuous map between these spheres by Proposition 10.6. ' This map has topological degree 6 = 0 because it is a homeomorphism 3ext −→ ext and S −1 (ext ) = 3ext . Therefore it has to be surjective. u t Since the boundary of the Siegel disk of a cubic which comes from the surgery is a quasicircle passing through some critical point, we have proved the main result (A3 ) of the introduction: Theorem 13.7 (Bounded type cubic Siegel disks are quasidisks). Let P be a cubic polynomial which has a fixed Siegel disk 1 of rotation number θ . Let θ be of bounded type. Then the boundary of 1 is a quasicircle which contains one or both critical points of P . By a recent theorem of Graczyk and Jones [GJ], we have Corollary 13.8. Under the assumptions of Theorem 13.7, the boundary of the Siegel disk 1 has Hausdorff dimension greater than 1. A recent result of McMullen [Mc3] implies the following interesting fact: The Hausdorff dimension of ∂1c is equal to the Hausdorff dimension 1 < δ(θ ) < 2 of the boundary of the Siegel disk of Qθ : z 7 → e2πiθ z + z2 whenever Pc is renormalizable. It follows from Theorem 4.2 that the function c 7 → HD(∂1c ) takes on the single value δ(θ ) on ext , int as well as on all the hyperbolic-like components of M(θ √ ). (One can actually find more rigorous estimates for the value of δ(θ ) when θ = ( 5 − 1)/2; see [BOS].) Now it is possible to show that despite all the bifurcations taking place near the boundary of the connectedness locus M(θ), which give rise to discontinuity of the Julia sets, the boundaries of the Siegel disks move continuously. Theorem 13.9 (Boundary of Siegel disks move continuously). The boundary ∂1c of the Siegel disk of Pc ∈ P cm (θ) centered at 0 is a continuous function of c ∈ C∗ in the Hausdorff topology. / ∂M(θ ), Theorem 3.1 shows that J (P ), Proof. Let us fix some P ∈ P cm (θ). If P ∈ hence ∂1P , moves holomorphically in a neighborhood of P and continuity at P is obvious. So let us assume that P ∈ ∂M(θ) and consider a sequence Pcn ∈ P cm (θ ) which converges to P as n → ∞. Since the surgery map is surjective, there exists a sequence Bµn ∈ B cm (θ) such that S(Bµn ) = Pcn . By properness (Proposition 10.6), some subsequence which we still denote by Bµn converges to some B ∈ Bcm (θ ), which by eµn ◦ϕn−1 as in (9.2). continuity maps to P . Now consider the representations Pcn = ϕn ◦ B Then the boundary ∂1cn is just the image ϕn (T). Since {ϕn } is normal by Corollary 10.5, some further subsequence, still denoted by {ϕn }, converges to a quasiconformal homee ◦ ψ −1 ∈ P cm (θ ) is quasiconformally conjugate to omorphism ψ. The map Q = ψ ◦ B P . Since P is rigid by Theorem 5.5, P = Q. Now, as n → ∞, ∂1cn = ϕn (T) converges t in the Hausdorff topology to ψ(T) = ∂1Q = ∂1P . u
Dynamics of Cubic Siegel Polynomials
229
Remark. We can actually make this theorem stronger in the following sense: Let Pc0 ∈ P cm (θ ) be a cubic for which one of the critical points c0 or 1 is off the boundary ∂1c0 (this happens if Pc0 is off the Jordan curve 0 studied in the next section). Then the boundary ∂1c moves holomorphically as a function of c in a neighborhood of c0 . To see this, assume for example that for all c sufficiently close to c0 we have c ∈ ∂1c but 1∈ / ∂1c . Evidently the critical orbit {Pc◦k (c)}k≥0 moves holomorphically as a function of c, and we can extend this motion to the closure of this critical orbit by the λ-lemma. But this closure is precisely the boundary ∂1c if c is close to c0 . 14. Siegel Disks with Two Critical Points on Their Boundary In this section we characterize those cubics in P cm (θ ) which have both critical points on the boundary of their Siegel disk. In Theorem 14.3 we will prove that the set of all such cubics is a Jordan curve 0 in P cm (θ). The proof of this theorem will use the fact that the quasiconformal conjugacy classes in B cm (θ) are path-connected (Theorem 12.4). We then show that when there are no queer components, 0 is in fact the common boundary of ext and int (Theorem 14.4). Consider the set 0 which consists of all cubics P ∈ P cm (θ ) such that both critical points of P belong to the boundary of the Siegel disk 1P . Fig. 18 shows this set in the parameter space P cm (θ). Since the surgery map S : B cm (θ) → P cm (θ ) is surjective by Corollary 13.6, every P ∈ 0 is of the form S(Bµ ) with Bµ having two double critical points on the circle. Corollary 9.3 shows that µ must belong to the unit circle T ⊂ C∗ ' B cm (θ ). Therefore, we simply have 0 = S(T). In particular, 0 is a closed path in P cm (θ) ' C∗ . Suggested by Fig. 18, we want to prove that 0 is a Jordan curve. This would follow immediately if one could prove that S|T is injective. However, I have not been able to show this. In fact, I do not know how to prove that Blaschke products on the boundary of the connectedness locus C(θ ) are quasiconformally rigid. So we take a slightly different approach by showing that the fibers of S|T : T → 0 are connected. Lemma 14.1. Let A, B ∈ B cm (θ) and S(A) = S(B) = P . Suppose that P is not capture. Then there exists a path t 7 → At ∈ B cm (θ ) of Blaschke products for 0 ≤ t ≤ 1, with A0 = A, A1 = B, such that S(At ) = P for all t. Proof. Since P is not capture, by Theorem 13.1 there exists a quasiconformal homeomorphism 8 which conjugates A to B, which is conformal away from the Julia set J (A). By Theorem 12.4 there exists a path {8t }0≤t≤1 connecting the identity map to 8 cm and a corresponding path {At = 8t ◦ A ◦ 8−1 t }0≤t≤1 of elements of B (θ ) connecting A to B. Note that by the definition of 8t , these quasiconformal homeomorphisms are all conformal away from J (A). It remains to show that S(At ) = P for all 0 ≤ t ≤ 1. Consider the Douady–Earle extension H : D → D used in the definition of S(A) in Sect. 9. Recall that H |T conjugates A|T to the rigid rotation Rθ . Hence, the quasiconformal homeomorphism : D → D will conjugate At |T to the rigid rotation as well. Note that Ht = H ◦ 8−1 t Ht is not in general the Douady–Earle extension of the linearizing homeomorphism
230
S. Zakeri
Γ c =1
Fig. 18. The Jordan curve 0, the locus of all critically marked cubics in P cm (θ ) which have both critical points on the boundary of their Siegel disk. Topologically it can be described as the common boundary of the complementary regions ext and int . Note that 0 is invariant under c 7 → 1/c
ht : T → T for At . Nevertheless, SHt (At ) = S(At ) by Proposition 9.2. Consider the modified Blaschke products |z| ≥ 1 e = A(z) A(z) (H −1 ◦ Rθ ◦ H )(z) |z| < 1 and et (z) = A
At (z) |z| ≥ 1 . (Ht−1 ◦ Rθ ◦ Ht )(z) |z| < 1
e= A et ◦ 8t . Note that 8t ◦ A Define the corresponding conformal structures σ = H ∗ σ0 and σt = Ht∗ σ0 as in Sect. 9. It is easy to see that σ = 8∗t σt .
(14.1)
Here we use the fact that 8t is conformal away from J (A). Consider the normalized solutions ϕ and ϕt of the Beltrami equations ϕ ∗ σ0 = σ, ϕt∗ σ0 = σt . By (14.1) and uniqueness, we have ϕt = ϕ ◦ 8−1 t .
Dynamics of Cubic Siegel Polynomials
231
Hence, by Proposition 9.2, et ◦ ϕt−1 S(At ) = ϕt ◦ A −1 e = ϕ ◦ 8−1 t ◦ At ◦ 8t ◦ ϕ −1 e =ϕ◦A◦ϕ = S(A). This completes the proof of the lemma. u t Corollary 14.2. The fibers of S|T : T → 0 are connected. Proof. Let A, B ∈ T ⊂ B cm (θ) and S(A) = S(B). Apply the previous lemma to A and B. Note that At ∈ T for all 0 ≤ t ≤ 1, since At is quasiconformally conjugate to A, hence has two double critical points on the unit circle. u t Theorem 14.3. 0 is a Jordan curve. Proof. Consider S|T : T → 0 whose fibers are closed and connected by Corollary 14.2. By general topology, 0 is homeomorphic to T/ ∼, where A ∼ B means S(A) = S(B). Since each equivalence class of ∼ is a closed connected proper subset of T, it follows that T/ ∼ is homeomorphic to the circle. u t Finally, we find a topological characterization of 0 in P cm (θ ) under the assumption that there are no queer components in the interior of M(θ ). Theorem 14.4 (Topological Characterization of 0). 0 is a subset of the boundary ∂M(θ ) and it contains ∂ext ∩ ∂int . If there are no queer components in the interior of M(θ ), then 0 = ∂ext ∩ ∂int . Proof. First let us show that ∂ext ∩ ∂int ⊂ 0. Let Pc ∈ ∂ext ∩ ∂int and assume that / 0. Choose Bµ ∈ B cm (θ) such that S(Bµ ) = Pc . We can assume without loss of Pc ∈ generality that |µ| > 1. Choose a sequence Pcn ∈ int converging to Pc and a sequence Bµn ∈ 3int such that S(Bµn ) = Pcn . By passing to a subsequence, we may assume that Bµn → Bµ0 as n → ∞, where |µ0 | ≤ 1. By continuity, S(Bµ0 ) = Pc and by our / 0, so we must have |µ0 | < 1. Since Pc is not capture by Corollary 5.3, assumption Pc ∈ Lemma 14.1 shows that there is a path t 7 → At of quasiconformally conjugate Blaschke products in B cm (θ) connecting Bµ to Bµ0 , all of which are mapped to Pc . Since this path must intersect T somewhere, we conclude that Pc ∈ 0 which is a contradiction. Now we prove that 0 ⊂ ∂M(θ). Fix some P ∈ 0. Since P has both critical points on ∂1P , it cannot belong to any hyperbolic-like or capture component. Also, P cannot be in a queer component U of the interior of M(θ ), since otherwise every Q ∈ U would have to be quasiconformally conjugate to P by Theorem 5.5, which would imply that Q has two critical points on ∂1Q , which would show U ⊂ 0. But this is evidently impossible because U is open and 0 is a Jordan curve. Therefore, P has to lie in ∂M(θ ) = ∂ext ∪ ∂int . Now assume that there are no queer components in the interior of M(θ ). To show that 0 = ∂ext ∩ ∂int , let Pc0 ∈ 0 and assume by way of contradiction that c0 ∈ ∂ext r ∂int . Since c0 has positive distance from int , for all c in a neighborhood D of c0 the sequence {Pc◦n (1)} has to be normal. Assuming that D is a small disk, the Jordan curve 0 cuts D into two topological disks D1 and D2 such that for every c ∈ D1 , / ∂1c , and for every c ∈ D2 , c ∈ ∂1c and 1 ∈ / ∂1c (see Fig. 19). 1 ∈ ∂1c and c ∈
232
S. Zakeri 1 belongs to the boundary of
∆c
Γ
D1 D
c0 D2
c belongs to the boundary of ∆
c
Fig. 19.
Clearly D2 ∩ ∂ext = D2 ∩ ∂int = ∅. So D2 has to be a subset of a component U of the interior of M(θ). Since there are no queer components by the assumption, U is either hyperbolic-like or capture. For every c ∈ D1 , we have 1 ∈ ∂1c and the restriction Pc |∂1c is conjugate to the ◦q rigid rotation by angle θ. Therefore, Pc n (1) → 1 for all c ∈ D1 , where the qn are the denominators of the rational approximations of θ. Since {Pc◦n (1)} is normal in D, for a ◦q subsequence {qn(j ) } we must have Pc n(j ) (1) → 1 throughout D. In particular, if c ∈ D2 , the critical point 1 of Pc must be recurrent. This is impossible if U is hyperbolic-like or capture, since over D2 , c ∈ ∂1c and hence 1 either gets attracted to the attracting cycle t or eventually maps to the Siegel disk 1c . u Acknowledgements. I am grateful to J. Milnor for many inspiring discussions and his support and interest in this work. He also showed me with great patience how to write programs to create the pictures of the Julia sets in this paper. I would like to thank M. Lyubich and D. Schleicher for very useful conversations during the Spring and Fall semesters of 1997 at Stony Brook. Further thanks are due to X. Buff for his suggestion on using deformation in the proof of Theorem 5.3, and A. Epstein and M. Yampolsky for various discussions.
References [A] [AB]
Ahlfors, L.: Lectures on Quasiconformal Mappings. Amsterdam: Van Nostrand, 1966 Ahlfors, L. and Bers, L.: Riemann mapping’s theorem for variable metrics. Annals of Math. 72, 385–404 (1960) [B] Bers, L.: On moduli of Kleinian groups. Russ. Math. Surv. 29, 88–102 (1974) [BR] Bers, L. and Royden, H.L.: Holomorphic families of injections. Acta Math., 157, 259–286 (1986) [BOS] Burbanks, A., Osbaldestin, A., Stirnemann, A.: Rigorous bounds on the Hausdorff dimension of Siegel disk boundaries. Commun. Math. Phys. 199, 417–439 (1998) [CG] Carleson, L. and Gamelin, T.: Complex Dynamics. Berlin–Heidelberg–New York: Springer-Verlag, 1993 [D1] Douady, A.: Systemes dynamiques holomorphes. Seminar Bourbaki, Asterisque 105–106, 39–63 (1983) [D2] Douady, A.: Disques de Siegel at aneaux de Herman. Seminar Bourbaki, Asterisque 152–153, 151– 172 (1987) [DE] Douady, A. and Earle, C.: Conformally natural extension of homeomorphisms of the circle. Acta Math. 157, 23–48 (1986) [DH1] Douady, A. and Hubbard, J.: Iteration des polynomes quadratiques complexes. C.R. Acad. Sc. Paris 294, 123–126 (1982) [DH2] Douady, A. and Hubbard, J.: On the dynamics of polynomial-like mappings. Ann. Sci. Ec. Norm. Sup. 18, 287–343 (1985) [EY] Epstein, A. and Yampolsky, M.: Geography of the cubic connectedness locus I: Intertwining surgery. Ann. Sci. Ec. Norm. Sup. 32, 151–185 (1999)
Dynamics of Cubic Siegel Polynomials
[G] [GJ] [GM] [Ha] [H1] [H2] [H3] [KH] [K] [Le] [Ly] [MN] [Mc1] [Mc2] [Mc3] [McS] [M1] [M2] [M3] [Pr1] [Pr2] [Pe] [R] [Su1] [Su2] [Sw] [W] [YZ] [Y] [Z1] [Z2] [Z3] [Z4]
233
Ghys, E.: Transformations holomorphes au voisinage d’une courbe de Jordan. C.R. Acad. Sc. Paris 298, 385–388 (1984) Graczyk, J. and Jones, P.: Geometry of Siegel disks. Manuscript, 1997 Goldberg, L. and Milnor, J.: Fixed points of polynomial maps II. Ann. Sci. Ec. Norm. Sup. 26, 51–98 (1993) Haissinsky, P.: Chirurgie parabolique. C.R. Acad. Sc. Paris 327, 195–198 (1998) Herman, M.: Are there critical points on the boundary of singular domains?. Commun. Math. Phys. 99, 593–612 (1985) Herman, M.: Conjugaison quasisymetrique des homeomorphismes analytique des cercle a des rotations. Manuscript Herman, M.: Conjugaison quasisymetrique des diffeomorphismes des cercle a des rotations et applications aux disques singuliers de Siegel. Manuscript Katok, A. and Hasselblatt, B.: Introduction to the Modern Theory of Dynamical Systems. Cambridge: Cambridge University Press, 1995 Kiwi, J.: Non-accessible critical points of Cremer polynomials. SUNY at Stony Brook IMS preprint, 1996/2, to appear in Ergod. Th. Dynam. Sys. Lehto, O.: Univalent Functions and Teichmüller Spaces Berlin–Heidelberg–New York: SpringerVerlag, 1987 Lyubich, M.: Dynamics of rational transforms: The topological picture. Russ. Math. Surv. 41, 35–95 (1986) Manton, N. and Nauenberg, M.: Universal scaling behavior for iterated maps in the complex plane. Commun. Math. Phys. 89, 555–570 (1983) McMullen, C.:Automorphisms of rational maps. In: Holomorphic Functions and Moduli I, ed. Drasin, Earle, Gehring, Kra, Marden, MSRI Pub. 10, Berlin–Heidelberg–New York: Springer, 1988 McMullen, C.: Complex Dynamics and Renormalization. Annals of Math Studies 135, 1994 McMullen, C.: Self-similarity of Siegel disks and the Hausdorff dimension of Julia sets. Acta Math. 180, 247–292 (1998) McMullen, C. and Sullivan, D.: Quasiconformal homeomorphisms and dynamics III: The Teichmüller space of a holomorphic dynamical system. Adv. Math. 135, 351–395 (1998) Milnor, J.: Dynamics in One Complex Variable: Introductory Lectures. SUNY at Stony Brook IMS preprint, 1990/5 Milnor, J.: Hyperbolic components in spaces of polynomial maps, with an appendix by A. Piorier. SUNY at Stony Brook IMS preprints, 1992/3 Milnor, J. Local connectivity of Julia sets: Expository Lectures. SUNY at Stony Brook IMS preprint, 1992/11 Perez-Marco, R.: Fixed points and circle maps. Acta Math. 179, 243–294 (1997) Perez-Marco, R.: Siegel disks with quasi-analytic boundary. Manuscript, 1997 Petersen, C.: Local connectivity of some Julia sets containing a circle with an irrational rotation. Acta Math. 177, 163–224 (1996) Rogers, J.: Singularities in the boundaries of local Siegel disks. Ergod. Th. Dynam. Sys. 12, 803–821 (1992) Sullivan, D.: Quasiconformal homeomorphisms and dynamics I: Solution of the Fatou–Julia problem on wandering domains. Annals of Math., 122, 401–418 (1985) Sullivan, D.: Quasiconformal homeomorphisms and dynamics III: Topological conjugacy classes of analytic endomorphisms. Manuscript ´ atek, G.: On critical circle homeomorphisms. Bol. Soc. Bras. Mat. 29, 329–351 (1998) Swi¸ Widom, M.: Renormalization group analysis of quasi-periodicity in analytic maps. Commun. Math. Phys. 92, 121–136 (1983) Yampolsky, M. and Zakeri, S.: Mating Siegel quadratic polynomials. SUNY at Stony Brook IMS preprint, 1998/8 Yoccoz, J.C.: Petits Diviseurs en Dimension 1. Asterisque 231, (1995) Zakeri, S.: On critical points of proper holomorphic maps on the unit disk. Bull. London Math. Soc. 30, 62–66 (1998) Zakeri, S.: Biaccessiblity in quadratic Julia sets. SUNY at Stony Brook IMS preprint, 1998/1, to appear in Ergod. Th. Dynam. Sys. Zakeri, S.: On dynamics of cubic Siegel polynomials. SUNY at Stony Brook IMS preprint, 1998/4 Zakeri, S.: Non-quasiconformal surgery and Siegel Julia sets. In preparation
Communicated by Ya. G. Sinai
Commun. Math. Phys. 206, 235 – 245 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Semilinear PDEs on Self-Similar Fractals K. J. Falconer Mathematical Institute, University of St Andrews, North Haugh, St Andrews, Fife, KY16 9SS, Scotland. E-mail:
[email protected] Received: 11 December 1998 / Accepted: 22 March 1999
Abstract: A Laplacian may be defined on self-similar fractal domains in terms of a suitable self-similar Dirichlet form, enabling discussion of elliptic PDEs on such domains. In this context it is shown that that semilinear equations such as 1u + up = 0, with zero Dirichlet boundary conditions, have non-trivial non-negative solutions if 0 < ν ≤ 2 and p > 1, or if ν > 2 and 1 < p < (ν + 2)/(ν − 2), where ν is the “intrinsic dimension” or “spectral dimension” of the system. Thus the intrinsic dimension takes the rôle of the Euclidean dimension in the classical case in determining critical exponents of semilinear problems. 1. Introduction Recently a great deal of effort has gone into defining a Laplacian operator for functions on fractal domains. This has led to a study of linear PDEs, such as the heat equation and linear eigenvalue problem, on fractal domains, see for example [4,8,10,12,14]. There are considerable difficulties in defining the Laplacian on a general fractal set and several definitions have been proposed that are applicable to certain classes of fractal. For example, [10,12] define a Laplacian as the limit of discrete differences on graphs approximating the fractal, a method suited to post-critically finite self-similar domains such as the Sierpi´nski triangle. Recently, Mosco [17,18] introduced a framework for the Laplacian, by taking as a starting point a Dirichlet form that reflects the self-similarities of the underlying fractal. This framework, which depends on the intrinsic structure of the fractal and its “intrinsic dimension” ν, leads to a very general theory of “variational fractals” that includes many of the more specific examples that have been analysed by other means, such as in [3,15,10,12]. So far, attention has concentrated on linear PDEs. However, many problems on fractal domains lead to nonlinear models, for example reaction-diffusion equations, problems on elastic fractal media or fluid flow through fractal regions, so it is appropriate to investigate nonlinear PDEs. It turns out that the analytic estimates obtained for variational fractals
236
K. J. Falconer
are just what is needed for critical point analysis in the nonlinear case. We demonstrate that semilinear elliptic PDEs of the form 10 u + f (x, u) = 0,
(1.1)
where 10 is the Laplacian corresponding to zero Dirichlet boundary conditions, have non-trivial non-negative solutions if f satisfies certain conditions, in particular that f (x, t) ∼ |t|p for large t, provided that either 0 < ν ≤ 2 and p > 1, or ν > 2 and 1 < p < (ν + 2)/(ν − 2), where ν is the intrinsic dimension of the system. This is reminiscent of the critical exponent condition for PDEs on classical domains in Rn with smooth boundary, where such equations have non-trivial solutions for all p > 1 if n = 1, 2 and for 1 < p < (n + 2)/(n − 2) if n = 3, 4, . . . , see [13]. Thus the intrinsic dimension plays an analogous rôle to the Euclidean dimension in the classical case. Moreover, the intrinsic dimension has another natural interpretaion as the spectral dimension of 10 u; thus the number of eigenvalues of 10 u at most λ is asymptotic to λν/2 . As a particular application, Eq. (1.1) might provide a steady-state solution to a nonlinear reaction-diffusion equation ut = 10 u + f (x, u) = 0 on a fractal catalyst, where u is temperature and the nonlinearity results from an exothermic chemical reaction. 2. Dirichlet Forms and Variational Fractals We review the definition of the Laplacian on self-similar sets via a suitable Dirichlet form, following the “variational fractal” approach of Mosco [17,18]. For i = 1, . . . , m, let ψi be contracting similarities on Rn equipped with the Euclidean metric d0 , so that d0 (ψi (x), ψi (y)) = ri d0 (x, y) (x, y ∈ Rn ), where ri < 1. Thus {ψ1 , . . . , ψm } is an iterated function system which has a self-similar attractor K, that is a unique non-empty compact set K ⊂ Rn satisfying K = ∪m i=1 ψi (K), see [7]. We assume that the system satisfies the open set separation condition, that there is a non-empty open set U such that ∪m i=1 ψi (U ) ⊂ U with this union disjoint. Then the Hausdorff and box-counting dimensions of K are given by the non-negative number df satisfying m X d ri f = 1. i=1
Moreover, K has positive finite df -dimensional Hausdorff measure. Let µ denote the restriction to K of normalised df -dimensional Hausdorff measure, so that µ(K) = 1 and µ satisfies the scaling property µ(A) =
m X i=1
d
ri f µ(ψi−1 (A))
(2.1)
for A ⊂ Rn . We write Ai1 ,... ,ik = ψi1 ◦ · · · ◦ ψik (A) for a set A ⊂ Rn in the usual way. Then µ(Ki1 ,... ,ik ∩ Kj1 ,... ,jk ) = 0 whenever i1 , . . . , ik 6 = j1 , . . . , jk . We write 0 ≡ ∪i6=j ψi−1 (Ki ∩ Kj ) for the intrinsic boundary of K, so 0i ∩ 0j = Ki ∩ Kj . We
Semilinear PDEs on Self-Similar Fractals
237
also assume an iterated analogue of this, namely Ki1 ,... ,ik ∩ Kj1 ,... ,jk = 0i1 ,... ,ik ∩ 0j1 ,... ,jk for all i1 , . . . , ik 6 = j1 , . . . , jk . Let L2 (K, µ) be the Hilbert space of µ-square integrable functions on K, with the usual inner product and norm k k2 . Let W : DW × DW → R≥0 be a Dirichlet form in L2 (K, µ) with domain DW = {u ∈ L2 (K, µ) : W [u] < ∞}, where we write W [u] = W (u, u). Thus W is a densely defined closed, non-negative, symmetric bilinear form, which satisfies the Markovian property, that is if u ∈ DW and T : R → R is such that T (0) = 0 and |T (t1 ) − T (t2 )| ≤ |t1 − t2 | for all t1 , t2 ∈ R,
(2.2)
T ◦ u ∈ DW and W [T ◦ u] ≤ W [u].
(2.3)
then
We assume that W is irreducible, that is W [u] = 0 if and only if u is constant, and strongly local, so that W (u, v) = 0 if u is constant on spt v. We define the energy norm on L2 (K, µ) by kuk = (W [u] + kuk22 )1/2 ;
(2.4)
this is analogous to the Sobolev norm k k2,1 for functions on classical domains. We assume that W is regular, that is DW ∩ C(K) is dense both in C(K) with the uniform norm, and in DW with the norm k k. For a Dirichlet form W satisfying these conditions, there is a representation Z W [u] = dL[u], (2.5) where the measure L(u, v) is defined by Z 1 gdL(u, v) = [W (gu, v) + W (u, gv) − W (g, uv)] 2 for g ∈ C0 (K). We think of L[u] ≡ L(u, u) as the analogue of “| grad u|2 ”, though in general L[u] is a measure rather than a function, and indeed it is not clear if “grad u” itself has any analogue. We next impose a requirement that W , and thus L, is compatible with the self-similar structure of K. We assume that for some constant 0 < σ < 1, W [u] =
m X
σ
µ(Ki ) W [u ◦ ψi ] =
m X
1
σ df
ri
W [u ◦ ψi ] (u ∈ DW ).
1
This leads to corresponding identities for L, that is Z gdL[u] =
m X 1
σ df
ri
Z (g ◦ ψi )dL[u ◦ ψi ]
238
K. J. Falconer
for g ∈ C0 (K). These formulae may be iterated to obtain identities involving the composed mappings ψi1 ◦ · · · ◦ ψik . Following Mosco [17,18] we term the triple (K, µ, W ) a variational fractal. A consequence of self-similarity is that there is a number ν > 0, called the intrinsic dimension of K, that gives the scaling law exponent of µ(B(x, r)) for x ∈ K and r > 0, where the ball B(x, r) is determined by a natural intrinsic metric on K. (This intrinsic metric is closely related to the effective resistence metric used in the constructive approach of Kigami [11].) This intrinsic approach, described in detail in [17,18], sheds a great deal of light on this theory. It turns out that ν=
2 1−σ
or σ =
ν−2 . ν
We may use W to define Laplace operators associated with Neumann boundary conditions, and with Dirichlet boundary conditions, respectively, by analogy with the Gauss-Green equation. The (unbounded) Neumann operator 1 has domain D1 ⊂ DW and is such that Z (2.6) W (u, v) = − (1u)vdµ for all v ∈ DW and u ∈ D1 . To permit Dirichlet boundary conditions we let W0 be the closure in kk2 of the restriction of W to W ∩ C0 (K \ 0), where C0 (K \ 0) is the space of continuous functions with compact support in K \ 0, and we write DW0 for the domain of W0 . Then W0 is closed, regular and strongly local with DW ∩ C0 (K) dense in C0 (K \ 0) in the uniform norm, and dense in DW0 in the norm k k. The self-adjoint Dirichlet operator 10 with domain D10 ⊂ DW0 is defined by Z W0 (u, v) = − (10 u)vdµ for all v ∈ DW0 and u ∈ D10 . (2.7) It turns out that the intrinsic dimension ν equals the spectral dimension of K, which gives the asymptotic distribution of the eigenvalues of the Laplacian. Thus #{eigenvalues of 1 ≤ λ} #{eigenvalues of 10 ≤ λ} λν/2 , see [14,16,20]. 3. Analytic Properties Some further analytic structure on the variational fractal (K, µ, W ) is needed to enable PDEs to be studied. As in Mosco [17,18] we assume that W satisfies a rather stronger irreducibility condition, namely a global Poincaré inequality of the form Z Z |u − uK |2 dµ ≤ c dL[u] (u ∈ DW ). (3.1) K
K\0
We also need a connectivity condition to ensure continuation of analytic properties between the different “similarity regions” Ki1 ,... ,ik = ψi1 ◦ · · · ◦ ψik (K) of K. The intersection of adjoining regions may be negligible in the measure theoretic sense, so we require K to be locally connected in a capacity sense. Essentially, we assume that for each small ball B, the regions Ki1 ,... ,ik of comparable size to B and which intersect
Semilinear PDEs on Self-Similar Fractals
239
B can be ordered so that consecutive regions intersect in a set of positive capacity with respect to W , with these capacities uniformly bounded away from 0 after normalising by the size of B (see [17,18] for the full technical condition). Using this condition and self-similarity, (3.1) leads to scaled Poincaré inequalities Z r 2 Z 2 |u − uB(x,r) | dµ ≤ c dL[u] (3.2) diamK B(x,r) B(x,qr) for all u ∈ DW , x ∈ K and r > 0, for constants q ≥ 1 and c > 0, see [17]. The scaled Poincaré inequalities yield the norm estimates needed for PDE analysis, see [5,17]. There are two cases depending on whether ν < 2 or ν > 2, where ν is the intrinsic dimension. In what follows, the constants c1 , c2 , . . . are, in particular, independent of the functions u ∈ DW . (a) If 0 < ν < 2 there is a Morrey inequality: |u(x) − u(y)| ≤ c1 W [u]1/2 d0 (x, y)df (2−ν)/2ν (u ∈ DW , x, y ∈ K).
(3.3)
In particular this gives a uniform bound |u(x)| ≤ c2 (W [u] + kuk22 )1/2 = c2 kuk (u ∈ DW , x ∈ K);
(3.4)
thus the embedding (DW , k k) ,→ (C(K), k k∞ )
(3.5)
is continuous, and moreover, by the Arzela-Ascoli theorem, it is compact. (b) If ν > 2 (3.2) leads to a Sobolev-type inequality: kuk2ν/(ν−2) ≤ c3 kuk = c3 (W [u] + kuk22 )1/2 (u ∈ DW ),
(3.6)
so the embedding (DW , k k) ,→ (L2ν/(ν−2) (K, µ), k k2ν/(ν−2) )
(3.7)
is continuous. Assuming also the analogue of Rellich’s theorem, that the embedding (DW , k k) ,→ (L2 (K, µ), k k2 )
(3.8)
is compact, then together with (3.6) this easily implies that the embedding (DW , k k) ,→ (Lq (K, µ), k kq )
(3.9)
is compact if 2 ≤ q < 2ν/(ν − 2). Note that in the critical case ν = 2 we get kukq ≤ c3 kuk for all 1 ≤ q < ∞ in place of (3.6), along with the consequential embeddings. For all ν > 0 there is a global Poincaré inequality for the Dirichlet operator: kuk22 ≤ c4 W0 [u] (u ∈ DW0 ).
(3.10)
If 0 < ν < 2 this follows by taking y on the boundary and integrating (3.3). If ν > 2 (3.10) may be deduced from (3.8), using that kuk ≤ lim supk→∞ kuk k if uk → u weakly, and that W is irreducible. Taking (3.10) together with (2.4) gives that W0 [u] ≤ kuk2 ≤ c5 W0 [u] (u ∈ DW0 ), so W0 [ ]1/2 and k k are equivalent norms on DW0 .
(3.11)
240
K. J. Falconer
4. Semilinear PDEs Let f : K × R → R be continuous. With the notation of (2.7), u ∈ D10 is a solution of 10 u + f (x, u) = 0
(4.1)
with Dirichlet boundary conditions if Z Z −W0 (u, v) + f (x, u)vdµ = [(10 u)v + f (x, u)v]dµ = 0
(4.2)
for all v ∈ DW0 . Variational methods enable us to study solutions of (4.1) for suitable f . Our conclusions depend crucially on the intrinsic dimension ν of the variational fractal (K, µ, W ). We set Z t f (x, s)ds. F (x, t) = 0
We will require that the continuous f : K × R → R satisfies conditions (i)-(iv) below; such conditions are familiar in the theory of PDEs on classical domains, see for example [2]: (i) |f (x, t)| ≤ a + b|t|p
(x ∈ K, t ∈ R)
(4.3)
for some a, b and p, where 1 < p < ∞ if
0 < ν ≤ 2 and
1 < p < (ν + 2)/(ν − 2) if
ν > 2,
(4.4)
(ii) f (x, t) = o(t)
near t = 0 uniformly in x,
(4.5)
(iii) f (x, t)t −1 → ∞ as t → ∞ uniformly in x, (iv) there are numbers 0 ≤ κ
0 such that
F (x, t) ≤ κf (x, t)t
(x ∈ K, |t| ≥ a0 ).
(4.7)
Note that, although we do not require f (x, t) > 0, condition (iii) ensures that f (x, t) is everywhere positive for large t. In particular, these conditions are all satisfied by f (x, t) = t|t|p−1 provided that p satisfies (4.4); this may be regarded as the canonical example. We define ψ : DW → R by Z F (x, v(x))dµ. (4.8) ψ(v) = K
|t|p+1
for large t, so ψ is well-defined by (3.4) or If f satisfies (i) then |F (x, t)| ≤ b1 (3.6), noting that p + 1 < 2ν/(ν − 2). Moreover, by an argument parallel to that of [1, Theorem 2.9], ψ is continuous, and has derivative ψ 0 : DW → L(DW , R) given by Z 0 f (x, v(x))w(x)dµ. (4.9) ψ (v)w = K
Semilinear PDEs on Self-Similar Fractals
241
Indeed, (i) implies that the Nemitski map v 7→ f (x, v(x)) is continuous as a mapping Lp+1 (K, µ) → L(p+1)/p (K, µ), see [1, Theorem 2.2], so as the embedding DW ,→ Lp+1 (K, µ) is compact by (3.5) or (3.9), the Nemitski map is compact as a mapping DW → L(p+1)/p (K, µ). Using Hölder’s inequality it follows that ψ 0 : DW → L(DW , R) is compact. We state a form of the variational principle for (4.1) appropriate in this setting. Proposition 4.1 (Variational principle). Let f : K × R → R be continuous and satisfy (i). Then u is a stationary point in DW0 of Z 1 W0 [v] − F (x, v)dµ (4.10) 2 if and only if u ∈ D10 and 10 u + f (x, u) = 0.
(4.11)
Proof. For u, v ∈ DW0 we have, using (4.9), that Z 1 W0 [u + v] − F (x, u + v)dµ 2 Z Z 1 W0 [u] − F (x, u)dµ + W0 (u, v) − f (x, u)vdµ + o(kvk). (4.12) = 2 If u ∈ D10 satisfies (4.11) and thus (4.2), the second term of (4) vanishes, so u is a stationary point in DW0 of (4.10). Conversely, if u ∈ DW0 is a stationary point of (4.10), then by (4) Z W0 (u, v) − f (x, u)vdµ = o(kvk) for all v ∈ DW0 . Since the left-hand side of this expression is linear in v, (4.2) follows t for all v ∈ DW0 , so u ∈ D10 and (4.11) is satisfied. u We recall the mountain pass lemma which will be used to demonstrate the existence of solutions. Proposition 4.2 (Mountain pass lemma). Let (X, k k) be a Banach space and let φ : X → R be C 1 . Suppose φ(0) = 0, that there exist numbers r, h > 0 such that φ(v) > 0 if 0 < kvk ≤ r
(4.13)
φ(v) ≥ h if kvk = r,
(4.14)
and
and that there exists w ∈ X with kwk > r and φ(w) ≤ 0. Suppose further that the Palais–Smale compactness condition holds, that is every sequence {vi } in X with φ(vi ) positive and bounded above and with φ 0 (vi ) → 0 in L(DW , R) has a convergent subsequence. Then φ has a critical value c with h ≤ c < ∞, that is there exists u ∈ X such that φ 0 (u) = 0 and φ(u) = c. Proof. A proof of the mountain pass lemma is given, for example, in [6]. u t
242
K. J. Falconer
We apply the mountain pass lemma to semilinear PDEs on fractal domains in a similar way to the case of classical domains [2,6]; we include the details for the benefit of readers with backgrounds in fractal rather than nonlinear analysis and since there are some technical differences. Theorem 4.3. Suppose 0 < ν ≤ 2 and 1 < p < ∞ , or ν > 2 and 1 < p < (ν + 2)/(ν − 2). Let f : K × R → R be continuous and satisfy (i)-(iv). Then the Dirichlet problem 10 u + f (x, u) = 0
(4.15)
on K has a non-zero solution u ∈ D10 ⊂ DW0 , with u ≥ 0 on K \ 0. The same conclusion holds if f is merely defined on the restricted domain f : K × R≥0 → R with (i)-(iv) satisfied for t ≥ 0. Proof. We apply the mountain pass lemma to the space (DW0 , k k), with Z 1 φ(v) = W0 [v] − F (x, v)dµ 2
(4.16)
for v ∈ DW0 . As in (4), φ : DW0 → R is well-defined, continuous, and C 1 , with Z φ 0 (v)w = W0 (v, w) − f (x, v)wdµ (4.17) for v, w ∈ DW0 . Take 0 < < 21 c5−1 with c5 as in (3.11). By property (ii) there exists δ such that |F (x, t)| ≤ |t|2 if |t| < δ, and by (i) |F (x, t)| ≤ c6 |t|p+1 if |t| ≥ δ. Then splitting the domain of integration into x with |v(x)| < δ and x with |v(x)| ≥ δ gives Z F (x, v)dµ ≤ kvk2 + c6 kvkp+1 ≤ kvk2 + c7 kvkp+1 2 p+1 by (3.4) or (3.6). Hence from (4.16) and (3.11) φ(v) ≥ ( 21 c5−1 − )kvk2 − c7 kvkp+1 ,
(4.18)
so φ(v) is positive for sufficiently small kvk, and we may choose r, h so that (4.13) and (4.14) hold. Moreover, fixing a bounded non-zero v ∈ DW0 with v ≥ 0, it follows from (4.16) and (iii) that φ(λv) < 0 if λ is large enough. Thus φ satisfies the basic hypotheses of the mountain pass lemma. To check the Palais–Smale conditions, let vi be a sequence in DW0 such that |φ(vi )| ≤ M and φ 0 (vi ) → 0 in L(DW , R). Then for w ∈ DW0 , Z 0 0 (4.19) kφ (vi )kkwk ≥ |φ (vi )w| = W0 (vi , w) − f (x, vi )wdµ , so setting w = vi we have that for sufficiently large i, Z f (x, vi )vi dµ ≤ W0 [vi ] + kvi k.
Semilinear PDEs on Self-Similar Fractals
243
Then if i is sufficiently large, writing s = sup{F (x, t) : x ∈ K, |t| ≤ a0 }, and using (iv), Z M ≥ φ(vi ) = 21 W0 [vi ] − F (x, vi )dµ Z Z ≥ 21 W0 [vi ] − κ f (x, vi )vi dµ − F (x, vi )dµ |vi |≥a0
≥
( 21
− κ)W0 [vi ] − κkvi k − s ≥
( 21
|vi | 1. For example, the most commonly studied construction, the Sierpi´nski triangle, has Hausdorff dimension log 3/ log 2 and intrinsic dimension 2 log 3/ log 5. In such constructions the basic analytic inequalities (3.3) and (3.4) may be obtained directly from the explicit expressions for the graph Laplacians and Dirichlet forms, given, for example in [10,12]. Indeed, our original approach to semilinear PDEs took
244
K. J. Falconer
these explicit forms as a starting point, before we became aware of Mosco’s more general framework. It is natural to ask whether the solutions guaranteed by Theorem 4.3 can be taken to be strictly positive throughout K \0 as in the classical case. The usual way of demonstrating this is by a strong maximal principle. Such maximal principles are well-known in the classical setting, but in the fractal setting have so far been established only in the case of regular harmonic structures on p.c.f. fractals, see Strichartz [21]. The solutions of (4.15) given by Theorem 4.3 are in DW0 , so in particular if 0 < ν < 2 the solutions are continuous and in the Hölder class df (2 − ν)/2ν. If ν > 2 we have “weak” solutions in L2ν/(ν−2) (K, µ); the regularity theory needed to improve this is not generally available, though under circumstances where Green’s functions exist stronger conclusions may be possible. A feature of these results is that the intrinsic dimension ν plays an analogous rôle to Euclidean dimension in determining for which nonlinearity exponents p the PDEs have a non-trivial solution. Thus Theorem 4.3 should be compared with classical results on critical exponents, for example, if K is a domain in Rn with smooth boundary then 1u + up = 0 in K with u = 0 on ∂K has a positive solution for all p > 1 if n = 1, 2, and for 1 < p < (n + 2)/(n − 2) if n = 3, 4, . . . , see [13] for a survey of such results. Moreover, if K is starshaped then no such positive solution exists if n ≥ 3 and p > (n + 2)/(n − 2), see [19]. It would be interesting to know if there are conditions on a fractal domain K that ensure no non-trivial solutions of 10 u + up = 0 if ν > 2 and p > (ν + 2)/(ν − 2). References 1. Ambrosetti, A., Prodi, G.: A Primer of Nonlinear Analysis. Cambridge: University Press, 1992 2. Ambrosetti, A., Rabinowitz, G.: Dual variational methods in critical point theory and applications. J. Funct. Anal. 14, 349–381 (1973) 3. Barlow, M.T., Bass, R.F.: Transition densities for Brownian motion on the Sierpi´nski carpet. Probab. Theory Related Fields 91, 307–330 (1992) 4. Barlow, M.T., Kigami, J.: Localized eigenfunctions of the Laplacian on p.c.f. self-similar sets. J. London Math. Soc.(2) 56, 320–332 (1997) 5. Biroli, M., Mosco, U.: Sobolev and isoperimetric inequalities for Dirichlet forms on discontinuous media. Rend. Mat. Acc. Lincei s. 9 6, 37–44 (1995) 6. Chow, S-N., Hale, J.K.: Methods of Bifurcation Theory. Berlin: Springer, 1982 7. Falconer, K.J.: Fractal Geometry – Mathematical Foundations and Applications. Chichester: John Wiley, 1992 8. Falconer, K.J.: Techniques in Fractal Geometry. Chichester: John Wiley, 1997 9. Hambly, B.M.: Brownian motion on a random recursive Sierpinski gasket. Ann. Probab. 25, 1059–1102 (1997) 10. Kigami, J.: In quest of fractal analysis. In: Yamaguti, M., Hata, M., Kigami, J. (eds.) Mathematics of Fractals, Providence, RI: American Mathematical Society, 1993, pp. 53–73 11. Kigami, J.: Effective resistances for harmonic structures on p.c.f. self-similar sets. Math. Proc. Cambridge Philos. Soc. 115, 291–303 (1994) 12. Kigami, J.: Harmonic calculus on p.c.f. self-similar sets. Trans. Am. Math. Soc. 335, 721–755 (1993) 13. Lions, P.L.: On the existence of positive solutions of semi-linear elliptic equations. SIAM Review. 24, 441–467 (1982) 14. Kigami, J., Lapidus, M.L.: Weyl’s problem for the spectral distribution of Laplacians on p.c.f. self-similar sets. Commun. Math. Phys. 158, 93–125 (1993) 15. Kusuoka, S., Yin, Z.X.: Dirichlet forms on fractals: Poincaré constant and resistance. Probab.Theory Related Fields 93 ,169–196 (1992) 16. Lapidus, M.L.: Analysis on fractals, Laplacians on self-similar sets, noncommutative geometry and spectral dimensions. Topol. Methods Nonlinear Anal. 4, 137–195 (1994) 17. Mosco, U.: Dirichlet forms and self-similarity. In: J. Jost et al. (eds.) New directions in Dirichlet forms. Cambridge: International Press, 1998
Semilinear PDEs on Self-Similar Fractals
18. 19. 20. 21.
245
Mosco, U.: Lagrangian metrics on fractals. Proc. Symp. Appl. Math. 54, 301–323 (1998) Pohozaev, S.: Eigenfunctions of the equation 1u + λf (u) = 0. Soviet. Math. Dokl. 6, 1408–1411 (1965) Posta, G.: Spectral asymptotics for variational fractals. Z. Anal. Anwendungen 17, 417–430 (1998) Strichartz, R.S.: Some properties of Laplacians on fractals. J. Funct. Analysis, to appear
Communicated by J. L. Lebowitz
Commun. Math. Phys. 206, 247 – 264 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Projective Module Description of the q-Monopole Piotr M. Hajac1 , Shahn Majid2 1 Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Silver St., Cam-
bridge CB3 9EW, England and Department of Mathematical Methods in Physics, Warsaw University, ul. Ho˙za 74, Warsaw 00–682, Poland. E-mail:
[email protected] 2 Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Silver St., Cambridge CB3 9EW, England. E-mail:
[email protected] Received: 4 September 1998 / Accepted: 16 October 1998
Abstract: The Dirac q-monopole connection is used to compute projector matrices of quantum Hopf line bundles for arbitrary winding number. The Chern–Connes pairing of cyclic cohomology and K-theory is computed for the winding number −1. The nontriviality of this pairing is used to conclude that the quantum principal Hopf fibration is non-cleft. Among general results, we provide a left-right symmetric characterization of the canonical strong connections on quantum principal homogeneous spaces with an injective antipode. We also provide for arbitrary strong connections on algebraic quantum principal bundles (Hopf–Galois extensions) their associated covariant derivatives on projective modules.
Introduction The goal of this paper is to provide a better understanding of the relationship between the quantum-group and K-theory approach to the noncommutative-geometry gauge theory. The latter approach is based on the classical Serre-Swan theorem that allows one to think of vector bundles as projective modules. The former comes from the concept of a Hopf–Galois extension which describes a quantum principal bundle the same way Hopf algebras describe quantum groups. Here a Hopf algebra H plays the role of the algebra of functions on the structure group, and the total space of a bundle is replaced by an H comodule algebra P . We rely on the Hopf–Galois theory to derive our noncommutativegeometric constructions. On the other hand, it is the machinery of noncommutative geometry that allows us to obtain a Galois-theoretic result: We employ the Chern– Connes pairing to prove the non-cleftness of the Hopf–Galois extension of the algebraic quantum principal Hopf fibration. We begin in Sect. 1 with some preliminaries about Hopf–Galois extensions, connections and connection 1-forms on algebraic quantum principal bundles, and connections on projective modules. In Sect. 2 we extend the existing theory with some general results
248
P. M. Hajac, S. Majid
about strong connections, their covariant derivatives on projective modules, and bicovariant splittings of canonical Hopf algebra surjections. We also discuss how to obtain projector matrices from splittings of the multiplication map. In Sect. 3 we first define (the space of sections of) a quantum Hopf line bundle as a bimodule associated to the quantum principal Hopf fibration via a one-dimensional corepresentation of the Hopf algebra k[z, z−1 ]. Then we use a canonical strong connection on the quantum principal Hopf fibration (Dirac q-monopole) to compute, for any one-dimensional corepresentation, left and right projector matrices of the thus defined quantum Hopf line bundles. This computation is the main part of our paper and provides the projective-module characterization of the q-monopole. Further results relating to the Chern–Connes pairing are in Sect. 4. We end with the Appendix where we show that the only invertible elements of the coordinate ring of SLq (2) are non-zero numbers, and use it as an alternative way to conclude the non-cleftness of the quantum Hopf fibration. To focus attention and take advantage of the cyclic cohomology results in [MNW91], we work over a ground field k of characteristic zero, and assume that q is a non-zero element in k that is not a root of 1. We use the Sweedler notation 1h = h(1) ⊗ h(2) (summation understood) and its derivatives. The antipode of the Hopf algebra is a linear map S : H → H , and the counit is an algebra map ε : H → k obeying certain properties. The convolution product of two linear maps from a coalgebra to an algebra is denoted in the following way: (f ∗ g)(c) := f (c(1) )g(c(2) ). We use interchangeably the words “colinear” and “covariant” with respect to linear maps that preserve the comodule structure. For an introduction to noncommutative geometry, quantum groups, Hopf–Galois extensions and quantum-group gauge theory we refer to [C-A94,L-G97], [M-S95], [S-HJ94] and [BM93,BM98] respectively.
1. Preliminaries We begin by recalling basic definitions and known results. Definition 1.1. Let E be a left B-module, and ((B), d) a differential algebra on B. A linear map ∇ : ∗ (B) ⊗B E → ∗+1 (B) ⊗B E is called a connection (covariant derivative) on E iff ∀ ξ ∈ E, λ ∈ (B) : ∇(λ ⊗B ξ ) = λ(∇ξ ) + dλ ⊗B ξ . In the case of the universal differential algebra the existence of a connection is equivalent to the projectivity of E [CQ95, Corollary 8.2]; [L-G97, Proposition 8.2.3]. If E is projective then a connection exists for any differential algebra because it can be obtained from the universal differential algebra and the canonical surjection onto a given differential algebra [C-A94, p. 555]. Definition 1.2. Let H be a Hopf algebra, P be a right H -comodule algebra with multiplication mP and coaction 1R , and B := P coH := {p ∈ P | 1R p = p ⊗ 1} the subalgebra of coinvariants. We say that P is a (right) H -Galois extension of B iff the canonical left P -module right H -comodule map χ := (mP ⊗ id) ◦ (id ⊗B 1R ) : P ⊗B P −→ P ⊗ H is bijective. We say that P is a faithfully flat H -Galois extension of B iff P is faithfully flat as a right and left B-module. For a comprehensive review of the concept of faithful flatness see [B-N72].
Projective Module Description of q-Monopole
249
Definition 1.3. An H -Galois extension is called cleft iff there exists a unital convolution invertible linear map 8 : H → P satisfying 1R ◦ 8 = (8 ⊗ id) ◦ 1. We call 8 a cleaving map of P . Note that, in general, 8 is not uniquely determined by its defining conditions. Observe also that the unitality assumption for the cleaving map is unnecessary in the sense that any right colinear convolution invertible mapping can be normalised to be unital. Indeed, ˜ be such a mapping, and 8(1) ˜ let 8 := b. By the colinearity, we have that b ∈ B, and the convolution invertibility entails that b is invertible. Also, b−1 ⊗ 1 = b−1 1R (bb−1 ) = ˜ is right b−1 b1R (b−1 ) = 1R (b−1 ). It is straightforward to check that 8 := b−1 8 colinear, convolution invertible and unital. Let us also remark that a cleaving map is necessarily injective:
(mP ◦ (mP ⊗ id) ◦ (id ⊗ 8−1 ⊗ id) ◦ (id ⊗ 1) ◦ 1R ◦ 8)(h) = 8(h(1) )8−1 (h(2) )h(3) = h, ∀ h ∈ H. To fix convention, let us recall that the universal differential calculus (grade one of the universal differential algebra) can be defined as the kernel of the multiplication map m 1 B := Ker(B ⊗ B → B) with the differential db := 1 ⊗ b − b ⊗ 1 (e.g., see [L-G97, Sect. 7.1]). (We abuse the notation and use the same letter d to signify both the universal and general differential.) The following are the universal-differential-calculus versions of more general definitions in [BM93,H-PM96]: Definition 1.4 ([BM93]). Let B ⊆ P be an H -Galois extension. Denote by 1 P the universal differential calculus on P . A left P -module projection 5 on 1 P is called a connection on a quantum principal bundle iff 1. Ker 5 = P (1 B)P (horizontal forms), 2. 1R ◦ 5 = (5 ⊗ id) ◦ 1R (right covariance). Here 1R is the right coaction on differential forms given by the formula 1R (ada 0 ) := 0 ⊗ a a 0 , where 1 a := a a(0) da(0) R (1) (1) (0) ⊗ a(1) (summation understood). Coaction on higher order forms is defined in the same manner. Definition 1.5 ([BM93]). Let P , H , B and 1 P be as above. A k-homomorphism ω : H → 1 P such that ω(1) = 0 is called a connection form iff it satisfies the following properties: 1. (mP ⊗ id) ◦ (id ⊗ 1R ) ◦ ω = 1 ⊗ (id − ε) (fundamental vector field condition), 2. 1R ◦ ω = (ω ⊗ id) ◦ adR , adR (h) := h(2) ⊗ S(h(1) )h(3) (right adjoint covariance). For every Hopf–Galois extension there is a one-to-one correspondence between connections and connection forms (see [M-S97, Proposition 2.1]). In particular, the connection 5ω associated to a connection form ω is given by the formula: 5ω (dp) = p(0) ω(p(1) ) .
(1.1)
5ω
is a left P -module homomorphism, so that it suffices to know its values on exact forms.
Definition 1.6 ([H-PM96]). Let 5 be a connection in the sense of Definition 1.4. It is called strong iff (id − 5)(dP ) ⊆ (1 B)P . We say that a connection form is strong iff its associated connection is strong.
250
P. M. Hajac, S. Majid
A natural next step is to consider associated quantum vector bundles. More precisely, what we need here is a replacement of the module of sections of an associated vector bundle. In the classical case such sections can be equivalently described as “functions of type %" from the total space of a principal bundle to a vector space. We follow this construction in the quantum case by considering B-bimodules of colinear maps Homρ (V , P ) associated with an H -Galois extension B ⊆ P via a corepresentation ρ : V → V ⊗ H (see [D-M96]). For our later purpose, we need the following reformulation of [BM93, Prop. A.7]: Lemma 1.7. Let B ⊆ P be a cleft H -Galois extension and ρ : V → V ⊗ H a right corepresentation of H on V . Then the space of colinear maps Homρ (V , P ) is isomorphic as a left B-module to the free module Hom(V , B). 2. Strong Connections on Associated Projective Modules First we study a general setting for translating strong connections on algebraic quantum principal bundles to connections on projective modules. The associated bimodule of colinear maps is finitely generated projective as a left module over the subalgebra of coinvariants under rather unrestrictive assumptions. However, we do not assume the projectivity of this module in the following two propositions, as it is needed only later to ensure the existence of a connection. Also, although we work only with the universal differential algebra in the sequel, we do not assume here that the differential algebra is universal. It suffices that it is right-covariant, i.e., the right coaction is well-defined on differential forms, and right-covariant and right-flat in the second proposition. On the other hand, we do not aim here at the utmost generality but try to keep our noncommutativegeometric motivation evident. Proposition 2.1. Let H be a Hopf algebra with a bijective antipode, P a faithfully ρ flat H -Galois extension of B, and V → V ⊗ H (dimV < ∞) a coaction. Denote by Homρ (V , P ) the B-bimodule of colinear homomorphisms from V to P , and choose a right-covariant differential algebra (P ). Then the following map ˇ ⊗B ϕ))(v) = λϕ(v), `ˇ : (B) ⊗B Homρ (V , P ) −→ Homρ (V , (B)P ), (`(λ is an isomorphism of graded left (B)-modules. Proof. It suffices to show that `ˇ has an inverse. By choosing P a linear basis {λµ } of (B), for any ϕ ∈ Homρ (V , (B)P ) we can write ϕ(v) = µ λµ ϕ µ (v). The point is now to show that we can always choose each ϕ µ to be an element of Homρ (V , P ). It can be done by assuming flatness of (B) (see Proposition 2.3), or by employing our assumptions on the Hopf–Galois extension. Lemma 2.2. Under the assumptions of Proposition 2.1, for any ϕ ∈ HomP ρ (V , (B)P ) there exist colinear homomorphisms ϕ˜ µ ∈ Homρ (V , P ) such that ϕ(v) = µ λµ ϕ˜ µ (v), ∀v ∈ V . Proof. By assumption, we have
((1R ◦ϕ) ⊗ id )(v(0) ⊗ v(1) ) = (((ϕ ⊗id) ◦ ρ) ⊗ id )(v(0) ⊗ v(1) ), i.e., X µ
λµ ϕ µ (v(0) )(0) ⊗ ϕ µ (v(0) )(1) ⊗ v(1) =
X µ
λµ ϕ µ (v(0) ) ⊗ v(1) ⊗ v(2) .
Projective Module Description of q-Monopole
251
Taking advantage of the faithful flatness of P , Theorem I in [S-HJ90] and (1.6) in [D-Y85] (Remark 3.3 in [S-HJ90]), we know that there exists a unital colinear map j : H → P . Applying m(P ) ◦ (id ⊗ (j ◦ m)) ◦ (id ⊗ S ⊗ id), where m(P ) and m are appropriate multiplication maps, to both sides of the above equality, we get X
λµ ϕ µ (v(0) )(0) j (S(ϕ µ (v(0) )(1) )v(1) ) =
µ
X
λµ ϕ µ (v(0) )j (S(v(1) )v(2) ),
µ
Hence, by the unitality of j , we obtain ϕ(v) =
X
λµ ϕ µ (v(0) )(0) j (S(ϕ µ (v(0) )(1) )v(1) ).
µ
On the other hand, using the colinearity of j it is straightforward to verify that each of ϕ˜ µ t the maps v 7 −→ ϕ µ (v(0) )(0) j (S(ϕ µ (v(0) )(1) )v(1) ) is colinear. u τ
The next step is to take advantage of the existence of the translation map H → P ⊗B P , τ (h) := χ −1 (1 ⊗ h) (see Definition 1.2), and define an auxiliary isomorphism f : (B)P → (B) ⊗B P , f := (m⊗ B id) ◦ (id ⊗τ ) ◦ 1R . From the definition of the translation map it follows that f (λp) = λp(0) τ (p(1) ) = λp(0) χ −1 (1 ⊗ p(1) ) = λχ −1 (p(0) ⊗ p(1) ) = λχ −1 (χ (1 ⊗B p)) = λ ⊗B p . (Note that f is the inverse of the multiplication map.) Moreover, let I be the restriction to Homρ (V , (B)P ) of the canonical isomorphism from Hom(V , (B)P ) to (B)P ⊗ V ∗ . Then we have a well-defined map `ˆ := (id ⊗B I −1 ) ◦ (f ⊗id) ◦ I : Homρ (V , (B)P ) −→ (B)P ⊗B Homρ (V , P ), ! XX X ˆ λµ ϕ˜ µ (ei ) ⊗ ei = λµ ⊗B ϕ˜ µ , `(ϕ) = ((id ⊗B I −1 ) ◦ (f ⊗id)) i
µ
µ
where {ei } is a basis of V , {ei } its dual, and (by the above lemma) we choose ϕ˜ µ ∈ P Homρ (V , P ) such that ϕ(v) = µ λµ ϕ˜ µ (v). It is straightforward to check that `ˆ = `ˇ−1 , as desired. u t Proposition 2.3. Let H be a Hopf algebra and P ⊇ B an H -Galois extension. Let `ˇ be the map defined in Proposition 2.1. Then if (B) is flat as a right B-module, `ˇ is an isomorphism of graded left (B)-modules.
252
P. M. Hajac, S. Majid
Proof. Let ρˇ : Hom(V , (B)P ) −→ Hom(V , (B)P ⊗ H ) be a left (B)-linear homomorphism defined by the formula ρ(ϕ)(v) ˇ = ϕ(v(0) )⊗v(1) −ϕ(v)(0) ⊗ϕ(v)(1) , and let ρ˜ denote its restriction to Hom(V , P ). Evidently, we have Ker ρˇ = Homρ (V , (B)P ) and Ker ρ˜ = Homρ (V , P ). Moreover, since (B) is flat as a right B-module, we have the following commutative diagram with exact rows of left (B)-modules: id⊗B ρ˜
0 −→ (B)⊗ B Hom , P ) −→ (B)⊗ B Hom(V , P ⊗H ) ρ (V , P ) −→ (B)⊗ B Hom(V ˇ (2.1) y` y`˜ y` ρˇ 0 −→ Homρ (V , (B)P ) −→ Hom(V , (B)P ) −→ Hom(V , (B)P ⊗H ).
P P Here ` is defined by the formula `( µ λµ ⊗B ϕ µ )(v) = µ λµ ϕ µ (v), and `˜ is given τ
the same way. With the help of the translation map H → P ⊗B P , reasoning as in the proof of the preceding proposition, one can show that ` and `˜ are isomorphisms. By standard diagram chasing (or completing the left hand side of (2.1) with zeros and invoking the Five Isomorphism Lemma), one can conclude from the diagram (2.1) that `ˇ is also an isomorphism. u t If ω is a strong connection form, then (id − 5ω ) ◦ d ◦ Homρ (V , P ) ⊆ Homρ (V , 1 (B)P ). Assuming also that the conditions allowing us to utilise one of the above propositions are fulfilled, we can define the covariant derivative associated to ω in the following way: ∇ ω : Homρ (V , P ) −→ 1 (B) ⊗B Homρ (V , P ), ∇ ω ξ := `ˇ−1 ((id − 5ω ) ◦ d ◦ ξ ). (2.2) One can check that ∇ ω satisfies the Leibniz rule ∇ ω (bξ ) = b∇ ω ξ + db ⊗B ξ . Hence ∇ ω can be extended (by the Leibniz rule) to an endomorphism of (B) ⊗B Homρ (V , P ) which is of degree 1 with respect to the grading of (B). Our second group of results concerns the canonical connection on a quantum principal homogeneous space (principal homogenous H -Galois extension), which is the general construction behind the Dirac q-monopole. A principal homogenous H -Galois extension B ⊆ P is a Hopf–Galois extension obtained from a surjective Hopf algebra map π : P → H which defines the right comodule structure by the formula 1R := (id ⊗ π ) ◦ 1. We know from the proof of [BM93, Proposition 5.3] that if B ⊆ P is a principal homogenous H -Galois extension, and i : H → P is a linear unital map such that π ◦ i = id (splitting of π) and (id ⊗ π) ◦ adR ◦ i = (i ⊗ id) ◦ adR ,
(2.3)
then ω := (S ∗ d) ◦ i is a connection form in the sense of Definition 1.5. (Note that since i is a splitting of a Hopf algebra map, it is counital: εH = εH ◦ π ◦ i = εP ◦ i.) We call the thus constructed connection the canonical connection (form) associated to splitting i. (In what follows, we skip writing “form" for the sake of brevity.) The next step is towards a left-right symmetric characterization of strong canonical connections. Proposition 2.4. The canonical connection associated to splitting i : H → P satisfying the above conditions is strong if and only if the splitting i obeys in addition the right covariance condition (i ⊗ id) ◦ 1 = 1R ◦ i .
Projective Module Description of q-Monopole
253
Proof. First we need to reduce the strongness condition for the canonical connection to a simpler form: Lemma 2.5. The canonical connection ω associated to i : H → P is strong if and only if i(h(2) )(2) ⊗ h(1) Sπ(i(h(2) )(1) ) = i(h) ⊗ 1, ∀h ∈ H.
(2.4)
Proof. To simplify the notation, let us put π(p) = p. Also, let 5ω denote the connection associated to ω, i.e., 5ω (dp) = p(1) ω(p(2) ) . (We take advantage of the fact that 1R = (id ⊗ π) ◦ 1, see (1.1).) Using the Leibniz rule we obtain: (id − 5ω )(dp) = d (p(1) S(i(p(2) )(1) ) i(p(2) )(2) ) − p(1) S(i(p(2) )(1) ) d(i(p(2) )(2) ) = d (p(1) S(i(p(2) )(1) )) i(p(2) )(2) = 1 ⊗ p − p(1) S(i(p(2) )(1) ) ⊗ i(p(2) )(2) . On the other hand, applying 1R ⊗ id to p(1) S(i(p(2) )(1) ) ⊗ i(p(2) )(2) yields p (1) S(i(p(3) )(2) ) ⊗ p(2) S i(p(3) )(1) ⊗ i(p(3) )(3) . Remembering that (1 B)P ⊆ B ⊗ P , we conclude that the strongness condition (see Definition 1.6, cf. [M-S97, (11)]) of the canonical connection is equivalent to p (1) S(i(p(3) )(2) ) ⊗ p(2) S i(p(3) )(1) ⊗ i(p(3) )(3) = p(1) S(i(p(2) )(1) ) ⊗ 1 ⊗ i(p(2) )(2) . The above equation is of the form (id ∗ f1 )(p) = (id ∗ f2 )(p). Since the antipode S is the convolution inverse of id, it is equivalent to f1 (p) = f2 (p). Therefore we can cancel the p (1) product from both sides. Also, since π is surjective and a coalgebra map, we can replace π(p) by a general element h ∈ H . Thus we arrive at S(i(h(2) )(2) ) ⊗ h(1) S i(h(2) )(1) ⊗ i(h(2) )(3) = S(i(h)(1) ) ⊗ 1 ⊗ i(h)(2) . Moreover, for any Hopf algebra the map (S ⊗ id) ◦ 1 is injective (apply ε ⊗ id). Consequently, the strongness is equivalent to the condition i(h(2) )(2) ⊗ h(1) S i(h(2) )(1) = i(h) ⊗ 1 , ∀ h ∈ H , as claimed. u t Note now that we can write the adjoint covariance of i, in an explicit manner, as i(h)(2) ⊗ S(i(h)(1) ) i(h)(3) = i(h(2) ) ⊗ (Sh(1) )h(3) , ∀h ∈ H. In this case i(h(1) ) ⊗ h(2) = i(h(3) ) ⊗ h(1) S(h(2) )h(4) = (1 ⊗ h(1) )((i ⊗ id)◦adR )(h(2) ) = i(h(2) )(2) ⊗ h(1) S(i(h(2) )(1) ) i(h(2) )(3) .
(2.5)
254
P. M. Hajac, S. Majid
Assume that ω is strong. Hence, by the above lemma, the strongness condition implies that i(h(1) ) ⊗ h(2) = i(h)(1) ⊗ i(h)(2) as required. Conversely, using the right covariance of i for the first step and (2.5) for the second, we compute the left hand side of (2.4) as i(h(2) )(2) ⊗ h(1) S(i(h(2) )(1) ) i(h(2) )(3) S(i(h(2) )(4) ) = i(h(2) )(2) ⊗ h(1) S(i(h(2) )(1) ) i(h(2) )(3) Sh(3) = i(h(3) ) ⊗ h(1) S(h(2) )h(4) S(h(5) ) = i(h) ⊗ 1. Hence the canonical connection is strong by Lemma 2.5. u t Corollary 2.6. Assume that antipode S is injective. Then strong canonical connections are in 1-1 correspondence with linear unital splittings of π obeying the two conditions (i ⊗ id) ◦ 1 = 1R ◦ i, (id ⊗ i) ◦ 1 = 1L ◦ i, where 1R = (id ⊗ π) ◦ 1, 1L = (π ⊗ id) ◦ 1. Proof. Assume first that the canonical connection associated to i is strong. Then, by the preceding proposition, i is right covariant and (2.5) holds. Hence i(h(1) )(2) ⊗ S(i(h(1) )(1) ) h(2) = i(h)(1)(2) ⊗ S(i(h)(1)(1) ) i(h)(2) = i(h)(2) ⊗ S(i(h)(1) ) i(h)(3) = i(h(2) ) ⊗ (Sh(1) )h(3) . Reasoning as in the proof of Lemma 2.5, we can cancel h(2) and h(3) from the two sides. Then cancelling S from both sides (we assume S to be injective), we have i(h)(2) ⊗ i(h)(1) = i(h(2) ) ⊗ h(1) , which is the left covariance condition. Conversely, if the left and right covariance conditions hold then i(h(2) ) ⊗ (Sh(1) )h(3) = i(h(2)(1) ) ⊗ (Sh(1) )h(2)(2) = i(h(2) )(1) ⊗ (Sh(1) )i(h(2) )(2) = i(h)(2)(1) ⊗ S(i(h)(1) ) i(h)(2)(2) , which is the same as (2.5). Invoking again the preceding proposition, we can conclude that the canonical connection associated to i is strong as required. u t Remark 2.7. Let π : P → H be a Hopf algebra surjection. If a linear map i : H → P is counital and left or right colinear, then i is a splitting of π, i.e., π ◦ i = id. Indeed, if i is right colinear (i(h)(1) ⊗ π(i(h)(2) ) = i(h(1) ) ⊗ h(2) ), we have: (π ◦ i)(h) = ε ((π ◦ i)(h)(1) )(π ◦ i)(h)(2) = ε(π(i(h)(1) ))π(i(h)(2) ) = ε(π(i(h(1) )))h(2) = h. The left-sided case is analogous.
Projective Module Description of q-Monopole
255
We end this section by showing how to obtain a projector matrix (explicit embedding of a projective module in a free module) from the canonical strong connection. It is known [DGH] that strong connection forms on P are equivalent to unital left B-linear right H -colinear splittings of the multiplication map m : B ⊗ P → P . Explicitly, if ω is a strong connection form, then s : P −→ B ⊗ P , s(p) = p ⊗ 1 + p(0) ω(p(1) )
(2.6)
gives the desired splitting. (Solving this equation for ω one gets ω(h) = h[1] s(h[2] ) − 1 ⊗ ε(h), where h[1] ⊗B h[2] = χ −1 (1 ⊗ h), summation understood, see Definition 1.2.) In particular, for the canonical strong connection associated to a bicovariant splitting i (i.e., ω = (S ∗ d) ◦ i), we have: s(p) = p(1) Si(p(2) )(1) ⊗ i(p(2) )(2) .
(2.7)
Note that a splitting of the multiplication map is almost the same as a projector matrix, for it is an embedding of P in the free B-module B ⊗ P . (We will use formula (2.7) in the next section to compute projector matrices of quantum Hopf line bundles from the Dirac q-monopole connection.) To turn (2.7) into a concrete recipe for producing finite size projector matrices of finitely generated projective modules, let us note the following general lemma: Lemma 2.8. Let A be an algebra and M a projective left A-module generated by linearly independent generators g1 , ..., gn . Also, let {g˜ µ }µ∈I be a completion of {g1 , ..., gn } to a map m : A ⊗ M → linear basis of M, f2 be a left A-linear Psplitting of the multiplication P M given by the formula f2 (gk ) = nl=1 akl ⊗ gl + µ∈I akµ ⊗ g˜ µ , and cµl ∈ A a P P choice of coefficients such that g˜ µ = nl=1 cµl gl . Then ekl = akl + µ∈I akµ cµl defines a projector matrix of M, i.e., e ∈ Mn (A), e2 = e and An e and M are isomorphic as left A-modules. Proof. Note first that we do not lose any generality by assuming g1 , ..., gn to be linearly independent (we can always remove generators that are linear combinations of other generators), and that a splitting of the multiplication map always exists by the projectivity assumption (cf. [CQ95, Sect. 8]). Let N be the kernel of the surjection f1 : An → M = An /N , f1 (ek ) = gk , k ∈ {1, ..., n}, where {ek }k∈{1,...,n} is the standard basis of An , i.e., ek is the row with zeros everywhere except for the k th place where there is 1. We have the following commutative diagram of left A-module homomorphisms whose rows are exact: id⊗f1
⊗ M −→ 0 0 −→ A ⊗ N −→ A ⊗ An −→ ←− A x f3 f f2ym y 4 f 1 0 −→ N −→ An −→ M −→ 0.
(2.8)
Here f2 is a splitting of the multiplication map (m ◦ f2 = id), f3 a splitting of id ⊗ f1 (which exists because A ⊗ M is free), and f4 is the multiplication map on A ⊗ An . From the commutativity of the diagram we can infer that f4 ◦ f3 ◦ f2 is a splitting of f1 : f1 ◦ f4 ◦ f3 ◦ f2 = m ◦ (id ⊗ f1 ) ◦ f3 ◦ f2 = id.
256
P. M. Hajac, S. Majid
Hence fe := f4 ◦f3 ◦f2 ◦f1 is an idempotent (fe2 = fe ) and fe (An ) is isomorphic to M, as needed. To compute P a matrix of Pfe , we choose a splitting f3 so that f3 (1⊗gk ) = 1⊗ek , f3 (1 ⊗ g˜ µ ) = 1 ⊗ nl=1 cµl el , nl=1 cµl gl = g˜ µ , k ∈ {1, ..., n}, µ ∈ I . Then fe (ek ) = (f4 ◦ f3 ◦ f2 )(gk ) n X X akl ⊗ gl + akµ ⊗ g˜ µ ) = (f4 ◦ f3 )( l=1
= f4 (
n X
akl ⊗ el +
l=1
=
n X l=1
This means that (akl + t u
P
µ∈I
µ∈I
(akl +
X
akµ ⊗
µ∈I
X
n X
cµl el )
l=1
akµ cµl )el .
µ∈I
akµ cµl )k,l∈{1,...,n} is a projector matrix of M, as claimed.
Observe that if akµ = 0 for all k and µ, the matrix elements of e are simply akl , and can be directly read off from the formula for splitting f2 written in terms of the module generators g1 , ..., gn . By a completely analogous reasoning, the same kind of lemma is true for right modules. 3. Projective Module Form of the Dirac q-Monopole Recall that A(SLq (2)) is a Hopf algebra over a field k generated by 1, α, β, γ , δ, satisfying the following relations: αβ = q −1 βα , αγ = q −1 γ α , βδ = q −1 δβ , βγ = γ β , γ δ = q −1 δγ , (3.1) αδ − δα = (q −1 − q)βγ , αδ − q −1 βγ = δα − qβγ = 1 , where q ∈ k \ {0}. The comultiplication 1, counit ε, and antipode S of A(SLq (2)) are defined by the following formulas: αβ α⊗1 β ⊗1 1⊗α 1⊗β 1 = , γ δ γ ⊗1 δ⊗1 1⊗γ 1⊗δ δ −qβ α β 10 αβ . ε = , S = γ δ 01 γ δ −q −1 γ α Now we need to recall the construction of the standard quantum sphere of Podle´s and the quantum principal Hopf fibration. The standard quantum sphere is singled out among the principal series of Podle´s quantum spheres by the property that it can be constructed as a quantum quotient space [P-P87]. In algebraic terms it means that its coordinate ring can be obtained as the subalgebra of coinvariants of a comodule algebra. To carry out this construction, first we need the right coaction on A(SLq (2)) of the commutative and cocommutative Hopf algebra k[z, z−1 ] generated by the grouplike element z and its inverse. This Hopf algebra can be obtained as the quotient of A(SLq (2)) by the Hopf ideal generated by the off-diagonal generators β and γ . Identifying the image of
Projective Module Description of q-Monopole
257
α and δ under the Hopf algebra surjection π : A(SLq (2)) → k[z, z−1 ] with z and z−1 respectively, we can describe the right coaction 1R := (id ⊗ π ) ◦ 1 by the formula: αβ α ⊗ z β ⊗ z−1 1R = . γ δ γ ⊗ z δ ⊗ z−1 We call the subalgebra of coinvariants defined by this coaction the coordinate ring of the (standard) quantum sphere, and denote it by A(Sq2 ). Since k[z, z−1 ] = A(SLq (2))/(A(Sq2 ) ∩ Kerε)A(SLq (2)) by Remark 3.4, we know from the general argument that A(Sq2 ) ⊆ A(SLq (2)) is a principal homogenous k[z, z−1 ]-Galois extension. (If P is a Hopf algebra, I a Hopf ideal, B π the subalgebra of coinvariants under the coaction 1R = (id ⊗ π ) ◦ 1, P → P /I , and I = (B ∩ Kerε)P , then we can define the inverse of the canonical map by χ −1 (p0 ⊗ π(p)) = p0 Sp(1) ⊗B p(2) .) We refer to the quantum principal bundle given by this Hopf–Galois extension as the quantum principal Hopf fibration. (An SOq (3) version of this quantum fibration was studied introduced in [BM93].) The main point of this section is to compute projector matrices of quantum Hopf line bundles associated to the just described Hopf q-fibration. Definition 3.1. Let ρn : k[z, z−1 ] → k ⊗ k[z, z−1 ], ρn (1) = 1 ⊗ z−n , n ∈ Z, be a one-dimensional corepresentation of k[z, z−1 ]. We call the A(Sq2 )-bimodule of colinear maps Homρn (k, A(SLq (2))) the (bimodule of) quantum Hopf line bundle of winding number n. Since we deal here with one-dimensional corepresentations, we identify colinear maps with their value at 1. We have ˜ ∈ A(SLq (2)) | 1R p = p ⊗ z−n } =: Pn Homρn (k, A(SLq (2)))={p as A(Sq2 )-bimodules. With the help of the PBW basis α k β l γ m , β p γ r δ s , k, l, m, p, r, s ∈ N0 , k > 0 of A(SLq (2)), one can show that P P−n −n−k γ k A(S 2 ) for n ≤ 0 A(Sq2 ) α −n−k γ k = −n q k=0 k=0 α P P Pn = n n k n−k k 2 = k=0 β δ n−k A(Sq2 ) for n ≥ 0, k=0 A(Sq ) β δ L and A(SLq (2)) = n∈Z Pn (cf. [MMNNU91, (1.10)]). Next, similarly to [BM93,BM98], we consider the canonical connection induced by the bicovariant splitting i(zn ) = α n , i(z−n ) = δ n . By Corollary 2.6 it induces a strong connection. We call this connection the (Dirac) q-monopole. Now, formula (2.7) gives us a splitting s : A(SLq (2)) → A(Sq2 ) ⊗ A(SLq (2)), and we can claim: Proposition 3.2. Put
α −n−k γ k −n (−q)l β l δ −n−l for n ≤ 0 l q2 (en )kl = β k δ n−k n (−q)−l α n−l γ l for n ≥ 0. l 2 q
|n|+1
Then, for any n ∈ Z, en ∈ M|n|+1 (A(Sq2 )), en2 = en , and A(Sq2 ) Pn as a left A(Sq2 )-module.
en is isomorphic to
258
P. M. Hajac, S. Majid
Proof. Recall first that if qxy = yx, then (x + y)n = n k
q
=
Pn
k=0
n k
q
x k y n−k , where
(q − 1)...(q n − 1) (q − 1)...(q k − 1)(q − 1)...(q n−k − 1)
are the q-binomial coefficients. (See, e.g., [M-S95, p.85].) Taking advantage of formula (2.7) in the q-monopole case, we compute: s(α m−k γ k ) = α m−k γ k Si(zm )(1) ⊗ i(zm )(2) m X = α m−k γ k ml 2 S(α m−l β l ) ⊗ α m−l γ l =
l=0 m X
q
α m−k γ k
l=0
Pn
Similarly, s(β k δ n−k ) =
l=0 β
m l
k δ n−k n l
q2
q2
(−q)l β l δ m−l ⊗ α m−l γ l .
(−q)−l α n−l γ l ⊗ β l δ n−l . Thus we have ver-
ified that s preserves the direct sum decomposition of A(SLq (2)), i.e., s(Pn ) ⊆ A(Sq2 ) ⊗ Pn , n ∈ Z. Hence, by restriction, we have a splitting of the left multiplication map for each Pn . The claim of the proposition follows directly from Lemma 2.8 and the above formulas for s. u t Remark 3.3. Observe that for n ≥ 0 we can write en = uv T , where uT = (δ n , ..., β k δ n−k , ..., β n )
and v T = (S(δ n ), ...,
n k
q2
S(γ k δ n−k ), ..., S(γ n )).
Since vT u =
n X n k=0
k
q2
S(γ k δ n−k )β k δ n−k = S((δ n )(1) )(δ n )(2) = ε(δ n ) = 1,
we can directly see that en2 = en . The case n ≤ 0 is similar. Remark 3.4. We can define the fibre of a quantum vector bundle over a classical point (understood as a number-valued algebra homomorphism) as the localization of the module of “sections" of this bundle at the kernel of this homomorphism. The standard Podle´s quantum sphere that we consider here has one classical point given by the restriction of the counit map ε. Let us consider the quantum Hopf line bundles as left + + A(Sq2 )-modules Pn . We can then regard the localization Pn /A(Sq2 ) Pn , A(Sq2 ) := +
Kerε ∩ A(Sq2 ), as the fibre vector space of Pn over the point given by A(Sq2 ) . (Note +
+
that Pn /A(Sq2 ) Pn is automatically a vector space over A(Sq2 )/A(Sq2 ) +
+
= k.) Since
ε(A(Sq2 ) Pn ) = 0, ε induces a linear map ε˜ : Pn /A(Sq2 ) Pn → k given by the formula +
ε˜ (p/A(Sq2 ) Pn ) = ε(p). Assume now that n ≥ 0. Arbitrary p ∈ Pn can be written as P + p = nl=0 bl β l δ n−l , bl ∈ A(Sq2 ). Hence ε˜ (p/A(Sq2 ) Pn ) = ε(b0 ), and we can conclude
Projective Module Description of q-Monopole
259
that ε˜ is surjective. Note now that β = (−q −1 βγ )β + (qαβ)δ, and consequently, for + l > 0, β l δ n−l = (−q −1 βγ )β l δ n−l + (qαβ)δβ l−1 δ n−l ∈ A(Sq2 ) Pn . It follows that (
n X l=0
+
+
bl β l δ n−l )/A(Sq2 ) Pn = b0 δ n /A(Sq2 ) Pn +
+
= ε(b0 )δ n /A(Sq2 ) Pn + (b0 − ε(b0 ))δ n /A(Sq2 ) Pn +
= ε(b0 )δ n /A(Sq2 ) Pn . This entails the injectivity of ε˜ . Thus ε˜ is an isomorphism, and we can infer that + the fibre Pn /A(Sq2 ) Pn is a one-dimensional vector space, exactly as expected for a line bundle. The reasoning for n ≤ 0 is analogous, and relies on the L identity γ = (−qβγ )γ + (q −1 δγ )α. This agrees with the fact that A(SLq (2)) = n∈Z Pn and L + 2 −1 n A(SLq (2))/A(Sq ) A(SLq (2)) = k[z, z ] = n∈Z kz . The latter equality can be directly seen as follows: Since β and γ q-commute with all monomials, the two-sided + ideal hβ, γ i = βA(SLq (2)) + γ A(SLq (2)). Thus, as β, γ ∈ A(Sq2 ) A(SLq (2)) by the +
+
above formulas, we have hβ, γ i ⊆ A(Sq2 ) A(SLq (2)). On the other hand, since A(Sq2 ) + is the ideal in A(Sq2 ) generated by αβ, βγ , γ δ, we also have A(Sq2 ) A(SLq (2)) ⊆ + Hence k[z, z−1 ] = A(SLq (2))/hβ, γ i = A(SLq (2))/A(Sq2 ) A(SLq (2)).
hβ, γ i.
To compute projector matrices of the quantum Hopf line bundles thought of as right A(Sq2 )-modules, we need a right-sided version of formula (2.7). A natural first candidate appears to be: s˜ (p) = i(p(1) )(1) ⊗ S(i(p(1) )(2) )p(2) .
(3.2)
It is evidently a splitting of the multiplication map m : A(SLq (2)) ⊗ A(SLq (2)) → A(SLq (2)). Only now it is right linear under left coinvariants. By left coinvariants we ˜ q2 ) := {p ∈ A(SLq (2)) | 1L p = 1⊗p}, where 1L = (π ⊗id)◦1. understand here A(S On generators, we have explicitly: z⊗α z⊗β αβ . = −1 1L γ δ z ⊗ γ z−1 ⊗ δ Using the PBW basis α k β l γ m , β p γ r δ s , k, l, m, p, r, s ∈ N0 , k > 0 of A(SLq (2)), ˜ q2 ) is a unital subalgebra of A(SLq (2)) generated by αγ , βδ, βγ . one can show that A(S ˜ q2 ). To this end we note We want to prove now that the image of s˜ lies in A(SLq (2))⊗ A(S that the right covariance of i implies the formula i(h)(1) ⊗i(h)(3) ⊗i(h)(2) = i(h(1) )(1) ⊗ h(2) ⊗i(h(1) )(2) . With the above formula at hand, one can verify that ((id ⊗1L )◦ s˜ )(p) = i(p(1) )(1) ⊗ 1 ⊗ S(i(p(1) )(2) )p(2) , as needed. Thus we can conclude that s˜ is a right ˜ q2 ) → A(SLq (2)). ˜ q2 )-linear splitting of the multiplication map A(SLq (2)) ⊗ A(S A(S 2 2 ˜ q ) and A(Sq ) are different subalgebras of A(SLq (2)), and we want to However, A(S find projector matrices for Pn thought of as right A(Sq2 )-modules. To our aid comes the transpose automorphism of A(SLq (2)) defined on generators by αβ αγ T = . γ δ β δ
260
P. M. Hajac, S. Majid
One can check directly that T is well defined. In particular, when we work over C, A(SLq (2)) has a natural ∗-algebra structure for q real, namely αβ δ −q −1 γ , ∗ = γ δ −qβ α and we can simply define T = ∗ ◦ S. This automorphism gives an isomorphism between ˜ q2 ). We have T (A(Sq2 )) = A(S ˜ q2 ) and T (A(S ˜ q2 )) = A(Sq2 ). (Note that A(Sq2 ) and A(S T 2 = id.) It is straightforward to verify that sˇ := (T ⊗ T ) ◦ s˜ ◦ T is a right A(Sq2 )-linear splitting of the right multiplication map m : A(SLq (2)) ⊗ A(Sq2 ) → A(SLq (2)). We can now proceed as in the left-sided case to prove: Proposition 3.5. Put −n l (−q)−l β l δ −n−l α −n−k γ k for n ≤ 0 q2 (fn )lk = n (−q)l α n−l γ l β k δ n−k for n ≥ 0. l 2 q
|n|+1
Then, for any n ∈ Z, fn ∈ M|n|+1 (A(Sq2 )), fn2 = fn , and fn A(Sq2 ) to Pn as a right A(Sq2 )-module.
is isomorphic
Proof. We have: sˇ (α m−k γ k ) = (T ⊗ T )(˜s (α m−k β k )) = (T ⊗ T )(i(zm )(1) ⊗ S(i(zm )(2) )α m−k β k ) m X m m−l l = (T ⊗ T )( β ⊗ S(α m−l γ l )α m−k β k ) l 2α = (T ⊗ T )( =
m X
q
α m−l β l ⊗
l=0
α m−l γ l ⊗
l=0
Similarly, sˇ (β k δ n−k ) =
l=0 m X
Pn
l=0 β
m l
q2
l
m l
q2
(−q)−l γ l δ m−l α m−k β k )
(−q)−l β l δ m−l α m−k γ k .
l δ n−l ⊗ n
q2
(−q)l α n−l γ l bk δ n−k . Hence sˇ (Pn ) ⊆ Pn ⊗
A(Sq2 ), n ∈ Z. By restriction of sˇ , we have a splitting of the right multiplication map for each Pn . The claim of the proposition follows from the right-sided version of Lemma 2.8 and the above formulas for sˇ . u t Finally, let us observe that, identifying Homρn (k, A(SLq (2))) with Pn , we can view the covariant derivative ∇nω : Homρn (k, A(SLq (2))) → 1 A(Sq2 ) ⊗A(Sq2 ) Homρn (k, A(SLq (2))) associated to the q-monopole by (2.2), as the Grassmannian connection associated to the splitting sn := s|Pn . More precisely, let ψ : Homρn (k, A(SLq (2))) → Pn , ψ(ξ ) = ξ(1) be the identification isomorphism mentioned above. The Grassmannian connection associated to the splitting sn : Pn → A(Sq2 ) ⊗ Pn is by definition the connection
Projective Module Description of q-Monopole
261
P ∇˜ ns : Pn → 1 A(Sq2 ) ⊗ Pn given by the formula ∇˜ ns p = i dbi ⊗A(Sq2 ) pi , where P i bi ⊗ pi := s(p). (See [CQ95, (54)] or [L-G97, (8.27)] for the right-sided version.) We want to show that ∇nω = (id ⊗A(Sq2 ) ψ −1 ) ◦ ∇˜ ns ◦ ψ, n ∈ Z, or equivalently that
∀ ξ ∈ Homρn (k, A(SLq (2))), n ∈ Z : ˇ nω ξ ))(1) = ((`ˇ ◦ (id ⊗A(S 2 ) ψ −1 ) ◦ ∇˜ ns ◦ ψ)(ξ ))(1). (`(∇ q (See Proposition 2.1 and (2.2).) Notice that we can use here either Proposition 2.1 or Proposition 2.3 to guarantee that ∇nω , n ∈ Z, makes sense. Indeed, since k[z, z−1 ] admits the Haar functional (hH : k[z, z−1 ] → k, hH (zn ) = δ 0n ), we can construct a unital right colinear mapping j : k[z, z−1 ] → A(SLq (2)), j := η ◦ hH , where η : k → A(SLq (2)) is the unit map, so that A(SLq (2)) is injective as a right k[z, z−1 ]-comodule. Thus, as the antipode of k[z, z−1 ] is bijective, A(SLq (2)) is left and right faithfully flat over A(Sq2 ) by [S-HJ90, Theorem I], and Proposition 2.1 applies. (In fact, we used the existence of a unital right colinear mapping to prove Proposition 2.1.) Also, 1 A(Sq2 ) is isomorphic with A(Sq2 )/k ⊗ A(Sq2 ) as a right A(Sq2 )-module via db 7 → b/k ⊗ 1, so that it is free, whence flat. Therefore Proposition 2.3 applies as well. Now, we put s(ξ(1)) = bi ⊗ξ(1)i , ξi (1) = ξ(1)i , and taking advantage of m ◦ sn = id, (2.6), (1.1) and (2.2) compute:
((`ˇ ◦ (id ⊗A(Sq2 ) ψ −1 ) ◦ ∇˜ ns ◦ ψ)(ξ ))(1) X
= ((`ˇ ◦ (id ⊗A(Sq2 ) ψ −1 ))( =
X
i
dbi ⊗A(Sq2 ) ξ(1)i )(1)
(dbi )ξ(1)i
i
= 1 ⊗ (m ◦ sn )(ξ(1)) − sn (ξ(1)) = 1 ⊗ ξ(1) − ξ(1) ⊗ 1 − ξ(1)(0) ω(ξ(1)(1) ) = dξ(1) − 5ω (dξ(1)) ˇ nω ξ ))(1). = (`(∇ This is exactly as one should expect, since we have constructed the splitting s : A(SLq (2)) → A(Sq2 ) ⊗ A(SLq (2)) from the connection form ω by formula (2.6). 4. Chern–Connes Pairing for the n = −1 Bimodule The aim of this section is to compute the left and right Chern numbers of the left and right finitely generated projective bimodule P−1 describing the quantum Hopf line bundle of winding number −1. This computation is a simple example of the Chern–Connes pairing between K-theory and cyclic cohomology [C-A94,L-JL97]. To obtain the desired Chern numbers we need to evaluate (to pair) the appropriate even cyclic cocycle with the left and right projector matrix respectively. Since the positive even cyclic cohomology H C 2n (A(Sq2 )), n > 0, is the image of the periodicity operator applied to H C 0 (A(Sq2 )), and the pairing is compatible with the action of the periodicity
262
P. M. Hajac, S. Majid
operator, the even cyclic cocycle computing Chern numbers is necessarily of degree zero, i.e., a trace. This trace is explicitly provided in [MNW91, (4.4)]. Adapting [MNW91, (4.4)] to our special case of the standard Podle´s quantum sphere, we obtain: m n
τ ((αβ) ζ ) = 1
τ 1 ((γ δ)m ζ n ) =
(1 − q 2n )−1 for n > 0, m = 0, 0 otherwise, (1 − q 2n )−1 for n > 0, m = 0, 0 otherwise,
(4.1)
where ζ := −q −1 βγ . The fact that the “Chern cyclic cocycle" is in degree zero is a quantum effect caused by the non-classical structure of H C ∗ (A(Sq2 )) (see [MNW91]). In the classical case the corresponding cocycle is in degree two, as it comes from the volume form of the two-sphere. Since τ 1 is a 0-cyclic cocycle, the pairing is given by the formula h[τ 1 ], [p]i = 1 (τ ◦ T r)(p), where p ∈ Mn (A(Sq2 )), p2 = p, and T r : Mn (A(Sq2 )) → A(Sq2 ) is the usual matrix trace. The following proposition establishes the pairing between the cyclic cohomology class [τ 1 ] and the K0 -classes [e−1 ] and [f−1 ] of the left and right projector matrix of bimodule P−1 respectively: Proposition 4.1. Let τ 1 : A(Sq2 ) → k be the trace (4.1), and e−1 , f−1 the projectors given in Propositions 3.2 and 3.5. Then (τ 1 ◦ T r)(e−1 ) = −1 and (τ 1 ◦ T r)(f−1 ) = 1. Proof. Taking advantage of (3.1) and (4.1), we get:
αδ −βα (τ ◦ T r) γ δ −qβγ 1
= τ 1 (1 + (q −1 − q)βγ ) = τ 1 ((q 2 − 1)ζ ) = −1.
Similarly,
(τ 1 ◦ T r)
δα δγ −αβ −q −1 βγ
= 1,
as claimed. u t This computation is in agreement with the classical situation. Only there the sign change of the Chern number when switching (by transpose) from the left to right projector matrix is due to the anticommutativity of the standard differential forms on manifolds. Here the sign change relies on the noncommutativity of the algebra. Since every free module can be represented in K0 by the identity matrix, we obtain that the pairing of the cyclic cohomology class [τ 1 ] with the K0 -class of any free A(Sq2 )module always vanishes: h[τ 1 ], [I ]i = τ 1 (n) = 0, n ∈ N. Now, combining Proposition 4.1 with Lemma 1.7 yields: Corollary 4.2. The Hopf–Galois extension of the quantum principal Hopf fibration is not cleft.
Projective Module Description of q-Monopole
263
Appendix In this appendix we provide a direct proof of non-cleftness of the quantum principal Hopf fibration which is possible in the purely algebraic setting. This complements our K-theoretic proof. Thus, suppose that there exists a cleaving map 8 : k[z, z−1 ] → A(SLq (2)). The existence of the convolution inverse 8−1 entails 8(z)8−1 (z) = ε(z), whence 8(z) must be invertible in A(SLq (2)). The polynomial 8(z) cannot be constant because then 8(z) and 8(1) = 1 would be linearly dependent, which contradicts the injectivity of 8 (see Sect. 1). Therefore to prove the non-cleftness it suffices to show that all invertible elements of A(SLq (2)) are non-zero numbers. L One can do it using the direct sum decomposition A(SLq (2)) = m,n∈Z A[m, n], where A[m, n] = {p ∈ A(SLq (2)) | π(p(1) ) ⊗ p(2) = zm ⊗ p, p(1) ⊗ π(p(2) ) = p ⊗ zn } (see [MMNNU91, (1.10)].) To be consistent with [MMNNU91], let us put now k = C. (See, however, the bottom of p.360 in [MMNNU91].) We know Pfrom [MMNNU91, p.363] that we can write any element of A(SLq (2)) as a sum m,n pm,n (ζ )em,n or P −1 k,l (ζ ), where ζ :=P−q βγ , pm,n , rk,l ∈ C[ζ ], em,n ∈ A[m, n]. Assume k,l ek,l rP now that m,n pm,n (ζ )em,n k,l ek,l rk,l (ζ ) = 1. Since both sums are finite, there exist indices m+ := max{m ∈ Z | pm,n 6 = 0}, n+ := max{n ∈ Z | pm+ ,n 6 = 0}, m− := min{m ∈ Z | pm,n 6 = 0}, n− := min{n ∈ Z | pm− ,n 6 = 0}, and similarly k+ , k− , l+ , l− . We have X X pm,n (ζ )em,n ek,l rk,l (ζ ) A[0, 0] 3 e0,0 = 1 = =
X
m,n
k,l
pm,n (ζ )sm,n,k,l (ζ )˜rk,l (ζ )em+k,n+l .
(4.2)
m,n,k,l
Here sm,n,k,l (ζ )em+k,n+l := em,n ek,l (see [MMNNU91, p.363]), and r˜k,l (ζ ) is obtained from rk,l (ζ ) by commuting it over em+k,n+l , i.e., em+k,n+l rk,l (ζ ) = r˜k,l (ζ )em+k,n+l . It follows from the commutation relations (3.1) that the coefficients of r˜k,l are q to some powers times the corresponding coefficients of rk,l . In particular, rk,l = 0 ⇔ r˜k,l = 0. Since pm+ ,n+ (ζ )em+ ,n+ , ek+ ,l+ rk+ ,l+ (ζ ) and pm− ,n− (ζ )em− ,n− , ek− ,l− rk− ,l− (ζ ) are the only terms that can contribute to the direct summand A[m+ + k+ , n+ + l+ ] and A[m− + k− , n− + l− ] respectively, we can conclude from Eq. (4.2) that either m+ + k+ , n+ + l+ , m− +k− , n− +l− are all zero, or else pm± ,n± (ζ )sm± ,n± ,k± ,l± (ζ )˜rk± ,l± (ζ )em± +k± ,n± +l± = 0. From [MMNNU91, p.363] we know, however, that em± +k± ,n± +l± is a (left and right) basis of A[m± + k± , n± + l± ] over C[ζ ]. Qj Qj −2(i−1) ζ ), d j a j = 2i Also, using formulas α j δ j = i=1 (1 − q i=1 (1 − q ζ ) one can check that em,n ek,l 6 = 0, whence sm± ,n± ,k± ,l± 6= 0. Thus, as there are no zero divisors in C[ζ ] and rk,l = 0 ⇔ r˜k,l = 0, we can conclude that pm± ,n± = 0 or rk± ,l± = 0. This, however, contradicts the definition of m± , n± , k± , l± . Therefore m± = −k± and n± = −l± . Consequently, as m− ≤ m+ and k− ≤ k+ , we have m− = m+ = −k+ = −k− . Hence also n− = n+ = P −l+ = −l− . Put n = n = n . It follows now that m0 = m− = m+ and 0 − + m,n pm,n (ζ )em,n = P pm0 ,n0 (ζ )em0 ,n0 and k,l ek,l rk,l (ζ ) = e−m0 ,−n0 r−m0 ,−n0 (ζ ). This way (4.2) reduces to pm0 ,n0 (ζ )sm0 ,n0 ,−m0 ,−n0 (ζ )˜r−m0 ,−n0 (ζ ) = 1. Hence all three of the above polynomials must be non-zero constants. Using again [MMNNU91, p.363] and remembering that
264
P. M. Hajac, S. Majid
α j δ j and δ j α j are polynomials in ζ of degree j , we can P infer that m0 = 0 = n0 . (Othis not of degree 0.) Consequently erwise s m ,n ,−m ,−n 0 0 0 0 m,n pm,n (ζ )em,n = p0,0 (ζ ), P e r (ζ ) = r ˜ (ζ ) = r (ζ ), and p , r are invertible constant polynomials, k,l k,l 0,0 0,0 0,0 0,0 k,l as needed. Acknowledgements. P. M. H. was partially supported by the NATO and CNR postdoctoral fellowships and KBN grant 2 P03A 030 14. It is a pleasure to thank Max Karoubi and Giovanni Landi for very helpful discussions.
References [B-N72] [BM93]
Bourbaki, N.: Commutative Algebra. Reading, MA: Addison-Wesley, 1972 Brzezi´nski, T., Majid, S.: Quantum Group Gauge Theory on Quantum Spaces. Commun. Math. Phys. 157, 591–638 (1993); Erratum 167, 235 (1995) [BM98] Brzezi´nski, T., Majid, S.: Quantum Differentials and the q-Monopole Revisited. Acta Applic. Math. 54, 185–232 (1998) [C-A94] Connes, A.: Noncommutative Geometry. London–New York: Academic Press, 1994 [CQ95] Cuntz, J., D.Quillen, D.: Algebra Extensions and Nonsingularity. J. Amer. Math. Soc. 8 (2), 251–289 (1995) [DGH] D¸abrowski, L., Grosse, H., Hajac, P.M.: Joint project. Trieste, Italy, SISSA 84/99/FM [D-Y85] Doi, Y: Algebras with total integrals. Commun. Alg. 13, 2137–2159 (1985) [D-M96] Durdevic, M.: Quantum Principal Bundles and Tannaka–Krein Duality Theory. Rep. Math. Phys. 38 (3), 313–324 (1996) [H-PM96] Hajac, P.M.: Strong Connections on Quantum Principal Bundles. Commun. Math. Phys. 182 (3), 579–617 (1996) [L-G97] Landi, G.: An Introduction to Noncommutative Spaces and their Geometries. Berlin– Heidelberg–New York: Springer-Verlag, 1997 [L-JL97] Loday, J.-L.: Cyclic Homology Berlin–Heidelberg–New York: Springer, 1997 [M-S95] Majid, S.: Foundations of Quantum Group Theory. Cambridge: Cambridge University Press, 1995 [M-S97] Majid, S.: Some Remarks on Quantum and Braided Group Gauge Theory. Banach Center Publications. 40, 336–349 (1997) [MMNNU91] Masuda, T., Mimachi, K., Nakagami, Y., Noumi, M., Ueno, K.: Representations of the Quantum Group SUq (2) and the Little q-Jacobi Polynomials. J. Funct. Anal. 99, 357–387 (1991) [MNW91] Masuda, T., Mimachi, K. Nakagami,Y., Watanabe, J.: Noncommutative Differential Geometry on the Quantum Two Sphere of Podle´s. I: An Algebraic Viewpoint. K-Theory 5, 151–175 (1991) [P-P87] Podle´s, P.: Quantum Spheres. Lett. Math. Phys. 14, 521–531 (1987) [S-HJ90] Schneider, H.-J.: Principal Homogenous Spaces for Arbitrary Hopf Algebras. Isr. J. Math. 72 (1–2), 167–195 (1990) [S-HJ94] Schneider, H.J.: Hopf Galois Extensions, Crossed Products, and Clifford Theory. In: Bergen, J., Montgomery, S. (eds.) Advances in Hopf Algebras, Lecture Notes in Pure and Applied Mathematics. New York: Marcel Dekker, Inc., 158, 1994, pp. 267–297 Communicated by A. Connes
Commun. Math. Phys. 206, 265 – 272 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Categorial Mirror Symmetry for K3 Surfaces C. Bartocci1 , U. Bruzzo1,2 , G. Sanguinetti3 1 Dipartimento di Matematica, Università degli Studi di Genova, Via Dodecaneso 35, 16146 Genova, Italy.
E-mail:
[email protected] 2 Scuola Internazionale Superiore di Studi Avanzati (SISSA), Via Beirut 2–4, 34014 Trieste, Italy.
E-mail:
[email protected] 3 Mathematical Institute, University of Oxford, 24–29 St. Giles’, Oxford OX1 3LB, UK.
E-mail:
[email protected] Received: 20 October 1998 / Accepted: 15 March 1999
Abstract: We study the structure of a modified Fukaya category F(X) associated with a K3 surface X, and prove that whenever X is an elliptic K3 surface with a section, the b derived category of F(X) is equivalent to a subcategory of the derived category D(X) b of coherent sheaves on the mirror K3 surface X. 1. Introduction In 1994 M. Kontsevich conjectured that a proper mathematical formulation of the mirror conjecture is provided by an equivalence between Fukaya’s category of a Calabi–Yau manifold X and the derived category of coherent sheaves of the mirror Calabi–Yau b [10]. Thus in some sense mirror symmetry relates the symplectic structure manifold X of a Calabi–Yau manifold with the holomorphic structure of its mirror. It is expected that special Lagrangian tori on X are mapped by mirror symmetry to skyscraper sheaves b on the mirror X. This conjecture found some physical evidence with the discovery of D-branes and the description of their role in mirror symmetry [14,17]. Moreover, in a recent paper [15] Kontsevich’s conjecture has been proved in the case of the simplest Calabi–Yau manifolds, the elliptic curves. Our approach to mirror symmetry follows the geometric interpretation due to Strominger, Yau and Zaslow [17]. According to their construction, given a Calabi–Yau manifold admitting a foliation in special Lagrangian tori, its mirror manifold should be obtained by relative T-duality. In the case of K3 surfaces this formulation has been given a rigorous treatment in [2,4], proving that Strominger, Yau and Zaslow’s approach is consistent with previous descriptions of mirror symmetry [5] (this is also related to work by Aspinwall and Donagi [1]). We show here how the constructions described in [2,4] can be given a categorial interpretation which provides a proof of Kontsevich’s conjecture in the case of K3 surfaces. More precisely, we show that, under some assumptions which will be spelled out in the
266
C. Bartocci, U. Bruzzo, G. Sanguinetti
following sections, the derived category of a Fukaya-type category built out of special Lagrangian submanifolds of an elliptic K3 surface X is equivalent to a subcategory of b This subcategory is the derived category of coherent sheaves on the mirror surface X. formed by the complexes of sheaves whose zeroth Chern character vanishes. 2. Special Lagrangian Submanifolds and Fukaya’s Category Definition 2.1. Let X be a Calabi–Yau n-fold, with Kähler form ω and holomorphic n-form . A (real) n-dimensional submanifold ι : M ,→ X of X is said to be special Lagrangian if the following two conditions are met: – X is Lagrangian in the symplectic structure given by ω, i.e. ι∗ ω = 0; – there exists a multiple 0 of such that ι∗ =m 0 = 0. It can be shown that these conditions are equivalent to requiring that the real part of 0 restricts on M to the volume form induced by the Riemannian metric of X. This exhibits special Lagrangian submanifolds as a special type of calibrated submanifolds [8]. There are not many explicit examples of special Lagrangian submanifolds. The simplest ones are the 1-dimensional submanifolds of an elliptic curve: the first condition is trivial, and the multiple 0 of the global holomorphic one-form is readily obtained by a holomorphic change of coordinates in the universal covering of the elliptic curve. Additional examples are provided by Calabi–Yau manifolds equipped with an antiholomorphic involution. Since the involution changes the sign of both the Kähler form and the imaginary part of the holomorphic n-form, the fixed point sets of the involution are special Lagrangian submanifolds. A third example, and the most relevant in our case, arises when considering Calabi–Yau manifolds endowed with a hyper-Kähler structure. This is always the case in dimension 2, i.e. for K3 surfaces. In this case special Lagrangian submanifolds are just holomorphic submanifolds with respect to a different complex structure compatible with the same hyper-Kähler metric. This example will be discussed at length in the next section. Special Lagrangian submanifolds have received remarkable attention in physics since the appearance of D-branes in string theory, and especially since their role turned out to be of a primary importance for the mirror conjecture [3,17]. D-branes are special Lagrangian submanifolds of the Calabi–Yau manifold which serves as compactification space, and are equipped with a flat U (1) line bundle. In the physicists’ language, special Lagrangian submanifolds of the compactification space are associated with physical states which retain part of the supersymmetry of the vacuum. For this (and other related) reasons, special Lagrangian submanifolds are often called supersymmetric cycles, or also BPS states. Fukaya’s category, whose objects are Lagrangian submanifolds of a symplectic manifold, was introduced in connection with Floer’s homology [6]. Here, basically following the exposition of [15], we offer a description of a modified Fukaya category, built out of the special Lagrangian submanifolds of a Calabi–Yau manifold X. We shall call this the special Lagrangian Fukaya category (SLF category for short) of X, and will denote it by F(X). The objects in F(X) are pairs (L, E), where L is a special Lagrangian submanifold of X, and E is a flat vector bundle on L. The morphisms in this category are a little bit more complicated to define. Since special Lagrangian submanifolds are n-cycles in a compact complex n-dimensional manifold, two special Lagrangian cycles generically intersect at a finite number of points. The basic concept is that a morphism between two objects in the SLF category is a way for passing from the vector bundle defined on one cycle to the bundle on the other.
Categorial Mirror Symmetry for K3 Surfaces
267
Definition 2.2. Let U1 = (L1 , E1 ), U2 = (L2 , E2 ) be two objects in the SLF category. Then the space of morphisms Hom(U1 , U2 ) is defined to be Hom(U1 , U2 ) = ⊕x∈L1 ∩L2 Hom(E1 |x , E2 |x ). Thus the space of morphisms between two objects in the SLF category turns out to be a direct sum of vector spaces, each one being the space of homomorphisms between the fibers of the two vector bundles at the intersection points of the two special Lagrangian cycles. Maslov index. The space of morphisms between two objects is naturally graded over Z by the Maslov index of the tangent spaces to the special Lagrangian submanifolds at the intersection points [15]. Let us recall some basic facts about the Maslov index. Let V be a 2n-dimensional real symplectic vector space, and denote by G(V ) the Grassmannian of Lagrangian n-planes in V . One has an isomorphism G(V ) ' U (n)/O(n), so that π1 (G(V )) ' Z. The Maslov index is the unique integer-valued function on the space of loops in G(V ) satisfying some naturality conditions [13] which include its homotopic invariance, and thus provides an explicit isomorphism π1 (G(V )) → Z. In order to define a Maslov index for the intersection of Lagrangian cycles one has to slightly modify its definition so as to consider open paths. One first notices that the Lagrangian Grassmannian is naturally stratified by the dimension of the intersection of the Lagrangian n-planes with a fixed Lagrangian n-plane. Then one can define a Maslov index for the intersection of two Lagrangian planes as a Z-valued function on the space of paths in G(V ) which is homotopy invariant under deformations of the paths that do not move the extrema out of their strata. (Actually one should consider a Grassmannian of special Lagrangian (rather than just Lagrangian) planes, and restrict the Maslov index to it. This will be done in the next section in the case of K3 surfaces.) A∞ structure. Strictly speaking Fukaya’s category is not a category at all, since in general the composition of morphisms fails to be associative. Associativity is replaced by a more complicated property, which makes Fukaya’s “category” into an A∞ category. Definition 2.3. An A∞ category F consists of a class of objects Ob(F); for any two objects X , Y, a Z-graded abelian group of morphisms Hom(X , Y); composition maps mk : Hom(X1 , X2 ) ⊗ · · · ⊗ Hom(Xk , Xk+1 ) → Hom(X1 , Xk+1 ), , k ≥ 1, of degree 2 − k, satisfying the condition X
(−1) mn−r+1 (a1 ⊗ · · · ⊗ as−1 ⊗ mr (as ⊗ . . .
r=1...n s=1...n−r+1
· · · ⊗ as+r−1 ) ⊗ as+r ⊗ · · · ⊗ an ) = 0 for all n ≥ 1, where = (r + 1)s + r(n +
s−1 X j =1
deg(aj )).
(1)
268
C. Bartocci, U. Bruzzo, G. Sanguinetti
Condition (1) implies that m1 is a coboundary operator. The vanishing of the morphism m1 , together with condition (1) for the morphism m3 , implies that the composition law given by m2 is associative. Let us see how this A∞ structure arises in Fukaya’s category. Let us assume that the first object X1 and the last object Xk+1 have a nonvoid intersection, otherwise Hom(X1 , Xk+1 ) = 0 and the composition map is trivial. The composition maps are explicitly described as follows: Let uj = (aj , tj ) ∈ Hom(Uj , Uj +1 ), where aj ∈ Lj ∩ Lj +1 and tj ∈ Hom(Ej |aj , Ej +1 |aj ). One defines X (C(u1 , . . . , uk ), ak+1 ). mk (u1 ⊗ · · · ⊗ uk ) = ak+1 ∈L1 ∩Lk+1
Here one has C(u1 , ..., uk , ak+1 ) =
X
Z I ∗ c ± exp[2π i( φ ω )]P exp[ φ ∗ β].
φ
This requires some explanation. The sum is performed over holomorphic and antiholomorphic maps φ from the disc D 2 into the manifold X, up to projective equivalence, with the following boundary condition: there are k + 1 points pj = e2π αj ∈ S 1 = ∂D 2 such that φ(pj ) = aj and φ(e2πα ) ∈ Lj for α ∈ (αj −1 , αj ). The two-form ωc appearing in (2) is the complexified Kähler form, while β is the connection of the bundle restricted to the image of the boundary of the disc. P represents a path-ordered integration, defined by I P exp( φ ∗ β) Z αk Z α1 Z αk+1 βk dα) tk exp( βk−1 dα) tk−1 ...t1 exp( β1 dα). = exp( αk
αk−1
αk+1
3. The Special Lagrangian Fukaya Category for K3 Surfaces The main purpose of this section is to give a description of the SLF category when the Calabi–Yau manifold is a K3 surface X. In this case, due to the fact that K3 surfaces admit hyper-Kähler metrics, special Lagrangian submanifolds are very easily exhibited. Let us denote by ω the Kähler form associated with a given hyper-Kähler metric and complex structure. One also has a holomorphic 2-form = x + iy. The three elements ω, x, y can be regarded as vectors in the cohomology space H 2 (X, R); if the latter is equipped with the scalar product of signature (3,19) induced by the intersection form on H 2 (X, Z), these three elements are spacelike, and generate a 2-sphere which can be identified with the set of complex structures compatible with the fixed hyper-Kähler metric. It is very easy to check that what is special Lagrangian in the original complex structure is holomorphic in the complex structure in which the roles of ω and x are exchanged (up to a sign) [8] (this corresponds to a rotation of 90 degrees around the y axis). We shall call such a change of complex structure a hyper-Kähler rotation. We want in particular to consider elliptic K3 surfaces X which admit a section.1 K3 surfaces arising as compactification spaces of string theories which admit mirror 1 This means that there exists an epimorphism p : X → P1 whose generic fiber is a smooth elliptic curve and admitting a section e : P1 → X.
Categorial Mirror Symmetry for K3 Surfaces
269
partners are always of this type [17]. So let us consider a K3 surface X that in a complex structure I is elliptic and has a section. Let us denote by XI this K3 surface. The Picard group of XI is generated by the section, by the divisor of the generic fiber, and by the irreducible components of the singular fibers that do not intersect the section.2 If we perform the hyper-Kähler rotation described above, and call J the new complex structure, the submanifolds which were holomorphic in the complex structure I are now special Lagrangian. Assuming that XJ is elliptic as well, it has been shown [4] that this hyper-Kähler rotation reproduces, at the level of the Picard lattice of an elliptic K3 surface, the effects of mirror symmetry previously described in an algebraic way [5]. So the varieties XI and XJ can be regarded as a mirror pair of K3 surfaces. In this way one has a very precise picture of the configuration of special Lagrangian submanifolds of XJ . Moreover, the flat vector bundles one considers on special Lagrangian submanifolds of XJ are (flat) holomorphic bundles in the complex structure I . On a K3 surface the A∞ structure of the SLF category turns out to be trivial, that is, the SLF category is a true category. In fact due to the hyper-Kähler structure of a K3 surface X, the Grassmannian of special Lagrangian subspaces of the tangent space to X at a point reduces to a copy of P1 , hence is simply connected. Moreover, special Lagrangian 2-cycles always intersect transversally, so there is no stratification, and the Maslov index is trivial (cf. [11]). The Hom groups in the SLF category have trivial grading, so mk = 0 for k 6 = 2, while condition (1) for m3 yields the associativity of the composition of morphisms. The triviality of this Fukaya category for K3 surfaces may be related, via Sadov’s claim [16] that the Floer homology of an almost Kähler manifold X with coefficients in the Novikov ring of X is equivalent to the quantum cohomology of X, to the triviality of the quantum cohomology of K3 surfaces. 4. The Special Lagrangian Fukaya Category and the Derived Category of Coherent Sheaves We want now to describe a construction which exhibits the relationship between the SLF category of a K3 surface and the derived category of coherent sheaves on the mirror K3 surface. We start by briefly recalling the definition of derived category of an abelian category A (cf. [18]). One starts from the category K(A) whose objects are complexes of objects in A, while the morphisms are morphisms of complexes identified up to homotopies. Let Ac(A) be the full subcategory of K(A) formed by acyclic complexes (i.e. complexes such that all cohomology objects vanish). The derived category D(A) is by definition the quotient K(A)/Ac(A). A morphism between two objects [X ], [Y] in D(A) is represented by a diagram of morphisms in K(A), q
m
X ←− Z −→ Y, where q is a quasi-isomorphism, i.e., a morphism which induces an isomorphism between the cohomology objects of X and Y. Two objects X , Y in K(A) turn out to be equivalent in D(A) whenever they are quasi-isomorphic, that is, whenever there is a diagram as above where m is also a quasi-isomorphism. If there exists a quasi-isomorphism between two complexes, these represent isomorphic objects in D(A). 2 Actually one may have further generators of the Picard group provided by additional sections of the projection p : X → P1 .
270
C. Bartocci, U. Bruzzo, G. Sanguinetti
Now we consider a K3 surface X with a fixed hyper-Kähler metric, and a compatible complex structure J . If we start from an object (L, E) in the SLF category F(XJ ), where L is a special Lagrangian submanifold of real dimension 2, and E a flat rank n vector bundle on L, in the complex structure I obtained by performing a hyper-Kähler rotation L is a divisor, and E may be regarded as a coherent sheaf on XJ concentrated on L, whose restriction to L is a rank n locally free sheaf. This operation is clearly functorial: the sheaf of homomorphisms between two such objects is a torsion sheaf concentrated on the points where the two divisors intersect. The stalks at such points are precisely the homomorphisms between the stalks of the two coherent sheaves. Thus the hyper-Kähler rotation induces a functor between the SLF category F(XJ ) and the category C(XI ) of coherent sheaves supported on a divisor of XI , whose restriction to the divisor is locally free. This functor is clearly faithful, free and representative and hence gives an equivalence of the two categories. Remark 4.1. To take account of the singular divisors in X we should consider torsionfree sheaves rather than just locally free ones. However, since any coherent sheaf on a possibly singular curve over C has a projective resolution by locally free sheaves, what we miss by restricting to locally free sheaves will be recovered when we go to the derived categories. The category C(XI ) that we obtained via a hyper-Kähler rotation is not abelian (kernels and cokernels of morphisms do not necessarily lie in the category). In order to introduce a related derived category, one should find a somehow natural abelian category ˜ I ) containing C(XI ). The most obvious choice is the subcategory of the category C(X Coh(XI ) of coherent sheaves on XI whose objects are sheaves of rank 0 (in particular we are adding all the skyscraper sheaves). We assume that the K3 surface XI is elliptic and has a section. Since XI is elliptic any point p ∈ X lies on a divisor D. The complex 0 → kp → 0 concentrated in degree zero, where kp is the length one skyscraper at p, is quasi-isomorphic to the complex of sheaves in C(XI ), 0 → OD (−p) → OD → 0, where OD is the term of degree zero. Since every coherent sheaf on a smooth curve is the direct sum of a locally free sheaf and a skyscraper sheaf, we obtain that all coherent sheaves whose support lies on ˜ I ). a divisor are objects of C(X It is not always true the derived category of an abelian subcategory C0 of an abelian category C is also a subcategory of the derived category of C. However, this is indeed ˜ I ), as we shall next show. Let us recall the definition of the case for the category C(X thick subcategory (cf. e.g. [9]). Definition 4.2. A subcategory C0 of a category C is said to be thick if for any exact sequence Y → Y 0 → W → Z → Z 0 in C with Y, Y 0 , Z, Z 0 in C0 then W belongs to C0 as well. ˜ I ) is a thick subcategory of Coh(XI ): in fact, the generic stalk of a sheaf in Now, C(X ˜ I ) is 0, and, since a sequence of sheaves is exact when it is so at the stalks, this C(X implies that also the generic stalk of W is 0, i.e. W also is a rank 0 sheaf. Moreover, ˜ I ) is a full subcategory, so that we can apply the following theorem [9]. C(X
Categorial Mirror Symmetry for K3 Surfaces
271
Theorem 4.3. Let C be an abelian category, C0 a thick full abelian subcategory. Assume that for any monomorphism f : W 0 → W with W 0 ∈ Ob(C0 ), there exists a morphism g : W → Y, with Y ∈ Ob(C0 ), such that g ◦ f is a monomorphism. Then the derived category D(C0 ) is equivalent to the subcategory of D(C) consisting of complexes whose cohomology objects belong to C0 . In our case the condition of this theorem is easily met, just take for g the evaluation ˜ I ) is a subcategory of the derived morphism. Thus the derived category built up from C(X category of coherent sheaves. ˜ I ) in cohomology is H 1,1 (Z) ⊕ H 4 (Z) and is an ideal The image of the category C(X in the algebraic cohomology ring. Since the Chern map is a ring morphism between ˜ I ) we recover K-theory and algebraic cohomology, by adding the structure sheaf to C(X the whole derived category of coherent sheaves. Adding the structure sheaf of the surface has no motivation from a strictly geometric viewpoint, but has physical grounds in the necessity of having 0-branes in the spectrum of the theory. (The association between coherent sheaves and branes is usually done by taking the Poincaré dual of the support of the coherent sheaf.) Let us check explicitly that every complex 0 → F → 0, where F is a coherent sheaf on XI , is quasi-isomorphic to a complex 0 → ⊕OXI → S → 0, where S is a coherent sheaf supported on a divisor. Let us fix a very ample divisor H in XI . Every coherent sheaf F admits a finite projective resolution by sheaves of the form ⊕rj =1 OXI (−mj H ) (cf. [7]). Moreover, due to the exactness of the sequence 0 → OXI (−mi H ) → OXI → Omi H → 0, the sheaf
⊕rj =1 OXI (−mj H )
is quasi-isomorphic to a complex 0 → ⊕OXI → S → 0
where S is a coherent sheaf supported on a divisor (here ⊕OXI is concentrated in degree 0). This proves that the whole derived category of coherent sheaves is obtained by complexes whose elements are either direct sums of the structure sheaf or lie in the image of the SLF category. Collecting these results, we have eventually proved the following fact: the derived category of a “natural abelianization” of the SLF category F(XJ ) is equivalent to a subcategory of the derived category D(XI ) of coherent sheaves on XI . 5. Conclusions Mirror symmetry yields definite predictions about the transformations of branes [14], which can be given a precise mathematical interpretation in terms of transformations of the derived category of coherent sheaves. In [2] it was indeed proved that the action of a Fourier–Mukai transform on the derived category of coherent sheaves mimics precisely the action of mirror symmetry on branes. In particular, this shows that on an elliptic K3 surface genus 1 special Lagrangian cycles are mapped to points, which is exactly the behaviour one expects from mirror symmetry [12]. Moreover, one can argue that the very essence of mirror symmetry is an equivalence between a suitable (derived) version of the Fukaya category of a Calabi–Yau manifold
272
C. Bartocci, U. Bruzzo, G. Sanguinetti
b This is exactly X and the derived category of coherent sheaves of the mirror manifold X. what we have proved when X is an elliptic K3 surface with a section, admitting also a fibration in special Lagrangian tori.After performing a hyper-Kähler rotation, we map the SLF category into a category whose “natural abelianization” is a thick full subcategory of the category of coherent sheaves. Now, if we consider an extension of this category adding the structure sheaf (which seems in some sense very natural) and derive this, we obtain the whole derived category of coherent sheaves. Applying a Fourier–Mukai transform (which at the level of derived categories is an equivalence) we obtain the desired transformation mapping 2-cycles of genus 1 to points. If, instead, we do not extend the SLF category by adding the structure sheaf, we obtain a subcategory of the derived category of coherent sheaves. This will be mapped by Fourier–Mukai transform to another subcategory, but again this will show the desired feature of mapping 2-cycles of genus 1 to points. Acknowledgements. We thank B. Dubrovin for valuable discussions and D. Hernández Ruipérez for his enlightening suggestions. This research was partly supported by the research project “Geometria delle varietà differenziabili”. The second author wishes to thank the School of Mathematical and Computing Sciences of the Victoria University of Wellington, New Zealand, for the warm hospitality during the completion of this paper while he was supported by the Marsden Fund research grant VUW-703.
References 1. Aspinwall, P., and Donagi, R.: The heterotic string, the tangent bundle, and derived categories. hepth/9806094 2. Bartocci, C., Bruzzo, U., Hernández Ruipérez, D., and Muñoz Porras, J.M.: Mirror symmetry on K3 surfaces via Fourier–Mukai transform. Commun. Math. Phys. 195, 79–93 (1998); alg-geom/9704023 3. Becker, K., Becker, M., and Strominger, A.: Fivebranes, membranes and non-perturbative string theory. Nucl. Phys. B456, 130-152 (1995); hep-th/9507158 4. Bruzzo, U., and Sanguinetti, G.: Mirror symmetry on K3 surfaces as a hyper-Kähler rotation. Lett. Math. Phys. 45, 295–301 (1998); physics/9802044 5. Dolgachev, I.V.: Mirror symmetry for lattice polarized K3 surfaces. J. Math. Sci. 81, 2599–2630 (1996); alg-geom/9502005 6. Fukaya, K.: Morse homotopy, A∞ -category and Floer homologies. In: Proceedings of the 1993 GARC Workshop on Geometry and Topology, Seoul National University 7. Hartshorne, R.: Algebraic geometry. New York: Springer-Verlag, 1977 (Corollary II.5.18) 8. Harvey, R., and Lawson Jr., H.B.: Calibrated geometries. Acta Math. 148, 47–157 (1982) 9. Kashiwara, M., and Schapira, P.: Sheaves on manifolds. Berlin: Springer-Verlag, 1990 10. Kontsevich, M.: Homological algebra of mirror symmetry. In: Proceedings of the 1994 International Congress of Mathematicians, I, Zürich: Birkhäuser, 1995, p. 120; alg-geom/9411018 11. Kontsevich, M.: Talk delivered at “European Conference on Algebraic Geometry”, University of Warwick, July 1996 12. Manin, Yu.I.: Talk delivered at the Pisa symposium “Hodge Theory, Mirror Symmetry and Quantum Cohomology”, April 1998 13. McDuff, D., and Salamon, D.: Introduction to symplectic topology. Oxford: Clarendon Press, 1995 14. Ooguri, H., Oz, Y., and Yin, Z.: D-branes on Calabi–Yau spaces and their mirrors. Nucl. Phys. B477, 407–430 (1996); hep-th/9606112 15. Polishchuk, A., and Zaslow, E.: Categorical mirror symmetry: The elliptic curve. math.AG/9801119 16. Sadov, V.: On equivalence of Floer’s and quantum cohomology. Commun. Math. Phys. 173, 77–99 (1995); hep-th/9310153 17. Strominger, A., Yau, S.-T., and Zaslow, E.: Mirror symmetry is T-duality. Nucl. Phys. B479, 243–259 (1996); hep-th/9606040 18. Verdier, J.-L.: Des catégories dérivées des catégories abéliennes. Astérisque 239, Société Mathématique de France (1996) Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 206, 273 – 288 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Ergodicity of 2D Navier–Stokes Equations with Random Forcing and Large Viscosity Jonathan C. Mattingly? Program in Applied and Computational Mathematics, Princeton NJ, USA Received: 16 February 1998 / Accepted: 19 March 1999
Abstract: The stochastically forced, two-dimensional, incompressable Navier–Stokes equations are shown to possess an unique invariant measure if the viscosity is taken large enough. This result follows from a stronger result showing that at high viscosity there is a unique stationary solution which attracts solutions started from arbitrary initial conditions. That is to say, the system has a trivial random attractor. Along the way, results controling the expectation and averaging time of the energy and enstrophy are given.
We consider the stochastically forced, 2D, incompressible Navier–Stokes equations (SNS) on a bounded domain U ⊂ R2 with a smooth boundary ∂U , namely ∂u(x, t) − ν1u(x, t) + (u(x, t) · ∇)u(x, t) = f (x, t) − ∇P (x, t), ∂t ∇ · u(x, t) = 0, u(x, 0) = u0 (x) and u(x, t) = 0 for x ∈ ∂U .
(1)
Here f (x, t) is a divergence-free, mean zero, white in time Gaussian random field satisfying the specified boundary conditions, P (x, t) is the pressure, and ν > 0 is the viscosity. Equation (1) determines a Markov process whose phase space consists of the square integrable, divergence-free vector fields defined on the domain U with the given boundary conditions. This process was studied in a series of papers by Crauel, Da Prato, Ferrario, Flandoli, Foias, Gatarek, Maslowski, Temam, Zabczyk and others (see [Fla94, FG95,FM95,CF94,Fer97,DPZ96]). In particular, for (1) they proved the existence and uniqueness of strong solutions to the integral equation and the existence of at least one ? Current address: Department of Mathematics, Stanford University, Stanford, CA 94305, USA. E-mail:
[email protected] 274
J. C. Mattingly
invariant measure. The uniqueness of this measure was proven under stringent additional conditions on the forcing. In this paper, we prove a general theorem about the behavior of solutions of the SNS equations which holds for large viscosity. This theorem, among other things, easily gives the uniqueness of the invariant measure. The approach was motivated by the paper [EKMS98]. Furthermore, it opens the possibility of studying various statistical properties of the solutions with respect to this invariant measure. Before stating the main result let us describe the setting more precisely. We begin by eliminating the pressure term from the equations by incorporating the divergencefree condition into the state space. Essentially, ∇P can be understood as a Lagrange multiplier which enforces the divergence free condition. Its effect can be captured by restricting ourselves to solutions living in a divergence free space. We denote by V the space of all C ∞ , divergence free vector fields on U satisfying the boundary conditions and by L2 the closure of V in L2 (U ) × L2 (U ). L2 should be thought of as the square integrable functions “in our setting”. By projecting onto L2 , we can rewrite (1) as an abstract Itô stochastic differential equation on L2 . We thus obtain du(t; t0 , u0 ) = {−ν32 u − B(u, u)}dt + dW (t),
(2)
u(t0 ; t0 , u0 ) = u0 ∈ L . 2
Here u(t; t0 , u0 ) is the value at time t of a solution which started from an initial condition u0 at time t0 . B(v, w) = PL2 (v·∇)w and 32 u = −PL2 1u are respectively the projection of the bilinear term and the linear terms onto L2 . We will also need the eigenvectors, {ek }k∈Z , of 32 in L2 and their corresponding eigenvalues, λk . To characterize the spatial smoothness, we will use the spaces Hs = D(3s ) ∩ L2 , where D(3s ) is the domain of the operator 3s . Hs is essentially the Sobolev space H s (U ) × H s (U ) with the addition of the boundary and divergence-free conditions. dW (t) is the Itô differential of an infinite dimensional Brownian motion in L2 . We assume W (t, ω) is of the form X σk ek βk (t) t ∈ (−∞, ∞). (3) W (t) = k∈Z,k6=0
Here the βk (t) are independent, two-sided, standard Brownian motions on the probability space (, Ft , P, θt ), Ft is the filtration of σ -algebras to which the βk ’s are adapted, P is the probability measure on , and θt is the induced ergodic group of P-preserving shift on . For a unique strong solution to exist, a sufficient requirement on the coefficients σk ∈ R is that X 1 (4) σk2 λk2 < ∞. 1
This condition is natural for it makes W (·) a Brownian motion with values in H 2 ([Kun90, DPZ92]). By the Sobolev embedding theorem, this is the marginal space to be continuously embedded in L4 × L4 as is required of the forcing in the deterministic theory. It is possible to work with less spatially regular forcing at the expense of having to deal with weak solutions and the imposition of additional conditions to assure uniqueness. However, since our goal is to outline a new approach, not give the most general theorems, we steer clear of these extra complications in the name of clarity. Thus henceforth, except when explicitly stated, we require that W (t) satisfy the condition given in (4).
Randomly Forced Navier–Stokes
275
The process W (t) has stationary increments; hence, the expected value of the L2 norm of the forcing at any instance time grows at a fixed constant rate. We will denote this constant by E0 , o n X σk2 . (5) E |W (t + τ )|2L2 − |W (t)|2L2 = τ E0 = τ k
Physically, E0 is the expected energy flux imparted by the stochastic forcing per unit of time. We also observe that the Poincaré inequality, |3u|2L2 ≥ λ1 |u|2L2 , holds in our setting. Furthermore, we shall need later the classical estimate on the bilinear term B(v, w) (see [CF88]): |hB(v, w), uiL2 |2 ≤ γ 2 |v|L2 |3v|L2 |3w|2L2 |u|L2 |3u|L2 .
(6)
For completeness, we restate the existence and uniqueness theorems for the SNS. Definition. A stochastic process u(t, ω) is a solution of (2), over the time interval [t0 , T ] with initial condition u0 ∈ L2 , if • u(·, ω) ∈ C( (t0 , T ), L2 ) ∩ L2 ( (t0 , T ); H1 ) a.s. • u(·, ω) is a solution of the integral equation u(t, ω) = e
ν32 (t−t0 )
Z u0 +
t
t0
e
ν32 (t−s)
Z B(u(s, ω), u(s, ω))ds +
t
t0
eν3
2 (t−s)
dW (s, ω)
with probability one. Just as in the deterministic Navier–Stokes equations, one can obtain a short time existence proof by means of a fixed point argument. The solution can than be extended for all time by an a priori energy estimate. Theorem (Da Prato, Zabczyk, Flandoli). If W (·) satisfies (4), then for each initial condition u0 ∈ L2 there exists a unique solution u of the SNS, Eq. (2), such that • u(·, ω) ∈ C([0, T ]; L2 ) ∩ L2 (0, T ; H1 ) a.s. . • u is a Markov process in L2 . 1
Proof. Given the observation that in the two-dimensional setting, H 2 ⊂ L4 × L4 , the existence and uniqueness was proved in [DPZ96]. Flandoli proved the regularity and Markov properties in [Fla94]. u t As mentioned before, in [FM95] Flandoli and Maslowski proved that the invariant measure for weak solutions of the SNS, Eq. (2), is unique if − 21
cλk
− 38 −
≤ σk ≤ Cλk
for some C > 0, c > 0, and > 0,
asymptotically in k. The upper bound ensures that a weak solution of a needed regularity exists, the lower bound ensures the uniqueness of the invariant measure. These results have been improved in [Fer97] but only in so far as the decay rates have been relaxed. All these results require the noise not only to be infinite dimensional but also not to have a high degree of smoothness in space. Our results, though requiring a viscosity which
276
J. C. Mattingly
is “large enough”, impose no spatial roughness on the forcing. In particular, the forcing can be finite dimensional. We use a simple yet, when applicable, extremely powerful methodology for showing the uniqueness of an invariant measure of our Markov process. It amounts to showing, noise realization by noise realization, that trajectories starting from different initial data converge to each other with probability one as the system evolves. 1. Main Results As we have alluded, all of our central results require the square of the viscosity to be large relative to the mean energy flux of the forcing. We now make this statement precise. Recall that γ is the domain dependent constant defined in (6) and that λ1 was the first eigenvalue of 32 on U . Define δ0 = λ1 ν − E0 γ /2ν 2 . Condition (A).
γ ν3 > ⇐⇒ δ0 > 0. E0 2λ1
As E0 is the mean increase in the L2 norm of the Brownian forcing W (t)per unit of time, Condition (A) requires that the mean energy input be small relative to the viscosity squared. All of our results stem from the following two theorems. Theorem 1. Assume Condition (A) holds. Fix a δ ∈ (0, δ0 ) and a time t0 . Let u0 ∈ L2 be 2p an initial condition, measurable with respect to Ft0 , such that E|u0 |L2 is finite for some p > 1. Set u(t) = u(t; t0 , u0 ) and let u(t) ˜ = u(t; t0 , u˜ 0 ) denote the solution starting from some other arbitrary initial condition u˜ 0 ∈ L2 . Then, there exists a positive integer→ ˜ such that valued random time − τ (δ, t0 , u0 ), independent of u, |u(t) − u(t)| ˜ 2L2 ≤ |u0 − u˜ 0 |2L2 e−2δ(t−t0 )
→ → for all t > t0 + +− τ . In addition, E(− τ q ) is finite for any q ∈ (0, p − 1).
L2 -norm u0
u˜ 0 t0 + τ ∗
t0 Fig. 1. Summary of Theorem 1
Theorem 2. Assume Condition (A) holds. Fix a δ ∈ (0, δ0 ) and a t ∈ R. Let {u0 (n)} be a sequence of random variables with n ∈ αZ and n ≥ 0. Assume that the u0 (n) are 2p measurable with respect to Ft−n and that E|u0 (n)|L2 is uniformly bounded in n for some p > 2. Then the following hold:
Randomly Forced Navier–Stokes
277
1. There exist a random αZ-valued time ← n−(, δ, t, ω) > 0 such that for real s > 0 and ← − all n ∈ αZ with n > n one has, with probability one, sup |u(t + s; t − n; , u0 (n)) − u(t + s; t − n, u00 )|L2 ≤ δ 2 |n|e−δ(|n|+s) .
u00 ∈An
2 n−q ) < ∞ for q ∈ (0, p−2). Here An is the set {u00 : |u00 |2L2 ≤ δ2 |n|}. In addition, E(← 2. Let {u˜ 0 (n)} be a second sequence of random variables with n ∈ αZ and n ≥ 0, 2p measurable with respect to Ft−n , and with E|u0 (n)|L2 uniformly bounded in n for some p > 2. Then there exists another αZ-valued random time ← n−0 such that, with probability one, for real s > 0 and all n ∈ αZ with n > ← n−0 one has,
|u(t + s; t − n; , u0 (n)) − u(t + s; t − n, u˜ 0 (n))|L2 ≤ δ 2 |n|e−δ(|n|+s) . Again, E(← n−0 )q < ∞ for q ∈ (0, p − 2). Theorem 2 is similar in spirit to Theorem 1. The main difference is that in the latter, time is running backwards. However Theorem 2 is a bit weaker in that we are restricted to points on the lattice αZ as starting times. This however is an artifact of our approach. By proving a “backwards” version of the critical lemma used in the proof of Theorem 1 (that is Lemma 3), one can prove a version of Theorem 2 completely analogous to Theorem 1. See [Mat98] for the details. The following corollary will allow us to build a solution starting from “−∞.” Corollary 1. Under Condition (A), fix a lattice αZ, a t1 ∈ αZ, and a δ ∈ (0, δ0 ). Given any ε > 0, there exists a positive αZ-valued random time n∗ (ε, δ, t1 ) such that with probability one, for all τ ≥ 0 and all n1 , n2 ∈ αZ, n1 , n2 < t1 − n∗ H⇒ |u(t1 + τ ; n1 , 0) − u(t1 + τ ; n2 , 0)|L2 ≤ εe−δτ . Furthermore, n∗ (ω) is a stationary random variable with all moments finite.
L2 -norm
t1 − n ∗
t1
Fig. 2. Summary of Corollary 1
Theorem 1 implies that, for almost every realization of the noise, trajectories starting from different initial conditions converge to each other. Corollary 1 states that two solutions with initial conditions identically equal to zero, but starting at different instances of time, converge to each other for almost every instance of noise. Together they show
278
J. C. Mattingly
that there exists a unique asymptotic behavior and thus a unique invariant measure. Essentially, Corollary 1 shows the existence of a single distinguished solution to which all solutions starting from zero converge almost surely. Theorem 1 guarantees that all initial conditions converge to this distinguished solution for almost every instance of the noise. Thus the asymptotic behavior depends only on the realization of the noise and is insensitive to the initial conditions. We now make this discussion more formal, and prove the following statements. Theorem 3. If Condition (A) holds, then there exists a unique solution u∗ : (−∞, ∞)× → L2 of the SNS, defined for all t ∈ (−∞, ∞), such that: 1. u∗ (t, ω) is a stationary stochastic process with values in H1 . 2. For any time t0 ∈ R, any δ in (0, δ0 ), and any lattice α0 Z, there exist integer random → n−∗ (t0 , δ, α0 ) such that times − n ∗ (t0 , δ) and ← |u(t; t0 , u0 ) − u∗ (t)|L2 ≤ re−δ(t−t0 ) ,
sup
{u0 :|u∗ (t0 )−u0 |2 2 t0 + − ∗ ∗ ← − − → n and n have all moments finite.
L2 -norm u∗
Fig. 3. Summary of Theorem 3
In fact, u∗ has greater spatial regularity than mentioned above. See [Mat98] for the details. Proof. We begin by constructing u∗ . Pick an α ∈ R+ . Let n1 be an arbitrary element of αZ. Define un (t, ω) = u(t, ω; n1 − n, 0) for n ∈ αZ+ and t ≥ n1 . By Corollary 1, we see that the {un }, restricted to the time interval [n1 , ∞), form a Cauchy sequence in the def space C([n1 , ∞), L2 ) under the norm |u|∞,L2 = sup |u(s)|L2 . This space is complete so s≥n1
the limit exists. Define u∗ (t, ω) to be this limit for t ≥ n1 . Since n1 was arbitrary this defines u∗ (t, ω) for all time. Flandoli proved in [Fla94] that there is an absorbing ball for the dynamics in the H1 topology. Thus for any fixed T > n1 , we see that lim sup|3un (s)|L2 < K(ω, T ) almost n
surely, for some random K and all s ∈ [n1 , T ]. This gives that |3u∗ |L2 < K almost surely, which means u∗ ∈ C([n1 , ∞); H1 ). This also shows that the {un } converge to u∗ weakly in H1 . We already know that the {un } converge strongly to u∗ (t, ω) in
Randomly Forced Navier–Stokes
279
L2 . Hence by standard techniques and some estimates on the bilinear term, we see that u∗ (t, ω) is a weak solution to the SNS equation (cf. Sect. 2.1. of [Tem79]). Since u∗ ∈ C([n1 , ∞); H1 ) almost surely, it is in fact a strong solution to the integral equation. 2p Because each un starts from zero, Lemma 2 shows that for each p, E{|un (t)|L2 } is
bounded uniformly in both t and n. Thus, Ep {|(u∗ (t)|L } is bounded uniformly in t, t ∈ (−∞, ∞). This uniformity allows us to apply Theorem 1 and 2 which proves the two statements about balls in phase space being exponentially attracted to u∗ . Next, we must show that u∗ is stationary. Observe, that by construction u∗ (t, ω) is stationary under shifts of length α, 2p
u∗ (t + α, ω) = lim u(t + α, ω; t − n, 0) n→−∞ n∈α Z
= lim u t, θα ω; t − n + α, 0 = u∗ (t, θα ω). n→−∞ n∈α Z
Since α was arbitrary, for another α˜ ∈ R+ we could construct u˜ ∗ corresponding to the lattice αZ. ˜ Again u˜ ∗ would be a strong solution, with Ep (u˜ ∗ ) uniformly bounded in time and stationary relative to shifts of length α. ˜ Since u∗ (t) and u˜ ∗ (t) both have uniformly bounded energy moments, we can apply Theorem 2. Because u∗ (t) and u˜ ∗ (t) exist for all times, we can slide the “initial times” used in Theorem 2 back to “−∞”. Thus showing that the two solutions are identical. u t In light of Theorem (3), we have the following corollary. Corollary 2. If Condition (A) holds, the SNS has a unique invariant measure. Proof. The invariant measure is simply the law of u∗ (t) at any time t. Since every t trajectory is attracted to u∗ (·) the measure is unique. u We can recast these conclusions in the language of random attractors (see [CF94]) by saying that the SNS possesses a random attractor which for each noise realization is a single solution in L2 . 2. Energy Estimates Before proving our main results, we establish a few facts concerning the evolution of the energy which do not require Condition (A). We will denote the moments of 2p 2 = supk σk2 . Letting the energy by Ep (t; u0 ) = E{|u(t; t0 , u0 )|L2 }. Also define σmax
uk (t) = hu(t), ek iL2 and denoting by hu(t), dW iL2 the sum
X k
uk (t) · σk dβk (t), we
have the following lemmas describing the evolution of the energy moments. Lemma 1. For p ≥ 1, the energy moments satisfy the Itô stochastic differential equation h i 2p 2(p−1) −ν|3u(t)|2L2 dt + hu(t), dW iL2 d|u(t)|L2 =2p|u(t)|L2 X 2(p−2) 2(p−1) |uk (t)|2 |σk |2 dt + p|u(t)|L2 E0 dt. (7) + 2p(p − 1)|u(t)|L2 k
Furthermore, the local martingale defined by Mt = is in fact an
L2 ()
martingale.
Z t t0
2(p−1)
2p|u(s)|L2
hu(s), dW (t)iL2
280
J. C. Mattingly 2p
Lemma 2. Assume that the initial condition is such that E{|u0 |L2 } is finite for some p ≥ 1 and measurable with respect to Ft0 , then n o E0 E0 −2νλ1 t 2 +e E |u0 |L2 − E1 (t, u0 ) ≤ 2νλ1 2νλ1 E0 E0 −2νλ1 t +e E1 (t0 , u0 ) − , = 2νλ1 2νλ1 and for all j ∈ Z, 1 < j ≤ p, Ej (t, u0 ) ≤ Ej (t0 , u0 )e−2j νλ1 t + Cj where Cj = 2j (j
2 − 1)σmax
Z
t
t0
Ej −1 (s, u0 ) e−2j νλ1 (t−s) ds,
+ j E0 . Furthermore, for s < t,
Ep (t, u0 ) ≤
def Emax p (s, u0 )=
p X j =1
Cj0 Ej (s, u0 ) + C00 ,
(8)
where the Cj0 are constants depending only on j and the σk ’s. We also see that asymptotically 2 ) p max(E0 , σmax (p − 1)!. (9) Ep (t) ≤ νλ1 Proof of Lemma 1 and 2. We begin by deriving (7) leaving the problem of showing the local martingale term is a true martingale until after we have derived the estimates for the expectations. In fact, these bounds on the expectations will be used to bound the quadratic variation process of the local martingale. 2p Applying Itô’s formula to u 7 → |u|L2 , one obtains (7). For p = 1, this is identical to the deterministic energy evolution equation except for the additional term with E0 . This term arises in Itô’s formula when the second functional derivative of u 7 → |u|2L2 is applied to the quadratic variation of W (t). These somewhat formal manipulations can be understood as the limit of classical finite-dimensional stochastic calculus applied to the Galerkin approximations in Fourier space. All of the terms are independent of the order of the Galerkin approximation so the limit can be taken. In the rest of the section, we will seem to cover the same ground three times. On each pass, we will glean a little more information. It is probably worthwhile to mention the difficulties that necessitate such repetition. From the existence and uniqueness theory, we know only that |u(t)|2L2 is finite with probability one. This puts u(t) in the one o for which the Itô stochastic integral is defined. Knowing that nZof the weakest class P
t
0
2p
|u(t)|L2 ds < ∞ = 1, allows one to define the stochastic integral Z t
Mt = def
0
2(p−1)
2p|u(s)|L2
hu(s), dW (s)iL2
but only as a local martingale. In particular as the diligent referee correctly observed, Z t
2p
this means that one does not know that EMt = 0. This requires that E |u(t)|L2 dt 0 is finite. This is not given by the existence and uniqueness theorem. Hence, we must establish this before we can make any conclusion which requires EMt = 0.
Randomly Forced Navier–Stokes
281
We will now show that
Z t 2p 2p 2(p−1) |3u(s)|2L2 ds E|u(t)|L2 ≤E|u(0)|L2 − ν2pE |u(s)|L2 0 Z t X 2(p−2) 2p(p − 1)|u(s)|L2 |uk (s)|2 |σk |2 ds +E Z
0
t
+E 0
(10)
k
2(p−1)
p|u(s)|L2
E0 ds.
Since Mt is a local martingale there exists a sequence of stopping time {Tn }, with Tn → ∞ as n → ∞, that reduces the Mt , that is, makes Mt∧Tn a bounded martingale. For t < Tn , Mt∧Tn follows the evolution of Mt . At the time Tn , it “stops”. For all future times it takes the value MTn . Since Mt∧Tn is a bounded martingale, the Optional Stopping Time Theorem implies that EMt∧Tn equals 0 (see [DM82,Dur96]). We denote by fn (t) the expression Z t X 2p 2(p−2) 2p(p − 1)|u(s)|L2 |uk (s)|2 |σk |2 ds |u(0)|L2 + Z
0
t
+ 0
k
2(p−1)
p|u(s)|L2
E0 ds + Mt∧Tn .
(11)
This is simply the positive drift terms from the right-hand side of (7) written in integral form, with the local martingale Mt replaced by the stopped martingale Mt∧Tn . Because, is a bounded martingale and hence has expected value zero, as already observed, M Z t∧Tn t
2(p−1)
|3u(s)|2L2 ds is the desired right-hand side from we see that Efn − νE 2p|u(s)|L2 0 (10). Next rearranging (7), we observe that Z t 2p 2(p−1) |u(s)|L2 |3u(s)|2L2 ds = fn (t) 0 ≤ |u(t)|L2 + 2pν 0
for t ≤ Tn . This shows that fn (t) is non-negative for t ≤ Tn . We intend to use Fatou’s lemma; hence, we need to show that fn (t) is non-negative for all t. In fact, we will see that for t > Tn , fn (t) ≥ fn (Tn ). This can be seen by using (11) to write fn (t) − fn (Tn ). When t ≥ Tn , we have Z t 2(p−2) 2p(p − 1)|u(s)|L2 fn (t) − fn (Tn ) = Tn
X k
Z |uk (s)|2 |σk |2 ds +
t
Tn
2(p−1)
p|u(s)|L2
E0 ds.
(12)
Since each integral on the right-hand side is the integral of a non-negative quantity, it is clearly non-negative. Putting all of this together shows that fn (t) is non-negative for all t which allows us to apply Fatou’s lemma. Doing so gives Z t 2p 2(p−1) 2p|u(s)|L2 |3u(s)|2L2 ds = E lim fn ≤ lim Efn , (13) E|u(t)|L2 + νE 0
which proves (10).
n→∞
n→∞
282
J. C. Mattingly
2 out and Next, we complete Lemma 2 by constructing the bounds in (8). Pulling σmax using the Poincaré inequality once gives
h i d 2 Ep ≤ −2νpλ1 Ep + 2p(p − 1)σmax + pE0 Ep−1 . dt Integration of this differential inequality gives the desired bounds on Ep (t). Lastly, we obtain uniform bounds on each moment in terms of the values of moments of lesser or equal order evaluated at an earlier moment of time. For t > s, E0 E0 def , E1 (s)} ≤ + E1 (s)=Emax 1 (s), 2νλ 2νλ def max Ep (t) ≤ Ep (s) + Cp Emax p−1 (s)=Ep (s). E1 (t) ≤ max{
Notice that Emax p (s) is just a linear combination of the moments of order less than or equal to p evaluated at the time s. In other words, there exist constants Cp0 depending Pp 0 E0 . only on p and {σk } so that Emax p (s) = 1 Cj Ej (s) + 2νλ We now examine if Mt is a true martingale or simply a local martingale. By Corollary 3 on p. 66 of [Pro90], it is sufficient to show that the quadratic variation, [M, M]t has finite expectation for all finite times, Z t Z t X 2(p−1) 2p 2 2 2 2p|u(s)|L2 |uk (s)| |σk | ds ≤ σmax |u(s)|L2 ds. [M, M]t = 0
0
k
Hence, Z 2 E[M, M]t ≤ σmax
0
t
Ep (s)ds,
(14)
which is finite by the bounds proved in Lemma 2 . This completes Lemma 1. u t Before moving on, we mention that completely analogous estimates are possible for |∇u|L2 and of a slightly different form for higher Sobolev norms. See [Mat98]. 3. The Contraction in Phase Space Condition (A) makes the system strongly dissipative. In the deterministic setting, it produces a system with a globally attracting fixed point. Our understanding of the dissipative nature will come from examining the evolution of the difference between two solutions starting from different initial data, u0 and u˜ 0 , but subjected to the same instance of noise. We define ρ(t; t0 , u0 , u˜ 0 ) = u(t; t0 , u0 ) − u(t; t0 , u˜ 0 ). At times, we will use the shorthand u(t) ˜ for u(t; t0 , u˜ 0 ) and u(t) for u(t; t0 , u0 ). From Eq. (2), we see that ρ(t) satisfies the following partial differential equation dρ = −ν32 ρ + B(u, ˜ u) ˜ − B(u, u) dt = −ν32 ρ + B(u − ρ, u − ρ) − B(u, u) = −ν3ρ − [B(u, ρ) + B(ρ, u) + B(ρ, ρ)].
(15)
Randomly Forced Navier–Stokes
283
This PDE is classical in so far as there are no Itô integrals, only random coefficients. In the following manipulations, we will not make specific reference to the regularity of the solutions. Implicitly, we do the intermediate calculations with finite Galerkin approximations which are C ∞ . The quantiles presented in the final estimates will be well defined in the limit as the order of the Galerkin approximation is taken to ∞. Thus, the finial conclusions will hold for the actual solution and not just its Galerkin approximations. Taking the inner product of (15) with ρ and remembering that hB(v, w), wiL2 = 0 for general u and v, we arrive at 1 d |ρ(t; t0 )|2L2 = −ν|3ρ|2L2 − hB(ρ, u), ρiL2 . 2 dt
(16)
Next recall the estimate on |hB(v, w), uiL2 | from the introduction. We use this inequality, followed by the application of ab < a 2 /2 + b2 /2, and lastly the Poincaré inequality to obtain 1 d |ρ(t; t0 )|2L2 ≤ −ν|3ρ|2L2 + γ |ρ|L2 |3ρ|L2 |3u|L2 2 dt ν γ ≤ − |3ρ|2L2 + |ρ|2L2 |3u|2L2 2 2ν γ νλ1 − |3u|2L2 |ρ|2L2 . ≤− 2 2ν
(17)
Thus by Gronwall’s lemma, we arrive at the estimate we need: |ρ(t; t0 , u0 , u˜ 0 )|2L2 ≤ e−2(t−t0 )0(t−t0 ;t0 ,u0 ) |ρ0 |2L2 ,
(18)
where γ 0(τ ; t0 , u0 ) = νλ1 − ν
Z t0 +τ 1 2 |3u(s)|L2 ds . τ t0
The following lemma, which will be proved in a later section, gives the needed control on the process 0. Lemma 3. Let u0 be a L2 -valued random variable, measurable with respect to Ft0 , with 2p E|u˜ 0 |L2 for some fixed p > 0. Then for any fixed > 0, there exists a random time s0 (, t0 , u0 ) such that for n ∈ Z+ , n > s0 H⇒ |0(n; t0 , u0 ) − δ0 | < . Also, 1. if p > 1 then s0 is finite almost surely and q 2. if q ∈ (0, p − 1) then Es0 is finite. (The definition of δ0 was given on at the beginning of Sect. 1.)
(19)
284
J. C. Mattingly
4. Proofs of Theorems 1, 2, 3 and Corollary 1 Proof of Theorem 1. Most of the work of this theorem is contained in the proof of Lemma 3. We set = δ0 − δ. By Lemma 3, there exists a random time s0 (δ1 ) so that the condition in (19) holds. This implies that for all times τ > s0 (), we have → τ (δ, t0 , u0 ) = s0 (), the estimate in (18) becomes 0(τ ; t0 , u0 ) > δ. Thus if we take − → the estimate given in the theorem. By Lemma 3, − τ (δ, t0 , u0 ) has the desired moments. t u Proof of Theorem 2. Without loss of generality we will take t=0. The letter n, with all of its various ornamentations, will always be αZ-valued. Similarly, m will always 2 be an integer. For m ≥ 0, let n0 (ω) = αTbound ({u0 (αm)}, δ2 |αm|) and n˜ 0 (ω) =
αTbound ({u˜ 0 (αm)}, δ2 |αm|). (The rescaling by α is necessary because the Bounding Lemma is written for sequences indexed by integers.) The definition of Tbound (·, ·) is given at the start of the appendix; however, in words it is defined as the first integer moment of time such that the first sequence is smaller than the second sequence for all subsequent integer times. It is the nearest integer moment of time when the second sequence overtakes the first. Set δ1 = δ0 − δ and t0 (n) = s0 (δ1 , n, u0 (n)), where s0 was defined in Lemma 3. Hence by Lemma 3, Et0 (n)q < ∞ for q ∈ (0, p−1). Now set n∗1 = Tbound ({t0 (n)}, |n|). By the first corollary to the Bounding Lemma contained in the appendix (Lemma 5), n− = max(n0 , n˜ 0 , n∗1 ). Because max(X, Y )p ≤ E(n∗1 )q < ∞ for q ∈ (0, p − 2). Define ← p p p ← − (X + Y ) ≤ Cp (X + Y ), n has all the same moments as n∗0 , n˜ 0 and n∗1 . Thus, E← n−q < ∞ for q ∈ (0, p − 2). Putting everything together and using the estimate (18), we have 2
|u(0; n, u0 (n)) − u(0; n, u˜ 0 (n) )|2L2 ≤ |u0 (n) − u˜ 0 (n)|2L2 e−δ|n| ≤ δ 2 |n|e−δ|n| .
t for n < n∗ . u Proof of Corollary 1. Without loss of generality, we take t1 = 0. As in the previous proof, the letter n will always be αZ-valued. Similarly, m will always be an integer. Define u0 (n) = u(−n; −n − α, 0) for n ∈ αZ with n > 0. The sequence u0 (n) forms a stationary sequence of random variables. By Lemma 2, all of the moments of |u0 (n)|L2 are uniformly bounded in n because the initial conditions are deterministic. Now use Theorem 2 to compare the solution starting from u0 (n) at time −n and the solution starting from zero at time −n. Theorem 2 says that there exists at αZ-valued random variable n∗ , with all moments finite, such that for n00 > n0 > n∗ > 0 and τ > 0, |u(τ ; −n0 , 0) − u(τ ; −n00 , 0)|L2 = |
n00 α −1
X
0
u(τ ; −αj, 0) − u(τ ; −αj + 1, 0)|L2
j = nα n00 α −1
≤
X
0 j = nα
|u(τ ; −αj, 0) − u(τ ; −αj + 1, 0)|L2 =≤ e−δτ .
t u
Randomly Forced Navier–Stokes
285
For m < 0, let n∗0 (ω) = αTbound ({u0 (αm)}, δ 2 |αm|). As in the proof of Theorem 2, the rescaling by α is necessary because the Bounding Lemma is written for sequences indexed by integers. Set δ1 = δ0 − δ and t0 (n) = s0 (δ1 , n, u0 (n), 0), where s0 was defined in Lemma 3. Observe that it is also a stationary sequence of random variables. By Lemma 2, 2p E|u0 (n)|L2 is finite for all p ≥ 1 and n ∈ αZ. Hence by Lemma 3, all moments of t0 (n) are finite. By the first corollary to the Bounding Lemma contained in the appendix (Lemma 5), n∗1 = Tbound ({t0 (n)}, |n|) has all moments finite. Define n∗ = min(n∗0 , n∗1 ). Because max(X, Y )p ≤ (X + Y )p ≤ Cp (Xp + Y p ), n∗ has all moments finite since n∗0 and n∗1 do. Putting everything together and using the estimate (18), we have |u(0; n, 0) − u(0; n − α, 0)|2L2 = |u(0; n, 0) − u(0; n, u0 (n) )|2L2 ≤ |u0 (n)|2L2 e−δ|n| ≤ δ 2 |n|e−δ|n|
for n < n∗ . And hence, for n0 ,n00 < −n∗ < 0 < τ , we have the needed estimate X |u(0; n, 0) − u(0; n − 1, 0)|2L2 |u(τ ; n0 , 0) − u(τ ; n00 , 0)|2L2 ≤ e−δτ ≤ e
n∈α Z,n 1, then Tbound ({Mn }, n) is finite almost surely. 2. ETbound ({Mn }, n)q is finite for q ∈ (0, p − 1). Proof. We apply the Bounding lemma (Lemma 5 from the appendix) using the estimate given in (21). The condition to be almost surely finite translates to 2p > 1 + p which implies p > 1. The condition on the moments translates to 2p > 1 + p + q which implies p − q > 1. u t We are now in a position to prove Lemma 3. Proof of Lemma 3. Recalling (20) and the definition of 0, we have 1 2 |u(s)|L2 + M(τ ; t0 , u0 ) . 0(τ ; t0 , u0 ) ≤ νλ1 − γ 2E0 + τ Let Mn (t0 ) = supn−1<s s0 H⇒ |0(n; t0 , u0 ) − δ0 |
Tbound ({Xn }, f, ω) H⇒ |Xm (ω)| < f (m). For a single random variable X define Tbound (X, f, ω) = sup{n : |X| > f (n)}. Lemma 5 (Bounding Lemma). Assume that P(|Xn | ≥ nδ ) ≤
E|Xn |p C ≤ pδ−r p pδ p n n
for some , δ, p, C > 0 and r ≥ 0. 1. If pδ > 1 + r then Tbound ({Xn }, nδ ) < ∞ a.s. . 2. E[Tbound ({Xn }, nδ )]q is finite for q ∈ (0, pδ − (1 + r) ). P Proof. In light of Chebyshev’s inequality, the sum n P(|Xn | > nδ ) is finite. Thus by the first Borel-Cantelli Lemma, there exists a random variable n∗ (ω), which is almost surely finite, such that m > n∗ ⇒ |Xn | > nδ a.s.. To prove the second statement, we observe that E(n∗ )q =
∞ X
nq P(n∗ = n) ≤
n=1
∞ X
nq P(|Xn | ≥ nδ )
(23)
n=1
≤
∞ X n=1
C p npδ−(r+q)
.
(24)
The first estimate hinges on the fact that n∗ was the smallest integer such that for all greater integers n, |Xn | < nδ . To conclude, note that the final sum converges if pδ − (r + q) > 1. u t The following two corollaries are a specialization of the above lemma.
288
J. C. Mattingly
Corollary 3. Given a family of random variables {Yn } for which E|Yn |p ≤ C < ∞ for all n (in particular, {Yn } could be a stationary sequence with E|Yn |p finite). then Tbound ({Yn }, nδ ) is finite almost surely. δ q is finite. 2. Let q > 0. If δ > q+1 p , then E Tbound ({Yn }, n )
1. If δ >
1 p,
Proof. By Chebyshev’s inequality and the bound on E|Yn |p , P(|Yn | > nδ ) ≤ E|Yn |p / p npδ ≤ C/ p npδ . This estimate satisfies the conditions of the above lemma with r = 0. The conclusion follows from the lemma. u t Corollary 4. Let X be a random variable such that E|X|p is finite. then Tbound (X, nδ ) is finite almost surely. δ q is finite. 2. Let q > 0. If δ > q+1 p , then E Tbound (X, n )
1. If δ >
1 p,
Proof. This is just a specialization of the above corollary. u t References [CF88]
Constantin, Peter and Foia¸s, Ciprian: Navier-Stokes Equations. Chicago: University of Chicago Press, 1988 [CF94] Crauel, Hans and Flandoli, Franco: Attractors for random dynamical systems. Probability Theory and Related Fields 100, 365–393 (1994) [DM82] Dellacherie, Claude and Meyer, Paul-André: Probabilities and potential. B Theory of martingales, 72 of North-Holland Mathematics Studies.Amsterdam–NewYork: North-Holland Publishing Co., 1982 [DPZ92] Da Parto, Giuseppe and Zabczyk, Jerzy: Stochastic Equations in Infinite Dimensions. Cambridge: Cambridge University Press, 1992 [DPZ96] Da Prato, Giuseppe and Zabczyk, Jerzy Ergodicity for Infinite Dimensional Systems. Cambridge: Cambridge University Press, 1996 [Dur96] Durrett, Richard: Stochastic Calculus, A practical introduction. CRC Press, 1996 [EKMS98] Khanin, W.E.K., Mazel, A., Sinai, Ya.: Burgers Equation with Random Forcing. Submitted to The Annals of Mathematics, Princeton University Press, 1998 [EFNT94] Eden, A., Foias, C., Nicolaenko, B. and Teman, R.: Exponential Attractors for dissipative Evolution equations. Research in Applied Mathematics. John Wiley and Sons and Masson, 1994 [Fer97] Ferrario, Benedetta: Ergodic results for stochastic Navier–Stokes equation. Stochastics and Stochastics Reports 60 (3–4), 271–288 (1997) [FG95] Flandoli, Franco and Gatarek, Dariusz: Martingale and stationary solutions for stochastic Navier– Stokes equations. Probability Theory and Related Fields 102, 367–391 (1995) [Fla94] Flandoli, Franco: Dissipativity and invariant measures for stochastic Navier–Stokes equations. NoDEA 1, 403–426 (1994) [FM95] Flandoli, Franco and Maslowski, B.: Ergodicity of the 2-D Navier–Stokes equation under random perturbations. Commun. Math. Phys. 171, 119–141 (1995) [Kun90] Kunita, Hiroshi: Stochastic Differential Equations. Cambridge: Cambridge University Press, 1990 [Mat98] Mattingly, Jonathan C.: The Stochastically forced Navier–Stokes equations: Energy estimates and phase space contraction. PhD thesis, Princeton University, 1998 [Pro90] Protter, Philip: Stochastic Integration and Differential Equations: A new approach. Berlin– Heidelberg–New York: Springer-Verlag, 1990 [Sch96] Schmalfuß, Björn: A random fixed point theorem based on Lyapunov exponents. Random & Computational Dynamics 4, 257–268 (1996) [Tem79] Temam, Roger: Navier-Stokes equations: Theory and numerical analysis. Volume 2 of Studies in Mathematics and its Applications. Amsterdam–NewYork: North-Holland Publishing Co., revised edition, 1979 [Tem88] Temam, Roger: Infinite Dimensional Dynamical Systems in Mechanics and Physics. New York: Springer-Verlag, 1988 Communicated by Ya. G. Sinai
Commun. Math. Phys. 206, 289 – 335 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Effective Interactions Due to Quantum Fluctuations Roman Kotecký1,2,? , Daniel Ueltschi3,?? 1 Center for Theoretical Study, Charles University, Jilská 1, 110 00 Praha 1, Czech Republic 2 Department of Theoretical Physics, Charles University, V Holešoviˇckách 2, 180 00 Praha 8, Czech Republic.
E-mail:
[email protected] 3 Institut de Physique Théorique, EPF Lausanne, CH-1015 Lausanne, Switzerland
Received: 28 April 1998 / Accepted: 19 March 1999
Abstract: A class of quantum lattice models is considered, with Hamiltonians consisting of a classical (diagonal) part and a small off-diagonal part (e.g. hopping terms). In some cases when the classical part has an infinite degeneracy of ground states, the quantum perturbation may stabilize some of them. The mechanism of this stabilization stems from effective potential created by the quantum perturbation. Conditions are found when this strategy can be rigorously controlled and the low temperature phase diagram of the full quantum model can be proven to be a small deformation of the zero temperature phase diagram of the classical part with the effective potential added. As illustrations we discuss the asymmetric Hubbard model and the Bose–Hubbard model. Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . 2. Assumptions and Statements . . . . . . . . . . . . . . 2.1 Classical Hamiltonian with quantum perturbation 2.2 The effective potential . . . . . . . . . . . . . . . 2.3 Stability of the dominant states . . . . . . . . . . 2.4 Characterization of stable phases . . . . . . . . . 2.5 Phase diagram . . . . . . . . . . . . . . . . . . . 3. Examples . . . . . . . . . . . . . . . . . . . . . . . . 3.1 The asymmetric Hubbard model . . . . . . . . . 3.2 The Bose–Hubbard model . . . . . . . . . . . . . 4. Contour Representation of a Quantum Model . . . . .
. . . . . . . . . . .
? Partly supported by the grants GACR ˇ 202/96/0731 and GAUK 96/272.
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
290 293 293 296 297 299 301 303 303 305 308
?? Present address: Department of Mathematics, Rutgers University, 110 Frelinghuysen Road, Piscataway,
NJ 08854-8019, USA. E-mail:
[email protected] 290
R. Kotecký, D. Ueltschi
5. Exponential Decay of the Weight of the Contours . . . . . . . . . . . . . . . . 320 6. Expectation Values of Local Observables and Construction of Pure States . . . 328 A. General Expression for the Effective Potential . . . . . . . . . . . . . . . . . 333
1. Introduction Physics of a large number of quantum particles at equilibrium is very interesting and difficult at the same time. Interesting, because it is treating such macroscopic phenomena as magnetization, crystallisation, superfluidity or superconductivity. And difficult, because their study has to combine Quantum Mechanics and Statistical Physics. A natural approach is to decrease difficulties arising from this combination by starting from only one aspect. Thus one can use only Quantum Mechanics and treat the particles first as independent, trying next to add small interactions. In the present paper we are concerned with the other approach. Namely, to start with a model treated by Classical Statistical Physics, adding next a small quantum perturbation. Another simplification is to consider lattice systems (going back to a physical justification for the modeling process, we can invoke applications to condensed matter physics). Quantum systems studied here have Hamiltonians consisting of two terms. The first term is a classical interaction between particles; formally, this operator is “function” of the position operators of the particles and it is diagonal with respect to the corresponding basis in occupation numbers. The second term is an off-diagonal operator that we suppose to be small with respect to the interaction. A typical example for this is a hopping matrix. The aim of the paper is to show that a new effective interaction appears that is due to the combination of the potential and the kinetic term. An explicit formula is computed, and sufficient conditions are given in order that the low temperature behaviour is controlled by the sum of the original diagonal interaction and the effective potential. To be more precise, it is rigorously shown that the phase diagram of the original quantum model is only a small perturbation of the phase diagram of a classical lattice model with the effective interaction. Thus, we will start by recalling some standard ideas of Classical Statistical Mechanics of lattice systems. The Peierls argument for proving the occurrence of a first order phase transition in the Ising model [Pei,Dob,Gri] marks the beginning of the perturbative studies of the low temperature regimes of classical lattice models. Partition functions and expectation values of observables may be expanded with respect to the excitations on top of the ground states, interpreting the excitations in geometric terms as contours. These ideas and methods are referred to as the Pirogov–Sinai theory; they were first introduced in [PS,Sin] and later further extended [Zah,BI,BS]. The intuitive picture is that a low temperature phase is essentially a ground state configuration with small excitations. A phase is stable whenever it is unprobable to install a large domain with another phase inside. For such an insertion one has to pay on its boundary, it is excited (two phases are separated by excitations), but, on the other side, one may gain on its volume if its metastable free energy (its ground energy minus the contribution of small thermal fluctuations) is smaller than the one of the external phase. It is important to take into account the fluctuations since they can play a role in determining which phase is dominant. A standard example here is the Blume-Capel model with an external field slightly favouring the “+1” phase; at low temperatures, the “0” phase may be still selected because it has more low energy excitations (theory of such dominant states chosen by thermal fluctuations may be found in [BS]).
Effective Interactions Due to Quantum Fluctuations
291
The partition function of a quantum system Tr e−βH may be expressed using the Duhamel expansion (or Trotter formula), yielding a classical contour model in a space with one more (continuous) dimension. If the corresponding classical model (the diagonal part only) has stable low temperature phases, and if the off-diagonal terms of the Hamiltonian are small, the contours have low probability of occurrence and it is possible to extend the Peierls argument to quantum models [Gin]. More generally, one can formulate a “Quantum Pirogov–Sinai theory” [BKU1,DFF1], in order to establish that (i) low temperature phases are very close to ground states of the diagonal interaction (more precisely: the density matrix Z1 e−βH is close to the projection operator |gihg| , where |gi is the ground state of the diagonal interaction only) and (ii) low temperature phase diagrams are small deformations of zero temperature phase diagrams of the interactions. So far we have only discussed the case when the effect of the quantum perturbation is small, and the features of the phases are due to the classical interaction between the particles. It may happen, however, that the classical interaction alone is not sufficient to choose the low temperature behaviour. This is the case in the two models we introduce now and use later for illustration of our general approach. • The asymmetric Hubbard model. It describes hopping spin 21 particles on a lattice 3 ⊂ Zν . A basis of its Hilbert space is indexed by classical configurations n ∈ {0, ↑, ↓, 2}3 , and the Hamiltonian X X X X † tσ cxσ cyσ + U nx↑ nx↓ − µ (nx↑ + nx↓ ) (1.1) H =− x
kx−yk2 =1 σ =↑,↓
x
(the hopping parameter tσ depends on the spin of the particle). In the atomic limit t↑ = t↓ = 0 the ground states are all the configurations with exactly one particle at each site. The degeneracy equals 2|3| , which means that it has nonvanishing residual entropy at zero temperature. The case t↑ 6 = t↓ = 0 corresponds to the Falicov– Kimball model (see [GM]); in this case, spin-↓ electrons behave as classical particles. Here, we shall consider the strongly asymmetric Hubbard model, with U t↑ t↓ . • The Bose–Hubbard model. We consider bosons moving on a lattice 3 ⊂ Z2 . They interact through on-site, nearest neighbour and next nearest neighbour repulsive potentials. A basis of its Hilbert space is the set of all configurations n ∈ N3 , and its Hamiltonian: X X ax† ay + U0 (n2x − nx ) H = −t kx−yk2 =1
+ U1
X
kx−yk2 =1
x
nx ny + U2
X √ kx−yk2 = 2
nx ny − µ
X
nx .
(1.2)
x
For U0 > 4U1 − 4U2 and U1 > 2U2 , and if 0 < µ < 8U2 , the ground states of the potential part are those generated by 01 00 , i.e. any configuration with alternatively a ferromagnetic and an empty line is a ground state (and similarly in the other direction); 1
see Fig. 2 in Sect. 3. The degeneracy is of the order 2 2 |3| 2 (if 3 is a square), there is no residual entropy. Actually, we shall add to (1.2) a generalized hard-core condition that prevents more than N bosons to be present at the same site; this condition has technical motivations, and does not change the physics of the model. 1
In these two situations, the smallest quantum fluctuations yield an effective interaction, and this interaction stabilizes phases displaying long-range order (there is neither superfluidity nor superconductivity).
292
R. Kotecký, D. Ueltschi
Beside of the low temperature Gibbs states, the effective potential may have an influence in situations with interfaces; it has been shown in [DMN] that rigid 100 and 111 interfaces occur in the Falicov–Kimball model at low temperature. In the case where classical and quantum particles are mixed in one model, like the Falicov–Kimball model, a method using Peierls argument was proposed by Kennedy and Lieb [KL]; it was extended in [LM] to situations that are not covered by the present paper, namely to cases of such mixed systems with continuous classical variables. Results very similar to ours have already been obtained by Datta, Fernández, Fröhlich and Rey-Bellet [DFFR]. Their approach is different, however. Starting from a Hamiltonian H (λ) = H (0) + λV , H (0) being a diagonal operator with infinitely many ground states, and V the quantum perturbation, the idea is to choose an antisymmetric matrix S = λS (1) + λ2 S (2) in such a way that the operator H (2) (λ) = eS H (λ) e−S , expanded with the help of Lie-Schwinger series, turns out to be diagonal, up to terms of order λ3 or higher. If the diagonal part of H (2) has a finite number of ground states and the excitations cost strictly positive energy, it can be shown that the ground states are stable. It is possible to include higher orders in this perturbation scheme (see [DFFR]). In fact, our first intention was to study the stability of the results of [BS] with respect to a quantum perturbation, and we began the present study as a warm-up and the first simple step towards this goal. This simple step turned out however to be rather involved. Even though, at the end, the paper contains results similar to that of [DFFR], we think that the subject is important enough to justify an alternative approach, and that there are some advantages in an explicit formula for the effective potential and sufficient conditions for it to control the low temperature behaviour that may be useful in explicit applications. The intuitive background of this paper owes much to the work of Bricmont and Slawny [BS] discussing the situation with infinite degeneracy of ground states, where only a finite number of ground states is dominating as a result of thermal fluctuations, and to the paper of Messager and Miracle-Solé [MM] which was useful to understand the structure of the quantum fluctuations. Having expanded the partition function Tr e−βH using the Duhamel formula and having defined quantum contours as excitations with respect to a well chosen classical configuration, we identify the smallest quantum contours (that we call loops). Given a set of big quantum contours, we can replace the sum over sets of loops by an effective interaction acting on the quantum configurations without loops. This effective interaction is long-range, but decays exponentially quickly with respect to the distance. This allows, for a class of models, to have an explicit control on the approximation given by the effective interaction allowing to prove rigorous statements about the behaviour of original quantum model. An important model that does not fall into the class of models we can treat is the (symmetric) Hubbard model. Take U = 1 and t↑ = t↓ = t in (1.1). Computing the effective potential stemming from one transition of a particle to a neighbouring site and back, we find an antiferromagnetic interaction of strength t 2 . On the other hand, it is possible to make two transitions as a result of which the spins of nearest neighbours are interchanged, † † cy↓ cy↑ cx↑ | ↑, ↓i. |nx , ny i = | ↓, ↑i = −cx↓
It turns out that this brings the factor t 2 , which is of the same order as the strength of the effective interaction. In this case we cannot ensure the stability of the phases selected by the effective potential – we would need a stronger effective interaction. Otherwise the system jumps easily from a configuration with one particle per site to another such configuration, i.e. from a classical ground state to another classical ground state. We call
Effective Interactions Due to Quantum Fluctuations
293
quantum instability this property of the system. In the Hubbard model it is a manifestation of a continuous symmetry of the system, namely the rotation invariance. In Sect. 2 the ideas discussed above are introduced with precise definitions. The effective potential is written down in Sect. 2.2 – actually, we restrict here to lowest orders; the general formula is not that pleasant, and is therefore hidden in the appendix. The results of the paper are summarized in Theorems 2.2 (a characterization of stable pure phases) and 2.3 (the structure of the phase diagram); experts will recognize standard formulations of Pirogov–Sinai theory. Taking into account that our aim is to describe in a rigorous way the behaviour of a quantum system, some care must be given to the introduction of stable phases. We define them with the help of an external field perturbation of the state constructed with periodic boundary conditions. In Sect. 3 we apply the results to our two illustrative examples. The rest of the paper is devoted to the construction of a contour representation (Sect. 4), the proof of the exponential decay of the weights of the contours (Sect. 5), and, finally, the proofs of our claims with the help of contour expansions of the expectation values of local observables and the standard Pirogov–Sinai theory (Sect. 6). Let us end this introduction by noting that given a model which enters our setting, it is not a straightforward task to apply our theorems. One still has to separate the correct leading orders that determine the behaviour of effective interaction. This situation has the utmost advantage that it should bring much more pleasure to users, since the most interesting part of the job remains to be done – to get intuition and to understand how the system behaves. 2. Assumptions and Statements 2.1. Classical Hamiltonian with quantum perturbation. Let Zν , ν > 2, be the hypercubic lattice. We use |x − y| := kx − yk∞ to denote the distance between two sites x, y ∈ Zν . is the finite state space of the system at site x = 0, || = S < ∞. Our standard setting will be to consider the system on a finite torus 3 = (Z/LZ)ν (i.e. a finite hypercube with periodic boundary conditions). With a slight abuse of notation we identify 3 with a subset of Zν and always assume that it is sufficiently large (to surpass the range of considered finite range interactions). A classical configuration n3 (occasionally we suppress the index and denote it n) is an element of 3 . If A ⊂ 3, the restriction of n3 to A is also denoted by nA . H3 is the (finite-dimensional) Hilbert space spanned by the classical configurations, i.e. the set of vectors X an3 |n3 i, an3 ∈ C, |vi = n3
with the scalar product hv|v 0 i =
X n3
A
n0A0
an∗3 an0 3 . 0
and ∈ A , with A ∩ A0 = ∅, it is convenient Given two configurations nA ∈ 0 to define nA n0A0 ∈ A∪A to be the configuration coinciding with nA on A and with n0A0 on A0 . The Hamiltonian is a sum of two terms, H3 = V3 + T3 . The former is the quantum equivalent of a classical interaction, the latter is the quantum perturbation – the notation was chosen such because we have in mind models where V represents the potential
294
R. Kotecký, D. Ueltschi
energy of quantum particles, that is diagonal in the basis of occupation number operators, and T represents the kinetic energy. It helps considerably to assume that V3 is the quantum equivalent of a classical “block interaction”, that is, an interaction that has support on blocks of a given size in Zν . More precisely, let R0 ∈ 21 N be the range of the interaction, and U0 (x) be the R0 -neighbourhood of x ∈ Zν : ( if R0 ∈ N {y ∈ Zν : |y − x| 6 R0 } (2.1) U0 (x) = 1 1 ν {y ∈ Z : |y − (x1 + 2 , . . . , xν + 2 )| 6 R0 } otherwise. When R0 is half-integer, U0 (x) is a block of integer size 2R0 × · · · × 2R0 whose center is at distance 21 of x. Then we assume the following structure for V3 . Assumption 1 (Classical Hamiltonian). There exists a classical periodic block interaction 8 of range R0 (i.e. a collection of functions 8x : U0 (x) → R ∪ {∞}, x ∈ Zν ) and period `0 such that X 8x (nU0 (x) ) |n3 i; V3 |n3 i = x∈3
for any torus 3 ⊂ Zν of side L that is a multiple of `0 and any n3 ∈ 3 . Let us suppose that a fixed collection of reference local configurations G0 (x) ⊂ U0 (x) is given, for all sites of Zν .1 Let GA = {gA ∈ A : gU0 (x) ∈ G0 (x) for all U0 (x) ⊂ A}, A ⊂ Zν , and G = GZν . Finally, we set A¯ = ∪U0 ∩A6=∅ U0 = {y : dist (y, A) 6 2R0 }.
(2.2)
We assume that the local energy gap of excitations is uniformly bounded from below, while the spread of local energies of reference states is not too big (Fig. 1): U0 (x) \ G0 (x)
G0 (x)
8x (nU0 (x) ) δ0
10
Fig. 1. Illustration for Assumption 2. The image of 8x decomposes into two sets separated by a gap 10 ; the spread of the set of small values is bounded by δ0
Assumption 2 (Energy gap for classical excitations). There exist constants 10 > 0 and δ0 < ∞ such that: / G0 (x), one has the lower bound • For any x ∈ Zν and any nU0 (x) ∈ 8x (nU0 (x) ) −
max
gU0 (x) ∈G0 (x)
8x (gU0 (x) ) > 10 ,
(2.3)
1 In some situations G (x) is simply the set of all ground configurations of 8 . When discussing the x 0 full phase diagram, however, we will typically extend the interaction 8x to a class of interactions by adding certain “external fields”. The set G0 (x) then will actually play the role of ground states of the interaction with a particular value of external fields (the point of maximal coexistence of ground state phase diagram).
Effective Interactions Due to Quantum Fluctuations
• and, max 0
gU0 (x) ,gU
0 (x)
295
8x (gU
∈G0 (x)
0 (x)
) − 8x (gU0 0 (x) ) 6 δ0 .
(2.4)
For later purpose, we note the following consequence of Assumption 2. Property. Let 8 satisfy Assumption 2, R be such that R ν 6 10 /δ0 , and A ⊂ Zν with / GA satisfies the lower diam A 6 R. Then any pair of configurations gA ∈ GA and nA ∈ bound i X h (2.5) 8x (nU0 (x) ) − 8x (gU0 (x) ) > R −ν 10 . x,U0 (x)⊂A
/ GA , there exists at least one site x, U0 (x) ⊂ A such that nU0 (x) ∈ / Proof. Since nA ∈ G0 (x). From the assumption, this implies that i X X h δ0 . 8x (nU0 (x) ) − 8x (gU0 (x) ) > 10 − y∈A,y6 =x
x,U0 (x)⊂A
t Using |A| 6 R ν , we obtain the property. u The quantum perturbation T3 is supposed to be aPperiodic quantum interaction. Namely, T3 is a sum of local operators TA , T3 = A TA , where TA has support supp A = A ⊂ 3 and A is, in general, a pair (A, α), where the index α specifies TA from a possible finite set of operators with the same support. We found it useful to label quantum interactions TA not only by the interaction domain A, but also, say, by quantum numbers of participating creation and annihilation operators. Thus, for example, the term A might, in the case of the Hubbard model, be a pair (< x, y >, ↑) corresponding to the † cy,↑ . We refer to A as a quantum transition. operator TA = cx,↑ Assumption 3 (Quantum perturbations). The collection of operators TA is supposed to be periodic,2 with period `0 , with respect to the translations of supp A. The interactions TA are assumed to satisfy the following condition, for fermions or bosons, respectively: • (Fermions) TA is a finite sum of even monomials in creation and annihilation operators of fermionic particles at a given site, i.e. X T˜ ({xi , σi , yj , σj0 })cx†1 ,σ1 . . . cx†k ,σk cy1 ,σ10 . . . cy` ,σ`0 TA = (x1 ,σ1 ),...,(xk ,σk ) (y1 ,σ10 )...,(y` ,σ`0 )
with xi , yi ∈ A and σi , σi0 are the internal degrees of freedom, such as spins; T˜ (·) is a complex number. k + ` must be an even number. The creation and annihilation operators satisfy the anticommutation relations † † , cy,σ {cx,σ 0 } = 0,
{cx,σ , cy,σ 0 } = 0,
† {cx,σ , cy,σ 0 } = δx,y δσ,σ 0 .
2 By taking the least common multiple, we can always suppose the same periodicity for 8 and T . Moreover, whenever a torus 3 is considered, we suppose that its side is a multiple of `0 .
296
R. Kotecký, D. Ueltschi
• (Spins or bosons) The matrix element hn3 | TA |n03 i is zero whenever n3\A 6 = n03\A and otherwise it depends on nA and n0A only. In both cases T is supposed to have an exponential decay with respect to its support: defining kT k to be kT k =
h sup
max
0 A,A⊂Zν nA ,nA ∈A
i1/|A| |hn0A | TA |nA i| ,
(2.6)
we assume that kT k < ∞. When stating our theorems, we shall actually suppose kT k to be sufficiently small. Notice also that we do not assume that T is of finite range, the exponential decay suffices. 2.2. The effective potential. In this section we define the effective potential that results from quantum fluctuations. It is due to a succession of “quantum transitions”, that is, it involves terms of the form hg| TA |ni. What are the sequences (A1 , . . . , Ak ) to take into account? There is no general answer to this question, it depends on the model and on the properties of the phases under observation. In the case where the Hamiltonian is of the form V + λT , λ being a perturbation parameter, one could restrict to all sequences that contain less than, say, 4 transitions (or 2, or 17...). But we can also consider models with more than one parameter. Let us say that the choice of the suitable sequence requires some physical intuition. The procedure is the following. First we guess a list S of sequences of quantum transitions, and we apply the formulæ (2.8)–(2.10) below to compute the effective potential. Then we must answer positively two questions: • Does S contains all the quantum transitions that actually play a role? • Are other quantum effects negligible? The mathematical formulation of these conditions is the subject of Assumptions 5 and 6 below. Notice that there is some freedom in the choice of S; indeed, it is harmless to include more transitions than what is necessary. Simply, it decreases the number of computations to guess the minimal set S. Let us now state the formulæ for the effective potential. Equations are rather simple in the case where S contains sequences of no more than 4 transitions; we restrict to that situation in this section, and postpone the general expression, that is quite involved, to the appendix. Let us decompose S = S (2) ∪ S (3) ∪ S (4) , with S (k) denoting the list of sequences with exactly k transitions, and write 9 = 9 (2) + 9 (3) + 9 (4) .
(2.7)
Here 9 (k) is the contribution to the effective potential due to the fluctuations from S (k) . Let i X h 8x (nU0 (x) ) − 8x (gU0 (x) ) . φA (nA ; gA ) = x,U0 (x)⊂A
Effective Interactions Due to Quantum Fluctuations
297
Then, for any connected A ⊂ Zν and gA ∈ GA , we define (2.8)
/ A (A1 ,A2 )∈S (2) nA ∈G A¯ 1 ∪A¯ 2 =A
X
(3)
9A (gA ) = −
X hgA | TA |nA ihnA | TA |gA i 1 2 , φA (nA ; gA )
X
(2)
9A (gA ) = −
X
/ A (A1 ,A2 ,A3 )∈S (3) nA ,n0A ∈G A¯ 1 ∪A¯ 2 ∪A¯ 3 =A
hgA | TA1 |nA ihnA | TA2 |n0A ihn0A | TA3 |gA i . φA (nA ; gA )φA (n0A ; gA ) (2.9)
The expression for 9 (4) becomes more complicated (we shall see in Sect. 4 that clusters of excitations are actually occurring here), (4)
9A (gA ) = =−
X
X
/ A (A1 ,A2 ,A3 ,A4 )∈S (4) nA ,n0A ,n00A ∈G A¯ 1 ∪A¯ 2 ∪A¯ 3 ∪A¯ 4 =A
−
1 2
X
nA ,n0A ∈G / A
hgA | TA1 |nA ihnA | TA2 |n0A ihn0A | TA3 |n00A ihn00A | TA4 |gA i φA (nA ;gA )φA (n0A ;gA )φA (n00A ;gA )
hgA | TA1 |nA ihnA | TA2 |gA ihgA | TA3 |n0A ihn0A | TA4 |gA i φA (nA ;gA )+φA (n0A ;gA )
n
1 1 φA (nA ;gA ) + φA (n0A ;gA )
o2
.
(2.10) Property (2.5) implies that all the denominators are strictly positive. These equations simplify further if TA is a monomial in creation and annihilation operators; indeed in the sums over intermediate configurations only one element has to be taken into account. Notice, finally, that the diagonal terms in T are not playing any role in the previous definitions; we consider that they are small, since otherwise we would have included them into the diagonal potential.
2.3. Stability of the dominant states. The aim of rewriting a class of quantum transitions in terms of the effective potential was to get control over stable low temperature phases. To this end, the three conditions, expressed first only vaguely and then in precise terms in the following Assumptions 4, 5, and 6, must be met. Namely, we suppose that • the Hamiltonian corresponding to the sum 8 + 9 of the classical (diagonal) and effective interactions has a finite number of ground configurations, and its excitations have strictly positive energy;3 • the list S contains all the lowest quantum fluctuations; • there is no “quantum instability”; the transition probability from a “ground state” g to another “ground state” g 0 is small compared to the energy cost of the excitations. 3 Again, when exploring a region of phase diagram at once, we have a fixed finite set of reference configurations that, strictly speaking, turn out to be ground configurations of the corresponding Hamiltonian for a particular value of “external fields”. See below for a more detailed formulation.
298
R. Kotecký, D. Ueltschi
Each component of the effective interaction 9A is a mapping GA → R; let us first / GA . To give a precise meaning to extend it to A → R by putting 9A (nA ) = 0 if nA ∈ the first condition, we suppose that a finite number of periodic reference configurations D ⊂ G is given such that the interaction 8 + 9 satisfies the Peierls condition with respect to D. We choose a formulation in which it is very easy to verify the condition and, in addition, it takes into account the fact that the configurations from D are not necessarily translation invariant. Namely, we will formulate the condition in terms of a block potential ϒ that is equivalent to 8 + 9 and is chosen in a suitable way. Of course, in many particular cases this is not necessary and the condition as stated below is valid directly for 8 + 9. However, in several important cases treated in Sect. 3, the interaction 8 + 9 turns out not to be the so-called m-potential and the use of the equivalent mpotential ϒ not only simplifies the formulation of the Peierls condition, but also makes the task of its verification much easier. We will consider the interactions ϕ and φ to be equivalent4 if, for any finite torus 3 and any configuration n ∈ 3 , one has X
X
ϕA (nA ) =
A⊂3 per
φA (nA ).
A⊂3 per
Assumption 4 (Peierls condition). There exist a finite set of periodic configurations D ⊂ G with the smallest common period L0 , a constant 1 such that 1 > kT kk for some finite constant k, and a periodic block interaction ϒ = {ϒx } (with period `0 ) that is equivalent to 8 + 9 such that the following conditions are satisfied. The interaction ϒ is of a finite range5 R ∈ 21 N such that R ν 6 10 /δ0 , with the constants δ0 and 10 determined by the interaction 8 in Assumption 2. We denote by U (x) the R-neighbourhood of x. The value ϒx (dU (x) ) is supposed to be translation invariant with respect to x for any d ∈ D, and the interaction ϒ satisfies the following conditions: / GU0 (x) , one has • For any x ∈ 3 and any n with nU0 (x) ∈ ϒx (nU (x) ) − max ϒx (gU (x) ) > 21 10 . g∈G
/ DU (x) , one has • For any x ∈ 3 and any n with nU (x) ∈ ϒx (nU (x) ) − min ϒx (dU (x) ) > 1. d∈D
The following assumption is a condition demanding that the list S should contain all transitions that are relevant for the effective potential. For this, we evaluate the diagonal 4 The usual notion of (physically) equivalent interactions (see [Geo,EFS]) is slightly weaker, but we will not need it here. 5 We will suppose, taking larger R if necessary, that it is larger or equal to the range R of 8, as well as to 0 half of the range of the effective interaction 9 and to L0 .
Effective Interactions Due to Quantum Fluctuations
299
terms arising from any sequence of transitions that does not appear in S; it will have to be small compared to the Peierls constant 1. We define m(TA1 , . . . , TAk ) = max
max
gA ∈GA n1 ,...,nk−1 ∈G / A A
|hgA | TA1 |n1A ihn1A | TA2 |n2A i . . .
A
. . . hnk−1 A | TAk |gA i|,
(2.11)
where A = ∪kj =1 A¯ j . Assumption 5 (Completeness of the set of quantum transitions). There exists a finite ¯ / S with connected ∪m number ε1 such that for any sequence (A1 , . . . , Am ) ∈ i=1 Ai one has m(TA1 , . . . , TAk1 )m(TAk1 +1 , . . . , TAk2 ) . . . m(TAkn−1 +1 , . . . , TAm ) 6 ε1 1. In general, it is not true that the main effect of quantum fluctuations results in a diagonal effective interaction. A sufficient condition for this to occur is that all possible transitions between different configurations g and g 0 have small contribution compared to 1. Assumption 6 (Absence of quantum instability). There exists a finite number ε2 such 0 ∈ G (A = ∪m A 0 ¯ that for any sequence (A1 , . . . , Am ), and any gA , gA A j =1 j ), gA 6 = gA , one has 0 i 6 ε2 1. hgA | TA1 . . . TAm |gA When formulating our theorems, we shall suppose that ε1 and ε2 are small, more precisely: smaller than a constant that does not depend on T . 2.4. Characterization of stable phases. Notice first that the specific energy per lattice site of the configuration d ∈ D, defined by e(d) = lim
3%Zν
1 X [8A (dA ) + 9A (dA )], |3|
(2.12)
A⊂3
is equal, according to Assumption 4, to ϒx (dU (x) ) (whose value does not depend on x). Our first result concerns the existence of the thermodynamic limit for the state under periodic boundary conditions. Taking L0 to be the smallest common period of periodic configurations from D, we always consider in the following the limit over tori 3 % Zν whose sides are multiples of L0 and `0 . Theorem 2.1 (Thermodynamic limit). Suppose that the Assumptions 1–6 are satisfied. There exist constants ε0 > 0 (independent of T ) and β0 = β0 (1) such that the limit per
hKiβ = lim
3%Zν
Tr K e−βH3 Tr e−βH3
(2.13)
exists whenever ε1 , ε2 , kT k 6 ε0 in Assumptions 5 and 6, β > β0 , and K is a local observable.6 6 A local observable, here, is a finite sum of even monomials in creation and annihilation operators, in the case of fermion systems.
300
R. Kotecký, D. Ueltschi
Notice the logic of constants in the theorem above (as well as in the remaining two theorems stated below). The constant ε0 is given by the context (lattice, phase space, range and periodicity of the model, and 8, but does not depend on T ). Then, for any T such that kT k and both ε1 and ε2 are smaller than ε0 one can choose β0 (depending on 1 that is determined in terms of T through the effective potential 9) such that the claim is valid for the given T and any β > β0 (1). With kT k → 0 we may have to go to lower temperatures (higher β) to keep the control. Of course, if 1 does not vanish with vanishing kT k (i.e. Assumption 4 is valid for 8 alone) as was the case in [BKU1, DFF1], one can choose the constant β0 uniformly in kT k. per If there are coexisting phases for a given temperature and Hamiltonian, the state h·iβ will actually turn out to be a linear combination of several pure states. A standard way how to select such a pure state is to consider a thermodynamic limit with a suitably chosen fixed boundary condition. In many situations to which the present theory should apply, this approach is not easy to implement. The classical part of the Hamiltonian might actually consist only of on-site terms and to make the system “feel” the boundary, the truly quantum terms must be used. One possibility is, of course, to couple the system with the boundary with the help of the effective potential. The problem here is, however, that since we are interested in a genuine quantum model, we would have to introduce the effective potential directly in the finite volume quantum state. Expanding this state, in a similar manner as it will be done in the next section, we would actually obtain a new, boundary dependent effective potential. One can imagine that it would be possible to cancel the respective terms by assuming that the boundary potential satisfies certain “renormalizing self-consistency conditions”. However, the details of such an approach remain to be clarified. Here we have chosen another approach. Namely, we construct the pure states by limits α 8α per , defined by (2.13) with H3 = V38 + T3 , where 8α is a perturbation of states h·iβ of the interaction 8 suitably chosen in such a way that one approaches the coexistence point from the one-phase region. Consider thus FR0 , the space of all periodic interactions φ per of range R0 . We say that a state h·iβ , φ ∈ FR0 , is thermodynamically stable if it is insensitive to small perturbations: φ, per
hKiβ
(φ+αψ) per
= lim hKiβ α→0
(2.14)
for every ψ ∈ FR0 and every local observable K. We define now a state h·i∗β to be a pure state (with classical potential 8 and quantum interaction T ) if there exists a 8α per are function (0, α0 ) 3 α → 8α ∈ FR0 so that limα→0+ 8α = 8, the states h·iβ thermodynamically stable, and 8α per
hKi∗β = lim hKiβ α→0+
(2.15)
for every local observable K. Theorem 2.2 (Pure low temperature phases). Under Assumptions 1–6 and for any η > 0, there exist ε0 > 0 (independent of T ) and β0 = β0 (1) such that if ε1 , ε2 , kT k 6 ε0 and β > β0 , there exists for every d ∈ D a function f β (d) such that the set Q = {d ∈ D; Re f β (d) = mind 0 ∈D Re f β (d 0 )} characterizes the set of pure phases. Namely, for any d ∈ Q:
Effective Interactions Due to Quantum Fluctuations
301
a) The function f β (d) is equal to the free energy of the system, i.e. f β (d) = −
1 1 lim log Tr e−βH3 . β 3%Zν |3|
b) There exists a pure state h·idβ . Moreover, it is close to the state |d3 i in the sense that for any bounded local observable K and any sufficiently large 3, one has hKidβ − hd3 | K |d3 i 6 η| supp K|kKk where supp K is the support of the operator K. c) There is exponential decay of correlations in the state h·idβ , i.e. there exists a constant ξ d > 0 such that 0 d hKK 0 idβ − hKidβ hK 0 idβ 6 | supp K|| supp K 0 |kKkkK 0 k e−dist (supp K,supp K )/ξ for any bounded local observables K and K 0 . per d) The state h·iβ is a linear combination of the states h·idβ , d ∈ Q, with equal weights, per
hKiβ =
1 X hKidβ |Q| d∈Q
for each local observable K. 2.5. Phase diagram. We now turn to the phase diagram at low temperatures. Let r be the number of dominant states, i.e. r = |D|. To be able to investigate the phase diagram, we suppose that r − 1 suitable “external fields” are added to the Hamiltonian H3 . Or, in other words, we suppose that the classical potential 8 and quantum interaction T depend on a vector parameter µ = (µ1 , . . . , µr−1 ) ∈ U, where U is an open set of Rr−1 . The dependence should be such that the parameters µ remove the degeneracy on the set D of dominant states. One way to formulate this condition is to assume a nonsingularity ∂eµ (d ) of the matrix of derivatives ∂µi j . Assumption 7. The potential 8 and the quantum perturbation T are differentiable with respect to µ and there exists a constant M < ∞ such that ∂ 8x (nU0 (x) ) 6 M maxν Z ∂µi n∈ for all x ∈ Zν , and kT k +
r−1
X
∂T
6M ∂µi i=1
for all µ ∈ U. Further, there exists a point µ0 ∈ U such that eµ0 (d) = eµ0 (d 0 ) for all d, d 0 ∈ D,
302
R. Kotecký, D. Ueltschi
and the inverse of the matrix of derivatives ∂ µ µ e (dj ) − e (dr ) ∂µi 1 6 i,j 6 r−1 has a uniform bound for all µ ∈ U. Notice that if for some d ∈ D one has eµ (d) = eµ := mind 0 ∈D eµ (d 0 ), then, according to the Peierls condition (Assumption 4), the configuration d is actually a ground state of ϒ. Thus, the assumption above implies that the zero temperature phase diagram has a regular structure: there exists a point µ0 ∈ U where all energies eµ0 (d) are equal, eµ0 (d) = eµ0 , r lines ending in µ0 with r − 1 ground states, 21 r(r − 1) twodimensional surfaces whose boundaries are the lines above with r − 2 ground states, . . . , r open (r − 1)-dimensional domains with only one ground state. Denoting the (r − |Q|)-dimensional manifolds corresponding to the coexistence of a given set Q ⊂ D of ground states by n Re eµ (d 0 ) if d ∈ Q, and M∗ (Q) = µ ∈ U; Re eµ (d) = min d 0 ∈D o (2.16) µ 0 Re e (d ) if d ∈ / Q , Re eµ (d) > min 0 d ∈D
we can summarize the above structure by saying that the collection P ∗ = {M∗ (Q)}Q⊂D determines a regular phase diagram. Notice, in particular, that ∪Q⊂D M∗ (Q) = U, ∗ ∗ M∗ (Q)∩M∗ (Q0 ) = ∅ whenever Q 6 = Q0 , while for the closures, M (Q)∩M (Q0 ) = ∗ M (Q ∪ Q0 ). Here we set M(∅) = ∅. The statement of the following theorem is that the similar collection P = {M(Q)}Q⊂D of manifolds corresponding to existence of corresponding stable pure phases for the full model is also a regular phase diagram and differs only slightly from P ∗ . To measure the distance of two manifolds M and M0 , we introduce the Hausdorff distance dist H (M, M0 ) = max( sup dist (µ, M0 ), sup dist (µ, M)). µ∈M
µ∈M0
Theorem 2.3 (Low temperature phase diagram). Under Assumptions 1–7 there exist P ∂ ε0 > 0 and β0 = β0 (1) such that if kT k + r−1 i=1 k ∂µi T k 6 ε0 , ε1 , ε2 6 ε0 , and β β > β0 , there exists a collection of manifolds P = {Mβ (Q)}Q⊂D such that (a) The collection P β determines a regular phase diagram; (b) If µ ∈ Mβ (Q), the corresponding stable pure state h·idβ exists for every d ∈ Q and satisfies the properties b), c), and d), from Theorem 2.2; (c) The Hausdorff distance dist H between the manifolds of P β and their correspondent in P ∗ is bounded, dist H (Mβ (Q), M∗ (Q)) 6 O( e−β + kT k +
r−1
X
∂T
), ∂µi i=1
for all Q ⊂ D.
Effective Interactions Due to Quantum Fluctuations
303
The proofs of these theorems are given in the rest of the paper. Expansions of the partition function and expectation values of local observables are constructed, and interpreted as contours of a classical model in one additional dimension. Then we show that the assumptions for using the standard Pirogov–Sinai theory are fulfilled, and, with some special care to be taken due to our definition of stability, the validity of the three theorems follows. 3. Examples 3.1. The asymmetric Hubbard model. The usual Hubbard model describes spin- 21 fermions on a lattice, interacting with an on-site repulsion. The kinetic energy of the particles is modelled by a hopping operator. There are many interesting questions with this model, much less rigorous results; see [Lieb] for a review. It is natural to think of the model as describing one kind of particles, that can be in two different states because of their spins. But since the Hamiltonian conserves the total magnetization, we can adopt a different point of view, namely to imagine having two different kinds of particles, the ↑ and ↓ ones; each kind of particle obeys the Pauli exclusion principle which prevents them from being at the same site. Whenever two particles of different kinds are at the same site, there is an energy cost of U . The natural phase space is the Fock space of antisymmetric wave functions on 3. It is isomorphic to H3 if we take for the state space = {0, ↑, ↓, 2}. Particles with different spins being different, it becomes natural to consider that they have different masses, hence different hopping coefficients. The Hamiltonian is written in (1.1). If we set t↓ = 0, we obtain the Falicov–Kimball model [GM]; in the following, we consider the situation t↓ t↑ U (strongly asymmetric Hubbard model). This model has for classical interaction if nx = 0 0 (3.1) 8x (nx ) = −µ if nx =↑ or nx =↓ U − 2µ if n = 2 x (R0 = 0). We choose the chemical potential such that 0 < µ < U . The set G is here the set of ground states of 8, i.e. ν
G = {n ∈ Z : nx =↑ or nx =↓ for any x ∈ Zν }. Assumption 2 holds with 10 = min(µ, U − µ) and δ0 = 0. The quantum perturbation is defined to be ( † t↑ cx↑ cy↑ if A = (< x, y >, ↑) , TA = † cy↓ if A = (< x, y >, ↓) t↓ cx↓
(3.2) 1
and we always have A = {x, y} for a pair of nearest neighbours x, y ∈ Zν . kT k = |t↑ | 2 (if |t↑ | > |t↓ |). The sequence S of transitions that we consider is S = {(A, A0 ) : A = (< x, y >, ↑) and A0 = (< y, x >, ↑) for some x, y ∈ Zν , kx − yk2 = 1}.
304
R. Kotecký, D. Ueltschi
The effective potential is given by Eq. (2.8). For any x, y ∈ Zν , nearest neighbours, † cy↑ |gi, g ∈ G, has an increase of energy of any configuration n such that |ni = cx↑ φ{x,y} (n{x,y} ; g{x,y} ) = U. Furthermore we have † † † † cy↑ cy↑ cx↑ |g{x,y} i + hg{x,y} | cy↑ cx↑ cx↑ cy↑ |g{x,y} i hg{x,y} | cx↑ ( 1 if g{x,y} ∈ {(↑, ↓), (↓, ↑)} = 0 otherwise.
Therefore
(3.3)
( 9{x,y} (g{x,y} ) =
−t↑2 /U if g{x,y} ∈ {(↑, ↓), (↓, ↑)} 0 otherwise.
(3.4)
This interaction is nearest-neighbour and can be inscribed in blocks 2 × · · · × 2. We take R = 21 and choose for the physically equivalent interaction ϒ, ϒx (nU (x) ) = 8x (nx ) +
1 2ν−1
X
9{y,z} (n{y,z} ).
(3.5)
{y,z}⊂U (x)
The set D has namely the two chessboard configurations d (1) and d (2) ; Qν two elements, x x i if (−1) := i=1 (−1) , ( ( ↑ if (−1)x = 1 ↑ if (−1)x = −1 (1) (2) , dx = . dx = x ↓ if (−1) = −1 ↓ if (−1)x = 1 To find the Peierls constant 1 of Assumption 4, let us make the following observation. Consider a cube 2 × · · · × 2 in Zν , that we denote C, and a configuration nC on it. First, only configurations with one particle per site need to be taken into account, the others having an increase of energy of the order U . If nC ∈ GC , then all edges of the cubes are either ferromagnetic, or antiferromagnetic. If a spin at a site is flipped, then exactly ν edges are changing of state. Since any configuration can be created by starting from the chessboard one, and flipping the spins at some sites, we see that the minimum number of ferromagnetic edges, for configurations that are not chessboard, is ν. This leads to t2
ν ↑ 1 = 2ν−1 U. The maximum of the expression in Assumption 5 is equal to max(t↓2 , t↑4 ). The constant
ε1 can be chosen to be
2ν−1 U ν
max(t↓2 /t↑2 , t↑2 ). For Assumption 6 the expression has ν−1
maximum equal to |t↓ t↑ | and we can take ε2 = 2 ν U |t↓ /t↑ | (we cannot suppose this to be very small in the symmetric Hubbard model; the effective potential is not strong enough in order to forbid the model to jump from one g to another g 0 ). Our results for the asymmetric Hubbard model can be stated in the following theorem (see also [KL,DFF2]): Theorem 3.1 (Chessboard phases in asymmetric Hubbard model). Consider the lattice Zν , ν > 2, and suppose 0 < µ < U . Then for any δ > 0, there exist t, α > 0 and β0 (t↑ ) < ∞ (limt↑ →0 β0 (t↑ ) = ∞) such that if |t↑ | 6 t, |t↓ | 6 α|t↑ |, and β > β0 ,
Effective Interactions Due to Quantum Fluctuations
305
• the free energy exists in the thermodynamic limit with periodic boundary conditions, as well as expectation values of observables. (1) (2) • There are two pure periodic phases, h·iβ and h·iβ , with exponential decay of correlations. (1) • One of these pure phases, h·iβ , is a small deformation of the chessboard state |d (1) i: ( (1) hnx↑ iβ
(
> 1 − δ if (−1)x = 1 6δ if (−1)x = −1
(1) hnx↓ iβ
6δ if (−1)x = 1 > 1 − δ if (−1)x = −1.
(2)
The other pure phase, h·iβ , is a small deformation of |d (2) i. To construct the two pure phases, one way is to consider the Hamiltonian X (−1)x (nx↑ − nx↓ ). H3 (h) = H3 − h x∈3
Then
(1)
per
h·iβ = lim h·iβ (h) h→0+
and
(2)
per
h·iβ = lim h·iβ (h), h→0−
per
where h·iβ (h) is defined by (2.13) with Hamiltonian H3 (h). 3.2. The Bose–Hubbard model. This model was introduced by Fisher et al. [FWGF] and may describe 4 He absorbed in porous media, or Cooper pairs in superconductors, . . . It is extremely simple, but has very interesting phase diagram with insulating and superfluid domains [FWGF]. Rigorous results mainly concern the insulating phases; when the classical model [(1.2) with t = 0] has a finite number of ground states, existence of Gibbs states that are close to projection operators onto the classical ground states can be proven for small t and large β; moreover, the compressibility vanishes in the ground states of the quantum model [BKU2]. If U0 = ∞, U1 = U2 = 0 and µ = 0, we obtain a model of hard-core bosons; the reflection positivity technique [DLS] shows that the model has off-diagonal long-range order at low enough temperature, hence has superfluid behaviour. On-site repulsion U0 discourages too high occupancy of sites, so it is physically harmless to introduce a generalized hard-core constraint, namely that there cannot be more than N bosons at the same site. As a consequence the local state space is = {0, 1, 2, . . . , N} and is finite. We restrict our discussion to the two-dimensional case. The range R0 is equal to 21 , and the classical interaction is X (U0 n2x − U0 nx − µnx ) + 8x (nU0 (x) ) = 41 y∈U0 (x)
+ 21 U1
X
y,z∈U0 (x) ky−zk2 =1
ny nz + U2
X y,z∈U0 (x) √ ky−zk2 = 2
ny nz .
(3.6)
306
R. Kotecký, D. Ueltschi
Remark that we have [BKU2] 8x (nU0 (x) ) = ( 41 U0 − U1 + U2 ) X
·
X
(ny − 21 )2 + ( 41 U1 − 21 U2 )
y∈U0 (x)
(ny + nz − 21 )2 + U2
X
y,z∈U0 (x) ky−zk2 =1
ny −
−
1 2
y∈U0 (x)
(3.7) µ 2 +C 8U2
with a constant C independent of n. Whenthe chemical potential satisfies 0 < µ < 8U2 , td 1 0 8x (nU0 (x) ) is minimum if nU0 (x) = d d ≡ 0 0 , or any configuration obtained from t d d d by rotation. Hence we define t d d t d d d d G0 (x) = d d , d d , d t , t d for any x ∈ Zν . Here, G is the set of ground states of the interaction 8, so that δ0 = / GU0 (x) , 0. Since 8x (nU0 (x) ) − 8x (gU0 (x) ) > 41 min(µ, 8U2 − µ), for any nU0 (x) ∈ 1 1 min(µ, 8U2 − µ) (the factor 36 , gU0 (x) ∈ GU0 (x) , Assumption 2 holds with 10 = 36 1 rather than 4 , has been chosen in view of Assumption 4, see below). t d d d t d t d d
d d t d d d d d t
t d d d t d t d d
d d t d d d d d t
t d d d t d t d d
d d t d d d d d t
t d d d t d t d d
d d t d d d d d t
t d t d t d t d t
t d d d t d t d d
d d d d d d d d d
t d t d t d t d t
d d d d d d d d d
(a)
t d t d t d t d t
d d d d d d d d d
t d t d t d t d t
d d d d d d d d d
t d t d t d t d t
t d d d t d d d t
d d t d d d t d d
t d d d t d d d t
(b)
d d t d d d t d d
t d d d t d d d t
d d t d d d t d d
t d d d t d d d t
d d t d d d t d d
t d d d t d d d t
(c)
Fig. 2. Configurations that minimize the diagonal interaction; (a) a general configuration; (b) and (c) two natural candidates that may be selected by lowest quantum fluctuations. Actually, candidate (c) dominates, because it allows for more freedom in the moves of bosons.
We take as a sequence of transitions for the smallest quantum fluctuations S = {(A, A0 ) : A =< x, y > and A0 =< y, x > for some x, y ∈ Z2 , kx − yk2 = 1}. The effective potential follows from (2.8). Let Pxy = {z : |z − x| 6 1 or |z − y| 6 1} and more generally we denote by P any 3 × 4 or 4 × 3 rectangle. Up to rotations and reflections, we have to take into account five configurations, namely dtd ddd dtd ddd (A)
gP (A)
dtd ddd tdt ddd (B)
tdt ddd dtd ddd (C)
gP
gP (C)
tdt ddd tdt ddd (D)
gP
tdd ddt tdd ddt (E)
gP
(B)
(D)
We find 9P (gP ) = −t 2 /2U1 , 9P (gP ) = −t 2 /4U2 , and 9P (gP ) = 9P (gP ) = (E) 9P (gP ) = 0.
Effective Interactions Due to Quantum Fluctuations
307
We can choose R = 23 ; U (x) is a block 4 × 4 centered on (x1 + 21 , x2 + 21 ). The configurations gU (x) ∈ GU (x) are (up to rotations and reflections) tdtd dddd tdtd dddd
tdtd dddd dtdt dddd
gU (x)
gU (x)
(a)
(b)
We choose for ϒ ϒx (nU (x) ) =
1 9
X
˜ y (nU0 (y) ) + 8
y,U0 (y)⊂U (x)
1 X 9P (nP ), 2
(3.8)
P ⊂U (x)
˜ y (nU0 (y) ) = 8y (nU0 (y) ) − ming∈G 8y (nU0 (y) ). Which configurations, among with 8 the four generated by g (a) and the eight generated by g (b) , allow for more quantum fluctuations? The effective potential yields t2 , 2U1 t2 t2 (b) − . ϒx (gU (x) ) = − 4U1 8U2 (a)
ϒx (gU (x) ) = −
We see that the set of dominant states D is formed by all the configurations generated by g (b) (recall that U1 > 2U2 ). Heuristically, there is more freedom for the bosons to move in g (b) , since they can go to a nearest-neighbour site and feel a small repulsion of strength U2 ; as for bosons of the configuration g (a) , any nearest-neighbour move brings them at distance 1 of another boson, and they feel a bigger repulsion U1 . As a result we can choose 1 = t 2 ( 8U1 2 − 4U1 1 ) in Assumption 4. The maximum of the expression in Assumption 5 is ε1 = t 2 ( 8U1 2 − 4U1 1 )−1 . In Assumption 6 we have ε2 = 0, because g 6 = g 0 means that g and g 0 must differ on a whole row, and the matrix element is zero for any finite m. These eight dominant states bring eight pure periodic phases, h·i(1) , . . . , h·i(8) ; each one can be constructed by adding a suitable field in the Hamiltonian (e.g. the projector onto the dominant state). Theorem 3.2 (Bose–Hubbard model). Consider the Bose–Hubbard model on the lattice Z2 with a generalized hard-core, and suppose U0 > 4(U1 − U2 ), U1 > 2U2 and 0 < µ < 8U2 . There exist t0 > 0 and β0 (t) < ∞ (limt→0 β0 (t) = ∞) such that if t 6 t0 and β > β0 , • the free energy exists in the thermodynamic limit with periodic boundary conditions, as well as expectation values of observables, • there are 8 pure periodic phases with exponential decay of correlations. Each of these eight phases is a perturbation of a dominant state d, and the expectation value of any local operator is close to its value in the state d, see Theorem 2.2 for more precise statement. Similar properties hold for other quarter-integer density phases. Equation (3.7) may be generalized so as to exhibit gaps for the spectrum of 8, cf. [BKU2].
308
R. Kotecký, D. Ueltschi
4. Contour Representation of a Quantum Model Our Hamiltonian has periodicity `0 < ∞. Without loss of generality, however, one can consider only translation invariant Hamiltonians, applying the standard trick. Namely, ν ν if is the single site phase space, we let 0 = {1,...,`0 } ; S 0 = |0 | = S `0 . Then we consider the torus 30 ⊂ Zν , `ν0 |30 | = |3|, each point of which is representing a block of sites in 3 of size `ν0 , and identify 0
30
' 3 . 0
Constructing H0 as the Hilbert space spanned by the elements of 0 3 , it is clear that H0 is isomorphic to H. The new translation invariant interactions 80 and T 0 are defined by resumming, for each A ⊂ 30 , the corresponding contributions with supports in the union of corresponding blocks. Notice the change in range of interactions. Namely, it decreased to dR/`0 e (the lowest integer bigger or equal to R/`0 ). From now on, keeping the original notation H, S, . . . , we suppose that the Hamiltonian is translation invariant. The partition function of a quantum model is a trace over a Hilbert space. But expanding e−βH with the help of the Duhamel formula we can reformulate it in terms of the partition function of a classical model in a space with one additional dimension (the extra dimension being continuous). In this section we present such an expansion, per leading to a contour representation, of the partition function Z3 := Tr e−βH3 in a per finite torus 3 . Expansion with the help of the Duhamel formula yields e
−βH3
=
X
X
Z
m > 0 A1 ,...,Am 0 0, there exists ε0 > 0 such that whenever kT k 6 ε0 and 0 ∈ D3 , we have the loop cluster expansion, Z Z Y T d4 z(ξ ) = exp dC8 (C) . (4.13) loop D3 (0)
C3 (0)
Moreover, the weights of the clusters are exponentially decaying (uniformly in 3 and β): Z Y dC I C 3 (x, τ ) |8T (C)| e(c−α1 log kT k)|A|+α2 |B| 6 δ (4.14) C3
and
ξ ∈C
Z (x,τ ) C3
dC|8T (C)|
Y
e(c−α1 log kT k)|A|+α2 |B| 6 δ
(4.15)
ξ ∈C
for every (x, τ ) ∈ T3 . Proof. One can follow any standard reference concerning cluster expansions for continuum systems, for example [Bry]. We are using here [Pfi] whose formulation is closer to our purpose. Assuming that inequality (4.15) holds true, we have a finite bound n Y X 1 Z T dξ1 . . . dξn |ϕ (ξ1 , . . . , ξn )| |z(ξi )| 6 δβ|3|. n! L3 (0)n
n>1
(4.16)
i=1
Lemma 4.1 then follows from Lemma 3.1 of [Pfi]. Let us turn to the proof of the two inequalities. Let f (ξ ) = |z(ξ )| e(c−α1 log kT k)|A|+α2 |B| . Skipping the conditions ξj ∼ 0, we define Z hZ dξ1 I B1 3 (x, τ ) + In = n Z ·
) L(x,τ 3
L3
Ln−1 3
dξ2 . . . dξn |ϕ T (ξ1 , . . . , ξn )|
i dξ1 n Y
(4.17) f (ξi )
i=1
(it does not depend on (x, τ ) ∈ T3 ). The lemma will be completed once we shall have established that In 6 n!( 21 δ)n (assuming that δ 6 1; otherwise, we show that In 6 n!/2n ). From Lemma 3.4 of [Pfi], we get Y X I Bi ∪ Bj connected . (4.18) |ϕ T (ξ1 , . . . , ξn )| 6 T tree on n vertices e(i,j )∈T
Denoting d1 , . . . , dn the incidence numbers of vertices 1, . . . , n, we first proceed with the integration on the loops j 6 = 1 for which dj = 1; in the tree T , such j shares an edge (i) (i) only with one vertex i. The incompatibility between ξi and ξj , with ξi = (Bi , ωBi , gAi ), (i)
(i)
(i)
Bi = Ai ×[τ1 , τ2 ], and similarly for ξj , means that either Bj ∪[Ai ×τ1 ] is connected,
314
R. Kotecký, D. Ueltschi (j )
or [Aj × τ1 ] ∪ Bi is connected. Hence, the bound for the integral over the ξj that are incompatible with ξi is Z dξj I Bj ∪ Bi connected f (ξj ) L3 Z Z dξj I Bj 3 (x, τ ) f (ξj ) + 2ν|Bi | dξj f (ξj ) 6 2ν|Ai | (4.19) ) L3 L(x,τ 3 Z Z 1 dξj I Bj 3 (x, τ ) f (ξj ) + dξj f (ξj ) . 6 2ν |Ai | + α|Bi | ) α L3 L(x,τ 3 (The constant α has been introduced in order to match with the conditions of the next lemma). Then Z hZ i X n−1 dξ1 I B1 3 (x, τ ) + dξ1 In 6 n(2ν) T tree of n vertices
) L(x,τ 3
L3
d1 f (ξ1 ) |A1 | + α|B1 |
n Z Y j =2
L3
dj −1 dξj I Bj 3 (x, τ ) f (ξj ) |Aj | + α|Bj |
1 + α
Z ) L(x,τ 3
(4.20)
dj −1
dξj f (ξj ) |Aj | + α|Bj |
.
Now summing over all trees, knowing that the number of trees with n vertices and incidence numbers d1 , . . . , dn is equal to (n − 1)! (n − 2)! 6 , (d1 − 1)! . . . (dn − 1)! d1 !(d2 − 1)! . . . (dn − 1)! we find a bound In 6 n!(2ν)n−1 (1 + α)
Z
1 + α
L3
dξ I B 3 (x, τ ) f (ξ ) e|A|+α|B|
Z
) L(x,τ 3
dξf (ξ ) e
|A|+α|B|
n
(4.21) .
We conclude by using the following lemma which implies that the quantity between the brackets is small. u t Lemma 4.2. Let α1 < (4R0 )−ν and α2 < R −2ν 10 . For any c ∈ R and δ > 0, there exists ε0 > 0 such that whenever kT k 6 ε0 the following inequality holds true, Z dξ I B 3 (x, τ ) |z(ξ )| e(c−α1 log kT k)|A|+α2 |B| L3 Z + dξ |z(ξ )| e(c−α1 log kT k)|A|+α2 |B| 6 δ, ) L(x,τ 3
where (x, τ ) is any space-time site of T3 .
Effective Interactions Due to Quantum Fluctuations
315
Proof. Let us first consider the integral over ξ such that its box contains a given spacetime site. We denote by `1 the number of quantum transitions of ξ at times bigger than τ , and `2 the number of the other quantum transitions. The integral over ξ can be done by summing over (`1 + `2 ) quantum transitions A11 , . . . , A1`1 , A21 , . . . , A2`2 , by summing i,j Aj
over (`1 + `2 ) configurations n i , and by integrating over times τ11 < · · · < τ`11 , τ12 < · · · < τ`22 . Let us do the change of variables τ˜11 = τ11 − τ , τ˜21 = τ21 − τ11 , . . . , τ˜`11 = τ`11 − τ`11 −1 , and τ˜12 = τ − τ12 , . . . , τ˜`22 = τ`22 −1 − τ`22 . Then we can write the following upper bound: Z L3
dξ I B 3 (x, τ ) |z(ξ )| e(c−α1 log kT k)|A|+α2 |B| X
6
`1 ,`2 > 1
X
Z
X
∞ 0
2,`2 1,1 A11 ,...A2` / A 2 nA1 ,...,nA2 ∈G i 1 `2 ¯ ∪i,j A =A3x
dτ˜11 . . . dτ˜`22
`i Y Y i=1,2 j =1
i,j
i,j +1
|hnA | TAi |nA j
i|
j
A connected
¯i
e(c−α1 log kT k)|Aj | e
−τ˜ji
P
i,j y,U0 (y)⊂A [8y (nU0 (y) )−8y (gU0 (y) )]
i
eτ˜j R
να 2
,
(4.22)
where gA ∈ GA is the configuration in which the loop ξ is immersed (if the construction does not lead to a possible loop, we find a bound by picking any gA ∈ GA ). Remark 2,1 that we neglected a constraint on the sum over configurations, namely n1,1 A = nA . It is useful to note that the sums over `1 , `2 and over the quantum transitions are finite, otherwise they cannot constitute a loop. Using the definition (2.6) of kT k, we have |hn0A | TA |nA i| 6 kT k|A| . Furthermore
X
i,j
[8x (nU0 (x) ) − 8x (gU0 (x) )] > R −ν 10 ,
x,U0 (x)⊂A
as claimed in Property (2.5). Hence we have, since the number of configurations on A is bounded with S |A| , Z L3
dξ I B 3 (x, τ ) |z(ξ )| e(c−α1 log kT k)|A|+α2 |B| 6
X
X
`1 ,`2 > 1
A11 ,...A2` 2 ∪i,j A¯ ij =A3x
ν ν |Ai | `i j Y Y kT k1−α1 (4R0 ) S ec(4R0 ) . −ν ν R 10 − R α2
(4.23)
i=1,2 j =1
A connected
This is a small quantity since the sums are finite, by taking kT k small enough. Now we turn to the second term, namely Z dξ |z(ξ )| e(c−α1 log kT k)|A|+α2 |B| . ) L(x,τ 3
316
R. Kotecký, D. Ueltschi
The proof is similar; we first sum over the number of transitions `, then over ` transitions A1 , . . . A` with A = ∪i A¯ i 3 x, A connected. Then we choose ` − 1 intermediate configurations. Finally, we integrate over ` − 1 time intervals. The resulting equation looks very close to (4.23) and is small for the same reasons. u t Now, we single out the class of small clusters. Namely, a cluster is small if the sequence of its quantum transitions belongs to the list S. To be more precise, we have to specify the order of transitions: considering a cluster C ≡ (ξ1 , . . . , ξk ) and using S(ξ (`) ), ` = 1, . . . , k, to denote the sequence of quantum transitions of the loop ξ (`) = ξ (`)
(B (`) , ωB (`) , gA ), S(ξ (`) ) ≡ S(B (`) , ωB (`) ), we take the sequence S(C) obtained by combining the sequences S(ξ (1) ), . . . , S(ξ (k) ) in this order. A cluster C is said to be small if S(C) ∈ S, it is large otherwise. We use C3small to denote the set of all small clusters on the torus T3 . The local contribution to the energy at time τ , when the system is in a state nU0 (x) (τ ), is 8x (nU0 (x) (τ )). Similarly, we will introduce the local contribution of loops (and small clusters of loops) in the expansion of the partition function – the effective potential β 9A (nA (τ )). The latter is a local quantity in the sense that it depends on n only on the set β A at time τ . An explicit expression of 9A (gA ) with g ∈ G is, in terms of small clusters, Z 8T (C) β dC (4.24) I C ∼ gA , AC = A, IC 3 0 . 9A (gA ) := − small |I | C C3 Here, again, C is the support of C, AC its horizontal projection onto Zν , AC = {x ∈ Zν ; x × [0, β]per ∩ C 6 = ∅}, and IC its vertical projection, |AC | and |IC | their corresponding areas, and the condition C ∼ gA means that each loop of C is immersed in the ground state g. Notice that the “horizontal extension” of any small cluster is at most 2R: if C is a small cluster, diam (AC ) 6 2R. The definitions introduced to write the effective potential (see the appendix) are now clear, once we identify the effective potential 9 defined in (A.1) as the limit β → ∞ of (4.24). Namely, 9 = lim 9 β . β→∞
Our assumptions in Sect. 2.3 concern the limit β → ∞ of the effective potential, but at non zero temperature we have to work with 9 β . To trace down the difference, we β / GA introduce ψ β = 9 β − 9. Notice that (4.24) implies 9A (nA ) = 0 whenever nA ∈ or diam A < 4R0 . Recalling that if C ⊂ T3 , C˜ is the smallest box containing C, we introduce, for any cluster C ∈ C3small , the function Z 8T (C) dτ I C ∼ 0 − I n0AC (τ ) ∈ GAC , C ∼ n0AC (τ ) . 8T (C; 0) = |IC | IC (4.25) Here, the first indicator function in the parenthesis singles out the clusters each loop of which is compatible with 0, while the second indicator concerns the clusters for which n0AC (τ ) ∈ GAC and each of their loops is immersed in the configuration n0A (τ ) (extended as a constant to all the time interval IC ). Observing that 8T (C; 0) = 0 whenever C˜ ∩ core 0 = ∅, we split the integral over small clusters into its bulk part expressed in terms of the effective potential and boundary terms “decorating” the quantum contours from 0.
Effective Interactions Due to Quantum Fluctuations
317
Lemma 4.3. For any fixed 0 ∈ D3 , one has Z Z dC8T (C) = − d(A, τ )9A (n0A (τ )) C3small (0)
T3
Z −
T3
β d(A, τ )ψA (n0A (τ )) +
Z C3small
dC8T (C; 0).
The term 8T (C; 0) vanishes whenever C˜ ∩ core 0 = ∅. R R P R Similarly as d(x, τ ), the shorthand d(A, τ ) means A dτ . Proof. To get the equality of integrals, it is enough to rewrite Z Z T dC8 (C) = dC8T (C) I C ∼ 0 C3small (0)
C3small
and Z Z β 0 d(A, τ )9A (nA (τ )) = −
8T (C) dC |IC | C3small
T3
Z
(4.26)
dτ I n0AC (τ ) ∈ GAC , C ∼ n0AC (τ ) .
IC
(4.27)
Moreover, whenever C˜ ∩ core 0 = ∅, the configuration n0AC (τ ) belongs to GAC , and it is constant, for all τ ∈ IC . Under these circumstances, the condition C ∼ 0 is equivalent t to C ∼ n0AC (τ ) and the right hand side of (4.25) vanishes. u Whenever 0 ∈ D3 is fixed, let Wd (0) ⊂ T3 be the set of space-time sites in the state d, i.e. Wd (0) = {(x, τ ) ∈ T3 : n0U (x) (τ ) = dU (x) }. Notice that T3 = supp 0 ∪ ∪ Wd (0); d∈D
Wd (0) ∩ Wd 0 (0) = ∅ if d 6= d 0 ,
and the set supp 0∩Wd (0) is of measure zero (with respect to the measure P d(x, τ ) on T3 ). Let us recall that the equivalent potential ϒ satisfies the equality x∈3 ϒx (nU (x) ) = P (8 (n ) + 9 (n )) + const|3| for any configuration n on the torus 3; actually, A A A A A⊂3 we can take const = 0, since ϒ and ϒ 0 = ϒ + const are also physically equivalent, and ϒ 0 satisfies the same assumptions as ϒ. Lemma 4.4. The partition function (4.9) can be rewritten as Z Y Y per d0 e−|Wd (0)|e(d) z(γ ) eR(0) . Z3 = D3
d∈D
γ ∈0
Here the weight z(γ ) of a quantum contour γ = (B, ωB ) with the sequence of transitions (A1 , . . . , Am ) at times (τ1 , . . . , τm ) is m n Z o Y γ γ γ hnAi (τi − 0)| TAi |nAi (τi + 0)i exp − d(x, τ )ϒx (nU (x) (τ )) . (4.28) z(γ ) = i=1
B
318
R. Kotecký, D. Ueltschi
The rest R(0) is given by Z Z dC8T (C) − R(0) = C3 (0)\C3small (0)
β
T3
d(A, τ )ψA (n0A (τ )) +
Z C3small
dC8T (C; 0). (4.29)
Proof. Using Lemmas 4.1 and 4.3 to substitute in (4.9) the contribution of loops by the action of the effective potential, we get Z m nY o per d0 hn0Ai (τi − 0)| TAi |n0Ai (τi + 0)i Z3 = D3 i=1 (4.30) o n Z 0 0 R(0) d(A, τ )(8A (nA (τ )) + 9A (nA (τ ))) e . · exp − T3
Replacing 8 + 9 by the physically equivalent potential ϒ, we get per
Z3 =
Z D3
d0
m nY o hn0Ai (τi − 0)| TAi |n0Ai (τi + 0)i i=1
Z exp −
supp 0
Y d(x, τ )ϒx (n0U (x) (τ )) e−e(d)|Wd (0)| eR(0) .
(4.31)
d∈D
We get our lemma by observing that the product over quantum transitions and the first exponential factorize with respect to the quantum contours, as was the case for the loops (for fermions the sign arising because of anticommutation relations also factorizes; we again refer to [DFF1] for the proof). u t Our goal is to obtain a classical lattice system in ν +1 dimensions. Thus we introduce a discretization of the continuous time direction, by choosing suitable parameters β˜ > 0 β˜ 11 . Setting L3 to be the (ν + 1)-dimensional discrete torus and N ∈ N with β = N 1 L3 = 3 × {0, 1, . . . , N − 1} per – let us recall that 3 has periodic boundary conditions in all spatial directions – and using C(x, t) ⊂ Rν+1 to denote, for any (x, t) ∈ L3 , the β˜ ˜ t) with vertical length β/1, we have T3 = ∪(x,t)∈L3 C(x, t). cell centered in (x, 1 For any M ⊂ L3 , we set C(M) to be the union of all cells centered at sites of M, C(M) = ∪(x,t)∈M C(x, t) ⊂ T3 . Conversely, if B ⊂ T3 , we take M(B) ⊂ L3 to be the smallest set such that C(M(B)) ⊃ B. Given a connected12 set M ⊂ L3 and a collection of quantum contours 0 ∈ D3 , we define Z dC I M(C) = M 8T (C) + ϕ(M; 0) = C3 (0)\C3small (0)
Z
+ Z −
C3small
dC I M(C) = M, C 6 ⊂ C(supp 0) 8T (C; 0) − β
M(A×τ )=M
d(A, τ )ψA (n0A (τ ))
(4.32)
11 Note the difference from [BKU1]; here the vertical length of a unit cell β/1 ˜ depends on kT k, since so does the quantum Peierls constant 1. 12 Connectedness in L is meant in the standard way via nearest neighbours. 3
Effective Interactions Due to Quantum Fluctuations
and ˜ R(0) =
Z C3small
319
dC I C ⊂ C(supp 0) 8T (C; 0).
(4.33)
We have separated the contributions of the small clusters inside C(supp 0) ≡ C(M(supp 0)), because they are not necessarily a small quantity, and it is impossible to expand them. On the contrary, ϕ(M; 0) is small, and hence it is natural to write X Y ˜ (4.34) eϕ(M;0) − 1 , eR(0) = eR(0) M M∈M
with the sum running over all collections M of connected subsets of L3 . Let supp M = ∪M∈M M. Given a set of quantum contours 0 ∈ D3 and a collection M, we introduce contours on L3 by decomposing the set M(supp 0) ∪ supp M into connected components [notice that if (x, t) ∈ / M(supp 0) ∪ supp M, then C(x, t) ⊂ ∪d∈D Wd (0)]. Namely, a contour Y is a pair (supp Y, αY ), where supp Y ⊂ L3 is a (non-empty) connected subset of L3 , and αY is a labeling of connected components F of ∂C(supp Y ), αY (F ) = 1, . . . , r. We write |Y | for the length (area) of the contour Y , i.e. the number of sites in supp Y . A set of contours Y = {Y1 , . . . , Yk } is admissible if the contours are mutually disjoint and if the labeling is constant on the boundary of each connected component of T3 \ ∪Y ∈Y C(supp Y ). Finally, given an admissible set of contours Y, we define Wd (Y) to be the union of all connected components M of L3 \ ∪Y ∈Y supp Y such that C(M) has label d on its boundary. Consider now any quantum configuration ω ∈ Q3 yielding, together with a collection M, a fixed set of contours Y. Summing over all such configurations ω and collections M, we get the weight to be attributed to the set Y. Let 0 ω be the collection of quantum contours corresponding to ω, ∪Y ∈Y supp Y = M(supp 0 ω ) ∪ supp M. Given that the configurations ω are necessarily constant with no transition on T3 \C(∪Y ∈Y supp Y ), we easily see that the weight factor splits into a product of weight factors of single contours Y ∈ Y. Namely, for the weight z of a contour Y we get the expression Z Y Y ˜ d0 z(γ ) e−e(d)|Wd (0)∩C(supp Y )| eR(0) z(Y ) = D3 (Y )
γ ∈0
d∈D
X Y ϕ(M;0) I M(supp 0) ∪ supp M = supp Y −1 , e M
(4.35)
M∈M
where D3 (Y ) is the set of quantum configurations compatible with Y , 0 ∈ D3 (Y ) if supp 0 ⊂ supp Y and the labels on the boundary of supp 0 match with labels of Y . Thus, we can finally rewrite the partition function in a form that agrees with the standard Pirogov–Sinai setting, namely Y XY β˜ per e− 1 e(d)|Wd (Y )| z(Y ), (4.36) Z3 = Y d∈D
Y ∈Y
with the sum being over all admissible sets of contours on L3 . In the next section we will evaluate the decay rate of contour weights in preparation to apply, in Sect. 6, the Pirogov–Sinai theory to prove Theorems 2.1, 2.2, and 2.3.
320
R. Kotecký, D. Ueltschi
5. Exponential Decay of the Weight of the Contours In this section we show that the weight z has exponential decay with respect to the length of the contours. We begin by a lemma proving that the contribution of M is small, that we shall use in Lemma 5.2 below for the bound of z. Lemma 5.1. Under Assumptions 1–6, for any c < ∞ there exist constants β0 , β˜0 < ∞, and ε0 > 0 such that for any β > β0 , β˜0 6 β˜ < 2β˜0 , and kT k, ε1 , ε2 6 ε0 , one has X eϕ(M;0) − 1 ec|M| 6 1 M3(x,t)
for any contour Y and any set of quantum contours 0 ∈ D3 (Y ). Proof. We show that
X ϕ(M; 0) ec|M| 6 1. M3(x,t)
This implies that |ϕ(M; 0)| 6 1 and consequently Lemma 5.1 holds – with a slightly smaller constant c. Let us consider separately, in (4.32), the three terms on the right hand side: (a) the integral over big clusters, (b) the integral over small clusters, and (c) the expression involving ψ β . (a) Big clusters. Our aim is to estimate Z X ec|M| J =
C3 (0)\C3small (0)
M3(x,t)
dC I M(C) = M 8T (C) .
Since M(C) = M and M 3 (x, t), the cell C(x, t) intersects a quantum transition of C, or it is contained in a box B belonging to a loop of C (both possibilities may occur at the same time). In the first case we start the integral over clusters by choosing ˜ the time for the first quantum transition, which yields a factor β/1. In the second case we simply integrate over all loops containing the given site. In the same time, given a (i) ξ (i) (i) cluster C = (ξ1 , . . . , ξn ), ξi = (Bi , ωBi , gAii ) and Bi = Ai × [τ1 , τ2 ], the condition M(C) = M implies that n n X
|Ai | +
i=1
o 1 |Bi | > |M|. β˜
(5.1)
Using it to bound |M|, we get the estimate Z Y c|A|+c 1 |B| β˜ β˜ dC|8T (C)| e + J 6 1 C3(x,τ ) \C3small ξ ∈C Z Y c|A|+c 1 |B| β˜ dC I C 3 (x, τ ) |8T (C)| e . + C3 \C3small
(5.2)
ξ ∈C
˜ Taking, in Lemma 4.1, the constant c as above as well as α1 = 21 (4R0 )−ν , α2 = c1/β, δ = 1, and choosing the corresponding ε0 (c, α1 , α2 , δ), we can bound the second term
Effective Interactions Due to Quantum Fluctuations
321
of (5.2), for any kT k 6 ε0 , with the help of (4.14) once β˜ is chosen large enough to satisfy c 2ν β˜ R . > 1 10
(5.3)
To estimate the first term of (5.2), we first consider the contribution of those clusters for which Y 1 β˜ −ν kT k− 2 (4R0 ) |A| . 6 1 ξ ∈C
Applying it together with (5.3) we can directly use the bound (4.15). Thus it remains to estimate the contribution of those terms for which X ˜ log(β/1) 1 . |A| < 2(4R0 )ν log(1/kT k)
(5.4)
ξ ∈C
Let us first fix β˜ and ε0 6 ε0 (c, α1 , α2 , δ) with the constants c, α1 , α2 , and δ as above, so that c 2ν β˜ > R ε0 10
(5.5)
and, in the same time, 1 0
−ν
k− k (4R0 ) β˜ 6 ε0 2
(5.6)
for a suitable large k 0 (we also assume that ε0 6 1). Here k is the constant that appears in Assumption 4, 1(kT k) > kT kk . Observing further that 1(kT k) can be taken to increase with kT k (one can always consider a weaker lower bound 1 when taking smaller kT k), we conclude that (5.3), as well as the condition 2(4R0 )ν
˜ log(β/1) 6 k0, log(1/kT k)
are satisfied for every kT k 6 ε0 . Thus, it suffices to find an upper bound to J0 =
Z X β˜ dC|8T (C)| I |A| < k 0 . 1 C3(x,τ ) \C3small
(5.7)
ξ ∈C
The main problem in estimating this term stems from the factor 1/1 that may be large if kT k is small. Thus, to have a bound valid for all small kT k, some terms, coming from the integral, that would P suppress this factor must be displayed. The condition ξ ∈C |A| < k 0 will be used several times by applying its obvious consequences: (i) the number of loops in C is smaller than k 0 , (ii) the number of transitions for each loop is smaller than k 0 , (iii) each transition A is such that |A| < k 0 , and (iv) the distance between each transition and x is smaller than k 0 .
322
R. Kotecký, D. Ueltschi
Furthermore, we use Assumption 5 to bound the contribution of the transitions of C; recalling the definition (4.11) of the weight of ξ , we have, for any large C, n Z o Y Y ξ ξ |z(ξ )| 6 ε1 1 exp − d(x, τ )[8x (nU0 (x) (τ )) − 8x (gU0 (x) )] ξ ∈C
B
ξ ∈C
6 ε1 1
Y
e
−R −2ν 1
0 |B|
.
(5.8)
ξ ∈C
In the last inequality we used Assumption 2 in the form of the bound (2.5) as well as the |B| lower bound |τ2 − τ1 | = |B| |A| > R ν for the support B = A × [τ1 , τ2 ] of the loop ξ . For any ξ ∈ C = (ξ1 , . . . ξn ), let τ be the time at which the first transition in C occurs (we assume that it happens for the “first” loop ξ1 ) and τ ξ be such that τ + τ ξ is the time at which the first transition in ξ occurs (τ ξ1 = 0). Referring to the condition (i) on the number of loops in C, we get the inequality X
|τ ξ | 6 k 0
X
and thus also 16
Y
e
−
10 2k 0 R 2ν
|B|,
ξ
ξ 6=ξ1
|τ ξ |
ξ
Y
e2R 1
−2ν 1
0 |B|
.
ξ
Integrating now over the time of the first transition for each ξ ∈ C, ξ 6 = ξ1 , and taking into account that |ϕ T (ξ1 , . . . , ξn )| 6 nn−2 , we get Z k X on 1 −2ν nn−2 2k 0 R 2ν n−1 n ˜ 1 dξ e− 2 R 10 |B| I ξ : k 0 . J 6 βε ) (n − 1)! 10 L(x,τ 3 0
0
(5.9)
n=1
Here the constraint I ξi : k 0 means that the loop ξi satisfies the conditions (ii)–(iv) above. We have then a finite number of finite terms, the contribution of which is bounded ˜ and k 0 ). Thus J 0 6 βε ˜ 1 K which we by a fixed number K < ∞ (depending on ε0 , β, can suppose sufficiently small if ε1 is small. (b) Small clusters. Let us first notice that |8T (C; 0)| 6 |8T (C)|, and since M(C) = M, inequality (5.1) is valid. Moreover C must contain at least one of the two boundary points β˜ β˜ ± 21 ) of some cell C(y, t) for which dist (x, y) 6 R. Indeed, given that C is (y, t 1 small and in the same time C˜ ∩ core 0 6 = ∅ (cf. Lemma 4.3), this is the only way to satisfy also C 6 ⊂ C(supp 0) [cf. (4.32)]. Thus it suffices to use again (4.14) and (5.3) to estimate Z Y c|A|+c 1 |B| ν β˜ dC I C 3 (x, τ ) |8T (C)| e . (2R) C3small
ξ ∈C
(c) Bound for ψ β . Finally, we estimate the expression involving ψ β . We first observe that β
eαβ |ψA (gA )| 6 1
(5.10)
Effective Interactions Due to Quantum Fluctuations
323
for any A ⊂ Zν and with α = 21 R −2ν 10 , Indeed, β
β
eαβ |ψA (gA )| = eαβ |9A (gA ) − 9A (gA )| = Z 8T (C) + dC I C ∼ gA , AC = A, IC 3 0, C ⊂ 3×[0, β] per , |IC | = β = eαβ − |IC | C3small Z 8T (C) . dC I C ∼ gA , AC = A, IC 3 0, C ⊂ 3 × [−∞, ∞], |IC | > β + |IC | C3small (5.11) The first integral above corresponds to clusters wrapped around the torus in vertical direction, while the second one assumes integration over all clusters in 3 × [−∞, ∞]. For any C above |IC | > β and thus eαβ 6
Y
eα|B| .
ξ ∈C
Observing now that every cluster in both integrals necessarily contains in its support at least one of the points (x, 0), x ∈ A, and using the fact that diam A 6 R, we can bound the first integral by Rν β
Z C3small
Y dC I C 3 (x, 0) |8T (C)| eα|B| , ξ ∈C
which can be directly evaluated by (4.14). The same bound can be actually used also for the second integral, once we realize that the estimate (4.14) is uniform in β. β Using now the fact that ψA = 0 if diam A > R, the condition M(A × {τ }) = M ν implies that M has less than R ν sites, hence ec|M| 6 ecR . Furthermore, referring to (5.10), we have Z
1 −2ν β˜ ν β d(A, τ )|ψA (·)| I M(A × {τ }) = M ec|M| 6 e− 2 R 10 β+cR , 1 T3
(5.12)
which can be made small for β sufficiently large and concludes thus the proof of the lemma. u t Using Lemma 5.1 and introducing e0 = mind∈D e(d), we can estimate the weight z of the contours in the discrete space of cells. Lemma 5.2. Under Assumptions 1–6, for any c < ∞, there exist β0 , β˜0 < ∞ and ε0 > 0 such that for any β > β0 , β˜0 6 β˜ < 2β˜0 , and kT k, ε1 , ε2 6 ε0 , one has β˜
|z(Y )| 6 e− 1 e0 |Y | e−c|Y | for any contour Y .
324
R. Kotecký, D. Ueltschi
Proof. For a given 0 (such that M(supp 0) ⊂ supp Y ) with transitions {A1 , . . . , Am } at times {τ1 , . . . , τm }, we define A(0) = ∪m i=1 ∪x∈Ai [U (x) × τi ], A = M(A(0)), / DU (x) for some and E ⊂ supp Y \ A to be the set of sites (x, t) such that n0U (x) (τ ) ∈ (x, τ ) ∈ C(x, t). The latter can be split into two disjoint subsets, E = E core ∪ E soft , with / GU (x) for some (x, τ ) ∈ C(x, t). The condition (x, t) ∈ E core whenever n0U (x) (τ ) ∈ M(supp 0) ∪ supp M = supp Y in (4.35) implies the inequality Y ν ec|M| . ec|Y | 6 ec(2R) |A(0)| ec|E | M∈M
From definitions (4.35) of z(Y ) and (4.28) of z(γ ), and using Assumption 4, we have X β˜ e− 1 e0 |supp Y \A| ec|Y | |z(Y )| 6 X
X
A⊂supp Y ˜
e−(β−c)|E \E
core |
β˜ 10 −ν core | 2 (2R) −c)|E
e−( 1
×
E ⊂supp Y \A E core ⊂E
Z
× m Y
n Z × exp −
i=1
D3
d0 I M(A(0)) = A, M(core 0) = E core
0 hn (τi − 0)| TA |n0 (τi + 0)i ec(2R)ν |Ai | × i Ai Ai
o ˜ d(x, τ )ϒx (n0U (x) (τ )) e|R(0)|
C(A)
X
Y eϕ(M;0) − 1 ec|M| .
M,supp M⊂supp Y M∈M
(5.13) All elements in M are different, because it is so in the expansion (4.34). Therefore we have Y X eϕ(M;0) − 1 ec|M| M,supp M⊂supp Y M∈M
in X 1h X eϕ(M;0) − 1 ec|M| n! M⊂supp Y n>0 in X X 1h eϕ(M;0) − 1 ec|M| , |Y | 6 n! 6
n>0
(5.14)
M3(x,t)
and using Lemma 5.1 this may be bounded by e|Y | . In (4.33) clusters are small, and they must contain a space-time site (x, τ ) such that there exists x 0 with (x 0 , τ ) ∈ core 0 and dist (x, x 0 ) < R. So we have the bound Z ˜ dC I C 3 (x, τ ) 8T (C) , |R(0)| 6 (2R)ν |core 0| C3small
since |8T (C; 0)| 6 |8T (C)|. Taking now, in Lemma 4.1, the constants c = α1 = α2 = 10 0 and δ = 4(2R) 2ν , and choosing the corresponding ε0 , we apply (4.14) to get, for any kT k 6 ε0 , the bound β˜ 10 10 10 ˜ (2R)−ν |core 0| 6 (2R)−ν |E core | + (2R)−ν |core 0 ∩ C(A)|. |R(0)| 6 4 1 4 4
Effective Interactions Due to Quantum Fluctuations
Assuming β˜ > c and
β˜ 10 1 4
325
> (2R)ν c [cf. (5.3)], we bound
˜
e−(β−c)|E \E
core |
β˜ 10 −ν core | 4 (2R) −c)|E
e−( 1
6 1.
Inserting these estimates into (5.13), we get X
β˜
ec|Y | |z(Y )| 6 e− 1 e0 |Y | e|Y |
3|supp Y \A|
A⊂supp Y
Z
d0 I M(A(0)) = A
D3
m Y 0 hn (τi − 0)| TA |n0 (τi + 0)i ec(2R)ν |Ai | i Ai Ai
n Z exp −
C(A)
i=1
d(x, τ )[ϒx (n0U (x) (τ )) − e0 −
o 10 (2R)−ν I (x, τ ) ∈ core 0 ] . (5.15) 4
To estimate the above expression, we will split the “transition part” of the considered quantum contours into connected components, to be called fragments, and deal with them separately. Even though the weight of a quantum contour cannot be partitioned into the corresponding fragments, we will get an upper bound combined from fragment bounds. Consider thus the set ˆ A(0) = core 0 ∩ C(A(0)) ˆ ˆ A(0) = and the fragments ζi = (Bi , ωBi ) on the connected components Bi of A(0), ∪ni=1 Bi , ωBi is the restriction of ω0 onto Bi . From Assumption 4, we have Z h i 10 (2R)−ν I (x, τ ) ∈ core 0 d(x, τ ) ϒx (n0U (x) (τ )) − e0 − 4 C(A) n X |Bi |. > 41 (2R)−ν 10 i=1
Let us introduce a bound for the contribution of a fragment ζ with transitions Aj , j = 1, . . . , k, −ν 1
zˆ (ζ ) = e− 4 (2R) 1
0 |B|
k Y j =1
ζ
ζ
ν |A | j
|hnAj (τ1 − 0)| TAj |nAj (τ1 + 0)i| ec(2R)
Then, integrating over the set FC(A) of all fragments in C(A), we get n X X 1 Z β˜ c|Y | −1 e0 |Y | |Y | |supp Y \A| e |z(Y )| 6 e e 3 dζ zˆ (ζ ) . n! FC(A) n>0
A⊂supp Y
Anticipating the bound
R FC(A)
dζ zˆ (ζ ) 6 |A|, we immediately get the claim, β˜
ec|Y | |z(Y )| 6 e− 1 e0 |Y | e3|Y | , with a slight change of constant c → c − 3.
.
(5.16)
326
R. Kotecký, D. Ueltschi
A bound on the integral of fragments. Let us first consider short fragments ζ = (B, ωB ) satisfying the condition k
˜ log(β/1) 1X 6 log β˜ + k |Aj | 6 2 log(1/kT k)
(5.17)
j =1
(if kT k 6 1). The integral over the time of occurrence of the first transition yields the ˜ factor β/1. Notice that ζ is not a loop. This follows from the construction of quantum ˆ contours and the fact that B is a connected component of A(0), where every transition is taken together with its R-neighbourhood. Thus, either its sequence of transitions does not belong to S, or the starting configuration does not coincide with the ending configuration. In the first case we use Assumption 5, in the second case Assumption 6, and since (5.17) means that the sum over transitions is bounded, we can write Z dζ zˆ (ζ ) 6 21 |A|, (5.18) short FC(A)
if ε1 and ε2 are small enough, independently of kT k. Finally, we estimate the integral over ζ ’s that are not short. We have Z Z β˜ dζ zˆ (ζ ) 6 |A| dζ zˆ (ζ ). (x,τ ) short short 1 FC(A) FC(A) \FC(A) \FC(A)
(5.19)
(x,τ )
Here FC(A) is the set of all fragments ζ whose first quantum transition (A1 , τ1 ) is such that x ∈ A1 and τ = τ1 . Whenever ζ is not short, we have 16
k 1 1Y kT k− 2 |Aj | . β˜ j =1
Thus, defining −ν 1 |B| 0
zˆ 0 (ζ ) = e− 4 (2R) 1
k h Y
ν +1
kT k 2 ec(2R) 1
i|Aj |
,
(5.20)
j =1
Z
we find the bound |A|
F (x,τ )
dζ zˆ 0 (ζ ).
Here, slightly overestimating, we take for F(x, τ ) the set of all fragments containing a quantum transition (A, τ ) with x ∈ A. The support B of a fragment ζ = (B, ωB ) ∈ F(x, τ ), is a finite union of vertical segments (i.e. sets of the form {y}×[τ1 , τ2 ] ⊂ T3 ) and k horizontal quantum transitions A1 , . . . , Ak . We will finish the proof by proving by induction the bound Z dζ zˆ 0 (ζ ) 6 1 (5.21) F (x,τ ;k)
with F(x, τ ; k) denoting the set of fragments from F(x, τ ) with at most k quantum transitions.
Effective Interactions Due to Quantum Fluctuations
327
Consider thus a fragment ζ with k horizontal quantum transitions connected by vertical segments. Let (A, τ ) be the transition containing the point (x, τ ) and let (A1 , τ + τ1 ), . . . , (A` , τ + τ` ) be the transitions that are connected by (one or several) vertical segments of the respective lengths |τ1 |, . . . , |τ` | with the transition (A, τ ). If we remove all those segments, the fragment ζ will split into the “naked” transition (A, τ ) and ¯ belongs additional `¯ 6 ` fragments ζ1 , . . . , ζ`¯, such that each fragment ζj , j = 1, . . . , `, to F(yj , τ +τj ; k−1) with yj ∈ A. Taking into account that the number of configurations (determining the possible vertical segments attached to A) above and below A is bounded ¯ by S 2|A| and that the number of possibilities to choose the points yj is bounded by |A|` , we get Z X |A| 1 ν dζ zˆ 0 (ζ ) 6 kT k 2 ec(2R) +1 S 2 F (x,τ ;k)
A,dist (A,x) β0 , β˜0 6 β˜ < 2β˜0 , kT k + r−1 i=1 k ∂µi T k 6 ε0 , and ε1 , ε2 6 ε0 , one has ∂ β˜ µ ˜ | e− 1 e0 |Y | e−c|Y | z(Y ) 6 α β|Y ∂µi for any contour Y . Proof. From the definition (4.35) of z, one has ∂ z(Y ) 6 ∂µi X X ∂ µ ∂ ∂ ˜ R(0) z(0) + e (d) + Wd ∩ C(supp Y ) 6 |z(Y )| ∂µi ∂µi ∂µi γ ∈0 d∈D Z Y Y µ ˜ d0 |z(γ )| e−e (d)|Wd ∩C(supp Y )| e|R(0)| + D3 (Y )
γ ∈0
d∈D
X I M(supp 0) ∪ supp M = supp Y M
X ∂ ϕ(M; 0) eϕ(M;0) ∂µi
M∈M
Y M 0 ∈M,M 0 6 =M
ϕ(M 0 ;0) e − 1 .
(5.23)
328
R. Kotecký, D. Ueltschi
∂ ∂ µ The bound for | ∂µ z(0)| is standard, see [BKU1], and | ∂µ e (d)| is assumed to be i i bounded in Assumption 7. For the other terms we have to control clusters of loops. Since we have exponential decay for z(ξ ) with any strength (by taking β large and kT k ∂ z(ξ ) (by taking β larger and kT k smaller). The integrals small), we have the same for ∂µ i over C can be estimated as before, the only effect of the derivative being an extra factor n (when the clusters have n loops). u t
6. Expectation Values of Local Observables and Construction of Pure States per
So far we have obtained an expression (4.36) for the partition function Z3 of the quantum model on torus 3 in terms of that of a classical lattice contour model with the weights of the contours showing an exponential decay with respect to their length. Using d with the torus the same weights z(Y ), we can also introduce the partition functions Z3(L) 3 replaced by a hypercube 3(L) and with fixed boundary conditions d. Namely, we take simply the sum only over those collections Y of contours whose external contours are labeled by d and are not close to the boundary.13 Notice, however, that here we d directly in terms of the classical contour model, without ensuring are defining Z3(L) existence of corresponding partition function for the original model. We will use these partition functions only as a tool for proving our theorems that are stated directly in terms of quantum models. To be more precise, we can extend the definition even more and consider, instead of the torus 3, any finite set V ⊂ L = Zν × {0, 1, . . . , N − 1} per . There is a class of contours that can be viewed as having their support contained in V ⊂ L. For any such contour Y we introduce its interior Int Y as the union of all finite components of L \ supp Y and Int d Y as the union of all components of Int Y whose boundary is labelled by d. Recalling that we assumed ν > 2, we note that the set L \ (supp Y ∪ Int Y ) is a connected set, implying that the label αY (·) is constant on the boundary of the set V (Y ) = supp Y ∪ Int Y . We say that Y is a d-contour, if αY = d on this boundary. Two contours Y and Y 0 are called mutually external if V (Y )∩V (Y 0 ) = ∅. Given an admissible set Y of contours, we say that Y ∈ Y is an external contour in Y, if supp Y ∩ V (Y 0 ) = ∅ for all Y 0 ∈ Y, Y 0 6 = Y . The sets Y contributing to ZVd are such that all their external contours are d-contours and dist (Y, ∂V ) > 1 for every Y ∈ Y. In this way we find ourselves exactly in the setting of standard Pirogov–Sinai theory, or rather, the reformulation for “thin slab” (cylinder L of fixed temporal size N ) as presented in Sects. 5–7 and Appendix of [BKU1]. In particular, for sufficiently large β P ∂ β,µ (d), metastable and sufficiently small kT k + r−1 i=1 k ∂µi T k, there exist functions f β,µ
free energies, such that the condition Re f β,µ (d) = f0 , with f0 ≡ f0 defined by f0 = mind 0 ∈D Re f β,µ (d 0 ), characterizes the existence of pure stable phase d. Namely, as will be shown next, a pure stable phase h·idβ exists and is close to the pure ground state |di. There is one subtlety in the definition of f β,µ (d). Namely, after choosing a suitable ˜ N) such that β˜ ∈ (β˜0 , 2β˜0 ) and N β˜ = β. To be β˜0 , given β, there exist several pairs (β, specific, we may agree to choose among them that one with maximal N. The function f β,µ (d) is then uniquely defined for each β > β0 . Notice, however, that while increasing β, we pass, at the particular value βN = N β˜0 , from discretization of temporal size N 13 In the terminology of Pirogov–Sinai theory we rather mean diluted partition functions – see the more precise definition below.
Effective Interactions Due to Quantum Fluctuations
329
to N + 1. As a result, the function f β,µ (d) might be discontinuous at βN with β = ∞ being an accumulation point of such discontinuities. Nevertheless, these discontinuities are harmless. They can appear only when Re f β,µ (d) > f0 and do not change anything in the following argument. Before we come to the construction of pure stable phases, notice that the first claim of Theorem 2.2 (equality of f0 with the limiting free energy) is now a direct consequence of the bound ν ν ˜ ˜ per (6.1) Z3 − |Q| e−βf0 NL 6 e−βf0 N L O( e−const L ) [cf. [BKU1], (7.14)]. Here Q = {d; Re f β,µ (d) = f0 }. The expectation value of a local observable K is defined as per
hKi3 =
Tr K e−βH3 . Tr e−βH3
(6.2)
In Sect. 4 we have obtained a contour expression for Z3 = Tr e−βH3 . We retrace per here the same steps for Z3 (K) := Tr K e−βH3 . The Duhamel expansion (4.1) for per Z3 (K) leads to an equation analogous to (4.2), per
per
Z3 (K) =
X
X
Z
X
m > 0 n0 ,...nm A1 ,...,Am 3 3 A¯ i ⊂3
0 0 the only stable phase is d, Re f β,µ,α (d) = f0 β,µ,α β,µ,α 0 0 (d ) > f0 for d 6 = d. Thus, Qµ,α = {d} and and, in the same time, Re f per h·idβ,µ,α = h·iβ,µ,α . This state is thermodynamically stable – when adding any small perturbation, metastable free energies will change only a little and that one corresponding to the state d will still be the only one attaining the minimum. The fact that in the limit of vanishing perturbation we recover h·idβ,µ,α , as well as the fact that
lim h·iβ,µ,α ≡ lim h·idβ,µ,α = h·idβ,µ , per
α→0+
α→0+
follows by inspecting the contour representations of the corresponding expectations and observing that it can be expressed in terms of converging cluster expansions whose terms depend smoothly on α as well as on the additional perturbation. To prove, finally, the claim b) of Theorem 2.2, it suffices to show that it is valid for per per µ,α h·iβ,µ,α = h·idβ,µ,α for every α > 0. Abbreviating h·iβ,µ,α = h·i per and H3 = H3 , we first notice that the expectation value of the projector onto the configuration d on supp K, d d per = Psupp K := |dsupp K ihdsupp K | , is close to 1, since its complement h(1 − Psupp K )i d d h(1−Psupp K )i is related to the presence of a contour intersecting or surrounding supp K (loops intersecting supp K ×{0} are considered here as part of quantum contours), whose weight is small. More precisely, for any δ > 0 we have d per 6 δ|supp K|, h(1 − Psupp K )i 14 Recall that, up to now, the state h·id is defined only in terms of the contour representation [see (6.9), (6.8), β and (4.36)], and the only proven connection with a state of original quantum model is the equality (6.11). 15 Actually, we can restrict δ d only to a particular type of sets A – for example all hypercubes of side R. A
332
R. Kotecký, D. Ueltschi
whenever kT k, ε1 , ε2 are small enough and β large enough. Furthermore, 1 h d d −βH3 + per Tr Psupp K KPsupp K e Z3 i d d −βH3 d −βH3 + Tr K(1 − Psupp + Tr (1 − Psupp K )KPsupp K e K) e
(6.12)
d d −βH3 d −βH3 = hd3 | K |d3 iTr Psupp Tr Psupp K KPsupp K e Ke d −βH3 , = hd3 | K |d3 i Tr e−βH3 − Tr (1 − Psupp K) e
(6.13)
per
hKi3 =
and
so that we have per hKi per − hd3 | K |d3 i 6 hd3 | K |d3 i h(1 − P d supp K )i3 3 per per d d d + h(1 − Psupp K )KPsupp K i3 + hK(1 − Psupp K )i3 .
(6.14)
The mapping (K, K 0 ) 7 → hK † K 0 i3 , with any two local operators K, K 0 , is a scalar product; therefore the Schwarz inequality yields per
hKi per − hd3 | K |d3 i 6 hd3 | K |d3 i h(1 − P d
per supp K )i3
3
1 † per 1 per 2 per 21 d d † d + h(1 − Psupp )i K KP i + hK Ki3 2 hP K 3 supp K supp K 3 i h per per 1/2 d d 6 kKk h(1 − Psupp K )i3 + 2 h(1 − Psupp T )i3 1
6 kKk|supp K|(δ + 2δ 2 ).
(6.15)
The proof of the remaining Theorem 2.3 is a standard application of the implicit function theorem. Thus, for example, the point µ¯ 0 of maximal coexistence, Re f β,µ¯ 0 (d) = Re f β,µ¯ 0 (d 0 ) for every pair d, d 0 ∈ D, can be viewed as the solution of the vector equation f (µ¯ 0 ) = 0, with f (µ) = (Re f β,µ (di ) − Re f β,µ (dr ))r−1 i=1 . Now, f = e + s, r−1 µ µ β,µ β,µ , with ksk as well e(µ) = (e (di ) − e (dr ))i=1 , s(µ) = (Re s (di ) − Re s (dr ))r−1
i=1
∂s
bounded by a small constant once kT k + Pr−1 ∂T is sufficiently small and as ∂µ i=1 ∂µi β is sufficiently large. The existence of a unique solution µ¯ 0 ∈ U then follows once we notice the existence of the solution µ0 ∈ U of the equation e(µ0 ) = 0 (equivalent with eµ0 (d) = eµ0 (d 0 ), d, d 0 ∈ D) and the fact that the mapping T : µ → A−1
∂e (µ − µ0 ) − f (µ) µ=µ 0 ∂µ
∂e , is a contraction. To this end it is enough just to with A−1 the matrix inverse to ∂µ recall Assumption 7 and the bounds on s β,µ (d), d ∈ D, and its derivatives.
Effective Interactions Due to Quantum Fluctuations
333
A. General Expression for the Effective Potential It is actually a cumbersome task to write down a compact formula for the effective potential in the general case. A lot of notation has to be introduced, and one pays for the generality by the fact that the resulting formulæ look rather obscure; nevertheless, the logic behind the following definitions and equations appeared rather naturally along the steps in Sect. 4. We would like to stress that for typical concrete models, it is entirely sufficient to restrict to the effective potential due to at most 4 transitions, and we can content ourselves with Eqs. (2.8)–(2.10). We assume that a list S of sequences of quantum transitions A is given to represent the leading quantum fluctuations. The particular choice of S depends on properties of the considered model. Often the obvious choice like “any sequence of transitions not surpassing a given order” is sufficient. In the general case, certain conditions (specified in Assumption 5) involving S are to be met. For any gA ∈ GA , the effective potential 9 is defined to equal 9A (gA ) = −
X 1 n!
n>1 n Y
X
X
k1 ,...,kn > 2 (A1 ,...,A1 ,A2 ,...,An )∈S 1 1 k1 kn ∪i,j A¯ i =A j
ki hY
X
i,ki −1 I(Ai1 , . . . , Aiki ; ni,1 g3\A ) A g3\A , . . . , nA
i=1 ni,1 ,...,ni,ki −1 ∈G / A A A
Z
−∞ 1 B˜ ` , decompose into connected components Bˆ (1) = ∪` > 1 Bˆ `(1) , and repeat the procedure until no change occurs any more, i.e. until Bˆ (m) = ∪` > 1 B˜ `(m) . The function I characterizes whether this final set, the result of the above construction, is connected or not, ( 1 if Bˆ (m) is connected 1 k−1 (A.3) I(A1 , . . . , Ak ; n , . . . , n ) = 0 otherwise. Equations (2.8)–(2.10) are obtained from the general expression (A.1) by considering the cases with one or two loops (i.e. n = 1, 2), each loop having no more than 4 transitions (ki 6 4). Acknowledgements. We are thankful to Christian Gruber for discussions. R. K. acknowledges the Institut de Physique Théorique at EPFL, and D. U. the Center for Theoretical Study at Charles University for hospitality.
References [BI]
Borgs, C. and Imbrie, J.: A unified approach to phase diagrams in field theory and statistical mechanics. Commun. Math. Phys. 123, 305–328 (1989) [BKU1] Borgs, C., Kotecký, R. and Ueltschi, D.: Low temperature phase diagrams for quantum perturbations of classical spin systems. Commun. Math. Phys. 181, 409–446 (1996) [BKU2] Borgs, C., Kotecký, R. and Ueltschi, D.: Incompressible phase in lattice systems of interacting bosons. Unpublished, available at http://dpwww.epfl.ch/instituts/ipt/publications.html (1997) [BS] Bricmont, J. and Slawny, J.: Phase transitions in systems with a finite number of dominant ground states. J. Stat. Phys. 54, 89–161 (1989) [Bry] Brydges, D.C.: A short course on cluster expansions. Proceeding of Les Houches, Session XLIII, 129–183 (1986)
Effective Interactions Due to Quantum Fluctuations
335
[DFF1] Datta, N., Fernández, R. and Fröhlich, J.: Low-temperature phase diagrams of quantum lattice systems. I. Stability for quantum perturbations of classical systems with finitely-many ground states. J. Stat. Phys. 84, 455–534 (1996) [DFF2] Datta, N., Fernández, R. and Fröhlich, J.: Effective Hamiltonians and phase diagrams for tightbinding models. Preprint, math-ph/9809007 (1998) [DFFR] Datta, N., Fernández, R., Fröhlich, J. and Rey-Bellet, L.: Low-temperature phase diagrams of quantum lattice systems. II. Convergent perturbation expansions and stability in systems with infinite degeneracy. Helv. Phys. Acta 69, 752–820 (1996) [DMN] Datta, N., Messager, A. and Nachtergaele, B.: Rigidity of interfaces in the Falicov–Kimball model. Preprint, mp-arc 98-267 (1998) [Dob] Dobrushin, R.L.: Existence of a phase transition in the two-dimensional and three-dimensional Ising models. Sov. Phys. Doklady 10, 111–113 (1965) [DLS] Dyson, F.J., Lieb, E.H. and Simon, B.: Phase transitions in quantum spin systems with isotropic and nonisotropic interactions. J. Stat. Phys. 18, 335–383 (1978) [EFS] van Enter, A.C.D., Fernández, R. and Sokal, A.D.: Regularity properties and pathologies of positionspace renormalization-group transformations: scope and limitations of Gibbsian theory. J. Stat. Phys. 72, 879–1167 (1993) [FWGF] Fisher, M.P.A., Weichman, P.B., Grinstein, G. and Fisher, D.S.: Boson localization and the superfluidinsulator transition. Phys. Rev. B 40, 546–570 (1989) [Geo] Georgii, H.-O.: Gibbs Measures and Phase Transitions. De Gruyter studies in Mathematics, Berlin– New York: De Gruyter, 1988 [Gin] Ginibre, J.: Existence of phase transitions for quantum lattice systems. Commun. Math. Phys. 14, 205–234 (1969) [Gri] Griffiths, R.B.: Peierls’ proof of spontaneous magnetization of a two-dimensional Ising ferromagnet. Phys. Rev. A 136, 437–439 (1964) [GM] Gruber, Ch. and Macris, N.: The Falicov–Kimball model: a review of exact results and extensions. Helv. Phys. Acta 69, 850–907 (1996) [KL] Kennedy, T. and Lieb, E.H.: An itinerant electron model with crystalline or magnetic long range order. Physica A 138, 320–358 (1986) [LM] Lebowitz, J.L. and Macris, N.: Low-temperature phases of itinerant fermions interacting with classical phonons: the static Holstein model. J. Stat. Phys. 76, 91–123 (1994) [Lieb] Lieb, E.H.: The Hubbard model: some rigorous results and open problems. In: XIth International Congress of Mathematical Physics (Paris, 1994), Cambridge, MA: Internat. Press, 1995 pp. 392–412 [MM] Messager, A. and Miracle-Solé, S.: Low temperature states in the Falicov–Kimball model. Rev. Math. Phys. 8, 271–299 (1996) [Pei] Peierls, R.: On the Ising model of ferromagnetism. Proceedings of the Cambridge Philosophical Society 32, 477–481 (1936) [Pfi] Pfister, C.-E.: Large deviations and phase separation in the two-dimensional Ising model. Helv. Phys. Acta 64, 953–1054 (1991) [PS] Pirogov, S.A. and Sinai, Ya.G.: Phase diagrams of classical lattice systems. Theoretical and Mathematical Physics 25, 1185–1192 (1975); 26, 39–49 (1976) [Sin] Sinai, Ya.G.: Theory of Phase Transitions: Rigorous Results, Oxford–New York–etc.: Pergamon Press, 1982 [Zah] Zahradník, M.: An alternate version of Pirogov–Sinai theory. Commun. Math. Phys. 93, 559–581 (1984) Communicated by Ya. G. Sinai
Commun. Math. Phys. 206, 337 – 366 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Global Foliations of Matter Spacetimes with Gowdy Symmetry Håkan Andréasson Department of Mathematics, Chalmers University of Technology, S-412 96 Göteborg, Sweden. E-mail:
[email protected] Received: 8 December 1998 / Accepted: 20 March 1999
Abstract: A global existence theorem, with respect to a geometrically defined time, is shown for Gowdy symmetric globally hyperbolic solutions of the Einstein–Vlasov system for arbitrary (in size) initial data. The spacetimes being studied contain both matter and gravitational waves. 1. Introduction An important problem in classical general relativity is the question of global existence (in an appropriate sense) for globally hyperbolic solutions of the vacuum-Einstein and matter-Einstein equations. The main motivation is its relationship to the cosmic censorship conjectures. Strong cosmic censorship has, e.g. by Eardley and Moncrief [EM], been formulated as a question on global existence and asymptotic behaviour of solutions to the Einstein equations, suggesting a definite method of analytical attack. To begin studying the long-time behaviour of solutions to a complicated partial differential equation system one might focus on families of solutions with some prescribed symmetry. With the exception of the monumental work on global nonlinear stability of the Minkowski space by Christodoulou and Klainerman [CK], the practice in general relativity has for long been to study “global existence” problems under symmetric assumptions. One family of (cosmological) solutions which have been studied extensively is the Gowdy spacetimes [G]. These spacetimes are vacuum but admit gravitational waves (in contrast to e.g. spherically symmetric spacetimes). Global existence has been shown for the Gowdy spacetimes [M], strong cosmic censorship is settled in the case of polarized Gowdy spacetimes [CIM], and much is known about the subset of the Gowdy spacetimes which admit an extension across a Cauchy horizon [CI]. In this paper we show global existence, with respect to a geometrically defined time, for matter spacetimes (Einstein–Vlasov) with Gowdy symmetry and thereby we extend Moncrief’s result [M] in the vacuum case. This is the first result which provides a global
338
H. Andréasson
foliation of a spacetime containing both matter and gravitational waves. Moreover, for matter spacetimes there are only a few global results available all together. Let us briefly mention some of these results. First, by matter spacetimes we have in mind spacetimes where the matter consists of massive particles. One can also consider spacetimes which only contain radiation and important results have been obtained in this direction, e.g. Christodoulou has obtained strong results in the spherically symmetric case with a scalar field as matter model (see e.g. [Cu1, Cu2] and the references therein). For spacetimes containing massive particles the main global results can be summarized as follows. Under a smallness condition on the initial data, Rein and Rendall have [RR] shown that solutions of the spherically symmetric Einstein–Vlasov system are geodesically complete. Some information on the large data problem was then obtained in [RRS]. Christodoulou has in a series of papers (see [Cu3] and the references therein) studied the Einstein–Euler equation in the spherically symmetric case for a special equation of state, adapted to understand the dynamics of a supernova explosion. He can globally control the solutions to the Cauchy problem and he finds solutions whose behaviour resembles qualitatively that of a supernova explosion. Finally, the most relevant results in the context of this paper are those on cosmological solutions by Rendall [Rl1-2] and Rein [Rn]. These are discussed in some detail in relation to our result below. Our method of proof is inspired by a recent global foliation result for vacuum spacetimes admitting a T 2 isometry group, acting on T 3 spacelike surfaces [BCIM]. These spacetimes are more general than the Gowdy spacetimes: both families admit two commuting Killing vectors but in the Gowdy case there is the additional condition that the twists are zero. The twists are defined by c1 = µνρδ Xµ Y ν ∇ ρ X δ , c2 = µνρδ Xµ Y ν ∇ ρ Y δ ,
(1)
where X, Y are Killing vectors associated with the isometry group. It follows from the Einstein equations that in vacuum these quantities are constant throughout spacetime [G]. One difficulty in studying long-time existence problems in general relativity is the lack of having a fixed time measure. A solution which remains regular for an infinite range of one time scale may become singular within a finite range of another. In [BCIM] this problem is treated by choosing a coordinate system in which the time is fixed to the geometry of spacetime. In fact, the time is defined to be the area of the two dimensional spacelike orbits of the T 2 isometry group. These coordinates are called areal coordinates. The main theorem in [BCIM] shows that the entire maximal globally hyperbolic development of the initial hypersurface can be foliated by areal coordinates. These coordinates are however only used in a direct way in the future direction. To show that the past of the initial hypersurface is covered by areal coordinates the authors use conformal coordinates (the time is not fixed to the geometry of spacetime) in which the equations take a more suitable form for an analytical treatment. By a long chain of geometrical arguments it is then shown that the development in conformal coordinates admits a foliation by areal coordinates, and that it covers the past maximal globally hyperbolic development of the initial hypersurface. We prove that T 3 × R-matter spacetimes with Gowdy symmetry admit global foliations by areal coordinates. The matter content is described by the Vlasov equation. This is a kinetic equation and gives a statistical description of a collection of collisionless “particles”. In the cosmological case the particles are galaxies or clusters of galaxies whereas in stellar dynamics they are stars. The Vlasov equation has been shown to be
Global Foliations of Matter Spacetimes with Gowdy Symmetry
339
suitable in general relativity for the study of the long-time behaviour of matter in gravitational fields. In particular it rules out the formation of shell-crossing singularities. For a discussion on the choice of matter model see [Rl4] and [Rl5]. To prove the existence of a global folitaion we also work directly in areal coordinates in the expanding (future) direction, and in the contracting (past) direction, we first show a global existence theorem in conformal coordinates and then we invoke the geometrical arguments in [BCIM] to complete the proof. We point out that our result depends strongly on the exact structure of the Vlasov equation and does not hold for general matter models which are only restricted by certain inequalities on the components of the energymomentum tensor. A related and interesting result has recently been shown by Rendall [Rl1] (see also [Rl2]). He considers T 2 symmetric spacetimes for the Einstein–Vlasov and the Einsteinwave map equations and he shows that if such a spacetime admits at least one compact constant mean curvature (CMC) hypersurface then the past of that surface can be covered by a foliation of compact CMC hypersurfaces. The CMC- and the areal coordinate foliation are both geometrically based time foliations which provide frameworks for studying strong cosmic censorship and other global issues. The main motivation for developing techniques to obtain CMC foliations is that the definition of a CMC hypersurface does not depend on any symmetry assumptions and it is hence possible that CMC foliations will exist for rather general spacetimes. The areal coordinate foliation used here is less general since it is adapted to the symmetry, but leads in the Gowdy case (note that the results in [Rl1] apply to the more general T 2 symmetric spacetimes, but see the remark below) to stronger results. Namely, the arguments in [Rl1] do not show that the entire future of the initial hypersurface can be covered, and the existence of the CMC foliation is only guaranteed under the hypothesis that spacetime admits at least one such hypersurface. We also mention a result in this direction due to Rein [Rn]. He has studied cosmological Einstein–Vlasov spacetimes with stronger symmetry restrictions than in the Gowdy case (the spacetimes admit three Killing vectors). In these spacetimes gravitational waves cannot exist. For plane symmetry (the relevant case for us) he has shown existence back to the initial singularity for small initial data, and under the assumption that one of the field components is bounded, he obtains global existence for large data in the future direction. An interesting result in his work is that the initial singularity is shown to be a curvature singularity as well as a “crushing” singularity (see [ES]). Remark. We have not tried to consider the more general T 2 symmetric spacetimes, i.e. spacetimes with nonvanishing twists. However, we believe that a generalization to this case would be rather straightforward as soon as the Einstein–Vlasov system has been derived. During the work on this paper we noticed one potential problem in generalizing our proof in the future direction. This is discussed and solved in the remark following Eq. (78). The outline of the paper follows largely that of [BCIM]. In Sect. 2 we describe Gowdy symmetry and give the equations for the Einstein–Vlasov system in areal and conformal coordinates. The main theorem is formulated in Sect. 3 where we also describe the geometrical arguments in [BCIM] needed to complete the proof in the contracting direction. Section 4 is devoted to the analysis in the contracting direction. Estimates for the field components and the matter terms are derived in conformal coordinates, by using e.g. light-cone arguments and methods originally developed for the Vlasov–Maxwell equation. The analysis in the expanding direction is carried out in areal coordinates in
340
H. Andréasson
Sect. 5 where a number of estimates are derived. Light-cone arguments and an “energy” monotonicity lemma are important tools for obtaining bounds on the field components and their derivatives. The control of the matter terms and their derivatives rely on three lemmas. The first one is the “energy” monotonicity lemma just mentioned. Then, in the second lemma a careful analysis of the characteristic system associated with the Vlasov equation is carried out, which leads to a bound on the support of the momenta. The third lemma provides bounds on the derivatives of the matter terms and relies indirectly on the geodesic deviation equation. This equation relates the curvature tensor and the acceleration of nearby geodesics and has proved useful in previous studies of the Einstein–Vlasov system (see [RR, Rn] and [Rl3]).
2. The Einstein–Vlasov System with Gowdy Symmetry Let us begin with a brief review of Gowdy symmetry. Consider a spacetime that can be foliated by a family of compact, connected, and orientable hypersurfaces. If the maximal isometry group of the spacetime is two dimensional, and if it acts invariantly and effectively on the foliation, then the isometry group must be U (1)×U (1). Moreover, the foliation surfaces must be homeomorphic to T 3 , S 1 ×S 2 , S 3 or L(p, q) (the Lens space), and the action is unique up to equivalence. The Killing vector fields X, Y associated with the isometry group have to commute in such a spacetime. We say that spacetimes satisfying the symmetry conditions above and in which both the twists c1 , c2 (see (1)) vanish have Gowdy symmetry. We remark that the term “Gowdy spacetime” is reserved for the vacuum case. For more background on Gowdy symmetry we refer to [G, Cl]. As mentioned above there are several choices of spacetime manifolds compatible with Gowdy symmetry. In this paper we restrict our attention to the T 3 -case. It is an interesting fact that in vacuum this is the only possibility if the condition of vanishing twists is relaxed. The dynamics of the matter is governed by the Vlasov equation. This is a kinetic equation and models a collisionless system of particles, i.e. the particles follow the geodesics of spacetime. For a nice introduction to the Einstein–Vlasov system see [Rl3]. We also mention the survey of Ehlers [E] for more information on kinetic theory in general relativity, and the book by Binney and Tremaine [BT] for some applications of kinetic theory in stellar dynamics. We will use two choices of coordinates, areal coordinates and conformal coordinates. It has been shown in [Cl] that, at least locally, any globally hyperbolic (non-flat) Gowdy spacetime on T 3 × R admits each of these coordinates. Both sets of coordinates are chosen so that ∂ ∂ +b , X=a ∂x ∂y and Y =c
∂ ∂ +d ∂x ∂y
are Killing vector fields (a, b, c and d are constants with ad − bc 6 = 0), and in both cases θ ∈ S 1 denotes the remaining spatial coordinate. Below the form of the metric and the Einstein–Vlasov system is given in areal and conformal coordinates. The functions R, α, U, A, η all depend on t and θ and the function f depends on t, θ and v ∈ R3 .
Global Foliations of Matter Spacetimes with Gowdy Symmetry
341
Areal Coordinates. Metric: g = −e2(η−U ) αdt 2 + e2(η−U ) dθ 2 + e2U (dx + Ady)2 + e−2U t 2 dy 2 .
(2)
The Einstein-matter constraint equations: e4U ηt = Ut2 + αUθ2 + 2 (A2t + αA2θ ) + e2(η−U ) αρ, t 4t √ e4U αθ ηθ = 2Ut Uθ + 2 At Aθ − − e2(η−U ) αJ, t 2t 2tα αt = 2tα 2 e2(η−U ) (P1 − ρ).
(3) (4) (5)
The Einstein-matter evolution equations: α2 ηθ αθ e4U ηt αt αθ θ + − θ + − Ut2 + αUθ2 + 2 (A2t − αA2θ ) 2 2α 4α 2 4t 2 A 2A αe2η S23 , (6) −αe2(η−U ) P3 − 2 αe2(η+U ) P2 − t t Ut Uθ αθ Ut αt e4U =− + + + 2 (A2t − αA2θ ) t 2 2α 2t 1 2(η−U ) α(ρ − P1 + P2 − P3 ), (7) + e 2 At αθ Aθ αt At = + + − 4At Ut + 4αAθ Uθ t 2 2α +2tαe2(η−2U ) S23 . (8)
ηtt − αηθθ =
Utt − αUθθ
Att − αAθθ
The Vlasov equation: √ 2U √ 1 αθ √ 0 αv ∂f αe Aθ v 2 v 3 ∂f 1 − U + αv + (η − U )v − + − (η ) θ θ t t ∂t v 0 ∂θ 2α t v0 √ √ ∂f αUθ v 1 v 2 ∂f + 0 ((v 3 )2 − (v 2 )2 ) − Ut v 2 + αUθ 0 1 v ∂v v ∂v 2 √ √ 1 e2U v 2 ∂f v1v3 v1 − ( − Ut )v 3 − αUθ 0 + = 0. (9) (At + αAθ 0 ) t v t v ∂v 3 The matter quantities ρ(t, θ) =
Z R3
Z Pk (t, θ) = J (t, θ) =
R3
Z
R3
Z S23 (t, θ) =
R3
v 0 f (t, θ, v) dv,
(10)
(v k )2 f (t, θ, v) dv, k = 1, 2, 3, v0
(11)
v 1 f (t, θ, v) dv,
(12)
v2 v3 f (t, θ, v) dv. v0
(13)
342
H. Andréasson
Here the variables v are related to the canonical momenta p through √ v 0 = αeη−U p0 , v 1 = e(η−U ) p1 , v 2 = eU p2 + AeU p3 , v 3 = te−U p3 ,
(14)
and
dx µ , x µ = (t, θ, x, y), dτ where τ is proper time. It is assumed that all “particles” have the same mass (normalized to one) and follow the geodesics of spacetime (collisionless particle system). Hence p µ :=
gµν pµ pν = −1, so that v0 =
p 1 + (v 1 )2 + (v 2 )2 + (v 3 )2 .
(15)
In conformal coordinates the function α is removed, having the consequence that the orbital area function R now depends on both t and θ (in areal coordinates R = t). In these coordinates the metric and the Einstein–Vlasov system take the following form. Conformal coordinates. Metric: g = e2(η−U ) (−dt 2 + dθ 2 ) + e2U (dx + Ady)2 + e−2U R 2 dy 2 .
(16)
The Einstein-matter constraint equations: ηt Rt ηθ Rθ e4U 2 Rθθ − − = −e2(η−U ) ρ, (A + A2θ ) + 4R 2 t R R R ηt Rθ ηθ Rt e4U Rtθ − − = e2(η−U ) J, At Aθ + 2Ut Uθ + 2 2R R R R
Ut2 + Uθ2 +
(17) (18)
The Einstein-matter evolution equations: Ut Rt e4U 2 Uθ Rθ − + (A − A2θ ) R R 2R 2 t 1 + e2(η−U ) (ρ − P1 + P2 − P3 ), 2 Rt At Rθ Aθ = − + 4(Aθ Uθ − At Ut ) + 2Re2(η−2U ) S23 , R R = Re2(η−U ) (ρ − P1 ), e4U 2 = Uθ2 − Ut2 + (A − A2θ ) − e2(η−U ) P3 4R 2 t A2 2A 2η e S23 . − 2 e2(η+U ) P2 − R R
Utt − Uθθ =
Att − Aθθ Rtt − Rθθ ηtt − ηθθ
(19) (20) (21)
(22)
Global Foliations of Matter Spacetimes with Gowdy Symmetry
343
The Vlasov equation: (v 2 )2 v 1 ∂f ∂f + 0 − (ηθ − Uθ )v 0 + (ηt − Ut )v 1 − Uθ 0 ∂t v ∂θ v 1 v 2 ∂f Rθ (v 3 )2 Aθ 2U v 2 v 3 ∂f v 2 + (Uθ − − U v + U ) 0 − e t θ R v R v 0 ∂v 1 v 0 ∂v 2 Rt Rθ v 1 v 3 e2U v 2 ∂f v1 − ( − Ut )v 3 − (Uθ − = 0. ) 0 + (At + Aθ 0 ) R R v R v ∂v 3
(23)
The matter quantities ρ, Pk , J and S23 are given by (10)–(13), where in this case v 0 = eη−U p0 , v 1 = e(η−U ) p1 , v 2 = eU p2 + AeU p3 , v 3 = Re−U p3 ,
(24)
and (15) holds here as well. Remark. It might be instructive to relate the metric in (16) with that used by Rein [Rn] mentioned in the introduction. By letting A = 0 and U = (1/2) ln R in (16) we obtain a metric which admits three Killing vectors and which depends on two field components. The distribution function f depends in this case on p 1 and (p2 )2 + (p3 )2 . 3. The Main Theorem Let (h, k, f0 ) be a Gowdy symmetric initial data set on T 3 . By this we mean that h is a Riemannian metric on T 3 , invariant under an effective T 2 action; k is a symmetric 2-tensor on T 3 , also invariant under the same T 2 group action; the twists c1 and c2 are both zero; the initial distribution function f0 is defined on T 3 and is invariant under the same T 2 group action and possesses the following additional symmetry, which reads, in coordinates that cast the metric in the forms (2) or (16), f0 (θ, p1 , p2 , p3 ) = f0 (θ, p1 , −p2 , p3 ) = f0 (θ, p1 , p2 , −p3 ) (this assumption is necessary for the Einstein–Vlasov system to be compatible with the form of the metric); and that (h, k, f0 ) satisfy the Einstein–Vlasov constraint equations. We also assume that (h, k) are C ∞ on T 3 and that f0 is a nonnegative, not identically zero, C ∞ function of compact support on the tangent bundle T (T 3 ) of T 3 . Remark. The smoothness assumption on the initial data is not a necessary condition. It is included so that we can refer directly to the classical local existence theorems. However, the estimates in this paper provide the information needed for proving a local existence theorem for C 2 × C 1 data (h, k) and C 1 data f0 . Moreover, the assumption f0 6 = 0 is here included for a technical reason and we refer to [M] or Sect. 5 in this paper for the vacuum case. Indeed, it is in this case possible to work directly in areal coordinates and the estimates derived in Sect. 5 are sufficient. See also the remark following Lemma 1 in that section. The results by Choquet-Bruhat [CB] and Choquet-Bruhat and Geroch [CBG], show that there exists a unique maximal globally hyperbolic development (6 × R, g, f ) of a given initial data set on a three-dimensional manifold 6 for the Einstein–Vlasov equation. Let us briefly comment upon the initial conditions imposed. The relations between a given initial data set (h, k) on a three-dimensional manifold 6 and the metric g on the spacetime manifold is that there exists an imbedding ψ of 6 into the spacetime such
344
H. Andréasson
that the induced metric and second fundamental form of ψ(6) coincide with the result of transporting (h, k) with ψ. For the relation of the distribution functions f and f0 we have to note that f is defined on the mass shell (for m = 1 it is the set of all future pointing unit timelike vectors). The initial condition imposed is that the restriction of f to the part of the mass shell over ψ(6) should be equal to f0 ◦ (ψ −1 , d(ψ)−1 ) ◦ φ, where φ sends each point of the mass shell over ψ(6), to its orthogonal projection onto the tangent space to ψ(6). Our main theorem now reads, Theorem 1. Let (h, k, f0 ) be a smooth Gowdy symmetric initial data set on T 3 . For some non-negative constant c, there exists a globally hyperbolic spacetime (M, g, f ) such that (i) M = (c, ∞) × T 3 . (ii) g and f satisfy the Einstein–Vlasov equation. (iii) M is covered by areal coordinates (t, θ, x, y), with t ∈ (c, ∞), so the metric globally takes the form (2). (iv) (M, g, f ) is isometrically diffeomorphic to the maximal globally hyperbolic development of the initial data (h, k, f0 ). As described in the introduction we prove global existence in conformal and areal coordinates for the past and future directions respectively. Then, in order to prove Theorem 1 in the past direction, we need to invoke substantial geometrical arguments from [BCIM]. For the future direction only a simple geometrical argument is needed for completing the proof. It should be pointed out that even if the geometrical results in [BCIM] concern the vacuum case they are true also for matter spacetimes as long as the Einstein-matter equations form a well-posed hyperbolic system, which of course is the case here. In Sect. 4 we show that the past maximal development of (h, k, f0 ) in terms of − (h, k, f0 ), has t → −∞ as long as conformal coordinates, which we denote by Dconf R stays bounded away from zero. Starting from this result we briefly describe how the geometrical arguments in [BCIM] lead to a proof of Theorem 1 in the past direction. First, in [BCIM] R is shown to be positive everywhere in the globally hyperbolic region of a T 2 symmetric spacetime. Also, along any past inextendible timelike path − (h, k, f0 ), R is shown to approach a limit R0 ≥ 0 (to be identified with c in Dconf in Theorem 1), which is independent of the choice of path. Moreover, for any R˜ ∈ (R0 , R1 ), where R1 is the minimum value of R on the initial hypersurface, the level set − (h, k, f0 ) is shown to be a Cauchy surface. From these facts it follows R = R˜ in Dconf − (h, k, f0 ) admits areal coordinates to the past of the from arguments in [Cl] that Dconf − (h, k, f0 ) hypersurface R = R1 . Propositions 4 and 5 in [BCIM] then show that Dconf − is also isometrically diffeomorphic to the maximal past development, D (h, k, f0 ) of (h, k, f0 ) on T 3 . In the future direction, global existence in areal coordinates is almost sufficient for proving Theorem 1. The only statement that remains to be proved in Theorem 1 is that + (h, k, f0 ). This follows from a very the future maximal development is covered by Dareal short geometrical argument given in the proof of Proposition 5 in [BCIM]. 4. Analysis in the Contracting Direction The local existence theorem of Choquet-Bruhat [CB] together with the result of Chrusciel (Lemma 4.2 in [Cl]) imply that for any Gowdy symmetric initial data set (h, k, f0 ) on
Global Foliations of Matter Spacetimes with Gowdy Symmetry
345
T 3 , we can find an interval (tˆ1 , tˆ2 ) and C ∞ functions R, U, η on (tˆ1 , tˆ2 ) × T 3 , and a non-negative C ∞ function f on (tˆ1 , tˆ2 ) × P (P denotes the mass shell) such that: these functions satisfy the Einstein–Vlasov equations in conformal coordinate form and for some t0 ∈ (tˆ1 , tˆ2 ), the metric g induces initial data on the t0 -hypersurface which is smoothly spatially diffeomorphic to (h, k), and the relation between f and f0 given above holds. − (h, k, f0 ) has t → −∞, as long as R stays bounded Now, in order to show that Dconf away from zero, it is sufficient to prove that on any finite time interval (t˜, t0 ], the functions R, U, A, η, f and all their derivatives are uniformly bounded and that the supremum of the support of momenta at time t, Q(t) := sup{|v| : ∃(s, θ) ∈ [t, t0 ] × S 1 such thatf (s, θ, v) 6 = 0},
(25)
is uniformly bounded. Note that the last condition implies that the matter quantities and their derivatives are uniformly bounded (if |∂f/∂x µ | < C). Step 1 (Monotonicity of R and bounds on its first derivatives). This is a key step and relies on Theorem 4.1 in [Cl] together with the arguments in [BCIM]. We have to check that the matter terms have the right signs so that these arguments still hold. The bounds on R and its first derivatives will play a crucial role when we control the matter terms below. First we show that ∇R is timelike. Let us introduce the null vector fields 1 1 ∂ξ = √ (∂t + ∂θ ), ∂λ = √ (∂t − ∂θ ) , 2 2
(26)
and let us set Fξ = ∂ξ F, Fλ = ∂λ F for a function F . After some algebra it follows that the constraint equations (17) and (18) can be written 4U ∂θ Rξ = ηξ Rξ − RUξ2 − e4R A2ξ − Re2(η−U ) (ρ − J ), 4U
∂θ Rλ = ηλ Rλ − RUλ2 − e4R A2λ − Re2(η−U ) (ρ + J ).
(27) (28)
Let h1 and h2 be defined by h1 := RUξ2 +
e4U 2 A + Re2(η−U ) (ρ − J ), 4R ξ
and
e4U 2 A + Re2(η−U ) (ρ + J ). 4R λ From (10) and (12) we have ρ ≥ |J |, and since R > 0 it follows that both h1 and h2 are non-negative. Solving Eq. (27) gives for any θ0 ∈ [0, 2π ] (suppressing the t-dependence) Z θ R Rθ θ η (σ )dσ Rξ (θ0 ) − e θ˜ ηξ (σ )dσ h1 (θ˜ )d θ˜ . (29) Rξ (θ) = e θ0 ξ h2 := RUλ2 +
θ0
Since R is C ∞ on S 1 it can be identified with a periodic function on the real line. If now Rξ (θ0 ) = 0 for any θ0 then Rξ (2π + θ0 ) = 0, but from (29) this is only possible if h1 vanishes identically. However, in the non-vacuum case (recall f0 6 = 0) hξ (t, ·) is strictly
346
H. Andréasson
positive on some open set of [0, 2π]. Therefore Rξ is nonzero and has a definite sign. The same arguments apply to Rλ , and it follows that g µν ∂µ R∂ν R = e−2(η−U ) Rξ Rλ is strictly positive or strictly negative. The former possibility is ruled out since ∂θ R = 0 at some point on S 1 . Thus ∇R is timelike. This means that ∂t R is nonzero everywhere. Our choice of time corresponds to contracting T 2 orbits so that ∂t R > 0. Next we show that ∂t R and |∂θ R| are bounded into the past. The evolution equation (21) can be written ∂λ Rξ = Re2(η−U ) (ρ − P1 ),
(30)
∂ξ Rλ = Re2(η−U ) (ρ − P1 ).
(31)
or equivalently,
The right hand side is positive since ρ ≥ P1 , see (10) and (11), and from (30) it follows that if we start at any point (t0 , θ0 ) on the initial surface we obtain Rξ (θ0 + s, t0 − s) ≤ Rξ (t0 , θ0 ),
(32)
Rλ (θ0 − s, t0 − s) ≤ Rλ (t0 , θ0 ).
(33)
and similarly from (31),
From these relations we get for any t ∈ (t˜, t0 ) and any θ ∈ S 1 , Rξ (t, θ) ≤ sup Rξ (t0 , θ ),
(34)
Rλ (t, θ) ≤ sup Rλ (t0 , θ ).
(35)
Rt (t, θ) ≤ sup (Rξ + Rλ )(t0 , θ ),
(36)
θ∈S 1
θ∈S 1
This yields θ∈S 1
and since ∇R is timelike everywhere we have |Rt | > |Rθ | and we find that both Rt and |Rθ | are bounded into the past, so R is uniformly C 1 bounded to the past of the initial surface. Step 2 (Bounds on U, A and η and their first derivatives). The bounds on Ut , At , Uθ and Aθ to the past of the initial surface are obtained by a light-cone estimate, which in this case, with one spatial dimension, is an application of the Gronwall method on two independent null paths. Then, by combining these results, one obtains the desired estimate. Let us now define the quadratic forms G and H by e4U 2 1 R(Ut2 + Uθ2 ) + (A + A2θ ), 2 8R t e4U At Aθ . H = RUt Uθ + 4R G=
(37) (38)
Global Foliations of Matter Spacetimes with Gowdy Symmetry
347
A motivation for the introduction of these quadratic forms is given in [BCIM] where it is shown that G and H are components of an “energy-momentum tensor” of a wave map. To derive bounds on U and A and their first order derivatives we use the evolution equations (19) and (20) and we find −1 e4U 2 2 2 2 (−At + Aθ ) ∂λ (G + H ) = √ Rξ Ut − Uθ + 4R 2 2 2 R + Uξ e2(η−U ) (ρ − P1 + P2 − P3 ) 2 e2U Aξ Re2(η−U ) S23 , + 2R and −1 e4U 2 2 (−A + A ) ∂ξ (G − H ) = √ Rλ Ut2 − Uθ2 + t θ 4R 2 2 2 R + Uλ e2(η−U ) (ρ − P1 + P2 − P3 ) 2 e2U Aλ Re2(η−U ) S23 . + 2R Now, integrating these equations along null paths starting at (t1 , θ ) and ending at the initial t0 -surface, and adding the results we obtain 1 1 [G + H ](t0 , θ − (t0 − t1 )) + [G + H ](t0 , θ + (t0 − t1 )) 2 2 Z 1 t0 K1 (s, θ − (s − t1 )) + K2 (s, θ + (s − t1 )) ds − 2 t1 Z 1 t0 [Uξ T ](s, θ − (s − t1 )) + [Uλ T ](s, θ + (s − t1 )) ds − 2 t1 Z e2U 1 t0 e2U Aξ T˜ ](s, θ − (s − t1 )) + [ Aλ T˜ ](s, θ + (s − t1 )) ds, [ − 2 t1 2R 2R
G(t1 , θ) =
where we have introduced the notations −1 e4U K1 = √ Rλ Ut2 − Uθ2 + 2 (−A2t + A2θ ) , R 2 2 −1 e4U 2 2 2 2 K2 = √ Rξ (Ut − Uθ + 2 (−At + Aθ ) , R 2 2 R T = e2(η−U ) (ρ − P1 + P2 − P3 ), 2 T˜ = Re2(η−U ) S23 .
(39)
(40) (41) (42) (43)
Let us first consider the matter terms. Note that for any t ∈ (t˜, t0 ), the evolution equations (30) and (31) give √ Z t0 [Re2(η−U ) (ρ − P1 )](s, θ + (s − t))ds, Rξ (t0 , θ + (t0 − t)) − Rξ (t, θ) = 2 t
(44)
348
H. Andréasson
and Rλ (t0 , θ − (t0 − t)) − Rξ (t, θ) =
√ Z 2
t
t0
[Re2(η−U ) (ρ − P1 )](s, θ − (s − t))ds. (45)
Hence, since R is uniformly C 1 bounded to the past of the initial surface it follows that the right-hand sides are uniformly bounded. From (10)-(11) we have ρ ≥ P1 + P2 + P3 , and thus 0 ≤ (ρ − P1 + P2 − P3 ) ≤ 2(ρ − P1 ), and from (13) and the elementary inequality 2ab ≤ a 2 + b2 , a, b ∈ R, we have 2|S23 | ≤ P2 + P3 ≤ ρ − P1 . In view of (44) and (45) we therefore have that both Z t0 T (s, θ ± (s − t))ds,
(46)
t
and Z
t0
|T˜ (s, θ ± (s − t))|ds,
t
(47)
are uniformly bounded on (t˜, t0 ] × S 1 . Now, by using the inequality 2ab ≤ a 2 + b2 again, we get 2G 1/2 , |Uξ | ≤ R and e2U |Aξ | ≤ 2R
2G R
1/2 .
The same estimates also hold for Uλ and Aλ . Since Rξ and Rλ are uniformly bounded it clearly follows that CG CG , |K2 | ≤ , |K1 | ≤ R R for some constant C. Let a(t) := supθ R −1 (t, ·), the identity (39) now implies that Z t0 a(s) sup G(s, ·)ds sup G(t1 , ·) ≤ sup G(t0 , ·) + sup H (t0 , ·) + C θ
Z + C sup θ
+ C sup θ
Z
t0
t1 t0 t1
θ
θ
t1
θ
p p a(s)[sup G(s, ·)](T + |T˜ |)(s, θ − (s − t1 ))ds θ
p p a(s)[sup G(s, ·)](T + |T˜ |)(s, θ + (s − t1 ))ds. θ
(48)
Global Foliations of Matter Spacetimes with Gowdy Symmetry
349
Since the suprema with respect to θ of the last two integrals are taken over the compact set S 1 , there exist θ1 , θ2 ∈ S 1 such that the suprema of these integrals equal Z C
t0 t1
Z +C
p p a(s)[sup G(s, ·)](T + |T˜ |)(s, θ1 − (s − t1 ))ds
t0
θ
p p a(s)[sup G(s, ·)](T + |T˜ |)(s, θ2 + (s − t1 ))ds.
(49)
θ
t1
Combining (48) and (49) we obtain a Gronwall-type inequality. Recall that Z
t0 t1
(T + T˜ )(s, θ ± (s − t1 ))ds,
√ are uniformly bounded on (t˜, t0 ] × S 1 . Using the crude estimate G ≤ (1 + G) we obtain a standard Gronwall inequality which is sufficient here but a sharper estimate is given in [MPF, p. 360]. Thus, as long as R stays uniformly bounded away from zero (or equivalently that a(t) is uniformly bounded on (t˜, t0 ]), we conclude that supθ G is uniformly bounded on (t˜, t0 ], leading to bounds on U and its first order derivatives, and thus also on A and its first order derivatives. The bounds on |η|, |ηt | and |ηθ | are obtained in a similar way since the evolution equation (22) can be written ∂λ ηξ = Uθ2 − Ut2 +
e4U 2 A2 4U 2A 2U 2 2(η−U ) (A − A ) − e (P + e P2 + e S23 ), (50) 3 t θ 2 2 4R R R
or equivalently, ∂ξ ηλ = Uθ2 − Ut2 +
e4U 2 A2 2A 2U (At − A2θ ) − e2(η−U ) (P3 + 2 e4U P2 + e S23 ). (51) 2 4R R R
We found above that the integrals along null paths for the matter quantity Re2(η−U ) (ρ − P1 ) were bounded to the past of the initial surface. Therefore, since 0 ≤ Pk ≤ ρ − P1 , k = 2, 3 and |S23 | ≤ ρ − P1 we have, as long as R stays bounded away from zero, that the integrals along the null paths for the matter terms in the right-hand sides above are bounded as well, since U and A are bounded. Now, since the first order derivatives of U and A are uniformly bounded we immediately obtain that |ηξ | and |ηλ | are bounded by integrating the equations for η along null paths. Since ηt = √1 (ηξ + ηλ ) 2
and ηt = √1 (ηξ − ηλ ) we find that η is uniformly C 1 bounded to the past of the initial 2 surface as long as R stays bounded away from zero. Step 3 (Bound on the support of the momentum). Note that a solution f to the Vlasov equation is given by f (t, θ, v) = f0 (2(0, t, θ, v), V (0, t, θ, v)),
(52)
350
H. Andréasson
where 2 and V are solutions to the characteristic system V1 d2 = 0, ds V (V 2 )2 dV 1 = − (ηθ − Uθ )V 0 − (ηt − Ut )V 1 + Uθ ds V0 3 2 2 3 Rθ (V ) Aθ 2U V V − (Uθ − + , ) e R V0 R V0 V 1V 2 dV 2 , = − Ut V 2 − Uθ ds V0 Rt Rθ V 1 V 3 dV 3 = − ( − Ut )V 3 + (Uθ − ) ds R R V0 e2U V1 (At + Aθ 0 )V 2 , − R V and 2(s, t, x, v), V (s, t, x, v) is the solution that goes through the point (θ, v) at time t. Let us recall the definition of Q(t) := sup{|v| : ∃(s, θ) ∈ [t, t0 ] × S 1 such thatf (s, θ, v) 6 = 0}. If Q(t) can be controlled we obtain immediately from (10)–(12) bounds on ρ, J, S23 and Pk , k = 1, 2, 3, since kf k∞ ≤ kf0 k∞ from (52). Now, all of the field components and their first derivatives are known to be bounded on (t˜, t0 ], as long as R stays bounded away from zero. Also, the distribution function has compact support on the initial surface and therefore |V k (t0 )| < C. So by observing that |V k | < V 0 , k = 1, 2, 3, a simple Gronwall argument applied to the characteristic system gives uniform bounds on |V k (t)|, t ∈ (t˜, t0 ], and it follows that Q(t) is uniformly bounded on (t˜, t0 ]. Remark. By a Killing vector argument, bounds on |V 2 | and |V 3 | can be derived if merely |U | and |A| are bounded and R > > 0. Such an argument will be used in the expanding direction. Step 4 (Bounds on the second order derivatives of the field components and on the first order derivatives of f ). From the Einstein-matter constraint equations in conformal coordinates we can express Rtθ and Rθθ in terms of uniformly bounded quantities, as long as R stays bounded away from zero. Therefore these functions are uniformly bounded and Eq. (21) then implies that Rtt is uniformly bounded as well. In the vacuum case one can take the derivative of the evolution equations and repeat the argument in Step 2 to obtain bounds on second order derivatives of U and A. Here we need another argument. First we write the evolution equations for U and A in the forms Utt − Uθθ =
(Rθ − Rt ) (Rθ + Rt ) (Uθ + Ut ) − (Ut − Uθ ) 2R 2R 1 e4U (At − Aθ )(At + Aθ ) + e2(η−U ) κ, + 2R 2 2
(53)
Global Foliations of Matter Spacetimes with Gowdy Symmetry
351
and Att − Aθθ =
(Rt − Rθ ) (Rθ + Rt ) (Aθ + At ) + (At − Aθ ) 2R 2R − 2(At − Aθ )(Uθ + Ut ) − 2(Aθ + At )(Ut − Uθ ) + 2Re2(η−2U ) S23 ,
(54)
where κ denotes ρ − P1 + P2 − P3 . Taking the θ -derivative of these equations gives ∂λ ∂ξ Uθ = L + +
Rξ Rλ ∂ξ Uθ + ∂λ Uθ 2R 2R
e4U 1 (Aλ ∂ξ Aθ + Aξ ∂λ Aθ ) + e2(η−U ) κθ , 2R 2 4
(55)
and Rξ Rλ ∂ξ Aθ − ∂λ Aθ + 2Uξ ∂λ Aθ + 2Aλ ∂ξ Uθ 2R 2R + 2Uλ ∂ξ Aθ + 2Aξ ∂λ Uθ + 2Re2(η−2U ) (S23 )θ .
∂λ ∂ξ Aθ = L +
(56)
Here, L contains only κ and S23 , first order derivatives of U, A and η, and first and second order derivatives of R, which all are known to be bounded. These equations can of course also be written in a form where the left hand sides read ∂ξ ∂λ Uθ and ∂ξ ∂λ Aθ , respectively. By integrating these equations along null paths to the past of the initial surface, we get from a Gronwall argument a bound on sup (|∂ξ Uθ | + |∂λ Uθ | + |∂ξ Aθ | + |∂λ Aθ |),
θ∈S 1
as long as R is bounded away from zero, under the hypothesis that the integral of the differentiated matter terms κθ and (S23 )θ can be controlled. In order to bound these integrals we make use of a device introduced by Glassey and Strauss [GS] for treating the Vlasov–Maxwell equation. It is sufficient to show how one of the differentiated matter terms can be boundeded since the arguments are similar in all cases. Let us consider the integral appearing by integrating (55) along the null path defined by ∂λ which involves ρθ , Z Z 1 t0 [e2(η−U ) v 0 ∂θ f ](s, θ − (s − t), v)dvds, (57) 3 4 t R where t ∈ (t˜, t0 ]. Next, define W =
√ v1 2∂λ = ∂t − ∂θ , S = ∂t + 0 ∂θ . v
Hence, ∂θ and ∂t can be expressed in terms of W and S by v0 (S − W ), v0 + v1 v0 v1 (S + 0 W ). ∂t = 0 1 v +v v
∂θ =
(58) (59)
352
H. Andréasson
Now,
[Wf ](s, θ − (s − t), v) = ∂s [f (s, θ − (s − t), v)], and from the Vlasov equation we get [Sf ](s, θ − (s − t), v) = [−K · ∇v f ](s, θ − (s − t), v), where it is clear which terms have been denoted by K = (K1 , K2 , K3 ). By using (58) we can now evaluate the integral above by integrating by parts (in s for the W -term and in v for the S-term), so that the remaining terms only involve bounded quantities. Note in particular that the v-integrals are easily controlled in view of the uniform bound on Q(t). Thus, the integrals of the differentiated matter terms can be controlled and the Gronwall argument referred to above goes through. So we obtain uniform bounds on |∂ξ Uθ |, |∂λ Uθ |, |∂ξ Aθ |, and |∂λ Aθ |, and therefore also on |Uθ θ |, |Utθ |, |Aθ θ | and |Atθ |, as long as R is bounded away from zero. The evolution Eq. (19) and (20) then give uniform bounds on |Utt | and |Att |. By differentiating Eq. (22), it is now straightforward to obtain bounds on the second order derivatives of η, using similar arguments to those already discussed here, in particular the integrals involving matter quantities can be treated as above. Bounds on the first order derivatives of the distribution function f may now be obtained from the known bounds on the field components from the formula f (t, θ, v) = f0 (2(0, t, θ, v), V (0, t, θ, v)),
(60)
since f0 is smooth and since ∂2 and ∂V (here ∂ denotes ∂t , ∂θ or ∂v ) can be controlled by a Gronwall argument in view of the characteristic system. Step 5 (Bounds on higher order derivatives and completion of the proof). It is clear that the method described above can be continued for obtaining bounds on higher derivatives as well. Hence, we have uniform bounds on the functions R, U, A, η and f and all their derivatives on the interval (t˜, t0 ] if R > > 0. This implies that the solution extends to t → −∞ as long as R stays bounded away from zero. In view of the discussion after the statement of Theorem 1, this completes the proof of Theorem 1 in the contracting direction. 5. Analysis in the Expanding Direction To begin the analysis in the expanding direction (increasing R) in areal coordinates we need to start with data on a R =constant Cauchy surface (recall that in areal coordinates R = t). That this can be done follows from the geometrical arguments in [BCIM] (cf. the discussion following the statement of Theorem 1). There it is shown that if Gowdy symmetric (or more generally T 2 symmetric) data is given on T 3 , and if R0 is the past − and if R1 := inf T 3 R, then for every limit of R along past inextendible paths in Dconf d ∈ (R0 , R1 ), the R = d level set 6d is a Cauchy surface, and these 6d foliate the region − ∩ I − (6R1 ). Here I − (S) is the chronological past of S (see [HE]). The surfaces Dconf 6d lie to the past of the initial surface. Let us pick one of them, say 6d2 . The spacetime D − (h, k, f0 ) induces initial data for the areal component fields (U, A, η, α) and the distribution function f on 6t2 =d2 . By combining the local existence proof in harmonic coordinates [CB], and the arguments in [Cl] which show that the spacetime admits areal coordinates, we obtain local existence for the initial value problem in these coordinates. Now, in order to extend local existence to global existence in these coordinates, it is again sufficient to obtain uniform bounds on the field components and the distribution function and all their derivatives on a finite time interval [t2 , t3 ) on which the local solution exists.
Global Foliations of Matter Spacetimes with Gowdy Symmetry
353
Step 1 (Bounds on α, U, A and η). ˜ In this step we first show an “energy” monotonicity lemma and then we show how this result leads to bounds on η˜ := η + ln α/2 and on U and A. Let E(t) be defined by Z E(t) =
S1
[α − 2 Ut2 + 1
√ 2 e4U − 1 2 √ 2 √ αUθ + 2 (α 2 At + αAθ ) + αe2(η−U ) ρ]dθ. 4t
Lemma 1. E(t) is a monotonically decreasing function in t, and satisfies √ Z e4U √ 2 α 2(η−U ) 2 d −1/2 2 [α Ut + 2 αAθ + (ρ + P3 )]dθ ≤ 0. E(t) = − e dt t S1 4t 2
(61)
Proof. This is a straightforward but a somewhat lengthy computation. Let us merely sketch the steps involved. After taking the time derivative of the integrand we use the evolution equations for U and A to substitute for the second order derivatives, and we express ρt by using the Vlasov equation. Integrating by parts and using the constraint t equations for ηt and αt lead to (61). u Remark. It is clear from (61) that a Gronwall argument leads to a bound on E(t) also on (0, t2 ]. For T 2 symmetry and vacuum, which is considered in [BCIM], this bound is not available. A natural question is then why the areal coordinates in our case have to be discarded in the analysis for the past direction. However, the analysis of the characteristic system associated to the Vlasov equation in Lemma 2 depends on the time direction. Let us now define the quantity η˜ by η˜ = η +
1 ln α. 2
(62)
From the constraint equation (4) we get η˜ θ = 2tUt Uθ +
√ e4U At Aθ − t αe2(η−U ) J. 2t
(63)
1 2 a + 2cb2 , for any a, b, c ∈ R, c > 0, Now, from the elementary inequality |ab| ≤ 2c and from the fact that |J | ≤ ρ, it follows from Lemma 1 that for any t ∈ [t2 , t3 ), Z |η˜ θ |dθ ≤ tE(t) ≤ tE(t2 ). (64) S1
Hence, for any θ1 , θ2 ∈ S 1 and for any t ∈ [t2 , t3 ) we have Z ˜ θ1 )| = | |η(t, ˜ θ2 ) − η(t,
θ2
θ1
Z η˜ θ dθ | ≤
S1
|η˜ θ |dθ ≤ tE(t2 ).
(65)
Next, using the constraint equations (3) and (5), we find that the time derivative of η˜ satisfies η˜ t = t[Ut2 + αUθ2 +
e4U 2 (A + αA2θ ) + αe2(η−U ) P1 ] ≥ 0. 4t 2 t
(66)
354
H. Andréasson
R This relation leads to a control of S 1 ηdθ ˜ from above, namely Z Z Z t Z d η(t, ˜ θ)dθ − η(t ˜ 2 , θ)dθ = η(s, ˜ θ )dθ ds S1 S1 S1 t2 dt Z tZ 2 2 4U √ √ √ √ U e A αs[ √t + αUθ2 + 2 ( √ t + αA2θ ) + αe2(η−U ) P1 ]dθ ds = 1 4s α α t2 S Z t Z t √ sE(s)ds ≤ C1 sE(t2 )ds = C1 E(t2 )(t 2 − t22 )/2. ≤ sup α(t2 , ·) t2
S1
t2
(67) second that α is a monoIn the first inequality above we used that P1 ≤ ρ and in the √ tonically decreasing function in t (see (5)) and C1 := supS 1 α(t R 2 , ·). We are now in ˜ 2 , θ )dθ we get a position to obtain an upper bound on η˜ itself. By letting C2 := S 1 η(t from (67) the inequality Z 1 C1 E(t2 )(t 2 − t22 ) + C2 ≥ η(t, ˜ θ )dθ (68) 2 S1 Z (η˜ − max η)dθ. ˜ (69) = 2π max η˜ + S1
S1
S1
By applying (65) to the last term we find 1 E(t2 )(t 2 − t22 ) + C2 ≥ 2π max η˜ − 2π tE(t2 ). 2 S1
(70)
Therefore, for some bounded function C(t), we have the upper bound max η˜ ≤ C(t),
(71)
S1
and since η˜ t ≥ 0 we conclude that η˜ is uniformly bounded on S 1 × [t2 , t3 ). Remark. In the analysis below C(t) will always denote a uniformly bounded function on [t2 , t3 ). Sometimes we introduce other functions with the same property only for the purpose of trying to make some estimates become more transparent. Next we show that the boundedness of E(t), together with the constraint equation (5), lead to a bound on |U |. For any θ1 , θ2 ∈ S 1 , and t ∈ [t2 , t3 ) we get by Hölder’s inequality Z θ2 Uθ (t, θ )dθ |U (t, θ2 ) − U (t, θ1 )| = Z ≤
θ2 θ1
α −1/2 dθ
1/2 Z
θ1
θ2
θ1
√ 2 αUθ dθ
1/2 .
(72)
The second factor on the right-hand side is clearly bounded by (E(t2 ))1/2 . For the first factor we use the constraint equation (5). This equation can be written as √ (73) ∂t (α −1/2 ) = t αe2(η−U ) (ρ − P1 ),
Global Foliations of Matter Spacetimes with Gowdy Symmetry
so that for t ∈ [t2 , t3 ), α −1/2 (t, θ) =
Z
t
t2
355
√ s αe2(η−U ) (ρ − P1 )ds + α −1/2 (t2 , θ ).
(74)
Since ρ ≥ P1 , the integrand is positive and bounded by the last term in the integrand of E(t). Letting C denote the supremum of α −1/2 (t2 , ·) over S 1 we get Z t Z Z θ2 √ 2(η−U ) −1/2 α dθ ≤ s αe ρdθ ds + 2π C θ1
t2
S1 2
≤ E(t2 )(t − t22 )/2 + 2π C.
(75)
Hence, for any θ1 , θ2 ∈ S 1 we have (76) |U (t, θ2 ) − U (t, θ1 )| ≤ C(t). R Next we estimate S 1 U (t, θ)dθ. Let C := S 1 U (t2 , θ )dθ, we get by Hölder’s inequality Z t Z Z = U (t, θ)dθ U (s, θ )dθ ds + C t 1 S t2 S 1 Z tZ |Ut (s, θ)|dθ ds + |C| ≤ R
≤
Z t Z t2
S1
S1
t2
√
1/2 Z
αdθ
S1
α −1/2 Ut2 dθ
1/2 ds + |C|.
(77)
√ The right-hand side is easily seen to be bounded since (5) shows that α is monotonically decreasing and (61) gives a bound for the second factor. Therefore Z U (t, θ)dθ ≤ C(t), S1
for some uniformly bounded function C(t). To obtain a uniform bound on U we combine these results. Let U+ (t) := maxS 1 U (t, ·), and U− (t) := minS 1 U (t, ·). We have Z Z 2πU± (t) = U (t, θ)dθ + (U± (t) − U (t, θ ))dθ, (78) S1
S1
and the right-hand side is bounded from below and above so U is uniformly bounded on [t2 , t3 ) × S 1 . These arguments also apply to A as well, since the factor e4U is controlled by the uniform bound on U . Remark. In the case studied in [BCIM], i.e. vacuum and T 2 symmetry, a bound on ln α, and thus on η, is directly available. On the other hand, the method used here to bound U and A does not directly apply which would lead to a difficulty in generalizing the result in [BCIM] to matter spacetimes. However, one can in that case show that Z √ √ √ 1 1 e4U K2 α − 2 Ut2 + αUθ2 + 2 (α − 2 A2t + αA2θ ) + αe2(η−U ) (ρ + 4 ) dθ, (79) 4t 4t S1 is monotonically decreasing. Here K is the twist constant in [BCIM]. This is sufficient for obtaining bounds on U and A also in the more general case of T 2 symmetry by straightforwardly applying the arguments above.
356
H. Andréasson
Step 2 (Bounds on Ut , Uθ , At , Aθ , ηt , αt and Q(t)). To bound the derivatives of U we use light-cone estimates in a similar way as for the contracting direction. However, the matter terms must be treated differently and we need to carry out a careful analysis of the characteristic system associated with the Vlasov equation. Let us define e4U 1 2 (Ut + αUθ2 ) + 2 (A2t + αA2θ ), 2 8t √ e4U H = αUt Uθ + 2 At Aθ , 4t G=
(80) (81)
and √ 1 χ = √ (∂t + α∂θ ), 2 √ 1 ζ = √ (∂t − α∂θ ). 2
(82) (83)
A motivation for the introduction of these quantities is based on similar arguments as those given in Step 2, Sect. 4. For details we refer to [BCIM]. Remark. We use the same notations, G and H , as in the contracting direction, and below we continue to carry over the notations. The analysis in the respective direction is independent so there should be no risk of confusion. By using the evolution equation (7), a short computation shows that αt ζ (G + H ) = √ (G + H ) 2 2α √ √ e4U 1 2 2 Ut + αUt Uθ + 2 (αAθ + αAt Aθ ) −√ 4t 2t √ √ α 2(η−U ) αe2η κ + √ (At + αAθ )S23 , + (Ut + αUθ ) √ e 2 2 2 2t αt χ (G − H ) = √ (G − H ) 2 2α √ √ e4U 1 2 2 Ut − αUt Uθ + 2 (αAθ − αAt Aθ ) −√ 4t 2t √ √ α 2(η−U ) αe2η κ + √ (At − αAθ )S23 . + (Ut − αUθ ) √ e 2 2 2 2t
(84)
(85) Here κ = ρ − P1 + P2 − P3 . Now we wish to integrate these equations along the integral curves of the vector fields χ and ζ respectively (let us henceforth call these integral curves null curves, since they are null with respect to the two-dimensional “base spacetime”). Below we show that the quantity 0(t) := sup G(t, ·) + Q2 (t), θ∈S 1
(86)
Global Foliations of Matter Spacetimes with Gowdy Symmetry
is uniformly bounded on [t2 , t3 ) by deriving the inequality Z t 0(s) ln 0(s)ds. 0(t) ≤ C + t2
357
(87)
We begin with two observations. Let γ and X be a geodesic and a Killing vector field respectively in any spacetime. Then g(γ 0 , X) is conserved along the geodesic. Here γ 0 is the tangent vector to γ . In our case we have the two Killing vector fields ∂x and ∂y . The particles follow the geodesics of spacetime with tangent p µ , so gµν pµ (∂x )ν and gµν pµ (∂y )ν are thus conserved. Expressing pµ in terms of v µ (see (14)) we find that V 2 (t)eU (t,2(t)) and
V 2 (t)AeU (t,2(t)) + V 3 (t)te−U (t,2(t)) , are conserved. Here V 2 (t), V 3 (t) and 2(t) are solutions to the characteristic system associated to the Vlasov equation. From Step 2 we have that U and A are uniformly bounded on [t2 , t3 ). Hence |V 2 (t)| and |V 3 (t)| are both uniformly bounded on [t2 , t3 ), and since the initial distribution function f0 has compact support we conclude that sup{|v 2 | + |v 3 | : ∃(s, θ) ∈ [t2 , t] × S 1 with f (s, θ, v) 6= 0},
(88)
is uniformly bounded on [t2 , t3 ). Therefore, in order to control Q(t) it is sufficient to control Q1 (t) := sup{|v 1 | : ∃(s, θ) ∈ [t2 , t] × S 1 such thatf (s, θ, v) 6 = 0}.
(89)
Below we introduce the uniformly bounded function γ (t) to denote estimates regarding the variables v 2 and v 3 . Next we observe that there is some cancellation to take advantage of in the matter term (ρ − P1 ) which appears in the equations for G + H and G − H above. This term can be estimated as follows: Z (v 1 )2 (v 0 − 0 )f (t, θ, v)dv 0 ≤ (ρ − P1 )(t, θ) = v R3 Z 2 1 + (v )2 + (v 3 )2 f (t, θ, v)dv = 3 v0 ZR dv [1 + (v 2 )2 + (v 3 )2 ]|f | p ≤ 3 R 1 + (v 1 )2 Z dv 1 p ≤ kf0 k∞ γ (t) |v 1 |≤Q1 (t) 1 + (v 1 )2 ≤ Cγ (t) ln Q1 (t).
(90)
In a similar fashion we can estimate P2 , P3 and S23 . Indeed, for k = 1, 2, we have Z (v k )2 f (t, θ, v)dv 0 ≤ Pk (t, θ) = 0 R3 v Z dv 1 p ≤ kf0 k∞ γ (t) |v 1 |≤Q1 (t) 1 + (v 1 )2 ≤ Cγ (t) ln Q1 (t). The argument is almost identical for S23 .
(91)
358
H. Andréasson
Remark. Since the matter of interest is large momenta we have here assumed that Q1 (t) ≥ 2 to avoid the introduction of some immaterial constants in the estimates. Let us now derive (87). As in Step 2 in Sect. 4 we integrate the equations above for G + H and G − H along null paths. For t ≥ t2 , let Z t √ α(s, θ )ds, A(t, θ) = t2
and integrate along the two null paths defined by χ and ζ , starting at (t2 , θ ) and add the results. We get for t ∈ [t2 , t3 ), 1 1 [G + H ](t2 , θ − (A(t) − t2 )) + [G + H ](t2 , θ + (A(t) − t2 )) 2 2 Z 1 t K1 (s, θ − (A(s) − t2 )) + K2 (s, θ + (A(s) − t2 )) ds + 2 t2 Z t 1 L1 (s, θ − (A(s) − t2 )) + L2 (s, θ + (A(s) − t2 )) ds + 2 t2 Z 1 t [χU M](s, θ − (A(s) − t2 )) + [ζ U M](s, θ + (A(s) − t2 )) ds + 2 t2 Z ζA ˜ 1 t χA ˜ M](s, θ − (A(s) − t2 )) + [ M](s, θ + (A(s) − t2 )) ds, [ + 2 t2 2t 2t
G(t, θ ) =
(92) where
αt αt (93) K1 = √ (G + H ), K2 = √ (G − H ), 2 2α 2 2α √ √ 1 e4U 2 2 (94) Ut + αUt Uθ + 2 (αAθ + αAt Aθ ) , L1 = − √ 4t 2t √ √ 1 e4U (95) Ut2 − αUt Uθ + 2 (αA2θ − αAt Aθ ) , L2 = − √ 4t 2t 1 ˜ ) κ, M˜ = e2η˜ S23 . (96) M = e2(η−U 2 Note that in the expression for M and M˜ we used αe2η = e2η˜ . It is easy to see that both G + H and G − H can be written as sums of two squares. From the constraint equation (5) we find that αt /α ≤ 0 so that K1 and K2 are nonpositive. Using the elementary ˜ and |U | are uniformly bounded we obtain inequality 2ab ≤ a 2 + b2 and the fact that |η| from (92) the inequality Z t 1 sup G(s, ·)ds sup G(t, ·) ≤ sup G(t2 , ·) + sup H (t2 , ·) + C θ θ θ t2 s θ Z t p C(s) sup[ G(s, ·)((ρ − P1 + P2 − P3 ) + S23 )]ds + t2
≤ C + C(t)
θ
Z
t t2
p [sup G(s, ·) + sup G(s, ·) ln Q1 (s)]ds, θ
where (90) and (91) were used in the last inequality.
θ
(97)
Global Foliations of Matter Spacetimes with Gowdy Symmetry
359
Remark. The sign of K1 and K2 simplified the estimate above. This is not crucial since |αt |/α is bounded by ln Q1 (t) which is sufficient for obtaining a bound on 0(t). Let us now derive an estimate for Q1 in terms of supθ G. Lemma 2. Let Q1 (t) and G(t, θ) be as above. Then Z t 1 2 |Q (t)| ≤ C + D(t) [(Q1 (s))2 + sup G(s, ·)]ds, t2
(98)
θ
where C is a constant and D(t) is a uniformly bounded function on [t2 , t3 ). Proof. The characteristic equation for V 1 associated to the Vlasov equation reads αθ √ 0 dV 1 (s) = −(ηθ − Uθ + ) αV − (ηt − Ut )V 1 ds 2α √ √ αUθ αAθ 2U 2 3 2 2 3 2 − ((V ) − (V ) ) + e v v . (99) 0 V sv 0 We will now split the right-hand side into three terms to be analyzed separately. Expressing ηθ and ηt by using the constraint equations (3) and (4) we obtain d d (V 1 (s))2 = 2V 1 (s) V 1 (s) = T1 + T2 + T3 , ds ds
(100)
where T1 = − 2V 1 (s)[sαe2(η−U ) (J V 0 + ρV 1 )], e4U T2 = − 2V 1 (s) s(Ut2 + αUθ2 + 2 (A2t + αA2θ ))V 1 4s √ √ e4U √ + 2s αUθ Ut V 0 − αUθ V 0 − Ut V 1 + αAt Aθ V 0 , 2s √ √ αUθ αAθ 2U 2 3 ((V 3 )2 − (V 2 )2 ) − e V V ]. T3 = − 2V 1 (s)[ V0 sV 0 Let us first estimate T1 . We split it into two terms ˜ ) − (I + I + ), T1 = T1− + T1+ = −2sV 1 (s)e2(η−U
where I− = I+ =
Z
Z
0
R2 −∞ Z Z ∞ R2 0
(101)
(v 1 V 0 + v 0 V 1 )f (s, θ, v)dv 1 dv 2 dv 3 ,
(v 1 V 0 + v 0 V 1 )f (s, θ, v)dv 1 dv 2 dv 3 .
Let us now consider the two cases V 1 (s) > 0 and V 1 (s) < 0. On a time interval where V 1 (s) > 0, I + is nonnegative and T1+ can therefore be discarded since it is nonpositive. The kernel in I 1− can be estimated as follows: (v 1 )2 (V 0 )2 − (v 0 )2 (V 1 )2 v1V 0 − v0 V 1 1 2 (v ) (1 + (V 2 )2 + (V 3 )2 ) (V 1 )2 (1 + (v 2 )2 + (v 3 )2 ) = + . v1V 0 − v0 V 1 v0 V 1 − v1V 0
v1V 0 + v0 V 1 =
360
H. Andréasson
Of course, the cancellation of the terms (v 1 )2 (V 1 )2 is essential in this computation. The second term is positive since V 1 (s) > 0 and v 1 < 0, and contributes negatively to T1− and can be discarded. The first term is negative and the modulus can be estimated by |v 1 |(1 + (V 2 )2 + (V 3 )2 ) (v 1 )2 (1 + (V 2 )2 + (V 3 )2 ) ≤ . |v 1 |V 0 + v 0 V 1 V1
(102)
˜ ) ≤ C(s). Hence, on In the expression for T1 we first note that 2sαe2(η−U ) = 2se2(η−U 1 the time interval where V (s) > 0 we can estimate T1 by
T1 ≤ T1− ≤ kf0 k∞ C(s)V 1 (s) Z
Q1
≤ kf0 kC(s)γ (s)
Z
Z
R2 0
Q1
v 1 (1 + (V 2 )2 + (V 3 )2 ) 1 dv du V 1 (s)
v 1 dv 1 ≤ C(s)(Q1 (s))2 .
(103)
0
On a time interval where V 1 < 0 we see that T1− is nonpositive and can be discarded. We can then estimate T1+ by using almost identical arguments as for T1− and we get also on such a time interval, T1 ≤ T1+ ≤ C(s)(Q1 (s))2 .
(104)
Let us now consider T2 . We again study the cases V 1 (s) > 0 and V 1 (s) < 0. Assume first p that V 1 (s) > 0 on some time interval. The expression for T2 can be written T2 = T2 +T2r (p=principal, r=rest) where √ √ √ e4U p (At + αAθ )2 ] T2 = −2(V 1 (s))2 [s(Ut + αUθ )2 − (Ut + αUθ )] + [ 4s and √ √ e4U √ αAt Aθ ]. T2r = 2(V 0 (s) − V 1 (s))V 1 (s)[ αUθ − 2s αUt Uθ − 2s For T2r we have 2(1 + (V 2 )2 + (V 3 )2 )V 1 (s) √ e4U At Aθ | α|U − 2sU U − θ t θ V0 +V1 2s ≤ (s + 1)γ (s) sup G(s, ·).
|T2r | =
θ
(105)
√ Since the matter of interest is large G we have here assumed that G ≤ G. This p assumption will be used below without comment. To estimate T2 we observe that for s ≥ t2 , −1 −1 ≥ , for any a ∈ R. sa 2 − a ≥ 4s 4t2 The term involving A contributes negatively and can be discarded, thus p
T2 ≤
1 (V 1 (s))2 ≤ C(Q1 (s))2 . 2t2
(106)
Global Foliations of Matter Spacetimes with Gowdy Symmetry
361
On a time interval where V 1 (s) < 0, the same estimates hold. Indeed, we only have to p write T2 = T2 + T2r in the form √ √ √ e4U p (At − αAθ )2 T2 = −2(V 1 (s))2 [s(Ut − αUθ )2 − (Ut − αUθ )] + 4s and √ √ e4U √ αAt Aθ ], T2r = 2(V 0 (s) + V 1 (s))V 1 (s)[ αUθ − 2s αUt Uθ − 2s and the same arguments apply. Therefore we have obtained p
T2 ≤ T2 + |T2r | ≤ C(Q1 (s))2 + C(s) sup G(s, ·). θ
(107)
Finally we estimate T3 . It follows immediately that |T3 | ≤ γ (s)
|V 1 (s)| √ e2U Aθ | ≤ C(s) sup G(s, ·). α|U + θ V0 s θ
(108)
t The lemma now follows by adding the estimates for Tk , k = 1, 2, 3. u Combining the estimate for (Q1 (t))2 in the lemma and the estimate (97) for supθ G(t, ·), we find that 0(t) satisfies the estimate (87) and is thus uniformly bounded. The constraint equation (3) now immediately shows that |ηt | is bounded by ˜ ) ρ ≤ C(t)[sup G(t, ·) + (Q(t))3 ], 2tG + te2(η−U θ
since
Z
Z ρ=
R3
f dv ≤ kf0 k∞
|v|≤Q(t)
dv ≤ C(Q(t))3 .
Analogous arguments show that |αt | is uniformly bounded. The uniform bound on G provides bounds on |Ut | and |At |, but to conclude that |Uθ | and |Aθ | are bounded we have to show that α stays uniformly bounded away from zero. Equation (5) is easily solved, Rt
α(t, θ) = α(t2 , θ)e
t2
F (s,θ)ds
,
(109)
where ˜ ) (ρ − P1 ), F (t, θ) := −2te2(η−U
which is uniformly bounded from below. Hence |Uθ | and |Aθ | are bounded and Step 2 is complete.
362
H. Andréasson
Step 3 (Bounds on ∂f , αθ and ηθ ). The main goal in this step is to show that the first derivatives of the distribution function are bounded. In view of the bound on Q(t) we then also obtain bounds on the first derivatives of the matter terms ρ, J, S23 and Pk , k = 1, 2, 3. Such bounds almost immediately lead to bounds on αθ and ηθ . Recall that the solution f can be written in the form f (t, θ, v) = f0 (2(0, t, θ, v), V (0, t, θ, v)),
(110)
where 2(s, t, θ, v), V (s, t, θ, v) is the solution to the characteristic system √ V1 d2 = α 0, ds V 1 αθ √ 0 dV = − (ηθ − Uθ + ) αV − (ηt − Ut )V 1 ds 2α √ e2U √ (V 3 )2 − (V 2 )2 V 2V 3 − αUθ + αA , θ V0 s V0 √ V 1V 2 dV 2 , = − Ut V 2 − αUθ ds V0 √ 1 V 1V 3 dV 3 = − ( − Ut )V 3 + αUθ ds s V0 2U 1 √ e V (At + αAθ 0 )V 2 , − s V
(111)
(112) (113)
(114)
with the property 2(t, t, θ, v) = θ , V (t, t, θ, v) = v. Hence, in order to establish bounds on the first derivatives of f it is sufficient to bound ∂2 and ∂V since f0 is smooth. Here ∂ denotes the first order derivative with respect to t, θ or v. Evolution equations for ∂2 and ∂V are provided by the characteristic system above. However, the right-hand sides will contain second order derivatives of the field components, but so far we have only obtained bounds on the first order derivatives (except for ηθ , αθ ). Yet, certain combinations of second order derivatives can be controlled. Behind this observation lies a geometrical idea which plays a fundamental role in general relativity. An important property of curvature is its control over the relative behaviour of nearby geodesics. Let γ (u, λ) be a two-parameter family of geodesics, i.e. for each fixed λ, the curve u 7 → γ (u, λ) is a geodesic. Define the variation vector field Y := γλ (u, 0). This vector field satisfies the geodesic deviation equation (or Jacobi equation) (see e.g. [HE]) D2 Y = RY γ 0 γ 0 , Du2
(115)
where D/Du is the covariant derivative, R the Riemann curvature tensor, and γ 0 := γu (u, 0). Now, the Einstein tensor is closely related to the curvature tensor and since the Einstein tensor is proportional to the energy momentum tensor which we can control from Step 2, it is meaningful, in view of (115) (with Y = ∂2), to look for linear combinations of ∂2 and ∂V which satisfy an equation with bounded coefficients. More precisely, we want to substitute the twice differentiated field components which appear by taking the derivative of the characteristic system by using the Einstein equations. The geodesic deviation equation has previously played an important role in studies of the Einstein–Vlasov system ([RR, Rn] and [Rl3]).
Global Foliations of Matter Spacetimes with Gowdy Symmetry
363
Lemma 3. Let 2(s) = 2(s, t, θ, v) and V k (s) = V k (s, t, θ, v), k = 1, 2, 3 be a solution to the characteristic system (111)–(114). Let ∂ denote ∂t , ∂θ or ∂v , and define 9 = α −1/2 ∂2, ηt V 0 Ut V 0 (V 0 )2 − (V 1 )2 + (V 2 )2 − (V 3 )2 1 1 Z = ∂V + √ − √ (V 0 )2 − (V 1 )2 α α V 0V 2V 3 V 1 ((V 2 )2 − (V 3 )2 ) At e2U − √ (V 0 )2 − (V 1 )2 αt (V 0 )2 − (V 1 )2 V 1V 2V 3 + Aθ 0 2 ∂2, (V ) − (V 1 )2
(116)
+ Uθ
(117)
Z 2 = ∂V 2 + V 2 Uθ ∂2, e2U 2 V Aθ ) ∂2. Z 3 = ∂V 3 − (V 3 Uθ − s
(118) (119)
Then there is a matrix A = {alm }, l, m = 0, 1, 2, 3, such that := (9, Z 1 , Z 2 , Z 3 )T satisfies d = A, ds
(120)
and the matrix elements alm = alm (s, 2(s), V k (s)) are all uniformly bounded on [t2 , t3 ). Sketch of proof. Once the ansatz (116)–(119) has been found this is only a lengthy calculation. To illustrate the type of calculations involved we show the easiest case, i.e. the Z 2 term: d dZ 2 = (∂V 2 + V 2 Uθ ∂2) ds ds dV 2 d Uθ ∂2 = ∂( V 2 ) + ds ds d2 d2 )∂2 + V 2 Uθ ∂( ). + V 2 (Utθ + Uθθ ds ds (121) Now we use (111) and (113) to substitute for d2/ds and right-hand side equals
dV 2 /ds.
We find that the
√ √ V 1V 2 V 1V 2 2 αUθ ) + (−U V − αU )Uθ ∂2 t θ V0 V0 √ √ V1 V1 αθ V 1 +V 2 (Utθ + Uθθ α 0 )∂2 + V 2 Uθ √ 0 ∂2 + α∂( 0 ) . V V 2 αV ∂(−Ut V 2 −
Taking the ∂ derivative of the first term we find that all terms of second order derivatives and terms containing αθ cancel. Next, since 1 2 1 √ √ √ V V V V1 2 αU V ∂ αU ∂V 2 , (122) − αUθ ∂ + = − θ θ V0 V0 V0
364
H. Andréasson
we are left with √ √ V 1V 2 V1 dZ 2 = −(Ut V 2 + αUθ )Uθ ∂2 − (Ut + αUθ 0 )∂V 2 . 0 ds V V
(123)
Finally we express this in terms of 9, Z 1 , Z 2 and Z 3 . Here this is easy and we immediately get √ V1 dZ 2 = −(Ut + αUθ 0 )Z 2 . ds V Clearly, the map (∂2, ∂V k ) 7 → (9, Z k ) is invertible so that this step is easy also in the other cases. It follows that the matrix elements a2m , m = 0, 1, 2, 3, are uniformly bounded on [t2 , t3 ) (only a22 is nonzero here). The computations for the other terms are similar. For the Z 1 term we point out that the evolution equations (7) and (8) should be invoked and that the matrix element a10 contains ηθ and αθ /2α, but they combine and form η. ˜ u t From the lemma it now immediately follows that || is uniformly bounded on [t2 , t3 ). Moreover, since the system (116)–(119) is invertible with uniformly bounded coefficients we also have uniform bounds on |∂2| and |∂V k |, k = 1, 2, 3. In view of the discussion at the beginning of this section we see that the distribution function f and the matter quantities ρ, J, S23 and Pk , are all uniformly C 1 bounded. From the constraint equation (5) we now obtain a uniform bound on αθ by a simple Gronwall argument using as usual ˜ ) . Finally this yields a uniform bound on η since the identity αe2(η−U ) = e2(η−U θ ηθ = η˜ θ −
αθ 2α
and α stays uniformly bounded away from zero. Step 4 (Bounds on second and higher order derivatives). It is now easy to obtain bounds on second order derivatives on U and A by using light cone arguments. We define G and H by e4U 1 2 2 ) + 2 (A2tt + αA2tθ ), (Utt + αUtθ 2 8t √ e4U H = αUtt Utθ + 2 Att Atθ , 4t G=
(124) (125)
and use the differentiated (with respect to t) evolution equations for U and A to obtain equations similar to (84) and (85). In this case a straightforward light cone argument applies since we have control of the differentiated matter terms. Uθ θ and Aθ θ are then uniformly bounded in view of the evolution equations (7) and (8). Bounds on second order derivatives on f then follow from (120) by studying the equation for ∂. The only thing to notice is that η˜ θθ is controlled by (4). It is clear that this reasoning can be continued to give uniform bounds on [t2 , t3 ) for higher order derivatives as well. In view of the discussion after the statement of Theorem 1 in Sect. 3, this completes the proof of Theorem 1 in the expanding direction. u t
Global Foliations of Matter Spacetimes with Gowdy Symmetry
365
Acknowledgement. I am most grateful to Alan Rendall for suggesting the problem (for small data) and for commenting on the manuscript. I also wish to thank Demetrios Christodoulou and Shadi Tahvildar-Zadeh at the Department of Mathematics at Princeton University, where this work was carried out, for interesting and stimulating discussions. This work was supported by the Swedish Foundation for International Cooperation in Research and Higher Education (STINT) and is hereby gratefully acknowledged.
References [BCIM] Berger, B.K., Chru´sciel, P., Isenberg, J. and Moncrief, V.: Global foliations of vacuum spacetimes with T 2 isometry. Ann. Phys. 260, 117–148 (1997) [BT] Binney, J. and Tremaine, S.: Galactic dynamics. Princeton, NJ: Princeton University Press, 1987 [CB] Choquet-Bruhat, Y.: Problème de Cauchy pour le système intégro différentiel d’Einstein–Liouville. Ann. Inst. Fourier 21, 181–201 (1971) [CBG] Choquet-Bruhat Y. and Geroch, R.: Global aspects of the Cauchy problem in general relativity. Commun. Math. Phys. 14, 344–357 (1969) [Cu1] Christodoulou, D.: Examples of naked singularity formation in the gravitational collapse of a scalar field. Ann. Math. 140, 607–653 (1994) [Cu2] Christodoulou, D.: Bounded variation solutions of the spherically symmetric Einstein-scalar field equations. Comm. Pure Appl. Math. 46, 1131–1220 (1993) [Cu3] Christodoulou, D.: Self-gravitating relativistic fluids: The formation of a free phase boundary in the phase transition from soft to hard. Arch. Rational Mech. Anal. 134, 97–154 (1996) [CK] Christodoulou, D. and Klainerman, S.: The global nonlinear stability of the Minkowski space. Princeton, NJ: Princeton University Press, 1993 [Cl] Chru´sciel, P.T.: On spacetimes with U (1) × U (1) symmetric compact Cauchy surfaces. Ann. Phys. 202, 100–150 (1990) [CIM] Chru´sciel, P.T., Isenberg, J. and Moncrief, V.: Strong cosmic censorship in polarised Gowdy spacetimes. Class. Quantum Grav. 7, 1671–1680 (1990) [EM] Eardley, D. and Moncrief, V.: The global existence problem and cosmic censorship in general relativity. Gen. Rel. Grav. 13, 887–892 (1981) [ES] Eardley, D. and Smarr, L.: Time functions in numerical relativity: marginally bound dust collapse. Phys. Rev. D19, 2239–2259 (1979) [E] Ehlers, J.: Survey of general relativity theory. In: W. Israel (ed.) Relativity, Astrophysics and Cosmology. Dordrecht: Reidel, 1973 [GS] Glassey, R. and Strauss, W.: Singularity formation in a collisionless plasma could only occur at high velocities. Arch. Rat. Mech. Anal. 92, 56–90 (1986) [G] Gowdy, R.: Vacuum spacetimes and compact invariant hypersurfaces: Topologies and boundary conditions. Ann. Phys. 83, 203–24 (1974) [HE] Hawking, S. and Ellis, G.: The large scale structure of spacetime. Cambridge: Cambridge University Press, 1973 [MPF] Mitrinovi´c, D., Pecari´c, J. and Fink, A.: Inequalities involving functions and their integrals and derivatives. Dordrecht: Kluwer Academic Publishers, 1991 [M] Moncrief, V.: Global properties of Gowdy spacetimes with T 3 × R topology. Ann. Phys. 132, 87–107 (1981) [Rn] Rein, G.: Cosmological solutions of the Vlasov–Einstein system with spherical, plane and hyperbolic symmetry. Math. Proc. Camb. Phil. Soc. 119, 739–762 (1996) [RR] Rein, G. and Rendall, A.D.: Global existence of solutions of the spherically symmetric Vlasov– Einstein system with small initial data. Commun. Math. Phys. 150, 561–583 (1992); Erratum: Commun. Math. Phys. 176, 475–478 (1996) [Rl1] Rendall, A.D.: Existence of constant mean curvature foliations in spacetimes with two-dimensional local symmetry. Commun. Math. Phys. 189, 145–164 (1997) [Rl2] Rendall, A.D.: Crushing singularities in spacetimes with spherical, plane and hyperbolic symmetry. Class. Quantum Grav. 12, 1517–1533 (1995) [Rl3] Rendall, A.D.: An introduction to the Einstein–Vlasov system. Mathematics of gravitation. Part I (Warsaw, 1996) Banach center Publ.41, Part I, Warsaw: Polish Acad. Sci., 1997, pp. 35–68 [Rl4] Rendall, A.D.: On the choice of matter model in general relativity. In: R. d’Inverno (ed.) Approaches to Numerical Relativity. Cambridge: Cambridge University Press, 1992 [Rl5] Rendall, A.D.: Cosmic censorship and the Vlasov equation. Class. Quantum Grav. 9, L99–L104 (1992) [RRS] Rein, G., Rendall, A.D. and Schaeffer, J.: A regularity theorem for solutions of the spherically symmetric Vlasov–Einstein system. Commun. Math. Phys. 168, 467–478 (1995) Communicated by H. Nicolai
Commun. Math. Phys. 206, 367 – 381 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Lie Groupoid C ∗ -Algebras and Weyl Quantization N. P. Landsman? Korteweg-de Vries Institute for Mathematics, University of Amsterdam, Plantage Muidergracht 24, 1018 TV Amsterdam, The Netherlands. E-mail:
[email protected] Received: 14 May 1998 / Accepted: 23 March 1999
Abstract: A strict quantization of a Poisson manifold P on a subset I ⊆ R containing 0 as an accumulation point is defined as a continuous field of C ∗ -algebras {Ah¯ }h¯ ∈I , ˜ 0 of C0 (P ) on which the Poisson bracket with A0 = C0 (P ), a dense subalgebra A is defined, and a set of continuous cross-sections {Q(f )}f ∈A˜ 0 for which Q0 (f ) = f . Here Qh¯ (f ∗ ) = Qh¯ (f )∗ for all h¯ ∈ I , whereas for h¯ → 0 one requires that i[Qh¯ (f ), Qh¯ (g)]/h¯ → Qh¯ ({f, g}) in norm. For any Lie groupoid G, the vector bundle G∗ dual to the associated Lie algebroid G is canonically a Poisson manifold. Let A0 = C0 (G∗ ), and for h¯ 6 = 0 let Ah¯ = C ∗ (G) be the C ∗ -algebra of G. The family of C ∗ -algebras {Ah¯ }h¯ ∈[0,1] forms a continuous field, ˜ 0 ⊂ C0 (G∗ ) and an associated family {QW (f )} and we construct a dense subalgebra A h¯ of continuous cross-sections of this field, generalizing Weyl quantization, which define ∗ a strict quantization of G . Many known strict quantizations are a special case of this procedure. On P = T ∗ Rn ∗ the maps QW h¯ (f ) reduce to standard Weyl quantization; for P = T Q, where Q is a Riemannian manifold, one recovers Connes’ tangent groupoid as well as a recent generalization of Weyl’s prescription. When G is the gauge groupoid of a principal bundle one is led to the Weyl quantization of a particle moving in an external Yang–Mills field. In case that G is a Lie group (with Lie algebra g) one recovers Rieffel’s quantization of the Lie–Poisson structure on g∗ . A transformation group C ∗ -algebra defined by a smooth action of a Lie group on a manifold Q turns out to be the quantization of the Poisson manifold g∗ × Q defined by this action. 1. Introduction The notion of quantization to be used in this paper is motivated by the desire to link the geometric theory of classical mechanics and reduction [18,32] with the C ∗ -algebraic ? Supported by a fellowship from the Royal Netherlands Academy of Arts and Sciences (KNAW).
368
N. P. Landsman
formulation of quantum mechanics and induction [15], and also with non-commutative geometry [2]. Starting with Rieffel’s fundamental paper [27], various C ∗ -algebraic definitions of quantization have been proposed [29,12,30,15,31]. Definition 2 below is closely related to these proposals, and is particularly useful in the context of the class of examples studied in this paper. These examples come from the theory of Lie groupoids and their Lie algebroids (cf. Sect. 2). The idea that the C ∗ -algebra of a Lie groupoid is connected to the Poisson manifold defined by the associated Lie algebroid by (strict) quantization was conjectured in [12], and proved in special cases in [13,15]. The results of [28,29,23] also supported the claim. In this paper we prove the conjecture up to Dirac’s condition (3); this is the content of Theorems 1 and 2. Following up on our work, Dirac’s condition has finally been proved by Ramazan [25]. This leads to the Corollary at the end of Sect. 5, which is the main result of the paper. Further to the examples considered in Sect. 6, it would be interesting to apply the point of view in this paper to the holonomy groupoid of a foliation [2], and to the Lie groupoid defined by a manifold with boundary [23,19]. Moreover, the approach to index theory via the tangent groupoid [2] and its recent generalization to arbitrary Lie groupoids [20] may now be seen from the perspective of “strict” quantization theory. This may be helpful also in understanding the connection between various other approaches to index theory which use (formal deformation) quantization [8,7]. The central notion in C ∗ -algebraic quantization theory is that of a continuous field of C ∗ -algebras [5]. For our purposes the following reformulation is useful [10]. Definition 1. A continuous field of C ∗ -algebras (C, {Ax , ϕx }x∈X ) over a locally compact Hausdorff space X consists of a C ∗ -algebra C, a collection of C ∗ -algebras {Ax }x∈X , and a set {ϕx : C → Ax }x∈X of surjective ∗ -homomorphisms, such that for all A ∈ C, 1. the function x → kϕx (A)k is in C0 (X); 2. one has kAk = supx∈X kϕx (A)k; 3. there is an element f A ∈ C for any f ∈ C0 (X) for which ϕx (f A) = f (x)ϕx (A) for all x ∈ X. The continuous Q cross-sections of the field in the sense of [5] consist of those elements {Ax }x∈X of x∈X Ax for which there is a (necessarily unique) A ∈ C such that Ax = ϕx (A) for all x ∈ X. We refer to [18,32] for the theory of Poisson manifolds and Poisson algebras; the latter is the classical analogue of the self-adjoint part of a C ∗ -algebra [15]. Definition 2. Let I ⊆ R contain 0 as an accumulation point. A strict quantization of a Poisson manifold P on I consists of 1. a continuous field of C ∗ -algebras (C, {Ah¯ , ϕh¯ }h¯ ∈I ), with A0 = C0 (P ); ˜ 0 ⊂ C0 (P ) on which the Poisson bracket is defined, and which is 2. a dense subspace A closed under pointwise multiplication and taking Poisson brackets (in other words, ˜ 0 is a Poisson algebra); A ˜ 0 and ˜ 0 → C which (with Qh¯ (f ) ≡ ϕh¯ (Q(f ))) for all f ∈ A 3. a linear map Q : A h¯ ∈ I satisfies Q0 (f ) = f, Qh¯ (f ∗ ) = Qh¯ (f )∗ ,
(1) (2)
Lie Groupoid C ∗ -Algebras and Weyl Quantization
369
˜ 0 satisfies Dirac’s condition and for all f, g ∈ A i lim k [Qh¯ (f ), Qh¯ (g)] − Qh¯ ({f, g})k = 0. h¯
h¯ →0
(3)
Elements of I are interpreted as possible values of Planck’s constant h¯ , and Ah¯ is the quantum algebra of observables of the theory at the given value of h¯ 6 = 0. For real-valued f , the operator Qh¯ (f ) is the quantum observable associated to the classical observable f . This interpretation is possible because of condition (2) in Definition 2. In view of the ˜ 0 each family {Qh¯ (f )}h¯ ∈I is a continuous comment after Definition 1, for fixed f ∈ A cross-section of the continuous field in question. In view of (1) this implies, in particular, that lim kQh¯ (f )Qh¯ (g) − Qh¯ (f g)k = 0.
h¯ →0
(4)
This shows that strict quantization yields asymptotic morphisms in the sense of E-theory [2]; cf. [22]. See [15] for an extensive discussion of quantization theory from the above perspective, including an interpretation of the conditions (3) and (4). 2. Lie Groupoids and Lie Algebroids Throughout this section, the reader is encouraged to occasionally skip to Sect. 6 to have a look at some examples of the objects defined. We refer to [26,17,3,2,15,1] for the basic definitions on groupoids; here we merely establish our notation. Briefly, a groupoid is a category whose space of arrows G is a set (hence the space of objects Q is a set as well), and whose arrows are all invertible. The source and target projections are called τs : G → Q and τt : G → Q, respectively. The subset of G × G on which the groupoid multiplication (i.e., the composition of arrows) is defined is called G2 ; hence (γ1 , γ2 ) ∈ G2 iff τs (γ1 ) = τt (γ2 ). The inversion γ → γ −1 defines the unit space G0 = {γ γ −1 |γ ∈ G}, which is related to the base space Q by the “object inclusion map” ι : Q ,→ G; this is a bijection between Q and ← ι(Q) = G0 . The notation G ⇒ Q for a groupoid to some extent captures the situation. ← A Lie groupoid is a groupoid G ⇒ Q, where G and Q are manifolds (perhaps with boundary), the maps τs and τt are surjective submersions, and multiplication and inclusion are smooth [17,3,2,15,1]. Following [15], we now sharpen Def. I.2.2 in [26]. ←
Definition 3. A left Haar system on a Lie groupoid G ⇒ Q is a family {µtq }q∈Q of positive measures, where the measure µtq is defined on τt−1 (q), such that 1. the family is invariant under left-translation in G; 2. each µtq is locally Lebesgue (i.e., it is equivalent to the Lebesgue measure in every −1 co-ordinate chart; note that each fiber R τt (q)t is a manifold); ∞ 3. for each f ∈ Cc (G) the map q 7 → τ −1 (q) dµq (γ )f(γ ) from Q to C is smooth. t
Here left-invariance means invariance under all maps Lγ , defined by Lγ (γ 0 ) := γ γ 0
(5)
whenever (γ , γ 0 ) ∈ G2 . Note that Lγ maps τt−1 (τs (γ )) diffeomorphically to τt−1 (τt (γ )).
370
N. P. Landsman ←
A Lie groupoid G ⇒ Q has an associated Lie algebroid [17,3,15,1], which we denote →TQ by G → Q . This is a vector bundle over Q, which apart from the bundle projection τ : G → Q is equipped with a vector bundle map τa : G → T Q (called the anchor), as well as with a Lie bracket [ , ]G on the space 0(G) of smooth sections of G, satisfying certain compatibility conditions. →TQ ← For our purposes, the essential point in the construction of G → Q from G ⇒ Q lies in the fact that the vector bundle G over Q is the normal bundle N ι Q defined by the embedding ι : Q ,→ G; accordingly, the projection τ : N ι Q → Q is given by τs or τt (these projections coincide on G0 ). The tangent bundle of G at the unit space has a decomposition t G, Tι(q) G = Tι(q) G0 ⊕ Tι(q)
(6)
where T t G = ker(T τt ) is a sub-bundle of T G. Note that Tγt G = Tγ τt−1 (τt (γ )). Hence →TQ
G → Q is isomorphic as a vector bundle to the restriction G0 of T t G to G0 . Under this −1 t G=T isomorphism the fiber Gq above q is mapped to the vector space Tι(q) ι(q) τt (q). The following pleasant result was pointed out by Ramazan [25]. Proposition 1. Every Lie groupoid possesses a left Haar system. Proof. A given strictly positive smooth density ρ on the vector bundle G can be t (uniquely) extended to a left-invariant density R ρ˜ on the vector bundle T G, which in t ˜ . u t turn yields a left Haar system by µq (f ) = τ −1 (q) ρf t
←
One may canonically associate a C ∗ -algebra C ∗ (G) to a Lie groupoid G ⇒ Q [2], and equally canonically associate a Poisson algebra C ∞ (G∗ ) to its Lie algebroid →TQ G → Q [4,3] (here G∗ is the dual vector bundle of G, with projection denoted by τ ∗ ). From the point of view of quantization theory, these constructions go hand in hand [12, 13,15]. Although a left Haar system is not intrinsic, and an intrinsic definition of C ∗ (G) may be given [2,15,25], it vastly simplifies the presentation of our results if we define this C ∗ algebra relative to a particular choice of a left Haar system {µtq }q∈Q . For f, g ∈ Cc∞ (G) the product ∗ in C ∗ (G) is then given by the convolution [26] Z dµtτs (γ ) (γ1 ) f(γ γ1 )g(γ1−1 ); (7) f ∗ g(γ ) := τt−1 (τs (γ ))
the involution is defined by f∗ (γ ) := f(γ −1 ).
(8)
The groupoid C ∗ -algebra C ∗ (G) is the completion of Cc∞ (G) in a suitable C ∗ -norm [2, 26,15]. On the classical side, the Poisson algebra C ∞ (G∗ ) associated to a Lie algebroid G [4,3,15] is most simply defined by listing special cases which uniquely determine the Poisson bracket. These are {f, g} = 0; {˜s , f } = −(τa ◦ s)f ;
(9) (10)
{˜s1 , s˜2 } = −[s^ 1 , s2 ]G .
(11)
Lie Groupoid C ∗ -Algebras and Weyl Quantization
371
Here f, g ∈ C ∞ (Q) (regarded as functions on G∗ in the obvious way), and s˜ ∈ C ∞ (G∗ ) is defined by a section s of G through s˜ (θ) = θ (s(τ ∗ (θ ))), etc. See [3] for an intrinsic definition. 3. A Generalized Exponential Map →TQ
Throughout the remainder of the paper, G → Q will be the Lie algebroid of a Lie ← groupoid G ⇒ Q. In order to state and prove our main results we need to construct an exponential map ExpW : G → G, which generalizes the map Exp from a Lie algebra to an associated Lie group. The construction of such a map was outlined by Pradines [24], but in order to eventually satisfy the self-adjointness condition (2) on our quantization map we need a different construction [15]. As in [24], our exponential map depends on the choice of a connection on the vector bundle G. As before, the reader is referred to Sect. 6 for examples of the constructions below. Lemma 1. The vector bundles T t G and τs∗ G (over G) are isomorphic. Proof. The pull-back bundle τs∗ G is a vector bundle over G with projection onto the second variable. The isomorphism is proved via the vector bundle isomorphism G ' G0 ; see Sect. 2. Recalling (5), one checks that T Lγ −1 : Tγt G → Tγt −1 γ G is the desired bundle t isomorphism between T t G and τs∗ G0 . u Let us now assume that G has a covariant derivative (or, equivalently, a connection), with associated horizontal lift `G . By Lemma 1 one then obtains a connection on T t G (seen as a vector bundle over G, whose projection is borrowed from T G) through pullback. Going through the definitions, one finds that the associated horizontal lift ` of a tangent vector X = γ˙ := dγ (t)/dtt=0 in Tγ G to Y ∈ Tγt G is `Y (γ˙ ) =
d [Lγ (t)∗ `G T Lγ −1 Y (τs (γ (t)))]t=0 , dt
(12)
which is an element of TY (T t G) (here `G (. . . ) lifts a curve). Since the bundle T t G → G has a connection, one can define geodesic flow X → X(t) on T t G in precisely the same way as on a tangent bundle with affine connection. That is, the flow X(t) is the solution of ˙ X(t) = `X(t) (X(t)),
(13)
with initial condition X(0) = X. →TQ
←
Definition 4. Let the Lie algebroid G → Q of a Lie groupoid G ⇒ Q be equipped with a connection. Relative to the latter, the left exponential map ExpL : G → G is defined by ExpL (X) := γX0 (1) = τT t G→G (X 0 (1)),
(14)
whenever the geodesic flow X 0 (t) on T t G (defined by the connection on T t G pulled back from the one on G) is defined at t = 1. Here X 0 ∈ G0 = T t G G0 is the image of X under the isomorphism G0 ' G. Our goal, however, is to define a “symmetrized” version of ExpL .
372
N. P. Landsman
Lemma 2. For all X ∈ G for which ExpL (X) is defined one has τt (ExpL (X)) = τ (X).
(15)
Here τ is the bundle projection of the Lie algebroid. Proof. We write X for X 0 in (14). One has τt (γX (0)) = τ (X) and d τt (γX (t)) = T (τt ◦ τT t G→G )`X(t) (X(t)) = T τt X(t) = 0, dt t since `X (Y ) covers Y , and X(t) ∈ T t G = ker(T τt ) ∩ T G. u We combine this with the obvious τ ( 21 X) = τ (− 21 X) to infer that τt (ExpL ( 21 X)) = τt (ExpL (− 21 X)) = τs (ExpL (− 21 X)−1 ). Thus the (groupoid) multiplication in (16) below is well-defined. Definition 5. The Weyl exponential map ExpW : G → G is defined by ExpW (X) := ExpL (− 21 X)−1 ExpL ( 21 X).
(16)
The following result is closely related to the tubular neighbourhood theorem. Proposition 2. The maps ExpL and ExpW are diffeomorphisms from a neighbourhood N ι of Q ⊂ G (as the zero section) to a neighbourhood Nι of ι(Q) in G, such that ExpL (q) = ExpW (q) = ι(q) for all q ∈ Q. Proof. The property ExpL (q) = ι(q) is immediate from Definition 4. The push-forward of ExpL at q is T ExpL : Tq G → Tι(q) G. Now recall the decomposition (6). For X tangent to Q ⊂ G one immediately sees that T ExpL (X) = T ι(X). For X tangent to the t G, one has T ExpL (X) = X 0 , as follows by fiber τ −1 (q), which we identify with Tι(q) the standard argument used to prove that expq in the theory of affine geodesics is a local t G one has ExpL (X(s)) = γ 0 (1) = diffeomorphism: for a curve X(s) = sX in Tι(q) X (s) L 0 γX0 (s), so that d/ds[Exp (X(s))]s=0 = X . Since T ExpL is a bijection at q, the inverse function theorem implies that ExpL is a local diffeomorphism. Since it maps Q pointwise to ι(Q), the local diffeomorphisms can be patched together to yield a diffeomorphism of the neighbourhoods stated in Proposition 2; we omit the details of this last step, since it is identical to the proof of the tubular neighbourhood theorem. As for ExpW , for X ∈ Tq Q ⊂ Tq G we have T ExpW (X) = T ι(X). Also, d [ExpL (− 21 sX)−1 ExpL ( 21 sX)]s=0 = − 21 T I (X0 ) + 21 X 0 , ds where T I is the push-forward of the inversion I in G. The right-hand side lies in ker(T τs + T τt ) ⊂ T G, and every element in this kernel is of the stated form. Similarly to (6), one may prove the decomposition Tι(q) G = Tι(q) G0 ⊕ ker(T τs + T τt )(ι(q)).
(17)
It follows that T ExpW is a bijection at q, and the second part of the theorem is derived t as for ExpL . u
Lie Groupoid C ∗ -Algebras and Weyl Quantization
373
4. The Normal Groupoid and Continuous Fields of C ∗ -Algebras We now come to the first part of the proof of the conjecture that C ∗ (G) is related to the Poisson manifold G∗ by a strict quantization. Theorem 1. Let G be a Lie groupoid, with associated Lie algebroid G. Take I = [0, 1] and put A0 = C0 (G∗ ), where G∗ is the dual vector bundle of G, and Ah¯ = C ∗ (G) for h¯ ∈ I \{0}. There exists a C ∗ -algebra C and a family of surjective ∗ -homomorphisms {ϕh¯ : C → Ah¯ }h¯ ∈I such that (C, {Ah¯ , ϕh¯ }h¯ ∈I ) is a continuous field of C ∗ -algebras. The proof uses the normal groupoid of Hilsum and Skandalis [9] (also cf. [33,15]), re-interpreted in terms of the Lie algebroid. We recall the definition; our construction of the smooth structure is different from the one in [9]. The essence is to regard the vector bundle G as a Lie groupoid under addition in each fiber, and glue it to G so as to obtain a new Lie groupoid containing both G and G. →TQ
←
Definition 6. Let G ⇒ Q be a Lie groupoid with associated Lie algebroid G → Q . The normal groupoid GN is a Lie groupoid with base [0, 1] × Q, defined by the following structures: • As a set, GN = G ∪ {(0, 1] × G}. We write elements of GN as pairs (h¯ , u), where u ∈ G for h¯ = 0 and u ∈ G for h¯ 6 = 0. Thus G is identified with {0} × G. • As a groupoid, GN = {0 × G} ∪ {(0, 1] × G}. Here G is regarded as a Lie groupoid over Q, with τs = τt = τ and addition in the fibers as the groupoid multiplication. The groupoid operations in (0, 1] × G are those in G. • The smooth structure on GN , making it a manifold with boundary, is as follows. To start, the open subset O1 := (0, 1] × G ⊂ GN inherits the product manifold structure. Let Q ⊂ N ι ⊂ G and ι(Q) ⊂ Nι ⊂ G, as in Theorem 2. Let O be the open subset of [0, 1] × G (equipped with the product manifold structure; this is a manifold with boundary, since [0, 1] is), defined as O := {(h¯ , X) | h¯ X ∈ N ι }. Note that {0} × G ⊂ O. The map ρ : O → GN is defined by ρ(0, X) := (0, X); ρ(h¯ , X) := (h¯ , ExpW (h¯ X)).
(18)
Since ExpW : N ι → Nι is a diffeomorphism (cf. Proposition 2) we see that ρ is a bijection from O to O2 := {0 × G} ∪ {(0, 1] × Nι }. This defines the smooth structure on O2 in terms of the smooth structure on O. Since O1 and O2 cover GN , this specifies the smooth structure on GN . The fact that GN is a Lie groupoid eventually follows from the corresponding property of G. The given chart is defined in terms of the Weyl exponential, which depends on the choice of a connection in G. However, one may verify that any (smooth) connection, or, indeed, any (Q-preserving) diffeomorphisms between N ι and Nι leads to an equivalent smooth structure on GN . For example, we could have used ExpL instead of ExpW . Also, the smoothness of ExpW makes the above manifold structure on GN well defined, in that open subsets of O1 ∩ O2 are assigned the same smooth structure. Since GN is a Lie groupoid, we can form the C ∗ -algebra C ∗ (GN ), which plays the role of C in Theorem 1. To proceed, we need a result due to Lee [16].
374
N. P. Landsman
Lemma 3. Let C be a C ∗ -algebra, and let ψ : Prim(C) → X be a continuous and open map from the primitive spectrum Prim(C) (equipped with the Jacobson topology [5]) to a locally compact Hausdorff space X. Define Ix := ∩ψ −1 (x); i.e., A ∈ Ix iff πI (A) = 0 for all I ∈ ψ −1 (x) (here πI (C) is the irreducible representation whose kernel is I). Note that Ix is a (closed two-sided) ideal in C. Taking Ax = C/Ix and ϕx : C → Ax to be the canonical projection, (C, {Ax , ϕx }x∈X ) is a continuous field of C ∗ -algebras. For the proof cf. [6]. We apply this lemma with C = C ∗ (GN ) and X = I = [0, 1]. In order to verify the assumption in the lemma, we first note that I0 ' C0 ((0, 1]) ⊗ C ∗ (G), as follows from a glance at the topology of GN . Hence Prim(I0 ) = (0, 1] × Prim(C ∗ (G)), with the product topology. Furthermore, one has C ∗ (GN )/I0 ' C ∗ (G) ' C0 (G∗ ); the second isomorphism is established by the fiberwise Fourier transform (20) below (also cf. [9,2]). Hence Prim(C ∗ (GN )/I0 ) ' G∗ . Using this in Prop. 3.2.1 in [5], with A = Cr∗ (GN ) and I the ideal I0 generated by those f ∈ Cc∞ (GN ) which vanish at h¯ = 0, yields the decomposition Prim(C ∗ (GN )) ' G∗ ∪ {(0, 1] × Prim(C ∗ (G))}, G∗
(19)
provide the full topology on Prim(C ∗ (GN )), but open. If it were, (0, 1] × Prim(C ∗ (G)) would be
is closed. This does not in which it is sufficient to know that G∗ is not closed in Prim(C ∗ (GN )), and this possibility can safely be excluded by looking at the topology of GN and the definition of the Jacobson topology. Using (19), we can define a map ψ : Prim(C ∗ (GN )) → [0, 1] by ψ(I) = 0 for all I ∈ G∗ and ψ(h¯ , I) = h¯ for h¯ 6 = 0 and I ∈ Prim(C ∗ (G)). It is clear from the preceding considerations that ψ is continuous and open. Using this in Lemma 3, one sees that Ih¯ is the ideal in C ∗ (GN ) generated by those f ∈ Cc∞ (GP ) which vanish at h¯ . Hence A0 ' C0 (G∗ ), as above, and Ah¯ ' C ∗ (G) for h¯ 6 = 0. Theorem 1 then follows from Lemma 3. As pointed out to the author by G. Skandalis (private communication, June 1997), similar considerations lead to the following generalization of Theorem 1. ˜ be a Lie groupoid with base Q, ˜ and let p be a continuous and open map from Let G ˜ ˜ Q to some Hausdorff space X, which is G-invariant in the sense that p ◦ τs = p ◦ τt . −1 ˜ ˜ because of the G-invariance ˜ Define Gx := (p ◦ τs ) (x) (this is a sub-groupoid of G of x ∗ x ∗ ˜ ˜ is a continuous field of p), and A := C (Gx ). Then the collection ({A }x∈X , C (G)) ˜ x ) = C ∗ (G ˜ x ). Here f ∈ C ∗ (G) ˜ is understood C ∗ -algebras at those points x where C ∗ (G r x ˜ to define a section of the field {A }x∈X by f (x) = f Gx . ˜ = GN and X = I , hence Q ˜ = I × Q, and We apply this to our situation by taking G p is just projection onto the first variable. Continuity away from h¯ = 0 follows from the triviality of the field for h¯ 6 = 0 (whether or not Cr∗ (G) = C ∗ (G)). Continuity at h¯ = 0 follows by noticing that Cr∗ (G) = C ∗ (G), both sides being equal to C0 (G∗ ). In other words, from this point of view it is the amenability of G, regarded as a Lie groupoid, that lies behind Theorem 1. 5. Weyl Quantization on the Dual of a Lie Algebroid →TQ
Let G → Q be a Lie algebroid, with bundle projection τ . We start by defining a fiberwise Fourier transform f` ∈ C ∞ (G) of suitable f ∈ C ∞ (G∗ ). This transform depends on L the choice of a family {µL q }q∈Q of Lebesgue measures, where µq is defined on the fiber
Lie Groupoid C ∗ -Algebras and Weyl Quantization
375
τ −1 (q). We will discuss the normalization of each µL q in the proof of Theorem 2; for the moment we merely assume that the q-dependence is smooth in the obvious (weak) sense. For a function f` on G which is L1 on each fiber we put Z −iθ (X) ` dµL f (X), (20) f (θ) := q (X) e τ −1 (q)
−1 L∗ where X ∈ τ −1 (q). Each µL q determines a Lebesgue measure µq on the fiber τG∗ →Q (q) of G∗ by fixing the normalization in requiring that the inverse to (20) is given by Z iθ (X) ` dµL∗ f (θ ). (21) f (X) = q (θ ) e −1 τG ∗ →Q (q)
∞ (G∗ ) as consisting of Having constructed a Fourier transform, we define the class CPW ∗ ∞ those smooth functions on G whose Fourier transform is in Cc (G); this generalizes the class of Paley-Wiener functions on T ∗ Rn ' Cn . We pick a function κ ∈ C ∞ (G, R) with support in N ι (cf. Proposition 2), equaling unity in some smaller tubular neighbourhood of Q, as well as satisfying κ(−X) = κ(X) for all X ∈ G.
Definition 7. Let G be a Lie groupoid with Lie algebroid G. For h¯ 6 = 0, the Weyl ∞ (G∗ ) is the element QW (f ) ∈ C ∞ (G), regarded as a dense quantization of f ∈ CPW c h¯ subalgebra of C ∗ (G), defined by QW / Nι , and by h¯ (f )(γ ) := 0 when γ ∈ W QW ¯ −n κ(X)f`(X/h¯ ). h¯ (f )(Exp (X)) := h
(22)
Here the Weyl exponential ExpW : G → G is defined in (16), and the cutoff function κ is as specified above. ∞ (G∗ ), the This definition is possible by virtue of Proposition 2. By our choice of CPW W operator Qh¯ (f ) is independent of κ for small enough h¯ (depending on f ). →TQ
˜0 = Theorem 2. Let G be a Lie groupoid with Lie algebroid G → Q , and take A W W ∞ ∗ ˜ CPW (G ). For each f ∈ A0 operator Qh¯ (f ) of Definition 7 satisfies Qh¯ (f )∗ = W W ∗ QW h¯ (f ), and the family {Qh¯ (f )}h¯ ∈[0,1] , with Q0 (f ) = f , is a continuous cross∗ section of the continuous field of C -algebras of Theorem 1. Proof. Writing the Poisson bracket and the pointwise product in terms of the Fourier ˜ 0 is indeed a Poisson algebra. transform, one quickly establishes that A ˜ 0 the operator QW (f ) It is immediate from (8) and (16) that for real-valued f ∈ A h¯ ∗ is self-adjoint in C (G); this implies the first claim. ← To prove the second claim, we pick a left Haar system {µtq }q∈Q on G ⇒ Q; see Proposition 1. The vector bundle G, regarded as a Lie groupoid under addition in each fiber (cf. Definition 6), has a left Haar system in any case, consisting of the family {µL q }q∈Q of Lebesgue measures on each fiber already used in the construction of the Fourier transform. Since we have a Lie groupoid, the Radon-Nikodym derivative ι Jq (X) := dµtq (ExpW (X))/dµL q (X) is well defined and strictly positive on N (since both measures are locally Lebesgue on spaces with the same dimension). We now fix
376
N. P. Landsman
the normalization of the µL q by requiring that lim X→0 Jq (X) = 1 for all q. This leads to a left Haar system for GN , given by µt(0,q) := µL q; µt(h¯ ,q) := h¯ −n µtq ,
(23)
where n is the dimension of the typical fiber of G. The factor h¯ −n is necessary in order to satisfy condition 3 in Definition 3 at h¯ = 0, as is easily verified using the manifold structure on GN . Thus the ∗ -algebraic structure on Cc∞ (GN ) defined by (7) and (8) with Definition 6 and (23) becomes Z dµL (24) f ∗ g(0, X) = τ (X) (Y ) f(0, X − Y )g(0, Y ); τ −1 ◦τ (X) Z dµtτs (γ ) (γ1 ) f(h¯ , γ γ1 )g(h¯ , γ1−1 ); (25) f ∗ g(h¯ , γ ) = h¯ −n τt−1 (τs (γ ))
f∗ (0, X) = f(0, −X); ∗
f (h¯ , γ ) =
f(h, γ −1 ). ¯
(26) (27)
∞ (G∗ ), the function Q(f ) on G defined by One sees that, for given f ∈ CPW N Q(f )(0, X) = f`(X), Q(f )(h¯ , ExpW (X)) = κ(X)f`(X/h¯ ), and Q(f )(h¯ , γ ) = 0 for γ ∈ / Nι , is smooth on GN ; cf. Definition 6. In other words, Q(f ) is an element of C ∗ (GN ). Recall that Ih¯ is the ideal in C ∗ (GN ) generated by those functions in Cc∞ (GN ) which vanish at h¯ . The canonical map f → [f]h¯ from C ∗ (GN ) to Cr∗ (GN )/Ih¯ is given, for h¯ 6 = 0, by [f]h¯ (·) = f(h¯ , ·). However, in view of the factor h¯ −n in (25), this map is only a ∗ -homomorphism from C ∗ (GN ) to C ∗ (G) if we add a factor h¯ −n to the definition (7) of convolution on G. Since for h¯ 6 = 0 we would like to identify C ∗ (GN )/Ih¯ with C ∗ (G), in which convolution is defined in the usual, h¯ -independent way, we should therefore define the maps ϕh¯ of Theorem 1 by
ϕ0 (f) : θ 7 → ´f(0, θ); ϕh¯ (f) : γ 7 → h¯ −n f(h¯ , γ ) (h¯ 6= 0).
(28)
Here ϕ0 : C ∗ (GN ) → C0 (G∗ ), and ´f(0, θ) and f(0, X) are related as f (θ ) and f`(X) are in (20). For h¯ 6 = 0 one of course has ϕh¯ : C ∗ (GN ) → C ∗ (G). These expressions are initially defined for f ∈ Cc∞ (GN ); since ϕh¯ is contractive, they are subsequently extended to general f ∈ C ∗ (GN ) by continuity. This explains the factor h¯ −n in (22); the theorem then follows from the paragraph after (27). u t The important calculations of Ramazan [25] show that i W W lim k [QW h (f ), Qh¯ (g)] − Qh¯ ({f, g})k = 0 h¯ →0 h ¯ ¯
(29)
˜ 0 ; this is Dirac’s condition (he in addition proves this to hold in formal for all f, g ∈ A deformation quantization).
Lie Groupoid C ∗ -Algebras and Weyl Quantization
377
Corollary 1. Let G be a Lie groupoid, with associated →TQ
• Lie algebroid G → Q ; • Poisson manifold G∗ (the dual bundle to G, with Poisson structure (9)–(11)); • normal groupoid GN (cf. Definition 6). In the context of Definition 2, the ingredients listed below yield a strict quantization of the Poisson manifold P = G∗ : 1. The continuous field of C ∗ -algebras given by C = C ∗ (GN ), A0 = C0 (G∗ ), Ah¯ = C ∗ (G) for h¯ ∈ I \{0}, and ϕh¯ as defined in (28); cf. Theorem 1. ˜ 0 = C ∞ (G∗ ) of fiberwise Paley–Wiener functions on G∗ (as 2. The dense subspace A PW defined below (21)). ∞ (G∗ ) → C ∗ (G ) is defined by putting Q = QW (as specified in 3. The map Q : CPW N h¯ h¯ Definition 7); this determines Q by Theorem 2 and the remark after Definition 1.
6. Examples In this section we illustrate the concepts introduced above, and show that a number of known strict quantizations are special cases of Corollary 1. Details of these examples will be omitted; see [17,3,15,1] for matters related to the Lie groupoids and Lie algebroids involved, and cf. [2,26,15,25] for the C ∗ -algebras that appear. The quantization maps are discussed in detail in [15]. It turns out that a number of examples are more naturally described by changing some signs, as follows. We denote G∗ , seen as a Poisson manifold through (9)–(11), by G∗− . Alternatively, we may insert plus signs on the right-hand sides of (10) and (11), defining the Poisson manifold G∗+ . The normal groupoid GN may be equipped with a different manifold structure by replacing ExpW (h¯ X) in (18) by ExpW (−h¯ X); the original − Definition 6 yields a manifold G+ N , and the modified one defines GN . (The original smooth structure is equivalent to the modified one by the diffeomorphism (0, X) 7→ (0, −X) and (h¯ , γ ) 7 → (h¯ , γ ).) In (22) we may replace f`(X/h¯ ) by f`(−X/h¯ ), defining W W a quantization map QW h¯ (·)− , differing from the original one Qh¯ (·)+ = Qh¯ (·). Theorems 1 and 2, Eq. (29), as well as Corollary 1 remain valid if all signs are simultaneously changed in this way. ←
Example 1 (Weyl quantization on a manifold). The pair groupoid Q × Q ⇒ Q on a set Q is defined by the operations τs (q1 , q2 ) := q2 , τt (q1 , q2 ) := q1 , ι(q) := (q, q), (q1 , q2 ) · (q2 , q3 ) := (q1 , q3 ), and (q1 , q2 )−1 := (q2 , q1 ). This is a Lie groupoid when Q is a manifold. Any measure ν on Q which is locally Lebesgue defines a left Haar system. One has C ∗ (Q × Q) ' B0 (L2 (Q)), the C ∗ -algebra of all compact operators on L2 (Q, ν). The associated Lie algebroid is the tangent bundle T Q, with the usual bundle projection and Lie bracket, and the anchor is the identity. The Poisson bracket on T ∗ Q is the canonical one.
378
N. P. Landsman
To define ExpW one chooses an affine connection ∇ on T Q, with associated exponential map exp : T Q → Q. Then ExpL (X) = (τ (X), expτ (X) (X));
(30)
ExpW (X) = (expτ (X) (− 21 X), expτ (X) ( 21 X)),
(31)
where X ∈ T Q and τ := τT Q→Q . On Q = Rn with flat metric and corresponding flat Riemannian connection this simplifies to ExpW (v, q) = (q − 21 v, q + 21 v), where we have used canonical co2 n ordinates on T Rn . The operator QW h¯ (f )− on L (R ) defined by (22), where one may take κ = 1, with (21), is then given by Z d n pd n y ip(x−y)/h¯ W e f (p, 21 (x + y))9(y). (32) Qh¯ (f )− 9(x) = ¯ )n T ∗ Rn (2π h This is Weyl’s original prescription. The associated continuous field of C ∗ -algebras is A0 = C0 (T ∗ Rn ) and Ah¯ = B0 (L2 (Rn )) for h¯ 6= 0. The fact that this quantization map is strict, and in particular satisfies (3), was proved by Rieffel [29]; also cf. [15]. Replacing I = [0, 1], as we have used so far in connection with Definition 2, by I = R, the C ∗ -algebra C in Definition 1 is C ∗ (Hn ), the group algebra of the simply connected Heisenberg group on Rn [6]. This is indeed the C ∗ -algebra of the tangent groupoid of Rn (see below). When Q is an arbitrary manifold, the normal groupoid (Q × Q)N is the tangent groupoid of Q [2]. If one takes the affine connection on T Q to be the Levi-Civita connection given by a Riemannian metric on Q, one recovers the extension of Weyl’s prescription considered in [12,15]. One now has A0 = C0 (T ∗ Q) and Ah¯ = B0 (L2 (Q)) for h¯ 6 = 0, and QW h¯ duly satisfies (3); see [12,15], where references to alternative generalizations of Weyl’s quantization prescriptions may be found. Example 2 (Rieffel’s quantization of the Lie–Poisson structure on a dual Lie algebra). A Lie group is a Lie groupoid with Q = e. A left-invariant Haar measure on G provides a left Haar system; the ensuing convolution algebra C ∗ (G) is the usual group algebra. The Lie algebroid is the Lie algebra. The Poisson structure on g∗± is the well-known Lie–Poisson structure [18,15]. No connection is needed to define the exponential map, and one has ExpL (X) = ExpW (X) = Exp(X),
(33)
where X ∈ g and Exp : g → G is the usual exponential map. When G is exponential (in that Exp is a diffeomorphism), one may omit κ in (22). Taking the + sign, the function ∗ QW h¯ (f )+ ∈ C (G) is then given by QW h¯ (f )+ : Exp(X) →
Z
g∗
d nθ eihθ,Xi/h¯ f (θ ). (2π h¯ )n
(34)
This is Rieffel’s prescription [28], who proved strictness of the quantization for nilpotent groups. When G is compact one needs the cut-off function κ, obtaining another quantization already known to be strict before the present paper and [25] appeared; see [14] or [15].
Lie Groupoid C ∗ -Algebras and Weyl Quantization
379 ←
Example 3 (Weyl quantization on a gauge groupoid). The gauge groupoid P×H P ⇒ Q of a smooth principal bundle P over a base Q with structure group H is defined by the projections τs ([x, y]H ) = τ (y) and τt ([x, y]H ) = τ (x), and the inclusion ι(τ (x)) = [x, x]H . Accordingly, the multiplication [x, y]H · [x 0 , y 0 ]H is defined when y and x 0 lie in the same fiber of P, in which case [x 0 , y 0 ]H = [y, z]H for some z = y 0 h, h ∈ H . Then [x, y]H · [y, z]H = [x, z]H . Finally, the inverse is [x, y]−1 H = [y, x]H . See [17]. An H -invariant measure µ on P which is locally Lebesgue produces a left Haar system. In general, each measurable section s : Q → P determines an isomorphism C ∗ (P ×H P) ' B0 (L2 (Q)) ⊗ C ∗ (H ); this is a special case of Thm. 3.1 in [21] (also cf. [15], Thm. 3.7.1). When H is compact one has C ∗ (P ×H P) ' B0 (L2 (P))H , where L2 (P) is defined with respect to some H -invariant locally Lebesgue measure on P. →TQ The associated Lie algebroid (T P)/H → Q is defined by the obvious projections (both inherited from the projection τ : P → Q), the Lie bracket on 0((T P)/H ) obtained by identifying this space with 0(T P)H , and borrowing the commutator from 0(T P); cf. [17]. The Poisson structure on ((T P)/H )∗ = (T ∗ P)/H is given by the restriction of the canonical Poisson bracket on C ∞ (T ∗ P) to C ∞ (T ∗ P)H , under the isomorphism C ∞ ((T ∗ P)/H ) ' C ∞ (T ∗ P)H . One chooses an H -invariant affine connection on T P, with exponential map exp : T P → P. This induces a connection on (T P)/H , in terms of which ExpL ([X]H ) = [τ (X), expτ (X) (X)]H ;
(35)
ExpW ([X]H ) = [expτ (X) (− 21 X), expτ (X) ( 21 X)]H ,
(36)
where τ = τT P→P , and [X]H ∈ (T P)/H is the equivalence class of X ∈ T P under the H -action on T P. In the Riemannian case, for compact H the corresponding map QW h¯ (·)− is simply ∞ (T ∗ P) → B (L2 (P)) as defined in Example 1 to (·) : C the restriction of QW − 0 PW h¯ ∞ (T ∗ P)H . Since QW is invariant under isometries [15], the image of C ∞ (T ∗ P)H is CPW PW h¯ contained in B0 (L2 (P))H . The ensuing quantization of (T ∗ P)/H was already known to be strict; see [12,15]. Physically, this example describes the quantization of a nonabelian charged particle moving in a gravitational as well as a Yang–Mills field. Example 4 (Transformation group C ∗ -algebras). Let a Lie group G act smoothly on a ← set Q. The transformation groupoid G × Q ⇒ Q is defined by the operations τs (x, q) = −1 x q and τt (x, q) = q, so that the product (x, q) · (y, q 0 ) is defined when q 0 = x −1 q. Then (x, q) · (y, x −1 q) = (xy, q). The inclusion is ι(q) = (e, q), and for the inverse one has (x, q)−1 = (x −1 , x −1 q). Each left-invariant Haar measure dx on G leads to a left Haar system. The corresponding groupoid C ∗ -algebra is the usual transformation group C ∗ -algebra C ∗ (G, Q), cf. [26]. →TQ The Lie algebroid g × Q → Q is a trivial bundle over Q, with anchor τa (X, q) = −ξX (q) (the fundamental vector field on Q defined by X ∈ g). Identifying sections of g × Q with g-valued functions X(·) on Q, the Lie bracket on 0(g × Q) is [X, Y ]g×Q (q) = [X(q), Y (q)]g + ξY X(q) − ξX Y (q).
(37)
The associated Poisson bracket coincides with the semi-direct product bracket defined in [11].
380
N. P. Landsman
The trivial connection on g × Q → Q yields ExpL (X, q) = (Exp(X), q); W
Exp (X, q) =
(Exp(X), Exp( 21 X)q).
(38) (39)
The cutoff κ in (22) is independent of q, and coincides with the function appearing in ∞ (g∗ × Q) is then quantized by Example 2. For small enough h¯ a function f ∈ CPW Z d nθ (f ) : (Exp(X), q) → eihθ,Xi/h¯ f (±θ, Exp(− 21 X)q). (40) QW ± h¯ ¯ )n g∗ (2π h When G = Rn and Q has a G-invariant measure, the map f → QW h¯ (f )± is equivalent to the deformation quantization considered by Rieffel [27], who already proved that it is strict (also cf. [15]). Note added in proof. All results remain true when the groupoid C ∗ -algebras are replaced by reduced ones. This is clear both from the proof of Lemma 3 and from the argument at the end of Sect. 4 (which should be attributed to E. Blanchard). References 1. Cannas da Silva, A., Hartshorn, K., Weinstein, A.: Lectures on Geometric Models for Noncommutative Algebras. Providence: AMS, 1998 2. Connes, A.: Noncommutative Geometry. San Diego: Academic Press, 1994 3. Coste, A., Dazord, P., Weinstein, A.: Groupoides symplectiques. Publ. Dépt. Math. Univ. C. Bernard-Lyon I 2A, 1–62 (1987) 4. Courant, T.J.: Dirac Manifolds. Trans. Am. Math. Soc. 319, 631–661 (1990) 5. Dixmier, J.: C ∗ -Algebras. Amsterdam: North-Holland, 1977 6. Elliott, G.A., Natsume, T., Nest, R.: The Heisenberg group and K-theory. K-Theory 7, 409–428 (1993) 7. Elliott, G.A., Natsume, T., Nest, R.: The Atiyah–Singer index theorem as passage to the classical limit in quantum mechanics. Commun. Math. Phys. 182, 505–533 (1996) 8. Fedosov, B.V.: Deformation Quantization and Index Theory. Berlin: Akademie-Verlag 1996 9. Hilsum, M., Skandalis, G.: Morphismes K-orientés d’espaces de feuilles et fonctorialité en théorie de Kasparov. Ann. Scient. Éc. Norm. Sup. (4e s.) 20, 325–390 (1988) 10. Kirchberg, E., Wassermann, S.: Operations on continuous bundles of C ∗ -algebras. Math. Ann. 303, 677– 697 (1995) 11. Krishnaprasad, P.S., Marsden, J.E.: Hamiltonian structure and stability for rigid bodies with flexible attachments. Arch. Rat. Mech. An. 98, 137–158 (1987) 12. Landsman, N.P.: Strict deformation quantization of a particle in external gravitational and Yang–Mills fields. J. Geom. Phys. 12, 93–132 (1993) 13. Landsman, N.P.: Classical and quantum representation theory. In: de Kerf, E. A., Pijls, H.G.J. (eds.) Proc. Seminar Mathematical Structures in Field Theory, CWI-syllabus 39, Amsterdam: Mathematisch Centrum CWI, 1996, pp. 135–163 14. Landsman, N.P.: Twisted Lie group C ∗ -algebras as strict quantizations. Lett. Math. Phys. 46, 181–188 (1998) 15. Landsman, N.P.: Mathematical Topics Between Classical and Quantum Mechanics. New York: Springer, 1998 16. Lee, R.-Y. On the C ∗ -algebras of operator fields. Indiana Univ. Math. J. 25, 303–314 (1976) 17. Mackenzie, K.: Lie Groupoids and Lie Algebroids in Differential Geometry. Cambridge: Cambridge University Press, 1987 18. Marsden, J.E., Ratiu, T.S.: Introduction to Mechanics and Symmetry. New York: Springer, 1994 19. Monthubert, B.: Groupoïdes et calcul pseudo-différentiel sur les variétés à coins. PhD Thesis. Paris: Université Paris VII- Denis Diderot, 1998 20. Monthubert, B., Pierrot, F.: Indice analytique et groupoïdes de Lie. C.R. Acad. Sci. Paris Série I 325, 193–198 (1997) 21. Muhly, P.S., Renault, J.N., Williams, D.P.: Equivalence and isomorphism for groupoid C ∗ -algebras. J. Operator Th. 17, 3–22 (1987)
Lie Groupoid C ∗ -Algebras and Weyl Quantization
381
22. Nagy, G.: E-theory with ∗-homomorphisms. J. Funct. Anal. 140, 275–299 (1996) 23. Nistor, V., Weinstein, A., Xu, P.: Pseudodifferential operators on differential groupoids. Preprint math.OA/9702054 (1998) 24. Pradines, J.: Géométrie différentielle au-dessus d’un groupoïde. C. R. Acad. Sci. Paris A266, 1194–1196 (1968) 25. Ramazan, B.: Quantification par Dèformation des variétés de Lie–Poisson. Ph.D Thesis. Orléans: Université d’Orléans, 1998 26. Renault, J.: A Groupoid Approach to C ∗ -algebras. Lecture Notes in Mathematics 793, Berlin: Springer, 1980 27. Rieffel, M.A.: Deformation quantization of Heisenberg manifolds. Commun. Math. Phys. 122, 531–562 (1989) 28. Rieffel, M.A.: Lie group convolution algebras as deformation quantizations of linear Poisson structures. Am. J. Math. 112, 657–686 (1990) 29. Rieffel, M.A.: Deformation quantization for actions of Rd . Mem. Am. Math. Soc. 106 (506), (1993) 30. Rieffel, M.A.: Quantization and C ∗ -algebras. In: Doran, R.S. (ed.) C ∗ -algebras: 1943–1993. Cont. Math. 167, Providence, RI: American Mathematical Society, 1994, pp. 67–97 31. Rieffel, M.A.: Quantization and operator algebras. In: Bracken, A.J., De Wit, D., Gould, M., Pearce, P. (eds.) Proc. XIIth Int. Congress of Mathematical Physics, Brisbane 1997 32. Vaisman, I.: Lectures on the Geometry of Poisson Manifolds. Basel: Birkhäuser, 1994 33. Weinstein, A.: Blowing up realizations of Heisenberg-Poisson manifolds. Bull. Sc. math. (2) 113, 381–406 (1989) 34. Weinstein, A.: Noncommutative geometry and geometric quantization. In: Donato, P. et al. (eds.) Symplectic Geometry and Mathematical Physics, Basel: Birkhäuser, 1991, pp. 446–461 Communicated by A. Connes
Commun. Math. Phys. 206, 383 – 407 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
On the Exact Solution of Models Based on Non-Standard Representations J. Gruneberg Institut für Theoretische Physik, Universität zu Köln, Zülpicher Straße 77, 50937 Köln, Germany Received: 24 November 1998 / Accepted: 26 March 1999
Abstract: The algebraic Bethe ansatz is a powerful method to diagonalize transfermatrices of statistical models derived from solutions of (graded) Yang Baxter equations, connected to fundamental representations of Lie (super-)algebras and their quantum deformations respectively. It is, however, very difficult to apply it to models based on higher dimensional representations of these algebras in auxiliary space, which are not of fusion type. A systematic approach to this problem is presented here. It is illustrated by the diagonalization of a transfer-matrix of a model based on the product of two different b 0 (2, 1; C)). four-dimensional representations of Uq (gl 1. Introduction The starting point for the construction of (Bethe ansatz) integrable models is the famous Yang–Baxter equation (YBE) [1,2], 0
00
0
00
0
00
00
0
VV VV V V V V VV VV (u, v)R13 (u, w)R23 (v, w) = R23 (v, w)R13 (u, w)R12 (u, v). R12 0
(1a)
V , V 0 and V 00 are in general three different spaces. The operators R V V (u) act on the direct product V × V 0 → V × V 0 . Both sides of Eq. (1a) act on the three-fold product V × V 0 × V 00 . The lower indices i, j ∈ 1, 2, 3 on the R-operators denote as usual the two factors in this product on which the corresponding R-operator acts non-trivially. In general the so-called spectral parameters u, v and w are complex variables. Up to now, there is no general classification of the solutions to (1a). The situation is much better understood, if V , V 0 and V 00 are carrier spaces for the representation of a simple Lie-algebra or its quantum-deformation. The corresponding theory is mainly due to Drinfel’d [3], who also introduced the concept of the universal R-matrix. The existence of the latter guarantees the existence of R-operators as matrices acting on direct products of usually, but not always, finite dimensional carrier spaces V . A good account of these developments has been given by Chari and Presley [4]. Powerful methods to construct
384
J. Gruneberg
these matrices explicitly were developed by Jimbo [5] and many others, see e.g. the book by Ma [6]. The dependence on only one complex parameter is due to the use of evaluation representations of affine algebras. In this case (1) takes the more common difference form 0
00
0
00
0
00
00
0
VV VV V V V V VV VV (u − v)R13 (u)R23 (v) = R23 (v)R13 (u)R12 (u − v) R12
(1b)
The first space V is called auxiliary space, the second relabeled to V (n) , in general taken out of some countable set {V (m) }N m=1 , a (local) quantum space. An L-operator acting on the direct product of these is defined as (n) Lˆ V (n|u) := R V V (u, w(n) ).
(2a)
It is assumed to act trivially on all other quantum spaces V (m) with m 6= n. Assuming that w (n) in (2a) just labels V (n) and that u is a spectral parameter of difference type as in (1b), it is possible to introduce additional inhomogeneities δ (n) into the monodromy-matrix Tˆ V (N |u) := Lˆ V (N |u − δ (N ) ) · · · · · · Lˆ V (1|u − δ (1) ). Here δ (n) and w(n) will be some complex numbers. o n τˆ V (N |u) = trV Tˆ V (N|u)
(2b)
(2c)
can be viewed as a row-to-row transfer-matrix of a two dimensional (classical) statistical model, with N sites per row, acting on the (global) quantum space V (N ) ×· · ·×V (1) . If δ (n) vanishes and w(n) is independent of n, the transfer-matrix (2c) is called homogeneous. In any case integrability of the latter is established via (1), written as 0 0 VV0 VV0 (u, v)Lˆ V1 (n|u)Lˆ V2 (n|v) = Lˆ V2 (n|v)Lˆ V1 (n|u)R12 (u, v). R12
(3a)
From that the fundamental commutation relations (FCR) are obtained immediately: 0
0
0
0
VV VV (u, v)Tˆ1V (N |u)Tˆ2V (N |v) = Tˆ2V (N|v)Tˆ1V (N|u)R12 (u, v). R12
(3b)
0
V V is invertible, which is guaranteed for finite dimensional V and V 0 , this Provided R12 yields i h 0 ! (3c) τˆ V (N |u), τˆ V (N |v) = 0. 0
Expanding τ V (N |v) in v one obtains an infinite family of operators commuting with τˆ V (N|u). The question, if this family contains the right number of “independent” integrals of motion for every finite N, is difficult to answer and usually taken for granted. The set of equations (2) and (3) was derived by Baxter and can be found together with the original references in his excellent book [7]. The notation here is due to Faddeev and coworkers, who created a purely algebraic way for diagonalizing τˆ V (N |u), the algebraic Bethe ansatz (ABA). Their quantum inverse scattering method (QISM) [8] provided the background for Drinfel’d’s theory [3], but is more general and in the author’s opinion not fully exploited yet. A good account including original references can be found in the book by Korepin et al. [9] and the reprint collection [10].
On the Exact Solution of Models Based on Non-Standard Representations
385
ABA is a powerful method to construct eigenvectors and eigenvalues of τˆ V (N|u). In some sense it is more systematic than the original coordinate Bethe ansatz [11]. In general this is only true, if the auxiliary space V is the carrier space of the fundamental representation of a Lie (super-)algebra or a deformation of the latter. Especially if the auxiliary space V is a higher dimensional carrier space of another representation of the same algebra, simplicity is lost and ABA becomes cumbersome. Drinfel’d’s theory [3] suggests that a simple generalization should exist. A systematic approach to this problem will be developed in the following. 2. Models In the case of general graded algebras Drinfel’d’s constructions [3] are still not completely understood. However for simple (affine) Lie superalgebras and their quantum deformations a proper algebraic construction has been given by Yamane recently [12]. Also QISM and ABA are not very sensitive to grading and the graded version of the YBE has been established by Kulish and Sklyanin long ago [13]. The R-matrices, which will be used as concrete examples, are related to the “quantum b 0 (2, 1|C)). No use will be made of any peculiar universal enveloping superalgebra” Uq (gl features of this symmetry. The interested reader is referred to the book by Cornwell [14] on Lie superalgebras, from which the notation is borrowed, the book by Kac [15] for more details on affinization and to the paper [12] for the proper construction of the q-deformed universal enveloping superalgebra. The carrier space V3 of the fundamental representation of Uq (gl(2, 1|C) is complex and three-dimensional. Basis and cobasis will be denoted by |i >, < j |i >= δij for i, j = 1, 2, 3.
(4a)
A basis of the complex carrier space V4 of the four-dimensional representation will be denoted similarly. These representations are Z2 -graded: To each basis-vector |ii a number p(i) ∈ {0, 1} is assigned, i.e. p(1) = p(2) = 0, p(3) = 1
(4b)
p(1) = p(2) = 0, p(3) = p(4) = 1
(4c)
for V3 and analogously
for V4 . Local basis-vectors are divided into even (bosonic, p = 0) and odd (fermionic, p = 1) ones. Local operators acting in V3 or V4 , etc. are expressed in the natural basis eij = |ii hj | .
(4d)
If the corresponding space is a (local) quantum space, it will be denoted with a hat, e.g. eˆij for clarity. These operators act trivially on all other (local) quantum spaces. A grading is assigned to this basis according to p(eij ) = [p(i) + p(j )] mod 2.
(4e)
It is possible to extend these definitions of grading naturally to those vectors |ψi and operators a, ˆ which are homogeneous with respect to the grading.
386
J. Gruneberg
It is convenient to expand operators as well as vectors in the natural (tensor) product basis, which is ordered according to V (N ) × · · · × V (1) , see (2b). Grading imposes signs on products of homogeneous operators, i.e.: ˆ
ˆ ˆ ˆ cˆ ⊗ d) ˆ = (−1)p(b)p(c) (aˆ c) ˆ ⊗ (bˆ d) (aˆ ⊗ b)(
(4f)
or on the action of homogeneous operators on homogeneous vectors, i.e. ˆ
ˆ (aˆ ⊗ b)(|ψi ⊗ |ϕi) = (−1)p(b)p(|ψi) (aˆ |ψi) ⊗ (bˆ |ϕi).
(4g)
The only other effect of grading is that trV in (2c) has to be interpreted as supertrace: o X n (−1)p(i) hi|Tˆ V (N|u)|ii. (4h) trV Tˆ V (N |u) = i
Kulish and Sklyanin found [13] that additional signs, which appear in an explicit representation of the YBE (1) due to grading can be absorbed into a redefinition of matrix elements, so that every solution of the graded YBE is equivalent to a solution of the conventional one. The four dimensional representation can be characterized by a set of complex parameters, symbolically denoted by (5a) V4 ≈ C, κ, κ ∗ , µ, µ∗ . This is a peculiarity of Lie superalgebras [14], which is conserved under quantum deformation; κ, κ ∗ and µ, µ∗ are not necessarily complex conjugate to each other, but related to C by κκ ∗ = [C]q , µµ∗ = [C + 1]q ,
(5b)
where q is the deformation parameter, q := e2η ,
(5c)
and q-brackets are defined as usual by [C]q :=
q C − q −C sinh(2ηC) = . −1 q −q sinh(2η)
(5d)
Different choices of κ, κ ∗ , µ, µ∗ can be related to each other by a similarity transformation of the algebra, which conserves grading, but is not unitary in general. That makes it convenient to keep these parameters. Note that the representation V4 can be deformed continuously into V40 , which is characterized by a set of primed parameters also connected by (5b). A well-known solution of (1b) with V = V 0 = V 00 = V3 is R V3 V3 (u) = e11 ⊗ eˆ11 + e22 ⊗ eˆ22 − d(u) e33 ⊗ eˆ33 + c(u) e11 (eˆ22 + eˆ33 ) + e22 (eˆ11 + eˆ33 ) + a(u)e21 ⊗ eˆ12 + b(u)e12 ⊗ eˆ21 + a(u) e31 ⊗ eˆ13 + e32 ⊗ eˆ23 − b(u) e13 ⊗ eˆ31 + e23 ⊗ eˆ32
(6)
On the Exact Solution of Models Based on Non-Standard Representations
387
with coefficients sinh(2η) [cosh(u) + sinh(u)] , sinh(2η + u) sinh(2η) b(u) := [cosh(u) − sinh(u)] , sinh(2η + u) sinh(u) , c(u) := sinh(2η + u) sinh(2η − u) . d(u) := sinh(2η + u) a(u) :=
To the author’s knowledge it appeared first in a different notation in the work of Perk and Schultz [16]. It is the standard q-deformation of the Y (gl(2, 1|C))-symmetric R-matrix given by Kulish and Sklyanin [13]. Kulish and Sklyanin wrote down the Y (gl(m, n|C))-symmetric R-matrix for arbitrary b 0 (m, n|C))-symmetric case can positive integers m and n. Its generalization to the Uq (gl also be taken from the paper by Perk and Schultz [16]. It is a simple generalization of (6). b 0 (2, 1|C))-symmetric R-matrix: The R-matrix (6) is related to the following Uq (gl R V3 V4 (u) = ρ(u) e11 ⊗ (eˆ11 + eˆ33 ) + e22 ⊗ (eˆ11 + eˆ44 ) + α0 (u) e11 ⊗ (eˆ22 + eˆ44 ) + e22 ⊗ (eˆ22 + eˆ33 ) + e33 ⊗ β0 (u)eˆ11 − eˆ22 + γ0 (u)(eˆ33 + eˆ44 ) + δ1 (u)e12 ⊗ eˆ43 + δ2 (u)e21 ⊗ eˆ34 − ε1 (u) e13 ⊗ eˆ23 + e23 ⊗ eˆ24 + ε2 (u) e31 ⊗ eˆ32 + e32 ⊗ eˆ42 + δ1 (u)e12 ⊗ eˆ43 + δ2 (u)e21 ⊗ eˆ34 h i − ζ1 (u) e13 ⊗ eˆ41 − q −1 e23 ⊗ eˆ31 + ζ2 (u) e31 ⊗ eˆ14 − q e32 ⊗ eˆ13
(7)
with coefficients (A1), listed in Appendix A, in the sense that it fulfills the YBE (1b) with V = V 0 = V3 and V 00 = V4 : V3 V3 V3 V4 V3 V4 V3 V4 V3 V4 V3 V3 R12 (u − v)R13 (u)R23 (v) = R23 (v)R13 (u)R12 (u − v).
(8)
The construction of the R-matrix (7), and the proof of (8) is standard (see e.g.[6,17]). From (7) a transfer-matrix τ V3 (N |u) is defined by (2). It is sufficient to consider the homogeneous case, i.e. δ (n) = 0 and V (n) = V4 for all n in (2a). Integrability follows from (3c). It is easily tractable by ABA, which will be demonstrated in the next section. b 0 (2, 1|C))-symmetric R-matrix acting on the direct product of two Another Uq (gl different four dimensional representations, characterized by the corresponding parameter
388
J. Gruneberg
sets (5a), is given by 0
R V4 V4 (u) = f (u)e11 ⊗ eˆ11 + g(u)e22 ⊗ eˆ22 − e33 ⊗ eˆ33 − e44 ⊗ eˆ44 + r5 e22 ⊗ eˆ11 + r50 e11 ⊗ eˆ22 − r10 (e33 ⊗ eˆ44 − e44 ⊗ eˆ33 ) − r7 (e33 + e44 ) ⊗ eˆ11 − r70 e11 ⊗ (eˆ33 + eˆ44 ) − r9 (e33 + e44 ) ⊗ eˆ22 − r90 e22 ⊗ (eˆ33 + eˆ44 ) + r1 e21 ⊗ eˆ12 + r10 e12 ⊗ eˆ21 − r4 e43 ⊗ eˆ34 − r40 e34 ⊗ eˆ43 + r2 (e31 ⊗ eˆ13 + e41 ⊗ eˆ14 ) − r20 (e13 ⊗ eˆ31 + e14 ⊗ eˆ41 ) + r3 (e32 ⊗ eˆ23 + e42 ⊗ eˆ24 ) − r30 (e23 ⊗ eˆ32 + e24 ⊗ eˆ42 )
(9)
− r6 (e24 ⊗ eˆ13 − q −1 e23 ⊗ eˆ14 ) + r80 (e42 ⊗ eˆ31 − q e32 ⊗ eˆ41 ) + r60 (e13 ⊗ eˆ24 − q e14 ⊗ eˆ23 ) − r8 (e31 ⊗ eˆ42 − q −1 e41 ⊗ eˆ32 ). The coefficients are again listed in Appendix A. The construction of this R-matrix, and a proof of (1b), V4 V 0
V4 V400
R12 4 (u − v)R13
V 0 V400
(u)R234
V 0 V400
(v) = R234
V4 V400
(v)R13
V4 V 0
(u)R12 4 (u − v)
(10a)
or V3 V 0
V4 V 0
V4 V 0
V3 V 0
V3 V4 V3 V4 (u − v)R13 4 (u)R23 4 (v) = R23 4 (v)R13 4 (u)R12 (u − v) R12
(10b)
with R V3 V4 from (7) will be given elsewhere [17]. A special case (V40 = V4 ), leading to considerable simplifications, has been constructed explicitly by Gould et al. [18]. One may fix u and v in (10a) and regard C, C 0 and C 00 instead as spectral parameters in order to satisfy the general form (1a) of the YBE. From (9) the transfer-matrix τ V4 (N |u) is defined by (2). It is again sufficient to consider the homogeneous case δ (n) = 0 and V (n) = V40 for all n in (2b). Integrability follows from (3c) with the choice between τ V4 (N|v) and τ V3 (N|v) as generating functionals for “integrals of motion”. Here ABA is not straightforward. This model requires a new strategy in order to obtain equations for all eigenvalues τ V4 (N |u). 3. Algebraic Bethe Ansatz The original recipe for ABA is simple [8]: 1. Determine a vacuum state, preferably a highest or lowest weight state of the underlying group structure, if available, which tridiagonalizes Lˆ V (n|u) locally, and extend it via the product structure (2b) to a global vacuum, tridiagonalizing Tˆ V (N|u). 2. Take the off-diagonal elements of Tˆ V (N |u), not annihilating the global vacuum, as creation-operators and use the associative algebra defined by the FCR (3b), to generate eigenvectors to all eigenvalues of τˆ V (N|u) (2c). Equations determining the latter are also derived from the algebra.
On the Exact Solution of Models Based on Non-Standard Representations
389
The first point is more or less a precondition for the applicability of ABA; the second is crucial: Only if V is a carrier space of the fundamental representation of a possibly deformed and graded Lie algebra, the choice of creation-operators is obvious. τˆ V3 (N|u) and τˆ V4 (N|u), as defined in the previous section, are sufficiently complex to illustrate the general situation. Since the auxiliary space is graded, it is useful to transform the matrix-elements of Lˆ V3 (n|u) (2a) in the V3 basis according to i(V3 ) i(V3 ) h h → (−1)p(j )[p(i)+p(j )] Lˆ V3 (n|u) . (11) Lˆ V3 (n|u) ij
ij
This absorbs just a troublesome minus sign from the commutation of |3iV3 with (n) [Lˆ V3 (n|u)]13 and [Lˆ V3 (n|u)]23 . All four local basis vectors of V4 (4a) are suitable as (local) vacuum, preferably (n) := |2 >(n) .
(12) (n)
(n) is a lowest weight state of the representation of Uq (gl(2, 1|C)) on V4 and its equivalent was used by Kulish and Reshetikhin to treat the non-graded Y (gl(3|C))symmetric case [19]. Their calculation was generalized to the fundamental representation b 0 (m, n)) by Schultz [20], of Uq (gl (n) ω1 (u) 0 0 (n) (n) (13) Lˆ V3 (n|u)(n) = 0 ω2 (u) 0 (n) ∗ ∗ ω3 (u) with ∗ denoting non-zero entries. The vacuum-eigenvalues of the diagonal elements are given by (n)
ω1 (u) = (n)
sinh(ηC + u) , sinh(η(C + 2) − u) (n)
ω2 (u) = ω1 (u), (n) ω3 (u)
(14)
= −1.
The index (n) will be omitted, due to homogeneity. Immediately from (13), (2b) and the definition |0iN = (N ) ⊗ (N−1) ⊗ · · · ⊗ (1)
(15)
of the (global) vacuum |0iN follows Tˆ V3 (N |u) |0iN [ω1 (u)]N 0 0 0 |0iN , 0 [ω1 (u)]N = Cˆ 2 (u) (−1)N Cˆ 1 (u)
(16)
where Cˆ i (u) := [Tˆ V3 (N |u)]3i for i = 1, 2
(17)
390
J. Gruneberg
will later serve as creation-operators. ABA step 1 is finished: From (16),(2c) and (4h) follows the vacuum-eigenvalue of τ V3 (N|u): 3VN3 (u) = 2[ω1 (u)]N − (−1)N .
(18)
As mentioned before, Kulish and Reshetikhin solved a model built from the fundamental representation of Y (gl(3|C)), whose R-matrix differs from the η → 0-limit of (6) only in minor details. The FCR (3b) derived from (8): V3 V3 V3 V3 (u − v)Tˆ1V3 (N |u)Tˆ2V3 (N |v) = Tˆ2V3 (N|v)Tˆ1V3 (N|u)R12 (u − v) R12
(19)
are almost identical to the ones in [19]: Trigonometric functions in (6) do not show up, if appropriate abbreviations are used. Apart from a few signs due to grading, which was also realized in [19], the formal algebra defined by (19) becomes exactly the same. Of course it is possible to write down equations for eigenvectors and eigenvalues immediately, using the result of [19]. Again apart from a few signs, just the vacuum eigenvalues have to be replaced by (14). This is a well-known feature of ABA. However some more details will be needed, in order to tackle the more complicated problem of diagonalizing τˆ V4 (N |u) in the following section: The (nested, see below) algebraic Bethe ansatz for (right) eigenvectors of τˆ V3 (N|u) is [19] |λ1 , . . . , λM |F >= F a1 ,... ,aM Cˆ a1 (λ1 ) · · · Cˆ aM (λM ) |0iN ,
(20)
F a1 ,... ,aM
where {λ1 , . . . , λM } is some set of yet unknown parameters and are some coefficients, yet undetermined. Summation over repeated ai = 1, 2 with i = 1, . . . , M is implied. From (19) it follows immediately 1 Cˆ i (v)Tˆ33 (u) c(u − v) a(v − u) ˆ Ci (u)Tˆ33 (v), + c(v − u) 2 X 1 rlm,j k (u − v)Cˆ m (v)Tˆil (u) Tˆij (u)Cˆ k (v) = c(u − v)
Tˆ33 (u)Cˆ i (v) =
(21a)
l,m=1
b(u − v) ˆ Cj (u)Tˆik (v), − c(u − v) 2 X 1 rkl,ij (u − v)Cˆ l (v)Cˆ k (u) Cˆ i (u)Cˆ j (v) = d(u − v)
(21b)
k,l=1
with i, j, k ∈ {1, 2}. a(u), b(u), c(u) and d(u) originate from (6). For brevity [Tˆ V3 (N |u)]ij has been denoted by Tˆij (u). In the present case rik,j l (u) b 0 (2|C))-symmetric R-matrix, are elements of the non-graded Uq (gl R V2 V2 (u) =
2 X
rik,j l eij ⊗ eˆkl
i,j,k,l=1
= e11 ⊗ eˆ11 + e22 ⊗ eˆ22 + c(u) e11 ⊗ eˆ22 + e22 ⊗ eˆ11 + a(u)e21 ⊗ eˆ12 + b(u)e12 ⊗ eˆ21
(22)
On the Exact Solution of Models Based on Non-Standard Representations
391
which acts on the direct product of two two-dimensional, purely even subspaces V2 of V3 , spanned by |1 > and |2 > from (4a). It is crucial to realize the appearance of R V2 V2 (u) as a proper submatrix in R V3 V3 (u) (6), because it defines a simpler BA-solvable model. Nested algebraic Bethe ansatz (NABA) is typical for models, based on fundamental representations of dimension larger than 2. It was preceded by the ingenious, but complicated nested coordinate Bethe ansatz, invented by Gaudin [21] and Yang [1] independently. Their method was applied to the fundamental representation of the Y (gl(m, n|C))-symmetric problem by Lai [23] and Sutherland [24]. The formal algebraic formulation of the method is apparently due to Takhtajan [22]. The transfer-matrix τˆ V3 (N |u) applied to the Bethe ansatz eigenvector (20) should yield τˆ V3 (N |u) |λ1 , . . . , λM |F >= 3V3 (N|u) |λ1 , . . . , λM |F >
(23)
Leaving some technical details for Appendix B, it turns out, that this is true, iff the coefficients F in (20) fulfill “6-vertex-type” eigenvalue equations [19]: ia1 ,... ,am
h
τˆ V2 (M|λk )
b1 ,... ,bm
F b1 ,... ,bm =
1 F a1 ,... ,am [−ω1 (λk )]N
(24)
for k = 1, . . . , M, of course solvable by ABA [8]. This is the second nested Bethe ansatz. τˆ V2 (M|u) is an inhomogeneous transfer-matrix obtained according to (2) with δ (n) = γn from (22). The eigenvalue of τ V2 (M|u) corresponding to the BA-eigenvector F is given by 3VM2 (u; µ1 , . . .
, µm ) =
M Y
! c(u − λn )
n=1 m Y
+
α=1
m Y α=1
1 c(µα − u)
!
1 c(u − µα )
! (25)
with rapidities µα (α = 1, . . . , m), determined by the BA-equations M Y
c(µα − λn ) =
n=1
m Y c(µα − µβ ) c(µβ − µα )
(26a)
β=1 β6=α
for α = 1, . . . , m. These and expressions for the actual BA-vectors F also depending on µ1 , . . . , µm , may be found in the literature [8]. Using (25) the eigenvalue condition (23) reads [−ω1 (λk )]N =
m Y
c(µα − λk )
(26b)
α=1
for k = 1, . . . , M, which is the second set of BA-equations, determining λ1 , . . . , λM . Collecting the wanted terms in (B1) the eigenvalue of τˆ V3 (N|u) corresponding to the
392
J. Gruneberg
NABA-eigenvector (20) follows immediately: 3VN3 (u; λ1 , . . .
, λM |µ1 , . . . , µm ) =
M Y i=1
1 c(u − λi )
! (27)
o n × [ω1 (u)]N 3VM2 (u; µ1 , . . . , µm ) − (−1)N . According to Baxter [7] BA-equations guarantee analyticity of all eigenvalues in u. Here a q-deformed, graded version of the R-matrix (6) has been used and the Cˆ i -operators act on a different quantum space, i.e. V4 instead of V3 . However not knowing about [20], the whole calculation has been borrowed from [19]. A highest weight state, i.e. |1i instead of |2i in (12) and (15), could have been used as vacuum, but this leads to a very similar calculation. The result (27) is new, but it differs just by the vacuum eigenvalues (14) and signs from the well-known one in [19]. It is also complete. This is not true for the set of eigenvectors (20). However the missing ones may be produced using the lowest weight property of the ABA-vectors with respect to the group action on quantum space, which can be proved by standard-methods [8]. These are well-known and beautiful features of Bethe ansatz solvable systems. Also the equations for the inhomogeneous model with w(n) = C (n) in (2a) can be written down immediately using an argument due to Baxter [7]: 3VN3 (u; λ1 , . . . , λM |µ1 , . . . , µm ) =
N Y n=1
×
sinh(ηC (n) + u − δ (n) ) sinh(η(C (n) + 2) − u + δ (n) )
!
(28a)
Y M
m sinh(u − λi + 2η) Y sinh(u − µα − 2η) sinh(u − λi ) sinh(u − µα ) α=1 i=1 m Y sinh(u − µα + 2η) + sinh(u − µα ) α=1
− (−1)N
M Y sinh(u − λi + 2η) . sinh(u − λi ) i=1
The BA-equations (analyticity conditions) are M m Y Y sinh(µα − µβ + 2η) sinh(µα − λi + 2η) = sinh(µα − λi ) sinh(µα − µβ − 2η) i=1
(28b)
β=1 β6=α
for α = 1, . . . , m and N m Y Y sinh(λk − δ (n) − η(C (n) + 2)) sinh(µα − λk + 2η) = sinh(µα − λk ) sinh(λk − δ (n) + ηC (n) )
n=1
(28c)
α=1
for k = 1, . . . , M. The situation is different in the case of τ V4 (N|u), because the innocent looking change of auxiliary space requires the use of an at first sight completely different algebra. In the next section a systematic approach to this problem will be developed, which makes extensive use of the presented solution.
On the Exact Solution of Models Based on Non-Standard Representations
393
0
4. Diagonalization of τˆ V4 (N|u) In order to understand the difficulties in diagonalizing the homogeneous version of 0 τ V4 (N|u) defined in Sect. 2, it is convenient to follow the standard procedure from the (N ) (1) previous section as far as possible. So V4 × · · · × V4 will be chosen as quantum 0 space, while V4 , characterized by primed parameters (5a) will serve as auxiliary space. The sign change (11) will be applied and the local vacuum will be chosen as the lowest weight state in V4 (12). Omitting the local index (n), due to homogeneity, this leads to 0 Lˆ V4 (n|u) (n) ω1 (u) 0 0 0 ∗ (n) ∗ ω2 (u) ∗ = ∗ 0 ω3 (u) 0 ∗ 0 0 ω4 (u)
(29)
with the new (local) vacuum eigenvalues sinh(η(C − C 0 ) + u) sinh(η(C − C 0 + 2) + u) , sinh(η(C + C 0 ) − u) sinh(η(C + C 0 + 2) + u) sinh(η(C + C 0 + 2) − u) , ω2 (u) = sinh(η(C + C 0 + 2) + u) sinh(η(C 0 − C) − u) , ω3 (u) = sinh(η(C + C 0 + 2) + u) ω4 (u) = ω3 (u). ω1 (u) =
(30)
There are five non-vanishing entries compared to two in (13). This will be the same for the other three possible local vacua. Using (15), (2b) leads to 0 Tˆ V4 (N |u) |0iN 0 0 0 [ω1 (u)]N ∗ ∗ ∗ [ω2 (u)]N |0iN . = N 0 ∗ 0 [ω3 (u)] N ∗ 0 0 [ω3 (u)]
(31)
From the integrability condition (3c), i h 0 τˆ V4 (N |u), τˆ V3 (N |v) = 0, 0
it is clear that τˆ V3 (N |v) and τˆ V4 (N |u) share the same eigenvectors. The eigenvalues (27) are in general degenerate. The lowest weight property of the (global) vacuum (15), which is inherited by the BA-vectors (20) via standard arguments [8], guarantees uniqueness of these special vectors. Note that the same argument would hold also for a highest weight state as (global) vacuum, but not for any other choice. From this and (31), following 0 Baxter [7], it can be concluded immediately that all eigenvalues of τˆ V4 (N|u) can be represented in the form V0
3N4 (u; λ1 , . . . , λM |µ1 , . . . , µm ) = [ω1 (u)]N F (u) + [ω2 (u)]N G(u) N
−[ω3 (u)] {H (u) + J (u)} ,
(32)
394
J. Gruneberg
where F (u), G(u), H (u) and J (u) are meromorphic functions in u, whose residua cancel, if the analyticity conditions (26) hold. In order to determine these unknown functions, the FCR (3b) with with V = V3 and V 0 = V40 , namely V3 V 0
V0
V0
V3 V 0
R12 4 (u, v)Tˆ1V3 (N |u)Tˆ2 4 (N |v) = Tˆ2 4 (N|v)Tˆ1V3 (N|u)R12 4 (u, v)
(33)
0
with R V3 V4 (u) from (7) will be chosen. The reasons are 0
0
1. R V3 V4 is a 12 × 12-matrix while R V4 V4 is a 16 × 16-matrix. The choice V = V4 would greatly increase the number of equations. 2. In contrast to (16) Eq. (31) does not offer a natural choice of creation-operators, so the invaluable a priori knowledge of unique eigenvectors (20) with BA-parameters obeying (26) would be lost within the alternative choice. 0
0
The R-matrices R V3 V4 (u) (7) and R V4 V4 (u) (9) do not contain R V2 V2 (u) (22) as a proper submatrix. In particular unwanted terms turn out to be much more complicated. However it is possible to omit their calculation. As will be shown, the knowledge of unique eigenvectors (20) with (26) as well as some details of the calculation given in Sect. 3 are sufficient to determine the unknown functions in (32) unambiguously. For brevity (17) will be used as well as 0
0
0 V V TˆijV3 (u) = [Tˆ V3 (N |u)]Vij3 , Tˆij 4 (u) = [Tˆ V4 (N|u)]ij4 .
First it is convenient to list all components from (33), containing an operator Cˆ i (u) (17) V0 multiplied with a diagonal element of Tˆjj4 (v) from the right. From (7) and (A1) with primed parameters (5a) and (33) follows: V0
V3 (u)Tˆ414 (v) ζ2 (u − v)Tˆ11 0
V V3 (u)Tˆ314 (v) − ζ2 (u − v)q Tˆ21 V0 + β0 (u − v)Cˆ 1 (u)Tˆ114 (v) 0
(34a) =
V0 ρ(u − v)Tˆ114 (v)Cˆ 1 (u), 0
V V −Cˆ 1 (u)Tˆ224 (v) = α0 (u − v)Tˆ224 (v)Cˆ 1 (u)
(34b)
V0 V3 (u), − ε2 (u − v)Tˆ234 (v)Tˆ33
V0
V3 (u)Tˆ234 (v) ε2 (u − v)Tˆ11 0
0
V V + γ0 (u − v)Cˆ 1 (u)Tˆ334 (v) = ρ(u − v)Tˆ334 (v)Cˆ 1 (u),
(34c)
0
V V3 (u)Tˆ244 (v) ε2 (u − v)Tˆ21 V0
V0
+ γ0 (u − v)Cˆ 1 (u)Tˆ444 (v) = α0 (u − v)Tˆ444 (v)Cˆ 1 (u) V40
+ δ2 (u − v)Tˆ43 (v)Cˆ 2 (u) V0
V3 (u), − ζ2 (u − v)Tˆ414 (v)Tˆ33
(34d)
On the Exact Solution of Models Based on Non-Standard Representations
395
0
V V3 ζ2 (u − v)Tˆ12 (u)Tˆ414 (v) V0
V3 (u)Tˆ314 (v) − ζ2 (u − v)q Tˆ22
(34e)
V40
V40
+ β0 (u − v)Cˆ 2 (u)Tˆ12 (v) = ρ(u − v)Tˆ11 (v)Cˆ 2 (u), V0
V0
−Cˆ 2 (u)Tˆ224 (v) = α0 (u − v)Tˆ224 Cˆ 2 (u)
(34f)
V 0 V3 (v), − ε2 (u − v)Tˆ244 Tˆ33
0
V V3 (u)Tˆ234 (v) ε2 (u − v)Tˆ12 0
0
V V + γ0 (u − v)Cˆ 2 (u)Tˆ334 (v) = δ1 (u − v)Tˆ344 (v)Cˆ 1 (u)
(34g)
V0 + α0 (u − v)Tˆ334 (v)Cˆ 2 (u) V0 V3 (u), + ζ2 (u − v)q Tˆ314 (v)Tˆ33
V0
V3 (u)Tˆ244 (u) ε2 (u − v)Tˆ22 V0
V0
+ γ0 (u − v)Cˆ 2 (u)Tˆ444 (v) = ρ(u − v)Tˆ444 (v)Cˆ 2 (u).
(34h)
The idea is to keep only contributions leading to wanted terms, when the eigenvector 0 (20) is applied to τˆ V4 (N |u) and neglect all others. The set (34) is not complete. For V0 V0 V3 V3 instance in (34a) a term ∝ Tˆ11 (u)Tˆ414 and another ∝ Tˆ21 (u)Tˆ314 occur. Both will act non-trivially on |0iN from (31). However in the set (33) the relations V0
V3 (u)Tˆ414 (v) α0 (u − v)Tˆ11 V0
V3 (u)Tˆ314 (v) + δ1 (u − v)Tˆ21 0
0
V V V3 (u) + ζ1 (u − v)Cˆ 1 (u)Tˆ114 (v) = Tˆ414 (v)Tˆ11
and 0
V V3 (u)Tˆ414 (v) δ2 (u − v)Tˆ11 0
V V3 (u)Tˆ314 (v) + α0 (u − v)Tˆ21 0
0
V V V3 (u) − ζ1 (u − v)q −1 Cˆ 1 (u)Tˆ114 (v) = ρ(u − v)Tˆ314 (v)Tˆ21
can be found and used to eliminate these terms leading to ! ζ1 ζ2 [2α0 + q −1 δ1 + qδ2 ] V0 β0 − (u − v) Cˆ 1 (u)Tˆ114 (v) 2 α0 − δ1 δ2 0
V = ρ(u − v) Tˆ114 (v)Cˆ 1 (u) " #! δ2 [α0 q + δ1 ] ρζ2 V0 V3 1+ (u − v) Tˆ414 (v)Tˆ11 (u) − 2 α0 α0 − δ1 δ2 ! ρζ2 [α0 q + δ1 ] V 0 V3 (u − v) Tˆ314 Tˆ21 (u), + 2 α0 − δ1 δ2
396
J. Gruneberg
where the dependence on difference variables has been denoted symbolically for brevity. The last two terms on the right hand side will not lead to a contribution proportional to any BA-eigenvector (20). It has been checked – and this is crucial, that these terms are not related to a proper combination of Cˆ i -operators by unused relations from the set (33). In conclusion they can be identified as leading to unwanted terms. In the same way two other relations from (33) may be used to eliminate from (34e) V0 V0 V3 V3 (u)Tˆ414 (v) and ∝ Tˆ22 (u)Tˆ314 (v), which after omitting contributions leading terms ∝ Tˆ12 to unwanted terms yield the same result with Cˆ 1 (u) replaced by Cˆ 2 (u), i.e.: V0 Tˆ114 (u)Cˆ i (v)
=
! β0 ζ1 ζ2 [2α0 + q −1 δ1 + qδ2 ] (v − u) − ρ ρ[α02 − δ1 δ2 ] V0
× Cˆ i (v)Tˆ114 (u)
± ...
(35a)
for i = 1, 2. 0 V 0 V3 V3 (u) and ∝ Tˆ244 Tˆ33 (v) can be identified as In (34b) and (34f) terms ∝ Tˆ V4 (v)Tˆ33 leading to unwanted terms in the sense explained above and therefore be neglected: 0
V Tˆ224 (u)Cˆ i (v) =
−1 V0 Cˆ i (v)Tˆ224 (u) α0 (v − u)
± ...
(35b)
for i = 1, 2. The other relations from (33) can be treated similarly, leading to α0 γ0 − ε1 ε2 V0 V0 (v − u) Cˆ 1 (v)Tˆ334 (u) Tˆ334 (u)Cˆ 1 (v) = α0 ρ ± ... α0 γ0 − ε1 ε2 V0 V0 (v − u) Cˆ 2 (v)Tˆ444 (u) Tˆ444 (u)Cˆ 2 (v) = α0 ρ ± ... ! α γ − ε ε V40 V0 0 0 1 2 (v − u) Cˆ 1 (v)Tˆ444 (u) Tˆ44 (u)Cˆ 1 (v) = 2 α0 − δ1 δ2 ! 0 δ2 α0 γ0 − ε1 ε2 ˆ 2 (v)Tˆ V4 (u) (v − u) C − 43 α0 α02 − δ1 δ2 ± ... 0
V Tˆ334 (u)Cˆ 2 (v) =
α0 γ0 − ε1 ε2 α02 − δ1 δ2 −
(35d)
(35e)
!
V0
(v − u) Cˆ 2 (v)Tˆ334 (u)
δ1 α0 γ0 − ε1 ε2 α0 α02 − δ1 δ2
± ... .
(35c)
!
0
V (v − u) Cˆ 1 (v)Tˆ434 (u)
(35f)
Some details of the calculations are given in Appendix C. They are tedious, but straightforward: It is trivial to identify terms proportional to a simple M = 1 eigenvector, (20), if it is applied. The remaining terms are divided into those which possibly lead to a contribution proportional to an eigenvector via the algebra (34), and others which cannot be
On the Exact Solution of Models Based on Non-Standard Representations
397
transformed this way. The former terms have been eliminated by using convenient relations from (34) and evaluated again, till this procedure terminated, leaving only terms of the latter type, i.e. unwanted terms, which have been neglected systematically in (35). Equations (35e) and (35f) contain non-trivial terms V0
V0
∝ Cˆ 2 (v)Tˆ434 (u) and ∝ Cˆ 1 (v)Tˆ434 (u). 0
V Next it is natural to add to (34) the relations involving terms ∝ Tˆ344 (u)Cˆ i (v) and ∝ 0 V Tˆ 4 (u)Cˆ i (v) with i = 1, 2, i.e. 43
0
V V3 ε2 (u − v)Tˆ11 (u)Tˆ244 (v) 0
0
V V + γ0 (u − v)Cˆ 1 (u)Tˆ344 (v) = α0 (u − v)Tˆ344 (v)Cˆ 1 (u)
(36a)
V0 + δ2 (u − v)Tˆ334 (v)Cˆ 2 (u) V0 V3 (u), + ζ2 (u − v)Tˆ314 (v)Tˆ33
0
V V3 (u)Tˆ234 (v) ε2 (u − v)Tˆ21 0
0
V V + γ0 (u − v)Cˆ 1 (u)Tˆ434 (v) = ρ(u − v)Tˆ434 (v)Cˆ 1 (u),
(36b)
0
V V3 (u)Tˆ244 (v) ε2 (u − v)Tˆ22 0
0
V V + γ0 (u − v)Cˆ 2 (u)Tˆ344 (v) = ρ(u − v)Tˆ344 (v)Cˆ 2 (u),
(36c)
V0
V3 (u)Tˆ234 (v) ε2 (u − v)Tˆ22 0
0
V V + γ0 (u − v)Cˆ 2 (u)Tˆ434 (v) = δ1 (u − v)Tˆ444 (v)Cˆ 1 (u)
(36d)
V0 + α0 (u − v)Tˆ434 (v)Cˆ 2 (u) V0 V3 (u). + ζ2 (u − v)q Tˆ414 (v)Tˆ33
Proceeding as above, leads to V0 Tˆ344 (u)Cˆ 1 (v)
=
α0 γ0 − ε1 ε2 α02 − δ1 δ2 −
0
α0 γ0 − ε1 ε2 α02 − δ1 δ2 −
0
V (v − u) Cˆ 1 (v)Tˆ344 (u)
δ2 α0 γ0 − ε1 ε2 α0 α02 − δ1 δ2
± ... , V Tˆ434 (u)Cˆ 2 (v) =
! !
V0
(v − u) Cˆ 2 (v)Tˆ334 (u) (37a)
!
0
V (v − u) Cˆ 2 (v)Tˆ434 (u)
δ1 α0 γ0 − ε1 ε2 α0 α02 − δ1 δ2
!
V0
(v − u) Cˆ 1 (v)Tˆ444 (u)
398
J. Gruneberg
± ... , α0 γ0 − ε1 ε2 V0 = (v − u) Cˆ 1 (v)Tˆ434 (u) α0 ρ ± ... , α0 γ0 − ε1 ε2 V0 V0 (v − u) Cˆ 2 (v)Tˆ344 (u) Tˆ344 (u)Cˆ 2 (v) = α0 ρ ± ... .
(37b)
V0 Tˆ434 (u)Cˆ 1 (v)
(37c)
(37d)
This idea is strongly supported by a comparison of (35) and (37) with (21), used in V0 the algebraic diagonalization of τˆ V3 (N |u), suggesting that the submatrix {Tˆ 4 } with ij
i, j = 3, 4 will play the same rôle as the submatrix {TˆijV3 } with i, j = 1, 2 in the previous section. Indeed using the definitions (A1) with primed parameters (5a), (35a) and (35b) can be written sinh(u − v + η(C 0 + 2)) ˆ V0 V0 Ci (v)Tˆ114 (u) Tˆ114 (u)Cˆ i (v) = 0 sinh(u − v − η(C − 2)) ± ... ,
(38a)
0 sinh(u − v + η(C 0 + 2)) ˆ V0 ˆ V4 (u) (v) T C Tˆ224 (u)Cˆ i (v) = i 22 sinh(u − v − ηC 0 ) ± ...
(38b)
for i = 1, 2., while the remaining equations from (35) and (37) may be noted as sinh(u − v + η(C 0 + 2)) tˆij (u)Cˆ k (v) = sinh(u − v − ηC 0 ) ×
2 X
rlm,j k (u − v − ηC 0 ) Cˆ m (v)tˆil (u)
l,m=1
± ...
(38c)
for i, j, k = 1, 2, where the elements rik,j l (u) of the R-matrix (22) and the convenient definition ! V0 V0 Tˆ334 (u) Tˆ434 (u) tˆ11 (u) tˆ12 (u) := (38d) V0 V0 tˆ21 (u) tˆ22 (u) Tˆ 4 (u) Tˆ 4 (u) 34
44
have been used. The similarity of (38) to (21) is striking and allows to calculate the 0 eigenvalues of τˆ V4 (N |u) easily. V0 V0 Applying the (right) eigenvector (20) to Tˆ 4 (u) and Tˆ 4 (u) using (38) and (31) yields 11
0
V Tˆ114 (u)|λ1 , . . . , λM |F > = [ω1 (u)]N
22
M Y sinh(u − λi + η(C 0 + 2)) sinh(u − λi − η(C 0 − 2)) i=1
×|λ1 , . . . , λM |F >
±...
On the Exact Solution of Models Based on Non-Standard Representations
399
and 0
V Tˆ224 (u)|λ1 , . . . , λM |F > = [ω2 (u)]N
M Y sinh(u − λi + η(C 0 + 2)) sinh(u − λi − ηC 0 ) i=1
×|λ1 , . . . , λM |F >
±... , V0
V0
where unwanted terms have been omitted. Applying it to [Tˆ334 (u) + Tˆ444 (u)] yields h
M i Y sinh(u − λi + η(C 0 + 2)) V0 V0 Tˆ334 (u) + Tˆ444 (u) |λ1 , . . . , λM |F > = [ω3 (u)]N sinh(u − λi − ηC 0 ) i=1 ib1 ,... ,bM h F a1 ,... ,aM × τˆ V2 (M|u − ηC 0 ) a1 ,... ,aM
×Cˆ b1 (λ1 ) · · · Cˆ bM (λM ) |0iN
± ... ,
where τˆ V2 (M|u) is defined by (22) via (2) with δ (n) = λn as in Sect. 3. But F is a (right) eigenvector to τˆ V2 (M|u) corresponding to the eigenvalue from (25). The neglected unwanted terms vanish per construction if the supertrace (4h) is performed according 0 to (2c). Therefore the eigenvalue of τˆ V4 (M|u) corresponding to the (right) eigenvector (20) is given by V0 3N4 (u; λ1 , . . .
M Y sinh(u − λi + η(C 0 + 2)) sinh(u − λi − η(C 0 − 2))
N
, λM |µ1 , . . . , µm ) = [ω1 (u)]
! (39)
i=1
N
+ [ω2 (u)]
M Y sinh(u − λi + η(C 0 + 2)) sinh(u − λi − ηC 0 ) i=1
N
− [ω3 (u)]
M Y sinh(u − λi + η(C 0 + 2)) sinh(u − λi − ηC 0 )
i=1 V2 × 3M (u − ηC 0 ; µ1 , . . .
! !
, µm )
with vacuum eigenvalues ωi (u) (i = 1, 2, 3) from (30) and 3VM2 (u; . . . ) from (25). The BA-parameters λ1 , . . . , λM and µ1 , . . . , µm are to be determined by the BAequations (26). Note that these are necessary and sufficient conditions [7] for analyticity of the eigenvalues (39) in u. Since up to now no explicit use has been made of these, this is a valuable consistency check on the validity of (39). Equation (39) is clearly of the expected form (32). It is further obvious that the eigenvalues for every transfer-matrix based on auxiliary space V40 can be represented by the same formula (30), provided the (global) quantum space is a lowest weight space. Of course the vacuum eigenvalues have to be replaced by new ones, which are obviously restricted by the BA-equations (26), as discussed in Sect. 3. For completeness the trivial generalization [7] of (39) to the inhomogeneous case with w (n) = C (n) in (2a) and δ (n) 6 = 0 in (2b) shall be given explicitly:
400
J. Gruneberg
V0
3N4 (u; λ1 , . . . , λM |µ1 , . . . , µm ) =
Y N n=1
sinh(η(C (n) − C 0 ) + u − δ (n) ) sinh(η(C (n) + C 0 ) − u + δ (n) )
sinh(η(C (n) − C 0 + 2) + u − δ (n) ) sinh(η(C (n) + C 0 + 2) + u − δ (n) ) ! M Y sinh(u − λi + η(C 0 + 2)) × sinh(u − λi − η(C 0 − 2))
(40)
×
i=1
Y N
sinh(η(C (n) + C 0 + 2) − u + δ (n) ) + sinh(η(C (n) + C 0 + 2) + u − δ (n) ) n=1 ! M Y sinh(u − λi + η(C 0 + 2)) × sinh(u − λi − ηC 0 ) i=1
Y N
sinh(η(C 0 − C (n) ) − u + δ (n) ) − sinh(η(C (n) + C 0 + 2) + u − δ (n) ) n=1 ! Y M sinh(u − λi + η(C 0 + 2) × sinh(u − λi − ηC 0 ) i=1 ! m Y sinh(u − µα − η(C 0 + 2)) × sinh(u − µα − ηC 0 ) α=1 ! M Y sinh(u − λi + η(C 0 + 2) + sinh(u − λi − η(C 0 − 2)) i=1 ! m Y sinh(u − µα − η(C 0 − 2)) . × sinh(u − µα − ηC 0 )
α=1
Here the BA-parameters λ1 , . . . , λM and µ1 , . . . , µm are determined by (28b) and (28c). Equation (40) describes all eigenvalues. As mentioned above, additional eigenvectors to the same eigenvalue (40) are obtained by applying shift operators, corresponding to the representation of the group-symmetry on the (global) quantum space, to the eigenvectors (20). Completeness may be assured by the usual arguments [8]. 5. Conclusion 0
In the previous section τˆ V4 (N |u) has been diagonalized by NABA, combined with analyticity arguments. Obviously the method can be applied to any BA-integrable model, defined by (2), based on an arbitrary, but finite dimensional representation V 0 of a possibly q-deformed Lie (super-)algebra as auxiliary space. Let the model based on the direct product of a fundamental representation V with itself, here defined by R V3 V3 (u) and (2), be solved by (N)ABA. In order to solve the model under consideration the following scheme may be applied: 1. An auxiliary model based on V as auxiliary and the non-standard representation V 0 as quantum space, may be constructed by standard methods and its transfer-matrix,
On the Exact Solution of Models Based on Non-Standard Representations
401
i.e. τˆ V3 (N|u) from (6) via (2), may be diagonalized, using a (global) lowest or highest weight state, e.g. |0iN (15), as (pseudo-)vacuum. 2. Vacuum eigenvalues may be calculated trivially, see (30). The transfer-matrix of the relevant model and the one of the auxiliary model commute (3c) and share all BA-eigenvectors, which dictates the form of the eigenvalue equations (32). 3. Mixed FCR (34), between creation-operators from auxiliary model (17), should be used as follows: 0 (a) FCRs (34) between diagonal elements of Tˆ V (N|u) and creation-operators multiplied from the right on these (35) should be collected. The remaining terms in these equations are classified as wanted (leading to terms proportional to the known BA-vectors), unwanted (not related to wanted ones by FCRs) and others. (b) Terms of the last category have to be eliminated by use of other convenient FCRs. Unwanted terms may be neglected in final equations, i.e. (35). (c) Generically the final equations in step (b) involve some off-diagonal elements 0 of Tˆ V (N|u) (35). They have to be complemented by all FCRs containing these off-diagonal elements, multiplied from the right with creation-operators (36), to which the same procedure has to be applied (37). 4. The relations obtained in step 3 allow the calculation of the eigenvalue equations (39), if they are written down conveniently, i.e. like (38). Step one and two are trivial here. Step three is crucial. An unusually large number of FCRs (34) has to be used, because the mixed R-matrix (7), does not contain any smaller R-matrix like (22) as a proper submatrix, which was true e.g. for (6). The approach is systematic and avoids a complicated discussion of unwanted terms. The author has checked in a number of cases, that these indeed vanish in the present application, but analyticity of the final result (39) is a very strong and usually sufficient test. Step four is simple. Some knowledge of the preceding calculations is a sufficient guideline. A group theoretical background is not necessary, but helpful. Definitely needed is a commuting (auxiliary) model, algebraically solvable [8], and a unique identification of joint eigenvectors. The theory of quantum groups [3,4,12] provides both. In addition it is implicitly assumed that the algebra defined by the FCRs is complete, i.e. if two operators are identical, this information should be encoded within the FCRs. This is guaranteed if R has the intertwining property [3]. The more complicated problem of handling the full set of commutation relations of comparable complexity directly, has been tackled more ore less exactly a number of times. The algebraic solution of a statistical covering model for the one-dimensional Hubbard model, where no commuting transfer-matrix is known, was studied by Ramos and Martins [25]. Also a diagonalization of an Y (sp(2, 1))-symmetrical model by the same authors should be mentioned [26]. To the author’s knowledge no systematic scheme is known and although the eigenvalues are presumably correct, the discussion of unwanted terms is not complete in these works. It is an interesting, but still unsolved question, if solvability of some statistical model by n-fold NABA implies the existence of a commuting transfer-matrix with minimal, that is (n + 1)-dimensional, auxiliary space? b 0 (N |C))-symmetric case, the quantum-determinant, introIn the non-graded Uq (gl duced by Izergin and Korepin [28] and recognized by Drinfel’d [3] to complete the center of this algebra, provides the possibility to construct functional relations [13] for the eigenvalues, extended to an “analytical Bethe ansatz” by Reshetikhin [29]. This is more elegant than the present approach, but does not generalize to the graded case, because no one-dimensional subspace can be separated from a product of transfer-matrices.
402
J. Gruneberg 0
The transfer-matrix τˆ V4 (N |u) has been used mainly for pedagogical reasons. Minus signs due to grading, even in the non-graded version [13] prevent a statistical interpretation. Nevertheless the Hamiltonian limit in the non-difference type spectral parameter (1a), as mentioned above, leads to an additional, unusual Hamiltonian, which will be discussed elsewhere [30]. Note that neither τˆ V3 (N|u) nor τˆ V4 (N|u) are hermitian, except if further restrictions are imposed on (5a). The diagonalization of τˆ V4 (N|u), especially the result (40), may serve as starting point for calculations on the thermodynamics of these models in the non-linear integral equation approach, pioneered by Klümper [31]. For a recent application of this technique see also [32]. b 0 (2, 1|C))-symThe eigenvalue-equation for the transfer-matrix of some other Uq (gl metric models with V40 as auxiliary and some lowest weight representation as quantum space may be written down by replacing the ωi (u) (30) in (39) by new ones. De Vega and Gonzáles Ruiz [33] and Foerster and Karowski [34] generalized the ABA calculations of Schultz [20] partially to non-periodic, integrable boundary conditions. There should be no principal problem to combine their techniques with the method presented here. The perhaps most important open question is concerned with the applicability of the method to models with infinite dimensional auxiliary space, which was precautiously excluded here. Acknowledgement. This work has been performed within the research program of the Sonderforschungsbereich 341 (Köln-Aachen-Jülich). The author thanks J. Zittartz and A. Klümper for continuous support, A. Zvyagin, G. Jüttner, Y. Kato, A. Klümper and especially A. Fujii for stimulating discussions and encouragement. Special thanks goes to A. Klümper for carefully reading the manuscript and numerous useful suggestions, incorporated in the final version. The author would also like to thank a referee for pointing out reference [20] to him.
Appendix A: Coefficients of R-matrices The elements of the R-matrix (7) are explicitly given by ρ(u) := α0 (u) := β0 (u) := γ0 (u) := δ1 (u) := δ2 (u) := ε1 (u) := ε2 (u) :=
sinh(η(C + 2) + u) , sinh(η(C + 2) − u) 1 {[C + 1]q ρ(u) − 1}, [C + 2]q 1 {[2]q ρ(u) − [C]q }, [C + 2]q 1 {ρ(u) − [C + 1]q }, [C + 2]q 1 {ρ(u)q −C−1 + q}, [C + 2]q 1 {ρ(u)q C+1 + q −1 }, [C + 2]q C C µ∗ {ρ(u)q − 2 −1 + q 2 +1 }, [C + 2]q C C µ {ρ(u)q 2 +1 + q − 2 −1 }, [C + 2]q
(A1a) (A1b) (A1c) (A1d) (A1e) (A1f) (A1g) (A1h)
On the Exact Solution of Models Based on Non-Standard Representations
403
C+1 C+1 κ∗ {ρ(u)q − 2 + q 2 }, [C + 2]q C+1 C+1 κ {ρ(u)q 2 + q − 2 }. ζ2 (u) := [C + 2]q
ζ1 (u) :=
(A1i) (A1j)
f (u) and g(u) in (9) are defined by sinh(η(C + C 0 ) + u) , sinh(η(C + C 0 ) − u) sinh(η(C + C 0 + 2) − u) . g(u) = sinh(η(C + C 0 + 2) + u)
f (u) =
(A2a) (A2b)
Using (5d) and the abbreviations α = [C + C 0 ]q β = [C + C 0 + 1]q , γ = [C + C 0 + 2]q , ε = [C 0 ]q q η = [C]q q
C+C 0 2 +1
C+C 0 2 +1
− [C]q q − − [C 0 ]q q
C+C 0 2 −1
0 − C+C 2 −1
, ,
the remaining coefficients of (9) can be written as: r1 = r10 = r2 = r20 = r3 = r30 = r4 =
r40 =
r5 =
κ ∗ µ∗ κ 0 µ0 0 0 (γ q C+C f (u) + [2]q β + αq −C−C −2 g(u)), αβγ κµκ 0 ∗ µ0 ∗ 0 0 (γ q −C−C f (u) + [2]q β + αq C+C +2 g(u)), αβγ 0 C+C 0 κ ∗κ 0 − C+C 2 2 f (u) + q q , α C+C 0 C+C 0 κκ 0 ∗ q − 2 f (u) + q 2 , α 0 C+C 0 +2 µµ0 ∗ − C+C2 +2 + q 2 g(u) , q γ C+C 0 +2 C+C 0 +2 µ∗ µ0 − 2 g(u) , q 2 +q γ q −1 1+ [C]q [C 0 ]q γf (u) − [C + 1]q β αβγ + [C 0 + 1]q [C + 1]q αg(u) − [C 0 ]q β , q 1+ [C]q [C 0 ]q γf (u) − [C + 1]q β αβγ + [C 0 + 1]q [C + 1]q αg(u) − [C 0 ]q β , 1 [C]q [C + 1]q γf (u) − [2]q [C 0 ]q [C + 1]q β αβγ
404
J. Gruneberg
+ [C 0 ][C 0 + 1]q αg(u) , 1 [C 0 ]q [C 0 + 1]q γf (u) − [2]q [C]q [C 0 + 1]q β r50 = αβγ + [C][C + 1]q αg(u) , r6 =
C+C 0 +1 1 µ∗ κ 0 [C]q γ q 2 f (u) − βεq 2 αβγ C+C 0 +1 − [C 0 + 1]q αq − 2 g(u) ,
C+C 0 +1 1 κµ0 ∗ [C 0 ]q γ q − 2 f (u) + βεq − 2 αβγ C+C 0 +1 − [C + 1]q αq 2 g(u) , 1 [C 0 ]q − [C]q f (u) , r7 = α 1 0 [C]q − [C 0 ]q f (u) , r7 = α C+C 0 +1 1 κ ∗ µ0 [C 0 ]q γ q 2 f (u) − βηq 2 r8 = αβγ C+C 0 +1 − [C + 1]q αq − 2 g(u) ,
r60 =
C+C 0 +1 1 µκ 0 ∗ [C]q γ q − 2 f (u) + βηq − 2 αβγ C+C 0 +1 − [C 0 + 1]q αq 2 g(u) , 1 [C 0 + 1]q − [C + 1]q g(u) , r9 = γ 1 [C + 1]q − [C 0 + 1]q g(u) , r90 = γ 1 [C]q [C + 1]q β − [C 0 ]q γf (u) r10 = αβγ + [C 0 + 1]q [C 0 ]q β − [C + 1]q αg(u) .
r80 =
Appendix B: Some Details on ABA Applying the ansatz (20) to the diagonal elements of Tˆij (u) using (19) and (16) yields [19], i h (B1a) Tˆ11 (u) + Tˆ22 (u) |λ1 , . . . , λM |F > = [ω1 (u)]N
M Y i=1
1 ,bM a1 ,... ,aM [τˆ V2 (M|u)]ba11 ,... ,... ,aM F c(u − λi )
×Cˆ b1 (λ1 ) · · · Cˆ bM (λM ) |0iN M h ib1 ,... ,bM X ˇ (1,2) (u; λ1 , . . . , λM ) + F a1 ,... ,aM 3 k k=1
a1 ,... ,aM
On the Exact Solution of Models Based on Non-Standard Representations
× Cˆ bk (u)
M Y
405
Cˆ bi (λi ) |0iN ,
i=1 i6=k
where τˆ V2 (M|u) is an inhomogeneous transfer-matrix obtained according to (2) with δ (n) = λn from (22), and Tˆ33 (u)|λ1 , . . . , λM |F > = (−1)N
M Y i=1
+
1 |λ1 , . . . , λM |F > c(u − λi ) ib1 ,... ,bM
M h X k=1
ˇ (3) (u; λ1 , . . . , λM ) 3 k
× Cˆ bk (u)
M Y
a1 ,... ,aM
(B1b)
F a1 ,... ,aM
Cˆ bi (λi ) |0iN .
i=1 i6=k
The operators Cˆ i (λ1 ) under the products in (B1) are ordered with the index increasing from left to right factors. Note that only the first terms in Eq. (B1) will contribute to the ˇ k are eigenvalue, while the following terms are unwanted. Their coefficients 3 ib1 ,... ,bM h ˇ (1,2) (u; λ1 , . . . , λM ) (B2a) 3 k a1 ,... ,aM
= −[ω1 (λk )]N
b(u − λk ) c(u − λk )
M Y i=1 i6=k
k−1 Y 1 1 c(λk − λi ) d(λj − λk ) j =1
b
[Lˆ cM−1 cM−2 (λk − λM−1 )]aM−1 × [Lˆ cM cM−1 (λk − λM )]baM M−1 M b × · · · × [Lˆ ck+1 ck (λk − λk+1 )]ak+1 k+1 ! k−1 h i Y × δabll δackk δb1k δ1cM + δb2k δ2cM , l=1
where Lˆ ij (u) is an abbreviation for [Lˆ V2 (n|u)]Vij2 , derived from (22) via (2a), and ib1 ,... ,bM h ˇ (3) (u; λ1 , . . . , λM ) (B2b) 3 k a1 ,... ,aM
=
a(λk − u) (−1)N+M c(λk − u)
M Y i=1 i6=k
M Y 1 d(λj − λk ) c(λi − λk )
ib1 ,... ,bk
h
× Sˆk (λ1 , . . . , λk )
a1 ,... ,ak
j =k+1
!
M Y l=k+1
δabll
.
Here a k-particle S-matrix has been defined via [19] h
ib1 ,... ,bk
Sˆk (λ1 , . . . , λk )
a1 ,... ,ak
= δbc1k δackk
k−1 Y i=1
rbi ci ,ai ,ci+1 (λi − λk )
(B2c)
406
J. Gruneberg
In (B2) summation over repeated indices ci = 1, 2 is implicit. Applying the ansatz (20) to (23) forces the unwanted terms in (B1) to vanish. These equations can be transformed into 6-vertex-type eigenvalue equations (24) in Sect. 3 [19].
Appendix C: Derivation of Commutation Relations 0
V V3 (u)Tˆ234 (v) A few more details on the derivation of (35) are given: In (34c) the term ∝ Tˆ11 acts non-trivially according to (16) and (31). It has to be eliminated by use of 0
V V3 (u)Tˆ234 (v) α0 (u − v)Tˆ11 V0
V0
V3 (u) + ε1 (u − v)Cˆ 1 (u)Tˆ334 (v) = ρ(u − v)Tˆ234 (v)Tˆ11
from (33), which results in (35c). Similarly (34h) can be handled, leading to (35d). In V0 (34d) the term ∝ Tˆ V3 Tˆ 4 has to be eliminated via the relation 21
24
V0
V3 (u)Tˆ244 (v) α0 (u − v)Tˆ21 0
0
V V V3 (u) + ε1 (u − v)Cˆ 1 (u)Tˆ444 (v) = α0 (u − v)Tˆ244 (v)Tˆ21 0
V V3 (u) + δ2 (u − v)Tˆ234 (v)Tˆ22 0
V V3 (u) + ζ2 (u − v)Tˆ214 (v)Tˆ32 0
V from (33). According to (16) and (31) the term ∝ Tˆ434 (v)Cˆ 2 (u) also acts non-trivially on |0iN . It has to be eliminated, using the following relations from set (33), V0
V3 (u)Tˆ234 (v) ε2 (u − v)Tˆ22 V0
V0
+ γ0 (u − v)Cˆ 2 (u)Tˆ434 (v) = δ1 (u − v)Tˆ444 (v)Cˆ 1 (u) 0
V + α0 (u − v)Tˆ434 (v)Cˆ 2 (u) 0
V V3 (u) + ζ2 (u − v)q Tˆ414 (v)Tˆ33
and 0
V V3 (u)Tˆ234 (v) α0 (u − v)Tˆ22 0
0
V V V3 (u) + ε1 (u − v)Cˆ 2 (u)Tˆ434 (v) = δ1 (u − v)Tˆ244 (v)Tˆ21 0
V V3 (u) + α0 (u − v)Tˆ234 (v)Tˆ22 V0
V3 (u). + ζ2 (u − v)q Tˆ214 (v)Tˆ23 V0
V3 (u)Tˆ234 (v), also Both relations have to be used in order to prevent the appearance of Tˆ22 acting non-trivially on |0iN , in the result (35e). Applying the same procedure to (34g) leads to (35f).
On the Exact Solution of Models Based on Non-Standard Representations
407
References 1. Yang, C.N.: Phys. Rev. Lett. 19, 1312–1315 (1967); Yang, C.N.: Phys. Rev. 168, 1920–19233 (1968) 2. Baxter, R.J.: Ann. Phys. 70, 323–337 (1972) 3. Drinfel’d: V.G.: Quantum Groups. In: Proceedings of the International Congress of Mathematicians, Berkeley, 1986 4. Chari, V. and Presley, A.: A Guide to Quantum Groups. New York: Cambridge University Press, 1994 5. Jimbo, M.: Commun. Math. Phys. 102, 537–547 (1986) 6. Ma, Z.-Q.: Yang–Baxter Equation and Quantum Enveloping Algebras. Singapore: World Scientific, 1993 7. Baxter, R.J.: Exactly solved Models in Statistical Mechanics. London: Academic Press, 1982 8. Sklyanin, E.K., Takhtajan, L.A. and Faddeev, L.D.: Theoret. Math. Phys. 40, 688–706 (1980); Takhtajan, L.A. and Faddeev, L.D.: Russ. Math. Surv. 34, 11–68 (1979); Faddeev, L.D.: Soviet Scientific Reviews C, 1, 107–155 (1980); Takhtajan, L.A.: Introduction to Algebraic Bethe Ansatz. In: B.S. Shastry, S.S. Jha and V. Singh (eds.): Exactly Solvable Problems in Condensed Matter and Field Theory. Lecture Notes in Physics 242, Berlin– Heidelberg: Springer, 1985, pp. 175–220 9. Korepin, V.E., Bogoliubov, N.N. and Izergin, A.G.: Quantum Inverse Scattering Method and Correlation Functions. New York: Cambridge University Press, 1993 10. Jimbo, M. (ed.): Yang–Baxter Equation in Integrable System. Singapore: World Scientific, 1989 11. Bethe, H.A.: Z. Physik 71, 205–226 (1931) 12. Yamane, H.: Preprint q-alg/9603015 (1996) 13. Kulish, P.P. and Sklyanin, E.K.: J. Soviet. Math. 19, 1596–15620 (1982) 14. Cornwell, J.F.: Group Theory in Physics, Vol. 3 – Supersymmetry and infinite dimensional Algebras. London: Academic Press, 1989 15. Kac, V.G.: Infinite dimensional Lie Algebras. 3rd ed., New York: Cambridge University Press, 1990 16. Perk, J.H.H. and Schultz, C.L.: Phys. Lett. 84 A, 407–410 (1981) 17. Gruneberg, J.: To be published 18. Gould, M.D., Hibberd, K.E., Links, J.R. and Zhang, Y.-Z.: Phys. Lett. A 212, 156–160 (1995) 19. Kulish, P.P. and Reshetikhin, N.Y.: JETP, 80, 158–183 (1981) 20. Schultz, C.L.: Physica A, 122, 71–88 (1983) 21. Gaudin, M.: Phys. Lett. A 24, 55–56 (1967) 22. Takhtajan, L.A.: LOMI-Proceedings, 1980, 101, 158–183 (1980) 23. Lai, L.A.: J. Math. Phys. 15, 1675–1676 (1974) 24. Sutherland, B.: Phys. Rev. B 12, 3795–3805 (1975) 25. Ramos and Martins: J. Phys. A 30, L195 (1997) 26. Ramos and Martins, Nucl. Phys. B 474, 678–714 (1996) 27. Kulish, P.P. and Sklyanin, E.K.: Quantum Spectral Transform Method – Recent Developments. In: J. Hietarina and C. Montonen (eds.): Integrable Quantum Field Theories. Lecture Notes in Physics 151, Berlin–Heidelberg, Springer, 1981, pp. 61–119 28. Izergin, A.G. and Korepin, V.E.: Sov. Phys. Dokl. 26, 653-654 (1981) 29. Reshetikhin, N.Y.: Sov. Phys. JETP 57, 691–696 (1983) 30. Gruneberg, J.: To be published 31. Klümper, A.: Ann. Physik 1, 540 (1992); Klümper, A.: Z. Phys. B 91, 507 (1993) 32. Jüttner, G., Klümper, A. and Suzuki, J.: Nucl. Phys. 487, 471–502 (1998) 33. De Vega, H.J. and Gonzáles-Ruiz, A.: Nucl. Phys. B 417, 553–578 (1994); Gonzáles-Ruiz, A.: Nucl. Phys. B 424, 468–486 34. Foerster, A. and Karowski, M.: Nucl. Phys. B 408, 512–534 (1993) Communicated by T. Miwa
Commun. Math. Phys. 206, 409 – 428 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
On the Small-Scale Mass Concentration of Modes John A. Toth? Department of Mathematics and Statistics, McGill University, Montreal, Quebec, Canada H3A 2K6. E-mail:
[email protected] Received: 1 October 1998 / Accepted: 1 April 1999
Abstract: Let P1 , . . . , Pd be commuting, jointly-elliptic, h¯ - pseudodifferential operators on a compact manifold, X, of dimension n ≥ d. Suppose γ is the ω-limit set of the bicharacteristic flow of the classical Hamiltonian, p1 , restricted to the variety, 6E = {(x, ξ ) ∈ T ∗ X; p1 (x, ξ ) − E1 = · · · = pd (x, ξ ) − Ed = 0}. We discuss the corresponding concentration of mass as h¯ → 0 for a subsequence of joint eigenfunctions of the Pj ’s with eigenvalues sufficiently close to (E1 , . . . , Ed ). 1. Introduction Let X be a compact, C ∞ Riemannian manifold of dimension n, and P1 (x, h¯ Dx ), . . . , Pd (x, h¯ Dx ) functionally independent, jointly-elliptic, classical, self-adjoint h¯ - pseudodifferential operators of order m with 1 ≤ d ≤ n. For simplicity of notation, we will denote the corresponding h¯ -principal symbols by p1 , . . . , pd . As a matter of convention, we will refer to H := p1 as the classical Hamiltonian and will also assume that [Pi , Pj ] = 0 for all 1 ≤ i, j ≤ d. When d = n, this system is said to be quantum integrable. There is a rather rich class of examples of this sort, including many classically integrable systems such as the Euler top and geodesic flow on a quadric surface, among others (see [T1]). Consider the variety: 6E = {(x, ξ ) ∈ T ∗ X; p1 (x, ξ ) − E1 = · · · = pd (x, ξ ) − Ed = 0}. For the purposes of this paper, the energy values (E1 , . . . , En ) of interest tend to be singular (see Sect. 4) and thus, the variety 6E is, generally speaking, not a manifold. Let ψj be L2 -normalized, joint eigenfunction of P1 , . . . , Pd satisfying: Pk ψj = Ek (h¯ )ψj , ? Supported in Part by NSERC Grant OGP0170280 and FCAR Grant NC-1520
410
where
J. A. Toth
Ek (h¯ ) = Ek + O(h¯ δ1 ).
Here, 0 < δ1 < 1 and k = 1, . . . , d . Let γ ⊂ 6E be a smooth, compact, embedded submanifold of T ∗ X. Given (x, ξ ) ∈ 6E , suppose that the bicharacteristic curves φt (x, ξ ) = exp t4p1 (x, ξ ) of the Hamilton vector field, 4p1 =
X ∂p1 ∂ ∂p1 ∂ − ∂ξj ∂xj ∂xj ∂ξj j
converge to γ ⊂ 6E as t → ∞. Our main result is a quantum analogue of this classical phenomenon; we show that there is a concentration of L2 mass of the ψj ’s in a tubular neighbourhood of γ corresponding to the classical convergence of the bicharacteristics on 6E . Although, such results are known in some specific instances, for example, onedimensional Schrödinger operators with non-degenerate potential maxima (see [B, CP, T2] and Sect. 4), very little seems to be known in the general case. We will show (see Theorem 1) that, under a rather general assumption on the rate of classical convergence of the bicharacteristics on 6E (see (H1) below), there is an analogous concentration estimate for the corresponding eigenfunctions, ψj . The plan of the paper is as follows: At the end of Sect. 1, we give a precise statement of our main result. In Sect. 2, we give a proof of Theorem 1. In Sects. 3 and 4 we give some applications of this analysis. For the proof in Sect. 2, we will first obtain an estimate on the microlocal concentration of eigenfunctions ψj associated with eigenvalues, Ek (h¯ ), satisfying |Ek (h¯ ) − Ek | = O(h¯ δ1 ) for 0 < δ1 ≤ 1 and k = 1, . . . , d. This result (see Proposition 1) basically says that the mass of such an eigenfunction is concentrated in a tube, E , of radius O(h¯ δ1 /2 ) about the characteristic variety, 6E . Next, we apply the semiclassical, time-dependent, Egorov Theorem (see Proposition 2) with time t ∼ log(1/h¯ ) to transport eigenfunction mass into a tube of width O(h¯ ) about the limit set. Here, > 0 is determined by δ1 , a number of derivatives of certain symbols, and the time-dependent divergence rate of the classical flow. This enables us to prove Theorem 1 (see below) which shows that for suitable > 0 and h¯ sufficiently small, the L2 mass of the ψj ’s inside a tube of radius h¯ around the ω-limit set γ , is at least as large as the mass outside a tube of radius, 2h¯ . It is important to note that, although the bicharacteristic curves on the variety, 6E , all converge to γ , for points z ∈ / 6E , the bicharacteristic emanating from z need not converge to γ as t → ∞. For instance, in the example of the one- dimensional, periodic, Schrödinger operator with two non-degenerate potential maxima (see Sect. 4), the ωlimit set of bicharacteristics on 6 consists of two hyperbolic fixed points. This is a manifestation of the fact that the stable manifold of the first critical point is the unstable manifold of the second (and visa-versa). However, nearby bicharacteristics trace out closed ovals in a periodic motion with no nice limiting behaviour. The important point here is that, since for h¯ small, h¯ δ1 /2 0. This is consistent with Theorem 1. Finally, in Appendix A we establish the existence of eigenvalues satisfying the hypotheses in Theorem 1 (provided 0 ≤ δ1 < 1/2). In fact, we give a lower bound for the semiclassical spectral counting function for such δ1 under rather weak assumptions on the singularities of the variety, 6E . We will now give a precise statement of our main result: To simplify the writing, we will henceforth fix the rate function: m(t) = e−|t| .
(1)
However, our results do generalize to include a wider class of rate functions, m(t) (see the remark after Theorem 1). In applying our result in these more general cases, h¯ should be replaced with the more cumbersome notation, m( log h¯ ). For an appropriate class of rate functions, see the remark after Theorem 1. Let 0 ≤ χ (s) ∈ Co∞ (R) be a cutoff function which is identically 1 in the interval, [−1, 1], and vanishes for |s| ≥ 2. Given 0 ≤ < 1/2, we define ζ (x, ξ ; h¯ ) := χ(h¯ −2 d 2 ((x, ξ ), γ )), where d(·, ·) denotes a fixed distance function on T ∗ X. Note, since we are assuming that γ ⊂ 6E is an embedded submanifold of T ∗ X, it follows that, for h¯ sufficiently small, ˜ ∈ C0∞ (R) be a cutoff which is identically 1 on the ζ ∈ C0∞ . Similarly, we let 0 ≤ χ(s) interval, [−2, 2], and vanishes for |s| ≥ 3. Given 0 ≤ < 1/2, we define ζ˜ (x, ξ ; h¯ ) := χ˜ (h¯ −2 d 2 ((x, ξ ), γ )). We will also need to fix the respective tubular neighbourhoods, 0 = {(x, ξ ) ∈ T ∗ X; d 2 ((x, ξ ), γ ) < h¯ 2 }, and 0˜ = {(x, ξ ) ∈ T ∗ X; d 2 ((x, ξ ), γ ) < 2h¯ 2 } . Moreover, we will define the neighbourhood, E , of 6E by E := {(x, ξ ) ∈ T ∗ X; p(x, ξ ) ≤ h¯ δ1 }, where p(x, ξ ) :=
d X (pj (x, ξ ) − Ej )2 . j =1
Suppose that the following hypotheses are satisfied (see also Sect. 4): (H1) Assume that for h¯ sufficiently small,
m(t) d(φt (x, ξ ), γ ) = O h¯ as t → ∞, uniformly for (x, ξ ) ∈ 6E − 0˜ .
412
J. A. Toth
(H2) Given z ∈ E − 0˜ , there exists z0 ∈ 6E − 0˜ such that for h¯ sufficiently small, δ1
d(z, z0 ) = O(h¯ 2 −f () ) uniformly for all such z. Here, f ∈ C 0 (R, R+ ) and f (0) = 0. The motivation behind (H2) is roughly as follows: If 6E is smooth outside γ and the differentials dpj ; j = 1, . . . , d are linearly independent, we can choose the pj ’s as coordinates in E . However, in a shrinking tubular neighbourhood around γ of radius O(h¯ ), the gradients, ∇pj , might degenerate (i.e. become dependent) as we approach γ . Condition (H2) roughly says that this degeneration occurs at a polynomial rate, h¯ −f () (see also, Sect. 4). In Sect. 2, we show that given hypotheses (H1) and (H2), 0 < δ1 ≤ 1 and ψj as above, there exists > 0 and κ = κ() > 0 such that, ([1 − OphF¯ (ζ˜ )]ψj , ψj ) ≤ (OphF¯ (ζ )ψj , ψj ) + O(h¯ κ ).
(∗)
Here, OphF¯ (a) denotes a semiclassical anti-Wick pseudodifferential operator associated with a(x, ξ ) (see Sect. 1). In the course of our proof, we will give estimates for and consequently, the error term κ > 0 in terms of dynamical constants and a finite number of derivatives of symbols. To give a statement of Theorem 1, we let π : T ∗ X → X denote the canonical cotangent projection map. Then, as a consequence of the estimate, (∗), we obtain: Theorem 1. Let ψj be as above and assume that conditions (H1) and (H2) are satisfied. Then, there exists > 0 and κ() > 0 such that for h¯ > 0 sufficiently small, Z Z 2 |ψj | dx ≤ |ψj |2 dx + O(h¯ κ ). X−π(0˜ )
π(0 )
Thus, the mass of ψj is concentrated in a semiclassically shrinking tubular neighbourhood of radius h¯ around π(γ ). Remark. Egorov’s Theorem with time t = −δ log h¯ also plays an important role in Zelditch’s paper [Z] on the rate of quantum ergodicity, as well as Volovoy’s paper [Vo] on the error term in Weyl’s law. 2. Microlocalization near 6E Let P1 (x, h¯ Dx ), . . . , Pd (x, h¯ Dx ) be self-adjoint, elliptic, classical h¯ -pseudodifferential operators of order m. This means that the respective symbols p1 , . . . , pd are required to have asymptotic expansions: pk (x, ξ ) ∼ h¯ m
∞ X
pk,j (x, ξ )h¯ j ,
j =0
where, for k = 1, . . . , d and j ≥ 0, β
|∂xα ∂ξ pkj (x, ξ )| ≤ Cα,β hξ im−j −|β| .
Small-Scale Mass Concentration of Modes
413
1
Here, hξ i := (1 + |ξ |2 ) 2 and pk,0 (x, ξ ) ≥ C1 hξ i for |ξ | ≥ C1 . Henceforth, to simplify notation, we shall denote the h¯ -principal symbols by p1 , . . . , pd . Since we will be working with small-scale cutoff functions like ζ , we now recall the main properties of the corresponding h¯ -pseudodifferential calculus. For further details, we refer the reader to [Sj1]. Let be an open set in Rn and recall that m(t) = e−|t| . For 0 ≤ < 21 , let Sm ( × Rn ) = h¯ −m S0 ( × Rn ), where the latter is defined as follows: a ∈ S0 provided β
S0 := {a ∈ C ∞ ( × Rn ); |∂xα ∂ξ a(x, ξ ; h¯ )| ≤ Cα,β h¯ −(|α|+|β|) , ∀(x, ξ ) ∈ × Rn }. We will denote by Oph¯ (a) the corresponding h¯ -Kohn–Nirenberg quantization, given locally by the integral operator: Z −n ei(x−y)ξ/h¯ a(x, ξ ; h¯ )u(y)dydξ. (Oph¯ (a)u)(x) = (2π h¯ ) Such operators form a calculus [Sj1] with the usual symbolic composition formula: If c(x, h¯ Dx ) = a(x, h¯ Dx ) · b(x, h¯ Dx ) with a, b ∈ S0 , then c(x, ξ ; h¯ ) ∼
∞ X
∂xα a(x, ξ ; h¯ ) · Dξα b(x, ξ ; h¯ )
0+|β|=k
h¯ α . α!
(2)
Note that the semiclassical Calderon–Vaillancourt Theorem also holds for operators in this calculus: If a ∈ S0 with 0 ≤ < 21 , then Oph¯ (a) : L2 (X) → L2 (X) is uniformly bounded in h¯ with: X |α|+|β| β h¯ 2 sup |∂xα ∂ξ a(x, ξ ; h¯ )|. (3) kOph¯ (a)k(0) ≤ C(n) |α|+|β|≤2n+1
Our first order of business will be an estimate on the microlocalization of the ψj near the variety, 6E . For this, we shall adapt the simple and elegant argument in [Sj1] involving commutators and resolvant estimates to the case at hand. To begin, let χ ∈ C0∞ (R) be a cutoff function identically equal to 1 in [−a, a] and vanishing for |s| ≥ 2a, where a > 0. We define p(x, ξ ) :=
d X (pj (x, ξ ) − Ej )2 .
(4)
j =1
Let 0 < δ1 ≤ 1 and consider the following symbol: χ δ1 (x, ξ ; h¯ ) := χ(h¯ −δ1 p(x, ξ )).
(5)
It is clear that Oph¯ (χ δ1 ) ∈ Op(Sδ01 /2 ), with m(t) = e−|t| . Let ψj ∈ C ∞ (X) be an L2 -normalized joint eigenfunction of P1 , . . . , Pd , satisfying Pk ψj = Ek ψj + O(h¯ δ1 )ψj for k = 1, . . . , d. Following [Sj1], we choose the support of χ large enough so that p(x, ξ ) + χ (h¯ −δ1 p(x, ξ )) ≥ (1 + 1/C)h¯ δ1
414
J. A. Toth
for some C > 0 and h¯ sufficiently small. Thus, the operator P˜ = P + Oph¯ (χ δ1 ) satisfies P˜ ≥ (1 + 1/2C)h¯ δ1 . Finally, let χ˜ be another cutoff function which is identically 1 on the support of χ and define ˜ h¯ −δ1 p(x, ξ )). χ˜ δ1 (x, ξ ; h¯ ) := χ( Proposition 1. Let ψj be an L2 -normalized joint eigenfunction as above. Then, k(1 − Oph¯ (χ˜ δ1 ))ψj k = O(h¯ ∞ ). Proof. Modulo the fact that we work with a cutoff function that is localized about an arbitrary energy level set rather than ground state, the proof follows as in [Sj1]. For the sake of completeness, we will sketch the argument. Consider the perturbed sum of squares operator: P˜ (x, h¯ Dx ) =
d X
(Pj (x, h¯ Dx ) − Ej )2 + Oph¯ (χ δ1 ).
j =1
The point of working with such an operator is that p vanishes to second order on the variety, 6E . This is the important point that enables one to estimate commutators. For the remainder of the proof, we drop the superscript δ1 and denote both the symbol and corresponding operator by χ when the context is clear. Start with a nested sequence of cutoff functions χ = χ0 , χ1 , χ2 , . . . , χN −1 , χN = χ˜ , with the property that χj is 1 near the support of χj −1 for all j = 0, . . . , N. By the symbolic composition, formula together with Calderon–Vaillancourt (3), it follows that (i)k[χj , χk ]k = O(h¯ ∞ ) and (ii)kχj (1 − χk )k = O(h¯ ∞ ) for k > j. Using (i), (ii) and the commutator identity: [(1 − χj ), (P˜ − λ)−1 ] = (P˜ − λ)−1 [χj , P˜ ](P˜ − λ)−1 , we obtain, by an iteration argument, the following estimate: ˜ P ](P˜ −λ)−1 . . . (P˜ −λ)−1 [χ1 , P ](P˜ −λ)−1 χψj +O(h¯ ∞ ). (1− χ˜ )ψj = (P˜ −λ)−1 [χ, Finally, to estimate the commutators [χj , P ] in L2 , use the symbolic expansion of σ ([χj , P ]) together with the fact that |∇p(x, ξ )|2 ≤ Cp(x, ξ ) near 6E to conclude that t k[χj , P ]k = O(h¯ ). Since N > 0 can be chosen arbitrarily large, we are done. u The next step in the proof of Theorem 1 is the time-dependent, semiclassical Egorov Theorem ([PU, Z, Vo]). Let φt : T ∗ X → T ∗ X denote time t bicharacteristic flow for H (x, ξ ). Then, in terms of local canonical coordinates on T ∗ X, we will write φt (x, ξ ) = ((φt )1 , . . . , (φt )2n ). We begin with the following elementary lemma: Lemma 1. There exists a constant Ck > 0 independent of t such that β
|∂xα ∂ξ (φt )j (x, ξ )| ≤ exp(Ck (|α| + |β|)|t|) locally uniformly for (x, ξ ) ∈ T ∗ X, for all 1 ≤ j ≤ 2n and 0 ≤ |α| + |β| ≤ k.
Small-Scale Mass Concentration of Modes
415
Proof. This inequality follows from the group law φt1 +t2 = φt2 · φt1 together with the chain rule and an iteration argument. u t Recall, P1 is assumed to be a classical, self-adjoint, h¯ pseudodifferential operator of order zero. It is then well-known that U (t) = eitP1 /h¯ , the corresponding solution operator of the time-dependent Schrödinger equation, −i h¯
∂ U (t) − P1 U (t) = 0, ∂t U (0) = I d
is an h¯ -Fourier integral operator [PU]. A principal ingredient in our argument is the following semiclassical analogue of the standard energy estimate ([Ta], Sect. 2.2) for strictly hyperbolic equations: Lemma 2. Let Q ∈ Oph¯ (S0 ) with kQ − Q∗ k = O(h¯ ) in L2 and suppose that u(x, t) solves the initial value problem: ∂u + Qu = r, ∂t u(x, 0) = u0 (x).
i h¯
Then, there exists a constant C1 > 0 such that: ku(x, t)k ≤ h¯ −1 eC1 |t| (ku0 k + krk). Proof. Let u(x, t) be the requisite solution. Then, ∂t (u, u) = (∂t u, u) + (u, ∂t u) = (i h¯ Qu − i h¯ r, u) + (u, i h¯ −1 Qu − i h¯ −1 r) = 2 0 such that for 0 < h¯ ≤ h¯ 0 and t ∈ R, h¯
h¯
e−itP1 · Q · eitP1 = Oph¯ (exp t4∗p1 q0 ) + K(t; h¯ ), where kK(t; h¯ )k ≤ h¯ 1−2 eC2 |t| .
416
J. A. Toth
Proof. In the following, we work locally and will denote the total symbol of Q by q0 (x, ξ ; h¯ ). So, Z Q (x, y) = 2π h¯ −n ei(x−y)ξ/h¯ q0 (x, ξ ; h¯ )dξ and we will denote the conjugated operator, e−itP1 Q eitP1 , by Qt . Since we are only interested in Egorov’s theorem per se, following [Ta], it will be convenient to work with the induced equation for Qt : h¯
∂ Q = i[P1 , Qt ]. ∂t t
(7)
As usual, the idea is to construct an approximate solution, At , to (7) with error Rt and then estimate the difference kQt − At k using Lemma 2. Given Z At (x, y) = 2π h¯ −n ei(x−y)ξ/h¯ at (x, ξ ; h¯ )dξ, it follows that at must solve the initial value problem: ∂ a = {p, at }, ∂t t at |t=0 = q0 .
(8)
The solution to (8) is at (x, ξ ) = q0 (exp t4p1 (x, ξ )). For our purposes, it suffices to stop the symbolic manipulations at this stage. As a consequence, we put At = Oph¯ (exp t4∗p1 q0 ) and claim that there exists a constant C > 0 such that kR(t; h¯ )k = O(h¯ 2−2 )eC|t| .
(9)
To prove (9), consider the total symbol σ (x, y, ξ ; t, h¯ ) of the commutator [P1 , At ]. By a standard Taylor expansion and integration by parts argument ([Sh]), one obtains the usual formula for the associated semiclassical Kohn–Nirenberg symbol, K X h¯ α α (∂ p · Dxα at − ∂ξα at · Dxα p) + e(x, y, ξ ; t, h¯ ), σ (x, y, ξ ; t, h¯ ) = α! ξ
(10)
|α|=1
where e(x, y, ξ ; t, h¯ ) = O(h¯ K ) and depends only on derivatives of at and p of order K + 1. By choosing K sufficiently large and taking into account Lemma 1, we get kOph¯ (e)k = O(h¯ N )eC|t| for any N > 0. As far as the first term on the RHS of (10) goes, its principal part {p, at } is cancelled by ∂t at . So, kR(t, h¯ )k ∼ kOph¯ (
K X
|α|=2
h¯ α ∂ξα pDxα at − h¯ α ∂ξα at Dxα p)k = O(h¯ 2−2 eC|t| )
by the Calderon–Vaillancourt theorem (3) and Lemma 1. To conclude the proof, following ([Ta], Sect. 2.2), we must estimate kQt − At k. Writing F (t) = At eitP1 /h¯ and G(t) = Qt eitP1 /h¯ , it follows from the unitarity of eitP1 /h¯ that: kQt − At k = kF (t) − G(t)k.
Small-Scale Mass Concentration of Modes
417
However, v(t) = F (t) − G(t) satisfies h¯ ∂t v(t) = iP1 v(t) + R(t; h¯ )eitP1 /h¯ , v(0) = 0. Therefore, by the energy estimate in Lemma 2, it follows that kQt − At k = kv(t)k = t O(h¯ −1 eC1 |t| kR(t; h¯ )k) = O(h¯ 1−2 )eC2 |t| . u We will now apply the Egorov Theorem in Proposition 2 to the small-scale symbols localized near the limit set γ ⊂ 6E : Let d(·, ·) be a distance function on T ∗ X and recall, ζ (x; h¯ Dx ) := Oph¯ χ(h¯ −2 d 2 ((x, ξ ), γ )). An application of Proposition 2 with ζ = Q0 gives ζt = Oph¯ (exp t4∗p ζ ) + O(h¯ 1−2 )eC2 |t| , −1
(11)
−1
where, ζt = e−it h¯ P1 Oph¯ (ζ )eit h¯ P1 . Let ψj be a joint eigenfunction of P1 , . . . , Pd as above. Then, by Proposition 2 and the unitarity of eitP1 /h¯ , (ζ ψj , ψj ) = (ζt ψj , ψj ) = (Oph¯ (exp t4∗p ζ )ψj , ψj ) + O(h¯ 1−2 )eC2 |t| .
(12)
We now fix an invariant, semiclassical Friedrichs (anti-Wick) quantization map −→ OphF¯ (Sm ) OphF¯ : Sm
with the property that
OphF¯ (a) ≥ 0 if a ≥ 0.
Proposition 3. Given ζ and ψj as above, (OphF¯ (ζ )ψj , ψj ) = (OphF¯ (exp t4∗p ζ )ψj , ψj ) + O(h¯ 1−2 )eC3 |t| . Proof. For simplicity of notation, we will write σ = ζ for the remainder of the proof. In view of Proposition 2, it suffices to show that kOphF¯ (σ ) − Oph¯ (σ )k = O(h¯ 1−2 ) in L2 . We can represent the operator locally in terms of its Weyl quantization Z 1 (x + y), ξ ; h¯ dξ, OpF (σ )(x, y; h¯ ) = (2π h¯ )−n ei(x−y)ξ/h¯ σ w 2
(13)
where σ w denotes the (local) Weyl symbol. Let σ F denote the corresponding Kohn– Nirenberg symbol, so that: w F Oph¯ (σ F ) = Ophw ¯ (σ ) = Oph¯ (σ ).
By the usual argument relating Weyl and Kohn–Nirenberg symbols [Sh], it follows that σ F (x, ξ ; h¯ ) = σ w (x, ξ ; h¯ ) + O(h¯ 1−2 )
(14)
418
J. A. Toth
with similar estimates for the derivatives. It therefore suffices to relate the local Weyl symbol σ w to σ . The relevant formula is [F]: ZZ σ (q, h¯ p; h¯ ) 8(h¯ 1/2 (η − p), h¯ −1/2 (y − q))dpdq. (15) σ w (y, h¯ η; h¯ ) = R Here, 8 is an even, non-negative Schwartz function with 8 = 1. Note that, since σ is compactly-supported, the scaling by hpi in 8 is not necessary here. To estimate the difference σ w − σ , we use Taylor expansion to second order: ZZ [σ (q, h¯ p; h¯ ) − σ (y, h¯ η; h¯ )] (σ w − σ )(y, h¯ η; h¯ ) = ZZ =
· 8(h¯ 1/2 (η − p), h¯ −1/2 (y − q))dpdq
(16)
[h¯ (p − η) · ∇η σ + (y − q) · ∇y σ + R(x, ξ, q, p; h¯ )] · 8(h¯ 1/2 (η − p), h¯ −1/2 (y − q))dpdq.
The linear terms in (16) all integrate to zero, since 8 is even. The quadratic term R is bounded by: C h¯ (1 + h¯ 1/2 |η − p| + h¯ −1/2 |y − q|)2 · 8(h¯ 1/2 (η − p), h¯ −1/2 (y − q)) · kσ w kC 2 = O(h¯ 1−2 ) (17) with similar estimates for the derivatives. u t Since our main interest here is in mass estimates, we will henceforth work with a fixed, positive quantization. For simplicity of notation, we will drop the superscript F . To proceed, put 1 t = δ log( ) h¯
(18)
in Proposition 3, where δ > 0 is to be determined. By Lemma 1, it is clear that, for such a value of t, exp t4∗p ζ ∈ S +C1 δ
(19)
with m(t) = e−|t| . In order to choose δ > 0, we need to combine the estimate on the h¯ -microsupport of the ψj (Proposition 1) and the time-dependent Egorov theorem (Proposition 3) using hypotheses (H1) and (H2). To see how to do this, choose a cutoff function 0 ≤ χ(s) ˜ ∈ C0∞ (R) which is identically equal to 1 on [−2, 2] and vanishes for |s| ≥ 3. So, in particular, χ˜ = 1 on supp χ . Recall, we have defined the associated symbol, ζ˜ (x, ξ ) := χ˜ (h¯ −2 d 2 ((x, ξ ), γ )).
(20)
Fix a δ1 with 0 < δ1 < 1 and recall that, by Proposition 1, the microlocal mass of eigenfunctions, ψj , satisfying Pk ψj = Ek (h¯ )ψj
Small-Scale Mass Concentration of Modes
419
is concentrated (modulo O(h¯ ∞ )) in the domain, E = {(x, ξ ) ∈ T ∗ X; p(x, ξ ) ≤ C h¯ δ1 }. Consider the tubular neighbourhoods, 0 = {(x, ξ ) ∈ T ∗ X; d 2 ((x, ξ ), γ ) < h¯ 2 } and 0˜ = {(x, ξ ) ∈ T ∗ X; d 2 ((x, ξ ), γ ) < 2h¯ 2 }. Given (x, ξ ) ∈ E − 0˜ , our objective is to choose δ so that for t = −δ log h¯ , φt (x, ξ ) = exp t4p (x, ξ ) ∈ 0 . To see how to do this, we first of all restrict > 0 so that: f ()
.
(23)
By hypothesis (H1), it follows that for t = −δ log h¯ , d(φt (x0 , ξ0 ), γ ) ≤ C h¯ δ− .
(24)
d(φt (x, ξ ), γ ) = d(φt (x, ξ ), φt (x0 , ξ0 )) + O h¯ δ− .
(25)
So, by the triangle inequality,
Finally, by a first-order Taylor expansion, it follows that d(φt (x, ξ ), φt (x0 , ξ0 )) ≤ sup |∇x,ξ φt | · d((x, ξ ), (x0 , ξ0 )) E
δ1
δ1
≤ exp(C1 |t|)h¯ 2 −f () = h¯ −C1 δ−f ()+ 2 , where, in the last inequality, we have used Lemma 1. The end result is that, given (21) and (23), we have for t = −δ log h¯ , (x, ξ ) ∈ E − 0˜ and h¯ sufficiently small, δ1
d(φt (x, ξ ), γ ) = O(h¯ −C1 δ−f ()+ 2 + h¯ δ− ).
(26)
We would like to arrange that the bicharacteristic curve φt (x, ξ ) be in 0 after time t = −δ log h¯ . This will be the case, provided: δ < C1−1 (
δ1 δ δ1 − f () − ), < , f () < . 2 2 2
(27)
The only other thing we need to consider is the error term in the Egorov Theorem (Proposition 3). In order to ensure that this term does not blow up, we also require that, δ < C3−1 (1 − 2). Summing up, we have proved:
(28)
420
J. A. Toth
Lemma 3. Let (, δ) satisfy the following inequalities: δ < min(C1−1 (
δ1 − f () − ), C3−1 (1 − 2)), 2 δ1 δ < , f () < . 2 2
(29)
Then, for t = −δ log h¯ , (x, ξ ) ∈ E − 0˜ and h¯ sufficiently small, ζ (exp t4p1 (x, ξ )) = 1. Since by assumption (H2), f ∈ C 0 and f (0) = 0, the system of inequalities in (29) can be solved for positive (, δ) and clearly, the maximal such is the optimal choice, since it will give the best localization near the limit set, γ . To exploit Lemma 3, we will need to discuss the pointwise behaviour of the symbols ζ and ζ˜ in greater detail: Lemma 4. Let (, δ) satisfy the inequalities in (29). Then, for t = −δ log h¯ , (x, ξ ) ∈ E and h¯ sufficiently small, [(1 − ζ˜ ) · (exp t4∗p1 ζ )](x, ξ ) = (1 − ζ˜ )(x, ξ ).
(30)
Proof. When (1 − ζ˜ )(x, ξ ) = 0, this identity clearly holds since both sides of (30) are zero. On the other hand, if (x, ξ ) ∈ supp (1 − ζ˜ ), then (x, ξ ) ∈ E − 0˜ and so, by t Lemma 3, ζ (exp t4p1 (x, ξ )) = 1. Thus, (30) is again satisfied. u Recall, the semiclassical Egorov Theorem (Proposition 3) says that: (Opζ ψj , ψj ) = (Op(exp t4∗p1 ζ )ψj , ψj ) + O(h¯ 1−2−C3 δ )).
(31)
Since 1 − ζ˜ ≤ 1 holds pointwise and we are using a non-negative, anti-Wick quantization, it follows as a consequence of Proposition 3 that (Opζ ψj , ψj ) ≥ (Op[(1 − ζ˜ ) · exp t4∗p1 ζ ]ψj , ψj ) + O(h¯ 1−2−C3 δ ).
(32)
Next, we expand the RHS in (32): (Op[(1 − ζ˜ ) · exp t4∗p1 ζ ]ψj , ψj ) = (Op[(1 − ζ˜ ) · exp t4∗p1 ζ ] · Op(χ˜ δ1 )ψj , ψj ) + (Op[(1 − ζ˜ ) · exp t4∗p1 ζ ] · [1 − Op(χ˜ δ1 )]ψj , ψj ). (33) By Proposition 1, k[1 − Op(χ˜ δ1 )]ψj k = O(h¯ ∞ ).
(34)
Using this estimate in (33) gives: (Opζ ψj , ψj ) ≥ (Op[(1 − ζ˜ ) · (exp t4∗p1 ζ )] · Op(χ˜ δ1 )ψj , ψj ) + O(h¯ 1−2C0 −C3 δ ).
(35)
Since the symbol χ˜ δ1 is supported on the domain E , it follows by the pointwise identity in (30) that, (Opζ ψj , ψj ) ≥ (Op(1 − ζ˜ ) · Op(χ˜ δ1 )ψj , ψj ) + O(h¯ 1−2C0 −C3 δ ).
(36)
Small-Scale Mass Concentration of Modes
421
Finally, appealing again to the microlocalization result in (34), we obtain: (Opζ ψj , ψj ) ≥ (Op(1 − ζ˜ )ψj , ψj ) + O(h¯ 1−2C0 −C3 δ ).
(37)
Our main result is now an immediate consequence of the estimate (37): Indeed, given the tubular neighbourhoods 0 = {(x, ξ ) ∈ T ∗ X; d 2 ((x, ξ ), γ ) ≤ h¯ 2 }, 0˜ = {(x, ξ ) ∈ T ∗ X; d 2 ((x, ξ ), γ ) ≤ 2h¯ 2 } and (, δ) satisfying the estimates in (29), we have proved: Theorem 1. Let P1 , . . . , Pd ; 1 ≤ d ≤ n be elliptic, self-adjoint classical h¯ - pseudodifferential operators with h¯ -principal symbols p1 , . . . , pd , satisfying: [Pi , Pj ] = 0 for all 1 ≤ i, j ≤ d. Let 0 < δ1 < 1, 6E = {(x, ξ ) ∈ T ∗ X; p1 (x, ξ ) − E1 = . . . , pd (x, ξ ) − Ed = 0} and ψj be an L2 -normalized joint eigenfunction satisfying: Pk ψj = Ek (h¯ )ψj , where, Ek (h¯ ) = Ek + O(h¯ δ1 ) and k = 1, . . . , d. Assume that hypotheses (H1) and (H2) are satisfied. Then, given (, δ) satisfying the estimates in (29) and h¯ sufficiently small, Z
Z X−π(0˜ )
|ψj |2 dx ≤
π(0 )
|ψj |2 dx + O(h¯ κ ),
where κ = 1 − 2 − C3 δ. Remark. Although we have dealt throughout with the explicit rate function, m(t) = e−|t| , and the associated symbol classes S0 , our main result generalizes to include other rate functions, m(t) (see [Sj2]). Indeed, let h¯ ∈ (0, h¯ 0 ] and assume that µ(h¯ ) ∈ (0, µ0 ] µ satisfies 0 < µh¯2 ≤ h¯ for any > 0. We define the corresponding symbol classes, S0 , as follows ([Sj2] Sect. 8): a(x, ξ ; h¯ , µ) ∈ Sµ0 , provided β
|∂xα ∂ξ a(x, ξ ; h¯ , µ)| ≤ Cα,β µ−(|α|+|β|) . Then, by standard arguments, one has the usual composition formulas for such symbols, together with Calderon–Vaillancourt L2 - boundedness results. In particular, Oph¯ (Sµ0 1 ) · Oph¯ (Sµ0 2 ) ⊂ Oph¯ (Sµ0 ), where, µ = max{µ1 , µ2 }. Thus, if m(t) ≥ e−|t| , it follows that we can work with symbol classes defined by µ(h¯ ) = m( log h¯ ) as long as 0 ≤ < 1/2.
422
J. A. Toth
3. Hyperbolic Geodesics on Quadric Surfaces In this section, we give a concrete application of Theorem 1: Let X = {(x1 , x2 , x3 ) ∈ R3 ; α1 x12 +α2 x22 +α3 x32 = 1} be the standard ellipsoid with axes of length α1−1 > α2−1 > α3−1 > 0. It was shown by Jacobi ([A]) that geodesic flow on X is completely integrable. In fact, this system is also quantum integrable in arbitrary dimension ([T1,T2]). In this case, we can take P1 = −h¯ 2 1, the standard Laplace-Beltrami operator. One can show [T1] that there exists a functionally-independent, second-order, self- adjoint partial differential operator P2 with the property that [P1 , P2 ] = 0. Define 6 = {z ∈ T ∗ X; p1 (z) − 1 = p2 (z) − α2−1 = 0} and denote the canonical dual coordinates to (x1 , x2 , x3 ) ∈ R3 by (ξ1 , ξ2 , ξ3 ) ∈ R3 . It is well known [A] that the geodesics γ ± = {z ∈ 6; x2 = ξ2 = 0} are hyperbolic. Moreover, there exists a constant C = C(α1 , α2 , α3 ) such that d(exp t4p1 z, γ ± ) ≤ exp(−C|t|) · h¯ − ,
(38)
uniformly for all z ∈ 6 − 0˜ , where 0˜ is a neighbourhood of γ ± defined as in Sect. 2. The following is an immediate consequence of Theorem 1: Corollary 1. Let δ1 > 0 be as above, E = (E1 , E2 ) = (1, α2−1 ), γ and ψj a normalized joint eigenfunction associated with 6E as in Theorem 1. Then, given (, δ) satisfying (29), Z Z 2 |ψj | dx ≤ |ψj |2 dx + O(h¯ κ ). X−π(0˜ )
π(0 )
Remarks. Note that applying separation of variables in this example leads to a nonFuchsian ODE of Heuns type ([T1, T2]) with an elliptic function potential, q(x). It is not difficult to show that to obtain smooth solutions to the Laplace eigenfunction on X, one must look for doubly-periodic solutions corresponding to a certain lattice in C ([T1]). Even the existence of such a solution is by no means obvious since it is not clear that q is a Picard potential [Ge]. Therefore, Corollary 1 gives a mass concentration result for separatrix eigenfunctions in a case where separation of variables is not readily applicable. There are other interesting algebraically integrable examples in arbitrary dimension satisfying (38) and hence, Corollary 1 ([T1, T2]) that can be approached using ODE techniques. However, in these examples, the ODE arising from separation of variables typically involve multiple spectral parameters and are therefore very difficult to work with directly. Moreover, the set γ can also be rather complicated. For example, when the dimension of the hyperellipsoid is at least 3, it is not difficult to show that the projected limit sets π(γ ) can actually be quadric surfaces of dimension ≥ 2.
Small-Scale Mass Concentration of Modes
423
4. The One-Dimensional Schrödinger Operator Let V ∈ C ∞ (R) satisfy: V (x + 1) = V (x), 0
V (0) = V (0) = 0, V 00 (0) < 0, −1 ≤ V (x) ≤ 0.
(i) (ii) (iii) (iv)
Consider the one-dimensional (reduced) Schrödinger equation P (h¯ )ψ = −h¯ 2
d2 ψ + V (x)ψ = λ ψ dx 2
(39)
on the circle, S1 = R (mod 1), with λ = O(h¯ ). The spectral theory of such a Schrödinger operator near a non- degenerate potential maximum is well-known ([B, BPU, CP, Ma]). However, it is of interest to see how this example falls into our framework. To do this, we will need to recall here some elementary properties of the classical flow. Consider the separatrix 1 60 = {(x, ξ ) ∈ T ∗ S1 ; ξ 2 + V (x) = 0}. 2 It is clear that 60 consists of two pieces and we will denote the subsets corresponding to ξ ≥ 0 and ξ ≤ 0 by 60+ and 60− respectively. To fix matters, we focus here on 60+ , the other case being similar. In this example, we define 0 = {(x, ξ ) ∈ T ∗ S1 ; x 2 + ξ 2 ≤ h¯ } ∪ {(x, ξ ) ∈ ∗ T S1 ; (x − 1)2 + ξ 2 ≤ h¯ } and 0˜ = {(x, ξ ) ∈ S1 ; x 2 + ξ 2 ≤ 2h¯ } ∪ {(x, ξ ) ∈ T ∗ S1 ; (x − 1)2 + ξ 2 ≤ 2h¯ }. Let (x0 , ξ0 ) ∈ 60+ − 0˜ and (x(t), ξ(t)) be the solution curve of the Hamilton equations: dξ = −V 0 (x), dt
dx = ξ, dt
(40)
with (x(0), ξ(0)) = (x0 , ξ0 ). Integration of the equations in (40) yields: Z Z ξ(t) = −
√ dx = 2t, √ −V (x)
x(t)
x(0) t
(41)
0
V (x(s))ds + ξ0 .
0
Let H[a,b] denote the indicator function of the interval, [a, b]. By the assumptions (i)-(iv) on the potential V (x), there exist c1 , c2 > 0 such that c2 H[0,1/2] (x)x 2 + c2 H[1/2,1] (x)(x−1)2 ≥ −V (x) ≥ c1 H[0,1/2] (x)x 2 +c1 H[1/2,1] (x)(x−1)2 . To minimize the profusion of constants, we will assume here that c1 = 1 and c2 = 2. Thus, 2H[0,1/2] (x) x 2 + 2H[1/2,1] (x) (x − 1)2 ≥ −V (x) ≥ H[0,1/2] (x) x 2 + H[1/2,1] (x) (x − 1)2 .
(42)
424
J. A. Toth
Consider the first equation in (41). Let x0 ≥ 2h¯ and suppose 2h¯ ≤ x(t) ≤ 1/2. Then, by the estimate in (42), Z x(t) dx √ (43) ≥ 2t. x x0 As a consequence, x(t) ≥ x0 e Suppose now that (42),
√
√ 2t
≥ 2h¯ e
√ 2t
.
(44)
2t ≥ log(4−1 h¯ − ), so that, in particular, x(t) ≥ 1/2. Then, also by Z
x(t) x0
√ dx ≥ 2t. 1−x
(45)
Thus, it follows that √ 2t
|1 − x(t)| ≤ |1 − x0 | e−
,
(46)
and consequently, √ 2t
|ξ(t)| ≤ 2 |1 − x0 | e−
(47)
for the same range of t. We now show that (H1) and (H2) follow from the above estimates. Lemma 5. Let γ := (0, 0) ∪ (1, 0) and suppose that d((x0 , ξ0 ), γ ) ≥ 2h¯ , where ξ0 ≥ 0. Then, for all t > 0, and h¯ sufficiently small, √ 2t
d((x(t), ξ(t)), γ ) = (|1 − x(t)|2 + ξ(t)2 ) 2 ≤ h¯ − e− 1
.
A similar result holds for ξ0 ≤ 0. Furthermore, hypothesis (H2) is also satisfied for this system with f () = . √ (46) and (47). On√the other Proof. For 2t ≥ log(h¯ − ) this follows from the estimates √ √ − − − 2t and |ξ(t)| ≤ h¯ − e− 2t , since hand, when 2t ≤ log(h¯ ), both |1 − x(t)| ≤ h¯ e 0 ≤ x, ξ ≤ 1. The second part of the lemma follows from the fact that V (x) ∼ −x 2 near x = 0 and V (x) ∼ −(x − 1)2 near x = 1. u t Let 1 be a sufficiently small neighbourhood of (0, 0) and let χ1 (x, ξ ) ∈ C0∞ be a cutoff function supported in 1 . Then, taking into account the microlocalization result in Proposition 1, by a quantum Birkhoff normal form construction [CP, HS], one can construct a microlocally unitary h¯ -Fourier integral operator, U : C0∞ (1 ) → C0∞ (1 ) such that (48) kOph¯ (ζ )[U ∗ F (P ; h¯ )U − h¯ (Dx x + xDx )]k = O(h¯ ∞ ), P where F (x; h¯ ) ∼ j =0 fj (x)h¯ j and 0 ≤ < 1/2. As a consequence of (48), it can be shown [CP] that there exist α± such that for any eigenfunction ψj , kOph¯ (ζ )(ψj − α+ U u+ − α− U u− )k = O(h¯ ∞ ).
(49)
Small-Scale Mass Concentration of Modes
425
Here, u± (x) = (2π)1/2 0(1/2 + iλ)/h¯ )−1 e−λ/2h¯ | log h¯ |−1/2 H (±x)x ±iλ/h¯ −1/2 is the distributional basis of solutions to the equation h¯ (Dx x + xDx )u = λu. To simplify the writing, we will put c(h¯ ) = (2π)1 |0(1/2 + iλ)/h¯ )|−2 e−λ/h¯ | log h¯ |−1 below. As a starting point, we will compute the microlocal mass of u± over the domain, := [−h¯ , h¯ ] × [−h¯ , h¯ ] ⊂ [−1, 1] × [−1, 1] with 0 ≤ < 1/2. Because of the symmetry of the problem [CP], it suffices to estimate the integral: 2 Z h¯ Z h¯ Z h¯ |uˆ + |2 dξ = c(h¯ ) e−i(xξ −λ log x)/h¯ x −1/2 dx dξ. (50) 0 0 0 R1 Notice, we have chosen c(h¯ ) so that 0 |uˆ + |2 dξ = 1. By making the change of coordinates ξ xξ , η= y= h¯ h¯ in the integral (50) we get, 2 Z h¯ −1 Z h¯ η Z h¯ dη |uˆ + |2 dξ = c(h¯ ) e−iy y −1/2+iλ/h¯ dy . (51) η 0 0 0 To estimate this latter integral, we first assume that η ∈ [0, h¯ − ]. Then, by an integration by parts: Z h¯ η e−iy y −1/2+iλ/h¯ dy 0
= O(h¯
/2 1/2
η
Z ) + O(1) 0
h¯ η
e−iy y 1/2+iλ/h¯ dy = O(h¯ /2 η1/2 ) + O(h¯ 3/2 η3/2 ). (52)
Thus, since c(h¯ ) ∼ (log h¯ )−1 , 2 Z h¯ −1 Z h¯ η Z h¯ dη 2 −iy −1/2+iλ/h¯ |uˆ + | dξ = c(h¯ ) e y dy + O(log h¯ −1 ), η h¯ − 0 0
(53)
and so, Z
h¯ 0
|uˆ + |2 dξ = 1 − 2 + O(log h¯ −1 ).
(54)
It follows that the mass inside dominates the mass in c = [0, 1]2 − provided 1 − 2 ≥ 2 and so, we must choose ≤
1 . 4
(55)
Remark. Although we will not prove this here, by using the above analysis together with Taylor expansion near (0, 0), it is not difficult to show that, for any q(x, ξ ) ∈ C0∞ (T ∗ S1 ), (Oph¯ (q)ψj , ψj ) → q(0, 0) as h¯ → 0 (see also [CP]). We will discuss limits of quantum expected values in greater generality (e.g. near unstable orbits) elsewhere.
426
J. A. Toth
5. Appendix A Fix a constant C > 0 and let P (x; h¯ DP ¯ - pseudodifferential operator x ) be a self-adjoint, h of order 1 with symbol p(x, ξ ; h¯ ) ∼ ∞ ¯ j , where j =0 pj (x, ξ )h β
|∂xα ∂ξ pj | ≤ Cα,β hξ i1−j −|α| . We will moreover assume that P is elliptic, with p0 (x, ξ ) ≥ C hξ i when |ξ | ≥ C1 . Fix 0 < δ1 < 1/2, E1 > 0 and denote the number of eigenvalues of P (counted with multiplicity) on the interval [E1 − C h¯ δ1 , E1 + C h¯ δ1 ] by Nδ1 ,E1 (h¯ ). Our objective here is to give an asymptotic lower bound for Nδ1 ,E1 (h¯ ) in terms of the trace of a pseudodifferential operator (the approximate spectral projector). This method is well-known ([Sh, R]) and has been used in a variety of settings. Since we could not find the results of Propositions 4 and 5 explicitly in the literature, we will sketch the proofs. To define the projector, we let χ(t) ∈ C0∞ (R) be identically 1 in the interval [−C − 1, C + 1] with supp χ ⊂ [−2C − 2, 2C + 2]. Define t − E1 χδ1 ,E1 (t) := χ h¯ δ1 and let 6s := {(x, ξ ) ∈ 6E1 ; dp(x, ξ ) = 0}. Proposition 4. Let 0 ≤ δ1 < 1/2 and suppose 6E1 − 6s contains an open manifold. Then, there exists a constant C > 0 such that: Nδ1 ,E1 (h¯ ) ≥ C h¯ −n+δ1 . Proof. Since Nδ1 ,E (h¯ ) ≥ Traceχδ1 (P (x, h¯ Dx )),
(56)
it suffices to give a lower bound for Trace χδ1 (P ). The first order of business is to show that χδ1 (P ) is an h¯ -pseudodifferential operator with singular symbol. One way of doing this [Do], is to use the Cauchy identity: ZZ ∂ f˜ (57) (z − P )−1 dzdz f (P ) = −π −1 lim →0 |=z|≤ ∂z which is valid for all f ∈ C0∞ (R). Here, f˜ ∈ C0∞ (C) denotes an almost-analytic extension of f . The resulting operator, f (P (h¯ )) is then an h¯ - pseudodifferential operator with symbol, pf (x, ξ ; h¯ ) ∼
∞ X j =0
pf,j (x, ξ )h¯ j ,
(58)
Small-Scale Mass Concentration of Modes
427
where pf,0 = f (p0 ) and for j ≥ 1, pf,j (x, ξ ) =
2j −1 X
dj,k f (k) (p0 ),
(59)
k=1
the dj k being universal polynomials in the derivatives of the pl . One can put f = χδ and carry out the symbolic calculations as in the standard case, except that the pf,j will now depend on h¯ . However, since ∂ k χδ1 = O(h¯ −δ1 k ), it follows that pf,j (x, ξ ) = O(h¯ −δ1 (2j −1) ). Since δ1 < 1/2, (58) still makes sense as an asymptotic expansion. Taking traces, we get the usual formula: ZZ (60) χδ1 (p0 (x, ξ ))dxdξ + O(h¯ −n+1−δ1 ). T rχδ1 (P (h¯ )) = (2π h¯ )−n By assumption, we can introduce p0 as a radial variable in (60) on an open domain. The result follows. u t By applying the argument above with a cutoff function χ(t1 , . . . , td ) ∈ C0∞ (Rd ) one can prove in exactly the same way: Proposition 5. Let P1 , . . . , Pd satisfy the hypotheses in Theorem 1 and suppose 6E −6s contains an open manifold. Then, for 0 ≤ δ1 < 1/2, there exists a constant C > 0 such that: Nδ1 ,E (h¯ ) ≥ C h¯ −n+δ1 ·d . Here, Nδ1 ,E (h¯ ) denotes the number of d-tuples of eigenvalues (λ1 , . . . , λd ) of P1 . . . , l . . . , Pd satisfying |λj − Ej | ≤ C h¯ δ1 and 6s = {(x, ξ ) ∈ 6E ; dp1 , . . . dpd are linearly dependent at (x, ξ )}. Remark. Under the hypothesis that the joint energy levels (E1 , . . . Ed ) are regular or have sufficiently tame singularities [BU, BPU, DG, GU, PU, R], there are well-known Weyl formulas for the spectral counting function that are much stronger than the lower bound in Proposition 5. The result of Proposition 5 shows that there are many eigenvalues satisfying the hypotheses of Theorem 1 (provided 0 ≤ δ1 < 1/2) under rather weak assumptions on the singularities of the level variety, 6E . Acknowledgement. I wish to thank Victor Guillemin, Alex Uribe, Steve Zelditch and Maciej Zworski for many helpful comments and valuable discussions. I am also indebted to the referee for several useful comments and suggestions regarding the paper.
References [A]
Arnold, V.I.: Mathematical Methods of Classical Mechanics. Second Edition, Berlin–Heidelberg–New York: Springer-Verlag, 1987 [B] Bleher, P.: Semiclassical quantization rules near separatrices. Commun. Math. Phys. 165, 621–640 (1994) [BU] Brummelhuis, J. and Uribe, A.: A trace formula for Schrödinger operators. Commun. Math. Phys. 136, 567–584 (1991) [BPU] Brummelhuis, J., Paul, T. and Uribe, A.: Spectral estimates around a critical level. Duke Math. J. 78 (3), 477–530 (1995)
428
[CP] [DG] [Do] [F] [Ge] [GU] [HS] [Ma] [PU] [R] [Sh] [Sj1] [SJ2] [Ta] [T1] [T2] [Vo] [Z]
J. A. Toth
Colin de Verdière, Y. and Parisse, B.: Équilibre instable en régime semi-classique I: concentration microlocale. Commun. P.D.E. 19, 1535–1563 (1994) Duistermaat, J. and Guillemin, V.: The spectrum of positive eliiptic operators and periodic bicharacteristics. Invent. Math. 29, 39–79 (1975) Dozias, S.: Mémoire de Magistère de l’ENS. (1993) Folland, G.: Harmonic Analysis in Phase Space. Annals of Math. Studies 122, Princeton, NJ: Princeton Univ. Press, 1989 Gesztesy, F.: On Picard Potentials. Differential and Integral Equations 8 (6), 1453–1476 (1995) Guillemin, V. and Uribe, A.: Circular symmetry and the trace formula. Invent. Math. 96, 385–423 (1989) Helffer, B.: and Sjöstrand, J.: Semiclassical analysis of Harper’s equation III. Bull.Soc. Math. France, Mémoire No. 39, (1990) März, C.: Spectral asymptotics for Hill’s equation near the potential maximum. Asymptotic Analysis 5, 221–267 (1992) Paul, T. and Uribe, A.: The semi-classical trace formula and propagation of wave packets. J. Funct. Anal. 132, 192–249 (1995) Robert, D.: Autour de l’approximation semi-classique. Progr. Math. 68, Boston: Birkhäuser, 1987 Shubin, M.: Pseudodifferential Operators and Spectral Theory. Berlin–Heidelberg–New York: Springer-Verlag, 1987 Sjöstrand, J.: Semi-excited states in nondegenerate potential wells. Asymp. Anal. 6, 29–43 (1992) Sjöstrand, J.: Microlocal analysis for the periodic magnetic Schrödinger equation and related questions. CIME-lectures, Montecatini (1989), Springer Lecture Notes in Math. 1495, pp. 237–332 Taylor, M.: Pseudodifferential Operators. Princeton, NJ: Princeton Univ. Press, 1981 Toth, J.A.: Various quantum mechanical aspects of quadratic forms. J. Funct. Anal. 130, 1–42 (1995) Toth, J.A.: Eigenfunction localization in the quantized rigid body. J. Diff. Geom. 43 (4), 844–858 (1996) Volovoy, A.V.: Improved two-term asymptotics for the eigenvalue distribution function of an elliptic operator on a compact manifold. Commun. in P.D.E. 15 (11), 1509–1563 (1990) Zelditch, S.: On the rate of quantum ergodicity. Commun. Math. Phys. 160, 81–92 (1994)
Communicated by P. Sarnak
Commun. Math. Phys. 206, 429 – 445 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
The Entropy Production of Diffusion Processes on Manifolds and Its Circulation Decompositions? Qian Min, Wang Zheng-dong Department of Mathematics, Peking University, Beijing 100871, P. R. China Received: 4 November 1998 / Accepted: 7 April 1999
Abstract: In non-equilibrium statistical mechanics, the entropy production is used to describe flowing in or pumping out of the entropy of a time-dependent system. Even if a system is in a steady state (invariant in time), Prigogine suggested that there should be a positive entropy production if it is open. In 1979, the first author of this paper and Qian Min-Ping discovered that the entropy production describes the irreversibility of stationary Markov chains, and proved the circulation decomposition formula of the entropy production. They also obtained the entropy production formula for drifted Brownian motions on Euclidean space R n (see a report without proof in the Proc. 1st World Congr. Bernoulli Soc.). By the topological triviality of R n , there is no discrete circulation associated to the diffusion processes on R n . In this paper, the entropy production formula for stationary drifted Brownian motions on a compact Riemannian manifold M is proved. Furthermore, the entropy production is decomposed into two parts – in addition to the first part analogous to that of a diffusion process on R n , some discrete circulations intrinsic to the topology of M appear! The first part is called the hidden circulation and is then explained as the circulation of a lifted process on M × S 1 around the circle S 1 . The main result of this paper is the circulation decomposition formula which states that the entropy production of a stationary drifted Brownian motion on M is a linear sum of its circulations around the generators of the fundamental group of M and the hidden circulation.
1. Introduction In non-equilibrium statistical mechanics, the entropy production is used to characterize how far a system is from being equilibrium (see e.g., [P]). As far as we know, this idea has not been studied in probability theory with appropriate generality (see e.g. p. 207 ? Project supported by the National Natural Science Foundation of China and Mathematical Center of State Education Commission.
430
M. Qian, Z.-d. Wang
of [Si]). In ICM 1998, G. Gallavotti brought up the topic of entropy production again in the plenary lecture (see e.g., [G]). He used it to solve some classical problems in non-equilibrium statistical mechanics. In 1979, the first author of this paper and Qian Min-Ping considered the entropy production of stationary Markov chains and found the relationship between the entropy production and circulation of Markov chains (see [QQ1]). For a sketch, we suppose that ξ is a stationary Markov chain with discrete state space S, transition probability matrix P = (pij )i,j ∈S and initial invariant distribution π = (πi )i∈S . The Markov chain ξ is called reversible if πi pij = πj pj i for all states i, j ∈ S. The entropy production ep of ξ is defined by ep =
πi pij 1 X (πi pij − πj pj i ) ln . 2 πj pj i i,j ∈S
Clearly, ep is non-negative, and ep = 0 if and only if ξ is reversible. Hence the entropy production defined above can be regarded as a criterion to characterize how far a Markov chain is from being reversible. Furthermore, the frequency of any cycle C (an ordered subset of S) which appears in every orbit of the Markov chain ξ has a certain limit WC . In fact the limit WC is independent of the orbit and is defined as the circulation of ξ around the cycle C. The following circulation decomposition formula of the entropy production is proved in [QQ1]: ep =
1X WC (WC − W−C ) ln , 2 W−C C∈C
where C denotes the set of all cycles, and −C represents the reverse cycle of C. We should refer to Kalpazidou’s book [K] for the further development of the circulation theory, in which the circulation is related to Carathéodory dimension, Betti’s number and Kolmogrov’s complexity. But all of these are limited to the case of the state space S being discrete. In this paper, we will consider the entropy production and circulaion of stationary drifted Brownian motions on compact Riemannian manifold M and study the relationship between them. Let {xt }t≥0 be a Brownian motion with drift X on probability space (, F, Ft , p), with the state space M (X is a vector field on M). Two probability mea+ − and p[s,t] can be introduced on the σ -algebra Fst generated by xu (s ≤ u ≤ t) sures p[s,t] as the distributions of {xu , s ≤ u ≤ t} and {xt+s−u , s ≤ u ≤ t}. {xt }≥0 is called re+ − = p[s,t] for any t > s > 0. The entropy production of {xt }≥0 is defined versible if p[s,t] as ! + dp[t,t+4t] 1 p E ln − , ep (t) = lim 4t→0+ 4t dp[t,t+4t] + . It is clear that ep (t) = 0(∀t > 0) if and where E p stands for the expectation of p[t,t+4t] only if {xt }≥0 is reversible. By a variant of Girsanov’s formula on compact manifolds, we will prove that the entropy production ep (t) of the drifted Brownian motion {xt }≥0 is given by:
1 ep (t) = 2
Z M
∂ ln ρ h2X − 5 ln ρ, 2X − 5 ln ρi − 2 ∂t
ρ(x, t)dx,
Entropy Production of Diffusion Processes on Manifolds
431
where dx stands for R the Riemannian volume element of M, ρ(x, t) denotes the density of xt and satisfies M ρ(x, t)dx = 1. Recall that 1 ∂ρ = 4 ρ − Xρ − ρ div X. ∂t 2 If {xt }t≥0 is stationary, i.e., ρ(x, t) = ρ(x)(∀t ≥ 0), the entropy production formula given above becomes Z 1 h2X − 5 ln ρ, 2X − 5 ln ρiρ(x)dx. (1.1) ep (t) = 2 M This yields the known result: a stationary drifted Brownian motion is reversible if and only if its drift X is a gradient vector field (see e.g., p. 294 of [IW]). We remark that our methods of derivation can be used to prove the entropy production formula for drifted Brownian motions on the Euclidean space R n which is given in [QQ2] without proof. And the definition of the entropy production given above seems closely related to the definition of Kurchan [ku] which goes back to Andrej [A] and Hoover et al (see for instance [H]) and Evans et al (see for instance [Ev]). The comparison of our definition with the ones above will be considered in the future. Suppose that the flow φt generated by the vector field X is ergodic, then the rotation number of φt around a closed curve γ in M is given by (see e.g., p. 149 of [AA]): Z (γ ∗ , X)(m)dµ(m), (1.2) αγ = M
where γ ∗ is the De Rham dual of γ in the first cohomology group H 1 (M, R), µ is the invariant probability measure of the ergodic flow φt , and (γ ∗ , X)(m) is the value of the one-form γ ∗ on X at point m. Even if the flow φt is non-ergodic, the rotation number αγ of the drifted Brownian motion {xt }t≥0 around the closed curve γ can be defined and is given by the formula (see [M]): Z (γ ∗ , X)(m)dµ(m), (1.3) αγ = M
where the De Rham dual γ ∗ of γ is chosen to be harmonic, and µ is the invariant probability measure of {xt }t≥0 . It is amazing that formula (1.3) takes the same form as (1.2), though µ represents different measures in these two cases. What we have in mind here is an extension of Qian-Qian’s result ([QQ1]) on the circulation for Markov chains to the case when the diffusion processes on manifolds are considered. The importance of the rotation numbers (or circulation) is revealed in the fact that the irreversibility of diffusion processes can be characterized in terms of them just as in the discrete case of Markov chains. To see this, now let us consider a simple example. Let Bt be a one dimensional Brownian motion on a probability space (, F, Ft , p) and b(x) is a bounded coninuous function on R 1 . The solution process {xt }t≥0 to the following stochastic differential equation with an initial condition x0 gives a Brownian motion with the drift b(x): dxt = dBt + b(xt )dt.
432
M. Qian, Z.-d. Wang
By Girsanov’s formula, a new probability measure p˜ can be defined on (, Fst ) such that ddpp˜ |F t = Zs,t , where s
Z
Zs,t (x. (ω)) = exp[−
s
t
1 b(xu (ω)) · dBu − 2
Z
s
t
xu2 (ω)du].
(1.4)
To calculate the entropy production ep (t), for simplification we assume that {xt }t≥0 is a stationary process with an invariant probability measure ρ(x)dx on R 1 . Observe that −
−1 (x. (ω))] E p[s,t] [f (x. (ω))] = E p [f (xt+s−. (ω))] = E p˜ [f (xt+s−. (ω))Zs,t
holds for any Borel function f on C([s, t], R 1 ). Notice that {xu }s≤u≤t is a stationary ˜ thus we have Brownian motion without drift on the new probability space (, Fst , p), (see Proposition 3.1 in Sect. 3) −1 (x. (ω))] E p˜ [f (xt+s−. (ω))Zs,t ρ(xt (ω)) −1 (xt+s−. (ω)) ] = E p˜ [f (x. (ω))Zs,t ρ(xs (ω)) + ρ(xt (ω)) −1 Zs,t (x. (ω))]. (xt+s−. (ω)) = E p[s,t] [f (x. (ω))Zs,t ρ(xs (ω))
Therefore we have − dp[s,t] + dp[s,t]
−1 = Zs,t (x. (ω))Zs,t (xt+s−. (ω))
ρ(xt (ω)) . ρ(xs (ω))
A simple stochastic calculus (see the proof of Proposition 3.3) yields that Z t Z t 3 ( b2 + b0 )(xu (ω))du]. Zs,t (xt+s−. (ω)) = exp[ b(xu (ω)) · dBu + 2 s s On the other hand, by Itô formula and using
∂ρ ∂t
(1.5)
(1.6)
= 21 (ρ)00 − (bρ)0 = 0, we can derive
ρ(xt (ω) = ln ρ(xt (ω)) − ln ρ(xs (ω)) ρ(xs (ω) Z t Z t 1 (ln ρ)00 + b(ln ρ)0 (xu (ω))du, = (ln ρ)0 (xu (ω)) · dBu + 2 s s Z t Z t 0 0 0 1 0 2 = (ln ρ) (xu (ω)) · dBu + b + 2b(ln ρ) − ((ln ρ) ) (xu (ω))du. 2 s s (1.7) By (1.4–1.7), we get Z t Z − dp[s,t] 1 t 2 = exp − a(x (ω)) · dB − a (x (ω))du , u u u + 2 s dp[s,t] s ln
where a(x) = 2b(x) − (ln ρ)0 (x). This yields the following entropy production formula for the drifted Brownian motion {xt }t≥0 on R 1 : Z 1 ep (t) = a 2 (x)ρ(x)dx. 2 R1
Entropy Production of Diffusion Processes on Manifolds
433
If b(x) is a continuous function on R 1 with period 2π , it can be regarded as a function ˆ iθ ) = b(θ), 0 ≤ θ ≤ 2π). A process {ξt }t≥0 with the state space bˆ on the circle S 1 (b(e 1 S can be defined as ξt (ω) = exp(ixt (ω)),
ω ∈ , t ≥ 0.
Clearly {ξt }t≥0 is a Brownian motion with drift bˆ on S 1 . As we consider above, the entropy production of {ξt }t≥0 can also be computed easily. In fact it is given by Z 2π [2b(θ) − (ln ρ)0 (θ )]2 ρ(θ )dθ, (1.8) ep (t) = 0
where ρ(θ ) is the invariant density of {ξt }t≥0 and satisfies the normalization condition R 2π 0 ρ(θ )dθ = 1. The rotation number of {ξt }t≥0 around the circle S 1 is defined as the following limit: α = lim
t→∞
1 xt . 2π t
Rt Observe that xt = x0 + Bt + 0 b(xu )du (B0 = 0 being supposed), (x0 + Bt )/t → 0, and the ergodicity of {ξt }t≥0 yields that Z t Z t Z 2π Z 2π ˆ u )du /t → ˆ iθ )ρ(θ )dθ = b(xu )du]/t = [ b(ξ b(θ )ρ(θ )dθ. b(e 0
0
0
0
Hence we have the rotation number formula for {ξt }t≥0 : Z 2π 1 b(θ )ρ(θ )dθ. α= 2π 0 Set
Z
θ
h(θ) =
(1.9)
[b(φ) − c]dφ,
0
where c=
1 2π
Z
2π
b(θ )dθ.
0
Clearly, h(θ ) is a C 1 function on R 1 with period 2π and satisfies b(θ) = c + h0 (θ ). Since (ρ 0 − 2bρ)0 = 0, i.e., ρ 0 − 2bρ =const., hence the entropy production ep (t) of {ξt }t≥0 can be rewritten as Z 2π Z 2π Z 2π (2b − (ln ρ)0 )(2bρ − ρ 0 )dθ = 2c (2bρ − ρ 0 )dθ = 4c bρdθ. ep (t) = 0
0
0
Combining this with (1.9), we get the following simple relationship between the entropy producion ep (t) and the circulation α of the drifted Brownian motion {ξt }t≥0 on S 1 : ep (t) = 8π cα.
434
M. Qian, Z.-d. Wang
Using some geometrical results, in Sect. 4, we prove that the entropy production formula (1.1) for drifted Brownian motions on M can be rewritten as Z ep (t) = 2
Z
M
(β, X)(x)ρ(x)dx + 2
M
(γ , X)(x)ρ(x)dx,
(1.10)
where β and γ represent the co-exact and harmonic one-forms respectively in the Hodge decomposition of the dual one-form X ∗ of X. By the rotation number formula (1.3), R we see that the second term M (γ , X)(x)ρ(x)dx in the right hand of (1.10) can be represented as a linear sum of the rotation numbers of {xt }t≥0 around some closed curves in M. Hence in the case of X ∗ being closed, i.e. β = 0, by (1.10) we see clearly that the entropy production ep (t) of R {xt }t≥0 is a linear sum of its circulation. In Sect. 4, we will explain that M (β, X)(x)ρ(x)dx (the first term in the right-hand side of (1.10)) represents a hidden circulation of {xt }t≥0 . To be more precise, we consider a trivial principal bundle M × S 1 over M. The diffusion process {xt }t≥0 can be lifted to M × S 1 with respect to a connection induced by the differential one form X∗ on M (for details see Sects. 2 and 4). R We prove that the circulation α0 of the lifted process around the circle S 1 is exactly M (β, X)(x)ρ(x)dx. This circulation can not be observed by the rotation of {xt }t≥0 in M and is called the hidden circulation of {xt }t≥0 . By the new entropy production formula (1.10), now we see clearly that the entropy production ep (t) of {xt }t≥0 can be characterized in terms of its circulation and hidden circulation. In fact, we have ep (t) = 2α0 + 2
b1 X (X∗ , ωi )αi , i=1
where α1 , · · · , αb1 are the rotation numbers of {xt }t≥0 around some closed curves γ1 , · · · , γb1 (they generate the homology group H1 (M, R 1 ), b1 being the first Betti number of M) in M, ωi is the harmonic one-form dual to γi , and (X∗ , ωi ) is the Hodge inner product between X ∗ and ωi .
2. Lifted Processes and Girsanov’s Formula Suppose that (M, h·, ·i) is a Riemannian manifold and X1 , X2 , · · · , Xd , Y are smooth vector fields on M. Let Bt = (Bt1 , Bt2 , · · · , Btd ) be a d-dimensional Brownian motion on a probability space (, F, Ft , p). Let us consider the following stochastic differential equation: dxt =
d X
j
Xj (xt ) ◦ dBt + Y (xt )dt
(2.1)
j =1
with an initial condition x0 , where ◦ is taken in the sense of Stratonovich. The infinitesimal generator A of its solution process is a second order differential operator on C ∞ (M) which satisfies (see e.g. [E]) d
d
j =1
j =1
1X 1X h5Xj (5f ), Xj i + (Y + 5Xj Xj )f Af = 2 2
Entropy Production of Diffusion Processes on Manifolds
435
for all f ∈ C ∞ (M). In the following we will always assume that the solution process of SDE (2.1) is a Brownian motion on M with a drift vector field X. This means that 4=
d X h5Xj 5, Xj i, j =1
d
X=Y +
1X 5X j X j ; 2
(2.2)
j =1
here 4 is the Laplace operator on C ∞ (M). We remark that in general the existence of such vector fields X1 , X2 , · · · , Xd on M is not known. However, there is a canonical SDE on the orthonormal frame bundle over M, and the solutions to this project down to give Brownian motion on M. This construction is due to Eells and Elworthy (see e.g., p. 362 of [E]). To simplify our discussion and make the argument more transparent, we will assume (2.2) throughout this paper. Suppose that {xt }t≥0 is a solution of SDE (2.1). Let us consider a lift of {xt }t≥0 to M ×S 1 . Let A be a R 1 valued differential one form on M. iA induces a connection of the trivial circle bundle M ×S 1 over M. Then any C 0 vector field Z on M can be horizontally b on M × S 1 . Regarding the tangent space T(x,g) (M × S 1 ) of lifted to a vector field Z 1 b is then given by M × S at point (x, g) (x ∈ M, g = eiθ ∈ S 1 ) as Tx (M) ⊕ Tg S 1 , Z b g) = Z(x) − i(A, Z)(x) d . Z(x, dθ
(2.3)
A lift of {xt }t≥0 to M × S 1 is then defined as a solution process {yt }t≥0 of the following SDE: d X bj (yt ) ◦ dBtj + Y b(yt )dt X (2.4) dyt = j =1
with an initial condition y0 = (x0 , g0 ). It is easy to prove that {yt }t≥0 projects down to give {xt }t≥0 . In fact, by (2.3) and (2.4), we have yt = (xt , gt ) with gt ∈ S 1 satisfying d X j (A, Xj )(xt ) ◦ dBt − i(A, Y )(xt )dt
dgt = −i
j =1
with a given initial condition g0 (in the following discussion, g0 = 1 is always assumed). Clearly, gt is then given by Z gt = exp{−i
t
d X j [ (A, Xj )(xs ) ◦ dBs + (A, Y )(xs )ds]}.
(2.5)
0 j =1
{(xt , gt )}t≥0 is called the horizontal lifted process of {xt }t≥0 with respect to the connection iA. In Sect. 4, we will use this lifted process to define a hidden circulation of the diffusion {xt }t≥0 . Using the methods in [WGQ], the lifted process {(xt , gt )}t≥0 can also be used to derive the following “covariant” Feynman–Kac formula: Z t ˜ V (xs )ds f (xt ) (2.6) [exp(t (A − V ))f ](x) = Ex0 =x gt exp − 0
− 21 hA∗ , A∗ i − ihX, A∗ i, for all f in C(M), where A˜ = 4 +X − iA∗ − ∗ A being the vector field on M dual to A. In the case of M being Euclidean space, such a formula is known and can be derived by combining the Cameron–Martin–Girsanov 1 2
i ∗ 2 div(A )
436
M. Qian, Z.-d. Wang
formula and the usual version of the Feynman–Kac formula (see e.g., Sect. 15 of [S]). Other Feynman–Kac type formulas can be founded in several papers (see e.g., [AHHK, AW and WGQ]). Notice that the one-form A can be regarded as a connection of the trivial principal bundle M × R 1 over M. As discussed above, we can also consider an horizontal lifted process {(xt , ht )}t≥0 on M × R 1 (with respect to the connection form A), where ht is given by Z t X d (A, Xj )(xs ) ◦ dBsj + (A, Y )(xs )ds . ht = − 0
j =1
Similar to formula (2.6), we have the following Feynman–Kac type formula: Z t V (xs )ds)f (xt , [exp (t (Aˆ − V ))f ](x) = Ex0 =x exp(ht ) exp −
(2.7)
0
where V ∈ C 0 (M) is a potential function and 1 1 1 Aˆ = 4 +X − A∗ − div (A∗ ) + hA∗ , A∗ i − hX, A∗ i. 2 2 2 Let A = X ∗ be the one-form dual to X and V = − 21 (div X + hX, Xi). By (2.7) and (2.2), and using Itô’s formula, we get t [exp( 4)f ](x) = Ex0 =x [Zt f (xt )], 2
(2.8)
where
Z tX Z t d 1 j hX, Xj i(xs (ω)) · dBs − hX, Xi(xs (ω))ds . Zt (ω) = exp − 2 0 0
(2.9)
j =1
P By our assumption (2.2), we have hX, Xi = dj =1 hX, Xj i2 and thus Zt is a martingale on the probability space (, F, Ft , p). So a new probability measure p˜ on (, F) can be defined by d p˜ | = Zt , ∀t > 0. dp Ft By (2.8), now we see clearly that the process {xt }t≥0 is a Brownian motion without ˜ We remark that (2.9) is a variant Girsanov’s drift on the probability space (, F, Ft , p). formula. Its original proof can be found in [E]. 3. The Entropy Production Formula Let {xt }t≥0 be a diffusion process on a probability space (, F, Ft , p). Define: Fst = ∨s≤u≤t σ (xu ) (the σ -algebra generated by xu , s ≤ u ≤ t), 0 ≤ s < t < ∞. By the Kolomogorov theorem, {xu }s≤u≤t and {xs+t−u }s≤u≤t determine probability measures + − and p[s,t] on Fst respectively. p[s,t]
Entropy Production of Diffusion Processes on Manifolds
437
Definition 3.1. If the following limits exist: + dp[t,t+4t] 1 p E ln − ep (t) = lim 4t→0+ 4t dp[t,t+4t]
! ,
! + dp[t,t+4t] 1 p | xt = x , E ln − ep (t, x) = lim 4t→0+ 4t dp[t,t+4t] then ep (t) and ep (t, x) are called the entropy production and entropy production density of the diffusion process {xt }t≥0 at time t respectively. + − = p[s,t] holds for A stationary diffusion process {xt }t≥0 is called reversible if p[s,t] any 0 ≤ s < t < ∞. The entropy production describes the irreversibility of a diffusion process. By the methods of Qian (see e.g. [QQ2]), we can prove easily that a stationary process {xt }t≥0 is reversible if and only if its entropy production ep (t) equals zero for all t ≥ 0. The entropy production formula for diffusion processes on Euclidean space R n has been given in [QQ2] without proof. In this section, we will prove the entropy production formula for drifted Brownian motions on a compact a Riemannian manifold M. Let {xt }t≥0 be a diffusion process on (, F, Ft , p), with M as its state space. Set (ηs,t x(ω))r = xt+s−r (ω) for any ω ∈ , 0 ≤ s ≤ r ≤ t < ∞. {(ηs,t x)r (ω)}s≤r≤t is a diffusion process on the probability space (, F, Ft , p). Denote by Rts the set of all functions which are measurable with respect to Fst . For any f ∈ Rts , f may be represented as f (ω) = f˜ ◦ x(ω), where f˜ is measurable with respect to the σ algebra β(W˜ st ) of Borel sets associated to W˜ st = C([s, t], M). Define a transformation ∗ : Rt −→ Rt by: ηs,t s s ∗ f )(ω) = (f˜ ◦ ηs,t x)(ω) (ηs,t
for any f = f˜ ◦ x, f ∈ Rts . Proposition 3.2. Suppose that {xt }t≥0 is a Brownian motion without drift on a prob˜ Let ρ(x, u) be the probability density of xu , u ≥ 0. If ability space (, F, Ft , p). ρ(x, 0) > 0 for any x ∈ M, then E p˜ [f (ω)
ρ(xt (ω), s) ∗ ] = E p˜ [(ηs,t f )(ω)] ρ(xs (ω), s)
(3.1)
holds for all f ∈ Rts . Proof. For any s = t0 < t1 < · · · < tn = t, and f0 , f1 , · · · , fn ∈ C(M), we have n Y ρ(xt (ω), s) ] E p˜ [ fi (xti (ω)) ρ(xs (ω), s) i=0 Z Y Z n n Y ··· p(ti − ti−1 , xti , xti−1 )ρ(xt , s) (fi (xti )dxti ), = M
M i=1
i=0
where dx represents the Riemannian volume element of M and p(u, x, ·) is the transition probability density of the Brownian motion {xt }t≥0 without drift which satisfies
438
M. Qian, Z.-d. Wang
p(u, x, y) = p(u, y, x). Hence the right-hand side of last equality becomes Z
Z M
···
ρ(xt , s)
M
n Y
p(ti − ti−1 , xti , xti−1 )
i=1
n Y (fi (xti )dxti ) i=0
p˜
= E [fn (xs (ω))fn−1 (xt+s−tn−1 (ω)) · · · f1 (xt+s−t1 (ω))f0 (xt (ω))] = E p˜ [f0 ((ηs,t x)s (ω))f1 ((ηs,t x)t1 (ω)) · · · fn ((ηs,t x)tn (ω))] ∗ f )(ω)]; = E p˜ [(ηs,t
Q here f (ω) = ni=0 fi (xti (ω)). Hence we see that (3.1) holds for all f in Rts . This completes the proof. u t In the following, we suppose that {xt }t≥0 is the solution process of SDE (2.1) which is a Brownian motion with drift X on the probability space (, F, Ft , p). Set
Z t Z tX d 1 j hX, Xj i(xu (ω)) · dBu − hX, Xi(xu (ω))du . Zs,t (ω) = exp − 2 s s j =1
Proposition 3.3. Zs,t ∈ Rts , and the following holds:
Z tX d j ∗ Zs,t )(ω) = exp hX, Xj i(xu (ω)) · dBu · (ηs,t s j =1
Z t 1 (3hX, Xi + 2 div X)(xu (ω))du . exp 2 s Proof. By the compactness of M, we may assume that M is a submanifold of R N for a large N , and the Riemannian metric h·, ·i of M is induced by the Euclidean metric in R N . Observe that Z tX d j hX, Xj i(xu (ω)) ◦ dBu · Zs,t (ω) = exp − s j =1
Z t 1 (div X + hX, Xi − 2hX, Y i)(xu (ω))du ] exp 2 s Z t Z 1 t (div X + hX, Xi)(xu (ω))du ]. = exp − hX, ◦dxu i + 2 s s By this expression and the stochastic calculus on R N , it is easy to see that Zs,t ∈ Rts . ∗ is a homomorphism of the algebra R t and the following holds: Observe that ηs,t s ∗ ηs,t
Z
t s
Z (div X + hX, Xi)du =
s
t
(div X + hX, Xi)du.
Entropy Production of Diffusion Processes on Manifolds
439
Hence we see that Proposition 3.3 follows from the following: Z t ∗ [ηs,t (− hX, ◦dxu i)](ω) s
Z t Z tX d 1 j hX, Xj i(xu (ω)) · dBu + ( divX + hX, Xi)(xu (ω))du. = s s 2
(3.2)
j =1
By the stochastic calculus, we have Z t n X hX, ·dxu i(ω) = lim hX(xu(n) (ω)), xu(n) (ω) − xu(n) (ω)i, n→∞
s
(n)
(n)
k
k=0
(n)
k+1
k
(n)
where s = u0 < u1 < · · · < un < un+1 = t is a series of a partition of [s, t], such that (n) (n) lim max | uk+1 − uk |= 0. n→∞ 0≤k≤n
Hence we have
Z
∗ ( [ηs,t
t s
= lim
n→∞
= lim
n→∞
− =− Since
hX, ·dxu i)](ω) n X hX(xt+s−u(n) (ω)), xt+s−u(n) (ω) − xt+s−u(n) (ω)i
−
n X
k=0 Z t s
k
k=0
"
k+1
#
k
n X hX(xu(n) (ω)), xu(n) (ω) − xu(n) (ω)i k
k=0
k+1
k
hX(xu(n) (ω)) − X(xu(n) (ω)), xu(n) (ω) − xun) (ω)i k+1
hX, ·dxu i(ω) −
k
Z s
t
k+1
k
hdX(xu ), dxu i(ω).
Rt Rt hX, ◦dxu i(ω) = s hX, ·dxu i(ω) + 21 s hdX(xu ), dxu i(ω), thus we get Z t ∗ hX, ◦dxu i (ω) ηs,t s Z t Z 1 t ∗ hX, ·dxu i (ω) + hdX(xu ), dxu i(ω) = ηs,t 2 s s Z Z t 1 t hdX(xu ), dxu i(ω) = − hX, ·dxu i(ω) − 2 s s Z t = − hX, ◦dxu i(ω)
Rt s
s
Z t Z tX d j hX, Xj i(xu (ω)) ◦ dBu − hX, Y i(xu (ω))du =− s j =1
s
Z t Z tX d 1 j hX, Xj i(xu (ω)) · dBu − ( div X + hX, Xi)(xu (ω))du, =− 2 s s j =1
440
M. Qian, Z.-d. Wang
This yields (3.2) and completes the proof. u t Now we can prove the entropy production and entropy production density formula for the diffusion process {xt }t≥0 . Theorem 3.4. Let ρ(x, r) be the density of xr . If ρ(x, 0) = ρ(x) > 0 for all x ∈ M, Then the entropy production density ep (t, x) and entropy production ep (t) of {xt }t≥0 can be expressed as: 1 ∂ ln ρ(x, t) h2X − 5 ln ρ(x, t), 2X − 5 ln ρ(x, t)i − , 2Z ∂t 1 ∂ ln ρ (h2X − 5 ln ρ, 2X − 5 ln ρi − 2 )ρ(x, t)dx. ep (t) = 2 M ∂t
ep (t, x) =
Proof. Define a new probability measure p˜ on (, Fst ) by
d p˜ t dp |Fs =
Zs,t . Notice that
−
∗ ∗ −1 f )(ω)] = E p˜ [(ηs,t f )(ω)Zs,t (ω)] E p[s,t] [f (ω)] = E p [(ηs,t
holds for any f ∈ Rts . It follows from the discussion in Sect. 2 that {xr }s≤r≤t is a ˜ observe Brownian motion without drift on the new probability space (, Fst , Fr , p).And −1 ∗ −1 ∗ ∗ ∗ −1 that we have (ηs,t ) = ηs,t and ηs,t (Zs,t ) = (ηs,t Zs,t ) . Thus by (3.1), we see that ∗ −1 f )(ω)Zs,t (ω)] E p˜ [(ηs,t ρ(xt (ω), s) ∗ Zs,t )−1 (ω) = E p˜ f (ω)(ηs,t ρ(xs (ω), s) + ρ(xt (ω), s) ∗ . Zs,t )−1 (ω) = E p[s,t] f (ω)Zs,t (ω)(ηs,t ρ(xs (ω), s)
Hence we get
− dp[s,t] + dp[s,t]
∗ (ω) = Zs,t (ω)(ηs,t Zs,t )−1 (ω)
ρ(xt (ω), s) . ρ(xs (ω), s)
By Proposition 3.3, now we get Z tX − d dp[s,t] j hX, Xj i(xr (ω)) · dBr · + (ω) = exp −2 dp[s,t] s j =1 Z t ρ(xt (ω), s) . exp − (div X + 2hX, Xi)(xr (ω))dr + ln ρ(xs (ω), s) s Since
∂ρ ∂t
=
1 2
4 ρ − hX, 5ρi − ρ div X, by Itô formula, we have
ρ(xt (ω), s) ρ(xs (ω), s) Z t Z tX d 1 j h5 ln ρ, Xj i(xr (ω)) · dBr + [( 4 +X)(ln ρ)](xr (ω))dr. = 2 s s
ln
j =1
Entropy Production of Diffusion Processes on Manifolds
441
Hence − dp[s,t]
+ dp[s,t]
Z tX d j = exp[− h2X − 5 ln ρ, Xj i(xr (ω)) · dBr ] ·
s j =1 Z 1 t
s j =1 Z 1 t
h2X − 5 ln ρ, 2X − 5 ln ρi(xr (ω))dr · exp − 2 s Z t 1 (−2X ln ρ + + 4 ρ − divX)(xr (ω))dr exp 2ρ s Z tX d j h2X − 5 ln ρ, Xj i(xr (ω)) · dBr · = exp − exp −
2
(h2X − 5 ln ρ, 2X − 5 ln ρi − 2
s
∂ ln ρ )(xr (ω))dr . ∂s
+ Therefore, Theorem − 3.4follows immediately by taking the limit 4t → 0 in the exdp . u t pression of ln dp[t,t+4t] + [t,t+4t]
In the case of {xt }t≥0 being stationary, by Theorem 3.4 we have the following Corollary 3.5. If ρ(x, 0) = ρ(x) is an invariant density of {xt }t≥0 , then the entropy production density ep (t, x) and entropy production ep (t) of {xt }t≥0 are given by ep (t, x) = and
1 ep (t) = 2
1 h2X − 5 ln ρ, 2X − 5 ln ρi(x) 2
Z M
h2X − 5 ln ρ, 2X − 5 ln ρi(x)ρ(x)dx
respectively. By Corollary 3.5, we see that a stationary drifted Brownian motion on M is reversible (i.e., its entropy production ep (t) = 0) if and only if its drift X is a gradient vector field. This result is of course known, see e.g., p. 294 of [IW]. 4. Entropy Production and Rotation Numbers In this section, we suppose that the solution process {xt }t≥0 of SDE (2.1) is a Brownian motion with drift X which admits an invariant initial density ρ(x) > 0, ∀x ∈ M. Suppose that the first homology group H1 (M, R 1 ) of M has finite integral bases γ1 , · · · , γb1 (b1 being the first Betti number of M, i.e. b1 =dimH1 (M, R 1 )). Each γk is a closed curve which can be assumed to be smooth. For any T > 0, let LT = {xt (ω) | 0 ≤ t ≤ T } be an orbit of {xt }t≥0 . We join the endpoints x0 (ω) and xT (ω) of LT with the shortest geodesic arc L0,T . Thus γ (T , ω) = LT ∪ L0,T is a closed curve, and there exist integers n1 (T , ω), · · · , nb1 (T , ω) such that γ (T , ω) =
b1 X i=1
ni (T , ω)γi
442
M. Qian, Z.-d. Wang
holds in the homology sense. The rotation number αi of {xt }t≥0 around the closed curve γi is then defined as the following limit: 1 ni (T , ω), i = 1, · · · , b1 . T →∞ T
αi = lim
It is known that these rotation numbers exist and are independent of ω. In fact, they are given by (see e.g., [M]) Z (ωi , X)(x)ρ(x)dx, i = 1, · · · , b1 , (4.1) αi = M
where ωi denotes the harmonic one form among the dual one-forms of γi . We remark that the rotation number formula can be rederived by considering a lifted process on the universal covering manifold M˜ of M. The rotation (or called circulation) of the diffusion process {xt }t≥0 is closely related to its irreversibility. All the rotation numbers α1 , · · · , αb1 of a reversible diffusion process are equal to zero. Note that the inverse becomes true only when the dual one-form X∗ of X is closed (see e.g. [IW]). Now we will rewrite the entropy production formula (Corollary 3.5) and then the relationship between the entropy production and circulation becomes more clear. Denote by X∗ the dual one form of X. Let X∗ = α + β + γ be its Hodge decomposition, with α, β, γ being the exact, co-exact, harmonic one-forms respectively. Now we give the following theorem, from which we can see how the rotation numbers contribute to the entropy production. Theorem 4.1. The entropy production ep (t) of {xt }t≥0 is given by ep (t) = 2(β, ρX∗ ) + 2(γ , ρX∗ ), where (·, ·) stands for the Hodge inner product. Proof. Set C = 2ρX − 5ρ. Denote its dual one form by C ∗ . By Corollary 3.5, we have Z 1 h2X − 5 ln ρ, 2X − 5 ln ρi(x)ρ(x)dx ep (t) = 2 M Z 1 h2X − 5 ln ρ, Ci(x)dx = 2 M 1 = (2X∗ − d ln ρ, C ∗ ) 2 1 1 1 = (2α − d ln ρ, C ∗ ) + (2β, C ∗ ) + (2γ , C ∗ ). 2 2 2 Observe that ρ satisfies div (2ρX − 5ρ) = 0. Hence δC ∗ = − div C = 0 (see e.g., p. 223 of [W]), i.e., C ∗ is co-closed. This yields (2α − d ln ρ, C ∗ ) = 0. Now we get ep (t) = (β, C ∗ ) + (γ , C ∗ ).
(4.2)
Since C ∗ = 2ρX∗ − dρ, (β, dρ) = (δβ, ρ) = 0 and (γ , dρ) = (δγ , ρ) = 0, we see clearly that Theorem 4.1 follows from (4.2). u t
Entropy Production of Diffusion Processes on Manifolds
443
By the rotation number formula (4.1), we see clearly that (γ , ρX∗ ) can be represented as a linear sum of the rotation numbers α1 , · · · , αb1 of {xt }t≥0 around the closed curves γ1 , · · · , γb1 . In the following, we shall explain that (β, ρX∗ ) represents a hidden circulation α0 of {xt }t≥0 . Therefore the irreversibility of {xt }t≥0 is characterized in terms of its circulation α0 , α1 , · · · , αb1 just as the case of Markov chain (see [QQ1]). When the dual one-form X ∗ of X is closed, the hidden circulation is zero and the entropy production ep (t) is then a linear sum of the rotation numbers α1 , · · · , αb1 . Define a connection of the principal bundle M × S 1 over M by a differential oneform iA = 2πiβ on M. With respect to this connection, the diffusion {xt }t≥0 can be horizontally lifted to M × S 1 (see Sect. 2). We define the rotation number of the horizontal lifting process around the circle S 1 as the hidden circulation of {xt }t≥0 . To be more precise, suppose that {(xt , gt )}t≥0 is the lifting process of {xt }t≥0 , gt = eiθt ∈ S 1 , θt being continuous with respect to t and the initial condition θ0 = 0 being given. The hidden circulation α0 of {xt }t≥0 is then defined by α0 = lim
t→∞
1 θt . 2π t
Theorem 4.2. The hidden circulation α0 of {xt }t≥0 is given by α0 = (β, ρX∗ ). Proof. By (2.5), we have Z θt = 2π
0
t
d X (β, Xj )(xs ) ◦ dBsj + (β, Y )(xs )ds . j =1
Let β ∗ be the dual vector field of β. Using (2.2) we can prove easily that d X h5Xj β ∗ , Xj i = div(β ∗ ). j =1
By Itô formula, we get Z t X d d X 1 j (β, Xj )(xs ) · dBs + Xj (β, Xj ) + (β, Y ) (xs )ds θt = 2π 2 0 j =1
j =1
Z tX d j (β, Xj )(xs ) · dBs + = 2π 0 j =1
+ * d d X X 1 1 ∗ ∗ h5Xj β , Xj i + β , Y + 5Xj Xj (xs )ds 2π 2 2 0 j =1 j =1 Z t X d 1 j (β, Xj )(xs ) · dBs + div (β ∗ ) + hβ ∗ , Xi (xs )ds . = 2π 2 0 Z
t
j =1
444
M. Qian, Z.-d. Wang
Observe that div (β ∗ ) = δβ = 0, we have Z tX Z t d j (β, Xj )(xs ) · dBs + (X, β)(xs )ds . θt = 2π 0 j =1
(4.3)
0
By the stochastic analysis and the compactness of M, it is easy to prove that Z 1 t j (β, Xj )(xs ) · dBs |2 ) = 0. lim E(| t→∞ t 0 Hence, by the Chebyshev’s inequality we get Z 1 t j (β, Xj )(xs ) · dBs = 0, j = 1, · · · , d. lim t→∞ t 0 On the other hand, by the ergodicity of {xt }t≥0 , we have Z Z 1 t (X, β)(xs )ds = (X, β)(x)ρ(x)dx. lim t→∞ t 0 M Thus by (4.3), we see clearly that 1 θt = t→∞ 2πt
Z
α0 = lim
M
(X, β)(x)ρ(x)dx,
which completes the proof. u t By Theorem 4.1 and Theorem 4.2, we see that the entropy production of {xt }t≥0 can be represented in terms of its rotation numbers α1 , · · · , αb1 and hidden circulation α0 . This can be stated as the following: Theorem 4.3. The entropy production ep (t) of {xt }t≥0 is represented as ep (t) = 2α0 + 2
b1 X (X∗ , ωi )αi . i=1
Acknowledgement. We would like to express our thanks to Professor Guo Mao-zheng for his helpful discussion.
References [A] Andrej, L: Phys. Lett., 111A, 45–46 (1982) [AA] Arnold V.I. & Avez, A.: Ergodic problems of classical mechanics, New York: W.A. Benjamin, 1968 [AHHK] Albeverio, S., Høegh-Krohn, R., Holden, H. & Kolsrud, T.: A covariant Feynman–Kac formula for unitary bundles over Euclidean space. In: Stochastic partial differential equations and its applications (G. Da. Prato& L. Tubaro eds.), Lecture Notes in Mathematics 1390, Berlin: Springer-Verlag, 1989, pp. 1–12 [AW] Albeverio, S. and Zheng-dong, Wang: Representation of the propagator and Schwinger functions of Dirac fields in terms of Brownian motions. J. Math. Phys., 36 No. 10, 5207–5216 (1995) [E] Elworthy, K.D.: Geometric aspects of diffusions on manifolds. iN: É cole d’É té de Probabilitié s de Saint-Flour XV-XVII, Proceedings 1985–87 (P. L. Hennequin ed.), Lecture Notes in Mathematics 1362, Berlin: Springer-Verlag, 1988, pp. 277–425 [Ev] Evans et al: Statistical Mechanics of Non-equilibrium fluids New York: Academic Press, 1990
Entropy Production of Diffusion Processes on Manifolds
[G] [H] [IW] [K] [Ku] [M] [P] [QQ1] [QQ2] [S] [Si] [W] [WGQ]
445
Gallavotti, G.: The chaotic hypothesis and universal large derivations properties. In: Abstracts of Plenary and Invited Lectures of ICM 1998, Berlin, 1998, p. 6 Hoover et al: Phys. Rev. Lett. 59, 10–13 (1987) Ikeda, N. & Watanabe, S.: Differential equations and diffusion processes. (second edition), Amsterdam: North Holland-Kodansha, 1989 Kalpazidou, S.: Cycle representation of Markov processes New York: Springer-Verlag, 1995 Kurchan: Fluctuation theorem for stochastic dynamics. J. Phys. A, 31, 3719–3729 (1998) Manabe, S.: Stochastic intersection number and homological behavior of diffusion processes on manifolds. Osaka J. Math. 19, 429–457 (1982) Prigogine, I.R.: From being to becoming. San Francisco: W. H. Freeman and Company, 1980 Qian, Min-ping & Qian, Min: Circulation for recurrent Markov chains. Zeit. für Wahr. Ver. Gef. 59, 203–212 (1982) Qian, Min-ping & Qian, Min: The entropy production and irreversibility of Markov processes. In: Proc. 1st World Congr. Bernoulli Soc., 1988, pp. 307–316 Simon, B.: Functional integration and mathematical physics. New York: Academic Press, 1979 Sinai, Ya.G.: Topics in Ergodic Theory. Princeton, NJ: Princeton University Press, 1994 Wu, Hong-Xi: Elements of Riemannian geometry. Beijing: Peking University Press, 1988 Wang, Zheng-dong, Guo, Mao-zheng & Qian, Min: Diffusion processes on principal bundles and differential operators on the associated bundles. Science in China (series A) 35, 385–398 (1992)
Communicated by Ya. G. Sinai
Commun. Math. Phys. 206, 447 – 462 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Entropic Repulsion for the Free Field: Pathwise Characterization in d ≥ 3 Jean-Dominique Deuschel1 , Giambattista Giacomin2,? 1 Fachbereich Mathematik, TU Berlin, D-10623 Berlin, Germany. E-mail:
[email protected] 2 Département de Mathématiques, EPFL, CH-1015 Lausanne, Switzerland
Received: 26 October 1998 / Accepted: 5 April 1999
Abstract: We study concentration properties of the lattice free field {ϕx }x∈Zd in d ≥ 3, i.e. the centered Gaussian field with covariance given by the Green function of the (discrete) Laplacian, when constrained to be positive in a region of volume O(N d ) (hard–wall condition). It has been shown in [3] that, as N → ∞, the conditioned field is pushed to infinity: more precisely the typical value of the ϕ-variable to leading order √ is c log N , and the exact value of c was found. It was moreover conjectured that the conditioned field, once this diverging height is subtracted, converges weakly to the lattice free field. Here we prove this conjecture, along with other explicit bounds, always in the direction of clarifying the intuitive idea that the free field with hard–wall conditioning merely translates away from the hard wall. We give also a proof, alternative to the one presented in [3], of the lower bound on the probability that the free field is everywhere positive in a region of volume N d . 1. Introduction and Main Result Let ϕ = {ϕx }x∈Zd (d ≥ 3) be the massless free field, i.e. the Gaussian process with zero mean and covariance operator −1−1 , with 1 the discrete Laplacian, 1f (x) =
X
(f (x + e) − f (x)),
f : Zd → R.
(1.1)
e∈Zd :|e|=1
We will denote by G(x, y) the matrix element (−1−1 )x,y and we set G = G(0, 0). Observe that G is 1/2d times the Green function of the simple random walk on Zd . We will denote by P the probability distribution of ϕ and by E the corresponding expectation. ? Present address: Dipartimento di Matematica, Università di Milano, via Saldini 50, 20133 Milano, Italy. E-mail:
[email protected] 448
J.-D. Deuschel, G. Giacomin d
d
RZ ≡ is endowed with the product topology. It is easy to check that P ∈ M1 (RZ ) is a Gibbs measure with formal Hamiltonian X 2 1 (1.2) ϕx − ϕy . H (ϕ) = 4 d x,y∈Z :|x−y|=1
By this we mean that for every x ∈ Zd , P exp − 21 y:|y−x|=1 (φ − ϕy )2 dφ P dφ F{x}{ (ϕ) = R 1P 0 2 dφ 0 y:|y−x|=1 (φ − ϕy ) R exp − 2
P(dϕ)–a.s., (1.3)
in which FA , A ⊂ Zd , is the σ -algebra generated by {ϕx }x∈A . Note that H (ϕ) is well defined if ϕx = const. for x in the complement of a finite set and that adding to such a ϕ a constant (i.e. ϕx → ϕx + const. for every x) does not change the value of H . The latter property goes under the name of continuum symmetry and it gives to the model several interesting properties, like the fact that associated to H there is a continuum of Gibbs measures. We refer to [8, Ch. 13] for an accurate presentation of the Gibbsian characterization of P and related results (see also Sect. 2 below). Our attention will be focused on P conditioned to the entropic repulsion event + N = {ϕ ∈ : ϕx ≥ 0 for all x ∈ VN } ,
(1.4)
where VN = NV ∩ Zd , N ∈ Z+ and V ⊂ Rd is a bounded domain which satisfies a uniform (interior) cone condition (i.e. there exists a right circular cone K ⊂ Rd , K open set, such that for every r ∈ V there exists a map S : Rd → Rd , composition of a rotation and a translation, such that SK has vertex r and SK ⊂ V ). Therefore we set + (1.5) P+ N (·) = P · N . This is a very simple model for an interface lying above a hard wall : ϕx represents the height of the interface at the site x and the wall is assumed to be at ϕ ≡ 0. What it is expected is that the hard wall will push the interface away from itself, i.e. that P+ N concentrates on trajectories (in this case: interfaces) which lie further and further from ϕ ≡ 0, as N grows (see [5,12] and [6] for physical background and some estimates on more general models). The exact distance at which the interface is pushed has been found in [3], in the case of the free field: for our purposes we need a strengthened version of the result. What we prove in Section 3, Proposition 3.3, is that E+ (ϕ ) x − 1 = 0. (1.6) lim sup √ N N→∞ x∈VN 4G log N In [3, Sect. 4] the statement (1.6) had been established only in the bulk, i.e. if we replace the supremum over x ∈ VN with the supremum over x ∈ (Vε )N , Vε = {r ∈ V : dist(r, V { ) > ε}, with ε > 0. We remark that in [3] only the case V = [−1, 1]d was considered. The extension of the results to a domain V as considered here is straightforward (see also Sect. 4 below). In [3] it has been also conjectured that P+ N , once the diverging repulsion distance is subtracted, would converge (as N → ∞) to P itself. This is in fact our main result: for
Entropic Repulsion for Free Field: Pathwise Characterization in d ≥ 3
449
a ∈ R, let us denote by P+ a,N the law of the field {ϕx − a}x∈Zd , where ϕ is distributed according to P+ . In what follows ⇒ denotes weak convergence of measures. We have N the following Theorem 1.1. There exists a sequence of real numbers a(N) satisfying lim √
N→∞
a(N) = 1, 4G log N
(1.7)
such that N →∞
P+ a(N ),N H⇒ P.
(1.8)
The proof is an immediate consequence of Proposition 2.1 and Proposition 3.1 below. In the proof we take a(N) = E+ N (ϕ0 ) ,
(1.9)
and (1.7) follows from (1.6). Theorem 1.1 is of a local nature, but we will establish also some more global results (see in particular Corollary 3.2 below). We will prove Theorem 1.1 in two steps. We will first (in Sect. 2) establish the convergence of P+ N , once recentered (by subtracting its mean). Then (Sect. 3) we will (ϕ show that E+ N x − ϕy ) tends to zero as N → ∞ and this will allow us to replace the mean with a (N -dependent) constant, completing thus the proof of Theorem 1.1. We observe that in spite of the fact that we start off in a Gaussian setting, due to the constraint, P+ N is of course non-Gaussian, and a central role in our analysis is played by the Brascamp–Lieb (B–L) inequality [4], which is a tool developed to deal with nonGaussian situations. Here we will use the following form of the inequality: for every compactly supported f : Zd → R and every N ∈ Z+ , + ≤ E [F ((f, ϕ))] , (1.10) E+ N F (f, ϕ) − EN [(f, ϕ)] P where (f, ϕ) = x f (x)ϕx and F : R → R is either F (r) = |r|β , any β ≥ 1, or F (r) = exp(r). The proof is an application of [4, Th. 5.1]: it can be found in [6], but, for completeness, we sketch it here. The key observation is that ∞1ϕx ≤0 is a convex function and the entropic repulsion constraint P can be enforced on P by changing the measure with the exponential factor exp(− x∈VV ∞1{ϕx ≤0} ), properly normalized. To apply directly the result in [4] it is sufficient to approximate ∞1{ϕx ≤0} with a C 2 convex function, for example α(ϕx )4 1{ϕx ≤0} , α ∈ R+ , and to consider the centered Gaussian d field P(M) on RZ with covariance given by −1−1 M , the inverse Laplacian with zero boundary conditions outside 3M , 3 = (−1, 1)d and M ∈ Z+ . WePdefine P(M,N,α) to be the probability measure satisfying dP(M,N,α) /dP(M) ∝ exp(−α x∈VN 1{ϕx ≤0} ). By [4, Th. 5.1], in the case F (r) = |r|β the inequality (1.10) is established uniformly in M, (M,N,α) and E with E(M) . By letting first M → ∞ and N and α, if we replace E+ N with E then α → ∞ we conclude. The case of F (r) = exp(r) is reduced to the case F (r) = r 2 by the differentiation–integration identity oi Z 1 Z t h n varP(M,N,α) (f, ϕ)dsdt, log E(M,N,α) exp (f, ϕ)} − E(M,N,α) ((f, ϕ)) = 0
0
s
450
J.-D. Deuschel, G. Giacomin (M,N,α)
where Ps
is the probability measure such that /dP(M,N,α) ∝ exp{s(f, ϕ)}. dP(N,M,α) s
In fact varP(M,N,α) (f, ϕ) ≤ varP(M) (f, ϕ) [4] and the proof of (1.10) is concluded by s taking limits. An inequality similar to (1.10) holds true also in a fully non-Gaussian setting, i.e. in the case in which H is the sum of convex functions with second derivative bounded away from zero. Various entropic repulsion results in this context are established in [6]. Crucial in establishing (1.6), and therefore for our result, is understanding the asymptotics of P(+ N ): while for the results in Sect. 2 we will only need (roughly) that the field is pushed toward infinity by the hard wall, √ to establish Theorem 1.1 we need to know that the field is pushed at distance const. log N and to have a relatively precise control on the value of the constant. We include in this paper (Sect. 4) a proof of the lower bound on P(+ N ), alternative to the one presented in [3, Th. 1.1]: this is very close in spirit to the original proof, but it relies on a well–known technique of field theory, providing thus a bridge from [3] to the earlier literature. 2. Convergence of the Centered Field In this section we focus on the recentered field: for each f : Zd → R define the shift map Tf : → by (Tf ϕ)x = ϕx − f (x). The recentered field is then −1 Pˆ N = P+ N Tf ,
with f (x) = E+ N (ϕx ).
(2.1)
The main result of this section is Proposition 2.1. With the definitions above N →∞
Pˆ N H⇒ P.
(2.2)
We start with two preliminary lemmas. Lemma 2.2. {Pˆ N }N∈Z+ is tight and any limit point Pˆ satisfies h i ˆ (Sn (x))2 = 0, lim E for all x ∈ Zd , n→∞
(2.3)
P where Sn (x) = y fn (y)ϕx+y and {fn }n∈Z+ is any sequence of functions such that kfn k1 ≤ 1 and limn→∞ kfn k∞ = 0. Proof. By the B–L inequality (1.10) and the definition of Pˆ N we obtain that for every x ∈ Zd , h i h i ˆ N (ϕx )2 ≤ E (ϕx )2 = G(0, 0) < ∞, (2.4) sup E N∈Z+
and therefore {Pˆ N }N∈Z+ is tight. B–L once again gives us also that for any n ∈ Z+ , h i X ˆ N (Sn (x))2 ≤ fn (x)G(x, y)fn (y). (2.5) sup E N∈Z+
x,y
Entropic Repulsion for Free Field: Pathwise Characterization in d ≥ 3
451
P Denoting by G the (convolution) operator Gf (x) = y G(x, y)f (y), by the Hölder and the Young inequality we have that X fn (x)(Gfn )(x) ≤ kfn kq kGkp kfn k1 ≤ kfn kq kGkp , (2.6) x
whenever 1/p + 1/q = 1. By using the decay of G at infinity [11, §1.5] we have that kGkp < ∞ if p > d/(d − 2) and by interpolation limn→∞ kfn kq = 0 for all q > 1. This establishes (2.3). u t Let us now define mx (ϕ) =
1 2d
X
ϕy .
(2.7)
y:|y−x|=1
We have the following expression for the expectation of the Laplacian of ϕ: √ Rr Lemma 2.3. Let 9(r) = ( −∞ exp{−s 2 /2}ds)/ 2π . For each x ∈ Zd , " # 2 √ 1 E+ exp −d(m x (ϕ)) if x ∈ VN , √ 4πd N 1−9 mx (ϕ) 2d E+ N [ϕx − mx (ϕ)] = 0 otherwise.
(2.8)
Proof. We write h i + + , E+ N [ϕx − mx (ϕ)] = EN EN ϕx − mx (ϕ) F{x}{
(2.9)
and therefore if x ∈ VN{ the quantity in (2.9) is equal to zero, since in this case we can take away the repulsion in the conditional expectation in the right-hand side and the result follows by the DLR characterization of the free field. If x ∈ VN we extract the conditioning on + x ≡ {ϕ : ϕx ≥ 0}, E (ϕx − mx (ϕ)) 1+x F{x}{ + . (2.10) E+ N [ϕx − mx (ϕ)] = EN P + F x {x}{ For the numerator we observe that Z ∞ n o 1 E (ϕx −mx (ϕ)) 1+x F{x}{ = √ (ϕx −mx (ϕ)) exp −d (ϕx −mx (ϕ))2 dϕx π/d 0 n o 1 exp −d (mx (ϕ))2 , =√ (2.11) 4πd and for the denominator √ F { = P ϕx − mx (ϕ) ≥ −mx (ϕ) F { = 1 − 9 2dm (ϕ) . (2.12) P + x x {x} {x} The proof of (2.8) is therefore complete. u t
452
J.-D. Deuschel, G. Giacomin
We are now ready to prove Proposition 2.1. Proof of Proposition 2.1. In Lemma 2.2 we have established the tightness of {Pˆ N }N ∈Z+ . We are therefore left with showing that any limit point Pˆ coincides with P. We will start ˆ by exhibiting the DLR equations satisfied by P. The idea is to observe that the DLR equations for the free field can be cast in the form: for every x ∈ Zd , 1 exp −d (φ − mx (ϕ))2 dφ, P dφ F{x}{ (ϕ) = √ π/d
(2.13)
and we repeat the same algebraic steps for Pˆ N . We obtain Pˆ N dφ F{x}{ (ϕ) = n 2 o 1 1{φ≥−E+ (ϕx )} dφ, exp −d (φ − mx (ϕ)) − E+ [ϕx − mx (ϕ)] N N Zˆ N (x) (2.14) in which Zˆ N (x) is the normalization. From (1.6) (in this case the result only for x away from the boundary is largely sufficient, see therefore [3, Sect. 4] for a proof, or refer directly to Lemma 3.3 below) we deduce that limN →∞ E+ N (ϕx ) = ∞ and therefore, to ˆ verify that P satisfies the same DLR equations as P, we are left with proving that lim E+ N [ϕx − mx (ϕ)] = 0,
N→∞
for every x ∈ Zd .
(2.15)
By using the explicit expression in (2.8) we obtain that if x ∈ VN \∂ − VN , h i 1 2 E+ , 0 ≤ E+ N [ϕx − mx (ϕ)] ≤ √ N exp −d (mx (ϕ)) πd
(2.16)
and once again the result follows from (1.6); in fact it is sufficient to know that P+ N (ϕx < c(N)) tends to zero for some c(N ) tending to infinity, as N → ∞. Now we know that (each) Pˆ satisfies the DLR equations of the free field, i.e. (1.3) or (2.13). We use now the fact that for the free field the set of extremal states is known [8, Ch. 13, ex. 13.29]: every extremal state Q can be written as P ◦ Th−1 (≡ Qh ), where h : Zd → R is an harmonic function. Therefore there exists a probability measure ν on the set of extremal Gibbs states viewed as a measurable space with the evaluation R σ -algebra [8, Th.7.26] such that Pˆ = Qh dˆν (Qh ). Let us now apply the second part of Lemma 2.2, by choosing fn (y) = pn (0, y), the probability that a simple random walk, leaving at 0, exits Vn at y ∈ ∂ + Vn . Note that with this choice by using the DLR equations and harmonicity of h we have that EQh (Sn (x)) = h(x),
(2.17)
Entropic Repulsion for Free Field: Pathwise Characterization in d ≥ 3
453
for every x ∈ Zd and any n ∈ Z+ . Therefore by (2.3) and Fatou’s Lemma we have that for every x Z h h i 2 i 2 ˆ varQh (Sn (x)) + EQh (Sn (x)) dˆν (Qh ) = 0 = lim E (Sn (x)) = lim n→∞ n→∞ Z Z h i ≥ lim varQh (Sn (x)) + (h(x))2 dˆν (Qh ) = (h(x))2 dˆν (Qh ), n→∞
(2.18) where limn→∞ varQh (Sn (x)) = 0 follows by the very same argument used to obtain (2.3). But (2.18) implies that νˆ is concentrated on P. u t
3. Repulsion and Flatness of the Field In this section we will require the full strength of (1.6), while in the previous one it was sufficient to know that limN→∞ E+ N (ϕx ) = ∞, without requiring any uniformity in x or any control over the rate of divergence. Notice however that such an estimate is required only if we want to be able to choose δ arbitrarily close to 0 in Proposition 3.1 below. For the main result (Theorem 1.1) of this paper, having Proposition 3.1 just for a δ < 1 suffices. It will be clear from the proof p that to obtain this weaker result it suffices that the field is pushed at least at distance (2G + δ 0 ) log N , for some δ 0 > 0. The main result of this section is the following Proposition 3.1. For every δ > 0 there exists C > 0 such that −1+δ , E+ N ϕx − ϕy ≤ C|x − y|N
(3.1)
for every x, y ∈ Zd and every N ∈ Z+ . A straightforward application to Pˆ N of the B–L inequality (1.10) with F (r) = exp(r), together with the exponential Chebychev inequality, yields the following corollary of Proposition 3.1, in which we keep the same notation as in Theorem 1.1. Corollary 3.2. For every r in the interior of V , every β < 1/G and any δ > 0, if we set a(N ) = E+ N (ϕ[rN] ) we have that β(ϕx − a(N))2 sup E+ < ∞. (3.2) exp sup N 2 N∈Z+ x:|x−rN |≤N 1−δ Corollary 3.2 will not be used in the sequel, but it gives a strong concentration property of the P+ N -field. In the proof of Proposition 3.1, we will make use of two lemmas, that we state and prove here. The first one is an extension of the results in [3] on the distance of the field from the hard wall, up to the boundary of the wall. Lemma 3.3. For every δ > 0 there exists N0 ∈ Z+ such that for all N ≥ N0 , p p (4G − δ) log N ≤ E+ N [ϕx ] ≤ (4G + δ) log N , for all x ∈ VN ∪ ∂ + VN .
(3.3)
454
J.-D. Deuschel, G. Giacomin
Proof. Let us first recall the following result from [3, Prop. 1.3 and Lemma 4.7]: for every ε > 0, E+ (ϕ ) x − 1 = 0. (3.4) lim sup √ N N→∞ x∈(Vε )N 4G log N From (3.4), the upper bound in (3.3) is immediate: it suffices in fact to replace V with (1 + ε)V , apply (3.4) and then use the FKG inequality. Let us turn to the lower bound. Because of (3.4), the result is already proven for x ∈ (Vε )N . To extend it to the whole box we proceed as follows: for x ∈ VN \(V )N we have + + E+ ≥ E+ (3.5) N [ϕx ] = EN EN ϕx F(Vε )N N E ϕx F(Vε )N , in which we have used the FKG inequality. Since X ε pN (x, y)ϕy , E ϕx F(Vε )N =
(3.6)
y∈∂ − (Vε )N
ε (x, y)} in which {pN y∈(Vε )N ∪{∞} is the hitting probability for a simple random walk starting at x, the result in (Vε )N implies that for any δ 0 > 0 and N sufficiently large p 0 ) log N P (4G − δ < ∞ , (3.7) ≥ E+ τ [ϕ ] x x N (V ){ ε N
where Px is the law of {X(j )}j ∈Z+ , the simple random walk on Zd , with X(0) = x, and τA is the exit time from A ⊂ Zd . We are therefore left with showing that inf Px τ(V ){ < ∞ = 1. (3.8) lim ε N
ε→0 x∈∂ + VN ∪VN \(Vε )N
Let us start with some notation: as before, we denote by K a (right circular) cone and we use h(K) for the height of K. Moreover, with respect to a fixed cone K with vertex r0 , we define for every R > 0, n o (3.9) BRN = y ∈ Zd : |y − N r0 | ≤ RN , while BR = {r ∈ Rd : |r − r0 | ≤ R}. We start by claiming that there exists δ > 0 such that for every ε0 ∈ (0, h(K)/4), (3.10) inf Px X(τB N ) ∈ N K ≥ δ, x∈BεN0
2ε 0
uniformly in N. This holds because fN (x) = Px (X(τB N ) ∈ NK) is a positive harmonic 2ε 0
N . Therefore, by the Harnack inequality [11, Theorem 1.7,2], there exists function in B2ε 0 a constant cH < ∞ such that
fN (x1 ) ≤ cH fN (x2 ),
(3.11)
for all x1 , x2 ∈ BεN0 . By elementary considerations lim fN ([Nr0 ]) = |∂B2ε0 ∩ K|d−1 ,
N→∞
(3.12)
Entropic Repulsion for Free Field: Pathwise Characterization in d ≥ 3
455
where | · |d−1 denotes area of a d − 1 dimensional manifold embedded in Rd . Therefore we have that there exists c > 0 such that fN ([N r0 ]) ≥ c for every N ∈ Z+ , which, combined with (3.11), yields c ≡ δ, (3.13) inf fN (x) ≥ N cH x∈B 0 ε
Z+ .
Therefore (3.10) is proven. for every N ∈ N , the ball Let us consider now a point x ∈ ∂ + VN ∪ VN \(Vε )N . Therefore x ∈ B2ε centered at N r0 , with r0 ∈ ∂Vε , vertex of a cone K contained in Vε . Observe first of all that (3.14) Px τ(V ){ < ∞ ≥ Px τK{ < ∞ ≥ inf Py τK{ < ∞ , ε N
N y∈B2ε
N
N
in which the first inequality follows from the fact that KN ⊂ (Vε )N . By the strong N, Markov property we obtain that for y ∈ B2ε o n n o Py τK{ < ∞ = Py τK{ < ∞ ∩ X(τB N ) ∈ KN 4ε N o n nN o { + Py τK{ < ∞ ∩ X(τB N ) ∈ KN 4ε N (3.15) ≥ Py X(τB N ) ∈ KN 4ε # " h i 1 − Py X(τB N ) ∈ KN . + inf Pz τK{ < ∞ N z∈B4ε
N
4ε
If 4ε < h(K) we can apply (3.10) to obtain that inf y∈B N Py (X(τB N ) ∈ KN ) ≥ δ and 2ε 4ε therefore (3.16) inf Py τK{ < ∞ ≥ δ + (1 − δ) inf Py τK{ < ∞ . N y∈B2ε
N
N y∈B4ε
N
From (3.16) it is clear that we can iterate the procedure n times, with 2n+1 ε < h(K) and, recalling (3.14), we obtain that q n X ε (1 − δ)j = 1 − (1 − δ)n+1 ≥ 1 − , (3.17) Px τ(V ){ < ∞ ≥ δ ε N h(K) j =0
for some q > 0, uniformly in N ∈ Z+ . By the uniform interior cone assumption on V , K (the cone used in the above procedure) can be chosen, up to translations and rotations, to be the same for each point x ∈ ∂ + VN ∪VN \(Vε )N . Therefore the estimate (3.17) is uniform in x and (3.8) is proven. t u Remark. By following the arguments in the beginning of the proof of Lemma 3.3 and using the weak convergence of XN (t) = X([tN 2 ])/N, t ∈ R+ , to the standard Brownian motion one can also obtain that for every r ∈ R+ , E+ N ϕ[rN] = u(r), (3.18) lim √ N→∞ 4G log N where u ∈ C 0 (Rd ), u = 1 in V , u harmonic outside outside V and limr→∞ u(r) = 0.
456
J.-D. Deuschel, G. Giacomin
Lemma 3.4. There exists C ∈ R+ such that for all N ∈ Z+ such that for all x ∈ VN , h n oi 1 2 2 (ϕ)) ( ϕ)) ¯ + Cm ( ϕ) ¯ , (3.19) (m exp −d(m ≤ exp − E+ x x x N 2G where ϕ¯· = E+ N (ϕ· ). Proof. First of all we note that h n oi 2 (ϕ)) exp −d(m E+ x N i n h o 2 2 (3.20) (ϕ − ϕ)) ¯ + 2dm ( ϕ)m ¯ (ϕ − ϕ) ¯ exp −d(m ( ϕ)) ¯ exp −d(m = E+ x x x x N oi n h n o ˆ N exp −d(mx (ϕ))2 + 2dmx (ϕ)m ¯ x (ϕ) exp −d(mx (ϕ)) ¯ 2 . =E If we set P˜ N (dϕ) =
exp −d(mx (ϕ))2 Pˆ N (dϕ), Z˜ N
(3.21)
where Z˜ N is the normalization constant, we can develop (3.20) further to obtain h n oi 2 ˜ N exp {2dmx (ϕ)m (ϕ)) ¯ x (ϕ − ϕ)} ˜ exp −d(m =E E+ x N n h n oi o ˆ N exp −d(mx (ϕ))2 exp {2dmx (ϕ)m ¯ x (ϕ)} ˜ exp −d(mx (ϕ) ¯ 2 , ·E (3.22) ˜ N (ϕ· ). In analogy with (3.21), we define also where ϕ˜ · = E exp −d(mx (ϕ))2 ˜ P(dϕ), P(dϕ) = Z˜
(3.23)
i.e. we perform the same change of measure but with respect to the free field. P˜ is a centered Gaussian field and i h ˜E (mx (ϕ))2 = 1 2dG − 1 , (3.24) 2d 2dG where we used the fact that E[(mx (ϕ))2 ] = G − (1/2d). Therefore by using Jensen’s inequality and the B–L inequality we obtain ˆ N mx (ϕ) exp −d(mx (ϕ))2 E ˜ = mx (ϕ) ˆ N exp −d(mx (ϕ))2 E n h i1/2 h io ˆ N (mx (ϕ))2 ˆ N (mx (ϕ))2 exp d E ≤ E (3.25) n h i1/2 io h 2 2 ˆ ˆ exp d E (mx (ϕ)) ≤ E (mx (ϕ)) r 1 1 exp dG − ≡ K. = G− 2d 2
Entropic Repulsion for Free Field: Pathwise Characterization in d ≥ 3
Finally, again by the B–L inequality, ˜ exp {2dmx (ϕ)m ˜ N exp {2dmx (ϕ)m ¯ x (ϕ − ϕ)} ˜ ≤E ¯ x (ϕ)} E 2dG − 1 2 (mx (ϕ)) ¯ . = exp d 2dG Inserting (3.25) and (3.26) into (3.22) we obtain h n oi ¯ 2 (mx (ϕ)) + 2 + 2dKmx (ϕ) ¯ , ≤ exp − EN exp −d(mx (ϕ)) 2G
457
(3.26)
(3.27)
and the proof is complete. u t We are now ready to prove the main result of this section. Proof of Proposition 3.1. Set uN (x) = E+ N (ϕx ). Let us denote by AN the discrete Laplacian of uN . Proposition 3.1 follows if we show that for every δ > 0 we can find C > 0 such that, uniformly in x ∈ Zd , i = 1, . . . , d and N ∈ Z+ , (3.28) ∇i 1−1 AN (x) ≤ CN 1−δ , where ∇i is the discrete gradient in the i-direction. We denote by Ki (·) the kernel of the operator ∇i 1−1 . By [11, Th. 1.5.5] there exists a constant cK such that for all x, |Ki (x)| ≤
cK . |x|d−1
Recalling Lemma 2.3, as in (2.16), we have that if x ∈ VN \∂ − VN , h i p 2 , 0 ≤ AN (x) ≤ 2 d/πE+ N exp −d (mx (ϕ))
(3.29)
(3.30)
and therefore, by Lemma 3.3 and Lemma 3.4, we obtain that for every δ > 0 there exists ca ∈ R+ such that for every x ∈ VN \∂ − VN , |AN (x)| ≤
ca . N 2−δ
(3.31)
Since AN = 0 outside VN , we are left with the case x ∈ ∂ − VN . If we call Ex the event {ϕ : mx (ϕ) ≤ 0}, from (2.8) we obtain that there is a constant c such that ( " #) 2 h i exp −d(m (ϕ)) 1 E x x 2 |AN (x)| ≤ c E+ . (3.32) + E+ √ N exp −d(mx (ϕ)) N 1 − 9( 2dmx (ϕ)) The first term in the right-hand side of (3.32) is bounded by const./N 2−δ , by the very same argument used for x ∈ VN \∂ − VN . For the other term we use the fact that R∞ 2 exp(−r /2)/ r exp(−s 2 /2)ds ≤ 2 + r, for r ≥ 0, and Hölder inequality to obtain that if 1/p + 1/q = 1, " # 2 p i1/p √ 1/q + h + 1Ex exp −d(mx (ϕ)) ≤ P+ (E ) E 2d|m (ϕ)| . 2 + EN √ x x N N 1 − 9( dmx (ϕ)) (3.33)
458
J.-D. Deuschel, G. Giacomin
Observe now that, by the exponential Chebychev inequality and the B–L inequality (1.10) with F (r) = exp(r), we have that + + + P+ N (Ex ) ≤ PN mx (ϕ) − EN (mx (ϕ)) ≤ −EN (mx (ϕ)) ( 2 ) E+ t2 N (mx (ϕ)) + , ≤ inf exp −tEN (mx (ϕ)) + G = exp − 2 2G t∈R (3.34) and therefore, by Lemma 3.3, we have that for every δ > 0, there exists c such that for all N ∈ Z+ , sup P+ N (Ex ) ≤
x∈∂ − VN
c N 2−δ
.
(3.35)
The second factor in the right-hand side of (3.33) can be easily bounded by using the B–L inequality with F (r) = |r|p and by using the upper bound in Lemma 3.3: for every N ∈ Z+ and every x ∈ ∂ − VN , E+ N
h
2+
p i1/p √ p 2d|mx (ϕ)| ≤ c(p) log N ,
(3.36)
where c(p) is a constant depending only on p. Therefore by choosing q sufficiently close to 1, we extend (3.31) to all x ∈ VN . Note that a much rougher upper bound than the one given in Lemma 3.3 would have been sufficient. Let us now go back to using (3.29). We obtain that for every x, c 1 cK ca X ≤ 1−δ , ∇i 1−1 AN (x) ≤ 2−δ d−1 N |x − y| N
(3.37)
y∈VN
t for some c ∈ R+ , and therefore (3.28) is proven. u
4. 2-Scale Decomposition and the Lower Bound For f : Rd → R, let us set Df = (∂1 f, . . . , ∂d f ). Proposition 4.1. Let C be the capacity of V , i.e. o n C ≡ inf kDhk2L2 (Rd ) : h ∈ H 1 (Rd ), h = 1 a.e. on V .
(4.1)
We have the following lower bound on the probability of + N: lim inf N →∞
1 N d−2 log N
log P (ϕx ≥ 0 for all x ∈ VN ) ≥ −2GC.
(4.2)
Entropic Repulsion for Free Field: Pathwise Characterization in d ≥ 3
459
This result can be found in [3, Prop. 2.1]. Here we present another proof, based on the following observation: we can realize the field ϕ as sum of two independent Gaussian fields {ϕx0 }x∈Zd and {ϕx1 }x∈Zd , ϕx = ϕx0 + ϕx1 ,
(4.3)
defined, once we fix ε > 0, by E(ϕx0 ϕy0 ) = (−1)−1 − (ε2 − 1)−1
x,y
E(ϕx1 ϕy1 ) = (ε2 − 1)−1
= G(0) (x, y),
(4.4)
,
(4.5)
x,y
and E(ϕx0 ) = 0, E(ϕx1 ) = 0,
(4.6)
for all x, y ∈ Zd . We will still use P (E) for the joint law of ϕ 0 and ϕ 1 . On the other hand, we denote by Pα,N (α ∈ R+ ) the law of the random field o n p . (4.7) ϕx0 + α log N , ϕx1 d x∈Z
This is still a Gaussian field, with the same covariance as {ϕ 0 , ϕ 1 } under P, but shifted ϕ 0 -mean. As remarked in the introduction, the proof is inspired by the multiscale decomposition of Field Theory (see e.g. [13] and [1]). We actually need only two scales: we are in fact splitting the field into a massless (ϕ 0 ) and a massive component (ϕ 1 ). Notice that the covariance of the massive part is equal to 1/2d times the Green function of a simple random walk with killing of rate ε 2 /2d. We will use the relative entropy technique: we compute the relative entropy of the √ original field and the field in which the massless part has been translated of a distance α log N . The best result, optimal to leading order by the upper bound in [3], is obtained by making the massless part to be infinitesimal (i.e. ε → 0) and c(> 2d) arbitrarily close to 2d. Once again, this gives another image of the fact that the field under the hard wall condition moves away from the wall, in order to make enough room for the fluctuations to occur. Proof of Proposition 4.1. First of all we claim that for all α ∈ R+ , lim
1
N→∞ N d−2 log N
α HN Pα,N |P = C, 2
where
HN Pα,N |P = Eα,N
! dPα,N log . dP FV
(4.8)
(4.9)
N
We will give the main argument and postpone the proof of (4.8) at the end. As it will be clear, we do not need to establish the equality in (4.8): an upper bound, with the same right-hand side, suffices. However equality is just as easy to obtain.
460
J.-D. Deuschel, G. Giacomin
Let σε = E(ϕx0 )2 (which is independent of x). It is immediate to see (for example by using the Fourier transform) that lim σε = 0.
ε→0
(4.10)
The two results (4.8) and (4.10) imply p 1 α log P ϕx0 ≥ α log N for all x ∈ VN ≥ − C, (4.11) lim inf lim inf d−2 ε→0 N→∞ N log N 2 for all x ∈ R+ . The proof of (4.11) goes as follows. First we recall the entropy inequality √ HN Pα 0 ,N |P + e−1 P(ϕx0 ≥ α log N ∀x ∈ VN ) ≥− , log √ √ Pα 0 ,N (ϕx0 ≥ α log N ∀x ∈ VN ) Pα 0 ,N (ϕx0 ≥ α log N ∀x ∈ VN ) (4.12) for α 0 ∈ R+ . Equation (4.12) is a consequence of Jensen’s inequality (see e.g. [3, p. 421]). Combining (4.8) and (4.12) we realize that, to have (4.11), it is sufficient to prove that for any α 0 > α there exists ε0 such that for all ε ≤ ε0 , p lim Pα 0 ,N ϕx0 ≥ α log N for all x ∈ VN = 1, (4.13) N→∞
which is proven by observing that √ p p √ Pα 0 ,N ϕx0 ≥ α log N ∀x ∈ VN = P ϕx0 ≥ ( α − α 0 ) log N ∀x ∈ VN
d
≥1−N P
ϕ00
s ! √ p √ √ √ log N d < ( α − α 0 ) log N = 1 − N 9 − ( α 0 − α) , σε (4.14)
and we recall that 9(r) is the probability that a standard normal variable is smaller than r ∈ R. Equation (4.13) is then an easy consequence of (4.14) and (4.10), since the in (4.14) converges to 1 when N goes to infinity if ε is chosen such that √ last term √ ( α 0 − α)2 /2σε > d. We are now going to prove (4.2). We have P (ϕx > 0 ∀x ∈ VN ) ≥ E P ϕx0 + ϕx1 > 0 ∀x ∈ VN |F 0 1{ϕx0 ≥√α log N ∀x∈VN } , (4.15) is the σ -algebra generated by ϕ 0 . By using the independence of ϕ 0 and ϕ 1 , where F 0 √ 0 on {ϕx ≥ α log N ∀x ∈ VN } we have p (4.16) P ϕx0 + ϕx1 > 0 ∀x ∈ VN |F 0 ≥ P ϕx1 > − α log N ∀x ∈ VN . By the FKG inequality for the field ϕ 1 (see e.g. [10]) and its translation invariance N d p p . P ϕx1 > − α log N ∀x ∈ VN ≥ P ϕ01 > − α log N
(4.17)
Entropic Repulsion for Free Field: Pathwise Characterization in d ≥ 3
461
Hence we have that 1 log P (ϕx > 0 ∀x ∈ VN ) ≥ N d−2 log N s ! p 1 α log N N2 log 9 − + d−2 log P(ϕx0 ≥ α log N ∀x ∈ VN ), log N (G − σε ) N log N (4.18) and the result then follows by using (4.11) and (4.10), since the first term in the right-hand side of (4.18) vanishes whenever α/4(G − σε ) > 1. We are then left with the proof of (4.8). A direct computation of the relative entropy (4.9), see [2] for similar computations, easily reduces the proof of (4.8) to proving that lim
1
N→∞ N d−2
(0)
h1VN , (GN )−1 1VN iVN = C,
(4.19) (0)
where h·, ·iA , A ⊂ Zd , is the scalar product in L2 (A) and GN is the matrix G(0) restricted to VN × VN (and analogous meaning below for G and GN ). The quantity of which we are taking the limit in (4.19) can be expressed in terms of a variational problem: it is equal to 1 1 (0) hf, G sup 2 h1 , f i − f i (4.20) VN VN VN . N N d−2 f ∈L2 (VN ) 2 A lower bound for the expression in (4.19) is then immediate, since G ≥ G(0) and lim
N→∞
1 h1VN , (GN )−1 1VN iVN = C, N d−2
(4.21)
which is proven in [2, Sect. 2]. For the upper bound we still use the variational formula (4.20) in the following way: 1 1 (0) (0) h1VN , f iVN − hf, GN f iVN ≤ sup hhN , f iZd − hf, G f iZd sup 2 2 f ∈L2 (VN ) f ∈L2 (Zd ) = hhN , (G(0) )−1 hN iZd , (4.22) where hN (·) = h(·/N), h ∈ C0∞ (Rd ) and h = 1 on V . P By setting fˆ(k) = x f (x) exp(ikx) for k ∈ (−π, +π]d , we have Z 1 µ(k)2 |hˆ N (k)|2 2 dk hhN , [(G(0) )−1 − G−1 ]hN iZd = d (2π) kkk≤π ε (4.23) c(h) 1 = 2 d−2 h1hN , 1hN iL2 (Zd ) ≤ 2 2 , ε N N ε Pd where µ(k) = 2 i=1 (1 − cos ki ) and c(h) is a constant depending on h. Hence the term in (4.23) vanishes as N → ∞. On the other hand 1 1 hhN , G−1 hN iZd = − d−2 hhN , 1hN iZd , N d−2 N
(4.24)
462
J.-D. Deuschel, G. Giacomin
R converges as N → ∞ to its continuum analog Rd |Dh|2 for any h ∈ C0∞ ; taking the infimum over h we obtain the capacity C and the proof is complete. u t Acknowledgements. We are grateful to Erwin Bolthausen for his help with the proof of Lemma 3.3 and for other useful discussions. G.G. acknowledges the support of the Swiss National Science Foundation (Project 20–410 925.94).
References 1. Benfatto, G., Cassandro, M., Gallavotti, G., Niccolò, F., Olivieri, E., Presutti, E. and Scacciatelli, E.: Ultraviolet Stability in Euclidean Scalar Field Theories. Commun. Math. Phys. 71, 95–130 (1980) 2. Bolthausen, E. and Deuschel, J.D.: Critical large deviations for Gaussian fields in the phase transition regime. Ann. Prob. 21, 1876–1920 (1994) 3. Bolthausen, E., Deuschel, J.D. and Zeitouni, O.: Entropic repulsion for the lattice free field. Commun. Math. Phys. 170, 417–443 (1995) 4. Brascamp, H.J. and Lieb, E.: On extensions of the Brun–Minkowski and Prekopa–Leinler theorems. J. Funct. Anal. 22, 366–389 (1976) 5. Bricmont, J., el Mellouki, A. and Fröhlich, J.: Random surfaces in statistical mechanics: Roughening, rounding, wetting. J. Stat. Phys. 42, 743–798 (1986) 6. Deuschel, J.D. and Giacomin, G.: Entropic Repulsion for Massless Fields. Preprint (1999). 7. Deuschel, J.D. and Stroock, D.W.: Large Deviations. Academic Press, Series in Pure and Applied Mathematics 137, 1989 8. Georgii, H.-O.: Gibbs Measures and Phase Transitions. Studies in Mathematics, 9, W. de Gruyter ed., 1988 9. Glimm, J. and Jaffe, A.: Quantum Physics. Berlin–Heidelberg–New York: Springer–Verlag, Second edition, 1987 10. Herbst, I. and Pitt, L.: Diffusion equation techniques in stochastic monotonicity and positive correlations. Prob. Th. Rel. Fields 87, 275–312 (1991) 11. Lawler, G.F.: Intersections of Random Walks. In: Probability and its Applications, Basel–Boston: Birkhäuser, 1991 12. Lebowitz, J.L. and Maes, C.: The effect of an external field on an interface, entropy repulsion. J. Stat. Phys. 46, 39–49 (1987) 13. Nelson, E.: A quartic interaction in two dimensions. In: Mathematical theory of elementary particles, (Goodman and Segal ed.s), Cambridge, MA: MIT press, 1966 14. Spitzer, F.: Principles of random walks. Berlin–Heidelberg–New York: Springer-Verlag, Second edition, 1976 Communicated by J. L. Lebowitz
Commun. Math. Phys. 206, 463 – 489 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
On the Spectrum of the Generator of an Infinite System of Interacting Diffusions R. A. Minlos1 , Yu. M. Suhov1,2 1 Institute for Problems of Information Transmission, Russian Academy of Sciences, 19 Bol’shoi Karetnyi
Per., Moscow, 101447, Russia
2 Statistical Laboratory, DPMMS, University of Cambridge, 16 Mill Lane, Cambridge CB2 1SB, UK
Received: 6 October 1998 / Accepted: 9 April 1999
Abstract: We study the spectrum of the operator Lf (Q) = −
X
X ∂ 2 f/∂qx2 (Q) − β (∂H /∂qx ) (Q) (∂f/∂qx ) (Q), Q = {qx },
x∈Zd
x∈Zd d
generating an infinite-dimensional diffusion process 4(t), in space L2 (RZ , dν(Q)). d Here ν is a “natural” 4(t)-invariant measure on RZ which is a Gibbs distribution corresponding to a (formal) Hamiltonian H of an anharmonic crystal, with a value of the inverse temperature β > 0. For β small enough, we establish the existence of an Ld invariant subspace H1 ⊂ L2 (RZ , dν(Q)) such that L H1 has a distinctive character related to a “quasi-particle” picture. In particular, L H1 has a Lebesgue spectrum separated from the rest of the spectrum of L and concentrated near a point κ1 > 0 giving the smallest non-zero eigenvalue of a limiting problem associated with β = 0. An immediate corollary of our result is an exponentially fast L2 -convergence to equilibrium for the process 4(t) for small values of β.
1. Introduction In this paper we consider the problem of describing a “lower” component of the spectrum of the generator of an infinite system of interacting diffusions. The dynamics of the model are given as an infinite-dimensional Markov process 4(t) = {ξx (t), x ∈ Zd }, t ≥ 0, d with state space = RZ , determined by a countable system of stochastic differential equations dξx (t) = −β(∂H /∂qx )(4(t))dt + dWx (t), ξx (0) = qx0 , x ∈ Zd ,
(1.1)
464
R. A. Minlos, Yu. M. Suhov
where {Wx , x ∈ Zd } is a family of independent Wiener processes on R labelled by sites x ∈ Zd , and Q0 = {qy0 , y ∈ Zd } ∈ is an initial condition. Furthermore, H (Q) is a formal Hamiltonian: X X α qx2s + (qx − qx 0 )2 , Q = {qy , y ∈ Zd }, (1.2) H (Q) = 2 d 0 d 0 x∈Z
x,x ∈Z : |x−x |=1
where s is a natural number, the coupling constant α is > 0, and |x − x 0 | denotes the distance (Euclidean or lattice) between x, x 0 ∈ Zd . The value β > 0 in (1.2) is interpreted as inverse temperature. It is known (see the original papers [7, 9, 23] and a review [6]) that, for a “tempered” Q0 ∈ , there exists a (strong) solution to (1.1) which is in fact unique among tempered weak solutions. (In fact, the existence and uniqueness of such solution can be proved under much more general assumptions about H (Q).) Furthermore (see [7, 23]), ∀β > 0 process 4(t) (= 4βR(t; Q0 )) defined by (1.1) has a unique invariant measure ν (= νβ ) such that supx∈Zd qx2 dν(Q) < ∞. Moreover, the measure ν coincides with Gibbs probability distribution ν corresponding to the Hamiltonian H (see (1.2)) and the value of the inverse temperature β. The last result also establishes the uniqueness, ∀β > 0, of a Gibbs distribution for Hamiltonian H (again within the class of probability measures on with a uniformly bounded second moment). As before, this result holds true under more general assumptions about H (Q). The measure ν is also invariant under spaceshifts in . Process 4β (t) with invariant measure ν is reversible and ergodic. The semi-group of its transition operators acting in the Hilbert space (H.s.) H := L2 (, dν(Q)) is selfadjoint; its generator L (= Lβ ) is also self-adjoint (and even positive definite). It is defined on a suitable dense set D(L) ⊂ H composed of “local”, smooth and tempered functions f (Q), Q ∈ , where it has the form X X β(∂H /∂qx )(Q) (∂f/∂qx ) (Q). ∂ 2 f/∂qx2 (Q) − (1.3) Lf (Q) = − x∈Zd
x∈Zd
In particular, the function f ≡ 1 is in the domain D(L) and taken to zero, i.e. is a (unique) normalised eigenvector of L with eigenvalue 0. The rest of the spectrum of L lies on R+ = (0, ∞). The main result of this paper is as follows. For β small enough, and under the condition s > 2d + 1,
(1.4)
there exists a subspace H1 ⊂ H, invariant under L and the space-shift unitary group {Uy , y ∈ Zd }, such that the spectrum of the restriction L H1 is Lebesgue and fills a segment J ⊂ R+ of length ∼ β, separated by gaps of size ∼ β 1/s from 0 and from the rest of the spectrum of L (which lies to the right of J). For the precise statement, see Theorem 1 below. Furthermore, let L2 (Td , dλ) denote the space L2 on the d-dimensional torus Td with the standard Lebesgue measure. Then, under a unitary map V: H1 → L2 (Td , dλ) (which is cyclic for group {Uy }), operator L H1 is taken to the operator of multiplication by b (λ), λ = (λ(1) , . . . , λ(d) ) ∈ Td , with values in J, a non-constant analytic function m whereas the operators Uy H1 , y = (y (1) , . . . , y (d) ), are taken to the operators of P multiplication by exp (ihy, λi), where hy, λi = 1≤j ≤d y (j ) λ(j ) . By using quantummechanics (or rather quantum field theory) analogies, one can interpret vectors of H.s.
Spectrum of Interacting Diffusions
465
b (λ) as the energy of the H1 as states of a certain (quasi-) “particle”, and the value m b gives the particle with the quasi-momentum λ. (Physicists often say that the function m dispersion rule of an individual (quantum) particle.) An immediate corollary of this result is an exponentially fast L2 -convergence (for β small enough) of the distribution of 4(t) to measure ν as t → ∞, for any initial distribution ν 0 that is absolutely continuous with respect to ν and with an dν 0 /dν ∈ H. After this paper was accepted, we learned about preprint [27] where exponential convergence was established in a different (and actually stronger) form (again under a condition that β is small enough). The technique used in [27] (and in related preprints [25], [26]) is based on logarithmic Sobolev’s inequalities and direct bounds upon process 4(t), and employes (in a rather indirect way) many properties of a Gibbs state used in this paper. See also [28]. Yet another approach was put forward in [22]. The case of small β considered in this paper may be treated as a small (although singular) perturbation of a certain “decoupled” system corresponding to β = 0; see below. In particular, the generator K of the “natural” decoupled system associated with (1.1)–(1.4) has a non-negative discrete spectrum of a distinctive “additive” structure (cf. (2.4)–(2.7) and (2.12)), where any positive eigenvalues have infinite multiplicity. The quasi-particle spectrum component L H1 of the perturbed operator L “arises” from the eigenspace corresponding to the lowest positive eigenvalue κ1 of K. Such a picture is typical for “cluster” operators associated with infinite-particle systems, cf. [14]. We want to stress that, as we believe, our results hold true for a more general form of Hamiltonian H than (1.2), (1,4). In particular, one could allow the existence, for β large enough, of more than one Gibbs distribution, as we work only in the region of small β’s where uniqueness is guaranteed by appropriate high-temperature polymer (or cluster) expansions. Furthermore, the proof in such a general situation could be done along the same lines, although it would considerably lengthen the exposition. We believe the same is also true about condition (1.4) that plays in this paper an important, although purely technical, role. The problems arising from rigorous study of the spectra of generators of various stochastic dynamics (including the dynamics implemented by an infinite-volume transfer-matrix operator) have their own history going back to [18]. There exists an extensive bibliography devoted to various aspects of this problem in the case of Glauber dynamics on Zd (otherwise known as the stochastic Ising model (s.I.m.)); see, e.g., reviews [6] and [12], as well as the literature quoted in these sources). In paper [16], under the assumption that the inverse temperature β of the model is small enough, a number of “lower” invariant subspaces of the s.I.m. generator were constructed (corresponding to k-particle pictures, k ≥ 1), and the spectrum on the first of these subspaces (with k = 1) was described, in terms similar to the ones above. (It should be noted that for d ≥ 2 the s.I.m. exhibits a complicated phenomenon of non-uniqueness of invariant measures.) In the one-dimensional case (d = 1), the s.I.m. is (again for β > 0 small enough) relatively simple: in this case all invariant subspaces of the generator and the spectrum on all of them were described in [19]. In [10] and [29], a similar problem was considered for the dynamics of plane rotators; in this model one deals with a system of stochastic differential equations which is similar to (1.1), but on a compact manifold (a circle S1 ) rather than on R. In this case, one was able to construct one- and two-particle invariant subspaces and describe the spectrum of the generator on these subspaces. The present paper follows the general scheme employed in the above papers and in [14] (which will allow us to avoid some tedious detail), although, in view of the non-
466
R. A. Minlos, Yu. M. Suhov
compactness of the “spin” space R, the whole construction is still technically rather involved. We also want to refer to [1] and [2, 3] (see also the references therein) where various properties of the operator L and process 4(t) are discussed from a different point of view. The paper is organised according to the following scheme. In Sect. 2 we state the problem and the main theorem on the one-particle invariant subspace H1 . Section 3 contains the proof of the first part of the theorem: here, we perform a construction of space H1 . In Sect. 4, we establish the form of the spectrum of operator L on H1 , and in Sects 5 and 6 prove various technical facts used in Sects 2–4. 2. The Main Theorem It is convenient to pass to a “modified” form of the operator L, by using the “multiplica1 tive” change of variables qy 7 → β 2s qy , y ∈ Zd . This generates a unitary transformation ¯ where of H.s.’s, R: H → H, H¯ := L2 (, d¯ν (Q)) and Rf (Q) = f (β −1/(2s) Q), f ∈ H, and ν¯ = ν¯ β is the Gibbs distribution determined by the Hamiltonian X X α b(Q) = H qx2s + β 1−1/s (qx − qx 0 )2 , 2 d 0 d 0 x∈Z
(2.1)
(2.2)
x,x ∈Z : |x−x |=1
with the inverse temperature one. The transformed operator RLR−1 has the form 1 ¯ where RLR−1 = β s L, X X ¯ (qx − qx 0 )∂/∂qx , −∂ 2 /∂qx2 + 2sq 2s−1 ∂/∂qx + L(= L¯ β ) = x∈Zd
x,x 0 ∈Zd : |x−x 0 |=1
(2.3) ¯ = RD(L). with = αβ 1−1/s . Operator L¯ is of course self-adjoint on its domain D(L) ¯ Thus, the problem of describing the spectrum of operator L is reduced to that for L. For small β (and hence small ) operator L¯ may be considered as a perturbation of the “decoupled” self-adjoint linear operator (l.o.) K in H.s. F, corresponding to “free” dynamics: X d (2.4) −∂ 2 /∂qx2 + 2sq 2s−1 ∂/∂qx , F := L2 (, dµ(Q)), µ = µZ K= 0 ; x∈Zd
here µ0 is a probability measure on R: dµ0 (q) = I
−1
Z
exp (−q )dq, where I = 2s
dq˜ exp (−q˜ 2s ).
(2.5)
Denote by k the self-adjoint l.o. acting in the space L2 (R, dµ0 (q)) by the formula k=−
d d2 + 2sq 2s−1 . 2 dq dq
(2.6)
Spectrum of Interacting Diffusions
467
The spectrum of k is a sequence of multiplicity one eigenvalues 0 = κ0 < κ1 < . . . < κn < . . . , κn % ∞.
(2.7)
The unitary group {Uy , y ∈ Zd } and the involution J are given by Uy f (Q) = f Uy Q), Jf (Q) = f (−Q),
(2.8)
where Uy is the shift Q = {qx } 7 → {qx0 }, with qx0 = qx+y , x ∈ Zd . Note that both {Uy , y ∈ Zd } and J commute with both L¯ and K (when considered in the corresponding H.s.). In particular, let H¯ ev and H¯ od denote, respectively, the even and odd subspaces of ¯ → H¯ ev and similarly for H¯ od . The same conclusion ¯ H¯ ev ∩ D(L) H¯ relative to J. Then L: holds for operator K and the even and odd subspaces F ev , F od ⊂ F. Theorem 1. Given s and α as above, s satisfying (1.4), there exist constants β 0 , C > 0 such that, for 0 < β < β 0 , ¯ 1. There exist decompositions of H.s.’s H¯ ev and H¯ od into H-orthogonal direct sums ev od , Hod = H¯ 1 ⊕ H¯ >1 , Hev = H¯ 0 ⊕ H¯ ≥2
(2.10)
ev and H ¯ od are invariant under L¯ and {Uy }, and H¯ 0 is where subspaces H¯ 0 , H¯ 1 , H¯ ≥2 >1 a one-dimensional nil-subspace of L¯ consisting of constant functions. Furthermore, ev := L ¯ H¯ ev and L¯ od := L¯ Hod lie in (i) the spectra of the restrictions L¯ ≥2 ≥2 >1 >1 (κ¯ 2 − C, ∞) and (κ¯ 3 − C, ∞), respectively, where κ¯ 2 = min [2κ1 , κ2 ] and κ¯ 3 = min [3κ1 , κ1 + κ2 , κ3 ], and (ii) the spectrum of the restriction L¯ 1 := L¯ H¯ 1 is confined to the interval J = (κ1 − C, κ1 + C). In particular, the spectrum of L¯ 1 is separated from 0 and ev ∨ Hod . the spectrum of L¯ H≥2 >1 ev ∨ Hod denotes the subspace spanned by Hev and Hod .) (Here, H≥2 >1 ≥2 >1 d , dλ) such that the l.o.’s VL ¯ 1 V−1 and → L (T 2. There exists a unitary map V : H 1 2 V Uy H1 V−1 have the form
b (λ)f (λ), V Uy H1 V−1 f (λ) = exp ihy, λif (λ), VL¯ 1 V−1 f (λ) = m
λ = (λ(1) , . . . , λ(d) ) ∈ Td , y = (y (1) , . . . , y (d) ) ∈ Zd , f ∈ L2 (Td dλ). (2.11) b is a non-constant analytic function Here Td is the d-dimensional unit torus and m on Td with values in interval J specified in assertion 1(i). In particular, the spectrum of L¯ 1 is Lebesgue. We conclude this section with an observation about the spectrum of operator K in H.s. F. Denote by ψn the (normalised) eigenvector of k corresponding to κn , n ∈ Z+ := {0, 1, ... }. For n even, ψn is an even function of q ∈ R, for n odd, ψn is odd. Comparing (2.4) and (2.6), we see that the eigenfunctions and eigenvalues of K are of the form Y X ψn(x) (qx ) and Kn = κn(x) . (2.12) 9n (Q) = x
x
468
R. A. Minlos, Yu. M. Suhov
Here n is integer-valued function x ∈ Zd 7 → n(x) (called a multi-index), a non-negative P with n = x n(x) < ∞; the set of such functions is denoted by N. Functions 9n , n ∈ N, form an orthonormal basis in F. Furthermore, each 9n is either an even or an odd vector relative to J, and the parity of 9n coincides with that of n . So, F ev is spanned by the even and F od by the odd 9n ’s. In particular, (a) 0 is a simple eigenvalue of K, (b) the lowest positive eigenvalue of K is κ1 ; it has an infinite multiplicity, and the corresponding eigenspace F1 is spanned by the odd vectors 9ey , y ∈ Zd . Here, ey denotes the multi-index with ey (x) = 1(x = y), x ∈ Zd . The next eigenvalue is κ¯ 2 = min [2κ1 , κ2 ], etc. Note that each eigenspace of K corresponding to a given eigenvalue Kn is invariant under {Uy }. We see that in terms of the asymptotic of the spectrum, L¯ 1 is related to the restriction K F1 : as β → 0, interval J shrinks to κ1 . An agreement used for the rest of the paper is that the notation c0 , c1 , etc., is used for positive constants varying from one lemma to another (so, e.g., constant c0 in Lemma 3.1 is different from that in Lemma 3.3); unless otherwise specified, these constants do not depend on variables figuring in the corresponding assertion (e.g., c0 in Lemma 3.1 does not depend on n). Also, each time a bound includes , we assume that β (and hence ) is small enough, in the sense indicated in Theorem 1. ¯1 3. Constructing Subspace H ¯ Furthermore, An important fact is that functions 9n ∈ H. ¯ Lemma 3.1. The H-norm of 9n obeys Y 2d/(4s−2) γn(x) , where γ0 = 1, γn = c0 κn + 1 , n ≥ 1. ||9n ||H¯ ≤
(3.1)
x
The proof of Lemma 3.1 is carried in Sect. 5. Consider now the system of functions 8n ∈ H¯ ∩ F, n ∈ N: .Y Y γn(x) = φn(x) (qx ), φn (q) = ψn (q)/γn , q ∈ R, n ∈ Z+ , 8n (Q) = 9n (Q) x
x
(3.2) by where γn are given by (3.1). P PDenote by L the space of functions G(Q) represented the following series: G = n gn 8n , gn ∈ C, with the norm |||G|||L = n |gn | < ∞, ¯ ||G|| ¯ ≤ |||G|||L , and so that L is isomorphic to l1 (N). In view of Lemma 3.1, L ⊂ H, H ¯ L is dense in H. ¯ n ∈ L. Lemma 3.2. Any 8n , n ∈ N, is taken by L¯ to a vector from L: L8 P ¯ n = Proof of Lemma 3.2. Consider the representation L8 m Ln , m 8m . Comparing (2.3) and (2.4) and using (3.2) we find that Lm , n = Mm , n + Wm , n , where X Mm , n =1(n = m)Kn , Wm , n = 2drn(x),m(x) 1(mx = nx ) −
X y: |y−x|=1
x∈σ (n)
pn(x),m(x) bn(x),m(x) 1(mx,y = nx,y ) .
(3.3)
Spectrum of Interacting Diffusions
469
Here, and below, σ (n) stands for the support of the multi-index n ∈ N: σ (n) = {y ∈ Zd : x n(y) ≥ 1}. Next, nx and ( m denote the multi-indices ( which differ from n and m n(z), m(z), z 6= x, z 6 = x, at site x only: nx (z) = mx (z) = Similarly, 0, z = x, 0, z = x. ( ( n(z), z 6 = x, z 6 = y, m(z), z 6 = x, z 6= y, mx,y (z) = Furthernx,y (z) = 0, z = x or z = y, 0, z = x or z = y. more, rn,m , pn,m and bn,m are defined by q
X X X dφn dφn rn,m φm (q), pn,m φm (q), qφn (q) = bn,m φm (q). (q) = (q) = dq dq m m m (3.4)
Lemma 3.3. The following bounds hold true: 3 2d+s−1 − 4d−1 −1 c0 |κn − κm | (κn + c1 ) 4 2(4s−2) (κm + c1 ) 4s−2 , m 6 = n, n ≥ 1, 1 1 |rn,m | ≤ c0 (κn + c1 ) 2 + 4s−2 , m = n ≥ 1, 0, n = 0, (3.5) ( |pn,m | ≤
c0 |κn − κm |−1 (κn + c1 ) 4 − 2(4s−2) (κm + c1 ) 0, n = m or n = 0, 3
4d−1
2d+s−2 4s−2
, m 6= n, n ≥ 1, (3.6)
1 2d − 2d −1 c0 |κn − κm | (κn + c1 ) 2 4s−2 (κm + c1 ) 4s−2 , m 6= n, n ≥ 1, 2d −1 (κ + c ) 4s−2 |bn,m | ≤ , m ≥ 1, n = 0, c0 κm m 1 0, m = n.
(3.7)
Moreover, X
|rn,m | ≤ c0 (κn + c1 ) 2 + 4s−2 ln (κn + c1 ), 1
1
(3.8a)
m
X
1
|pn,m | ≤ c0 (κn + c1 ) 2 ln (κn + c1 ),
(3.8b)
m
and X
1
|bn,m | ≤ c0 (κn + c1 ) 4s−2 ln (κn + c1 ).
(3.8c)
m
The proof of Lemma 3.3 is given in Sect. 6. P Remark. It is the bound for m |rn,m | in (3.8a) where condition (1.4) is essential (by using methods proposed in this paper).
470
R. A. Minlos, Yu. M. Suhov
The assertion of Lemma 3.2 now follows from (3.3) and bounds (3.5)–(3.8). u t ¯ = {8 ∈ L : Thus, we can consider the restriction of L¯ L, with the domain DL (L) ¯ ⊂ D ¯ (L). ¯ For simplicity, we will keep for the last operator ¯ ∈ L}. Clearly, DL (L) L8 H ¯ Space L is decomposed into a sum of its closed subspaces Lev the original notation L. and Lod spanned by the even and odd vectors 8n , respectively: L = Lev + Lod . Both Lev and Lod are invariant with respect to L¯ and {Uy }, and we set L¯ ev = L¯ Lev and L¯ od = L¯ Lod . ¯ Lemma 3.4. There exist decompositions of Lev and Lod into sums of their L-closed Land {Uy }-invariant subspaces: od od Lev = L0 + Lev ≥2 , L = L1 + L>1 ,
(3.9)
with the following properties. 1. L0 is the one-dimensional nil-subspace of L¯ formed by constant functions. ev = L ¯ ev Lev and L¯ od = L¯ od Lod are invertible l.o.’s, and 2. The restrictions L¯ ≥2 ≥2 >1 −1 >1 ev od −1 ||| ¯ |||L ≤ 1/(κ¯ 2 − c0 ), ||| L¯ >1 the L-norms of their inverses obey ||| L≥2 L ≤ 1/(κ¯ 3 − c0 ), where κ¯ 2 and κ¯ 3 are as in Theorem 1. 3. The restriction L¯ 1ev = L¯ ev L1 has the L-norm |||L¯ 1ev |||L obeying |||L¯ 1ev |||L ≤ κ1 + c0 . ¯ 4. The H-closures ev od ¯ ev = ClH¯ Lev H¯ 1 = ClH¯ L1 , H¯ ≥2 ¯ L>1 , ≥2 , H≥2 = ClH
(3.10)
are invariant under L¯ and {Uy }, and, together with H¯ 0 = L0 , form decompositions (2.10). Proof of Lemma 3.4. We begin with a construction of spaces L1 and Lod >1 . The starting point is a decomposition od (3.11) Lod = L01 + L0, >1 , nP o P 0, od where L01 = x∈Zd gx 8ex and L>1 = n∈N, |n|>1 gn 8n . Formula (3.11) induces od od L¯ 0,1 L¯ od : L0 → L0 , od ¯ the corresponding matrix representation L ' ¯ 0,0 , where L¯ 0,0 1 1 Lod L¯ od 1,0
1,1
od : L0, od → L0 , etc. We also introduce a l.o. M acting in space L0, od and defined by L¯ 0,1 1 >1 >1 M8n = Kn 8n . (The action of M is identical to that of K, but in a different space.) od ¯ ¯ Spaces L01 and L0, >1 are not L-invariant; in order to get the L-invariant decomposition (3.9) we will perform some “corrections”. Namely, spaces L1 and Lod >1 in (3.9) are sought in the form 0, od L1 = v : v = u + M−1 Su, u ∈ L01 , Lod >1 = u : u = v + Tv, v ∈ L>1 , (3.12) 0,od 0 ¯ where S: L01 → L0,od >1 and T: L>1 → L1 are bounded l.o.’s. The L-invariance of L1 od and L>1 is equivalent to the following relations upon S and T: od od od od od od od od + L¯ 1,1 M−1 S = M−1 S(L¯ 0,0 + L¯ 0,1 M−1 S), T(L¯ 1,0 T + L¯ 1,1 ) = L¯ 0,0 T + L¯ 0,1 . L¯ 1,0 (3.13)
Spectrum of Interacting Diffusions
471
od is invertible in L0, od , one can re-write relations (3.13) as Assuming that L¯ 1,1 >1 od −1 −1 ¯ od od −1 ¯ od od −1 −1 ¯ od ) M SL0,0 − M(L¯ 1,1 ) L1,0 + M(L¯ 1,1 ) M SL0,1 M−1 S, S = M(L¯ 1,1 (3.14a) od ¯ od −1 od od −1 od od −1 (L1,1 ) + L¯ 0,0 T(L¯ 1,1 ) − TL¯ 1,0 T(L¯ 1,1 ) . T = L¯ 0,1
(3.14b)
od od −1 ) exists and is bounded in L0, Lemma 3.5. 1. (L¯ 1,1 >1 . 0,od 0 2. There exist unique L-bounded l.o.’s S: L01 → L0,od >1 and T: L>1 → L1 satisfying ≤ c0 1/2 . (3.14a,b), and their norms obey |||S|||L , |||T|||L P 3. The m.e.’s Sx,n of S in the representation S8ex = n∈Nod Sx,n 8n have the form Sx,n P = ( 1/2 )`{x}∪σ (n) sx, n , where n∈Nod sx,n ≤ c1 1/2 .
Here, and below, given a finite set B ⊂ Zd , `B stands for the minimal length of a finite subgraph γ of Zd (taken with standard links) with B ⊆ [γ ], where [γ ] is the set of the vertices of γ . Remarks. 1. It is possible to show that norms |||S|||L and |||T|||L are actually of order . 2. The m.e.’s of T also admit a representation similar to that for S. However, we do not need such a result in this paper. od )−1 . Observe that the LProof of Lemma 3.5.1. First, we establish the existence of (L¯ 1,1 P P norm |||B|||L of a l.o. B defined by B8n = m Bn ,m 8m equals supn m |Bn ,m |. Write od = M + W, where W is the l.o. with the m.e.’s W od L¯ 1,1 n ,m , n , m ∈ N , |m|, |n| ≥ 1 (cf. od )−1 in the form (L ¯ od )−1 = M−1 (E + WM−1 )−1 , (3.3)). In other words, we seek (L¯ 1,1 1,1
od −1 have the form W −1 where E is the unit operator in L0, n ,m Kn . >1 . The m.e.’s of WM By using (3.5)–(3.8), we obtain that X 1 1 (κn(x) + c1 ) 2 + 4s−2 ln (κn(x) + c1 ) |||WM−1 |||L ≤ c2 sup Kn−1
+
n∈Nod
x
1 (κn(x) + c1 ) (κn(y) + c1 ) 4s−2 ln (κn(x) + c1 ) ln (κn(y) + c1 ) .
X
1 2
x,y: |x−y|=1
With the help of Young’s inequality we find that 1
1
(κn(x) + c1 ) 2 ln (κn(x) + c1 ) (κn(y) + c1 ) 4s−2 ln (κn(y) + c1 ) 4s+2 1 1 4s 4s (κn(x) + c1 ) 2 + 4s ln (κn(x) + c1 ) ≤ 4s + 2 2s+1 1 1 1 (κn(y) + c1 ) 2 + 2s+1 ln (κn(y) + c1 ) + ≤ c3 κn(x) + κn(y) + 2c1 , 2s + 1 od )−1 provided that s ≥ 2. Hence, |||WM−1 |||L < c4 < 1. This guarantees that (L¯ 1,1 od −1 exists and is bounded, and |||(L¯ 1,1 ) |||L ≤ 1/(κ¯ 2 − c5 ). Thus, Eqs. (3.13) are indeed equivalent to (3.14 a,b). u t
472
R. A. Minlos, Yu. M. Suhov
P P Proof of Lemma 3.5.2. In what follows, the sum n stands for n∈Nod , |n|>1 ; the same P is true for m . Consider the operator space AL0 ,L0,od consisting of the bounded l.o.’s A: 1 >1 P 1/2 )−`{x}∪σ (n) L01 → L0,od n |Ax,n | ( >1 such that the m.e.’s Ax,n satisfy the bound supx −`{x}∪σ ( n) e 1/2 < ∞. In other words, Ax,n are represented in the form Ax,n = ( ) Ax,n , P P e with supx n |Ax,n | < ∞. The norm |||A|||A of A ∈ A is defined as supx∈Zd n |Ax,n | P ex,n |. ( 1/2 )−`{x}∪σ ( n) = supx∈Zd n |A We treat the right-hand side (r.h.s) of (3.14a) as a “quadratic” map Λ: A → ΛA, where ΛA equals od −1 ¯ od od −1 −1 ¯ od od −1 −1 ¯ od ) L1,0 + M(L¯ 1,1 ) M AL0,0 + M(L¯ 1,1 ) M AL0,1 M−1 A. −M(L¯ 1,1
(3.15)
We are going to check that Λ maps A → A and is bounded in norm ||| |||A (for simplicity, we will omit the subscript A in this notation). Furthermore, we will show that Λ is a contraction on a suitably chosen subset of A. This will imply the existence (and uniqueness) of a fixed point S. To this end, we will assess each of three summands in the r.h.s. of (3.15). We begin with the analysis of the second summand which is linear in A. To start with, observe that od have the form the m.e.’s of M−1 AL¯ 0,0 (M
−1
od −1 ¯ AL0,0 )x,m = Km (κ1 + 2r1,1 )Ax, m − p1,0 b0,1
X
Ay, m .
y: |y−x|=1
This leads to the bound |||M−1 A(L¯ od )0,0 ||| ≤ κ¯ 2−1 (κ1 + c6 1/2 )|||A|||.
(3.16)
P Furthermore, the m.e.’s of WM−1 A are of the form (WM−1 A)x, m = n Kn−1 Ax, n Wn ,m . For the non-zero summands in the last sum the set-theoretical difference σ (m) \ σ (n) is either empty or contains a single point y ∈ Zd neighbouring a point of σ (n). Therefore, `{x}∪σ (n) + 1 ≥ `{x}∪σ (m) , and hence ex, m = −1/2 ex, m , where A |(WM−1 A)x, m | ≤ `{x}∪σ (m) A
X n
Kn−1 |Ax, n Wm ,n |.
P ex, m ≤ By the same argument as in the proof of Lemma 3.5.1, we conclude that m A P 1/2 1/2 −1 1/2 c7 |||A|||. This yields that |||WM A||| ≤ c7 |||A|||. n Ax, m ≤ c7 P od )−1 = (E + WM−1 )−1 into a power series −1 l Now, expanding M(L¯ 1,1 l (WM ) , we find that od −1 ) A||| < (1 − c8 1/2 )−1 |||A|||. |||M(L¯ 1,1
(3.17)
Bounds (3.16) and (3.17) together give the following bound for the norm of the second summand in (3.15): od −1 −1 ¯ od ) M AL0,0 ||| ≤ (κ1 + c7 1/2 )((1 − c8 1/2 )κ¯ 2 )−1 |||A|||, |||M(L¯ 1,1
which is ≤ η|||A|||, where 0 < η < 1 for small enough.
(3.18)
Spectrum of Interacting Diffusions
473
To assess the third, “quadratic”, term in the r.h.s. of (3.15), note that the operator od M−1 acts as follows: 8 L¯ 0,1 nex , for n > 1 odd, is taken to X 8ey , (3.19a) κn−1 2rn,1 8ex − pn,0 b0,1 y: |x−y|=1
and 8n1 ex
1
+n2 ex
2
, for n1 ≥ 2 even and n2 ≥ 1 odd, to
(κn1 + κn2 )−1 2rn1 ,0 8ex 1(n2 = 1) − 1(|x1 − x2 | = 1) 2 × pn1 ,1 bn2 ,0 8ex + pn2 ,0 bn1 ,1 8ex ; 1
(3.19b)
2
Nod ,
are taken to zero. the rest of the vectors 8n , n ∈ P od M−1 A) −1 ¯ od As follows from (3.19a,b), the m.e.’s (L¯ 0,1 x1 ,x2 = n (L0,1 M )x1 ,n An ,x2 admit the bound od M−1 A)x1 ,x2 | ≤ c9 ( 1/2 )|x1 −x2 | |||A|||. |(L¯ 0,1 Applying an argument similar to that used for deriving bound (3.18), with the use of an obvious inequality `x1 ∪σ (n) + |x1 − x2 | ≥ `x2 ∪σ (n) , we obtain that od −1 −1 ¯ od ) M AL0,1 M−1 A||| < c10 |||A|||2 . |||M(L¯ 1,1
(3.20)
It remains to assess the first summand in the r.h.s. of (3.15). We have that X X X od 8ex = 2 r1,m 8mex − p1,m1 b0,m2 8m1 ex +m2 ey , L¯ 1,0 m∈Z+ , m>1, m odd
m1 ,m2 ∈Z+ , m1 +m2 >1
y: |y−x|=1
od od ||| ≤ c 1/2 . An argument similar to the above ∈ A and |||L¯ 1,0 which implies that L¯ 1,0 11 again leads to the bound od −1 ¯ od ) L1,0 ||| ≤ c11 1/2 . |||M(L¯ 1,1
(3.21)
From bounds (3.18), (3.20) and (3.21) we obtain that for small enough, ∃η ∈ (0, 1) such that for any A, A1 , A2 ∈ A |||ΛA||| ≤ η|||A||| + c10 |||A|||2 + c11 1/2 ,
(3.22a)
|||Λ(A1 − A2 )||| ≤ η|||A1 − A2 ||| + 2c12 max [|||A1 |||, |||A2 |||, |||A1 − A2 |||] . (3.22b) In turn, (3.22a) means that Λ is a bounded map A → A. As to (3.22b), it guarantees that there exists a constant R (1) > 0 such that the ball BR (1) = A ∈ A : |||A||| < R (1) 1/2 is taken by map Λ into itself. Similarly, from (3.22b) we see that for small enough this map is a contraction on BR (1) . Thus, the required properties of map Λ are established. Therefore, for small enough there exists a unique S satisfying (3.14a). The existence and uniqueness of l.o. T obeying (3.14b) is established in a similar way. This completes the proof of Lemma 3.5.2. u t Proof of Lemma 3.5.3. The bounds for m.e.’s Sx,n follows directly from the above analysis of map Λ. The proof of Lemma 3.5 is now complete. u t
474
R. A. Minlos, Yu. M. Suhov
We now continue with the proof of Lemma 3.4. We have constructed a pair of L¯ od ¯ od invariant L-closed sub-spaces L1 , Lod >1 ⊂ L. As L commutes with {Uy }, subspaces od L1 and L>1 are {Uy }-invariant. Furthermore, the intersection of L1 and Lod >1 is zero, and their sum coincides with Lod . The proof of the last assertion is identical to that of Lemma 3.4 from [10], and we refer the reader to this paper. We want to outline the construction of the decomposition of the even space Lev . As ev before (cf. (3.11), (3.12)), we start with the decomposition Lev = L0 + L0, ≥2 , and the ev 0 L¯ 0,1 0, ev ¯ ev corresponding representation of L¯ ev as a matrix L¯ ev ' ev , where L0,1 : L≥2 → 0 L¯ 1,1 ev : L0, ev → L0, ev . The one-dimensional subspace L is identified with the L0 and L¯ 1,1 0 ≥2 ≥2 complex line C; such an identification is repeatedly used below without comment. ev is given by Observe that operator L¯ 0,1 ev L¯ 0,1 8nex = 2rn,0 , n ∈ Z1+ , n even, ev 8n1 e +n2 e = −(pn1 ,0 bn2 ,0 + pn2 ,0 bn1 ,0 )1(|x1 − x2 | = 1), L¯ 0,1 x1
x2
n1 , n2 ∈ Z10 , n1 and n2 odd, ev 8n = 0, L¯ 0,1
(3.23)
for all other n ∈ Nev .
in (3.9) in the form Lev = v : v = u + F(u), As before, we seek the subspace Lev ≥2 ≥2 ev 0, ev u ∈ L0, ≥2 , where F: L≥2 → C is a bounded linear functional. The condition of 0, ev ¯ ev ¯ ev L¯ ev -invariance of Lev ≥2 leads to the equation F(L1,1 u) = L0,1 u, for any vector u ∈ L≥2 ev ev from the domain of l.o.’s L¯ 1,1 and L¯ 0,1 . This yields the formula F(v) = L¯ ev (L¯ ev )−1 v, v ∈ L0, ev . As before, one can check 0,1
1,1
≥2
ev is invertible in L0, ev and (L ¯ ev )−1 is L-bounded. Thus, the linear functional F that L¯ 1,1 1,1 ≥2 is indeed L-bounded, and |||F|||L ≤ c13 . This completes the construction of the L¯ ev ev invariant space Lev ≥2 . It is also invariant under {Uy } and provides the decomposition L ev = L0 + L≥2 . ev )−1 ||| ≤ 1/(κ −c ) ≤ 1/(κ¯ −c ) in assertion 3 of Lemma The inequality |||(L¯ ≥2 1 0 2 0 L 3.4 may be deduced from the established facts, similar to the analogous inequality for od )−1 ||| . |||(L¯ >1 L It remains to check assertion 4 of Lemma 3.4: the closures (3.10) are invariant under ¯ and form decomposition (2.10). To this end, it L¯ and {Uy } (considered as l.o.’s in H) ¯ l.o. is convenient to pass from the unbounded self-adjoint l.o. L¯ to a bounded (in H) −1 −1 (L¯ + aE) . Here, the constant a > 0 is chosen so that the l.o. (L¯ 1 + aE) acting in space L1 ⊂ Lod is L1 -bounded. (Observe that the action of (L¯ 1 + aEL1 )−1 on L1 ¯ coincides with that of (L¯ + aE)−1 .) By virtue of the H-boundedness of (L¯ + aE)−1 , ¯ In H¯ 1 is invariant with respect to (L¯ + aE)−1 . Thus, it is invariant with respect to L. od and H ¯ ev are L-invariant. ¯ The invariance of H¯ 1 , a similar way one can check that H¯ >1 ≥2 od ev ¯ ¯ H>1 and H≥2 under Uy and decomposition (2.10) follow from the construction. This completes the proof of Lemma 3.4. u t
Thus, we construct decomposition (2.10). Its spectral properties are checked in Sect. 4.
Spectrum of Interacting Diffusions
475
4. Spectral Properties of L¯ 1 ¯ BL ⊆ L, and assume that B L is Lemma 4.1. Let B be a self-adjoint l.o. in H.s. H, ¯ L-bounded. Then B is H-bounded, and ||B||H¯ ≤ |||B|||L . The proof of Lemma 4.1 repeats that of Lemma 3.1 from [16] and is omitted. From Lemma 4.1 and bounds of the preceding section we deduce that od −1 ) ||H¯ od ≤ (κ¯ 3 − c0 )−1 , ||L¯ 1 ||H¯ 1 ≤ κ1 + c0 , ||(L¯ >1 ev −1 ) ||H¯ ev ≤ (κ¯ 2 − c0 )−1 , ||(L¯ ≥2
>1
(4.1)
≥2
od and L ¯ ev lie to the right of The two last bounds in (4.1) imply that the spectra of L¯ >1 ≥2 −1 −1 and κ¯ 2 − c0 , respectively. This gives the proof of assertion points κ¯ 3 − c0 1(i) of Theorem 1. u t The first bound in (4.1) gives that the spectrum of L¯ 1 lies to the left of point κ1 + c0 . To establish the lower bound for the spectrum of L¯ 1 , consider a family of elements 2x of L1 of the form
2x = 8ex + M−1 S8ex , x ∈ Zd .
(4.2)
Obviously, ∀x, y ∈ Zd , Uy 2x = 2x+y , and |||2x |||L = 1 + ζ , where P ζ does not depend on x and |ζ | ≤ c1 . Furthermore, for any v ∈ L1 we have: v = x∈ gx (8ex + P M−1 S8 P ex ) = x gx 2x , which yields P that {2x } is a basis in L1 , and the coefficients gx obey x |gx | ≤ |||v|||L ≤ (1 + ζ ) x |gx |. P Let Lx,y denote the m.e.’s of L¯ 1 in basis {2x }: L¯ 1 2x = y Lx,y 2y . As L¯ 1 commutes with {Uy }, Lx,y depends only on x − y: od od )x,y + (L¯ 0,1 M−1 S)x,y := m(x − y), x, y ∈ Zd . Lx,y = (L¯ 0,0
From (3.3) we find that od )x,y (L¯ 0,0
κ1 + 2r1,1 , if x = y, = −2p1,0 b0,1 , if |x − y| = 1, 0, otherwise,
(4.3)
(4.4)
and as in the proof of Lemma 3.5.2, od M−1 S)x,y | ≤ c2 ( 1/2 )|x−y| . |(L¯ 0,1
(4.5)
Now consider a commutative Banach algebra B formed by the functions f : Zd → C, P |f (x)|, where the multiplication is given by the with the l1 (Zd )-norm ||f ||1P= x convolution: (f1 ∗ f2 )(x) = x 0 f1 (x 0 )f2 (x − x 0 ). Obviously, m ∈ B and the unit of B is the function e0 (x) = 1(x = 0). Lemma 4.2. Element m is invertible in B. Furthermore, m and its inverse m−∗1 have the form m = κ1 e0 + n, m−∗1 = κ1−1 e0 + p, where |n(x)|, |p(x)| ≤ c3 1/2 ( 1/2 )|x| . (4.6)
476
R. A. Minlos, Yu. M. Suhov
Proof of Lemma 4.2. The bound on n is obvious P from (4.4). Consider the Fourier transb (θ) = κ1 b p is analytic in form of m: m p(θ), where b p(θ) = x n(x)eihθ,xi n(x). Then b the complex domain {θ ∈ Cd : |=(θ)| < (1/2)| ln |} (here, | | stands for the norm both p in the domain {θ ∈ C: in C and Cd ). Furthermore, ∀ζ ∈ (0, | ln |/2), the function b m(θ ) = |=(θ )| ≤ (1/2)| ln | − ζ } admits the bound |b p(θ )| ≤ c4 1/2 . Therefore, 1/b p(θ )) = κ1−1 − b p(θ)κ1−1 (κ1 + b p(θ))−1 . Taking the inverse Fourier transform 1/(κ1 + b R yields m−∗1 (x) = κ1−1 e0 (x) + p(x), where p(x) = κ1−1 Td eihθ,xi (κ1 + b p(θ ))dθ . Owing to analyticity of and the above bound for b p, by choosing an appropriate integration contour in the last integral, we obtain the bound (4.6) for p. The proof of Lemma 4.2 is now complete. u t Lemma 4.2 implies that L¯ 1 is invertible in L1 , and L¯ 1−1 acts on {2x } as L¯ 1−1 2x = P P −1 κ1 2x + y p(x − y)2y . It is easy to see that for any vector v = x gx 2x the norm P P |||L¯ 1−1 v|||L is ≤ x κ1−1 |gx | + y |p(x −y)||gy | |||2x |||L ≤ (1+c5 ) κ1−1 +c5 1/2
|||v|||L , whence
|||L¯ 1−1 |||L ≤ κ1−1 + c5 1/2 .
(4.7)
By virtue of Lemma 3.6 we obtain that the spectrum of L¯ 1 in H¯ 1 lies to the right of t the point κ1 − c6 1/2 which yields assertion 1(ii) of Theorem 1.1. u ex } which To prove assertion 2, we pass from {2x } to another orthonormal basis {2 we construct below. In what follows, we use the symbol h , iH¯ (and alternatively h , iν¯ ) ¯ Furthermore, hgi ¯ (and alternatively hgiν¯ ) stands for the for the scalar product in H. H R integral g(Q)d¯ν (Q) = hg, 1iH¯ . Finally, we set CoH¯ (g1 , g2 ) := hg1 , g2 iH¯ − hg1 iH¯ hg2 iH¯ and call CoH¯ (g1 , g2 ) (alternatively denoted as Coν¯ (g1 , g2 )) the correlator of g1 and g2 . Consider the Gram matrix for {2x }, with the m.e.’s Gx,y = Co(2x , 2y ) = h2x , 2y iH¯ (we use here the fact that h2x iH¯ = 0, x ∈ Zd ). Clearly, Gx,y is a function of x − y only. Furthermore, X X Sx,n 8n , 8ey + Sy,n 8m = CoH¯ (8ex , 8ey ) Gx,y = CoH¯ 8ex + +
X n
n
Sx,n CoH¯ (8n , 8ey ) +
X m
m∈N
Sy,m CoH¯ (8ex , 8m ) +
X n,m
(4.8)
Sy,n Sx,m CoH¯ (8n , 8m ).
Lemma 4.3. For any n, m ∈ N the following bound holds: |σ (n)|+|σ (m)|
|CoH¯ (8n , 8m )| ≤ c6
(c6 )ρ(n,m) .
(4.9)
of σ (n), and ρ(n, m) stands for the Here, and below |σ (n)| denotes the cardinality distance min |x − y|: x ∈ σ (n), y ∈ σ (m) . The proof of Lemma 4.3 is carried out in Sect. 5. Formula (4.8) and bound (4.9), together with the bounds of Lemma 3.5.3 and the inequality `{x}∪σ (n) + `{y}∪σ (m) + ρ(n, m) ≥ |x − y|, imply that function f:Zd → R defined by Gx,y =: f(x − y) belongs to algebra B and admits the representation f = re0 + h, where (a) r = h82ex iH¯ > 0 and |x| does not depend on x ∈ Zd , and (b) h ∈ l1 (Zd ) satisfies the bound |h(x)| ≤ c7 c7 1/4 .
Spectrum of Interacting Diffusions
477
Repeating the argument given in the proof of Lemma 4.2, we conclude that there exist in B the square ∗-root f∗1/2 and its inverse f−∗1/2 , and they admit the representations |x| f∗1/2 = r1/2 e0 + h1 , f−∗1/2 = r−1/2 e0 + h2 , where |h1 (x)|, |h2 (x)| ≤ c8 c8 1/4 . (4.10) e x } in H¯ 1 , map V: H¯ 1 → L2 (Td , dλ) and function m b (λ), λ ∈ Td , are The basis {2 now defined by X X ey )(λ) = exp ihy, λi, m ex = b (λ) = f−∗1/2 (x − z)2z , (V2 m(z) exp ihz, λi. 2 z
z
(4.11) ex = 2 ex by Uy 2 ex+y , and function m e is analytic in a Cd Group {Uy } acts on 2 d neighbourhood of torus T . Finally, it is not hard to check that X b (λ) = κ1 + 2r1,1 − 2p1,0 b0,1 cos λ(j ) + O( 2 ), m 1≤j ≤d (4.12) λ = (λ(1) , ..., λ(d) ) ∈ Td . e is not constant. This completes the proof of Theorem 1. u Thus, function m t 5. Cluster Expansions In this section we prove Lemmas 3.1 and 4.3. The proof is based on cluster expansions for measure ν¯ (see (2.2)–(2.5)) which are discussed below. 5.1. Expansion of the partition function. We begin with an expansion of the partition function ZV related to generator L¯ in a finite set V ⊂ Zd : Z X qx − qy )2 dµ(Q). (5.1a) ZV = exp − x,y∈V: |x−y|=1
We use a standard representation of the product P(V) Q sum {0} 0 p0 (Q) to write ZV =
(V) Y X
Q
{x,y}∈V (e
−(qx −qy )2
− 1 + 1) as the
P0 ,
(5.1b)
{0} 0
where
Z P0 =
p0 (Q)dµ(Q), p0 (Q) =
Y
(e−(qx −qy ) − 1). 2
(5.1c)
{x,y}∈0
P(V) Here and below, the sum {0} is taken over the finite unordered collections of pairwise disjoint Zd -connected sets 0 of lattice edges {x, y} lying in “volume” V (we say that
478
R. A. Minlos, Yu. M. Suhov
an edge Q {x, y} lies in a set O ⊂ Zd (and write {x, y} ⊂ O) when x, y ∈ O), and the product 0 over the 0’s from the given collection. Furthermore, [0] denotes below the set of vertices of the edges {x, y} ⊂ 0, |0| the cardinality of set 0 and |[0]| that of [0]. |0| It turns out that the following bound holds true: |P0 | ≤ c0 . The derivation of this bound is based on the following general fact: Lemma 5.1 (A generalized Hoelder inequality). Q Let Et , Et, πt , t ∈ T be a finite family of probability spaces and E, E, π = t∈T Et , Et , πt their Cartesian product. Suppose that {fYi , 1 ≤ i ≤ k} is a collection of functions E → C, indexed by subsets Yi ⊂ T such that each function fYi is measurable relative to the sigma-subalgebra EYi = Q t∈Yi Et ⊆ E. Furthermore, assume that a collection of positive numbers ri , 1 ≤ i ≤ k, P is given, such that Yi : Yi 3t ri−1 ≤ 1 ∀t ∈ T . Then k k Z Y Z Y 1/ri fYi dπ ≤ |fYi |ri dπ . E i=1
i=1
E
For the proof of Lemma 5.1 see [20], Lemma 5.2. To apply Lemma 5.1, we set T = [0] and identify Yi as the two-point subset consisting of vertices xi and yi of an edge {xi , yi } ⊂ 0. As each point x ∈ [0] is incident to not more than 2d edges from 0, we can take ri = 2d. Lemma 5.1 then gives that Y Z
|P0 | ≤
1/(2d)
|e−(qx −qy ) − 1|2d dµ(Q) 2
.
{x,y}⊂0
To bound a single term in the last product, we use the straightforward inequalities |e
−(qx −qy )2
Z − 1| ≤ (qx − qy ) and 2
qx − qy
4d
Z dµ ≤ 2
q 4d dµ0 (q).
4d
R 1/(2d) |0| with c0 = 4 q 4d dµ0 (q) . This yields the bound |P0 | ≤ c0 We list below, without proof, some facts about the partition function ZV which may be derived from the above bound for P0 . For the proof, see [13], Chapter 3. First, the above expansion of ZV absolutely converges for small enough. Furthermore, given (V) (1) (2) −1 finite V0 ⊂ V, set ϕV0 = ZV ZV\V0 . Then for any finite V0 , V0 , V0 ⊂ Zd , with (1)
(2)
V0 ∩ V0 = ∅,
(1)
(2)
1) the following bounds hold true: for any finite V ⊇ V0 , V0 , V0 , (V)
|ϕV0 | ≤ c1 2|V0 | , |ϕ
(V )
(1) V(1) 0 ∪V0
(1)
−
Y j =1,2
(V ) (j ) | V0
ϕ
(1)
≤ c1 3|V0
(2)
(2)
|+|V0 |
(1)
(c1 )ρ(V0 (1)
(2)
,V0 )
, (5.2)
(2)
where, as before, ρ(V0 , V0 ) denotes the distance between sets V0 and V0 ; (V)
2) there exists the limit ϕV0 = limV%Zd ϕV0 , and the limiting value ϕV0 satisfies bounds (5.2).
Spectrum of Interacting Diffusions
479
5.2. Expansion for expected values. Given a finite set V(0) ⊂ Zd , suppose that gV(0) is a function → C localised in V(0) (i.e. depending on the restriction of a configuration Q ∈ to V(0) : gV(0) (Q) = gV(0) (QV(0) ). Assuming that V(0) ⊆ V, con−1 sider the Gibbs distribution ν¯ V with the density d¯νV (Q)/dµ(Q) = ZV exp − P x,y∈V: |x−y|=1 qx − qy )2 ; see (5.1). The approach adopted in Sect. 5.1 leads to R the following representation for the expected value hgV(0) iν¯V := gV(0) d¯νV : Z
(X V(0) )
hgV(0) iν¯V =
(V)
gV(0) (Q)p0 (Q)dµ(Q)ϕV\(V(0) ∪[0]) .
0: [0]⊆V
(5.3)
P(V(0) ) Here, the sum 0: [0]⊆V is over the sets 0 of pairwise distinct edges of Zd such that (a) each connected component 0 of 0 has [0] ∩ V(0) 6 = ∅, and (b) the set of the vertices [0] of the edges from [0] is a subset V. Equations (5.2) and (5.3) imply that if (X V(0) ) 0: |[0]| 0 such that N(t) ≤ c4 t 2s/(4s−2) + c5 . Observe that bounds 6.5.1 imply that ∃ c6 , c7 ∈ (0, 1), c6 < c7 , such that for any r ∈ (0, 1) there exists n0 = n0 (r) such that ∀n > n0 c6 ≤ κ[rn] /κn ≤ c7 .
(6.19)
Spectrum of Interacting Diffusions
487
P To estimate the sum m |rn,m |, we note that by virtue of (3.5) it does not exceed c8 (κn + V0 )3/4+(4d−1)/(2(4s−2)) X 1 (κm + V0 )(2d+s−1)/(4s−2) + (κn + V0 )1/2+1/(4s−2) . × |κn − κm | m: m6=n
We partition the last series into four sums: X X X = + m: m6=n
0≤m≤[n/2]
X
+
[n/2]+1≤m≤n−1
X
+
n+1≤m6 =[3n/2]
(6.20)
m≥[3n/2]
and assess each ofP them individually. The first sum, 0≤m≤[n/2] , equals Z κ[n/2] + 1 (t + V0 )(2d+s−1)/(4s−2) dN (t) κn − t 0 2d+s−1 2d+s−1 !0 Z κ[n/2] (κ[n/2] + V0 ) 4s−2 (t + V0 ) 4s−2 − N (t) dt. = N κ[n/2] κn − κ[n/2] κn − t 0
(6.21)
In view of 6.5.1, 6.5.2 and (6.19), the first term in the r.h.s. of (6.21) is less than or equal to 2d+3s−1
c9
(κ[n/2] + V0 ) 4s−2 (κ[n/2] + V0 ) 2s−2 ≤ c9 n κ[n/2] + V0 4s−2 n
2d+s+1 4s−2
≤ c9 (κn + V0 )
2d−s+1 4s−2
The integral in the r.h.s. of (6.21), again by (6.19), is Z κ[n/2] 2d+3s−1 1 (t + V0 ) 4s−2 −1 ≤ c10 2 (κn − t) 0 2d + s − 1 3s − 2d − 1 (κn + V0 ) + | |(t + V0 ) dt. × 4s − 2 4s − 2
.
(6.22)
(6.23)
Performing the change of variables t + V0 = (κn + V0 )ξ and using (6.19), integral (6.23) is made Z c11 2d+3s−1 1 ξ (2d+3s−1)/(4s−2)−1 ≤ (κn + V0 ) 4s−2 −1 (1 − ξ )2 0 (6.24) 3s − 2d − 1 2d + s − 1 +| |ξ dξ = c12 (κn + V0 )(2d−s+1)/(4s−2) . × 4s − 2 4s − 2 Therefore, the first sum in the r.h.s. of (6.20) does not exceed
The second sum,
P
c13 (κn + V0 )(2d−s+1)/(4s−2) . [n/2]+1≤m≤n−1 ,
c14
X 1≤k≤[n/2]
in (6.20) does not exceed
(κn−k + V0 )(2d+s−1)/(4s−2) k(κm + V0 )(2s−2)/(4s−2) (2d+s−1)/(4s−2)
≤ c15 (κn + V0 )
(6.25)
ln (κn + V0 );
(6.26)
488
R. A. Minlos, Yu. M. Suhov
in the last inequality we used 6.5.1 and the fact that, for 0 ≤ k ≤ [n/2], 0 < c16 ≤ (κn−k + V0 ) (κn + V0 )−1 < 1. The third sum in the r.h.s. of (6.20) is assessed in a similar fashion and again does not exceed c17 (κn + V0 )(2d−s+1)/(4s−2) ln (κn + V0 ).
(6.27)
Finally, the fourth sum is estimated by means of an argument used for assessing the first sum. However, the difference with (6.24) is that now we deal with an integral Z ∞ 3s − 2d − 1 1 (2d+3s−1)/(4s−2)−1 2d + s − 1 +| |ξ dξ (6.28) ξ 2 4s − 2 4s − 2 c18 (1 − ξ ) which converges when 2d +P1 < s (see (1.4)). The ultimate bound is then identical to (6.25). We finally have that m |rn,m | ≤ c19 (κn + V0 )1/2+1/(4s−2) ln (κn + V0 ). X X |pn,m | and |bn,m | are assessed in a similar way. This completes the The sums m
proof of Lemma 3.3. u t
m
Acknowledgements. RAM acknowledges the financial support of RFFI (grants 96-01-00064 and 97-0100714).YMS acknowledges the support of EC Grant “Training Mobility and Research” (Contracts CHRX–CT 930411 and ERBMRXT–CT 960075A) and INTAS Grant “Mathematical Methods for Stochastic Discrete Event Systems” (INTAS 93–820). RAM thanks St John’s College, Cambridge, UK, for hospitality during Easter Term 1998. YMS thanks I.H.E.S., Bures-sur-Yvette, France, for hospitality during his visits in Spring and Autumn, 1998, and DIAS and Professor J. Lewis for hospitality during his visit in Autumn, 1998. The authors thank S. Shea-Simonds for checking the style of the paper.
References [AR]
Albeverio, S., Röckner, M.: Stochastic differential equations in infinite dimensions: Solution via Diriclet’s forms. Prob. Theor. Rel. Fields 89, 347–385 (1991) [AKR 1] Albeverio, S., Kondratiev, Yu.G., Röckner, M.: Ergodicity of L2 -semigroups and extremality of Gibbs states. J. Funct. Anal. 144, 394–423 (1997) [AKR 2] Albeverio, S., Kondratiev,Yu.G., Röckner, M.: Ergodicity for stochastic dynamics of quasi-invariant measures with applications to Gibbs states. J. Funct. Anal. 149, 415–469 (1997) [BH] Bellissard, J., Hoegh-Krohn, R.: Compactness and maxcimal Gibbs states for Gibbs random fields on a lattice. Commun. Math. Phys. 84, 297–327 (1982) [COPP] Cassandro, M., Olivieri, E., Pellegrinotti, A., Presutti, E.: Existence and uniqueness of DLR measures for unbounded spin systems. Z. Wahrsch. verv. Gebiete 41, 313–334 (1978) [DFS] Dobrushin, R.L., Fritz, J., Suhov, Yu.M., A.N.: Kolmogorov, the foundator of the theory of reversible Markov processes [Russian]. Uspekhi Matem. Nauk 43 No. 6, 167–188 (1988) d
Doss, H., Royer, G.: Processus de diffusion associé aux mesures de Gibbs sur RZ . Z. Wahrsch. Verw. Gebiete 46, 107–124 (1978) [F] Fedoryuk, M.V.: Asymptotic Analysis. Linear Ordinary Differential Equations. Berlin: SpringerVerlag, 1993 [Fr] Fritz, J.: Infinite lattice systems of interacting diffusion processes. Z. Wahrsch. Verw. Gebiete 59, 291–309 (1982) [KM] Kondratiev, Yu.G., Minlos, R.A.: One-particle subspaces in the stochastic XY model. J. Stat. Phys. 87, no. 3/4, 613–642 (1997) [LS] Levitan, B.M., Sargsijan, I.S.: Introduction to Spectral Theory: Selfadjoint Ordinary Differential Operators. Providence, R.I.: AMS, 1975 [L] Liggett, T.M.: Stochastic models of interacting systems. Ann. Prob. 25, 1–29 (1977) [MM 1] Malyshev, V.A., Minlos, R.A.: Gibbs Random Fields. Cluster Expansions. Dordrecht: Kluwer Academic Publishers, 1991 [MM 2] Malyshev, V.A., Minlos, R.A.: Linear Infinite-Particle Operators. Translations of Mathematical Monographs 143 Providence, R.I.: American Mathematical Society, 1995 [DR]
Spectrum of Interacting Diffusions
489
[M 1] Minlos, R.A.: Spectral expansion of the transfer matrices of Gibbs fields. In: Mathematical Physics Reviews. Vol. 7. Soviet. Sci. Rev. Sect. C: Math. Phys. Rev. Chur: Harwood Academic Publ. 1988, pp. 235–280 [M 2] Minlos, R.A.: Invariant subspaces of the stochastic Ising high temperature dynamics. Markov Proc. Rel. Fields 2, 263–284 (1996) [M 3] Minlos, R.A.: Spectra of the stochastic operators of some Markov processes, and their asymptotic behavior. St Petersburg Math. J. 8, 291–301 (1996) [MS] Minlos, R.A., Sinai, Ya.G.: Investigation of the spectra of stochastic operators that arise in lattice gas models [Russian]. Teoret. Mat. Fizika 2, 230–243 (1970) [MT] Minlos, R.A., Trishch, A.G.: Complete spectral resolution of the generator of Glauber dynamics for the one-dimensional Ising model[Russian]. Uspekhi Matem. Nauk 49 No.6, 209–210 (1994) [MVZ] Minlos, R.A., Verbeure, A., Zagrebnov, V.A.: A quantum crystal model in the light mass limit: The Gibbs state. To appear in Rev. Math. Phys. 1999 [MZ] Minlos, R.A., Zhizhina, E.A.: Asymptotics of decay of correlations for lattice spin fields at high temperatures. I. J. Stat. Phys. 84 no. 1/2, 85–118 (1996) [R] Ramirez, A.F.: Relative entropy and mixing properties of infinite-dimensional diffusions. Probab. Th. Rel. Fields 110, 369–395 (1998) [Ro] Royer, G. Processus de diffusion associé à certain modèles d’Ising à spin continue. Z. Wahrsch. Verw. Gebiete 46, 165–176 (1978) [T] Titchmarsh, E.C.: Eigenfunction Expansions Associated With Second-Order Differential Equations, Oxford: Clarendon Press, 1946 [Y1] Yoshida, N.: The log-Sobolev inequality for weakly coupled lattice fields. Preprint, Division of Mathematics, School of Science, Kyoto University, 1997. To appear in Prob. Theory Rel. Fields [Y2] Yoshida, N.: The equivalence of the log-Sobolev and a mixing condition for unbounded spin systems on the lattice. Preprint, Division of Mathematics, School of Science, Kyoto University, 1998 [Y3] Yoshida, N.: The log-Sobolev inequality for weakly coupled lattice fields. Preprint, Division of Mathematics, School of Science, Kyoto University, 1998 [Z] Zegarlinski, B.: The strong decay to equilibrium for the stochastic dynamics of unbounded spin systems. Commun. Math. Phys. 175, 401–432 (1996) [Zh] Zhizhina, E.A.: An asymptotic formula for the decay of correlations in a stochastic model of planar rotators at high temperatures. Theoret. and Math. Phys. 112, 857–865 (1997) Communicated by Ya. G. Sinai
Commun. Math. Phys. 206, 491 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Erratum
The Number-Theoretical Spin Chain and the Riemann Zeroes Andreas Knauf Mathematisches Institut, Universität Erlangen-Nürnberg, Bismarckstr. 1 21 , D–91054 Erlangen, Germany. E-mail:
[email protected] Received: 18 June 1999 / Accepted: 18 June 1999 Commun. Math. Phys. 196, 703–731 (1998)
In Definition 12 of [2] I introduced three-regular finite graphs Gd = (V , E), d prime, whose vertex sets V = V+ ∪ V− consist of the orbits of −1 −1 −1 1 resp. M− := M+ := 1 0 −1 0 acting on SL(2, Z/dZ). A pair {v+ , v− }, v± ∈ V of vertices belongs to the set E of edges iff v+ ∈ V+ , v− ∈ V− and the orbits v+ and v− contain a common group element g ∈ SL(2, Z/dZ). I showed in Proposition 15 that for a common ε > 0 the adjacency matrices of these graphs have a spectral radius smaller than 3 − ε, omitting the eigenvalues ±3. On page 725 I conjectured that these √ graphs are bipartite Ramanujan, meaning that their non-trivial spectral radius is ≤ 8. However, it has been shown recently by Stephan Heiss (following a suggestion of Alain Valette, at the Université de Neuchâtel) that this conjecture is wrong, d = 29 being the first prime leading to a violation of the Ramanujan estimate. Similarly, a Ramanujan estimate does not hold for the operators T¯dd . Here the first counterexample is d = 433. I would like to thank them for pointing out my erroneous conjecture, and also Peter Sarnak who independently advised me to check it. References 1. Personal Communication. Homepage of Alain Valette: http://www.unine.ch/math/ 2. Knauf, A.: The Number-Theoretical Spin Chain and the Riemann Zeroes. Commun. Math. Phys. 196, 703–731 (1998) Communicated by P. Sarnak
Commun. Math. Phys. 206, 493 – 531 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Singular Dimensions of the N = 2 Superconformal Algebras. I Matthias Dörrzapf1 , Beatriz Gato-Rivera2,3 1 Lyman Laboratory of Physics, Harvard University, Cambridge, MA 02138, USA.
E-mail:
[email protected] 2 Instituto de Matemáticas y Física Fundamental, CSIC, Serrano 123, Madrid 28006, Spain.
E-mail:
[email protected] 3 NIKHEF-H, Kruislaan 409, 1098 SJ Amsterdam, The Netherlands
Received: 19 August 1998 / Accepted: 15 March 1999
Abstract: Verma modules of superconfomal algebras can have singular vector spaces with dimensions greater than 1. Following a method developed for the Virasoro algebra by Kent, we introduce the concept of adapted orderings on superconformal algebras. We prove several general results on the ordering kernels associated to the adapted orderings and show that the size of an ordering kernel implies an upper limit for the dimension of a singular vector space. We apply this method to the topological N = 2 algebra and obtain the maximal dimensions of the singular vector spaces in the topological Verma modules: 0, 1, 2 or 3 depending on the type of Verma module and the type of singular vector. As a consequence we prove the conjecture of Gato-Rivera and Rosado on the possible existing types of topological singular vectors (4 in chiral Verma modules and 29 in complete Verma modules). Interestingly, we have found two-dimensional spaces of singular vectors at level 1. Finally, by using the topological twists and the spectral flows, we also obtain the maximal dimensions of the singular vector spaces for the Neveu–Schwarz N = 2 algebra (0, 1 or 2) and for the Ramond N = 2 algebra (0, 1, 2 or 3). 1. Introduction More than two decades ago, superconformal algebras were first constructed independently and almost at the same time by Kac [21] and byAdemollo et al. [1]. Whilst Kac [21] derived them for mathematical purposes along with his classification of Lie super algebras, Ademollo et al. [1] constructed the superconformal algebras for physical purposes in order to define supersymmetric strings. Since then the study of superconformal algebras has made much progress in both mathematics and physics. On the mathematical side Kac and van de Leuer [24] and Cheng and Kac [6] have classified all possible superconformal algebras and Kac recently has proved that their classification is complete (see footnote in Ref. [23]). As far as the physics side is concerned, superconformal models are gaining increasing importance. Many areas of physics make use of superconformal
494
M. Dörrzapf, B. Gato-Rivera
symmetries but the importance is above all due to the fact that superconformal algebras supply the underlying symmetries of Superstring Theory. The classification of the irreducible highest weight representations of the superconformal algebras is of interest to both mathematicians and physicists. After more than two decades, only the simpler superconformal highest weight representations have been fully understood. Namely, only the representations of N = 1 are completely classified and proven [2,3]. For N = 2 remarkable efforts have been taken by several research groups [5,14,29,8,10,20]. Already the N = 2 superconformal algebras contain several surprising features regarding their representation theory, most of them related to the rank 3 of the algebras, making them more difficult to study than the N=1 superconformal algebras. The rank of the superconformal algebras keeps growing with N and therefore even more difficulties can be expected for higher N. The standard procedure of finding all possible irreducible highest weight representations starts off with defining freely generated modules over a highest weight vector, denoted as Verma modules. A Verma module is in general not irreducible, but the corresponding irreducible representation is obtained as the quotient space of the Verma module divided by all its proper submodules. Therefore, the task of finding irreducible highest weight representations can be reduced to the classification of all submodules of a Verma module. Obviously, every proper submodule needs to have at least one highest weight vector different from the highest weight vector of the Verma module. These vectors are usually called singular vectors of the Verma module. Conversely, a module generated on such a singular vector defines a submodule of the Verma module. Thus, singular vectors play a crucial rôle in finding submodules of Verma modules. However, the set of singular vectors may not generate all the submodules. The quotient space of a Verma module divided by the submodules generated by all singular vectors may still be reducible and may hence contain further submodules that again contain singular vectors. But this time they are singular vectors of the quotient space, known as subsingular vectors of the Verma module. Repeating this division procedure successively would ultimately lead to an irreducible quotient space. On the Verma modules one introduces a hermitian contravariant form. The vanishing of the corresponding determinant indicates the existence of a singular vector. Therefore, a crucial step towards analysing irreducible highest weight representations is to compute the inner product determinant. This has been done for N = 1 [22,33,34], N = 2 [5, 34,25,19,12], N = 3 [27], and N = 4 [28,32]. Once the determinant vanishes we can conclude the existence of a singular vector 9l at a certain level l, although there may still be other singular vectors at higher levels even outside the submodule generated by 9l , the so-called isolated singular vectors. Thus the determinant may not give all singular vectors neither does it give the dimension of the space of singular vectors at a given level l, since at levels where the determinant predicts one singular vector, of a given type, there could in fact be more than one linearly independent singular vectors, as it happens for the N = 2 superconformal algebras [9,20]. Therefore, the construction of specific singular vectors at levels given by the determinant formula may not be enough. One needs in addition information about the dimension of the space of singular vectors, apart from the (possible) existence of isolated singular vectors. The purpose of this paper is to give a simple procedure that derives necessary conditions on the space of dimensions of singular vectors of the N=2 superconformal algebras. This will result in an upper limit for the dimension of the spaces of singular vectors at a given level. For most weight spaces of a Verma module these upper limits on the dimensions will be trivial and we obtain a rigorous proof that there cannot exist any
Singular Dimensions of N = 2 Superconformal Algebras. I
495
singular vectors for these weights. For some weights, however, we will find necessary conditions that allow one-dimensional singular vector spaces, as is the case for the Virasoro algebra, or even higher dimensional spaces. The method shown in this paper for the superconformal algebras originates from the method used by Kent [26] for the Virasoro algebra1 . Kent analytically continued the Virasoro Verma modules to generalised Verma modules. In these generalised Verma modules he constructed generalised singular vector expressions in terms of analytically continued Virasoro operators. Then he proved that if a generalised singular vector exists at level 0 in a generalised Verma module, then it is proportional to the highest weight vector. And consequently, if a generalised singular vector exists at a given level in a generalised Verma module, then it is unique up to proportionality. This uniqueness can therefore be used in order to show that the generalised singular vector expressions for the analytically continued modules are actually singular vectors of the Virasoro Verma module, whenever the Virasoro Verma module has a singular vector. As every Virasoro singular vector is at the same time a generalised singular vector, this implies that Virasoro singular vectors also have to be unique up to proportionality. In this paper we focus on the uniqueness proof of Kent and show that similar ideas can be applied directly to the superconformal algebras. Our procedure does not require any analytical continuation of the algebra, however, and therefore gives us a powerful method that can easily be applied to a vast number of algebras without the need of constructing singular vectors. We shall define the underlying idea as the concept of adapted orderings. For pedagogical reasons we will first apply Kent’s ordering directly to the Virasoro Verma modules. Then we will present adapted orderings for the topological N = 2 superconformal algebra, which is the most interesting N = 2 algebra for current research in this field. The results obtained will be translated finally to the Neveu–Schwarz and to the Ramond N = 2 algebras. In a future publication we will further apply these ideas to the twisted N = 2 superconformal algebra [13]. The paper is structured as follows. In Sect. 2 we explain the concept of adapted orderings for the case of the Virasoro algebra, which will also serve to illustrate Kent’s proof in our setting. In Sect. 3, we prove some general results on adapted orderings for superconformal algebras, which justify the use of this method. In Sect. 4 we review some basic results concerning the topological N = 2 superconformal algebra. Section 5 introduces adapted orderings on generic Verma modules of the topological N = 2 superconformal algebra (those built on G0 -closed or Q0 -closed highest weight vectors). This procedure is extended to chiral Verma modules in Sect. 6 and to no-label Verma modules in Sect. 7. Section 8 summarises the implications of the adapted orderings on the dimensions of the singular vector spaces for the corresponding topological Verma modules. Section 9 translates these results to the singular vector spaces of the Neveu– Schwarz and the Ramond N = 2 superconformal algebras. Section 10 is devoted to conclusions and prospects. The proof of Theorem 5.3 fills several pages and readers that are not interested in the details of this proof can simply continue with Theorem 5.5. In this case, the preliminary remarks to Theorems 6.1 and 7.2 should also be skipped. Nevertheless, the main idea of the concept can easily be understood from the introductory example of the Virasoro Verma modules in Sect. 2.
1 Besides the later application to the Neveu–Schwarz N = 2 algebra in Ref. [9], only one further application is known to us which has been achieved by Bajnok [4] for the W A2 algebra.
496
M. Dörrzapf, B. Gato-Rivera
2. Virasoro Algebra It is a well-known fact that at a given level of a Verma module of the Virasoro algebra there can only be one singular vector which is unique up to proportionality. This is an immediate consequence of the proof of the Virasoro embedding diagrams by Feigin and Fuchs [15]. Using an analytically continued algebra of the Virasoro algebra, Kent constructed in Ref. [26] all Virasoro singular vectors in terms of products of analytically continued operators. Although similar methods had already been used earlier on Verma modules over Kac-Moody algebras [31], the construction by Kent not only shows the existence of analytically continued singular vectors for any complex level but also their uniqueness2 . This issue is our main interest in this paper. We shall therefore concentrate on the part of Kent’s proof that shows the uniqueness of Virasoro singular vectors rather than the existence of analytically continued singular vectors. It turns out that the extension of the Virasoro algebra to an analytically continued algebra, although needed for the part of Kent’s proof showing the existence claim, is however not necessary for the uniqueness claim on which we will focus in this paper. We will first motivate and define our concept of adapted orderings for the Virasoro algebra and will then prove some first results for the implications of adapted orderings on singular vectors. Following Kent [26] we will then introduce an ordering on the basis of a Virasoro Verma module and describe it in our framework. If we assume that a singular vector exists at a fixed level, then this total ordering will show that this singular vector has to be unique up to proportionality. The Virasoro algebra V is generated by the operators Lm with m ∈ Z and the central extension C satisfying the commutation relations [Lm , Ln ] = (m − n)Lm+n +
C 3 (m − m)δm+n,0 , [C, Lm ] = 0, m, n ∈ Z. 12
(1)
V can be written in its triangular decomposition V = V− ⊕ V0 ⊕ V+ , with V+ = span{Lm : m ∈ N}, the positive Virasoro operators, and V− = span{L−m : m ∈ N}, the negative Virasoro operators. The Cartan subalgebra is given by V0 = span{L0 , C}. For elements Y of V that are eigenvectors of L0 with respect to the adjoint representation we call the L0 -eigenvalue the level of Y and denote it by3 |Y |: [L0 , Y ] = |Y |Y . The same shall be used for the universal enveloping algebra U (V). In particular, elements of U (V) of the form Y = L−pI . . . L−p1 , pq ∈ Z for q = 1, . . . , I , I ∈ N, are at level P |Y | = Iq=1 pq and we furthermore define them to be of length kY k = I . Finally, for the identity operator we set k1k = |1| = 0. For convenience we define the graded class of subsets of operators in U (V) at positive level: S m = {S = L−mI . . . L−m1 : |S| = m ; mI ≥ . . . m1 ≥ 2 ; m1 , . . . , mI , I ∈ N},(2) for m ∈ N, S0 = {1}, and also Cn = {X = Sm Ln−m −1 : Sm ∈ Sm , m ∈ N0 , m ≤ n},
(3)
for n ∈ N0 , which will serve to construct a basis for Virasoro Verma modules later on. We consider representations of V for which the Cartan subalgebra V0 is diagonal. Furthermore, C commutes with all operators of V and can hence be taken to be constant 2 The exact proof of Kent showed that generalised Virasoro singular vectors at level 0 are scalar multiples of the identity. 3 Note that positive generators L have negative level |L | = −m. Therefore, any positive operators m m 0 ∈ V+ have a negative level |0|.
Singular Dimensions of N = 2 Superconformal Algebras. I
497
c ∈ C (in an irreducible representation). A representation with L0 -eigenvalues bounded from below contains a vector with L0 -eigenvalue 1 which is annihilated by V+ , a highest weight vector |1, ci: V+ |1, ci = 0, L0 |1, ci = 1 |1, ci , C |1, ci = c |1, ci .
(4)
The Verma module V1,c is the left-module V1,c = U (V) ⊗V0 ⊕V+ |1, ci. For V1,c we choose the standard basis B 1,c as: B 1,c = {Sm Ln−1 |1, ci : Sm ∈ Sm , m, n ∈ N0 }.
(5)
V1,c and B 1,c are L0 -graded in a natural way. The corresponding L0 -eigenvalue is called the conformal weight and the L0 -eigenvalue relative to 1 is the level. Let us introduce Bk1,c = {Xk |1, ci : Xk ∈ Ck } , k ∈ N0 .
(6)
Thus, Bk1,c has conformal weight k and span{Bk1,c } is the grade space of V1,c at level k. For x ∈ span{Bk1,c } we again denote the level by |x| = k. Verma modules may not be irreducible. In order to obtain physically relevant irreducible highest weight representations one thus needs to trace back the proper submodules of V1,c and divide them out. This finally leads to the notion of singular vectors as any proper submodule of V1,c needs to contain a vector 9l that is not proportional to the highest weight vector |1, ci but still satisfies the highest weight vector conditions4 with conformal weight5 1 + l for some l ∈ N0 : V+ 9l = 0, L0 9l = (1 + l)9l , C9l = c9l ,
(7)
l is the level of 9l , denoted by |9l |. An eigenvector 9l of L0 at level l in V1,c , in particular a singular vector, can thus be written using the basis (6): 9l =
l X X m=0 Sm ∈Sm
cSm Sm Ll−m −1 |1, ci ,
(8)
with coefficients cSm ∈ C. The basis decomposition (8) of an L0 -eigenvector in V1,c will be denoted the normal form of 9l , where Sm Ll−m −1 ∈ Cl and cSm will be referred to as the terms and coefficients of 9l , respectively. A non-trivial term Y ∈ Cl of 9l refers to a term Y in Eq. (8) with non-trivial coefficient cY . Let O denote a total ordering on Cl with global minimum. Thus 9l in Eq. (8) needs to contain an O-smallest X0 ∈ Cl with cX0 6 = 0 and cY = 0 for all Y ∈ Cl with Y