Commun. Math. Phys. 298, 1–36 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1064-1
Communications in
Mathematical Physics
Hölder-Continuous Rough Paths by Fourier Normal Ordering Jérémie Unterberger Institut Elie Cartan, Université Henri Poincaré, BP 239, 54506 Vandoeuvre Cedex, France. E-mail:
[email protected] Received: 16 March 2009 / Accepted: 9 March 2010 Published online: 19 May 2010 – © Springer-Verlag 2010
Abstract: We construct in this article an explicit geometric rough path over arbitrary d-dimensional paths with finite 1/α-variation for any α ∈ (0, 1). The method may be coined as ‘Fourier normal ordering’, since it consists in a regularization obtained after permuting the order of integration in iterated integrals so that innermost integrals have highest Fourier frequencies. In doing so, there appear non-trivial tree combinatorics, which are best understood by using the structure of the Hopf algebra of decorated rooted trees (in connection with the Chen or multiplicative property) and of the Hopf shuffle algebra (in connection with the shuffle or geometric property). Hölder continuity is proved by using Besov norms. The method is well-suited in particular in view of applications to probability theory (see the companion article [34] for the construction of a rough path over multidimensional fractional Brownian motion with Hurst index α < 1/4, or [35] for a short survey in that case). Contents 0. 1.
2. 3. 4.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Iterated Integrals: Smooth Case . . . . . . . . . . . . . . . . . . . . . 1.1 From iterated integrals to trees . . . . . . . . . . . . . . . . . . . 1.2 Permutation graphs and Fourier normal ordering for smooth paths 1.3 Tree Chen property and coproduct structure . . . . . . . . . . . . 1.4 Skeleton integrals . . . . . . . . . . . . . . . . . . . . . . . . . . Regularization: The Fourier Normal Ordering Step by Step . . . . . . Proof of the Geometric and Multiplicative Properties . . . . . . . . . . 3.1 Hopf algebras and the Chen and shuffle properties . . . . . . . . . 3.2 Proof of the Chen and shuffle properties . . . . . . . . . . . . . . Hölder Estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Choice of the regularization scheme . . . . . . . . . . . . . . . . 4.2 A key formula for skeleton integrals . . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
2 5 6 7 9 10 13 16 17 19 26 26 27
2
J. Unterberger
4.3 Estimate for the increment term . 4.4 Estimate for the boundary term . 5. Appendix. Hölder and Besov Spaces References . . . . . . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
28 31 33 35
0. Introduction Assume t → t = (t (1), . . . , t (d)), t ∈ R is a smooth d-dimensional path, and let V1 , . . . , Vd : Rd → Rd be smooth vector fields. Then the classical Cauchy-Lipschitz theorem implies that the differential equation driven by dy(t) =
d
Vi (y(t))dt (i)
(0.1)
i=1
admits a unique solution with initial condition y(0) = y0 . The usual way to prove this is to show by a functional fixed-point theorem that iterated integrals t Vi (yn (s))ds (i) (0.2) yn → yn+1 (t) := y0 + 0
i
converge when n → ∞. Assume now that is only α-Hölder continuous for some α ∈ (0, 1). Then the Cauchy-Lipschitz theorem does not hold any more because one first a meaning t needs to give t to the above integrals, and in particular to the iterated integrals s dt1 (i 1 ) s 1 dt2 (i 2 ) . . . tn−1 dtn (i n ), n ≥ 2, 1 ≤ i 1 , . . . , i n ≤ d. s The theory of rough paths, invented by T. Lyons [22] and further developed by V. Friz, N. Victoir [14] and M. Gubinelli [15] implies the possibility to solve Eq. (0.1) by a redefinition of the integration along , using as an essential ingredient a rough path over . By definition, a functional = ( 1 , . . . , N ), N = 1/α = entire part of 1/α, is called a rough path over if 1ts = (δ)ts := t − s are the two-point increments of , and k = ( k (i 1 , . . . , i k ))1≤i1 ,...,ik ≤d , k = 1, . . . , N satisfy the following three properties: k (i) Hölder continuity. Each component of ,k = 1, . . . , N is kα-Hölder continuous, | k (i ,...,i )|
1 k < ∞; that is to say, sups∈R supt∈R ts|t−s| kα k (ii) Multiplicative/Chen property. Letting δ tus := kts − ktu − kus , one requires δ ktus (i 1 , . . . , i k ) = ktu1 (i 1 , . . . , i k1 ) kus2 (i k1 +1 , . . . , i k ); (0.3)
k1 +k2 =k
(iii) Geometric/shuffle property. nts1 (i 1 , . . . , i n 1 ) nts2 ( j1 , . . . , jn 2 ) =
n 1 +n 2 (k1 , . . . , kn 1 +n 2 ), (0.4)
k∈Sh( i , j )
where Sh(i, j ) is the set of shuffles of i = (i 1 , . . . , i n 1 ) and j = ( j1 , . . . , jn 2 ), that is to say, of permutations of i 1 , . . . , i n 1 , j1 , . . . , jn 2 which do not change the orderings of (i 1 , . . . , i n 1 ) and ( j1 , . . . , jn 2 ).
Hölder-Continuous Rough Paths by Fourier Normal Ordering
3
There is a canonical choice for , called canonical lift of , when is a smooth path, namely, the iterated integrals of of arbitrary order. If one sets t t1 tn−1 cano,n (i 1 , . . . , i n ) := dt1 (i 1 ) dt2 (i 2 ) . . . dtn (i n ), (0.5) s
cano
( cano )
s
s
then = n=1,2,... satisfies properties (i), (ii), (iii) with α = 1. Axiom (ii) receives a natural geometric interpretation in this case since cano measures the areas, volumes and so forth generated by 1 , . . . , d , see [14], while axiom (iii) may be deduced from Fubini’s theorem. A further justification of axioms (i),(ii),(iii) comes from the fact that any rough path is a limit in some sense of the iterated integrals of a sequence of smooth paths, so plays the rôle of a substitute of iterated integrals for . The problem we address here is the existence and construction of rough paths. It is particularly relevant when is a random path; it allows for the pathwise construction of stochastic integrals or of solutions of stochastic differential equations driven by . Rough paths are then usually constructed by choosing some appropriate smooth approx> imation η , η → 0 of and proving that the canonical lift of η converges in L 2 () for appropriate Hölder norms to a rough path lying above (see [11,32] in the case of fractional Brownian motion with Hurst index α > 1/4, and [1,18] for a class of random paths on fractals, or references in [23]). A general construction of a rough path for deterministic paths has been given – in the original formulation due to T. Lyons – in an article by T. Lyons and N. Victoir [23]. The idea [14] is to see a rough path over as a Hölder section of the trivial G-principal bundle over R, where G is a free rank-N nilpotent group (or Carnot group), while the underlying path is a section of the corresponding quotient G/K -bundle for some normal subgroup K of G; so one is reduced to the problem of finding Hölder-continuous sections gt K → gt . Obviously, there is no canonical way to do this in general. This abstract, group-theoretic construction – which uses the axiom of choice – is unfortunately not particularly appropriate for concrete problems, such as the behaviour of solutions of stochastic differential equations for instance. We propose here a new, explicit method to construct a rough path over an arbitrary α-Hölder path which rests on an algorithm that we call Fourier normal ordering. Let us explain the main points of this algorithm. The first point is the use of Fourier transform, F; Hölder estimates are obtained by means of Besov norms involving compactly supported Fourier multipliers, see the Appendix. Assume for simplicity that is complactly supported; this assumption is essentially void since one may multiply any α-Hölder path by a smooth, compactly supported function equal to 1 over an arbitrary large compact interval, and then restrict the construction to this interval. What makes the Fourier transform interesting for our problem is that (F )(ξ func t1 ) is a well-defined tn−1 t ) = iξ(F)(ξ tion; thus, the meaningless iterated integral s dt1 (i 1 ) s dt2 (i 2 ) . . . s dtn (i n ) +∞ ∞ is rewritten after Fourier transformation as some integral −∞ . . . −∞ f (ξ1 , . . . , ξn )dξ1 . . . dξn , where f is regular but not integrable at infinity along certain directions. The second, main point is the splitting of the Fourier domain of integration Rn into ∪σ ∈ n Rnσ , n = set of permutations of {1, . . . , n}, where Rnσ := {|ξσ (1) | ≤ . . . ≤ |ξσ (n) |}, see Sect. 2 for a more accurate definition involving the Besov dyadic decomposition. Away from the singular directions, the resulting integrals are naturally shown to have a polynomially decreasing behaviour at infinity implying the correct Hölder behaviour; simple examples may be read from [35]. However – as computations in Sect. 4 clearly show, see also [35] for an elementary example – these bounds are naturally
4
J. Unterberger
obtained only after permuting the order of integration by means of Fubini’s theorem, so that the Fourier coordinates |ξ1 |, . . . , |ξn | appear in increasing order. There appear in the process integrals over domains which differ from the simplex {t ≥ t1 ≥ . . . ≥ tn ≥ s}, which are particular instances of tree integrals, and that we call tree skeleton integrals. The next step is to regularize the tree skeleton integrals so that Fourier integrals converge at infinity, without losing the Chen and shuffle properties (ii) and (iii). At this point it turns out to be both natural and necessary to re-interpret the above scheme in terms of tree Hopf algebra combinatorics. The interest for the study of Hopf algebras of trees or graphs surged out of a series of papers by A. Connes and D. Kreimer [8–10] concerning the mathematical structures hidden behind the Bogolioubov-Hepp-Parasiuk-Zimmermann (BPHZ) procedure for renormalizing Feynmann diagrams in quantum field theory [17], and is still very much alive, see for instance [3,4,6,7,13,20,25,36], with applications ranging from numerical methods to quantum chromodynamics or multi-zeta functions or operads. It appears that the shuffle property may be stated by saying that regularized skeleton integrals define characters of yet another Hopf algebra called shuffle algebra, while the Chen property follows from the very definition of the regularized iterated integrals as a convolution of regularized skeleton integrals. We show that the tree skeleton integrals may be regularized by integrating over appropriate subdomains of Rnσ avoiding the singular directions. The proof of properties (ii), (iii) uses Hopf combinatorics and does not depend on the choice of the above subdomains, while the proof of the Hölder estimates (i) uses both tree combinatorics and some elementary analysis relying on the shape of the subdomains. It seems natural to look for a less arbitrary regularization scheme for the skeleton integrals. The idea of cancelling singularities by building iteratively counterterms, originated from the BPHZ procedure, should also apply here. We plan to give such a construction (such as dimensional regularization for instance) in the near future. Let us state our main result. Throughout the paper α ∈ (0, 1) is some fixed constant and N = 1/α . Main Theorem. Assume 1/α ∈ N. Let = ((1), . . . , (d)) : R → Rd be a compactly supported α-Hölder path. Then the functional (R 1 , . . . , R N ) defined in Sect. 2 is an α-Hölder geometric rough path lying over in the sense of properties (i),(ii),(iii) of the Introduction. In a companion paper [34], we construct by the same algorithm an explicit rough path over a d-dimensional fractional Brownian motion B α = (B α (1), . . . , B α (d)) with arbitrary Hurst index α ∈ (0, 1) – recall simply that the paths of B α are a.s. κ-Hölder for every κ < α. The problem was up to now open for α ≤ 1/4 despite many attempts [11,32,33,12]. Fourier normal ordering turns out to be very efficient in combination with Gaussian tools, and provides explicit bounds for the moments of the rough path, seen as a path-valued random variable. The above theorem extends to paths with finite 1/α-variation. Namely (see [23], [21] or also [14]), a simple change of variable → φ := ◦ φ −1 turns into an α-Hölder path, with φ defined for instance as φ(t) := supn≥1 sup0=t0 ≤...≤tn =t n−1 1/α . The construction of the above theorem, applied to φ , j=0 ||(t j+1 ) − (t j )|| yields a family of paths with Hölder regularities α, 2α, . . . , N α which may alternatively be seen as a G N -valued α-Hölder path φ , where G N is the Carnot free nilpotent group of order N equipped with any subadditive homogeneous norm. Then (as proved in [23], Lemma 8) := φ ◦ φ has finite 1/α-variation, which is equivalent to saying that n has finite 1/nα-variation for n = 1, . . . , N , and lies above .
Hölder-Continuous Rough Paths by Fourier Normal Ordering
5
Corollary. Let α ∈ (0, 1) and α < α. Then every α-Hölder path may be lifted to a strong α -Hölder geometric rough path, namely, there exists a sequence of canonical lifts (n) of smooth paths (n) converging to R for the sequence of α -Hölder norms. The set of strong α-Hölder geometric rough paths is strictly included in the set of general α-Hölder geometric rough paths. On the other hand, as we already alluded to above, a weak α-Hölder geometric rough path may be seen as a strong α -Hölder geometric rough path if α < α. This accounts for the loss of regularity in the corollary (see [14] for a precise discussion). The proviso 1/α ∈ N in the statement of the main theorem is a priori needed because otherwise R N may not be treated in the same way as the lower-order iterated integrals (although we do not know if it is actually necessary). However, if 1/α ∈ N, all one has to do is replace α by a slightly smaller parameter α , so that the corollary holds even in this case. Note that the present paper gives unfortunately no explicit way of approximating R by canonical lifts of smooth paths, i.e. of seeing it concretely as a strong geometric rough path. The question is currently under investigation in the particular case of fractional Brownian motion by using constructive field theory methods. Interestingly enough, the idea of controlling singularities by separating the Fourier scales according to a dyadic decomposition is at the core of constructive field theory [27]. Here is an outline of the article. A thorough presentation of iterated integrals, together with the skeleton integral variant, the implementation of Fourier normal ordering, and the extension to tree integrals, is given in Sect. 1, where is assumed to be smooth. The regularization algorithm is presented in Sect. 2; the regularized rough path R is defined there for an arbitrary α-Hölder path . The proof of the Chen and shuffle properties is given in Sect. 3, where one may also find two abstract but more compact reformulations of the regularization algorithm, see Lemma 3.5 and Definition 3.7. Hölder estimates are to be found in Sect. 4. Finally, we gathered in an Appendix some technical facts about Besov spaces required for the construction. Notations. We shall denote by F the Fourier transform, F : L 2 (Rl ) → L 2 (Rl ), f → F( f )(ξ ) =
1 f (x)e−ix,ξ d x. (2π )l/2 Rl
(0.6)
Throughout the article, : R → Rd is some compactly supported α-Hölder path; sometimes, it is assumed to be smooth. The permutation group of {1, . . . , n} is denoted by n . Also, if a, b : X → R+ are functions on some set X such that a(x) ≤ Cb(x) for every x ∈ X , we shall write a b. Admissible cuts of a tree T, see Subsect. 1.3, are usually denoted by v or w, and we write (Roov (T), Leav (T)) (root part and leaves) instead of the traditional notation (R c T, P c T) due to Connes and Kreimer. 1. Iterated Integrals: Smooth Case Let t → t = (t (1), . . . , t (d)) be a d-dimensional, compactly supported, smooth path. The purpose of this section is to give proper notations for iterated integrals of and to introduce some tools which will pave the way for the regularization algorithm. Subsection 1.1 on tree iterated integrals is standard, see for instance [8]. We introduce permutation graphs and Fourier normal ordering for smooth paths in Subsect. 1.2. The tree Chen property – a generalization of the usual Chen property to tree iterated integrals – is recalled in Subsect. 1.3, in connection with the underlying Hopf algebraic
6
J. Unterberger
structure. Finally, a variant of iterated integrals called skeleton integrals is introduced in Subsect. 1.4, together with a variant of the tree Chen property that we call tree skeleton decomposition. 1.1. From iterated integrals to trees. It was noted already a long time ago [5] that iterated integrals could be encoded by trees, see also [20]. This remark has been exploited in connection with the construction of the rough path solution of partial, stochastic differential equations in [16]. The correspondence between trees and iterated integrals goes simply as follows: Definition 1.1. A decorated rooted tree (to be drawn growing up) is a finite tree with a distinguished vertex called root and edges oriented downwards, i.e. directed towards the root, such that every vertex wears a positive integer label called decoration. If T is a decorated rooted tree, we let V (T) be the set of its vertices (including the root), and : V (T) → N be its decoration. Definition 1.2 (tree partial ordering). Let T be a decorated rooted tree. • Letting v, w ∈ V (T), we say that v connects directly to w, and write v → w or equivalently w = v − , if (v, w) is an edge oriented downwards from v to w. Note that v − exists and is unique except if v is the root. • If vm → vm−1 → . . . → v1 , then we shall write vm v1 , and say that vm connects to v1 . By definition, all vertices (except the root) connect tothe root. • Let (v1 , . . . , v|V (T)| ) be an ordering of V (T). Assume that vi v j ⇒ (i > j); in particular, v1 is the root. Then we shall say that the ordering is compatible with the tree partial ordering defined by . Definition 1.3 (tree integrals). (i) Let = ((1), . . . , (d)) be a d-dimensional, compactly supported, smooth path, and T a rooted tree decorated by : V (T) → {1, . . . , d}. Then IT () : R2 → R is the iterated integral defined as t x− x− v2 v |V (T)| [IT ()]ts := dx1 ( (v1 )) dx2 ( (v2 )) . . . dxv|V (T)| ( (v|V (T)| )), s
s
s
(1.1) where (v1 , . . . , v|V (T)| ) is any ordering of V (T) compatible with the tree partial ordering. In particular, if T is a trunk tree with n vertices (see Fig. 1) – so that the tree ordering is total – we shall write IT () = In (), where [In ()]ts :=
s
t
dx1 ( (1))
x1 s
(1.2)
dx2 ( (2)) . . .
xn−1 s
dxn ( (n)).
(1.3)
(ii) Multilinear extension. Assume μ is a compactly supported, signed Borel measure on RV (T) := {(xv )v∈V (T) , xv ∈ R}. Then t x− x− v2 v V (T) ... μ(d xv1 , . . . , d xvV (T) ). (1.4) [IT (μ)]ts := s
s
s
Hölder-Continuous Rough Paths by Fourier Normal Ordering
7
n
2 1 Fig. 1. Trunk tree with set of vertices {n → n − 1 → . . . → 1}
2
3
3
2
1 1
1
3
1
2
3
2
Fig. 2. Example 1.6. From left to right: Tσ1 ; Tσ2 ; Roo{2} Tσ1 ⊗ Lea{2} Tσ1 ; Roo{2,3} Tσ1 ⊗ Lea{2,3} Tσ1
Clearly, the definition of [IT ()]ts given in Eq. (1.1) does not depend on the choice of the ordering (v1 , . . . , v|V (T)| ). For instance, consider T = Tσ1 to be the first tree in Fig. 2. Then
x1 t x1 [IT ()]ts = dx1 (1) dx2 (2) dx3 (3) s s s
x1 t x1 dx1 (1) dx2 (3) dx3 (2) . (1.5) = s
s
s
Note that the decoration of T is required only for (i). In case of ambiguity, we shall also use the decoration-independent notation IT ⊗v∈V (T) ( (v)) instead of IT (). The above correspondence extends by multilinearity to the algebra of decorated rooted trees defined by Connes and Kreimer [8], whose definition we now recall. Definition 1.4 (algebra of decorated rooted trees). (i) Let T be the set of decorated rooted trees. (ii) Let H be the free commutative algebra over R generated by T , with unit element denoted by e. If T1 , T2 , . . . Tl are decorated rooted trees, then the product T1 . . . Tl is the forest with connected components T1 , . . . , Tl . L m l Tl ∈ H, where m l ∈ Z and each Tl = Tl,1 . . . Tl, jl is a forest (iii) Let T = l=1 whose decorations have values in the set {1, . . . , d}. Then [IT ()]ts :=
L
m l [ITl,1 ()]ts . . . [ITl, jl ()]ts .
(1.6)
l=1
1.2. Permutation graphs and Fourier normal ordering for smooth paths. As explained briefly in the Introduction, and as we shall see in the next sections, an essential step in our regularization algorithm is to rewrite iterated integrals by permuting the order of integration. We shall prove the following lemma in this subsection:
8
J. Unterberger
Lemma 1.5 (permutation graphs). To every trunk tree Tn with n vertices and decoration , and every permutation σ ∈ n , is associated in a canonical way an element Tσ of H called a permutation graph, such that: (i) In () = ITσ ();
(1.7)
(ii) Tσ =
Jσ
g(σ, j)Tσj ∈ H,
(1.8)
j=1
where g(σ, j) = ±1 and each Tσj , j = 1, . . . , Jσ is a forest provided by construction with a total ordering compatible with its tree structure, image of the ordering {v1 < . . . < vn } of the trunk tree Tn by the permutation σ . The decoration of Tσ is ◦ σ . Proof. Let σ ∈ n . Applying Fubini’s theorem yields t x1 xn−1
[In ()]ts = dx1 ( (1)) dx2 ( (2)) . . . dxn ( (n)) s s s t2 tn t1 dxσ (1) ( (σ (1))) dxσ (2) ( (σ (2))) . . . dxσ (n) ( (σ (n))), = s1
s2
sn
(1.9) with s1 = s, t1 = t, and for some suitable choice of s j ∈ {s} ∪ {xσ (i) , i < j}, t j ∈ t {t} ∪ {xσ (i) , i < j}( j ≥ 2). Now decompose s jj dxσ ( j) ( (σ ( j))) into
−
s
if s j = s, t j = t, and
t sj
sj
tj
s
dxσ ( j) ( (σ ( j)))
dxσ ( j) ( (σ ( j))) into
t
sj
− s
s
dxσ ( j) ( (σ ( j)))
if s j = s. Then In () has been rewritten as a sum of terms of the form τ1 τ2 τn ± dx1 ( (σ (1))) dx2 ( (σ (2))) . . . dxn ( (σ (n))), s
s
(1.10)
s
where τ1 = t and τ j ∈ {t} ∪ {xi , i < j}, j = 2, . . . , n. Note the renaming of variables and vertices from Eq. (1.9) to Eq. (1.10). Encoding each of these expressions by the forest T with a set of vertices V (T) = {1, . . . , n}, label function ◦ σ , roots { j = 1, . . . , n | τ j = t}, and oriented edges {( j, j − ) | j = 2, . . . , n, τ j = x j − }, yields In () = ITσ () for some Tσ ∈ H as in Eq. (1.8).
(1.11)
Hölder-Continuous Rough Paths by Fourier Normal Ordering
9
123 . Then 231 t t2 t3 dx1 ( (1)) dx2 ( (2)) dx3 ( (3)) s s s x2 x2 t dx2 ( (2)) dx3 ( (3)) dx1 ( (1)) =− s s s x2 t t dx2 ( (2)) dx3 ( (3)). dx1 ( (1)). +
Example 1.6. Let σ =
s σ T2 is
s
(1.12)
s
the sum of a tree and of a forest with two components. See Hence Tσ = −Tσ1 + Fig. 2, where variables and vertices have been renamed according to the permutation σ . 1.3. Tree Chen property and coproduct structure. The Chen property (ii), see Introduction, may be generalized to tree iterated integrals by using the coproduct structure of H, as explained in [8]. It is an essential feature of our algorithm since it implies the possibility to reconstruct a rough path from the quantities t → nts0 with fixed s0 . This idea will be pursued further in the next subsection, where we shall introduce a variant of these iterated integrals with fixed s0 called skeleton integrals. Definition 1.7 (admissible cuts). (see [8], Sect. 2). 1. Let T be a tree, with set of vertices V (T) and root denoted by 0. If v = (v1 , . . . , v J ), J ≥ 1 is any totally disconnected subset of V (T) \ {0}, i.e. vi v j for all i, j = 1, . . . , J , then we shall say that v is an admissible cut of T, and write v | V (T). We let Leav T (read: leaves of T) be the sub-forest (or sub-tree if J = 1) obtained by keeping only the vertices above v, i.e. V (Leav T) = v ∪ {w ∈ V (T) : ∃ j = 1, . . . , J, w v j }, and Roov T (read: root part of T) be the sub-tree obtained by keeping all other vertices. 2. Let T = T1 . . . Tl be a forest, together with its decomposition into trees. Then an admissible cut of T is a disjoint union v 1 ∪ . . . ∪ vl , v i ⊂ Ti , where v i is either ∅, {0i } (root of Ti ) or an admissible cut of Ti ; by convention, the two trivial cuts ∅ ∪ . . . ∪ ∅ and {01 } ∪ . . . ∪ {0l } are excluded. By definition, we let Roov T = Roov 1 T1 . . . Roovl Tl , Leav T = Leav 1 T1 . . . Leavl Tl (if v i = ∅, resp. {0i }, then (Roovi Ti , Leavi Ti ) := (Ti , ∅), resp. (∅, Ti )). See Figs. 3, 4 and 2. Defining the co-product operation Roov T ⊗ Leav T, : H → H ⊗ H, T → e ⊗ T + T ⊗ e + v |V (T)
(1.13) where e stands for the unit element, yields a coalgebra structure on H. One may also define an antipode S, which makes H a Hopf algebra (see Sect. 3 for more details). We may now state the tree Chen property. Recall from the Introduction that [δ f ]tus := f ts − f tu − f us if f is a function of two variables. Proposition 1.8 (tree Chen property). (See [20] or [16]). Let T be a forest, then [δ IT ()]tus = [I Roov T ()]tu [I Leav T ()]us . (1.14) v |V (T)
10
J. Unterberger
w’ w vd
vu
0 Fig. 3. Admissible cut
w’ w
0 Fig. 4. Non-admissible cut
This proposition is illustrated in the discussion following Lemma 1.12 in the upcoming paragraph. 1.4. Skeleton integrals. We now introduce a variant of tree iterated integrals that we call tree skeleton integrals, or simply skeleton integrals. We explain after Eq. (1.23) below the reason why we shall use skeleton integrals instead of the usual iterated integrals as building stones for our construction. Definition 1.9 (formal integral). Let f : R → R be a smooth, compactly supported t function such that F f (0) = 0. Then the formal integral f of f is defined as +∞ t itξ 1 e dξ. (1.15) f := √ (F f )(ξ ) iξ 2π −∞ The condition F f (0) = 0 prevents possible infra-red divergence when ξ → 0. Note that
t t s t +∞ 1 ixξ f − f =√ (F f )(ξ ) e d x dξ = f (x)d x (1.16) 2π −∞ s s t by the Fourier inversion formula, so f is an anti-derivative of f . Formally one may write, as an equality of distributions: t t eitξ , (1.17) eixξ d x = eixξ d x = iξ ∞ +∞ ixξ since −∞ eiξ φ(ξ ) dξ →x→∞ 0 for any test function φ such that φ(0) = 0. Hence t +∞ +∞ t 1 1 eitξ ixξ dξ, (1.18) f =√ dξ(F f )(ξ ) e dx = √ (F f )(ξ ) iξ 2π −∞ 2π −∞ in coherence with Eq. (1.15).
Hölder-Continuous Rough Paths by Fourier Normal Ordering
11
Definition 1.10 (skeleton integrals). (i) Let T be a tree with decoration : T → {1, . . . , d}. Let (v1 , . . . , v|V (T)| ) be any ordering of V (T) compatible with the tree partial ordering. Then the skeleton integral of along T is by definition
[SkIT ()]t :=
t
dxv1 ( (v1 ))
xv− 2
dx2 ( (v2 )) . . .
xv−
|V (T)|
dxv|V (T)| ( (v|V (T)| )).
(1.19) (ii) Extension to forests. Let T = T1 . . . Tl be a forest, with its tree decomposition. Then one defines [SkIT ()]t :=
l
[SkIT j ()]t .
(1.20)
j=1
˜ and μ a (iii) Multilinear extension, see Definition 1.3. Assume T is a subtree of T, ˜ T compactly supported, signed Borel measure on R := {(xv )v∈V (T˜ ) , xv ∈ R}. Then t x− x− v2 v |V (T)| [SkIT (μ)]t := ... μ(d xv1 , . . . , d xv|V (T)| ) (1.21) is a signed Borel measure on {(xv )v ∈V (T˜ )\V (T) , xv ∈ R}. Formally again, [SkIT ()]t may be seen as [IT ()]t,±i∞ . Denote by μˆ the partial Fourier transform of μ with respect to (xv )v∈V (T) ), so that μ((ξ ˆ v )v∈V (T) , (d xv )v ∈V (T )\V (T) ) = (2π)−|V (T)|/2 μ, (xv )v∈V (T) → e−i v∈V (T) xv ξv . (1.22) Then
[SkIT (μ)]t = (2π )−|V (T)|/2 μ, ˆ SkIT (xv )v∈V (T) → ei v∈V (T) xv ξv . (1.23) t
As explained in the previous subsection, tree skeleton integrals are straightforward generalizations of the usual tree iterated integrals. They are very natural when computing in Fourier coordinates, because every successive integration brings about a new ξ -factor in the denominator, allowing easy Hölder estimates using Besov norms (see the t itξ Appendix). On the contrary, 0 eixξ d x = eiξ − iξ1 contains a constant term − iξ1 which does not improve when one integrates again. It is the purpose of Sect. 3 to show that a rough path over an α-Hölder path may be obtained from adequately regularized tree skeleton integrals, using the following tree skeleton decomposition, which is a variant of the tree Chen property recalled in Proposition 1.8 above. Definition 1.11 (multiple cut). Let v ⊂ V (T), v = ∅. If w ∈ v, one calls Lev(w) := 1+|{w ∈ v; w w }| the level of w. If v | V (T) is an admissible cut, then Lev(w) = 1 for all w ∈ v. Quite generally, letting Lev(v) = max{Lev(w); w ∈ v}, one writes v j := {w ∈ v; Lev(w) = j} for 1 ≤ j ≤ Lev(v), and calls (v j ) j=1,...,Lev(v ) the level decomposition of v considered as a multiple cut. One shall also write: v 1 | . . . | v Lev(v ) | V (T) since v Lev(v ) | V (T) and each v j , j = 1, . . . , Lev(v) − 1 is an admissible cut of Roov j+1 (T).
12
J. Unterberger
Lemma 1.12 (tree skeleton decomposition). Let T be a tree. Then: (i) Recursive version. [IT ()]tu = [δSkIT ()]tu −
[I RoovT ()]tu .[SkI LeavT ()]u , (1.24)
v |V (T)
(ii) Non-recursive version. [IT ()]tu = [δSkIT ()]tu +
(−1)|v 1 |+...+|vl |
l≥1 v 1 |...|vl |V (T)
[δSkI Roov1 (T) ()]tu
l−1
SkI Leavm ◦Roovm+1 (T) [SkI Leavl (T) ()]u . u
m=1
(1.25) Proof. Same as for Proposition 1.8. Equation (1.24) may formally be seen as a particular case of the Chen property (1.14) by setting s = ±i∞ (see the previous subsection). The non-recursive version may be deduced from the recursive version in a straightforward way. Let us illustrate these notions in a more pedestrian way for the reader who is not accustomed to tree integrals. Consider for an example the trunk tree Tn with vertices n → n − 1 → . . . → 1 and decoration : {1, . . . , n} → {1, . . . , d}, and the associated iterated integral t xn−1 [In ()]ts = [ITn ()]ts = dx1 ( (1)) . . . dxn ( (n)). (1.26) s
s
Cutting Tn at some vertex v ∈ {2, . . . , n} produces two trees, Roov Tn and Leav Tn , with respective vertex subsets {1, . . . , v − 1} and {v, . . . , n}. Then the usual Chen property (ii) in the Introduction reads [δ ITn ()]tus = [I Roov Tn ()]tu [I Leav Tn ()]us . (1.27) v∈V (Tn )\{1}
On the other hand, rewrite [ITn ()]tu as the sum of the increment term, which is a skeleton integral, t x1 xn−1 [δSkITn ()]tu = dx1 ( (1)) dx2 ( (2)) . . . dxn ( (n)) u x1 xn−1 − dx1 ( (1)) dx2 ( (2)) . . . dxn ( (n)), (1.28) and of the boundary term [ITn ()(∂)]tu := − .
u
n 1 +n 2 =n u
dxn1 +1 ( (n 1 + 1))
t
dx1 ( (1)) . . . xn 1 +1
xn 1 −1 u
dxn1 ( (n 1 ))
dxn1 +2 ( (n 1 + 2)) . . .
xn−1
dxn ( (n)). (1.29)
Hölder-Continuous Rough Paths by Fourier Normal Ordering
13
The above decomposition is fairly obvious for n = 2 and obtained by easy induction for general n. One has thus obtained the recursive skeleton decomposition property for trunk trees, [ITn ()]tu = [δSkITn ()]tu − [I Roov Tn ()]tu .[SkI Leav Tn ()]u . (1.30) v∈V (Tn )\{1}
The non-recursive version of the skeleton decomposition property is a straightforward consequence, and reads in this case [ITn ()]tu = [δSkITn ()]tu + (−1)l [δSkI Roo j1 (Tn ) ()]tu j1 compatible with its tree ordering, we let P + := P U> with U> := {(kv )v∈V (T) ∈ ZT | (v > w) ⇒ |kv | ≥ |kw |}. (iv) Using the Fourier multipliers D(φ˜ kv ) instead of D(φkv ), see Definition 5.3, define similarly P˜ { k} :=
1 ⊗v∈V (T) D(φ˜ kv )( (v)), | k |
(2.5)
where k ⊂ n is the subset of permutations τ such that |kτ ( j) | = |k j | for every j = 1, . . . , n, and P˜ { k} (). P˜ + := (2.6) k=(kv )v∈V (T) ∈U>
Remark. By construction, P + P˜ + = P˜ + if P + , P˜ + are associated to a total ordering compatible with the tree ordering of T. α α Note that P U may be considered as a linear operator P U : (B∞,∞ )⊗T → (B∞,∞ )⊗T , α where (B∞,∞ )⊗T stands for the vector space generated by the monomials ⊗v∈V (T) f v , α f v ∈ B∞,∞ . It is actually a bounded linear operator, as recalled in the Appendix, see Proposition 5.8 and remarks after Proposition 5.2. We may now proceed to explain our regularization algorithm.
• Step 1 (Choice of regularizationscheme). Choose for each tree T ∈ T a subset { k} ZrTeg ⊂ ZT + such that the series k∈ZrTeg [SkIT (P ())]t converges absolutely for
any α-Hölder path . By assumption ZrTeg = Z if |V (T)| = 1. • Step 2. Let T be a forest equipped with a partial or total ordering compatible with its tree ordering, and P˜ + the corresponding projection operator. For k ∈ ZT + , we let the projected regularized skeleton integral be the quantity [R{ k} SkIT (P˜ + )]t = 1 k∈ZrTeg · [SkIT (P { k} P˜ + )]t .
(2.7)
{ k} ˜+ • Step 3 (Regularized projected tree integral). For k ∈ ZT + , let [R IT (P )]ts be constructed out of projected regularized skeleton integrals in the following recursive way, as in Lemma 1.12:
[R{ k} IT (P˜ + )]ts := [δR{ k} SkIT (P˜ + )]ts − [R{Roov ( k)} I Roov (T) (P˜ + )]ts [R{Leav ( k)} SkI Leav T (P˜ + )]s , v |V (T)
(2.8) where Roov (k) = (kw )w∈Roov (T) ∈ Z Roov (T) , and Leav (k) = (kw )w∈Leav (T) ∈ Z Leav (T) .
Hölder-Continuous Rough Paths by Fourier Normal Ordering
15
• Step 4 (Generalization to forests). The generalization is straightforward. Namely, if Tl 1 T = T1 . . . Tl is a forest, and k = (k1 , . . . , kl ) ∈ ZT + × . . . × Z+ , we let R{ k} SkIT (P˜ + ) :=
l
R{ k j } SkIT j (P˜ + )
(2.9)
j=1
and similarly l
R{ k} IT (P˜ + ) :=
R{ k j } IT j (P˜ + ).
(2.10)
j=1
Consider a partial or total ordering > on T and denote by P˜ + the corresponding projection operator. By summing over all indices k ∈ U> , one gets the following quantities: RSkIT (P˜ + ) := R{ k} SkIT (P˜ + ) (2.11) k∈U>
(see Definition 2.1), and similarly RIT (P˜ + ) :=
R{ k} IT (P˜ + ).
(2.12)
k∈U>
Observe in particular, using Eq. (2.8), and summing over indices k, that RIT (P˜ + ) decomposes naturally into the sum of an increment term, which is a regularized skeleton integral, and of a boundary term denoted by the symbol ∂, namely,
δRSkIT (P˜ + ) + RIT (P˜ + )(∂) . (2.13) ts
ts
This decomposition is a generalization of that obtained in Subsect. 1.4, see Eq. (1.28) and (1.29). Observe also that we have not defined RSkIT (), nor RIT (); the regularized integration operators RIT , RSkIT only act on Fourier normal ordered projections of paths P˜ + . • Final step (Fourier normal ordering). Let Tn be a trunk tree with n vertices decorated σ by , and, for each σ ∈ n , Tσ = Jj=1 g(σ, j)Tσj is the corresponding permutation σ graph, as in Lemma 1.5. Each forest T comes with a total ordering compatible with its tree ordering, which defines a projection operator P˜ + ; we write for short P˜ σ instead of P˜ + (⊗nm=1 ( (σ (m)))). Then we let [R n ( (1), . . . , (n))]ts :=
=
σ ∈ n
⎛ ⎝
Jσ σ ∈ n j=1
g(σ, j)RITσj (P˜ σ ) Jσ
k=(k1 ,...,kn )∈Zn ; |kσ (1) |≤...≤|kσ (n) | j=1
⎞ g(σ, j)[R{ k◦σ } ITσj (P˜ σ )]ts ⎠ . (2.14)
16
J. Unterberger
We shall prove in the next section that R satisfies the Chen (ii) and shuffle (iii) properties of the Introduction. The Hölder property (i) will be proved in Sect. 4 for an adequate choice of subdomains ZrTeg , T ∈ T satisfying in particular the property required in Step 1. Some essential comments are in order. 1. Assume that is smooth, and do not regularize, i.e., choose ZrTeg = ZT + . Then Eq. (2.8) is a recursive definition of the non-regularized projected integral [IT (P { k} P˜ + )]ts , as follows from the tree skeleton decomposition property, see Lemma 1.12. Hence the right-hand side of formula (2.14) reads simply
Jσ
σ ∈ n k=(k1 ,...,kn )∈Zn ; |kσ (1) |≤...≤|kσ (n) | j=1
g(σ, j)[ITσj (P { k} P˜ σ )]ts .
(2.15)
But this quantity is the usual iterated integral or canonical lift of , [ cano,n ( (1), . . . ,
(n)]ts , since Jσ
g(σ, j)[ITσj (P { k} P˜ σ )]ts = [ITσ (P { k} P˜ σ )]ts = [In (P { k} P˜ σ )]ts (2.16)
j=1
by Lemma 1.5, and
P { k} P˜ σ ()
σ ∈ n k=(k1 ,...,kn )∈Zn ; |kσ (1) |≤...≤|kσ (n) |
=
σ ∈ n
P + P˜ + (⊗nm=1 ( (σ (m)))) =
P˜ + (⊗nm=1 ( (σ (m)))) =
σ ∈ n
,
(2.17)
see the Remark after Definition 2.1. 2. Iterated integrals of order 1, [R 1 (i)]ts , 1 ≤ i ≤ d, are not regularized, namely, [R 1 (i)]ts = [ 1 (i)]ts = t (i) − s (i), because of the assumption in Step 1 which states that ZrTeg = Z if |V (T)| = 1. Hence R is a rough path over . 3. We propose a reformulation of this algorithm in a Hopf algebraic language in Lemma 3.5 below. An equivalent algorithm is given in Definition 3.7. The abstract algebraic language of Sect. 3 turns out to be very appropriate to prove the Chen and shuffle properties. 3. Proof of the Geometric and Multiplicative Properties Let = ((1), . . . , (d)) be an α-Hölder path. This section is dedicated to the proof of Theorem 3.1. Choose for each tree T a subset ZrTeg ⊂ ZT such that the condition of Step 1 of the construction in Sect. 2 is satisfied, i.e. such that the regularized rough path R defined in Sect. 2 is well-defined. Then R satisfies the Chen (ii) and shuffle (iii) properties of the Introduction.
Hölder-Continuous Rough Paths by Fourier Normal Ordering
17
This theorem is in fact a consequence of the following very general construction, whose essence is really algebraic. Two Hopf algebras are involved in it: the Hopf algebra of decorated rooted trees H, and the shuffle algebra Sh. As we shall presently see, the first one is related to the Chen property, while the second one is related to the shuffle property. The first paragraph below is devoted to an elementary presentation of these Hopf algebras in connection with the Chen/shuffle property. Theorem 3.1 is proved in the second paragraph. 3.1. Hopf algebras and the Chen and shuffle properties. 1. Let us first consider the Hopf algebra of decorated rooted trees, H. Recall the definition of the coproduct on H, (T) = e ⊗ T + T ⊗ e + Roov T ⊗ Leav T. (3.1) v |V (T)
The usual convention [8,9] is to write c (cut) for v, R c (T) (root part) for Roov T, P c (T) for Leav T (leaves), and to reverse the order of the factors in the tensor product. The convolution of two linear forms f, g on H is written: f (Roov T)g(Leav T), T ∈ H. ( f ∗ g)(T) = f (T)g(e) + f (e)g(T) + v |V (T)
(3.2) This notion is particularly interesting for characters. A character of H is a linear map such that χ (T1 .T2 ) = χ (T1 ).χ (T2 ). If χ1 , χ2 are two characters of H, then χ1 ∗ χ2 is also a character of H. The tree Chen property, see Proposition 1.8, may then be stated as follows. Let = ((1), . . . , (d)) be a smooth path, and Hd := {T ∈ H; : V (T) → {1, . . . , d}}
(3.3)
be the subspace of H generated by forests with decoration valued in {1, . . . , d}. Now, define Its : Hd → R to be the following character of H (see Definition 1.3) Its (T) = [IT ()]ts .
(3.4)
Its = Itu ∗ Ius .
(3.5)
Then (as remarked in [20])
Generalizing this property to the multilinear setting, one may also write Iμts (T) = (I tu ∗ I us )μ (T) := Iμtu (T) + Iμus (T) tu us I Roo (Roov (T))I Lea (Leav (T)) + v (μ) v (μ)
(3.6)
v |V (T)
for a tensor measure μ = ⊗v∈V (T) μv , where Roov (μ) := ⊗v∈V (Roov (T)) μv , Leav (μ) := ⊗v∈V (Leav (T)) μv , and (I tu ∗ I us )μk (T) (3.7) Iμts (T) := (I tu ∗ I us )μ (T) := k
18
J. Unterberger
for a more general measure μ := k μ k , where each μ k is a tensor measure. Later on we shall use these formulas for μ k = 1 k∈ZT+ dP { k} () or 1 k∈ZrTeg dP { k} (). As for the antipode S, it is the multiplicative morphism S : H → H defined inductively on tree generators T by (see [8], p. 219)
S(e) = e; S(T) = −T −
Roov T.S(Leav T).
(3.8)
v |V (T)
Applying iteratively the second relation yields an expression of S(T) in terms of multiple cuts of T obtained by ’chopping’ it [8], see Def. 1.11, namely, S(T) = −T −
(−1)|v 1 |+...+|vl |
l≥1 v 1 |...|vl |V (T)
Roov 1 (T)
l−1
Leav m ◦ Roov m+1 (T) Leavl (T).
m=1
(3.9) Let χ1 , χ2 be two characters of H. Recall that χ2 ◦ S is the convolution inverse of χ2 , namely, χ2 ◦ S is a character and χ2 ∗ (χ2 ◦ S) = e, ¯ where e¯ is the counity of H, defined on generators by e(e) ¯ = 1 and e(T) ¯ = 0 if T is a forest. Now Eq. (3.2) and (3.9) yield
χ1 ∗ (χ2 ◦ S)(T) = χ1 (T) + χ2 ◦ S(T) +
χ1 (Roov (T))χ2 ◦ S(Leav (T))
v |V (T)
= (χ1 − χ2 )(T) +
(χ1 − χ2 )(Roov (T))χ2 ◦ S(Leav (T))
v |V (T)
= (χ1 − χ2 )(T) + ×
l−1
(−1)|v1 |+...+|vl | l≥1
(χ1 − χ2 )(Roov1 (T))
v =(v 1 ,...,vl )
χ2 (Leavm ◦ Roovm+1 (T) ) χ2 (Leavl (T)),
(3.10)
m=1
where v = (v 1 , . . . , vl ) is a multiple cut of T as in Eq. (3.9). In particular, let SkIt : H → R be the character defined by (see Definition 1.10) SkIt (T) = [SkIT ()]t . Then the tree skeleton decomposition, see Lemma 1.12, reads simply Itu = SkIt ∗ SkIu ◦ S .
(3.11)
(3.12)
2. The shuffle algebra over the index set N [24] may be defined as follows. The algebra Sh is generated as a vector space over R by the identity e and by the trunk trees (Tn )n≥1 with vertex set V (Tn ) = {v1 < . . . < vn }, provided with an N-valued decoration . Let Tn , T n be trunk trees with n, resp. n vertices. The shuffle product of Tn and T n is the formal sum
Hölder-Continuous Rough Paths by Fourier Normal Ordering
Tn T n =
19 T
ε(Tn n ),
(3.13)
ε∈Sh((V (Tn ),V (T n )))
T
where Tn n is the trunk tree with n + n vertices obtained by putting T n on top of Tn , and the shuffle ε permutes the decorations of Tn , T n as in property (iii) discussed in the Introduction. Let Shd be the subspace of Sh generated by trunk trees with decoration valued in {1, . . . , d}. Then the shuffle property for iterated integrals reads Its (Tn )Its (T n ) = Its (Tn T n ), Tn , T n ∈ Shd .
(3.14)
In other words, it may be stated by saying that Its : Tn → [ITn ()]ts is a character of Sh. Similarly, skeleton integrals SkIt : Tn → [SkIT ()]t also define characters of Sh. The shuffle algebra Sh is made into a Hopf algebra by re-using the same coproduct : T → T ⊗ e + e ⊗ T + v |V (T) Roov T ⊗ Leav T as for H, and defining the ¯ n is obtained from Tn by reversing the ¯ n , where T ¯ n ) = (−1)n T antipode S¯ as S(T ordering of the vertices, T¯ n (v j ) = Tn (vn+1− j ). The convolution of linear forms or characters f, g on Sh is given by the same formula as for H. Proposition 3.1 [24]. The linear morphism : H → Sh defined by (T) = j T j , where T j ranges over all trunk trees {v1 < . . . < v|V (T)| } such that the corresponding total ordering of vertices of T is compatible with its tree partial ordering, is a Hopf algebra map. is actually onto. In other words, it is a structure-preserving projection, with the canonical identification of Sh as a subspace of T. Note that [IT ()]ts = [SkIT ()]ts = 0 if T ∈ K er () and is an arbitrary smooth path, which is a straightforward generalization of the shuffle property; one may call this the tree shuffle property. Corollary 3.2. Let χ¯ be a character of Sh. Then χ := χ¯ ◦ is a character of H. If ¯ T ∈ Sh, then χ ◦ S(T) = χ¯ ◦ S(T). 3.2. Proof of the Chen and shuffle properties. We shall now prove Theorem 3.1. In the next pages, Meas(Rn ) stands for the space of compactly supported, signed Borel measures on Rn . Let us explain the strategy of the proof. We give a general method to construct families of characters of the shuffle algebra, χ¯ t , depending on a path , see ¯ Lemma 3.6; these quantities satisfy the shuffle property by Eq. (3.14). Then χ¯ t ∗(χ¯ s ◦ S) is immediately seen to define a rough path satisfying both the Chen and shuffle properties, see Definition 3.7. For a particular choice of the characters χ¯ t related to the regularized skeleton integrals defined in Sect. 2, the rough path of Definition 3.7 is shown to coincide with the regularized rough path R of Sect. 2, see Lemma 3.8. In order to prove this last lemma, one needs a Hopf algebraic reformulation of the Fourier normal ordering algorithm leading to R, see Lemma 3.5. Lemma 3.3 (measure splitting). Let μ ∈ Meas(Rn ). Then μσ ◦ σ, μ= σ ∈ n
(3.15)
20
J. Unterberger
where μσ ∈ P˜ + Meas(Rn ) is defined by μσ :=
(P˜ { k} μ) ◦ σ
(3.16)
k=(k1 ,...,kn )∈Zn ;|kσ (1) |≤...≤|kσ (n) |
as in Eq. (2.14). Proof. See Eq. (2.17).
+ ⊂ H (n ≥ 1) be the set of all forests T with n vertices and Definition 3.4. (i) Let Fn,n one-to-one decoration : V (T) → {1, . . . , n} valued in the set {1, . . . , n}, such + ⊂ H the vector space generated by F + . that (v w) ⇒ (v) ≥ (w), and Hn,n n,n + +, T n ˜ (ii) If T ∈ Fn,n , let P Meas(R ) denote the subspace {P˜ +,T μ; μ ∈ Meas(Rn )}, see Sect. 2 for a definition of the projection operator P˜ +,T . + ) (iii) Let φTt : P˜ +,T Meas(Rn ) → R, μ → φTt (μ), also written φμt (T)(t ∈ R, T ∈ Fn,n + +, T n be a family of linear forms such that, if (Ti , μi ) ∈ Fn i ,n i × P˜ i Meas(R i ), i = 1, 2, the following H-multiplicative property holds,
φμt 1 (T1 )φμt 2 (T2 ) = φμt 1 ⊗μ2 (T1 ∧ T2 ),
(3.17) where T1 ∧T2 ∈ Fn+1 +n 2 ,n 1 +n 2 is the forest T1 .T2 with decoration T = 1 , T = 1 2 n 1 + 2 ( i = decoration of Ti , i = 1, 2), and μ1 ⊗ μ2 ∈ P˜ +,T1 ∧T2 Meas(Rn 1 +n 2 ) is the tensor measure μ1 ⊗ μ2 (d x1 , . . . , d xn 1 +n 2 ) = μ1 (d x1 , . . . , d xn 1 )μ2 (d xn 1 +1 , . . . , d xn 1 +n 2 ). (iv) Let, for = ((1), . . . , (d)), χ¯ t : Shd → R be the linear form on Shd defined by χ¯ t (Tn ) := φμt σ (Tσ ), (3.18) σ ∈ n
where – being the decoration of Tn – one has set μ := ⊗nj=1 d( ( j)), and Tσ is the permutation graph associated to σ (see Subsect. 1.2). Remarks. 1. Note that the H-multiplicative property (3.17) holds in particular for φTt = [SkIT ( . )]t or [RSkIT ( . )]t , either trivially or by construction (see Step 4 in the construction of Sect. 2). Note that [RSkIT (μ)]t has been defined only if μ ∈ P˜ + Meas(Rn ). If φTt = [SkIT ( . )]t , then simply χ¯ t (Tn ) = [SkITn ()]t by the measure splitting lemma. ˜ 2. Assume μi ∈ P˜ + Meas(Rn i ) ⊂ P˜ +,T Meas(Rn i ), where P˜ + is the P-projection associated to the subset Zn+i := {k = (k1 , . . . , kn i ); |k1 | ≤ . . . ≤ |kn i |}(i = 1, 2). Then μ1 ⊗ μ2 ∈ P˜ +,T1 ∧T2 Meas(Rn 1 +n 2 ) but μ1 ⊗ μ2 ∈ P˜ + Meas(Rn 1 +n 2 ) in general; the product measure μ1 ⊗ μ2 decomposes as a sum over shuffles ε of (1, . . . , n 1 ), (n 1 + 1, . . . , n 1 + n 2 ), namely, μ1 ⊗ μ2 = ε shuffle (μ1 ⊗ μ2 )ε ◦ ε. Hence the H-multiplicative property (3.17) reads also t −1 φ(μ (T1 ∧ T2 )), (3.19) φμt 1 (T1 )φμt 2 (T2 ) = ε (ε 1 ⊗μ2 ) ε shuffle
where ε−1 (T1 ∧ T2 ) is the forest T1 ∧ T2 with decoration ε−1 ◦ , see Definition 3.4 (iii) for the definition of .
Hölder-Continuous Rough Paths by Fourier Normal Ordering
21
3. The regularization algorithm R presented in Sect. 2 may be written in a compact way using the structures we have just introduced. Namely, one has: Lemma 3.5. Let = ((1), . . . , (d)) and μ := ⊗nj=1 d( ( j)). Then [R n ( (1), . . . , (n))]ts =
φ t ∗ (φ s ◦ S) μσ (Tσ ),
σ ∈ n
where
(3.20)
⎛
⎡
⎞⎤ ⎟⎥ ⎜ ⎢ φνt (T) := [RSkIT (ν)]t = ⎣SkIT ⎝ ⊗v∈V (T) D(φkv ) ν ⎠⎦ k∈ZrTeg
(3.21) t
for ν ∈ P˜ +,T Meas(Rn ), and φ t ∗ (φ s ◦ S) μσ is the obvious multilinear extension of the convolution, see Eq. (3.7). Proof. Simple formalization of the regularization procedure explained in Sect. 2.
The fundamental result is the following. Lemma 3.6. Let = ((1), . . . , (d)) be compactly supported, and assume that the condition of Step 1 in Sect. 2 is satisfied. Then χ¯ t is a character of Shd . Proof. Let Tn i ∈ Shd with n i vertices (i = 1, 2); define n := n 1 + n 2 . Let μi := i ⊗nj=1 d( i ( j)), i = 1, 2 and μ := μ1 ⊗ μ2 . If n ≥ 1, we let T n be the trunk tree with n vertices {n → . . . → 1} and decoration ( j) = j, j ≤ n , see Fig. 1. All shuffles ε below are intended to be shuffles of (1, . . . , n 1 ), (n 1 + 1, . . . , n 2 ). Then t χ¯ t (Tn 1 Tn 2 ) = χ¯ μ◦ε (T n ) ε shuffle
=
σ ∈ n ε shuffle
=:
σ ∈ n
t σ φ(μ◦ε) σ (T ) =
σ ∈ n ε shuffle
φμt ε◦σ (Tσ )
φμt σ (tσ1 )
with
tσ1 :=
(3.22)
Tε
−1 ◦σ
+ ∈ Hn,n .
(3.23)
ε shuffle
On the other hand, χ¯ t (Tn 1 )χ¯ t (Tn 2 ) = χ¯ μt 1 (T n 1 )χ¯ μt 2 (T n 2 ) φ t σ1 (Tσ1 )φ t σ2 (Tσ2 ) = σ1 ∈ n 1 ,σ2 ∈ n 2
=
μ1
σ1 ∈ n 1 ,σ2 ∈ n 2 ε shuffle
μ2
φt
σ
σ
(μ11 ⊗μ22 )ε
(ε−1 (Tσ1 ∧ Tσ2 ))
(3.24)
22
J. Unterberger
by (3.19) =
σ ∈ n
where
φμt σ (tσ2 ),
tσ2 :=
ε−1 (Tσ1 ∧ Tσ2 ).
(3.25)
(3.26)
(σ1 ,σ2 ,ε);(σ1 ⊗σ2 )◦ε=σ
Hence χ¯ t is a character of Sh if and only if tσ1 = tσ2 for every σ ∈ n ; let us prove this. Extend first (3.22) and (3.25) by multilinearity from tensor measures μ1 ⊗ μ2 to a general measure μ ∈ Meas(Rn ). By the usual shuffle identity, SkIt (Tn 1 Tn 2 ) = SkIt (Tn 1 ).SkIt (Tn 2 ), so (3.22) and (3.25) coincide for χ¯ t = [SkI( . )]t . Choose σ ∈ n . For any μ ∈ Meas(Rn ), one has [SkIμσ (tσ1 − tσ2 )]t = 0.
(3.27)
This fact implies actually that tσ1 = tσ2 . Let us first give an informal proof of this statement. To begin with, note that the fact that [SkI (t)]t = 0 for every smooth path does not imply in itself that t = 0 if t ∈ H is arbitrary. Namely, the character SkIt : H → R quotients out via the canonical projection : H → Sh, see Proposition 3.1, into a character Sh → R, by the tree shuffle property; one may actually prove that SkIt (t) = 0 for + are lineevery smooth path if and only if t ∈ K er (). In our case, the elements of Fn,n arly independent modulo K er () because the ordering of the labels ( j), j = 1, . . . , n is compatible with the tree ordering – which prevents any possibility of shuffling – hence tσ1 − tσ2 = 0. + Let us now give a more formal argument. Let tσ1 − tσ2 =: j a j t j , a j ∈ Z, t j ∈ Fn,n two-by-two distinct, and define Ft(ξ1 , . . . , ξn ) :=
1 (ξ + v∈V (t) v wv ξw )
(3.28)
+ . Applying Lemma 4.5 to [SkI (t )] , where (μ ◦ σ ) + n if t ∈ Fn,n μm j t m m≥1 ∈ P Meas(R ) is a sequence of measures whose Fourier transform converges weakly to the Dirac distribution δ(ξ1 ,...,ξn ) , one gets a j Ft j (ξ1 , . . . , ξn ) = 0, |ξ1 | ≤ . . . ≤ |ξn |. (3.29) J
Since the left-hand side of (3.29) is a rational function, the equation extends to arbitrary ξ = (ξ1 , . . . , ξn ) ∈ Rn . Note that (ξv + ξw ) = (ξ1 + ξw )Ftˇ j (ξ2 , . . . , ξn ), (3.30) v∈V (t j )
wv
w1
where ˇt j := Lea{1} (t j ) is t j severed of the vertex 1, which is one of its roots. Let J , ⊂ {2, . . . , n} be the subset of indices j such that {v ∈ {1, . . . , n}; v 1 in t j } = , i.e. such that the tree component of 1 in t j has vertex set . Take the residue at − w∈ ξw of the left-hand side of (3.29), considered as a function of ξ1 . This gives: a j Ftˇ j (ξ2 , . . . , ξn ) = 0, ⊂ {2, . . . , n}. (3.31) j∈J
Hölder-Continuous Rough Paths by Fourier Normal Ordering
23
Shifting by −1 the indices of vertices of ˇt j and the labels (v), v ∈ V (ˇt j ), one gets a + forest in Fn−1,n−1 . One may now conclude by an inductive argument. Let us now give an alternative definition for the regularization R. As we shall see in Lemma 3.8, the two definitions actually coincide. Definition 3.7 (alternative definition for regularization R ). Choose for every tree T ∈ H a subset ZrTeg ⊂ ZT + satisfying the condition stated in Step 1 of Sect. 2. Let = ((1), . . . , (d)) be a compactly supported, α-Hölder path, and μ := ⊗nj=1 d( ( j)) the corresponding measure. (i) Let, for every T ∈ Hd with n vertices, φνt (T) = [RSkIT (ν)]t , ν ∈ P˜ +,T Meas(Rn ),
(3.32)
see Eq. (2.11) or Lemma 3.5, and χ¯ t (Tn ) :=
σ ∈ n
φμt σ (Tσ )
(3.33)
be the associated character of Sh as in Definition 3.4. (ii) Let, for Tn ∈ Shd , n ≥ 1, with n vertices and decoration , ¯ n ). [R n ( (1), . . . , (n))]ts := χ¯ t ∗ (χ¯ s ◦ S)(T
(3.34)
¯ are characters of the shuffle algebra, R Since χ¯ s , χ¯ t and hence χ¯ t ∗ (χ¯ s ◦ S) satisfies the shuffle property. Also, R satisfies the Chen property by construction, since ¯ ∗ χ¯ u ∗ (χ¯ s ◦ S) ¯ (Tn ) [R n ( (1), . . . , (n))]ts = χ¯ t ∗ (χ¯ u ◦ S) = [R n ( (1), . . . , (n))]tu + [R n ( (1), . . . , (n))]us + [R j ( (1), . . . , ( j))]tu [R n− j ( ( j + 1), . . . , (n))]us j
(3.35) by definition of the convolution in Sh. Both properties remain valid if χ¯ t , t ∈ R are arbitrary characters of Sh. Let us make this definition a little more explicit before proving that R = R. Replacing χ¯ s ◦ S¯ with χ s ◦ S, see Corollary 3.2, one gets, see Eq. (3.8), [R n ( (1), . . . , (n))]ts = χt (Tn ) + χs (S(Tn )) + =
(χ¯ t
− χ¯ s )(Tn ) +
χt (Roo j Tn )(χs ◦ S)(Lea j Tn )
j
(χ¯ t
− χ¯ s )(Roo j Tn ).χs (S(Lea j Tn )).
j
(3.36)
24
J. Unterberger
Expanding the formula for S(Lea j Tn ) in terms of multiple cuts as in the previous subsection, see Eq. (3.9), we get [R n ( (1), . . . , (n))]ts = (χ¯ t − χ¯ s )(Tn ) + (−1)l j1 |ξwmax (v) |. (4.1) 2 wv Proof. The left inequality is trivial. As for the right one, assume first that v is on a terminal branch, i.e. Lea f (v) = {wmax (v)} is a singleton. Then Definition 4.3 (ii) implies the following: for every vertex v on the branch between wmax (v) and v, i.e. v ∈ Br (wmax (v) v) ∪ {v}, – either ξv is of the same sign as ξwmax (v) ; |ξwmax (v) | |kv |−1 , 5 · 2|kv |−1 ) (and similarly for |ξ
– or |ξv | ≤ 2|V wmax (v) |) (T)| , since |ξv | ∈ (2 by the remarks following Proposition 5.2. v}| Hence |ξv + wv ξw | = | v ∈Br (wmax (v)v)∪{v} ξv | > 1 − 21 |{w:w |ξwmax (v) | |V (T)| and ξv + wv ξv has the same sign as ξwmax (v) . Consider now what happens at a node n. Let n + := {v ∈ V (T) | v → n}. Assume by induction on the number of vertices that, for all v ∈ n + ,
1 |{w : w v}| . |ξwmax (v) | ξw | > 1− (1 + |{w : w v}|) |ξwmax (v) | ≥ |ξv + 2 |V (T)| wv (4.2)
and that ξv + wv ξw has the same sign as ξwmax (v) . By Definition 4.3 (iii), either |ξwmax (n) | + ξwmax (v) .ξwmax (n) > 0 or |ξwmax (v) | ≤ 2|V (T)| . Then, letting w0 be the element of n such that wmax (v0 ) = wmax (n), ξw | = ξn + (ξv + ξ w ) (1 + |{w : w n}|) |ξwmax (n) | ≥ |ξn + wn wv v∈n + ≥ ξv 0 + ξw − (ξv + ξw ) − |ξn | + wv0 wv v∈n ;ξwmax (v).ξwmax (n) 1− (4.3) 2 |V (T)| 4.2. A key formula for skeleton integrals. We assume in this paragraph that is smooth and denote by its derivative. The Hölder estimates in Subsects. 4.3 and 4.4 rely on the key formula below. Lemma 4.5. The following formula holds: √ [SkIT ()]s = (i 2π)−|V (T)| . . .
v∈V (T)
dξv .eis
v∈V (T) ξv
v∈V (T) F ( v∈V (T) (ξv +
( (v)))(ξ
v)
wv ξw )
.
(4.4)
28
J. Unterberger
Proof. We use induction on |V (T)|. After stripping the root of T, denoted by 0, there remains a forest T = T 1 . . . T J , whose roots 01 , . . . , 0 J are the vertices directly connected to 0. Assume ix0 v∈V (T ) ξv j dξv .e F j (ξ0 j , (ξv )v∈T j \{0 j } ) (4.5) [SkIT j ()]x0 = . . . v∈V (T j )
for some functions F j , j = 1, . . . , J . Note that ⎡ ⎢ F SkIT j () (ξ j ) = ⎣
⎤
⎥ dξv ⎦ F j (ξ j −
v∈V (T j )\{0 j }
v∈V (T j )\{0 j }
ξv , (ξv )v∈V (T j )\{0 j } ).
(4.6) Then [SkIT ()]s =
s
dx0 ( (0))
J
[SkIT j ()]x0
j=1
1 = √ 2π 1 = √ 2π
+∞
−∞
+∞
−∞
⎡ J ⎢ ×⎣
⎞ ⎛ J dξ isξ ⎝
SkIT j ()⎠ (ξ ) e F ( (0)) iξ j=1
dξ F ( ( (0)))(ξ −
j=1 v∈V (T j )\{0 j }
J j=1
⎤
ξj)
eisξ . iξ
dξ1 . . .
J
j=1
v∈V (T j )\{0 j }
⎥ dξv⎦ F j (ξ j −
dξ J
ξv , (ξv )v∈V (T j )\{0 j } ),
(4.7) hence the result.
4.3. Estimate for the increment term. We now come back to an arbitrary α-Hölder path and prove a Hölder estimate for the increment term, see Eq. (2.13), which is simply a regularized skeleton integral. Let σ ∈ n be a permutation, and T be one of the forests Tσj appearing in the permutation graph Tσ , see Lemma 1.5. Hölder norms || . ||C γ are defined in the Appendix. Recall T comes with a total ordering compatible with its tree partial ordering. The ˜ P-projection P˜ + below is defined with respect to this total ordering. Lemma 4.6 (Hölder estimate of the increment term). ||RSkIT P˜ + (⊗v∈V (T) ( (σ (v)))) ||C |V (T)|α < ∞ holds.
(4.8)
Hölder-Continuous Rough Paths by Fourier Normal Ordering
29
Remark. Although formal integrals are a priori infra-red divergent (see Subsect. 1.4), the formula given in Lemma 4.5 for skeleton integrals delivers infra-red convergent quantities when one restricts the integration over ξ = (ξv )v∈V (T) to the subdomain associated to ZrTeg , see Lemma 4.4, because F( ( (v)))(ξv ) |F(( (v)))(ξv )| |ξv | ≤ |F(( (v)))(ξv )| (4.9) ξ + ξ |ξwmax (v) | v wv w is bounded. Proof. We implicitly assume in the proof that T is a tree, leaving the obvious generalization to forests with several components to the reader. We shall start the computations by adapting the proof of a theorem in [30], §2.6.1 bounding the Hölder-Besov norm of the product of two Hölder functions. Write
G(x) = RSkIT P˜ + (⊗v∈V (T) ( (σ (v)))) . (4.10) x
By Lemma 4.5,
√ G(x) = (i 2π )−|V (T)|
v∈V (T) supp(φkv )
k=(kv )v∈V (T) ∈ZrTeg
.e
ix
v∈V (T) ξv
D(φkv ) ( (σ (v))) (ξv ) . v∈V (T) (ξv + wv ξw )
v∈V (T) F
dξv
v∈V (T)
(4.11) Write, for ξ = (ξv )v∈V (T) , (ξ ) =
v∈V (T)
ξv +
ξv
wv ξw
(4.12)
and 1 (k) =
2|kv |
v∈V (T)
2|kwmax (v) |
.
(4.13)
Let finally k (ξ ) :=
! v∈V (T)
φkv (ξv ) .
(ξ ) . 1 (k)
(4.14)
By Lemma 4.4, || k || S 0 (RV (T) ) , see Proposition 5.8, is uniformly bounded in k if k ∈ ZrTeg , which is the key point for the following estimates. ∗ Let k ∈ Z. Apply the operator D(φk ) to Eq. (4.11): then, letting φk (ξ ) := φk ( v∈V (T) ξv ), ⎡ ⎤ ! ⎢ ⎥ D(φk )G(x) = ⎣ 1 (k)D( k )D(φk∗ ). D( φkv )( (σ (v)))⎦ (x), k∈ZrTeg
v∈V (T)
(4.15)
30
J. Unterberger
where x = (xv )v∈V (T) = (x, . . . , x) is a vector with |V (T)| identical ! components. Let vmax := sup{v | v ∈ V (T)}. Note that D(φk∗ ) . D(⊗v∈V (T) φkv ) vanishes except if ⎛ ⎞ ⎝ supp(φkv )⎠ ∩ supp(φk∗ ) = ∅, (4.16) v∈V (T)
which implies by Lemma 4.4, |kvmax − k| = O(log2 |V (T)|);
(4.17)
namely, denoting by 0 the root of T, |V (T)| . |ξkvmax | ≥ | v∈V (T) ξkv | = |ξk0 + 1 w0 ξkw | > 2 |ξkvmax | if ξv ∈ supp(φkv ) for every v. Since k , φk∗ ∈ S 0 (RV (T) ), one gets by Proposition 5.8, ! ||D(φk )G||∞ 1 (k) ||D( φkv )( (σ (v)))||∞ . (4.18) v∈V (T)
k∈ZrTeg ,kvmax =k
Since is in C α , one obtains by Propositions 5.7 and 5.8: 1 (k) 2−|kv |α ||D(φk )G||∞ k∈ZrTeg ,kvmax =k
v∈V (T)
2|kv |(1−α)−|kwmax (v) | .
(4.19)
k∈ZrTeg ,kvmax =k v∈V (T)
In other words, loosely speaking, each vertex v ∈ V (T) contributes a factor 2|kv |(1−α)−|kwmax (v) | to ||D(φk )G||∞ . If v is a leaf, then this factor is simply 2−|kv |α . Note that the upper bound 2|kv |(1−α)−|kwmax (v) | ≤ 2−|kv |α holds true for any vertex v. Consider an uppermost node n, i.e. a node to which no other node is connected, together with the set of leaves {w1 < . . . < w J } above n, see Fig. 5. Let p j = |V (Br (w j n))|. On the branch number j, −|k |α |k |(1−α)−|kw j | −|k |αp 2 v 2 wj j , (4.20) 2 wj v∈Br (w j n)\{w j } |kv |≤|kw j |
and (summing over kw1 , . . . , kw J −1 and over kn ) 2−|kw J |αp J 2−|kw J −1 |αp J −1 ⎛
|kw J −1 |≤|kw J |
⎛
⎝. . . ⎝
2−|kw1 |αp1 ⎝
|kw1 |≤|kw2 |
2
−|kw J |αW (n)
⎛
,
⎞⎞
⎞
2|kn |(1−α)−|kw J | ⎠⎠ . . .⎠
|kn |≤|kw1 |
(4.21)
where W (n) = p1 + . . . + p J + 1 = |{v : v n}| + 1 is the weight of n. One may then consider the reduced tree Tn obtained by shrinking all vertices above n (including n) to one vertex with weight W (n) and perform the same operations on Tn . Repeat this inductively until T is shrunk to one point. In the end, one gets ||D(φk )G||∞ 2−|kvmax |α|V (T)| 2−|k|α|V (T)| , hence G ∈ C |V (T)|α .
Hölder-Continuous Rough Paths by Fourier Normal Ordering
31
Remark. Note that the above proof breaks down for the non-regularized quantitities, T since the function k (ξ ) is unbounded on ZT + \ Zr eg . For instance, the Lévy area of fractional Brownian motion diverges below the barrier α = 1/4, see [11,32,33]. For deterministic, well-behaved paths with very regular, polynomially decreasing Fourier components, the unregularized integrals are probably well-defined at least for α > 1/2 – in which case the much simpler Young integral converges – otherwise the case is not even clear. 4.4. Estimate for the boundary term. We shall now prove a Hölder estimate corresponding to the boundary term. As in the previous paragraph, we let σ ∈ n and T be one of the forests Tσj , j = 1, . . . , Jσ . Once again, recall T comes with a total ordering com˜ patible with its tree partial ordering. The P-projection P˜ + below is defined with respect to this total ordering. Lemma
4.7 (Hölder regularity of theboundary term). The regularized boundary term + ˜ RIT P (⊗v∈V (T) ( (σ (v)))) (∂) is |V (T)|α-Hölder. ts
Proof. As in the previous proof, we assume implicitly that T is a tree, but the proof generalizes with only very minor changes to the case of forests. Solving in terms of multiple cuts as in Sect. 3 the recursive definition of the boundary term [RIT P˜ + (⊗v∈V (T) ( (σ (v))) (∂)]ts given in Sect. 2, one gets in the end a sum of ’skeleton-type’ terms of the form (see Fig. 6) " l−1 # Ats := [δRSkI Roo(T) ]ts [RSkI Leavm ◦Roovm+1 (T) ]s [RSkI Leavl (T) ]s
m=1
× P (⊗v∈V (T) ( (σ (v))) , ˜+
(4.22)
where vl = (vl,1 < . . . < vl,Jl ) | V (T), vl−1 | V (Roovl T), . . ., v 1 = (v1,1 , . . . , v1,J1 ) | Roov 2 (T)) and one has set for short Roo(T) := Roov 1 (T). Leav
T
l Zr eg l, j such that k = (kvl,1 , . . . , kvl,Jl ) (with |kvl,1 | ≤ First step. Let U [k] ⊂ Jj=1 . . . ≤ |kvl,Jl |) is fixed. Then (see after Eq. (4.19) in the proof of Lemma 4.6) each vertex v contributes a factor 2|kv |(1−α)−|kwmax (v)| ≤ 2−|kv |α , hence
||P U [ k] RSkI Leavl T (⊗v∈V (Leavl T) ( (σ (v))))||∞ ⎡ ⎤ ⎣2−|kv |α 2−|kw |α ⎦ v∈vl
|kw |≥|kv |,w∈Leav T\{v}
2
−|kv |α|V (Leav T)|
.
(4.23)
v∈vl
˜ Second step. More generally, let Bs [k] be the expression obtained by P-projecting " l−1 # [RSkI Leavm ◦Roovm+1 (T) ]s [RSkI Leavl (T) ]s P˜ + (⊗v∈V (Leav1 (T)) ( (σ (v)))) m=1
32
J. Unterberger
v 1,2
4
v 2,1 v 1,1
2 1
0 Fig. 6. Here V (Roo(T)) = {0, 1, 2, 4}, R(0) = R(4) = ∅, R(1) = {v1,1 }, R(2) = {v1,2 }
onto the sum of terms with some fixed value of the indices k = (kv1,1 , . . . , kv1,J1 ). Then ||Bs [k]||∞
2−|kv |α|V (Leav T)|
(4.24)
v∈v 1
(proof by induction on l). Third step. We define As (x) := [RSkI Roo(T) ]x
" l−1
# [RSkI Leavm ◦Roovm+1 (T) ]s [RSkI Leavl (T) ]s
m=1
P˜ + (⊗v∈V (T) ( (σ (v)))
(4.25)
α (see Eq. (4.22)), so that Ats = As (t)−As (s), and show that sups∈R ||x → As (x)|| B∞,∞ < ∞. Note first (see the Remark following Lemma 4.6) there is no infra-red divergence problem. Let V (Roo(T)) = {w1 < . . . < wmax }. Fix s ∈ R and K ∈ Z. By definition, and by Lemma 4.5,
⎛ ⎜ (D(φ K )As ) (x) = D(φ K ) ⎝x →
k=(kv1,1 ,...,kv1,J ) ((kw )w∈V (Roo(T)) )∈Sk 1
v∈V (Roo(T))
dξv . e
ix
v∈V (Roo(T)) ξv
v∈V (Roo(T))
supp(φkv )
⎞ D(φkw ) ( (σ (w))) (ξw ) Bs [ k]⎠ , w∈V (Roo(T)) (ξw + w w,w ∈V (Roo(T)) ξw ) w∈V (Roo(T)) F
(4.26) where indices in Sk satisfy in particular the following conditions: (i) |ξw + w w,w ∈V (Roo(T)) ξw | > 21 max{|ξw | : w w, w ∈ V (Roo(T))} by Lemma 4.4; ∗ ) = ∅, see Eq. (4.16); (ii) supp(φ ) ∩ supp(φ K k w w∈V (Roo(T)) (iii) for every w ∈ V (Roo(T)), |kw | ≤ |kwmax |; and (iv) for every w ∈ V (Roo(T)), |kw | ≤ |kv | for every v ∈ R(w) := {v = v1,1 , . . . , v1,J1 | v → w}. Note that R(w) may be empty. See Fig. 6. Note that |kwmax − K | = O(log2 |V (Roo(T))|) by (ii) (see Eq. (4.17)). Hence conditions (ii) and (iii) above are more or less equivalent to fixing kwmax K and letting (kw )w∈V (Roo(T))\{wmax } range over some subset of [−|K |, |K |] × . . . × [−|K |, |K |].
Hölder-Continuous Rough Paths by Fourier Normal Ordering
33
The large fraction in Eq. (4.26) contributes to ||D(φk )As ||∞ an overall factor bounded by |1 (k)| w∈V (Roo(T)) 2−|kv |α . If w ∈ Roo(T), split R(w) into R(w)> ∪ R(w)< , where R(w)≷ := {v ∈ R(w) | v ≷ wmax }. Summing over indices corresponding to vertices in or above RT> := {v = vl,1 , . . . , vl,Jl | v > wmax } = ∪w∈Roo(T) R(w)> , one gets by Eq. (4.24) a quantity bounded up to a constant by 2−|kv |α|V (Rv T)| 2−|K |α v∈RT> |V (Rv T)| . (4.27) v∈R T> |kv |≥|K |
Let w ∈ Roo(T) \ {wmax } such that R(w)< = ∅ (note that R(wmax )< = ∅). Let R(w)< = {vi1 < . . . < vi j } . Then the sum over (kv ), v ∈ R(w)< contributes a factor bounded by a constant times 2
∞
−|kw |α
∞
...
|kvi |=|kw | |kvi |=|kvi | 1
2
2
∞
1
|kvi |=|kvi j
−|kw |α(1+ v∈R(w)< |V (Leav T)|)
2 j−1
−|kvi |α|V (Leavi T)| 1
1
...2
−|kvi |α|V (Leavi T)| j
j
|
.
(4.28)
In other words, each vertex w ∈ Roo(T) ’behaves’ as if it had a weight 1 + v∈R(w)< |V (Rv T)|. Hence (by the same method as in the proof of Lemma 4.6), letting RT< := ∪w∈Roo(T) R(w)< ,
||D(φ K )As ||∞ 2−|K |α(|V (Roo(T))|+
v∈RT
|V (Leav T)|
(4.29)
5. Appendix. Hölder and Besov Spaces We gather in this Appendix some definitions and technical facts about Besov spaces and Hölder norms that are required in Sects. 2 and 4. Definition 5.1 (Hölder norm). If f : Rl → R is α-Hölder continuous for some α ∈ (0, 1), we let | f (x) − f (y)| . ||x − y||α x,y∈Rl
|| f ||C α := || f ||∞ + sup
(5.1)
The space C α = C α (Rl ) of real-valued α-Hölder continuous functions, provided with the above norm || ||C α , is a Banach space. Proposition 5.2 [30]. Let l ≥ 1. There exists a family of C ∞ functions φ0 , (φ1, j ) j=1,...,4l −2l : Rl → [0, 1], satisfying the following conditions: 1. suppφ0 ⊂ [−2, 2]l and φ0 [−1,1]l ≡ 1. 2. Cut [−2, 2]l into 4l equal hypercubes of volume 1, and remove the 2l hypercubes included in [−1, 1]l . Let K 1 , . . . , K 4l −2l be an arbitrary enumeration of the remaining hypercubes, and K˜ j ⊃ K j be the hypercube with the same center as K j , but with edges twice longer. Then suppφ1, j ⊂ K˜ j , j = 1, . . . , 4l − 2l .
34
J. Unterberger
3. Let (φk, j )k≥2, j=1,...,4l −2l be the family of dyadic dilatations of (φ1, j ), namely, φk, j (ξ1 , . . . , ξl ) := φ1, j (21−k ξ1 , . . . , 21−k ξl ).
(5.2)
Then (φ0 , (φk, j )k≥1, j=1,...,4l −2l ) is a partition of unity subordinated to the covering l l [−2, 2]l ∪ ∪k≥1 ∪4 −2 2k−1 K˜ j , namely, j=1
−2 4 l
φ0 +
l
φk, j ≡ 1.
(5.3)
k≥1 j=1
Constructed in this almost canonical way, the family of Fourier multipliers (φ0 , (φk, j )) is immediately seen to be uniformly bounded for the norm ||.|| S 0 (Rl ) defined in Proposition 5.8 below. If l = 1, letting K 1 = [1, 2] and K 2 = [−2, −1], we shall write φ1 , resp.φ−1 , instead of φ1,1 , resp. φ1,2 , and define φk (ξ ) = φsgn(k) (21−|k| ξ ) for |k| ≥ 2, so that k∈Z φk ≡ 1 and supp φ0 ⊂ [−2, 2], supp φk ⊂ [2k−1 , 5 × 2k−1 ], supp φ−k ⊂ [−5 × 2k−1 , −2k−1 ] (k ≥ 1).
(5.4)
In this particular case, such a family is easily constructed from an arbitrary even, smooth function φ0 : R → [0, 1] with the correct support by setting φk (ξ ) = 1R+ (ξ ).(φ0 (2−k ξ )− φ0 (21−k ξ )) and φ−k (ξ ) = 1R− (ξ ).(φ0 (2−k ξ ) − φ0 (21−k ξ )) for every k ≥ 1 (see [31], §1.3.3). In order to avoid setting apart the one-dimensional case, we let Il := Z if l = 1, and Il = {0} ∪ {(k, j) | k ≥ 1, 1 ≤ j ≤ 4l − 2l } if l ≥ 2. Also, if l ≥ 2, we define |κ| = k ≥ 1 if κ = (k, j) with k ≥ 1. Definition 5.3 Let (φ˜ κ )κ∈Il be the partition of unity of Rl , l ≥ 1 defined by (see Proposition 5.2): (i) φ˜ 0 := 1[−1,1]l , φ˜ 1, j := 1 K j ;
(5.5)
φ˜ k, j (ξ1 , . . . , ξl ) := φ˜ 1, j (21−k ξ1 , . . . , 21−k ξl ).
(5.6)
(ii) if k ≥ 2,
We use this auxiliary partition several times in the text. Definition 5.4 [30]. Let ∞ (L ∞ ) be the space of sequences ( f κ )κ∈Il of a.s. bounded functions f κ ∈ L ∞ (Rl ) such that || f κ || ∞ (L ∞ ) := sup || f κ ||∞ < ∞. κ∈Il
(5.7)
Let S (Rl , R) be the dual of the Schwartz space of rapidly decreasing functions on Rl . As is well-known, it includes the space of infinitely differentiable slowly growing functions.
Hölder-Continuous Rough Paths by Fourier Normal Ordering
35
The following definition is classical. Recall that the Fourier transform F has been defined at the end of the Introduction. Definition 5.5 (Fourier multipliers). Let m : Rl → R be an infinitely differentiable slowly growing function. Then D(m) : S (Rl , R) → S (Rl , R), φ → F −1 (m · Fφ)
(5.8)
defines a continuous operator. In other words, m is a Fourier multiplier of S (Rl , R). α α Definition 5.6 [30]. Let B∞,∞ (Rl ) := { f ∈ S (Rl , R) | || f || B∞,∞ < ∞}, where α || f || B∞,∞ := ||2α|κ| D(φκ ) f || ∞ (L ∞ )
= sup 2α|κ| ||D(φκ ) f ||∞ .
(5.9)
κ∈Il
α (Rl ) = C α (Rl ), and the Proposition 5.7 (see [30], §2.2.9). For every α ∈ (0, 1), B∞,∞ α two norms || ||C α and || || B∞,∞ are equivalent. α We shall sometimes call || || B∞,∞ the Hölder-Besov norm. Let us finally give a criterion for a function m to be a Fourier multiplier of the Besov α space B∞,∞ :
Proposition 5.8 (Fourier multipliers). (see [30], §2.1.3, p. 30). Let α ∈ (0, 1) and m : Rl → R be an infinitely differentiable function such that ||m|| S 0 (Rl ) := sup sup |(1 + ||ξ ||)| j| m ( j) (ξ )| < ∞, | j|≤l+5 ξ ∈Rl
(5.10)
where j = ( j1 , . . . , jl ), | j| = j1 + . . . + jl and m ( j) := ∂ξ11 . . . ∂ξll m. Then there exists a constant C depending only on α, such that j
α α ≤ C||m|| S 0 (Rl ) || f || B∞,∞ . ||D(m) f || B∞,∞
j
(5.11)
The space S 0 (Rl ) contains the space of translation-invariant pseudo-differential symbols of order 0 (see for instance [2], Def. 1.1, or [29]). References 1. Bass, R.F., Hambly, B.M., Lyons, T.J.: Extending the Wong-Zakai theorem to reversible Markov processes. J. Eur. Math. Soc. 4, 237–269 (2002) 2. Benassi, A., Jaffard, S., Roux, D.: Elliptic Gaussian random processes. Rev. Mat. Iberoamericana 13(1), 19–90 (1997) 3. Brouder, C., Frabetti, A.: QED Hopf algebras on planar binary trees. J. Alg. 267, 298–322 (2003) 4. Brouder, C., Frabetti, A., Krattenthaler, C.: Non-commutative Hopf algebra of formal diffeomorphisms. Adv. in Math. 200, 479–524 (2006) 5. Butcher, J.C.: An algebraic theory of integration methods. Math. Comp. 26, 79–106 (1972) 6. Calaque, D., Ebrahimi-Fard, K., Manchon, D.: Two Hopf algebras of trees interacting. Preprint http:// arxiv.org/abs/0806.2238v3[math.co], 2009
36
J. Unterberger
7. Chapoton, F., Livernet, M.: Relating two Hopf algebras built from an operad, International Mathematics Research Notices, Vol. 2007, Article ID rnm131 8. Connes, A., Kreimer, D.: Hopf algebras, renormalization and non-commutative geometry. Commun. Math. Phys. 199(1), 203–242 (1998) 9. Connes, A., Kreimer, D.: Renormalization in quantum field theory and the Riemann-Hilbert problem (I). Commun. Math. Phys. 210(1), 249–273 (2000) 10. Connes, A., Kreimer, D.: Renormalization in quantum field theory and the Riemann-Hilbert problem (II). Commun. Math. Phys. 216(1), 215–241 (2001) 11. Coutin, L., Qian, Z.: Stochastic analysis, rough path analysis and fractional Brownian motions. Prob. Th. Rel. Fields 122(1), 108–140 (2002) 12. Darses, S., Nourdin, I., Nualart, D.: Limit theorems for nonlinear functionals of Volterra processes via white-noise analysis. http://arxiv.org/abs/0904.1401v1[math.PR], 2009 13. Foissy, L.: Les algèbres de Hopf des arbres enracinés décorés (I). Bull. Sci. Math. 126 (3), 193–239, and (II), Bull. Sci. Math. 126(4), 249–288 (2002) 14. Friz, P., Victoir, N.: Multidimensional dimensional processes seen as rough paths. Cambridge studies in Adv. Math. 120, Cambridge: Cambridge University Press, 2010 15. Gubinelli, M.: Controlling rough paths. J. Funct. Anal. 216, 86–140 (2004) 16. Gubinelli, M.: Ramification of rough paths. Preprint available on http://arxiv.org/abs/math/ 0306433v2[math.PR], 2003 17. Hepp, K.: Proof of the Bogoliubov-Parasiuk theorem on renormalization. Commun. Math. Phys. 2(4), 301–326 (1966) 18. Hambly, B., Lyons, T.J.: Stochastic area for Brownian motion on the Sierpinski basket. Ann. Prob. 26(1), 132–148 (1998) 19. Kahane, J.-P.: Some random series of functions. Cambridge studies in advanced mathematics 5, Cambridge: Cambridge Univ. Press, 1985 20. Kreimer, D.: Chen’s iterated integral represents the operator product expansion. Adv. Theor. Math. Phys. 3(3), 627–670 (1999) 21. Lejay, A.: An introduction to rough paths. Séminaire de probabilités XXXVII, Lecture Notes in Mathematics, Berlin-Heidelberg-NewYork: Springer, 2003 22. Lyons, T., Qian, Z.: System control and rough paths. Oxford: Oxford University Press, 2002 23. Lyons, T., Victoir, N.: An extension theorem to rough paths. Ann. Inst. H. Poincaré Anal. Non Linéaire 24(5), 835–847 (2007) 24. Murua, A.: The shuffle Hopf algebra and the commutative Hopf algebra of labelled rooted trees. Available on www.ehu.es/ccwmuura/research/shart1bb.pdf, 2005 25. Murua, A.: The Hopf algebra of rooted trees, free Lie algebras, and Lie series. Found. Comput. Math. 6(4), 387–426 (2006) 26. Nualart, D.: Stochastic calculus with respect to the fractional Brownian motion and applications. Contemporary Mathematics 336, 3–39 (2003) 27. Rivasseau, V.: From Perturbative to Constructive Renormalization. Princeton Series in Physics, Princeton, NJ: Princeton Univ. Press, 1991 28. Tindel, S., Unterberger, J.: The rough path associated to the multidimensional analytic fBm with any Hurst parameter. Preprint available at http://arxiv.org/abs/0810.1408[math.PR], 2008 29. Treves, F.: Introduction to pseudodifferential and Fourier integral operators. Vol. 1. Pseudodifferential operators, The University Series in Mathematics, New York-London: Plenum Press, 1980 30. Triebel, H.: Spaces of Besov-Hardy-Sobolev type. Leipzig: Teubner, 1978 31. Triebel, H.: Theory of function spaces. II. Monographs in Mathematics, 84, Basel: Birkhäuser, 1992 32. Unterberger, J.: Stochastic calculus for fractional Brownian motion with Hurst parameter H > 1/4; a rough path method by analytic extension. Ann. Prob. 37(2), 565–614 (2009) 33. Unterberger, J.: A central limit theorem for the rescaled Lévy area of two-dimensional fractional Brownian motion with Hurst index H < 1/4. Preprint available at http://arxiv.org/abs/0808.3458v2[math.PR], 2008 34. Unterberger, J.: A rough path over multi-dimensional fractional Brownian motion with arbitrary Hurst index by Fourier normal ordering. Preprint available at http://arxiv.org/abs/0901.4771v2[math.PR], 2009 35. Unterberger, J.: A Lévy area by Fourier normal ordering for multidimensional fractional Brownian motion with small Hurst index. Preprint available at http://arxiv.org/abs/0906.1416v1[math.PR], 2009 36. Waldschmidt, M.: Valeurs zêta multiples. Une introduction. Journal de Théorie Des Nombres de Bordeaux 12(2), 581–595 (2000) Communicated by A. Connes
Commun. Math. Phys. 298, 37–64 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1066-z
Communications in
Mathematical Physics
Geometrization and Generalization of the Kowalevski Top Vladimir Dragovi´c1,2 1 Mathematical Institute SANU, Kneza Mihaila 36, 11000 Belgrade, Serbia. E-mail:
[email protected] 2 Mathematical Physics Group, University of Lisbon, Av. Prot. Gama Pinto, 2, PT-1649-003 Lisboa, Portugal
Received: 19 May 2009 / Accepted: 16 February 2010 Published online: 20 May 2010 – © Springer-Verlag 2010
Dedicated to my teacher Boris Anatol’evich Dubrovin on the occasion of his sixtieth birthday Abstract: A new view on the Kowalevski top and the Kowalevski integration procedure is presented. For more than a century, the Kowalevski 1889 case, has attracted full attention of a wide community as the highlight of the classical theory of integrable systems. Despite hundreds of papers on the subject, the Kowalevski integration is still understood as a magic recipe, an unbelievable sequence of skillful tricks, unexpected identities and smart changes of variables. The novelty of our present approach is based on our four observations. The first one is that the so-called fundamental Kowalevski equation is an instance of a pencil equation of the theory of conics which leads us to a new geometric interpretation of the Kowalevski variables w, x1 , x2 as the pencil parameter and the Darboux coordinates, respectively. The second is observation of the key algebraic property of the pencil equation which is followed by introduction and study of a new class of discriminantly separable polynomials. All steps of the Kowalevski integration procedure are now derived as easy and transparent logical consequences of our theory of discriminantly separable polynomials. The third observation connects the Kowalevski integration and the pencil equation with the theory of multi-valued groups. The Kowalevski change of variables is now recognized as an example of a two-valued group operation and its action. The final observation is surprising equivalence of the associativity of the two-valued group operation and its action to the n = 3 case of the Great Poncelet Theorem for pencils of conics.
Contents 1. 2. 3.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pencils of Conics and Discriminantly Separable Polynomials . . . 2.1 Pencils of conics and the Darboux coordinates . . . . . . . . . 2.2 Discriminantly separable polynomials . . . . . . . . . . . . . Geometric Interpretation of the Kowalevski Fundamental Equation
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
38 39 39 43 46
38
4.
5.
V. Dragovi´c
Generalized Integrable System . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Equations of motion and the first integrals . . . . . . . . . . . . . . . . 4.2 Generalized Kotter transformation . . . . . . . . . . . . . . . . . . . . 4.3 Interpretation of the equations of motion . . . . . . . . . . . . . . . . . Two-Valued Groups, Kowalevski Equation and Poncelet Porism . . . . . . . 5.1 Multivalued groups: defining notions . . . . . . . . . . . . . . . . . . . 5.2 The simplest case: 2-valued group p2 . . . . . . . . . . . . . . . . . . . 5.3 2-valued group structure on CP1 , the Kowalevski fundamental equation and Poncelet porism . . . . . . . . . . . . . . . . . . . . . . . . . . . .
47 47 51 52 55 55 56 59
1. Introduction The goal of this paper is to give a new view on the Kowalevski top and the Kowalevski integration procedure. For more than a century, the Kowalevski 1889 case [25], has attracted the full attention of a wide community as the highlight of the classical theory of integrable systems. Despite hundreds of papers on the subject, the Kowalevski integration is still understood as a magic recipe, an unbelievable sequence of skillful tricks, unexpected identities and smart changes of variables (see for example [1,2,4,11,14,17,18,20,22–24,26–29,32] and references therein). The novelty of this paper is based on our four observations. The first one is that the so-called fundamental Kowalevski equation (see [20,24,25]) Q(w, x1 , x2 ) = 0, is an instance of a pencil equation from the theory of conics. This leads us to a new interpretation of the Kowalevski variables w, x1 , x2 as the pencil parameter and the Darboux coordinates respectively. Origins and classical applications of the Darboux coordinates can be found in Darboux’s book [9], while some modern applications can be found in [12,13]. The second is observation of the key algebraic property of the pencil equation: all three of its discriminants are expressed as products of two polynomials in one variable each: Dw (Q)(x1 , x2 ) = f 1 (x1 ) f 2 (x2 ), Dx1 (Q)(w, x2 ) = f 3 (w) f 2 (x2 ), Dx2 (Q)(w, x1 ) = f 1 (x1 ) f 3 (w). This serves us as a motivation to introduce a new class of what we call discriminantly separable polynomials. We develop the theory of such polynomials. All steps of the Kowalevski integration now follow as easy and transparent logical consequences of our theory of the discriminantly separable polynomials. The third observation connects the Kowalevski integration and the pencil equation with the theory of multivalued groups. The theory of multivalued groups started in the beginning of the 1970’s by Buchstaber and Novikov (see [5]). It has been further developed by Buchstaber and his collaborators in the last forty years (see [6–8]). The Kowalevski change of variables is now recognized as a case of the two-valued group operation (2 , Z2 ) and its action, where 2 is an elliptic curve and Z2 its subgroup. Our final observation is the surprising equivalence of the associativity condition for this two-valued group operation to a case of the Great Poncelet Theorem for triangles. Well-known mechanical interpretation of the Great Poncelet Theorem is connected with
Geometrization and Generalization of the Kowalevski Top
39
integrable billiards, see for example [15]. The Great Poncelet Theorem is the milestone of the theory of pencils of conics and the whole classical projective geometry (see [30], and also [3,15,16] and references therein), as the Kowalevski top is the milestone of the classical integrable systems. Now we manage to relate them closely. As a consequence, we get a new connection between the Great Poncelet Theorem and integrable mechanical systems, this time from rigid- body dynamics. The paper is organized as follows. Section 2 starts with a subsection devoted to the pencils of conics and the Darboux coordinates. We derive the key property of the pencil equation-discriminant separability. In the second subsection, we formally introduce the class of discriminantly separable polynomials and systematically study this class. In Sect. 3 we show how the Kowalevski case is embedded into our more general framework. A new geometric interpretation of the Kowalevski variables (w, x1 , x2 ) as the pencil parameter and the Darboux coordinates is obtained. In Sect. 4 general systems are defined, related to the general equation of the pencil. The Kowalevski top can be seen as a special subcase. The first integrals are studied. Their properties are related to the properties of discriminantly separable polynomials, obtained in Sect. 2. It was done by use of what we call the Kotter trick (see [20,24]). The nature of this transformation is going to be clarified in Sect. 5 through the theory of multivalued groups. Then, we manage to generalize another Kotter transformation and this gives us a possibility to integrate the general system defined at the beginning of this section. We reduce the problem to the functions Pi , i = 1, 2, 3. The evolution of those functions in terms of the theta-functions was obtained by Kowalevski herself in [25]. A modern account of the theta-functions and their applications to nonlinear equations can be found for example in [17]. Section 5 is devoted to two-valued groups and their connection with the Kowalevski top and the Great Poncelet Theorem. In order to make the text self-contained as much as possible, we start the section with a brief introduction to the theory of multivalued groups, following works of Buchstaber and his co-workers. The main role is played by a two-valued coset group obtained from an elliptic curve 2 and its subgroup Z2 . It appears that the Kowalevski change of variables has its natural expression through this two-valued group and its action. These results complete the picture obtained before by Weil in [33] and Jurdjevic [23]. Within this framework, we give an explanation of the Kotter trick, as we promised in Sect. 4. Finally, we show that the associativity condition for the two-valued group (2 , Z2 ) is equivalent to the famous Great Poncelet Theorem ([30]) in its basic n = 3 case. 2. Pencils of Conics and Discriminantly Separable Polynomials 2.1. Pencils of conics and the Darboux coordinates. Let us start with two conics C1 and C2 given by their tangential equations: C1 : a0 w12 + a2 w22 + a4 w32 + 2a3 w2 w3 + 2a5 w1 w3 + 2a1 w1 w2 = 0; C2 : w22 − 4w1 w3 = 0.
(1)
We assume that conics C1 and C2 are in general position. Consider the pencil C(s) of conics C1 +sC2 . The conics from the pencil share four common tangents. The coordinate equation of the conics of the pencil is: F(s, z 1 , z 2 , z 3 ) := det M(s, z 1 , z 2 , z 3 ) = 0,
(2)
40
V. Dragovi´c
where M is a bordered matrix of the form ⎤ ⎡ z2 z3 0 z1 a0 a1 a5 − 2s ⎥ ⎢z M(s, z 1 , z 2 , z 3 ) = ⎣ 1 . z 2 a1 a2 + s a3 ⎦ z 3 a5 − 2s a3 a4
(3)
Then the point equation of the pencil of conics C(s) is of the form of the quadratic polynomial in s, F := H + K s + Ls 2 = 0,
(4)
where H , K and L are quadratic expressions in (z 1 , z 2 , z 3 ). Following Darboux (see [9]), we introduce a new system of coordinates in the plane. Given a plane with standard coordinates (z 1 , z 2 , z 3 ), we start from the given conic C2 . The conic is given by Eq. (1) and it is rationally parameterized by (1, , 2 ). The tangent line to the conic C2 through the point with the parameter 0 is given by the equation tC2 (0 ) : z 1 20 − 2z 2 0 + z 3 = 0. On the other hand, for a given point P in the plane with coordinates P = (ˆz 1 , zˆ 2 , zˆ 3 ) there correspond two solutions x1 and x2 of the equation quadratic in : zˆ 1 2 − 2ˆz 2 + zˆ 3 = 0.
(5)
Each solution corresponds to a tangent to the conic C2 from the point P. We will call the pair (x1 , x2 ) the Darboux coordinates of the point P. One finds immediately converse formulae zˆ 1 = 1, zˆ 2 =
x1 + x2 , zˆ 3 = x1 x2 . 2
(6)
We change the variables in the polynomial F from projective coordinates (z 1 : z 2 : z 3 ) to the Darboux coordinates according to formulae (6). In the new coordinates we get the formulae: H (x1 , x2 ) = (a12 − a0 a2 )x12 x22 + (a0 a3 − a5 a1 )x1 x2 (x1 + x2 ) 1 +(a52 − a0 a4 )(x12 + x22 ) + (2(a5 a2 − a1 a3 ) + (a52 − a0 a4 )x1 x2 2 +(a1 a4 − a3 a5 ))(x1 + x2 ) + a32 − a2 a4 , K (x1 , x2 ) = −a0 x12 x22 + 2a1 x1 x2 (x1 + x2 ) − a5 (x12 + x22 ) − 4a2 x1 x2 +2a3 (x1 + x2 ) − a4 , L(x1 , x2 ) = (x1 − x2 )2 .
(7)
We may notice for further references that (x1 − x2 )2 = 4(z 1 z 3 − z 22 ). Now, the polynomial F(s, x1 , x2 ) = L(x1 , x2 )s 2 + K (x1 , x2 )s + H (x1 , x2 )
(8)
Geometrization and Generalization of the Kowalevski Top
41
is of the second degree in each of variables s, x1 and x2 and it is symmetric in (x1 , x2 ). It has one very exceptional property, as described in the next theorem. For a polynomial P(y1 , y2 , . . . , yn ) of variables (y1 , y2 , . . . , yn ) we will denote its discriminant with respect to the variable yi by D yi (P) which is a polynomial of the rest of the variables (y1 , . . . , yi−1 , yi+1 , . . . , yn ). Theorem 1. (i) There exists a polynomial P = P(x) such that the discriminant of the polynomial F in s as a polynomial in variables x1 and x2 separates the variables: Ds (F)(x1 , x2 ) = P(x1 )P(x2 ).
(9)
(ii) There exists a polynomial J = J (s) such that the discriminant of the polynomial F in x2 as a polynomial in variables x1 and s separates the variables: Dx2 (F)(s, x1 ) = J (s)P(x1 ).
(10)
Due to the symmetry between x1 and x2 the last statement remains valid after exchanging the places of x1 and x2 . Proof.
(i) A general point belongs to two conics of a tangential pencil. If a point belongs to only one conic, then it belongs to one of the four common tangents of the pencil. At such a point, this unique conic touches one of the four common tangents. Thus, the equation Ds (F)(x1 , x2 ) = 0
(11)
which represents the condition of annulation of the discriminant, is the equation of the four common tangents. Thus, Eq. (11) is equivalent to the system x1 = c1 x1 = c2 , x1 = c3 x1 = c4 , x2 = c1 x2 = c2 , x2 = c3 x2 = c4 , where ci are parameters which correspond to the points of contact of the four common tangents with the conic C2 . As a consequence, we get Ds (F)(x1 , x2 ) = P(x1 )P(x2 ), where the polynomial P is of the fourth degree and of the form P(x) = a(x − c1 )(x − c2 )(x − c3 )(x − c4 ). This proves the first part of the theorem. The second part of the theorem follows from the following lemma: Lemma 1. Given a polynomial S = S(x, y, z) of the second degree in each of its variables in the form: S(x, y, z) = A(y, z)x 2 + 2B(y, z)x + C(y, z), if there are polynomials P1 and P2 of the fourth degree such that B(y, z)2 − A(y, z)C(y, z) = P1 (y)P2 (z), then there exists a polynomial f such that D y S(x, z) = f (x)P2 (z), Dz S(x, y) = f (x)P1 (y).
(12)
42
V. Dragovi´c
Proof. To prove the lemma, rewrite Eq. (12) in the equivalent form (B + u A)2 − A(u 2 A + 2u B + C) = P1 (y)P2 (z). For a zero y = y0 of the polynomial P1 , any zero of S(u, y0 , z) as a polynomial in z is a double zero, according to the last equation. Thus, y0 is a zero of Dz S(x, y). Thus, the polynomial P1 is a factor of the polynomial Dz S(x, y). Since the degree of the polynomial P1 is four, then there exists a polynomial f in x such that Dz S(x, y) = f (x)P1 (y). The rest of the lemma follows by double application of the same arguments.
(ii) Now, the proof of the second part of Theorem 1 follows by immediate application of Lemma 1. Proposition 1.
(i) The explicit formulae for the polynomials P and J are P(x) = a0 x 4 − 4a1 x 3 + (2a5 + 4a2 )x 2 − 4a3 x + a4 , J (s) = −4s 3 + 4(a5 − a2 )s 2 + (a0 a4 − a52 + 4(a5 a2 − a1 a3 ))s −a32 a0
+ a0 a4 a2 + 2a1 a3 a5 − a4 a12
(13)
− a2 a52 .
(ii) If all the zeros of the polynomial P are simple, then the elliptic curves 1 : y 2 = P(x), 2 : t 2 = J (s) are isomorphic and the later can be understood as the Jacobian of the former. Proof. Instead of a straightforward calculation, we are going to consider a double-bordered determinant (see [9,21,31]) obtained from the matrix M (3): 0 0 z 1 z 2 z 3 0 0 z1 z2 z 3 a0 a1 a5 − 2s . (14) Mˆ = z 1 z 1 z z a a + s a 2 1 2 3 2 z z 3 a5 − 2s a3 a4 3
We apply the Jacobi identity and get Mˆ 11 Mˆ 22 − ( Mˆ 12 )2 = Mˆ Mˆ 12,12 . Obviously, Mˆ 12,12 is a polynomial only in s of the third degree: Mˆ 12,12 = −4s 3 + 4(a5 − a2 )s 2 + ((a0 a4 − a52 ) + 4(a5 a2 − a1 a3 ))s +a0 a4 a2 − a32 a0 + 2a1 a3 a5 − a4 a12 − a2 a52 = J (s).
Geometrization and Generalization of the Kowalevski Top
43
Moreover, if we substitute x1 + x2 , z 3 = x1 x2 , 2 x1 + x2 z 1 = 1, z 2 = , z 3 = x1 x2 , 2
z 1 = 1, z 2 =
we have (x2 − x2 )2 , 4 = F(s, x1 , x2 ),
Mˆ = P(x1 )
Mˆ 11 Mˆ 22 = F(s, x1 , x2 ). If we denote
F(s, x1 , x2 ) = T (s, x1 )x22 + V (s, x1 )x2 + W (s, x1 ), then Mˆ 12 = T x2 x2 + V
x2 + x2 + W. 2
From the last equations, after dividing by (x2 − x2 )2 , we get V 2 − 4T W = J (s)P(x1 ), and the proof of the first part of the proposition is finished. The second part follows by direct calculation of correspondence between two elliptic curves, one of which is defined by a polynomial of degree 3 and one by polynomial of degree 4. 2.2. Discriminantly separable polynomials. We saw that a polynomial of three variables which defines a pencil of conics has a very peculiar property: all three of its discriminants are representable as products of two polynomials of one variable each. These considerations motivate the following definition. Definition 1. For a polynomial F(x1 , . . . , xn ) we say that it is discriminantly separable if there exist polynomials f i (xi ) such that for every i = 1, . . . , n, f j (x j ). Dxi F(x1 , . . . , xˆi , . . . , xn ) = j=i
It is symmetrically discriminantly separable if f2 = f3 = · · · = fn , while it is strongly discriminatly separable if f1 = f2 = f3 = · · · = fn . j
It is weakly discriminantly separable if there exist polynomials f i (xi ) such that for every i = 1, . . . , n, f ji (x j ). Dxi F(x1 , . . . , xˆi , . . . , xn ) = j=i
44
V. Dragovi´c
Theorem 2. Given a polynomial F(s, x1 , x2 ) of the second degree in each of the variables s, x1 , x2 of the form F = s 2 A(x1 , x2 ) + 2B(x1 , x2 )s + C(x1 , x2 ), denote by TB 2 −AC a 5 × 5 matrix such that (B − AC)(x1 , x2 ) = 2
5
5
ij
j−1
TB 2 −AC x1i−1 x2
.
j=1 i=1
Then, polynomial F is discriminantly separable if and only if rank TB 2 −AC = 1. Proof. The proof follows from Lemma 1 and the observation that a polynomial in two variables is equal to a product of two polynomials in one variable if and only if its matrix is equal to a tensor product of two vectors. The last condition is equivalent to the condition on rank of the last matrix to be equal to 1. Proposition 2. Given a polynomial F(s, x1 , x2 ) of the second degree in each of the variables s, x1 , x2 of the form F = s 2 A(x1 ) + 2B(x1 , x2 )s + C(x2 ), where A depends only on x1 and C depends only on x2 , denote by TB 2 a 5 × 5 matrix such that (B 2 )(x1 , x2 ) =
5
ij
j−1
TB 2 x1i−1 x2
.
i=1
Then, polynomial F is discriminantly separable if and only if rank TB 2 = 2. Proof. The proof follows from the observation of the proof of the last theorem and the fact that a matrix of rank two is equal to a sum of two matrices of rank one. The last proposition gives a method to construct nonsymmetric discriminantly separable polynomials. Lemma 2. Given an arbitrary quadratic polynomial F = s 2 A + 2Bs + C, then the square of its differential is equal to its discriminant under the condition F = 0: dF 2 = 4(B 2 − AC). ds Corollary 1. For an arbitrary discriminantly separable polynomial F(x3 , x1 , x2 ) of the second degree in each of the variables x3 , x1 , x2 , its differential is separable on the surface F(x3 , x1 , x2 ) = 0: √
dF d x3 d x1 d x2 =√ +√ +√ . f 3 (x3 ) f 1 (x1 ) f 2 (x2 ) f 3 (x3 ) f 1 (x1 ) f 2 (x2 )
Geometrization and Generalization of the Kowalevski Top
45
The proof of the corollary is a straightforward application of the previous statements. This property of discriminantly separable polynomials is fundamental in their role in the theory of integrable systems. Observe that the analogous statement is valid for arbitrary discriminantly separable polynomials. From the last corollary, applied to a symmetric discriminatly separable polynomial of the second degree, a variant of the Euler theorem immediately follows. Corollary 2. The condition x3 = const defines a conic from the pencil as an integral curve of the Euler equation: √
d x1 d x2 +√ = 0, f 1 (x1 ) f 1 (x2 )
where f 1 is general polynomial of degree 4. Proposition 3. All symmetric discriminantly separable polynomials F(s, x1 , x2 ) of degree two in each variable with the leading coefficient L(x1 , x2 ) = (x1 − x2 )2 are of the form F(s, x1 , x2 ) = (x1 − x2 )2 s 2 + K (x1 , x2 )s + H (x1 , x2 ), where K and H are done by formulae (7). The next lemma gives a possibility to create new discriminantly separable polynomials from a given one. Lemma 3. Given a discriminantly separable polynomial F(s, x1 , x2 ) := A(x1 , x2 )s 2 + 2B(x1 , x2 )s + C(x1 , x2 ) of the second degree in each variable: (a) Let α(x) be a linear transformation. Then polynomial F1 (s, x1 , x2 ) := F(s, α(x1 ), x2 ) is discriminantly separable. (b) The polynomial ˆ x1 , x2 ) := C(x1 , x2 )s 2 + 2B(x1 , x2 )s + A(x1 , x2 ) F(s, is discriminantly separable. The transformation from F to Fˆ described in Lemma 3 (b) maps a solution s of the equation F = 0 to 1/s. We will use the term transposition for such a transformation ˆ Thus, summarizing we get from F to F. Corollary 3. Given a discriminantly separable polynomial F(s, x1 , x2 ) := A(x1 , x2 )s 2 + 2B(x1 , x2 )s + C(x1 , x2 ) of the second degree in each variable and three fractionally-linear transformations α, β, γ , then the polynomial F1 (s, x1 , x2 ) := F(γ (s), α(x1 ), β(x2 )) is discriminantly separable.
46
V. Dragovi´c
From the last lemma we have a procedure to create non-symmetric discriminantly separable polynomials from a given symmetric discriminantly separable polynomial. The converse statement is also true: Proposition 4. Given a discriminantly separable polynomial F(s, x1 , x2 ) := A(x1 , x2 )s 2 + 2B(x1 , x2 )s + C(x1 , x2 ) of the second degree in each variable, suppose that a biquadratic F(s0 , x1 , x2 ) is nondegenerate for some value s = s0 . Then there exists a fractionally-linear transformation α such that the polynomial F1 (s, x1 , x2 ) := F(s, α(x1 ), x2 ) is symmetrically discriminantly separable. Proof. Let us fix an arbitrary value for s such that B(x1 , x2 ) is a nondegenerate biquadratic. Keeping s fixed, we have a relation √
d x1 d x2 ±√ = 0, f 1 (x1 ) f 2 (x2 )
where f 1 , f 2 are two polynomials, each in one variable. For a given x1 there are two corresponding points x2 and xˆ2 . The last two are connected by the relation
d xˆ2 f 2 (xˆ2 )
±√
d x2 = 0, f 2 (x2 )
where now the denominators of both fractions is one and the same polynomial, f 2 . This means that there exists an elliptic function u of degree two and a shift T on the elliptic curve y 2 = f 2 (x), such that x2 and xˆ2 are parameterized by x2 = u(z) xˆ2 = u(z + T ). From the relations B(x1 , x2 ) = 0,
B(x1 , xˆ2 ) = 0,
y2
are elliptic functions of degree at most four which can be we see that both y and expressed through x2 , xˆ2 . Thus, y is an elliptic function of degree two. There is a fractional-linear transformation which reduces y to u(z + T /2). This concludes the proof of the proposition. 3. Geometric Interpretation of the Kowalevski Fundamental Equation The magic integration of the Kowalevski top is based on the Kowalevski fundamental equation, see [20,24]: Q(w, x1 , x2 ) := (x1 − x2 )2 w 2 − 2R(x1 , x2 )w − R1 (x1 , x2 ) = 0,
(15)
where R(x1 , x2 ) = −x12 x22 + 6l1 x1 x2 + 2lc(x1 + x2 ) + c2 − k 2 , R1 (x1 , x2 ) = −6l1 x12 x22 − (c2 − k 2 )(x1 + x2 )2 − 4clx1 x2 (x1 + x2 ) +6l1 (c − k ) − 4c l . 2
2
2 2
(16)
Geometrization and Generalization of the Kowalevski Top
47
If we replace in Eqs. (4) and (7) the following values for the coefficients: a0 = −2, a1 = 0, a5 = 0, a2 = 3l1 , a3 = −2cl, a4 = 2(c2 − k 2 ),
(17)
and compare with (15) and (16), we get the following Theorem 3. The Kowalevski fundamental equation represents a point pencil of conics given by their tangential equations Cˆ 1 : −2w12 + 3l1 w22 + 2(c2 − k 2 )w32 − 4clw2 w3 = 0; C2 : w22 − 4w1 w3 = 0.
(18)
The Kowalevski variables w, x1 , x2 in these geometric settings are the pencil parameter, and the Darboux coordinates with respect to the conic C2 respectively. The Kowalevski case corresponds to the general case under the restrictions a1 = 0 a5 = 0 a0 = −2. The last of these three relations is just a normalization condition, provided a0 = 0. The Kowalevski parameters l1 , l, c are calculated by the formulae 1 a2 a3 l1 = , l = ± −a4 + a4 + 4a32 , c = ∓ , 3 2 2 −a4 + a4 + 4a3 provided that l and c are requested to be real. Let us mention at the end of this section, that in the original paper [25], instead of the relation (15), Kowalevski used the equivalent one: l1 2 R1 (x1 , x2 ) l1 2 ˆ − = 0. Q(s, x1 , x2 ) := (x1 − x2 ) s − − R(x1 , x2 ) s − 2 2 4 The equivalence is obtained by putting w = 2s − l1 . 4. Generalized Integrable System 4.1. Equations of motion and the first integrals. We are going to consider the following system of differential equations on unknown functions e1 , e2 , x1 , x2 , r, g: de1 = −αe1 , dt de2 = αe2 , dt d x1 = −β(r x1 + cg), dt (19) d x2 = β(r x2 + cg), dt α dr = −β(x2 − x1 )(x1 + x2 + a1 ) − (e1 − e2 ), dt 2r dg β (2rβ −α) 2 2 e x − e x = [(x2 −x1 )(x1 x2 −a5 )+e1 x2 − e2 x1 ]+ 1 2 2 1 . dt 2c 2c2 g
48
V. Dragovi´c
Here β and α are given functions of e1 , e2 , x1 , x2 , r, g. The choice of their form defines different systems. The Kowalevski top is equivalent to the above system for a1 = 0 a5 = 0, with the choice α = ir β =
i . 2
(20)
We will assume in what follows that a1 and a5 are general. Beside the last choice for α and β, there are many other choices which also provide polynomial vector fields, such as (A) α = kr 2 , β = k2 r , (B) α = krg, β = k1 g, (C) α = kr 2 g, β = k1 g. Interesting cases satisfy the system (38) from Proposition (8). Proposition 5. The system (19) has the following first integrals: k 2 = e1 · e2 , a0 a2 = e1 + e2 − (x1 + x2 )2 − 2a1 (x1 + x2 ) − r 2 , a5 a0 a3 − = −x2 e1 − x1 e2 + x1 x2 (x1 + x2 ) + (x1 + x2 ) + a1 x1 x2 − rg, 2 2 a0 a4 = x22 e1 + x12 e2 − x12 x22 − a5 x1 x2 − g 2 . 4
(21)
One can rewrite the last relations in the following form: k 2 = e1 · e2 , ˆ 1 , x2 ), r 2 = e1 + e2 + E(x ˆ 1 , x2 ), rg = −x2 e1 − x1 e2 + F(x
(22)
ˆ 1 , x2 ), g 2 = x22 e1 + x12 e2 + G(x where ˆ 1 , x2 ) = −a0 a2 − K (x1 + x2 )2 − 2a1 (x1 + x2 ), E(x ˆ 1 , x2 ) = a0 a3 + K x1 x2 (x1 + x2 ) + a5 (x1 + x2 ) + a1 x1 x2 , F(x 2 2 ˆ 1 , x2 ) = − a0 a4 − K x12 x22 − a5 x1 x2 , G(x 4
(23)
with K = 1. ˆ F, ˆ Gˆ are defined by Eq. (23) then the polynomial Lemma 4. If the polynomials E, ˆ 1 , x2 )x12 + 2 F(x ˆ 1 , x2 )x1 + G(x ˆ 1 , x2 ) P(x1 ) := E(x depends only on x1 . ˆ 1 , x2 ), G(x ˆ 1 , x2 ) of the second degree ˆ 1 , x2 ), F(x Proposition 6. Three polynomials E(x in each variable are given such that
Geometrization and Generalization of the Kowalevski Top
49
(1) Polynomials P, Q defined by ˆ 1 , x2 )x12 + 2 F(x ˆ 1 , x2 )x1 + G(x ˆ 1 , x2 ), P(x1 ) := E(x 2 ˆ 1 , x2 )x2 + 2 F(x ˆ 1 , x2 )x2 + G(x ˆ 1 , x2 ) Q(x2 ) := E(x
(24)
depend only on one variable each. (2) Polynomials R(x1 , x2 ) and R1 (x1 , x2 ) defined by ˆ 1 , x2 )x1 x2 + F(x ˆ 1 , x2 )(x1 + x2 ) + G(x ˆ 1 , x2 ), R(x1 , x2 ) := E(x ˆ 1 , x2 )G(x ˆ 1 , x2 ) − Fˆ 2 (x1 , x2 ) R1 (x1 , x2 ) := E(x
(25)
are of the second degree in each variables. Then: ˆ 1 , x2 ), F(x ˆ 1 , x2 ), G(x ˆ 1 , x2 ) are symmetric in x1 , x2 . (a) The polynomials E(x (b) The polynomial F(s, x1 , x2 ) = (x1 − x2 )2 s 2 − 2R(x1 , x2 )s − R1 (x1 , x2 ) is discriminantly separable. ˆ F, ˆ Gˆ is given in Eq. (23), with K (c) The most general form of the polynomials E, arbitrary. (d) For K = 1 the polynomial P is the one given in Proposition 1. Proof. The proof follows by straightforward calculation with application of Lemma 1. If the coefficient K is nonzero we may normalize it to be equal to one. Under this assumption, Eqs. (23) with K = 1 are general. The case K = 0 is going to be analyzed separately in one of the following sections. From Eqs. (22) we get the following Corollary 4. The relation e2 P(x1 ) + e1 P(x2 ) − H (x1 , x2 ) + k 2 (x1 − x2 )2 = 0,
(26)
is satisfied, where P is the polynomial defined in Lemma 4. Corollary 5. The differentials of x1 and x2 may be written in the form
d x1 = −β P(x1 ) + e1 (x1 − x2 )2 , dt
d x2 = β P(x2 ) + e2 (x1 − x2 )2 . dt
(27)
The proof follows from Eqs. (22) and Lemma 4. Now, we apply what we are going to call the Kotter trick: √ √ 2 √ P(x2 ) √ P(x1 ) e1 ± e2 = (w1 ± k)(w2 ∓ k), (28) x1 − x2 x1 − x2 where w1 , w2 are solutions of the quadratic equation F(s, x1 , x2 ) = (x1 − x2 )2 s 2 − 2R(x1 , x2 )s − R1 (x1 , x2 ).
(29)
50
V. Dragovi´c
The Kotter trick appeared in [24] quite mysteriously. Further explanation done by Golubev sixty years later seems to be even trickier, see [20] and much less clear. In the last section of this paper, see Proposition 11, we provide a new interpretation of this transformation as a commuting diagram of morphisms of double-valued group. Should we hope that our explanation is more transparent than previous ones, since another sixty years passed in the meantime? From the last relations, following Kotter, one gets 2 d x1 (x1 − x2 )4 e1 P(x2 ) = β2 1 + √ P(x1 )P(x2 )(x1 − x2 )2 P(x1 )dt √ √ ( (w1 − k)(w2 + k) + (w1 + k)(w2 − k))2 2 , = β 1+ (w1 − w2 )2 2 d x2 (x1 − x2 )4 e2 P(x1 ) = β2 1 + √ P(x1 )P(x2 )(x1 − x2 )2 P(x2 )dt √ √ ( (w1 − k)(w2 + k) − (w1 + k)(w2 − k))2 . = β2 1 + (w1 − w2 )2 Next, we get
√
√ (w1 − k)(w1 + k) + (w2 + k)(w2 − k) , (w1 − w2 ) √ √ (w1 − k)(w1 + k) − (w2 + k)(w2 − k) d x2 . = −β √ (w1 − w2 ) P(x2 )dt √
d x1 = −β P(x1 )dt
(30)
Now we apply the discriminant separability property of the polynomial F: d x1 d x2 dw1 +√ =√ , P(x1 ) P(x2 ) J (w1 ) d x1 d x2 dw2 −√ =√ . √ P(x1 ) P(x2 ) J (w2 ) √
(31)
We will refer to the last relations as the Kowalevski change of variables. The nature of these relations has been studied by Jurdjevic (see [23]) following Weil ([33]). We are going to develop further these efforts in Sect. 5 where we are going to show that the Kowalevski change of variables is the infinitesimal version of a double valued group operation and its action. From the relations 31 and 30 we finally get: √
dw1 dw2 +√ = 0, (w1 ) (w2 )
w1 dw1 w2 dw2 +√ = 2β dt, √ (w1 ) (w2 )
(32)
where (w) = J (w)(w − k)(w + k), is the polynomial of fifth degree. Thus, Eqs. (32) represent the Abel-Jacobi map of the genus 2 curve y 2 = (w).
Geometrization and Generalization of the Kowalevski Top
51
4.2. Generalized Kotter transformation. In order to integrate the dynamics on the Jacobian of the hyper-elliptic curve y 2 = (w) we are going to generalize the classical Kotter transformation. In this section we will assume the normalization condition a0 = −2. Proposition 7. For the polynomial F(s, x1 , x2 ) there exist polynomials A0 (s), f (s), A(s, x1 , x2 ), B(s, x1 , x2 ) such that the following identity: F(s, x1 , x2 ) · A0 (s) = A2 (s, x1 , x2 ) + f (s) · B(s, x1 , x2 ),
(33)
is satisfied. The polynomials are defined by the formulae: A(s, x1 , x2 ) = A0 (s)(x1 x2 − s) + B0 (s)(x1 + x2 ) + M0 (s), A0 (s) = a12 − a0 a2 − sa0 , 1 B0 (s) = (a0 a3 − a5 a1 + 2sa1 ), 2 M0 (s) = a5 a2 − a1 a3 + s(a12 + a5 ), B(s, x1 , x2 ) = (x1 + x2 )2 + 2a1 (x1 + x2 ) − 2s − 2a2 ,
a2 f (s) = 2s 3 + 2(a2 − a5 )s 2 + 2(a1 a3 − a5 a2 ) + a4 + 5 2 f 0 = a4 a2 − a32 − a1 a3 a5 +
s + f0 ,
a4 a12 + a2 a52 . 2
For a5 = a1 = 0 the previous identity has been obtained in [24]. Following Kotter’s idea, consider the identity F(s) = F(u) + (s − u)F (u) + (s − u)2 . From the last two identities we get a quadratic equation in s − u, (s − u)2 (x1 − x2 )2 − 2(s − u)(R(x1 , x2 ) − u(x1 − x2 )) + f (u)B + (x1 − x2 )2 A2 . Corollary 6. (a) The solutions of the last equation satisfy the identity in u: (s1 − u)(s2 − u) =
A2 B + f (u) . (x1 − x2 )2 (x1 − x2 )2
(b) Denote m 1 , m 2 , m 3 the zeros of the polynomial f , and
Pi = (s1 − m i )(s2 − m i ), i = 1, 2, 3. Then 1 B0 (m i ) Pi = +m i (m i − a5 − 2a2 )−2a5 − a1 a3 , A0 (m i )x1 x2 + √ x1 − x2 A0 (m i ) i = 1, 2, 3. (34)
52
V. Dragovi´c
Now we introduce more convenient notation n i = m i + a12 + 2a2 , i = 1, 2, 3, x1 x2 + (2a12 + a5 + 2a2 ) + a21 (x1 − x2 ) , x1 − x2 1 Y = , x1 − x2 (a 3 + 2a2 a1 + 2a5 a1 + 2a3 )(x1 + x2 ) − 2(a12 + 2a2 )(a12 + a5 ) Z = 1 . x1 − x2
X =
Lemma 5. The quantities X, Y, Z satisfy the system of linear equations 1 P1 Z = √ , 2n 1 n1 1 P2 Z = √ , X − n2Y + 2n 2 n2 1 P3 Z = √ . X − n3Y + 2n 3 n3 X − n1Y +
(35)
Denote fˆ(x) = f (x − a12 − 2a2 ). One can easily solve the previous linear system and get Lemma 6. The solutions of the system (35) are √ √ √ P1 n 1 P2 n 2 P3 n 3 Y =− + + , fˆ (n 1 ) fˆ (n 2 ) fˆ (n 3 ) P1 P2 P3 Z = 2n 1 n 2 n 3 √ +√ +√ . n 1 fˆ (n 1 ) n 2 fˆ (n 2 ) n 3 fˆ (n 3 ) The expression in terms of theta functions for Pi = can be obtained from [25] paragraph 7.
√ (s1 − m i )(s2 − m i ) for i = 1, 2, 3
4.3. Interpretation of the equations of motion. Rigid-body coordinates. We are going to present briefly the interpretation of the equations of motion (19) in the standard rigid-body coordinates p, q, r, γ , γ , γ , where e1 = x12 + c(γ + iγ ), e2 = x22 + c(γ − iγ ), x1 + x2 , p= 2 x1 − x2 q= . 2i
Geometrization and Generalization of the Kowalevski Top
53
From the last four equations of the system (19) we get p˙ = −iβrq, q˙ = iβr p, r˙ = 2βiq(2 p + a1 ) −
iα (2 pq + cγ ), r
(36)
β γ˙ = − (qia5 + 2icγ q − 2icγ p) c 2rβ − α + 2 (icγ ( p 2 − q 2 ) − 2icpqγ ), c γ while the equations for γ˙ , γ˙ can easily be obtained from the first two equations of the system (19): α 2 −x1 x˙1 − x2 x˙2 (x − x12 ) − iαγ + , 2c 2 c α −x1 x˙1 + x2 x˙2 γ˙ = (−x22 − x12 ) − iαγ + . 2c c γ˙ =
Finally, we get 2i(2βr − α) pq − iαγ + 2iβγ q, c 2i(2βr − α) 2 ( p − q 2 ) + iαγ − 2iβγ q. γ˙ = − c γ˙ =
(37)
Proposition 8. The system (36, 37) preserves the standard measure if and only if A0 α + A1 α p + A2 αq + A3 αr + A4 αγ + A5 αγ + A6 αγ + B0 β + B1 β p + B2 βq + B3 βr + B4 βγ + B5 βγ + B6 βγ = 0,
(38)
where A0 = r 2 γ p 2 + c2 γ 2 γ − 2r 2 pqγ + 2cγ 2 pq − r 2 γ q 2 , A1 = 0, A2 = 0, A3 = −2cγ 2 r pq − c2 γ 2 r γ , A4 = −2 pqr 2 γ 2 − γ r 2 cγ 2 , A5 = −2r 2 γ 2 q 2 + gr 2 cγ 2 + 2r 2 γ 2 p 2 , A6 = −r 2 γ γ p 2 + 2r 2 γ pqγ + r 2 γ γ q 2 , B0 = −2r 3 γ p 2 + 2r 3 γ q 2 + 4r 3 pqγ , B1 = −cr 3 qγ 2 , B2 = cr 3 pγ 2 , B3 = 4qr 2 cγ 2 p + 2qr 2 cγ 2 a1 , B4 = 2γ 3 qr 2 c + 4 pqr 3 γ 2 , B5 = −4r 3 γ 2 p 2 − 2γ 3 qr 2 c + 4r 3 γ 2 q 2 , B6 = −r 2 γ 2 qa5 −2r 3 γ γ q 2 −2r 2 γ 2 cγ q +2r 3 γ γ p 2 +2r 2 γ 2 cγ p−4r 3 γ pqγ .
54
V. Dragovi´c
Example 1. From the Kowalevski case, there is a pair α = ir, β = i/2 which satisfies the system (37) written above. We give two more pairs: α1 = 2r ( p 2 + q 2 ) β1 = p 2 + q 2 , and α2 = r γ β2 = 0. Moreover, any linear combination of the pairs (α, β), (α1 , β1 ) and (α2 , β2 ) also gives a solution of the system (37) and provides a system with invariant standard measure. Elastic deformations. Jurdjevic considered a deformation of the Kowalevski case associated to a Kirchhoff elastic problem, see [23]. The systems are defined by the Hamiltonians H = M12 + M22 + 2M32 + γ1 , where deformed Poisson structures {·, ·}τ are defined by {Mi , M j }τ = i jk Mk , {Mi , γ j }τ = i jk γk , {γi , γ j }τ = τ i jk Mk , where the deformation parameter takes values τ = 0, 1, −1. The classical Kowalevski case corresponds to the case τ = 0. Denote e1 = x12 − (γ1 + iγ2 ) + τ, e2 = x22 − (γ1 − iγ2 ) + τ, where x1,2 =
M1 ± i M2 . 2
The integrals of motion I1 I2 I3 I4
= = = =
e1 e2 , H, γ1 M 1 + γ2 M 2 + γ3 M 3 , γ12 + γ22 + γ32 + τ (M12 + M22 + M32 )
may be rewritten in the form (22) k 2 = I1 = e1 · e2 , ˆ 1 , x2 ), M32 = e1 + e2 + E(x ˆ 1 , x2 ), M3 γ3 = −x2 e1 − x1 e2 + F(x ˆ 1 , x2 ), γ32 = x22 e1 + x12 e2 + G(x where ˆ 1 , x2 ) = −x12 x22 − 2τ x1 x2 − 2τ (I1 − τ ) + τ 2 − I2 , G(x ˆ 1 , x2 ) = (x1 x2 + τ )(x1 + x2 ) + I3 , F(x ˆ 1 , x2 ) = −(x1 + x2 )2 + 2(I1 − τ ). E(x
Geometrization and Generalization of the Kowalevski Top
55
Proposition 9. The corresponding pencil of conics is determined by equations a1 = 0, a5 = 2τ, a2 =
2(τ − I1 ) I3 8τ (I1 − τ ) + 4(I2 − τ 2 ) , a3 = 2 , a4 = , a0 a0 a0
where a0 is arbitrary. 5. Two-Valued Groups, Kowalevski Equation and Poncelet Porism 5.1. Multivalued groups: defining notions. The structure of multivalued groups was introduced by Buchstaber and Novikov in 1971 (see [5]) in their study of characteristic classes of vector bundles, and it has been studied by Buchstaber and his collaborators since then (see [8] and references therein). Following [8], we give the definition of an n-valued group on X as a map: m : X × X → (X )n , m(x, y) = x ∗ y = [z 1 , . . . , z n ], where (X )n denotes the symmetric n th power of X and z i coordinates therein. Associativity is the condition of equality of two n 2 -sets [x ∗ (y ∗ z)1 , . . . , x ∗ (y ∗ z)n ], [(x ∗ y)1 ∗ z, . . . , (x ∗ y)n ∗ z], for all triplets (x, y, z) ∈ X 3 . An element e ∈ X is a unit if e ∗ x = x ∗ e = [x, . . . , x], for all x ∈ X . A map inv : X → X is an inverse if it satisfies e ∈ inv(x) ∗ x, e ∈ x ∗ inv(x), for all x ∈ X . Following Buchstaber, we say that m defines an n-valued group structure (X, m, e, inv) if it is associative, with a unit and an inverse. An n-valued group X acts on the set Y if there is a mapping φ : X × Y → (Y )n , φ(x, y) = x ◦ y, such that the two n 2 -multisubsets of Y , x1 ◦ (x2 ◦ y) (x1 ∗ x2 ) ◦ y, are equal for all x1 , x2 ∈ X, y ∈ Y . It is additionally required that e ◦ y = [y, . . . , y] for all y ∈ Y .
56
V. Dragovi´c
Example 2 (A two-valued group structure on Z+ , [7]). Let us consider the set of nonnegative integers Z+ and define a mapping m : Z+ × Z+ → (Z+ )2 , m(x, y) = [x + y, |x − y|]. This mapping provides a structure of a two-valued group on Z+ with the unit e = 0 and the inverse equal to the identity inv(x) = x. In [7] the sequence of two-valued mappings associated with the Poncelet porism was identified as the algebraic representation of this 2-valued group. Moreover, the algebraic action of this group on CP1 was studied and it was shown that in the irreducible case all such actions are generated by Euler-Chasles correspondences. In the sequel, we are going to show that there is another 2-valued group and its action on CP1 which is even more closely related to the Euler-Chasles correspondence and to the Great Poncelet Theorem, and which is at the same time intimately related to the Kowalevski fundamental equation and to the Kowalevski change of variables. However, we will start our approach with a simple example.
5.2. The simplest case: 2-valued group p2 . Among the basic examples of multivalued groups, there are n-valued additive group structures on C. For n = 2, this is a two-valued group p2 defined by the relation m 2 : C × C → (C)2 , √ √ √ √ x ∗2 y = [( x + y)2 , ( x − y)2 ].
(39)
The product x ∗2 y corresponds to the roots in z of the polynomial equation p2 (z, x, y) = 0, where p2 (z, x, y) = (x + y + z)2 − 4(x y + yz + zx). Our starting point in this section is the following Lemma 7. The polynomial p2 (z, x, y) is discriminantly separable. The discriminants satisfy relations Dz ( p2 )(x, y) = P(x)P(y) Dx ( p2 )(y, z) = P(y)P(z) D y ( p2 )(x, z) = P(x)P(z), where P(x) = 2x. The polynomial p2 as discriminantly separable, generates a case of the generalized Kowalevski system of differential equations, but this time with K = 0. The system is defined by Eˆ = 0 Fˆ = 1 Gˆ = 0,
(40)
Geometrization and Generalization of the Kowalevski Top
57
and the equations of motion have the form de1 dt de2 dt d x1 dt d x2 dt dr dt dg dt
= −αe1 , = αe2 , = −β(r x1 + cg), (41) = β(r x2 + cg), α (e1 − e2 ), 2r (2rβ − α) 2 2 e x − e x = 2βc + 1 2 2 1 . 2c2 g =−
In the standard rigid-body coordinates with α = ir , β = i/2 the last two equations become r˙ = 2 pq + cγ γ˙ = ic. Lemma 8. The integrals of the system defined by Eqs. (40) are k2 r2 crg c2 g 2
= e1 e2 , = e1 + e2 , = 1 − x1 e2 − x2 e1 , = x22 e1 + x12 e2 .
From Lemma 8 we get the relation 2e1 x2 + 2e2 x1 − 1 + k 2 (x1 − x2 )2 = 0. Now, together with the first integral relation from Lemma 8, similar as in the Kowalevski case, we get √ √ 2 √ √ 2x2 2x1 e1 ± e2 = (w1 ± k)(w2 ∓ k), (42) x1 − x2 x1 − x2 where w1 , w2 are solutions of the quadratic equation F2 (w, x1 , x2 ) := (x1 − x2 )2 w 2 − 2(x1 + x2 )w + 1 = 0.
(43)
The polynomial F2 is obtained by transposition from the polynomial p2 and, thus, it is discriminantly separable: Dx (F2 )(y, z) = P(y)ϕ(z), where ϕ(z) = z 3 . Following lines of integration, we finally come to
58
V. Dragovi´c
Proposition 10. The system of differential equations defined by 40 is integrated through the solutions of the system ds1 ds2 + √ = 0, √ s1 1 (s1 ) s2 1 (s2 ) ds1 ds2 i +√ = dt, √ 2 1 (s1 ) 1 (s2 )
(44)
where (s) = s(s − e4 )(s − e5 ) is the polynomial of degree 3. Similar systems appeared in a slightly different context in the works of Appel’rot, Mlodzeevskii, Delone in their study of degenerations of the Kowalevski top (see [1,11,29]). In particular, we may construct Delone-type solutions of the last system: i (t − t0 ) . s1 = 0, s2 = ℘ 4 We can also consider integrable perturbation of the previous integrable system, defined by: Eˆ = k1 − 2a1 (x1 + x2 ), a5 Fˆ = k2 + (x1 + x2 ) + a1 x1 x2 , 2 ˆ G = k 3 − a5 x 1 x 2 .
(45)
The equations of motion have the form de1 dt de2 dt d x1 dt d x2 dt dr dt dg dt
= −αe1 , = αe2 , = −β(r x1 + cg), (46) = β(r x2 + cg), α a1 (e1 − e2 ) − β(x2 − x1 ), 2r 2 a (2rβ − α) 2 5 2 = 2βc + e cβ(x2 − x1 ). x − e x 1 2 2 1 + 2c2 g 2 =−
In the standard rigid-body coordinates with α = ir , β = i/2, the last two equations become a1 r˙ = 2 pq + cγ + q, 2 a5 γ˙ = ic(1 + i q). 2
Geometrization and Generalization of the Kowalevski Top
59
The corresponding polynomial F(s, x1 , x2 ) = (x1 − x2 )2 s 2 − 2R(x1 , x2 )s − R1 (x1 , x2 ), where ˆ 1 + x2 ) + G, ˆ R(x1 , x2 ) = Eˆ x1 x2 + F(x
R1 (x1 , x2 ) = Eˆ Fˆ − Gˆ 2 ,
is discriminantly separable and Dx1 (s, x2 ) = ϕ(s)P(x2 ), where ϕ(s) = (2s − a5 )(2a1 + a5 s − 2s 2 ), P(x) = 2x(2a1 x 2 − a5 x − 2). 5.3. 2-valued group structure on CP1 , the Kowalevski fundamental equation and Poncelet porism. Now we pass to the general case. We are going to show that the general pencil equation represents an action of a two valued group structure. Recognition of this structure enables us to give to ’the mysterious Kowalevski change of variables’ a final algebro-geometric expression and explanation, developing further the ideas of Weil and Jurdjevic (see [23,33]). Amazingly, the associativity condition for this action from a geometric point of view is nothing else than the Great Poncelet Theorem for a triangle. As we have already mentioned, the general pencil equation F(s, x1 , x2 ) = 0 is connected with two isomorphic elliptic curves 1 : y 2 = P(x), 2 : t 2 = J (s), where the polynomials P, J of degree four and three respectively are defined by Eqs. (13). Suppose that the cubic one 2 is rewritten in the canonical form 2 : t 2 = J (s) = 4s 3 − g2 s − g3 . Moreover, denote by ψ : 2 → 1 a birational morphism between the curves induced by a fractional-linear transformation ψˆ which maps three zeros of J and ∞ to the four zeros of the polynomial P. The curve 2 as a cubic curve has the group structure. Together with its subgroup Z2 it defines the standard two-valued group structure of coset type on CP1 (see [6,8]): 2 2 t 1 − t2 t1 + t2 , −s1 − s2 + s1 ∗c s2 = −s1 − s2 + , (47) 2(s1 − s2 ) 2(s1 − s2 ) where ti = J (si ), i = 1, 2.
60
V. Dragovi´c
Theorem 4. The general pencil equation after fractional-linear transformations F(s, ψˆ −1 (x1 ), ψˆ −1 (x2 )) = 0 defines the two valued coset group structure (2 , Z2 ) defined by the relation (47). Proof. After the fractional-linear transformations, the pencil equation obtains the form F1 (s, x, y) = T (s, x)y 2 + V (s, x)y + W (s, x), where T (s, x) = −4s 2 + 4sx − s 2 , V (s, x) = 4sx 2 + 2s 2 x − 2xg2 − g2 s − 4g3 , g2 W (s, x) = −s 2 x 2 − g2 xs − 4xg3 − 2g3 s − 2 . 4 We apply now a linear change of variables γ on s: m = γ (s) :=
s 2
and get F2 (m, x, y) = F1 (2m, x, y). Denote by P = (m, n) and M = (x, u) two arbitrary points on the curve 2 , which means n 2 = 4m 3 − g2 m − g3 , u 2 = 4x 3 − g2 x − g3 . We want to find points N1 = (y1 , v1 ) and N2 = (y2 , v2 ) on 2 which correspond by F2 to P and M. These points are −V (s, x) + 4nu 2x T (s, y1 ) + V (s, y1 ) , v1 = − , 2T (s, x) 4n −V (s, x) − 4nu 2x T (s, y2 ) + V (s, y2 ) , v2 = − . y2 = 2T (s, x) 4n y1 =
By trivial algebraic transformations −4mx 2 − 4xm 2 + xg2 + mg2 + 2g3 + 2nu −4(x − m)2 −4mx(x + m) + x 3 + m 3 − x 3 + xg2 + g3 − m 3 + mg2 + g3 + 2nu = −4(x − m)2 2 u−n = −x − m + , 2(x − m)
y1 =
we get the first part of the operation of the two-valued group (2 , Z2 ) defined by the relation (47). Applying similar transformations to y2 we get the second part of the relation (47) as well. This ends the proof of the theorem.
Geometrization and Generalization of the Kowalevski Top
61
The Kowalevski change of variables (see Eqs. (31)) is infinitesimal of the correspondence which maps a pair of points (M1 , M2 ) from the curve 1 to a pair of points (S1 , S2 ) of the curve 2 . One view to this correspondence has been given in [23] following Weil [33]. In our approach, there is a geometric view to this mapping as the correspondence which maps two tangents to the conic C to the pair of conics from the pencil which contains the intersection point of the two lines. If we apply fractional-linear transformations to transform the curve 1 into the curve 2 , then the above correspondence is nothing else than the two-valued group operation ∗c on (2 , Z2 ). Theorem 5. The Kowalevski change of variables is equivalent to the infinitesimal of the action of the two valued coset group (2 , Z2 ) on 1 . Up to the fractional-linear transformation, it is equivalent to the operation of the two valued group (2 , Z2 ). Now, the Kotter trick from Sect. 4 (see Eqs. (28, 29) can be presented as a commutative diagram. Proposition 11. The Kotter transformation defined by Eqs. (28, 29) makes the following diagram commutative: C4
i 1 ×i 1 ×m
Q
- 1 × 1 × Cψ
−1 ×ψ −1 ×id
Q
Q ia ×ia ×m Q i 1 ×i 1 ×id×id p1 × p1 ×id Q Q QQ s ? ? 1 × 1 × C × C CP1 × CP1 × C / ? 1 CP × CP1 × C
ψˆ −1 ×ψˆ −1 ×id
ϕ1 ×ϕ2
? C×C
m c ×τc
m2
? CP2
f
? CP2 × C/ ∼
The mappings are defined as follows:
i 1 : x → (x, P(x)), m : (x, y) → x · y, i a : x → (x, 1), p1 : (x, y) → x, m c : (x, y) → x ∗c y, √ √ τc : x → ( x, − x), √ √ P(x2 ) , ϕ1 : (x1 , x2 , e1 , e2 ) → e1 x1 − x2
- 2 × 2 × C
p1 × p1 ×id
62
V. Dragovi´c
√ e2
√
P(x1 ) , x1 − x2 f : ((s1 , s2 , 1), (k, −k)) → [(γ −1 (s1 )+k)(γ −1 (s2 )−k), (γ −1 (s2 )+k)(γ −1 (s1 )−k)].
ϕ2 : (x1 , x2 , e1 , e2 ) →
From Proposition 11 we see that the two-valued group plays an important role in the Kowalevski system and its generalizations. Putting together the geometric meaning of the pencil equation and algebraic structure of the two valued group we come to the connection with the Great Poncelet Theorem ([30], see also [3,15] and [16]). For the reader’s sake we are going to formulate the Great Poncelet Theorem for triangles in the form we are going to use below. Theorem 6 (Great Poncelet Theorem for triangles [30]). Given four conics C1 , C2 , C3 , C from a pencil and three lines a1 , a2 , a3 , tangents to the conic C such that a1 , a2 intersect on C1 , a2 , a3 intersect on C2 and a2 , a3 intersect on C3 . Moreover, we suppose that the tangents to the conics C1 , C2 , C3 at the intersection points are not concurrent. Given b1 , b2 tangents to the conic C which intersect at C1 . Then there exists b3 , tangent to the conic C such that the triplet (b1 , b2 , b3 ) satisfies all conditions as (a1 , a2 , a3 ). Now, we are going back to the associativity condition for the action of the double-valued group (2 , Z2 ). Theorem 7. Associativity conditions for the group structure of the two-valued coset group (2 , Z2 ) and for its action on 1 are equivalent to the great Poncelet theorem for a triangle. Proof. Denote by P and Q two arbitrary elements of the two-valued group (2 , Z2 ) and M an arbitrary point on the curve 1 . Let Q ∗ P = [P1 , P2 ] and P ◦ M = [N1 , N2 ]. Associativity means the equality of the two quadruples: [Q ◦ N1 , Q ◦ N2 ] = [P1 ◦ M, P2 ◦ M]. Let us consider the previous situation from the geometric point of view. Recall the geometric meaning of the equation of a pencil of conics F(s, x1 , x2 ) = 0. Variables x1 and x2 denote the Darboux coordinates of two tangents to the conic C2 which intersect at the conic Cs with the pencil parameter equal to s. Denote by C P and C Q the conics from the pencil which correspond to the elements P, Q, and by l M , l N1 , l N2 the tangents to the conic C2 which correspond to the points M, N1 , N2 of the curve 1 . Then, l N1 and l N2 are the two lines tangent to C2 which intersect l M at the conic C P . Moreover, if we denote Q ◦ N1 = [N3 , N4 ],
Q ◦ N2 = [N5 , N6 ],
Geometrization and Generalization of the Kowalevski Top
63
Fig. 1. Associativity condition and Poncelet theorem
then corresponding lines l N3 , l N4 , l N5 , l N6 , tangent to the conic C2 satisfy the conditions: the pairs of lines (l N1 , l N3 ), (l N1 , l N4 ), (l N2 , l N5 ), (l N2 , l N6 ) all intersect at the conic C Q . Now, associativity of the action is equivalent to the existence of a pair of conics (C P1 , C P2 ) such that (l M , l N3 ) and (l M , l N6 ) intersect at the conic C P1 , while (l M , l N5 ) and (l M , l N4 ) intersect at the conic C P2 , see Fig. 1. Consider the intersection of the lines (l M , l N3 ). Choose the conic from the pencil which contains the intersection point, such that the tangent to this conic at the intersection point is not concurrent with the tangents to the conics C P and C Q at the intersection points (l M , l N1 ) and (l N1 , l N3 ) respectively. Denote the conic C P1 . Then by applying the Great Poncelet Theorem for triangle (see the theorem above, [30], see also [3,15,16]), one of the lines l N5 and l N6 , say the last one, intersects L M at the conic C P1 . The tangent to this conic at the intersection point is not concurrent with the tangents to the conics C P and C Q at the intersection points (l M , l N2 ) and (l N2 , l N6 ) respectively. In the same way, by considering intersection of the lines (l M , l N4 ) we come to the conic (C P2 ) from the pencil, which, by the Great Poncelet Theorem contains intersections of (l M , l N4 ) and (l M , l N5 ). Since the result of the operation in the double-valued group between elements P, Q doesn’t depend on the choice of the point M to which the action is applied, the conics C P2 and C P1 in the previous construction should not depend of the choice of the line l M . This independence is equivalent to the poristic nature of the Poncelet Theorem. This demonstrates the equivalence between the associativity condition and the Great Poncelet Theorem for a triangle. From the last two theorems we get finally Conclusion. Geometric settings for the Kowalevski change of variables is the Great Poncelet Theorem for a triangle. Acknowledgement. The author is grateful to Borislav Gaji´c and Katarina Kuki´c for helpful remarks. The research was partially supported by the Serbian Ministry of Science and Technology, Project Geometry and Topology of Manifolds and Integrable Dynamical Systems. A part of the paper has been written during a visit to the IHES. The author uses the opportunity to thank the IHES for hospitality and outstanding working conditions.
References 1. Appel’rot, G.G.: Some suplements to the memoir of N. B. Delone. Tr. otd. fiz. nauk, 6 (1893) 2. Audin, M.: Spinning Tops. An introduction to integrable systems. Cambridge studies in advanced mathematics 51, Cambridge: Cambridge Univ. Press, 1999
64
V. Dragovi´c
3. Berger, M.: Geometry. Berlin: Springer-Verlag, 1987 4. Bobenko, A.I., Reyman, A.G., Semenov-Tian-Shansky, M.A.: The Kowalevski top 99 years later: a Lax pair, generalizations and explicit solutions. Commun. Math. Phys. 122, 321–354 (1989) 5. Buchstaber, V.M., Novikov, S.P.: Formal groups, power systems and Adams operators. Mat. Sb. (N. S) 84 (126), 81–118 (1971) (in Russian) 6. Buchstaber, V.M., Rees, E.G.: Multivalued groups, their representations and Hopf algebras. Transform. Groups 2, 325–349 (1997) 7. Buchstaber, V.M., Veselov, A.P.: Integrable correspondences and algebraic representations of multivalued groups. Internat. Math. Res. Notices 1996, 381–400 (1996) 8. Buchstaber, V.: n-valued groups: theory and applications. Moscow Math. J. 6, 57–84 (2006) 9. Darboux, G.: Principes de géométrie analytique. Paris: Gauthier-Villars, 1917, 519 p 10. Darboux, G.: Leçons sur la théorie générale des surfaces et les applications géométriques du calcul infinitesimal. Volumes 2 and 3, Paris: Gauthier-Villars, 1887, 1889 11. Delone, N.B.: Algebraic integrals of motion of a heavy rigid body around a fixed point. Petersburg, 1892 12. Dragovi´c, V.: Multi-valued hyperelliptic continuous fractions of generalized Halphen type. Internat. Math. Res. Notices 2009, 1891–1932 (2009) 13. Dragovi´c, V.: Marden theorem and Poncelet-Darboux curves. http://arXiv./org/abs/0812.4829v1[math. CA], 2008 14. Dragovi´c, V., Gaji´c, B.: Systems of Hess-Appel’rot type. Commun. Math. Phys. 265, 397–435 (2006) 15. Dragovi´c, V., Radnovi´c, M.: Geometry of integrable billiards and pencils of quadrics. J. Math. Pures Appl. 85, 758–790 (2006) 16. Dragovi´c, V., Radnovi´c, M.: Hyperelliptic Jacobians as Billiard Algebra of Pencils of Quadrics: Beyond Poncelet Porisms. Adv. Math. 219, 1577–1607 (2008) 17. Dubrovin, B.: Theta - functions and nonlinear equations. Usp. Math. Nauk 36, 11–80 (1981) 18. Dullin, H.R., Richter, P.H., Veselov, A.P.: Action variables of the Kowalevski top. Reg. Chaotic Dynam. 3, 18–26 (1998) 19. Euler, L.: Evolutio generalior formularum comparationi curvarum inservientium. Opera Omnia Ser 1 20, 318–356 (1765) 20. Golubev, V.V.: Lectures on the integration of motion of a heavy rigid body around a fixed point. Moscow: Gostechizdat, 1953 [in Russian], English translations: Israel Program for Scientific washington, DC: US Dept. of Commerce, Off, of Tech. Serv., 1960 21. Hirota, R.: The direct mthod in soliton theory. Cambridge Tracts in Mathematics 155, Cambridge: Cambridge Univ. Press, 2004 22. Horozov, E., van Moerbeke, P.: The full geometry of Kowalevski’s top and (1, 2)-abelian surfaces. Comm. Pure Appl. Math. 42, 357–407 (1989) 23. Jurdjevic, V.: Integrable Hamiltonian systems on Lie Groups: Kowalevski type. Ann. Math. 150, 605– 644 (1999) 24. Kotter, F.: Sur le cas traite par M-me Kowalevski de rotation d’un corps solide autour d’un point fixe. Acta Math. 17, 209–263 (1893) 25. Kowalevski, S.: Sur la probleme de la rotation d’un corps solide autour d’un point fixe. Acta Math. 12, 177–232 (1889) 26. Kowalevski, S.: Sur une propriete du systeme d’equations differentielles qui definit la rotation d’un corps solide autour d’un point fixe. Acta Math. 14, 81–93 (1889) 27. Kuznetsov, V.B.: Kowalevski top revisted. CRM Proc. Lecture Notes 32, Providence, RI: Amer. Math. Soc., 2002, pp. 181–196 28. Markushevich, D.: Kowalevski top and genus-2 curves. J. Phys. A 34(11), 2125–2135 (2001) 29. Mlodzeevskii, B.K.: About a case of motion of a heavy rigid body around a fixed point. Mat. Sb. 18 (1895) 30. Poncelet, J.V.: Traité des propriétés projectives des figures. Paris: Mett, 1822 31. Vein, R., Dale, P.: Determinants and their applications in Mathematical Physics. Appl. Math. Sciences 134, Berlin-Heidelberg-New York: Springer, 1999 32. Veselov, A.P., Novikov, S.P.: Poisson brackets and complex tori. Trudy Mat. Inst. Steklov 165, 49–61 (1984) 33. Weil, A.: Euler and the Jacobians of elliptic curves. In: Arithmetics and Geometry, Vol. 1, Progr. Math. 35, Boston, MA: Birkhauser, 1983, pp. 353–359 Communicated by M. Aizenman
Commun. Math. Phys. 298, 65–99 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1059-y
Communications in
Mathematical Physics
Dimension Theory for Invariant Measures of Endomorphisms Lin Shu∗ LMAM, School of Mathematical Sciences, Peking University, Beijing 100871, P. R. China. E-mail:
[email protected] Received: 29 June 2009 / Accepted: 3 November 2009 Published online: 15 May 2010 – © Springer-Verlag 2010
Abstract: We establish the exact dimensional property of an ergodic hyperbolic measure for a C 2 non-invertible but non-degenerate endomorphism on a compact Riemannian manifold without boundary. Based on this, we give a new formula of Lyapunov dimension of ergodic measures and show it coincides with the dimension of hyperbolic ergodic measures in a setting of random endomorphisms. Our results extend several well known theorems of Barreira et al. (Ann Math 149:755–783, 1999) and Ledrappier and Young [Commun Math Phys 117(4):529–548, 1988] for diffeomorphisms to the case of endomorphisms. Contents 1. 2. 3.
Introduction . . . . . . . . . . . . . . . . . . . . . . . Notions and Statement of the Main Results . . . . . . . Dimension of Hyperbolic Measures for Endomorphisms 3.1 Preparatory lemmas . . . . . . . . . . . . . . . . . 3.2 Proof of Theorem 2.1 . . . . . . . . . . . . . . . . 4. Volume Lemma and Lyapunov Dimension of Measures 5. Dimension Formula for Random Endomorphisms . . . 5.1 The proofs of Theorem 2.4 and Theorem 2.6 . . . . 5.2 An application of the results to stochastic flows . . References . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
65 70 76 76 83 89 92 93 97 98
1. Introduction The present paper is intended to study the dimension theory for invariant probability measures of a C 2 non-invertible but non-degenerate endomorphism. To motivate the ∗ This work is supported by NSFC (No. 10901007) and National Basic Research Program of China (973 Program) (2007 CB 814800).
66
L. Shu
questions, we first give some background of the corresponding theories for diffeomorphisms. Throughout this paper, we let M be a C ∞ compact connected Riemannian manifold without boundary. Let f : M → M be a C 2 (or C 1+α ) diffeomorphism preserving a Borel probability measure μ. For x ∈ M, the local dimension of μ at x is defined by d(μ, x) = lim
ρ→0
log μ(B(x, ρ)) , log ρ
(1.1)
provided the limit exists, where B(x, ρ) stands for the ball of radius ρ centered at x. Call μ exact dimensional if d(μ, x) is constant a.e. (in that case, the constant is denoted by dimμ). The exact dimensional property of μ implies almost all the known characteristics of dimension type of the measure coincide [32]. This partially tells why the study on exact dimensional measures is of great importance in dimension theory of dynamical systems [3,10,12,23,32]. For further reference of this topic, see e.g. Farmer, Ott and Yorke [5], Eckmann and Ruelle [4], and Young [33]. In 1982 [32], Young studied the local dimension of an ergodic measure μ of a C 1+α surface diffeomorphism f . She showed that for μ almost every x, 1 1 d(μ, x) = hμ( f ) + h μ ( f ) =: δ u + δ s , (1.2) λ1 −λ2 where h μ ( f ) is the metric entropy of f (with respect to μ) and λ1 > 0 > λ2 are the Lyapunov exponents of μ. The δ u , δ s defined as above can roughly be interpreted as the dimension of μ in the direction of the subspaces corresponding to λ1 and λ2 , respectively. As a simple consequence of (1.2) one has that if μ is an SRB measure, then dimμ = 1 − λ1 /λ2 , which coincides with the Lyapunov dimension of μ (cf. [5,6]). Arising from this simple, yet delicate model are three natural questions for an invariant probability measure μ of a C 2 (or C 1+α ) diffeomorphism f of higher dimensional M: i) What is the relation between entropy, Lyapunov exponents and dimensions [12]? ii) In case μ is ergodic, will it be exact dimensional [4]? iii) When will dimμ coincide with its Lyapunov dimension [5,6]? The answers to the last two questions rely on that of the first. For μ-a.e. x, let λ1 (x), . . . , λr (x) (x) be the distinct Lyapunov exponents of f at x and let ⊕i≤r (x) E i (x) be the corresponding decomposition of Tx M. In 1985, Ledrappier and Young [12] proved the entropy formula λi (x)γi (x) dμ, (1.3) hμ( f ) = M λ (x)>0 i
(where γi (x) denotes, roughly speaking, the dimension of μ in the direction of the subspace E i (x)), which gives the existence of stable and unstable pointwise dimensions δ s (x) = Σλi (x)0 γi (x) of μ, resembling that in (1.2). Furthermore, it can be derived from a general inequality in [12] that d(μ, x) ≤ δ s (x) + δ c (x) + δ u (x), μ − a.e.,
(1.4)
for any invariant probability measure μ, where d(μ, x) is the upper pointwise dimension of μ at x, defined by replacing lim in (1.1) by lim sup and δ c (x) is the multiplicity of the zero exponent.
Dimension Theory of Endomorphisms
67
In [10], Ledrappier established the existence of the pointwise dimension of arbitrary SRB measures. In [23], Pesin and Yue extended his approach and proved the existence (of the pointwise dimension) for hyperbolic measures satisfying the so-called semi-local product structure. (Here by hyperbolic one means that the measure has no zero Lyapunov exponent.) Finally in 1999, Barreira, Pesin and Schmeling [3] exploited all the above works, especially [12], in an essential way. They showed that a hyperbolic measure has a kind of asymptotically “almost” local product structure, from which they deduced that d(μ, x) ≥ δ s (x) + δ u (x), μ − a.e., where d(μ, x) is the lower pointwise dimension of μ at x, defined by replacing lim in (1.1) by lim inf. This together with (1.4) established the exact dimensional property of ergodic hyperbolic measures of C 1+α diffeomorphisms. (Note that there are examples of non-hyperbolic ergodic measures which are not exact dimensional [11].) Despite the accuracy of the above formulas for d(μ, x) using γi ’s, a formula of dimμ, widely used in practice, however, is the measure’s Lyapunov dimension, which can be simply calculated only using its Lyapunov exponents. To be precise, let μ be an f -ergodic measure with Lyapunov exponents λ1 > · · · > λr . Let K be the largest integer so K λ m > 0, where m is the multiplicity of λ . Then the Lyapunov dimension of that Σi=1 i i i i μ is defined to be K m = dimM; dimM, if Σi=1 i dim L y (μ) = (1.5) K m − 1 Σ K m , otherwise. Σi=1 i λ K +1 i=1 i r γ (where γ equals the multiplicity of λ It can be easily verified using dimμ ≤ Σi=1 i i i if λi = 0) and (1.3) that dimμ ≤ dim L y (μ) is always true and a necessity of the equality is that μ is an SRB measure. Despite existing counterexamples about the converse direction, dim L y (μ) and dimμ are close in real calculation. It was conjectured by Frederickson, Kaplan, Yorke and Yorke [6] (see also [5]) that if μ is an SRB measure, then generically,
dimμ = dim L y (μ).
(1.6)
When M has dimension 2, the conjecture is true by Young’s formula (1.2). For the higher dimensional case, it is still unknown what generic condition should be put there [13]. In a setting of random diffeomorphisms, however, Ledrappier and Young were able to show the formula (1.6) is mathematically correct. Let ν be a probability measure on Diff(M), the space of diffeomorphisms of M. They considered the composition of maps chosen independently with distribution ν. Let μ be an ergodic stationary measure corresponding to this process. Denote by {μw : w ∈ Diff(M)Z } a class of sample measures associated with μ. Consider the backward derivative process naturally induced by the process on the Grassmannian manifold GrM whose transition probabilities are given by Q(v, Γ ) = ν{ f w ∈ Diff(M) : D( f w )−1 v ∈ Γ }, v ∈ Gr(M), Borel Γ ⊂ Gr(M). They showed that if μ Leb and λ j = 0 for all j, taking the hypothesis that for all v ∈ Gr(M) the transition probability Q(v, ·) is absolutely continuous with respect to the Lebesgue measure on Gr(M), then for ν Z -a.e. w, dim(μw ) = dim L y (μ);
(1.7)
68
L. Shu
and Eq. (1.7) continues to hold if the hypothesis is replaced by a weaker assumption about the randomness of the distribution of tangent spaces to K + 1, K + 2th stable manifolds or a nonlinear version formulated in terms of two-point processes on M. We note that the above hypothesis of randomness appears quite naturally in the setting of stochastic flows from Stochastic Differential Equations (SDE) (see Sect. 5.2). The motivation of the present paper is trying to answer the questions proposed at the beginning of the paper in the case f is a non-invertible endomorphism, concerning all the existing results mentioned above for diffeomorphisms. The main difficulty in setting up the corresponding theories is caused by the non-invertibility of the map. To remove this, one method is to avoid this by lifting the system (M, f ) to some higher-dimensional system so that the argument for diffeomorphisms works, see e.g. [29,30]. However, this might encounter the problem that the existence of d(μ, x) of μ can not always be obtained from that of the lift of μ. Another method is to lift the system to its inverse limit space (denoted by (M, θ ), which is a lift of (M, f ) to form an invertible system) [25]. However, this can not solve all the problems, either, especially when the system has negative exponents, since the lift does not help to split stable manifolds into distinct manifolds corresponding to different past paths. Besides, the dimension of the lift of μ and dimμ, in general, are not equal, either (cf. [18]). So, an alternative approach to solve the three questions for endomorphisms might be following the above lines for diffeomorphisms. A preliminary nontrivial step is to set up the corresponding entropy formulas. For positive exponents, much has been done with the help of the inverse limit space of (M, f ) (see [25–28] and also [16]). In [25], Qian and Xie established (1.3) for endomorphisms (with γi to be interpreted differently from that of (1.3) for diffeomorphisms). For each “typical” x = (xn )+∞ −∞ with f (x n ) = x n+1 for n ∈ Z, there exists an unstable manifold W u (x). They were able to show that the dimension of μ along W u (x), denoted by δ u (x0 ), only depends on x0 . Based on this, they proved the existence of d(μ, x) (in a.e. sense) for C 2 expanding endomorphisms. Generalizing (1.3) for negative exponents is another story. Since the existence of the stable manifolds W s only relies on forward iterations of orbits, lifting the system to its inverse limit space can not clear up the influence of the overlap caused by non-invertibility. The key to this problem is the observation that overlap (or folding) actually diminishes the dimensions of (conditional) measures in stable manifolds. Let f be a C 2 non-degenerate (i.e., Tx f = 0 for all x ∈ M) non-invertible endomorphism on M and let μ be an invariant probability measure on M. In [24], Ruelle conjectured an inequality h μ ( f ) ≤ Fμ ( f ) − λi (x)m i (x) dμ (1.8) M λ (x) λ2 are the Lyapunov exponents of μ. In particular, for an SRB measure μ in this setting (see [28] or Sect. 2 for its definition), we have dimμ = 1−(λ1 −Fμ ( f ))/λ2 . This motivates the following new notion of Lyapunov dimension of ergodic measures for endomorphisms. Let λ1 > · · · > λr be the distinct Lyapunov exponents of μ. We define the Lyapunov dimension of μ, denoted by dim L (μ), as i) If Σλi >0 λi m i ≤ Fμ ( f ), let dim L (μ) = Σλi ≥0 m i ; K λ m > F ( f ) and define ii) Otherwise, let K be the largest integer so that Σi=1 i i μ K m i = dimM; dimM, if i=1 dim L (μ) = K 1 K i=1 m i − λ K +1 ( i=1 λi m i − Fμ ( f )), otherwise. (1.10) The formula differs from that for diffeomorphisms (see (1.5)) by plugging in the quantum of folding entropy. When f is a diffeomorphism, the folding entropy is zero and (1.10) reduces to the classical one. As we will show later (Proposition 4.2), it is always true for ergodic μ that d(μ, x) ≤ dim L (μ), μ − a.e., and for the equality to hold, a necessary condition is that μ is SRB.
70
L. Shu
To see the new Lyapunov dimension is in some sense mathematically correct, we study its relation with dimension of measures in the setting of random endomorphisms. Let ν be a probability measure (satisfying some regularity conditions to be specified in Sect. 5) on C 2 (M, M), the space of C 2 endomorphisms, and consider the composition of maps chosen independently with distribution ν. This process together with an ergodic stationary measure μ is referred to as χ . Let Fμ (χ ) denote the folding entropy of χ . We define the Lyapunov dimension of μ as in the deterministic case, replacing Fμ ( f ) by Fμ (χ ). Denote by {μw : w ∈ C 2 (M, M)Z } the associated class of sample measures. Our third main result, in one sentence, is that in the above setting of random endomorphisms, under the same hypothesis of Ledrappier and Young [13], for ν Z -a.e. w, we have dim(μw ) = dim L (μ), with dim L (μ) as above.
(1.11)
The proof of (1.11) depends on the existence of dimension of ergodic measures of random endomorphisms and the establishment of the corresponding entropy formulas in such a setting. With these, we can show, as in the case of random diffeomorphisms [13], that for the equality (1.11) to hold, a sufficient and necessary configuration is that μw tends to fill in the direction λ1 , . . . , λ j before spilling over into the λ j+1 direction. This configuration can be verified by showing the transversal dimension to j + 1th stable manifolds is as large as possible exactly as in [13]. We will state the main results in the next section, but the proofs, i.e., the answers to the last two questions for endomorphisms, will be presented separately in Sects. 3 and 5. Section 4 is devoted to the relation between d(μ, x) and dim L (μ) for arbitrary ergodic measures for non-invertible but non-degenerate C 2 endomorphisms. 2. Notions and Statement of the Main Results We first consider the dimension theory of invariant probability measures for deterministic endomorphisms. Let f be a C 2 non-invertible but non-degenerate endomorphism on M preserving a Borel probability measure μ. Consider M Z endowed with the product topology. Define M := {x = (xn )+∞ −∞ : x n ∈ M, f (x n ) = x n+1 , n ∈ Z}. Denote by θ the left shift transformation on M. The pair (M, θ ) is called the inverse limit space of (M, f ). Let p be the natural projection map from M to M, i.e., p(x) = x0 , ∀ x ∈ M. Then p ◦ θ = f ◦ p on M. Denote by μ the unique invariant probability measure on M that satisfies μ ◦ p −1 = μ. Then μ is ergodic whenever μ is. For μ-a.e. x, let λ1 (x) > λ2 (x) > · · · > λr (x) (x) be the distinct Lyapunov exponents of μ at x with multiplicities m 1 (x), . . . , m r (x) (x), respectively. Applying the Oseledec multiplicative ergodic theorem [21] to (M, θ, μ), we can obtain a Borel set Γ0 ⊂ M with μ(Γ0 ) = 1 such that for each x = (xn )+∞ −∞ ∈ Γ0 , there is a measurable splitting Tx0 M = E 1 (x) ⊕ E 2 (x) ⊕ · · · ⊕ Er (x0 ) (x)
Dimension Theory of Endomorphisms
71
such that for each 1 ≤ i ≤ r (x0 ), 1 log |D(x, n)v| = λi (x0 ) for 0 = v ∈ E i (x), n→±∞ n lim
where D(x, n) = Tx0 f n for n ≥ 0 and D(x, n) = (Tx−n f )−1 ◦ · · · ◦ (Tx−1 f )−1 for n < 0. Put E s (x) = ⊕λi (x0 )0 E i (x). Let s(x) = #{λi (x) : λi (x) < 0}. For x = (xn )+∞ −∞ ∈ Γ0 , define u (x) := {y ∈ M : lim sup 1 log d(x−n , y−n ) < 0}. W n→+∞ n u (x)). It is called It is called the unstable set of (M, f, μ) in M at x. Let W u (x) := p(W the unstable manifold of (M, f, μ) in M at x. It can be proved that W u (x)’s are all C 1,1 immersed submanifolds of M tangent at x0 to E u (x) [14]. Each W u (x) inherits a Riemannian structure from M. Denote by dxu (·, ·) the corresponding Riemannian metric on each leaf of W u (x) and let B u (x, ρ) = {y ∈ W u (x) : dxu (x0 , y0 ) < ρ}. A measurable partition ξ of M is said to be subordinate to W u -manifolds of (M, f, μ) (cf. [28]) if for μ-a.e. x, ξ(x) (ξ(x) denotes the element of ξ that contains x) satisfies: i) p|ξ(x) : ξ(x) → p(ξ(x)) is bijective; ii) There exists a Σλ j (x0 )>0 m j (x0 )-dimensional C 1 embedded submanifold V u (x) of M with V u (x) ⊂ W u (x) such that p(ξ(x)) ⊂ V u (x) and p(ξ(x)) contains an open neighborhood of x0 in V u (x) (with respect to the submanifold topology of V u (x)). Let ξ u be a measurable partition of M subordinate to W u -manifolds of (M, f, μ). ξu Denote by {μx } the canonical system of conditional measures of μ associated with ξ u . Then μ is said to be SRB (cf. [28]) if for each ξ u , we have for μ-a.e. x that the measure ξu u μx ◦ p|−1 ξ u (x) is absolutely continuous with respect to the Lebesgue measure on V (x) induced by its inherited Riemannian structure as a submanifold of M. Let ξ u be as above. The lower and upper pointwise dimension of μ along W u manifolds at x ∈ Γ0 with respect to the partition ξ u are defined by ξ log μx ( B u (x, ρ)) ; δ u (x, ξ ) = lim inf ρ→0 log ρ ξu log μx ( B u (x, ρ)) u . δ u (x, ξ ) = lim sup log ρ ρ→0 u
u
It was proved in [25] that there exists an (μ-mod 0) f -invariant measurable function δ u : M → R, the so called unstable pointwise dimension of μ, which does not depend on the choice of ξ u , such that δ u (x, ξ u ) = δ u (x, ξ u ) = δ u (x0 ), for μ − a.e. x. Let Λ denote the set of regular points in the sense of Oseledec for (M, f, μ). We may assume s(x) ≥ 1 for each x ∈ Λ. For x ∈ Λ, define
1 s n n W (x) = y ∈ M : lim sup log d( f (x), f (y)) < 0 . n→∞ n
72
L. Shu
It is called the stable manifold of f at x. Let V s (x) denote the arc connected component of W s (x) which contains x. It is a C 1,1 immersed submanifold of M with dimension Σλ j 0 there exist a set Λ ⊂ M with μ(Λ) > 1 − ε and a constant κ ≥ 1 such that for every x ∈ Λ and every sufficiently small r (depending on x), we have r
ξ u u r
s r ε μξx0 B s x0 , · μx B x, κ κ s ξu ≤ μ(B(x0 , r )) ≤ r −ε μξx0 (B s (x0 , κr )) · μx ( B u (x, κr )); ii) μ is exact dimensional (i.e., d(μ, x) is constant a.e.) and its pointwise dimension is equal to the sum of the stable and unstable pointwise dimensions, i.e. d(μ, x) = δ s + δ u , for μ − a.e. x.
(2.1)
iii) when μ is an arbitrary hyperbolic invariant probability measure, (2.1) changes into d(μ, x) = δ s (x) + δ u (x), for μ − a.e. x.
Dimension Theory of Endomorphisms
73
A simple corollary of the theorem, generalizing (1.2), is: Theorem 2.2. Let f : M → M be a C 2 non-invertible but non-degenerate endomorphism of a compact surface M and let μ be an ergodic Borel probability measure with exponents λ1 > 0 > λ2 . Then d(μ, x) =
1 1 hμ( f ) + (h μ ( f ) − Fμ ( f )), μ − a.e. λ1 −λ2
In the case μ is an arbitrary ergodic measure, which is not necessarily hyperbolic, let δ c denote the multiplicity of its zero Lyapunov exponent. We have Theorem 2.3. Let f be a C 2 non-invertible but non-degenerate endomorphism on M preserving an f -ergodic Borel probability measure μ. Then d(μ, x) ≤ δ s + δ c + δ u ≤ dim L (μ), for μ − a.e. x. Next we consider the dimension of invariant probability measures for random endomorphisms as in [13]. Let C 2 (M, M) be the space of all C 2 endomorphisms of M endowed with the C 2 topology. We denote by w an element of C 2 (M, M) and f w the corresponding map in C 2 (M, M). Let Ω=
+∞
C 2 (M, M)
−∞
be the two sided infinite product of copies of C 2 (M, M) endowed with the product +∞ topology. For w = (wn )+∞ −∞ ∈ Ω, let { f wn }n=−∞ be the corresponding sequence of maps and define for n > 0, f w0 = id, f wn = f wn−1 ◦ f wn−2 ◦ · · · ◦ f w0 . Let ν be a Borel probability measure on C 2 (M, M) satisfying log+ | f w |C 2 ν(dw) < +∞, log D( f w ) ν(dw) > −∞, where | f w |C 2 denotes the C 2 norm of f w and D( f w ) = inf x∈M |detTx f w |. Consider the composition of { f wn }+∞ n=−∞ , where the wn ’s are chosen independently with distribution ν. Let τ denote the left shift map on Ω. Then ν Z is ergodic with respect to τ on Ω. The above set-up of the random process will be referred to as χ (M, ν) in the sequel. A Borel probability measure μ on M is called a stationary measure of χ (M, ν), or χ (M, ν)-stationary if f w μ ν(dw) = μ. C 2 (M,M)
Consider the Markov process generated by χ (M, ν) with state space M and transition probabilities P(x, A) = ν{w : f w x ∈ A},
A ∈ B(M),
74
L. Shu
where B(M) is the Borel σ -algebra of M. It is easy to see that μ is χ (M, ν)-stationary if and only if it is stationary with respect to the transition kernel P(x, A). In the case of the transition probabilities P(x, ·), x ∈ M have a density with respect to the Lebesgue measure L M on M, i.e. there is a measurable function p : M × M → R+ such that for every x ∈ M one has P(x, A) = A p(x, y) dL M (y), A ∈ B(M), every χ (M, ν)-stationary measure μ is absolutely continuous with respect to L M and dμ (y) = p(x, y) μ(d x). dL M M A Borel set A ∈ B(M) is said to be ν-invariant if x ∈ A if and only if f w x ∈ A, ν-a.e. and x ∈ A if and only if f w x ∈ A, ν-a.e. Call a χ (M, ν)-stationary measure μ ergodic if every ν-invariant Borel set A has μ measure 0 or 1. Consider the skew map T : Ω × M → Ω × M defined by T (w, x) = (τ (w), f w0 x), where w = (wn )+∞ −∞ . For each χ (M, ν)-stationary measure μ, there is an unique T -invariant Borel probability measure μ∗ on Ω × M such that (see [14, Prop. 1.2]) ProjC 2 (M,M)N ×M μ∗ = ν N × μ, ∗
Z
(2.2) ∗
ProjC 2 (M,M)Z μ = ν , Proj M μ = μ,
(2.3)
and T n (ν Z × μ) converges weakly to μ∗ . Moreover, μ∗ is ergodic if μ is. Disintegrating μ∗ with respect to ν Z , we obtain a class of measures {μw }w∈Ω (unique ν Z -a.e.) such that μ∗ (dw, d x) = ν Z (dw)μw (d x).
(2.4)
Call {μw }w∈Ω a class of sample measures of μ. It is easy to see from (2.2), (2.3), and (2.4) that the sample measures {μw }w∈Ω of a χ (M, ν)-stationary measure μ satisfy i) f w0 μw = μτ (w) , ν Z -a.e.; ii) w → μw depends only on (wn )n · · · > λr be the Lyapunov exponents of χ with multiplicities m 1 , . . . , m r respectively. Then for μ∗ -a.e. (w, x), there is an associated sequence of subspaces Tx M = V (0) (w, x) ⊃ V (1) (w, x) ⊃ · · · ⊃ V (r ) (w, x) = {0} such that lim
n→+∞
1 log |Tx f wn v| = λi n
Dimension Theory of Endomorphisms
75
for all v ∈ V (i−1) (w, x)\V (i) (w, x), 1 ≤ i ≤ r . We may assume λ1 > 0 for non-triviality. For j with λ j < 0, define the stable manifold corresponding to V ( j) at (w, x) to be
1 s, j n n W (w, x) = y ∈ M : lim sup log d( f w x, f w y) < λ j . n→+∞ n Let V s, j (w, x) denote the arc connected component of W s, j (w, x) which contains x. s, j It is a C 1,1 immersed submanifold of M with dimension Σi≥ j m j . Denote by d(w,x) the metric on V s, j (w, x) inherited from M. The folding entropy of μ for system χ (M, ν; μ), denoted by Fμ (χ ) or Fμ for simplicity, is defined by f w−1 Fμ (χ ) = − log(μw )x 0 ({x}) μw (d x) ν Z (dw), f w−1
where denotes the measurable partition of M into single points and {(μw )x 0 } is a disintegration of the measure μw with respect to the partition f w−1 . This notion is 0 closely related to that of Jacobian of the measure preserving transformations [22] (see also [16,17]). The Lyapunov dimension of μ for χ , denoted by dim L (μ), is as defined in the introduction using Fμ (χ ) in place of Fμ ( f ). As it is to be explained in Sect. 5.1.2, if the first case in the definition of dim L (μ) happens, the following theorems hold trivially. So we may assume K in the definition of dim L (μ) always exists. Hypothesis A, A , and B below are taken from [13]. We let L be the smallest integer L λ m ≤ F (i.e. L = K + 1 for K in Sect. 2). so that Σ j=1 j j μ Hypothesis A. For μ-a.e. x and j = L , L + 1, the distribution of w → V ( j) (w, x) is absolutely continuous with respect to Lebesgue on the space of (Σi≥ j m i )-planes in Tx M. Theorem 2.4. Let χ (M, ν; μ) be so that μ is absolutely continuous with respect to Lebesgue on M and λ j = 0 for all j. Assume Hypothesis A is satisfied. Then, for ν Z -a.e. w, dim(μw ) = dim L (μ). A stronger hypothesis which implies Hypothesis A is as follows. Recall the Grassmannian bundle of M is
dim M
Gr(M) =
Gr(M, k),
k=1
where Gr(M, k) is the bundle of k-dimensional subspaces of tangent spaces to M. For v ∈ Gr(M) and Γ ⊂ Gr(M), the probability transition kernel Q(v, Γ ) is Q(v, Γ ) = ν{w : (D f w )−1 v ∈ Γ }. By our assumption of ν, the map (D f w )−1 is well-defined for ν-a.e. w.
76
L. Shu
Hypothesis A . For all v ∈ Gr(M), the probability Q(v, ·) is absolutely continuous with respect to Lebesgue on Gr(M). Theorem 2.5. Theorem 2.4 holds if Hypothesis A is replaced by Hypothesis A . For x ∈ M, it generates a partition Px on C 2 (M, M) by Px (z) = {w : f w x = z}. Let {ν x,z : z ∈ M} be the family of conditional measures associated with Px . Given y ∈ M, let Px,z (y, ·) be the image of ν x,z under the map w → f w y. Let ρ yx,z be its density with respect to Lebesgue if it exists. Hypothesis B. i) For Lebesgue a.e. (x, z) and all y = x, Px,z (y) Leb. ii) For all ξ > 0, there exists G ξ ⊂ M × M with Leb(M × M\G ξ ) = 0 and Eξ : G ξ → R+ such that for all (x, z) ∈ G ξ and all y with d(x, y) ≤ Eξ (x, z), ρ yx,z ≤ d(x, y)−dim M−ξ . Theorem 2.6. Let χ (M, ν; μ) be so that λ j = 0 for all j and Hypothesis B is satisfied. Then, for ν Z -a.e. w, dim(μw ) = dim L (μ). 3. Dimension of Hyperbolic Measures for Endomorphisms In this section, if it is not specified, μ is assumed to be ergodic and hyperbolic. Let ρ0 , ρ1 > 0 be such that, for any x ∈ M, the map f | B(x,ρ0 ) : B(x, ρ0 ) → M is a diffeomorphism to the image which contains B( f x, ρ1 ). Let f x−1 : f B(x, ρ0 ) → B(x, ρ0 ) denote the local inverse. 3.1. Preparatory lemmas. 3.1.1. Lyapunov charts in (M, θ ). Write Rdim M = Ru × Rs , where u = dimE u (x), s = dimE s (x), μ-a.e. For v ∈ Rdim M , let (v u , v s ) be its coordinates with respect to this splitting. Define |v| = max{|v u |u , |v s |s }, where |·|u , |·|s are the Euclidean norms on Ru , Rs , respectively. The closed disk in Ru (or Rs ) of radius ρ centered at 0 is denoted by Ru (ρ) (or Rs (ρ)) and R(ρ) = Ru (ρ)×Rs (ρ). Put λs = max{λi : λi < 0}, λu = min{λi : λi > 0}. Let 0 < ε < min{−λs , λu }/200 be given. There exist a Borel set Γ0 ⊂ Γ0 with μ(Γ0 ) = 1 and θ (Γ0 ) = Γ0 and a measurable function l : Γ0 → [1, +∞) with l(θ ±1 x) ≤ eε l(x) such that for each x ∈ Γ0 one can define an embedding Φx : R(l(x)−1 ) → M with the following properties: (i) Φx (0) = x0 and T0 Φx takes Ru , Rs to E u (x), E s (x), respectively. −1 −1 := Φθ−1 (ii) Put f x := Φθ−1 −1 x ◦ f x −1 ◦ Φ x , defined whenever they x ◦ f ◦ Φx and f x make sense. Then
Dimension Theory of Endomorphisms
77
|T0 f x v| ≥ eλ
u −ε
|T0 f x v| ≤ e
λs +ε
|v| for v ∈ Ru ,
|v| for v ∈ Rs .
(iii) Let L(g) denote the Lipschitz constant of the function g, then L( f x − T0 f x ) ≤ ε, L( f x−1 − T0 f x−1 ) ≤ ε and L(T f x ), L(T f x−1 ) ≤ l(x). (iv) For any v, v ∈ R(l(x)−1 ), we have κ −1 d(Φx v, Φx v ) ≤ |v − v | ≤ l(x)d(Φx v, Φx v ) for some universal constant κ > 0. (v) | f x v| ≤ eλ |v| and | f x−1 v| ≤ eλ |v| for all v ∈ R(e−(λ+ε) l(x)−1 ) , where λ > 0 is a number depending only on ε and the exponents. In particular, f x±1 R(e−(λ+ε) l(x)−1 ) ⊂ R(l(θ ±1 x)−1 ). Any system of local charts {Φx : x ∈ Γ0 } satisfying i)-v) above will be referred to as (ε, l)-charts. 3.1.2. A special partition P. Given two measurable partitions ξ1 and ξ2 of a measurable space (X, ν), we say that ξ1 refines ξ2 (ξ1 > ξ2 ) if ξ1 (x) ⊂ ξ2 (x) at ν-a.e. x ∈ X . Denote by ∨ the join of two partitions. Let {Φx , x ∈ Γ0 } be a system of (ε, l)-charts and let 0 < δ ≤ 1 be a reduction factor. −1 for For x ∈ Γ0 , put f x0 = Id and f xn = f θ n−1 x ◦ · · · ◦ f x , f x−n := f θ−1 −(n−1) x ◦ · · · ◦ f x n ≥ 1 and define Sδcs (x) = {z ∈ R(l(x)−1 ) : | f xn (z)| ≤ δl(θ n x)−1 , ∀ n ≥ 0},
Sδcu (x) = {z ∈ R(l(x)−1 ) : | f x−n (z)| ≤ δl(θ −n x)−1 , ∀ n ≥ 0}. A measurable partition P of (M, μ) is said to be adapted to ({Φx }, δ) if for μ-a.e. x ∈ Γ0 one has p(P − (x)) ⊂ Φx Sδcs (x), where P − =
∞ 0
θ −n P and P + =
∞ 0
p(P + (x)) ⊂ Φx Sδcu (x),
θ n P.
Lemma 3.1. For any 0 < δ < e−λ−ε , there exists a measurable partition P of (M, μ), which is adapted to ({Φx }, δ) and satisfies Hμ (P) < ∞. (The proof of Lemma 3.1 only differs slightly from [19, Lemma 4.2.1] in the definition of φ below. We give it for completeness.)
78
L. Shu
= {x ∈ Γ : l(x) ≤ l0 } has μ positive Proof. Fix some l0 > 0 such that the set Λ 0 let measure. For x ∈ Λ r + (x) = min{k > 0 : θ k (x) ∈ Λ}, r − (x) = min{k > 0 : θ −k (x) ∈ Λ}. Define φ : M → (0, +∞) by min{δ, ρ0 }, if x ∈ Λ; φ(x) = −2 −(λ+3ε) max{r + (x),r − (x)} min{δl0 e , ρ0 }, if x ∈ Λ. + Then −φ is defined μ almost everywhere and log φ is μ-integrable since Λ r (x) dμ = r (x) dμ = 1. Λ Follow [19, Lemma 4.2.1] (to use Mañé’s idea [20]) to construct a partition P with Hμ (P) < ∞ so that diam p(P(x)) ≤ φ(x), for any x ∈ M. Then, using the recurrence functions r + , r − and v) of the properties of ({Φx }, δ), we can conclude that p(P − (x)) ⊂ Φx Sδcs (x) and p(P + (x)) ⊂ Φx Sδcu (x). We use the following notations. Let η be a measurable partition of (M, θ ). For every integer k, l ≥ 1, we define ηkl = ∨ln=−k θ −n η. We observe that ηkl (x) = ηk0 (x) ∩ η0l (x). Let ξ s be a partition of M subordinate to W s -manifolds satisfying ξ s > f −1 ξ s . Denote by ξ s = p −1 ξ s the lift of ξ s to (M, θ ). Let ξ u be a partition of (M, θ ) subordinate to W u -manifolds satisfying ξ u > θ ξ u . Then as in [15] we have Lemma 3.2 (cf. [15, Lemma 4.4]). Let λ0 ≤ min{|λ j |, 1 ≤ j ≤ r } and fix 0 < δ < e−λ−ε arbitrarily. Let P be as constructed in Lemma 3.1. Then there exist some constant κ ≥ 1 and a measurable function n 0 : M → Z+ such that for μ-a.e. x ∈ M and all n ≥ n 0 (x), i) p(Pnn (x)) ⊂ B(x0 , κδe−nλ /2 ); 0 0 B u (x, κδe−nλ /2 ); ii) p([ξ s ∨ Pn0 ](x)) ⊂ B s (x0 , κδe−nλ /2 ), [ξ u ∨ P0n ](x) ⊂ iii) p(P − (x)) ⊂ B s (x0 , κδ), P + (x) ⊂ B u (x, κδ). 0
Moreover, due to the generating properties of ξ s and ξ u (cf. [19] and [25]), we have Lemma 3.3 (cf. [15, Lemma 4.5]). Let P be the partition (depending particularly on 0 < δ < e−λ−ε ) given above. Then one has for μ-a.e. x, 1 log μ(Pnn (x)) = h μ ( f ), δ↓0 n→+∞ 2n ∗ 1 ξs lim lim − log μx (Pn0 (x)) = h μ ( f ), n→+∞ δ↓0 n ∗ 1 ξs lim lim − log μx (Pnn (x)) = h μ ( f ), δ↓0 n→+∞ n ∗ 1 ξu lim lim − log μx (P0n (x)) = h μ ( f ), δ↓0 n→+∞ n ∗ 1 ξu lim lim − log μx (Pnn (x)) = h μ ( f ), δ↓0 n→+∞ n lim lim −
Dimension Theory of Endomorphisms
79
∗ 1 lim lim − log μx (Pnn (x)) = Fμ ( f ), δ↓0 n→+∞ n 1 ξs lim − log μx (P0n (x)) = 0, n→∞ n 1 ξu lim − log μx (Pn0 (x)) = 0, n→∞ n
where the limits lim∗n→∞ above are understood as both lim inf n→∞ and lim supn→∞ . 3.1.3. Points with good local behavior. Let 0 < ε < 1 be given sufficiently small. Let 0 < ε∗ ≤ (1/200) min{λ0 , 1} and let {Φx } be a system of (ε∗ , l∗ ) Lyapunov charts. Let 0 < δ∗ < e−λ−ε be small enough. Set h = h μ ( f ), = p −1 , with being the partition of M into single points and Fμ = Fμ ( f ). Then, by Lemma 3.2 and Lemma 3.3, we can find a measurable partition P of (M, θ ) with Hμ (P) < ∞ and a set Γ ⊂ M of measure μ(Γ ) > 1 − ε4 together with an integer n 0 = n 0 (ε) ≥ 1 and a number C = C(ε) > 1 such that for every x ∈ Γ and n ≥ n 0 , the following statements hold: a) for all integers k ≥ 1 we have C −1 e−k(h+ε) ≤ μ(P0k (x)) ≤ Ce−k(h−ε) , C −1 e−k(h+ε) ≤ μ(Pk0 (x)) ≤ Ce−k(h−ε) , ξs
C −1 e−kε ≤ μx (P0k (x)) ≤ 1, ξs
C −1 e−k(h+ε) ≤ μx (Pk0 (x)) ≤ Ce−k(h−ε) , ξu
C −1 e−kε ≤ μx (Pk0 (x)) ≤ 1, ξu
C −1 e−k(h+ε) ≤ μx (P0k (x)) ≤ Ce−k(h−ε) ; b) for all integers k ≥ 1 we have C −1 e−2k(h+ε) ≤ μ(Pkk (x)) ≤ Ce−2k(h−ε) , ξs
C −1 e−k(h+ε) ≤ μx (Pkk (x)) ≤ Ce−k(h−ε) , ξu
C −1 e−k(h+ε) ≤ μx (Pkk (x)) ≤ Ce−k(h−ε) , C −1 e−k(Fμ +ε) ≤ μx (Pkk (x)) ≤ Ce−k(Fμ −ε) ; c) e−n(δ e
s +ε/2)
−n(δ u +ε/2)
≤ μξx0 (B s (x0 , e−n )) ≤ e−n(δ s
s −ε/2)
ξu
−n(δ u −ε/2)
≤ μx ( B u (x, e−n )) ≤ e
,
;
d) an (x)) ⊂ p(Pan 0 s ξ ∨ Pan (x) ⊂ u ξ ∨ P0an (x) ⊂
B(x0 , e−n ), p −1 (B s (x0 , e−n )) ⊂ ξ s (x), B u (x, e−n ) ⊂ ξ u (x),
where a is the integer part of 2(1 + (λ0 )−1 );
80
L. Shu
e) for each x ∈ Γ , d(z, z ) ≤ dxs0 (z, z ) ≤ 2d(z, z ), ∀ z, z ∈ B s (x0 , e−n 0 ), B u (x, e−n 0 ); d( p(z), p(z )) ≤ dxu ( p(z), p(z )) ≤ 2d( p(z), p(z )), ∀ z, z ∈ f) for every x ∈ Γ and n ≥ n 0 , 1 B x0 , e−n ∩ ξ s (x0 ) ⊂ B s (x0 , e−n ) ⊂ B(x0 , e−n ) ∩ ξ s (x0 ), 2 1 ∩ ξ u (x) ⊂ B u (x, e−n ) ⊂ p −1 (B(x0 , e−n )) ∩ ξ u (x). p −1 B x0 , e−n 2 3.1.4. Density points of the set Γ Since we only have control on points in Γ , we will pick up the “density points” of Γ for later use. Note that M is not a finite dimensional manifold. We need the following slight variance of the Borel density lemma. Lemma 3.4 ([25, Lemma 3.1]). Let A ⊂ M be a measurable set with μ(A) > 0. Then for μ-a.e. x ∈ A, ξ μx ( B u (x, ρ) ∩ A) = 1. lim ξu ρ→0 μ ( B u (x, ρ)) u
x
Based on this, we can further use its idea to show Lemma 3.5. There exists a set Γˆ ⊂ Γ with μ(Γˆ ) > 1 − 30ε and nˆ ∈ N such that for ˆ any x ∈ Γˆ , n ≥ n,
1 μ B(x0 , e−n ) , μ p −1 (B(x0 , e−n )) ∩ P(x) ∩ Γ ≥ (3.1) 8C
1 ξs s ξs μx0 B (x0 , e−n ) , μx p −1 (B s (x0 , e−n )) ∩ P(x) ∩ Γ ≥ (3.2) 8C 1 ξu u ξu μx ( B u (x, e−n ) ∩ Γ ) ≥ μx ( B (x, e−n )), (3.3) 2 1 . (3.4) μx (P(x) ∩ Γ ) ≥ 2C Proof. We first pick up points satisfying (3.4). Let
1 A := x ∈ M : μx (P(x) ∩ Γ ) ≥ . 2C We show μ(A) > 1 − 3ε2 . Let {μ D : D ∈ P} be a canonical system of conditional measures of μ with respect to the partition P. Denote by μ/P the corresponding induced measure on the factor space of M with respect to P. Put A = {D ∈ P : μ D (Γ ) ≥ 1 − ε2 }, then
μ(Γ ) =
μ D (Γ ) dμ/P ≤ 1 − ε2 (1 − μ/P (A)),
Dimension Theory of Endomorphisms
81
which gives μ/P (A) ≥ 1 − ε2 . For each D ∈ A fixed, define
1 −1 K D := y ∈ M : μ D∩ p {y0 } (Γ ) ≥ . 2 Then μ D (K D ) ≥ 1 − 2ε2 . For y ∈ K D , we see that −1
μy (P(y) ∩ Γ ) =
μP (y)∩ p {y0 } (Γ ) 1 , ≥ 2C μ y (P(y))
where μy (P(y)) ≤ C by b) of Sect. 3.1.3 since p −1 {y0 } ∩ P(y) ∩ Γ = Ø. Hence we have μ(A) ≥ μ D (A) dμ/P A ≥ μ D (K D ) dμ/P A
≥ (1 − ε2 )(1 − 2ε2 ) ≥ 1 − 3ε2 . Let Γ1 = Γ ∩ A. Then μ(Γ1 ) > 1 − 4ε2 . Next, let n 1 ∈ N and define
1 s ξs A1 = x : μx p −1 (B s (x0 , e−n )) ∩ Γ1 ≥ μξx0 (B s (x0 , e−n )), ∀ n ≥ n 1 . 4 We show μ(A1 ) > 1 − 12ε for some n 1 large. Let {μ E : E ∈ ξ s } be a canonical system of conditional measures of μ with respect to the partition ξ s . Denote by μ/ξ s the corresponding induced measure on the factor space of M with respect to ξ s . Put A1 = {E ∈ ξ s : μ E (Γ1 ) ≥ 1 − 2ε}, then μ(Γ1 ) =
μ E (Γ1 ) dμ/ξ s ≤ 1 − 2ε(1 − μ/ξ s (A1 )),
which gives μ/ξ s (A1 ) > 1 − 2ε. For each E ∈ A1 fixed, define
1 −1 . K E := y ∈ M : μ E∩ p {y} (Γ1 ) ≥ 2 Put E = p(E). Then μ E (K E ) ≥ 1 − 4ε and
1 μ E p −1 (B s (x0 , e−n )) ∩ Γ1 ≥ μ E B s (x0 , e−n ) ∩ K E . 2
82
L. Shu
By the Borel density lemma, there exists n = n (E) and K E ⊂ K E of measure μ E (K E ) ≥ 1 − 6ε such that μ E (B s (x0 , e−n ) ∩ K E ) ≥
1 E s μ (B (x0 , e−n )), ∀n ≥ n , y ∈ K E . 2
Thus we can define a measurable function n : A1 → Z+ such that the above equation holds true. Let n 1 be a large number such that the set 1 := {E ∈ A1 : n (E) ≤ n 1 } A 1 ) ≥ 1 − 4ε. Therefore, if E ∈ A 1 and y ∈ K ∩ p(A ∩ E), then has measure μ/ξ s (A E for n ≥ n 1 ,
1 μ E p −1 (B s (y, e−n )) ∩ Γ1 ≥ μ E (B s (y, e−n )), 4 i.e., p −1 (K E ) ∩ Γ1 ∩ E ⊂ A1 . Thus μ(A1 ) ≥ μ E (A1 ) dμ/ξ s 1 A ≥ μ E ( p −1 (K E ) ∩ Γ1 ∩ E) dμ/ξ s 1 A
≥ (1 − 8ε)(1 − 4ε) ≥ 1 − 12ε. Similarly, let n 2 ∈ N and define
1 A2 = x ∈ M : μ p −1 (B(x0 , e−n )) ∩ Γ1 ≥ μ(B(x0 , e−n )), ∀n ≥ n 2 . 4 We have μ(A2 ) > 1 − 12ε for n 2 large. Let n 3 ∈ N and define
1 ξu u ξu B u (x, e−n ) ∩ Γ1 ) ≥ μx ( B (x, e−n )), ∀n ≥ n 3 . A3 = x ∈ M : μx ( 2 Then points in A3 satisfy (3.3). By Lemma 3.4, we have μ(A3 ) > 1 − ε for n 3 large. Now put Γˆ = A∩ A1 ∩ A2 ∩ A3 ∩Γ . Then μ(Γˆ ) > 1−30ε. Let nˆ = max{n 1 , n 2 , n 3 }. For x ∈ Γˆ and n > n, ˆ we have
s ξ μx p −1 (B s (x0 , e−n )) ∩ P(x) ∩ Γ
ξs ≥ μy p −1 (B s (x0 , e−n )) ∩ P(x) ∩ Γ dμx (y) ξs ≥ μy (P(x) ∩ Γ ) dμx (y) −1 s −n p (B (x0 ,e ))∩Γ ξs ≥ μy (P(x) ∩ Γ ) dμx (y) p −1 (B s (x0 ,e−n ))∩Γ1 ξs
≥ μx ≥
1 p −1 (B s (x0 , e−n )) ∩ Γ1 · 2C
1 ξs μ (B(x0 , e−n )), 8C x0
Dimension Theory of Endomorphisms
83
i.e., x satisfies (3.2). Similarly, we can show (3.1). This finishes the proof of the lemma. Let Γˆ ⊂ Γ with μ(Γˆ ) > 1 − 30ε be as obtained in Lemma 3.5. We can further require that for every n ≥ nˆ and x ∈ Γˆ , ξu
μx
0 Pan (x) ∩ B u (x, e−n ) ∩ Γ
≥ e−n(δ
u +ε)
.
(3.5)
This inequality can be obtained by considering μ in place of the random measure in [15].
3.2. Proof of Theorem 2.1. We first show i) and ii) of Theorem 2.1. The proof will follow the line in [15] (see also [3]). Let μ be ergodic. Fix x ∈ Γ and n ≥ n 0 . Consider an an (y) ⊂ P(x) : Pan (y) ∩ Γ = Ø}, Rn := {Pan 0 0 (y) ⊂ P(x) : Pan (y) ∩ Γ = Ø}, Fns := {Pan u an an Fn := {P0 (y) ⊂ P(x) : P0 (y) ∩ Γ = Ø}.
For each A ⊂ P(x), define a series of subsets of Rn , Fns or Fnu by the following: N (n, A) N (n, y, A) N u (n, y, A) Nˆ s (n, A) s
:= {R ∈ Rn : R ∩ A = Ø} := N (n, ξ s (y) ∩ Γ ∩ A) := N (n, ξ u (y) ∩ Γ ∩ A) := {R ∈ Fns : R ∩ A = Ø}
Nˆu (n, A) := {R ∈ Fnu : R ∩ A = Ø}. It is clear that Rn ⊂ Fns ∨ Fnu := {R s ∩ R u : R s ∈ Fns , R u ∈ Fnu , R s ∩ R u = Ø}. From this we have Lemma 3.6. For x ∈ Γ and each n ≥ n 0 ,
#N n, p −1 (B(x0 , e−n )) ∩ Γ
≤ # Nˆ s n, p −1 (B(x0 , e−n )) ∩ Γ · # Nˆu n, p −1 (B(x0 , e−n )) ∩ Γ . (Here # D denotes the cardinality of a countable set D.) On the other hand, it is easy to see that Lemma 3.7. For each x ∈ Γ and integer n > n 0 , we have
s #N s n, x, p −1 (B(x0 , e−n )) ≤ μξx0 (B s (x0 , 4e−n )) · Cean(h+ε) ,
ξu #N u n, x, p −1 (B(x0 , e−n )) ≤ μx ( B u (x, 4e−n )) · Cean(h+ε) .
84
L. Shu
Proof. For each R ∈ N s n, x, p −1 (B(x0 , e−n )) , we have by b) of Sect. 3.1.3 that ξs
μx (R) ≥ C −1 e−an(h+ε) . Moreover, we see that p(R) ∩ ξ s (x0 ) ⊂ B s (x0 , 4e−n ). Hence s ξs μx (R) μξx0 (B s (x0 , 4e−n )) ≥ R∈N s (n,x, p −1 (B(x0 ,e−n )))
≥ #N s n, x, p −1 (B(x0 , e−n )) · C −1 e−an(h+ε) , from which the first inequality of the lemma follows. The proof of the second inequality of the lemma is similar. Lemma 3.8. For each x ∈ Γˆ and n > n, ˆ we have
−n −1 μ(B(x0 , e )) ≤ #N n, p (B(x0 , e−n )) ∩ Γ · 8C 2 e−2an(h−ε) . Proof. By Lemma 3.5 and b) of Sect. 3.1.3, we have
1 μ(B(x0 , e−n )) ≤ μ p −1 (B(x0 , e−n )) ∩ P(x) ∩ Γ 8C μ(R) ≤ R∈N (n, p −1 (B(x0 ,e−n ))∩Γ )
≤ #N n, p −1 (B(x0 , e−n )) ∩ Γ · Ce−2an(h−ε) . So, by the above three lemmas, to show i) of Theorem 2.1, it suffices to compare the cardinalities of the sets N s n, x, p −1 (B(x0 , e−n )) and N u n, x, p −1 (B(x0 , e−n )) with that of Nˆ s n, p −1 (B(x0 , e−n )) ∩ Γ and Nˆu n, p −1 (B(x0 , e−n )) ∩ Γ , respectively. Lemma 3.9. For each x ∈ Γ , y ∈ P(x) and n ≥ n 0 , we have
# Nˆ s n, p −1 (B(y0 , e−n )) ∩ Γ ≥ #N s n, y, p −1 (B(y0 , e−n )) · C −2 e−2anε ,
# Nˆu n, p −1 (B(y0 , e−n )) ∩ Γ ≥ #N u n, y, p −1 (B(y0 , e−n )) · C −2 e−2anε . Proof. Obviously, we have
# Nˆ s n, ξ s (y) ∩ p −1 (B(y0 , e−n )) ∩ Γ ≤ # Nˆ s n, p −1 (B(y0 , e−n )) ∩ Γ . (3.6) Fix R s ∈ Nˆ s n, ξ s (y) ∩ p −1 (B(y0 , e−n )) ∩ Γ . For each z that belongs to ξ s (y) ∩ 0 (z), we see that R = P an (z) is a rectangle in p −1(B(y0 , e−n )) ∩ Γ suchthat R s = Pan an s −1 −n N n, y, p (B(y0 , e )) . The number of different R corresponding to R s , denoted by An , satisfies ξs
An ≤
μ y (R s )
ξs min{μ y (R) : R ∈ N s n, y, p −1 (B(y0 , e−n )) , R ∈ R s } ξs
=
0 (z)) μ y (Pan ξs
an (z )) : z ∈ ξ s (y) ∩ p −1 (B(y , e−n )) ∩ R s ∩ Γ } min{μ y (Pan 0
≤ Ce−an(h−ε) /(C −1 e−an(h+ε) ) = C 2 e2anε .
Dimension Theory of Endomorphisms
85
Therefore
# Nˆ s n, ξ s (y) ∩ p −1 (B(y0 , e−n )) ∩ Γ ≥ #N s n, y, p −1 (B(y0 , e−n )) · C −2 e−2anε . This together with (3.6) implies the first inequality of the lemma. The other inequality of the lemma can be similarly obtained. As to the inequalities in the reverse direction, we first have Lemma 3.10. For each x ∈ Γ and n ≥ n 0 , # Nˆ s (n, P(x)) ≤ Cean(h+ε) , # Nˆu (n, P(x)) ≤ Cean(h+ε) . 0 (z). So Proof. The set P(x) is the union of a collection of rectangles R = Pan μ(R) ≥ # Nˆ s (n, P(x)) · C −1 · e−an(h+ε) , 1 ≥ μ(P(x)) ≥ R∈ Nˆ s (n,P (x))
where the last inequality holds since different rectangles R of Nˆ s (n, P(x)) are mutually disjoint. The first inequality of the lemma follows immediately. The second inequality of the lemma can be similarly obtained. Then using the fact of the existence of δ s and δ u and Lemma 3.5, we have Lemma 3.11. For μ-a.e. y ∈ P(x) ∩ Γˆ , we have # Nˆ s n, p −1 (B(y0 , e−n )) ∩ Γ · e−5anε = 0, lim sup n→+∞ #N s n, y, p −1 (B(y0 , e−n )) # Nˆu n, p −1 (B(y0 , e−n )) ∩ Γ · e−5anε = 0. lim sup n→+∞ #N u n, y, p −1 (B(y0 , e−n )) Proof. Let y ∈ Γˆ and n ≥ n, ˆ we have by c) of Sect. 3.1.3 and Lemma 3.5 that s s e−(δ +ε)n ≤ μξy0 B s (y0 , e−n )
ξs = μ y p −1 (B s (y0 , e−n ))
ξs ≤ 8C · μ y p −1 (B s (y0 , e−n )) ∩ P(y) ∩ Γ
ξs ≤ 8C · μ y p −1 (B(y0 , e−n )) ∩ P(y) ∩ Γ . From this, we obtain that
#N n, y, p −1 (B(y0 , e−n )) ≥ s
ξs
μy
p −1 (B(y0 , e−n )) ∩ P(y) ∩ Γ
ξs
an (z)) : z ∈ ξ s (y) ∩ P(y) ∩ Γ } max{μ y (Pan
e−n(δ +ε) 1 · −an(h−ε) 8C e 1 s ≥ · e−n(δ −ah+2aε) . 8C s
≥
86
L. Shu
Next, for each k ≥ 1, consider the set Fk :=
# Nˆ s n, p −1 (B(y0 , e−n )) ∩ Γ 1 −5anε ·e y ∈ P(x) ∩ Γˆ : lim sup ≥ . k n→+∞ #N s n, y, p −1 (B(y0 , e−n ))
For each y ∈ Fk , there exists an increasing sequence {m j (y)}∞ j=1 of positive integers such that n = m j (y) satisfies
1 #N s n, y, p −1 (B(y0 , e−n )) · e5anε # Nˆ s n, p −1 (B(y0 , e−n )) ∩ Γ ≥ 2k 1 −n(δ s −ah−3aε) ≥ . (3.7) e 16kC Suppose Fk has μ positive measure for some k. Then μ( p(Fk )) ≥ μ(Fk ) > 0. Let Fk ⊂ Fk be the set of points y ∈ Fk for which there exists the limit ξs
log μ y0 (B s (y0 , ρ)) = δs . ρ→0 log ρ lim
Clearly μ(Fk ) = μ(Fk ) > 0. Then we can find y ∈ Fk such that ξs
ξs
ξs
μ y (Fk ) = μ y (Fk ) = μ y
Fk ∩ P(y) ∩ ξ s (y) > 0.
ξs
Hence μ y0 ( p(Fk )) > 0. So it follows from Frostman’s lemma that dim H p(Fk ) ∩ ξ s (y0 ) ≥ δ s .
(3.8)
Consider the collection of balls D := {B(z 0 , e−m j (z) ) : z ∈ Fk ∩ ξ s (y), j = 1, 2, . . .}. By the Besicovitch covering lemma, one can find a countable subcover D ⊂ D of p(Fk ) ∩ ξ s (y0 ) of arbitrarily small diameter and finite multiplicity q. This means that ∞ and a sequence for any L ≥ nˆ one can choose a sequence of points {z i ∈ Fk ∩ ξ s (y)}i=1 ∞ ∞ i of integers {ti }i=1 , where ti ∈ {m j (z )} j=1 and ti ≥ L for each i such that the collection of balls D = {B( p(z i ), 4e−ti ) : i = 1, 2, . . .} comprises a cover of p(Fk ) ∩ ξ s (y0 ) whose multiplicity does not exceed q. Write Bi = B( p(z i ), 4e−ti ). The Hausdorff sum corresponding to this cover is B∈D
(diamB)δ
s −ε
= 8δ
s −ε
∞ i=1
e−ti (δ
s −ε)
.
Dimension Theory of Endomorphisms
87
Noting that a > 1 (see d) of Sect. 3.1.3), we have by (3.7) that ∞ i=1
e−ti (δ
s −ε)
≤
∞
# Nˆ s (ti , p −1 (Bi ) ∩ Γ ) · 16kC · e−ati (h+2ε)
i=1
≤ 16kC
∞
e−al(h+2ε) ·
l=nˆ ∞
≤ 16kCq
# Nˆ s (ti , p −1 (Bi ) ∩ Γ )
i: ti =l
e−al(h+2ε) · # Nˆ s (l, P(x))
l=nˆ ∞
≤ 16kC 2 q ≤ 16kC 2 q
l=nˆ ∞
e−al(h+2ε) · eal(h+ε) e−alε < ∞.
l=nˆ
dim H ( p(Fk )
ξ s (y0 ))
∩ ≤ δ s − ε < δ s , which contradicts (3.8). This It follows that proves the first equation of the lemma. The second equation can be obtained similarly using the existence of δ u and Lemma 3.5. Proof of Theorem 2.1. We first consider the case μ is ergodic. By Lemma 3.11, there exists a set Γ ε ⊂ Γˆ with μ(Γ ε ) > 1 − 30ε and n ε ∈ N such that ∀ x ∈ Γ ε , n > n ε ,
(3.9) # Nˆ s n, p −1 (B(x0 , e−n )) ∩ Γ ≤ #N s n, x, p −1 (B(x0 , e−n )) · e5anε ,
# Nˆu n, p −1 (B(x0 , e−n )) ∩ Γ ≤ #N u n, x, p −1 (B(x0 , e−n )) · e5anε . (3.10) Fix x ∈ Γ ε and an integer n ≥ n ε , we first show ξu
μξx0 (B s (x0 , e−n )) · μx ( B u (x, e−n )) ≤ μ(B(x0 , 3e−n )) · 8C 6 e7anε . s
(3.11)
Clearly, for each rectangle R in Rn which intersects p −1 (B(x0 , 2e−n )) ∩ Γ , we have R ⊂ p −1 (B(x0 , 3e−n )). Therefore,
μ p −1 (B(x0 , 3e−n )) ∩ P(x) ≥ μ(R) R∈N (n, p −1 (B(x0 ,2e−n ))∩Γ )
≥ #N n, p −1 (B(x0 , 2e−n )) ∩ Γ · C −1 e−2an(h+ε) . (3.12) On the other hand, we see that
#N n, p −1 (B(x0 , 2e−n )) ∩ Γ
0 ≥ #N u n, z, Pan (z) ∩ p −1 (B(z 0 , e−n ))
0 (z)∈ Nˆ s n, p −1 (B(x ,e−n ))∩Γˆ Pan 0
0 (z) ∩ p −1 (B(z 0 , e−n )) : ≥ # Nˆ s n, p −1 (B(x0 , e−n )) ∩ Γˆ · inf{#N u n, z, Pan z ∈ p −1 (B(x0 , e−n )) ∩ Γˆ }.
(3.13)
88
L. Shu
0 (z) ∩ p −1 (B(z , e−n )) for z ∈ Γˆ . By f) of Sect. 3.1.3, Now we estimate #N u n, z, Pan 0 we have 0 0 (z) ∩ B u (z, e−n ) ⊂ ξ u (z) ∩ Pan (z) ∩ p −1 (B(z 0 , e−n )). ξ u (z) ∩ Pan
Therefore by (3.5), we have
0 (z) ∩ p −1 (B(z 0 , e−n )) #N u n, z, Pan ξu 0 μz Pan (z) ∩ B u (z, e−n ) ∩ Γ u ≥ ξ 0 (z) ∩ p −1 (B(z , e−n )) max μz (R) : R ∈ N u n, z, Pan 0 ≥ e−n(δ
· C −1 ean(h−ε) ξu ≥ C −1 ean(h−2ε) · μ ( B u (x, e−n )), u +ε)
x
(3.14)
where the last inequality holds by c) of the choice of x in Sect. 3.1.3 using the fact that a ≥ 2 (cf. d) there).
As to # Nˆ s n, p −1 (B(x0 , e−n )) ∩ Γˆ , we can follow the same line as in the proof of Lemma 3.9 to show
# Nˆ s n, p −1 (B(x0 , e−n )) ∩ Γˆ
≥ N s n, x, p −1 (B(x0 , e−n )) ∩ Γˆ · C −2 e−2anε . (3.15) Furthermore, we have by Lemma 3.5 and b) of Sect. 3.1.3 that
#N s n, x, p −1 (B(x0 , e−n )) ∩ Γˆ
ξs μx p −1 (B(x0 , e−n )) ∩ P(x) ∩ Γˆ s
≥ ξ max μx (R) : R ∈ N s n, x, p −1 (B(x0 , e−n )) ∩ Γˆ
ξs ≥ μx p −1 (B(x0 , e−n )) ∩ P(x) ∩ Γˆ · C −1 ean(h−ε) ≥
1 an(h−ε) ξ s s e · μx0 (B (x0 , e−n )), 8C 2
(3.16)
where the last inequality holds if we pick up x such that Lemma 3.5 holds for Γˆ . Putting the inequalities (3.12), (3.13), (3.14), (3.15), and (3.16) together, we obtain (3.11). Now we have by Lemmas 3.6, 3.7 and 3.8 and inequalities (3.9) and (3.10) that for x ∈ Γ ε and n ≥ nˆ that ξu
B u (x, 4e−n )) · 8C 5 e14anε . μ(B(x0 , e−n )) ≤ μξx0 (B s (x0 , 4e−n )) · μx ( s
This together with (3.11) proves i) of Theorem 2.1. The equality in ii) of Theorem 2.1 follows immediately from the inequalities of i). As to the case when μ is not ergodic, one can pick up the set of points Γ of (M, θ ) of measure ≥ 1 − ε4 and for every x ∈ Γ , a number h(x0 ) such that a)–f) of Sect. 3.1.3 hold if h is replaced by h(x0 ), δ s by δ s (x0 ) and δ u by δ u (x0 ). Then fix ι > 0 and consider the sets Γ (x) = {y ∈ M : |h(x) − h(y)| < ι, |δ s (x) − δ s (y)| < ι, |δ u (x) − δ u (y)| < ι}.
Dimension Theory of Endomorphisms
89
The collection of these sets covers p(Γ ). Moreover, there exists a countable sub collection {Γ i }i∈N which still covers p(Γ ). Let μi be the conditional measure generated by μ on Γ i . Following the above argument for p −1 (Γ i ) ∩ Γ and μi , we can show for almost every x ∈ Γ i , d(μi , x) ≥ δ s (x) + δ u (x) − cι, d(μi , x) ≤ δ s (x) + δ u (x) + cι, where d(μi , x) and d(μi , x) are the lower and upper pointwise dimension of the measure μi and c does not depend on x or ι. Letting ι go to zero yields that for μ-a.e. x ∈ M, d(μ, x) = δ s (x) + δ u (x). Proof of Theorem 2.2. It is an immediate consequence of Theorem 2.1 using (1.3) for endomorphism (cf. [25]) and (1.9). 4. Volume Lemma and Lyapunov Dimension of Measures When an f -ergodic measure μ is not hyperbolic, let δ c denote the multiplicity of its zero Lyapunov exponent. To see Theorem 2.3, i.e., the relation between the local dimension of μ with its Lyapunov dimension, we first have Lemma 4.1. Let f be a C 2 non-invertible but non-degenerate endomorphism on M preserving an f -ergodic Borel probability measure μ. Then d(μ, x) ≤ δ s + δ c + δ u , for μ − a.e. x. Proof. Let 0 < ε < 1 be given sufficiently small. Let P and Γ be as obtained in Sect. 3.1.3 such that for x ∈ Γ , all the properties there hold except for an p(Pan (x)) ⊂ B(x0 , e−n ) for n ≥ n 0 . ηu
Put ηu = ξ u ∨ P + and let {μx } be a system of conditional measures associated with ηu . Then by Lemma 12.4.1 and Lemma 12.1.2 of [12], we have that at μ-a.e. x, 1 ηu lim − log μx (P0n (x)) = h μ (θ, P) and n→∞ n ηu
log μx ( B u (x, ρ)) ≤ δu . lim sup log ρ ρ→o Hence we may assume that for a point x ∈ Γ and n ≥ n 0 , ηu
μx (P0n (x)) ≤ e−n(h−ε) and ηu
μx ( B u (x, e−n )) ≥ e
−n(δ u +ε)
.
(4.1) (4.2)
Furthermore, by using the properties of the Lyapunov metric in M, we can argue as in [12] to choose a sequence of partitions {Qn }n≥n 0 refining P such that
90
L. Shu
an and for x ∈ Γ, g) For n ≥ n 0 , we have Qn > Pan −n i) diam p(Qn (x)) ≤ 2e , c an (x)). ii) μ(Qn (x)) ≥ C −1 e−nε · e−nδ · μ(Pan
Then we proceed to pick up the “density points” as in Lemma 3.5. Let Γˆ ⊂ Γ and nˆ be there such that the lemma holds for Γ above. Note that the projection map p restricted on each element of ξ u is injective. Hence the same argument as in [25, Lemma 3.1] gives that for μ-a.e. x ∈ Γ , η μ ( B u (x, ρ) ∩ Γ ) lim x ηu = 1. ρ→0 μ ( B u (x, ρ)) u
x
ˆ So, we may assume that for x ∈ Γˆ and n ≥ n, ηu
μx ( B u (x, e−n ) ∩ Γ ) ≥
1 ηu u μ ( B (x, e−n )). 2 x
The left steps are exactly as in [12]. Indeed, pick up x ∈ Γˆ and set 1 δ = lim sup − log μ(B(x0 , 4e−n )). n n→∞ ˆ such that There exist infinitely many n ≥ max{n 0 , n} μ( p −1 B(x0 , 4e−n )) = μ(B(x0 , 4e−n )) ≤ e−n(δ−ε) . Fix such n, assuming 16C 2 ≤ enε . Consider the number an an : Pan intersecting Γ ∩ p −1 B(x0 , 2e−n ) . N = # atoms of Pan We have by b) of Sect. 3.1.3 and ii) of g) that N ≤ Cenε · enδ · e2an(h+ε) · e−n(δ−ε) . c
On the other hand, it is clear that an an N ≥ # atoms of Pan : Pan intersecting Γ ∩ p −1 B(x0 , e−n ) .
(4.3)
(4.4)
an is an intersection of a unique pair from P an and P 0 . For a Note that each atom of Pan an 0 lower bound in (4.4), we first estimate using c) and f) of Sect. 3.1.3 that 1 −n(δ s +ε) an(h−ε) 0 0 e # atoms of Pan : Pan intersecting Γ ∩ p −1 B(x0 , e−n ) ≥ ·e . 8C 0 ) and choose y ∈ P ∩ Γ ∩ p −1 B(x , e−n ). Then for Fix one of these atoms Pu (of Pan u 0 −1 −n any z ∈ Pu ∩ Γ ∩ p B(x0 , e ), we have by (4.1) that ηu
ηu
μ y (P0an (z)) = μz (P0an (z)) ≤ e−an(h−ε) . Denote by n(X ) the number of atoms of P0an intersecting the set X ∩Γ ∩ p −1 B(x0 , e−n ), then we have by c) and f) of Sect. 3.1.3 and (4.2) that n(P0an (y)) ≥
1 −n(δ u +ε) an(h−ε) e ·e . 2
Dimension Theory of Endomorphisms
91
Therefore, we have
N ≥
n(Pu )
{ Pu : Pu ∩Γ ∩ p−1 B(x0 ,e−n )=Ø} ≥
1 −n(δ s +δ u +2ε) 2an(h−ε) e ·e . 16C
Comparing this with (4.3) gives δ ≤ δ u + δ c + δ s + (5 + 4a)ε. The conclusion follows since ε > 0 is arbitrary.
We remark that for a general invariant probability measure μ, a slight modification of the above proof as in [12] (by dividing M into a countable invariant set on each one the relevant functions are more or less constants) will give d(μ, x) ≤ δ s (x) + δ c (x) + δ u (x), μ − a.e., where δ c (x) is the multiplicity of zero Lyapunov exponent at x. The proof is omitted since we will not use this general formula. Let μ be an f -ergodic probability measure on M. It is true by the entropy theories of [25,31] that there are partial dimensions {γi }ri=1 such that the following properties hold: i) ii) iii) iv)
0 ≤ γi ≤ m i for i = 1, 2, . . . , r , γi = m i if λi = 0, δ s = Σλi 0 γi , r λ γ = F ( f ) := F . Σi=1 i i μ μ
Proposition 4.2. Let μ be as in Lemma 4.1 with partial dimensions {γi }ri=1 . Then r
γi ≤ dim L (μ)
(4.5)
i=1
with equality attained if and only if the partial dimensions {γ j } satisfy ⎧ if j < jc ; ⎪ ⎨γj = m j, Condition I: ∃ jc such that 0 < γ j ≤ m j , if j = jc ; ⎪ ⎩ γ j = 0, if j > jc . Proof. We first show the inequality (4.5). If Σλi >0 λi m i ≤ Fμ , we have by iv) above that Fμ =
r i=1
λi γi ≤
λi m i ≤ Fμ ,
λi >0
which immediately implies γi = m i for λi > 0 and γi = 0 for λi < 0 and hence K λ m > F . the inequality (4.5). Next, let K be the largest integer such that Σi=1 i i μ
92
L. Shu
K m = dimM, then (4.5) holds trivially. Otherwise, we have λ If Σi=1 i K +1 < 0. Clearly, we have by i) that r r (λi − λ K +1 )γi ≤ (λi − λ K +1 )m i . i=1
(4.6)
i=1
By iv), we have −λ K +1
r
γi ≤ −λ K +1
i=1
K
mi +
i=1
K
λi m i − Fμ .
i=1
Dividing each side of it by −λ K +1 gives r i=1
γi ≤
K
K mi +
i=1 λi m i
i=1
− Fμ
−λ K +1
= dim L (μ).
Suppose Condition I holds. If jc is such that λ jc = 0, then Σi γi = dim L (μ) holds by the first case of the definition of dim L (μ). Otherwise, we have by iv) that jc −1 j=1 λ j m j − Fμ γ jc = , −λ jc and hence Σi γi = dim L (μ) with K = jc − 1 in the definition of dim L (μ). Conversely, if the first case in the definition of dim L μ happens, then Condition I holds by the argument in the first paragraph. Otherwise, we have by (4.6) that the equality in (4.5) implies γ j = 0 for j > K + 1 and γ j = m j for j < K + 1. Then we have by iv) that −λ K +1 γ K +1 =
K
λ j γ j − Fμ ,
j=1
which implies γ K +1 > 0 by our choice of K in the definition of dim L (μ).
As a consequence of Lemma 4.1 and Proposition 4.2, we have Theorem 2.3. Moreover, for μ as there, we have d(μ, x) ≤ dim L (μ), μ − a.e., where with the equality holds only if μ is SRB. (Here we note that γ j = m j for j with λ j > 0 implies SRB property (cf. [25,28]).) Furthermore, in case μ has no zero Lyapunov exponent, we have dimμ = dim L (μ) if and only if Condition I holds. 5. Dimension Formula for Random Endomorphisms In this section, we show in a random setting the dimension of a hyperbolic ergodic measure coincides with its Lyapunov dimension. The corresponding dimension theories that can be obtained completely parallelling that in the deterministic case will be mentioned without proof for conciseness of the paper.
Dimension Theory of Endomorphisms
93
5.1. The proofs of Theorem 2.4 and Theorem 2.6. We begin with some preparations concerning the structure of local stable manifolds and dimension properties of sample measures. The notations in the second part of Sect. 2 will be retained. 5.1.1. Properties of local stable manifolds. For j with λ j < 0, put E j (w, x) = V ( j) (w, x) and F j (w, x) = E j (w, x)⊥ being the orthogonal complement of E j (w, x) in Tx M. Given ε > 0, there exist positive constants C0 , α, D0 , β, E 0 , δ0 , δ1 and a measurable set Λ = Λ(C0 , α, D0 , β, E 0 , δ0 , δ1 ) ⊂ Ω × M such that the following five properties hold (cf. [13]): i) Λ depends only on x and wn , n ≥ 0 and μ∗ (Λ) ≥ 1 − ε. ii) Let j be such that λ j < 0. For (w, x) ∈ Λ and n ≥ 0, v ∈ E j (w, x) ⇒ |Tx f wn v| ≤ C0 en(λ j +ε) |v|, v ∈ F j (w, x) ⇒ |Tx f wn v| ≥ C0−1 en(λ j−1 −ε) |v|. iii) Let j be such that λ j < 0. For each (w, x) ∈ Λ, there is a C 1 embedded connected s, j Σi≥ j m i dimensional disk Wα (w, x) such that s, j s, j a) Wα (w, x) = {y ∈ V s, j (w, x) : d(w,x) (y, x) ≤ α}.
j b) exp−1 x Wα (w, x) is part of the graph of a function gw,x : E (w, x) → j F (w, x) satisfying 1) gw,x 0 = 0, 2) T0 gw,x = 0, 3) |T gw,x | ≤ 1/1000, 4) Lip(T gw,x ) ≤ C0 . s, j s, j s, j c) If z 1 , z 2 ∈ Wα (w, x), then dθ n (w,x) ( f wn z 1 , f wn z 2 ) ≤ C0 en(λ j +ε) d(w,x) (z 1 , z 2 ) for all n ≥ 0. iv) Let Λw = {x ∈ M : (w, x) ∈ Λ}. Then for each w with Λw non-empty, the map s, j
x → E j (w, x) s, j
s, j
is locally Hölder continuous on the set Wα (Λw ) = ∪x∈Λw Wα (w, x) with expos, j
nent β, i.e., for all z 1 , z 2 ∈ Wα (Λw ) with d(z 1 , z 2 ) ≤ δ0 , d(E j (w, z 1 ), E j (w, z 2 )) ≤ D0 d(z 1 , z 2 )β . v) Let w be such that Λw is non-empty. For x ∈ Λw , let T1 and T2 be expx images of small disks parallel to F j (w, x) and at a distance smaller than δ1 from F j (w, x). s, j s, j Then the map ψ from T1 ∩ Wα (Λw ∩ B(x, δ0 )) to T2 by sliding along Wα -leaves is absolutely continuous with |Jac(ψ)| ≤ E 0 . For explicit definitions and proofs of Hölder continuity of subbundles and the absolute continuity of the map ψ, we refer the readers to [14] and [7].
94
L. Shu
5.1.2. Dimension properties of sample measures. Let χ (M, ν; μ) be as in Sect. 2. Parallel to the entropy theories of [25,31] and Lemma 4.1, we obtain partial dimensions {γi }ri=1 for sample measures {μw }w∈Ω such that i) ii) iii) iv) v)
0 ≤ γi ≤ m i for i = 1, 2, . . . , r , γi = m i if λi = 0, r λ γ = F (χ ) := F , Σi=1 i i μ μ r γ for μ∗ -a.e. (w, x). lim supρ→0 log μw (B(x, ρ)) / log ρ ≤ Σi=1 i If λ j < 0, then Σi≥ j γi is the dimension of the conditional measure of μw on (a measurable partition subordinate to) W s, j manifolds.
We may assume the integer K in the definition of dim L (μ) exists. Otherwise, as in the proof of Proposition 4.2, we obtain γi = m i for λi > 0 and γi = 0 for λi < 0. Hence if μ is hyperbolic, we have by a parallel result of Theorem 2.1 in our random setting that dimμw =
r
γi = dim L (μ), ν Z − a.e.
i=1
Let Condition I be as proposed in Proposition 4.2. It is also a straightforward corollary of Lemma 4.1 and Proposition 4.2 in random setting that Lemma 5.1. Let χ (M, ν; μ) be as in Sect. 2. Then there is σ such that for μ-a.e. x, σ := lim
ρ→0
log μw (B(x, ρ)) ≤ dim L (μ), log ρ
If λ j = 0 for all j, then dim(μw ) = dim L (μ), ν Z -a.e. if and only if Condition I holds. 5.1.3. Proof of the main results The idea to show Theorem 2.4 is as in [13] to introduce a notion of transversal dimension of μw with respect to W s, j for j = L , L + 1 and show they have predominate contribution to the dimension of μw . This will imply Condition I and hence the theorem by Lemma 5.1. j
Proof of Theorem 2.4. Let j = L or L +1. For μ-a.e. x, let ρx be the density with respect to Lebesgue of the distribution of w → E j (w, x) in the space of Σi≥ j m i -dimensional planes in Tx M. Let ξ > 0 be arbitrarily small. Choose E and r0 > 0 with r0 ≤ α/100, δ0 so that j Σ := (w, x) ∈ Λ : ρx ≤ E and μw (B(x, r¯ )) ≤ E r¯ σ −ξ , ∀ r¯ ≤ r0 has positive μ∗ measure. For w ∈ Ω, let Σw = {x ∈ M : (w, x) ∈ Σ} be the w-section j s, j of Σ. Let π(w,x) be the projection along Wα into expx F j (w, x). For (w, x) ∈ Σ and t ∈ (1/2, 1) to be specified later, define j Σwj (x, r¯ , t) = y ∈ Σw : d(x, y) ≤ r¯ t and d(x, π(w,x) y) ≤ r¯ . For (w, x) ∈ Σ, we define the upper transversal dimension of μw with respect to W s, j as j
dimμw (x, Σwj ) := lim sup r¯ →0
log μw (Σw (x, r¯ , t)) . log r¯
Dimension Theory of Endomorphisms
95
Then exactly the same argument as in [13] using the existence of σ and {γi }i≥ j gives Sublemma 5.2. For μ∗ a.e (w, x) ∈ Σ, we have j dimμw (x, Σw ) ≥ d j + t (σ − d j − ξ ), if σ − ξ > d j ; j dimμw (x, Σw ) ≥ σ − ξ, if σ − ξ < d j , where we denote by d j := Σi< j m i . Sublemma 5.3. Let t > 1 − β. Then for a set of (w, x) with positive measure in Σ, dimμw (x, Σwj ) ≤ σ − (1 − t) γi + 3ξ. i≥ j
We show these two sublemmas imply Condition I and hence σ = dim L (μ). Firstly, we can exclude the case L−1
λi m i > Fμ and L = r + 1.
i=1
Suppose otherwise. Since μ L M , we have that μw has absolutely continuous measure u (see [25] and [16]), so, m i = γi for i with λi > 0. Hence in the unstable direction W r
λi γi ≥
i=1
L−1
λi m i > Fμ ,
i=1
r λ γ = F . which is a contradiction to the fact that Σi=1 i i μ Next, we have by the above two sublemmas and the arbitrary choice of ξ that
σ ≤ i< j γi + t i≥ j γi , if σ > d j ; (5.1) d j + t (σ − d j ) ≤ i< j γi + t i≥ j γi , if σ ≤ d j . L−1 L λ m ≤ F . From this, we deduce Recall L is such that Σi=1 λi m i > Fμ and Σi=1 i i μ
−λ L m L ≥
L−1
λi m i − Fμ ,
i=1
and hence we have by Lemma 5.1 that σ ≤ dim L (μ) ≤
L
m i = d L+1 .
i=1
Apply (5.1) for j = L + 1 to give σ ≤ i 0 (measurable in w) such that for any x ∈ M, the map f w | B(x,ρw0 ) : B(x, ρw0 ) → M is a diffeomorphism to the −1 : f B(x, ρ 0 ) → B(x, ρ 0 ) denote the image which contains B( f w x, ρw1 ). Let f w,x w w w local inverse. Fix j with λ j < 0. Let ξ > 0 be an arbitrarily small constant. We choose δ and C > 0 with δ ≤ δ0 , δ1 , α/100. (See Sect. 5.1.1 for the definitions of δ0 , δ1 and α.) Let Γ = F −1 Λ ∩ {(w, x) ∈ Λ : i) |D f w | ≤ C,
ii) μw (B(x, r¯ )) ≤ C r¯ σ −ξ , r¯ ≤ δ, iii) (x, f w0 x) ∈ G ξ and Eξ (x, f w0 x) ≥ δ/C},
where G ξ and Eξ are given in Sect. 2. For δ sufficiently small and C sufficiently large, we may assume Γ has μ∗ positive measure. We may also assume δ < min(w,x)∈Γ {ρw0 0 , ρw1 0 }. Next, for (w, x) ∈ Γ , let (exp fw Tˆw,x = f w−1 0 ,x
0 (x)
F j (T (w, x))).
s, j Let πˆ w,x denote the projection along Wα leaves onto Tˆw,x . For r¯ > 0, define ˆ
ˆ
μT (w, x, r¯ ) := μw {y ∈ Γw : d(x, y) ≤ δ/C and d T (πˆ w,x , x) ≤ r¯ }, ˆ where d T denotes the distance on Tˆw,x . Let ˆ
τ j (w, x) = lim sup r¯ →0
log μT (w, x, r¯ ) . log r¯
Exactly the same argument as in [13] gives Sublemma 5.4. For μ∗ -a.e. (w, x) ∈ Γ ,
τ j (w, x) ≥ σ − 2ξ, if σ − 2ξ < d j ; τ j (w, x) ≥ d j ,
if σ − 2ξ ≥ d j .
Dimension Theory of Endomorphisms
97
Sublemma 5.5. For a set of (w, x) of Γ with positive measure, we have σ ≥ τ j (w, x) +
γi − ξ.
i≥ j
We show these will imply Condition I and hence the theorem follows by Lemma 5.1. First, we have by i) of Hypothesis B and the properties of stationary measures stated in Sect. 2 that μ ≤ L M . Hence the same reasoning as in the proof of Theorem 2.4 yields r λ m ≤ F . that Σi=1 i i μ Starting from j = r + 1, let L be the first integer j < r such that σ > d j . Then σ ≤ d L+1 . Apply the above two sublemmas for the case j = L + 1. We have by the arbitrary choice of ξ that
τ j (w, x) ≥ σ ≥ τ j (w, x) +
γi ,
i≥L+1
which clearly implies that γi = 0 for i > L. Hence we may assume λ L < 0. Otherwise, γi = 0 for all i with λi < 0, which is impossible since λi 0
λi m i − Fμ > 0.
λi >0
Now, we apply Sublemmas 5.4 and 5.5 for the case j = L and conclude that σ ≥ τ j (w, x) +
γi ≥ d L + γ L .
i≥L r γ . This forces γ = m for i < L. Thus the γ ’s satisfy the Note that σ = Σi=1 i i i j requirement of Condition I and the theorem holds by Lemma 5.1.
5.2. An application of the results to stochastic flows. The model in this section is taken from Liu [16] (see also [2] and [13]). Consider a random perturbation model introduced in Baladi and Young [2]. Suppose that f : M → M is a C 2 map with no singularities. Consider the case that a particle x ∈ M jumps to f (x) and it then performs a diffusion for the time ε > 0 (see also Kifer [8] for a systematic treatment of this set-up). More precisely, let X 0 , X 1 , . . . , X d be C ∞ vector fields of M, and consider the SDE of Stratonovich type dξt = X 0 (ξt ) dt +
d
X i (ξt ) ◦ d Bti ,
(5.2)
i=1
where {Bt1 , . . . , Btd }t≥0 is a standard d-dimensional Brownian motion defined on a probability space (W, F, P). Realize the solution of this equation as a stochastic process ξt : (W, F, P) → Diff∞ (M)
98
L. Shu
which satisfies i) ii) iii) iv)
ξ0 = id; for t0 < t1 < · · · < tn , the increments ξti ◦ ξt−1 are independent; i−1 −1 for s < t, the distribution of ξt ◦ ξs depends only on t − s; with probability 1 the stochastic flow ξt has continuous sample paths.
(See Kunita [9] for more information.) Now consider the randomly perturbed process generated by compositions of random maps · · · ◦ f w1 ◦ f w0 ◦ f w−1 ◦ · · · , where . . . , w1 , w0 , w−1 , . . . ∈ (W, P) are independent and f wi = ξε (wi ) ◦ f. The randomly perturbed process introduced above is just χ (M, νε ), where νε is the distribution on C 2 (M, M) induced by the map Σ : (W, F, P) → C 2 (M, M), w → ξ (w) ◦ f. It was verified in [16] that the probability νε satisfies log+ |g|C 2 νε (dg) < +∞, log D(g) νε (dg) > −∞. For ε > 0, the transition probabilities of χ are given by Pε (x, A) = νε {w : ξε (w)( f (x)) ∈ A}. In the case when the SDE (5.2) is non-degenerate, i.e., X 0 , . . . , X d span the tangent space of M, then the transition probabilities of χ have a density with respect to Lebesgue measure and hence a χ -stationary measure μ satisfies μ Leb. Furthermore, as it was d showed in [13], if the operator L = − X 0 +Σk=1 X k2 on C ∞ (Gr(M)) is hypoelliptic, where X k is the natural lifting of X k to Gr(M), 0 ≤ k ≤ d, particularly, if d ≥ dimM+(dimM)2 , then there is an open and dense subset in the space of (d + 1)-tuples of vector fields on M on which the hypothesis A is satisfied. Hence Theorem 2.5 applies to this model. Acknowledgements. The author is grateful to Professor Peidong Liu for introducing her to this field, for many discussions, and constant encouragement. This work was partially revised during the author’s visit to CUHK. She would like to thank Professors Dejun Feng and Kasing Lau for hospitality and valuable comments.
References 1. Arnold, L.: Random Dynamical Systems. Berlin-Heidelberg New York: Springer-Verlag, 1998 2. Baladi, V., Young, L.-S.: On the spectra of randomly perturbed expanding maps. Commun. Math. Phys. 156, 355–385 (1993) 3. Barreira, L., Pesin, Y., Schmeling, J.: Dimension and product structure of hyperbolic measures. Ann. Math. 149, 755–783 (1999) 4. Eckmann, J.-P., Ruelle, D.: Ergodic theory of chaos and strange attractors. Rev. Mod. Phys. 57(3), 617–656 (1985) 5. Farmer, J., Ott, E., Yorke, J.: The dimension of chaotic attractors. Physica 7D, 153–180 (1983)
Dimension Theory of Endomorphisms
99
6. Frederickson, P., Kaplan, J.-L., Yorke, E.-D., Yorke, J.-A.: The Liapunov dimension of strange attractors. J. Diff. Eqs. 49(2), 185–207 (1983) 7. Katok, A., Strelcyn, J.-M.: Invariant Manifold, Entropy and Billiards; Smooth Maps with Singularities. Lecture Notes in Mathematics 1222, Berlin-Heidelberg-New York: Springer Verlag, 1986 8. Kifer, Y.: Ergodic Theory of Random Transformations. Boston: Birkhäuser, 1986 9. Kunita, H.: Stochastic Flows and Stochastic Differential Equations. Cambridge: Cambridge University Press, 1990 10. Ledrappier, F.: Dimension of invariant measures. In: Proceedings of the conference on ergodic theory and related topics, II (Georgenthal, 1986), Stuttgart: Math. 94, Teubner-Tecte, 1987, pp. 116–124 11. Ledrappier, F., Misiurewicz, M.: Dimension of invariant measures for maps with exponent zero. Ergod. Th. & Dynam. Sys. 5, 595–610 (1985) 12. Ledrappier, F., Young, L.-S.: The metric entropy of diffeomorphisms. I. Characterization of measures satisfying Pesin’s entropy formula. Ann. of Math. (2) 122, no. 3, 509–539 (1985); The metric entropy of diffeomorphisms. II. Relations between entropy, exponents and dimension. Ann. of Math. (2) 122, no. 3, 540–574 (1985) 13. Ledrappier, F., Young, L.-S.: Dimension formula for random transformations. Commun. Math. Phys. 117(4), 529–548 (1988) 14. Liu, P.-D., Qian, M.: Smooth Ergodic Theory of Random Dynamical Systems. Lecture Notes in Mathematics, 1606, Berlin: Springer-Verlag, 1995 15. Liu, P.-D., Xie, J.-S.: Dimension of hyperbolic measures of random diffeomorphisms. Trans. Amer. Math. Soc. 358(9), 3751–3780 (2006) 16. Liu, P.-D.: Entropy formula of Pesin type for noninvertible random dynamical systems. Math. Z. 230, 201–239 (1999) 17. Liu, P.-D.: Ruelle inequality relating entropy, folding entropy and negative Lyapunov exponents. Commun. Math. Phys. 240(3), 531–538 (2003) 18. Liu, P.-D.: A note on the relationship of pointwise dimensions of an invariant measure and its natural extension. Arch. Math. (Basel) 83(1), 81–87 (2004) 19. Liu, P.-D.: Invariant measures satisfying an equality relating entropy, folding entropy and negative Lyapunov exponents. Commun. Math. Phys. 284, 391–406 (2008) 20. Mañé, R.: A proof of Pesin’s formula. Ergod. Th. & Dynam. Syst. 1, 95–102 (1981) 21. Oseledeˇc, V.-I.: A multiplicative ergodic theorem: Lyapunov characteristic numbers for dynamical systems. Trans. Moscow Math. Soc. 19, 197–221 (1968) 22. Parry, W.: Entropy and Generators in Ergodic Theory. New York: W. A. Benjamin, Inc., 1969 23. Pesin, Y., Yue, C.: The Hausdorff dimension of measures with non-zero Lyapunov exponents and local product structure. PSU preprint 24. Ruelle, D.: Positivity of entropy production in nonequilibrium statistical mechanics. J. Stat. Phys. 85(1–2), 1–23 (1996) 25. Qian, M., Xie, J.-S.: Entropy formula for endomorphisms: relations between entropy, exponents and dimension. Discr. Cont. Dyn. Syst. 21(2), 367–392 (2008) 26. Qian, M., Xie, J.-S., Zhu, S.: Smooth ergodic theory for endomorphisms. Lecture Notes in Mathematics, 1978, Berlin: Springer-Verlag, 2009 27. Qian, M., Zhang, Z.-S.: Ergodic theory for axiom A endomorphisms. Ergod. Th. & Dynam. Sys. 15, 161– 174 (1995) 28. Qian, M., Zhu, S.: SRB measures and Pesin’s entropy formula for endomorphisms. Trans. Amer. Math. Soc. 354(4), 1453–1471 (2002) 29. Schmeling, J., Troubetzkoy, S.: Dimension and invertibility of hyperbolic endomorphisms with singularities. Ergod. Th. & Dynam. Sys. 18, 1257–1282 (1998) 30. Schmeling, J.: A dimension formula for endomorphisms-the Belykh family. Ergod. Th. & Dynam. Sys. 18, 1283–1309 (1998) 31. Shu, L.: The metric entropy of endomorphisms. Commun. Math. Phys. 291(2), 491–512 (2009) 32. Young, L.-S.: Dimension, entropy and Lyapunov exponents. Ergod. Th. & Dynam. Sys. 2, 109–124 (1982) 33. Young, L.-S.: Ergodic theory of attractors. In: Proceedings of the International Congress of Mathematicians, (Zürich, Switzerland, 1994), Basel: Birkhäuser, 1995, pp. 1230–1237 Communicated by G. Gallavotti
Commun. Math. Phys. 298, 101–138 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1010-2
Communications in
Mathematical Physics
Mean-Field Dynamics: Singular Potentials and Rate of Convergence Antti Knowles1 , Peter Pickl2 1 Theoretische Physik, ETH Hönggerberg, CH-8093 Zürich, Switzerland.
E-mail:
[email protected] 2 Mathematisches Institut, Universität München, Theresien str. 39,
80333 München, Germany Received: 24 July 2009 / Accepted: 7 December 2009 Published online: 19 February 2010 – © Springer-Verlag 2010
Abstract: We consider the time evolution of a system of N identical bosons whose interaction potential is rescaled by N −1 . We choose the initial wave function to describe a condensate in which all particles are in the same one-particle state. It is well known that in the mean-field limit N → ∞ the quantum N -body dynamics is governed by the nonlinear Hartree equation. Using a nonperturbative method, we extend previous results on the mean-field limit in two directions. First, we allow a large class of singular interaction potentials as well as strong, possibly time-dependent external potentials. Second, we derive bounds on the rate of convergence of the quantum N -body dynamics to the Hartree dynamics. 1. Introduction We consider a system of N identical bosons in d dimensions, described by a wave function N ∈ H(N ) . Here H(N ) := L 2+ (R N d , dx1 · · · dx N ) is the subspace of L 2 (R N d , dx1 · · · dx N ) consisting of wave functions N (x1 , . . . , x N ) that are symmetric under permutation of their arguments x1 , . . . , x N ∈ Rd . The Hamiltonian is given by HN =
N i=1
hi +
1 N
w(xi − x j ),
(1.1)
1i< j N
where h i denotes a one-particle Hamiltonian h (to be specified later) acting on the coordinate xi , and w is an interaction potential. Note the mean-field scaling 1/N in front of the interaction potential, which ensures that the free and interacting parts of H N are of the same order.
102
A. Knowles, P. Pickl
The time evolution of N is governed by the N -body Schrödinger equation i∂t N (t) = H N N (t),
N (0) = N ,0 .
(1.2)
For definiteness, let us consider factorized initial data N ,0 = ϕ0⊗N for some ϕ0 ∈ L 2 (Rd ) satisfying the normalization condition ϕ0 L 2 (Rd ) = 1. Clearly, because of the interaction between the particles, the factorization of the wave function is not preserved by the time evolution. However, it turns out that for large N the interaction potential experienced by any single particle may be approximated by an effective mean-field potential, so that the wave function N (t) remains approximately factorized for all times. In other words we have that, in a sense to be made precise, N (t) ≈ ϕ(t)⊗N for some appropriate ϕ(t). A simple argument shows that in a product state ϕ(t)⊗N the interaction potential experienced by a particle is approximately w ∗ |ϕ(t)|2 , where ∗ denotes convolution. This implies that ϕ(t) is a solution of the nonlinear Hartree equation i∂t ϕ(t) = hϕ(t) + w ∗ |ϕ(t)|2 ϕ(t), ϕ(0) = ϕ0 . (1.3) Let us be a little more precise about what one means with N≈ ϕ ⊗N (we omit the irrelevant time argument). One does not expect the L 2 -distance N − ϕ ⊗N L 2 (R N d ) to become small as N → ∞. A more useful, weaker, indicator of convergence should depend only on a finite, fixed1 number, k, of particles. To this end we define the reduced k-particle density matrix (k) γ N := Tr k+1,...,N | N N |,
where Tr k+1,...,N denotes the partial trace over the coordinates xk+1 , . . . , x N , and | N N | denotes (in accordance with the usual Dirac notation) the orthogonal projector (k) onto N . In other words, γ N is the positive trace class operator on L 2+ (Rkd , dx1 · · · dxk ) with operator kernel γ N(k) (x1 , . . . , xk ; y1 , . . . , yk ) = dxk+1 · · · dx N N (x1 , . . . , x N ) N (y1 , . . . , yk , xk+1 , . . . , x N ). (k)
The reduced k-particle density matrix γ N embodies all the information contained in the full N -particle wave function that pertains to at most k particles. There are two commonly used indicators of the closeness γ N(k) ≈ (|ϕ ϕ|)⊗k : the projection (k) (k) E N := 1 − ϕ ⊗k , γ N ϕ ⊗k and the trace norm distance
(k) (k) R N := Tr γ N − (|ϕ ϕ|)⊗k .
(1.4)
It is well known (see e.g. [9]) that all of these indicators are equivalent in the sense (k) (k) that the vanishing of either R N or E N for some k in the limit N → ∞ implies that
) (k ) lim N R (k N = lim N E N = 0 for all k . However, the rate of convergence may differ 1 In fact, as shown in Corollary 3.2, k may be taken to grow like o(N ).
Mean-Field Dynamics: Singular Potentials and Rate of Convergence
103
from one indicator to another. Thus, when studying rates of convergence, they are not equivalent (see Sect. 2 below for a full discussion). The study of the convergence of γ N(k) (t) in the mean-field limit towards (|ϕ(t)
ϕ(t)|)⊗k for all t has a history going back almost thirty years. The first result is due to (k) Spohn [13], who showed that lim N R N (t) = 0 for all t provided that w is bounded. His method is based on the BBGKY hierarchy, (k)
i∂t γ N (t) =
k
i=1
+
1 (k) h i , γ N (t) + N
N −k N
k
(k) w(xi − x j ) , γ N (t)
1i< j k
(k+1) Tr k+1 w(xi − xk+1 ) , γ N (t) ,
(1.5)
i=1 (k)
an equation of motion for the family (γ N (t))k∈N of reduced density matrices. It is a simple computation to check that the BBGKY hierarchy is equivalent to the Schrödinger equation (1.2) for N (t). Using a perturbative expansion of the BBGKY hierarchy, (k) Spohn showed that in the limit N → ∞ the family (γ N (t))k∈N converges to a family (k) (γ∞ (t))k∈N that satisfies the limiting BBGKY obtained by formally setting N = ∞ in (1.5). This limiting hierarchy is easily seen to be equivalent to the Hartree equation (1.3) (k) via the identification γ∞ (t) = (|ϕ(t) ϕ(t)|)⊗k . We refer to [3] for a short discussion of some subsequent developments. In the past few years considerable progress has been made in strengthening such results in mainly two directions. First, the convergence lim N R (k) N (t) = 0 for all t has been proven for singular interaction potentials w. It is for instance of special physical interest to understand the case of a Coulomb potential, w(x) = λ|x|−1 , where λ ∈ R. The proofs for singular interaction potentials are considerably more involved than for bounded interaction potentials. The first result for the case h = − and w(x) = λ|x|−1 is due to Erd˝os and Yau [3]. Their proof uses the BBGKY hierarchy and a weak compactness argument. In [1], Schlein and Elgart extended this√result to the technically more demanding case of a semirelativistic kinetic energy, h = 1 − and w(x) = λ|x|−1 . This is a critical case in the sense that the kinetic energy has the same scaling behaviour as the Coulomb potential energy, thus requiring quite refined estimates. A different approach, based on operator methods, was developed by Fröhlich et al. in [4], where the authors treat the case h = − and w(x) = λ|x|−1 . Their proof relies on dispersive estimates and counting of Feynman graphs. Yet another approach was adopted by Rodnianski and Schlein in [12]. Using methods inspired by a semiclassical argument of Hepp [6] focusing on the dynamics of coherent states in Fock space, they show convergence to the mean-field limit in the case h = − and w(x) = λ|x|−1 . The second area of recent progress in understanding the mean-field limit is deriving estimates on the rate of convergence to the mean-field limit. Methods based on expansions, as used in [13 and 4], give very weak bounds on the error R (1) N (t), while weak compactness arguments, as used in [3 and 1], yield no information on the rate of convergence. From a physical point of view, where N is large but finite, it is of some interest to have tight error bounds in order to be able to address the question whether the mean-field approximation may be regarded as valid. The first reasonable estimates on the error were derived for the case h = − and w(x) = λ|x|−1 by Rodnianski and Schlein in their work [12] mentioned above. In fact they derive an explicit estimate on
104
A. Knowles, P. Pickl
the error of the form C1 (k) (k) R N (t) √ eC2 (k)t N for some constants C1 (k), C2 (k) > 0. Using a novel approach inspired by Lieb-Robinson bounds, Erd˝os and Schlein [2] further improved this estimate under the more restrictive assumption that w is bounded and its Fourier transform integrable. Their result is (k)
R N (t)
C 1 C2 k C3 t e e , N
for some constants C1 , C2 , C3 > 0. In the present article we adopt yet another approach based on a method of Pickl [10]. We strengthen and generalize many of the results listed above, by treating more singular interaction potentials as well as deriving estimates on the rate of convergence. Moreover, our approach allows for a large class of (possibly time-dependent) external potentials, which might for instance describe a trap confining the particles to a small volume. We also show that if the solution ϕ(·) of the Hartree equation satisfies a scattering condition, all of the error estimates are uniform in time. The outline of the article is as follows. Section 2 is devoted to a short discussion (k) (k) of the indicators of convergence E N and R N , in which we derive estimates relating them to each other. In Sect. 3 we state and prove our first main result, which concerns the mean-field limit in the case of L 2 -type singularities in w; see Theorem 3.1 and Corollary 3.2. In Sect. 4 we state and prove our second main result, which allows for a larger class of singularities such as the nonrelativistic critical case h = − and w(x) = λ|x|−2 ; see Theorem 4.1. For an outline of the methods underlying our proofs, see the beginnings of Sects. 3 and 4. Notation. Except in definitions, in statements of results and where confusion is possible, we refrain from indicating the explicit dependence of a quantity a N (t) on the time t and the particle number N . When needed, we use the notations a(t) and a|t interchangeably to denote the value of the quantity a at time t. The symbol C is reserved for a generic positive constant that may depend on some fixed parameters. We abbreviate a Cb with a b. To simplify notation, we assume that t 0. We abbreviate L p (Rd , dx) ≡ L p and · L p ≡ · p . We also set · L 2 (R N d ) = ·. For s ∈ R we use H s ≡ H s (Rd ) to denote the Sobolev space with norm f H s = (1 + |k|2 )s/2 fˆ , where fˆ is the Fourier transform of f . 2 Integer indices on operators denote particle number: A k-particle operator A (i.e. an operator on H(k) ) acting on the coordinates xi1 , . . . , xik , where i 1 < · · · < i k , is denoted by Ai1 ...ik . Also, by a slight abuse of notation, we identify k-particle functions f (x1 , . . . , xk ) with their associated multiplication operators on H(k) . The operator norm of the multiplication operator f is equal to, and will always be denoted by, f ∞ . We use the symbol Q(·) to denote the form domain of a semibounded operator. We denote the space of bounded linear maps from X 1 to X 2 by L(X 1 ; X 2), and abbreviate L(X ) = L(X ; X ). We abbreviate the operator norm of L L 2 (R N d ) by ·. For two Banach spaces, X 1 and X 2 , contained in some larger space, we set f X 1 +X 2 = inf f 1 X 1 + f 2 X 2 , f = f1 + f2
f X 1 ∩X 2 = f X 1 + f X 2 , and denote by X 1 + X 2 and X 1 ∩ X 2 the corresponding Banach spaces.
Mean-Field Dynamics: Singular Potentials and Rate of Convergence
105
2. Indicators of Convergence This section is devoted to a discussion, which might also be of independent interest, of (k) (k) quantitative relationships between the indicators E N and R N . Throughout this section we suppress the irrelevant index N . Take a k-particle density matrix γ (k) ∈ L(H(k) ) and a one-particle condensate wave function ϕ ∈ L 2 . The following lemma gives the relationship between different elements of the sequence E (1) , E (2) , . . . , where, we recall, (2.1) E (k) = 1 − ϕ ⊗k , γ (k) ϕ ⊗k . Lemma 2.1. Let γ (k) ∈ L(H(k) ) satisfy γ (k) 0,
Tr γ (k) = 1.
Let ϕ ∈ L 2 satisfy ϕ = 1. Then E (k) k E (1) .
(2.2)
(k) (k) Proof. Let i i 1 be an orthonormal basis of H(k) with 1 = ϕ ⊗k . Then (k−1) (k−1) ϕ ⊗k , γ (k) ϕ ⊗k = ϕ ⊗ i , γ (k) ϕ ⊗ i
i 1
−
(k−1)
(k−1)
, γ (k) ϕ ⊗ i
ϕ ⊗ i
i 2
= ϕ , γ (1) ϕ −
(k−1)
ϕ ⊗ i
(k−1)
, γ (k) ϕ ⊗ i
i 2
Therefore,
ϕ , γ (1) ϕ − ϕ ⊗k , γ (k) ϕ ⊗k = ϕ ⊗ i(k−1) , γ (k) ϕ ⊗ i(k−1) i 2
(k−1) (k−1) (1) , γ (k) (1) j ⊗ i j ⊗ i
i 2 j 1
=
(1)
(k−1)
j ⊗ i
(1)
(k−1)
, γ (k) j ⊗ i
i 1 j 1
−
(1)
(1)
j ⊗ ϕ ⊗(k−1) , γ (k) j ⊗ ϕ ⊗(k−1)
j 1
= 1 − ϕ ⊗(k−1) , γ (k−1) ϕ ⊗(k−1) . This yields E (k) E (k−1) + E (1) , and the claim follows.
.
106
A. Knowles, P. Pickl
Remark 2.2. The bound in (2.2) is sharp. Indeed, let us suppose that E (k) k f (k) E (1) for some function f . Then f (k) sup γ (k)
E (k) 1 − (1 − α)k 1 − (1 − α)k lim = 1, sup α→0 kα kα k E (1) 0 0. Without loss of generality we assume that K 1. (A4’) The solution ϕ(·) of (1.3) satisfies ϕ(·) ∈ C(R; X 1 ) ∩ C 1 (R; X 1∗ ). Then Theorem 3.1 and Corollary 3.2 hold with t φ(t) = 32K dsϕ(s)2X 1 . 0
The proof remains virtually unchanged. One replaces (3.24) with (3.6), as well as (3.20) with w ∗ |ϕ|2 2K ϕ2 , X1 ∞ which is an easy consequence of (3.6). 3.2. Examples. We list two examples of systems satisfying the assumptions of Theorem 3.1. 3.2.1. Particles in a trap. Consider nonrelativistic particles in R3 confined by a strong trapping potential. The particles interact by means of the Coulomb potential: w(x) = λ|x|−1 , where λ ∈ R. The one-particle Hamiltonian is of the form h = − + v, where v is a measurable function on R3 . Decompose v into its positive and negative parts: v = v+ − v− , where v+ , v− 0. We assume that v+ ∈ L 1loc and that v− is −-form bounded with relative bound less than one, i.e. there are constants 0 a < 1 and 0 b < ∞ such that
ϕ , v− ϕ a ϕ , −ϕ + b ϕ , ϕ .
(3.7)
Thus h + b1 is positive, and it is not hard to see that h is essentially self-adjoint on Cc∞ (R3 ). This follows by density and a standard argument using Riesz’s representation theorem to show that the equation (h + (b + 1)1)ϕ = f has a unique solution ϕ ∈ {ϕ ∈ L 2 : hϕ ∈ L 2 } for each f ∈ L 2 . It is now easy to see that Assumptions (A1) and (A2) hold with the one-particle Hamiltonian h + c1 for some c > 0. Let us assume without loss of generality that c = 0. Next, we verify Assumptions (A3’) and (A4’) (see Remark 3.8). We find 2 2 2 w ∗ |ϕ|2 = sup dy λ |ϕ(y)| ϕ , −ϕ ∞ 2 |x − y| x ϕ , hϕ + ϕ , ϕ = ϕ2X 1 , where the second step follows from Hardy’s inequality and translation invariance of , and the third step is a simple consequence of (3.7). This proves (A3’).
Mean-Field Dynamics: Singular Potentials and Rate of Convergence
111
Next, take ϕ0 ∈ X 1 . By standard methods (see e.g. the presentation of [7]) one finds that (A4’) holds. Moreover, the mass ϕ(t)2 and the energy 1 E ϕ (t) = ϕ , hϕ + dxdy w(x − y)|ϕ(x)|2 |ϕ(y)|2 2 t are conserved under time evolution. Using the identity |x|−1 1{|x|ε} ε|x|−2 + 1{|x|>ε} ε−1 and Hardy’s inequality one sees that ϕ(t)2X 1 E ϕ (t) + ϕ(t)2 , and therefore ϕ(t) X 1 C for all t. We conclude: Theorem 3.1 holds with φ(t) = Ct. More generally, the preceding discussion holds for interaction potentials w ∈ L 3w + L ∞ , p where L w denotes the weak L p space (see e.g. [11]). This follows from a short computation using symmetric-decreasing rearrangements; we omit further details. This example generalizes the results of [3,12 and 4]. 3.2.2. A boson star. Consider semirelativistic particles in R3 whose one-particle Hamil√ tonian is given by h = 1 − . The particles interact by means of a Coulomb potential: w(x) = λ|x|−1 . We impose the condition λ > −4/π . This condition is necessary for both the stability of the N -body problem (i.e. Assumption (A2)) and the global well-posedness of the Hartree equation. See [7,8] for details. It is well known that Assumptions (A1) and (A2) hold in this case. In order to show (A4) we need some regularity of ϕ(·). To this end, let s > 1 and take ϕ0 ∈ H s . Theorem 3 of [7] implies that (1.3) has a unique global solution in H s . Therefore Sobolev’s inequality implies that (A4) holds with 1 1 s = − . q1 2 3 Thus q1 > 6, and (A3) holds with appropriately chosen values of p1 , p2 . We conclude: Theorem 3.1 holds for some continuous function φ(t). (In fact, as shown in [7], one has the bound φ(t) eCt .) This example generalizes the result of [1]. 3.3. Proof of Theorem 3.1. 3.3.1. A family of projectors. Define the time-dependent projectors p(t) := |ϕ(t) ϕ(t)|,
q(t) := 1 − p(t).
Write 1 = ( p1 + q1 ) · · · ( p N + q N ),
(3.8)
and define Pk , for k = 0, . . . , N , as the term obtained by multiplying out (3.8) and selecting all summands containing k factors q. In other words, Pk =
a∈{0,1} N
i
ai =k
N
pi1−ai qiai .
(3.9)
: i=1
If k = {0, . . . , N } we set Pk = 0. It is easy to see that the following properties hold:
112
A. Knowles, P. Pickl
(i) Pk is an orthogonal projector, (ii) Pk Pl = δkl Pk , (iii) k Pk = 1. Next, for any function f : {0, . . . , N } → C we define the operator f (k)Pk . f :=
(3.10)
k
It follows immediately that f g= f g, and that f commutes with pi and Pk . We shall often make use of the functions k k n(k) := . m(k) := , N N We have the relation 1 1 1 qi = qi Pk = k Pk = m . N N N i
k
i
(3.11)
k
Thus, by symmetry of , we get α = , q1 = , m .
(3.12)
The correspondence q1 ∼ m of (3.11) yields the following useful bounds. Lemma 3.9. For any nonnegative function f : {0, . . . , N } → [0, ∞) we have , f q1 = , fm , N , fm 2 . , f q1 q2 N −1
(3.13) (3.14)
Proof. The proof of (3.13) is an immediate consequence of (3.11). In order to prove (3.14) we write, using symmetry of as well as (3.11),
, f q1 q2 =
1 , f qi q j N (N − 1) i= j
1 N , fm 2 , , f qi q j = N (N − 1) N −1 i, j
which is the claim.
Next, we introduce the shift operation τn , n ∈ Z, defined on functions f through (τn f )(k) := f (k + n). Its usefulness for our purposes is encapsulated by the following lemma.
(3.15)
Mean-Field Dynamics: Singular Potentials and Rate of Convergence
113
Lemma 3.10. Let r 1 and A be an operator on H(r ) . Let Q i , i = 1, 2, be two projectors of the form Q i = #1 · · · #r , where each # stands for either p or q. Then Q 1 A1...r f Q 2 = Q 1 τn f A1...r Q 2 , where n = n 2 − n 1 and n i is the number of factors q in Q i . Proof. Define
Pkr :=
N
pi1−ai qiai .
N −r i=r +1 a∈{0,1}
i ai =k
Then, Qi f =
f (k) Q i Pk =
k
r f (k) Q i Pk−n = i
k
f (k + n i ) Q i Pkr .
k
The claim follows from the fact that Pkr commutes with A1...r .
3.3.2. A bound on α. ˙ Let us abbreviate W ϕ := w ∗ |ϕ|2 . From (A3) and (A4) we find W ϕ ∈ L ∞ (see (3.20) below). Then i∂t ϕ = (h + W ϕ )ϕ, where h + W ϕ ∈ L(X 1 ; X 1∗ ). Thus, for any ψ ∈ X 1 independent of t we have i∂t ψ , p ψ = ψ , [h + W ϕ , p]ψ . On the other hand, it is easy to see from (A3) and (A4) that m ∈ Q(H ). Combining these observations, and noting that ∈ Q(H ) ⊂ X by (A2), we see that α is differentiable in t with derivative
α˙ = i , H − H ϕ , m ,
ϕ where H ϕ := i (h i + Wi ). Thus, ϕ 1 Wi j − Wi , m . α˙ = i , N i< j
i
By symmetry of and m we get α˙ =
i
ϕ ϕ , (N − 1)W12 − N W1 − N W2 , m . 2
In order to estimate the right-hand side, we introduce 1 = ( p1 + q1 )( p2 + q2 )
(3.16)
114
A. Knowles, P. Pickl
on both sides of the commutator in (3.16). Of the sixteen resulting terms only three different types survive:
ϕ ϕ i q 1 p2 , (I) 2 , p1 p2 (N − 1)W12 − N W1 − N W2 , m
ϕ ϕ i q1 q2 , (II) 2 , q1 p2 (N − 1)W12 − N W1 − N W2 , m
ϕ ϕ i q1 q2 . (III) 2 , p1 p2 (N − 1)W12 − N W1 − N W2 , m Indeed, Lemma 3.10 implies that terms with the same number of factors q on the left and on the right vanish. What remains is α˙ = 2(I) + 2(II) + (III) + complex conjugate. The remainder of the proof consists in estimating each term. Term (I). First, we remark that ϕ
p2 W12 p2 = p2 W1 .
(3.17)
This is easiest to see using operator kernels (we drop the trivial indices x3 , y3 , . . . , x N , y N ): ( p2 W12 p2 )(x1 , x2 ; y1 , y2 ) = dzϕ(x2 ) ϕ(z) w(x1 − z) δ(x1 − y1 ) ϕ(z) ϕ(y2 ) = ϕ(x2 ) ϕ(y2 ) δ(x1 − y1 ) (w ∗ |ϕ|2 )(x1 ). Therefore, (I) =
ϕ −i i ϕ ϕ , p1 p2 (N − 1)W1 − N W1 , m , p1 p2 W 1 , m q 1 p2 = q 1 p2 . 2 2
Using Lemma 3.10 we find (I) =
−i −i ϕ ϕ , p1 p2 W 1 m , p1 p2 W 1 q 1 p2 . − τ −1 m q1 p2 = 2 2N
This gives (I) 1 W ϕ ∞ = 1 w ∗ |ϕ|2 . ∞ 2N 2N By (A3), we may write w = w (1) + w (2) ,
w (i) ∈ L pi .
(3.18)
By Young’s inequality, (i) w ∗ |ϕ|2
∞
w (i) pi ϕr2i ,
where r1 , r2 are defined through 1=
1 2 + . pi ri
(3.19)
Mean-Field Dynamics: Singular Potentials and Rate of Convergence
Therefore,
w ∗ |ϕ|2
∞
115
w (1) p1 ϕr21 + w (1) p2 ϕr22 2 w (1) p1 + w (2) p2 ϕr1 + ϕr2 .
Taking the infimum over all decompositions (3.18) yields 2 W ϕ ∞ = w ∗ |ϕ|2 ∞ w L p1 +L p2 ϕr1 + ϕr2 .
(3.20)
Note that (A3) and (A4) imply 2 r i q1 ,
(3.21)
so that the right-hand side of (3.20) is finite. Summarizing, (I) 1 w L p1 +L p2 ϕr + ϕr 2 . 1 2 2N
(3.22)
Term (II). Applying Lemma 3.10 to (II) yields i ϕ , q1 p2 (N − 1)W12 − N W2 m − τ −1 m q1 q2 2 N −1 i ϕ W12 − W2 q1 q2 , = , q 1 p2 2 N
(II) =
so that 1 (II) , q1 p2 W12 q1 q2 + 2
1 ϕ , q1 p2 W2 q1 q2 . 2
The second term of (3.23) is bounded by 2 1 1 W ϕ ∞ q1 2 w L p1 +L p2 ϕr1 + ϕr2 α, 2 2 where we used the bound (3.20) as well as (3.12). The first term of (3.23) is bounded using Cauchy-Schwarz by 1 2 p q , q1 p2 W12
, q1 q2 2 1 2 1 = , q1 p2 w 2 ∗ |ϕ|2 1 p2 q1 , q1 q2 . 2 This follows by applying (3.17) to W 2 . Thus we get the bound 1 1 q1 2 w 2 ∗ |ϕ|2 ∞ = α w 2 ∗ |ϕ|2 ∞ . 2 2 We now proceed as above. Using the decomposition (3.18) we get 2 w ∗ |ϕ|2 2(w (1) )2 ∗ |ϕ|2 + 2(w (2) )2 ∗ |ϕ|2 . ∞ ∞ ∞ Then Young’s inequality gives (i) 2 (w ) ∗ |ϕ|2 w (i) 2 ϕ2 , qi p ∞ i
(3.23)
116
A. Knowles, P. Pickl
which implies that 2 w ∗ |ϕ|2
∞
2 2w2L p1 +L p2 ϕq1 + ϕq2 .
(3.24)
Putting all of this together we get √ 1 (II) w L p1 +L p2 2 ϕq + ϕq + ϕr + ϕr 2 α. 1 2 1 2 2 Term (III). The final term (III) is equal to
i i , p1 p2 (N − 1)W12 , m − τ q1 q2 = , p1 p2 (N − 1)W12 m −2 m q1 q2 2 2 N − 1 , p1 p2 W12 q1 q2 , =i N where we used Lemma 3.10. Next, we note that, on the range of q1 , the operator n −1 is well-defined and bounded. Thus (III) is equal to i
N − 1 N − 1 , p1 p2 W12 , p1 p2 τ n n −1 q1 q2 = i n −1 q1 q2 , 2 n W12 N N
where we used Lemma 3.10 again. We now use Cauchy-Schwarz to get 2 τ (III) , p1 p2 τ , n −2 q1 q2 n W n p p 2 1 2 12 2 2 2 ,m −1 q1 q2 = , p1 p2 τ 2 n w ∗ |ϕ| 1 τ 2 n p1 p2 N w 2 ∗ |ϕ|2 ∞ τ2 n
, m N −1 √ N = w 2 ∗ |ϕ|2 ∞ , τ2 m α N −1 2 √ N 2 2 w ∗ |ϕ| ∞ = ,m + α N −1 N N 2α w 2 ∗ |ϕ|2 ∞ α+ N −1 N 1 N 2 α+ . w 2 ∗ |ϕ|2 ∞ N −1 N Using the estimate (3.24) we get finally √ (III) 2 2w L p1 +L p2 ϕq + ϕq 1 2
1 N α+ . N −1 N
Conclusion of the proof. We have shown that the estimate (3.2) holds with 2 B N (t) = 2w L p1 +L p2 ϕ(t)r1 + ϕ(t)r2 + 6 ϕ(t)q1 + ϕ(t)q2 , A N (t) =
B N (t) . N
Mean-Field Dynamics: Singular Potentials and Rate of Convergence
117
Using L 2 -norm conservation ϕ(t) = 1 and interpolation we find ϕ(t)r2i ϕ(t)qi . Thus, B N (t) 16w L p1 +L p2 ϕ(t)q1 + ϕ(t)q2 . The claim now follows from the Grönwall estimate (3.3). 4. Convergence for Stronger Singularities In this section we extend the results of the Sect. 3 to more singular interaction potentials. We consider the case w ∈ L p0 + L ∞ , where 1 1 1 = + . p0 2 d
(4.1)
For example in three dimensions p0 = 6/5, which corresponds to singularities up to, but not including, the type |x|−5/2 . Of course, there are other restrictions on the interaction potential which ensure the stability of the N -body Hamiltonian and the well-posedness of the Hartree equation. In practice, it is often these latter restrictions that determine the class of allowed singularities. In the words of [11] (p. 169), it is “venerable physical folklore” that an N -body Hamiltonian of the form (3.4), with h = − and w(x) = |x|−ζ for ζ < 2, produces reasonable quantum dynamics in three dimensions. Mathematically, this means that such a Hamiltonian is self-adjoint; this is a well-known result (see e.g. [11]). The corresponding Hartree equation is known to be globally well-posed (see [5]). This section answers (affirmatively) the question whether, in the case of such singular interaction potentials, the mean-field limit of the N -body dynamics is governed by the Hartree equation. 4.1. Outline and main result. As in Sect. 3, we need to control expressions of the form w 2 ∗ |ϕ|2 ∞ . The situation is considerably more involved when w2 is not locally integrable. An important step in dealing with such potentials in our proof is to express w as the divergence of a vector field ξ ∈ L 2 . This approach requires the control of not only α = q1 2 but also ∇1 q1 2 , which arises from integrating by parts in expressions containing the factor ∇ · ξ . As it turns out, β, defined through n N t , (4.2) β N (t) := N , does the trick. This follows from an estimate exploiting conservation of energy (see Lemma 4.6 below). The inequality m n and the representation (3.12) yield α β.
(4.3)
We consider a Hamiltonian of the form (3.4) and make the following assumptions. (B1) The one-particle Hamiltonian h is self-adjoint and bounded from below. Without loss of generality we assume that h 0. We also assume that there are constants κ1 , κ2 > 0 such that − κ1 h + κ2 , as an inequality of forms on H(1) .
118
A. Knowles, P. Pickl
(B2) The Hamiltonian (3.4) is self-adjoint and bounded from below. We also assume that Q(H N ) ⊂ X N , where X N is defined as in Assumption (A1). (B3) There is a constant κ3 ∈ (0, 1) such that 0 (1 − κ3 )(h 1 + h 2 ) + W12 , as an inequality of forms on H(2) . (B4) The interaction potential w is a real and even function satisfying w ∈ L p + L ∞ , where p0 < p 2. (B5) The solution ϕ(·) of (1.3) satisfies ϕ(·) ∈ C(R; X 12 ∩ L ∞ ) ∩ C 1 (R; L 2 ), where X 12 := Q(h 2 ) ⊂ L 2 is equipped with the norm ϕ X 2 := (1 + h 2 )1/2 ϕ . 1
Next, we define the microscopic energy per particle E N (t) :=
1
N , H N N t , N
as well as the Hartree energy 1 ϕ 2 2 E (t) := ϕ , h ϕ + dx dyw(x − y)|ϕ(x)| |ϕ(y)| . 2 t By spectral calculus, E N (t) is independent of t. Also, invoking Assumption (B5) to differentiate E ϕ (t) with respect to t shows that E ϕ (t) is conserved as well. Summarizing, E N (t) = E N (0),
E ϕ (t) = E ϕ (0),
t ∈ R.
We may now state the main result of this section. Theorem 4.1. Let N ,0 ∈ Q(H N ) and assume that Assumptions (B1) – (B5) hold. Then there is a constant K , depending only on d, h, w and p, such that 1 ϕ β N (t) β N (0) + E N − E + η e K φ(t) , N where η :=
p/ p0 − 1 2 p/ p0 − p/2 − 1
and φ(t) := 0
t
ds 1 + ϕ(s)3X 2 ∩L ∞ . 1
(4.4)
Mean-Field Dynamics: Singular Potentials and Rate of Convergence
119
ϕ Remark 4.2. We have convergence to the mean-field limit whenever lim N E N = E ⊗N and lim N β N (0) = 0. For instance if we start in a fully factorized state, N ,0 = ϕ0 , then β N (0) = 0 and ϕ E N −E =
1
ϕ0 ⊗ ϕ0 , W12 ϕ0 ⊗ ϕ0 , N
so that Theorem 4.1 yields (1)
E N (t) β N (t)
1 K φ(t) e , Nη
and the analogue of Corollary 3.2 holds. Remark 4.3. The following graph shows the dependence of η on p for d = 3, i.e. p0 = 6/5.
0.5 0.4
η
0.3 0.2 0.1 0 1.2
1.4
1.6
1.8
2
Remark 4.4. Theorem 4.1 remains valid for a large class of time-dependent one-particle Hamiltonians h(t). See Sect. 4.4 below for a full discussion. Remark 4.5. In three dimensions Assumption (B1) and Sobolev’s inequality imply that ϕ∞ ϕ X 2 , so that Assumption (B5) is equivalent to ϕ ∈ C(R; X 12 ) ∩ C 1 (R; L 2 ). 1
4.2. Example: nonrelativistic particles with interaction potential of critical type. Consider nonrelativistic particles in R3 with one-particle Hamiltonian h = −. The interaction potential is given by w(x) = λ|x|−2 . This corresponds to a critical nonlinearity of the Hartree equation. We require that λ > −1/2, which ensures that the N -body Hamiltonian is stable and the Hartree equation has global solutions. To see this, recall Hardy’s inequality in three dimensions,
ϕ , |x|−2 ϕ 4 ϕ , −ϕ .
(4.5)
One easily infers that Assumptions (B1) – (B3) hold. Moreover, Assumption (B4) holds for any p < 3/2. In order to verify Assumption (B5) we refer to [5], where local well-posedness is proven. Global existence follows by standard methods using conservation of the mass
120
A. Knowles, P. Pickl
ϕ2 , conservation of the energy E ϕ , and Hardy’s inequality (4.5). Together they yield an a-priori bound on ϕ X 1 , from which an a-priori bound for ϕ X 2 may be inferred; 1 see [5] for details. We conclude: For any η < 1/3 there is a continuous function φ(t) such that Theorem 4.1 holds. 4.3. Proof of Theorem 4.1. 4.3.1. An energy estimate. In the first step of our proof we exploit conservation of energy to derive an estimate on ∇1 q1 . Lemma 4.6. Assume that Assumptions (B1) – (B5) hold. Then 1 . ∇1 q1 2 E − E ϕ + 1 + ϕ2X 2 ∩L ∞ β + √ 1 N Proof. Write 1 E ϕ = ϕ , hϕ + ϕ , W ϕ ϕ , 2
(4.6)
as well as E = , h 1 +
N −1
, W12 . 2N
(4.7)
Inserting 1 = p1 p2 + (1 − p1 p2 ) in front of every in (4.7) and multiplying everything out yields
, (1 − p1 p2 )h 1 (1 − p1 p2 )
= E − , p1 p2 h 1 p1 p2 N −1
, p1 p2 W12 p1 p2 − 2N − , (1 − p1 p2 )h 1 p1 p2 − , p1 p2 h 1 (1 − p1 p2 ) N − 1 N − 1 , (1 − p1 p2 )W12 p1 p2 − , p1 p2 W12 (1 − p1 p2 ) − 2N 2N N − 1 , (1 − p1 p2 )W12 (1 − p1 p2 ) . − 2N We want to find an upper bound for the left-hand side. In order to control the last term on the right-hand side for negative interaction potentials, we need to use some of the kinetic
Mean-Field Dynamics: Singular Potentials and Rate of Convergence
121
energy on the left-hand side. To this end, we split the left-hand side by multiplying it with 1 = κ3 + (1 − κ3 ). Thus, using (4.6), we get κ3 , (1 − p1 p2 )h 1 (1 − p1 p2 ) = E − Eϕ − , p1 p2 h 1 p1 p2 + ϕ , hϕ 1 N −1
, p1 p2 W12 p1 p2 + ϕ , W ϕ ϕ − 2N 2 − , (1 − p1 p2 )h 1 p1 p2 − , p1 p2 h 1 (1 − p1 p2 ) N − 1 N − 1 , (1 − p1 p2 )W12 p1 p2 − , p1 p2 W12 (1 − p1 p2 ) − 2N 2N N − 1 , (1 − p1 p2 )W12 (1 − p1 p2 ) − 2N (4.8) − (1 − κ3 ) , (1 − p1 p2 )h 1 (1 − p1 p2 ) . The rest of the proof consists in estimating each line on the right-hand side of (4.8) separately. There is nothing to be done with the first line. Lines 6–7. The last two lines of (4.8) are equal to N − 1 , (1 − p1 p2 )W12 (1 − p1 p2 ) − 2N 1 − (1 − κ3 ) , (1 − p1 p2 )(h 1 + h 2 )(1 − p1 p2 ) 2
N − 1 , (1 − p1 p2 ) (1 − κ3 )(h 1 + h 2 ) + W12 (1 − p1 p2 ) 0, − 2N where in the last step we used Assumption (B3). Line 2. The second line on the right-hand side of (4.8) is bounded in absolute value by ϕ , hϕ − , p1 p2 h 1 p1 p2 = ϕ , hϕ , (1 − p1 p2 ) = ϕ , hϕ , (q1 p2 + p1 q2 + q1 q2 ) 3 α ϕ , hϕ 3 β ϕ , hϕ , where in the last step we used (4.3). Line 3. The third line on the right-hand side of (4.8) is bounded in absolute value by 1 ϕ , W ϕ ϕ − N − 1 , p1 p2 W12 p1 p2 2 2N N −1 1 = ϕ , W ϕ ϕ 1 −
, p1 p2 2 N 1 1 W ϕ ∞ , (q1 p2 + p1 q2 + q1 q2 ) + , p1 p2 2 N 3 1 W ϕ ∞ α + 2 N 1 3 . W ϕ ∞ β + 2 N
122
A. Knowles, P. Pickl
As in (3.20), one finds that W ϕ ∞ w L 1 +L ∞ ϕ2L 2 ∩L ∞ . Line 4. The fourth line on the right-hand side of (4.8) is bounded in absolute value by , (1 − p1 p2 )h 1 p1 p2 = , (q1 p2 + p1 q2 + q1 q2 )h 1 p1 p2 = , q 1 h 1 p1 p2 = , q1 n −1/2 n 1/2 h 1 p1 p2 = , q1 n −1/2 h 1 τ1 n 1/2 p1 p2 , where in the last step we used Lemma 3.10. Using Cauchy-Schwarz, we thus get , (1 − p1 p2 )h 1 p1 p2 , q1 , p1 p2 n −1 τ1 n 1/2 h 21 τ1 n 1/2 p1 p2 τ1 n p1 p2 , n ϕ , h 2 ϕ , = , where in the second step we used Lemma 3.9. Using 1 k+1 n(k) + √ (τ1 n)(k) = N N we find ! 1 , (1 − p1 p2 )h 1 p1 p2 β ϕ , h 2 ϕ , n + √ N 1 = ϕ , h 2 ϕ β β + 1/4 N 1 . 2 ϕ , h 2 ϕ β + √ N Line 5. Finally, we turn our attention to the fifth line on the right-hand side of (4.8), which is bounded in absolute value by , p1 p2 W12 (1 − p1 p2 ) = , p1 p2 W12 ( p1 q2 + q1 p2 + q1 q2 2(a) + (b), where
(a) := , p1 p2 W12 q1 p2 ,
(b) := , p1 p2 W12 q1 q2 .
One finds, using (3.17), Lemma 3.10 and Lemma 3.9, ϕ (a) = , p1 p2 W1 q1 ϕ 1/2 −1/2 = , p1 p2 W 1 n n q1 ϕ −1/2 = , p1 p2 τ1 n 1/2 W1 n q1 W ϕ ∞ , τ1 n , n −1 q1 ! 1 ϕ , n n + √ W ∞ , N 1 . 2W ϕ ∞ β + √ N
Mean-Field Dynamics: Singular Potentials and Rate of Convergence
123
The estimation of (b) requires a little more effort. We start by splitting w = w ( p) + w (∞) ,
w ( p) ∈ L p , w (∞) ∈ L ∞ .
This yields (b) (b)( p) + (b)(∞) in self-explanatory notation. Let us first concentrate on (b)(∞) : (∞) (b)(∞) = , p1 p2 W12 q1 q2 (∞) = , p1 p2 W12 n n −1 q1 q2 (∞) −1 = , p1 p2 τ n q1 q2 2 n W12 2 W (∞) ∞ , τ , n −2 q1 q2 n 2 2 √ (∞) α w ∞ α + N 2 . 2w (∞) ∞ β + N Let us now consider (b)( p) . In order to deal with the singularities in w ( p) , we write it as the divergence of a vector field ξ , w ( p) = ∇ · ξ.
(4.9)
This is nothing but a problem of electrostatics, which is solved by ξ =C
x ∗ w ( p) , |x|d
with some constant C depending on d. By the Hardy-Littlewood-Sobolev inequality, we find ξ q w ( p) p ,
1 1 1 = − . q p d
(4.10)
Thus if p p0 then q 2. Denote by X 12 multiplication by ξ(x1 − x2 ). For the following it is convenient to write ∇ · ξ = ∇ ρ ξ ρ , where a summation over ρ = 1, . . . , d is implied. Recalling Lemma 3.10, we therefore get ( p) (b)( p) = , p1 p2 W12 n n −1 q1 q2 ( p) −1 = , p1 p2 τ n q1 q2 2 n W12 ρ ρ = , p1 p2 τ n −1 q1 q2 . 2 n (∇ X )12 1
Integrating by parts yields ρ ρ (b)( p) ∇1 τ n −1 q1 q2 2 n p1 p2 , X 12 ρ ρ −1 + τ n q 1 q 2 . 2 n p1 p2 , X ∇ 12 1
(4.11)
124
A. Knowles, P. Pickl
Let us begin by estimating the first term. Recalling that p = |ϕ ϕ|, we find that the first term on the right-hand side of (4.11) is equal to ρ X p2 (∇ ρ p)1 τ n −1 q1 q2 2 n , 12 −1 ρ σ σ n q1 q2 (∇ ρ p)1 τ 2 n , p2 X 12 X 12 p2 (∇ p)1 τ 2n −1 n q1 q2 |ϕ|2 ∗ ξ 2 ∞ ∇ϕ τ 2n 2 √ ξ q ϕ L 2 ∩L ∞ ϕ X 1 α + α, N where we used Young’s inequality, Assumption (B1), and Lemma 3.9. Recalling that β α, we conclude that the first term on the right-hand side of (4.11) is bounded by 1 C ϕ2X 1 ∩L ∞ β + . N Next, we estimate the second term on the right-hand side of (4.11). It is equal to ρ ρ −1 2 n −1 q1 q2 X p1 p2 τ n q1 q2 τ 2 n , ∇1 2 n , p1 p2 X 12 p1 p2 τ 2 n ∇1 12 |ϕ|2 ∗ ξ 2 ∞ τ2 n ∇1 n −1 q1 q2 2 ξ q ϕ L 2 ∩L ∞ α + ∇1 n −1 q1 q2 . N We estimate ∇1 n −1 q1 q2 by introducing 1 = p1 + q1 on the left. The term arising from p1 is bounded by p 1 ∇1 n −1 q1 q2 = p1 q2 τ1 n −1 ∇1 q1 ∇1 q 1 , q 2 τ1 n −2 ∇1 q1 " # N # 1 = $ ∇1 q 1 , qi τ1 n −2 ∇1 q1 N −1 i=2 " # N # 1 qi τ1 n −2 ∇1 q1 $ ∇1 q 1 , N i=1 = ∇1 q1 , n2 τ1 n −2 ∇1 q1 ∇1 q1 . The term arising from q1 in the above splitting is dealt with in exactly the same way. Thus we have proven that the second term on the right-hand side of (4.11) is bounded by 1 Cϕ L 2 ∩L ∞ β + ∇1 q1 . N
Mean-Field Dynamics: Singular Potentials and Rate of Convergence
Summarizing, we have (b)
( p)
ϕ2X 1 ∩L ∞
125
1 1 β+ + ϕ L 2 ∩L ∞ β + ∇1 q1 . N N
Conclusion of the proof. Putting all the estimates of the right-hand side of (4.8) together, we find , (1 − p1 p2 )h 1 (1 − p1 p2 ) (4.12) 1 1 + ϕ L 2 ∩L ∞ β + ∇1 q1 . E − E ϕ + 1 + ϕ2X 2 ∩L ∞ β + √ 1 N N Next, from 1 − p1 p2 = p1 q2 + q1 we deduce h 1 q1 = h 1 (1 − p1 p2 ) − h 1 p1 q2 h 1 (1 − p1 p2 ) + h 1 p1 q2 . Now, recalling that p = |ϕ ϕ|, we find h 1 p1 q2 h 1 p1 q2 ϕ X 1 β. Therefore,
2 h 1 q1 2 h 1 (1 − p1 p2 ) + ϕ2X 1 β.
Plugging in (4.13) yields h 1 q1 2 E − E ϕ + 1 + ϕ2 2 ∞ β + √1 X 1 ∩L N 1 +ϕ L 2 ∩L ∞ β + ∇1 q1 . N Next, we observe that Assumption (B1) implies ∇1 q1 h 1 q1 + β, so that we get
h 1 q1 2 E − E ϕ + 1 + ϕ2 2 ∞ β + √1 X 1 ∩L N 1 +ϕ L 2 ∩L ∞ β + h 1 q1 . N Now we claim that h 1 q1 2 E − E ϕ + 1 + ϕ2 2 ∞ β + √1 . X 1 ∩L N This follows from the general estimate x 2 C(R + ax)
⇒
x 2 2C R + C 2 a 2 ,
which itself follows from the elementary inequality 1 1 C(R + ax) C R + C 2 a 2 + x 2 . 2 2 The claim of the lemma now follows from (4.13) by using Assumption (B1).
(4.13)
126
A. Knowles, P. Pickl
˙ We start exactly as in Sect. 3. Assumptions (B1) – (B5) imply that 4.3.2. A bound on β. β is differentiable in t with derivative i
ϕ ϕ , (N − 1)W12 − N W1 − N W2 , n 2 = 2(I) + 2(II) + (III) + complex conjugate,
β˙ =
(4.14)
where
i ϕ ϕ , p1 p2 (N − 1)W12 − N W1 − N W2 , n q 1 p2 , 2
i ϕ ϕ (II) := , q1 p2 (N − 1)W12 − N W1 − N W2 , n q1 q2 , 2
i ϕ ϕ (III) := , p1 p2 (N − 1)W12 − N W1 − N W2 , n q1 q2 . 2 (I) :=
Term (I). Using (3.17) we find
ϕ ϕ n q 1 p2 2 (I) = , p1 p2 (N − 1)W12 − N W1 − N W2 ,
ϕ = , p1 p2 W1 , n q 1 p2 ϕ = , p1 p2 W 1 n − τ −1 n q1 p2 , where we used Lemma 3.10. Define √ N μ(k) := N n(k) − (τ−1 n)(k) = √ n −1 (k), √ k+ k−1
k = 1, . . . , N . (4.15)
Thus, (I) = 1 , p1 p2 W ϕ 1 μ q 1 p2 N 1 W ϕ ∞ , μ2 q1 N 1 n −2 q1 W ϕ ∞ , N 1 ϕ2L 2 ∩L ∞ , N by (3.13). Term (II). Using Lemma 3.10 we find
ϕ 2|(II)| = , q1 p2 (N − 1)W12 − N W2 , n q1 q2 N −1 ϕ = , q1 p2 μ q1 q2 W12 − W2 N ϕ , q1 p2 W12 μ q 1 q 2 + , q 1 p2 W 2 μ q1 q2 . % &' ( % &' ( =:(a)
=:(b)
(4.16) (4.17) (4.18)
Mean-Field Dynamics: Singular Potentials and Rate of Convergence
127
One immediately finds (b) W ϕ ∞ q1 , μ2 q1 q2 ϕ2L 2 ∩L ∞ β. In (a) we split w = w ( p) + w (∞) ,
w ( p) ∈ L p , w (∞) ∈ L ∞ ,
with a resulting splitting (a) (a)( p) + (a)(∞) . The easy part is (a)(∞) w (∞) ∞ q1 2 β. In order to deal with (a)( p) we write w ( p) = ∇ · ξ as the divergence of a vector field ξ , exactly as in the proof of Lemma 4.6; see (4.9) and the remarks after it. We integrate by parts to find ρ (a)( p) = , q1 p2 (∇1 X ρ )12 μ q1 q2 ρ ρ ρ ρ ∇1 q1 p2 , X 12 μ q1 q2 + q1 p2 , X 12 ∇1 μ q 1 q 2 .
(4.19)
The first term of (4.19) is equal to ρ ρ ρ σ p ∇σ q X p2 ∇ ρ q 1 , μ q1 q2 ∇1 q1 , p2 X 12 X 12 , μ2 q1 q2 2 1 1 12 1 n −2 q1 q2 ξ 2 ∗ |ϕ|2 ∞ ∇1 q1 , N 2 2 , n2 ξ ∗ |ϕ| ∞ ∇1 q1 N −1 ξ q ϕ L 2 ∩L ∞ ∇1 q1 β ∇1 q1 2 ϕ L 2 ∩L ∞ + β ϕ L 2 ∩L ∞ , where in the second step we used (4.15), in the third Lemma 3.9, and in the last (4.3), Young’s inequality, and (4.10). The second term of (4.19) is equal to q1 p2 , X ρ ( p1 + q1 )∇ ρ 12 1 μ q1 q2 ρ ρ ρ ρ q1 p2 , X 12 p1 τ μ ∇1 q1 q2 , (4.20) 1 μ ∇1 q1 q2 + q1 p2 , X 12 q1
128
A. Knowles, P. Pickl
where we used Lemma 3.10. We estimate the first term of (4.20). The second term is dealt with in exactly the same way. We find ρ p1 X ρ q1 p2 , τ 1 μ ∇1 q 1 q 2 12 2 p q 2 , q1 p2 X 12 ∇1 q1 , q2 τ 2 1 1 μ q 2 ∇1 q 1 ξ 2 ∗ |ϕ|2 ∞ q1 ∇1 q1 , n −2 q2 ∇1 q1 " # N √ # 1 ∇1 q1 , n −2 qi ∇1 q1 ξ q ϕ L 2 ∩L ∞ α $ N −1 i=2 " # N # 1 ∇1 q1 , n −2 qi ∇1 q1 ϕ L 2 ∩L ∞ β $ N −1 i=1 N ∇1 q1 , n −2 n 2 ∇1 q 1 = ϕ L 2 ∩L ∞ β N −1 ϕ L 2 ∩L ∞ β ∇1 q1 β ϕ L 2 ∩L ∞ + ∇1 q1 2 ϕ L 2 ∩L ∞ . In summary, we have proven that (II) β ϕ L 2 ∩L ∞ + ∇1 q1 2 ϕ L 2 ∩L ∞ . Term (III). Using Lemma 3.10 we find
2|(III)| = (N − 1) , p1 p2 W12 , n q1 q2 = (N − 1) , p1 p2 W12 n − τ −2 n q1 q2 . Defining ν(k) := N n(k) − (τ−2 n)(k) = √
√
N n −1 (k), √ k+ k−2
k = 2, . . . , N , (4.21)
we have ν q1 q2 . 2 (III) , p1 p2 W12 As usual we start by splitting w = w ( p) + w (∞) ,
w ( p) ∈ L p , w (∞) ∈ L ∞ ,
Mean-Field Dynamics: Singular Potentials and Rate of Convergence
129
with the induced splitting (III) = (III)( p) + (III)(∞) . Thus, using Lemma 3.10, we find (∞) 1/2 −1/2 n n ν q1 q2 2 (III)(∞) = , p1 p2 W12 (∞) −1/2 1/2 = , p1 p2 τ W12 n ν q1 q2 2n w (∞) ∞ , τ , n −1 n ν 2 q1 q2 2 ! 2 , n −3 q1 q2 β+ N ! 2 N β+ β N N −1 1 β+√ , N where in the fifth step we used Lemma 3.9. In order to estimate (III)( p) we introduce a splitting of w ( p) into “singular” and “regular” parts, w ( p) = w ( p,1) + w ( p,2) := w ( p) 1{|w( p) |>a} + w ( p) 1{|w( p) |a} ,
(4.22)
where a is a positive (N -dependent) constant we choose later. For future reference we record the estimates w ( p,1) p0 a 1− p/ p0 w ( p) p
p/ p0
w
( p,2)
2 a
1− p/2
,
p/2 w ( p) p .
(4.23a) (4.23b)
The proof of (4.23) is elementary; for instance (4.23a) follows from p p −p p w ( p,1) p00 = dx w ( p) w ( p) 0 1{|w( p) |>a} ( p) p p p0 − p p0 − p a 1{|w( p) |>a} a dx w dx w ( p) . Let us start with (III)( p,1) . As in (4.9), we use the representation w ( p,1) = ∇ · ξ. Then (4.10) and (4.23a) imply that ξ 2 w ( p,1) p0 a 1− p/ p0 . Integrating by parts, we find ( p,1) ν q1 q2 2 (III)( p,1) = , p1 p2 W12 ρ ρ = , p1 p2 (∇1 X 12 ) ν q1 q2 ρ ρ ρ ρ ∇1 p1 p2 , X 12 ν q1 q2 + p1 p2 , X 12 ∇1 ν q 1 q 2 .
(4.24)
(4.25)
130
A. Knowles, P. Pickl
Using ∇ p = ∇ϕ and Lemma 3.9 we find that the first term of (4.25) is bounded by
ρ
ρ
σ p ∇σ p ∇1 p1 , p2 X 12 X 12 2 1 1
√ , ν 2 q1 q2 ∇ p ϕ∞ ξ 2 α ∇ϕ ϕ∞ a 1− p/ p0 β ∇ϕ ϕ∞ β + a 2−2 p/ p0 ,
where in the second step we used the estimate (4.24). Next, using Lemma 3.10, we find that the second term of (4.25) is equal to p1 p2 , X ρ ( p1 + q1 )∇ ρ 12 1 ν q1 q2 ρ ρ ρ ρ p1 p2 , X 12 p1 τ ν ∇1 q1 q2 . 1 ν ∇1 q1 q2 + p1 p2 , X 12 q1 We estimate the first term (the second is dealt with in exactly the same way): ρ 2 p1 p2 , X ρ p1 τ , p1 p2 X 2 p1 p2 ∇1 q1 , τ ν ∇ q q 1 1 2 1 ν q 2 ∇1 q 1 12 12 1 " # N # 1 2 $ p2 X 12 p2 ∇1 q1 , n −2 qi ∇1 q1 N −1 i=2 " # N # 1 ∇1 q1 , n −2 qi ∇1 q1 ξ 2 ϕ∞ $ N −1 i=1 N ∇1 q1 , ∇1 q1 a 1− p/ p0 ϕ∞ N −1 2−2 p/ p 0 ϕ∞ a + ∇1 q1 2 . Summarizing, (III)( p,1) ϕ∞ βϕ X + ∇1 q1 2 + a 2−2 p/ p0 ϕ X . 1 1 Finally, we estimate ( p,2) ( p,2) (III)( p,2) = , p1 p2 W12 ν q1 q2 = , p1 p2 W12 ν ( χ (1) + χ (2) )q1 q2 , (4.26) where 1 = χ (1) + χ (2) ,
χ (1) , χ (2) ∈ {0, 1}{0,...,N } ,
is some partition of the unity to be chosen later. The need for this partitioning will soon become clear. In order to bound the term with χ (1) , we note that the operator norm of ( p,2) p1 p2 W12 q1 q2 on the full space L 2 (Rd N ) is much larger than on its symmetric sub( p,2) space. Thus, as a first step, we symmetrize the operator p1 p2 W12 q1 q2 in coordinate
Mean-Field Dynamics: Singular Potentials and Rate of Convergence
131
2. We get the bound , p1 p2 W ( p,2) ν χ (1) q1 q2 12 N 1 ( p,2) (1) , = p1 pi W1i qi q1 χ ν q1 N −1 i=2 " # N # 1 ( p,2) ( p−2) ν q 1 $ , p1 pi W1i q1 qi χ (1) q1 q j W1 j p j p1 . N −1 i, j=2
Using n −1 q1 1 ν q1 we find , p1 p2 W ( p,2) ν χ (1) q1 q2 12
1 √ A + B, N −1
where
A :=
( p,2)
, p1 pi W1i
χ (1) q j W1 j q1 qi
( p,2)
p j p1 ,
2i= j N
B :=
N
( p,2)
, p1 pi W1i
( p,2) χ (1) W1i pi p1 . q1 qi
i=2
The easy part is B
N
( p,2) 2 , p1 pi W1i pi p1
i=2
N ( p,2) 2 w ∗ |ϕ|2 ∞ , p1 pi i=2
(N − 1)ϕ2∞ w ( p,2) 22 N a 2− p ϕ2∞ . Let us therefore concentrate on A=
2i= j N
=
2i= j N
= A1 + A2 ,
( p,2)
, p1 pi W1i
( p,2) χ (1) χ (1) q j W1 j p j p1 q1 qi
(1) W ( p,2) q W ( p,2) τ (1) q p p , p1 pi q j τ 2χ 1 1j 2χ i j 1 1i
(4.27)
132
A. Knowles, P. Pickl
with A = A1 + A2 arising from the splitting q1 = 1 − p1 . We start with |A1 |
1i
2i = j N
=
1j
(1) W ( p,2) W ( p,2) W ( p,2) W ( p,2) τ (1) q p p , p1 pi q j τ χ 2 2χ i j 1 1i
2i = j N
(1) W ( p,2) W ( p,2) τ (1) q p p , p1 pi q j τ 2χ 2χ i j 1
1j
1i
1j
(1) q p p W ( p,2) W ( p,2) p p q τ (1) , , τ 2χ j 1 i 1 i j 2χ 1i 1j
2i = j N
√ by Cauchy-Schwarz and symmetry of . Here · is any complex square root. In order to estimate this we claim that, for i = j, ( p,2) ( p,2) 2 p1 pi W1i W1 j p1 pi w ( p,2) ∗ |ϕ|2 ∞ .
(4.28)
Indeed, by (3.17), we have ( p,2) ( p,2) ( p,2) ( p,2) p1 pi W1i W1 j p1 pi = p1 pi W1i pi W1 j p1 ( p,2) = p1 pi w ( p,2) ∗ |ϕ|2 1 W1 j p1 . ( p,2) The operator p1 w ( p,2) ∗ |ϕ|2 1 W1 j p1 is equal to f j p1 , where f (x j ) = dx1 ϕ(x1 ) w ( p,2) ∗ |ϕ|2 (x1 ) w ( p,2) (x1 − x j ) ϕ(x1 ). Thus, 2 f ∞ w ( p,2) ∗ |ϕ|2 ∞ , from which (4.28) follows immediately. Using (4.28), we get 2 w ( p,2) ∗ |ϕ|2 2 τ |A1 | χ (1) q1 ∞ 2 2i= j N
(1) q N 2 w ( p) 2p ϕ4L 2 ∩L ∞ , τ 2χ 1 (1) n2 . N 2 ϕ4L 2 ∩L ∞ , τ 2χ Now let us choose χ (1) (k) := 1{k N 1−δ } for some δ ∈ (0, 1). Then (τ2 χ (1) ) n 2 N −δ implies |A1 | ϕ4L 2 ∩L ∞ N 2−δ .
(4.29)
Mean-Field Dynamics: Singular Potentials and Rate of Convergence
133
Similarly, we find
|A2 |
(1) p p W ( p,2) p W ( p,2) p p τ (1) q , q j τ 2χ i 1 1i 1 1j 1 j 2χ i
2i= j N
2i= j N
( p,2) 2 (1) q w ∗ |ϕ|2 ∞ , τ 2χ 1
N 2 ϕ4L 2 ∩L ∞ N −δ = ϕ4L 2 ∩L ∞ N 2−δ . Thus we have proven |A| ϕ4L 2 ∩L ∞ N 2−δ . Going back to (4.27), we see that , p1 p2 W ( p,2) ν χ (1) q1 q2 ϕ2L 2 ∩L ∞ N −δ/2 + ϕ∞ N −1/2 a 1− p/2 . 12 What remains is to estimate is the term of (III)( p,2) containing χ (2) , , p1 p2 W ( p,2) ν χ (2) q1 q2 12 N 1 ( p,2) 1/2 1/2 (2) , = χ p p W q q ν ν q 1 i 1i i 1 1 N −1 i=2 " # N # 1 ( p,2) ( p−2) 1/2 ν q1 $ ν q 1 q j W1 j p j p1 . , p1 pi W1i q1 qi χ (2) N −1 i, j=2
Using
1/2 ν q1 , n −1 n 2 = β
we find , p1 p2 W ( p,2) ν χ (2) q1 q2 12
where A :=
( p,2)
, p1 pi W1i
√
β √ A + B, N −1
χ (2) q1 qi ν q j W1 j
( p,2)
p j p1 ,
2i= j N
B :=
N
( p,2)
, p1 pi W1i
( p,2) χ (2) q1 qi ν W1i pi p1 .
i=2
Since χ (2) (k) = 1{k>N 1−δ } we find χ (2) ν χ (2) n −1 N δ/2 .
(4.30)
134
A. Knowles, P. Pickl
χ (2) Thus, q1 qi ν N δ/2 and we get B N δ/2
N
( p,2) 2 2 , p1 pi W1i pi p1 N 1+δ/2 w ( p,2) ∗ |ϕ|2 ∞
i=2 1+δ/2 N w ( p,2) 22 ϕ2∞
N 1+δ/2 a 2− p ϕ2∞ ,
by (4.23b). Next, using Lemma 3.10, we find ( p,2) (2) 1/2 ( p,2) A= , p1 pi q j W1i χ χ (2) ν q1 ν 1/2 W1 j qi p j p1 2i= j N
=
(2) τν 1/2 W ( p,2) q W ( p,2) τ (2) τν 1/2 q p p , p1 pi q j τ 2χ 2 1 1j 2χ 2 i j 1 1i
2i= j N
= A1 + A2 , where, as above, the splitting A = A1 + A2 arises from writing q1 = 1 − p1 . Thus, (2) τν 1/2 W ( p,2) W ( p,2) τ (2) τν 1/2 q p p , p1 pi q j τ |A1 | 2χ 2 2χ 2 i j 1 1i 1j 2i= j N
=
( p,2) ( p,2) ( p,2) 1/2 (2) p1 pi q j τ τ W1i W1 j W1i 2χ 2ν
2i= j N
( p,2) (2) τν 1/2 q p p × W1 j τ 2χ 2 i j 1 (2) τν 1/2 p p W ( p,2) W ( p,2) p p τ (2) τν 1/2 q , , q j τ 2χ 2 1 i i 1 2χ 2 j 1i 1j 2i= j N
by Cauchy-Schwarz and symmetry of . Using (4.28) we get 2 |A1 | N 2 w ( p,2) ∗ |ϕ|2 ∞ , τ 2 ν q1 N 2 w ( p,2) 2p ϕ4L 2 ∩L ∞ , n
N 2 ϕ4L 2 ∩L ∞ β. Similarly, |A2 |
2i= j N
2i= j N
(2) τν 1/2 p W ( p,2) p W ( p,2) p τ (2) τν 1/2 q p , pi q j τ 2χ 2 1 1i 1 1j 1 2χ 2 i j ( p,2) 2 w ∗ |ϕ|2 ∞ , τ 2 ν q1
n N 2 w ( p) 2p ϕ4L 2 ∩L ∞ , N 2 ϕ4L 2 ∩L ∞ β. Plugging all this back into (4.30), we find that , p1 p2 W ( p,2) ν χ (2) q1 q2 β ϕ2L 2 ∩L ∞ + ϕ∞ + ϕ∞ a 2− p N δ/2−1 . 12
Mean-Field Dynamics: Singular Potentials and Rate of Convergence
135
Summarizing: (III)( p,2) 1 + ϕ2 2 ∞ β + a 2− p N δ/2−1 + N −δ/2 + N −1/2 a 1− p/2 , L ∩L from which we deduce (III)( p) ϕ∞ ∇1 q1 2 + 1 + ϕ X 1 ∩L ∞ β + a 2− p N δ/2−1 + N −δ/2 + N −1/2 a 1− p/2 + a 2−2 p/ p0 . Let us set a ≡ a N = N ζ and optimize in δ and ζ . This yields the relations p δ , ζ (2 − p) + δ = 1, − = 2ζ 1 − 2 p0 which imply δ p/ p0 − 1 = , 2 2 p/ p0 − p/2 − 1 with δ 1. Thus, (III)( p) ϕ∞ ∇1 q1 2 + 1 + ϕ X ∩L ∞ β + N −η , 1 where η = δ/2 satisfies (4.4). Conclusion of the proof. We have shown that β˙ ϕ L 2 ∩L ∞ ∇1 q1 2 + 1 + ϕ X 1 ∩L ∞ β + N −η . Using Lemma 4.6 we find 1 β˙ 1 + ϕ3X 2 ∩L ∞ β + E − E ϕ + η . 1 N
(4.31)
The claim then follows from the Grönwall estimate (3.3). 4.4. A remark on time-dependent external potentials. Theorem 4.1 can be extended to time-dependent external potentials h(t) without too much sweat. The only complication is that energy is no longer conserved. We overcome this problem by observing that, while the energies E (t) and E ϕ (t) exhibit large variations in t, their difference remains small. In the following we estimate the quantity E (t) − E ϕ (t) by controlling its time derivative. We need the following assumptions, which replace Assumptions (B1) – (B3). (B1’) The Hamiltonian h(t) is self-adjoint and bounded from below. We assume that there is an operator h 0 0 that such that 0 h(t) h 0 for all t. We define the Hilbert space X N = Q i (h 0 )i as in (A1), and the space X 12 = Q(h 20 ) as in (B5) using h 0 . We also assume that there are time-independent constants κ1 , κ2 > 0 such that − κ1 h(t) + κ2 for all t.
136
A. Knowles, P. Pickl
We make the following assumptions on the differentiability of h(t). The map t → ψ , h(t)ψ is continuously differentiable for all ψ ∈ X 1 , with derivative ˙ ˙
ψ , h(t)ψ for some self-adjoint operator h(t). Moreover, we assume that the quantities (1 + h(t))−1/2 h(t) ˙ 2 ϕ(t) , ˙ (1 + h(t))−1/2
ϕ(t) , h(t) are continuous and finite for all t. (B2’) The Hamiltonian H N (t) is self-adjoint and bounded from below. We assume that Q(H N (t)) ⊂ X N for all t. We also assume that the N -body propagator U N (t, s), defined by i∂t U N (t, s) = H N (t)U N (t, s),
U N (s, s) = 1,
exists and satisfies U N (t, 0) N ,0 ∈ Q(H N (t)) for all t. (B3’) There is a time-independent constant κ3 ∈ (0, 1) such that 0 (1 − κ3 )(h 1 (t) + h 2 (t)) + W12 for all t. Theorem 4.7. Assume that Assumptions (B1’) – (B3’), (B4), and (B5) hold. Then there is a continuous nonnegative function φ, independent of N and N ,0 , such that 1 ϕ β N (t) φ(t) β N (0) + E N (0) − E (0) + η , N with η defined in (4.4). Proof. We start by deriving an upper bound on the energy difference E(t) := E (t) − E ϕ (t). Assumptions (B1’) and (B2’) and the fundamental theorem of calculus imply t ˙ . ds (s) , h˙ 1 (s)(s) − ϕ(s) , h(s)ϕ(s) E(t) = E(0) + % &' ( 0 =: G(s)
By inserting 1 = p1 (s)+q1 (s) on both sides of h˙ 1 (s) we get (omitting the time argument s) ˙ + 2 Re , p1 h˙ 1 q1 + , q1 h˙ 1 q1 . (4.32) G = , p1 h˙ 1 p1 − ϕ , hϕ The first two terms of (4.32) are equal to ˙ = α ϕ , hϕ ˙ β| ϕ , hϕ |. ˙
, p1 − 1 ϕ , hϕ The third term of (4.32) is bounded, using Lemmas 3.9 and 3.10, by 2 , p1 h˙ 1 n 1/2 n −1/2 q1 = 2 h˙ 1 p1 τ1 n 1/2 , n −1/2 q1 −1/2 n τ1 n 1/2 , p1 h˙ 21 p1 τ1 n 1/2 q1 | ϕ , h˙ 2 ϕ | , τ1 n , n −1 q1 ! 1 β, | ϕ , h˙ 2 ϕ | β + √ N 1 . | ϕ , h˙ 2 ϕ | β + √ N
Mean-Field Dynamics: Singular Potentials and Rate of Convergence
137
The last term of (4.32) is equal to , q1 (1 + h 1 )1/2 (1 + h)−1/2 h˙ 1 (1 + h 1 )−1/2 (1 + h)1/2 q1 ˙ + h)−1/2 (1 + h 1 )1/2 q1 2 . (1 + h)−1/2 h(1 Thus, using Assumption (B1’) we conclude that 2 1 1/2 G(t) C(t) β(t) + √ + h 1 (t) q1 (t)(t) N
(4.33)
for all t. Here, and in the following, C(t) denotes some continuous nonnegative function that does not depend on N . Next, we observe that, under Assumptions (B1’) – (B3’), the proof of Lemma 4.6 remains valid for time-dependent one-particle Hamiltonians. Thus, (4.13) implies h 1 (t)1/2 q1 (t)(t)2 E(t) + 1 + ϕ(t)2 2 ∞ β(t) + √1 . X 1 ∩L N Plugging this into (4.33) yields 1 G(t) C(t) β(t) + √ + E(t) . N Therefore, E(t) E(0) +
t
0
1 . ds C(s) β(s) + E(s) + √ N
(4.34)
Next, we observe that, under Assumptions (B1’) – (B3’), the derivation of the estimate (4.31) in the proof of Theorem 4.1 remains valid for time-dependent one-particle Hamiltonians. Therefore, t 1 (4.35) β(t) β(0) + ds C(s) β(s) + E(s) + η . N 0 Applying Grönwall’s lemma to the sum of (4.34) and (4.35) yields t 1 β(t) + E(t) β(0) + E(0) e 0 C + η N
t
ds C(s) e
t 0
C
.
0
Plugging this back into (4.35) yields 1 β(t) C(t) β(0) + E(0) + η , N which is the claim. Acknowledgements. We would like to thank J. Fröhlich and E. Lenzmann for helpful and stimulating discussions. We also gratefully acknowledge discussions with A. Michelangeli which led to Lemma 2.1.
138
A. Knowles, P. Pickl
References 1. Elgart, A., Schlein, B.: Mean field dynamics of boson stars. Comm. Pure Appl. Math. 60(4), 500– 545 (2007) 2. Erd˝os, L., Schlein, B.: Quantum dynamics with mean field interactions: a new approach. http://arXiv. org/abs/0804.3774v1[math.ph], 2008 3. Erd˝os, L., Yau, H.-T.: Derivation of the nonlinear Schrödinger equation with Coulomb potential. Adv. Theor. Math. Phys. 5, 1169–1205 (2001) 4. Fröhlich, J., Knowles, A., Schwarz, S.: On the mean-field limit of bosons with Coulomb two-body interaction. Commun. Math. Phys. 288, 1023–1059 (2009) 5. Ginibre, J., Velo, G.: On a class of non linear Schrödinger equations with non local interaction. Math. Z. 170, 109–136 (1980) 6. Hepp, K.: The classical limit for quantum mechanical correlation functions. Commun. Math. Phys. 35, 265–277 (1974) 7. Lenzmann, E.: Well-posedness for semi-relativistic Hartree equations of critical type. Math. Phys. Anal. Geom. 10(1), 43–64 (2007) 8. Lieb, E., Yau, H.-T.: The Chandrasekhar theory of stellar collapse as the limit of quantum mechanics. Commun. Math. Phys. 112(1), 147–174 (1987) 9. Lieb, E.H., Seiringer, R.: Proof of Bose-Einstein condensation for dilute trapped gases. Phys. Rev. Lett. 88(17), 170409 (2002) 10. Pickl, P.: A simple derivation of mean field limits for quantum systems. To appear 11. Reed, M., Simon, B.: Methods of Modern Mathematical Physics II: Fourier Analysis, Self-Adjointness. New York: Academic Press, 1975 12. Rodnianski, I., Schlein, B.: Quantum fluctuations and rate of convergence towards mean field dynamics. http://arXiv.org/abs/0711.3087v1[math.ph], 2007 13. Spohn, H.: Kinetic equations from Hamiltonian dynamics: Markovian limits. Rev. Mod. Phys. 53(3), 569–615 (1980) Communicated by H.-T. Yau
Commun. Math. Phys. 298, 139–230 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1061-4
Communications in
Mathematical Physics
Energy Dispersed Large Data Wave Maps in 2 + 1 Dimensions Jacob Sterbenz1, , Daniel Tataru2, 1 Department of Mathematics, University of California, San Diego, CA 92093-0112, USA.
E-mail:
[email protected] 2 Department of Mathematics, University of California, Berkeley, CA 94720-3840, USA.
E-mail:
[email protected] Received: 24 July 2009 / Accepted: 27 December 2009 Published online: 23 May 2010 – © The Author(s) 2010. This article is published with open access at Springerlink.com
Abstract: In this article we consider large data Wave-Maps from R2+1 into a compact Riemannian manifold (M, g), and we prove that regularity and dispersive bounds persist as long as a certain type of bulk (non-dispersive) concentration is absent. This is a companion to our concurrent article [21], which together with the present work establishes a full regularity theory for large data Wave-Maps. Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 1.1 A guide to reading the paper . . . . . . . . . . . . 2. Standard Constructions, Function Spaces, and Estimates 2.1 Constants . . . . . . . . . . . . . . . . . . . . . . 2.2 Basic harmonic analysis . . . . . . . . . . . . . . . 2.3 Function spaces and standard estimates . . . . . . . 3. New Estimates and Intermediate Constructions . . . . . 3.1 Core technical estimates and constructions . . . . . 3.2 Derived estimates and intermediate constructions . 4. Proof of the Main Result . . . . . . . . . . . . . . . . . 5. The Iteration Spaces: Basic Tools and Estimates . . . . . 5.1 Space-time and angular frequency cutoffs . . . . . 5.2 The S and N function spaces . . . . . . . . . . . . 5.3 Extension and restriction for S and N functions . . 5.4 Strichartz and Wolff type bounds . . . . . . . . . . 6. Bilinear Null Form Estimates . . . . . . . . . . . . . . . 7. Proof of the Trilinear Estimates . . . . . . . . . . . . . . The first author was supported in part by the NSF grant DMS-0701087.
. . . . . . . . . . . . . . . . .
The second author was supported in part by the NSF grant DMS-0801261.
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
140 143 145 146 146 148 150 150 153 156 171 172 172 179 182 186 193
140
J. Sterbenz, D. Tataru
8. The Gauge Transformation . . . . . . . . . . . . . . . . . . . 8.1 Bounds for B . . . . . . . . . . . . . . . . . . . . . . . 8.2 The gauge construction . . . . . . . . . . . . . . . . . . 9. The Linear Paradifferential Flow . . . . . . . . . . . . . . . . 10. Structure of Finite S Norm Wave-Maps and Energy Dispersion 10.1 Renormalization . . . . . . . . . . . . . . . . . . . . . . 10.2 Partial fungibility of the S norm . . . . . . . . . . . . . 10.3 The role of the energy dispersion . . . . . . . . . . . . . 11. Initial Data Truncation . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
198 198 199 209 221 221 222 224 225 229
1. Introduction In this article we consider finite energy large data Wave-Maps from the Minkowski space R2+1 into a compact Riemannian manifold (M, g). Our main result asserts that regularity and dispersive bounds persist as long as a certain type of bulk concentration is absent. The results proved here are used in the companion article [21] to establish a full regularity theory for large data Wave-Maps. The set-up we consider is the same as the one in [33], using the so-called extrinsic formulation of the Wave-Maps equation. Precisely, we consider the target manifold (M, g) as an isometrically embedded submanifold of R N . Then we can view the M valued functions as R N valued functions whose range is contained in M. Such an embedding always exists by Nash’s theorem [18] (see also Gromov [3] and Günther [4]). In this context the Wave-Maps equation can be expressed in a form which involves the second fundamental form S of M, viewed as a symmetric bilinear form: S : T M × T M → N M, S(X, Y ), N = ∂ X N , Y . For the standard d’Allembertian in R2+1 we use the notation = ∂t2 − x = −∂ α ∂α . The Cauchy problem for the wave maps equation has the form: a φ a = −Sbc (φ)∂ α φ b ∂α φ c , φ ∈ R N , φ(0, x) = φ0 (x), ∂t φ(0, x) = φ˙ 0 (x),
(1a) (1b)
where the initial data (φ0 , φ˙ 0 ) is chosen to obey the constraint: φ0 (x) ∈ M, φ˙ 0 (x) ∈ Tφ0 (x) M, x ∈ R2 . In the sequel, it will be convenient for us to use the notation φ[t] = (φ(t), ∂t φ(t)). The system of Eqs. (1) admits a conserved quantity, namely the Dirichlet energy: |∂t φ(t)|2 + |∇x φ(t)|2 d x := φ[t] 2H˙ 1 ×L 2 = E. (2) E[φ(t)] := R2
Finite energy solutions for (1) correspond to initial data in the energy space, namely φ[t] ∈ H˙ 1 × L 2 . We call a Wave-Map “classical” on a bounded time interval (t0 , t1 )×R2 if ∇x,t φ(t) belongs to the Schwartz class for all t ∈ (t0 , t1 ). The Wave-Maps equation is also invariant with respect to the change of scale φ(t, x) → φ(λt, λx) for any positive λ ∈ R. In (2 + 1) dimensions, it is easy to
Energy Dispersed Wave Maps
141
see that the energy E[φ] is dimensionless with respect to this scale transformation. For this reason, the problem we consider is called energy critical. For the evolution (1), a local well-posedness theory in Sobolev spaces H s × H s+1 for s above scaling, s > 1, was established some time ago. See [7] and [9], and references therein. The small data Cauchy-problem in the scale invariant Sobolev space is, by now, also well understood. Following work of the second author [32] for initial data in a scale invariant Besov space, Tao was the first to consider the wave map equation with small energy data. In the case when the target manifold is a sphere, Tao [29] proved global regularity and scattering for small energy solutions. This result was extended to the case of arbitrary compact target manifolds by the second author in [33]. Finite energy solutions were also introduced in [33] as unique strong limits of classical solutions, and the continuous dependence of the solutions with respect to the initial data was established. The case when the target is the hyperbolic plane was handled by Krieger [15]. There is also an extensive literature devoted to the more tractable higher dimensional case; we refer the reader to [8,14,17,28,31], and [20] for more information. To measure the dispersive properties of solutions φ to the Wave-Maps equation, we shall use a variant of the standard dispersive norm S from [33]. This was originally defined in [29] by modifying a construction in [32]. S is used together with its companion space N which has the linear property (precise definitions will be given shortly): ∞ φ S[I ] φ L ∞ + φ[0] H˙ 1 ×L 2 + φ N [I ] . t (L x )[I ]
The main result in [33] asserts that global regularity and scattering hold for the small energy critical problem: Theorem 1.1. The wave maps Eq. (1) is globally well-posed for small initial data φ[0] ∈ H˙ 1 × L 2 in the following sense: (i) Classical Solutions. If the initial data φ[0] is constant outside of a compact set and C ∞ , then there is a global classical solution φ with this data. (ii) Finite Energy Solutions. For each small initial data set in φ[0] ∈ H˙ 1 × L 2 there is a global solution φ ∈ S, obtained as the unique S limit of classical solutions, so that: φ S φ[0] H˙ 1 ×L 2 .
(3)
(iii) Continuous dependence. The solution map φ[0] → φ from a small ball in H˙ 1 × L 2 to S is continuous. We remark that due to the finite speed of propagation one can also state a local version of the above result, where the small energy initial data is taken in a ball, and the solution is defined in the corresponding uniqueness cone. This allows one to define large data finite energy solutions: Definition 1.2. Let I be a time interval. We say that φ is a finite energy wave map in I if φ[·] ∈ C(I ; H˙ 1 × L 2 ) and, for each (t0 , x0 ) ∈ I and r > 0 so that E[φ(t0 )](B(x0 , r )) is small enough, the solution φ coincides with the one given by Theorem 1.1 in the uniqueness cone I ∩ {|x − x0 | + |t − t0 | r }. In this work we consider a far more subtle case, which is a conditional version of the large data problem. It is first important to observe that for general targets the
142
J. Sterbenz, D. Tataru
above theorem cannot be extended to arbitrarily large C ∞ initial data, and that this failure can be attributed to several different mechanisms. For instance any harmonic map φ0 : R2 → M yields a time independent wave-map which does not decay in time, therefore it does not belong to S. More interesting is that for certain non-convex targets, for example when we take M = S2 , finite time blow-up of smooth solutions is possible (see [13,19]). In this latter case, the blow-up occurs along a family of rescaled harmonic maps. To avoid such Harmonic-Map based solutions, as well as other possible concentration scenarios, in this article we prove a conditional regularity theorem: Theorem 1.3 (Energy Dispersed Regularity Theorem). There exist two functions 1
F(E) and 0 < (E) 1 of the energy (2) such that the following statement is true. If φ is a finite energy solution to (1) on the open interval (t1 , t2 ) with energy E and: sup Pk φ L ∞ 2 (E) t,x [(t1 ,t2 )×R ]
(4)
φ S(t1 ,t2 ) F(E).
(5)
k
then one also has:
Finally, such a solution φ(t) extends in a regular way to a neighborhood of the interval I = [t1 , t2 ]. Remark 1.4. In Sect. 4, Theorem 4.1, we shall state a slightly stronger version of this result which uses the language of frequency envelopes from [29]. In particular, we will show the energy dispersion bound (4) implies that a certain range of subcritical Sobolev norms may only grow by a universal energy dependent factor. Put another way, one may interpret this restatement of Theorem 1.3 as saying that in the energy dispersed scenario, the Wave-Maps equation becomes subcritical in the sense that there is a quasi-conserved norm of higher regularity than the physical energy. This information, coupled with the standard regularity theory for Wave-Maps (e.g. see [33]) provides us with the continuation property. Remark 1.5. The result in this article is stated and proved in space dimension d = 2. However, given its perturbative nature, one would expect to have a similar result in higher dimension d 3 as well. That is indeed the case. There are two reasons why we have decided to stay with d = 2 here. One is to fix the notations. The second, and the more important reason, is to avoid lengthening the paper with an additional argument in Sect. 4, which is the only place in the article where the conservation of energy is used. In higher dimensions, this aspect would have to be replaced by an almost conservation of energy, with errors controlled by the energy dispersion parameter . Remark 1.6. The proof of Theorem 1.3 allows us to obtain explicit formulas for F(E) and (E). Precisely, in the conclusion of the proof of Corollary 4.4 below, we show that these parameters may be chosen of the form: F(E) = eCe
EM
, (E) = e−Ce
EM
,
with C and M sufficiently large. As a consequence of the frequency envelope version of this result in Theorem 4.1 we can also state a weaker non-conditional version of the above result:
Energy Dispersed Wave Maps
143
Corollary 1.7. There exists two functions 1 F(E) and 0 < (E) 1 of the energy (2) such that for each initial data φ[0] satisfying: sup Pk φ[0] H˙ 1 ×L 2 (E),
(6)
k
there exists a unique global finite energy solution φ ∈ S, satisfying: φ S F(E),
(7)
which depends continuously on the initial data. If in addition the initial data is smooth, then the solution is also smooth. Our main interest in Theorem 1.3 is to combine it with the results of our concurrent work [21], which together implies a full regularity theory for Wave-Maps. In this context, one may view Theorem 1.3 as providing a “compactness continuation” principle, which roughly states that there is the following dichotomy for classical Wave-Maps defined on the open time interval (t0 , t1 ) × R2 : (1) The solution φ continues to a neighborhood of the closed time interval [t0 , t1 ] as a classical Wave-Map. (2) The solution φ exhibits a compactness property on a sequence of rescaled times. In particular, the second case may be used with the energy estimates from [21] to conclude that a portion of any singular Wave-Map must become stationary, and via compactness must therefore rescale to a Harmonic-Map of non-trivial energy. This was known as the bubbling conjecture (see the introduction of [21] for more background). Finally, we would like to remark that results similar in spirit to the ones of this paper and [21] have been recently announced. In the case where M = Hn , the hyperbolic spaces, global regularity and scattering follows from the program of Tao [22–24,26,30] and [25]. In the case where the target M is a negatively curved Riemann surface, Krieger and Schlag [16] provide global regularity and scattering via a modification of the KenigMerle method [6], which uses as a key component suitably defined Bahouri-Gerard [1] type decompositions. 1.1. A guide to reading the paper. The paper has a “two tier” structure, whose aim is to enable the reader to get quickly to the proof of the main result in Sect. 4. The first tier consists of Sects. 2, 3 and 4, which play the following roles: Section 2 is where the notations are set-up. In addition, in Proposition 2.3 we review the linear, bilinear, trilinear and Moser estimates concerning the S and N spaces, as proved in [29,33]. The N space we use is the same as in [29,32]. For the S space we begin with the definition in [29] and add to it the Strichartz norm S defined later in (148). This modification costs almost nothing, but saves a considerable amount of work in several places. Section 3 contains new contributions, reaching in several directions: • Renormalization. A main difficulty in the study of wave maps is that the nonlinearity is non-perturbative at the critical energy level. A key breakthrough in the work of Tao [29] was a renormalization procedure whose aim is to remove the nonperturbative part of the nonlinearity. However, despite subsequent improvements in [33], this procedure only applies to the small data problem. We remedy this in Proposition 3.1, introducing a large data version of the renormalization procedure. This
144
J. Sterbenz, D. Tataru
applies without any reference to the energy dispersion bounds. We note that other large data renormalization procedures are available in certain cases, for instance by using the Coulomb or the caloric gauge. • S bounds for the paradifferential evolution with a large connection. After peeling off the perturbative part of the nonlinearity in the wave map equation, one is left with a family of frequency localized linear paradifferential evolutions as in (38). In the case of the small data problem, by renormalization this turns into a small perturbation of the linear wave equation. Here this is no longer possible, as the connection coefficients Aα are large, and this cannot be improved using the energy dispersion. However, what the energy dispersion allows us to do is to produce a large frequency gap m in (38). As it turns out, this is all that is needed in order to have good estimates for Eq. (38). • New bilinear and trilinear estimates which take advantage of the energy dispersion. The main bilinear bound is the L 2 estimate in Proposition 3.4. Ideally one would like to have such estimates for functions in S, but that is too much to ask. Instead we introduce a narrower class W of “renormalizable” functions φ of the form φ = U † w, where U ∈ S is a gauge transformation, while for w we control both w S and w N . As a consequence of Proposition 3.4 and the more standard bounds in Proposition 2.3, we later derive the trilinear estimates in Proposition 3.6, which are easy to apply subsequently in the proof of our main theorem. Section 4 contains the proof of Theorem 4.1, which is a stronger frequency envelope version of Theorem 1.3. This is done via an induction on energy argument. The noninductive part of the proof is separated into Propositions 4.2 and 4.3, whose aim is to bound in two steps the difference between a wave-map φ and a lower energy wave map whose initial data is essentially obtained by truncating in frequency the initial data for φ φ. The arguments in this section use exclusively the results in Sects. 2, 3. The second tier of the article contains the proofs of all the results stated in Sects. 2, 3, with the exception of those already proved in [29] and [33]. These are organized as follows: Section 5’s content is as follows: • A full description of the S and N spaces. Some further properties of these spaces are detailed in Proposition 5.4; most of these are from [29] and [33], with the notable exception of the fungibility estimate (159). The bound (159) is proved using only the definition of N . • Extension properties for the S space. In most of our analysis we do not work with the spaces S and N globally, instead we use their restrictions to time intervals, S[I ] and N [I ]. This is not important for N , since the multiplication by a characteristic function of an interval is bounded on N . However, that is not the case for S. One can define the S[I ] norm using minimal extensions. But in our case, we also need good control of the energy dispersion and of the high modulation bounds for the extensions. To address this, in Proposition 5.5 we introduce a canonical way to define the extensions which obey the appropriate bounds, and which also produce an equivalent S[I ] norm. • Strichartz and L 2 bilinear estimates. Using the U p and V p spaces1 associated to the half-wave evolutions, we first show that solutions to the wave equation φ = F with a right hand side F ∈ N satisfy the full Strichartz estimates. The fungibility estimate (159) plays a significant role here, as it allows us to place the solution φ in a V 2 type space, see (195). A second goal is to prove L 2 bilinear bounds for products of 1 For further information on the U p and V p spaces we refer the reader to [5,11,12].
Energy Dispersed Wave Maps
145
two such inhomogeneous waves with frequency localization and angular frequency separation, see Lemma 5.10. This is accomplished using the Wolff [34]-Tao [27] type L p bilinear estimates with p < 2. Section 6 is devoted to the proof of the bilinear null form estimates in Proposition 3.4. A preliminary step, achieved in Lemma 6.1, is to establish the counterpart of the bounds (44) and (46) in the absence of the renormalization factor. The proofs here use only Lemma 5.10 and the estimates in Propositions 2.3, 5.4. Section 7 contains the proof of the trilinear estimates in Proposition 3.6. There are a number of dyadic decompositions and multiple cases to consider, but this is largely routine, using either Proposition 3.4 or the estimates in Propositions 2.3 and 5.4. Section 8 is concerned with the construction of the gauge transformation in Proposition 3.1. The discrete inductive construction in [29,33] is replaced with a continuous version which serves to insure that the renormalization matrices U, , (13) k k
2−ak ck (a − σ )−1 2−ak ck ,
a > σ,
(14)
k k
with similar bounds for integrals. These two inequalities capture the essence of every use we have for the {ck } notation, which is simply to bookkeep (resp.) Low × Low ⇒ H igh and H igh × H igh ⇒ Low frequency cascades. 2.3. Function spaces and standard estimates. We use the function spaces S and N from [32,33] and [29] with only a few minor modifications. The spaces of restrictions of S and N functions to a time interval I are denoted by S[I ], respectively N [I ], with the induced norms. The first part of our proof does not use the precise structure of these spaces, only the following statement: Proposition 2.3 (Standard Estimates and Relations: Part I). Let F, φ, and φ (i) be a collection of test functions, I ⊆ R any subinterval (including R itself). Then there exists function spaces S[I ] and N [I ] with the following properties: • Triangle Inequality for S. Let I = ∪iK Ii be a decomposition of I into consecutive intervals, then the following bounds hold (uniform in K ): φ S[I ] φ S[Ii ] . (15) i
• Frequency Orthogonality. The spaces S[I ] and N [I ] are made up of dyadic pieces in the sense that: Pk φ 2S[I ] , (16) φ 2S[I ] = φ 2L ∞ (L ∞ )[I ] + t
φ 2N [I ]
=
x
k
Pk φ 2N [I ] .
(17)
k
• Energy Estimates. We have that L 1t (L 2x )[I ] ⊆ N [I ], and also the estimate: φk S[I ] φk N [I ] + φk [0] H˙ 1 ×L 2 .
(18)
• Core Product Estimates. We have that: (1)
(2)
(1)
(2)
φ 0, that is, to a generic positive function in L (M, B−m,0 , μ). + − If g is such that both the positive part g and the negative part g are nonzero, we apply (5.61) twice to g + and g − . An easy estimate proves that the formula holds in this case as well. & := {F ∈ L ∞ (M, As , μ) | ∃ μ(F)} and L &m := Therefore (M5) holds w.r.t. G 1 L (M, B−m,0 , μ), for all m ∈ N. Now, if F ∈ Gm and g ∈ Lm , the invariance of μ and Lemma 2.4 give that μ((F ◦ T n )g) − μ(F)μ(g) = μ((F ◦ T n−m )(g ◦ T −m )) − μ(F ◦ T m )μ(g ◦ T −m ).
(5.62)
&2m and F ◦ T m ∈ G & (because B0,2m ⊂ As ), we apply the previous Since g ◦ T −m ∈ L (2m) result and see that (a) holds with a convergence rate ϑn−2m .
510
M. Lenci
5.7. Stage 4: Proof of the remaining assertions. Statement (a) immediately implies (M4) relative to Gm and Lm (Proposition 3.1). One readily extends it to G and L, thus proving (b), by means of the following obvious lemma: Lemma 5.8. If G is a dense subset of G in the L ∞ -norm and L is a dense subset of L in the L 1 -norm, then (M4) for G and L implies (M4) for G and L. As concerns (c), it is easy to verify that Proposition 3.2 applies to the classes of global observables Gm and local observables Lm (using the family of local observables gα := G 1 Sα ). Therefore (a) implies (M2) relative to Gm . We extend it to G by means of another obvious result. Lemma 5.9. If G is a dense subset of G in the L ∞ -norm, then (M2) for G implies (M2) for G. Finally, let us consider (d). By the second part of Proposition 3.1, it suffices to show that, if F, G ∈ Gm and F is Zd -periodic, then μ((F ◦ T n )G) exists for n large enough. By the same arguments as in the proof of Lemma 2.4, when V M, μV ((F ◦ T n )G) = μV ((F ◦ T n−m )(G ◦ T −m )) + o(1).
(5.63)
So we can reduce to proving the existence of the infinite-volume limit of the above r.h.s., for all n ≥ 2m. Since G ◦ T −m is measurable w.r.t. B−2m,0 ⊂ Au , with a slight abuse of notation we can define, for α ∈ Z d , 1 −m bα := G◦T dμ = G ◦ T −m (α; y2 ) dy2 . (5.64) Sα
0
An analogous definition can be made for F ◦ T n−m , which is measurable w.r.t. B0,2m ⊂ As . In this case, notice that F ◦ T n−m is also Zd -periodic, so we can write 1 a := F ◦ T n−m dμ = F ◦ T n−m (y1 ) dy1 . (5.65) Sα
Clearly, then,
and, for V =
Sα α∈Bγ ,r
0
(F ◦ T n−m ) (G ◦ T −m ) dμ = a bα
(5.66)
Sα ,
μV ((F ◦ T n−m )(G ◦ T −m )) =
a bα . d (2r + 1)
(5.67)
α∈Bγ ,r
Since μ(G) exists, by Lemma 2.4, μ(G ◦ T −m ) = lim
r →∞
1 bα (2r + 1)d
(5.68)
α∈Bγ ,r
exists and the limit is uniform in γ . Also, it is obvious that μ(F ◦ T n−m ) = a. Hence, as V M, the r.h.s. of (5.67) tends to μ(F ◦ T n−m )μ(G ◦ T −m ), which is what we wanted to prove. This concludes the proof of Theorem 4.6.
On Infinite-Volume Mixing
511
A. Appendix We collect here a few technical results which would have been distracting in the body of the paper. The most important of them is an estimate on a certain Fourier norm that is pivotal in the proof of Theorem 4.6. This is presented in Sect. A.3. A.1. Proof of Lemma 4.2. Since T is an automorphism, it is no loss of generality to prove the assertion for t = −1. Also, since (A1) is invariant for the action of Zd on V , we may assume that all V ∈ V are of the form Vr := B0,r × [0, 1)2 . Thus, the infinite-volume limit becomes the limit r → ∞. Let r = r (r ) := [r 1/2 ] ([·] is the integer part of a positive number) and ϕ(r ) := μ(T S0 \ Vr ) = pβ . (A.1) β∈ B0,r
Clearly, ϕ(r ) imply that
0, as r and r tend to infinity. This and the translation invariance of T μ(T Vr \ Vr +r ) ≤ μ(Vr ) ϕ(r ),
(A.2)
μ(T Vr ∪ Vr ) ≤ μ(Vr +r ) + μ(T Vr \ Vr +r ) = (2r + 1)d + o((2r + 1)d ).
(A.3)
whence
With a dual argument, considering that μ(Vr −r \ T Vr ) = μ(T −1 Vr −r \ Vr ) and that T −1 acts essentially as T (after a swapping of the coordinates y1 and y2 , the map T −1 becomes of the same type as T ), we obtain μ(T Vr ∩ Vr ) ≥ μ(Vr −r ) − μ(Vr −r \ T Vr ) = (2r + 1)d + o((2r + 1)d ). (A.4) Taking the difference of (A.3) and (A.4) yields (A1) with t = −1.
A.2. Proof of Lemma 4.5. We must show that, if j , j ∈ Z N with j = j , then
spanZ {β ( j) − β ( j ) } j= j = spanZ {β ( j) − β ( j ) } j= j .
(A.5)
The generic element of the l.h.s. of (A.5) is γ =
n j (β ( j) − β
j= j
( j )
)=
j= j
⎛ n j β ( j) − ⎝
⎞
n j ⎠ β ( j ),
(A.6)
j= j
chosen integers. Upon defining where {n j } j= j are free variables, i.e., are arbitrarily n j := − j= j n j , which implies n j = − j= j n j , (A.6) becomes ⎛ ⎞ γ = n j β ( j) = n j β ( j) − ⎝ n j ⎠ β( j ) = n j (β ( j) − β ( j ) ), (A.7) j∈Z N
j= j
j= j
j= j
which is the generic element of the r.h.s. of (A.5), if we consider {n j } j= j to be the free variables and n j to depend on them.
512
M. Lenci
A.3. Absolutely convergent Fourier series. In this section we present a convenient estimate for the space A of functions a : Td −→ C with an absolutely convergent Fourier series {aβ } = a [K, §6]. This functional space is defined as the maximal domain of the norm a A := a1 := |aβ |. (A.8) β∈Zd
a ∈ A, This norm has a couple of straightforward invariances. For γ ∈ Zd , ζ ∈ Td , and let ωγ (θ ) := eıγ ·θ ; (τζ a )(θ ) := a (θ + ζ ).
(A.9) (A.10)
Lemma A.1. Given a ∈ A, for all γ ∈ Zd and ζ ∈ Td , a ωγ A = τζ a A = a A . a. Proof. Trivial verification upon computation of the Fourier series of a ωγ and τζ
The following estimate is a modification—mostly, a simplification—of a 1984 result by Nowak [No]. The proof, which we give for completeness, is practically copied from that article. Lemma A.2. Let ν¯ = [d/2] + 1 be the smallest integer strictly bigger than d/2. There exists a constant Cd > 0 such that a A ≤ Cd a H ν¯ , where a H ν¯ := |a0 | +
d i=1
=
⎛
⎞1/2 2 ⎝ βiν¯ aβ ⎠ β∈Zd
2 ⎞1/2 ∂ ν¯ a ν¯ (θ ) dθ ⎠ . Td ∂θi
⎛ d ⎝ a (θ )dθ + d
T
i=1
Proof. Let σ = (σ1 , σ2 , . . . , σd ) be a permutation of (1, 2, . . . , d). Let us define
Z σ := (β1 , β2 , . . . , βd ) ∈ Zd 0 < |βσ1 | ≤ |βσ2 | · · · ≤ |βσd | . (A.11) Clearly, Zd = {0} ∪
Zσ ,
(A.12)
σ
although the union is not disjoint. Using the Cauchy-Schwartz inequality, ⎞2 ⎛ 2 ⎝ |aβ |⎠ ≤ Cν¯ βσ2dν¯ |aβ |2 ≤ Cν¯ βσν¯d aβ , β∈Z σ
β∈Z σ
β∈Zd
(A.13)
On Infinite-Volume Mixing
513
where we have denoted ν¯ Cν¯ := βσ−2 = d β∈Z σ
|α1 |>0 |α2 |≥|α1 |
≤C
···
αd−2ν¯
|αd−1 |≥|αd−2 | |αd |≥|αd−1 |
···
|α1 |>0 |α2 |≥|α1 |
≤ ······ ≤ C
−2ν¯ +1 αd−1
|αd−1 |≥|αd−2 |
α1−2ν¯ +d−1 < ∞
(A.14)
|α1 |>0
(as in Sect. 5, C represents a generic constant). In view of (A.12), summing the square root of (A.13) over all the permutations σ , we obtain ⎛ ⎞1/2 d 2 ' ⎝ |aβ | ≤ (d − 1)! Cν¯ (A.15) βiν¯ aβ ⎠ , β=0
whence the assertion of the lemma.
i=1
β∈Zd
Acknowledgements. I would like to thank an anonymous referee for pointing out a relevant mistake in the first draft of the manuscript.
References [A] [BS] [CM] [Fr] [HK] [H] [I1] [I2] [KP] [K] [KO] [KS] [Kr] [L1] [L2] [L3] [No] [Or] [Pa]
Aaronson, J.: An introduction to infinite ergodic theory. Mathematical Surveys and Monographs, 50. Providence, RI: Amer. Math. Soc., 1997 Bunimovich, L.A., Sinai, Ya.G.: Statistical properties of lorentz gas with periodic configuration of scatterers. Commun. Math. Phys. 78(4), 479–497 (1980/81) Chernov, N., Markarian, R.: Chaotic billiards. Mathematical Surveys and Monographs, 127. Providence, RI: Amer. Math. Soc., 2006 Friedman, N.A.: Mixing transformations in an infinite measure space. In: Studies in probability and ergodic theory, Adv. in Math. Suppl. Stud., 2. New York-London: Academic Press, 1978, pp. 167–184 Hajian, A.B., Kakutani, S.: Weakly wandering sets and invariant measures. Trans. Amer. Math. Soc. 110, 136–151 (1964) Hopf, E.: Ergodentheorie. Berlin: Springer-Verlag, 1937 Isola, S.: Renewal sequences and intermittency. J. Stat. Phys. 97(1–2), 263–280 (1999) Isola, S.: On systems with finite ergodic degree. Far East J. Dyn. Syst. 5(1), 1–62 (2003) Kakutani, S., Parry, W.: Infinite measure preserving transformations with “mixing”. Bull. Amer. Math. Soc. 69, 752–756 (1963) Katznelson, Y.: An introduction to harmonic analysis. 3rd ed. Cambridge Mathematical Library. Cambridge: Cambridge University Press, 2004 Kingman, J.F.C., Orey, S.: Ratio limit theorems for markov chains. Proc. Amer. Math. Soc. 15, 907– 910 (1964) Krengel, U., Sucheston, L.: Mixing in infinite measure spaces. Z. Wahr. Verw. Geb. 13, 150– 164 (1969) Krickeberg, K.: Strong mixing properties of Markov chains with infinite invariant measure. In: 1967 Proc. Fifth Berkeley Sympos. Math. Statist. and Probability (Berkeley, CA, 1965/66), Vol. II, Part 2, Berkeley, CA: Univ. California Press, 1967, pp. 431–446 Lenci, M.: Aperiodic lorentz gas: recurrence and ergodicity. Erg. Th. Dynam. Systs. 23(3), 869– 883 (2003) Lenci, M.: Typicality of recurrence for lorentz gases. Erg. Th. Dynam. Systs. 26(3), 799–820 (2006) Lenci, M.: Mixing properties of infinite Lorentz gases. In preparation Nowak, Z.: Criteria for absolute convergence of multiple fourier series. Ark. Mat. 22(1), 25–32 (1984) Orey, S.: Strong ratio limit property. Bull. Amer. Math. Soc. 67, 571–574 (1961) Papangelou, F.: Strong ratio limits, r -recurrence and mixing properties of discrete parameter markov processes. Z. Wahr. Verw. Geb. 8, 259–297 (1967)
514
M. Lenci
[Pr]
Pruitt, W.E.: Strong ratio limit property for r -recurrent markov chains. Proc. Amer. Math. Soc. 16, 196–200 (1965) Rudin, W.: Fourier analysis on groups. Interscience Tracts in Pure and Applied Mathematics, no. 12, New York-London: Interscience Publishers (a division of John Wiley and Sons), 1962 Sachdeva, U.: On category of mixing in infinite measure spaces. Math. Systems Theory 5, 319– 330 (1971) Sinai, Ya.G.: Dynamical systems with elastic reflections. Russ. Math. Surv. 25(2), 137–189 (1970) Spitzer, F.: Principles of random walk, 2nd ed. Graduate Texts in Mathematics, 34. New YorkHeidelberg: Springer-Verlag, 1976 Sucheston, L.: On mixing and the zero-one law. J. Math. Anal. Appl. 6, 447–456 (1963) Thaler, M.: The asymptotics of the perron-frobenius operator of a class of interval maps preserving infinite measures. Studia Math. 143(2), 103–119 (2000) Tomatsu, S.: Uniformity of mixing transformations with infinite measure. Bull. Fac. Gen. Ed. Gifu Univ. No. 17, 43–49 (1981) Tomatsu, S.: Local uniformity of mixing transformations with infinite measure. Bull. Fac. Gen. Ed. Gifu Univ. No. 20, 1–5 (1984) Walters, P.: An introduction to ergodic theory. Graduate Texts in Mathematics, 79. New York-Berlin: Springer-Verlag, 1982
[R] [Sa] [Si] [Sp] [Su] [Th] [To1] [To2] [W]
Communicated by G. Gallavotti
Commun. Math. Phys. 298, 515–522 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1073-0
Communications in
Mathematical Physics
An Energy Gap for Yang-Mills Connections Claus Gerhardt Institut für Angewandte Mathematik, Ruprecht-Karls-Universität, Im Neuenheimer Feld 294, 69120 Heidelberg, Germany. E-mail:
[email protected] Received: 8 August 2009 / Accepted: 16 February 2010 Published online: 11 June 2010 – © Springer-Verlag 2010
Abstract: Consider a Yang-Mills connection over a Riemann manifold M = M n , n ≥ 3, where M may be compact or complete. Then its energy must be bounded from below by some positive constant, if M satisfies certain conditions, unless the connection is flat. Contents 1. Introduction . . . . . . 2. The Compact Case . . . 3. The Non-compact Case References . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
515 517 520 522
1. Introduction We consider the problem: When is a Yang-Mills connection non-flat? Of course, the trivial answer Fμλ ≡ 0 is unsatisfactory. Bourguignon and Lawson proved in [3, Theorem C], among other results, that any Yang-Mills connection over S n , n ≥ 3, the field strength of which satisfies the pointwise estimate n (1.1) F 2 = − tr(Fμλ F μλ ) < 2 is flat. We want to prove that under certain assumptions on the base space M, which is supposed to be a Riemannian manifold of dimension n ≥ 3, the energy of a Yang-Mills This work has been supported by the DFG.
516
C. Gerhardt
connection has to satisfy |F|
n 2
2 n
≥ κ0 > 0,
(1.2)
M
where κ0 depends only on the Sobolev constants of M, n and the dimension of the Lie group G, unless the connection is flat. Here, √ (1.3) |F| = F 2 , and we also call the left-hand side of (1.2) energy though this label is only correct when n = 4. However, this norm is also the crucial norm, which has to be (locally) small, used to prove regularity of a connection, cf. [4, Theorem 1.3]. The exponent n2 naturally pops up when Sobolev inequalities are applied to solutions of differential equations satisfied by the field strength or the energy density of a connection in the adjoint bundle. We distinguish two cases: M compact and M complete and non-compact. When M is compact, we require R¯ αβ Λαλ Λβλ − 21 R¯ αβμλ Λαβ Λμλ ≥ c0 Λαβ Λαβ
(1.4)
for all skew-symmetric Λαβ ∈ T 0,2 (M), where 0 < c0 , while for non-compact M the weaker assumption R¯ αβ Λαλ Λβλ − 21 R¯ αβμλ Λαβ Λμλ ≥ 0,
(1.5)
and in addition u M
2n n−2
n−2 n
≤ c1
|Du|2
∀ u ∈ H 1,2 (M)
(1.6)
M
should be satisfied. Remark 1.1.
(i) If M is a space of constant curvature R¯ αβμλ = K M (g¯ αμ g¯ βλ − g¯ αλ g¯ βμ ),
(1.7)
R¯ αβ Λαλ Λβλ − 21 R¯ αβμλ Λαβ Λμλ = (n − 2)K M Λαβ Λαβ .
(1.8)
then In case n = 2 the curvature term therefore vanishes, and this result is also valid for an arbitrary two-dimensional Riemannian manifold, since the curvature tensor then has the same structure as in (1.7) though K M is not necessarily constant. (ii) If M = Rn , n ≥ 3, the conditions (1.5) and (1.6) are always valid. Theorem 1.2. Let M = M n , n ≥ 3, be a compact Riemannian manifold for which the condition (1.4) with c0 > 0 holds. Then any Yang-Mills connection over M with compact, semi-simple Lie group is either flat or satisfies (1.2) for some constant κ0 > 0 depending on the Sobolev constants of M, n, c0 , and the dimension of the Lie group. Theorem 1.3. Let M = M n , n ≥ 3, be complete, non-compact and assume that the conditions (1.5) and (1.6) hold. Then any Yang-Mills connection over M with compact, semi-simple Lie group is either flat or the estimate (1.2) is valid. The constant κ0 > 0 in (1.2) depends on the constant c1 in (1.6), n, and the dimension of the Lie group.
An Energy Gap for Yang-Mills Connections
517
2. The Compact Case Let (P, M, G, G) be a principal fiber bundle, where M = M n , n ≥ 3 is a compact Riemannian manifold with metric g¯ αβ and G a compact, semi-simple Lie group with Lie a ) be a basis of ad g and algebra g. Let f c = ( f cb Aμ = f c Acμ
(2.1)
a Yang-Mills connection in the adjoint bundle (E, M, g, Ad(G)). The curvature tensor of the connection is given by a c R abμλ = f cb Fμλ ,
(2.2)
c Fμλ = f c Fμλ
(2.3)
where
is the field strength of the connection, and a F 2 ≡ γab Fμλ F bμλ = Rabμλ R abμλ
(2.4)
the energy density of the connection—at least up to a factor 14 . Here, γab is the Cartan-Killing metric acting on elements of the fiber g, and Latin indices are raised or lowered with respect to the inverse γ ab or γab , and Greek indices with respect to the metric of M. Definition 2.1. The adjoint bundle E is vector bundle; let E ∗ be the dual bundle, then we denote by T r,s (E) = Γ (E · · ⊗ E ⊗ E ∗ ⊗ · · · ⊗ E ∗ ) ⊗ · r
(2.5)
s
the sections of the corresponding tensor bundle. Thus, we have a Fμλ ∈ T 1,0 (E) ⊗ T 0,2 (M).
(2.6)
Since Aμ is a Yang-Mills connection it solves the Yang-Mills equation F aαλ;α = 0,
(2.7)
where we use Einstein’s summation convention, a semi-colon indicates covariant differentiation, and where we stipulate that a covariant derivative is always a full tensor, i.e., γ a a a a b c γ Fμλ;α = Fμλ,α + f bc Aα Fμλ − Γ¯αμ Fγaλ − Γ¯αλ Fμγ ,
(2.8)
γ where Γ¯αβ are the Christoffel symbols of the Riemannian connection; a comma indicates partial differentiation. Before we formulate the crucial lemma let us note that R¯ αβγ δ resp. R¯ αβ symbolize the Riemann curvature tensor resp. the Ricci tensor of g¯ αβ .
518
C. Gerhardt
Lemma 2.2. Let Aμ be a Yang-Mills connection, then its energy density F 2 solves the equation − 41 ΔF 2 + 21 Faμλ;α F a c = − f cb Fαμ F bαλ Fa
aμλ α ; μλ
aβ + R¯ βμ F λ Fa μλ − 21 R¯ αβμλ Fa αβ F aμλ
.
(2.9)
Proof. Differentiating (2.7) covariantly with respect to x μ and using the Ricci identities we obtain 0 = −F aαλ;αμ aβ β = −F aαλ;μα + R abαμ F bαλ + R¯ αβαμ F λ + R¯ λμα F aαβ .
(2.10)
On the other hand, differentiating the second Bianchi identities a a a + Fμα;λ + Fλμ;α 0 = Fαλ;μ
(2.11)
a , 0 = F aαλ;μα + F aμ α;λα + ΔFλμ
(2.12)
a Fa μλ = −2F aαλ;μα Fa μλ . − ΔFμλ
(2.13)
we infer
and we deduce further
In view of (2.10) we then conclude aβ a Fa μλ + R abαμ F bαλ Fa μλ + R¯ βμ F λ Fa μλ 0 = − 21 ΔFμλ
+ R¯
β
λμα F
aα μλ , β Fa
(2.14)
which is equivalent to 0=
aβ a a c Fa μλ + f cb Fαμ F bαλ Fa μλ + R¯ βμ F λ Fa μλ − 21 ΔFμλ
− R¯ αμβλ F aαβ Fa μλ ,
(2.15)
in view of (2.2). Finally, using the first Bianchi identities, R¯ αβμλ + R¯ αμλβ + R¯ αλβμ = 0,
(2.16)
R¯ αβμλ F aαβ Fa μλ + R¯ αμλβ F aαβ Fa μλ + R¯ αλβμ F aαβ Fa μλ = 0,
(2.17)
R¯ αβμλ F aαβ Fa μλ = 2 R¯ αμβλ F aαβ Fa μλ ,
(2.18)
we deduce
and hence
from which Eq. (2.9) immediately follows.
An Energy Gap for Yang-Mills Connections
519
Proof of Theorem 1.2. Define u = F 2,
(2.19)
aβ R¯ βμ F λ Fa μλ − 21 R¯ αβμλ Fa αβ F aμλ ≥ c0 u,
(2.20)
then
where c0 > 0, in view of the assumption (1.4). Multiplying (2.9) with u and integrating by parts we obtain √ 2 2 2 3 |Du| + c u ≤ c uu , 0 8 M
M
(2.21)
M
where we used the simple estimate |Du|2 ≤ 4Faμλ;α F
aμλ α ; u,
(2.22)
and where c depends on n and the dimension of g; note that f c ∈ SO(g, γab ).
(2.23)
The integral on the right-hand side of (2.21) is estimated by
√ 2 uu ≤
n 4
u
M
2 n
u
M
n−2 n
2n n−2
,
(2.24)
M
where u
2 n
n 4
=
|F|
M
n 2
2 n
.
(2.25)
M
Applying then the Sobolev inequality u
2n n−2
n−2 n
≤ c1
|Du| + c2
u2,
2
M
M
(2.26)
M
cf. [1], we obtain u
2n n−2
n−2 n
≤ c3
|F|
M
n 2
2 n
u
M
2n n−2
n−2 n
,
(2.27)
M
where c3 depends on c1 , c2 , c0 and c. Hence, we deduce u ≡ 0 or c3−1
≤
|F|
n 2
2 n
.
(2.28)
M
Setting κ0 = c3−1 finishes the proof.
(2.29)
520
C. Gerhardt
3. The Non-compact Case We now suppose that M = M n is a complete, non-compact Riemannian manifold. Then there holds H 1,2 (M) = H01,2 (M),
(3.1)
i.e., the test functions Cc∞ (M) are dense in the Sobolev space H 1,2 (M), see [1, Lemma 4] or [2, Theorem 2.6]. Since we do not a priori know F 2 ∈ H 1,2 (M),
(3.2)
1,2 F 2 ∈ Hloc (M),
(3.3)
but only
the preceding proof has to be modified. Let η = η(t) be defined through ⎧ ⎪ t ≤ 1, ⎨1, q η(t) = (2 − t) , 1 ≤ t ≤ 2, ⎪ ⎩0, t ≥ 2,
(3.4)
where
q = max 1, n8 .
(3.5)
Fix a point x0 ∈ M and let r be the Riemannian distance function with center in x0 , r (x) = d(x0 , x).
(3.6)
|Dr | = 1
(3.7)
ηk (x) = η(k −1r ).
(3.8)
Then r is Lipschitz such that
almost everywhere. For k ≥ 1 define
The functions u p−1 ηk ,
p
(3.9)
p = n4 ,
(3.10)
where
An Energy Gap for Yang-Mills Connections
521 p
then have compact support, and multiplying (2.9) with u p−1 ηk yields
p 4
+
1 8
−
M
p |Du|2 u p−2 ηk
≤c
|F| M
+ c
M
n 2
2 n
(uηk )
n n−2
n−2 p
n
M p−2 |Dηk |2 ηk u p ,
(3.11)
where 0 < is supposed to be small. Furthermore, there holds p p2 |D(uηk ) 2 |2 = |Duηk + u Dηk |2 (uηk ) p−2 4 M M (3.12) p2 p2 2 p−2 p 2 p−2 p ≤ (1 + ) |Du| u ηk + c |Dηk | ηk u . 4 M 4 M Now, choosing so small such that 2
(1 + ) p4 ≤ p
p 4
+
1 8
−
(3.13)
and setting p
ϕ = (uηk ) 2 , we obtain
|Dϕ| ≤ pc
|F|
2
M
n 2
2 n
ϕ
M
2n n−2
(3.14)
n−2
n
M
+ c
p−2 p
M
|Dηk |2 ηk
u ,
(3.15)
where c is a new constant. We furthermore observe that p−2
|Dηk |2 ηk
≤ q 2 k −2 (2 − k −1r )q p−2 ,
(3.16)
subject to 1 ≤ k −1r ≤ 2.
(3.17)
qp − 2 ≥ 0,
(3.18)
In view of (3.5) and (3.10)
and hence p−2
|Dηk |2 ηk
≤ q 2 k −2 .
(3.19)
Applying now the Sobolev inequality (1.6) to ϕ and choosing κ0 = (c1 cp)−1 ,
(3.20)
we conclude |F| ≡ 0, if |F| M
n 2
2 n
< κ0 .
(3.21)
522
C. Gerhardt
Indeed, if the preceding inequality is valid, then we deduce from (3.15), 2 n−2 1 − κ0−1
n
|F| 2
n
2n
|ϕ| n−2
M
n
M
In the limit k → ∞ we obtain |u|
pn n−2
≤ c q 2 k −2
n
|F| 2 .
(3.22)
M
n−2 n
≤ 0.
(3.23)
M
References 1. Aubin, T.: Problèmes isopérimétriques et espaces de Sobolev. J. Diff. Geom. 11(4), 573–598 (1976) 2. Aubin, T.: Nonlinear analysis on manifolds. Monge-Ampère equations. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], Vol. 252, New York: Springer-Verlag, 1982 3. Bourguignon, J.-P., Lawson, H.B. Jr.: Stability and isolation phenomena for Yang-Mills fields. Commun. Math. Phys. 79(2), 189–230 (1981) 4. Uhlenbeck, K.K.: Connections with L p bounds on curvature. Commun. Math. Phys. 83(1), 31–42 (1982) Communicated by A. Connes
Commun. Math. Phys. 298, 523–547 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1011-1
Communications in
Mathematical Physics
Generators of KMS Symmetric Markov Semigroups on B(h) Symmetry and Quantum Detailed Balance Franco Fagnola1 , Veronica Umanità2 1 Departiment of Mathematics, Politecnico di Milano, P. L. da Vinci 32,
I-20133 Milano, Italy. E-mail:
[email protected] 2 Departiment of Mathematics, University of Genoa, V. Dodecaneso 35,
I-16146 Genova, Italy. E-mail:
[email protected] Received: 11 August 2009 / Accepted: 4 December 2009 Published online: 24 February 2010 – © Springer-Verlag 2010
Abstract: We find the structure of generators of norm-continuous quantum Markov semigroups on B(h) that are symmetric with respect to the scalar product tr (ρ 1/2 x ∗ ρ 1/2 y) induced by a faithful normal invariant state ρ and satisfy two quantum generalisations of the classical detailed balance condition related with this non-commutative notion of symmetry: the so-called standard detailed balance condition and the standard detailed balance condition with an antiunitary time reversal. 1. Introduction Symmetric Markov semigroups have been extensively studied in classical stochastic analysis (Fukushima et al. [13] and the references therein) because their generators and associated Dirichlet forms are very well tractable by Hilbert space and probabilistic methods. Their non-commutative counterpart has also been deeply investigated (Albeverio and Goswami [1], Cipriani [6], Davies and Lindsay [8], Goldstein and Lindsay [15], Guido, Isola and Scarlatti [17], Park [23], Sauvageot [26] and the references therein). The classical notion of symmetry with respect to a measure, however, admits several non-commutative generalisations. Here we shall consider the so-called KMS-symmetry that seems more natural from a mathematical point of view (see e.g. Accardi and Mohari [3], Cipriani [6,7], Goldstein and Lindsay [14], Petz [25]) and find the structure of generators of norm-continuous quantum Markov semigroups (QMS) on the von Neumann algebra B(h) of all bounded operators on a complex separable Hilbert space h that are symmetric or satisfy quantum detailed balance conditions associated with KMS-symmetry or generalising it. We consider QMS on B(h), i.e. weak∗ -continuous semigroups of normal, completely positive, identity preserving maps T = (Tt )t≥0 on B(h), with a faithful normal invariant state ρ. This defines pre-scalar products on B(h) by (x, y)s = tr (ρ 1−s x ∗ ρ s y) for s ∈ [0, 1] and allows one to define the s-dual semigroup T on B(h) satisfying
524
F. Fagnola, V. Umanità
tr (ρ 1−s x ∗ ρ s Tt (y)) = tr (ρ 1−s Tt (x)∗ ρ s y) for all x, y ∈ B(h). The above scalar products coincide on an Abelian von Neumann algebra; the notion of symmetry T = T , however, clearly depends on the choice of the parameter s. The most studied cases are s = 0 and s = 1/2. Denoting T∗ the predual semigroup, a simple computation yields Tt (x) = ρ −(1−s) T∗t (ρ 1−s xρ s )ρ −s , and shows that for s = 1/2 the maps Tt are positive but, for s = 1/2 this may not be the case. Indeed, it is well-known that, for s = 1/2, the maps Tt are positive if and only if the maps Tt commute with the modular group (σt )t∈R , σt (x) = ρ it xρ −it (see e.g. [18] Prop. 2.1, p. 98, [22] Th. 6, p. 7985, for s = 0, [11] Th. 3.1, p. 341, Prop. 8.1, p. 362 for s = 1/2). This quite restrictive condition implies that the generator has a very special form that makes simpler the mathematical study of symmetry but imposes strong structural constraints (see e.g. [18 and 12]). Here we shall consider the most natural choice s = 1/2 whose consequences are not so stringent and say that T is KMS-symmetric if it coincides with its dual T . KMSsymmetric QMS were introduced by Cipriani [6] and Goldstein and Lindsay [14]; we refer to [7] for a discussion of the connection with the KMS condition justifying this terminology. All quantum versions of the classical principle of detailed balance (Agarwal [4], Alicki [5], Frigerio, Gorini, Kossakowski and Verri [18], Majewski [20,21]), which is at the basis of equilibrium physics, are formulated prescribing a certain relationship between T and T or between their generators, therefore they depend on the underlying notion of symmetry. This work clarifies the structure of generators of QMS that are KMS-symmetric or satisfy a quantum detailed balance condition involving the above scalar product with s = 1/2 and is a key step towards understanding which is the most natural and flexible in view of the study of their generalisations for quantum systems out of equilibrium as, for instance, the dynamical detailed balance condition introduced by Accardi and Imafuku [2]. The generator L of a norm-continuous QMS can be written in the standard GoriniKossakowski-Sudarshan [16] and Lindblad [19] (GKSL) form L(x) = i[H, x] −
1 ∗ L L x − 2L ∗ x L + x L ∗ L , 2
(1)
≥1
where H, L ∈ B(h) with H = H ∗ and the series ≥1 L ∗ L is strongly convergent. The operators L , H in (1) are not uniquely determined by L, however, under a natural minimality condition (Theorem 2 below) and a zero-mean condition tr (ρ L ) = 0 for all ≥ 1, H is determined up to a scalar multiple of the identity operator and the (L )≥1 up to a unitary transformation of the multiplicity space of the completely positive part of L. We shall call special a GKSL representation of L by operators H, L satisfying these conditions. As a result, by the remark following Theorem 2, in a special GKSL representation of L, the operator G = −2−1 ≥1 L ∗ L − i H , is uniquely determined by L up to a purely imaginary multiple of the identity operator and allows us to write L in the form L(x) = G ∗ x + L ∗ x L + x G. (2) ≥1
Our characterisations of QMS that are KMS-symmetric or satisfy a quantum detailed balance condition generalising related with KMS-symmetry are given in terms of the operators G, L (or, in an equivalent way H, L ) of a special GKSL representation.
Generators of Quantum Markov Semigroups and Detailed Balance
525
Theorem 7 shows that a QMS is KMS-symmetric if and only if the operators G, L of a special GKSL representation satisfy ρ 1/2 G ∗ = Gρ 1/2 + icρ 1/2 of its generator ∗ 1/2 1/2 for some c ∈ R and ρ L k = u k L ρ for all k and some unitary (u k ) on the multiplicity space of the completely positive part of L coinciding with its transpose, i.e. such that u k = u k for all k, . In order to describe our results on the structure of generators of QMS satisfying a quantum detailed balance condition we first recall some basic definitions. The best known is due to Alicki [5] and Frigerio-Gorini-Kossakowski-Verri [18]: a norm-continuous QMS T = (Tt )t≥0 on B(h) satisfies the Quantum Detailed Balance (QDB) on B(h) and a self-adjoint operator K on h such condition if there exists an operator L that tr (ρ L(x)y) = tr (ρxL(y)) and L(x) − L(x) = 2i[K , x] for all x, y ∈ B(h). Roughly speaking we can say that L satisfies the QDB condition if the difference of L with respect to the pre-scalar product on B(h) given by tr (ρa ∗ b) is a and its adjoint L derivation. = L − 2i[K , · ] can be written in the form (2) This QDB implies that the operator L replacing G by G + 2i K and then generates a QMS T. Therefore L and the maps Tt commute with the modular group. This restriction does not follow if the dual QMS is defined with respect to the symmetric pre-scalar product with s = 1/2. with the adjoint L defined via the The QDB can be readily reformulated replacing L symmetric scalar product; the resulting condition will be called the Standard Quantum Detailed Balance condition (SQDB) (see e.g. [9]). Theorem 5 characterises generators L satisfying the SQDB and extends previous partial results by Park [23] and the authors [11]: the SQDB holds if and only if there exists a unitary matrix (u k ), coinciding with its transpose, i.e. u k = u k for all k, , such that ρ 1/2 L ∗k = u k L ρ 1/2 . This shows, in particular, that the SQDB depends only on the L ’s and does not involve directly H and G. Moreover, we find explicitly the unitary (u k )k providing also a geometrical characterisation of the SQDB (Theorem 6) in terms of the operators L ρ 1/2 and their adjoints as Hilbert-Schmidt operators on h. We also consider (Definition 3) another notion of quantum detailed balance, inspired by Agarwal’s original notion (see [4], Majewski [20,21], Talkner [27]) involving an antiunitary time reversal operator θ which does not play any role in the Alicki et al. definition. Time reversal appears to keep into account the parity of quantum observables; position and energy, for instance, are even, i.e. invariant under time reversal, momentum are odd, i.e. change sign under time reversal. Agarwal’s original definition, however, depends on the s = 0 pre-scalar product and implies then, that a QMS satisfying this quantum detailed balance condition must commute with the modular automorphism. Here we study the modified version (Definition 3) involving the symmetric s = 1/2 pre-scalar product that we call the SQDB-θ condition. Theorem 8 shows that L satisfies the SQDB-θ condition if and only if there exists a special GKSL representation of L by means of operators H, L such that Gρ 1/2 = ρ 1/2 θ G ∗ θ and a unitary self-adjoint (u k )k such that ρ 1/2 L ∗k = u k θ L θρ 1/2 for all k. Here again (u k )k is explicitly determined by the operators L ρ 1/2 (Theorem 9). We think that these results show that the SQDB condition is somewhat weaker than the SQDB-θ condition because the first does not involve directly the operators H , G. Moreover, the unitary operator in the linear relationship between L ρ 1/2 and their adjoints is transpose symmetric and any point of the unit disk could be in its spectrum while, for generators satisfying the SQDB-θ , it is self-adjoint and its spectrum is contained in {−1, 1}. Therefore, by the spectral theorem, it is possible in principle to find a standard form for the generators of QMSs satisfying the SQDB-θ generalising the
526
F. Fagnola, V. Umanità
standard form of generators satisfying the usual QDB condition (that commute with the modular group) as illustrated in the case of QMSs on M2 (C) studied in the last section. This classification must be much more complex for generators of QMSs satisfying the SQDB. The above arguments and the fact that the SQDB-θ condition can be formulated in a simple way both on the QMS or on its generator (this is not the case for the QDB when L and its Hamiltonian part i[H, ·] do not commute), lead us to the conclusion that the SQDB-θ is the more natural non-commutative version of the classical detailed balance condition. The paper is organised as follows. In Sect. 2 we construct the dual QMS T and recall the quantum detailed balance conditions we investigate, then we study the relationship between the generators of a QMS and its adjoint in Sect. 3. Our main results on the structure of generators are proved in Sects. 4 (QDB without time reversal) and 5 (with time reversal). 2. The Dual QMS, KMS-Symmetry and Quantum Detailed Balance We start this section by constructing the dual semigroup of a norm-continuous QMS with respect to the (·, ·)1/2 pre-scalar product on B(h) defined by an invariant state ρ and prove some properties that will be useful in the sequel. Although this result may be known, the presentation given here leads in a simple and direct way to the dual QMS avoiding non-commutative L p -spaces techniques. Proposition 1. Let Φ be a positive unital normal map on B(h) with a faithful normal invariant state ρ. There exists a unique positive unital normal map Φ on B(h) such that tr ρ 1/2 Φ (x)ρ 1/2 y = tr ρ 1/2 xρ 1/2 Φ(y) for all x, y ∈ B(h). If Φ is completely positive, then Φ is also completely positive. Proof. Let Φ∗ be the predual map on the Banach space of trace class operators on h and let Rk(ρ 1/2 ) denote the range of the operator ρ 1/2 . This is clearly dense in h because ρ is faithful and coincides with the domain of the unbounded self-adjoint operator ρ −1/2 . For all self-adjoint x ∈ B(h) consider the sesquilinear form on the domain Rk(ρ 1/2 )× Rk(ρ 1/2 ), F(v, u) = ρ −1/2 v, Φ∗ (ρ 1/2 xρ 1/2 )ρ −1/2 u. By the invariance of ρ and positivity of Φ∗ we have − x ρ = − x Φ∗ (ρ) ≤ Φ∗ (ρ 1/2 xρ 1/2 ) ≤ x Φ∗ (ρ) = x ρ. Therefore |F(u, u)| ≤ x · v · u . Thus sesquilinear form is bounded and there exists a unique bounded operator y such that, for all u, v ∈ Rk(ρ 1/2 ), v, yu = ρ −1/2 v, Φ∗ (ρ 1/2 xρ 1/2 )ρ −1/2 u. Note that, Φ being a ∗ -map, and x self-adjoint v, y ∗ u = y ∗ u, v = ρ −1/2 u, Φ∗ (ρ 1/2 xρ 1/2 )ρ −1/2 v = Φ∗ (ρ 1/2 xρ 1/2 )ρ −1/2 u, ρ −1/2 v = ρ −1/2 v, Φ∗ (ρ 1/2 xρ 1/2 )ρ −1/2 u.
Generators of Quantum Markov Semigroups and Detailed Balance
527
This shows that y is self-adjoint. Defining Φ (x) := y, we find a real-linear map on selfadjoint operators on B(h) that can be extended to a linear map on B(h) decomposing each self-adjoint operator as the sum of its self-adjoint and anti self-adjoint parts. Clearly Φ is positive because ρ 1/2 Φ (x ∗ x)ρ 1/2 = Φ∗ (ρ 1/2 x ∗ xρ 1/2 ) and Φ∗ is positive. Moreover, by the above construction Φ (1l) = 1l, i.e. Φ is unital. Therefore is a norm-one contraction. If Φ is completely positive, then Φ∗ is also and formula ρ 1/2 Φ (x)ρ 1/2 = Φ∗ (ρ 1/2 1/2 xρ ) shows that Φ is completely positive. Finally we show that Φ is normal. Let (xα )α be a net of positive operators on B(h) with least upper bound x ∈ B(h). For all u ∈ h we have then supρ 1/2 u, Φ (xα )ρ 1/2 u = supu, Φ∗ (ρ 1/2 xα ρ 1/2 )u α
α
= u, Φ∗ (ρ 1/2 xρ 1/2 )u = ρ 1/2 u, Φ (x)ρ 1/2 u. Now if u ∈ h, for every ε > 0, we can find a u ε ∈ Rk(ρ 1/2 ) such that u − u ε < ε by the density of the range of ρ 1/2 . We have then
u, Φ (xα ) − Φ (x) u ≤ ε Φ (xα ) − Φ (x) ( u + u ε ) + u ε , Φ (xα ) − Φ (x) u ε for all α. The conclusion follows from the arbitrarity of ε and the uniform boundedness of Φ (xα ) − Φ (x) and u ε . Theorem 1. Let T be a QMS on B(h) with a faithful normal invariant state ρ. There exists a QMS T on B(h) such that ρ 1/2 Tt (x)ρ 1/2 = T∗t (ρ 1/2 xρ 1/2 )
(3)
for all x ∈ B(h) and all t ≥ 0. Proof. By Proposition 1, for each t ≥ 0, there exists a unique completely positive normal and unital contraction Tt on B(h) satisfying (3). The semigroup property follows from the algebraic computation (x)ρ 1/2 = T∗t T∗s (ρ 1/2 xρ 1/2 ) ρ 1/2 Tt+s = T∗t ρ 1/2 Ts (x)ρ 1/2 ) = ρ 1/2 Tt Ts (x)) ρ 1/2 . Since the map t → ρ 1/2 v, Tt (x)ρ 1/2 u is continuous by the identity (3) for all u, v ∈ h, and Tt (x) ≤ x for all t ≥ 0, a 2ε approximation argument shows that t → Tt (x) is continuous for the weak∗ -operator topology on B(h). It follows that T = (Tt )t≥0 is a QMS on B(h). Definition 1. The quantum Markov semigroup T is called the dual semigroup of T with respect to the invariant state ρ. It is easy to see, using (3), that ρ is an invariant state also for T . Remark 1. When T is norm-continuous it is not clear whether also T is norm-continuous. Here, however, we are interested in generators of symmetric or detailed balance QMS. We shall see that these additional properties of T imply that also T is norm continuous. Therefore we proceed studying norm-continuous QMSs whose dual is also norm-continuous.
528
F. Fagnola, V. Umanità
The quantum detailed balance condition of Alicki, Frigerio, Gorini, Kossakowski and Verri modified by considering the pre-scalar product (·, ·)1/2 on B(h), usually called standard (see e.g. [9]) because of multiplications by ρ 1/2 as in the standard representation of B(h), is defined as follows. Definition 2. The QMS T generated by L satisfies the standard quantum detailed balance condition (SQDB) if there exists an operator L on B(h) and a self-adjoint operator K on h such that tr (ρ 1/2 xρ 1/2 L(y)) = tr (ρ 1/2 L (x)ρ 1/2 y),
L(x) − L (x) = 2i[K , x]
(4)
for all x ∈ B(h). The operator L in the above definition must be norm-bounded because it is everywhere defined and norm closed. To see this consider a sequence (xn )n≥1 in B(h) converging in norm to a x ∈ B(h) such that (L(xn ))n≥1 converges in norm to b ∈ B(h) and note that tr ρ 1/2 L (x)ρ 1/2 y = lim tr ρ 1/2 xn ρ 1/2 L(y) n→∞ = lim tr ρ 1/2 L (xn )ρ 1/2 y = tr ρ 1/2 bρ 1/2 y n→∞
for all y ∈ B(h). The elements ρ 1/2 yρ 1/2 , with y ∈ B(h), are dense in the Banach space of trace class operators on h because ρ is faithful. Therefore it shows that L (x) = b and L is closed. Since both L and L are bounded, also K is bounded. We now introduce another definition of quantum detailed balance, due to Agarwal [4] with the s = 0 pre-scalar product, that involves a time reversal θ . This is an antiunitary operator on h, i.e. θ u, θ v = v, u for all u, v ∈ h, such that θ 2 = 1l and θ −1 = θ ∗ = θ . Recall that θ is antilinear, i.e. θ zu = z¯ u for all u ∈ h, z ∈ C, and its adjoint θ ∗ satisfies u, θ v = v, θ ∗ u for all u, v ∈ h. Moreover θ x θ belongs to B(h) (linearity is re-established) and tr (θ xθ ) = tr (x ∗ ) for every trace-class operator x ([10] Prop. 4), indeed, taking an orthonormal basis of h, we have tr (θ xθ ) = e j , θ xθ e j = xθ e j , θ ∗ e j j
j
=
θ e j , x ∗ θ ∗ e j = tr(x ∗ ).
j
It is worth noticing that the cyclic property of the trace does not hold for θ , since tr (θ xθ ) = tr (x ∗ ) may not be equal to tr (x) for non-self-adjoint x. Definition 3. The QMS T generated by L satisfies the standard quantum detailed balance condition with respect to the time reversal θ (SQDB-θ ) if tr (ρ 1/2 xρ 1/2 L(y)) = tr (ρ 1/2 θ y ∗ θρ 1/2 L(θ x ∗ θ )), for all x, y ∈ B(h).
(5)
Generators of Quantum Markov Semigroups and Detailed Balance
529
The operator θ is used to keep into account parity of the observables under time reversal. Indeed, a self-adjoint operator x ∈ B(h) is called even (resp. odd) if θ xθ = x (resp. θ xθ = −x). The typical example of antilinear time reversal is a conjugation (with respect to some orthonormal basis of h). This condition is usually stated ([20,21,27]) for the QMS T as tr (ρ 1/2 xρ 1/2 Tt (y)) = tr (ρ 1/2 θ y ∗ θρ 1/2 Tt (θ x ∗ θ )),
(6)
for all t ≥ 0, x, y ∈ B(h). In particular, for t = 0 we find that this identity holds if and only if ρ and θ commute, i.e. ρ is an even observable. This is the case, for instance, when ρ is a function of the energy. Lemma 1. The following conditions are equivalent: (i) θ and ρ commute, (ii) tr (ρ 1/2 xρ 1/2 y) = tr (ρ 1/2 θ y ∗ θρ 1/2 θ x ∗ θ ) for all x, y ∈ B(h). Proof. If ρ and θ commute, from tr (θaθ ) = tr (a ∗ ), we have tr (ρ 1/2 θ y ∗ θρ 1/2 θ x ∗ θ ) = tr (θ (ρ 1/2 y ∗ ρ 1/2 x ∗ )θ ) = tr (xρ 1/2 yρ 1/2 ) and (ii) follows cycling ρ 1/2 . Conversely, if (ii) holds, taking x = 1l, we have tr (ρy) = tr (ρθ y ∗ θ ) = tr θ (θ y ∗ θ )∗ ρθ = tr (yθρθ ) = tr (θρθ y), for all y ∈ B(h), and ρ = θρθ .
Proposition 2. If ρ and θ commute then (5) and (6) are equivalent. Proof. Clearly (5) follows from (6) differentiating at t = 0. Conversely, putting α(x) = θ xθ and denoting L∗ the predual of L we can write (5) as tr (L∗ (ρ 1/2 xρ 1/2 )y) = tr ρ 1/2 α(y ∗ )ρ 1/2 L(α(x ∗ )) = tr ρ 1/2 α(L(α(x)))ρ 1/2 y , for all y ∈ B(h), because tr (α(a)) = tr (a ∗ ). Therefore we have L∗ (ρ 1/2 xρ 1/2 ) = ρ 1/2 α(L(α(x)))ρ 1/2 and, iterating, Ln∗ (ρ 1/2 xρ 1/2 ) = ρ 1/2 α(Ln (α(x)))ρ 1/2 for all n ≥ 1. It follows that (5) holds for all powers Ln with n ≥ 1. Since ρ and θ commute, it is true also for n = 0 and we find (6) by the exponentiation formula Tt = n≥0 t n Ln /n!. We do not know whether the SQDB condition (4) of Definition 2 has a simple explicit formulation in terms of the maps Tt if L and L do not commute. Remark 2. The SQDB condition (5), by tr (θaθ ) = tr (a ∗ ), reads tr (ρ 1/2 xρ 1/2 L(y)) = tr (ρ 1/2 (θ L(θ xθ )θ )ρ 1/2 x), for all x, y ∈ B(h), i.e. L (x) = θ L(θ xθ )θ . Write L in a special GKSL form as in (1) and decompose the generator L = L0 + i[H, · ] into the sum of its dissipative part L0 and derivation part i[H, · ]. If H commutes with θ , by the antilinearity of θ , we find L (x) = θ L0 (θ xθ )θ − i[H, x]. Therefore, if the dissipative part is time reversal invariant, i.e. L0 (x) = θ L0 (θ xθ )θ , we end up with L = L − 2i[H, · ]. The relationship with Definition 2 of SQDB, in this case, is then clear. The SQDB conditions of Definition 2 and 3, however, in general are not comparable.
530
F. Fagnola, V. Umanità
3. The Generator of a QMS and its Dual We shall always consider special GKSL representations of the generator of a normcontinuous QMS by means of operators L , H . These are described by the following theorem (we refer to [24] Theorem 30.16 for the proof). Theorem 2. Let L be the generator of a norm-continuous QMS on B(h) and let ρ be a normal state on B(h). There exists a bounded self-adjoint operator H and a finite or infinite sequence (L )≥1 of elements of B(h) such that: (i) tr(ρ L ) = 0 for each ≥ 1, ∗ (ii) sum, ≥1 L L is a strongly convergent (iii) if ≥0 |c |2 < ∞ and c0 + ≥1 c L = 0 for complex scalars (ck )k≥0 then ck = 0 for every k ≥ 0, (iv) the GKSL representation (1) holds. If H , (L )≥1 is another family of bounded operators in B(h) with H self-adjoint and the sequence (L )≥1 is finite or infinite then the conditions (i)–(iv) are fulfilled with H, (L )≥1 replaced by H , (L )≥1 respectively if and only if the lengths of the sequences (L )≥1 , (L )≥1 are equal and for some scalar c ∈ R and a unitary matrix (u j ), j we have H = H + c, L = u j L j . j
As an immediate consequence of the uniqueness (up to a scalar) of the Hamiltonian H , the decomposition of L as the sum of the derivation i[H, ·] and a dissipative part L0 = L−i[H, · ] determined by special GKSL representations of L is unique. Moreover, since (u j ) is unitary, we have ⎛ ⎞ ∗ ⎝ L L = u k u j L ∗k L j = u k u j ⎠ L ∗k L j = L ∗k L k . ≥1
,k, j≥1
k, j≥1
≥1
k≥1
−1
∗ Therefore, putting G = −2 ≥1 L L − i H , we can write L in the form (2), where G is uniquely determined by L up to a purely imaginary multiple of the identity operator. Theorem 2 can be restated in the index free form ([24] Thm. 30.12).
Theorem 3. Let L be the generator of a norm continuous QMS on B(h), then there exist an Hilbert space k, a bounded linear operator L : h → h⊗k and a bounded self-adjoint operator H on h satisfying the following: 1. L(x) = i[H, x] − 21 (L ∗ L x − 2L ∗ (x ⊗ 1lk )L + x L ∗ L) for all x ∈ B(h); 2. the set {(x ⊗ 1lk )Lu : x ∈ B(h), u ∈ h} is total in h ⊗ k. Proof. Let k be a Hilbert space with Hilbertian dimension equal to the length of the sequence (L k )k and let ( f k ) be an orthonormal basis of k. Defining Lu = k L k u ⊗ f k , where the L k are as in Theorem 2, a simple calculation shows that 1 is fulfilled. Suppose that there exists a non-zero vector ξ orthogonal to the set of (x ⊗ 1lk )Lu with x ∈ B(h), u ∈ h; then ξ = k vk ⊗ f k with vk ∈ h and vk , x L k u = L ∗k x ∗ vk , u 0 = ξ, (x ⊗ 1lk )Lu = k
k
Generators of Quantum Markov Semigroups and Detailed Balance
531
for all x ∈ B(h), u ∈ h. Hence, k L ∗k x ∗ vk = 0. Since ξ = 0, we can suppose v1 = 1; then, putting p = |v1 v1 | and x = py ∗ , y ∈ B(h), we get ∗ ∗ ∗ ∗ 0 = L 1 yv1 + v1 , vk L k yv1 = L 1 + v1 , vk L k yv1 . (7) k≥2
k≥2
Since y ∈ B(h) is arbitrary, Eq. (7) contradicts the linear independence (see Theorem 2 (iii)) of the L k ’s. Therefore the set in (2) must be total. The Hilbert space k is called the multiplicity space of the completely positive part of L. A unitary matrix (u j ), j≥1 , in the above basis ( f k )k≥1 , clearly defines a unitary operator on k. From now on we shall identify such matrices with operators on k. We end this section by establishing the relationship between the operators G, L and G , L in two special GKSL representations of L and L when these generators are both bounded. The dual QMS T clearly satisfies ρ 1/2 Tt (x)ρ 1/2 = T∗t (ρ 1/2 xρ 1/2 ), where T∗ denotes the predual semigroup of T . Since L is bounded, differentiating at t = 0, we find the relationship among the generator L of T and L∗ of the predual semigroup T∗ of T , ρ 1/2 L (x)ρ 1/2 = L∗ (ρ 1/2 xρ 1/2 ). (8) ∗ Proposition 3. Let L(a) = G ∗ a + aG + L a L be a special GKSL representation of L with respect to a T -invariant state ρ = k ρk |ek ek |. Then ρk L(| uek |)ek − tr(ρG)u, (9) G∗u = k≥1
Gv =
ρk L∗ (| vek |)ek − tr(ρG ∗ )v
(10)
k≥1
for every u, v ∈ h.
Proof. Since L(|uv|) = |G ∗ uv| + |uGv| + |L ∗ uL ∗ v|, putting v = ek we have G ∗ u = |G ∗ uek |ek and G ∗ u = L(|uek |)ek − ek , L ek L ∗ u − ek , Gek u.
Multiplying both sides by ρk and summing on k, we find then G∗u = ρk L(|uek |)ek − ρk ek , L ek L ∗ u − ρk ek , Gek u ,k
k≥1
=
k≥1
ρk L(| uek |)ek −
k≥1
tr (ρ L )L ∗ u
− tr (ρG)u
and (9) follows since tr (ρ L j ) = 0. The identity (10) is now immediate computing the adjoint of G.
532
F. Fagnola, V. Umanità
Proposition 4. Let T be the dual of a QMS T generated by L with normal invariant state ρ. If G and G are the operators (10) in two GKSL representations of L and L then G ρ 1/2 = ρ 1/2 G ∗ + tr(ρG) − tr(ρG ) ρ 1/2 . (11) Moreover, we have tr(ρG) − tr(ρG ) = ic for some c ∈ R. Proof. The identities (10) and (8) yield G ρ 1/2 v =
L∗ (ρ 1/2 | vρk ek |)ρk ek − tr (ρG ∗ )ρ 1/2 v 1/2
1/2
k≥1
=
L∗ (ρ 1/2 (| vek |)ρ 1/2 )ρ 1/2 ek − tr (ρG ∗ )ρ 1/2 v
k≥1
=
ρ 1/2 L(| vek |)ρ 1/2 ρ 1/2 ek − tr (ρG ∗ )ρ 1/2 v
k≥1 1/2
=ρ
G ∗ v + tr (ρG) − tr (ρG ∗ ) ρ 1/2 v.
Therefore, we obtain (11). Right multiplying this equation by ρ 1/2 we have G ρ = 1/2 ∗ 1/2 ∗ ρ G ρ + tr (ρG) − tr (ρG ) ρ, and, taking the trace, tr (ρG) − tr (ρG ∗ ) = tr (G ρ) − tr (ρ 1/2 G ∗ ρ 1/2 ) = tr (G ρ) − tr (G ∗ ρ) = −(tr (ρG) − tr (ρG ∗ )); this proves the last claim.
We can now prove as in [11] Th. 7.2, p. 358 the following Theorem 4. For all special GKSL representations of L by means of operators G, L as in (2) there exists a special GKSL representation of L by means of operators G , L such that: 1. G ρ 1/2 = ρ 1/2 G ∗ + icρ 1/2 for some c ∈ R, 2. L ρ 1/2 = ρ 1/2 L ∗ for all ≥ 1. Proof. L is bounded, it admits a special GKSL representation L (a) = G ∗ a + ∗ Since 1/2 = ρ 1/2 G ∗ + icρ 1/2 , k L k a L k + aG . Moreover, by Proposition 4, we have G ρ c ∈ R, and so (8) implies k
1/2 ρ 1/2 L ∗ = k x Lkρ
L k ρ 1/2 xρ 1/2 L ∗k .
(12)
k
Let k (resp. k ) be the multiplicity space of the completely positive part of L (resp. L ), ( f k )k (resp. ( f k )k ) an orthonormal basis of k (resp. k ) and define a linear operator X : h ⊗ k → h ⊗ k, X (x ⊗ 1lk )L ρ 1/2 u = (x ⊗ 1lk ) ρ 1/2 L ∗k u ⊗ f k k
Generators of Quantum Markov Semigroups and Detailed Balance
533
for all x ∈ B(h) and u ∈ h, where L : h → h ⊗ k, Lu = k L k u ⊗ f k , L : h → h ⊗ k , L u = k L k u ⊗ f k . Note that the right-hand side series is convergent for all u ∈ h because of (12), since
n
2 n
n
1/2 ∗ 2 1/2 ∗ u, L k ρ L ∗k u , ρ L k u ⊗ fk =
ρ L k u =
k=m
k=m
k=m
and the right-hand side goes to 0 for n, m tending to infinity because ρ is an invariant state and the series k L k ρ L ∗k = −(Gρ + ρG) is trace-norm convergent. The identity (12) yields ∗ 1/2 X (x ⊗ 1lk )L ρ 1/2 u, X (y ⊗ 1lk )L ρ 1/2 v = u, ρ 1/2 L ∗ v k x y Lkρ k
= (x ⊗ 1lk )L ρ 1/2 u, (y ⊗ 1lk )L ρ 1/2 v for all x, y ∈ B(h) and u, v ∈ h, i.e. X preserves the scalar product. Therefore, since the set {(x ⊗ 1lk )L ρ 1/2 u | x ∈ B(h), u ∈ h} is total in h ⊗ k (for ρ 1/2 (h) is dense in h and Theorem 3 holds), X is well defined and extends to an isometry from h ⊗ k to h ⊗ k. The operator X is unitary because its range is dense in h ⊗ k. Indeed, if we suppose that there exists a vector ξ = k vk ⊗ f k , with vk ∈ h and k vk 2 < ∞, orthogonal to all (x ⊗ 1lk ) k ρ 1/2 L ∗k u ⊗ f k ; then 0 = ξ, (x ⊗ 1lk ) ρ 1/2 L ∗k u ⊗ f k = vk , xρ 1/2 L ∗k u = L k ρ 1/2 x ∗ vk , u k
k
k
for all x ∈ B(h), u ∈ h. Taking x = |w1 w2 |, by the arbitrarity of u, we have then 1/2 w = 0. Since w is arbitrary, the range of ρ 1/2 is dense in h and 2 2 k w1 , vk L k ρ the sequence (w1 , vk )k≥1 is square-summable we find k w1 , vk L k = 0. The linear independence of the L k , in the sense of Theorem 2 (iii), implies then w1 , vk = 0 for all k and all w1 ∈ h, i.e. ξ = 0. As a consequence we have X ∗ X = 1lh⊗k and X X ∗ = 1lh⊗k . Moreover, since X (y ⊗ 1lk ) = (y ⊗ 1lk )X for all y ∈ B(h), we can conclude that X = 1lh ⊗ Y for some unitary map Y : k → k . The definition of X implies then (ρ 1/2 ⊗ 1lk )L ∗ = X L ρ 1/2 = (1lh ⊗ Y )L ρ 1/2 . This means that, replacing L by (1lh ⊗ Y )L , or more precisely L k by all k, we have ρ 1/2 L ∗k = L k ρ 1/2 . Since tr (ρ L k ) = tr (ρ L ∗k ) = 0 and, from L (1l) = 0, G ∗ + G = − properties of a special GKSL representation follow.
k
Remark 3. Condition 2 implies that the completely positive parts Φ(x) = and Φ of the generators L and L , respectively are mutually adjoint, i.e. tr (ρ 1/2 (x)ρ 1/2 y) = tr (ρ 1/2 xρ 1/2 (y))
u k L
for
L ∗k L k , the
L ∗ x L (13)
for all x, y ∈ B(h). As a consequence, also the maps x → G ∗ x + x G and x → (G )∗ x + x G are mutually adjoint.
534
F. Fagnola, V. Umanità
4. Generators of Standard Detailed Balance QMSs In this section we characterise the generators of norm-continuous QMSs satisfying the SQDB of Definition 2. We start noting that, since ρ is invariant for T and T , i.e. L∗ (ρ) = L∗ (ρ) = 0, the operator K commutes with ρ. Moreover, by comparing two special GKSL representations of L and L + 2i[K , · ], we have immediately the following Lemma 2. A QMS T satisfies the SQDB L − L = 2i[K , · ] if and only if for all special GKSL representations of the generators L and L by means of operators G, L k and G , L k respectively, we have G = G + 2i K + ic
L k =
uk j L j
j
for some c ∈ R and some unitary (u k j )k j on k. Since we know the relationship between the operators G , L k and G, L k thanks to Theorem 4, we can now characterise generators of QMSs satisfying the SQDB. We emphasize the following definition of T -symmetric matrix (operator) on k in order to avoid confusion with the usual notion of symmetric operator X meaning that X ∗ is an extension of X . Definition 4. Let Y = (yk )k,≥1 be a matrix with entries indexed by k, running on the set (finite or infinite) of indices of the sequence (L )≥1 . We denote by Y T the transpose matrix Y T = (yk )k,≥1 . The matrix Y is called T -symmetric if Y = Y T . Theorem 5. T satisfies the SQDB if and only if for all special GKSL representation of the generator L by means of operators G, L k there exists a T -symmetric unitary (u m )m on k such that, for all k ≥ 1, ρ 1/2 L ∗k = u k L ρ 1/2 . (14)
Proof. Given a special GKSL representation of L, adding a purely imaginary multiple of the identity operator to the anti-selfadjoint part of G if necessary, Theorem 4 allows us to write the dual L in a special GKSL representation by means of operators G , L k with G ρ 1/2 = ρ 1/2 G ∗ ,
L k ρ 1/2 = ρ 1/2 L ∗k . (15) u k j L j for some unitary Suppose first that T satisfies the SQDB. Since L k = j (u k j )k j by Lemma 2, we can find (14) substituting L k with j u k j L j in the second formula (15). Finally we show that the unitary matrix u = (u m )m is T -symmetric. Indeed, taking the adjoint of (14) we find L ρ 1/2 = m u¯ m ρ 1/2 L ∗m . Writing ρ 1/2 L ∗m as in (14) we have then L ρ 1/2 = (u ∗ )T u u¯ m u mk L k ρ 1/2 = L k ρ 1/2 . m,k
k
k
Generators of Quantum Markov Semigroups and Detailed Balance
535
The operators L ρ 1/2 are linearly independent by property (iii) Theorem 2 of a special GKSL representation, therefore (u ∗ )T u is the identity operator on k. Since u is also T unitary, we have also u ∗ u = (u ∗ )T u, namely u ∗ = (u ∗ )T and u = u .1/2 1/2 = u k L ρ , so that L k = Conversely, if (14) holds, by (15), we have L k ρ u L for all k and for some unitary (u ) . Therefore, thanks to Lemma 2, to kj kj k conclude it is enough to prove that G = G + i(2K + c) namely, that G − G is anti self-adjoint. To this end note that, since ρ is an invariant state, we have 0 = ρG ∗ + L k ρ L ∗k + Gρ, (16) k
with
L k ρ L ∗k =
k
k
=
(L k ρ 1/2 )(ρ 1/2 L ∗k ) =
k
, j
u k u k j ρ 1/2 L ∗ L j ρ 1/2
ρ 1/2 L ∗ L ρ 1/2 = −ρ 1/2 (G + G ∗ )ρ 1/2 ,
(for condition (14) holds) and so, by substituting in Eq. (16) we get 0 = ρG ∗ − ρ 1/2 Gρ 1/2 − ρ 1/2 G ∗ ρ 1/2 + Gρ = ρ 1/2 ρ 1/2 G ∗ − Gρ 1/2 − ρ 1/2 G ∗ − Gρ 1/2 ρ 1/2 = [Gρ 1/2 − ρ 1/2 G ∗ , ρ 1/2 ], i.e. Gρ 1/2 − ρ 1/2 G ∗ commutes with ρ 1/2 . We can now prove that G − G is anti self-adjoint. Clearly, it suffices to show that 1/2 ρ Gρ 1/2 − ρ 1/2 G ρ 1/2 is anti self-adjoint. Indeed, by (15), we have ∗ ∗ ρ 1/2 Gρ 1/2 − ρ 1/2 G ρ 1/2 = ρ 1/2 Gρ 1/2 − ρG ∗ ∗ = ρ 1/2 Gρ 1/2 − ρ 1/2 G ∗ ∗ = Gρ 1/2 − ρ 1/2 G ∗ ρ 1/2 = ρG ∗ − ρ 1/2 Gρ 1/2 = ρ 1/2 G ρ 1/2 − ρ 1/2 Gρ 1/2 , because Gρ 1/2 − ρ 1/2 G ∗ commutes with ρ 1/2 . This completes the proof.
It is worth noticing that, as in Remark 3, T satisfies the SQDB if and only if the completely positive part Φ of the generator L is symmetric. This improves our previous result, Thm. 7.3 [11], where we gave Gρ 1/2 = ρ 1/2 G ∗ −(2i K + ic) ρ 1/2 for some c ∈ R as an additional condition. Here we showed that it follows from (14) and the invariance of ρ. Remark 4. Note that (14) holds for the operators L of a special GKSL representation of L if and only if it is true for all special GKSL representations because of the second part of Theorem 2. Therefore the conclusion of Theorem 5 holds true also if and only if we can find a single special GKSL representation of L satisfying (14).
536
F. Fagnola, V. Umanità
The T -symmetric unitary (u m )m is determined by the L ’s because they are linearly independent. We shall now exploit this fact to give a more geometrical characterisation of SQDB. When the SQDB holds, the matrices (bk j )k, j≥1 and (ck j )k, j≥1 with (17) bk j = tr ρ 1/2 L ∗k ρ 1/2 L ∗j , and ck j = tr ρ L ∗k L j define two trace class operators B and C on k by Lemma 3 (see the Appendix); B is T -symmetric and C is self-adjoint. Moreover, it admits a self-adjoint inverse C −1 because ρ is faithful. When k is infinite dimensional, C −1 is unbounded and its domain coincides with the range of C. We can now give the following characterisation of QMS satisfying the SQDB condition which is more direct because the unitary (u k )k in Theorem 5 is explicitly given by C −1 B. Theorem 6. T satisfies the SQDB if and only if the operators G, L k of a special GKSL representation of the generator L satisfy the following conditions: (i) the closed linear span of ρ 1/2 L ∗ | ≥ 1 and L ρ 1/2 | ≥ 1 in the Hilbert space of Hilbert-Schmidt operators on h coincide, (ii) the trace-class operators B, C defined by (17) satisfy C B = BC T and C −1 B is unitary T -symmetric. Proof. If T satisfies the SQDB then, by Theorem 5, the identity (14) holds. The series in the right-hand side of (14) is convergent with respect to the Hilbert-Schmidt norm because
2
1/2
u k L ρ = u¯ k u k tr ρ L ∗ L
m+1≤≤n
m+1≤, ≤n HS 1 1 |u k |2 |u k |2 + |c |2 ≤ 2 2 m+1≤, ≤n m+1≤, ≤n ⎞2 ⎛ 1 1 ≤ ⎝ |u k |2 ⎠ + |c |2 , 2 2 m+1≤≤n
m+1≤, ≤n
and the right-hand side vanishes as n, m go to infinity because the operator C is traceclass by Lemma 3 and the columns of U = (u k )k are unit vectors in k by unitarity. Left multiplying both sides of (14) by ρ 1/2 L ∗j and taking the trace we find B = CU T = CU . It follows that the range of the operators B, CU and C coincide and C −1 B = U is everywhere defined, unitary and T -symmetric because U is T -symmetric. Moreover, since B is T -symmetric by the cyclic property of the trace, we have also BC T = CU T C T = C(CU )T = C B T = C B. Conversely, we show that (i) and (ii) imply the SQDB. To this end notice that, by the spectral theorem we can find a unitary linear transformation V = (vmn )m,n≥1 on k such that V ∗ C V is diagonal. Therefore, choosing a new GKSL representation of the generator L by means of the operators L k = n≥1 vnk L n , if necessary, we can suppose
Generators of Quantum Markov Semigroups and Detailed Balance
537
that both (L ρ 1/2 )≥1 and (ρ 1/2 L ∗k )k≥1 are orthogonal bases of the same closed linear space. Note that tr (ρ 1/2 (L )∗k ρ 1/2 (L )∗j ) = v¯nk v¯m j tr (ρ 1/2 L ∗n ρ 1/2 L ∗m ) m,n≥1
and the operator B, after this change of GKSL representation, becomes V ∗ B(V ∗ )T which is also T -symmetric. Writing the expansion of ρ 1/2 L ∗k with respect to the orthogonal basis (L ρ 1/2 )≥1 , for all k ≥ 1 we have ρ 1/2 L ∗k =
tr (ρ 1/2 L ∗ ρ 1/2 L ∗ )
≥1
k
L ρ 1/2 2H S
L ρ 1/2 .
(18)
In this way we find a matrix Y of complex numbers yk such that ρ 1/2 L ∗k = yk L ρ 1/2 and the series is Hilbert-Schmidt norm convergent. Clearly, since C is diagonal and B is T -symmetric, yk = (BC −1 )k = ((B(C −1 )T )k = ((C −1 B)T )k . It follows from (ii) that Y coincides with the unitary operator (C −1 B)T and (14) holds. Moreover, Y is symmetric because yk = (BC −1 )k = ((B(C −1 )T )k = (C −1 B)k = yk .
This completes the proof.
Formula (18) has the following consequence. Corollary 1. Suppose that a QMS T satisfies the SQDB condition. For every special GKSL representation of L with operators L ρ 1/2 that are orthogonal in the Hilbert space of Hilbert-Schmidt operators on h if tr (ρ 1/2 L ∗ ρ 1/2 L ∗k ) = 0 for a pair of indices k, ≥ 1, then tr (ρ L ∗ L ) = tr (ρ L ∗k L k ). Proof. It suffices to note that the matrix (u k ) with entries u k = must be T -symmetric.
tr (ρ 1/2 L ∗ ρ 1/2 L ∗k ) L ρ 1/2 2H S
=
tr (ρ 1/2 L ∗ ρ 1/2 L ∗k ) tr (ρ L ∗ L )
Remark 5. The matrix C can be viewed as the covariance matrix of the zero-mean (recall that tr (ρ L ) = 0) “random variables” { L | ≥ 1 } and in a similar way, B can be viewed as a sort of mixed covariance matrix between the previous random variable and the adjoint { L ∗ | ≥ 1 }. Thus the SQDB condition holds when the random variables L right multiplied by ρ 1/2 and the adjoint variables L ∗ left multiplied by ρ 1/2 generate the same subspace of Hilbert-Schmidt operators and the mixed covariance matrix B is a left unitary transformation of the covariance matrix C. If we consider a special GKSL representation of L with operators L ρ 1/2 that are orthogonal, then, by Corollary 1 and the identity L ρ 1/2 H S = L k ρ 1/2 H S , the unitary matrix U can be written as C −1/2 BC −1/2 . This, although not positive definite, can be interpreted as a correlation coefficient matrix of { L | ≥ 1 } and { L ∗ | ≥ 1 }. The characterisation of generators of symmetric QMSs with respect to the s = 1/2 scalar product follows along the same lines.
538
F. Fagnola, V. Umanità
Theorem 7. A norm-continuous QMS T is symmetric if and only if there exists a special GKSL representation of the generator L by means of operators G, L such that ∗ 1/2 (1) Gρ 1/2 = ρ 1/2 G + icρ for some c ∈ R, (2) ρ 1/2 L ∗k = u k L ρ 1/2 , for all k, for some unitary (u k )k on k which is also T -symmetric.
Proof. Choose a special GKSL representation of L by means of operators G, L k . Theorem 4 allows us to write the symmetric dual L in a special GKSL representation by means of operators G , L k as in (15). Suppose first that T is KMS-symmetric. Comparing the special GKSL representations of L and L , by Theorem 2 we find G = G + ic,
L k =
uk j L j ,
j
for some unitary matrix (u k j ) and some c ∈ R. This, together with (15) implies that conditions (1) and (2) hold. Assume now that conditions (1) and (2) hold. Taking the adjoint of (2) we find immediately L k ρ 1/2 = k u k ρ 1/2 L ∗ . Then a straightforward computation, by the unitarity of the matrix (u k ), yields L∗ (ρ 1/2 xρ 1/2 ) = Gρ 1/2 xρ 1/2 +
L k ρ 1/2 xρ 1/2 L ∗k + ρ 1/2 xρ 1/2 G ∗
k ∗
=ρ
1/2
G xρ
=ρ
1/2
L(x)ρ
1/2
+
u k u k j ρ 1/2 L ∗k x L j ρ 1/2 + ρ 1/2 x Gρ 1/2
kj 1/2
for all x ∈ B(h). Iterating we find Ln∗ (ρ 1/2 xρ 1/2 ) = ρ 1/2 Ln (x)ρ 1/2 for all n ≥ 0, therefore, exponentiating, we find T∗t (ρ 1/2 xρ 1/2 ) = ρ 1/2 Tt (x)ρ 1/2 for all t ≥ 0. This, together with (3), implies that T is KMS-symmetric. Remark 6. Note that condition (2) in Theorem 7 implies that the completely positive part of L is KMS-symmetric. This makes a parallel with Theorem 4, where condition (2) implies that the completely positive parts of the generators L and L are mutually adjoint. The above theorem simplifies a previous result by Park ([23], Thm 2.2) where conditions (1) and (2) appear in a much more complicated way.
5. Generators of Standard Detailed Balance (with Time Reversal) QMSs We shall now study generators of semigroups satisfying the SQDB-θ introduced in Definition 3 involving the time reversal operation. In this section, we always assume that the invariant state ρ and the anti-unitary time reversal θ commute. The relationship between the QMS satisfying the SQDB-θ , its dual and their generators is clarified by the following
Generators of Quantum Markov Semigroups and Detailed Balance
539
Proposition 5. A QMS T satisfies the SQDB-θ if and only if the dual semigroup T is given by Tt (x) = θ Tt (θ xθ )θ
for all x ∈ B(h).
(19)
In particular, if T is norm-continuous, then T is also norm-continuous. Moreover, in this case T is generated by L (x) = θ L(θ xθ )θ,
x ∈ B(h).
(20)
Proof. Suppose that T satisfies the SQDB-θ and put σ (x) = θ xθ . Taking t = 0 Eq. (6) reduces to tr (ρ 1/2 xρ 1/2 y) = tr (ρ 1/2 σ (y ∗ )ρ 1/2 σ (x ∗ )) for all x, y ∈ B(h), so that tr (ρ 1/2 xρ 1/2 Tt (y)) = tr (ρ 1/2 σ (y ∗ )ρ 1/2 Tt (σ (x ∗ ))) = tr (ρ 1/2 σ (Tt (σ (x ∗ ))∗ ρ 1/2 σ (σ (y ∗ )∗ )) = tr (ρ 1/2 σ (Tt (σ (x)))ρ 1/2 y) for every x, y ∈ B(h) and (19) follows. Therefore, if T is norm continuous, Tt = (σ ◦ Tt ◦ σ )t is also. Conversely, if (19) holds, the commutation between ρ and θ implies tr (ρ 1/2 Tt (x)ρ 1/2 y) = tr ρ 1/2 θ Tt (θ xθ )θρ 1/2 y = tr θ ρ 1/2 Tt (θ xθ )θρ 1/2 yθ θ = tr ρ 1/2 θ y ∗ ρ 1/2 θ Tt (θ x ∗ θ ) and (19) is proved. Now (20) follows from (19) differentiating at t = 0.
We can now describe the relationship between special GKSL representations of L and L . Proposition 6. If T satisfies the SQDB-θ then, for every special GKSL representation of L by means of operators H, L k , the operators H = −θ H θ and L k = θ L k θ yield a special GKSL representation of L . Proof. Consider a special GKSL representation of L by means of operators H , L k . Since L (a) = θ L(θaθ )θ by Proposition 5, from the antilinearity of θ and θ 2 = 1l we get 1 ∗ θ L (a) θ = i[H, θaθ ] − L k L k θaθ − 2L ∗k θaθ L k + θaθ L ∗k L k 2 k = iθ (θ H θa − aθ H θ ) θ + θ (θ L ∗k θ )a(θ L k θ ) θ k
1 θ (θ L ∗k θ )(θ L k θ )a + a(θ L ∗k θ )(θ L k θ ) θ − 2 k 1 ∗ ∗ θ L k L k a − 2L ∗ = θ (−i[θ H θ, a] ) θ − k a L k + a L k L k θ, 2 k
L k
H
:= θ L k θ . Therefore, putting = −θ H θ , we find a GKSL representation of where L which is also special because tr (ρ L k ) = tr (θρ L k θ ) = tr (L ∗k ρ) = tr (ρ L k ) = 0.
540
F. Fagnola, V. Umanità
The structure of generators of QMSs satisfying the SQDB-θ is described by the following Theorem 8. A QMS T satisfies the SQDB-θ condition if and only if there exists a special GKSL representation of L, with operators G, L , such that: 1. ρ 1/2 θ G ∗ θ = Gρ 1/2 , ∗ 1/2 2. ρ θ L k θ = j u k j L j ρ 1/2 for a self-adjoint unitary (u k j )k j on k. Proof. Suppose that T satisfies the SQDB-θ condition and consider a special GKSL representation of the generator L with operators G, L k . The operators −θ H θ and θ L k θ give then a special GKSL representation of L by Proposition 6. Moreover, by Theorem 4, we have another special GKSL representation of L by means of operators G , L k such that G ρ 1/2 = ρ 1/2 G ∗ +icρ 1/2 for some c ∈ R, and L k ρ 1/2 = ρ 1/2 L ∗k . Therefore there exists 1/2 L ∗ = 1/2 . a unitary (vk j )k j on k such that L k = j vk j θ L j θ , and ρ j vk j θ L j θρ k Condition 2 follows then with u k j = v¯k j left and right multiplying by the antiunitary θ . In order to find condition 1, first notice that by the unitarity of (vk j )k j , L ∗ θ L ∗k L k θ. (21) k Lk = k
k
Now, by the uniqueness of G up to a purely imaginary multiple of the identity in a special GKSL representation, H = (G ∗ − G )/(2i) is equal to −θ H θ + c1 for some c1 ∈ R. From (21) and G ρ 1/2 = ρ 1/2 G ∗ + icρ 1/2 we obtain then 1 ∗ 1/2 Lk Lkρ ρ 1/2 G ∗ + icρ 1/2 = G ρ 1/2 = −i H ρ 1/2 − 2 k 1 ∗ = iθ H θρ 1/2 + ic1 ρ 1/2 − θ L k L k θρ 1/2 2 k
= θ Gθρ ρ 1/2 θ G ∗ θ
It follows that and tracing we find
=
Gρ 1/2
1/2
+ ic2
+ ic1 ρ
ρ 1/2
1/2
.
for some c2 ∈ R. Left multiplying by ρ 1/2
ic2 = tr θρG ∗ θ − tr (ρG) = tr (Gρ) − tr (ρG) = 0
and condition 1 holds. Finally we show that the square of the unitary (u k j )k j on k is the identity operator. Indeed, taking the adjoint of the identity ρ 1/2 θ L ∗k θ = j u k j L j ρ 1/2 , we have θ L k θρ 1/2 = u¯ k j ρ 1/2 L ∗j . j
Left and right multiplying by the antilinear time reversal θ (commuting with ρ) we find L k ρ 1/2 = θ u¯ k j ρ 1/2 L ∗j θ = u k j ρ 1/2 θ L ∗j θ. j
Writing ρ 1/2 θ L ∗j θ as L k ρ 1/2
j
u jm L m ρ 1/2 by condition 2 we have then = u k j u jm L m ρ 1/2 = (u 2 )km L m ρ 1/2 m
j,m
m
Generators of Quantum Markov Semigroups and Detailed Balance
541
which implies that u 2 = 1l by the linear independence of the L m ρ 1/2 . Therefore, since u is unitary, u = u ∗ . Conversely, if 1 and 2 hold, we can write ρ 1/2 θ L(θ xθ )θρ 1/2 as ρ 1/2 θ L ∗k θ xθ L k θρ 1/2 + ρ 1/2 xθ Gθρ 1/2 ρ 1/2 θ G ∗ θ xρ 1/2 + k
= Gρ
1/2
xρ
1/2
+
L j ρ 1/2 xρ 1/2 L ∗j + ρ 1/2 xρ 1/2 G ∗ .
j
This, by Theorem 4, can be written as ρ 1/2 (G )∗ xρ 1/2 + ρ 1/2 (L j )∗ x L j ρ 1/2 + ρ 1/2 x G ρ 1/2 = ρ 1/2 L (x)ρ 1/2 . j
It follows that θ L(θ xθ )θ = L (x) for all x ∈ B(h) because ρ is faithful. Moreover, it is easy to check by induction that θ Ln (θ xθ )θ = (L )n (x) for all n ≥ 0. Therefore θ Tt (θ xθ )θ = Tt (x) for all t ≥ 0 and T satisfies the SQDB-θ condition by Proposition 5. We now provide a geometrical characterisation of the SQDB-θ condition as in Theorem 6. To this end we introduce the trace class operator R on k R jk = tr ρ 1/2 L ∗j ρ 1/2 θ L ∗k θ . (22) A direct application of Lemma 3 shows that R is trace class. Moreover it is self-adjoint because, by the property tr (θ xθ ) = tr (x ∗ ) of the antilinear time reversal, we have R jk = tr ρ 1/2 L ∗j ρ 1/2 θ L ∗k θ = tr θ (L k θρ 1/2 L j ρ 1/2 θ )θ = tr ρ 1/2 θ L ∗j ρ 1/2 θ L ∗k = tr (ρ 1/2 θ L ∗j θ )(ρ 1/2 L ∗k ) = Rk j . Theorem 9. T satisfies the SQDB-θ if and only if the operators G, L k of a special GKSL representation of the generator L fulfill the following conditions: 1. ρ 1/2 θ G ∗ θ = Gρ 1/2 , 2. the closed linear span of ρ 1/2 θ L ∗ θ | ≥ 1 and L ρ 1/2 | ≥ 1 in the Hilbert space of Hilbert-Schmidt operators on h coincide, 3. the self-adjoint trace class operators R, C defined by (17) and (22) commute and C −1 R is unitary and self-adjoint. Proof. It suffices to show that conditions 2 and 3 above are equivalent to condition 2 of Theorem 8. If T satisfies the SQBD-θ , then it can be shown as in the proof of Theorem 6 that 2 follows from condition 2 of Theorem 8. Moreover, left multiplying by ρ 1/2 L ∗ the identity ρ 1/2 θ L ∗k θ = j u k j L j ρ 1/2 and tracing, we find u k j tr ρ L ∗ L j tr ρ 1/2 L ∗ ρ 1/2 θ L ∗k θ = j
542
F. Fagnola, V. Umanità
for all k, , i.e. R = CU T . The operator U T is also self-adjoint and unitary. Therefore R and C have the same range and, since the domain of C −1 coincides with the range of C, the operator C −1 R is everywhere defined, unitary and self-adjoint. It follows that the densely defined operator RC −1 is a restriction of (C −1 R)∗ = C −1 R and C R = RC. In order to prove, conversely, that 2 and 3 imply condition 2 of Theorem 8, we first notice that, by the spectral theorem there exists a unitary V = (vmn )m,n≥1 on the multiplicity space k such that V ∗ C V is diagonal. Choosing a new GKSL representation of the generator L by means of the operators L k = n≥1 vnk L n , if necessary, we can suppose that both (L ρ 1/2 )≥1 and (ρ 1/2 L ∗k )k≥1 are orthogonal bases of the same closed linear space. Note that tr ρ 1/2 (L )∗k ρ 1/2 θ (L )∗j θ = v¯nk vm j tr (ρ 1/2 L ∗n ρ 1/2 θ L ∗m θ ) m,n≥1
and the operator R, in the new GKSL representation, transforms into V ∗ RV which is also self-adjoint. Expanding ρ 1/2 θ L ∗k θ with respect to the orthogonal basis (L ρ 1/2 )≥1 , for all k ≥ 1, we have tr (ρ 1/2 L ∗ ρ 1/2 θ L ∗ θ ) k L ρ 1/2 , (23) ρ 1/2 θ L ∗k θ = 1/2 2 L ρ HS ≥1
i.e. ρ 1/2 θ L ∗k θ = yk L ρ 1/2 with a unitary matrix Y of complex numbers yk . Clearly, we have yk = (C −1 R)k . It follows then from condition 3 above that Y coincides with the unitary operator (C −1 R)T and condition 2 of Theorem 8 holds. Moreover, Y is self-adjoint because both R and C are.
As an immediate consequence of the commutation of R and C we have the following parallel of Corollary 1 for the SQDB condition Corollary 2. Suppose that a QMS T satisfies the SQDB-θ condition. For every special GKSL representation of L with operators L ρ 1/2 orthogonal as Hilbert-Schmidt operators on h if tr (ρ 1/2 L ∗ ρ 1/2 θ L ∗k θ ) = 0 for a pair of indices k, ≥ 1, then tr (ρ L ∗ L ) = tr (ρ L ∗k L k ). When the time reversal θ is given by the conjugation θ u = u¯ (with respect to some orthonormal basis of h), θ x ∗ θ is equal to the transpose x T of x and we find the following Corollary 3. T satisfies the SQDB-θ condition if and only if there exists a special GKSL representation of L, with operators G, L k , such that: 1. ρ 1/2 G T = Gρ 1/2 ; T 1/2 2. ρ L k = j u k j L j ρ 1/2 for some unitary self-adjoint (u k j )k j . 6. SQDB-θ for QMS on M2 (C) In this section, as an application, we find a standard form of a special GKSL representation of the generator L of a QMS on M2 (C) satisfying the SQDB-θ . The faithful invariant state ρ, in a suitable basis of C2 , can be written in the form 1 ν 0 = (σ0 + (2ν − 1)σ3 ) , 0 < ν < 1, ρ= 0 1−ν 2
Generators of Quantum Markov Semigroups and Detailed Balance
543
where σ0 is the identity matrix and σ1 , σ2 , σ3 are the Pauli matrices 01 0 −i 1 0 , σ2 = , σ3 = . σ1 = 10 i 0 0 −1 The time reversal θ is the usual conjugation in the same basis of C2 . In order to determine the structure of the operators G and L k satisfying conditions of Corollary 3 we find first a convenient basis of M2 (C). We choose then a basis of eigenvectors of the linear map X → ρ 1/2 X T ρ −1/2 in M2 (C) given by σ0 , σ1ν , σ2ν , σ3 , where √ √ 2ν 0 0 −i 2ν , σ2ν = √ . σ1ν = √ 2(1 − ν) 0 0 i 2(1 − ν) Indeed, σ0 , σ1ν , σ3 (resp. σ2ν ) are eigenvectors of the eigenvalue 1 (resp. −1). Every special GKSL representation of L is given by (see [11], Lemma 6.1) L k = −(2ν − 1)z k3 σ0 + z k1 σ1ν + z k2 σ2ν + z k3 σ3 ,
k ∈ J ⊆ {1, 2, 3}
with vectors z k := (z k1 , z k2 , z k3 ) (k ∈ J ) linearly independent in C3 . The SQDB-θ holds if and only if G, L k satisfy 1/2 G T ρ −1/2 , (i) G = ρ (ii) L k = j∈J u k j ρ 1/2 L Tj ρ −1/2 for some unitary self-adjoint U = (u k j )k, j∈J .
Now, if J = ∅, since every unitary self-adjoint matrix is diagonalizable and its spectrum is contained in {−1, 1}, it follows that U = W ∗ DW for some unitary matrix W = (wi j )i, j∈J and some diagonal matrix D of the form diag( 1 , . . . , |J | ),
i ∈ {−1, 1},
(24)
where |J | denotes the cardinality of J . Therefore, replacing the L k ’s by operators L k := j∈J wk j L j if necessary, we can take U of the form (24). We now analyze the structure of L k ’s corresponding to the different (diagonal) forms of U . By condition (ii) we have either L k = ρ 1/2 L kT ρ −1/2 or L k = −ρ 1/2 L kT ρ −1/2 ; an easy calculation shows that L k = ρ 1/2 L kT ρ −1/2
if and only if
z k2 = 0
(25)
and L k = −ρ 1/2 L kT ρ −1/2 if and only if z k1 = z k3 = 0.
(26)
Therefore, the linear independence of {z j : j ∈ J } forces U to have at most two eigenvalues equal to 1 and at most one equal to −1 and, with a suitable choice of a phase factor for each L k , we can write L k = (1 − 2ν)rk σ0 + rk σ3 + ζk σ1ν for k = 1, 2 and rk ∈ R, ζk ∈ C L 3 = r3 σ2ν , r3 ∈ R.
(27) (28)
Clearly L 1 and L 2 are linearly independent if and only if r1 ζ2 = r2 ζ1 . This, together with non triviality conditions leaves us, up to a change of indices, with the following possibilities:
544
F. Fagnola, V. Umanità
|J | = 1, U = 1 then J = {1} with r1 ζ1 = 0, |J | = 1, U = −1 then J = {3} with r3 = 0, |J | = 2, U = diag(1, 1) then J = {1, 2} with r1 ζ1r2 ζ2 = 0, r1 ζ2 = r2 ζ1 , |J | = 2, U = diag(1, −1) then J = {1, 3}, with r3 = 0, r1 ζ1 = 0, |J | = 3, U = diag(1, 1, −1) then J = {1, 2, 3} with r1 ζ2 = r2 ζ1 , r3 = 0, r1 ζ1r2 ζ2 = 0. To conclude, we analyze condition (i). If G = g jk 1≤ j,k≤2 then statement (i) is equivalent to √ √ ν g21 = 1 − ν g12 . (29) Since G = −i H − 2−1 k L ∗k L k with H = 3j=1 v j σ j , v j ∈ R, and k L ∗k L k is equal to the sum of a term depending only on σ0 and σ3 plus √ √ 0 ζk 2ν(1 − ν) − ζ¯k ν 2(1 − ν) √ , 2rk ¯ √ ζk 2ν(1 − ν) − ζk ν 2(1 − ν) 0
(a) (b) (c) (d) (e)
k=1,2
in the case J = ∅ the identity (29) holds if and only if √ √ √ √ √ 2 2 v1 1 − ν − ν = − 2ν(1 − ν) 1 − ν + ν rk Iζk . √ √ √ √ √ 2 k=1 2 v2 1 − ν + ν = − 2ν(1 − ν) 1 − ν − ν k=1 rk Rζk On √ the other hand, when J = ∅, condition (29) is equivalent to 1 − ν(v1 − iv2 ), i.e. √ √ 1 − ν − ν = 0, v2 = 0, v1
(30)
√ ν(v1 + iv2 ) = (31)
Therefore we have the following possible standard forms for L. 3 Theorem 10. Let L 1 , L 2 , L 3 be as in (27), (28), H = j=1 v j σ j with v1 , v2 as in (30) and v3 ∈ R. The QMS T satisfies the SQDB-θ if and only if there exists a special GKSL representation of L given, up to phase factors multiplying L 1 , L 2 , L 3 , in one of the following ways: (o) (a) (b) (c) (d) (e)
H with v1 = v2 = 0 if ν = 1/2, and v1 ∈ R, v2 = 0 if ν = 1/2, H, L 1 with r1 ζ1 = 0, H, L 3 with r3 = 0, H, L 1 , L 2 with r1 ζ1r2 ζ2 = 0 and r1 ζ2 = r2 ζ1 , H, L 1 , L 3 with r3 = 0 and r1 ζ1 = 0, H, L 1 , L 2 , L 3 with r1 ζ2 = r2 ζ1 , r1 ζ1r2 ζ2 = 0 and r3 = 0.
Roughly speaking, the standard form of L corresponds, up to degeneracies when some of the parameter vanish or when some linear dependence arises, to the case e). We know that a QMS satisfying the usual (i.e. with pre-scalar product with s = 0) QDB-θ condition must commute with the modular group. Moreover, when this happens, the SQDB-θ and QDB-θ conditions are equivalent (see e.g. [6,11]). We finally show how the generators of a QMSs on M2 (C) satisfying the usual QDB-θ condition can be recovered by a special choice of the parameters r1 , r2 , r3 , ζ1 , ζ2 in Theorem 10 describing the generator of a QMS satisfying the SQDB-θ condition.
Generators of Quantum Markov Semigroups and Detailed Balance
545
To this end, we recall that T fulfills the QDB-θ when tr (ρxTt (y)) = tr (ρθ y ∗ θ Tt (θ x ∗ θ )) for all x, y ∈ B(h). In [11] we classified generators of QMS on M2 (C) satisfying the QDB condition without time reversal (i.e., formally, replacing θ by the identity operator, that is, of course, not antiunitary). The same type of arguments show that, disregarding trivialisations that may occur when some of the parameters below vanishes, QMSs on M2 (C) satisfying the QDB-θ condition have the following standard form | η |2 2 L x − 2L x L + x L 2 2 2 | μ |2 + − |λ| σ −σ + x − 2σ − xσ + + xσ −σ + − σ σ x − 2σ + xσ − + xσ + σ − , (32) − 2 2
L(x) = i[H, x] −
where H = h 0 σ0 + h 3 σ3 (h 0 , h 3 ∈ R), L = −(2ν − 1)σ0 + σ3 , σ ± = (σ1 ± iσ2 )/2 and, changing phases if necessary, λ, μ, η can be chosen as non-negative real numbers satisfying λ2 (1 − ν) = νμ2 .
(33)
Choosing r1 = η, ζ1 = 0 we find immediately that the operator L in (32) coincides with the operator L 1 in (27). Moreover, choosing r2 = 0 we find v2 = 0 and also v1 = 0 for ν = 1/2. A straightforward computation yields √ √ λ σ+ L2 iλ/(2r√3 2ν) λ/(2ζ √2 2ν) = μ σ− L3 μ/(2ζ2 2(1 − ν)) −iμ/(2r3 2(1 − ν)) √ √ and the above 2×2 matrix is unitary if we choose ζ2 = λ/(2 ν), r3 = iμ/(2 1 − ν)) = iζ2 because of (33) and changing the phase of r3 in order to find a unitary that is also self-adjoint. This shows that we can recover the standard form (32) √1 , L 2 , L 3 as in √ choosing H , L Theorem 10 e) with r1 = η, ζ1 = 0, r2 = 0, ζ2 = λ/(2 ν), r3 = iμ/(2 1 − ν)), v1 = v2 = 0. Appendix We denote by 2 (J ) the Hilbert space of complex-valued, square summable sequences indexed by a finite or countable set J . Lemma 3. Let J be a complex separable Hilbert space and let (ξ j ) j∈J , (η j ) j∈J be two
2
2 Hilbertian bases of J satisfying j∈J ξ j < ∞, j∈J η j < ∞. The complex matrices A = (a jk ) j,k∈J , B = (b jk ) j,k∈J , C = (c jk ) j,k∈J given by a jk = ξ j , ξk , b jk = ξ j , ηk , c jk = η j , ηk define trace class operators on 2 (J ) satisfying B ∗ A−1 B = C. Moreover A and C are self-adjoint and positive. Proof. Note that 2 2 2 b jk ≤
ξ j · ηk 2 =
ξ j · ηk 2 < ∞. j,k≥1
j,k≥1
j
Therefore B defines a Hilbert-Schmidt operator on 2 (J ).
k
546
F. Fagnola, V. Umanità
In a similar way A and C define Hilbert-Schmidt operators on 2 (J ) that are obviously self-adjoint. These are also positive because for any sequence (z m )m∈J of complex numbers with z m = 0 for a finite number of indices m at most we have
2
z¯ m amn z n = z¯ m ξm , ξn z n =
z m ξm ≥ 0.
m,n∈J
m,n∈J
m∈J
Moreover, they are trace class because 2
ξ j < ∞, ajj = j∈J
j∈J
cjj =
j∈J
2
η j < ∞. j∈J
Finally, we show that B is also trace class. By the spectral theorem, we can find a unitary V = (vk j )k, j∈J on 2 (J ) such that V ∗ AV is diagonal. The series m∈J vm j ξm is norm convergent because
2
vm j ξm = v¯n j anm vm j = (V ∗ AV ) j j .
m
m,n∈J
The series as well for a similar reason. Therefore, m∈J vm j ξm is norm convergent putting ξ j = m∈J vm j ξm and ηj = m∈J vm j ηm we find immediately (V ∗ AV )k j =
2
ξk , ξ j = 0 for j = k, (V ∗ AV ) j j = ξ j and (V ∗ BV )k j =
v¯mk vn j ξm , η j = ξk , ηj ,
m,n
∗
(V C V )k j =
v¯mk vn j ηm , η j = ηk , ηj .
m,n
As a consequence, the following identity V ∗ B ∗ A−1 BV = (V ∗ B ∗ V )(V ∗ AV )−1 (V ∗ BV ) kj
=
kj
−1 ∗ (V ∗ B ∗ V )km (V ∗ AV )mm (V BV )m j
m∈J
=
m∈J
ηk ,
ξm ξm
ξm , η ξm j
= ηk , ηj = (V ∗ C V )k j holds because (ξm / ξm )m∈J is an orthonormal basis of J . This proves that V ∗ B ∗ A−1 BV = V ∗ C V i.e. B ∗ A−1 B = C. It follows that |A−1/2 B| = C 1/2 is Hilbert-Schmidt as well as A−1/2 B and B = A1/2 (A−1/2 B) is trace class being the product of two Hilbert-Schmidt operators. Acknowledgements. The financial support from the MIUR PRIN 2007 project “Quantum Probability and Applications to Information Theory” is gratefully acknowledged.
Generators of Quantum Markov Semigroups and Detailed Balance
547
References 1. Albeverio, S., Goswami, D.: A remark on the structure of symmetric quantum dynamical semigroups. Inf. Dim. Anal. Quant. Prob. Relat. Top. 5, 571–579 (2002) 2. Accardi, L., Imafuku, K.: Dynamical detailed balance and local KMS condition for non-equilibrium states. Int. J. Mod. Phys. B 18(4-5), 435–467 (2004) 3. Accardi, L., Mohari, A.: Time reflected markov processes. Inf. Dim. Anal. Quant. Prob. Relat. Top. 2, 397– 426 (1999) 4. Agarwal, G.S.: Open quantum Markovian systems and the microreversibility. Z. Physik 258(5), 409–422 (1973) 5. Alicki, R.: On the detailed balance condition for non-Hamiltonian systems. Rep. Math. Phys. 10, 249–258 (1976) 6. Cipriani, F.: Dirichlet forms and markovian semigroups on standard forms of von neumann algebras. J. Funct. Anal. 147, 259–300 (1997) 7. Cipriani, F.: Dirichlet forms on noncommutative spaces. In: Quantum Potential Theory, Lecture Notes in Math., 1954, Berlin-Heidelberg-New York: Springer, 2008, pp. 161–276 8. Davies, E.B., Lindsay, J.M.: Non-commutative symmetric Markov semigroups. Math. Z. 210, 379– 411 (1992) 9. Derezynski, J., Fruboes, R.: Fermi golden rule and open quantum systems. In: S. Attal et al. (eds.) Open Quantum Systems III, Lecture Notes in Mathematics 1882, Berlin-Heidelberg-New York: Springer, 2006, pp. 67–116 10. Fagnola, F., Umanità, V.: Detailed balance, time reversal and generators of quantum Markov semigroups, M. Zametki, 84 (1) 108–116 (2008) (Russian); translation Math. Notes 84 (1–2), 108–115 (2008) 11. Fagnola, F., Umanità, V.: Generators of detailed balance quantum markov semigroups. Inf. Dim. Anal. Quant. Prob. Relat. Top. 10(3), 335–363 (2007) 12. Fagnola, F., Umanità, V.: On two quantum versions of the detailed balance condition. To appear in: Noncommutative harmonic analysis with applications to probability, M. Bozejko, et al. eds., Banach Center Publications, Polish Academy of Sciences 2009 13. Fukushima, M., Oshima, Y., Takeda, M.: Dirichlet Forms and Symmetric Markov Processes, de Gruyter Studies in Mathematics 19, Berlin: de Grayler, 1994 14. Goldstein, S., Lindsay, J.M.: Beurling-Deny condition for KMS-symmetric dynamical semigroups. C. R. Acad. Sci. Paris 317, 1053–1057 (1993) 15. Goldstein, S., Lindsay, J.M.: KMS symmetric semigroups. Math. Z. 219, 591–608 (1995) 16. Gorini, V., Kossakowski, A., Sudarshan, E.C.G.: Completely positive dynamical semigroups of N -level systems. J. Math. Phys. 17, 821–825 (1976) 17. Guido, D., Isola, T., Scarlatti, S.: Non-symmetric Dirichlet forms on semifinite von Neumann algebras. J. Funct. Anal. 135(1), 50–75 (1996) 18. Kossakowski, A., Frigerio, A., Gorini, V., Verri, M.: Quantum detailed balance and KMS condition. Commun. Math. Phys. 57, 97–110 (1977) 19. Lindablad, G.: On the genarators of quantum dynamical semigroups. Commun. Math. Phys. 48, 119–130 (1976) 20. Majewski, W.A.: On the relationship between the reversibility of detailed balance conditions. Ann. Inst. Henri Poincaré, A 39, 45–54 (1983) 21. Majewski, W.A.: The detailed balance condition in quantum statistical mechanics. J. Math. Phys. 25(3), 614–616 (1984) 22. Majewski, W.A., Streater, R.F.: Detailed balance and quantum dynamical maps. J. Phys. A: Math. Gen. 31, 7981–7995 (1998) 23. Park, Y.M.: Remarks on the structure of Dirichlet forms on standard forms of von Neumann Algebras. Inf. Dim. Anal. Quant. Prob. Rel. Top. 8, 179–197 (2005) 24. Parthasarathy, K.R.: An Introduction to Quantum Stochastic Calculus. Monographs in Mathematics 85, Basel: Birkhäuser-Verlag, 1992 25. Petz, D.: Conditional expectation in quantum probability. In: L. Accardi and W. von Waldenfels (eds.), Quantum Probability and Applications III. Proceedings, Oberwolfach 1987. LNM 1303 Berlin-Heidelberg-New York: Springer, 1988, pp. 251–260 26. Sauvageot, J.L.: Quantum Dirichlet forms, differential calculus and semigroups. In: L. Accardi, W. von Waldenfels (eds.), Quantum Probability and Applications V. LNM 1442, Berlin-Heidelberg-New York: Springer, 1990, pp. 334–346 27. Talkner, P.: The failure of the quantum regression hypotesis. Ann. Phys. 167(2), 390–436 (1986) Communicated by M.B. Ruskai
Commun. Math. Phys. 298, 549–572 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1044-5
Communications in
Mathematical Physics
Random Matrices: Universality of Local Eigenvalue Statistics up to the Edge Terence Tao1,∗ , Van Vu2,∗∗ 1 Department of Mathematics, UCLA, Los Angeles CA 90095-1555, USA. E-mail:
[email protected] 2 Department of Mathematics, Rutgers, Piscataway, NJ 08854, USA. E-mail:
[email protected] Received: 13 August 2009 / Accepted: 8 January 2010 Published online: 3 April 2010 – © The Author(s) 2010. This article is published with open access at Springerlink.com
Abstract: This is a continuation of our earlier paper (Tao and Vu, http://arxiv.org/abs/ 0908.1982v4[math.PR], 2010) on the universality of the eigenvalues of Wigner random matrices. The main new results of this paper are an extension of the results in Tao and Vu (http://arxiv.org/abs/0908.1982v4[math.PR], 2010) from the bulk of the spectrum up to the edge. In particular, we prove a variant of the universality results of Soshnikov (Commun Math Phys 207(3):697–733, 1999) for the largest eigenvalues, assuming moment conditions rather than symmetry conditions. The main new technical observation is that there is a significant bias in the Cauchy interlacing law near the edge of the spectrum which allows one to continue ensuring the delocalization of eigenvectors. 1. Introduction In our recent paper [27], a universality result (the Four Moment Theorem) was established for the eigenvalue spacings in the bulk of the spectrum of random Hermitian matrices. (See [6] for an extended discussion of the universality phenomenon, and [27] for further references on universality results in the context of Wigner Hermitian matrices.) The main purpose of this paper is to extend this universality result to the edge of the spectrum as well. 1.1. Universality in the bulk. To recall the Four Moment Theorem, we need some notation. Definition 1.1 (Condition C0). A random Hermitian matrix Mn = (ζi j )1≤i, j≤n is said to obey condition C0 if ∗ T. Tao is is supported by a grant from the MacArthur Foundation, by NSF grant DMS-0649473, and by the NSF Waterman award. ∗∗ V. Vu is supported by research grants DMS-0901216 and AFOSAR-FA-9550-09-1-0167.
550
T. Tao, V. Vu
• The ζi j are independent (but not necessarily identically distributed) for 1 ≤ i ≤ j ≤ n. For 1 ≤ i < j ≤ n, they have mean zero and variance 1; for i = j, they have mean zero and variance c for some fixed c > 0 independent of n. • (Uniform exponential decay) There exist constants C, C > 0 such that P(|ζi j | ≥ t C ) ≤ exp(−t)
(1)
for all t ≥ C and 1 ≤ i, j ≤ n. Examples of random Hermitian matrices obeying Condition C0 include the GUE and GOE ensembles, or the random symmetric Bernoulli ensemble in which each of the ζi j are equal to ±1 with equal probability 1/2. In GOE one has c = 2, but in the other two cases one has c = 1. The arguments in the previous paper [27] were largely phrased for the case c = 1, but it is not difficult to see that the arguments extend without difficulty to other values of c (the main point being that a modification of the variance of a single entry of a row vector does not significantly affect the Talagrand concentration inequality, [27, Lemma 43], or Lemma 2.1 below.). Given an n × n Hermitian matrix A, we denote its n eigenvalues as λ1 (A) ≤ · · · ≤ λn (A), and write λ(A) := (λ1 (A), . . . , λn (A)). We also let u 1 (A), . . . , u n (A) ∈ Cn be an orthonormal basis of eigenvectors of A with Au i (A) = λi (A)u i (A); these eigenvectors u i (A) are only determined up to a complex phase even when the eigenvalues are simple, but this ambiguity will not cause a difficulty in our results as we will only be interested in the magnitude |u i (A)∗ X | of various inner products u i (A)∗ X of u i (A) with other vectors X. It will be convenient to introduce the following notation for frequent events depending on n, in increasing order of likelihood: Definition 1.2 (Frequent events). Let E be an event depending on n. • E holds asymptotically almost surely if 1 P(E) = 1 − o(1). • E holds with high probability if P(E) ≥ 1 − O(n −c ) for some constant c > 0. • E holds with overwhelming probability if P(E) ≥ 1− OC (n −C ) for every constant C > 0 (or equivalently, that P(E) ≥ 1 − exp(−ω(log n))). • E holds almost surely if P(E) = 1. Definition 1.3 (Moment matching). We say that two complex random variables ζ and ζ match to order k if ERe(ζ )m Im(ζ )l = ERe(ζ )m Im(ζ )l for all m, l ≥ 0 such that m + l ≤ k. The first main result [27] can now be stated as follows: Theorem 1.4 (Four Moment Theorem) [27, Theorem 15]. There is a small positive constant c0 such that for every 0 < ε < 1 and k ≥ 1 the following holds. Let Mn = (ζi j )1≤i, j≤n and Mn = (ζij )1≤i, j≤n be two random matrices satisfying C0. Assume furthermore that for any 1 ≤ i < j ≤ n, ζi j and ζij match to order 4 and for any 1 ≤ i ≤ n, 1 See Sect. 1.4 for our conventions on asymptotic notation.
Universality up to the Edge
551
√ √ ζii and ζii match to order 2. Set An := n Mn and An := n Mn , and let G : Rk → R be a smooth function obeying the derivative bounds |∇ j G(x)| ≤ n c0
(2)
for all 0 ≤ j ≤ 5 and x ∈ Rk . Then for any εn ≤ i 1 < i 2 · · · < i k ≤ (1 − ε)n, and for n sufficiently large depending on ε, k (and the constants C, C in Definition 1.2) we have |E(G(λi1 (An ), . . . , λik (An ))) − E(G(λi1 (An ), . . . , λik (An )))| ≤ n −c0 .
(3)
If ζi j and ζij only match to order 3 rather than 4, then there is a positive constant C independent of c0 such that the conclusion (3) still holds provided that one strengthens (2) to |∇ j G(x)| ≤ n −C jc0 for all 0 ≤ j ≤ 5 and x ∈ Rk . Informally, this theorem asserts that the distribution of any bounded number of eigenvalues in the bulk of the spectrum of a random Hermitian matrix obeying condition C0 depends only on the first four moments of the coefficients. There is also a useful companion result to Theorem 1.4, which is used both in the proof of that theorem, and in several of its applications: Theorem 1.5 (Lower tail estimates) [27, Theorem 17]. Let 0 < ε 0, and for n sufficiently large depending on ε, c0 and the constants C, C in Definition 1.1, and for each εn ≤ i ≤ (1 − ε)n, one has λi+1 (An ) − λi (An ) ≥ n −c0 with high probability. In fact, one has P(λi+1 (An ) − λi (An ) ≤ n −c0 ) ≤ n −c1 for some c1 > 0 depending on c0 (and independent of ε). Theorem 1.4 (and to a lesser extent, Theorem 1.5) can be used to extend the range of applicability for various results on eigenvalue statistics in the bulk for Hermitian or symmetric matrices, for instance in extending results for special ensembles such as GUE or GOE (or ensembles obeying some regularity or divisibility conditions) to more general classes of matrices. See [27,13,10] for some examples of this type of extension. We also remark that a level repulsion estimate which has a similar spirit to Theorem 1.5 was established in [9, Theorem 3.5], although the latter result establishes repulsion of eigenvalues in a fixed small interval I , rather than at a fixed index i of the sequence of eigenvalues, and does not seem to be directly substitutable for Theorem 1.5 in the arguments of this paper. The results of Theorem 1.4 and Theorem 1.5 only control eigenvalues λi (An ) in the bulk region εn ≤ i ≤ (1 − ε)n for some fixed ε > 0 (independent of n). The reason for this restriction was technical, and originated from the use of the following two related results (which are variants of previous results of Erd˝os, Schlein, and Yau[7–9]), whose proof relied on the assumption that one was in the bulk:
552
T. Tao, V. Vu
Theorem 1.6 (Concentration for ESD) [27, Theorem 56]. For any ε, δ > 0 and any random Hermitian matrix Mn = (ζi j )1≤i, j≤n whose upper-triangular entries are independent with mean zero and variance 1, and such that |ζi j | ≤ K almost surely for all i, j 2
20
n and some 1 ≤ K ≤ n 1/2−ε , and any interval I in [−2+ε, 2−ε] of width |I | ≥ K log , n the number of eigenvalues N I of Wn := √1n Mn in I obeys the concentration estimate |N I − n ρsc (x) d x| δn|I | I
with overwhelming probability, where ρsc is the semicircular distribution √ 1 4 − x 2 , |x| ≤ 2 ρsc (x) := 2π 0, |x| > 2.
(4)
In particular, N I = ε (n|I |) with overwhelming probability. Proposition 1.7 (Delocalization of eigenvectors) [27, Prop. 58]. Let ε, Mn , Wn , ζi j , K be as in Theorem 1.6. Then for any 1 ≤ i ≤ n with λi (Wn ) ∈ [−2 + ε, 2 − ε], if u i (Wn ) denotes a unit eigenvector corresponding to λi (Wn ), then with overwhelming 2 20 n ). probability each coordinate of u i (Mn ) is Oε ( K nlog 1/2 In the bulk region [−2 + ε, 2 − ε], the semicircular function ρsc is bounded away from zero. Thus, Theorem 1.6 ensures that the eigenvalues of Wn in the bulk tend to have a mean spacing of ε (1/n) on the average. Applying the Cauchy interlacing law λi (Wn ) ≤ λi (Wn−1 ) ≤ λi+1 (Wn ),
(5)
where Wn−1 is an n − 1 × n − 1 minor of Wn , this implies that the bulk eigenvalues of Wn−1 are within ε (1/n) of the corresponding eigenvalues of Wn on the average. Using linear algebra to express the coordinates of the eigenvector u i (Mn ) in terms of Wn and a minor Wn−1 (see Lemma 4.1 below), and using some concentration of measure results concerning the projection of a random vector to a subspace (see Lemma 2.1), we eventually obtain Proposition 1.7. 1.2. Universality up to the edge. The main results of this paper are that the above four theorems can be extended to the edge of the spectrum (thus effectively sending ε to zero). Let us now state these results more precisely. Firstly, we have the following extension of Theorem 1.6: Theorem 1.8 (Concentration for ESD up to edge). Consider a random Hermitian matrix Mn = (ζi j )1≤i, j≤n whose upper-triangular entries are independent with mean zero and variance 1, and such that |ζi j | ≤ K almost surely for all i, j and some K ≥ 1. Let 0 < δ < 1/2 be a quantity which can depend on n, and let I be an interval such that K 2 log4 n . nδ 10 We also make the mild assumption K = o(n 1/2 δ 2 ). Then the number of eigenvalues N I of Wn := √1n Mn in I obeys the concentration estimate |N I − n ρsc (x) d x| δn|I | |I | ≥
I
with overwhelming probability.
Universality up to the Edge
553
Remark 1.9. The powers of K , δ and log n here are probably not best possible, but this will not be relevant for our purposes. In our applications K will be a power of log n, and δ will be a negative power of log n. (This allows the error term O(δn|I |) in the above estimate for N I to exceed the main term n I ρsc (x) d x when one is very near the edge, but this will not impact our arguments.) We prove this theorem in Sect. 3, using the same (standard) Stieltjes transform method that was used to prove Theorem 1.6 in [27] (see also [9]), with a somewhat more careful analysis. We next use it to obtain the following extension of Proposition 1.7: Proposition 1.10 (Delocalization of eigenvectors up to the edge). Let Mn be a random matrix obeying Condition C0. Then with overwhelming probability, every unit eigenvector u i (Mn ) of Mn has coefficients at most n −1/2 log O(1) n, thus sup |u i (Mn )∗ e j | n −1/2 log O(1) n, 1≤i, j≤n
where e1 , . . . , en is the standard basis. The deduction of Proposition 1.10 from Theorem 1.8 differs significantly from the analogous deduction of Proposition 1.7 in Theorem 1.6 in [27]. The main difference is that in the current case ρsc is no longer bounded away from zero, which causes the average eigenvalue spacing between λi (Wn ) and λi+1 (Wn ) to be considerably larger than 1/n. For instance, the gap between the second largest eigenvalue λn−1 (Wn ) and the largest eigenvalue λn (Wn ) is typically of size n −2/3 . The key new ingredient that helps us to deal with this problem is the following observation: the Cauchy interlacing law (5), when applied to the eigenvalues of the edge, is strongly bias. In particular, the gap between λi (Wn−1 ) and λi (Wn ) is significantly smaller than the gap between λi (Wn−1 ) and λi+1 (Wn ). We can show that (with high probability), the first gap is of order n −1+o(1) while the second can be as large as n −2/3 (and similarly for the gap between λi+1 (Wn ) and λi (Wn−1 ) when n/2 ≤ i ≤ n). This new ingredient will be sufficient to recover Proposition 1.10; see Sect. 4, where the above proposition is proved. Using Theorem 1.8 and Proposition 1.10, one can continue the arguments from [27] to establish the following extensions of Theorem 1.4 and Theorem 1.5: Theorem 1.11 (Four Moment Theorem up to the edge). There is a small positive constant c0 such that for every k ≥ 1 the following holds. Let Mn = (ζi j )1≤i, j≤n and Mn = (ζij )1≤i, j≤n be two random matrices satisfying C0. Assume furthermore that for any 1 ≤ i < j ≤ n, ζi j and ζij match to order 4 and for any 1 ≤ i ≤ n, ζii and ζii match √ √ to order 2. Set An := n Mn and An := n Mn , and let G : Rk → R be a smooth function obeying the derivative bounds (2) for all 0 ≤ j ≤ 5 and x ∈ Rk . Then for any 1 ≤ i 1 < i 2 · · · < i k ≤ n, and for n sufficiently large depending on k (and the constants C, C in Definition 1.1) we have (3). If ζi j and ζij only match to order 3 rather than 4, then there is a positive constant C independent of c0 such that the conclusion (3) still holds provided that one strengthens (2) to |∇ j G(x)| ≤ n −C jc0 for all 0 ≤ j ≤ 5 and x ∈ Rk .
(6)
554
T. Tao, V. Vu
Theorem 1.12 (Lower tail estimates up to the edge). Let Mn be a random matrix obey√ ing Condition C0. Set An := n Mn . Then for every c0 > 0, and for n sufficiently large depending on c0 and the constants C, C in Definition 1.1, and for each 1 ≤ i ≤ n, one has λi+1 (An ) − λi (An ) ≥ n −c0 with high probability, uniformly in i. The novelty here is that we have no assumption on the indices i j and i. We present the proof of these theorems in Sects. 5, 6, following the arguments in [27] closely. 1.3. Applications. As Theorems 1.11, 1.12 extend Theorems 1.4, 1.5, all the applications of the latter theorems in [27] (concerning the bulk of the spectrum) can also be viewed as applications of these theorems. But because these results extend all the way to the edge, we can now obtain some results on the edge of the spectrum as well. For instance, we can prove Theorem 1.13. Let k be a fixed integer and Mn be a matrix obeying Condition C0, and suppose that the real and imaginary part of the atom variables have the same covariance matrix as the GUE ensemble (i.e. both components have variance 1/2, and have covariance 0). Assume furthermore that all third moments of the atom variables vanish. Set Wn := √1n Mn . Then the joint distribution of the k dimensional random vector (7) (λn (Wn ) − 2)n 2/3 , . . . , (λn−k+1 (Wn ) − 2)n 2/3 has a weak limit as n → ∞, which coincides with that in the GUE case (in particular, the limit for k = 1 is the GUE Tracy-Widom distribution [28], and for higher k is controlled by the Airy kernel [14]). The result also holds for the smallest eigenvalues λ1 , . . . , λk , with the offset −2 replaced by +2. If the atom variables have the same covariance matrix as the GOE ensemble (i.e. they are real with variance 1 off the diagonal, and 2 on the diagonal), instead of the GUE ensemble, then the same conclusion applies but with the GUE distribution replaced of course by the GOE distribution (see [29] for the k = 1 case). This result was previously established by Soshnikov [25] (see also [23,24]) in the case when Mn is a Wigner Hermitian matrix (i.e. the off-diagonal entries are iid, and the matrix matches GUE to second order at least) with symmetric distribution (which implies, but is stronger than, matching to third order). For some additional partial results in the non-symmetric case see [20,21]. The exponential decay condition in Soshnikov’s result has been lowered to a finite number of moments; see [22,18]. It is reasonable to conjecture that the exponential decay conditions in this current paper can similarly be lowered, but we will not pursue this issue here. It also seems plausible that the third moment matching conditions could be dropped, though this is barely beyond the reach of the current method2 . Proof. We just prove the claim for the largest k eigenvalues and for GUE, as the claim for the smallest√k and/or GOE is similar. Set An := n Mn . It suffices to show that for every smooth function G : Rk → R, that the expectation EG((λn (An ) − 2n)/n 1/3 , . . . , (λn−k+1 (An ) − 2n)/n 1/3 )
(8)
2 Note added in proof. The third moment condition has recently been dropped in [16], by combining the four moment theorem here with a new proof of universality for the distribution of the largest eigenvalue for gauss divisible matrices.
Universality up to the Edge
555
only changes by o(1) when the matrix Mn is replaced with GUE. But this follows from the final conclusion of Theorem 1.11, thanks to the extra factor n −1/3 .
Remark 1.14. Notice that there is some room to spare in this argument, as the n −1/3 gain in (8) is far more than is needed for (6). Because of this, one can obtain similar universality results for suitably normalised eigenvalues λi (An ) with i ≤ n 1−ε or i ≥ n − n 1−ε for any ε > 0 (where the normalisation factor n 2/3 min(i, n − i)1/3 , t for λi (An ) is now i and the offset −2 is replaced by −t, where −2 ρsc (x) d x = n ). We omit the details. Remark 1.15. In analogy with [13], one should be able to drop the third moment condition in Theorem 1.13 if one can control the distribution of the largest (or smallest) eigenvalues from random matrices obtained from a suitable Ornstein-Uhlenbeck process, as in [12]. 1.4. Notation. We consider n as an asymptotic parameter tending to infinity. We use X Y , Y X , Y = (X ), or X = O(Y ) to denote the bound X ≤ CY for all sufficiently large n and for some constant C. Notations such as X k Y, X = Ok (Y ) mean that the hidden constant C depend on another constant k. X = o(Y ) or Y = ω(X ) means that X/Y → 0 as n → ∞; the rate of decay here will be allowed to depend on other parameters. We write X = (Y ) for Y X Y . We view vectors x ∈ Cn as column vectors. The Euclidean norm of a vector x ∈ Cn is defined as x := (x ∗ x)1/2 . Eigenvalues are always ordered in increasing order, thus for instance λn (An ) is the largest eigenvalue of a Hermitian matrix An , and λ1 (An ) is the smallest. 2. General Tools In this section we record some general tools (proven in [27]) which we will use repeatedly in the sequel. We begin with a very useful concentration of measure result that describes the projection of a random vector to a subspace. Lemma 2.1 (Projection Lemma). Let X = (ξ1 , . . . , ξn ) ∈ Cn be a random vector whose entries are independent with mean zero, variance 1, and are bounded in magnitude by K almost surely for some K , where K ≥ 10(E|ξ |4 + 1). Let H be a subspace of dimension d and π H the orthogonal projection onto H . Then P(|π H (X ) −
√ t2 d| ≥ t) ≤ 10 exp(− ). 10K 2
In particular, one has π H (X ) =
√ d + O(K log n)
with overwhelming probability. The same conclusion holds (with 10 replaced by another explicit constant) if one of the entries ξ j of X is assumed to have variance c instead of 1, for some absolute constant c > 0. Proof. See [27, Lem. 40]. (The main tool in the proof is Talagrand’s concentration inequality.) It is clear from the triangle inequality that the modification of variance in a single entry does not significantly affect the conclusion except for constants.
556
T. Tao, V. Vu
Next, we record a crude but useful upper bound on the number of eigenvalues in a short interval. Lemma 2.2 (Upper bound on ESD). Consider a random Hermitian matrix Mn = (ζi j )1≤i, j≤n whose upper-triangular entries are independent with mean zero and variance 1 (with variance c on the diagonal for some absolute constant c > 0), and such that |ζi j | ≤ K almost surely for all i, j and some K ≥ 1. Set Wn := √1n Mn . Then for any interval I ⊂ R with |I | ≥
K 2 log2 n , n
N I n|I | with overwhelming probability, where N I is the number of eigenvalues of Wn in I . Proof. See [27, Prop. 62]. (The main tools in the proof are the Stieltjes transform method, Lemma 3.3 below, and Lemma 2.1.) Again, the generalisation to variances other than 1 on the diagonal do not cause significant changes to the argument.
Finally, we recall a Berry-Esséen type theorem: Theorem 2.3 (Tail bounds for complex random walks). Let 1 ≤ N ≤ n be integers, and let A = (ai, j )1≤i≤N ;1≤ j≤n be an N × n complex matrix whose N rows are orthonormal in Cn , and obeying the incompressibility condition sup
1≤i≤N ;1≤ j≤n
|ai, j | ≤ σ
(9)
for some σ > 0. Let ζ1 , . . . , ζn be independent complex random variables with mean zero, variance E|ζ j |2 equal to 1, and obeying E|ζi |3 ≤ C for some C ≥ 1. For each 1 ≤ i ≤ N , let Si be the complex random variable Si :=
n
ai, j ζ j
j=1
and let S be the C N -valued random variable with coefficients S1 , . . . , S N : • (Upper tail bound on Si ) For t ≥ 1, we have P(|Si | ≥ t) exp(−ct 2 ) + Cσ for some absolute constant c > 0. √ √ For any t ≤ N , one has P(| S| ≤ t) O(t/ N )N /4 + • (Lower tail bound on S) C N 4 t −3 σ . The same claim holds if one of the ζi is assumed to have variance c instead of 1 for some absolute constant c > 0. Proof. See [27, Th. 41]. Again, the modification of the variance on a single entry can be easily seen to have no substantial effect on the conclusion.
Universality up to the Edge
557
3. Asymptotics for the ESD In this section we prove Theorem 1.8, using the Stieltjes transform method (see [2] for a general discussion of this method). We may assume throughout that n is large, since the claim is vacuous for n small. It is known by the moment method (see e.g. [2] or [4]) that with overwhelming probability, all eigenvalues of Wn have magnitude at most 2 + o(1). Because of this, we may restrict attention to the case when I lies in interval [−3, 3] (say). We recall the Stieltjes transform sn (z) of a Hermitian matrix Wn , defined for complex z by the formula n 1 1 . sn (z) := n λi (Wn ) − z
(10)
i=1
We also introduce the semicircular counterpart 2 1 s(z) := ρsc (x) d x, x − z −2 which by a standard contour integral computation can be given explicitly as 1 s(z) = (−z + z 2 − 4), 2
(11)
where we use the branch of the square root of z 2 − 4 with cut at [−2, 2] which is asymptotic to z at infinity. It is well known that one can control the empirical spectral distribution N I via the Stieltjes transform. We will use the following formalization of this principle: Lemma 3.1 (Control of Stieltjes transform implies control on ESD). There is a positive constant C such that the following holds for any Hermitian matrix Wn . Let 1/10 ≥ η ≥ 1/n and L , ε, δ > 0. Suppose that one has the bound |sn (z) − s(z)| ≤ δ
(12)
with (uniformly) overwhelming probability for all z with |Re(z)| ≤ L and Im(z) ≥ η. Then for any interval I in [−L + ε, L − ε] with |I | ≥ max(2η, ηδ log 1δ ), one has |N I − n ρsc (x) d x| ε δn|I | I
with overwhelming probability, where N I is the number of eigenvalues of Wn in I . Proof. See [27, Lem. 60].
As a consequence of this lemma (with L = 4 and ε = 1, say), we see that Theorem 1.8 follows from Theorem 3.2 (Concentration for the Stieltjes transform up to edge). Consider a random Hermitian matrix Mn = (ζi j )1≤i, j≤n whose upper-triangular entries are independent with mean zero and variance 1, with variance c on the diagonal for some absolute constant c > 0, and such that |ζi j | ≤ K almost surely for all i, j and some K ≥ 1. Set Wn := √1n Mn . Let 0 < δ < 1/2 (which can depend on n), and suppose that
558
T. Tao, V. Vu
K = o(n 1/2 δ 2 ). Then (12) holds with (uniformly) overwhelming probability for all z with |Re(z)| ≤ 4 and Im(z) ≥
K 2 log3.5 n . δ8n
The remainder of this section is devoted to proving Theorem 3.2. Fix z as in Theorem 3.2, thus |Re(z)| ≤ 4 and Im(z) = η, where ηn ≥
K 2 log3.5 n . δ8
(13)
Our objective is to show (12) with (uniformly) overwhelming probability. As in previous works (in particular [9,27]), the key is to exploit the fact that when Imz > 0, s(z) is the unique solution to the equation s(z) +
1 =0 s(z) + z
(14)
with Ims(z) > 0; this is immediate from (11). We now seek a similar relation for sn . Note that Imsn (z) > 0 by (10). We use the following standard matrix identity (cf. [27, Lem. 39], or [2, Chap. 11]): Lemma 3.3. We have sn (z) =
n 1 n
1
√1 ζ k=1 n kk
− z − Yk
,
(15)
where Yk := ak∗ (Wn,k − z I )−1 ak , Wn,k is the matrix Wn with the k th row and column removed, and ak is the k th row of Wn with the k th element removed. Proof. By Schur’s complement,
1 ζkk −z−ak∗ (Wk −z I )−1 ak
(W − z I )−1 . Taking traces, one obtains the claim.
is the k th diagonal entry of
Proposition 3.4 (Concentration of Yk ). For each 1 ≤ k ≤ n, one has Yk = sn (z) + o(δ 2 ) with overwhelming probability. √ Proof Fix k, and write z = x + −1η. The entries of ak are independent of each other and of Wn,k , and have mean zero and variance n1 . By linearity of expectation we thus have, on conditioning on Wn,k ,
1 1 −1 sn,k (z), E(Yk |Wn,k ) = trace(Wn,k − z I ) = 1 − n n where 1 1 n−1 λi (Wn,k ) − z n−1
sn,k (z) :=
i=1
Universality up to the Edge
559
is the Stieltjes transform of Wn,k . From the Cauchy interlacing law (5) and (13), we have
1 1 1 1 = o(δ 2 ). sn (z) − (1 − )sn,k (z) = O d x = O n n R |x − z|2 nη It follows that E(Yk |Wn,k ) = sn (z) + o(δ 2 ), and so it will remain to show the concentration estimate Yk − E(Yk |Wn,k ) = o(δ 2 ) with overwhelming probability. Rewriting Yk , it suffices to show that n−1 j=1
Rj λ j (Wn,k ) − (x +
√
−1η)
= o(δ 2 )
(16)
with overwhelming probability, where R j := |u j (Wn,k )∗ ak |2 − 1/n. Let 1 ≤ i − < i + ≤ n, then
R j = PH ak 2 −
i − ≤ j≤i +
dim(H ) , n
where H is the space spanned by the u j (Wn,k )∗ for i − ≤ j ≤ i + . From Lemma 2.1 and the union bound, we conclude that with overwhelming probability √ i + − i − K log n + K 2 log2 n
. (17) R j n i − ≤ j≤i +
By the triangle inequality, this implies that i − ≤ j≤i +
i+ − i− + PH ak
n 2
√ i + − i − K log n + K 2 log2 n , n
and hence by a further application of the triangle inequality i − ≤ j≤i +
|R j |
(i + − i − ) + K 2 log2 n n
(18)
with overwhelming probability. The plan is to use (17) and (18) to establish (16). Accordingly, we split the LHS of (16), into several subsums according to the distance |λ j − x|. Lemma 2.2 provides a sharp estimate on the number of terms of each subsum which will allow us to obtain a good upper bound on the absolute value.
560
T. Tao, V. Vu
We turn to the details. From (13) we can choose two auxiliary parameters 0 < δ , α < 1 such that δ = o(δ 2 ); α log n = o(δ 2 ); αδ ηn ≥ K 2 log2 n; K log n = o(δ 2 ). √ αδ ηn
(19)
Indeed, one could set δ := δ 2 log−0.01 n and α := δ 2 log−1.01 n and use (13). Fix such parameters, and consider the contribution to (16) of the indices j for which |λ j (Wn ) − x| ≤ δ η. By Lemma 2.2 and (19), the interval of j for which this occurs has cardinality O(δ ηn) 1 √ (with overwhelming probability). On this interval, the quantity λ (W )−(x+ has −1η) j
n,k
magnitude O( η1 ). Applying (18) (and (19)), we see that the contribution of this case is thus
1 δ ηn = o(δ 2 ), η n
which is acceptable. Next, we consider the contribution to (16) of those indices j for which (1 + α)l δ η < |λ j (Wn ) − x| ≤ (1 + α)l+1 δ η for some integer 0 ≤ l log n/α, and then sum over l. By Lemma 2.2 and (19), the set of j for which this occurs is contained (with overwhelming probability) in at most l two intervals of cardinality O((1 + α) thequantity αδ ηn). On each of these intervals, 1 √ 1 α has magnitude O (1+α)l δ η and fluctuates by O (1+α)l δ η . Applyλ (W )−(x+ −1η) j
n,k
ing (17), (18) (and noting that (1 + α)l αδ ηn exceeds K 2 log2 n, by (19)) we see that the contribution of a single l to (16) is at most 1 α α(1 + α)l δ ηn α(1 + α)l δ ηn K log n
+ , l l (1 + α) δ η n (1 + α) δ η n which simplifies to K log n
α(1 + α)−l/2 √ + α 2 . αδ ηn Summing over l we obtain a bound of K log n
√ + α log n, αδ ηn which is acceptable by (19).
Universality up to the Edge
561
We now conclude the proof of Theorem 1.8. By hypothesis, 1 √ √ ζkk ≤ K / n = o(δ 2 ) n almost surely. Inserting these bounds into (15), we see that with overwhelming probability sn (z) +
n 1 1 = 0. n sn (z) + z + o(δ 2 ) k=1
By the triangle inequality (and square rooting the o() decay), we can assume that either the error term o(δ 2 ) is o(δ 2 |sn (z) + z|), or that |sn (z) + z| is o(1). Suppose the former holds. Then by Taylor expansion 1 1 = + o(δ 2 ), sn (z) + z + o(δ 2 ) sn (z) + z and thus sn (z) +
1 = o(δ 2 ). sn (z) + z
If we assume |z| ≤ 10 (say), we conclude that |sn (z)| ≤ 100. Multiplying out by sn (z)+z and rearranging, we obtain z 2 z2 − 4 + o(δ 2 ). sn (z) + = 2 4 Thus
sn (z) +
z2 − 4 z =± + o(δ) 2 4
(treating the case when z 2 − 4 = o(δ 2 ) separately). To summarise, we have shown (with overwhelming probability) in the region |z| ≤ 10; |Re(z)| ≤ 4; Im(z) ≥
K 2 log3.5 n δ8n
√ that one either has sn (z) = s(z)+o(δ), sn (z) = −z −s(z)+o(1) = s(z)− z 2 − 4+o(1), or |sn (z) + z| = o(1). It is not hard to see that the first two cases are disconnected from −1 the third (for n large enough) in this region, because s(z) = s(z)+z is bounded away −1 from zero, as is s(z) + z = s(z) . Furthermore, the first and second possibilities are also disconnected from each other except when z 2 − 4 = o(δ 2 ). Also, the second and third possibilities can only hold for Im(z) = o(1) since sn (z) and z both have positive real part. A continuity argument thus shows that the first possibility must hold throughout the region except when z 2 − 4 = o(δ 2 ), in which case either the first or second possibility can hold; but in that region, the first and second possibility are equivalent, and (12) follows. The proof of Theorem 1.8 is now complete.
562
T. Tao, V. Vu
4. Delocalization of Eigenvectors Without loss of generalization, we can assume that the entries are continuously distributed. Having established Theorem 1.8, we now use this theorem to establish Proposition 1.10. Let Mn obey Condition C0. Then by Markov’s inequality, one has |ζi j | log O(1) n with overwhelming probability (here and in the sequel we allow implied constants in the O() notation to depend on the constants C, C in (1)). By conditioning the ζi j to this event3 , we may thus assume that |ζi j | ≤ K
(20)
almost surely for some K = O(log O(1) n). Fix 1 ≤ i ≤ n; by symmetry we may take i ≥ n/2. By the union bound and another application of symmetry, it suffices to show that |u i (Mn )∗ e1 | n −1/2 log O(1) n with overwhelming probability. To compute u i (Mn )∗ e1 we use the following identity from [7] (see also [27, Lem. 38]): Lemma 4.1 Let
An =
a X∗ X An−1
x be a unit v eigenvector of A with eigenvalue λi (A), where x ∈ C and v ∈ Cn−1 . Suppose that none of the eigenvalues of An−1 are equal to λi (A). Then be a n × n Hermitian matrix for some a ∈ R and X ∈ Cn−1 , and let
|x|2 =
1+
1
n−1
j=1 (λ j (An−1 ) − λi (An ))
−2 |u
j (An−1 )
∗ X |2
,
where u j (An−1 ) is a unit eigenvector corresponding to the eigenvalue λ j (An−1 ). Proof By subtracting λi (A)I from A we may assume λi (A) = 0. The eigenvector equation then gives x X + An−1 v = 0, thus v = −x A−1 n−1 X. Since v 2 + |x|2 = 1, we conclude 2 |x|2 (1 + A−1 n−1 X ) = 1. 2 Since A−1 n−1 X =
n−1
j=1 (λ j (An−1 ))
−2 |u
j (An−1 )
∗ X |2 ,
the claim follows.
3 Strictly speaking, this distorts the mean and variance of ζ by an exponentially small amount, but one ij can easily check that this does not significantly impact any of the arguments in this section.
Universality up to the Edge
563
Let Mn−1 be the bottom right n − 1 × n − 1 minor of Mn . As we are assuming that the coefficients of Mn are continuously distributed, we see almost surely that none of the eigenvalues of Mn−1 are equal to λi (Mn ). We may thus apply Lemma 4.1 and conclude that |u i (Mn )∗ e1 |2 = 1+
n−1
1
|u j (Mn−1 )∗ X |2 j=1 (λ j (Mn−1 )−λi (Mn ))2
,
where X is the bottom left n −1×1 vector of Mn (and thus has entries ζ j1 for 1 < j ≤ n). It thus suffices to show that n−1 j=1
|u j (Mn−1 )∗ X |2 n log−O(1) n (λ j (Mn−1 ) − λi (Mn ))2
with overwhelming probability. It will be convenient to eliminate the exponent 2 in the denominator, as follows. From Lemma 2.1, one has |u j (Mn−1 )∗ X | log O(1) n with overwhelming probability for each j (and hence for all j, by the union bound). It thus suffices to show that n−1 j=1
|u j (Mn−1 )∗ X |4 n log−O(1) n (λ j (Mn−1 ) − λi (Mn ))2
with overwhelming probability. By the Cauchy-Schwarz inequality, it thus suffices to show that j:i−T− ≤ j≤i+T+
|u j (Mn−1 )∗ X |2 n 1/2 log−O(1) n |λ j (Mn−1 ) − λi (Mn )|
with overwhelming probability for some 1 ≤ T− , T+ log O(1) n. It is convenient to work with the normalized matrix Wn := √1n Mn , thus we need to show j:i−T− ≤ j≤i+T+
|u j (Wn−1 )∗ Y |2 log−O(1) n |λ j (Wn−1 ) − λi (Wn )|
with overwhelming probability for some 1 ≤ T− , T+ log O(1) n, where Y :=
(21) √1 n
X
√1 ζ j1 n
for 1 < j ≤ n. has entries There are two cases: the bulk case and the edge case; the former was already treated in [27], but the latter is new. 4.1. The bulk case. Suppose that n/2 ≤ i < 0.999n. Then from the semicircular law (or Theorem 1.8) we see that λi (Wn ) ∈ [−2 + ε, 2 + ε] with overwhelming probability for some absolute constant ε > 0. Let A be a large constant to be chosen later. A further application of Theorem 1.8 then shows that there is an interval I of length log A n/n centered at λi (Wn ) which contains (log A n) eigenvalues of Wn . If λ j (Wn ), λ j+1 (Wn )
564
T. Tao, V. Vu
lie in I , then by the Cauchy interlacing property (5), |λ j (Wn−1 ) − λi (Wn )| log A n/n. One can thus lower bound the left-hand side of (21) (for suitable values of T ) by |u j (Wn−1 )∗ Y |2 . n log−A n j:λ j (Wn ),λ j+1 (Wn )∈I
One can rewrite this as log−A nπ H X 2 , where H is the span of the u j (Wn−1 ) for λ j (Wn ), λ j+1 (Wn ) ∈ I . The claim then follows from Lemma 2.1 (for A large enough). 4.2. The edge case. We now turn to the more interesting edge case when 0.999n ≤ i ≤ n. Using the semicircular law, we now see that λi (Wn ) ≥ 1.9
(22)
(say) with overwhelming probability. Next, we can exploit the following identity: Lemma 4.2 (Interlacing identity) [27, Lem. 37]. If u j (Wn−1 )∗ X is non-zero for every j, then n−1 j=1
|u j (Wn−1 )∗ X |2 1 = √ ζnn − λi (Wn ). λ j (Wn−1 ) − λi (Wn ) n
(23)
Proof By diagonalising Wn−1 (noting that this does not affect either side of (23)), we may assume that Wn−1 = diag(λ1 (Wn−1 ), . . . , λn−1 (Wn−1 )) and u j (Wn−1 ) = e j for j = 1, . . . , n−1. One then easily verifies that the characteristic polynomial det(Wn −λI ) of Wn is equal to ⎡ ⎤
n−1 n−1 |u j (Wn−1 )∗ X |2 1 ⎦ (λ j (Wn−1 ) − λ) ⎣ √ ζnn − λ − λ j (Wn−1 ) − λ n j=1
j=1
when λ is distinct from λ1 (Wn−1 ), . . . , λn−1 (Wn−1 ). Since u j (Wn−1 )∗ X is non-zero by hypothesis, we see that this polynomial does not vanish at any of the λ j (Wn−1 ). Substituting λi (Wn ) for λ, we obtain (23).
Again, the continuity of the entries of Mn ensure that the hypothesis of Lemma 4.2 is obeyed almost surely. From (20), (22), (23) one has n−1 |u j (Wn−1 )∗ X |2 ≥ 1.9 − o(1) j=1 λ j (Wn−1 ) − λi (Wn ) with overwhelming probability, so to show (21), it will suffice by the triangle inequality to show that ∗ X |2 |u (W ) j n−1 ≤ 1.8 + o(1) (24) j>i+T+ or j 100 be a large constant to be chosen later. By Theorem 1.8, we see (if A is large enough) that N I = nα I |I | + O(|I |n log−A/20 n)
(25)
with overwhelming probability for any interval I of length |I | = n/n, where α I := |I1| I ρsc (x) d x. For any such interval, we see from Lemma 2.1 (and Cauchy interlacing (5)) that with overwhelming probability log A/2+O(1) n NI ∗ 2 +O |u j (Wn−1 ) X | = n n log A
j:λ j (Wn−1 )∈I
and thus by (25) (for A large enough) |u j (Wn−1 )∗ X |2 = α I |I | + O(|I | log−A/20) n). j:λ j (Wn−1 )∈I
Set d I :=
dist(λi (Wn ),I ) . |I |
If d I ≥ log n (say), then
1 1 = +O λ j (Wn−1 ) − λi (Wn ) d I |I |
1 d I2 |I |
for all j in the above sum, thus j:λ j (Wn−1 )∈I
|u j (Wn−1 )∗ X |2 αI = +O λ j (Wn−1 ) − λi (Wn ) dI
log−A/20 n dI
+O
αI d I2
.
(26)
We now partition the real line into intervals I of length log A n/n, and I
(26)over all sum
with d I ≥ log n. Bounding α I crudely by O(1), we see that I O αd 2I = O log1 n = I
o(1). Similarly, one has log−A/20 n
= O(log−A/20 n log n) = o(1) O dI I
if A is large enough. Finally, Riemann integration of the principal value integral 2 ρsc (x) ρsc (x) p.v. d x := lim dx ε→0 |x|≤2:|x−λi (Wn )|>ε x − λi (Wn ) −2 x − λi (Wn ) shows that αI I
dI
= p.v.
2 −2
ρsc (x) d x + o(1). x − λi (Wn )
The operator norm of Wn is 2 + o(1) with overwhelming probability (see e.g. [2,4]), so |λi (Wn )| ≤ 2 + o(1). Using the formula (11) for the Stieltjes transform, one obtains from residue calculus that 2 ρsc (x) p.v. d x = −λi (Wn )/2 −2 x − λi (Wn )
566
T. Tao, V. Vu
for |λi (Wn )| ≤ 2, with the right-hand side replaced by −λi (Wn )/2 + for |λi (Wn )| > 2. In either event, we have p.v.
2 −2
λi (Wn )2 − 4/2
ρsc (x) d x ≤ 1 + o(1). x − λi (Wn )
Putting all this together, we see that I :d I ≥log n
j:λ j (Wn−1 )∈I
|u j (Wn−1 )∗ X |2 ≤ 1 + o(1). λ j (Wn−1 ) − λi (Wn )
The intervals I with d I < log n will contribute at most log A+O(1) n eigenvalues, by (25) (and Cauchy interlacing (5)). The claim (24) now follows by setting T− and T+ appropriately. The proof of Proposition 1.10 is now complete. Remark 4.3 From (21) and Lemma 2.1 one sees that |λi−1 (Wn−1 ) − λi (Wn )| log O(1) n/n with overwhelming probability for all n/2 ≤ i ≤ n, and similarly one has |λi (Wn−1 ) − λi (Wn )| log O(1) n/n with overwhelming probability for all 1 ≤ i ≤ n/2. On the other hand, according to the Tracy-Widom law, the gap between λn (Wn ) and λn−1 (Wn ) (or between λ1 (Wn ) and λ2 (Wn )) can be expected to be as large as n −2/3 . Thus we see that there is a significant bias at the edge in the interlacing law (5), which can ultimately be traced to the imbalance of “forces” in the interlacing identity (23) at that edge.
5. Lower Bound on Eigenvalue Gap We now give the proof of Theorem 1.12. Most of the proof will follow closely the proof of Theorem 1.5 in [27], so we shall focus on the changes needed to that argument. As such, this section will assume substantial familiarity with the material from [27], and will cite from it repeatedly (similarly for the next section). For technical reasons relating to an induction argument, it turns out that one has to treat the extreme cases i = 1, n separately: Proposition 5.1 (Extreme cases). Theorem 1.12 is true when i = 1 or i = n. Proof By symmetry it suffices to do this for i = n. By a limiting argument we may assume that the entries ζi j of Mn are continuously distributed. From Lemma 4.2 one has (almost surely) that n−1 j=1
|u j (Wn−1 )∗ X |2 1 = √ ζnn − λn (Wn ). λ j (Wn−1 ) − λn (Wn ) n
Universality up to the Edge
567
Recall that λn (Wn ) = 2 + o(1) with overwhelming probability; also, √1n ζnn = o(1) with overwhelming probability. As all the terms in the left-hand side have the same sign, we conclude that |u n−1 (Wn−1 )∗ X |2
1. |λn−1 (Wn−1 ) − λn (Wn )| From Theorem 2.3 and Proposition 1.10, we have |u n−1 (Wn−1 )∗ X | ≥ n −c0 /10 (say) with high probability, and so |λn−1 (Wn−1 ) − λn (Wn )| ≥ n −c0 with high probability. The claim now follows from the Cauchy interlacing property (5).
Remark 5.2 In fact, at the edge, one should be able to improve the lower bound on the eigenvalue gap substantially, from n −c0 to n 1/3−c0 , in accordance to the Tracy-Widom law, but we will not need to do so here. Now we handle the general case of Theorem 1.12. Fix Mn and c0 . We write n 0 , i 0 for i, n, thus 1 ≤ i 0 ≤ n 0 and our task is to show that λi0 +1 (An ) − λi0 (An 0 ) ≥ n −c0 with high probability. By Proposition 5.1 we may assume 1 < i 0 < n 0 . We may also assume n 0 to be large, as the claim is vacuous otherwise. As in previous sections, we may truncate so that all coefficients ζi j are of size O(log O(1) n 0 ) (as before, the exponentially small corrections to the mean and variance of ζi j caused by this are easily controlled), and approximate so that the distribution is continuous rather than discrete. For each n 0 /2 ≤ n ≤ n 0 , let An be the top left n × n minor of An 0 . As in [27, Sect. 3.4], we introduce the regularized gap gi,l,n :=
inf
λi+ (An ) − λi− (An )
1≤i − ≤i−l 0, and assume n sufficiently large depending on these parameters. Let 1 ≤ i 1 < · · · < i k ≤ n. For a complex parameter z, let A(z) be a (deterministic) family of n × n Hermitian matrices of the form A(z) = A(0) + ze p eq∗ + zeq e∗p , where e p , eq are unit vectors. We assume that for every 1 ≤ j ≤ k and every |z| ≤ n 1/2+ε1 whose real and imaginary parts are multiples of n −C1 , we have • (Eigenvalue separation) For any 1 ≤ i ≤ n with |i − i j | ≥ n ε1 , we have |λi (A(z)) − λi j (A(z))| ≥ n −ε1 |i − i j |.
(33)
570
T. Tao, V. Vu
• (Delocalization at i j ) If Pi j (A(z)) is the orthogonal projection to the eigenspace associated to λi j (A(z)), then Pi j (A(z))e p , Pi j (A(z))eq ≤ n −1/2+ε1 .
(34)
Pi j ,α (A(z))e p , Pi j ,α (A(z))eq ≤ 2α/2 n −1/2+ε1 ,
(35)
• For every α ≥ 0,
whenever Pi j ,α is the orthogonal projection to the eigenspaces corresponding to eigenvalues λi (A(z)) with 2α ≤ |i − i j | < 2α+1 . We say that A(0), e p , eq are a good configuration for i 1 , . . . , i k if the above properties hold. Assuming this good configuration, then we have E(F(ζ )) = EF(ζ ) + O(n −(r +1)/2+O(ε1 ) ),
(36)
whenever F(z) := G(λi1 (A(z)), . . . , λik (A(z)), Q i1 (A(z)), . . . , Q ik (A(z))), and G = G(λi1 , . . . , λik , Q i1 , . . . , Q ik ) is a smooth function from Rk × Rk+ → R that is supported on the region Q i1 , . . . , Q ik ≤ n ε1 and obeys the derivative bounds |∇ j G| ≤ n ε1 for all 0 ≤ j ≤ 5, and ζ, ζ are random variables with |ζ |, |ζ | ≤ n 1/2+ε1 almost surely, which match to order r for some r = 2, 3, 4. If G obeys the improved derivative bounds |∇ j G| ≤ n −C jε1 for 0 ≤ j ≤ 5 and some sufficiently large absolute constant C, then we can strengthen n −(r +1)/2+O(ε1 ) in (36) to n −(r +1)/2−ε1 . Proof See [27, Prop. 43].
The second proposition asserts that these good configurations occur very frequently: Proposition 6.2 (Good configurations occur very frequently). Let ε1 > 0 and C, C1 , k ≥ 1. Let 1 ≤ i 1 < · · · < i k ≤ n, let 1 ≤ p, q ≤ n, let e1 , . . . , en be the standard basis of Cn , and let A(0) = (ζi j )1≤i, j≤n be a random Hermitian matrix with independent uppertriangular entries and |ζi j | ≤ n 1/2 logC n for all 1 ≤ i, j ≤ n, with ζ pq = ζq p = 0, but with ζi j having mean zero and variance 1 for all other i j, except on the diagonal where the variance is instead c for some absolute constant c > 0, and also being distributed continuously in the complex plane. Then A(0), e p , eq obey the Good Configuration Condition in Theorem 6.1 for i 1 , . . . , i k and with the indicated value of ε1 , C1 with overwhelming probability.
Universality up to the Edge
571
Proof The proof of this proposition repeats the proof of [27, Prop. 44 in Sect. 5] almost exactly. Only the following changes have to be made: • All references to [27, Th. 56] (i.e. Theorem 1.6) need to be replaced with Theorem 1.8. • All references to [27, Prop. 58] (i.e. Proposition 1.7) need to be replaced with Proposition 1.10. • The edge regions in which λi (A(z)) do not fall inside the bulk region [(−2 + ε )n, (2 − ε )n] no longer need to be treated separately, thus simplifying the last paragraph of the proof somewhat.
Given these two propositions, the proof of Theorem 1.11 repeats the proof of [27, Th. 15 in Sect. 3.3] almost exactly. Only the following changes have to be made: • All references to [27, Prop. 44] need to be replaced with Proposition 6.2. The proof of Theorem 1.11 is now complete. Acknowledgements. The authors thank the anonymous referee for helpful comments and references, and Horng-Tzer Yau for additional references. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
References 1. Anderson, G., Guionnet, A., Zeitouni, O.: An introduction to random matrices. To be published by Cambridge Univ. Press 2. Bai, Z.D., Silverstein, J.: Spectral analysis of large dimensional random matrices. Mathematics Monograph Series 2, Beijing: Science Press, 2006 3. Bai, Z.D., Yin, Y.Q.: Convergence to the semicircle law. Ann. Probab. 16, 863–875 (1988) 4. Bai, Z.D., Yin, Y.Q.: Necessary and Sufficient Conditions for Almost Sure Convergence of the Largest Eigenvalue of a Wigner Matrix. Ann. Probab. 16, 1729–1741 (1988) 5. Deift, P.: Orthogonal polynomials and random matrices: a Riemann-Hilbert approach. Courant Lecture Notes in Mathematics, 3. New York University, Courant Institute of Mathematical Sciences, New York; Providence, RI: Amer. Math. Soc., 1999 6. Deift, P.: Universality for mathematical and physical systems. In: International Congress of Mathematicians Vol. I, Zürich: Eur. Math. Soc., 2007, pp. 125–152 7. Erd˝os, L., Schlein, B., Yau, H.-T.: Semicircle law on short scales and delocalization of eigenvectors for Wigner random matrices. Ann. Prob. 37(3), 815–852 (2009) 8. Erd˝os, L., Schlein, B., Yau, H.-T.: Local semicircle law and complete delocalization for Wigner random matrices. Commun. Math. Phys. 287(2), 641–655 (2009) 9. Erd˝os, L., Schlein, B., Yau, H.-T.: Wegner estimate and level repulsion for Wigner random matrices. Submitted, available at http://arxiv.org/abs/0811.2591v3[math.ph], 2009 10. Erd˝os, L., Schlein, B., Yau, H.-T.: Universality of Random Matrices and Local Relaxation Flow. http:// arxiv.org/abs/0907.5605v3[math-ph], 2009 11. Erd˝os, L., Ramirez, J., Schlein, B., Yau, H.-T.: Universality of sine-kernel for Wigner matrices with a small Gaussian perturbation. http://arxiv.org/abs/0905.2089v1[math-ph], 2009 12. Erd˝os, L., Ramirez, J., Schlein, B., Yau, H.-T.: Bulk universality for Wigner matrices. http://arxiv.org/ abs/0905.4176v2[math-ph], 2009 13. Erd˝os, L., Ramirez, J., Schlein, B., Tao, T., Vu, V., Yau, H.-T.: Bulk universality for Wigner hermitian matrices with subexponential decay. http://arxiv.org/abs/0906.4400v1[math.PR], 2009 14. Forrester, P.: The spectral edge of random matrix ensembles. Nucl. Phys. B 402, 709–728 (1993) 15. Johansson, K.: Universality of the local spacing distribution in certain ensembles of Hermitian Wigner matrices. Commun. Math. Phys. 215(3), 683–705 (2001) 16. Johansson, K.: Universality for certain Hermitian Wigner matrices under weak moment conditions, preprint 17. Katz, N., Sarnak, P.: Random matrices, Frobenius eigenvalues, and monodromy. American Mathematical Society Colloquium Publications, 45. Providence, RI: Amer. Math. Soc., 1999
572
T. Tao, V. Vu
18. Khorunzhiy, O.: High Moments of Large Wigner Random Matrices and Asymptotic Properties of the Spectral Norm. http://arxiv.org/abs/0907.3743v2[math.PR], 2009 19. Mehta, M.L.: Random Matrices and the Statistical Theory of Energy Levels. New York: Academic Press, 1967 20. Péché, S., Soshnikov, A.: On the lower bound of the spectral norm of symmetric random matrices with independent entries. Electron. Commun. Probab. 13, 280–290 (2008) 21. Péché, S., Soshnikov, A.: Wigner random matrices with non-symmetrically distributed entries. J. Stat. Phys. 129(5–6), 857–884 (2007) 22. Ruzmaikina, A.: Universality of the edge distribution of eigenvalues of Wigner random matrices with polynomially decaying distributions of entries. Commun. Math. Phys. 261(2), 277–296 (2006) 23. Sinai, Y., Soshnikov, A.: Central limit theorem for traces of large symmetric matrices with independent matrix elements. Bol. Soc. Brazil. Mat. 29, 1–24 (1998) 24. Sinai, Y., Soshnikov, A.: A refinement of Wigners semicircle law in a neighborhood of the spectrum edge for random symmetric matrices. Func. Anal. Appl. 32, 114–131 (1998) 25. Soshnikov, A.: Universality at the edge of the spectrum in Wigner random matrices. Commun. Math. Phys. 207(3), 697–733 (1999) 26. Soshnikov, A.: Gaussian limit for determinantal random point fields. Ann. Probab. 30(1), 171–187 (2002) 27. Tao, T., Vu, V.: Random matrices: Universality of the local eigenvalue statistics. Submitted, available at http://arxiv.org/abs/0908.1982v4[math.PR], 2010 28. Tracy, C., Widom, H.: Level spacing distribution and Airy kernel. Commun. Math. Phys. 159, 151– 174 (1994) 29. Tracy, C., Widom, H.: On orthogonal and symplectic matrix ensembles. Commun. Math. Phys. 177, 727– 754 (1996) Communicated by H.-T. Yau
Commun. Math. Phys. 298, 573–583 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1046-3
Communications in
Mathematical Physics
A Quantum Boltzmann Equation for Haldane Statistics and Hard Forces; the Space-Homogeneous Initial Value Problem L. Arkeryd Mathematical Sciences, Chalmers, S-41296 Gothenburg, Sweden. E-mail:
[email protected] Received: 27 August 2009 / Accepted: 12 January 2010 Published online: 8 April 2010 – © Springer-Verlag 2010
Abstract: The paper considers equations of Boltzmann type for Haldane exclusion statistics. Existence and some basic properties of the solutions are studied for the space homogeneous initial value problem with hard forces and angular cut-off. The approach uses strong L 1 compactness. Some of the technical estimates are based on L ∞ decay properties, and the control of the filling factor on range estimates for the solutions. 1. Introduction The quantum Boltzmann Haldane equation (BH) is a kinetic equation of Boltzmann type for confined quasi-particles with Haldane statistics [H], e.g. in the interior of condensed matter. This exclusion statistics interpolates between the Fermi and Bose quantum behaviours. In quantum statistical mechanics, the number of quantum states of N identical particles occupying G states is given by (G + N − 1)! N !(G − 1)!
and
G! N !(G − N )!
in the boson resp. fermion cases. The interpolated number of quantum states for the fractional exclusion of Haldane and Wu is ([W]) (G + (N − 1)(1 − α))! 0 < α < 1. N !(G − α N − (1 − α))!
(1.1)
When applied to the fractional quantum Hall effect, the Haldane statistics coincides with the two-dimensional anyon definition in terms of the braiding of particle trajectories. Haldane statistics may also be realized [BMB] for neutral fermionic atoms at ultra-low temperature in three dimensions at unitarity. Elastic collisions in a Boltzmann type collision operator are pair collisions preserving mass, first moments, and energy. For two particles having pre-collisional velocities v,
574
L. Arkeryd
v∗ in Rd , the velocities after collision are denoted by v , v∗ . The density function in the corresponding variables is denoted by f, f ∗ respectively f , f ∗ . The collision operator Q for Haldane statistics, first introduced in [BBM], is Q( f ) = B(v − v∗ , ω) × [ f f ∗ F( f )F( f ∗ ) − f f ∗ F( f )F( f ∗ )]dv∗ dω. IR d ×S d−1
Here dω corresponds to the Lebesgue probability measure on the sphere. The collision kernel B in the variables (z, ω) ∈ IR d × Sd−1 is positive, locally integrable, and only depends on |z| and |(z, ω)|. The filling factor F is given by F( f ) = (1 − α f )α (1 + (1 − α) f )1−α , 0 < α < 1. It is concave with maximum value one at 1−2α f = 0 for α ≥ 21 , and maximum value ( α1 − 1)1−2α > 1 at f = α(1−α) for α < 21 . With this filling factor, the collision operator vanishes identically for the Haldane equilibrium distribution functions as obtained in [W] under (1.1), but for no other functions. The Boltzmann equation for the limiting cases, representing boson statistics (α = 0) and fermion statistics (α = 1), were first studied by [Lu1] (α = 0 in a space-homogeneous isotropic situation) resp. [D,L]. In their case (α = 1) the cancellation of quartic terms in the collision integral, and applicability of Lions’ compactness result for the classical gain term, together allow for a space-dependent study resembling the weak L 1 analysis for the quadratic Boltzmann equation. For 0 < α < 1, however, there is no cancellation in the collision term. Moreover, the Lipschitz continuity of the collision term in the Fermi-Dirac case, is now replaced by a weaker Hölder continuity. Those features led to the choice of a strong L 1 -compactness approach in this paper. Consider the initial value problem for the Boltzmann equation with Haldane statistics in the space-homogeneous case with velocities in IR d , d ≥ 2, df = Q( f ). dt
(1.2)
Because of the filling factor F, the range for the initial value f 0 should belong to [0, α1 ], which is then formally preserved by the equation. The general BH equation (0 < α < 1) retains important properties from the Fermi-Dirac case (cf. [BBM], but it has so far not been validated from basic quantum theory. Therefore the choice here is to consider the case of hard forces, B(z, ω) = |z|β b( (z,ω) |z| ) with 0 < β ≤ 1 and Grad angular cut-off, which also agrees with the better understood limiting cases α = 0, 1. In the approach to existence in this paper, stronger limitations on the cut-off allow for weaker moment conditions for f 0 . For the discussion below it is assumed that 0 < b ≤ c| sin θ cos θ |d−1 , with an initial moment (1 + |v|s ) f 0 in L ∞ for s ≥ d − 1 + β. In Section Two this initial value problem is considered for a family of approximations with bounded support for the kernel B, when 0 < f 0 ≤ esssup f 0 < α1 . Starting from approximations with Lipschitz continuous filling factor, the corresponding solutions are shown to stay away uniformly from α1 , the upper bound for the range. Uniform Lipschitz continuity follows for the approximating operators and leads to well posedness for the limiting problem. Section Three studies uniform L ∞ moment bounds for the approximate solutions using an approach from the classical Boltzmann case [A]. Based on those preliminary results, in Section Four the main global existence result for hard forces, Theorem 4.1, is stated and proved. Section 5 extends the result to initial values with 0 < f 0 ≤ α1 . Mass and first moments are conserved and energy is bounded by its initial value. That bound on energy in turn implies energy conservation using the arguments for energy conservation from [Lu2] or [MW]. A stability property of the solutions is discussed. The question of long time behaviour is left open.
Equations of Boltzmann Type for Haldane Exclusion Statistics
575
2. A Particular Collision Kernel This initial value problem for (1.2) will first be considered for kernels of a particular pseudo-Maxwellian type with bounded support, to be used as approximations for the main problem of the paper. For n sufficiently large, take B¯ n = χn B, where χn is the characteristic function of n , the complement of the set 1 , n 1 or |v − v∗ | − |(v − v∗ ) · ω| < }. n
{(v, v∗ , ω); v, v∗ ∈ IR d , ω ∈ Sd−1 , |v − v∗ | > n, or |(v − v∗ , ω)|
0, define χn (x) = χ (x − n), and set B¯ = Bn = χn (|v|)χn (|v∗ |)χn ((v |)χn (|v∗ |) B¯ n := n B¯ n . Proposition 2.1. Suppose that the initial value 0 < f 0 ∈ L 1 (Rd ) for the space¯ has finite energy and esssup f 0 (v) < 1 . homogeneous equation (1.2) with kernel B, α This initial value problem is well posed in L 1 with conservation of mass and energy. The essential supremum of the solution remains smaller than α1 on any set {(t, v); 0 ≤ t ≤ t0 , v ∈ Rd }. Proof. A family of locally Lipschitz continuous approximations is first introduced that preserves mass and the bounds for f 0 . In the collision operator for the approximations, Q with > 0, the modified filling factor is F ( f ) =
(1 − α f ) (1 + (1 − α) f )1−α . ( + (1 − α f )1−α )
(2.1)
It has bounded first derivatives with respect to f , when the range of f is contained in [0, α1 ], and so there is Lipschitz continuity for the corresponding approximate collision operators. With f¯ = 0 for f < 0, f¯ = f for 0 ≤ f ≤ α1 , f¯ = α1 for f > α1 , the Lipschitz continuity implies that the initial value problem df = Q ( f¯), dt
f (0, v) = f 0 (v)
is solvable for t near 0 with values in L ∞ . It follows from the equation that f is strictly increasing whenever it attains the value 0, and strictly decreasing at the value α1 . So f is absolutely continuous in t for a.e. v, satisfies 0 < f < α1 for t > 0, and f = f¯ solves the approximate initial value problem uniquely. This local solution can similarly be continued into a unique, global solution. For an initial value in L 1 , the solution stays in L 1 and conserves mass and energy. The function f (., v) is decreasing whenever ¯ f f ∗ F ( f )F ( f ∗ ) − f f ∗ F ( f )F ( f ∗ )) ≤ 0, dv∗ dω B( or
F ( f ) ≤
dv∗ dω B¯ f f ∗ F ( f )F ( f ∗ ) , dv∗ dω B¯ f f ∗ F ( f ∗ )
576
L. Arkeryd
and in particular for v in t = {v; |v| ≤ n, F ( f (t, v)) ≤ 41 } , if dv∗ dω B¯ f f ∗ F ( f )F ( f ∗ ) F ( f )(t, v) ≤ inf t . dv∗ dω B¯ f f ∗ F ( f ∗ )
(2.2)
Given t0 > 0 and ∪t≤t0 t = ∅, the infimum over ∪t≤t0 t is positive. This holds using the conditions on B and the bounds on f , since the denominator has a uniform (in ) upper bound. Define b0 by F( α1 − b0 ) = 21 . Take > 0 small so that F ( α1 − b0 ) ≥ 41 . A positive lower bound of the numerator can be obtained as follows. In the integrand, by definition f (t, v) ≥ α1 − b0 on t for t ≤ t0 . For the factor f (t, v∗ ), the exponential form of the equation gives a lower bound coming from the initial value term f 0 , which is multiplied by an exponential factor. The exponent is a negative time integral of dv∗ dω B¯ f ∗ F ( f )F ( f ∗ ), again with uniform bound, and so f (t, v∗ ) ≥ e−t0 C f 0 (v∗ ). For the factor F ( f )F ( f ∗ ) in the numerator, remove from the integrand the set (of uniformly in bounded measure), where F ( f ) < 21 , or F ( f ∗ ) < 21 . This leads for n large to an -independent positive lower bound for the numerator. Hence there is a constant C0 > 0 (independent of ) such that, if 0 < f 0 ≤ α1 − C1 with 0 < C1 ≤ C0 , then the function f (t, v) of t for v fixed, starts decreasing not later than when reaching the value α1 − C1 . And so the inequality is preserved by f (t) for 0 < t ≤ t0 . This implies that the derivatives (with respect to f) ddf F ( f )(t, v) are uniformly in bounded. It then follows from the equation in mild form that sup | f (t, v + q) − f (t, v)|dv ≤ | f 0 (v + q) − f 0 (v)|dv t≤t0 |v| 0, for t ≤ t0 and independent of n, the exponential form of the equation gives a lower bound for the factor f n (t, v∗ ) equal to the positive initial value f 0 , multiplied by an exponential factor, which is bounded from below for |v| ≤ n 0 (with bound independent of n ≥ n 0 ). So for n 0 large enough (depending on B) and for n ≥ n 0 , the following estimate holds for the collision frequency: dv∗ dωBn f n (t, v∗ )F( f n )F( f n ∗ ) ≥ dω dv∗ n χn0 B f n (t, v∗ )F( f n )F( f n ∗ ) f n , f n ∗ r
≥ Cr β χn (|v|)
f n , f n ∗ r
n 0 χn0 f n ∗ dv∗ dω ≥ C1 χn (|v|),
for some C1 > 0 independent of n. Since |v − v∗ |β ≥ |v|β − |v∗ |β , it also holds that dv∗ dωBn f n (t, v∗ )F( f n )F( f n ∗ ) β ≥ c1 |v| n χn0 f n ∗ dv∗ dω − c2 (1 + v 2 ) f n ∗ dv∗ χn (|v|). f n , f n ∗ 0, and f + h 1 f ≤ h 2 (t > 0), then supt>0 f (t) ≤ max( f (0), sup t>0
h 2 (t) ). h 1 (t)
Proof of Proposition 3.1. Given t0 > 0, the proposition follows from Lemma 3.2, ).
Lemma 3.4, and Lemma 3.5 with s = min(d − 1 + β, 2β(d+1)+2 d 4. Hard Force Collisions In this section the above existence result for (1.2) with truncated kernels will be extended to hard force kernels with 0 < B(z, ω) ≤ C|z|β | sin θ cos θ |d−1 , where 0 < β ≤ 1, d > 2, and 0 < β < 1, d = 2.
(4.1)
Equations of Boltzmann Type for Haldane Exclusion Statistics
579
Theorem 4.1. Let the initial value 0 < f 0 ∈ L 1 of the space-homogeneous equation (1.2) for hard forces have finite energy and satisfy esssup f 0 (v) < α1 . If esssup(1 + |v|s ) f 0 < ∞ for s = d − 1 + β, then this initial value problem for (1.2) has a solution in the space of functions continuous from t ≥ 0 into L 1 ∩ L ∞ , which conserves mass and energy, and for t0 > 0 given, has esssupv,t≤t0 |v|s f (t, v) bounded, where ). s = min(s, 2β(d+1)+2 d Proof. We shall use the solutions f n for the approximate kernels Bn (v, ω) of the previous sections to which Proposition 2.1 and 3.1 apply, and base the proof on a strong L 1 -compactness property for the sequence ( f n ). Strong precompactness of the sequence ( f n ) in L 1 is equivalent to (4.2) lim | f n (v + q) − f n (v)|dv = 0, q→0
uniformly in n. 1 Given t0 > 0, by Proposition 3.1 there is λ > 0 such that f n (t, v) ≤ 2α for all n ∈ N and all |v| > λ. It follows from the proof of Proposition 2.1 that there is an n 0 ≥ λ and C > 0, such that f (t, v) ≤ α1 − C for all n ≥ n 0 and all |v| ≤ λ. Hence for functions f = f n with n ≥ n 0 , the derivative ddf F( f ) is uniformly bounded when t ≤ t0 , v ∈ R N . With Sgn the sign function for f n (t, q + v) − f n (t, v), it holds that d Sgn( f n (t, q + v) − f n (t, v)) = Sgn(Q n ( f n )(t, q + v) − Q n ( f n )(t, v)). (4.3) dt The right hand side is split into a sum of four differences, Sgn( dv∗ dωBn f n f n ∗ F( f n ∗ )(t, q + v) − dv∗ dωBn f n f n ∗ F( f n ∗ )(t, v))F( f n )(t, q + v) − Sgn( dv∗ dωBn f n∗ F( f n )F( f n ∗ )(t, q + v) − dv∗ dωBn f n∗ F( f n )F( f n ∗ )(t, v)) f n (t, v) + Sgn(F( f n )(t, q + v) − F( f n )(t, v)) dv∗ dωBn f n f n ∗ F( f n ∗ )(t, v) − Sgn( f n (t, q + v) − f n (t, v)) dv∗ dωBn f n ∗ F( f n )F( f n ∗ )(t, q + v).
(4.4)
Here the last, negative term is removed, and integrals of the first three terms will be estimated using the uniform bound on the derivative ddf F( f ) for the sequence ( f n )n≥n 0 and t ≤ t0 . In the third term, by the proof of Lemma 3.3 the (gain type) integral is uniformly in n bounded. And so the estimate of the third term from above by C | f n (t, v + q) − f n (t, v)|dvdt follows with C independent of n.
580
L. Arkeryd
The remainig two terms are split into two further differences. Of the four ensuing terms, the two terms dvdt dv∗ dω|Bn (v+q −v∗ , ω)− Bn (v−v∗ , ω)| f n (t, v∗ )(F( f n )F( f n ∗ ) f n )(t, q +v), dvdt dv∗ dω|Bn (v+q −v∗ , ω)− Bn (v−v∗ , ω)|F( f n )(t, v∗ )( f n f n ∗ F( f n ))(t, q +v), tend to zero when q → 0, uniformly with respect to n, since mass and energy of the f n ’s are conserved, and the factor F( f n ) is uniformly in v and n bounded. In a third term, dvdt dv∗ dωBn (v − v∗ , ω) f n (t, v∗ )|F( f n )F( f n ∗ )(t, q + v) −F( f n )F( f n ∗ )(t, v)| f n (t, q + v),
(4.5)
it is used that f (t, v)(1 + |v|2β ) is uniformly in n and t ≤ t0 bounded in L 1 ∩ L ∞ . The Carleman representation (3.1) together with Proposition 3.1, the condition (4.1), and the bound on the derivative ddf F( f ), can be used to estimate the difference in F( f n )F( f n ∗ ). When the difference is taken for F( f n ), the hyperplane integral in the Carleman representation is evaluated for the integrand (1 + |v − v∗ |β )−1 |v − v∗ |−d+1
b(θ )(sin θ )d−1−β (sin θ )β |v − v∗ |β (1 + |v − v∗ |β ) , (cos θ )d−1 (1 + |v|2β )(1 + |v∗ |2β )
which is convergent, and similarly for the difference in F( f n ∗ ). The terms in the integrand are evaluated at v (v + q, v∗ ) etc., and an upper bound C sup | f n (t, v + q ) − f n (t, v)|dvdt |q |≤|q|
is obtained with C independent of n. The same bound can actually be obtained for all the other terms of (4.3), if the integration with respect to time is first carried out, followed by taking a supremum with respect to |q | ≤ |q|, and only then the integration in v. This will be used. There remains the term Sgn( dv∗ dωBn (v − v∗ , ω)F( f n )(t, v∗ )( f n f n ∗ (t, q + v) − f n f n ∗ (t, v))F( f n )(t, q + v)).
(4.6)
We shall insert F( f ) = F(0) + f F ( ), where 0 < < f , and use that f n (t, v) ≤ c s , uniformly in n and for t ≤ t0 . Then 1+|v| the term resulting from F(0)F(0) (= 1), gives a Boltzmann gain term sequence, which is here compact (cf. [L]). It follows that (after taking the subsequence), uniformly in n, dvdt sup | dv∗ dωBn (v − v∗ , ω)( f n f n ∗ (t, q + v) − f n f n ∗ (t, v))| |q |≤|q|
Equations of Boltzmann Type for Haldane Exclusion Statistics
581
converges to zero when q → 0. In the remaining terms, there is at least one factor from F( f n ) or F( f n ∗ ) that can be estimated by c s or c s , e.g. f n ∗ in the term Sgn
1+|v|
1+|v∗ |
dv∗ dωBn (v − v∗ , ω) f n ∗ F ( n )( f n (t, v (v + q , v∗ )) − f n (t, v (v, v∗ ))) f n ∗
can after integration be estimated from above by c | f (t, v (v + q , v∗ )) dv∗ dωBn (v − v∗ , ω) dvdt sup s n 1 + |v ∗| |q |≤|q| 1 β f (1 + |v | ) ≤ c dvdtdv∗ sup | f n (t, v + q ) − f n (t, v )| ∗ 1 + |v∗ |β n ∗ |q |≤|q| − f n (t, v)|(1 + |v∗ |2 ) f n (t, v∗ ). Here (4.1), together with |v∗ − v∗ | = cos θ |v − v∗ |, |v∗ − v∗ |β ≤ (1 + |v∗ |β )(1 + |v∗ |β ) were used. The other remaining terms can be handled similarly. That gives sup sup | f n (t, v + q ) − f n (t, v)|dv ≤ sup | f 0 (v + q ) t≤t0 |v| λ. Set ν(v) = dv∗ dωB f ∗ F( f )F( f ∗ ) and ν˜ (v) = dv∗ dωB f f ∗ F( f ∗ ). It holds that Q f < 0 if F( f ) < fν˜ν , in particular if F( f ) < in f t fν˜ν := 2C1 . Here t = {v; |v| ≤ λ, F( f (t, v)) ≤ 21 } and obviously C1 > 0. Define b1 by F( α1 − b1 ) = min{ 41 , C1 }. Hence for t ≤ t0 it holds F( f ) ≤ C1 if and only if |v| ≤ λ and α1 ≥ f ≥ α1 − b1 . For such f -values F( f )˜ν ≤ C1 ν˜ ≤ 21 f ν, and so 1 1 1 Q f ≤ − f ν ≤ − ( − b1 )C12 B f ∗ dωdv∗ 2 2 α F( f ),F( f ∗ )>C1 t0 1 1 2 β ≤ − ( − b1 )C1 r inf exp(− νds)dωdv∗ |v|≤λ F( f ),F( f )>C1 ,|v−v∗ |>r 2 α 0 ∗ := −C2 , with C2 > 0. This gives a maximum time for the equation ddtf = Q f with initial value f 0 to reach f (t, v) ≤ α1 − b1 . Also for any t > 0 the solutions stay uniformly in n away from α1 . From here a version of the previous study of (4.4) can be used to prove existence for an initial time interval and an initial value f 0 not remaining uniformly in v away from 1 ∞ 1 α . At a few places the L estimate in time should then be replaced by an L -estimate to α α−1 providing a ’small handle factors of the type α(b + ct) (estimates for integrals of t factor’). Thus in the third term of (4.4) the factor F( f )(t, q + v) − F( f )(t, v) is now bounded by (c3 + c4 t α−1 )( f (t, q + v) − f ((t, v)). This gives an upper bound for the integral of the third term (c3 t + c4 t α ) sup | f (t, q + v) − f (t, v)|dv. (5.1) t≤t0
In the remaining two terms of (4.4) the B-differences are treated as in Sect. 4. For (4.5) the F( f )-difference is estimated as above (the previous third term), now giving the upper bound (c3 t + c4 t α ) supt≤t0 sup|q |≤|q| | f (t, v + q ) − f (t, v)|dv. There remains to consider (4.6) and (4.7) in the present setting. Again an estimate of F(θ )-factors by (c3 + c4 t α−1 ) leads to upper bounds of the type (5.1). Similarly to Sect. 4 the claims of the theorem follow.
Acknowledgement. The author would like to thank J. Bergh for useful discussions during the preparation of the paper. The insightful comments from one referee helped to improve the paper.
References [A] [BBM] [BMB]
Arkeryd, L.: L ∞ -estimates for the space-homogeneous boltzmann equation. J. Stat. Phys. 31, 347– 361 (1983) Bhaduri, R.K., Bhalero, R.S., Murthy, M.V.: Haldane exclusion statistics and the boltzmann equation. J. Stat. Phys. 82, 1659–1668 (1996) Bhaduri, R.K., Murthy, M.V., Brack, M.: Fermionic ground state at unitarity and haldane exclusion statistics. J. Phys. B 41, 115301 (2008)
Equations of Boltzmann Type for Haldane Exclusion Statistics
[D] [H] [L] [Lu1] [Lu2] [MW] [W]
583
Dolbeault, J.: Kinetic models and quantum effects: a modified boltzmann equation for fermi-dirac particles. Arch. Rat. Mech. Anal. 127, 101–131 (1994) Haldane, F.D.: Fractional statistics in arbitrary dimensions: a generalization of the pauli principle. Phys. Rev. Lett. 67, 937–940 (1991) Lions, P.L.: Compactness in Boltzmann’s equation via Fourier integral operators and applications I, III. J. Math. Kyoto Univ. 34, 391–427, 539–584 (1994) Lu, X.: A modified boltzmann equation for bose-einstein particles: isotropic solutions and long time behaviour. J. Stat. Phys. 98, 1335–1394 (2000) Lu, X.: Conservation of energy, entropy identity, and local stability for the spacially homogeneous boltzmann equation. J. Stat. Phys. 96, 765–796 (1999) Mischler, S., Wennberg, B.: On the spacially homogeneous boltzmann equation. Ann. Inst. Henri Poincaré 16, 467–501 (1999) Wu, Y.S.: Statistical distribution for generalized ideal gas of fractional-statistics particles. Phys. Rev. Lett. 73, 922–925 (1994)
Communicated by H. Spohn
Commun. Math. Phys. 298, 585–611 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1086-8
Communications in
Mathematical Physics
Deformation Quasi-Hopf Algebras of Non-semisimple Type from Cochain Twists C. A. S. Young, R. Zegers Department of Mathematical Sciences, University of Durham, South Road, Durham DH1 3LE, UK. E-mail:
[email protected];
[email protected] Received: 18 December 2008 / Accepted: 7 June 2010 Published online: 11 July 2010 – © Springer-Verlag 2010
Abstract: Given a symmetric decomposition g = h ⊕ p of a semisimple Lie algebra g, we define the notion of a p-contractible quantized universal enveloping algebra (QUEA): for these QUEAs the contraction g → g0 making p abelian is nonsingular and yields a QUEA of g0 . For a certain class of symmetric decompositions, we prove, by refining cohomological arguments due to Drinfel’d, that every QUEA of g0 so obtained is isomorphic to a cochain twist of the undeformed envelope U(g0 ). To do so we introduce the p-contractible Chevalley-Eilenberg complex and prove, for this class of symmetric decompositions, a version of Whitehead’s lemma for this complex. By virtue of the existence of the cochain twist, there exist triangular quasi-Hopf algebras based on these contracted QUEAs and, in the approach due to Beggs and Majid, the dual quantized coordinate algebras admit quasi-associative differential calculi of classical dimensions. As examples, we consider κ-Poincaré in 3 and 4 spacetime dimensions. 1. Introduction This paper is concerned with deformation quantizations of the universal enveloping algebras (UEAs) of a certain class of non-semisimple Lie algebras, and more particularly with proving that these deformations are cochain twists of their undeformed counterparts. The Lie algebras we consider have the property that they can be obtained by contraction of semisimple Lie algebras; among them is the Poincaré algebra, which is the case of clearest physical interest and will be the example we treat in detail. Let us first recall the situation concerning twists of (semi)simple Lie algebras. For any simple Lie algebra g, the standard Drinfel’d-Jimbo quantization Uh (g) comes equipped with a quasitriangular structure R, which provides the isomorphisms required to turn its category of representations into a quasitensor category [1–3]. As a quasitriangular Hopf algebra, (Uh (g), R) is not twist-equivalent by any cocycle twist to (U(g), 1 ⊗ 1), the undeformed UEA equipped with the usual Hopf algebra structure and trivial triangular structure. It cannot be, because R is strictly quasitriangular (i.e. R21 = R−1 ) and
586
C. A. S. Young, R. Zegers
the property of triangularity is preserved by twisting. The celebrated result of Drinfel’d [4–6] is that Uh (g) and U(g) are twist equivalent in the larger category of quasi-Hopf algebras. Here one drops the requirement that the twist element F obey the cocycle condition. Since this condition is what guarantees the preservation of coassociativity under twisting, a quasi-Hopf algebra may fail to be coassociative; but it does so in a controlled fashion, specified by the coassociator . In the special case = 1 ⊗ 1 ⊗ 1 one recovers the definition of a Hopf algebra. Drinfel’d showed that (Uh (g), R, 1 ⊗ 1 ⊗ 1) can be reached by a cochain twist starting from (U(g), RKZ , KZ ); that is, starting from the quasitriangular quasi-Hopf (qtqH) algebra obtained by equipping U(g) with a certain R-matrix RKZ and coassociator KZ constructed from the monodromies of a KnizhnikZamolodchikov system of equations, which in turn depend on the quadratic Casimir t of g. One has RKZ = eht , where the Casimir is split over the tensor product. An alternative possibility, discussed notably by Beggs and Majid [7,8], is to start instead with (U(g), 1⊗2 , 1⊗3 ) and twist by the same cochain F as in Drinfel’d’s construction. What results is, necessarily, the same algebra and coproduct as Uh (g), but now equipped with non-standard R-matrix F21 F −1 and coassociator F (the coboundary of F , closely related to KZ ). F is central in the sense that the coproduct of Uh (g) is coassociative, but it is nevertheless non-trivial and thus (Uh (g), F21 F −1 , F ) is a triangular but strictly quasi-Hopf algebra. Dually, the deformed function algebra Ch (G) becomes a co-quasi-Hopf algebra which happens to be associative. The non-triviality of F is seen at the level of intertwiners of representations (the category of representations is symmetric but non-trivially monoidal). It also appears when one tries to construct a differential calculus on C h (G), and in fact this was one of the original motivations for considering the set-up: Beggs and Majid showed that, at least for semisimple g, the standard quantum groups C h (G) do not admit any bi-covariant associative differential calculus of classical dimensions in deformation theory. But, by the existence of the cochain twist, one can construct a quasi-associative differential calculus (C h (G)) of classical dimensions [7,8]. The results summarized above pertain to semisimple Lie algebras. To the authors’ knowledge no systematic extension to quantized universal enveloping algebras (QUEAs) of general Lie algebras is known. The proof of the existence of Drinfel’d’s twist element F relies on the vanishing of a certain cohomology class, which holds for semisimple Lie algebras but may fail more generally. Drinfel’d did show [5] that any qtqH QUEA is isomorphic to a cochain twist of the undeformed UEA of the underlying Lie algebra g. So the existence of a qtqH structure is sufficient as well as necessary for the existence of the twist. But when g is not semi-simple, one has no general means of knowing whether the given QUEA admits a qtqH algebra structure. In physics one is also concerned with non-semisimple Lie algebras. In particular, in trying to formulate non-commutative quantum field theory by (paralleling the usual approach in [9]) beginning with particles, regarded as irreducible representations of the algebra of spacetime symmetries, one is certainly interested in the Poincaré algebra iso(1, n) and its deformations. Possibly the most well-known deformation of U(iso(1, n)) is the θ -deformation, which is dual to the usual non-commutative coordinate algebra [xi , x j ] = θi j , with θi j constant. It is known to be twist-equivalent to U(iso(1, n)) by a cocycle twist [10,11], making it in a sense a rather mild deformation. The results presented here will apply, rather, to what is referred to as the κ-deformation of U(iso(1, n)) [12–19]. κ-Poincaré can be understood in more than one way. From one perspective, it arises as a certain bicrossproduct [20], and this viewpoint allows for a nice geometrical interpretation discussed in [21]. Another formulation [22–24] is as a
Deformation Quasi-Hopf Algebras of Non-semisimple Type from Cochain Twists
587
particular contraction limit of the appropriate real form of the standard Drinfel’d-Jimbo QUEA Uh (so(n + 1, C)). It is this property which will be relevant in the present work. The main idea, then, is to consider a class of QUEAs obtained by applying to (e.g. the standard) QUEAs Uh (g) of semisimple Lie algebras g a contraction procedure modelled on that used to obtain κ-Poincaré. As we recall in detail below, to every symmetric decomposition g = h ⊕ p of a Lie algebra g there is associated an Inönu-Wigner contraction, in which p is rescaled to become an abelian ideal of the contracted Lie algebra g0 . Whenever the contraction procedure is non-singular at the level of the Uh (g), this yields a quantization Uh (g0 ) of U(g0 ). One must specify how the formal deformation parameter is rescaled in the contraction limit to produce the limiting parameter h ; obviously there are many possibilities, and we will consider the choice that ensures that κ-Poincaré is captured by our results. We will show (theorem 6.1) that, given a certain restriction on the allowed symmetric decomposition (see Definition 3.2), every such QUEA Uh (g0 ) is isomorphic to a twist of the undeformed UEA U(g0 ) by a cochain F0 . We do this by refining the cohomological arguments of Drinfel’d so as to prove that the twist element F which relates U(g) and Uh (g) can be chosen to be non-singular in the contraction limit. Since U(g0 ) can always be endowed with the trivial qtqH structure R = 1⊗2 , = ⊗3 1 , the existence of this twist F0 means that one can certainly obtain (see Corollary 5.4 below) a triangular quasi-Hopf algebra (Uh (g0 ), (F0 )21 F0−1 , F0 ). And, from [7,8], the deformed coordinate algebra C h (G 0 ) dual to Uh (g0 ) will admit a quasi-associative bicovariant differential calculus of classical dimensions. It is a separate question whether Uh (g0 ) admits a quasitriangular Hopf algebra structure. In Sect. 5, we give a necessary condition for such a structure to arise by contraction (see corollary 5.5). Examples, in the case of κ-Poincaré, are in Sect. 7. The paper is organised as follows. In Sect. 2, we recall the definition of symmetric semisimple Lie algebras. The important notion of contractibility is introduced in Sect. 3 after a brief reminder of the definitions of the filtered and graded algebras associated to UEAs. Sect. 4 is dedicated to the cohomology of associative algebras and Lie algebras. After a brief account of Hochschild and Chevalley-Eilenberg cohomology, we introduce the notion of contractible Chevalley-Eilenberg cohomology. We establish, in particular, the vanishing of the first contractible Chevalley-Eilenberg cohomology module for symmetric semisimple Lie algebras possessing the restriction property 3.2. This will be crucial in proving the existence of a contractible twist. In Sect. 5, the usual rigidity theorems for semisimple Lie algebras are then refined, with special regards to the contractibility of the structures. We construct, in particular, a contractible twist from every contractible QUEA of restrictive type to the undeformed UEA of the underlying Lie algebra. The actual contraction is performed in Sect. 6. In Sect. 7 we comment on the implications of our mathematical results for the particular example of κ-Poincaré. We discuss how they are compatible with previous work and explain certain previous results. Throughout Sects. 1 through 6, K denotes a field of characteristic zero. 2. Symmetric Decompositions of Lie Algebras Let us briefly review some well-known facts concerning symmetric semisimple Lie algebras. Following [25,26], we have Definition 2.1. A symmetric Lie algebra is a pair (g, θ ), where g is a Lie algebra and θ : g → g is an involutive (i.e. θ ◦ θ = id and θ = id) automorphism of Lie algebras.
588
C. A. S. Young, R. Zegers
As θ ◦ θ = id, the eigenvalues of θ are +1 and −1. Let h = ker (θ − id) and p = ker (θ + id) be the corresponding eigenspaces. Every such θ thus defines a symmetric decomposition of g, i.e. a triple (g, h, p) such that • h ⊂ g is a Lie subalgebra; • g = h ⊕ p as K-modules; • [h, p] ⊆ p and [p, p] ⊆ h. Any Lie subalgebra h of g that is the fixed point set of some involutive automorphism will be referred to as a symmetrizing subalgebra. If, in addition, g is semisimple then p must be the orthogonal complement of h in g with respect to the (non-degenerate) Killing form, and thus every given symmetrizing subalgebra h uniquely determines p and hence θ . In this case, we shall refer to (g, h) as a symmetric pair. A symmetric semisimple Lie algebra (g, θ ) is said to be diagonal if g = v ⊕ v for some semisimple Lie algebra v and θ (x, y) = (y, x) for all (x, y) ∈ g. A symmetric Lie algebra splits into symmetric subalgebras (gi , θi )i∈I if g = i∈I gi and the restrictions θ|gi = θi for all i ∈ I . Lemma 2.2. Every symmetric semisimple Lie algebra (g, θ ) splits into a diagonal symmetric Lie algebra (gd , θd ) and a collection of symmetric simple Lie subalgebras (gi , θi )i∈I . A proof can be found in Chap. 8 of [26]. Lemma 2.2 allows for a complete classification of the symmetric semisimple Lie algebras; see [26,27]. It also follows that we have the following Lemma 2.3. Let (g, θ ) be a symmetric semisimple Lie algebra and let g = h ⊕ p be the associated symmetric decomposition of g. Then h is linearly generated by [p, p]. Proof. By virtue of Lemma 2.2, it suffices to prove this result on symmetric simple Lie algebras and on diagonal symmetric Lie algebras. Let us first assume that g is simple. The linear span of [p, p] defines a non-trivial ideal in h and span([p, p]) ⊕ p therefore defines a non-trivial ideal in g. If we assume that g is simple, it immediately follows that span([p, p]) = h. Suppose now that (g, θ ) is a diagonal symmetric Lie algebra, i.e. that there exists a semisimple Lie algebra v such that g = v ⊕ v and θ (x, y) = (y, x) for all (x, y) ∈ g. In this case, we have a symmetric decomposition g = h ⊕ p, where h is the set of elements of the form (x, x) for all x ∈ v, whereas p is the set of elements of the form (x, −x) for all x ∈ v. We naturally have [p, p] ⊆ h. Now, as v is semisimple, it follows that for every x ∈ v, there exist y, z ∈ v such that x = [y, z]. Then for all (x, x) ∈ h, we have (x, x) = ([y, z], [y, z]) = [(y, y), (z, z)] = [(y, −y), (z, −z)]. But both (y, −y) and (z, −z) are in p. 3. Contractible QUEAs 3.1. Filtrations of the Universal Enveloping Algebra. Given a Lie algebra g over K, its universal algebra U(g) is defined as the quotient of the graded tensor algebra enveloping ⊗n by the two-sided ideal I(g) generated by the elements of the form Tg = n≥0 g x ⊗ y − y ⊗ x − [x, y], for all x, y ∈ g. This quotient constitutes a filtered K-algebra, i.e. there exists an increasing sequence {0} ⊂ F0 (U(g)) ⊂ · · · ⊂ Fn (U(g)) ⊂ · · · ⊂ U(g),
(3.1)
Deformation Quasi-Hopf Algebras of Non-semisimple Type from Cochain Twists
such that1 U(g) =
Fn (U(g))
and
Fn (U(g)) · Fm (U(g)) ⊂ Fn+m (U(g)).
589
(3.2)
n≥0
The elements of this sequence are, for all n ∈ N0 , Fn (U(g)) =
n
g⊗m /I(g).
(3.3)
m=0
In particular, F0 (U(g)) = K and F1 (U(g)) = K ⊕ g. Let us identify g with its image under the canonical inclusion g → U(g), and further write x1 · · · xn for the equivalence class of x1 ⊗ · · · ⊗ xn . In this notation, Fn (U(g)) is linearly generated by elements that can be written as words composed of at most n symbols from g. We define the left action of g on g⊗n by extending the adjoint action x x = x, x of g on g as a derivation: x (x1 ⊗ · · · ⊗ xn ) =
n
x1 ⊗ · · · ⊗ [x, xi ] ⊗ · · · ⊗ xn ∈ g⊗n ,
(3.4)
i=1
for all x, x1 , . . . , xn ∈ g. In this way we endow T g with the structure of a left g-module. As the ideal I(g) is stable under this action, the Fn (U(g)) are also left g-modules. We therefore have a filtration of U(g) not only as a K-algebra, but also as a left g-module. We will also need such a filtration on (U(g))⊗2 . In fact, for all m ∈ N0 , there is a K-algebra filtration on the universal envelope U(g⊕m ) of the Lie algebra g⊕m , as defined above. If we endow g⊕m with the structure of a left g-module according to x (x1 , . . . , xm ) := ([x, x1 ] , . . . , [x, xm ]) ,
(3.5)
and extend this action to all of U(g⊕m ) as a derivation, then we have a filtration of U(g⊕m ) as a left g-module. But there is a natural isomorphism ∼
ρm : U(g⊕m ) −→ (U(g))⊗m
(3.6)
of K-algebras (see e.g. [25, Sect. 2.2]). This induces a left action of g on (U(g))⊗m and a filtration of (U(g))⊗m as a left g-module. We write the elements of this filtration as ⊗m Fn (U(g)) . Given now any symmetric decomposition g = h ⊕ p, (3.7) there is an associated bifiltration Fn,m (U(g)) n,m∈N of U(g), i.e. a doubly increasing 0 sequence
Fn,m (U(g)) ⊂ Fn+1,m (U(g)) such that U(g) =
Fn,m (U(g))
and
Fn,m (U(g)) ⊂ Fn,m+1 (U(g)),
(3.8)
and Fn,m (U(g)) · Fk,l (U(g)) ⊂ Fn+k,m+l (U(g)), (3.9)
n,m≥0 1 Although F (U (g)) · F (U (g)) is usually strictly contained in F n m n+m (U (g)), it linearly generates the latter.
590
C. A. S. Young, R. Zegers
for all n, m, k, l ∈ N0 . The elements of this sequence are, for all n, m ∈ N0 , Fn,m (U(g)) =
n m
Sym h⊗ p ⊗ p⊗q /I(g),
(3.10)
p=0 q=0
where, for all n ∈ N0 and all K-submodules X 1 , . . . X n ⊂ g, X σ (1) ⊗ · · · ⊗ X σ (n) Sym(X 1 ⊗ · · · ⊗ X n ) =
(3.11)
σ ∈ n
is the direct sum over all permutations of submodules in the tensor product. Each Fn,m (U(g)) is therefore the left h-module linearly generated by elements of U(g) that can be written as words containing at most n symbols in h and at most m symbols in p. In particular, F1,0 (U(g)) = K ⊕ h and F0,1 (U(g)) = K ⊕ p. We also have, for all m, n ∈ N0 , Fn,m (U(g)) ⊂ Fn+m (U(g))
and
Fn (U(g)) =
n
Fn−m,m (U(g)).
(3.12)
m=0
In complete analogy with the Fn ((U(g))⊗m ), we can construct bifiltrations Fn, p ((U(g))⊗m ) of all the m-fold tensor products of U(g). 3.2. Symmetric tensors. Let S(g) be the graded algebra associated to the filtration of U(g) by setting, for all n ∈ N0 , Sn (g) = Fn (U(g))/Fn−1 (U(g)) and S(g) = Sn (g). (3.13) n≥0
Since the Fn (U(g)) are left g-modules, so are the Sn (g). The symmetrization map, sym : S(g) → U(g), defined by 1 sym(x1 · · · xn ) = xσ (1) · · · xσ (n) (3.14) n! σ ∈Sn
for all n ∈ N0 and all x1 , . . . , xn ∈ g, constitutes an isomorphism of left g-modules2 . The image of a given Sn (g) through sym is the g-module of symmetric tensors in g⊗n . If now g = h ⊕ p is a symmetric decomposition, let Sm,n (g) = Fm,n (U(g))/Fm+n−1 (U(g)),
(3.15)
for all m, n ∈ N0 . These obviously constitute left h-modules. As such, they are isomorphic to the left h-modules of symmetric tensors in the Sym h⊗m ⊗ p⊗n , which are linearly generated by totally symmetric words with exactly m symbols in h and exactly n symbols in p. Note that these h-modules are mixed under the left p-action. Indeed, let m, n ∈ N0 be two non-negative integers and let x ∈ Sm,n (g). We have: • if m > 0 and n = 0, then p x ∈ Sm−1,n+1 (g); • if m > 0 and n > 0, then p x ∈ Sm+1,n−1 (g) ⊕ Sm−1,n+1 (g); • if m = 0 and n > 0, then p x ∈ Sm+1,n−1 (g). 2 Recall that we assume K has characteristic zero.
Deformation Quasi-Hopf Algebras of Non-semisimple Type from Cochain Twists
591
This is better represented by the following diagram in Sm+n (g). Sm,n (g) Sm−1,n+1 (g) Sm−2,n+2 (g) ··· J M OO t o M pq q M pq q J t o p p O Oo Jt qM qM h h h h tJ o t p J$ xq q pM M& xq q pM M& wo o p O O' zt t pJ J zt $ Sm,n (g) Sm−1,n+1 (g) Sm−2,n+2 (g) Sm+1,n−1 (g) ··· ··· ··· J
J
Sm+1,n−1 (g)
Jpt
t
M
Using the action (3.5) of g on g⊕m we have entirely analogous structures for g⊕m with Sn, p (g⊕m ) = Fn, p (U(g⊕m ))/Fn+ p−1 (U(g⊕m )).
(3.16)
In view of (3.6), it follows that
Sn, p (g⊕m ) ∼ = Fn, p (U(g))⊗m ) /Fn+ p−1 (U(g))⊗m )
(3.17)
for all n, p ∈ N0 . We shall therefore identify each Sn, p (g⊕m ) with the left h-module of symmetric tensors on (U(g))⊗m containing exactly n factors in h and p in p. 3.3. Symmetric invariants and the restriction property. For all n, p ∈ N0 , let Sn (g⊕g)g be the set of g-invariant elements of the left g-module Sn (g ⊕ g) and let Sn, p (g ⊕ g)h denote the set of h-invariant elements of the left h-module Sn, p (g ⊕ g). We have the following two lemmas. Lemma 3.1. Let n and p be positive integers. Every x ∈ Sn− p, p (g ⊕ g)h such that p x ∈ Sn− p+1, p−1 (g ⊕ g) is in the linear span of Sn− p,0 (g ⊕ g)g S0, p (g ⊕ g)h. Proof. Let (h i )i∈I and ( p j ) j∈J be ordered bases of h ⊕ h and p ⊕ p respectively. Every element x ∈ Sn− p, p (g ⊕ g) can be written as x= xi1 ...in− p j1 ... j p h i1 . . . h in− p p j1 . . . p j p , i 1 ≤···≤i n− p j1 ≤···≤ j p
where, for all i 1 , . . . , i n− p ∈ I and j1 , . . . , j p ∈ J , xi1 ...in− p j1 ... j p ∈ K. Then, omitting the ordered sums, we have p x = xi1 ...in− p j1 ... j p p h i1 . . . h in− p p j1 . . . p j p + h i1 . . . h in− p p p j1 . . . p j p . Since (p x) ∩ Sn− p−1, p+1 (g ⊕ g) = {0}, we have p xi1 ...in− p j1 ... j p h i1 . . . h in− p = 0, for all j1 ≤ · · · ≤ j p ∈ J ; it follows that this quantity is also invariant under [p, p] and hence, by Lemma 2.3, under h. Thus it is actually g-invariant. Introduce a basis (yk )k∈K of the K-module Sn− p,0 (g ⊕ g)g, so that we can write xi1 ...in− p j1 ... j p h i1 . . . h in− p = bk j1 ... j p yk , k∈K
with bk j1 ... j p ∈ K, for all j1 ≤ · · · ≤ j p ∈ J . Now, as x is h-invariant, we also have h x = bk j1 ... j p yk h p j1 . . . p j p = 0.
592
C. A. S. Young, R. Zegers
This yields h bk j1 ... j p p j1 . . . p j p = 0, for all k ∈ K . Introduce a basis (zl )l∈L for the K-module S0, p (g ⊕ g)h, so that we can write, for all k ∈ K , bk j1 ... j p p j1 . . . p j p =
akl zl ,
l∈L
with akl ∈ K for all k ∈ K and l ∈ L. Now, x can be rewritten as x=
akl yk zl ,
k∈K l∈L
with yk ∈ Sn− p,0 (g ⊕ g)g for all k ∈ K and zl ∈ S0, p (g ⊕ g)h for all l ∈ L.
Let us now restrict our attention to the class of symmetric Lie algebras encompassed by the following Definition 3.2. We say that a symmetric semisimple Lie algebra (g, θ ) with associated symmetric decomposition g = h ⊕ p is of restrictive type (or has the restriction property) if and only if for all p ∈ N0 , the projection from g to p maps S p (g ⊕ g)g onto S0, p (g ⊕ g)h. This restriction property will be sufficient to allow us to prove a refined version of Whitehead’s lemma in the next section. Note that it is similar to the so-called surjection property – namely that the restriction from g to p maps S(g)g onto S(p)h – which is known to hold for all classical symmetric Lie algebras [28] and which has proven useful in a number of contexts [29,30]. In our case we have, at least, Lemma 3.3. If a symmetric semisimple Lie algebra splits (as in Lemma 2.2), in such a way that its simple factors are drawn only from the following classical families of simple symmetric Lie algebras: AIn>2 : (su(n), so(n))n>2 , AIIn : (su(2n), sp(2n))n∈N∗ , BDIn>2,1 : (so(n + 1), so(n))n>2 , then it is of restrictive type. Proof. See Appendix.
3.4. Contractible homomorphisms of K[[h]]-modules. Let K[[h]] denote the K-algebra of formal power series in h with coefficients in the field K and let U(g)[[h]] be the U(g)algebra of formal power series in h with coefficients in U(g). We have a natural K-algebra monomorphism i : U(g) → U(g)[[h]]. There is also an epimorphism of K-algebras j : U(g)[[h]] U(g) such that j ◦ i = id on U(g). We shall therefore identify U(g) with its image i(U(g)) ⊂ U(g)[[h]]. We shall also consider complete K[[h]]-modules and it is assumed that the tensor products considered from now on are completed in the h-adic topology. In this subsection, we further assume that g = h ⊕ p is a symmetric decomposition.
Deformation Quasi-Hopf Algebras of Non-semisimple Type from Cochain Twists
593
Definition 3.4. Let p ∈ Z, m ∈ N0 be integers. An element x of (U(g))⊗m [[h]] is ( p, p)contractible if and only if there exists a collection (xn )n∈N0 of elements of (U(g))⊗m such that, x= h n xn (3.18) n≥0
and, for all n ∈ N0 , there exists l(n) ∈ N0 such that xn ∈ Fl(n),n+ p (U(g))⊗m . Similarly, a subset X ⊂ (U(g))⊗m [[h]] is ( p, p)-contractible if all its elements are, according to the previous definition. Note that for the sake of simplicity, we shall refer to (0, p)-contractible elements or sets as p-contractible. Let us now define the notion of con tractibility for K[[h]]-module homomorphisms in Hom U(g)⊗m [[h]], (U(g))⊗n [[h]] . Definition 3.5. Let r, s ∈ N0 and p ∈ Z be integers. A homomorphism of K[[h]]modules φ : (U(g))⊗r [[h]] → (U(g))⊗s [[h]] is p-contractible if and only if, for all n, m ∈ N0 , φ(Fn,m (U(g)⊗r )) is (m, p)-contractible as a subset. Let us emphasize that for every p-contractible K[[h]]-module homomorphism φ : (U(g))⊗r [[h]] → (U(g))⊗s [[h]], there exists a collection (ϕn )n∈N0 of K[[h]]-module homomorphisms ϕn : (U(g))⊗r [[h]] → (U(g))⊗s [[h]] such that φ= h n ϕn (3.19) n≥0
⊗r ) ⊆ F , there exists l(n) ∈ N such that ϕ ((U(g)) and, for all n, m, p ∈ N 0 0 n m, p Fl(n),n+ p (U(g))⊗s . The following two lemmas will be useful in the next sections. Lemma 3.6. Let φ and ψ be two p-contractible homomorphisms of K[[h]]-modules. Then the K[[h]]-module homomorphism φ ◦ ψ is p-contractible. Proof. We have φ=
h n ϕn
and
ψ=
n≥0
h n ψn ,
n≥0
with, for all n, m, p ∈ N0 , ϕn (Fm, p ) ⊆ F∗,n+ p , and ψn (Fm, p ) ⊆ F∗,n+ p . For the sake of simplicity we shall omit the arguments of the bifiltration and denote by ∗ the integer l(n) whose existence is guaranteed by the definition of contractibility. We thus have φ◦ψ =
h n+m ϕn ◦ ψm =
n≥0 m≥0
n≥0
hn
n
ϕm ◦ ψn−m ,
m=0
with, for all l, m, n, p ∈ N0 , ϕm ◦ ψn−m (Fl, p ) ⊆ ϕm (F∗,n−m+ p ) ⊆ F∗,n+ p .
The following holds for the inverse. Lemma 3.7. Let φ be a p-contractible homomorphism of K[[h]]-modules, congruent with id mod h. Then the K [[h]]-module homomorphism φ −1 = id mod h is p-contractible.
594
C. A. S. Young, R. Zegers
Proof. We shall construct φ −1 =
h n ϕn ,
n≥0
by recursion on the order in h, by demanding that φ ◦ φ −1 = id. At leading order, we have ϕ0 = id and therefore ϕ0 (Fm, p ) ⊆ Fm, p , for all m, p ∈ N0 . Let us assume that we have a polynomial φn−1 of degree n > 0 such that φ ◦ φn−1 − id = q
mod h n+1 .
Assuming that φn−1 is p-contractible, we have by Lemma 3.6 that φ ◦ φn−1 is p-contractible, as φ is p-contractible by assumption. Therefore, q(Fm, p ) ⊆ F∗,n+1+ p . Now, to complete the recursion, we have to find ϕn+1 such that
φ ◦ φn−1 + h n+1 ϕn+1 − id = 0 mod h n+2 . This is achieved by taking ϕn+1 = −q. We thus have ϕn+1 (Fm, p ) ⊆ F∗,n+1+ p .
Finally, when φ is not only a K[[h]]-module homomorphism but also a K[[h]]-algebra homomorphism, we have the following useful lemma. ⊗t [[h]] be a homomorphism of K[[h]]Lemma 3.8. Let φ : (U(g))⊗s [[h]] → (U(g)) algebras. It is p-contractible if and only if φ F1,0 ((U(g))⊗s ) is (0, p)-contractible and φ F0,1 ((U(g))⊗s ) is (1, p)-contractible. Proof. If φ is p-contractible, it follows from the definition that, in particular, φ F1,0 is (0, p)-contractible and φ F0,1 is (1, p)-contractible. Now, assuming that φ F1,0 is (0, p)-contractible and φ F0,1 is (1, p)-contractible, we want to prove that, for all m, p ∈ N0 , φ Fm, p is ( p, p)-contractible. We proceed by recursion on m and p. We have assumed the result for m = 1 and p = 0, as well as for m = 0 and p = 1. Suppose that, for some m, p ∈ N0 , we have that, for all m < m, p < p and n ∈ N0 , proven there exists l ∈ N0 such that ϕn Fm , p ⊆ Fl,n+ p . Then, for all n ∈ N0 ,
⊗s
ϕn Fm, p+1 ((U(g)) ) = ϕn = ⊆
m p
k=0 l=0 p m
span
k=0 l=0 p m
span Fk,l · F0,1 · Fm−k, p−l
ϕσ1 Fk,l · ϕσ2 F0,1 · ϕσ3 Fm−k, p−l
σ ∈C3 (n)
spanσ ∈C3 (n) F∗,σ1 +l · F∗,σ2 +1 · F∗,σ3 + p−l
k=0 l=0
= F∗,n+ p+1 , where, for all X ⊆ (U(g))⊗s , span X denotes the K-module linearly generated by X 3 σi = n} of weak 3-compositions and C3 (n) is the set {σ = (σ1 , σ2 , σ3 ) ∈ N30 : i=1 of n.
Deformation Quasi-Hopf Algebras of Non-semisimple Type from Cochain Twists
595
Similarly, we have
ϕn Fm+1, p ((U(g))⊗s ) = ϕn = ⊆
m p
k=0 l=0 p m
span Fk,l · F1,0 · Fm−k, p−l
span
k=0 l=0 p m
ϕσ1 Fk,l · ϕσ2 F1,0 · ϕσ3 Fm−k, p−l
σ ∈C3 (n)
spanσ ∈C3 (n) F∗,σ1 +l · F∗,σ2 · F∗,σ3 + p−l = F∗,n+ p ,
k=0 l=0
for all n ∈ N0 .
3.5. Contractible deformation Hopf algebras. We recall that U(g) possesses a natural cocommutative Hopf algebra structure, whose coproduct is the algebra homomorphism
0 : U(g) → U(g) ⊗ U(g) defined by 0 (x) = x ⊗ 1 + 1 ⊗ x for all x ∈ g, and whose counit and antipode are specified by 0 (1) = 1 and S0 (1) = 1. We refer to this as the undeformed Hopf algebra structure. Given the notion of contractibility introduced in the preceding subsections, it is natural to specialize the usual notion of a quantization – i.e. a deformation – of a universal enveloping algebra, as follows. Definition 3.9. Let (g, θ ) be a symmetric Lie algebra, with symmetric decomposition g = h ⊕ p. A p-contractible deformation (Uh (g), ·h , h , h , Sh ) of the Hopf algebra (U(g), ·, 0 , 0 , S0 ) is a topological Hopf algebra such that • • • • •
∼
there exists a K[[h]]-module isomorphism η : Uh (g) −→ U(g)[[h]]; μh := η ◦ (·h ) ◦ η−1 ⊗ η−1 = · mod h and μh is p-contractible; ˜ h := (η ⊗ η) ◦ h ◦ η−1 = 0 mod h and ˜ h is p-contractible;
˜Sh := η ◦ Sh ◦ η−1 = S0 mod h and S˜h is p-contractible; ˜h = h ◦ η−1 = 0 mod h and ˜h is p-contractible.
This definition can be naturally restricted to bialgebras and algebras. 4. On the Cohomology of Associative and Lie Algebras 4.1. The Hochschild cohomology. Let A be a K-algebra. For any (A, A)-bimodule (M, , ) and all n ∈ N0 ∗ , we define the (A, A)-bimodule of n-cochains C n (A, M) = Hom(A⊗n , M). We also set C 0 (A, M) = M. To each cochain module C n (A, M), we associate a coboundary operator, i.e. a derivation operator δn : C n (A, M) −→ C n+1 (A, M), by setting, for all f ∈ C n (A, M), δn f (x1 , . . . , xn+1 ) = x1 f x2 , . . . , xˆi , . . . , xn+1 +
n
(−1)i f (x1 , . . . , xi xi+1 , . . . , xn+1 )
i=1
+ (−1)n+1 f (x1 , . . . , xn ) xn+1
(4.1)
596
C. A. S. Young, R. Zegers
for all x1 , . . . , xn+1 ∈ A. One can check that δn ◦ δn+1 = 0 for all n. Therefore, the (C n , δn ) thus defined constitute a cochain complex. It is known as the Hochschild or standard complex [31,32] – see also [33 or 34]. An element of the (A, A)-bimodule Z n (A, M) = ker δn ⊂ C n (A, M) is called an n-cocycle, while an element of the (A, A)-bimodule B n (A, M) = im δn−1 ⊂ C n (A, M) is called an n-coboundary. As usual, the quotient H H n (A, M) = Z n (A, M)/B n (A, M)
(4.2)
n th
defines the cohomology module of A with coefficients in M. In the next section, we shall be particularly interested in the Hochschild cohomology of the universal enveloping algebra of a given Lie algebra g, i.e. A = U(g), with coefficients in M = U(g). The latter trivially constitutes a (U(g), U(g))-bimodule with the multiplication · of U(g) as left and right U(g)-action. Concerning the Hochschild cohomology we will need the following result – see for example Theorem 6.1.8 in [2]. Lemma 4.1. Let g be a semisimple Lie algebra over K. Then, H H 2 (U(g), U(g)) = 0. 4.2. The Chevalley-Eilenberg cohomology. Let g be a Lie algebra over K and (M, ) a left g-module. For all n ∈ N0 ∗ , we define the left g-module of n-cochains C n (g, M) = Hom(∧n g, M), with left g-action (x f ) (x1 , . . . , xn ) = x ( f (x1 , . . . , xn ))−
n
f (x1 , . . . , [x, xi ], . . . , xn ) ,
(4.3)
i=1
for all f ∈ C n (g, M) and all x, x1 , . . . , xn ∈ g. We also set C 0 (g, M) = M with its natural left g-module structure. To each cochain module C n (g, M), we associate a coboundary operator, i.e. a derivation operator dn : C n (g, M) −→ C n+1 (g, M), by setting, for all f ∈ C n (g, M), dn f (x1 , . . . , xn+1 ) =
n+1 i=1
+
(−1)i+1 xi f x1 , . . . , xˆi , . . . , xn+1
(−1)i+ j f
xi , x j , x1 , . . . , xˆi , . . . , xˆ j , . . . , xn+1
1≤i≤ j≤n+1
(4.4) for all x1 , . . . , xn+1 ∈ g. In (4.4), hatted quantities are omitted and denotes the left g-action on M. One can check that dn ◦ dn+1 = 0 for all n. Therefore, the (C n , dn ) thus defined constitute a cochain complex. It is known as the Chevalley-Eilenberg complex [35], – see also [33 or 34]. An element of Z n (g, M) = ker dn ⊂ C n (g, M) is called an n-cocycle, while an element of B n (g, M) = im dn−1 ⊂ C n (g, M) is called an n-coboundary. As usual, the quotient H n (g, M) = Z n (g, M)/B n (g, M)
(4.5)
defines the n th cohomology module of g with coefficients in M. One can check that, for all n ∈ N0 , Z n (g, M), B n (g, M) and H n (g, M) naturally inherit the left g-module structure of C n (g, M), as for all n ∈ N0 , d (x f ) = x d f,
(4.6)
Deformation Quasi-Hopf Algebras of Non-semisimple Type from Cochain Twists
597
for all f ∈ C n (g, M) and all x ∈ g. An important result about the ChevalleyEilenberg cohomology of Lie algebras concerns finite dimensional complex semisimple Lie algebras. It is known as Whitehead’s Lemma. Lemma 4.2. Let g be a semisimple Lie algebra over K. If M is any finite-dimensional left g-module, then H 1 (g, M) = H 2 (g, M) = 0. A proof of this result can be found, for instance, in Sect. 7.8 of [34]. 4.3. Contractible Chevalley-Eilenberg cohomology. In the next section, we will be mostly interested in the module M = U(g) ⊗ U(g), with the left g-action induced by (3.5) and (3.6), i.e. g x = [ 0 (g), x],
(4.7)
for all g ∈ g and all x ∈ U(g) ⊗ U(g). In particular, we shall need a refinement of Whitehead’s Lemma, in the case of symmetric semisimple Lie algebras of restrictive type, taking into account the possible p-contractibility of the generating cocycles of n (g, U(g) ⊗ U(g)) Z ∗ (g, U(g) ⊗ U(g)). For all m, n ∈ N0 , we therefore define Cm, p as the set of (m, p)-contractible n-cochains, by which we mean the set of n-cochains f ∈ C n (g, U(g) ⊗ U(g)), such that, for all 0 ≤ p ≤ n, f (∧n− p h) ∧ (∧ p p) ⊆ n Fl,m+ p (U(g) ⊗ U(g)), for some l ∈ N0 . Defining similarly, Z m,p(g, U(g) ⊗ U(g)) = n−1 n (g, U(g) ⊗ U(g)) and B n (g, U(g) ⊗ U(g)) = d ker dn ∩Cm, n−1 C m,p (g, U(g) ⊗ U(g)) p m,p as the modules of the (m, p)-contractible n-cocycles and of the n-coboundaries of (m, p)contractible n −1-cochains, respectively, we can define the n th (m, p)-contractible cohomology module as n n n Hm, p(g, U(g) ⊗ U(g)) = Z m,p(g, U(g) ⊗ U(g))/Bm,p(g, U(g) ⊗ U(g)).
(4.8)
It is worth emphasizing that these cohomology modules generally differ from the usual ones H n (g, U(g)⊗U(g)). Consider for instance a case for which H 1 (g, U(g)⊗U(g)) = 0. We have that every 1-cocycle in Z 1 (g, U(g) ⊗ U(g)), and therefore every cocycle 1 (g, U(g) ⊗ U(g)), is the coboundary of an element x ∈ U(g) ⊗ U(g). Howf ∈ Z m, p ever, although the considered f is (m, p)-contractible, it may be that it can only be obtained as the coboundary of an element x ∈ U(g) ⊗ U(g) that does not belong to any 1 (g, U(g) ⊗ F∗,m (U(g) ⊗ U(g)), thus yielding a non-trivial cohomology class in Hm, p U(g)). When g is a symmetric semisimple Lie algebra of restrictive type, we nonetheless establish the following lemma concerning the first (m, p)-contractible cohomology 1 (g, U(g) ⊗ U(g)). module Hm, p Lemma 4.3. Let (g, θ ) be a symmetric semisimple Lie algebra of restrictive type over K and let g = h ⊕ p be the associated symmetric decomposition of g. We have 1 (g, U(g) ⊗ U(g)) = 0, for all m ∈ N . Hm, 0 p Proof. Let m ∈ N0 be a positive integer. We have to prove that every (m, p)1 (g, U(g) ⊗ U(g)) is the coboundary of an elecontractible 1-cocycle f ∈ Z m, p ment α ∈ Fl,m (U(g) ⊗ U(g)), for some l ∈ N0 . From Lemma 4.2, there exists an x ∈ U(g) ⊗ U(g) such that f = d0 x. All we have to prove is that we can always find a left g-invariant y ∈ (U(g) ⊗ U(g))g, such that x = y modulo Fl,m (U(g) ⊗ U(g)) for some l ∈ N0 . Then, we can check that for α = x − y ∈ Fl,m (U(g) ⊗ U(g)), we have d0 α = d0 (x − y) = d0 x = f.
598
C. A. S. Young, R. Zegers
In view of (3.17), we can first expand x into its components in the left g-modules isomorphic to the Sn (g ⊕ g), for all n ∈N0 . Up to the isomorphism of left g-modules, which we shall omit here, we have x = n≥0 xn where, for all n ∈ N0 , xn ∈ Sn (g ⊕ g). Similarly, we can further decompose each Sn (g⊕g) into the left h-modules Sn− p, p (g⊕g), with 0 ≤ p ≤ n, and, accordingly, each xn . We are now going to construct the desired y ∈ (U(g) ⊗ U(g))g by recursion, submodule by submodule. If xn = 0 for all n > m, we can set y = 0 and we are done. So, suppose that there exists an n > m such that xn = 0 and let x0,n be the component of xn in S0,n (g ⊕ g). If x0,n vanishes, we can skip to the component of xn in S1,n−1 (g ⊕ g). Otherwise, we are going to prove that there exists a g-invariant yn,0 ∈ Sn (g ⊕ g)g, such that the component of xn − yn,0 in S0,n (g ⊕ g) vanishes. From f being (m, p)-contractible, we know that ⎛ f (h) = d0 x(h) = h ⎝xn +
⎞ xn ⎠ ⊆ Fl,m (U(g) ⊗ U(g)),
(4.9)
n =n
for some l ∈ N0 . Therefore, since the Sm, p (g⊕g) are left h-modules, we have hx0,n = 0. Since g has the restriction property, Definition 3.2, it follows that the h-invariant tensor x0,n ∈ S0,n (g ⊕ g)h is the restriction to p of a g-invariant tensor yn,0 ∈ Sn (g ⊕ g)g. Now consider xn − yn,0 . By construction, it has no component in S0,n (g⊕g). If n −1 ≤ m, we set yn = yn,0 and skip to another g-module Sn >m (g ⊕ g), where x has a non-vanishing component, if any. Otherwise, let 0 ≤ k < n − m and assume that we have found yn,k ∈ Sn (g ⊕ g)g, such that xn − yn,k has vanishing component in all the Sn− p, p (g ⊕ g) with p ≥ n − k > m. We are going to prove that there exists yn,k+1 ∈ Sn (g ⊕ g)g such that xn − yn,k+1 has vanishing component in all the Sn− p, p (g ⊕ g) with p ≥ n − k − 1. To do so, let xk+1,n−k−1 be the component of xn − yn,k in Sk+1,n−k−1 (g ⊕ g). If it is zero, we set yn,k+1 = yn,k . Otherwise, note that from (4.9), we have h xk+1,n−k−1 = 0. But the (m, p)-contractibility of f also implies that ⎛ f (p) = d0 x(p) = p ⎝xn − yn,k +
⎞ xn ⎠ ⊆ Fl,m+1 (U(g) ⊗ U(g)),
n =n
from which it follows that p x k+1,n−k−1 ∈ Sk+2,n−k−2 (g⊕g). According to Lemma 3.1, g we can write xk+1,n−k−1 = i, j ai j wi z j , with ai j ∈ K, wi ∈ Sk+1,0 (g ⊕ g) and z j ∈ S0,n−k−1 (g ⊕ g)h. Since g has the restriction property, all the z j are the restrictions to p of g-invariant elements ζ j ∈ Sn−k−1 (g⊕g)g. Now, set yn,k+1 = yn,k + i, j ai j wi ζ j . It is obvious that yn,k+1 ∈ Sn (g ⊕ g)g and, by construction, xn − yn,k+1 has no component in all the Sn− p, p (g ⊕ g), with p ≥ n − k − 1. The recursion goes on until we have yn,n−m ∈ Sn (g ⊕ g)g such that xn − yn,n−m has vanishing components in all the Sn− p, p (g ⊕ g), with p > m. We therefore set yn = yn,n−m . By repeating this a finite number of times3 , in all the Sn >m (g ⊕ g) in which x has non-vanishing components, we obtain the desired y = n≥0 yn . 3 It is rather obvious that x has non-vanishing components in a finite number of submodules S (g ⊕ g), as n there always exists an l ∈ N such that x ∈ Fl (U (g) ⊗ U (g)).
Deformation Quasi-Hopf Algebras of Non-semisimple Type from Cochain Twists
599
5. Rigidity Theorems 5.1. Contractible algebra isomorphisms. Proposition 5.1. Let g be a semisimple Lie algebra over K and let h be a symmetrizing Lie subalgebra with orthogonal complement p in g. Then, for every p-contractible deformation algebra (Uh (g), ·h ) of (U(g), ·), there exists a p-contractible isomorphism ∼ of K[[h]]-algebras (Uh (g), ·h ) −→ (U(g)[[h]], ·), that is congruent with id mod h. ∼
Proof. By definition, there exists a K[[h]]-module isomorphism η : Uh (g) −→ U(g)[[h]]. The latter defines a K[[h]]-algebra between (Uh (g), ·h ) and isomorphism (U(g)[[h]], μh ), where μh := η ◦ (·h ) ◦ η−1 ⊗ η−1 = · mod h. If we found a p-contractible K[[h]]-algebra automorphism ∼
φ : (U(g)[[h]], μh ) −→ (U(g)[[h]], ·),
(5.1)
we would prove the proposition as φ ◦ η would constitute the desired K[[h]]-algebra isomorphism from (Uh (g), ·h ) to (U(g)[[h]], ·). Let φ be a K[[h]]-module automorphism on U(g)[[h]]. The condition for such an automorphism to be the K[[h]]-algebra automorphism (5.1) is μh = φ −1 ◦ (·) ◦ (φ ⊗ φ) . Let us construct φ=
(5.2)
h n ϕn ,
(5.3)
n≥0
order by order in h. At leading order, we have μ0 = · and we can take ϕ0 = id ∈ Hom(U(g)[[h]], U(g)[[h]]). We thus have ϕ0 (Fm, p (U(g))) ⊆ Fm, p (U(g)), for all m, p ∈ N0 . Suppose now that we have found a polynomial of degree n > 0, φn =
n
h m ϕm ,
(5.4)
m=0
such that μh − φn−1 ◦ (·) ◦ (φn ⊗ φn ) = h n+1r
mod h n+2 ,
(5.5)
where φn−1 denotes the exact inverse series of φn defined by φn ◦ φn−1 = id and r ∈ Hom(U(g) ⊗ U(g)[[h]], U(g)[[h]]). We assume that φn is p-contractible. Therefore, (·) ◦ (φn ⊗ φn ) is p-contractible. By Lemma 3.7, φn−1 is p-contractible and, by Lemma 3.6, φn−1 ◦ (·) ◦ (φn ⊗ φn ) is p-contractible. By definition of a p-contractible deformation algebra, we know that μh is p-contractible. It therefore follows from (5.5) at order h n+1 that r (Fm, p (U(g) ⊗ U(g))) ⊆ F∗,n+1+ p (U(g)), for all m, p ∈ N0 . From the associativity of μh , we deduce that r is a 2-cocycle in the Hochschild complex, δ2 r = 0.
(5.6)
As g is semisimple, it follows from Lemma 4.1 that its second Hochschild cohomology module H H 2 (U(g), U(g)) is empty, so that r is a coboundary. We thus have r = δ1 β, for some β ∈ Hom(U(g)[[h]], U(g)[[h]]). But we know that, in particular, r (F2,0 (U(g) ⊗
600
C. A. S. Young, R. Zegers
U(g))) ⊆ F∗,n+1 (U(g)) and r (F1,1 (U(g) ⊗ U(g))) ⊆ F∗,n+2 (U(g)). It follows that β can be consistently chosen so that β(F1,0 (U(g))) ⊆ F∗,n+1 (U(g)) and β(F0,1 (U(g))) ⊆ F∗,n+2 (U(g)). To complete the recursion, we have to solve
μh = φn−1 −h n+1 ϕn+1 mod h n+2 ◦ φn +h n+1 ϕn+1 · φn +h n+1 ϕn+1 mod h n+2 , that is δ1 ϕn+1 = r.
(5.7)
This equation can be solved by taking ϕn+1 = −β, which implies that ϕn+1 (F1,0 (U(g))) ⊆ F∗,n+1 (U(g)) and ϕn+1 (F0,1 (U(g))) ⊆ F∗,n+2 (U(g)). The proposition then follows from Lemma 3.8. 5.2. Contractible twisting for symmetric semisimple Lie algebras. Proposition 5.2. Let (g, θ ) be a symmetric semisimple Lie algebra over K having the restriction property, and let g = h ⊕ p be the associated symmetric decomposition of g. Every p-contractible deformation (Uh (g), , , S) of the Hopf algebra (U(g), 0 , 0 , S0 ) is isomorphic, as a Hopf algebra over K[[h]], to a twist of (U(g), 0 , 0 , S0 ) by a p-contractible invertible element F ∈ U(g) ⊗ U(g)[[h]], congruent with 1 ⊗ 1 mod h. Proof. We consider the composite map ∼
∼ ˜ : U(g)[[h]] −→
Uh (g) −→ Uh (g) ⊗ Uh (g) −→ U(g) ⊗ U(g)[[h]],
(5.8)
where the existence of a p-contractible isomorphism of K[[h]]-algebras φ follows from ˜ is an algebra Proposition 5.1. As φ is an algebra isomorphism, the composite map homomorphism. By repeated use of Lemma 3.6, one can show that it is p-contractible. Now, we want to prove that there exists a p-contractible and invertible element F ∈ U(g) ⊗ U(g)[[h]], such that F = 1 ⊗ 1 mod h and ˜ = F 0 F −1 .
(5.9)
We shall proceed by recursion on the order in h. To first order, we have, by construction ˜ = 0
mod h,
(5.10)
and we can take F = 1⊗1 mod h. We thus have F|h=0 ∈ F0,0 (U(g)⊗U(g)). Suppose now that we have found a polynomial Fn ∈ U(g) ⊗ U(g)[h] of degree n, Fn =
n
h m fm ,
(5.11)
m=0
such that ˜ − Fn 0 Fn−1 = h n+1 ξ
mod h n+2 ,
(5.12)
where Fn−1 ∈ U(g) ⊗ U(g)[[h]] is the formal inverse of F in the sense that F −1 F = 1 and ξ ∈ Hom(U(g)[[h]], U(g) ⊗ U(g)[[h]]). We assume that Fn is p-contractible, i.e.
Deformation Quasi-Hopf Algebras of Non-semisimple Type from Cochain Twists
601
˜ is p-contractible, we deduce that for all n ∈ N0 , f n ∈ F∗,n (U(g) ⊗ U(g)). Since ξ(F1,0 (U(g))) ⊆ F∗,n+1 (U(g) ⊗ U(g)) and ξ(F0,1 (U(g)) ⊆ F∗,n+2 (U(g)). It follows from (5.12) that, for all X, Y ∈ g, we have
˜ − Fn 0 Fn−1 ([X, Y ]) = h n+1 ξ([X, Y ]) mod h n+2 , (5.13)
˜ is an algebra homomorphism, on one hand and, on the other hand, since
˜ Y ˜ ˜ − Fn 0 Fn−1 ([X, Y ]) = X, − Fn 0 ([X, Y ])Fn−1
= h n+1 ([ 0 X, ξ(Y )]+[ξ(X ), 0 Y ])
mod h n+2 . (5.14)
Equating (5.13) and (5.14), we finally get d1 ξ = 0.
(5.15)
The map ξ is thus a 1-cocycle of Z 1 (g, U(g) ⊗ U(g)) in the sense of the ChevalleyEilenberg complex4 . As g is semisimple, it follows from Lemma 4.2 that the cohomology module H 1 (g, U(g) ⊗ U(g)) is empty. We therefore conclude that ξ is a coboundary. But we know that ξ(F0,1 (U(g))) ⊆ F∗,n+2 (U(g)⊗U(g)) and ξ(F1,0 (U(g))) ⊆ F∗,n+1 (U(g) ⊗ U(g)), so that ξ is an (n + 1, p)-contractible 1-cocycle in the contractible Chevalley-Eilenberg complex defined in Subsect. 4.3. As g is of restrictive type, 1 it follows from Lemma 4.3, that Hn+1, p(g, U(g) ⊗ U(g)) = 0, so that ξ is the coboundary of an (n + 1, p)-contractible element in U(g) ⊗ U(g), i.e. there exists an α ∈ F∗,n+1 (U(g) ⊗ U(g)) such that ξ = d0 α = δ0 α. In order to complete the recursion, we have to find an f n+1 ∈ U(g) ⊗ U(g) such that
˜
− Fn +h n+1 f (n+1) 0 Fn−1 −h n+1 f (n+1) mod h n+2 = 0 mod h n+2 . (5.16) Expanding the above equation to order h n+1 yields δ0 f n+1 + ξ = 0.
(5.17)
This equation can then be solved by choosing f n+1 = −α ∈ F∗,n+1 (U(g) ⊗ U(g)).
5.3. Contractible quasi-Hopf algebras. Generically, cochain twists map quasi-Hopf algebras to quasi-Hopf algebras[2–4]. Under twisting, the coproduct and coassociator of a given quasi-Hopf algebra transform as −1
F X = F · ( X ) · F −1 , F = F12 · ( ⊗ id) (F ) · · (id ⊗ ) (F −1 ) · F23 , (5.18)
and, if the quasi-Hopf algebra is in addition quasitriangular, then the R-matrix R transforms as RF = F21 RF −1 .
(5.19)
4 By rewriting (5.13–5.14) for the associative product of two arbitrary elements in U (g), we also show that ξ is a 1-cocycle in the sense of the Hochschild complex. This indeed provides a unique continuation of ξ from g to U (g) as a derivation.
602
C. A. S. Young, R. Zegers
In the previous section it happened that both and 0 were coassociative, so that both Uh (g) and U(g) happened to be Hopf algebras, but the theory applies more generally. Suppose now that R ∈ (Uh (g))⊗2 and ∈ (Uh (g))⊗3 are any R-matrix and coassociator that make a given QUEA (Uh (g), , , S) into a (coassociative) qtqH algebra, which we denote, by a slight abbreviation, as (Uh (g), , R, ). We say that this qtqH algebra is p-contractible with respect to a symmetric decomposition g = h ⊕ p if and only if (Uh (g), , , S) is p-contractible in the sense of Definition 3.9 and R and are p-contractible as elements of their respective tensor products. It then follows from the definitions above that Proposition 5.3. For any QUEA Uh (g) and any symmetric decomposition g = h ⊕ p, if (Uh (g), , R, ) is a p-contractible qtqH algebra and F ∈ (Uh (g))⊗2 is a pcontractible twist then ((Uh (g))F , F , RF , F ) is a p-contractible qtqH algebra. Combining this with Propositions 5.1 and 5.2, we have that every p-contractible qtqH algebra (Uh (g), , R, ) can be obtained, via p-contractible change of basis and twist, from some p-contractible qtqH algebra (U(g), 0 , R , ) based on the undeformed UEA. In particular, starting from the trivial triangular quasi-Hopf structure (R = 1 ⊗ 1, = 1 ⊗ 1 ⊗ 1) on U(g), which is obviously p-contractible, we have Corollary 5.4. For any p-contractible deformation Hopf algebra (Uh (g), , , S) based on a symmetric semisimple Lie algebra of restrictive type with symmetric decomposition g = h ⊕ p, there is an R-matrix R and coassociator such that (Uh (g), , R, ) is a p-contractible triangular quasi-Hopf algebra. Proof. Explicitly, by Propositions 5.1 and 5.2, there exists a p-contractible invertible element F ∈ U(g) ⊗ U(g)[[h]] and a p-contractible K[[h]]-algebra isomorphism φ, such that
= φ −1 ⊗ φ −1 ◦ F 0 F −1 ◦ φ. Defining
R := φ −1 ⊗ φ −1 F21 F −1 ,
−1 := φ −1 ⊗ φ −1 ⊗ φ −1 F12 · ( 0 ⊗ id) (F ) · (id ⊗ 0 ) (F −1 ) · F23
provides the required structure.
(5.20) (5.21)
One may also want to know when a given p-contractible Hopf QUEA (Uh (g), , , S) admits a p-contractible quasitriangular structure. When g is a semisimple Lie algebra, we can at least give necessary conditions, by adapting the argument surrounding Proposition 3.16 in [4]. Let t ∈ sym(g ⊗ g)g be a g-invariant symmetric element. For semisimple g, t is a linear combination of invariant symmetric elements of the simple factors of ¯ be the corresponding standard quasitriangular Hopf QUEA and g. Let (Uh (g), , R) (U(g)[[h]], 0 , R, ) the qtqH algebra with R = eht/2 , both as defined (simple factor by simple factor) in [4]. Corollary 5.5. Let g = h ⊕ p be a symmetric decomposition of restrictive type of ¯ is p-contractible, then it is isomora semisimple Lie algebra g. If (Uh (g), , R) phic, via a p-contractible isomorphism of K[[h]]-algebras, to a p-contractible twist of (U(g)[[h]], 0 , R = eht/2 , ). Furthermore, ht is necessarily p-contractible.
Deformation Quasi-Hopf Algebras of Non-semisimple Type from Cochain Twists
603
Proof. (Outline) One follows the Proof of Proposition 3.16 in [4] to reach the qtqH algebra (U(g)[[h]], 0 , R, ), where R and are g-invariants but, as above, one knows from Propositions 5.1 and 5.2 that the required isomorphism φ and twist F can be chosen to be p-contractible. Indeed, a further g-invariant twist may be required to ensure that R21 = R, but this twist is p-contractible as R is (cf. Prop 3.5 in [4]). Then the rest of the proof is unmodified, and one has that R = eht/2 and that is the corresponding coassociator, as defined in [4]. Moreover, since both R and depend on h and t solely through ht, their p-contractibility implies that of ht. Remark. Knowing, ahead of time, that the standard quasitriangular Hopf QUEAs of semisimple Lie algebras exist allows one to conclude that, to the datum (g, t), corresponds, via twisting, a quasitriangular Hopf algebra. It does not allow us though to conclude anything about p-contractibilty. In order to decide whether the existence of a p-contractible ht ∈ h sym(g ⊗ g)g is also a sufficient condition for the existence of a ¯ based on (Uh g, , , S), it p-contractible quasitriangular Hopf algebra (Uh (g), , R) might be helpful to refine the approach of Donin and Shnider, [38], where it is shown by direct cohomological arguments that there exists a twist from (U(g), 0 , R, ) to the latter, therefore setting the coassociator to unity. In Sect. 7 we will see an example for which a p-contractible ht (and a p-contractible quasitriangular Hopf algebra) does exist, and one for which it does not. 6. Twists and p-Contractions We can now finally turn to the objects in which we are really interested in this paper: those deformed enveloping algebras of non-semisimple Lie algebras that are obtained by a certain contraction procedure modelled on that used in [22–24] to obtain the κ-deformation of Poincaré. The notion of p-contractibilty introduced in the previous sections is formulated with this type of contraction in mind, as we now discuss. Recall first that if g = h ⊕ p is a symmetric decomposition of a Lie algebra g, a standard procedure known as Inönu-Wigner contraction, [36,37], consists in contracting the submodule p by means of a one-parameter family of linear automorphisms of the form t = πh + t πp,
(6.1)
where πh : g h and πp : g p denote the linear projections from g to h and p respectively and t ∈ (0, 1]. For all t ∈ (0, 1], the image of g by the automorphism −1 t is the symmetric semisimple Lie algebra gt , isomorphic to g = h ⊕ p as a K-module, with Lie bracket [X, Y ]t = −1 t ([t (X ), t (Y )])
(6.2)
for all X, Y ∈ g. It has the property that [h, h]t ⊂ h ,
[h, p]t ⊂ p , and [p, p]t ⊂ t 2 h,
(6.3)
so in the limit t → 0 one obtains a Lie algebra g0 , isomorphic to g = h ⊕ p as a K-module, whose Lie bracket [, ]0 = limt→0 [, ]t obeys [h, h]0 ⊂ h ,
[h, p]0 ⊂ p , and [p, p]0 = {0}.
(6.4)
604
C. A. S. Young, R. Zegers
The submodule p is therefore an abelian ideal in g0 . The undeformed Hopf algebra structure defined in Sect. 3.1 is preserved as t tends to zero. There is thus a natural undeformed Hopf algebra structure on the envelope U(g0 ) of the contracted Lie algebra, which we may write as (U(g0 ), 0 , S0 , 0 ) without ambiguity. We may extend t over U(g)[[h]] as a K[[h]]-algebra homomorphism. Further, by means of the K[[h]]-module isomorphism η of Definition 3.9, we can regard t as a map Uh (g) → Uh (g) on any QUEA Uh (g). This specifies how every element of the latter is to be rescaled in the contraction limit. The relevance of the definition of p-contractibility from Sect. 3 is then contained in the following Definition-Proposition 6.1. Let (g, θ ) be a symmetric semisimple Lie algebra with symmetric decomposition g = h ⊕ p and let (Uh (g), h , Sh , h ) be a deformation of the Hopf algebra (U(g), 0 , S0 , 0 ). For all t ∈ (0, 1], set −1
(t) = (−1 t ⊗ t ) ◦ th ◦ t ,
S(t) = −1 t ◦ Sth ◦ t
and (t) = th ◦ t , (6.5)
where h = h/t is the rescaled deformation parameter. Then the limit of (Uth (gt ), (t) , S(t) , (t) ) as t → 0 exists if and only if (Uh (g), h , Sh , h ) is p-contractible. If so, one has a deformation of (U(g0 ), 0 , S0 , 0 ) which we denote by (Uh (g0 ), h , Sh , h ), and refer to as the p-contraction of (Uh (g), h , Sh , h ). Proof. Let r, s ∈ N and let φ : (U(g))⊗r [[h]] → (U(g))⊗s [[h]] be a homomorphism ⊗s ◦ φ ◦ ( )⊗r has a finite limit of K[[h]]-modules. We want to prove that φt = (−1 t t ) when t → 0 if and only if φ is p-contractible. First assume that φ is p-contractible; then from Lemma 3.8, there exists a collection (ϕn )n∈N0 of K[[h]]-module homomorphisms ϕn : (U(g))⊗r [[h]] → (U(g))⊗s [[h]] such that φ=
h n ϕn
(6.6)
n≥0
⊗r and, for m, all n,⊗s p ∈ N0 , there exists l ∈ N0 such that ϕn Fm, p ((U(g)) ) ⊆ Fl,n+ p (U(g)) . We thus have, for all n, m, p ∈ N0 , ⊗s ⊗s h n (−1 ◦ ϕn ◦ (t )⊗r Sm, p (g⊕r ) = h −n t n+ p (−1 ◦ ϕn Sm, p (g⊕r ) t ) t ) ⊗s ⊆ h −n t n+ p (−1 Fl,n+ p ((U(g))⊗s ) t ) = h −n t n+ p O(t −(n+ p) ) Fl,n+ p ((U(g))⊗s ) = h −n O(1) Fl,n+ p ((U(g))⊗s ) . This obviously has a finite limit when t → 0 and so does φt . Conversely, one sees that if φ is not p-contractible, φt diverges at least as t −1 . It is worth emphasizing that the notion of p-contraction defined here is not the only possible contraction that can be performed on a QUEA of g with respect to a given symmetric decomposition g = h ⊕ p: one could also, for example, consider contractions where the deformation parameter h is not rescaled in the limit.
Deformation Quasi-Hopf Algebras of Non-semisimple Type from Cochain Twists
605
Finally, we can state our main result concerning twists and p-contracted QUEAs: Theorem 6.1. If a deformation Hopf algebra (Uh (g0 ), h , Sh , h ) is the p-contraction of a QUEA of a symmetric semisimple Lie algebra (g, θ ) having the restriction property, then it is isomorphic, as a Hopf algebra over K[[h ]], to a twist of the undeformed Hopf algebra (U(g0 ), 0 , S0 , 0 ) by an invertible element F0 ∈ Uh (g0 ) ⊗ Uh (g0 )[[h ]] congruent with 1 ⊗ 1 modulo h . Proof. By Proposition 6.1, Proposition 5.2 applies. By arguing as in the proof of 6.1, we have that if F is the p-contractible twist element of Proposition 5.2, then −1 F0 = lim (−1 t ⊗ t )(F )
(6.7)
t→0
is well-defined. By construction, this is the twist we seek.
From Corollary 5.4, one has similarly that for every such p-contracted QUEA Uh (g0 ) there exists an R-matrix R and coassociator that make (Uh (g0 ), R, ) into a triangular quasi-Hopf algebra. 7. Examples: κ-Poincaré in 3 and 4 Dimensions We now turn to explicit examples. Let K = C, and consider the symmetric decomposition so(n + 1) = so(n) ⊕ pn ,
n > 2,
(7.1)
whose Inönu-Wigner contraction of course yields the Lie algebra iso(n) of the complexified Euclidean group in n dimensions, I S O(n, C). By Lemma 3.3, this decomposition is of restrictive type. Thus, the results above will apply to any pn -contractible deformation algebra Uh (so(n + 1)). Finding such deformations is itself a non-trivial task. In the cases n = 3, 4, this was achieved in [23,24]5 , yielding the κ-deformations Uκ (iso(3)) and Uκ (iso(4)). These can be written in terms of the generators Mi j = −M ji ,
Ni ,
Pi ,
P0 = E,
(7.2)
for all 1 ≤ i, j ≤ n − 1 and n = 3, 4. The deformation parameter is conventionally denoted as κ = 1/ h , and the algebra is then given by Mi j , Pk = δk[i P j] , (7.3) E , Ni , P j = δi j κ sinh (7.4) [Ni , E] = Pi , κ 1 E + 2 P· P Mi j + Pk P[i M j]k , (7.5) Ni , N j = −Mi j cosh κ 4κ for all 1 ≤ i, j, k, l ≤ n − 1. The coproduct is given by
κ (E) = E ⊗ 1 + 1 ⊗ E , E 2κ
κ (Ni ) = Ni ⊗ e
E 2κ
(7.6)
E − 2κ
κ (Pi ) = Pi ⊗ e + e ⊗ Pi , (7.7)
E E E 1 P j ⊗ e 2κ Mi j − e− 2κ Mi j ⊗ P j , (7.8) + e− 2κ ⊗ Ni + 2κ (7.9)
κ (Mi j ) = Mi j ⊗ 1 + 1 ⊗ Mi j ,
5 Note that although the κ-Poincaré algebra exists in arbitrary dimension [39], to the authors’ knowledge it has only explicitly been shown to arise as a p-contraction for n ≤ 4.
606
C. A. S. Young, R. Zegers
and the antipode by Sκ (Pμ ) = −Pμ , Sκ (Mi j ) = −Mi j , Sκ (Ni ) = −Ni +
d Pi . 2κ
(7.10)
The counit map is undeformed, (Mi j ) = (Ni ) = (Pμ ) = 0, for all 0 ≤ μ ≤ n − 1. It follows from the results presented in the previous sections that Uκ (iso(3)) and Uκ (iso(4)) are isomorphic to cochain twists of U(iso(3)) and U(iso(4)) respectively. Let us comment on the relationship between this statement and various previous results. First, it should not be confused with other statements that exist in the literature, [40,41], concerning twists and κ-deformed Minkowski space-time, which involve enlarged algebras that include the dilatation generator. Next, as we saw above, the existence of the cochain twist means there certainly exist triangular quasi-Hopf algebras (Uκ (iso(n)), R, ), at least for n = 3, 4. They are obtained, as in the approach of Beggs and Majid [7,8], by twisting (Uκ (iso(n)), 1⊗2 , 1⊗3 ). To the first few orders in h = 1/κ, the structures R and were explicitly computed, for any n ≥ 2, in [42]; see also [43,44]. One can also understand the existence of the quasitriangular Hopf algebra structure of Uκ (iso(3)) exhibited in [23] in the context of the results above. Among the special orthogonal algebras, so(4, C) alone is not simple: so(4, C) = a1 ⊕ a1 . There is thus a two-dimensional space of quadratic Casimirs. It is straightforward to verify that a one-dimensional subspace of them are p-contractible, namely h t := h i jk Mi j Pk . For n = 3, it is known that there is no classical r-matrix obeying the classical Yang-Baxter equation [42,45] and therefore no quasitriangular Hopf algebra structure. This now also follows from Corollary 5.5, given that for all n = 3 the unique quadratic Casimir of so(n + 1) fails to be p-contractible. As for versions of the κ-deformed Poincaré algebra in higher and lower space-time dimensions, a consistent definition was first given in [39]. The main idea is that the four dimensional case is generic enough that the 1 + d-dimensional case can be obtained by simply extending or truncating the range of the spatial indices from 1, . . . , 3 to 1, . . . , d. It is reasonable to think that the twist obtained in the four dimensional case can be similarly extended to arbitrary dimensions, thus extending to all dimensions the existence of a triangular quasi-Hopf algebra structure on the κ-deformation of the Poincaré algebra. In particular, we expect that the κ-deformation of U(sl(2)) admits a triangular quasiHopf algebra structure [44], but a proof of this statement would obviously require a refinement of the arguments used here so as to circumvent the obstructions arising in this case – cf. the Appendix. Such a refinement could, for instance, rely on a further symmetry property of the p-contractible Chevalley-Eilenberg cohomology of sl(2). Finally, we note that it would be interesting to understand the existence of the twist from the point of view of the other, conceptually distinct, construction of κ-Poincaré, namely as a bicrossproduct [20,21,46,47]. Acknowledgments. The research of C.A.S.Y. was supported by the Leverhulme foundation. R.Z. was funded by an EPSRC postdoctoral fellowship.
Appendix: Proof of Lemma 3.3 In this Appendix, we provide a proof of Lemma 3.3. Let (g, θ ) be a symmetric semisimple Lie algebra obeying the conditions of the lemma. If g = h ⊕ p is the associated symmetric decomposition of g, we want to prove that, for all p ∈ N, the projection from
Deformation Quasi-Hopf Algebras of Non-semisimple Type from Cochain Twists
607
g to p maps S p (g ⊕ g)g onto S0, p (g ⊕ g)h. The isomorphism of left g-modules (3.6) induces a similar isomorphism S(g ⊕ g) ∼ = S(g) ⊗ S(g) at the level of the symmetric algebras, from which it follows that Sm (g ⊕ g) ∼ =
m
Sk (g) ⊗ Sm−k (g),
(7.11)
k=0
for all m ∈ N. We thus have a decomposition of S(g ⊕ g) into the g-submodules isomorphic to Sk (g) ⊗ Sm−k (g). There is an analogous decomposition of S0,m (g ⊕ g) into h-submodules isomorphic to S0,k (g) ⊗ S0,m−k (g) = Sk (p) ⊗ Sm−k (p). It therefore suffices to show that, for all k, ∈ N, the restriction map induces a surjection (Sk (g) ⊗ S (g))g (Sk (p) ⊗ S (p))h .
(7.12)
∼ p∗ , by means of the Killing form, an element Identifying g ∼ = g∗ , and in particular p = d ∈ Sk (p) ⊗ S (p) can be regarded as a (k + )-linear map p × · · · × p → K; (X, . . . , Y ) → d(X, . . . , Y )
(7.13)
that is symmetric in its first k and final slots. In view of the polarization formulae, such maps are in bijection with polynomials of two variables in p, according to p(d) (X, Y ) = d(X, . . . , X , Y, . . . , Y ). k
(7.14)
These polynomials are (k, )-homogeneous, by which we mean that they are homogeneous of degree k with respect to their first argument and of degree with respect to their second argument. We denote by Kk, [p, p] the left h-module of (k, )-homogeneous polynomials on p. Then for all k, ∈ N, (Sk (p) ⊗ S (p))h is in bijection with the submodule of h-invariant (k, )-homogeneous polynomials of Kk, [p, p]h. Similarly, (Sk (g) ⊗ S (g))g is in bijection with Kk, [g, g]g. Therefore, it suffices to show that the restriction map from g to p maps Kk, [g, g]g onto Kk, [p, p]h. By virtue of Lemma 2.2, it will be sufficient to consider separately the cases of diagonal symmetric Lie algebras and of the symmetric simple Lie algebras listed in 3.3. We recall that a diagonal symmetric Lie algebra is a pair (g, θ ), where g = v ⊕ v, for some semisimple Lie algebra v, and θ is the involutive automorphism of Lie algebras defined by θ (x, y) = (y, x), for all (x, y) ∈ g. We thus have g = h ⊕ p, where h is the set of elements of g of the form (x, x), whereas p is the set of elements of g of the form (x, −x), for x ∈ v. We are first going to prove that Kk, [p, p]h ∼ = Kk, [v, v]v. Let p ∈ Kk, [p, p] be a polynomial. For all X, Y ∈ p, we have p(X, Y ) = p((x, −x), (y, −y)) = p(x, ˜ y),
(7.15)
for some x, y ∈ v. The left h-action on p induces a left h-action on p × p, given, for all h ∈ h and all X, Y ∈ p, by h (X, Y ) = (z, z) ((x, −x), (y, −y)) = ((z x, −z x), (z y, −z y)),
(7.16)
for some x, y ∈ v and some z ∈ v; from which it obviously follows that p˜ is v-invariant if and only if p is h-invariant. Now, we are going to prove that the restriction map is
608
C. A. S. Young, R. Zegers
a surjection from Kk, [g, g]g onto Kk, [v, v]v. Let p ∈ Kk, [g, g]g be a g-invariant polynomial on g. The left g-action on g ⊕ g is given, for all g ∈ g and all X, Y ∈ g, by g (X, Y ) = (g1 , g2 ) ((x1 , x2 ), (y1 , y2 )) = ((g1 x1 , g2 x2 ), (g1 y1 , g2 y2 )),
(7.17)
for some g1 , g2 ∈ v and some x1 , x2 , y1 , y2 ∈ v. As one can always choose g1 and g2 independently, it follows that in order for p to be g-invariant, there must be a polynomial f : K × K → K and two v-invariant polynomials p1 , p2 ∈ Kk, [v, v]v such that p((x1 , x2 ), (y1 , y2 )) = f ( p1 (x1 , y1 ), p2 (x2 , y2 )),
(7.18)
for all x1 , x2 , y1 , y2 ∈ v. Now restricting p to p, we get p((x1 , −x1 ), (y1 , −y1 )) = f ( p1 (x1 , y1 ), p2 (−(x1 , y1 ))) = p(x ˜ 1 , y1 ) ∈ Kk, [v, v]v,
(7.19)
for all x1 , y1 ∈ v. Now, it is obvious that every polynomial in Kk, [v, v]v can be obtained as the restriction to p of a polynomial in Kk, [g, g]g; e.g. take p2 = 0, f = id and p1 = p. ˜ We are now going to consider the different symmetric simple Lie algebras listed in 3.3. Let us first consider the symmetric simple Lie algebras of type AIn for all n > 2. In this case, we have g = su(n) endowed with an involutive automorphism θ given by complex conjugation, i.e. θ (x) = x, ¯ for all x ∈ su(n). The fixed points of θ are traceless real antisymmetric matrices which generate an so(n) subalgebra. We thus have the symmetric decomposition su(n) = so(n) ⊕ p, where the orthogonal complement p is the left so(n)-module generated by the traceless imaginary symmetric matrices of su(n). It follows from the first fundamental theorem for so(n)-invariant polynomials on n × n matrices, [48], that Kk, [p, p]so(n) is generated by the following polynomials: (x, y) ∈ p × p → tr P(x, y),
(7.20)
for all (i, j)-homogeneous noncommutative polynomials P ∈ Ki, j [X, Y ], with i ≤ k and j ≤ . The polynomials defined in (7.20) are obviously restrictions to p of su(n)invariant polynomials on su(n) as, for all P ∈ Ki, j [X, Y ] and all x, y ∈ su(n), (x, y) → tr P(x, y)
(7.21)
defines an element in Km,n [su(n), su(n)]su(n) . This proves Lemma 3.3 for simple symmetric Lie algebras of type AIn>2 . It is worth noting that in the case of AI2 , there exist obstructions to the above result which are related to the existence of a further so(2)invariant with appropriate symmetries, namely the pfaffian (x, y) ∈ p × p → Pf([x, y]). As the latter is not the restriction to p of any su(2) invariant on su(2), Lemma 3.3 does not hold in this case. We now turn to type AIIn . In this case, we have g = su(2n) endowed with an involutive automorphism θ given by the symplectic transpose, i.e., for all x ∈ su(2n), θ (x) = J x t J , where J is a non-singular skew-symmetric 2n × 2n matrix such that J 2 = −1. The fixed point set of θ constitutes an sp(2n) subalgebra and we have the following symmetric decomposition su(2n) = sp(2n) ⊕ p, where p ⊂ su(2n) is the left sp(2n)-module of matrices x ∈ su(2n) such that θ (x) = −x. It follows from the first fundamental theorem for sp(2n)-invariant polynomials on 2n × 2n matrices, [48], that Kk, [p, p]sp(2n) is generated by the following polynomials: (x, y) ∈ p × p → tr P(x, y),
(7.22)
Deformation Quasi-Hopf Algebras of Non-semisimple Type from Cochain Twists
609
for all noncommutative (i, j)-homogeneous polynomials P ∈ Ki, j [X, Y ], with i ≤ k and j ≤ . These polynomials are restrictions to p of su(2n)-invariant polynomials on su(2n) as, for all P ∈ Ki, j [X, Y ] and all x, y ∈ su(2n), (x, y) → tr P(x, y)
(7.23)
defines an element in Ki, j [su(2n), su(2n)]su(2n) . This proves Lemma 3.3 for simple symmetric Lie algebras of type AIIn . We finally consider the symmetric simple Lie algebras of type BDIn,1 for all n > 2. In this case, we have the symmetric pairs (so(n + 1), so(n))n>2 . We introduce the usual basis of gl(n + 1), i.e. the (E i j )0≤i, j≤n defined as the (n + 1) × (n + 1) matrices with a 1 at the intersection of the i th row and j th column and 0 everywhere else. The matrices Mi j = E i j − E ji , 0 ≤ i, j ≤ n, constitute a basis of so(n + 1), and of these, the Mi j with 1 ≤ i, j ≤ n generate an so(n) subalgebra. We thus have the symmetric decomposition so(n + 1) = so(n) ⊕ p, where p is the n-dimensional so(n)-module spanned by the Pi = M0,i , for all 1 ≤ i ≤ n. The Pi transform under the fundamental representation n of so(n), as can be checked from Mi j Pk = [Mi j , Pk ] = δ jk Pi − δik P j ,
(7.24)
for all 1 ≤ i, j, k ≤ n. This means that we are looking for S O(n)-invariant (k, )-homogeneous polynomials on p×p = n×n. For all n > 2, it follows from the first fundamental theorem for so(n)-invariant polynomials on vectors, [49,50], that such polynomials only depend on the S O(n) scalars built out of the scalar products of their arguments. Let q be the quadratic form defined on p × p by q(Pi , P j ) = δi j for all 1 ≤ i, j ≤ n. For all p ∈ Kk, [p, p]h, there exists a polynomial f : K3 → K such that, for all X, Y ∈ p, p(X, Y ) = f (q(X, X ), q(X, Y ), q(Y, Y )).
(7.25)
Now, it is obvious that q is the restriction to p of the map 1 so(n + 1) × so(n + 1) → K ; (X, Y ) → − tr(X Y ), 2 which is so(n + 1)-invariant. This proves the result for symmetric simple Lie algebras of type BDIn>2,1 . It is worth noting that in the case of BDI2,1 , there exist obstructions to the above result which are related to the existence of a further S O(2) invariant, namely (X, Y ) ∈ p × p → det(X, Y ). As the latter is not the restriction to p of any so(3) invariant, Lemma 3.3 does not hold in this case. By virtue of the special isomorphisms between lower rank simple Lie algebras, the list of summands in Lemma 3.3 actually includes CII1,1 = BDI4,1 and BDI3,3 = AI4 . The latter respectively correspond to the symmetric decompositions sp(4) = (sp(2) ⊕ sp(2))⊕ p and so(6) = (so(3) ⊕ so(3)) ⊕ p. References 1. Drinfeld, V.G.: Quantum groups. J. Sov. Math. 41, 898 (1988) [Zap. Nauchn. Semin. 155, 18 (1986)]. Also in Proc. Int. Cong. Math. (Berkeley,1986) 1, 1987, pp. 798–820 2. Chari, V., Pressley, A.: A Guide to Quantum Groups. Cambridge: Cambridge University Press, 1994 3. Majid, S.: Foundations of Quantum Group Theory. Cambridge: Cambridge University Press, 2000 4. Drinfel’d, V.G.: Quasi-Hopf algebras. (Russian) Algebra i Analiz 1(6), 114–148 (1989); translation in Leningrad Math. J. 1(6), 1419–1457 (1990)
610
C. A. S. Young, R. Zegers
5. Drinfel’d, V.G.: On the structure of quasitriangular quasi-Hopf algebras. (Russian) Funktsional. Anal. i Prilozhen. 26(1), 78–80 (1992); translation in Funct. Anal. Appl. 26(1), 63–65 (1992) 6. Drinfel’d, V.G.: On almost cocommutative Hopf algebras. (Russian) Algebra i Analiz 1(2), 30–46 (1989); translation in Leningrad Math. J. 1(2), 321–342 (1990) 7. Beggs, E.J., Majid, S.: Semi-classical differential structures. Pac. J. Math. 224(1), 1–44 (2006) 8. Beggs, E.J., Majid, S.: Quantization by cochain twists and nonassociative differentials. http://arxiv.org/ abs/math/0506450v2[math.QA], 2005 9. Weinberg, S.: The Quantum Theory of Fields. Vol. 1: Foundations. Cambridge: Cambridge University Press, 1995 10. Lukierski, J.: Quantum deformations of Einstein’s relativistic symmetries. AIP Conf. Proc. 861, 398 (2006) 11. Oeckl, R.: Untwisting noncommutative R**d and the equivalence of quantum field theories. Nucl. Phys. B 581, 559 (2000) 12. Lukierski, J., Ruegg, H., Zakrzewski, W.J.: Classical quantum mechanics of free kappa relativistic systems. Annals Phys. 243, 90 (1995) 13. Kosinski, P., Lukierski, J., Maslanka, P.: Local D = 4 fieldtheory on kappa-deformed Minkowski space. Phys. Rev. D 62, 025004 (2000) 14. Amelino-Camelia, G., Majid, S.: Waves on noncommutative spacetime and gamma-ray bursts. Int. J. Mod. Phys. A 15, 4301 (2000) 15. Agostini, A., Amelino-Camelia, G., D’Andrea, F.: Hopf-algebra description of noncommutative-spacetime symmetries. Int. J. Mod. Phys. A 19, 5187 (2004) 16. Dimitrijevic, M., Jonke, L., Moller, L., Tsouchnika, E., Wess, J., Wohlgenannt, M.: Field theory on kappa-spacetime. Czech. J. Phys. 54, 1243 (2004) 17. Grosse, H., Wohlgenannt, M.: On kappa-deformation and UV/IR mixing. Nucl. Phys. B 748, 473 (2006) 18. Kresic-Juric, S., Meljanac, S., Stojic, M.: Covariant realizations of kappa-deformed space. Eur. Phys. J. C 51, 229 (2007) 19. Daszkiewicz, M., Lukierski, J., Woronowicz, M.: κ-deformed statistics and classical fourmomentum addition law. Mod. Phys. Lett. A23, 653–665 (2008) 20. Majid, S., Ruegg, H.: Bicrossproduct structure of kappa poincare group and noncommutative geometry. Phys. Lett. B 334, 348 (1994) 21. Freidel, L., Kowalski-Glikman, J., Nowak, S.: Field theory on κ–Minkowski space revisited: Noether charges and breaking of Lorentz symmetry. Int. J. Mod. Phys. A 23, 2687 (2008) 22. Celeghini, E., Giachetti, R., Sorace, E., Tarlini, M.: Three dimensional quantum groups from contraction of SU (2) Q . J. Math. Phys. 31, 2548 (1990) 23. Celeghini, E., Giachetti, R., Sorace, E., Tarlini, M.: The Three-dimensional Euclidean quantum group E(3)-q and its R matrix. J. Math. Phys. 32, 1159 (1991) 24. Lukierski, J., Ruegg, H., Nowicki, A., Tolstoi, V.N.: Q-deformation of Poincare algebra. Phys. Lett. B 264, 331 (1991) 25. Dixmier, J.: Enveloping Algebras. Amsterdam: North Holland Publishing Company, 1977 26. Helgason, S.: Differential Geometry and Symmetric Spaces. London: Academic Press, 1962 27. Cartan, E.: Sur certaines formes riemanniennes remarquables des géométries à groupe fondamental simple. Ann. Sci. Ecole Norm. Sup. 44, 345–467 (1927) 28. Helgason, S.: Fundamental solutions of invariant differential operators on symmetric spaces. Amer. J. Math. 86(3), 565–601 (1964) 29. Burstall, F.E., Ferus, D., Pedit, F., Pinkall, U.: Harmonic tori in symmetric spaces and commuting Hamiltonian systems on loop algebras. Ann. Math. 138(1), 173–212 (1993) 30. Evans, J.M.: Integrable sigma-models and Drinfeld-Sokolovhierarchies. Nucl. Phys. B 608, 591 (2001) 31. Hochschild, G.: On the cohomology groups of an associative algebra. Ann. Math. 46(1), 58–67 (1945) 32. Hochschild, G.: On the cohomology theory for associatviealgebras. Ann. Math. 47(3), 568–579 (1946) 33. Cartan, H., Eilenberg, S.: Homological Algebra. Princeton, NJ: Princeton University Press, 1956 34. Weibel, C.A.: An Introduction to Homological Algebra. Cambridge studies in advanced mathematics 38, Cambridge: Cambridge University Press, 1994 35. Chevalley, C., Eilenberg, S.: Cohomology theory of Lie groups and Lie algebras. Trans. Amer. Math. Soc. 63, 85–124 (1948) 36. Inönu, E., Wigner, E.P.: On the contraction of groups and their representations. Proc. Natl. Acad. Sci. U.S.A 39, 510–524 (1953) 37. Saletan, E.J.: Contraction of Lie groups. J. Math. Phys. 2, 1–21 (1961) 38. Donin, J., Shnider, S.: Cohomological construction of quantized universal enveloping algebras. Trans. Am. Math. Soc. 349, 1611 (1997) 39. Lukierski, J., Ruegg, H.: Quantum kappa poincaré in any dimension. Phys. Lett. B 329, 189 (1994) 40. Govindarajan, T.R., Gupta, K.S., Harikumar, E., Meljanac, S., Meljanac, D.: Twisted statistics in kappaminkowski spacetime. Phys. Rev. D 77, 105010 (2008)
Deformation Quasi-Hopf Algebras of Non-semisimple Type from Cochain Twists
611
41. Borowiec, A., Pachol, A.: kappa-Minkowski spacetime as the result of Jordanian twist deformation. Phys. Rev. D 79, 045012 (2009) 42. Young, C.A.S., Zegers, R.: On kappa-deformation and triangular quasibialgebra structure. Nucl. Phys. B 809, 439–457 (2009) 43. Young, C.A.S., Zegers, R.: Covariant particle statistics and intertwiners of the kappa-deformed Poincare algebra. Nucl. Phys. B 797, 537 (2008) 44. Young, C.A.S., Zegers, R.: Covariant particle exchange for kappa-deformed theories in 1+1 dimensions. Nucl. Phys. B 804, 342 (2008) 45. Zakrzewski, S.: Poisson Poincaré groups. http://arxiv.org/abs/hep-th/9412099v1, 1994 46. Schroers, B.J.: Lessons from (2 + 1)-dimensional quantum gravity. PoS QG-PH, 035 (2007) 47. Majid, S., Schroers, B.J.: q-Deformation and semidualisation in 3d quantum gravity. http://arxiv.org/abs/ 0806.2587v2[gr-qc], 2009 48. Procesi, C.: The invariants of n × n matrices. Adv. Math. 19, 306–381 (1976) 49. Kraft, H., Procesi, C.: Classical invariant theory. available online at http://www.math.unibas.ch/~kraft/ Papers/KP-Primer.pdf, 1996 50. Spivak, M.: A comprehensive introduction to differential geometry. Vol. 5, Houston, TX: Publish or Perish, 1979 Communicated by A. Connes
Commun. Math. Phys. 298, 613–643 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1088-6
Communications in
Mathematical Physics
String Theory and the Kauffman Polynomial Marcos Mariño Département de Physique Théorique et Section de Mathématiques, Université de Genève, Genève CH-1211, Switzerland. E-mail:
[email protected] Received: 6 June 2009 / Accepted: 16 April 2010 Published online: 6 July 2010 – © Springer-Verlag 2010
Abstract: We propose a new, precise integrality conjecture for the colored Kauffman polynomial of knots and links inspired by large N dualities and the structure of topological string theory on orientifolds. According to this conjecture, the natural knot invariant in an unoriented theory involves both the colored Kauffman polynomial and the colored HOMFLY polynomial for composite representations, i.e. it involves the full HOMFLY skein of the annulus. The conjecture sheds new light on the relationship between the Kauffman and the HOMFLY polynomials, and it implies for example Rudolph’s theorem. We provide various non-trivial tests of the conjecture and we sketch the string theory arguments that lead to it. Contents 1. 2.
3. 4.
5. 6.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Colored HOMFLY and Kauffman Polynomials . . . . . . . . . . . . . 2.1 Basic ingredients from representation theory . . . . . . . . . . . . 2.2 The colored HOMFLY polynomial . . . . . . . . . . . . . . . . . 2.3 The colored Kauffman polynomial . . . . . . . . . . . . . . . . . 2.4 Relationships between the HOMFLY and the Kauffman invariants The Conjecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Review of the conjecture for the colored HOMFLY invariant . . . 3.2 The conjecture for the colored Kauffman invariant . . . . . . . . . Evidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Direct computations . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 General predictions for knots . . . . . . . . . . . . . . . . . . . . 4.3 General predictions for links . . . . . . . . . . . . . . . . . . . . String Theory Interpretation . . . . . . . . . . . . . . . . . . . . . . . 5.1 Chern–Simons theory and D-branes . . . . . . . . . . . . . . . . 5.2 Topological string dual . . . . . . . . . . . . . . . . . . . . . . . Conclusions and Outlook . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
614 615 616 617 621 622 623 623 625 629 629 632 633 636 636 639 640
614
M. Mariño
1. Introduction The HOMFLY [12] and the Kauffman [23] polynomials are probably the most useful two-variable polynomial invariants of knots and links. Both of them generalize the Jones polynomial, and they have become basic building blocks of quantum topology. However, many aspects of these polynomial invariants are still poorly understood. As Joan Birman remarked in 1993, “we can compute the simplest of the invariants by hand and quickly fill pages (...) without having the slightest idea what they mean” [6]. One particular interesting question concerns the relationship between the HOMFLY and the Kauffman invariants. Since their discovery almost thirty years ago, a number of isolated connections have been found between them. For example, when written in an appropriate way, they have the same lowest order term [33]. Other connections can be found when one considers their colored versions. The colored invariants can be formulated in terms of skein theory, in terms of quantum groups, or in terms of Chern–Simons gauge theory [53]. In the language of quantum groups or Chern–Simons theory, different colorings correspond to different choices of group representation. The original HOMFLY and Kauffman invariants are obtained when one considers the fundamental representation of SU(N ) and SO(N )/Sp(N ), respectively. One could consider other representations, like for example the adjoint representation. An intriguing result of Rudolph [49] states that the HOMFLY invariant of a link colored by the adjoint representation equals the square of the Kauffman polynomial of the same link, after the coefficients are reduced modulo two. This type of relationship has been recently extended by Morton and Ryder to more general colorings [41,43]. In spite of these connections, no unified, general picture has emerged to describe both invariants. More recently, knot invariants have been reinterpreted in the context of string theory thanks to the Gopakumar–Vafa conjecture [14], which postulates an equivalence between the 1/N expansion of Chern–Simons theory on the three-sphere, and topological string theory on a Calabi–Yau manifold called the resolved conifold. As a consequence of this conjecture, correlation functions of Chern–Simons gauge theory with U(N ) gauge group (i.e. colored HOMFLY invariants) are given by correlation functions in open topological string theory, which mathematically correspond to open Gromov–Witten invariants. Since Gromov–Witten invariants enjoy highly nontrivial integrality properties [13,31,44], this equivalence provides strong structural results on the colored HOMFLY polynomial [29,31,44] which have been tested in detail in various cases [29,36,48] and finally proved in [37]. Moreover, there is a full cohomology theory behind these invariants [31] which should be connected to categorifications of the HOMFLY polynomial [16]. Therefore, the string theory description “explains” to a large extent many aspects of the colored U(N ) invariants and leads to new predictions about their algebraic structure. The string theory perspective is potentially the most powerful tool to understand the connections between the colored HOMFLY and Kauffman polynomials. From the point of view of Chern–Simons theory, these polynomials correspond to the gauge groups U(N ) and SO(N )/Sp(N ), respectively. But when a gauge theory has a string theory large N dual, as is the case here, the theory with orthogonal or symplectic gauge groups can be obtained from the theory with unitary gauge group by using a special type of orbifold action called an orientifold. The building block of an orientifold is an involution I in the target space X of the string theory, which is then combined with an orientation reversal in the worldsheet of the string to produce unoriented strings in the quotient space X/I. Very roughly, one finds that correlation functions in the orbifold theory are given by correlation functions in the SO/Sp gauge theories. As for any orbifold, these
String Theory and the Kauffman Polynomial
615
functions are given by a sum over an “untwisted” sector involving oriented strings, and a “twisted” sector involving the unoriented strings introduced by the orientifold. The contributions from oriented strings are still given by correlation functions in the U(N ) gauge theory. The use of orientifolds in the context of the Gopakumar–Vafa conjecture was initiated in [51], which identified the relevant involution of the resolved conifold and studied the closed string sector. This line of research was developed in more detail in [7,8]. In particular, [8] extended the orientifold action to the open string sector and pointed out that, as a consequence of the underlying string/gauge theory correspondence, the colored Kauffman invariant of a link should be given by the sum of an appropriate HOMFLY invariant plus an “unoriented” contribution. The results of [8] made possible to formulate some partial conjectures on the structure of the Kauffman polynomial and test them in examples (see [45] for further tests)1 . Unfortunately, these results were not precise enough to provide a full, detailed string-based picture. The reason was that one of the crucial ingredients –the appropriate HOMFLY invariant that corresponds to the untwisted sector of the orientifold– was not identified. In this paper we remedy this situation and we identify these invariants as HOMFLY polynomials colored by composite representations of U(N ). This will allow us to state a precise conjecture on the structure of the colored Kauffman polynomial of knots and links. In skein-theoretic language, the appearance of composite representations means that, in order to understand the colored Kauffman polynomial in the light of string theory, one has to use the full HOMFLY skein of the annulus (see for example [19]). We will indeed see that the natural link invariant to consider in an unoriented theory involves both the colored Kauffman polynomial and the colored HOMFLY polynomial for composite representations and for all possible orientations of the link components. Our conjecture generalizes the results of [31,44] for the U(N ) case, and it “explains” various aspects of the relationship between the HOMFLY and the Kauffman polynomials, like for example Rudolph’s theorem. It also predicts some new, simple relationships between the Kauffman and the HOMFLY polynomial of links. In terms of open topological string theory, this paper adds little to the results of [8]. The bulk of the paper is then devoted to a detailed statement and discussion of the conjecture in the language of knot theory. Sect. 2 introduces our notation and reviews the construction of the colored HOMFLY and Kauffman polynomials, as well as of their relations. In Sect. 3 we review the conjecture of [31,44] and we state the new conjecture for the colored Kauffman polynomial. Sect. 4 provides some nontrivial evidence for the conjecture by looking at particular knots and links, and it explains how some standard results relating the HOMFLY and the Kauffman polynomials follow easily from our conjecture. In Sect. 5 we sketch the string theory arguments that lead to the conjecture, building on [7,8,51]. Finally, Sect. 6 contains some conclusions and prospects for future work.
2. Colored HOMFLY and Kauffman Polynomials In this section we introduce various tools from the theory of symmetric polynomials and we recall the construction of the colored Kauffman and HOMFLY polynomials, mainly to fix notations. 1 Some of the proposals of [8] were reformulated and recently proved in [10].
616
M. Mariño
2.1. Basic ingredients from representation theory. Let R be an irreducible representation of the symmetric group S . We will represent it by a Young diagram or partition, R = {li }i=1,...,r (R) ,
l1 ≥ l2 ≥ · · · lr (R) ,
(2.1)
where li is the number of boxes in the i th row of the diagram and r (R) is the total number of rows. Important quantities associated to the diagram are its total number of boxes, (R) =
r (R)
li
(2.2)
i=1
which equals , as well as the quantity κR =
r (R)
li (li − 2i + 1).
(2.3)
i=1
The ring of symmetric polynomials in an infinite number of variables {vi }i≥1 will be denoted by . It can be easily constructed as a direct limit of the ring of symmetric polynomials with a finite number of variables, see for example [38] for the details. It has a basis given by the Schur polynomials s R (v), which are labelled by Young diagrams. The multiplication rule for these polynomials is encoded in the Littlewood–Richardson coefficients s R1 (v)s R2 (v) = N RR1 R2 s R (v). (2.4) R
The identity of this ring is the Schur polynomial associated to the empty diagram, which we will denote by R = ·. We will also need the n th Adams operation ψn (s R (v)) = s R (v n ) = s R (v1n , v2n , . . .).
(2.5)
One can use elementary representation theory of the symmetric group to express s R (v n ) as a linear combination of Schur polynomials labelled by representations U with n ·(R) boxes U s R (v n ) = cn;R sU (v). (2.6) U
Let χ R be the character of the symmetric group associated to the diagram R. Let Cμ be the conjugacy class associated to the partition μ, and let |Cμ | be the number of elements U are given by [36] in the conjugacy class. It is easy to show that the coefficients cn;R U cn;R =
1 χ R (Cμ )χU (Cnμ ), zμ μ
(2.7)
where nμ = (nμ1 , nμ2 , · · · ) and zμ =
(μ)! . |Cμ |
(2.8)
String Theory and the Kauffman Polynomial
617
Fig. 1. A composite representation made out of the diagrams R and S
We will also regard a Young diagram R as an irreducible representation of U(N ). The quadratic Casimir of R is then given by C R = κ R + N (R).
(2.9)
The most general irreducible representation of U(N ) is a composite representation (see for example [25] for a collection of useful results on composite representations). Composite representations are labelled by a pair of Young diagrams (R, S).
(2.10)
This representation is usually depicted as in the left-hand side of Fig. 1, where the second representation S is drawn upside down at the bottom of the diagram. When regarded as a representation of SU(N ), the composite representation corresponds to the diagram depicted on the right-hand side of Fig. 1, and it has in total N μ1 + (R) − (S)
(2.11)
boxes, where μ1 is the number of boxes in the first row of S. For example, the composite representation ( , ) is the adjoint representation of SU(N ). It is easy to show that [15] C(R,S) = C R + C S .
(2.12)
The composite representation can be understood as the tensor product R ⊗ S, where S is the conjugate representation to S, plus a series of “lower order corrections” involving tensor products of smaller representations. The precise formula is [25] (R, S) = (−1)(U ) NURV NUS T W (V ⊗ W ). (2.13) U,V,W
2.2. The colored HOMFLY polynomial. The HOMFLY polynomial of an oriented link L, PL (t, ν), can be defined by using a planar projection of L. This gives an oriented diagram in the plane which will be denoted as DL . The skein of the plane is the set of linear combinations of these diagrams, modulo the skein relations
(2.14)
618
M. Mariño
Using the skein relations, the diagram of a link DL can be seen to be proportional to the trivial diagram . The proportionality factor DL is a scalar and gives a regular isotopy invariant (i.e. a quantity which is invariant under the Reidemeister moves II and III, but not under the I). A true ambient isotopy invariant is obtained by defining PL (t, ν) = ν −w(DL ) DL ,
(2.15)
where w(DL ) is the self-writhe of the link diagram D (see for example [33], p. 173). This is defined as the sum of the signs of crossings at which all link components cross themselves, and not other components. It differs from the standard writhe w(D) in twice the total linking number of the link, lk(L). Notice that the standard HOMFLY polynomial is usually defined by using the total writhe in (2.15) [33]. Therefore, the HOMFLY polynomial, as defined in (2.15), is given by the standard HOMFLY polynomial times a factor ν 2 lk(L) .
(2.16)
As discussed in detail in [31], this is the natural version of the HOMFLY polynomial from the string theory point of view. The HOMFLY invariant of the link is then defined by , (2.17) H(L) = PL (t, ν)H and we choose the normalization H
=
ν − ν −1 . t − t −1
(2.18)
From the skein theory point of view, the colored HOMFLY invariant of a link is obtained by considering satellites of the knot. Let K be a framed knot, and let P be a knot diagram in the annulus. Around K there is a framing annulus, or equivalently a parallel of K. The satellite knot K P
(2.19)
is obtained by replacing the framing annulus around K by P, or equivalently by mapping P to S3 using the parallel of K. Here, K is called the companion knot while P is called the pattern. In Fig. 2 we show a satellite where the companion K is the trefoil knot. Since the diagrams in the annulus form a vector space (called the skein of the annulus) we can obtain the most general satellite of a knot by considering the basis of this vector space. There is a very convenient basis constructed in [19] whose elements are labelled by pairs of Young diagrams P(R,S) . Given a knot K, the HOMFLY invariant colored by the partitions (R, S) is simply (2.20) H(R,S) (K) = H K P(R,S) . If we have a link L with L components K1 , . . . , K L , one can color each component independently, and one obtains an invariant of the form H(R1 ,S1 ),...,(R L ,SL ) (L).
(2.21)
The colored HOMFLY invariant has various important properties which will be needed in the following:
String Theory and the Kauffman Polynomial
619
Fig. 2. An example of a satellite knot. The companion knot K is the trefoil knot, and below we show the framing annulus. If we replace this framing annulus by the pattern P, we obtain the satellite K P
1. The pattern P(S,R) is equal to the pattern P(R,S) with its orientation reversed. In particular, coloring with ( , ·) gives the original knot K, while coloring with (·, ) gives the knot K with the opposite orientation. Since the HOMFLY invariant of a knot is invariant under reversal of orientation, we have that H(S,R) (K) = H(R,S) (K) = H(R,S) (K).
(2.22)
However, the HOMFLY invariant of a link is only invariant under a global reversal of orientation, therefore in general one has that H(R1 ,S1 ),...,(S j ,R j ),...,(R L ,SL ) (L) = H(R1 ,S1 ),...,(R j ,S j ),...,(R L ,SL ) (L j ),
(2.23)
where L j is the link obtained from the link L by reversing the orientation of the j th component, see for example Fig. 3. 2. If one of the patterns is empty, say S = ·, the skein theory is simpler and it has been developed in for example [3,4]. In this case, the HOMFLY invariant of the knot K (which we denote by H R (K)) is equal to the invariant of K obtained from the quantum group Uq (sl(N , C)) in the representation R, with the identification t = q 1/2 , ν = t N .
(2.24)
In particular we have HR
= dimq R,
where dimq R is the quantum dimension of R.
(2.25)
620
M. Mariño
Fig. 3. Changing (R, S) into (S, R) reverses the orientation of a knot. If the knot is a component of a link, this leads in general to different HOMFLY invariants
Fig. 4. Examples of patterns for various representations. The patterns are written as formal combinations of braids, and after closing them we find elements in the skein of the annulus. In the last example, 1 refers to the empty diagram
3. For a general pattern labeled by two representations (R, S), the HOMFLY invariant of the knot K, H(R,S) (K) equals the invariant of K obtained from the quantum group Uq (gl(N , C)) in the composite representation (R, S). In particular [19] H(R,S)
= dimq (R, S) (−1)(U ) NURV NUS T W dimq V dimq W . (2.26) = U,V,W
Here, U T is the transposed Young diagram. The second equality follows from (2.13). In Fig. 4 we show some examples of patterns associated to different representations. The patterns are represented as elements in the braid group, which can be closed by joining the endpoints to produce patterns in the annulus. In the following we will denote z = t − t −1 .
(2.27)
String Theory and the Kauffman Polynomial
621
Remark 2.1. In this paper, the above skein rules will be used to compute the values of the HOMFLY and Kauffman invariants in the standard framing. Starting from this framing, a change of framing by f units is done through [40] H R (K) → (−1) f (R) t f κ R H R (K).
(2.28)
This is the rule that preserves the integrality properties of the invariants that will be discussed below. The framing of links is done in a similar way, with one framing factor like (2.28) for each component. Example 2.2. The HOMFLY polynomial of the trefoil knot is, in our conventions, P31 (t, ν) = 2ν 2 − ν 4 − z 2 ν 2 , while the HOMFLY polynomial of the Hopf link is P22 (t, ν) = ν − ν −1 z −1 + νz. 1
(2.29)
(2.30)
2.3. The colored Kauffman polynomial. The Kauffman polynomial is also defined by a skein theory [23], but the diagrams correspond now to planar projections of unoriented knots and links. The skein relations are (2.31)
This is sometimes called the “Dubrovnik” version of the Kauffman invariant. As in the case of HOMFLY, the diagram of an unoriented link L, which will be denoted by E L , is proportional to the trivial diagram
, and the proportionality factor E L is a regular
isotopy invariant. The Kauffman polynomial is defined as FL (t, ν) = ν −w(E L ) E L .
(2.32)
Like before, this differs from the standard Kauffman polynomial (as defined for example in [33]) in an overall factor (2.16). More importantly, the use of the self-writhe guarantees that the resulting polynomial is still an invariant of unoriented links. The Kauffman invariant of the link will be defined as , (2.33) G(L) = FL (t, ν)G and we choose the normalization G
=1+
ν − ν −1 . t − t −1
(2.34)
The colored Kauffman polynomial is obtained, similarly to the HOMFLY case, by considering the Kauffman skein of the annulus and by forming satellites with elements of
622
M. Mariño
this skein taken as patterns. There is again a basis y R labelled by Young tableaux [5], and we define G R (K) = G(K y R ).
(2.35)
For a link of L components, we can color each component independently, and one obtains in this way the colored Kauffman invariant of the link G R1 ,...,R L (t, ν).
(2.36)
The invariant defined in this way equals the invariant obtained from the quantum group Uq (so(N , C)) in the representation R, after identifying t = q 1/2 ,
ν = q N −1 .
(2.37)
In particular, for the unknot the invariant is equal to the quantum dimension of R, GR = dimqSO(N) R. (2.38) The results for the Kauffman invariants of knots and links will be presented in the standard framing. The change of framing is also done with the rule (2.28). Example 2.3. The Kauffman polynomial of the trefoil knot is, in our conventions, F31 (t, ν) = 2ν 2 − ν 4 + z(−ν 3 + ν 5 ) + z 2 (ν 2 − ν 4 ),
(2.39)
while that of the Hopf link is
F22 (t, ν) = z −1 ν − ν −1 + z + z 2 (ν − ν −1 ) . 1
(2.40)
2.4. Relationships between the HOMFLY and the Kauffman invariants. As we mentioned in the Introduction, the colored HOMFLY and Kauffman invariants of a link are not unrelated. The simplest relation concerns the invariants of a link L in which all components have the coloring R = , i.e. the original HOMFLY and Kauffman polynomials. It is easy to show that these polynomials have the structure PL (z, ν) = z 1−L pi (ν)z 2i , FL (z, ν) = z 1−L ki (ν)z i . (2.41) i≥0
i≥0
It turns out that (see for example [33], Prop. 16.9) p0 (ν) = k0 (ν).
(2.42)
In general, the Kauffman polynomial contains many more terms than the HOMFLY polynomial. In particular, as (2.41) shows, it contains both even and odd powers of z, while the HOMFLY polynomial only contains even powers. In the case of torus knots the HOMFLY polynomial can even be obtained from the Kauffman polynomial by the formula [32] H(z, ν) =
1 (G(z, ν) − G(−z, ν)). 2
(2.43)
String Theory and the Kauffman Polynomial
623
There are also highly nontrivial relations between the two invariants when we consider colorings. An intriguing theorem of Rudolph [49] states the following. Let L be an unoriented link with L components. Pick an arbitrary orientation of L and consider its HOMFLY invariant H(
, ),...,( , ) (L).
(2.44)
Due to (2.23), this invariant does not depend on the choice of orientation in L, and it is therefore an invariant of the unoriented link. One can show (2.44) is an element in Z[z ±1 , ν ±1 ] (see for example [41]). The square of the Kauffman invariant of L, G 2 (L), belongs to the same ring. By reducing the coefficients of these polynomial modulo 2, we obtain two polynomials in Z2 [z ±1 , ν ±1 ]. Rudolph’s theorem states that these reduced polynomials are the same. In other words, G 2 (L) ≡ H(
, ),...,( , ) (L)
mod 2,
(2.45)
see [50] for this statement of Rudolph’s theorem. Morton and Ryder have recently extended this result to more general colorings [41,43]. This generalization requires more care since now the invariants have denominators involving products of t r − t −r , r ∈ Z>0 . However, one can still make sense of the reduction modulo 2, and one obtains that, for any unoriented link L, 2 G(R (L) = H(R1 ,R1 ),...,(R L ,R L ) (L) mod 2. 1 ,...,R L )
(2.46)
3. The Conjecture 3.1. Review of the conjecture for the colored HOMFLY invariant. We start by recalling the conjecture of [30,31,44] on the integrality structure of the colored HOMFLY polynomial. We first state the conjecture for knots, and then we briefly consider the generalization to links. Notice that these conjectures have now been proved in [37]. Let K be a knot, and let H R (K) be its colored HOMFLY invariant with the coloring R. We first define the generating functional Z H (v) = H R (K)s R (v), (3.1) R
understood as a formal power series in s R (v). Here we sum over all possible colorings, including the empty one R = ·. We also define the free energy FH (v) = log Z H (v)
(3.2)
which is also a formal power series. The reformulated HOMFLY invariants of K, f R (t, ν), are defined through the equation FH (v) =
∞ 1 f R (t d , ν d )s R (v d ). d
(3.3)
d=1 R
One can easily prove [30] that this equation determines uniquely the reformulated HOMFLY invariants f R in terms of the colored HOMFLY invariants of K. Explicit formulae for f R in terms of H R for representations with up to three boxes are listed in [30].
624
M. Mariño
If (R) = (S) we define the matrix MRS
1 = χ R (Cμ )χ S (Cμ ) zμ μ
(μ) μ i − t −μi i=1 t , t − t −1
(3.4)
which is zero otherwise. It is easy to show that this matrix is invertible (see for example [31,30]). We now define fˆR (t, ν) = M R−1S f S (t, ν). (3.5) S
In principle, fˆR (t, ν) are rational functions, i.e. they belong to the ring Q[t ±1 , ν ±1 ] with denominators given by products of t r − t −r . However, we have the following Conjecture 3.1. fˆR (t, ν) ∈ z −1 Z[z 2 , ν ±1 ], i.e. they have the structure N R;g,Q z 2g ν Q , fˆR (t, ν) = z −1
(3.6)
g≥0 Q∈Z
where N R;g,Q are integer numbers and are called the BPS invariants of the knot K. The sum appearing here is finite, i.e. for a given knot and a given coloring R, the N R;g,Q vanish except for finitely many values of g, Q. The conjecture can be generalized to links. Let L be a link of L components K1 , . . . , K L , and let vl , l = 1, . . . , L, be formal sets of infinite variables. The subindex l refers here to the l th component of the link, and each vl has the form vl = ((vl )1 , (vl )2 , · · · ). We define H R1 ,...,R L (L)s R1 (v1 ) · · · s R L (v L ) (3.7) Z H (v1 , . . . , v L ) = R1 ,...,R L
as well as the free energy (3.8) FH (v1 , . . . , v L ) = log Z H (v1 , . . . , v L ). If any of the Ri s are given by the trivial coloring Ri = ·, it is understood that H R1 ,...,R L (L) is the HOMFLY invariant of the sublink of L obtained after removing the corresponding Ki s. The reformulated invariants are now defined by FH (v1 , . . . , v L ) =
∞
f R1 ,...,R L (t d , ν d )s R1 (v1d ) · · · s R L (v dL )
(3.9)
d=1 R1 ,...,R L
and
fˆR1 ,...,R L (t, ν) =
S1 ,...,SL
M R−1 · · · M R−1L SL f S1 ,...,SL (t, ν). 1 S1
(3.10)
Remark 3.1. Notice that, for the fundamental representation, fˆ
,...,
(t, ν) = f
,...,
(t, ν).
(3.11)
We can now state the conjecture for links. Conjecture 3.2. fˆR1 ,...,R L (t, ν) ∈ z L−2 Z[z 2 , ν ±1 ], i.e. they have the structure N R1 ,...,R L ;g,Q z 2g ν Q . fˆR1 ,...,R L (t, ν) = z L−2 g≥0 Q∈Z
These conjectures also hold for framed knots and links [40].
(3.12)
String Theory and the Kauffman Polynomial
625
3.2. The conjecture for the colored Kauffman invariant. We will first state the conjecture for knots. Let K be an oriented knot, and let H(R,S) be its HOMFLY invariant in the composite representation (R, S). The composite invariant of the knot K, colored by the representation R, and denoted by R R , is given by R R (K) = N RR1 R2 H(R1 ,R2 ) (K), (3.13) R1 ,R2
where N RR1 R2 are the Littlewood–Richardson coefficients defined by (2.4). Notice that, due to (2.22), this invariant is independent of the choice of orientation of the knot, and it is therefore an invariant of unoriented knots. Example 3.2. We give some simple examples of the composite invariant for colorings with up to two boxes: R R R
= 2H , = 2H + H( , ) , = 2H + H( , ) .
Using these invariants we define the generating functionals Z R (v) = R R (K)s R (v), FR (v) = log Z R (v).
(3.14)
(3.15)
R
We also define the generating functionals for colored Kauffman invariants of K as Z G (v) = G R (K)s R (v), FG (v) = log Z G (v). (3.16) R
This allows us to define two sets of reformulated invariants, h R and g R , as follows. The h R are defined by a relation identical to (3.3), ∞ 1 h R (t d , ν d )s R (v d ), d
(3.17)
1 1 FR (v) = g R (t d , ν d )s R (v d ). 2 d
(3.18)
FR (v) =
d=1 R
while the g R are defined by FG (v) −
d odd R
Here the sum over d is over all positive odd integers. h R can be explicitly obtained in terms of colored HOMFLY invariants for composite representations, while the g R can be written in terms of these invariants and the colored Kauffman invariants. Example 3.3. We list here explicit expressions for the reformulated invariants g R of a knot, where R is a representation of up to three boxes. We have g g g
=G =G =G
−H , 1 1 − G2 − H + H2 − H( , ) , 2 2 1 1 − G 2 − H + H2 − H( , ) , 2 2
(3.19)
626
M. Mariño
as well as =G
g
−G
−H
− H(
=G
g
−G
−H
=G
−G G , )
H
+ 2H
− H(
+ H(
2 + G3 + 3 , ) + 2H
, )H
4 − H3 , 3
1 g (t 3, ν 3 ) 3 H + 2H H
(3.20)
8 − H3 , 3 1 3 1 + G − g (t 3 , ν 3 ) 3 3
, )H
−G G
−H
, )
G
− H(
+ 2H( g
1 1 + G 3 − g (t 3 , ν 3 ) 3 3
G
− H(
, )
+ 2H H
+ H(
, )H
4 − H3 . 3
The invariants hˆ R , gˆ R are defined by a relation identical to (3.5), M R−1S h S (t, ν), gˆ R (t, ν) = M R−1S g S (t, ν). hˆ R (t, ν) = S
(3.21)
S
Like before, hˆ R (t, ν) and gˆ R (t, ν) belong in principle to the ring Q[t ±1 , ν ±1 ] with denominators given by products of t r − t −r . The conjecture for the colored Kauffman polynomial states an integrality property similar to the one we stated for the colored HOMFLY invariant. Conjecture 3.3. We have that hˆ R (t, ν) ∈ z −1 Z[z 2 , ν ±1 ], i.e. they have the structure hˆ R (t, ν) = z −1
gˆ R (t, ν) ∈ Z[z, ν ±1 ],
c=0 N R;g,Q z 2g ν Q ,
g≥0 Q∈Z
gˆ R (t, ν) =
(3.22)
c=1 c=2 N R;g,Q z 2g ν Q + N R;g,Q z 2g+1 ν Q ,
(3.23)
g≥0 Q∈Z
where
c=0,1,2 N R;g,Q
are integers.
Again, there is a generalization to links as follows. Let L be an unoriented link, and pick an arbitrary orientation. We define the composite invariant of L as NUR11V1 · · · NURLLVL H(U1 ,V1 ),...,(U L ,VL ) (L). (3.24) R R1 ,...,R L (L) = U1 ,V1 ,··· ,U L ,VL
Due to (2.23), this invariant does not depend on the choice of orientation of L, and it is therefore an invariant of unoriented links. We further define the generating functionals R R1 ,...,R L (L)s R1 (v1 ) · · · s R L (v L ), Z R (v1 , . . . , v L ) = R1 ,...,R L
Z G (v1 , . . . , v L ) =
R1 ,...,R L
G R1 ,...,R L (L)s R1 (v1 ) · · · s R L (v L ),
(3.25)
String Theory and the Kauffman Polynomial
627
as well as the free energies FR (v1 , . . . , v L ) = log Z R (v1 , . . . , v L ),
FG (v1 , . . . , v L ) = log Z G (v1 , . . . , v L ). (3.26)
The reformulated invariants h R1 ,...,R L , g R1 ,...,R L are now defined by FR (v1 , . . . , v L ) =
∞
h R1 ,...,R L (t d , ν d )s R1 (v1d ) · · · s R L (v dL )
(3.27)
d=1 R1 ,...,R L
and 1 FG (v1 , . . . , v L ) − FR (v1 , . . . , v L ) 2 g R1 ,...,R L (t d , ν d )s R1 (v1d ) . . . s R L (v dL ). =
(3.28)
d odd R1 ,...,R L
Finally, the “hatted” invariants are defined by the relation hˆ R1 ,...,R L (t, ν) =
S1 ,...,SL
gˆ R1 ,...,R L (t, ν) =
S1 ,...,SL
M R−1 . . . M R−1L SL h S1 ,...,SL (t, ν), 1 S1 M R−1 . . . M R−1L SL g S1 ,...,SL (t, ν). 1 S1
(3.29)
Example 3.4. For links L of two components K1 , K2 we have g g
(L) = G , (L) − G (K1 )G (K2 ) − H , (L) − H , (L) + 2H (K1 )H (K2 ), (L) = G , (L) − G , (L)G (K1 ) − G (K1 )G (K2 ) + G (K1 )2 G (K2 ) − H , (L) − H + 2 H , (L) + H
g
, ,
(L) − H( , ), (L) (L) H (K1 )
+ 2H (K1 )H (K2 )+H( , ) (K1 )H (K2 )−4H (K1 )2 H (K2 ), (L) = G , (L) − G , (L)G (K1 ) − G (K1 )G (K2 ) + G (K1 )2 G (K2 ) − H , (L) − H , (L) − H( , ), (L) + 2 H , (L) + H , (L) H (K1 ) + 2H (K1 )H (K2 ) + H( , ) (K1 )H (K2 ) − 4H (K1 )2 H (K2 ).
(3.30)
In these equations, L is the link obtained from L by inverting the orientation of one of its components.
628
M. Mariño
Fig. 5. The reformulated invariant of an unoriented link L, gˆ R1 ,...R L (L), involves the Kauffman invariant of L, together with the HOMFLY invariants of all possible choices of orientations for the components of the link. Here we illustrate it for the fundamental representation and for a two-component link
Example 3.5. For general links of L components it is easy to write down a general formula for g ,..., (L). We first define the connected Kauffman invariant of a link L as the term multiplying s (v1 ) · · · s (v L ) in the expansion of FG (v1 , · · · , v L ). It is given by G
(c)
(L) = G(L) −
L
G(K j )G(L j ) + · · · ,
(3.31)
j=1
where the link L j is obtained from L by removing the j th component. Further corrections involve all possible sublinks of L, and the combinatorics appearing in the formula is the same one that appears in the calculation of the cumulants of a probability distribution. A similar definition gives the connected HOMFLY invariant of a link, H(c) (L), which was studied in detail in [29,31]. We now consider all possible oriented links that can be obtained from an unoriented link L of L components by choosing different orientations in their component knots. In principle there are 2 L oriented links that can be obtained in this way, but they can be grouped in pairs that differ in an overall reversal of orientation, and therefore lead to the same HOMFLY invariant. We conclude that there are 2 L−1 different links which differ in the relative orientation of their components and have a priori different HOMFLY invariants. We will denote these links by Lα , where α = 1, · · · , 2 L−1 . Using (2.23), it is easy to see that the oriented invariant (3.24) for R1 = · · · = R L = involves the sum over all possible orientations of the link, and we have R
,...,
(L) = 2
L−1 2
H(Lα ).
(3.32)
α=1
The reformulated invariant g g
,...,
,...,
(L) is then given by
(L) = G
(c)
(L) −
L−1 2
H(c) (Lα ).
(3.33)
α=1
In general, the reformulated invariant of an unoriented link L, gˆ R1 ,...R L (L), involves colored Kauffman invariants of L, together with colored HOMFLY invariants of all possible choices of orientations of the link. This is an important feature of the reformulated invariants, and we illustrate it graphically for a two-component link in Fig. 5. The fact that one has to consider all possible orientations of the unoriented link bears some resemblance to Jaeger’s model for the Kauffman polynomial in terms of the HOMFLY polynomial (see for example [24], pp. 219–222), and it has appeared before in the context of the Kauffmann invariant in [46]. We can now state our conjecture for the Kauffman invariant of links.
String Theory and the Kauffman Polynomial
629
Conjecture 3.4. We have that hˆ R1 ,...,R L (t, ν) ∈ z L−2 Z[z 2 , ν ±1 ],
gˆ R1 ,...,R L (t, ν) ∈ z L−1 Z[z, ν ±1 ],
(3.34)
i.e. they have the structure N Rc=0 z 2g ν Q , hˆ R1 ,...,R L (t, ν) = z L−2 1 ,...,R L ;g,Q g≥0 Q∈Z
gˆ R1 ,...,R L (t, ν) = z L−1
g≥0 Q∈Z
(3.35) 2g Q c=2 2g+1 Q N Rc=1 . z ν + N z ν R ,...,R ;g,Q ,...,R ;g,Q 1 1 L L
Remark 3.6. It follows from this conjecture that h R1 ,...,R L ∈ z L−2 Z[t ±1 , ν ±1 ],
g R1 ,...,R L ∈ z L−1 Z[t ±1 , ν ±1 ].
(3.36)
As in the colored HOMFLY case, the conjecture is supposed to hold as well for framed knots and links. 4. Evidence 4.1. Direct computations. In this section we provide some evidence for our conjectures concerning the colored Kauffman invariant of knots and links. The first type of evidence follows from direct computation of the invariants hˆ R1 ,...,R L , gˆ R1 ,...,R L for simple knots and links and for representations with small number of boxes. Example 4.1. The simplest example is of course the unknot. The colored HOMFLY and Kauffman invariants are just quantum dimensions. For the standard framing one finds that the only nonvanishing hˆ R , gˆ R are hˆ hˆ
= 2(ν − ν −1 )z −1 , = −z −1 ,
hˆ
= −z −1 ,
gˆ
= 1.
(4.1)
Although we have only computed the reformulated invariants up to four boxes, we conjecture that the hˆ R , gˆ R vanish for all remaining representations. Of course, this has the structure predicted by our conjecture. We now consider more complicated examples. As in [29,31], a useful testing ground are torus knots and links, since for them one can write down general expressions for the colored invariants in any representation. Torus knots are labelled by two coprime integers n, m, and we will denote them by Kn,m . Torus links are labelled by two integers, and their g.c.d. is the number of components of the link, L. We will denote a torus link by L Ln,Lm , where n, m are coprime. Explicit formulae for the HOMFLY invariant of a torus knot Kn,m , colored by a representation R, can be obtained in many ways. In the context of Chern–Simons theory, one can use for example the formalism of knot operators of [28] to write down general expressions [29]. In fact [52], one can obtain formulae in the knot operator formalism which are much simpler than those presented
630
M. Mariño
in [29] and make contact with the elegant result derived in [36] by using Hecke algebras. The formula one obtains is simply H R (Kn,m ) = t nmC R
m
U cn;R t − n CU dimq U,
(4.2)
U
U is defined in (2.7). Of course one has to set t N = ν. This formula is also valid where cn;R for composite representations, which after all are just a special type of representations of U(N ). The generalization to torus links is immediate, as noticed in [31], and the invariant for L Ln,Lm is given by
H R1 ,...,R L (L Ln,Lm ) =
N RS1 ,...,R L t
mn C S − Lj=1 C R j
H S (Kn,m ).
(4.3)
S
This expression is also valid for composite representations, but one has to use the appropriate Littlewood–Richardson coefficients (as computed in for example [25]). Example 4.2. By using these formulae one obtains, for the trefoil knot,
H(
, ) (K2,3 )
= dimq ( , ) 4ν 4 − 4ν 6 + ν 8 + z 2 4ν 4 − 7ν 6 + 2ν 8 + ν 10
+ z 4 ν 4 − 2ν 6 + ν 8 , (4.4)
while for the Hopf link we have for example H(
, ),( ,·) (L2,2 )
= dimq ( , ) dimq
(1 + z 2 ).
(4.5)
Remark 4.3. In the case of the Hopf link, a general expression for H(R1 ,S1 ),(R2 ,S2 ) in terms of the topological vertex [1] can be read from the results for the “covering contribution” in [8]. This expression has reappeared in other studies of topological string theory, see [2,22]. Particular cases have been computed by using skein theory in [42]. One also needs to compute the colored Kauffman invariants of torus knots and links. Very likely, the expression (4.2) generalizes to the Kauffman case by using the group theory data for SO(N )/Sp(N ), but we have used the expression presented in [8] for torus knots of the type (2, 2m + 1), based on the approach of [11]. With these ingredients it is straightforward to compute the reformulated invariants gˆ R for torus knots, although the expressions quickly become quite complicated. We have verified the conjecture for various framed torus knots and links and representations with up to four boxes.
String Theory and the Kauffman Polynomial
631
Example 4.4. For the trefoil knot in the standard framing one finds = −21ν 11 + 79ν 9 − 111ν 7 + 69ν 5 − 16ν 3 + 21z ν 12 − 3ν 10 + 3ν 8 − ν 6 gˆ + z 2 −70ν 11 + 251ν 9 − 307ν 7 + 146ν 5 − 20ν 3 + 7z 3 10ν 12 − 33ν 10 + 33ν 8 − 10ν 6 − 2z 4 42ν 11 − 165ν 9 + 183ν 7 − 64ν 5 + 4ν 3 + 14z 5 6ν 12 − 23ν 10 + 23ν 8 − 6ν 6 + z 6 −45ν 11 + 220ν 9 − 230ν 7 + 56ν 5 − ν 3 + 3z 7 15ν 12 − 73ν 10 + 73ν 8 − 15ν 6 + z 8 −11ν 11 + 78ν 9 − 79ν 7 + 12ν 5 + z 9 −11ν 11 + 78ν 9 − 79ν 7 + 12ν 5 + z 10 −ν 11 + 14ν 9 − 14ν 7 + ν 5 + z 11 ν 12 − 14ν 10 + 14ν 8 − ν 6 + z 12 ν 9 − ν 7 + z 13 ν 8 − ν 10 , gˆ = −15ν 11 + 53ν 9 − 69ν 7 + 39ν 5 − 8ν 3 + 15z ν 12 − 3ν 10 + 3ν 8 − ν 6 + z 2 −35ν 11 + 126ν 9 − 146ν 7 + 61ν 5 − 6ν 3 + 5z 3 7ν 12 − 24ν 10 + 24ν 8 − 7ν 6 + z 4 −28ν 11 + 120ν 9 − 128ν 7 + 37ν 5 − ν 3 + 7z 5 4ν 12 − 17ν 10 + 17ν 8 − 4ν 6 + z 6 −9ν 11 + 55ν 9 − 56ν 7 + 10ν 5 + z 7 9ν 12 − 55ν 10 + 55ν 8 − 9ν 6 + z 8 −ν 11 + 12ν 9 − 12ν 7 + ν 5 + z 9 ν 12 − 12ν 10 + 12ν 8 − ν 6 + z 10 ν 9 − ν 7 + z 11 ν 8 − ν 10 . (4.6) c=1,2 From these expressions one can read the BPS invariants N c=1,2 . In [8] ;g,Q and N ;g,Q
the invariants with c = 1 were obtained by exploited parity properties of the Kauffman invariant, but the c = 2 invariants were not determined. Notice that our convention for the matrix M R S is different from the one in [8], so in order to compare with the results for c = 1 presented in [8] one has to change N R;g,Q → (−1)(R)−1 N R T ;g,Q . Example 4.5. For the Hopf link one finds gˆ gˆ gˆ
,
= z(ν − ν −1 ),
,
= z(ν 2 − 1),
,
= z(ν
−2
− 1).
(4.7)
632
M. Mariño
4.2. General predictions for knots. We now discuss general predictions of our conjecture. We will see that it makes contact with well-known properties of the Kauffman invariant, and that it makes some simple, new predictions for the structure of the Kauffman invariant of links. We start by discussing general predictions for knots. From their definition (2.17), (2.33) we can write ν − ν −1 K G(K) = 1 + ki (ν)z i , z i≥0 (4.8) −1 ν−ν K 2i pi (ν)z . H(K) = z i≥0
According to our conjecture, gˆ
=g
has no terms in z −1 , therefore one must have
k0K (ν) = p0K (ν),
(4.9)
which is (2.42) in the case of knots. Therefore, the equality of the lowest order terms of the HOMFLY and Kauffman polynomials is a simple consequence of our conjecture. This was already noticed in [8]. We now consider the reformulated polynomial g R for representations with two boxes. Our conjecture implies that this quantity belongs to z −1 Z[ν ±1 , t ±1 ]. By looking at the definition of g (K), g (K) in terms of colored Kauffman and HOMFLY invariants, we see that the only possible term which might spoil integrality is 1 (G(K)2 + H( , ) (K)). (4.10) 2 Therefore our conjecture implies that G(K)2 ≡ H(
, ) (K)
mod 2.
(4.11)
This is precisely Rudolph’s theorem (2.45) for knots. Very likely, our integrality conjecture also leads to the generalization of Rudolph’s theorem due to Morton and Ryder (2.46), although the combinatorics becomes more involved. As an example, we will briefly show how to derive (2.46) in the case of knots (L = 1) and with R = . To do this, we look at g , which is given by g
=G
−G G
1 R − 2
1 − G2 2
−R R
1 1 + G2 G − G4 2 4 1 1 − R2 + R2 R 2 2
1 − R4 4
.
(4.12)
Most terms in the r.h.s. are manifestly elements in Z[t ±1 , ν ±1 ] with denominators given by products of t r − t −r . The only possible source for rational coefficients is the term 1 1 2 G G 4 − H(2 , ) . (4.13) + H( , ) − − 2 4 However, the last two terms inside the bracket are also in Z[t ±1 , ν ±1 ] thanks to Rudolph’s theorem, and we conclude that integrality of g requires G2
≡ H(
,
)
mod 2.
(4.14)
This is Morton–Ryder’s theorem (2.46) for a knot colored by R = . It seems likely that the general case of their theorem, for an arbitrary representation R, follows from integrality of g S , where S ∈ R ⊗ R.
String Theory and the Kauffman Polynomial
633
4.3. General predictions for links. Let us now consider two-component links. When both components are colored by , the HOMFLY and Kauffman invariants have the form ν − ν −1 L G(L) = 1 + ki (ν)z i−1 , z i≥0 (4.15) −1 ν−ν L 2i−1 pi (ν)z . H(L) = z i≥0
Our conjecture says that g , (L) belongs to Z[ν ±1 , z], i.e. it has no negative powers of z. From its explicit definition in terms of the HOMFLY and Kauffman polynomials of L we find that this condition leads to three different relations. The first one is k0L (ν) = (ν − ν −1 )k0K1 (ν)k0K2 (ν).
(4.16)
The conjecture in the case of HOMFLY leads to a similar relation [31] p0L (ν) = (ν − ν −1 ) p0K1 (ν) p0K2 (ν).
(4.17)
Notice that, due to (4.9), we also have from (4.17) and (4.16), that p0L (ν) = k0L (ν)
(4.18)
for links of two components. The second relation determines the second coefficient of the Kauffman polynomial of a link as k1L (ν) = k0K1 (ν)k0K2 (ν) + (ν − ν −1 ) k0K1 (ν)k1K2 (ν) + k1K1 (ν)k0K2 (ν) . (4.19) Finally, the third relation gives an equation for k2L (ν), k2L (ν) = p1L (ν) + p1L (ν) − 2 p0K1 (ν) p1K2 (ν) + p1K1 (ν) p0K2 (ν) + k0K1 (ν)k1K2 (ν) + k1K1 (ν)k0K2 (ν) + (ν − ν −1 ) k0K1 (ν)k2K2 (ν) + k2K1 (ν)k0K2 (ν) + k1K1 (ν)k1K2 (ν) . (4.20) These results can be easily generalized to a general link L with L components K j , j = 1, . . . , L, as follows. If we calculate the connected invariants of the link from their definitions in terms of invariants of sublinks, we obtain an expression of the form ν − ν −1 (c),L (c) G (L) = 1 + ki (ν)z i+1−L , z i≥0 (4.21) −1 ν − ν (c),L (c) 2i+1−L pi (ν)z , H (L) = z i≥0
where pi(c),L (ν), ki(c),L (ν) can be obtained in terms of the polynomials pi(c),L (ν),
pi(c),L (ν) of the different sublinks of L, L ⊂ L. For example, p0(c),L (ν) = p0L (ν) − (ν − ν −1 ) L−1
L−1
K
p0 j (ν).
(4.22)
j=1
The conjecture of [30,31,44] for the colored HOMFLY invariant implies in particular that the connected HOMFLY invariant belongs to z L−2 Z[z 2 , ν ±1 ]. This leads to[30,31]
634
M. Mariño
Fig. 6. A Brunnian link with four components (c),L
p0
(c),L
(ν) = · · · = p L−2 (ν) = 0
(4.23)
for any link L. The fact that p0(c),L (ν) = 0 is a result of Lickorish and Millett [34], and (c),L the vanishing of p1,2 (ν) has been proved in [21]. Our conjecture for the colored Kauffman implies that (3.33) belongs to z L−1 Z[z, ν ±1 ]. This gives the relations (c),L
k0
(c),L
(ν) = · · · = k2L−3 (ν) = 0
(4.24)
as well as (c),L
k2L−2 (ν) =
L−1 2
(c),L
p L−1 α (ν).
(4.25)
α=1
The relations (4.24) generalize (4.16) and (4.19), while (4.25) generalizes (4.20) to any (c),L link. The equality (2.42) for any link now follows from the vanishing of p0 (ν), (c),L k0 (ν) and the equality in the case of knots (4.9). (c),L (ν) with i = 0, · · · , 3 has been proved by Kanenobu in [20]. The vanishing of ki More evidence for (4.24) comes from Brunnian links. A Brunnian link is a nontrivial link with the property that every proper sublink is trivial. The Hopf link is a Brunnian link of two components, while the famous Borromean rings give a Brunnian link of three components. A Brunnian link with four components is shown in Fig. 6. It is easy to see that the connected invariants of a Brunnian link B with L components are of the form L ν − ν −1 ν − ν −1 G (c) (B) = 1 + FB (z, ν) − 1 + , z z (4.26) L ν − ν −1 ν − ν −1 (c) PB (z, ν) − H (B) = . z z Conjectures (4.23) and (4.24) imply that, for Brunnian links, L−1 ν − ν −1 PB (z, ν) − = O(z L−1 ), z L−1 ν − ν −1 FB (z, ν) − 1 + = O(z L−1 ). z This has been proved in [18,47].
(4.27)
String Theory and the Kauffman Polynomial
635
Fig. 7. The two oriented links L and L tabulated as 412 , and differing in the relative orientation of their components
The relation (4.25) (and in particular (4.20) for links with two components) seems however to be a new result in the theory of the Kauffman polynomial. It relates the Kauffman polynomial of an unoriented link L to the HOMFLY polynomial of all the oriented links that can be obtained from L, modulo an overall reversal of the orientation. In the case of links made out of two unknots, it further simplifies to k2L (ν) = p1L (ν) + p1L (ν),
(4.28)
and it can be easily checked in various cases by looking for example at the tables presented in [35]. Example 4.6. Let us check (4.28) for some simple links made out of two unknots. The easiest example is of course the Hopf link, where L and L are depicted in Fig. 3. Their HOMFLY polynomials are given by PL =
ν − ν −1 + νz, z
PL =
ν − ν −1 − ν −1 z, z
(4.29)
and p1L (ν) + p1L (ν) = ν − ν −1 .
(4.30)
By comparing with (2.40), we see that (4.28) holds. Let us now consider the pair of oriented links depicted in Fig. 7. Their HOMFLY polynomials are PL = ν − ν −1 z −1 + ν − 3ν −1 z − ν −1 z 3 , (4.31) PL = ν − ν −1 z −1 + ν + ν 3 z, while the Kauffman polynomial is FL = ν −ν −1 z −1 +1+ ν 3 +2ν −3ν −1 z + 1 − ν 2 z 2 + ν −ν −1 z 3 . (4.32) Again, the relation (4.28) holds. Finally, we consider the link depicted in Fig. 8, and tabulated as 521 . This link is invariant under reversal of orientation of its components, hence L = L, and its HOMFLY polynomial equals PL = ν − ν −1 z −1 + −ν −1 + 2ν − ν 3 z + νz 3 , (4.33)
636
M. Mariño
Fig. 8. The oriented link L, tabulated as 521 . It is invariant under reversal of orientation of its components, hence L = L
while its Kauffman polynomial is FL = ν − ν −1 z −1 + 1 + −2ν −1 + 4ν − 2ν 3 z + −1 + ν 4 z 2 + −ν −1 + 3ν − 2ν 3 z 3 + (−1 + ν 2 )z 4 .
(4.34)
Here, k2L (ν) = 2 p1L (ν), again in agreement with (4.28). Finally, it is easy to see that Rudolph’s theorem for a link L can be obtained by requiring integrality of, for example, the reformulated invariant g ,..., , generalizing in this way our analysis for knots. It is likely that (2.46) follows from looking at g S1 ,...,SL with Si ∈ Ri ⊗ Ri . 5. String Theory Interpretation The conjecture stated in this paper is mostly based on the analysis performed in [8], which in turn builds upon previous work on the large N duality between Chern–Simons theory and topological strings (see [39] for a review of these developments). In this section we sketch some of the string theory considerations which lead to the above conjecture. For simplicity we will restrict ourselves to the case of knots. The extension of these considerations to the case of links is straightforward. 5.1. Chern–Simons theory and D-branes. In [54], Witten showed that Chern–Simons theory on a three-manifold M can be obtained by considering open topological strings on the cotangent space T ∗ M with boundaries lying on M, which is a Lagrangian submanifold of T ∗ M. Equivalently, one can say that the theory describing N topological branes wrapping M inside T ∗ M is U(N ) Chern–Simons theory. To incorporate knots and links into this framework one has to introduce a different set of branes, as explained by Ooguri and Vafa [44]. This goes as follows: given any knot K in S3 , one can construct a natural Lagrangian submanifold NK in T ∗ S3 . This construction is rather canonical, and it is called the conormal bundle of K. Let us parametrize the knot K by a curve q(s), where s ∈ [0, 2π ]. The conormal bundle of K is the space ∗ 3 NK = (q(s), p) ∈ T S pi q˙i = 0, 0 ≤ s ≤ 2π , (5.1) i
String Theory and the Kauffman Polynomial
637
where qi , pi are coordinates for the base and the fibre of the cotangent bundle, respectively, and q˙i denote derivatives w.r.t. s. The space NK is an R2 -fibration of the knot itself, where the fiber on the point q(s) is given by the two-dimensional subspace of ∗ S3 of planes orthogonal to q(s). Tq(s) ˙ NK has the topology of S1 × R2 , and intersects S3 along the knot K. As a matter of fact, for some aspects of the construction, the appropriate submanifolds to consider are deformations of NK which are disconnected from the zero section. For example, [26] considers a perturbation ∗ 3 NK, = (q(s), p + q(s)) ˙ ∈T S pi q˙i = 0, 0 ≤ s ≤ 2π . (5.2) i
Let us now wrap M probe branes around NK . There will be open strings with one endpoint on S3 , and another endpoint on NK . These open strings lead to the insertion of the following operator (also called the Ooguri–Vafa operator) in the Chern–Simons theory on S3 [44]: U(N ) Z U(N ) (v) = Tr R (UK ) s R (v). (5.3) R
Here UK is the holonomy of the Chern–Simons gauge field around K, while v is a U (M) matrix associated to the M branes wrapping NK . After computing the expectation value of this operator in Chern–Simons theory, we obtain the generating functional (3.1). In order to describe the Kauffman polynomial, we need a Chern–Simons theory on S3 with gauge group SO(N ) or Sp(N ). From the point of view of the open string description, we need an orientifold of topological string theory on T ∗ S3 . This orientifold was constructed in [51], and it can be described as follows. As a complex manifold, the cotangent space T ∗ S3 is a Calabi–Yau manifold called the deformed conifold. It can be described by the equation 4
xi2 = μ,
(5.4)
i=1
where xi are complex coordinates. For real μ > 0, the submanifold Im xi = 0 is nothing but S3 , while Im xi are coordinates of the cotangent space. We now consider the following involution of the geometry I : xi → x¯i .
(5.5)
This leaves the S3 invariant, and acts as a reflection on the coordinates of the fiber: pi → − pi .
(5.6)
If we now wrap N D-branes on S3 , the corresponding gauge theory description is ChernSimons theory with gauge group SO(N ) or Sp(N ), depending on the choice of orientifold action on the gauge group [51]. Since an orientifold theory is a particular case of a Z2 orbifold, the partition function is expected to be the sum of the partition function in the untwisted sector, plus the partition function of the twisted sector. The partition function in the untwisted sector corresponds to a theory of oriented open strings in the “covering geometry,” i.e. the original target space geometry but with the closed moduli identified
638
M. Mariño
according to the action of the involution. The partition function in the twisted sector is given by the contributions of unoriented strings. We can then write [7,51] Z SO(N )/Sp(N ) =
1 Z cov + Z unor . 2
(5.7)
In the case considered in [51], the partition function for the covering geometry is just the U(N ) partition function. We can introduce Wilson loops around knots and links by following the strategy of Ooguri and Vafa, i.e. by introducing branes wrapping the Lagrangian submanifold NK . This leads to the insertion of the operator [7,8] SO(N )/Sp(N ) Z SO(N )/Sp(N ) (v) = Tr R (UK ) s R (v). (5.8) R
After computing the expectation value of this operator in Chern–Simons theory, we obtain the generating functional (3.16). In this paper we have used the Kauffman invariant which follows naturally from the gauge group SO(N ), but in fact the two choices of gauge group lead to essentially identical theories due to the “SO(N ) = Sp(−N )” equivalence, see [7] for a discussion and references. How does (5.8) decompose into a sum (5.7) of covering and unoriented contributions? In general, if we have a geometry X with a submanifold L, there will be two submanifolds in the covering geometry: the original submanifold L and its image under the involution I (L) [8]. Although I (NK ) = NK , if one considers deformations of the conormal bundle, the resulting submanifolds will be different (this has been previously noticed in [17]). For example, for the deformation (5.2) one has that I (NK, ) = NK,− . Therefore, after deformation we will have two sets of probe D-branes in T ∗ S3 , wrapping two different submanifolds related by the involution I , and leading to two different sources v1 and v2 [8]. In particular, we have two sets of open strings, going from the two sets of probe branes to the branes wrapping S3 in the orientifold plane, and related by the orientifold action. This action involves both the target space involution pi → − pi and an orientation reversal which conjugates the Chan-Paton charges. We then conclude that one set of open strings will lead to the insertion of Wilson lines in S3 involving representations R = ·, , , . . ., while the other set of open strings will lead to conjugate representations S = . . . , , , . . .. This is illustrated in Fig. 9. The partition function of the covering geometry will then have the structure U(N ) Tr (R,S) (UK ) s R (v)s S (v). (5.9) Z cov (v) = R,S
Since we have to identify the closed and open moduli according to the action of the involution, in (5.9) we have set v1 = v2 = v in the source terms s R (v1 ) and s S (v2 ). After computing the expectation value of this operator in Chern–Simons theory, we obtain the generating functional Z R in (3.15). This argument by itself does not make it possible to decide if the representation induced by the orientifold action is the composite representation (R, S) or the tensor product representation R ⊗ S, which differ in “lower order corrections” as specified in (2.13). One needs in principle a more detailed study of the orientifold, but as we will see in a moment, by looking at the topological string theory realization for simple knots and links, we can verify that the covering geometry involves indeed the composite representation.
String Theory and the Kauffman Polynomial
639
Fig. 9. The two sets of open strings in the covering geometry, going from the probe branes to the orientifold “plane” in S3 and extending along the cotangent directions. They are related by the target space involution, which sends pi → − pi , and by orientation reversal. The Chan–Paton charges ending on S3 lead to a Wilson line colored by (R, S), while the Chan-Paton charges in the probe branes lead to sources v1 , v2 which have to be identified by the involution: v1 = v2
5.2. Topological string dual. It was conjectured in [14] that open topological string theory on T ∗ S3 with N D-branes wrapping S3 is equivalent to a closed topological string theory on the resolved conifold X = O(−1) ⊕ O(−1) → P1 .
(5.10)
This leads to a large N duality between Chern–Simons theory on S3 and the closed topological string on (5.10). The open string theory on the deformed conifold is related to the closed string theory on the resolved conifold by a so-called geometric transition. In this case this is the conifold transition. This duality was extended by Ooguri and Vafa to the situation in which one has probe branes in T ∗ S3 wrapping the Lagrangian submanifold NK . They postulated that, given any knot K, one can construct a Lagrangian submanifold in the resolved conifold, L K , which can be understood as a geometric transition of the Lagrangian NK in the deformed conifold. The total free energy in the deformed conifold can be computed in terms of Chern–Simons theory and it is given by FH (v). By the large N duality, it should be equal to the free energy of an open topological string theory on the resolved conifold X with Lagrangian boundary conditions given by L K . Since open topological string amplitudes can be reformulated in terms of counting of BPS invariants and satisfy integrality conditions [13,31,44], one obtains the conjecture about the integrality properties of the colored HOMFLY invariant [30,39] which we reviewed above. As first shown in [51], one can extend the large N duality of [14] to the orientifold case and obtain an equivalence between Chern–Simons theory on S3 with SO(N )/Sp(N ) gauge groups and topological string theory on an orientifold of the resolved conifold. A convenient description of (5.10) is as a toric manfiold, defined by the equation |X 1 |2 + |X 2 |2 − |X 3 |2 − |X 4 |2 = t
(5.11)
640
M. Mariño
and a further quotient by a U (1) action where the coordinates (X 1 , . . . , X 4 ) have charges (1, 1, −1, −1). In this description, the orientifold is defined by the involution (X 1 , X 2 , X 3 , X 4 ) → (X 2 , −X 1 , X 4 , −X 3 ).
(5.12)
Let us now consider the open topological string theory on the resolved conifold defined by the Lagrangian submanifold L K associated to a knot. If we perform the orientifold action (5.12) we obtain an orientifold of this open topological string theory. The total partition function of this orientifold of the resolved conifold should be equal to the total partition function of the orientifold of the deformed conifold, namely Z G (v). The contribution of the covering geometry will be given by the partition function of topological open strings on X in the presence of two Lagrangian submanifolds, L K and I (L K ), after identifying the sources. This should be equal to the contribution of the covering of the deformed geometry, i.e. Z R (v). This partition function can then be c=0 . expressed in terms of integer BPS invariants N R;g,Q On the other hand, the unoriented contribution to the orientifold partition function will be given by the partition function of unoriented topological open strings in the quotient geometry X/I with a brane L K . This unoriented partition function also has an integrality structure [7,8] generalizing [13]. In particular, it can be written in terms of c=1,2 BPS invariants N R;g,Q related to the counting of curves with boundaries ending on L K and with one or two crosscaps. This explains the integrality properties for the colored Kauffman polynomial that we conjectured in this paper. Remark 5.1. Note that the two choices of orientifold action which lead to the gauge groups SO(N )/Sp(N ) in the deformed conifold become here a choice of overall sign for the c = 1 contribution, see for example [7] for a discussion and examples. Remark 5.2. The sum over odd positive integers d in (3.18) seems to be a general feature of multicovering formulae for unoriented surfaces, as noticed in [7,8,51]. See [27] for recent examples. When K is the unknot it is possible to construct explicitly the corresponding Lagrangian submanifold L K in X . It turns out to be given by a toric construction, and it is possible to compute Z R (v) by using the topological vertex [8]. The explicit computation in Eq. (3.10) of [8] confirms that the vacuum expectation value of the operator appearing in (5.9) is indeed the quantum dimension of the composite representation (2.26), ) Tr U(N (R,S) (UK ) = dim q (R, S).
(5.13)
One can also find an explicit description of the Lagrangian submanifolds in X corresponding to the Hopf link [8], and compute the covering contribution to the orientifold partition function by using the topological vertex. The resulting expression (Eq. (3.17) of [8]) agrees again with the HOMFLY invariant of the Hopf link for general composite representations, which was computed in [2,22] in a different context. 6. Conclusions and Outlook In this paper we have formulated a new conjecture on the structure of the colored Kauffman polynomial of knots and links. This conjecture is mainly based on the results of [8], but it adds a crucial ingredient which was missing in that paper: the fact that partition
String Theory and the Kauffman Polynomial
641
functions in the untwisted sector of the orientifold are given by HOMFLY invariants colored by composite representations. This makes possible to extend the results obtained for the colored HOMFLY invariant in [30,31,44] to the colored Kauffman polynomial. According to our conjecture, the natural invariant of unoriented knots and links involves both the Kauffman polynomial and the HOMFLY polynomial colored with composite representations. In particular, in the case of links, it involves considering all possible orientations for the components of a link. This is probably the most interesting aspect of the conjecture, and it “explains” many aspects of the relationship between these invariants, like Rudolph’s theorem [49]. It also leads to new, simple results about the Kauffman polynomial, like for example (4.25). From the point of view of physics, the results presented in this paper provide new precision tests of a large N string/gauge theory correspondence. It would be very interesting to relate the integrality properties conjectured here to appropriate generalizations of Khovanov homology, as in [16]. Indeed, as in the case of the colored HOMFLY invariants, the integers N Rc=0,1,2 are Euler characteristics 1 ,...,R L ;g,Q of cohomology theories associated to BPS states, and it is natural to conjecture that these cohomologies give categorifications of the colored Kauffman invariant. There has been already work in this direction for knots colored by the fundamental representation [17]. The case of links and/or higher representations should involve, as conjectured in this paper, both the Kauffman invariant and the HOMFLY invariant for composite representations. Finally, it was noticed in [29] that the reformulated invariants f R , when expanded in power series, t = exp(x/2), lead to Vassiliev invariants. On the other hand, some of the properties that follow from our conjectures (like (4.27)) have a natural interpretation in Vassiliev theory. Therefore, it would be interesting to have a precise interpretation of our conjectures in terms of Vassiliev invariants, especially now that we have a rather complete understanding of Chern–Simons invariants for all classical gauge groups in terms of string theory. Note added. After this paper was submitted, two papers appeared [9,52] with extensive checks of conjecture 3.3 for framed torus knots and links. Acknowledgements. My interest in this problem was revived by the recent paper of Morton and Ryder [43], and I would like to thank them for a very useful correspondence. I would also like to thank Vincent Bouchard, Jose Labastida and Cumrun Vafa for many conversations on this topic along the years, and Sébastien Stevan for recent discussions on torus knots. Finally, I would like to thank Vincent Bouchard, Stavros Garoufalidis and specially Hugh Morton for a detailed reading of the manuscript. This work was supported in part by the Fonds National Suisse.
References 1. Aganagic, M., Klemm, A., Mariño, M., Vafa, C.: The topological vertex. Commun. Math. Phys. 254, 425 (2005) 2. Aganagic, M., Neitzke, A., Vafa, C.: BPS microstates and the open topological string wave function. http://arxiv.org/abs/hep-th/0504054v1, 2005 3. Aiston, A.K.: Skein theoretic idempotents of Hecke algebras and quantum group invariants. Ph.D. Thesis, University of Liverpool (1996), available in http://www.liv.ac.uk/~su14/knotprints.html 4. Aiston, A.K., Morton, H.R.: Idempotents of Hecke algebras of type A. J. Knot Theory Ramifications 7, 463 (1998) 5. Beliakova, A., Blanchet, C.: Skein construction of idempotents in Birman-Murakami-Wenzl algebras. Math. Ann. 321, 347 (2001)
642
M. Mariño
6. Birman, J.: New points of view in knot theory. Bull. Am. Math. Soc., New Ser. 28, 253 (1993) 7. Bouchard, V., Florea, B., Mariño, M.: Counting higher genus curves with crosscaps in Calabi-Yau orientifolds. JHEP 0412, 035 (2004) 8. Bouchard, V., Florea, B., Mariño, M.: Topological open string amplitudes on orientifolds. JHEP 0502, 002 (2005) 9. Chandrima, P., Pravina, B., Ramadevi, P.: Composite Invariants and Unoriented Topological String Amplitudes. http://arxiv.org/abs/1003.5282v1[hep-th], 2010 10. Chen, L., Chen, Q., Reshetikhin, N.: Orthogonal quantum group invariants of links (to appear) 11. Rama Devi, P., Govindarajan, T.R., Kaul, R.K.: Three-dimensional Chern-Simons theory as a theory of knots and links. 3. Compact semisimple group. Nucl. Phys. B 402, 548 (1993) 12. Freyd, P., Yetter, D., Hoste, J., Lickorish, W.B.R., Millett, K., Ocneanu, A.: A new polynomial invariant of knots and links. Bull. Amer. Math. Soc. 12, 239 (2002) 13. Gopakumar, R., Vafa, C.: M-theory and topological strings. I, II. http://arxiv.org/abs/hep-th/9809187v1, 1998 and http://arxiv.org/abs/hep-th/9812127v1, 1998 14. Gopakumar, R., Vafa, C.: On the gauge theory/geometry correspondence. Adv. Theor. Math. Phys. 3, 1415 (1999) 15. Gross, D.J., Taylor, W.: Two-dimensional QCD is a string theory. Nucl. Phys. B 400, 181 (1993) 16. Gukov, S., Schwarz, A.S., Vafa, C.: Khovanov-Rozansky homology and topological strings. Lett. Math. Phys. 74, 53 (2005) 17. Gukov, S., Walcher, J.: Matrix factorizations and Kauffman homology. http://arxiv.org/abs/hep-th/ 0512298v1, 2005 18. Habiro, K.: Brunnian links, claspers and Goussarov-Vassiliev finite type invariants. Math. Proc. Cambridge Philos. Soc. 142, 459 (2007) 19. Hadji, R.J., Morton, H.R.: A basis for the full Homfly skein of the annulus. Math. Proc. Cambridge Philos. Soc. 141, 81 (2006) 20. Kanenobu, T.: The first four terms of the Kauffman’s link polynomial. Kyungpook Math. J. 46, 509 (2006) 21. Kanenobu, T., Miyazawa, Y.: The second and third terms of the HOMFLY polynomial of a link. Kobe J. Math. 16, 147 (1999) 22. Kanno, H.: Universal character and large N factorization in topological gauge/string theory. Nucl. Phys. B 745, 165 (2006) 23. Kauffman, L.H.: An invariant of regular isotopy. Trans. Amer. Math. Soc. 318, 417 (1990) 24. Kauffman, L.H.: Knots and physics. Third edition, Singapore: World Scientific, 2001 25. Koike, K.: On the decomposition of tensor products of the representations of the classical groups: by means of the universal characters. Adv. Math. 74, 57 (1989) 26. Koshkin, S.: Conormal bundles to knots and the Gopakumar–Vafa conjecture. Adv. Theor. Math. Phys. 11, 591 (2007) 27. Krefl, D., Walcher, J.: The Real Topological String on a local Calabi-Yau. http://arxiv.org/abs/0902. 0616v1[hep-th], 2009 28. Labastida, J.M.F., Llatas, P.M., Ramallo, A.V.: Knot operators in Chern-Simons gauge theory. Nucl. Phys. B 348, 651 (1991) 29. Labastida, J.M.F., Mariño, M.: Polynomial invariants for torus knots and topological strings. Commun. Math. Phys. 217, 423 (2001) 30. Labastida, J.M.F., Mariño, M.: A new point of view in the theory of knot and link invariants. J. Knot Theory Ramifications 11, 173 (2002) 31. Labastida, J.M.F., Mariño, M., Vafa, C.: Knots, links and branes at large N. JHEP 0011, 007 (2000) 32. Labastida, J.M.F., Pérez, E.: A Relation Between The Kauffman And The Homfly Polynomials For Torus Knots. J. Math. Phys. 37, 2013 (1996) 33. Lickorish, W.B.R.: An introduction to knot theory. Berlin-Heidelberg-New York: Springer-Verlag, 1997 34. Lickorish, W.B.R., Millett, K.C.: A polynomial invariant of oriented links. Topology 26, 107 (1987) 35. Lickorish, W.B.R., Millett, K.C.: The new polynomial invariants of knots and links. Math. Mag. 61, 3 (1988) 36. Lin, X.-S., Zheng, H.: On the Hecke algebras and the colored HOMFLY polynomial. http://arxiv.org/ abs/math/0601267v1[math.QA], 2006 37. Liu, K., Peng, P.: Proof of the Labastida–Mariño–Ooguri–Vafa conjecture. http://arxiv.org/abs/0704. 1526v3[math.QA], 2009 38. Macdonald, I.G.: Symmetric functions and Hall polynomials. Second edition, Oxford: Oxford University Press, 1995 39. Mariño, M.: Chern-Simons theory and topological strings. Rev. Mod. Phys. 77, 675 (2005) 40. Mariño, M., Vafa, C.: Framed knots at large N. http://arxiv.org/abs/hep-th/0108064v1, 2001 41. Morton, H.R.: Integrality of Homfly 1-tangle invariants. Algebr. Geom. Topol. 7, 327 (2007) 42. Morton, H.R., Hadji, R.J.: Homfly polynomials of generalized Hopf links. Algebr. Geom. Topol. 2, 11 (2002)
String Theory and the Kauffman Polynomial
643
43. Morton, H.R., Ryder, N.D.A.: Relations between Kauffman and Homfly satellite invariants. Math. Proc. Phil. Soc. 149, 105–114 (2010) 44. Ooguri, H., Vafa, C.: Knot invariants and topological strings. Nucl. Phys. B 577, 419 (2000) 45. Pravina, B., Ramadevi, P.: SO(N) reformulated link invariants from topological strings. Nucl. Phys. B 727, 471 (2005) 46. Przytycki, J.H.: A note on the Lickorish–Millett–Turaev formula for the Kauffman polynomial. Proc. Amer. Math. Soc. 121, 645 (1994) 47. Przytycki, J.H., Taniyama, K.: The Kanenobu-Miyazawa conjecture and the Vassiliev-Gusarov skein modules based on mixed crossings. Proc. Amer. Math. Soc. 129, 2799 (2001) 48. Ramadevi, P., Sarkar, T.: On link invariants and topological string amplitudes. Nucl. Phys. B 600, 487 (2001) 49. Rudolph, L.: A congruence between link polynomials. Math. Proc. Cambridge Philos. Soc. 107, 319 (1990) 50. Ryder, N.D.A.: Skein based invariants and the Kauffman polynomial. Ph.D. Thesis, University of Liverpool, 2008 51. Sinha, S., Vafa, C.: SO and Sp Chern-Simons at large N. http://arxiv.org/abs/hep-th/0012136v1, 2000 52. Stevan, S.: Chern-Simons Invariants of Torus Knots and Links. http://arxiv.org/abs/1003.2861v1[hep-th], 2010 53. Witten, E.: Quantum field theory and the Jones polynomial. Commun. Math. Phys. 121, 351 (1989) 54. Witten, E.: Chern-Simons Gauge Theory As A String Theory. Prog. Math. 133, 637 (1995) Communicated by A. Kapustin
Commun. Math. Phys. 298, 645–672 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1087-7
Communications in
Mathematical Physics
Irreducible Characters of General Linear Superalgebra and Super Duality Shun-Jen Cheng1, Ngau Lam2, 1 Institute of Mathematics, Academia Sinica, Taipei 10617, Taiwan.
E-mail:
[email protected] 2 Department of Mathematics, National Cheng-Kung University,
Tainan 70101, Taiwan. E-mail:
[email protected] Received: 5 August 2009 / Accepted: 16 April 2010 Published online: 9 July 2010 – © Springer-Verlag 2010
Abstract: We develop a new method to solve the irreducible character problem for a wide class of modules over the general linear superalgebra, including all the finitedimensional modules, by directly relating the problem to the classical Kazhdan-Lusztig theory. Furthermore, we prove that certain parabolic BGG categories over the general linear algebra and over the general linear superalgebra are equivalent. We also verify a parabolic version of a conjecture of Brundan on the irreducible characters in the BGG category of the general linear superalgebra.
1. Introduction The problem of finding the finite-dimensional irreducible characters of simple Lie superalgebras was first posed in [K1,K2]. This problem turned out to be one of the most challenging problems in the theory of Lie superalgebras, and in the type A case was first solved by Serganova [Se]. Later on, inspired by [LLT], Brundan in [B] provided an elegant new solution of the problem. To be more precise, Brundan in [B, Conjecture 4.32 and (4.35)] gave a conjectural character formula for every irreducible highest weight gl(m|n)-module in the Bernstein-Gelfand-Gelfand category O in terms of certain Brundan-Kazhdan-Lusztig polynomials. The validity of the conjecture would imply a remarkable formulation of the Kazhdan-Lusztig theory of O in terms of canonical and dual basis on a certain Fock space. Brundan then solved the finite-dimensional irreducible character problem by verifying the conjecture for the subcategory of finite-dimensional gl(m|n)-modules, in which case the Fock space is Em|n (see Sect. 4.3). One of the main purposes of the present paper is to establish Brundan’s conjecture for a substantially larger subcategory of O of gl(m|n)-modules, which includes all the finite-dimensional Partially supported by an NSC-grant and an Academia Sinica Investigator grant.
Partially supported by an NSC-grant.
646
S.-J. Cheng, N. Lam
ones. We note that a similar Fock space formulation is known among experts for modules of the general linear algebra gl(m + n) in the category O, and in particular for modules in the maximal parabolic subcategory corresponding to the Levi subalgebra gl(m) ⊕ gl(n), in which case the Fock space is Em+n (see Sect. 4.3). Let g and g denote direct limits of general linear algebras gl(m + n) and of general linear superalgebras gl(m|n), respectively, as n → ∞ (see Sect. 2.2 and Sect. 2.3). Motivated by [B] it was shown in [CWZ] that in the limit n → ∞ the Fock spaces Em+n and Em|n have compatible canonical and dual canonical bases, and that the KazhdanLusztig polynomials in Em+∞ and Em|∞ can be identified. Now the Kazhdan-Lusztig f polynomials of Em+∞ describe the g-module category O[−m,−2] whose objects are the direct limits of modules in the above-mentioned maximal parabolic subcategory, while f those of Em|∞ describe the g-module category O[−m,−2] whose objects are direct limits of finite-dimensional gl(m|n)-modules. From this and Brundan’s formulation it follows that the classical (parabolic) Kazhdan-Lusztig polynomials of the general linear algebra also give solution to the finite-dimensional irreducible character problem for the general linear superalgebra. Motivated by [CWZ] Wang and the first author in [CW] compare a more general f f parabolic g-module category OY with a corresponding g-module category OY , where Y here is any subset of [−m, −2] (see Sects. 2.2, 2.3, and Remark 3.13). A precise statement of the parabolic Brundan conjecture ([B, Conj. 4.32]) on the character of irref ducible g-modules in OY was given in [CW, Conj. 3.10]. The results in [CWZ,CW] f f suggest a direct connection between the categories OY and OY . In fact, the categories f
f
OY and OY are conjectured to be equivalent in [CW, Conj. 4.18], which was referred to as super duality. The purpose of the present paper is to establish this super duality. Our main idea is the introduction of a bigger Lie superalgebra g (Sect. 2.1), which contains and interpolates f of g and g. We then study a corresponding category O g-modules and define certain Y f → O f and T : O f → OYf (Sect. 3.2). These functors truncation functors T : O Y Y Y are shown to send parabolic Verma g-modules to the respective parabolic Verma g- and g-modules, and furthermore irreducible g-modules to the respective irreducible g- and g-modules. From this we obtain in Theorem 3.16 a solution of the irreducible character f problem for g-modules in OY . The solution of the irreducible character problem then f f allows us to compare the Kazhdan-Lusztig polynomials in OY with those in OY . This then enables us to prove in Theorem 5.1 that the functors T and T define equivalences of categories from which super duality follows. Note that a special case of the super duality conjecture was already formulated for f f the categories O[−m,−2] and O[−m,−2] in [CWZ, Conj. 6.10]. A proof of this special case was announced recently in [BS], with a proof to appear in a sequel of [BS]. Our method differs significantly from that of Brundan and Stroppel, as ours is independent of [B]. Furthermore our approach enables us to explicitly construct functors inducing this equivalence, and it is applicable to more general module categories. We want to emphasize that, in contrast to [CWZ], the arguments presented in this article do not depend on [B or Se], and hence Theorem 3.16 also gives an independent solution to the finite-dimensional irreducible character problem for the general linear superalgebra as a special case. By directly relating the irreducible character problem of Lie superalgebras to that of Lie algebras our solution of the problem becomes
Irreducible Characters of Superalgebra and Super Duality
647
surprisingly elementary. Our method is applicable to other finite and infinite-dimensional superalgebras, e.g. the ortho-symplectic Lie superalgebras [CLW]. This article is organized as follows. In Sect. 2 the Lie superalgebras g, g and g are f , O f and OYf . In Sect. 3 the main tool, defined, together with the module categories O Y Y odd reflections [LSS], of making connections between these categories is introduced, and the crucial Lemma 3.2 is proved. In Sect. 4 we show that the Kazhdan-Lusztig polynomials of these categories coincide, from which we then derive in Sect. 5 the equivalence of these categories. We conclude this Introduction by setting the notation to be used throughout this article. The symbols Z, N, and Z+ stand for the sets of all, positive and non-negative integers, respectively. For m ∈ Z we set m := m, if m > 0, and m := 0, otherwise. For integers a < b we set [a, b] := {a, a + 1, . . . , b}. Let P denote the set of partitions. For λ ∈ P we denote by λ the transpose partition of λ, by (λ) the length of λ and by sλ (y1 , y2 , . . .) the Schur function in the indeterminates y1 , y2 , . . . associated with λ. For a super space V = V0¯ ⊕ V1¯ and a homogeneous element v ∈ V , we use the notation |v| to denote the Z2 -degree of v. Let U(g) denote the universal enveloping algebra of a Lie (super)algebra g. Finally all vector spaces, algebras, tensor products, et cetera, are over the field of complex numbers C.
2. The Lie Superalgebras g, g and g denote the complex super space with g. For m ∈ N, let V 2.1. The Lie superalgebra 1 homogeneous basis {vr |r ∈ [−m, −1] ∪ 2 N}. The Z2 -gradation is determined by |vr | = ¯ for r ∈ 1 + Z+ , and |vi | = 0, ¯ for i ∈ [−m, −1] ∪ N. We denote by 1, g the Lie 2 vanishing on all but finitely many vr s. For r, s, p ∈ superalgebra of endomorphisms of V [−m, −1] ∪ 21 N, let Er s denote the endomorphism defined by Er s (v p ) := δsp vr . Then g equals the Lie superalgebra spanned by these Er s s. Let g0
f R (q n , Q n )
Tr R V n , n
where f R (q, Q) is completely determined by the BPS degeneracies of the M2 brane, N R ,J,s , where R denotes the representation the BPS state transforms in J , is the charge 2 For a complete mathematical proof of the integrality of N R,J,s see [26].
Link Homologies and the Refined Topological Vertex
765
of the brane and s is the spin. Moreover the sign of N is correlated with its fermion number. It was proposed in [8] that there is a further charge one can consider in labeling the BPS states of M2 branes ending on M5 branes: The normal geometry to the M5 brane includes, in addition to the spacetime R3 , and the three normal directions inside the CY, an extra R2 plane. It was proposed there that the extra S O(2) rotation in this plane will provide an extra gradation which could be viewed as a refinement of topological strings and it was conjectured that this is related to link homologies that we will review in the next section. This gives a refinement of N R,J,s → N R,J,r,s . In other words for a given representation R we have a triply graded structure labeling the BPS states. 3. Link Homologies and Topological Strings Now, let us proceed to describing the properties of link homologies suggested by their relation to Hilbert spaces of BPS states. We mostly follow notations of [8,9]. Let L be an oriented link in S3 with components, K 1 , . . . , K . We shall consider homological as well as polynomial invariants of L whose components are colored by representations R1 , . . . , R of the Lie algebra g. Although in this paper we shall consider only g = sl(N ), there is a natural generalization to other classical Lie algebras of type B, C, and D. In particular, there are obvious analogs of the structural properties of sl(N ) knot homologies for so(N ) and sp(N ) homologies (see [10,28] for some work in this direction). Given a link colored by a collection of representations R1 , . . . , R of sl(N ), we denote the corresponding polynomial invariant by P sl(N );R1 ,...,R (q).
(20)
Here and below, the “bar” means that (20) is the unnormalized invariant; its normalized version Psl(N );R1 ,...,R (q) obtained by dividing by the invariant of the unknot is written without a bar. Since this “reduced” version depends on the choice of the “preferred” component of the link L, below we mainly consider a more natural, unnormalized invariant (20). In the special case when every Ra , a = 1, . . . , is the fundamental representation of sl(N ) we simply write P N (q) ≡ P sl(N );
,...,
(q).
(21)
The polynomial invariants (20) are related to expectation values of Wilson loop operators W (L) = W R1 ,...,R (L) in Chern-Simons theory. For example, the polynomial sl(N ) invariant PN (q) is related to the expectation value of the Wilson loop operator W (L) = W ,..., (L), P¯N (L) = q −2N lk(L) W (L),
(22)
where lk(L) = a 0, which remains fixed throughout the present section, and divide the interval [0, π m/2] = [0, C] ∪ [C, π m/2] (see Sect. 3.1), We then have the following lemma (cf. Lemma 3.3); to prove it just use (86) and the ϕ bound (88) for W0 together with Lemma 3.2. Lemma 4.3. For any constant C > 0, we have as n → ∞, Inϕ = 2 I˜nϕ + Oϕ∞
1 , n
(90)
where
I˜nϕ
π m/2
:=
K n (ψ) −
ψ 1 Wϕ dψ. 4 m
(91)
C ϕ Proof of Theorem 1.4. First we evaluate I˜n as defined in (91). Plugging (56) into (91), we have (cf. (59))
I˜nϕ =
=
π m/2
1 sin(2ψ) 65 1 9 cos(2ψ) + + 2 2 π n sin(ψ/m) 256 π n sin(ψ/m)ψ 32 π nψ sin(ψ/m) C ! 27 11 ϕ ψ 64 sin(2ψ) − 256 cos(4ψ) dψ + W π 2 nψ sin(ψ/m) m ⎞ ⎛ π m/2
1 ψ 1 Wϕ dψ ⎠ +O ⎝ + ψ 3 nψ m 1 16π 3 n
C π m/2
9 cos(2ψ) 65 1 + 128 π ψ 16 ψ C ! 27 11 sin(2ψ) − 128 cos(4ψ) 1 ϕ ψ dψ + O ϕ2∞ , + 32 W0 πψ m n sin(2ψ) +
(92)
with constants involved in the “O”-notation universal. Here we used the identity (86); to effectively control the error term we use (88).
820
I. Wigman
We integrate by parts the first oscillatory term in (92), using the continuous differentiability assumptions; this yields the bound for its contribution 1 n
π m/2
ϕ
sin(2ψ)W0 (ψ/m)dψ C
π m/2 1 1 C ϕ ϕ cos(2ψ) · W0 (ψ/m) ψ=π m/2 + 2 cos(2ψ)W0 (ψ/m)dψ n n C
ϕ2∞ n
+
ϕ W0 L 1 ([0,π ])
n
(ϕ2∞ + ϕ∞ V (ϕ)) ·
1 n
with constants involved in the “«”-notation universal, by (89). It is easy to establish similar bounds for the remaining oscillatory terms in (92), i.e. the 3rd and the 4th terms. To analyze the main contribution, which comes from the remaining second term in ϕ (92), we note that the continuous differentiability of W0 implies W0 (φ) = 2π ϕ2L 2 (S 2 ) + Oϕ∞ ,V (ϕ) (φ), by (87) and (89). The main contribution to (92) is then 65 1 2048π 4 n
π m/2
C
ϕ
W0 (ψ/m) 65 1 dψ = ϕ2L 2 (S 2 ) · 3 ψ 1024π n ⎛ 1 +Oϕ∞ ,V (ϕ) ⎝ 2 n
π m/2
C π m/2
dψ ψ
⎞
dψ ⎠
C
=
65 log n + Oϕ∞ ,V (ϕ) ϕ2L 2 (S 2 ) · 1024π 3 n
ϕ All in all we evaluated I˜n as
I˜nϕ =
65 log n + Oϕ∞ ,V (ϕ) ϕ2L 2 (S 2 ) · 3 1024π n
1 . n
1 . n
Plugging this into (90) yields Inϕ =
65 log n + Oϕ∞ ,V (ϕ) ϕ2L 2 (S 2 ) · 3 512π n
1 . n
We finally obtain the statement of Theorem 1.4 by plugging (93) into (84).
(93)
5. Proof of Theorem 1.5 As implied by the formulation of Theorem 1.5, in this section we will deal with functions of bounded variation. The definition and some basic properties of the class BV (S 2 ) of functions of bounded variation is given in Appendix C.
Fluctuations of Nodal Length
821
5.1. On the proof of Theorem 1.5. To prove Theorem 1.5 one wishes to apply a standard approximation argument, approximating our test function ϕ of bounded variation with a sequence ϕi of C ∞ , for which we can apply Theorem 1.4. There are two major issues with this approach however. On one hand, one needs to check that ϕi approximating ϕ implies the corresponding statement for the random variables Z ϕ ( f n ) and Z ϕi ( f n ), and, in particular, their variance. While it is easy to check that if ϕi → ϕ in L 1 then for every fixed n we also have E[Z ϕi ( f n )] → E[Z ϕ ( f n )], the analogous statement for the variance is much less trivial (see Proposition 5.17 ). On the other hand, when applying Theorem 1.4 for ϕi , one needs to control the error term in (14), which may a priori depend on ϕi . To resolve the latter we take advantage8 of the fact that Theorem 1.4 allows us to control the dependency of the error term in (14) on the test function in terms of its L ∞ norm and total variation. Thus to resolve this issue it would be sufficient to require from ϕi to be essentially uniformly bounded and having uniformly bounded total variation. Fortunately the standard symmetric mollifiers construction from [13] as given in Appendix C satisfy both the requirements above. Namely given a function ϕ ∈ BV (S 2 ) we obtain a sequence ϕi of the C ∞ function, that converge in L 1 to ϕ, ϕi ∞ ≤ ϕ∞ and in addition V (ϕi ) → V (ϕ). 5.2. Continuity of the distribution of Z ϕ . As pointed in Sect. 5.1, to prove Theorem 1.5 we will need to show that the distribution of Z ϕ depends continuously on ϕ. Proposition 5.1 makes this statement precise. We believe that it is of independent interest. Proposition 5.1. Let ϕ ∈ BV (S 2 ) ∩ L ∞ (S 2 ) be any test function. Then E Z ϕ ( f n )2 = O n 2 ϕ2L 1 (S 2 ) + ϕ∞ ϕ L 1 (S 2 ) ,
(94)
where the constant involved in the “O”-notation are universal. In particular, if F ⊆ S 2 has a C 2 boundary then 2 F = O(n 2 |F|2 + |F|). E Z ( fn ) Proof. Recall that we defined W ϕ as (82); the assumption ϕ ∈ L ∞ (S 2 ) saves us from dealing with the validity of this definition. Starting from (121), and repeating the steps in the proof of Lemma 2.1 from either [23] or [4,5], we may extend the validity of the Kac-Rice formula (84) with (85) for this class as well. Note that the constant term 7 Proposition 5.1 gives a stronger claim. First, it evaluates the second moment rather than the variance.
2
2 Secondly, it gives a general bound for E Zϕi ( f n ) − Zϕ ( f n ) = E Zϕi −ϕ ( f n ) . It is easy to derive the result we need employing the triangle inequality. 8 This is by no means a lucky coincidence; it is precisely the proof of Theorem 1.5 that motivated the technical statement made in Theorem 1.4.
822
I. Wigman
in (85) comes from the squared expectation, so that we need to omit it if we want to compute the second moment. We then have E
2 = 8π 2 Z ϕ ( fn )
En J ϕ, n + 1/2 n
(95)
where π m/2
Jnϕ
=
K n (ψ) W
ϕ
ψ m
dφ,
0
denoting as usual m := n + 1/2. As usual while estimating this kind of integrals we remove the origin by choosing a constant C > 0 and writing ϕ
ϕ
Jnϕ = Jn,1 + Jn,2 ,
(96)
where ϕ Jn,1
C =
K n (ψ) W ϕ
ψ m
dψ,
0
and ϕ Jn,2
π m/2
=
K n (ψ) W ϕ
ψ m
dψ.
C
First, for C < ψ < on C, i.e.
πm 2 ,
K n (ψ) is bounded by a constant, which may depend only |K n (ψ)| = OC (1), ϕ
which follows directly from Proposition 3.5. Therefore we may bound Jn,2 as ϕ |Jn,2 |
π m/2
ϕ W
C
π m/2 ϕ ψ ψ dψ dψ ≤ W m m 0
C
π/2 ϕ W (φ) dψ nϕ2 1 2 , = m L (S )
(97)
0
as earlier. We claim that for 0 < ψ < C we may bound K n as 1 . |K n (ψ)| = OC ψ
(98)
Fluctuations of Nodal Length
823 ϕ
Before proving this estimate we will show how it helps us to bound Jn,1 . We have by ϕ the definition of Jn,1 ,
C
C 1 ϕ ψ 1 ϕ ψ ϕ W dψ dψ W Jn,1 ψ m n 0 m 0
C/n
0
ϕ W (φ) dφ C 1 ϕ∞ ϕ L 1 (S 2 ) , 0 n
(99)
0
by (86) and the first inequality of (88). The statement of the present lemma now follows from plugging the estimates (97) and (99) into (96) and (95). We still have to prove (98) though. To see (98) we use Remark 2.10 and the Cauchy-Schwartz inequality to write K n (ψ) =
1 E [U · V ], (2π ) 1 − Pn (cos ψ/m)2
(100)
where U and V are 2-dimensional mean zero Gaussian vectors with covariance matrix (45), whose entries uniformly bounded by an absolute constant, whence E [U · V ] ≤ E U 2 E V 2 = O(1), (101) with the constant involved in the “O”-notation uniform. For the other term Lemma B.2 yields 1 (102) 1 − Pn (cos(ψ/m))2 , ψ so that we obtain the necessary bound (98) for K n (ψ) plugging the estimates (101) and (102) into (100). 5.3. Proof of Theorem 1.5. Now we are ready to give a proof of Theorem 1.5. Proof of Theorem 1.5. Given a function ϕ ∈ BV (S 2 ), let ϕi ∈ C ∞ be a sequence of smooth functions such that ϕi → ϕ in L 1 (S 2 ), Vi := V (ϕi ) → V (ϕ) and ϕi ∞ ≤ ϕ∞ .
(103)
(see Appendix C). Let M1 := ϕ∞ and M2 := max{Vi }i≥1 < ∞, since Vi is convergent. Theorem 1.4 applied on ϕi ∈ C ∞ (S 2 ) states that Var(Z ϕi ( f n )) = c(ϕi ) · log n + O M1 ,M2 (1),
(104)
824
I. Wigman
where c(ϕi ) is given by c(ϕi ) := 65
ϕi 2L 2 (S 2 ) 128π
> 0.
Note that since ϕi and ϕ are uniformly bounded (103), L 1 (S 2 ) convergence implies L 2 (S 2 ) convergence, so that c(ϕi ) → c(ϕ),
(105)
the latter being given by (13). On the other hand we know from Proposition 5.1 that
2
2 E Z ϕi ( f n ) − Z ϕ ( f n ) = E Z ϕi −ϕ ( f n ) → 0, using the uniform boundedness (103) again to ensure that (94) holds uniformly. This together with the triangle inequality implies that
Var Z ϕi ( f n ) → Var Z ϕ ( f n ) , (106) and we take the limit i → ∞ in (104) to finally obtain the main statement of Theorem 1.5. Remark 5.2. From the proof presented, it is easy to see that the constant in the “O”-notation in the statement (14) of Theorem 1.5 could be made dependent only on ϕ∞ and V (ϕ). Appendix A. Computation of the Covariance Matrix In this section we compute the matrix n (φ) explicitly, as prescribed by (37). The matrix n (φ) is the 4×4 covariance matrix of the mean zero Gaussian random vector Z 2 in (23) with x = y ∈ S 2 any two points on the arc {θ = 0} with d(x, y) = φ, conditioned upon f (x) = f (y) = 0. Recall that as such, n (φ) is given by (34), where A = An (x, y), B = Bn (x, y) and C = Cn (x, y) are given by (30), (31) and (32) respectively, and x, y ∈ S 2 are any points on the arc {θ = 0} with d(x, y) = φ. Here the gradients are given in the orthonormal frame (36) of the tangent planes Tx (S 2 ) associated to the spherical coordinates (see Sect. 2.4 for explanation). Let x and y correspond to the spherical coordinates (φx , θx = 0) and (φ y , θ y = 0), and denote φ = d(x, y) = |φx − φ y |. Recall that u n (x, y) = Pn (cos(d(x, y))) = Pn (cos φ). First we compute the inverse of A in (30) as 1 1 −Pn (cos φ) . An (φ)−1 = 1 1 − Pn (cos φ)2 −Pn (cos φ)
(107)
Fluctuations of Nodal Length
825
It is easy to either see from the geometric picture or compute explicitly that ∇x u n (x, y) = −∇ y u n (x, y) = ±Pn (cos φ) sin(φ)(1, 0), depending on whether φx > φ y or φx < φ y , so that 0 0 Pn (cos φ) sin φ Bn (φ) = ± −Pn (cos φ) sin φ 0 0
0 . 0
(108)
(109)
Next we turn to the missing part of Cn (φ) defined in (32), i.e. the “pseudo-Hessian” Hn (φ) given by (33). By the chain rule
Hn (φ) = ∇x ⊗ ∇ y u n (x, y) = ∇x ⊗ Pn (cos(d(x, y)))∇ y cos(d(x, y)) = Pn (cos φ)∇x cos(d(x, y)) ⊗ ∇ y cos(d(x, y))
+Pn (cos φ) ∇x ⊗ ∇ y cos(d(x, y)).
(110)
We denote h(x, y) := cos d(x, y) = cos φx cos φ y + sin φx sin φ y cos(θx − θ y ), and compute explicitly that for θx = θ y = 0 we have
cos φ ∇x ⊗ ∇ y cos(d(x, y)) = ∇x ⊗ ∇ y h(x, y) = 0
Plugging (108) and (111) into (110) we obtain Pn (cos φ) cos φ − Pn (cos φ) sin(φ)2 H= 0
0 . 1
0 . Pn (cos φ).
(111)
(112)
Finally plugging (112) into (32), and plugging that together with (107) and (109) into (34), we obtain an explicit expression for n (φ) as prescribed by (37) with entries given by (38), (39) and (40). Appendix B. Estimates for the Legendre Polynomials and Related Functions The goal of this section is to give a brief introduction to the Legendre polynomials Pn : [−1, 1] → R and give some relevant basic information necessary for the purposes of the present paper. The high degree asymptotic analysis of behaviour of Pn and its first two derivatives involves the Hilb’s asymptotics in Lemma B.1 together with the recursion (114) for the 1st derivative and the differential equation (113) for the second one. We refer the reader to [17] for more information. The Legendre polynomials Pn are defined as the unique polynomials of degree n orthogonal w.r.t. the constant weight function ω(t) ≡ 1 on [−1, 1] with the normalization Pn (1) = 1. They satisfy the following second order differential equation: Pn (cos(ψ/m)) = −
n(n + 1) 2 cos(ψ/m) Pn (cos(ψ/m))+ P (cos(ψ/m)), sin(ψ/m)2 sin(ψ/m)2 n
(113)
as well as the recursion Pn (cos(ψ/m)) = (Pn−1 (cos(ψ/m))−cos(ψ/m)Pn (cos(ψ/m)))
n . (114) sin(ψ/m)2
The Hilb asymptotics gives the high degree asymptotic behaviour of Pn .
826
I. Wigman
Lemma B.1. (Hilb Asymptotics (formula (8.21.17) on p. 197 of Szego [17])) Pn (cos φ) =
φ sin φ
1/2 J0 ((n + 1/2)φ) + δ(φ),
(115)
uniformly for 0 ≤ φ ≤ π/2, J0 is the Bessel J function of order 0 and the error term is # φ 1/2 O(n −3/2 ), Cn −1 < φ < π/2 δ(φ) 0 < φ < Cn −1 , φ α+2 O(n α ), where C > 0 is any constant and the constants involved in the “O”-notation depend on C only. We have the following rough estimate for the behaviour of the Legendre polynomials at ±1, which follows directly from Hilb’s asymptotic. Lemma B.2. For 0 < φ
C: (1)
π 1 cos(ψ + π4 ) 2 sin(ψ + ) − Pn (cos(ψ/m)) = π n sin(ψ/m) 4 8 ψ 1 1 +O , (116) +√ ψ 5/2 ψn (2) Pn (cos(ψ/m)) " √ π 3 π 2 n sin(ψ/m) sin ψ − + sin ψ + = π sin(ψ/m)5/2 4 8n 4 2 n n +O , + ψ 7/2 ψ 3/2
(117)
(3) n2 2 Pn (cos(ψ/m)) + P (cos(ψ/m)) sin(ψ/m)2 sin(ψ/m)2 n 3 n . (118) +O ψ 5/2
Pn (cos(ψ/m)) = −
Fluctuations of Nodal Length
827
Proof. By Lemma B.1 and the standard asymptotics for the Bessel functions we obtain √ √ ψ/m ψ Pn (cos(ψ/m)) = √ J0 (ψ) + O 2 n sin(ψ/m) "
√ sin ψ + π4 1 cos ψ + π4 2 ψ/m = − √ √ π sin(ψ/m) 8 ψ 3/2 ψ √ 1 ψ + +O n2 ψ 5/2 π 1 cos(ψ + π4 ) 2 sin(ψ + ) − = π n sin(ψ/m) 4 8 ψ 1 1 +O , +√ 5/2 ψ ψn which is (116). To obtain (117) we employ the recursive formula (114), evaluating the Legendre polynomials appearing there using (116). Finally we obtain a simple approximate differential equation (118), replacing n(n + 1) by n 2 and cos(ψ/m) by 1 in the differential equation (113) satisfied by the Legendre polynomials. To do so we use the decay 1 |Pn (cos(ψ/m))| = O √ ψ of Pn , which follows directly from (116), as well as (79) of its derivative.
Appendix C. Functions of Bounded Variation In this section we give the definition and some basic properties on the functions of bounded variation. For more information we refer the reader to [13]. Classically, the variation of a function η : [a, b] → R on [a, x] is defined as V (η; x) :=
k−1
sup
λ: t1 =a tˆ(q) we have t (q) = a. The function tˇ = τ + t is a temporal function and tˇ(q) = τ (q) + a < τ ( p) + b = tˇ( p), a contradiction. It must be remarked that to every temporal function t there √ corresponds a flow generated by the future directed timelike unit vector u = −∇t/ −g(∇t, ∇t). The generated congruence of timelike curves represents an extended reference frame so that every curve of the congruence is identified with an observer “at rest in the frame”. The flow is orthogonal to the slices t = const. which therefore are the natural simultaneity slices as they would be obtained by the observers at rest in the frame by a local application of Einstein’s simultaneity convention [27,30,41]. This observation shows that the temporal functions, at least in principle, can be physically realized through a well defined operational procedure. The above theorem then states that while observers living in different extended reference frames may disagree on which event of a pair comes “before” or “after” the other, according to their own time function, they certainly agree whenever the pair of compared events belong to the K + (Seifert) relation, and in fact only for those type of pairs. In other words the K + (Seifert) relation provides that set of pairs of events for which all the observers agree on their temporal order. Equation (3) can be rewritten in the equivalent form K+ =
T + [t],
(5)
t∈A
thus we have just obtained an alternative proof for the same equation. This result allows us to establish those circumstances in which the chronological or causal relation can be recovered from the knowledge of the time or temporal functions. Recall that a spacetime is causally easy if it is strongly causal and J + is transitive [35]. Recall also that a causally continuous spacetime is a spacetime which is distinguishing and reflective. Finally a spacetime is causally simple [5] if it is causal and J + = J + . We have causal simplicity ⇒ causal continuity ⇒ causal easiness ⇒ K -causality. By definition of causal easiness K + = J + , thus as I + = J + , we easily find Proposition 1. In a causally easy spacetime I + = Int t T + [t], and in a causally sim ple spacetime J + = t T + [t], where the intersections are with respect to the sets of time or the temporal functions.
Time Functions as Utilities
867
5. Conclusions The concept of causal influence is more primitive, and in fact more intuitive, than that of time. General relativistic spacetimes have by definition a causal structure but may lack a time function, namely a continuous function which respects the notion of causal precedence (i.e. if a influences b then the time of a is less than that of b). In this work we have recognized the mathematical coincidence between the problem of the existence of a (semi-)time function on spacetime in the relativistic physics field and the problem of the existence of a utility function for an agent in microeconomics. From these problems two so far independent lines of research arose which, as we noted, often passed through the very same concepts. Remarkably, some results obtained in one field were not rediscovered in the other, a fact which has allowed us to use Peleg’s and Levin’s theorems to reach new results concerning the existence of (semi-)time functions in relativity. In particular, we have proved that a chronological spacetime in which J + is transitive (for instance a reflective spacetime) admits a semi-time function. Also in a K -causal spacetime the existence of a time function follows solely from the closure and antisymmetry of the K + relation. In the other direction we have proved without the help of smoothing techniques, that the existence of a time function implies K -causality. We have also given a new proof of the equivalence between K -causality and stable causality by using Levin’s theorem and smoothing techniques. Finally, we have shown in two different ways that in a K -causal (i.e. stably casual) spacetime the K + (i.e. Seifert) relation can be recovered from the set of time or temporal functions allowed by the spacetime. This result singles out the K + relation as one of the most important ones for the development of causality theory. Acknowledgments. This work has been partially supported by GNFM of INDAM and by FQXi.
References 1. Andrikopoulos, A.: Szpilrajn-type theorems in economics (May 2009). Mimeo, Univ. of Ionnina. Available at http://ideas.repec.org/p/pra/mprap/14345.html 2. Aumann, R.J.: Utility theory without the completeness axiom. Econometrica 30, 445–462 (1962) 3. Beem, J.K.: Conformal changes and geodesic completeness. Commun. Math. Phys. 49, 179–186 (1976) 4. Bernal, A.N., Sánchez, M.: Smoothness of time functions and the metric splitting of globally hyperbolic spacetimes. Commun. Math. Phys. 257, 43–50 (2005) 5. Bernal, A.N., Sánchez, M.: Globally hyperbolic spacetimes can be defined as ‘causal’ instead of ‘strongly causal’. Class. Quant. Grav. 24, 745–749 (2007) 6. Bossert, W.: Intersection quasi-orderings: An alternative proof. Order 16, 221–225 (1999) 7. Bridges, D.S., Mehta, G.B.: Representations of preference orderings, Vol. 442 of Lectures Notes in Economics and Mathematical Systems. Berlin: Springer-Verlag, 1995 8. Candeal-Haro, J.C., Induráin-Eraso, E.: Utility representations from the concept of measure. Math. Soc. Sci. 26, 51–62 (1993) 9. Clarke, C.J.S., Joshi, P.S.: On reflecting spacetimes. Class. Quant. Grav. 5, 19–25 (1988) 10. Debreu, G.: Representation of preference ordering by a numerical function. In: Decision Processes, ed. Thrall, R.M., Coombs, C.H., Davis, R.L., New York: John Wiley, 1954, pp. 159–165 11. Debreu, G.: Continuity properties of Paretian utility. Int. Econ. Rev. 5, 285–293 (1964) 12. Dieckmann, J.: Volume functions in general relativity. Gen. Rel. Grav. 20, 859–867 (1988) 13. Donaldson, D., Weymark, J.A.: A quasiordering is the intersection of orderings. J. Econ. Theory 78, 328– 387 (1998) 14. Dushnik, B., Miller, E.: Partially ordered sets. Amer. J. Math. 63, 600–610 (1941) 15. Eilenberg, S.: Ordered topological spaces. Amer. J. Math. 63, 39–45 (1941) 16. Evren, O., Ok, E.A.: On the multi-utility representation of preference relations. J. Econ. Theory (in press) 17. Geroch, R.: Domain of dependence. J. Math. Phys. 11, 437–449 (1970)
868
E. Minguzzi
18. Hawking, S.W.: The existence of cosmic time functions. Proc. Roy. Soc. London, Series A 308, 433– 435 (1968) 19. Hawking, S.W., Ellis, G.F.R.: The Large Scale Structure of Space-Time. Cambridge: Cambridge University Press, 1973 20. Hawking, S.W., Sachs, R.K.: Causally continuous spacetimes. Commun. Math. Phys. 35, 287–296 (1974) 21. Herden, G.: On the existence of utility functions. Math. Soc. Sci. 17, 297–313 (1989) 22. Herden, G.: On some equivalent approaches to mathematical utility theory. Math. Soc. Sci. 29, 19–31 (1995) 23. Herden, G., Pallack, A.: On the continuous analogue of the Szpilrajn theorem I. Math. Soc. Sci. 43, 115– 134 (2002) 24. Kim, J.-C., Kim, J.-H.: Totally vicious spacetimes. J. Math. Phys. 34, 2435–2439 (1993) 25. Lee, L.-F.: The theorems of Debreu and Peleg for ordered topological spaces. Econometrica 40, 1151– 1153 (1972) 26. Levin, V.L.: A continuous utility theorem for closed preorders on a σ -compact metrizable space. Sov. Math. Dokl. 28, 715–718 (1983) 27. Malament, D.B.: Causal theories of time and the conventionality of simultaneity. Noûs 11, 293–300 (1977) 28. Mehta, G.: Topological ordered spaces and utility functions. Int. Econ. Rev. 18, 779–782 (1977) 29. Mehta, G.: Ordered topological spaces and the theorems of Debreu and Peleg. Indian J. Pure Appl. Math. 14, 1174–1182 (1983) 30. Minguzzi, E.: Simultaneity and generalized connections in general relativity. Class. Quant. Grav. 20, 2443–2456 (2003) 31. Minguzzi, E.: The causal ladder and the strength of K -causality. I. Class. Quant. Grav. 25, 015009 (2008) 32. Minguzzi, E.: The causal ladder and the strength of K -causality. II. Class. Quant. Grav. 25, 015010 (2008) 33. Minguzzi, E.: Limit curve theorems in Lorentzian geometry. J. Math. Phys. 49, 092501 (2008) 34. Minguzzi, E.: Non-imprisonment conditions on spacetime. J. Math. Phys. 49, 062503 (2008) 35. Minguzzi, E.: K -causality coincides with stable causality. Commun. Math. Phys. 290, 239–248 (2009) 36. Minguzzi, E., Sánchez, M.: The causal hierarchy of spacetimes. In: Baum, H., Alekseevsky, D. (eds.), Recent developments in pseudo-Riemannian geometry of ESI Lect. Math. Phys., Zurich: Eur. Math. Soc. Publ. House, 2008, pp. 299–358 (2008) 37. Nachbin, L.: Topology and order. Princeton: D. Van Nostrand Company, Inc., 1965 38. Nomizu, K., Ozeki, H.: The existence of complete Riemannian metrics. Proc. Amer. Math. Soc. 12, 889– 891 (1961) 39. Peleg, B.: Utility functions for partially ordered topological spaces. Econometrica 38, 93–96 (1970) 40. Rader, T.: The existence of a utility function to represent preferences. Rev. Econ. Stud. 30, 229–232 (1963) 41. Robb, A.A.: A Theory of Time and Space. Cambridge: Cambridge University Press, 1914 42. Seifert, H.: The causal boundary of space-times. Gen. Rel. Grav. 1, 247–259 (1971) 43. Seifert, H.J.: Smoothing and extending cosmic time functions. Gen. Rel. Grav. 8, 815–831 (1977) 44. Sondermann, D.: Utility representations for partial orders. J. Econ. Theory 23, 183–188 (1980) 45. Sorkin, R.D., Woolgar, E.: A causal order for spacetimes with C 0 Lorentzian metrics: proof of compactness of the space of causal curves. Class. Quant. Grav. 13, 1971–1993 (1996) 46. Szpilrajn, E.: Sur l’extension de l’ordre partiel. Fund. Math. 16, 386–389 (1930) 47. Ward, L.E. Jr.: Partially ordered topological spaces. Proc. Am. Math. Soc. 5, 144–161 (1954) Communicated by P.T. Chru´sciel
Commun. Math. Phys. 298, 869–878 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1079-7
Communications in
Mathematical Physics
On the Best Constant in the Moser-Onofri-Aubin Inequality Nassif Ghoussoub1 , Chang-Shou Lin2 1 Department of Mathematics, University of British Columbia, Vancouver,
BC V6T1Z2, Canada. E-mail:
[email protected] 2 Department of Mathematics, Taida Institute for Mathematical Sciences,
National Taiwan University, Taipei, 106, Taiwan Received: 29 September 2009 / Accepted: 24 February 2010 Published online: 27 June 2010 – © Springer-Verlag 2010
Abstract: Let S 2 be the 2-dimensional unit sphere and let Jα denote the nonlinear functional on the Sobolev space H 1 (S 2 ) defined by dμ0 α 1 2 Jα (u) = |∇u| dμ0 + u dμ0 − ln eu , 2 16π S 2 4π S 2 4π S where dμ0 = sin θ dθ ∧ dφ. Onofri had established that Jα is non-negative on H 1 (S 2 ) provided α ≥ 1. In this note, we show that if Jα is restricted to those u ∈ H 1 (S 2 ) that satisfies the Aubin condition: eu x j dμ0 = 0 for all 1 ≤ j ≤ 3, S2
then the same inequality continues to hold (i.e., Jα (u) ≥ 0) whenever α ≥ 23 − 0 for some 0 > 0. The question of Chang-Yang on whether this remains true for all α ≥ 21 remains open. 1. Introduction Let S 2 be the 2-dimensional unit sphere with the standard metric g0 whose correspond dμ0 ing volume form dω := 4π is normalized so that S 2 dω = 1. For α > 0, we consider the following nonlinear functional on the Sobolev space H 1 (S 2 ): α |∇g0 u|2 dω + u dω − ln eu dω. Jα (u) = 2 2 16π S 2 S S The classical Moser-Trudinger inequality [14] yields that Jα is bounded from below in H 1 (S 2 ) if and only if α ≥ 1. In [15], Onofri proved that the infimum is actually equal to zero for α = 1, by using the conformal invariance of J1 to show that inf J1 (u) =
u∈M
inf
u∈H 1 (S 2 )
J1 (u) = 0,
(1.1)
870
N. Ghoussoub, C.-S. Lin
where M is the submanifold of H 1 (S 2 ) defined by M := u ∈ H 1 (S 2 ) ;
S2
eu x dw = 0 ,
(1.2)
with x = (x1 , x2 , x3 ) ∈ S 2 , on which the infimum of J1 is attained. Other proofs were also given by Osgood-Phillips-Sarnak [16] and by Hong [11]. Prior to that, Aubin [1] had shown that by restricting the functional Jα to M, it is then again bounded below by — a necessarily non-positive — constant Cα , for any α ≥ 21 . In their work on Nirenberg’s prescribing Gaussian curvature problem on S 2 , Chang and Yang [5,6] showed that Cα can be taken to be equal to 0 for α ≥ 1 − 0 for some small 0 . This led them to the following Conjecture 1. If α ≥
1 2
then inf Jα (u) = 0. u∈M
Note that this fails if α < 21 , since the functional Jα is then unbounded from below (see [9]). In this article, we want to give a partial answer to this question by showing that this is indeed the case for α ≥ 23 and slightly below that. As mentioned above, Aubin had proved that for all α ≥ 21 , the functional Jα is coercive on M, and that it attains its infimum on some function u ∈ M. Accounting for the Lagrange multipliers, and setting ρ = α1 , the Euler-Lagrangian equation for u is then g0 u + 8πρ
3 eu − 1 = α j x j eu on S 2 . u S 2 e dw j=1
In [6], Chang and Yang proved however that α j , j = 1, 2, 3 necessarily vanish. Thus u satisfies – up to an additive constant – the equation g0 u + 8πρ(eu − 1) = 0 on S 2 , equivalently u + 2ρ(eu − 1) = 0 on S 2 ,
(1.3)
1 where now the Laplacian := 4π g0 corresponds to the metric on the unit sphere whose volume form is dμ0 = sin θ dθ ∧ dφ. Here is the main result of this note.
Theorem 1.1. If 1 < ρ ≤
3 2
and u is a solution of (1.3), then u ≡ 0 on S 2 .
This then clearly gives a positive answer to Conjecture 1 for α ≥ 23 . 2. The Axially Symmetric Case The proof of Theorem 1.1 relies on the fact that the conjecture has been shown to be true in the axially symmetric case. In other words, the following result holds. Theorem A. Let u be a solution of (1.3) with 1 < ρ ≤ 2. If u is axially symmetric, then u ≡ 0 on S 2 .
Best Constant in the Moser-Onofri-Aubin Inequality
871
Theorem A was first established by Feldman, Froese, Ghoussoub and Gui [9] for 1 < ρ ≤ 25 16 . It was eventually proved for all 1 < ρ ≤ 2 by Gui and Wei [10], and independently by Lin [12]. Note that this means that the following one-dimensional inequality holds: 1 1 1 1 1 2g(x) 2 2 (1 − x )|g (x)| d x + 2 g(x) d x − 2 ln e d x ≥ 0, 2 −1 2 −1 −1 1 for every function g on (−1, 1) satisfying −1 (1 − x 2 )|g (x)|2 d x < ∞ and 1 2g(x) xd x = 0. −1 e We now give a sketch of the proof of Theorem A that connects the conjecture of Chang-Yang to an equally interesting Liouville type theorem on R 2 . For that, we let denote the stereographic projection S 2 → R2 with respect to the North pole N = (0, 0, 1): x2 x1 . ,
(x) := 1 − x3 1 − x3 Suppose u is a solution of (1.3), and set u(y) ˜ := u( −1 (y)) for y ∈ R2 . Then u˜ satisfies
where J (y) :=
u˜ + 2ρ J (y) eu˜ − 1 = 0 in R2 , 2 1+|y|2
2
is the Jacobian of . By letting
v(y) := u(y) ˜ + ρ log (1 + |y|2 )−2 + log(8ρ) for y ∈ R2 ,
(2.1)
we have that v satisfies v + (1 + |y|2 )l ev = 0 in R2 , where l = 2(ρ − 1). Let v be a solution of (2.2) and suppose βl (v) is finite, where 1 βl (v) = (1 + |y|2 )l ev dy. 2π R2
(2.2)
(2.3)
Then v(y) has the following asymptotic behavior at ∞: v(y) = −βl (v) log |y| + O(1).
(2.4)
We refer to [7] for a proof of (2.4), which once combined with the Pohozaev identity yields the following result. Lemma 2.1. Let l > 0 and v be a solution of (2.2) such that βl (v) < +∞. Then 4 < βl (v) < 4(l + 1).
872
N. Ghoussoub, C.-S. Lin
Proof. Multiplying (2.2) by y · ∇v and integrating by parts on B R = {y | |y| < R}, we have 1 ∂v 2 (y · ∇v) ds − (y · ν)|∇v| ds = − (1 + |y|2 )l y · ∇ev dy ∂ν 2 ∂ BR ∂ BR BR 2 l v 2 l−1 v = (l + 2) (1 + |y| ) e dy − l (1 + |y| ) e dy − (1 + |y|2 )l (y · ν)ev ds. BR
∂ BR
BR
By letting R → +∞ in the above formula and by using (2.4), we obtain that (l + 2) (1 + |y|2 )l ev dy − l (1 + |y|2 )l−1 ev dy = πβl2 (v). R2
Note now that
R2
(1 + |y|2 )l ev dy 2 l v < (l + 2) (1 + |y| ) e dy − l (1 + |y|2 )l−1 ev dy R2 R2 < (l + 2) (1 + |y|2 )l ev dy = 2π(2l + 2)βl (v),
4πβl (v) = 2
R2
R2
which means that 4πβl (v) < πβl2 (v) ≤ 2π(2l + 2)βl (v), i.e., 4 < βl (v) < 4(l + 1). Note that by using (2.1) with u ≡ 0, Eq. (2.2) always has a special axially symmetric solution, namely v ∗ (y) = −2ρ log(1 + |y|2 ) + log(8ρ) for y ∈ R2 ,
(2.5)
βl (v ∗ ) = 4ρ = 2(l + 2).
(2.6)
where
An open question that would clearly imply the conjecture of Chang and Yang is the following: Conjecture 2. Is v ∗ the only solution of (2.2) with βl = 2(l + 2)? Note that it is indeed the case if < 0 (i.e., ρ < 1 and α > 1), since then we can employ the method of moving planes to show that v(y) is radially symmetric with respect to the origin, and then conclude that u(x) is axially symmetric with respect to any line passing through the origin. Thus u(x) must be a constant function on S 2 . Equation (1.3) then yields u = 0, which implies Jα ≥ 0 on M. By passing to the limit as α → 1, we recover the Onofri inequality. When l > 0 (i.e., ρ > 1 and α < 1), the method of moving planes fails and it is still an open problem whether any solution of (2.2) with βl = 2(l + 2) is equal to v ∗ or not. The following uniqueness theorem reduces however the problem to whether any solution of (2.2) is radially symmetric.
Best Constant in the Moser-Onofri-Aubin Inequality
873
Theorem B. Suppose l > 0 and vi (y) = vi (|y|), i = 1, 2, are two solutions of (2.2) satisfying βl (v1 ) = βl (v2 ).
(2.7)
Then v1 = v2 under one of the following conditions: (i) l ≤ 1 or (ii) l > 1 and 4l < βl (vi ) < 4(1 + l) for i = 1, 2. See [12] for a proof of Theorem B. In order to show how Theorem B implies Theorem A, we suppose u is a solution of (1.3) that is axially symmetric with respect to some direction. By rotating, the direction can be assumed to be (0, 0, 1). By using the stereographic projection as above, and setting v as in (2.1), we have v(y) |y| + O(1), = −4ρ log (2.8) 1 2 )l ev dy = 4ρ = 4 + 2l. (1 + |y| 2 R 2π If l ≤ 1, i.e., ρ ≤ 23 , then v = v ∗ by (i) of Theorem B, and then u ≡ 0. If 2 > l > 1, then by noting that 4l < 4ρ = 4 + 2l = βl (v) < 4 + 4l, we deduce that v = v ∗ by (ii) of Theorem B, which again means that u ≡ 0. 3. Proof of the Main Theorem We shall prove Theorem 1.1 by showing that if ρ ≤ 23 , then any solution of (1.3) is necessarily axially symmetric. We can then conclude by using Theorem A. We shall need the following lemma. Lemma 3.1. Let be a simply connected domain in R2 , and suppose g ∈ C 2 () satisfies g g +g e > 0 in and e dy ≤ 8π. Consider an open set ω ⊂ such that λ1,g (ω) ≤ 0, where λ1,g (ω) is the first eigenvalue of the operator + e g on H01 (ω). Then, we necessarily have that e g dy > 4π. (3.1) ω
Lemma 3.1 was first proved in [2] by using the classical Bol inequality. The strict inequality of (3.1) is due to the fact that g + e g > 0 in . See [3] and references therein. Remark 3.2. We note that Lemma 3.1 can be applied even when ω is unbounded. Indeed, for simplicity, we shall assume –as will be the case in the application below to the proof of Theorem 1.1– that for some β ≥ 2, we have g(y) = −β log |y| + O(1) at ∞.
874
N. Ghoussoub, C.-S. Lin
We shall also assume that the corresponding null-eigenfunction ϕ in ω, i.e., ϕ + e g ϕ = 0 in ω, ϕ|∂ω = 0, is bounded in ω. Without loss of generality, we may also assume that 0 ∈ ω. Now set g(x) ˆ = g(
x x x ) − 2 log |x| and ϕ(x) ˆ = ϕ( 2 ) for x ∈ ω∗ = {y = ; x ∈ ω}. |x|2 |x| |x|2
Since β ≥ 2, e gˆ is a Hölder function at 0 ∈ ω∗ , and gˆ and ϕˆ satisfy gˆ + e gˆ > 0 in ω∗ \{0} and ϕˆ + e gˆ ϕˆ = 0 in ω∗ . By the boundedness of ϕ, ˆ ϕˆ is continuous on ω∗ . If 0 ∈ ω∗ , then by noting that gˆ satisfies g ˆ gˆ + e ≥ (β − 2)δ0 , where δ0 is the Dirac measure at 0 and β − 2 ≥ 0, we can then apply a version of Lemma 3.1 where gˆ can have a singularity (see [3]), to deduce that ˆ e g(x) dx = e g(x) d x ≥ 4π. ω∗
ω
We note that in the application of the lemma to the proof of Theorem 1.1, we have that ϕ is bounded on all of R2 . Now we are in the position to prove the main theorem. Proof of Theorem 1.1. Suppose u(x) is a solution of (1.3). Let ξ0 be a critical point of u. Without loss of generality, we may assume ξ0 = (0, 0, −1). By using the stereographic projection as before and letting v(y) := u( −1 (x)) − 2ρ log(1 + |y|2 ) + log(8ρ), v satisfies (2.2) and ∇v(0) = 0.
(3.2)
Set ϕ(y) := y2
∂v ∂v − y1 . ∂ y1 ∂ y2
Then ϕ satisfies ϕ + (1 + |y|2 )l ev ϕ = 0 in R2 .
(3.3)
By (2.1), it is easy to see ϕ is bounded in R2 . If ϕ ≡ 0, then by (3.2), ϕ(y) = Q(y) + higher order terms for |y| 1, where Q(y) is a quadratic polynomial of degree m with m ≥ 2, that is also a harmonic function, i.e., Q = 0. Thus, the nodal line {y | ϕ(y) = 0} divides a small neighborhood of the origin into at least four regions. Let γi , i = 1, 2, 3, 4, be four branches of the nodal 4 line of ϕ issuing from the origin. If γi does not intersect with γ j , i = j, then R2 \ γi i=1
Best Constant in the Moser-Onofri-Aubin Inequality
875
Fig. 1.
Fig. 2.
contains at least four simply-connected components. See Fig. 1 below. If γi intersects 4 γi contains at least three simply-connected components. with some γ j , then R2 \ i=1
See Fig. 2. If there are more branches of the nodal line of ϕ issuing from the origin, then R2 \{ϕ = 0} is divided into more components of simply-connected domains. Therefore, we conclude that R2 is divided by the nodal line {y | ϕ(y) = 0} into at least 3 regions, i.e., R2 \{y | ϕ(y) = 0} =
3
j.
j=1
In each component j , the first eigenvalue of + (1 + |y|2 )l ev being equal to 0. Let now
g := log (1 + |y|2 )l ev . By noting that g + e g > 0 in R2 , Lemma 3.1 then implies that for each j = 1, 2, 3, g e dy = (1 + |y|2 )l ev dy > 4π. j
j
876
N. Ghoussoub, C.-S. Lin
It follows that 8πρ =
3 j=1 j
(1 + |y|2 )l ev dy > 12π,
which is a contradiction if we had assumed that ρ ≤ 23 . Thus we have ϕ(y) = 0, i.e., v(y) is axially symmetric. By Theorem A, we can conclude u ≡ 0. Remark 3.3. If we further assume that the antipodal of ξ0 is also a critical point of u, m 2 then R \{y | ϕ(y) = 0} = j , where m ≥ 4. Lemma 3.1 then yields j=1
8πρ =
R2
(1 + |y|2 )l ev dy ≥
m j=1 j
(1 + |y|2 )l ev dy > 4mπ ≥ 16π,
which is a contradiction whenever ρ ≤ 2. By Theorem A, we have again that u ≡ 0. For example, if u is even on S 2 (i.e., u(z) = u(−z) for all z ∈ S 2 ), then the main theorem holds for ρ ≤ 2. Remark 3.4. If v is a solution of (2.2) with βl (v) ≤ 6, and 0 is a critical point of v, then by the same proof of Theorem 1.1, we can conclude v is radially symmetric in R2 . Furthermore, if v(x1 , x2 ) is even in both x1 and x2 , then v is radially symmetric if βl (v) ≤ 8. Remark 3.5. One can actually show that Conjecture 1 holds for ρ ≤ 23 + 0 for some 0 > 0. Indeed, it suffices to show that for α smaller but close to 23 , the functional Jα is non-negative. Assuming not, then there exists a sequence of {αk }k such that 21 < αk < 23 , limk αk = 23 and inf M Jαk (u) < 0. Since Jα is coercive for each α > 21 , a standard compactness argument yields the existence of a minimizer u k ∈ M for Jαk . Moreover, u k H 1 < C for some positive constant independent of k. Modulo extracting a subsequence, u k then converges weakly to some u 0 in M as k → ∞, and u 0 is necessarily a minimizer for I 2 in M. By our main result, u 0 ≡ 0. Now, we claim that u k actually 3
converges strongly in H 1 to u 0 ≡ 0. This is because – as argued by Chang and Yang – the Euler-Lagrange equations are then 1 αk u k − 1 + eu k = 0, 2 λk
(3.20)
where λk = S 2 eu k dw < C for some positive constant C. Multiplying (3.20) by u k and integrating over S 2 , we obtain 1 αk |∇u k |2 dw + u k (x) dw = eu k (x) u k (x) dw. (3.21) 2 S2 λk S 2 S2 Applying Onofri’s inequality for u k and using that u k H 1 < C, we get that S 2 e2u k dw is also uniformly bounded. This combined with Hölder’s inequality and the fact that u k converges strongly to 0 in L 2 yields that S 2 eu k u k dw → 0. Use now (3.21) to conclude that u k H 1 → 0 as k → ∞.
Best Constant in the Moser-Onofri-Aubin Inequality
877
Now, write u = v + o(||u||) for ||u|| small, where v belongs to the tangent space of the submanifold M at u 0 ≡ 0 in H 1 (S 2 ). It is easy to see that S 2 vx dw = 0. We can calculate the second variation of Jα in M at u 0 ≡ 0 and get the following estimate around 0 : Jα (u) = α |∇v|2 dw − 2 |v|2 dw + o(||u||2 ). S2
S2
Note that the eigenvalues of the Laplacian on S 2 corresponding to the eigenspace generated by x1 , x2 , x3 are λ2 = λ3 = λ4 = 2, while λ5 = 6. Since v is orthogonal to x, we have 2 |∇v| dw ≥ 6 |v|2 dw, S2
S2
and therefore 1 Jα (u) ≥ (α − )||u||2 + o(||u||2 ). 3 Taking α = αk and u = u k for k large enough, we get that Jαk (u k ) ≥ 0, which clearly contradicts our initial assumption on u k . Concluding remarks. (i) The question whether Jα (u) ≥ 0 for 21 ≤ α < 23 under the condition (1.2) is still open. However, in [13], it was proved that there is a constant C ≥ 0 such that for any solution u of (1.3) with 1 < ρ ≤ 2 (i.e. 21 ≤ α < 1), we have |u(x)| ≤ C for all x ∈ S 2 . (ii) Recently, Liouville type equations with singular data have attracted a lot of attention among PDErs since they are closely related to vortex condensates which appear in many physics models. One of the challenges in this line of research is to understand bubbling phenomena arising from solutions of these equations, and the past twenty years have seen many works in this direction. The most delicate case in bubbling phenomena is when more than one vortex collapse into a single point. Equation (2.2) is one of the model equations that allows an accurate description of the bubbling behavior during such a collapse. See [4] and [8] for related details. Thus, understanding the structure of solutions to Eq. (2.2) is fundamentally important. As mentioned above, it is conjectured that for l ≤ 2, all solutions of (2.2) must be radially symmetric. This remains an open question, although a partial answer has been given recently in [4]. References 1. Aubin, T.: Meilleures constantes dans le théorème d’inclusion de Sobolev et un théorème de Fredholm non linéaire pour la transformation conforme de la courbure scalaire (French). J. Funct. Anal. 32(2), 148– 174 (1979) 2. Bandle, C.: Isoperimetric inequalities and applications, Monographs and Studies in Mathematics, 7. Boston, MA. London: Pitman (Advanced Publishing Program), 1980 3. Bartolucci, D., Lin, C.S.: Uniqueness results for mean field equations with singular data. Comm. Part. Diff. Eqs. 34(3), 676–702 (2009) 4. Bartolucci, D., Lin, C.S., Tarantello, G.: Preprint, 2009 5. Chang, S.Y., Yang, P.: Conformal deformation of metrics on S 2 . J. Diff. Geom. 27(2), 259–296 (1988)
878
N. Ghoussoub, C.-S. Lin
6. Chang, S.Y., Yang, P.: Prescribing Gaussian curvature on S 2 . Acta Math. 159(3–4), 215–259 (1987) 7. Cheng, K.S., Lin, C.S.: On the asymptotic behavior of solutions of the conformal Gaussian curvature equations in R2 . Math. Ann. 308(1), 119–139 (1997) 8. Dolbeault, J., Esteban, M.J., Tarantello, G.: Multiplicity results for the assigned Gaussian curvature problem in R2 . Nonlinear Anal. 70, 2870–2881 (2009) 9. Feldman, J., Froese, R., Ghoussoub, N., Gui, C.F.: An improved Moser-Aubin-Onofri inequality for axially symmetric functions on S 2 . Calc. Var. Part. Diff. Eqs. 6(2), 95–104 (1998) 10. Gui, C.F., Wei, J.C.: On a sharp Moser-Aubin-Onofri inequality for functions on S 2 with symmetry. Pac. J. Math. 194(2), 349–358 (2000) 11. Hong, C.: A best constant and the Gaussian curvature. Proc. AMS 97, 737–747 (1986) 12. Lin, C.S.: Uniqueness of solutions to the mean field equations for the spherical Onsager vortex. Arch. Rat. Mech. Anal. 153(2), 153–176 (2000) 13. Lin, C.S.: Topological degree for mean field equations on S 2 . Duke Math. J. 104(3), 501–536 (2000) 14. Moser, J.: A sharp form of an inequality by N. Trudinger. Indiana U. Math. J. 20, 1077–1091 (1971) 15. Onofri, E.: On the positivity of the effective action in a theory of random surfaces. Commun. Math. Phys. 86(3), 321–326 (1982) 16. Osgood, B., Phillips, R., Sarnak, P.: Extremals of determinants of Laplacians. J.F.A. 80, 148–211 (1988) Communicated by M. Salmhofer